[00:34:47] <wikibugs_>	 10Beta-Cluster-Infrastructure, 10Wikimedia-Site-requests: Fix language interwikis in beta to point to beta - https://phabricator.wikimedia.org/T120427#3069685 (10Dalba)
[00:34:50] <wikibugs_>	 10Beta-Cluster-Infrastructure, 10Continuous-Integration-Infrastructure, 10Pywikibot-core, 07Pywikibot-tests: Run pywikibot test suite regularly on beta cluster as part of MediaWiki/Wikimedia CI - https://phabricator.wikimedia.org/T100903#3069684 (10Dalba)
[00:37:08] <wikibugs_>	 10Beta-Cluster-Infrastructure, 10Wikimedia-Site-requests: Fix language interwikis in beta to point to beta - https://phabricator.wikimedia.org/T120427#3069688 (10Dalba)
[00:37:13] <wikibugs_>	 10Beta-Cluster-Infrastructure, 10Continuous-Integration-Infrastructure, 10Pywikibot-core, 07Pywikibot-tests: Run pywikibot test suite regularly on beta cluster as part of MediaWiki/Wikimedia CI - https://phabricator.wikimedia.org/T100903#3069687 (10Dalba)
[00:37:26] <wikibugs_>	 10Beta-Cluster-Infrastructure, 10Wikimedia-Site-requests: Fix language interwikis in beta to point to beta - https://phabricator.wikimedia.org/T120427#1853735 (10Dalba)
[00:37:29] <wikibugs_>	 10Beta-Cluster-Infrastructure, 10Continuous-Integration-Infrastructure, 10Pywikibot-core, 07Pywikibot-tests: Run pywikibot test suite regularly on beta cluster as part of MediaWiki/Wikimedia CI - https://phabricator.wikimedia.org/T100903#1323218 (10Dalba)
[02:26:14] <wmf-insecte>	 Project beta-scap-eqiad build #144733: 04FAILURE in 1 min 21 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/144733/
[02:36:33] <wmf-insecte>	 Project beta-scap-eqiad build #144734: 04STILL FAILING in 1 min 38 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/144734/
[02:46:29] <wmf-insecte>	 Project beta-scap-eqiad build #144735: 04STILL FAILING in 1 min 36 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/144735/
[02:48:29] <wikibugs_>	 10Gerrit, 10Blueprint, 10WikimediaUI Style Guide: Rename mediawiki/skins/LivingStyleGuide code repository to mediawiki/skins/Blueprint - https://phabricator.wikimedia.org/T93568#3069872 (10Volker_E)
[02:57:05] <wmf-insecte>	 Yippee, build fixed!
[02:57:05] <wmf-insecte>	 Project beta-scap-eqiad build #144736: 09FIXED in 2 min 14 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/144736/
[04:07:05] <wmf-insecte>	 Project selenium-MultimediaViewer » safari,beta,OS X 10.9,BrowserTests build #318: 04FAILURE in 11 min: https://integration.wikimedia.org/ci/job/selenium-MultimediaViewer/BROWSER=safari,MEDIAWIKI_ENVIRONMENT=beta,PLATFORM=OS%20X%2010.9,label=BrowserTests/318/
[04:10:19] <wikibugs_>	 06Release-Engineering-Team, 10MediaWiki-Configuration, 13Patch-For-Review, 06Services (watching): Automate WMF wiki creation - https://phabricator.wikimedia.org/T158730#3069969 (10tstarling) Probably better to use etcd than a separate web host, since that seems to be the current standard solution. See abov...
[04:14:03] <wikibugs_>	 06Release-Engineering-Team, 10MediaWiki-Configuration, 13Patch-For-Review, 06Services (watching): Automate WMF wiki creation - https://phabricator.wikimedia.org/T158730#3069976 (10tstarling)
[04:18:49] <wikibugs_>	 (03PS1) 10Harej: Adding Doxygen and JSDuck postmerge jobs for CollaborationKit [integration/config] - 10https://gerrit.wikimedia.org/r/340921
[04:20:21] <wikibugs_>	 (03PS1) 10Harej: Merge branch 'master' of https://gerrit.wikimedia.org/r/integration/config [integration/config] - 10https://gerrit.wikimedia.org/r/340922
[04:20:37] <wikibugs_>	 (03Abandoned) 10Harej: Merge branch 'master' of https://gerrit.wikimedia.org/r/integration/config [integration/config] - 10https://gerrit.wikimedia.org/r/340922 (owner: 10Harej)
[04:21:19] <wikibugs_>	 (03PS2) 10Harej: Adding Doxygen and JSDuck postmerge jobs for CollaborationKit [integration/config] - 10https://gerrit.wikimedia.org/r/340921
[05:56:16] <wmf-insecte>	 Project beta-scap-eqiad build #144754: 04FAILURE in 1 min 22 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/144754/
[06:06:16] <wmf-insecte>	 Project beta-scap-eqiad build #144755: 04STILL FAILING in 1 min 25 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/144755/
[06:16:14] <wmf-insecte>	 Project beta-scap-eqiad build #144756: 04STILL FAILING in 1 min 24 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/144756/
[06:20:35] <shinken-wm_>	 PROBLEM - Free space - all mounts on deployment-fluorine02 is CRITICAL: CRITICAL: deployment-prep.deployment-fluorine02.diskspace._srv.byte_percentfree (<55.56%)
[06:27:07] <wmf-insecte>	 Yippee, build fixed!
[06:27:08] <wmf-insecte>	 Project beta-scap-eqiad build #144757: 09FIXED in 2 min 15 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/144757/
[06:45:35] <shinken-wm_>	 RECOVERY - Free space - all mounts on deployment-fluorine02 is OK: OK: All targets OK
[06:58:26] <wikibugs_>	 06Release-Engineering-Team, 10MediaWiki-Configuration, 13Patch-For-Review, 06Services (watching): Automate WMF wiki creation - https://phabricator.wikimedia.org/T158730#3045487 (10Joe) @tstarling I agree, dblists is one of the things that could be stored in etcd and read from there.  On the other hand, it'...
[07:20:51] <wikibugs_>	 (03CR) 10Hashar: [C: 032] Adding Doxygen and JSDuck postmerge jobs for CollaborationKit [integration/config] - 10https://gerrit.wikimedia.org/r/340921 (owner: 10Harej)
[07:21:46] <wikibugs_>	 (03Merged) 10jenkins-bot: Adding Doxygen and JSDuck postmerge jobs for CollaborationKit [integration/config] - 10https://gerrit.wikimedia.org/r/340921 (owner: 10Harej)
[07:22:09] <wikibugs_>	 (03CR) 10Hashar: "I have deployed it just a minute ago." [integration/config] - 10https://gerrit.wikimedia.org/r/340790 (owner: 10Gergő Tisza)
[07:23:16] <wikibugs_>	 (03PS2) 10Hashar: Adding dependencies for current BlueSpice* repos [integration/config] - 10https://gerrit.wikimedia.org/r/340747 (owner: 10Robert Vogel)
[07:23:56] <wikibugs_>	 (03CR) 10Hashar: [C: 032] Adding dependencies for current BlueSpice* repos [integration/config] - 10https://gerrit.wikimedia.org/r/340747 (owner: 10Robert Vogel)
[07:24:41] <wikibugs_>	 (03Merged) 10jenkins-bot: Adding dependencies for current BlueSpice* repos [integration/config] - 10https://gerrit.wikimedia.org/r/340747 (owner: 10Robert Vogel)
[10:30:02] <wikibugs_>	 10Beta-Cluster-Infrastructure: Special:Version displays incorrect information for what commit is deployed there - https://phabricator.wikimedia.org/T159520#3070282 (10Nikerabbit)
[10:32:44] <wikibugs_>	 (03CR) 10EddieGP: "Thanks again for doing this so fast. :)" [integration/config] - 10https://gerrit.wikimedia.org/r/340790 (owner: 10Gergő Tisza)
[11:09:40] <wikibugs_>	 10Gerrit, 07Browser-Support-Internet-Explorer: Internet Explorer 8 cannot display gerrit patch set with @/at sign in search result - https://phabricator.wikimedia.org/T52818#3070328 (10Aklapper) 05Resolved>03declined As we do not know if anything is improved this is not `resolved`. Closing as `declined`. S...
[11:26:36] <wmf-insecte>	 Project beta-scap-eqiad build #144787: 04FAILURE in 1 min 38 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/144787/
[11:36:26] <wmf-insecte>	 Project beta-scap-eqiad build #144788: 04STILL FAILING in 1 min 32 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/144788/
[11:46:22] <wmf-insecte>	 Project beta-scap-eqiad build #144789: 04STILL FAILING in 1 min 28 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/144789/
[11:57:00] <wmf-insecte>	 Yippee, build fixed!
[11:57:00] <wmf-insecte>	 Project beta-scap-eqiad build #144790: 09FIXED in 2 min 6 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/144790/
[12:46:52] <wikibugs_>	 10Gerrit, 07Browser-Support-Internet-Explorer: Internet Explorer 8 cannot display gerrit patch set with @/at sign in search result - https://phabricator.wikimedia.org/T52818#3070470 (10Paladox) Whoops sprry
[13:11:47] <wikibugs_>	 10Browser-Tests-Infrastructure, 10MediaWiki-General-or-Unknown, 07JavaScript, 13Patch-For-Review, and 2 others: Port Selenium tests from Ruby to Node.js - https://phabricator.wikimedia.org/T139740#3070518 (10zeljkofilipin)
[13:17:23] <shinken-wm_>	 PROBLEM - Puppet run on jenkinstest is CRITICAL: CRITICAL: 11.11% of data above the critical threshold [0.0]
[13:24:05] <wikibugs_>	 10Continuous-Integration-Config, 07TestMe: fix or mark as inactive extensions currently failing CI - https://phabricator.wikimedia.org/T134090#3070551 (10Sebschlicht)
[13:27:22] <shinken-wm_>	 RECOVERY - Puppet run on jenkinstest is OK: OK: Less than 1.00% above the threshold [0.0]
[13:29:45] <Zppix>	 hashar:  whats going on on jenkinstest?
[13:31:26] <wikibugs_>	 (03PS2) 10Hashar: test: mw ext/skins have test templates in Zuul [integration/config] - 10https://gerrit.wikimedia.org/r/332890
[13:33:20] <hashar>	 Zppix: work
[13:33:28] <Zppix>	 what is that xD
[13:33:33] <hashar>	 Zppix: it is a test instance for jenkins
[13:33:37] <wmf-insecte>	 Project beta-scap-eqiad build #144799: 04FAILURE in 8 min 38 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/144799/
[13:33:39] <hashar>	 to try out some puppet patch I am working on 
[13:33:42] <Zppix>	 hashar:  i meant what is work xD
[13:33:45] <wikibugs_>	 (03CR) 10jerkins-bot: [V: 04-1] test: mw ext/skins have test templates in Zuul [integration/config] - 10https://gerrit.wikimedia.org/r/332890 (owner: 10Hashar)
[13:34:21] <Zppix>	 hashar:  can you translate 13:24:56 [Fri Mar  3 13:24:56 2017] [hphp] [2308:7f0cedf626c0:0:000001] [] Unable to set ResourceLimit.CoreFileSize to 8589934592: Operation not permitted (1) into human please?
[13:34:43] <hashar>	 some oddity somewhere in hhvm conf i guess
[13:34:48] <hashar>	 can be ignored
[13:34:54] <Zppix>	 ok
[13:35:03] <hashar>	 hhvm fails to set some process limit. it is probably not important
[13:35:16] <Zppix>	 hashar:  then i cannot find out why https://integration.wikimedia.org/ci/job/mediawiki-extensions-hhvm-jessie/7346/console failed
[13:37:34] <wmf-insecte>	 Project beta-scap-eqiad build #144800: 04STILL FAILING in 1 min 58 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/144800/
[13:38:58] <wmf-insecte>	 Project beta-scap-eqiad build #144801: 04STILL FAILING in 1 min 21 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/144801/
[13:41:52] <wikibugs_>	 (03PS1) 10Hashar: Merge mw/core autoload-dev in vendor.git [integration/config] - 10https://gerrit.wikimedia.org/r/340975 (https://phabricator.wikimedia.org/T158674)
[13:42:06] <wikibugs_>	 (03CR) 10Hashar: [C: 032] Merge mw/core autoload-dev in vendor.git [integration/config] - 10https://gerrit.wikimedia.org/r/340975 (https://phabricator.wikimedia.org/T158674) (owner: 10Hashar)
[13:42:57] <wikibugs_>	 (03Merged) 10jenkins-bot: Merge mw/core autoload-dev in vendor.git [integration/config] - 10https://gerrit.wikimedia.org/r/340975 (https://phabricator.wikimedia.org/T158674) (owner: 10Hashar)
[13:43:59] <Zppix>	 13:25:23 Fatal error: Call to undefined function WikibaseQuality\ConstraintReport\Tests\Specials\SpecialConstraintReport\both() in /home/jenkins/workspace/mediawiki-extensions-hhvm- hashar  this looks bad
[13:44:42] <hashar>	 Zppix: yeah 
[13:44:55] <hashar>	 known I have been working on that with wikidata folks last week and this week
[13:44:56] <Zppix>	 hashar:  is that commit or ci error
[13:45:01] <hashar>	 the patches I have sent above would fix it
[13:45:11] <Zppix>	 hashar:  would that cause a V: -1
[13:45:15] <hashar>	 hopefully
[13:45:19] <hashar>	 yeah that is a fatal error
[13:45:21] <hashar>	 so the job fails
[13:45:36] <Zppix>	 hashar:  can you override the v: -1 on https://gerrit.wikimedia.org/r/#/c/340965/ please
[13:45:42] <hashar>	 no
[13:46:00] <hashar>	 should be fixed now
[13:46:02] <Zppix>	 ok
[13:46:05] <Zppix>	 ill run recheck
[13:46:05] <hashar>	 just comment in Gerrit 'recheck'
[13:46:11] <hashar>	 and that will run the jobs again hopefully
[13:46:15] <hashar>	 bah it is not fixed
[13:46:17] <hashar>	 grrlbllblb
[13:46:34] <wmf-insecte>	 Project beta-scap-eqiad build #144802: 04STILL FAILING in 1 min 32 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/144802/
[13:46:47] <Zppix>	 hashar:  you have 1 job xD
[13:46:54] <wmf-insecte>	 Yippee, build fixed!
[13:46:54] <wmf-insecte>	 Project selenium-VisualEditor » firefox,beta,Linux,BrowserTests build #325: 09FIXED in 2 min 53 sec: https://integration.wikimedia.org/ci/job/selenium-VisualEditor/BROWSER=firefox,MEDIAWIKI_ENVIRONMENT=beta,PLATFORM=Linux,label=BrowserTests/325/
[13:49:45] <wikibugs_>	 (03PS1) 10Hashar: We want to match the project repo name [integration/config] - 10https://gerrit.wikimedia.org/r/340976 (https://phabricator.wikimedia.org/T158674)
[13:51:06] <Zppix>	 jessie's still failing hashar 
[13:55:45] <Zppix>	 eddiegp:  fyi recheck is whitelisted only certain people can use it
[13:56:30] <eddiegp>	 Zppix: Yep, got whitelisted yesterday.
[13:56:40] <Zppix>	 oh :P
[13:56:53] <Zppix>	 my logs didnt go back that far for some reason
[13:57:01] <hashar>	 eddiegp: hello :)
[13:57:12] <hashar>	 I think the job is fixed now
[13:57:19] <eddiegp>	 Hello :)
[13:58:18] <hashar>	 eddiegp: so for your patch
[13:58:25] <hashar>	 when your remove the phpcs rules in https://gerrit.wikimedia.org/r/#/c/340965/4/phpcs.xml
[13:58:31] <hashar>	 that causes CI to stop ignoring them
[13:58:38] <hashar>	 and thus we know your patch is good :]
[14:00:34] <eddiegp>	 hashar: I've left you a comment at "part 3" https://gerrit.wikimedia.org/r/#/c/340964/ and had amended "part 4" https://gerrit.wikimedia.org/r/#/c/340965/ to do this.
[14:00:59] <wikibugs_>	 10Continuous-Integration-Config, 10Wikidata, 13Patch-For-Review, 03Wikidata-Sprint: Fatal error: Call to undefined function Wikibase\Client\Tests\RecentChanges\both() - on jenkins - https://phabricator.wikimedia.org/T158674#3070637 (10hashar) 05Open>03Resolved a:03hashar So that one is fixed. The roo...
[14:01:34] <eddiegp>	 Oh, I got your message wrong.
[14:01:37] <Zppix>	 eddiegp:  for future reference please try to keep it all in one commit its less struggle for us to review
[14:01:52] <eddiegp>	 Jep, I've got that by now :)
[14:01:55] <hashar>	 yeah looks good
[14:02:05] <hashar>	 wanna learn how to merge the commits ? :]
[14:02:22] <hashar>	 it is all some git magic
[14:02:42] <eddiegp>	 Jep, up for it ;)
[14:02:47] <hashar>	 so
[14:02:56] <hashar>	 when you send a patch to gerrit
[14:03:20] <hashar>	 Gerrit doesnt know for which change it is
[14:03:29] <hashar>	 to attach your patch to an existing change it looks at:
[14:03:32] <hashar>	 1) the repository
[14:03:34] <hashar>	 2) the branch
[14:03:43] <hashar>	 3) the Change-Id field in the commit message
[14:04:06] <hashar>	 eg for SiteMatrix if you create an entirely new patch but with the same Change-Id and send it, that will create a new patch for that change
[14:04:11] <hashar>	 that is for the context
[14:04:25] <hashar>	 what we want is to get part4 code added in part3 code
[14:04:39] <hashar>	 to do so locally, you want to retrieve part3:  git-review -d 340964
[14:05:02] <hashar>	 then you can use git to add the code of the other change but not commit it
[14:05:07] <hashar>	 that is known has cherry picking
[14:05:28] <eddiegp>	 Yep, I've done that before, wait a moment...
[14:05:43] <hashar>	 part4 patchset is commit 6335a23fdeac7c0a3d2cc3785d2eca843060eb5e
[14:05:46] <hashar>	 so you can:
[14:05:50] <hashar>	 git cherry-pick --no-commit 6335a23fdeac7c0a3d2cc3785d2eca843060eb5e
[14:06:03] <hashar>	 that will take all the content of part 4 and apply it on part 3 WITHOUT creating a commit
[14:06:09] <hashar>	 using "git diff" you should show part 4 
[14:06:12] <hashar>	 you can then git add
[14:06:14] <hashar>	 git commit -ammend
[14:06:26] <hashar>	 err git commit -amend
[14:06:37] <hashar>	 and that updates part3 commit with whatever you have cherry picked :]
[14:06:38] <eddiegp>	 actually it get's staged right away, so no need to git add.
[14:06:47] <hashar>	 even better
[14:07:00] <hashar>	 once you have send the updated part 3 commit to Gerrit
[14:07:27] <hashar>	 you can try the [Rebase] button on part 4 change https://gerrit.wikimedia.org/r/#/c/340965/  and it should show an empty list of files
[14:07:36] <wmf-insecte>	 Yippee, build fixed!
[14:07:37] <wmf-insecte>	 Project beta-scap-eqiad build #144803: 09FIXED in 12 min: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/144803/
[14:07:40] <hashar>	 since all the changes have been incorporated in previous change (part 3) and thus you can abandon it
[14:08:01] <eddiegp>	 Is up at https://gerrit.wikimedia.org/r/#/c/340964/
[14:08:42] <hashar>	 I rebased https://gerrit.wikimedia.org/r/#/c/340965/
[14:08:48] <hashar>	 it is now empty and I have abandoned it
[14:08:56] <hashar>	 now we just have to wait for CI
[14:08:58] <hashar>	 review the code
[14:09:02] <hashar>	 and I will +2 it :]
[14:09:46] <Zppix>	 i changed the commit
[14:09:48] <eddiegp>	 Why is it needed to rebase it when I'll abondone it right afterwards anyway?
[14:10:07] <Zppix>	 eddiegp:  merge conflict 
[14:10:09] <Zppix>	 thats why
[14:10:34] <hashar>	 eddiegp:  I rebased it merely to make sure that all changes have been properly incorporated
[14:10:37] <chasemp>	 hashar: how is nodepool doing since we upped the quota?
[14:10:48] <hashar>	 chasemp: on steroids!
[14:11:14] <Zppix>	 hashar:  dang it i've sent nodepool to rehab once already
[14:11:47] <eddiegp>	 Zppix: Ah, okay.
[14:11:54] <hashar>	 chasemp: there are less changes queued in gearman, and the pool looks less busy
[14:12:03] <chasemp>	 hashar: ok good, seems decent to me too, realigning the jessie ready nodes alone seems to have been good
[14:12:25] <eddiegp>	 Zppix: I would say it's still (part 3) as part 1 and 2 have already be merged ...
[14:12:44] <hashar>	 chasemp: I also stopped triggering zend jobs for a lot of repositories so that is less instances consumed
[14:12:45] <Zppix>	 ok eddiegp  ill fix
[14:12:54] <chasemp>	 hashar: nice
[14:12:57] <chasemp>	 even better
[14:12:59] <hashar>	 chasemp: the zend jobs are still triggered but only on CR+2
[14:13:09] <hashar>	 there are probably some jobs we can merge now
[14:13:18] <hashar>	 typically run composer/npm/phpunit all in a single job 
[14:14:19] <hashar>	 chasemp: for the instance creation slowness is there anything I can help with ?
[14:14:31] <hashar>	 labnet nova logs dont show much though :/
[14:15:01] <eddiegp>	 That was the part I struggled with mostly when hashar left me the comment to put all of those into one single commit: I couldn't figure out how to do this for the patches already merged. Seems I just got him wrong :) 
[14:15:04] <chasemp>	 hashar: it's a sometimes thing where I suspect the schedular but andrew or I will have to find time to dig in
[14:15:14] <chasemp>	 fwiw this has most likely been going on for a long time
[14:15:22] <chasemp>	 it's just now we are watching
[14:15:43] <hashar>	 eddiegp: I highly recommend reading (and even buying as a reference) https://git-scm.com/book/
[14:16:18] <hashar>	 eddiegp: if you had only one chapter to read, that will be the chapter 3 about branches  https://git-scm.com/book/en/v2/Git-Branching-Branches-in-a-Nutshell
[14:16:36] <Zppix>	 eddiegp:  or do what i do type commands like mad man break a few things and make hashar  fix it xD
[14:17:16] <hashar>	 chasemp: based on nodepool similar metric, it looks like it really degraded over thet last couple weeks 
[14:17:37] <chasemp>	 hashar: well that's definitely interesting
[14:17:48] <Zppix>	 hashar:  i think load balancing could be done a little better 
[14:17:53] <chasemp>	 does it correlate w/ our quota bump for nodepool itself?
[14:18:12] <wikibugs_>	 10Gerrit, 07Browser-Support-Internet-Explorer: Internet Explorer 8 cannot display gerrit patch set with @/at sign in search result - https://phabricator.wikimedia.org/T52818#3070679 (10Aklapper) No problem :)
[14:18:22] <chasemp>	 the scheduler is much busier than before etc
[14:18:32] <chasemp>	 but really I don't know why, only that it's predictable atm
[14:19:00] <hashar>	 apparently not related
[14:19:13] <hashar>	 looking at the nodepool graphs over 30 days on https://grafana.wikimedia.org/dashboard/db/nodepool?from=now-30d&to=now  
[14:19:38] <hashar>	 the top left one shows the nodepool bump, you can put your mouse pointer on it to have a vertical red bar on all other graphs
[14:19:58] <hashar>	 and below there are a couple graphs showing the median and max times to launch an instance
[14:20:11] <hashar>	 (which is the equivalent of your fullstack metric  + all of nodepool overhead)
[14:20:46] <hashar>	 but I guess your fullstack metric would show it nicely
[14:20:55] <chasemp>	 right, barring drop everying emergency it's going to have to wait for andrew
[14:21:02] <chasemp>	 this is the worse week possible
[14:21:08] <eddiegp>	 To be honest I've already read this chapter once, getting from it the very basics (what branches, merges and rebasing even are), going to the whole book to understand thing s more deeply is still somewhere on my todo list ;) Thanks for the reminder anyway. :)
[14:21:29] <hashar>	 chasemp: for nodepool the pool is now large enough that it is not much of an issue imho
[14:21:48] <chasemp>	 yeah that's good, I still hate it, but that's good
[14:23:04] <shinken-wm_>	 PROBLEM - Puppet run on deployment-copper is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0]
[14:23:04] <hashar>	 something that might be related. I noticed that once new snapshots have bene generated, it takes ages to spawn instances
[14:23:13] <Zppix>	 hashar:  i'd say load balancing is partially to blame i've seen jessie jobs queued for over 30 mins before and its only because another job on jessie was running for the last half hour
[14:23:15] <hashar>	 but I guess that is due to transfering the image to the Compute nodes
[14:23:50] <hashar>	 Zppix: that happens when ones CR+2 .  The changes are put in a queue and throttled
[14:24:12] <Zppix>	 but i also then see that there are empty jessie instances
[14:24:18] <Zppix>	 that could of been easily used
[14:25:07] <hashar>	 well it does not necessarly runs all the jobs and might wait for others to complete first
[14:25:19] <hashar>	 there are also some jobs that can only have one build at a time
[14:26:06] <Zppix>	 hashar:  hhvm one of them? i notice that one takes alot of time
[14:26:13] <hashar>	 depends on the job
[14:26:20] <Zppix>	 example for ext
[14:26:21] <wmf-insecte>	 Project beta-scap-eqiad build #144806: 04FAILURE in 1 min 22 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/144806/
[14:26:39] <Zppix>	 beta-scap seems to failing alot today is there a known ci issue or something?
[14:27:13] <hashar>	 hhvm in the name is just a way to indicate it is using hhvm to run whatever script/commands
[14:27:32] <hashar>	 yeah there is a task for beta-scap. Puppet related and I dont think anyone is looking into it
[14:27:55] <Zppix>	 i would if i had the access but sadly i don't, what do we know so far about it?
[14:28:22] <hashar>	 Zppix: https://phabricator.wikimedia.org/T159332
[14:29:22] <Zppix>	 it needs a host key verified is one of the issues im seeing within the first few seconds
[14:30:02] <Zppix>	 why delete known-hosts to begin with?
[14:31:18] <hashar>	 no clue :)
[14:31:28] <hashar>	 I am heading to office brb
[14:36:21] <wmf-insecte>	 Project beta-scap-eqiad build #144807: 04STILL FAILING in 1 min 26 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/144807/
[14:53:27] <wmf-insecte>	 Project beta-scap-eqiad build #144808: 04STILL FAILING in 8 min 29 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/144808/
[14:57:18] <wmf-insecte>	 Project beta-scap-eqiad build #144809: 04STILL FAILING in 1 min 57 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/144809/
[15:06:32] <wmf-insecte>	 Project beta-scap-eqiad build #144810: 04STILL FAILING in 1 min 34 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/144810/
[15:16:28] <wmf-insecte>	 Project beta-scap-eqiad build #144811: 04STILL FAILING in 1 min 30 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/144811/
[15:36:04] <wmf-insecte>	 Yippee, build fixed!
[15:36:05] <wmf-insecte>	 Project beta-scap-eqiad build #144812: 09FIXED in 11 min: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/144812/
[15:40:10] <paladox>	 RainbowSprinkles i think the reason why polygerrit comments are broken when using the rewrites is we need to add the base path to it's internal system
[15:40:11] <paladox>	 https://github.com/gerrit-review/gerrit/blob/7e83ae9ef03f62cd2d224e4743268bedb68a8f70/polygerrit-ui/app/elements/shared/gr-rest-api-interface/gr-rest-api-interface.js#L425
[15:44:38] <wmf-insecte>	 Yippee, build fixed!
[15:44:38] <wmf-insecte>	 Project selenium-MobileFrontend » chrome,beta,Linux,BrowserTests build #346: 09FIXED in 22 min: https://integration.wikimedia.org/ci/job/selenium-MobileFrontend/BROWSER=chrome,MEDIAWIKI_ENVIRONMENT=beta,PLATFORM=Linux,label=BrowserTests/346/
[16:03:54] <shinken-wm_>	 PROBLEM - Puppet run on integration-saltmaster is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0]
[16:07:05] <paladox>	 nodepool is down again
[16:07:05] <paladox>	 https://integration.wikimedia.org/zuul/
[16:07:07] <paladox>	 hashar ^^
[16:07:23] <paladox>	 chasemp ^^
[16:08:55] <shinken-wm_>	 RECOVERY - Puppet run on integration-saltmaster is OK: OK: Less than 1.00% above the threshold [0.0]
[16:11:12] <wikibugs_>	 06Release-Engineering-Team, 06Labs, 06Operations, 07Nodepool: Investigate why nodepool keeps leaking instances and why it stops for no reason sometimes - https://phabricator.wikimedia.org/T159543#3070970 (10Paladox)
[16:11:29] <paladox>	 hashar i filled ^^
[16:11:54] <wikibugs_>	 06Release-Engineering-Team, 06Labs, 06Operations, 07Nodepool: Investigate why nodepool keeps leaking instances and why it stops for no reason sometimes - https://phabricator.wikimedia.org/T159543#3070982 (10Paladox) p:05Triage>03High
[16:15:07] <chasemp>	 hashar: thcipriani|afk service interruption in vm creation as we updated a field in a config file that hit like every service and it's taking awhile to come back
[16:15:32] <hashar>	 chasemp: lets stop nodepool entirely to avoid overloading the nova api?
[16:15:50] <chasemp>	 hashar: ok
[16:16:08] <chasemp>	 andrewbogott is babysitting it hashar and afaik it's coming back just slow
[16:16:09] <hashar>	 it is already in bad shape apparently
[16:16:19] <hashar>	 oh so the change already ahppened
[16:16:36] <chasemp>	 well, a host of in-flight creations went error as services via puppet restarted
[16:16:38] <chasemp>	 I  think
[16:16:53] <hashar>	 bad puppet
[16:16:59] <hashar>	 going to stop jenkins instead
[16:17:23] <chasemp>	 well, it's expected but maybe we should shepherd this out by hand for nova idk
[16:17:47] <hashar>	 ReadTimeout: HTTPConnectionPool(host='labnet1001.eqiad.wmnet', port=8774): Read timed out. (read timeout=60.0)
[16:17:49] <hashar>	 I stopped Jenkins
[16:18:51] <chasemp>	 andrewbogott: ^ any idea of current status?
[16:19:34] <hashar>	 in which channel do you typically handles labs/openstack ?  I don't mind switching to wherever you two are already talking about it
[16:19:37] <andrewbogott>	 things should be back to normal now, or almost
[16:20:09] <chasemp>	 andrewbogott: should we force cleanup error isntances in contintcloud?
[16:20:14] <hashar>	 nova-api.log on labnet1001 has some traces still 
[16:20:16] <hashar>	 such as: instance's host labvirt1007 is down, deleting from database
[16:20:21] <andrewbogott>	 probably
[16:20:43] <andrewbogott>	 hashar: I'm watching that too, I think there are some very long-lived timeouts still happening
[16:20:52] <hashar>	 okkk
[16:22:32] <chasemp>	 error instances are not reducing for contintcloud on their own so far
[16:22:33] <hashar>	 which eventually cascade in Nodepool timing out talking to labnet1001:8774 after 60 secs
[16:22:34] <chasemp>	 19 and holding
[16:22:50] <hashar>	 yeah nodepool keeps trying to delete the ones it has flagged for deletion
[16:23:14] <hashar>	 bunch are showing in ERROR state
[16:23:21] <chasemp>	 is there a way to tell nodepool "forget all your state those are dead"
[16:23:31] <andrewbogott>	 a few labvirts are still unresponsive, so it we're trying to delete instances there they would timeout
[16:23:51] <chasemp>	 andrewbogott: ok I'm holding off on doing anything since you're on it
[16:25:54] <hashar>	 I guess when the ERROR instances get cleaned out from openstack , nodepool will consider them as deleted and flush them out
[16:27:23] <wikibugs_>	 06Release-Engineering-Team, 06Labs, 06Operations, 07Nodepool: Investigate why nodepool keeps leaking instances and why it stops for no reason sometimes - https://phabricator.wikimedia.org/T159543#3071028 (10chasemp) a:03Andrew we merged https://gerrit.wikimedia.org/r/#/c/340986/ causing nova services to...
[16:33:13] <hashar>	 andrewbogott: I don't think they are long lived timeouts.
[16:33:46] <hashar>	 nodepool keep trying to requests deletion to the api, and after some seconds nova-api says a message can not be dispatched because labvirtXXX is down
[16:34:08] <andrewbogott>	 "a few labvirts are still unresponsive, so it we're trying to delete instances there they would timeout"
[16:34:59] <hashar>	 ahh ok
[16:37:52] <andrewbogott>	 There is a serious problem in nova-compute that means it takes sometimes 20-30 minutes for the service to be responsive after a restart.
[16:37:59] <andrewbogott>	 I fixed the issue but the fix is in M
[16:38:15] <andrewbogott>	 It's only noticeable if we restart all nodes at once, generally
[16:39:28] <hashar>	 and puppet ended up restarting them all together right?
[16:42:12] <andrewbogott>	 yes, due to a config change
[16:43:29] <hashar>	 andrewbogott: and I guess there is no way to clear out the instances in error?
[16:45:37] <paladox>	 RainbowSprinkles i think it is an access denied problem. Most likly it dosent like us rewriting the urls. I think it's https://github.com/gerrit-review/gerrit/blob/7e83ae9ef03f62cd2d224e4743268bedb68a8f70/polygerrit-ui/app/elements/shared/gr-rest-api-interface/gr-rest-api-interface.js#L118
[16:48:06] <wikibugs_>	 10Beta-Cluster-Infrastructure, 06Release-Engineering-Team: Beta cluster scap job ( beta-scap-eqiad ) fails due to puppet erasing /etc/ssh/ssh_known_hosts - https://phabricator.wikimedia.org/T159332#3064414 (10thcipriani) So this is happening continually, but the root cause is somewhat confounding.  According t...
[16:52:29] <wikibugs_>	 10Beta-Cluster-Infrastructure, 06Release-Engineering-Team: Beta cluster scap job ( beta-scap-eqiad ) fails due to puppet erasing /etc/ssh/ssh_known_hosts - https://phabricator.wikimedia.org/T159332#3071059 (10thcipriani) p:05Triage>03Normal It seems this is as a result of a patch on beta puppetmaster relat...
[16:54:51] <paladox>	 Strange logging in works but not posting comments. /me wonders what rest api polygerrit uses.
[16:55:42] <wikibugs_>	 10Beta-Cluster-Infrastructure, 06Release-Engineering-Team: Beta cluster scap job ( beta-scap-eqiad ) fails due to puppet erasing /etc/ssh/ssh_known_hosts - https://phabricator.wikimedia.org/T159332#3071087 (10hashar) Ahhh I forgot about cherry-picks.  So yeah that seems to enable collection of host keys to gen...
[16:57:14] <hashar>	 thcipriani: there is another puppet resource for /etc/ssh/ssh_known_hosts , maybe that is the one generating an empty file
[16:57:16] <hashar>	 race condition!!
[16:57:42] <thcipriani>	 oh
[16:57:46] <thcipriani>	 that would make some sense
[16:58:05] <hashar>	 so maybe craft a puppet patch to drop that file{}
[16:58:09] <hashar>	 that might resolve the race
[16:58:17] <hashar>	 and hopefully the file perms are all ok
[16:58:53] <thcipriani>	 where is the 2nd file resource?
[16:59:16] <thcipriani>	 wouldn't this cause puppet to explode at the compile phase?
[17:00:48] <hashar>	 thcipriani: different type of resources I believe
[17:00:55] <hashar>	 one would be File[ssh_known_hosts]
[17:01:05] <hashar>	 the other something like  @@Ssh_Keys   or whatever
[17:01:29] <hashar>	 puppet definitely do not conflict between them
[17:01:38] <thcipriani>	 ah, got it, role::ci::slave::labs
[17:01:48] <hashar>	 whatever generates the ssh key collection does not define a File resource I believe
[17:02:01] <hashar>	 thcipriani: yes https://phabricator.wikimedia.org/T159332#3071087 :}
[17:02:25] <thcipriani>	 still. seems weird that it would overwrite the content rather than just change perms
[17:02:30] <hashar>	 meanwhile the whole CI is down due to inability to delete instances in openstack ( https://phabricator.wikimedia.org/T159543 )
[17:03:42] <thcipriani>	 I suppose that'll happen
[17:05:05] <andrewbogott>	 well, that was a strange interconnected mess...
[17:05:10] <andrewbogott>	 hashar, are things clearing now?
[17:09:23] <paladox>	 RainbowSprinkles it's uses the wrong rest api probaly because of the apache rules. Just tested with rebase, even though i did not actually fix that it was a 404 error, now it just reloads the page but dosent actually rebase.
[17:10:04] <paladox>	 I will see if fixing it internally will do the job
[17:12:15] <Yaron>	 Is Jenkins down?
[17:19:07] <hashar>	 andrewbogott: looks better yes
[17:19:45] <andrewbogott>	 ok, I may be about to cause slightly more disruption but I'll keep an eye out
[17:21:27] <mutante>	 Yaron: <+hashar> meanwhile the whole CI is down due to inability to delete instances in openstack ( https://phabricator.wikimedia.org/T159543 )
[17:22:46] <paladox>	 Yaron nova is having a problem currently
[17:23:35] <wikibugs_>	 06Release-Engineering-Team, 06Labs, 06Operations, 07Nodepool: Investigate why nodepool keeps leaking instances and why it stops for no reason sometimes - https://phabricator.wikimedia.org/T159543#3071222 (10Paladox) p:05High>03Unbreak! Guessing unbreak as ci is down?
[17:24:00] <paladox>	 Should the topic in this channel be changed?
[17:24:04] <shinken-wm_>	 PROBLEM - Puppet run on deployment-aqs01 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0]
[17:24:31] <hashar>	 andrewbogott: some instances managed to get deleted so we run on a small pool
[17:24:49] <andrewbogott>	 hashar: great.  I'm still trying to sort out why the others won't delete.
[17:26:15] <hashar>	 there are still a bunch instances in ERROR state though
[17:31:37] <andrewbogott>	 It's down to 3 now
[17:32:39] <hashar>	 andrewbogott: yup all good
[17:32:46] <hashar>	 delta the three left over
[17:33:28] <hashar>	 andrewbogott: that is more or less the queue of CI jobs: https://grafana.wikimedia.org/dashboard/db/zuul-gearman?panelId=10&fullscreen&from=now-3h&to=now-5m
[17:33:29] <hashar>	 red = waiting
[17:33:34] <hashar>	 it should go down now
[17:33:40] <hashar>	 and green is functions running
[17:34:42] <wikibugs_>	 10Beta-Cluster-Infrastructure, 10Cognate, 10MediaWiki-extensions-InterwikiSorting, 13Patch-For-Review, 15User-Addshore: Create beta hewiktionary for testing InterwikiSorting & Cognate - https://phabricator.wikimedia.org/T158628#3071279 (10Addshore) While doing this I have also updated the docs on wikitec...
[17:35:11] <wikibugs_>	 10Beta-Cluster-Infrastructure, 10Cognate, 10MediaWiki-extensions-InterwikiSorting, 13Patch-For-Review, and 2 others: Create beta hewiktionary for testing InterwikiSorting & Cognate - https://phabricator.wikimedia.org/T158628#3042289 (10Addshore)
[17:35:28] <hashar>	 andrewbogott: thanks!! :} 
[17:35:40] <hashar>	 I guess the 3 left over will eventually get deleted at some point
[17:35:56] <Yaron>	 mutante: paladox: thanks.
[17:42:53] <paladox>	 your welcome
[17:44:32] <Yaron>	 Yay, Jenkins is running again!
[17:49:58] <hashar>	 andrewbogott: all instances that were in ERROR are gone now.service is fully back!!    Have a nice week-end
[18:04:05] <shinken-wm_>	 RECOVERY - Puppet run on deployment-aqs01 is OK: OK: Less than 1.00% above the threshold [0.0]
[18:13:42] <Yaron>	 Jenkins appears to be down or back-logged again.
[18:14:27] <wikibugs_>	 06Release-Engineering-Team, 06Labs, 06Operations, 07Nodepool: Investigate why nodepool keeps leaking instances and why it stops for no reason sometimes - https://phabricator.wikimedia.org/T159543#3071508 (10Paladox) p:05Unbreak!>03High
[18:15:04] <paladox>	 Yaron it will be a while before everything is catched up
[18:24:57] <wikibugs_>	 10Scap, 06Services (later), 15User-mobrovac: Delay repooling trending service after a restart - https://phabricator.wikimedia.org/T156687#2983681 (10Fjalapeno) @mobrovac what is the ETA on this? Just curious for planing
[18:26:36] <Yaron>	 paladox: yes, okay.
[18:54:33] <wikibugs_>	 10Beta-Cluster-Infrastructure: Special:Version displays incorrect information for what commit is deployed there - https://phabricator.wikimedia.org/T159520#3070282 (10greg) Is it still incorrect? You might have looked mid-update. Remember it updates every 10 minutes.
[19:12:38] <eddiegp>	 Does anyone know if gerritbot has been changed today? I noticed it now posts the shell user name in comments by now, not the LDAP user name. At least it did at T150423 (difference between 2:26 and 12:19 UTC). I don't know if that's intended or a bug.
[19:12:38] <stashbot>	 T150423: SiteMatrix should use short array syntax and shorter lines - https://phabricator.wikimedia.org/T150423
[19:14:50] <paladox>	 Yes it has
[19:15:18] <paladox>	 and yes it is intended.
[19:17:53] <mutante>	 eddiegp: https://phabricator.wikimedia.org/T159441  https://phabricator.wikimedia.org/T76291
[19:23:16] <eddiegp>	 Thanks
[19:27:13] <wikibugs_>	 (03Draft2) 10Addshore: Add Cognate to make-wmf-branch [tools/release] - 10https://gerrit.wikimedia.org/r/332912
[19:27:33] <wikibugs_>	 (03CR) 10Addshore: "The security review has now been closed so this is ready to be added to this script!" [tools/release] - 10https://gerrit.wikimedia.org/r/332912 (owner: 10Addshore)
[19:28:14] <wikibugs_>	 (03PS3) 10Addshore: Add Cognate to make-wmf-branch [tools/release] - 10https://gerrit.wikimedia.org/r/332912 (https://phabricator.wikimedia.org/T150182)
[19:28:33] <wikibugs_>	 (03CR) 10Chad: [C: 032] Add Cognate to make-wmf-branch [tools/release] - 10https://gerrit.wikimedia.org/r/332912 (https://phabricator.wikimedia.org/T150182) (owner: 10Addshore)
[19:28:50] <addshore>	 thanks RainbowSprinkles!
[19:29:08] <RainbowSprinkles>	 yw
[19:32:11] <wikibugs_>	 (03Merged) 10jenkins-bot: Add Cognate to make-wmf-branch [tools/release] - 10https://gerrit.wikimedia.org/r/332912 (https://phabricator.wikimedia.org/T150182) (owner: 10Addshore)
[19:32:16] <addshore>	 infact, RainbowSprinkles, I have a question that I have no doubt you know the answer too!
[19:32:26] <RainbowSprinkles>	 Shoot
[19:33:17] <addshore>	 In InitialiseSettings & the $wgConf->settings array, what would happen if for a setting i would set 'default' => false, 'wikidataclient' => true, 'wikipedia' => false ?
[19:33:42] <addshore>	  / is there a way to enable something for all wikidataclients excluding wikipedia inline, or would I have to create a dblist for it?
[19:35:31] <RainbowSprinkles>	 Yeah, might want to create a computed dblist.
[19:35:41] <addshore>	 okay! :)
[19:35:48] <RainbowSprinkles>	 My *guess* would be that wikidataclient would override wikipedia, actually
[19:36:03] <RainbowSprinkles>	 (named dblists override project dblists, I think is the order)
[19:36:07] <RainbowSprinkles>	 But I could be wrong!
[19:36:17] <addshore>	 hahaa, okay, computed db list it is :D
[19:36:52] <RainbowSprinkles>	 Yeah, should be easy, just do wikidataclient - wikipedia
[19:56:33] <wmf-insecte>	 Project beta-scap-eqiad build #144837: 04FAILURE in 1 min 26 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/144837/
[20:02:25] <thcipriani>	 huh, /etc/ssh/ssh_known_hosts is empty again, puppet rerun repopulated
[20:06:51] <wmf-insecte>	 Yippee, build fixed!
[20:06:51] <wmf-insecte>	 Project beta-scap-eqiad build #144838: 09FIXED in 1 min 54 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/144838/
[20:17:55] <wikibugs_>	 10Beta-Cluster-Infrastructure, 06Release-Engineering-Team: Beta cluster scap job ( beta-scap-eqiad ) fails due to puppet erasing /etc/ssh/ssh_known_hosts - https://phabricator.wikimedia.org/T159332#3071890 (10thcipriani) >>! In T159332#3071087, @hashar wrote: > And I guess puppet has two difference resources i...
[20:26:21] <wmf-insecte>	 Project beta-scap-eqiad build #144840: 04FAILURE in 1 min 20 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/144840/
[20:37:03] <wmf-insecte>	 Yippee, build fixed!
[20:37:03] <wmf-insecte>	 Project beta-scap-eqiad build #144841: 09FIXED in 1 min 58 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/144841/
[20:37:28] <Zppix>	 hashar:  ^^ yay
[20:41:40] <matt_flaschen>	 Not sure if T159575 is a regression.
[20:41:40] <stashbot>	 T159575: Extremely slow Phabricator search (1m 45s) - https://phabricator.wikimedia.org/T159575
[20:42:39] <Sagan>	 hashar: heh, thanks for your trust :)
[20:42:50] <hashar>	 ;-}
[20:43:24] <hashar>	 but watch out, if you mess up someone will retaliate :-D
[20:43:51] <Sagan>	 :D
[20:43:56] <hashar>	 given your ops on several other channels, I hope your request will be fulfilled
[20:44:42] <Sagan>	 yeah, I hope too. would make it a bit easier, if the trolls arrive again :)
[20:51:48] <wikibugs_>	 10Beta-Cluster-Infrastructure, 06Release-Engineering-Team: Beta cluster scap job ( beta-scap-eqiad ) fails due to puppet erasing /etc/ssh/ssh_known_hosts - https://phabricator.wikimedia.org/T159332#3071976 (10hashar) I thought that maybe the File resource would mess up with the content somehow.  Apparently bet...
[20:52:31] <thcipriani>	 hashar: yeah, TIL we have deployment-puppetdb01
[20:52:43] <hashar>	 magic!
[20:52:55] <hashar>	 which also mean we could setup a puppet dashboard :}
[20:53:06] <thcipriani>	 ...if we don't have one already :)
[20:53:11] <hashar>	 hehe
[20:54:08] * thcipriani double checks horizon proxies to be sure
[20:54:52] <thcipriani>	 heh, no, we don't :)
[20:55:27] <shinken-wm_>	 PROBLEM - Puppet run on deployment-tin is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0]
[20:57:34] <hashar>	 maybe one can craft a small manifest that realize the ssh::key thing
[20:57:49] <hashar>	 and then  while loop; puppet apply --debug ; done;
[20:58:21] <hashar>	 I am randomly shooting non sense idea. Really I have no idea how that works
[20:58:47] <thcipriani>	 there's evidently a puppetdb cli
[20:59:01] <thcipriani>	 although not on deployment-puppetdb01
[20:59:21] <hashar>	 java.lang.AssertionError: Assert failed: status
[20:59:21] <hashar>	 hehe
[21:01:25] <thcipriani>	 yeah, that's me trying to get to the status page...I think...
[21:05:28] <shinken-wm_>	 RECOVERY - Puppet run on deployment-tin is OK: OK: Less than 1.00% above the threshold [0.0]
[21:21:05] <Zppix>	 hashar: is beta scap tests fixed?
[21:21:53] <icinga-wm>	 PROBLEM - Check systemd state on contint2001 is CRITICAL: CRITICAL - degraded: The system is operational but one or more units failed.
[21:22:03] <icinga-wm>	 PROBLEM - jenkins_service_running on contint2001 is CRITICAL: PROCS CRITICAL: 0 processes with regex args ^/usr/bin/java .*-jar /usr/share/jenkins/jenkins.war
[21:23:21] <icinga-wm>	 ACKNOWLEDGEMENT - Check systemd state on contint2001 is CRITICAL: CRITICAL - degraded: The system is operational but one or more units failed. daniel_zahn conversion to systemd https://gerrit.wikimedia.org/r/#/c/337404/
[21:23:21] <icinga-wm>	 ACKNOWLEDGEMENT - jenkins_service_running on contint2001 is CRITICAL: PROCS CRITICAL: 0 processes with regex args ^/usr/bin/java .*-jar /usr/share/jenkins/jenkins.war daniel_zahn conversion to systemd https://gerrit.wikimedia.org/r/#/c/337404/
[21:23:21] <icinga-wm>	 ACKNOWLEDGEMENT - jenkins_zmq_publisher on contint2001 is CRITICAL: connect to address 127.0.0.1 and port 8888: Connection refused daniel_zahn conversion to systemd https://gerrit.wikimedia.org/r/#/c/337404/
[21:39:04] <hashar>	 Zppix: subscribe to the task and once it get fixed you will know :-}
[21:39:09] <hashar>	 I am not actively working on it
[21:39:36] <Zppix>	 hashar: i just wanted to check in i havent got to look at phab yet
[21:39:48] <hashar>	 ;-)
[21:39:53] <Zppix>	 hashar: if theres anything i can do let me know
[21:46:47] <wikibugs_>	 10Beta-Cluster-Infrastructure, 06Release-Engineering-Team: Beta cluster scap job ( beta-scap-eqiad ) fails due to puppet erasing /etc/ssh/ssh_known_hosts - https://phabricator.wikimedia.org/T159332#3072120 (10thcipriani) I think this means these resrouces exist. I found them in the psql db that backs Puppet DB...
[21:47:52] <icinga-wm>	 RECOVERY - Check systemd state on contint2001 is OK: OK - running: The system is fully operational
[21:48:02] <icinga-wm>	 RECOVERY - jenkins_service_running on contint2001 is OK: PROCS OK: 1 process with regex args ^/usr/bin/java .*-jar /usr/share/jenkins/jenkins.war
[21:52:20] <mutante>	 :) ^
[21:52:32] <mutante>	 the "systemd" part is the nice part
[21:53:21] <paladox>	 mutante hashar we should probaly upstream it :)
[21:54:57] <mutante>	 do it :)
[21:55:59] <Zppix>	 ^ we should cloak wmf-insecte
[22:00:10] <wikibugs_>	 10Beta-Cluster-Infrastructure, 06Release-Engineering-Team: Beta cluster scap job ( beta-scap-eqiad ) fails due to puppet erasing /etc/ssh/ssh_known_hosts - https://phabricator.wikimedia.org/T159332#3072143 (10thcipriani) Good query to note  ``` puppetdb=> select title, parameters from catalog_resources cr join...
[22:01:14] <wikibugs_>	 10Continuous-Integration-Infrastructure, 07Jenkins, 07Upstream: /etc/init.d/jenkins script provided by Debian doesn't work properly - https://phabricator.wikimedia.org/T53817#3072145 (10hashar) For the record, we are now running Jenkins with systemd. So this definitely no more applies.
[22:02:36] <Zppix>	 hashar: what was jenkins on before systemd
[22:02:47] <paladox>	 It was on init
[22:03:11] <Zppix>	 No wonder issues were existant
[22:03:12] <paladox>	 init.d to be precise (Yes theres a difference between init and init.d)
[22:03:17] <paladox>	 Why?
[22:03:30] <Zppix>	 Init.d in my experience is well "fun"
[22:03:37] <paladox>	 oh
[22:04:48] <RainbowSprinkles>	 death to systemd
[22:04:58] <RainbowSprinkles>	 sysv init 4 ever
[22:05:23] <Zppix>	 No no
[22:05:41] <paladox>	 lol
[22:05:52] <paladox>	 systemd is strict
[22:06:00] <paladox>	 you carn't define function or custom params
[22:06:16] <Zppix>	 ^ but that means less errors
[22:06:26] <paladox>	 No it dosen't
[22:10:06] <paladox>	 Zppix ^^
[22:10:29] <Zppix>	 With custom params it is
[22:11:10] <paladox>	 No it's not. systemd anyways is from another distro
[22:11:15] * paladox forgot which one
[22:11:23] <Zppix>	 Liniux?
[22:11:37] <paladox>	 Thats a not a distro
[22:11:38] <paladox>	 thats a os
[22:11:46] <Zppix>	 Derp i knew that
[22:11:50] <paladox>	 lol
[22:11:55] <Zppix>	 Ignore my brain dying
[22:11:58] <paladox>	 ah, it's centos
[22:12:02] <paladox>	 or red hat
[22:12:13] <Zppix>	 ^ thats what i meant
[22:12:59] <paladox>	 Oh
[22:13:11] <paladox>	 mac has a form of systemd
[22:13:18] <paladox>	 since they name apache apachectl
[22:13:31] <RainbowSprinkles>	 launchd/launchctl isn't quite like systemd
[22:13:33] <Zppix>	 paladox: are you able to merge in quarrybot repo yet?
[22:13:37] <paladox>	 but mysql, is mysql.start so hard to know what command to do
[22:13:38] <paladox>	 no
[22:13:41] <Zppix>	 Ok
[22:13:48] <RainbowSprinkles>	 But yeah, that's OSX's init/launcher/etc type thing
[22:13:49] <RainbowSprinkles>	 :)
[22:13:54] <Zppix>	 RainbowSprinkles: do you have admin perms on gerrit?
[22:14:02] <RainbowSprinkles>	 I have god powers on gerrit
[22:14:05] <RainbowSprinkles>	 I am gerrit
[22:14:17] <RainbowSprinkles>	 Gerrit and I merged into a singularity
[22:14:21] <Zppix>	 Can you add paladox to the gerrit group quarrybot-enwiki
[22:14:25] <Zppix>	 RainbowSprinkles
[22:14:30] <paladox>	 lol
[22:14:59] <paladox>	 RainbowSprinkles it tells me to visit an actual localhost page for the status
[22:15:11] <paladox>	 it's like not available for me to view.
[22:15:14] <RainbowSprinkles>	 Zppix: Done. And also I made the group self-managing so you can DIY next time
[22:15:29] <Zppix>	 Thank god
[22:17:00] <wikibugs_>	 10Gerrit, 10QuarryBot-enwiki: Add Paladox to  Group labs-tools-quarrybot-enwiki - https://phabricator.wikimedia.org/T159049#3072214 (10Zppix) 05Open>03Resolved a:03demon
[22:17:25] <wikibugs_>	 06Release-Engineering-Team, 07Performance, 10Phabricator (Search): Extremely slow Phabricator search (1m 45s) - https://phabricator.wikimedia.org/T159575#3072217 (10Paladox)
[22:18:39] <Zppix>	 paladox: if you want to merge the change for pywikibot addition you may to be sure you can
[22:19:14] <paladox>	 Which repo is that?
[22:19:42] <paladox>	 https://gerrit.wikimedia.org/r/#/c/339829/
[22:19:46] <paladox>	 Zppix ^^ ?
[22:20:16] <Zppix>	 https://gerrit.wikimedia.org/r/339829
[22:21:08] <wikibugs_>	 06Release-Engineering-Team, 07Performance, 10Phabricator (Search): Extremely slow Phabricator search (1m 45s) - https://phabricator.wikimedia.org/T159575#3072219 (10Paladox) Per @ebernhardson   <ebernhardson> cluster is a start, the other big thing i would like to see there is some sort of semaphore that lim...
[22:22:16] <paladox>	 done
[22:22:19] <paladox>	 Zppix ^^
[22:22:23] <Zppix>	 Ok
[22:23:01] <paladox>	 RainbowSprinkles thanks for adding me to the group :)
[22:23:06] <RainbowSprinkles>	 yw
[22:23:31] <Zppix>	 I think self managed group needs to be the default when requested by nongerrit admins
[22:23:46] <paladox>	 It is, i think.
[22:24:04] <Zppix>	 No or i would of been able to add you myself
[22:24:17] <paladox>	 oh
[22:26:03] <RainbowSprinkles>	 Zppix: Complain to person who makes group ;-)
[22:26:23] <Zppix>	 Well i mean it should be *policy*
[22:26:28] <wikibugs_>	 06Release-Engineering-Team, 07Performance, 10Phabricator (Search): Extremely slow Phabricator search (1m 45s) - https://phabricator.wikimedia.org/T159575#3072224 (10EBernhardson) The single query wont do it, but someone mischievous could cause hundreds of long running queries to be running in parallel, which...
[22:26:58] <RainbowSprinkles>	 Default owner isn't actually admins, it's Project-And-Group-Creators
[22:27:03] <RainbowSprinkles>	 Which is Admins + a few other peeps
[22:27:07] * RainbowSprinkles shrugs
[22:27:17] <RainbowSprinkles>	 I wish you could configure it to "Default to self-owning"
[22:27:22] <Zppix>	 ^
[22:27:27] <Zppix>	 Thats what i mean
[22:27:44] <RainbowSprinkles>	 I know. 
[22:27:53] <RainbowSprinkles>	 I can't automate it though :(
[22:27:55] <RainbowSprinkles>	 That's my beef
[22:27:55] <paladox>	 RainbowSprinkles i swear, theres a self service plugin
[22:28:03] <paladox>	 i saw a plugin
[22:28:11] <paladox>	 i forgot the name though
[22:28:13] <paladox>	 openstack uses
[22:28:14] <paladox>	 it
[22:28:14] <RainbowSprinkles>	 Well, we're going to be using Phabricator any day now, right? So why bother :p
[22:28:29] <paladox>	 I thought thats on pause?
[22:29:28] <Zppix>	 NO i love gerrit
[22:29:49] <RainbowSprinkles>	 Nobody loves gerrit. They just deal with it :p
[22:30:01] <Zppix>	 I do
[22:30:02] <paladox>	 lol
[22:30:18] <paladox>	 RainbowSprinkles it has better dev tools then phabricator, whereas phabricator has the good looks
[22:30:35] <mutante>	 i love gerrit
[22:30:49] <RainbowSprinkles>	 mutante: shush you
[22:31:17] <paladox>	 I would love to see all of gerrit's features ported to phabricator before we fully migrate to it :)
[22:31:25] <greg-g>	 no.
[22:31:36] <paladox>	 I like the inline editing expecially
[22:31:45] <Zppix>	 I WILL cry if we go phab
[22:31:53] <greg-g>	 I'm sick and tired of that meme. Different tools have different use-cases and don't need all of the features of every other tool.
[22:32:10] <greg-g>	 Zppix: many cry every day with gerrit
[22:32:20] <Zppix>	 Gerrit so simple to use
[22:32:59] <paladox>	 Phabricator for one has no easy access for users to edit a patch through the browser
[22:33:05] <Zppix>	 Imagine the config changes releng would have to do just on jenkins alone
[22:33:06] <paladox>	 expecially useful for mobile users like me
[22:33:15] <Zppix>	 And me
[22:33:50] <mutante>	 turns paladox into a an android user and makes him enhance mgerrit https://play.google.com/store/apps/details?id=com.jbirdvegas.mgerrit&hl=en
[22:33:54] <greg-g>	 Zppix: we're completely redoing the ci pipeline in the next year or so, so it doesn't actually matter
[22:33:58] <paladox>	 lol, /me hates android
[22:34:06] <paladox>	 Im an ios user
[22:34:08] <Zppix>	 Iphone all the way
[22:34:15] <Zppix>	 greg-g:
[22:34:18] <Zppix>	 :(
[22:34:59] <greg-g>	 no, actually that's a :) because it'll be a modern setup that gives us what we need.
[22:35:13] <paladox>	 But it takes away inline editing
[22:35:15] <paladox>	 plain git
[22:35:25] <paladox>	 forceing us to use arcanist
[22:35:31] <greg-g>	 #1: not sure, #2: untrue
[22:35:42] <mutante>	 Zppix, but  ... https://www.gnu.org/proprietary/malware-apple.en.html :)
[22:36:00] <paladox>	 using arcanist is true.
[22:36:03] <Zppix>	 You have to try to infect an iphone
[22:36:23] <greg-g>	 look, I'm going to put the brakes on this conversation topic right now, it's not useful as not everyone is well informed and it's not happening in the short term so there's no need to debate things people aren't sure of the details about.
[22:36:55] <wikibugs_>	 06Release-Engineering-Team, 07Performance, 10Phabricator (Search): Extremely slow Phabricator search (1m 45s) - https://phabricator.wikimedia.org/T159575#3072256 (10EBernhardson) Additionally to debug this, i need to see what query phabricator is issuing to elasticsearch. It would be convenient to have some...
[22:44:03] <wikibugs_>	 (03PS1) 10Harej: Adding CollaborationKit to doc.wikimedia.org portal [integration/docroot] - 10https://gerrit.wikimedia.org/r/341098
[22:44:42] <Zppix>	 greg-g: ack
[23:00:09] <wikibugs_>	 06Release-Engineering-Team, 06Labs, 06Operations, 07Nodepool: Investigate why nodepool keeps leaking instances and why it stops for no reason sometimes - https://phabricator.wikimedia.org/T159543#3072330 (10hashar) 05Open>03Resolved Nova / OpenStack recovered. Thus instances managed to get deleted and...
[23:15:43] <wikibugs_>	 06Release-Engineering-Team, 07Performance, 10Phabricator (Search): Extremely slow Phabricator search (1m 45s) - https://phabricator.wikimedia.org/T159575#3071951 (10demon) Here's the query:  ``` lang=json  {     "_source": false,     "query": {         "bool": {             "must": [                 {...
[23:16:59] <wikibugs_>	 06Release-Engineering-Team, 07Performance, 10Phabricator (Search): Extremely slow Phabricator search (1m 45s) - https://phabricator.wikimedia.org/T159575#3072378 (10Paladox) I was told that ^^ is loading 10,000 results.
[23:27:41] <wikibugs_>	 06Release-Engineering-Team, 07Performance, 10Phabricator (Search): Extremely slow Phabricator search (1m 45s) - https://phabricator.wikimedia.org/T159575#3072387 (10demon) So, to summarize our IRC discussion: # The query itself is fine -- even with 10k results requested Elasticsearch can respond plenty quick...
[23:31:01] <wikibugs_>	 06Release-Engineering-Team, 07Performance, 10Phabricator (Search): Extremely slow Phabricator search (1m 45s) - https://phabricator.wikimedia.org/T159575#3072395 (10EBernhardson) fwiw i also think the query building might not be doing everything it should, this query is asking for items tagged flow that cont...
[23:33:37] <wikibugs_>	 06Release-Engineering-Team, 07Performance, 10Phabricator (Search): Extremely slow Phabricator search (1m 45s) - https://phabricator.wikimedia.org/T159575#3072396 (10demon) >>! In T159575#3072395, @EBernhardson wrote: > fwiw i also think the query building might not be doing everything it should, this query i...
[23:34:09] <Zppix>	 ^ RainbowSprinkles could you subscribe me to that
[23:34:22] <paladox>	 Zppix carn't you subscribe your self?
[23:34:29] <paladox>	 By clicking the subscribe button?
[23:34:48] <Zppix>	 Cant i just be lazy?
[23:35:04] <paladox>	 lol
[23:35:05] <RainbowSprinkles>	 You've  spent more time asking us than it would take to DIY
[23:35:12] <RainbowSprinkles>	 So, no, nobody will subscribe you :p
[23:36:06] <Zppix>	 Dang
[23:36:09] * Zppix mumbles
[23:37:48] <Zppix>	 Phab supports html tags right?
[23:39:40] <wikibugs_>	 06Release-Engineering-Team, 07Performance, 10Phabricator (Search): Extremely slow Phabricator search (1m 45s) - https://phabricator.wikimedia.org/T159575#3072404 (10Zppix) Also one fix could be to set the query limit <b> PER </b> page to something like 50.
[23:40:00] <RainbowSprinkles>	 Of course not
[23:41:06] <Zppix>	 How the h3ll do i bold
[23:41:28] <greg-g>	 click the B button
[23:41:37] <greg-g>	 or **whatever**
[23:41:37] <RainbowSprinkles>	 **this is bold text**
[23:41:57] <RainbowSprinkles>	 And //this would be italics//
[23:42:07] <greg-g>	 here ya go: https://secure.phabricator.com/book/phabricator/article/remarkup/
[23:42:11] <RainbowSprinkles>	 Markdown: the spec that everyone makes their own version of!
[23:42:45] <Zppix>	 Now that i look like a dumb dev that cant do simple text formatting i got it lol
[23:45:27] <greg-g>	 Zppix: just don't be the dumb user who doesn't know they can edit their own comments
[23:45:30] <greg-g>	 :)
[23:45:46] <Zppix>	 Well i knew that
[23:54:35] <wikibugs_>	 (03CR) 10Legoktm: [C: 032] Adding CollaborationKit to doc.wikimedia.org portal [integration/docroot] - 10https://gerrit.wikimedia.org/r/341098 (owner: 10Harej)