[00:05:33] <paladox>	 no_justification done :)
[00:07:35] * paladox has been watching alot of queen victoria (itv) over the weekend heh :)
[00:11:05] <no_justification>	 414598 works (if you add some src/*, otherwise bazel gives up)
[00:11:20] <paladox>	 thanks :)
[00:14:48] * legoktm huggles no_justification and paladox
[00:15:04] <paladox>	 thanks, my thanks to no_justification too :)
[00:17:24] <wikibugs>	 10Gerrit, 10Developer-Relations: Create a self-service portal for trusted users to easily create new Gerrit repos - https://phabricator.wikimedia.org/T188196#4000225 (10Legoktm) Sounds awesome!  Would it be possible to extend the plugin thing to run bootstrap scripts like cookiecutter (https://gerrit.wikimedia...
[00:19:41] <no_justification>	 https://gerrit.wikimedia.org/r/#/c/414599/
[00:19:55] <paladox>	 no_justification should the mac file be added to it too?
[00:20:02] <paladox>	 the one the file viewer produces?
[00:20:22] <paladox>	 .DS_Store
[00:20:37] <paladox>	 I always seem to have the file added when ever i do things on the mac.
[00:21:23] <no_justification>	 I have that in my local gitignore globally :p
[00:21:25] <no_justification>	 But doesn't hurt
[00:24:46] <paladox>	 no_justification thanks
[00:30:13] <paladox>	 no_justification i guess i should merge it as is?
[00:30:21] <no_justification>	 Added .DS_Store now
[00:30:27] <paladox>	 ah
[00:30:28] <paladox>	 thanks
[00:30:35] <paladox>	 merging
[00:30:38] <no_justification>	 .DS_Store is the modern version of thumbs.db
[00:30:43] <no_justification>	 Useless f'ing metadata files
[00:30:52] <paladox>	 no_justification hmm i doint seem to have the submit button on https://gerrit.wikimedia.org/r/#/c/414599/
[00:31:16] <paladox>	 no_justification yeh, they should get rid of them, or at least they should hide it from everyone.
[00:34:19] <paladox>	 no_justification i wonder if adding submit for that gerrit group (on https://gerrit.wikimedia.org/r/#/admin/projects/operations/software/gerrit/plugins/wikimedia,access) would fix that?
[00:36:00] <no_justification>	 Yeah, do the acl change
[00:36:18] <paladox>	 ok
[00:36:19] <paladox>	 thanks
[00:36:40] <paladox>	 done
[00:36:54] <paladox>	 https://gerrit.wikimedia.org/r/plugins/gitiles/operations/software/gerrit/plugins/wikimedia/+/4b537f2539654bf96bfd16aaa2f4ce290aaa8acc
[00:57:50] <no_justification>	 https://gerrit.wikimedia.org/r/#/c/414604/
[00:58:08] <paladox>	 thanks.
[00:58:14] <paladox>	 merging
[01:06:27] <no_justification>	 Ok, fleshed out most of the basics: https://gerrit.wikimedia.org/r/#/c/414605/
[01:06:30] * no_justification goes to take a break
[01:07:57] <paladox>	 no_justification  thanks :). 
[01:08:30] <shinken-wm>	 PROBLEM - Puppet errors on deployment-memc06 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0]
[01:09:40] * paladox goes too.
[01:10:23] <paladox>	 no_justification i can merge https://gerrit.wikimedia.org/r/#/c/414605/ if you want? (will wait for your say so :))
[01:11:07] <no_justification>	 If ya test it first :)
[01:11:12] <no_justification>	 I got it to compile tho!
[01:11:17] <paladox>	 Ok.
[01:12:42] <paladox>	 no_justification it builds
[01:12:43] <paladox>	 bazel-genfiles/wikimedia.jar
[01:15:02] <no_justification>	 legoktm: only downside with cookie cutter is we can't shell out very easily
[01:15:15] <no_justification>	 So using `cookiecutter`
[01:15:20] <no_justification>	 Itself might be problematic 
[01:15:22] <legoktm>	 hm
[01:15:35] <no_justification>	 But the functionality of bootstrapping a repo is doable
[01:15:37] <legoktm>	 but the plugin will be able to create a repo with the .gitreview file?
[01:15:46] <no_justification>	 Ya
[01:15:49] <legoktm>	 ok
[01:15:55] <no_justification>	 My idea is to have it use a skeleton
[01:16:12] <paladox>	 no_justification not sure if it's a master problem, but i get
[01:16:13] <paladox>	 Caused by: java.lang.ClassNotFoundException: com.googlesource.gerrit.plugins.Module
[01:16:33] <no_justification>	 Sounds like a class path bug 
[01:16:40] <paladox>	 oh
[01:16:46] <no_justification>	 Is that when installing the plugin?
[01:16:53] <no_justification>	 Because yeah might be master issue
[01:17:13] <no_justification>	 Our bazelets targets 2.14
[01:17:35] <paladox>	 no_justification yep when installing as a plugin
[01:17:43] <paladox>	 ah
[01:17:49] <paladox>	 let me build with master :)
[01:17:58] <paladox>	 me copys folder
[01:18:01] <no_justification>	 Yeah, just change the sha1 of bazelets dep
[01:20:16] <paladox>	 hmm still says
[01:20:16] <paladox>	 Caused by: java.lang.ClassNotFoundException: com.googlesource.gerrit.plugins.Module
[01:20:34] <legoktm>	 no_justification: sounds good to me, I'm not tied to cookiecutter as a tool
[01:21:09] <paladox>	 no_justification ah
[01:21:12] <paladox>	 found the error
[01:21:26] <paladox>	 no_justification see https://gerrit.wikimedia.org/r/#/c/414605/1/BUILD@8
[01:24:48] <paladox>	 no_justification fixed it, left a comment on ^^ which fixes the problem.
[01:29:23] <paladox>	 no_justification it worked
[01:29:26] <paladox>	 "[2018-02-26 01:27:24,197] [HTTP-91] INFO  com.googlesource.gerrit.plugins.wikimedia.github.ProjectCreatedListener : New project: 'testtttttt', Parent: 'All-Projects'
[01:29:26] <paladox>	 "
[01:29:47] <paladox>	 (i set the log level to debug, to confirm it worked).
[01:30:37] <no_justification>	 :)
[01:30:40] <no_justification>	 Yay!
[01:30:52] <paladox>	 :)
[01:31:33] <paladox>	 no_justification thanks for the fix, want me to merge?
[01:31:39] <no_justification>	 Go for it
[01:31:50] <paladox>	 ok thanks :)
[01:31:59] <paladox>	 done
[01:48:30] <shinken-wm>	 RECOVERY - Puppet errors on deployment-memc06 is OK: OK: Less than 1.00% above the threshold [0.0]
[01:52:43] <no_justification>	 Are GerritUiExtensionPoint.* hooks deprecated in Polygerrit or were they ported over?
[01:55:24] <paladox>	 no_justification they are porting them over i think (but marked as deprecated) i think
[01:55:46] <paladox>	 no_justification you making the plugin web ui accessible? Ie you can do it through the ui?.
[01:56:15] <paladox>	 I could help do that. Port over from gwtui to polygerrit. It can support both at the same time.
[01:58:08] <paladox>	 no_justification if by GerritUiExtensionPoint* you mean, like .on and more then yes
[01:58:09] <no_justification>	 Just poking at stuff
[01:58:10] <no_justification>	 :)
[01:58:35] <paladox>	 no_justification there's docs too :)
[01:58:38] <paladox>	 for pg plugins
[01:58:39] <paladox>	 https://gerrit.googlesource.com/gerrit/+/master/Documentation/pg-plugin-dev.txt
[01:58:46] <paladox>	 https://gerrit.googlesource.com/gerrit/+/master/Documentation/pg-plugin-endpoints.txt
[01:58:47] <no_justification>	 Yeah
[01:58:53] <paladox>	 https://gerrit.googlesource.com/gerrit/+/master/Documentation/pg-plugin-migration.txt
[01:59:03] <paladox>	 :)
[02:04:08] <paladox>	 no_justification i've added support for pg in delete-project here https://gerrit-review.googlesource.com/c/plugins/delete-project/+/140591
[02:11:08] * paladox is going to go for the night - 02:11am.
[02:11:13] <paladox>	 And snow on the way heh
[02:12:37] <paladox>	 https://news.sky.com/story/live-britain-braces-for-beastly-freeze-11268196
[02:22:09] <no_justification>	 legoktm: I thought you'd like this one https://gerrit.wikimedia.org/r/c/414607/
[02:22:27] <legoktm>	 today must be christmas
[02:23:32] <no_justification>	 All of this stuff is WIP/untested, but with the bootstrapping work done it'll be *easy* for us to add stuff we need
[02:23:45] <legoktm>	 no_justification: can the plugin make external requests to get the contents of wikiversions.json?
[02:23:47] <no_justification>	 There's a lot of workflow/automation stuff we've put off for too long
[02:23:52] <no_justification>	 Yes!
[02:23:54] <no_justification>	 It could do that
[02:24:04] <legoktm>	 https://noc.wikimedia.org/conf/wikiversions.json
[02:24:34] <no_justification>	 It could also get the file from the repo on disk :p
[02:24:43] <no_justification>	 But that's probably expensive hah
[02:25:08] <legoktm>	 well noc is what's actually deployed, though they should be the same most of the time
[02:25:23] <no_justification>	 True true
[02:26:05] <no_justification>	 We can fetch it and cache locally for X time too
[02:26:11] <no_justification>	 The data doesn't /have/ to be perfect
[02:26:21] * legoktm nods
[02:26:21] <no_justification>	 (and we don't want every change page hitting noc, heh)
[02:27:06] <no_justification>	 Also!
[02:27:07] <no_justification>	 branches.removeIf(branch -> !(branch.equals("master") || branch.startsWith("wmf/")));
[02:27:14] <no_justification>	 ^ First time using java8 predicates :)
[02:27:21] <paladox>	 I’ve only just started implementing Included In @ pg :)
[02:27:38] <no_justification>	 paladox...go to bed :P
[02:27:39] <no_justification>	 haha
[02:27:45] <paladox>	 Heh yes :)
[02:27:51] <paladox>	 Thanks
[04:50:30] <shinken-wm>	 PROBLEM - Mediawiki Error Rate on graphite-labs is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [10.0]
[04:57:39] <shinken-wm>	 PROBLEM - Free space - all mounts on deployment-fluorine02 is CRITICAL: CRITICAL: deployment-prep.deployment-fluorine02.diskspace._srv.byte_percentfree (<33.33%)
[05:09:32] <wmf-insecte>	 Yippee, build fixed!
[05:09:32] <wmf-insecte>	 Project mediawiki-core-code-coverage build #3349: 09FIXED in 2 hr 9 min: https://integration.wikimedia.org/ci/job/mediawiki-core-code-coverage/3349/
[05:20:31] <shinken-wm>	 PROBLEM - Mediawiki Error Rate on graphite-labs is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [10.0]
[07:07:37] <shinken-wm>	 RECOVERY - Free space - all mounts on deployment-fluorine02 is OK: OK: All targets OK
[07:56:19] <wikibugs>	 10Phabricator (2018-02-15), 10Upstream: Click to select arbitrary tasks across workboard columns for batch edit - https://phabricator.wikimedia.org/T129528#4000423 (10Aklapper) 05Open>03Resolved No reply - assuming yes.
[08:13:21] <wikibugs>	 (03CR) 10Hashar: [C: 032] Add unit tests for CloneDiff extension [integration/config] - 10https://gerrit.wikimedia.org/r/414381 (owner: 10Umherirrender)
[08:14:45] <wikibugs>	 (03Merged) 10jenkins-bot: Add unit tests for CloneDiff extension [integration/config] - 10https://gerrit.wikimedia.org/r/414381 (owner: 10Umherirrender)
[08:18:45] <wikibugs>	 (03PS1) 10Hashar: Make jobs voting for CloneDiff extension [integration/config] - 10https://gerrit.wikimedia.org/r/414626
[08:19:06] <wikibugs>	 (03CR) 10Hashar: [C: 032] Make jobs voting for CloneDiff extension [integration/config] - 10https://gerrit.wikimedia.org/r/414626 (owner: 10Hashar)
[08:20:30] <wikibugs>	 (03Merged) 10jenkins-bot: Make jobs voting for CloneDiff extension [integration/config] - 10https://gerrit.wikimedia.org/r/414626 (owner: 10Hashar)
[08:49:10] <shinken-wm>	 PROBLEM - Puppet errors on integration-slave-jessie-1004 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0]
[08:52:42] <wikibugs>	 10Phabricator: Delete unused and empty workboards database-side - https://phabricator.wikimedia.org/T105865#4000475 (10Aklapper) 05Open>03declined Again, anyone could disable unused workboards in their projects. This does not require any Phab admin time.
[08:52:59] <wikibugs>	 (03CR) 10Hashar: "Thanks for the review! Lets indeed switch to php7" (033 comments) [integration/config] - 10https://gerrit.wikimedia.org/r/413737 (owner: 10Hashar)
[08:53:12] <wikibugs>	 (03PS3) 10Hashar: phpunit-coverage-publish job on Docker [integration/config] - 10https://gerrit.wikimedia.org/r/413737
[08:54:59] <shinken-wm>	 PROBLEM - Puppet errors on integration-slave-jessie-1001 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0]
[09:12:37] <wikibugs>	 (03PS4) 10Hashar: phpunit-coverage-publish job on Docker [integration/config] - 10https://gerrit.wikimedia.org/r/413737
[09:13:29] <wikibugs>	 (03CR) 10Hashar: "cleaned up a few mistakes. Going to replace the old jobs in next patchset and update Zuul." [integration/config] - 10https://gerrit.wikimedia.org/r/413737 (owner: 10Hashar)
[09:14:18] <shinken-wm>	 PROBLEM - Puppet errors on deployment-elastic06 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0]
[09:19:05] <wikibugs>	 (03PS5) 10Hashar: phpunit-coverage-publish job on Docker [integration/config] - 10https://gerrit.wikimedia.org/r/413737
[09:32:49] <shinken-wm>	 PROBLEM - Puppet errors on deployment-elastic07 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0]
[09:33:00] <wikibugs>	 (03CR) 10Hashar: [C: 032] "Replaced all occurences phpunit-coverage-publish" [integration/config] - 10https://gerrit.wikimedia.org/r/413737 (owner: 10Hashar)
[09:35:53] <wikibugs>	 (03CR) 10Hashar: [C: 032] "And I have saved the cover directory to /srv/org/wikimedia/doc/cover20182601-0934.tar.gz" [integration/config] - 10https://gerrit.wikimedia.org/r/413737 (owner: 10Hashar)
[09:36:17] <wikibugs>	 (03Merged) 10jenkins-bot: phpunit-coverage-publish job on Docker [integration/config] - 10https://gerrit.wikimedia.org/r/413737 (owner: 10Hashar)
[09:38:35] <wikibugs>	 (03CR) 10Hashar: [C: 032] "Tested on the last cdb.git merged change:" [integration/config] - 10https://gerrit.wikimedia.org/r/413737 (owner: 10Hashar)
[09:39:14] <shinken-wm>	 PROBLEM - Puppet errors on deployment-elastic05 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0]
[09:39:21] <wikibugs>	 10Continuous-Integration-Infrastructure (shipyard), 10Release-Engineering-Team (Kanban): Migrate leftover Nodepool jobs to Docker - https://phabricator.wikimedia.org/T187797#3986123 (10hashar)
[10:09:19] <shinken-wm>	 RECOVERY - Puppet errors on deployment-elastic06 is OK: OK: Less than 1.00% above the threshold [0.0]
[10:12:50] <shinken-wm>	 RECOVERY - Puppet errors on deployment-elastic07 is OK: OK: Less than 1.00% above the threshold [0.0]
[10:19:13] <shinken-wm>	 RECOVERY - Puppet errors on deployment-elastic05 is OK: OK: Less than 1.00% above the threshold [0.0]
[10:32:17] <wikibugs>	 10Continuous-Integration-Infrastructure (shipyard), 10Release-Engineering-Team (Kanban): Migrate leftover Nodepool jobs to Docker - https://phabricator.wikimedia.org/T187797#4000645 (10hashar)
[10:40:30] <wikibugs>	 10Release-Engineering-Team (Kanban), 10MediaWiki-Core-Tests, 10User-zeljkofilipin, 10Wikimedia-Incident: Create selenium-core-jessie daily Jenkins job - https://phabricator.wikimedia.org/T185011#4000693 (10zeljkofilipin) I have just talked with @hashar about this. Another option would be to log in using th...
[11:19:24] <wikibugs>	 10Continuous-Integration-Infrastructure, 10MediaWiki-Core-Tests, 10Operations, 10HHVM: Readd complete URL parsing fix from 3.18.7 release - https://phabricator.wikimedia.org/T185024#4000766 (10MoritzMuehlenhoff) p:05Unbreak!>03Normal
[11:32:02] <wikibugs>	 10Phabricator, 10Release-Engineering-Team (Kanban), 10media-storage: Connect Phabricator to swift for storage of git-lfs and file uploads. - https://phabricator.wikimedia.org/T182085#4000805 (10mmodell) a:03mmodell
[11:44:16] <wikibugs>	 (03CR) 10Hashar: [C: 032] Add a tox env to run JJB [integration/config] - 10https://gerrit.wikimedia.org/r/413402 (owner: 10Hashar)
[11:45:43] <wikibugs>	 (03Merged) 10jenkins-bot: Add a tox env to run JJB [integration/config] - 10https://gerrit.wikimedia.org/r/413402 (owner: 10Hashar)
[11:55:19] <wikibugs>	 10Phabricator, 10Analytics-Tech-community-metrics, 10Bugzilla-Migration, 10DevRel-November-2015: Closed tickets in Bugzilla migrated without closing event? - https://phabricator.wikimedia.org/T107254#4000910 (10Aklapper) For the records, the [[ https://phabricator.wikimedia.org/phame/post/view/85/phabricat...
[12:52:10] <wikibugs>	 (03PS5) 10Hashar: Experimental integration-jjb-config-diff on Docker [integration/config] - 10https://gerrit.wikimedia.org/r/413404
[12:53:09] <wikibugs>	 (03PS6) 10Hashar: Experimental integration-jjb-config-diff on Docker [integration/config] - 10https://gerrit.wikimedia.org/r/413404
[13:06:51] <dcausse>	 hi, what's the equivalent of terbium in deployment-prep (I mean the machine that has mwscript crontab)
[13:36:29] <Reedy>	 there isn't one
[13:36:47] <Reedy>	 dcausse: https://phabricator.wikimedia.org/T187826
[13:37:37] <wikibugs>	 (03PS7) 10Hashar: Experimental integration-jjb-config-diff on Docker [integration/config] - 10https://gerrit.wikimedia.org/r/413404
[13:42:29] <dcausse>	 Reedy: thanks! yes just saw that
[13:45:35] <wikibugs>	 (03PS8) 10Hashar: Experimental integration-jjb-config-diff on Docker [integration/config] - 10https://gerrit.wikimedia.org/r/413404
[13:48:32] <awight>	 I’m trying to construct an URL which will always point to the raw, master version of a JSON schema.  Anyone know how that might be done in Diffusion?  For example, https://phabricator.wikimedia.org/diffusion/EJAD/browse/master/jsonschema/scoring/damaging/v1.json
[13:49:23] <awight>	 That URL doesn’t quite work because it points to a fancy rendering which isn’t machine-readable.  But the “raw file” link has an URL which includes revision hashes..
[14:01:31] <wikibugs>	 10Phabricator, 10Release-Engineering-Team (Kanban), 10Operations, 10Patch-For-Review, and 2 others: Apache on phab1001 is gradually leaking worker processes which are stuck in "Gracefully finishing" state - https://phabricator.wikimedia.org/T182832#4001208 (10MoritzMuehlenhoff) >>! In T182832#3982284, @elu...
[14:07:01] <wikibugs>	 (03PS9) 10Hashar: Experimental integration-jjb-config-diff on Docker [integration/config] - 10https://gerrit.wikimedia.org/r/413404
[14:12:21] <wikibugs>	 (03PS10) 10Hashar: Experimental integration-jjb-config-diff on Docker [integration/config] - 10https://gerrit.wikimedia.org/r/413404
[14:12:54] <wikibugs>	 10Continuous-Integration-Infrastructure (shipyard), 10Release-Engineering-Team (Kanban): Migrate leftover Nodepool jobs to Docker - https://phabricator.wikimedia.org/T187797#4001233 (10hashar)
[14:15:10] <wikibugs>	 (03CR) 10Hashar: [C: 032] Experimental integration-jjb-config-diff on Docker [integration/config] - 10https://gerrit.wikimedia.org/r/413404 (owner: 10Hashar)
[14:16:33] <wikibugs>	 (03Merged) 10jenkins-bot: Experimental integration-jjb-config-diff on Docker [integration/config] - 10https://gerrit.wikimedia.org/r/413404 (owner: 10Hashar)
[14:18:25] <wikibugs>	 (03CR) 10Hashar: [C: 032] "recheck" [integration/config] - 10https://gerrit.wikimedia.org/r/413404 (owner: 10Hashar)
[14:19:55] <wikibugs>	 (03PS1) 10Hashar: Run integration-jjb-config-diff-docker solely in test pipeline [integration/config] - 10https://gerrit.wikimedia.org/r/414673
[14:20:30] <wikibugs>	 (03CR) 10Hashar: [C: 032] "recheck" [integration/config] - 10https://gerrit.wikimedia.org/r/413404 (owner: 10Hashar)
[14:40:17] <wikibugs>	 (03PS1) 10Hashar: Promote tests for 3d2png [integration/config] - 10https://gerrit.wikimedia.org/r/414678
[14:40:33] <wikibugs>	 (03CR) 10Hashar: [C: 032] ":)" [integration/config] - 10https://gerrit.wikimedia.org/r/414678 (owner: 10Hashar)
[14:41:44] <wikibugs>	 (03Merged) 10jenkins-bot: Promote tests for 3d2png [integration/config] - 10https://gerrit.wikimedia.org/r/414678 (owner: 10Hashar)
[14:43:24] <wikibugs>	 10Continuous-Integration-Infrastructure (shipyard), 10Release-Engineering-Team (Kanban), 10releng-201718-q3: Migrate PHPUnit Mediawiki jobs from Nodepool to Docker - https://phabricator.wikimedia.org/T183512#4001370 (10hashar)
[14:43:26] <wikibugs>	 10Continuous-Integration-Infrastructure (shipyard), 10Release-Engineering-Team (Kanban): Migrate leftover Nodepool jobs to Docker - https://phabricator.wikimedia.org/T187797#4001371 (10hashar)
[14:43:51] <wikibugs>	 10Continuous-Integration-Infrastructure (shipyard), 10Release-Engineering-Team (Kanban), 10releng-201718-q3: Migrate PHPUnit Mediawiki jobs from Nodepool to Docker - https://phabricator.wikimedia.org/T183512#3855599 (10hashar) I am migrating the long tail first: T187797
[15:00:24] <shinken-wm>	 PROBLEM - Free space - all mounts on integration-slave-jessie-1001 is CRITICAL: CRITICAL: integration.integration-slave-jessie-1001.diskspace._mnt.byte_percentfree (No valid datapoints found)integration.integration-slave-jessie-1001.diskspace._srv.byte_percentfree (<11.11%)
[15:01:39] <shinken-wm>	 PROBLEM - Free space - all mounts on integration-slave-jessie-1002 is CRITICAL: CRITICAL: integration.integration-slave-jessie-1002.diskspace._mnt.byte_percentfree (No valid datapoints found)integration.integration-slave-jessie-1002.diskspace._srv.byte_percentfree (<40.00%)
[15:02:14] <wmf-insecte>	 Project mediawiki-core-code-coverage-php7 build #111: 04FAILURE in 0.85 sec: https://integration.wikimedia.org/ci/job/mediawiki-core-code-coverage-php7/111/
[15:04:22] <wikibugs>	 10Continuous-Integration-Infrastructure (shipyard), 10Release-Engineering-Team (Kanban): Migrate leftover Nodepool jobs to Docker - https://phabricator.wikimedia.org/T187797#4001457 (10hashar)
[15:10:59] <Zoranzoki21>	 Hi
[15:11:04] <Zoranzoki21>	 What happening with ci
[15:11:05] <Zoranzoki21>	 https://integration.wikimedia.org/ci/job/operations-mw-config-php55lint/19054/console
[15:13:41] <hashar>	 oahar stupid job
[15:14:15] <Zoranzoki21>	 I wanted to rebase https://integration.wikimedia.org/ci/job/operations-mw-config-php55lint/19054/console and add in deployment
[15:14:23] <Zoranzoki21>	 and CI as always
[15:14:25] <Zoranzoki21>	 failed
[15:14:28] <hashar>	 Zoranzoki21: the job is broken sometime
[15:14:29] <Zoranzoki21>	 -1 jenkins bot
[15:21:31] <wikibugs>	 10Continuous-Integration-Infrastructure (shipyard), 10Release-Engineering-Team (Kanban), 10Pywikibot-core: Migrate pywikibot-tests-beta-cluster to a tox env in pywikibot/core - https://phabricator.wikimedia.org/T188256#4001495 (10hashar)
[15:21:41] <wikibugs>	 10Continuous-Integration-Infrastructure (shipyard), 10Release-Engineering-Team (Kanban): Migrate leftover Nodepool jobs to Docker - https://phabricator.wikimedia.org/T187797#4001509 (10hashar)
[15:22:10] <wikibugs>	 10Continuous-Integration-Infrastructure (shipyard), 10Release-Engineering-Team (Kanban), 10Pywikibot-core: Migrate pywikibot-tests-beta-cluster to a tox env in pywikibot/core - https://phabricator.wikimedia.org/T188256#4001495 (10hashar)
[15:27:27] <wikibugs>	 10Release-Engineering-Team (Kanban), 10Fundraising-Backlog, 10MediaWiki-extensions-DonationInterface, 10Browser-Tests, and 2 others: Write browser tests for DonationInterface - https://phabricator.wikimedia.org/T99955#4001530 (10zeljkofilipin) I have provisioned fundraising role, but looks like something i...
[15:27:43] <shinken-wm>	 PROBLEM - Puppet errors on deployment-redis01 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0]
[15:29:38] <wikibugs>	 10Continuous-Integration-Infrastructure (shipyard), 10Release-Engineering-Team (Kanban): Migrate leftover Nodepool jobs to Docker - https://phabricator.wikimedia.org/T187797#4001537 (10hashar)
[15:29:41] <wikibugs>	 10Continuous-Integration-Infrastructure (shipyard), 10Release-Engineering-Team (Kanban), 10Pywikibot-core: Migrate pywikibot-tests-beta-cluster to a tox env in pywikibot/core - https://phabricator.wikimedia.org/T188256#4001535 (10hashar) 05Open>03declined I am going to delete it. The job never ran any test.
[15:30:40] <wikibugs>	 (03PS1) 10Hashar: Delete pywikibot-tests-beta-cluster [integration/config] - 10https://gerrit.wikimedia.org/r/414688 (https://phabricator.wikimedia.org/T188256)
[15:31:03] <wikibugs>	 10Continuous-Integration-Infrastructure (shipyard), 10Release-Engineering-Team (Kanban): Migrate leftover Nodepool jobs to Docker - https://phabricator.wikimedia.org/T187797#3986123 (10hashar)
[15:32:53] <wikibugs>	 10Release-Engineering-Team (Kanban), 10Fundraising-Backlog, 10MediaWiki-extensions-DonationInterface, 10Browser-Tests, and 2 others: Write browser tests for DonationInterface - https://phabricator.wikimedia.org/T99955#4001544 (10zeljkofilipin) Ah, looks like that is what is supposed to happen. :) I can log...
[15:33:42] <shinken-wm>	 PROBLEM - Puppet errors on deployment-redis02 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0]
[15:52:14] <shinken-wm>	 PROBLEM - Puppet errors on integration-slave-docker-1002 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0]
[15:55:32] <wikibugs>	 (03CR) 10Hashar: [C: 032] Delete pywikibot-tests-beta-cluster [integration/config] - 10https://gerrit.wikimedia.org/r/414688 (https://phabricator.wikimedia.org/T188256) (owner: 10Hashar)
[15:56:55] <wikibugs>	 (03Merged) 10jenkins-bot: Delete pywikibot-tests-beta-cluster [integration/config] - 10https://gerrit.wikimedia.org/r/414688 (https://phabricator.wikimedia.org/T188256) (owner: 10Hashar)
[15:57:09] <paladox>	 no_justification hi, i will add build documentation for the plugin :)
[16:00:52] <wikibugs>	 10Continuous-Integration-Infrastructure (shipyard), 10Release-Engineering-Team (Kanban): Migrate leftover Nodepool jobs to Docker - https://phabricator.wikimedia.org/T187797#4001685 (10hashar)
[16:01:11] <paladox>	 no_justification https://gerrit.wikimedia.org/r/#/c/414698/
[16:01:39] <shinken-wm>	 RECOVERY - Free space - all mounts on integration-slave-jessie-1002 is OK: OK: integration.integration-slave-jessie-1002.diskspace._mnt.byte_percentfree (No valid datapoints found)
[16:08:00] <wikibugs>	 (03PS1) 10Hashar: Delete wikimedia-fundraising-civicrm-jessie [integration/config] - 10https://gerrit.wikimedia.org/r/414703 (https://phabricator.wikimedia.org/T187797)
[16:08:43] <wikibugs>	 10Continuous-Integration-Infrastructure (shipyard), 10Release-Engineering-Team (Kanban), 10Patch-For-Review: Migrate leftover Nodepool jobs to Docker - https://phabricator.wikimedia.org/T187797#4001791 (10hashar)
[16:09:49] <wikibugs>	 10Continuous-Integration-Config, 10Wikidata, 10Patch-For-Review, 10User-Addshore: Only run npm job on Jenkins for builds of data-values/value-view - https://phabricator.wikimedia.org/T178083#4001799 (10WMDE-leszek) 05Open>03Resolved a:03WMDE-leszek
[16:09:53] <wikibugs>	 10Release-Engineering-Team, 10Wikidata, 10Epic, 10Patch-For-Review, 10User-Addshore: [Epic] Kill the Wikidata build step - https://phabricator.wikimedia.org/T173818#4001802 (10WMDE-leszek)
[16:32:11] <shinken-wm>	 RECOVERY - Puppet errors on integration-slave-docker-1002 is OK: OK: Less than 1.00% above the threshold [0.0]
[16:56:30] <shinken-wm>	 RECOVERY - Mediawiki Error Rate on graphite-labs is OK: OK: Less than 1.00% above the threshold [1.0]
[17:13:59] <robh>	 What is the process of re-enabling a disabled phab account?
[17:14:09] <robh>	 Erik Z is asking since he wants to reactivate his stuff
[17:14:12] <robh>	 phab is first step.
[17:14:24] <robh>	 (is it something where we can enable it, and he can do a password reset?)
[17:15:00] <wikibugs>	 10Phabricator, 10Release-Engineering-Team (Kanban), 10Operations, 10Patch-For-Review, and 2 others: Apache on phab1001 is gradually leaking worker processes which are stuck in "Gracefully finishing" state - https://phabricator.wikimedia.org/T182832#4002197 (10mmodell) Thanks @MoritzMuehlenhoff! Please let...
[17:15:26] <shinken-wm>	 PROBLEM - Free space - all mounts on integration-slave-jessie-1001 is CRITICAL: CRITICAL: integration.integration-slave-jessie-1001.diskspace._mnt.byte_percentfree (No valid datapoints found)integration.integration-slave-jessie-1001.diskspace._srv.byte_percentfree (<55.56%)
[17:18:19] <paladox>	 no_justification i can merge it if you want? Or you can? :)
[17:19:07] <paladox>	 no_justification https://gerrit-review.googlesource.com/c/gerrit/+/162450
[17:19:22] <wikibugs>	 10Release-Engineering-Team, 10Scoring-platform-team: Investigate deployment concurrency limitations for ORES - https://phabricator.wikimedia.org/T188281#4002206 (10awight)
[17:20:04] <wikibugs>	 10Phabricator, 10Release-Engineering-Team (Kanban), 10media-storage: Connect Phabricator to swift for storage of git-lfs and file uploads. - https://phabricator.wikimedia.org/T182085#4002219 (10mmodell) @halfak: I'm working on this.
[17:29:14] <wikibugs>	 10Release-Engineering-Team (Next), 10Maps-Sprint, 10Repository-Admins, 10Maps (Tilerator): Setup diffusion and github sync for kartotherian and tilerator package repositories - https://phabricator.wikimedia.org/T182848#4002274 (10greg)
[17:30:11] <wikibugs>	 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (Watching / External), 10Operations, 10Goal, and 2 others: Add Prometheus exporter to Jenkins instances - https://phabricator.wikimedia.org/T182759#4002278 (10greg)
[17:30:20] <wikibugs>	 10Gerrit, 10Release-Engineering-Team (Watching / External), 10Operations: Add prometheus exporter to Gerrit - https://phabricator.wikimedia.org/T184086#4002280 (10greg)
[17:30:55] <wikibugs>	 10Beta-Cluster-Infrastructure, 10Release-Engineering-Team (Watching / External), 10Recommendation-API, 10Patch-For-Review, 10Scoring-platform-team (Current): What to do with deployment-sca03? - https://phabricator.wikimedia.org/T184501#4002283 (10greg)
[17:31:16] <wikibugs>	 10Release-Engineering-Team (Watching / External), 10MediaWiki-Core-Tests, 10MediaWiki-extensions-ORES, 10MW-1.31-release-notes (WMF-deploy-2018-02-06 (1.31.0-wmf.20)), and 2 others: How do I test my extension's maintenance scripts? - https://phabricator.wikimedia.org/T184775#4002285 (10greg)
[17:32:33] <wikibugs>	 10Phabricator, 10RelEng-Archive-FY201718-Q2: Phabricator search degraded in quality for almost any query - https://phabricator.wikimedia.org/T182088#4002331 (10mmodell)
[17:32:38] <wikibugs>	 10Continuous-Integration-Infrastructure (shipyard), 10Release-Engineering-Team (Kanban), 10Wikidata Query UI: Running mvn on wikidata/query/rdf fails with: /bin/sh: 1: npm: not found - https://phabricator.wikimedia.org/T188285#4002314 (10hashar)
[17:32:40] <wikibugs>	 10Phabricator, 10Release-Engineering-Team (Kanban), 10monitoring, 10Browser-Tests, 10User-zeljkofilipin: Develop tests for phabricator search to detect regressions / search quality issues - https://phabricator.wikimedia.org/T182160#4002329 (10mmodell) 05Open>03stalled Still need to develop a few more...
[17:32:59] <wikibugs>	 10Release-Engineering-Team (Kanban), 10Release Pipeline: Pipeline image build cleanup - https://phabricator.wikimedia.org/T177867#3673142 (10thcipriani) a:05thcipriani>03None Not currently working on this, but may pick it up again in near future.
[17:34:31] <wikibugs>	 10Continuous-Integration-Infrastructure (shipyard), 10Release-Engineering-Team (Kanban), 10Wikidata Query UI: Running mvn on wikidata/query/rdf fails with: /bin/sh: 1: npm: not found - https://phabricator.wikimedia.org/T188285#4002314 (10hashar) From some paste research, that seems to be an issue with the pl...
[17:38:21] <robh>	 hrmm... no answer so im going to start pinging folks cuz i dont wanna do this wrong
[17:38:38] <robh>	 twentyafterfour: whats the process for un-suspending a phabricator account?  Specifically Erik Zachte's?
[17:38:57] <robh>	 He had all his accounts suspended when he had two computer breakdowns
[17:39:23] <robh>	 greg-g: ^ perhaps you also know the process
[17:39:32] <robh>	 (if you guys are wrong people to ask sorry for the pings!)
[17:39:39] <greg-g>	 ah sure
[17:39:47] <greg-g>	 I didn't know it was suspended. task?
[17:39:51] <greg-g>	 or something?
[17:39:54] <robh>	 was via email it seems
[17:39:57] <robh>	 he is emailing me now as well
[17:40:08] <robh>	 so my plan was to figure out the process, then do a hangout with him to eliminate social engineering
[17:40:12] <robh>	 since i know his face and all =]
[17:40:26] <greg-g>	 doesn't look suspended: https://phabricator.wikimedia.org/people/manage/1177/
[17:41:43] <twentyafterfour>	 yeah, what greg said :-/  ... 
[17:42:16] <robh>	 hrmm, email sent to followup with him
[17:42:29] <robh>	 he basically blind emailed me last week since i was on clinic duty
[17:42:38] <wikibugs>	 10Phabricator, 10Release-Engineering-Team (Kanban), 10Phlogiston: Phlogiston reports don't have new data since mid-February - https://phabricator.wikimedia.org/T188149#4002399 (10JAufrecht) Have dug in deeper; pretty sure this is a change in the data files provided to Phlogiston, and not a code problem intro...
[17:54:13] <wikibugs>	 (03PS7) 10Hashar: Experimental docker job wikidata-query-rdf-maven [integration/config] - 10https://gerrit.wikimedia.org/r/410169 (https://phabricator.wikimedia.org/T188285)
[17:56:29] <wikibugs>	 (03CR) 10jerkins-bot: [V: 04-1] Experimental docker job wikidata-query-rdf-maven [integration/config] - 10https://gerrit.wikimedia.org/r/410169 (https://phabricator.wikimedia.org/T188285) (owner: 10Hashar)
[17:57:35] <no_justification>	 eddiegp: I finished most of the site stats refreshes over the weekend
[17:57:39] <no_justification>	 Just 5 left to go!
[18:00:43] <madhuvishy>	 hey y'all, Keegan just pinged us about T188288, looks like a deployment-prep thing (managed through https://github.com/wikimedia/puppet/blob/production/hieradata/labs/deployment-prep/host/deployment-cache-text04.yaml may be?) - wasn't sure what project to tag there, so poking here
[18:00:43] <stashbot>	 T188288: fr.wikipedia.beta.wmflabs.org uses an invalid security certificate - https://phabricator.wikimedia.org/T188288
[18:01:37] <wikibugs>	 (03PS8) 10Hashar: Experimental docker job wikidata-query-rdf-maven [integration/config] - 10https://gerrit.wikimedia.org/r/410169 (https://phabricator.wikimedia.org/T188285)
[18:01:53] <greg-g>	 madhuvishy: #beta-cluster-infra
[18:01:58] * greg-g does
[18:02:11] <wikibugs>	 10Beta-Cluster-Infrastructure: fr.wikipedia.beta.wmflabs.org uses an invalid security certificate - https://phabricator.wikimedia.org/T188288#4002497 (10greg)
[18:02:17] <paladox>	 thanks 
[18:02:41] <madhuvishy>	 greg-g: Thank you!
[18:04:37] <wikibugs>	 10Phabricator, 10Release-Engineering-Team (Kanban), 10Phlogiston: Phlogiston reports don't have new data since mid-February - https://phabricator.wikimedia.org/T188149#4002510 (10mmodell) @JAufrecht I think you are probably right, I'm looking at the schema changes and I will update the dump. Hopefully nothin...
[18:04:59] <wikibugs>	 10Beta-Cluster-Infrastructure: fr.wikipedia.beta.wmflabs.org uses an invalid security certificate - https://phabricator.wikimedia.org/T188288#4002472 (10greg) Looks like it wasn't added to the list of let's encrypt domains.
[18:06:17] <wikibugs>	 (03CR) 10Hashar: [C: 032] Experimental docker job wikidata-query-rdf-maven [integration/config] - 10https://gerrit.wikimedia.org/r/410169 (https://phabricator.wikimedia.org/T188285) (owner: 10Hashar)
[18:07:53] <wikibugs>	 10Continuous-Integration-Infrastructure (shipyard), 10Release-Engineering-Team (Kanban), 10Wikidata Query UI, 10Patch-For-Review: Running mvn on wikidata/query/rdf fails with: /bin/sh: 1: npm: not found - https://phabricator.wikimedia.org/T188285#4002526 (10hashar)
[18:08:51] <wikibugs>	 (03CR) 10Hashar: [C: 032] "Deployed" [integration/config] - 10https://gerrit.wikimedia.org/r/414678 (owner: 10Hashar)
[18:09:08] <hashar>	 !log Zuul: reloaded to apply https://gerrit.wikimedia.org/r/#/c/414678/  "Promote tests for 3d2png"
[18:09:09] <stashbot>	 hashar: Failed to log message to wiki. Somebody should check the error logs.
[18:09:23] <wikibugs>	 (03Merged) 10jenkins-bot: Experimental docker job wikidata-query-rdf-maven [integration/config] - 10https://gerrit.wikimedia.org/r/410169 (https://phabricator.wikimedia.org/T188285) (owner: 10Hashar)
[18:12:27] <wikibugs>	 10Continuous-Integration-Config: User commenting on a merge change that it had Cr+2 cause it to enter gate-and-submit again - https://phabricator.wikimedia.org/T188290#4002561 (10hashar)
[18:12:43] <hashar>	 !log reloading zuul for https://gerrit.wikimedia.org/r/#/c/410169/ "Experimental docker job wikidata-query-rdf-maven"
[18:12:45] <stashbot>	 hashar: Failed to log message to wiki. Somebody should check the error logs.
[18:14:20] <wikibugs>	 10Continuous-Integration-Infrastructure (shipyard), 10Release-Engineering-Team (Kanban), 10Wikidata Query UI, 10Patch-For-Review: Running mvn on wikidata/query/rdf fails with: /bin/sh: 1: npm: not found - https://phabricator.wikimedia.org/T188285#4002593 (10hashar) Testing out my theory on https://gerrit.w...
[18:17:50] <wikibugs>	 10Beta-Cluster-Infrastructure, 10Release-Engineering-Team (Kanban), 10Patch-For-Review, 10User-greg: fr.wikipedia.beta.wmflabs.org uses an invalid security certificate - https://phabricator.wikimedia.org/T188288#4002620 (10greg) p:05Triage>03High a:03greg
[18:21:00] <wikibugs>	 10Continuous-Integration-Config, 10Release-Engineering-Team (Someday), 10MediaWiki-extensions-SendGrid: Extensions with PHP 5.6+ as requirements making Jenkins to fail on merge when CR+2 - https://phabricator.wikimedia.org/T185451#4002640 (10greg)
[18:22:28] <wikibugs>	 10Release-Engineering-Team (Next), 10Scap, 10Wikimedia-Incident: Scap sync-file: report the file on IRC/SAL on canary error rate failure - https://phabricator.wikimedia.org/T186064#4002650 (10greg)
[18:22:36] <wikibugs>	 10Release-Engineering-Team (Next), 10Scap, 10Wikimedia-Incident: Scap: on canary failure, report the list of failed hosts - https://phabricator.wikimedia.org/T186065#4002652 (10greg)
[18:22:38] <wikibugs>	 10Release-Engineering-Team (Next), 10Scap, 10Wikimedia-Incident: Scap sync-file: allow to sync multiple files in different directories - https://phabricator.wikimedia.org/T186067#4002654 (10greg)
[18:22:48] <wikibugs>	 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (Next): Wikimedia Portals needs libpng-dev for npm-browser-node-6 tests - https://phabricator.wikimedia.org/T186117#4002658 (10greg)
[18:36:06] <wikibugs>	 (03PS2) 10Umherirrender: Add unit tests for BlueSpiceRSSFeeder [integration/config] - 10https://gerrit.wikimedia.org/r/414069
[18:39:13] <mutante>	 !log deployment-cache-text04 - manually creating Letsencrypt SSL cert for fr.wikipedia.beta.wmflabs.org (acme-setup -i "fr_wikipedia_beta_wmflabs_org" -s "fr.wikipedia.beta.wmflabs.org" --key-user root --key-group root), restarted nginx (T188288)
[18:39:20] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL
[18:39:20] <stashbot>	 T188288: fr.wikipedia.beta.wmflabs.org uses an invalid security certificate - https://phabricator.wikimedia.org/T188288
[18:39:22] <shinken-wm>	 PROBLEM - Puppet errors on deployment-mx is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0]
[18:39:29] <mutante>	 ..but .. it did not fix it yet, heh
[18:40:11] <mutante>	 i see, what i did is create a new cert for just the missing one
[18:40:20] <mutante>	 but usually they are all added to a single unified one
[18:47:42] <wikibugs>	 10Beta-Cluster-Infrastructure, 10Release-Engineering-Team (Kanban), 10Patch-For-Review, 10User-greg: fr.wikipedia.beta.wmflabs.org uses an invalid security certificate - https://phabricator.wikimedia.org/T188288#4002472 (10Dzahn) 13:39 < mutante> ..but .. it did not fix it yet, heh 13:40 < mutante> i see,...
[18:55:00] <awight>	 twentyafterfour: How do I tell which version of scap is deployed?
[18:55:21] <twentyafterfour>	 `scap version` ?
[18:55:32] <awight>	 lol thanks
[18:55:56] <twentyafterfour>	 awight:   '--version' was used for something else :-O
[18:56:07] <awight>	 no offense taken
[18:56:48] <awight>	 twentyafterfour: uh, current tag isn’t in git?
[18:57:37] <awight>	 Scap self-reports version 3.8.0, but the git tags stop at 3.7.6
[18:58:01] <twentyafterfour>	 awight: on beta? 
[18:58:05] <awight>	 yah
[18:58:34] <twentyafterfour>	 beta is one version ahead of what's been tagged. It should really be '3.8.0-dev'
[18:58:53] <awight>	 How would I tell whether a particular commit is included?
[18:59:37] <twentyafterfour>	 it should be running master, although we were intending to change that 
[18:59:57] <twentyafterfour>	 you can look at the scap package version... hang on
[19:00:01] <awight>	 hmm.  OK I’ll run with that, and confirm by looking at the code directly
[19:00:11] <awight>	 pkg version matches the self-reported version, happily.
[19:00:18] <twentyafterfour>	 oh hmm
[19:00:50] <twentyafterfour>	 apt show scap:
[19:00:56] <twentyafterfour>	 Version: 3.8.0-1~20180222003831.298
[19:01:45] <awight>	 I’m good for now, I found scap/script.py, which proves that 286585b112ae17b137aa85047fb732280c2540c9 is present.
[19:08:42] <shinken-wm>	 PROBLEM - Puppet errors on deployment-ms-be04 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0]
[19:09:45] <shinken-wm>	 PROBLEM - Puppet errors on deployment-changeprop is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0]
[19:10:21] <shinken-wm>	 PROBLEM - Puppet errors on saucelabs-03 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0]
[19:10:23] <shinken-wm>	 PROBLEM - Puppet errors on deployment-kafka05 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0]
[19:14:59] <wikibugs>	 10Release-Engineering-Team (Kanban): Investigate the Extension Submittal Review process - https://phabricator.wikimedia.org/T182731#3832907 (10Jrbranaa) Moved to blocked as I am waiting for response from Daniel on my request.
[19:19:21] <wikibugs>	 10Scap: Add DEPLOY_DIR env var to scap command checks - https://phabricator.wikimedia.org/T154612#4002951 (10awight) I'm trying this out on beta, using target code https://gerrit.wikimedia.org/r/#/c/392682/ (PS 9) and deployment-tin reports scap version 3.8.0-1~20180222003831.298  There must be a missing import,...
[19:26:24] <wikibugs>	 10Gerrit, 10Release-Engineering-Team (Next), 10Security: Upgrade gerrit from 2.14.6 to 2.14.7 - https://phabricator.wikimedia.org/T186135#3934807 (10greg) Should this really be tagged with #security? I'm sure there are lots of third-party upgrades that improve some facet of security-relate functionality...
[19:26:35] <wikibugs>	 10Gerrit, 10Release-Engineering-Team (Next), 10Security: Upgrade gerrit from 2.14.6 to 2.14.7 - https://phabricator.wikimedia.org/T186135#4002996 (10greg)
[19:26:49] <wikibugs>	 10Phabricator, 10Release-Engineering-Team (Someday), 10Operations, 10Patch-For-Review: Add support for stretch in the phabricator puppet class - https://phabricator.wikimedia.org/T187127#4002998 (10greg)
[19:26:59] <wikibugs>	 10Release-Engineering-Team (Kanban), 10MediaWiki-SWAT-deployments: Proposal: Effective immediately, disallow multi-sync patch deployment - https://phabricator.wikimedia.org/T187761#4003000 (10greg)
[19:27:49] <wikibugs>	 10Release-Engineering-Team (Next), 10MediaWiki-Maintenance-scripts, 10Utilities-code-utils: Write some version of foreachwiki(indblist) that respects replag and/or has some --delay parameter between wikis - https://phabricator.wikimedia.org/T187852#4003004 (10greg)
[19:28:51] <wikibugs>	 10Scap, 10Scoring-platform-team: Investigate deployment concurrency limitations for ORES - https://phabricator.wikimedia.org/T188281#4003007 (10greg) Using a fan out method is how this is handled normally.
[19:30:22] <andrewbogott>	 I'm looking at 'scap lock —help' and I don't see how to unlock when I'm done
[19:30:29] <andrewbogott>	 (there does not appear to be a 'scap unlock')
[19:31:33] <wikibugs>	 10Gerrit, 10Release-Engineering-Team (Next), 10Security: Upgrade gerrit from 2.14.6 to 2.14.7 - https://phabricator.wikimedia.org/T186135#4003025 (10Paladox) @greg nope, forgot to remove that tag. The changes were reverted in this release + i later found out the updates woulden't have effected us anyways.
[19:32:03] <wikibugs>	 10Gerrit, 10Release-Engineering-Team (Next), 10Security: Upgrade gerrit from 2.14.6 to 2.14.7 - https://phabricator.wikimedia.org/T186135#4003028 (10Paladox) Hmm herald re added it. Wonder why it did that?
[19:32:51] <andrewbogott>	 thcipriani: is there something that does the equivalent of 'scap unlock'?
[19:33:10] <wikibugs>	 10Scap, 10Scoring-platform-team: Investigate deployment concurrency limitations for ORES - https://phabricator.wikimedia.org/T188281#4003034 (10awight) >>! In T188281#4003007, @greg wrote: > Using a fan out method is how this is handled normally.  @greg Sorry, I'm not sure how to parse that.  I think that's co...
[19:34:22] <thcipriani>	 andrewbogott: IIRC lock expires when time is up or if you hit ctrl-c (I don't think scap lock returns terminal control)
[19:34:33] <thcipriani>	 unlock would be just deleting the lock file
[19:34:41] <thcipriani>	 if you need to do that manually
[19:34:59] <andrewbogott>	 oh, I see.  I didn't realize that it holds the terminal.
[19:35:28] <wikibugs>	 10Gerrit, 10Release-Engineering-Team (Next): Upgrade gerrit from 2.14.6 to 2.14.7 - https://phabricator.wikimedia.org/T186135#4003042 (10greg)
[19:44:44] <shinken-wm>	 RECOVERY - Puppet errors on deployment-changeprop is OK: OK: Less than 1.00% above the threshold [0.0]
[19:45:22] <shinken-wm>	 RECOVERY - Puppet errors on deployment-kafka05 is OK: OK: Less than 1.00% above the threshold [0.0]
[19:48:41] <shinken-wm>	 RECOVERY - Puppet errors on deployment-ms-be04 is OK: OK: Less than 1.00% above the threshold [0.0]
[20:00:13] <paladox>	 no_justification heh testing your change
[20:00:14] <paladox>	 Beta Cluster	Most likely
[20:00:20] <paladox>	 https://gerrit.wikimedia.org/r/#/c/414607/
[20:02:25] <no_justification>	 Ehhhhh
[20:02:41] <no_justification>	 I guess 'branches' isn't just merged-to but also includes open changes?
[20:02:52] <no_justification>	 Oh wait, nvm
[20:02:54] <no_justification>	 I misread
[20:03:08] <paladox>	 no_justification it seems it is included for all repos.
[20:03:14] <no_justification>	 Yeahhhh
[20:03:25] <no_justification>	 Hmmm
[20:03:28] <no_justification>	 This might not be useful
[20:03:38] <no_justification>	 It'd need a lot of domain knowledge about what extensions are deployed, etc.
[20:03:42] <paladox>	 no_justification you could use regex or use a list of deployed ext.
[20:03:43] <paladox>	 yeh
[20:03:58] <wikibugs>	 10Beta-Cluster-Infrastructure, 10Release-Engineering-Team (Kanban), 10Patch-For-Review, 10User-greg: fr.wikipedia.beta.wmflabs.org uses an invalid security certificate - https://phabricator.wikimedia.org/T188288#4003140 (10Dzahn) This changed the puppet error to a new one, since now there is a kafka_cluste...
[20:04:48] <paladox>	 no_justification and best, it still compiles on master :)
[20:08:06] <paladox>	 no_justification i wonder if it was possible we could use the plugin to allow users to merge changes to specific files. like reviewers.config in all-projects?
[20:15:20] <shinken-wm>	 RECOVERY - Puppet errors on saucelabs-03 is OK: OK: Less than 1.00% above the threshold [0.0]
[20:30:08] <wikibugs>	 (03PS1) 10Chad: Add FileExporter/FileImporter to make-wmf-branch [tools/release] - 10https://gerrit.wikimedia.org/r/414768
[20:30:10] <wikibugs>	 (03CR) 10Chad: [C: 032] Add FileExporter/FileImporter to make-wmf-branch [tools/release] - 10https://gerrit.wikimedia.org/r/414768 (owner: 10Chad)
[20:31:08] <wikibugs>	 (03Merged) 10jenkins-bot: Add FileExporter/FileImporter to make-wmf-branch [tools/release] - 10https://gerrit.wikimedia.org/r/414768 (owner: 10Chad)
[20:32:51] <shinken-wm>	 PROBLEM - Free space - all mounts on deployment-mediawiki06 is CRITICAL: CRITICAL: deployment-prep.deployment-mediawiki06.diskspace.root.byte_percentfree (<11.11%)
[20:32:55] <wikibugs>	 (03PS1) 10Chad: Add PerformanceInspector to make-wmf-branch [tools/release] - 10https://gerrit.wikimedia.org/r/414770
[20:32:57] <wikibugs>	 (03CR) 10Chad: [C: 032] Add PerformanceInspector to make-wmf-branch [tools/release] - 10https://gerrit.wikimedia.org/r/414770 (owner: 10Chad)
[20:33:16] <wikibugs>	 10Continuous-Integration-Infrastructure (shipyard), 10Release-Engineering-Team (Kanban), 10Wikidata Query UI, 10Patch-For-Review: Running mvn on wikidata/query/rdf fails with: /bin/sh: 1: npm: not found - https://phabricator.wikimedia.org/T188285#4003382 (10hashar) I went for dinner and magically @Gehel an...
[20:33:33] <wikibugs>	 (03Merged) 10jenkins-bot: Add PerformanceInspector to make-wmf-branch [tools/release] - 10https://gerrit.wikimedia.org/r/414770 (owner: 10Chad)
[20:33:37] <wikibugs>	 (03PS2) 10Hashar: Delete wikimedia-fundraising-civicrm-jessie [integration/config] - 10https://gerrit.wikimedia.org/r/414703 (https://phabricator.wikimedia.org/T187797)
[20:33:48] <shinken-wm>	 PROBLEM - Free space - all mounts on deployment-mediawiki05 is CRITICAL: CRITICAL: deployment-prep.deployment-mediawiki05.diskspace.root.byte_percentfree (<11.11%)
[20:34:48] <wikibugs>	 (03PS1) 10Hashar: Migrate wikidata-query-rdf-maven job to Docker [integration/config] - 10https://gerrit.wikimedia.org/r/414772 (https://phabricator.wikimedia.org/T188285)
[20:35:35] <wikibugs>	 10Continuous-Integration-Infrastructure (shipyard), 10Release-Engineering-Team (Kanban), 10Patch-For-Review: Migrate leftover Nodepool jobs to Docker - https://phabricator.wikimedia.org/T187797#4003399 (10hashar)
[20:36:38] <wikibugs>	 (03CR) 10Hashar: [C: 032] Migrate wikidata-query-rdf-maven job to Docker [integration/config] - 10https://gerrit.wikimedia.org/r/414772 (https://phabricator.wikimedia.org/T188285) (owner: 10Hashar)
[20:37:22] <wikibugs>	 10Beta-Cluster-Infrastructure, 10ORES, 10Scoring-platform-team: Beta: Could not find class role::ores::worker - https://phabricator.wikimedia.org/T188316#4003417 (10thcipriani)
[20:37:47] <wikibugs>	 10Scap: Add DEPLOY_DIR env var to scap command checks - https://phabricator.wikimedia.org/T154612#4003439 (10thcipriani) >>! In T154612#4002951, @awight wrote: > I'm trying this out on beta, using target code https://gerrit.wikimedia.org/r/#/c/392682/ (PS 9) and deployment-tin reports scap version 3.8.0-1~201802...
[20:38:05] <wikibugs>	 (03Merged) 10jenkins-bot: Migrate wikidata-query-rdf-maven job to Docker [integration/config] - 10https://gerrit.wikimedia.org/r/414772 (https://phabricator.wikimedia.org/T188285) (owner: 10Hashar)
[20:38:31] <wikibugs>	 10Scap: Add DEPLOY_DIR env var to scap command checks - https://phabricator.wikimedia.org/T154612#4003442 (10awight) @thcipriani Wow, thanks for digging that up!
[20:39:02] <shinken-wm>	 PROBLEM - Puppet errors on deployment-mediawiki07 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0]
[20:39:55] <wikibugs>	 10Continuous-Integration-Infrastructure (shipyard), 10Release-Engineering-Team (Kanban), 10Patch-For-Review: Migrate leftover Nodepool jobs to Docker - https://phabricator.wikimedia.org/T187797#4003449 (10hashar)
[20:39:59] <wikibugs>	 10Continuous-Integration-Infrastructure (shipyard), 10Release-Engineering-Team (Kanban), 10Wikidata Query UI, 10Patch-For-Review: Running mvn on wikidata/query/rdf fails with: /bin/sh: 1: npm: not found - https://phabricator.wikimedia.org/T188285#4003448 (10hashar) 05Open>03Resolved
[20:42:49] <shinken-wm>	 RECOVERY - Free space - all mounts on deployment-mediawiki06 is OK: OK: All targets OK
[20:43:47] <shinken-wm>	 RECOVERY - Free space - all mounts on deployment-mediawiki05 is OK: OK: All targets OK
[20:52:03] <wikibugs>	 (03PS1) 10Lucas Werkmeister (WMDE): Disable coverage builds for WBQEV [integration/config] - 10https://gerrit.wikimedia.org/r/414774 (https://phabricator.wikimedia.org/T185697)
[20:54:47] <shinken-wm>	 PROBLEM - Free space - all mounts on deployment-mediawiki05 is CRITICAL: CRITICAL: deployment-prep.deployment-mediawiki05.diskspace.root.byte_percentfree (<11.11%)
[20:59:48] <shinken-wm>	 RECOVERY - Free space - all mounts on deployment-mediawiki05 is OK: OK: All targets OK
[21:19:34] <bearND>	 Anyone know what's up with logmsgbot? On ops channel I just saw a "Failed to log message to wiki. Somebody should check the error logs" a second time today.
[21:20:17] <shinken-wm>	 PROBLEM - Puppet errors on deployment-eventlog02 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0]
[21:20:43] <mdholloway>	 !log deployed to BC: [mobileapps/deploy@9970f97]: Update mobileapps to 8aa38e7
[21:20:43] <stashbot>	 mdholloway: Failed to log message to wiki. Somebody should check the error logs.
[21:21:10] <mdholloway>	 uh oh
[21:21:27] <mdholloway>	 bearND: oh, i see this isn't just happening to me then ;)
[21:21:55] <paladox>	 mdholloway bearND hi, wikitech is currently having maint done
[21:23:10] <mdholloway>	 paladox: i see, thanks
[21:23:16] <paladox>	 your welcome.
[21:29:15] <halfak>	 Hey folks.  I have a beta cluster host that I think might be misconfigured.  
[21:29:16] <halfak>	 deployment-ores01.deployment-prep.eqiad.wmflabs
[21:29:24] <halfak>	 It should have an ores-worker role enabled 
[21:29:29] <halfak>	 and I'm not sure it does.  
[21:29:39] <halfak>	 I'm not sure how to check except for in wikitech when you're a project admin
[21:30:54] <thcipriani>	 you should be able to check puppet roles applied to that host on horizon
[21:31:23] <eddiegp>	 no_justification, re stats recount: THANKS! :)
[21:31:41] <eddiegp>	 I began to like your jfdi attitude ;)
[21:31:53] <no_justification>	 :)
[21:32:04] <no_justification>	 enwiki's stats have been wrong for years, it was long overdue :P
[21:36:38] <thcipriani>	 halfak: also Puppet provides a list of classes applied to a node by default in /var/lib/puppet/state/classes.txt
[21:36:51] <halfak>	 Thanks
[21:37:07] <halfak>	 thcipriani, perm denied :|
[21:37:14] <halfak>	 nvm 
[21:37:16] <halfak>	 sudo!
[21:37:18] <thcipriani>	 :)
[21:39:02] <halfak>	 !log ran `sudo service celery-ores-worker start` on deployment-ores01
[21:39:04] <stashbot>	 halfak: Failed to log message to wiki. Somebody should check the error logs.
[21:39:18] <halfak>	 woops.  what did I miss? 
[21:39:42] <halfak>	 oh I see the note above ^_^
[21:40:23] <halfak>	 thanks thcipriani 
[21:40:41] <thcipriani>	 sure thing
[21:55:35] <paladox>	 no_justification i just realised something. I forgot to add support for other names in Included In @ polygerrit change heh
[21:55:37] <paladox>	 https://gerrit-review.googlesource.com/c/gerrit/+/128411/
[21:56:20] <paladox>	 "expected array for `items`, found"
[21:56:21] <paladox>	 {Beta Cluster: ["Most likely"]}
[21:56:22] <paladox>	 and throws that now.
[22:00:18] <shinken-wm>	 RECOVERY - Puppet errors on deployment-eventlog02 is OK: OK: Less than 1.00% above the threshold [0.0]
[22:01:06] <paladox>	 no_justification the only thing with polygerrit, is what browsers support es6?
[22:01:21] <paladox>	 Otherwise we will have to use vulcanize like the codemirror-editor plugin.
[22:08:28] <legoktm>	 I'm manually submitting some jenkins jobs
[22:08:51] <legoktm>	 (if you see random stuff in the zuul queue that makes no sense)
[22:09:31] <paladox>	 legoktm, zuul won't show anything if you manually do it in jenkins.
[22:09:50] <legoktm>	 I'm queuing them through zuul :)
[22:09:55] <James_F>	 Unlike normal?
[22:10:03] <James_F>	 :-)
[22:10:17] <paladox>	 legoktm, i thought you can only do that if you do it through gerrit?
[22:10:19] <legoktm>	 `zuul-test-repo ext:Whatever`, it's the equivalent of commenting "recheck" on a bunch of patches
[22:10:31] <legoktm>	 paladox: we have a script to fake it
[22:10:35] <paladox>	 oh i see
[22:24:08] <wikibugs>	 10Continuous-Integration-Infrastructure (shipyard), 10Release-Engineering-Team (Kanban), 10Patch-For-Review: Migrate leftover Nodepool jobs to Docker - https://phabricator.wikimedia.org/T187797#4003783 (10hashar)
[22:27:06] <Hauskatze>	 legoktm: shame on you, trying to fool poor jenkins ;-)
[22:27:28] <Hauskatze>	 ("we have a script to fake it")
[22:29:27] <Hauskatze>	 and, if for something weird you mean merged patches being tested yeah, I see some of those :)
[22:39:07] <Hauskatze>	 !log purging old abusefilter IP data from Beta Cluster wikis while we wait for a cron job to do this automatically
[22:39:11] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL
[22:40:43] <Hauskatze>	 !log updating list of Tor nodes for TorBlock on Beta Cluster wikis
[22:40:48] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL
[22:43:22] <wikibugs>	 10Phabricator: Email sometimes not being sent when a task is created - https://phabricator.wikimedia.org/T182549#4003890 (10Anomie) Ping. This is still happening. :(
[22:47:10] <wikibugs>	 10Phabricator: Email sometimes not being sent when a task is created - https://phabricator.wikimedia.org/T182549#4003905 (10mmodell) I'm not sure how to debug this without more information. Are other people experiencing similar problems? Can you post a screenshot of your phabricator email notification settings?...
[22:57:13] <shinken-wm>	 PROBLEM - SSH on integration-slave-docker-1001 is CRITICAL: CRITICAL - Socket timeout after 10 seconds
[22:59:32] <shinken-wm>	 PROBLEM - Puppet errors on deployment-ms-fe02 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0]
[23:00:36] <shinken-wm>	 PROBLEM - Puppet errors on jenkinstest is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0]
[23:01:07] <shinken-wm>	 PROBLEM - Puppet errors on integration-publishing is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0]
[23:02:11] <shinken-wm>	 RECOVERY - SSH on integration-slave-docker-1001 is OK: SSH OK - OpenSSH_6.7p1 Debian-5+deb8u4 (protocol 2.0)
[23:02:23] <shinken-wm>	 PROBLEM - Puppet errors on integration-slave-docker-1004 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0]
[23:02:27] <shinken-wm>	 PROBLEM - Puppet errors on deployment-eventlogging04 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0]
[23:02:52] <shinken-wm>	 PROBLEM - Puppet errors on deployment-parsoid09 is CRITICAL: CRITICAL: 16.67% of data above the critical threshold [0.0]
[23:03:14] <shinken-wm>	 PROBLEM - Puppet errors on deployment-memc07 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0]
[23:03:18] <shinken-wm>	 PROBLEM - Puppet errors on deployment-redis06 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0]
[23:03:20] <shinken-wm>	 PROBLEM - Puppet errors on deployment-redis05 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0]
[23:03:49] <shinken-wm>	 PROBLEM - Puppet errors on deployment-elastic07 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0]
[23:04:03] <shinken-wm>	 PROBLEM - Puppet errors on deployment-db03 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0]
[23:04:17] <Hauskatze>	 yay, beta cluster puppet going down
[23:05:05] <paladox>	 may be wikitech
[23:05:38] <paladox>	 sError: Could not retrieve catalog from remote server: Error 500 on SERVER: Server Error: Evaluation Error: Error while evaluating a Function Call, Reading data from Hosts/gerrit-mysql failed: NoMethodError: undefined method `[]' for nil:NilClass at /etc/puppet/manifests/realm.pp:22:14 on node gerrit-mysql.git.eqiad.wmflabs
[23:06:41] <wikibugs>	 10Release-Engineering-Team (Kanban), 10Release Pipeline: Installation method for Minikube on CI for k8s testing - https://phabricator.wikimedia.org/T184457#4004012 (10thcipriani) I've made a flattened clone of minikube at https://gerrit.wikimedia.org/r/#/admin/projects/operations/debs/minikube and added all th...
[23:07:18] <shinken-wm>	 PROBLEM - Puppet errors on integration-slave-docker-1007 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0]
[23:07:32] <shinken-wm>	 PROBLEM - Puppet errors on deployment-aqs03 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0]
[23:07:39] <paladox>	 yep wikitech.
[23:07:48] <shinken-wm>	 PROBLEM - Puppet errors on deployment-sca02 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0]
[23:07:51] <shinken-wm>	 PROBLEM - Puppet errors on deployment-cpjobqueue is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0]
[23:07:52] <paladox>	 Hauskatze ^^
[23:07:57] <shinken-wm>	 PROBLEM - Puppet errors on deployment-cassandra3-01 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0]
[23:07:57] <shinken-wm>	 PROBLEM - Puppet errors on deployment-etcd-01 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0]
[23:08:25] <Hauskatze>	 I don't know how to fix those
[23:08:41] <shinken-wm>	 PROBLEM - Puppet errors on integration-slave-docker-1005 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0]
[23:08:56] <shinken-wm>	 PROBLEM - Puppet errors on deployment-aqs01 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0]
[23:09:14] <paladox>	 Hauskatze you carn't
[23:09:24] <paladox>	 you have to wait for wikitech to be back online.
[23:09:25] <Hauskatze>	 even better
[23:09:33] <shinken-wm>	 PROBLEM - Puppet errors on deployment-memc06 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0]
[23:09:41] <shinken-wm>	 PROBLEM - Puppet errors on deployment-ms-be04 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0]
[23:10:02] <shinken-wm>	 PROBLEM - Puppet errors on deployment-cumin is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0]
[23:10:14] <shinken-wm>	 PROBLEM - Puppet errors on deployment-elastic05 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0]
[23:10:23] <andrewbogott>	 thcipriani, no_justification, I have a 'scap lock' open on silver but scap overwrote my changes anyway
[23:10:34] <andrewbogott>	 So, I guess 'scap lock —all' doesn't work, and also I could use a hand unbreaking wikitech
[23:10:44] <shinken-wm>	 PROBLEM - Puppet errors on deployment-changeprop is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0]
[23:10:46] <andrewbogott>	 I mean, I can hotfix it again but apparently it'll just break itself at any time
[23:10:50] <shinken-wm>	 PROBLEM - Puppet errors on deployment-sca01 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0]
[23:11:22] <shinken-wm>	 PROBLEM - Puppet errors on saucelabs-03 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0]
[23:11:22] <shinken-wm>	 PROBLEM - Puppet errors on deployment-kafka05 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0]
[23:12:07] <thcipriani>	 andrewbogott: scap lock is for the deployment server to ensure that no one can start a deployment from tin.
[23:12:08] <wikibugs>	 10Scap: Add DEPLOY_DIR env var to scap command checks - https://phabricator.wikimedia.org/T154612#4004045 (10awight) It's working well now.  Just noting here that the lack of cwd means my script has to do the following to calculate the deployment root:   deploy_dir="$( realpath "$( dirname "${BASH_SOURCE[0]}" )/...
[23:12:13] <shinken-wm>	 PROBLEM - Puppet errors on integration-slave-jessie-android is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0]
[23:12:17] <andrewbogott>	 thcipriani: well, now I know
[23:12:19] <shinken-wm>	 PROBLEM - Puppet errors on deployment-kafka-jumbo-1 is CRITICAL: CRITICAL: 11.11% of data above the critical threshold [0.0]
[23:12:22] <andrewbogott>	 so there's no way to exclude a host from deployment?
[23:12:34] <shinken-wm>	 PROBLEM - Puppet errors on saucelabs-01 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0]
[23:12:44] <greg-g>	 andrewbogott: get it out of the dsh group, I suppose
[23:12:49] <shinken-wm>	 PROBLEM - Puppet errors on deployment-kafka-jumbo-2 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0]
[23:13:01] <greg-g>	 but that's auto-created by puppet
[23:13:05] <thcipriani>	 greg-g: andrewbogott yeah, but I'm trying to remember if those are controlled via...yeah that
[23:13:16] <greg-g>	 :)
[23:13:33] <shinken-wm>	 PROBLEM - Puppet errors on deployment-cassandra3-02 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0]
[23:13:39] <shinken-wm>	 PROBLEM - Puppet errors on deployment-imagescaler01 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0]
[23:13:51] <andrewbogott>	 At this point I guess I would just like a merge so I can stop caring about this
[23:14:06] <shinken-wm>	 PROBLEM - Puppet errors on deployment-db04 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0]
[23:14:28] <greg-g>	 do you not have merge rights there?
[23:14:35] <greg-g>	 https://gerrit.wikimedia.org/r/c/414733/ <-- this one
[23:15:25] <shinken-wm>	 PROBLEM - Puppet errors on deployment-imagescaler02 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0]
[23:15:33] <andrewbogott>	 on mw-config?  I surely do have merge rights but don't feel great self-merging
[23:15:42] <shinken-wm>	 PROBLEM - Puppet errors on deployment-urldownloader is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0]
[23:15:47] <andrewbogott>	 I guess that's maybe the best of a bad situation at this point
[23:16:05] <shinken-wm>	 PROBLEM - Puppet errors on deployment-mira is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0]
[23:16:09] <wikibugs>	 10Scap: Add DEPLOY_DIR env var to scap command checks - https://phabricator.wikimedia.org/T154612#4004068 (10thcipriani) >>! In T154612#4004045, @awight wrote: > It's working well now. >  > Just noting here that the lack of cwd means my script has to do the following to calculate the deployment root: >   deploy_...
[23:16:43] <Krenair>	 andrewbogott, self-merging on an operations/ repository? :)
[23:17:01] <greg-g>	 there's plenty of self-merging in mw-config :)
[23:17:08] <chasemp>	 andrewbogott: not sure how to help you man, merging is probably the only sane option at this point
[23:17:12] <Krenair>	 I used to do that all the time because ops people got away with it on operations/puppet
[23:17:35] <Krenair>	 on wmf deployment branches in mediawiki/ repositories too
[23:17:52] <shinken-wm>	 PROBLEM - Puppet errors on deployment-aqs02 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0]
[23:17:59] <andrewbogott>	 Right, but I /don't/ do it all the time which means I wouldn't mind a hand for someone
[23:18:02] <greg-g>	 self-merging in mw-config is not seen as A Bad Thing (TM) afaict
[23:18:08] <greg-g>	 ack
[23:18:35] <greg-g>	 thcipriani: does https://gerrit.wikimedia.org/r/c/414733/ look ok?
[23:19:23] <Krenair>	 mm to be fair I never really touched the db- files
[23:19:38] <wikibugs>	 10Release-Engineering-Team (Watching / External), 10Operations, 10ops-eqdfw, 10Patch-For-Review: setup/install/deploy deploy1001 as deployment server - https://phabricator.wikimedia.org/T175288#4004108 (10RobH) To be clear, mgmt responds, but the mgmt password doesn't work.
[23:19:44] <Krenair>	 jaime would know about that
[23:19:56] <thcipriani>	 I don't normally swap around database stuff, honestly, but it was already deployed?
[23:19:56] <wikibugs>	 10Release-Engineering-Team (Watching / External), 10Operations, 10ops-eqiad, 10Patch-For-Review: setup/install/deploy deploy1001 as deployment server - https://phabricator.wikimedia.org/T175288#4004112 (10RobH)
[23:20:29] <thcipriani>	 it was only deployed on silver probably. right.
[23:20:30] <Krenair>	 in some places in there where you're referring to a db cluster, the other entries in the file refer to individual DB servers
[23:21:02] <andrewbogott>	 thcipriani: that patch is live-hacked on silver, yes
[23:21:07] <andrewbogott>	 well, intermittently :)
[23:22:20] <andrewbogott>	 and it works, but that doesn't necessarily mean it's right
[23:23:03] <wikibugs>	 10Scap: Add DEPLOY_DIR env var to scap command checks - https://phabricator.wikimedia.org/T154612#4004118 (10awight) Confirmed, thanks for reminding me about the new variable!
[23:23:36] <thcipriani>	 blerg. yeah, I don't know anything about this configuration.
[23:25:38] <thcipriani>	 I have a thought about preventing scap deploys to silver
[23:26:02] <bd808>	 thcipriani: I'm just going to merge it and take the blame if it fails
[23:26:34] <Krenair>	 I think if I'd done this a year ago I'd have gotten input from jaime
[23:26:39] <thcipriani>	 fair enough, although silver is a one-off so removing from the dsh file should be a quick hierafile change.
[23:27:28] <thcipriani>	 https://github.com/wikimedia/puppet/blob/production/hieradata/common/scap/dsh.yaml#L40
[23:27:32] <thcipriani>	 just need to remove ^
[23:27:42] <andrewbogott>	 thcipriani: I think "there's no way to hotfix" is probably an ok policy, as long as it goes along with "it's ok to merge things outside of a SWAT window"
[23:27:43] <wikibugs>	 10Release-Engineering-Team (Watching / External), 10Operations, 10Patch-For-Review, 10Scoring-platform-team (Current), 10Wikimedia-Incident: Cache ORES virtualenv within versioned source - https://phabricator.wikimedia.org/T181071#3778593 (10awight) Do not merge patches, currently blocked on scap 3.8 dep...
[23:28:02] <awight>	 thcipriani: Unrelated question: is there a tentative timeline for deploying scap 3.8?  I like the new feature :-)
[23:28:05] <Krenair>	 it's probably safe, may be possible to clean it up to be Better later on
[23:28:07] <andrewbogott>	 As it is, all the ways to exclude a thing from sway seem to require merging a puppet patch, which feels like just circling around the problem :)
[23:28:32] <Krenair>	 there has been some strange "Once something is added here don't remove it again" in those db- files which I was always suspicious of
[23:28:50] <Krenair>	 so I'd be mindful of that
[23:28:53] <andrewbogott>	 yeah, my patch leaves in a resolution for silver exactly because it has that comment by it
[23:29:01] <andrewbogott>	 and I was intimidated :)
[23:29:09] <andrewbogott>	 I'll pull that line out when I shut silver down for good
[23:29:21] <thcipriani>	 fwiw, it's fine to merge stuff outside of a swat window.
[23:29:53] <thcipriani>	 especially if things are breaking :)
[23:30:40] <paladox>	 no_justification  i got included in working in pg https://phabricator.wikimedia.org/F14044877 :).
[23:30:47] <paladox>	 Shows "Beta Cluster"
[23:30:48] <Krenair>	 I always tried to get a window in advance to officially plan to do something
[23:31:01] <thcipriani>	 awight: we hadn't discussed a timeline for it, glad you like the new feature, I'll check with other scap folks to see if there's anything else we want to get in before a new version goes out. short answer would be: not this week, but Soon™.
[23:31:11] <Krenair>	 either by piggybacking on SWAT or a full blown window of my own
[23:32:00] <awight>	 thcipriani: Great, anything is fine for us, I just want to stay abreast of the plans.  Our downstream fix is a luxury feature at this point, which will make ORES rollbacks many times faster.
[23:33:00] <wikibugs>	 10Scap: Add DEPLOY_DIR env var to scap command checks - https://phabricator.wikimedia.org/T154612#4004176 (10awight) FWIW, I now have a script-check which has been smoke-tested on the beta cluster and behaves like we want.  I've set this task to block mine, so we'll just throw a small party when it can all be de...
[23:34:35] <wikibugs>	 10Release-Engineering-Team (Watching / External), 10Scoring-platform-team (Current), 10Wikimedia-Incident: [Spike] Write reports about why Ext:ORES is helping cause server 500s and write tasks to fix - https://phabricator.wikimedia.org/T181010#4004181 (10awight)
[23:34:38] <wikibugs>	 10Scap: Add DEPLOY_DIR env var to scap command checks - https://phabricator.wikimedia.org/T154612#4004182 (10mmodell) It is probably just about time for another scap release, eh? </canadianAccent>
[23:34:40] <wikibugs>	 10Release-Engineering-Team (Watching / External), 10Operations, 10Patch-For-Review, 10Scoring-platform-team (Current), 10Wikimedia-Incident: [Blocked] Cache ORES virtualenv within versioned source - https://phabricator.wikimedia.org/T181071#4004179 (10awight) 05Open>03stalled
[23:39:04] <shinken-wm>	 RECOVERY - Puppet errors on deployment-db03 is OK: OK: Less than 1.00% above the threshold [0.0]
[23:39:28] <shinken-wm>	 RECOVERY - Puppet errors on deployment-ms-fe02 is OK: OK: Less than 1.00% above the threshold [0.0]
[23:40:39] <shinken-wm>	 RECOVERY - Puppet errors on jenkinstest is OK: OK: Less than 1.00% above the threshold [0.0]
[23:41:08] <shinken-wm>	 RECOVERY - Puppet errors on integration-publishing is OK: OK: Less than 1.00% above the threshold [0.0]
[23:42:14] <shinken-wm>	 RECOVERY - Puppet errors on integration-slave-docker-1007 is OK: OK: Less than 1.00% above the threshold [0.0]
[23:42:21] <shinken-wm>	 RECOVERY - Puppet errors on integration-slave-docker-1004 is OK: OK: Less than 1.00% above the threshold [0.0]
[23:42:34] <shinken-wm>	 RECOVERY - Puppet errors on deployment-aqs03 is OK: OK: Less than 1.00% above the threshold [0.0]
[23:42:49] <shinken-wm>	 RECOVERY - Puppet errors on deployment-sca02 is OK: OK: Less than 1.00% above the threshold [0.0]
[23:42:59] <shinken-wm>	 RECOVERY - Puppet errors on deployment-etcd-01 is OK: OK: Less than 1.00% above the threshold [0.0]
[23:43:15] <shinken-wm>	 RECOVERY - Puppet errors on deployment-memc07 is OK: OK: Less than 1.00% above the threshold [0.0]
[23:43:20] <shinken-wm>	 RECOVERY - Puppet errors on deployment-redis06 is OK: OK: Less than 1.00% above the threshold [0.0]
[23:43:22] <shinken-wm>	 RECOVERY - Puppet errors on deployment-redis05 is OK: OK: Less than 1.00% above the threshold [0.0]
[23:43:40] <shinken-wm>	 RECOVERY - Puppet errors on integration-slave-docker-1005 is OK: OK: Less than 1.00% above the threshold [0.0]
[23:43:49] <shinken-wm>	 RECOVERY - Puppet errors on deployment-elastic07 is OK: OK: Less than 1.00% above the threshold [0.0]
[23:44:17] <no_justification>	 andrewbogott: Just caught scrollback (sorry, internet issues). Soooo, I don't see why we couldn't DWIM here w.r.t. `scap lock`
[23:44:35] <no_justification>	 At the very least: making commands unambiguous as to whether they're a local or remote thing would be nice
[23:45:39] <wikibugs>	 10Release-Engineering-Team (Watching / External), 10Operations, 10ops-eqiad, 10Patch-For-Review: setup/install/deploy deploy1001 as deployment server - https://phabricator.wikimedia.org/T175288#4004238 (10RobH) a:05Cmjohnson>03RobH
[23:46:23] <shinken-wm>	 RECOVERY - Puppet errors on saucelabs-03 is OK: OK: Less than 1.00% above the threshold [0.0]
[23:47:25] <andrewbogott>	 no_justification: Having a host-specific lock would be cool.  Failing that, having it say "Dude, I totally don't do anything anywhere but on Tin" would also be cool
[23:47:33] <shinken-wm>	 RECOVERY - Puppet errors on saucelabs-01 is OK: OK: Less than 1.00% above the threshold [0.0]
[23:47:51] <no_justification>	 andrewbogott: I don't see why `scap pull` couldn't check for a lock file :)
[23:47:51] <shinken-wm>	 RECOVERY - Puppet errors on deployment-cpjobqueue is OK: OK: Less than 1.00% above the threshold [0.0]
[23:47:55] <no_justification>	 It's all abstracted nicely
[23:48:16] <no_justification>	 Also, if we ever get the etcd/conftool stuff finished, you could make it just depool
[23:48:18] <no_justification>	 Or something
[23:48:19] <andrewbogott>	 no_justification: would be nice!  I'll make a ticket
[23:52:10] <shinken-wm>	 RECOVERY - Puppet errors on integration-slave-jessie-android is OK: OK: Less than 1.00% above the threshold [0.0]
[23:52:23] <wikibugs>	 10Release-Engineering-Team, 10Scap: Provide a mechanism ('scap lock'?) to exclude an individual host from deploys - https://phabricator.wikimedia.org/T188347#4004305 (10Andrew)
[23:52:50] <andrewbogott>	 no_justification: outline of a task ^ but you might want to adjust the project tags as appropriate
[23:53:31] <no_justification>	 "There's not currently a way to prevent syncs on a host without merging a puppet patch" -- wait, I thought depooling from conftool rewrites the dsh files?
[23:53:38] <no_justification>	 (that was kinda half of the point....?)
[23:53:44] <no_justification>	 Otherwise, +1 to task
[23:55:27] <andrewbogott>	 no_justification: Please correct if I'm wrong, I don't know a thing about how conftool is supposed to work in this case
[23:56:09] <no_justification>	 Nobody does :p
[23:56:09] <no_justification>	 heheheeh
[23:57:29] <thcipriani>	 depooling from conftool works for all but a handfull of one-offs
[23:57:42] <thcipriani>	 https://github.com/wikimedia/puppet/blob/production/hieradata/common/scap/dsh.yaml#L27-L45
[23:57:48] <thcipriani>	 "hosts" ^
[23:59:47] <no_justification>	 Derp, so it wouldn't have worked here anyway