[00:01:43] <halfak>	 !log deploy-prep awight disabled ORES service
[00:01:47] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL
[00:02:16] <halfak>	 !log deploy-prep awight enabled ORES service
[00:02:19] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL
[00:12:54] <awight>	 Grabbing an emergency deployment window, beginning now.
[00:43:58] <wikibugs>	 10Release-Engineering-Team (Kanban), 10MediaWiki-Cache, 10MediaWiki-Vagrant, 10Performance-Team, 10User-zeljkofilipin: MediaWiki core Selenium tests fail when targeting Vagrant - https://phabricator.wikimedia.org/T180035#3795136 (10aaron) This looks like an integration issue with ChronologyProtector vs W...
[00:46:46] <awight>	 greg-g: done with “my” window
[00:46:52] <greg-g>	 :)
[00:47:02] <awight>	 you can keep it! :p
[00:47:02] <greg-g>	 I have to run now, don't break anything
[00:47:09] <awight>	 no problem ;-)
[00:52:46] <wikibugs>	 10Release-Engineering-Team (Kanban), 10MediaWiki-Cache, 10MediaWiki-Vagrant, 10Performance-Team, 10User-zeljkofilipin: MediaWiki core Selenium tests fail when targeting Vagrant - https://phabricator.wikimedia.org/T180035#3795161 (10aaron) The simple thing is to not set INTERIM keys in the same request th...
[01:23:43] <shinken-wm>	 RECOVERY - Puppet errors on deployment-zotero01 is OK: OK: Less than 1.00% above the threshold [0.0]
[01:29:47] <shinken-wm>	 RECOVERY - Puppet errors on deployment-urldownloader is OK: OK: Less than 1.00% above the threshold [0.0]
[01:39:01] <shinken-wm>	 RECOVERY - Puppet errors on deployment-tmh01 is OK: OK: Less than 1.00% above the threshold [0.0]
[01:42:26] <wikibugs>	 10Continuous-Integration-Config, 10BlueSpice, 10Patch-For-Review: Enable unit tests on BlueSpice* repos - https://phabricator.wikimedia.org/T130811#3795227 (10Paladox) Oh sorry for late reply. Your change may require that the integration config change is merged or that the other repo is published to composer...
[01:43:47] <shinken-wm>	 RECOVERY - Puppet errors on deployment-eventlog02 is OK: OK: Less than 1.00% above the threshold [0.0]
[03:00:15] <wmf-insecte>	 Project mediawiki-core-code-coverage build #3161: 04FAILURE in 14 sec: https://integration.wikimedia.org/ci/job/mediawiki-core-code-coverage/3161/
[03:23:02] <Krinkle>	 !log Jenkins jobs for mediawiki-core-php55lint consistently failing on integration-slave-jessie ("git: stderr: error: failed to write..")
[03:23:06] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL
[03:23:26] <Krinkle>	 gotta go. Failed twice on https://gerrit.wikimedia.org/r/#/c/393983/
[03:23:28] <Krinkle>	 o/
[03:23:54] <wikibugs>	 10Beta-Cluster-Infrastructure, 10Performance-Team: Make MediaWiki profiler in Beta match production - https://phabricator.wikimedia.org/T180766#3795357 (10Krinkle)
[03:53:26] <legoktm>	 03:00:13 DEBUG:git.cmd:AutoInterrupt wait stderr: 'fatal: write error: No space left on device\nfatal: index-pack failed'
[03:53:51] <legoktm>	 The last Puppet run was at Tue Nov 28 17:19:11 UTC 2017 (634 minutes ago). 
[03:56:01] <legoktm>	 !log deleted all workspaces on integration-slave-jessie-1003 /srv ran out of space 
[03:56:04] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL
[03:56:07] <legoktm>	 /dev/mapper/vd-second--local--disk   21G  4.5G   15G  24% /srv
[03:58:23] <wmf-insecte>	 Yippee, build fixed!
[03:58:24] <wmf-insecte>	 Project selenium-MultimediaViewer » firefox,mediawiki,Linux,BrowserTests build #592: 09FIXED in 2 min 23 sec: https://integration.wikimedia.org/ci/job/selenium-MultimediaViewer/BROWSER=firefox,MEDIAWIKI_ENVIRONMENT=mediawiki,PLATFORM=Linux,label=BrowserTests/592/
[04:00:21] <wikibugs>	 10Continuous-Integration-Infrastructure: mwgate-php55lint workspaces are getting huge - https://phabricator.wikimedia.org/T179963#3795398 (10Legoktm) Just had to clean this up on integration-slave-jessie-1003.
[04:07:50] <shinken-wm>	 RECOVERY - Free space - all mounts on integration-slave-jessie-1003 is OK: OK: All targets OK
[04:52:10] <wikibugs>	 10Release-Engineering-Team (Kanban), 10MediaWiki-Cache, 10MediaWiki-Vagrant, 10Performance-Team, and 2 others: MediaWiki core Selenium tests fail when targeting Vagrant - https://phabricator.wikimedia.org/T180035#3795413 (10aaron) I noticed a worse bug of cpPosTime cookies not being used (not related to WA...
[04:54:50] <shinken-wm>	 PROBLEM - Mediawiki Error Rate on graphite-labs is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [10.0]
[05:19:52] <shinken-wm>	 PROBLEM - Mediawiki Error Rate on graphite-labs is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [10.0]
[05:51:22] <wikibugs>	 10Release-Engineering-Team (Kanban), 10MediaWiki-Cache, 10MediaWiki-Vagrant, 10Performance-Team, and 2 others: MediaWiki core Selenium tests fail when targeting Vagrant - https://phabricator.wikimedia.org/T180035#3795451 (10aaron) >>! In T180035#3795413, @aaron wrote: > I noticed a worse bug of cpPosTime c...
[07:00:01] <shinken-wm>	 RECOVERY - Free space - all mounts on deployment-fluorine02 is OK: OK: All targets OK
[08:54:11] <shinken-wm>	 PROBLEM - Long lived cherry-picks on puppetmaster on deployment-puppetmaster02 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0]
[09:00:39] <elukey>	 hello! I am wondering if https://phabricator.wikimedia.org/T181219 is enough to ask for a github mirror from gerrit
[09:06:10] <hashar>	 elukey: hmm yeah maybe ? :)
[09:08:01] <hashar>	 !log github: created repo operations-software-druid_exporter | T181219
[09:08:05] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL
[09:08:05] <stashbot>	 T181219: Mirror the gerrit repo operations/software/druid_exporter - https://phabricator.wikimedia.org/T181219
[09:08:31] <elukey>	 ahahahah I didn't mean to push you to do it now, just wanted to know if I got the right procedure :/
[09:08:38] <elukey>	 but thanks! 
[09:09:44] <hashar>	 elukey: in theory we just have to create the repo on GitHub and supposedly Gerrit will end up replicating to it
[09:09:45] <hashar>	 https://github.com/wikimedia/operations-software-druid_exporter
[09:09:56] <hashar>	 not sure whether the underscore will end up being a problem though
[09:10:27] <elukey>	 all right I'll keep it monitored! thanks!
[09:10:37] <elukey>	 we also have https://github.com/wikimedia/operations-software-hhvm_exporter
[09:10:48] <elukey>	 so the _ should be fine
[09:12:09] <hashar>	 it does not seem to replicate though :(
[09:14:05] <hashar>	 !log github: created wikimedia/operations-debs-contenttranslation-apertium-crh-tur and wikimedia/operations-debs-prometheus-openldap-exporter
[09:14:08] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL
[09:18:18] <hashar>	 !log gerrit: forcing replication: ssh -p 29418 hashar@gerrit.wikimedia.org replication start operations/software/druid_exporter  # T181219
[09:18:22] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL
[09:18:22] <stashbot>	 T181219: Mirror the gerrit repo operations/software/druid_exporter - https://phabricator.wikimedia.org/T181219
[09:19:42] <hashar>	 elukey: mirrored! https://github.com/wikimedia/operations-software-druid_exporter
[09:21:15] <elukey>	 hashar: you rock, thank you!!!
[10:39:57] <wikibugs>	 (03Draft1) 10MarcoAurelio: Archive the ActionEditSubmit extension [integration/config] - 10https://gerrit.wikimedia.org/r/394033 (https://phabricator.wikimedia.org/T180808)
[10:40:02] <wikibugs>	 (03PS2) 10MarcoAurelio: Archive the ActionEditSubmit extension [integration/config] - 10https://gerrit.wikimedia.org/r/394033 (https://phabricator.wikimedia.org/T180808)
[12:05:50] <wikibugs>	 10Release-Engineering-Team (Kanban), 10User-zeljkofilipin: Upgrade WebdriverIO to 4.9 - https://phabricator.wikimedia.org/T180144#3796173 (10zeljkofilipin) a:03zeljkofilipin
[12:22:47] <wmf-insecte>	 Yippee, build fixed!
[12:22:49] <wmf-insecte>	 Project selenium-GettingStarted » firefox,beta,Linux,BrowserTests build #601: 09FIXED in 46 sec: https://integration.wikimedia.org/ci/job/selenium-GettingStarted/BROWSER=firefox,MEDIAWIKI_ENVIRONMENT=beta,PLATFORM=Linux,label=BrowserTests/601/
[13:13:41] <wikibugs>	 10Continuous-Integration-Config, 10Security: Run composer with `--dev` flag - https://phabricator.wikimedia.org/T180235#3796300 (10Reedy) 05stalled>03declined
[13:23:28] <wikibugs>	 10Release-Engineering-Team, 10ORES, 10Operations, 10Scoring-platform-team (Current), and 2 others: Git refusing to clone some ORES submodules - https://phabricator.wikimedia.org/T181552#3796325 (10awight) @thcipriani @mmodell Is the fix for T179013 deployed to production?  I'm hoping the fix will be that s...
[14:22:20] <wikibugs>	 (03PS1) 10Phuedx: Add npm job for the Chromium render service [integration/config] - 10https://gerrit.wikimedia.org/r/394058 (https://phabricator.wikimedia.org/T179552)
[14:30:55] <shinken-wm>	 PROBLEM - Puppet errors on deployment-elastic06 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0]
[14:31:25] <shinken-wm>	 PROBLEM - Puppet errors on deployment-memc06 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0]
[15:10:54] <shinken-wm>	 RECOVERY - Puppet errors on deployment-elastic06 is OK: OK: Less than 1.00% above the threshold [0.0]
[15:11:28] <shinken-wm>	 RECOVERY - Puppet errors on deployment-memc06 is OK: OK: Less than 1.00% above the threshold [0.0]
[16:28:18] <wmf-insecte>	 Yippee, build fixed!
[16:28:19] <wmf-insecte>	 Project mediawiki-core-code-coverage build #3162: 09FIXED in 1 hr 28 min: https://integration.wikimedia.org/ci/job/mediawiki-core-code-coverage/3162/
[16:46:27] <wikibugs>	 10Continuous-Integration-Infrastructure: Jenkins silently ignores patches if a dependency loop is involved somehow - https://phabricator.wikimedia.org/T181574#3796972 (10Anomie)
[17:24:20] <shinken-wm>	 PROBLEM - Puppet errors on deployment-kafka01 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0]
[17:24:22] <shinken-wm>	 PROBLEM - Puppet errors on deployment-secureredirexperiment is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0]
[17:24:27] <shinken-wm>	 PROBLEM - Puppet errors on deployment-prometheus01 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0]
[17:24:53] <shinken-wm>	 PROBLEM - Puppet errors on deployment-jobrunner02 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0]
[17:24:57] <shinken-wm>	 PROBLEM - Puppet errors on deployment-apertium02 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0]
[17:25:05] <shinken-wm>	 PROBLEM - Puppet errors on jenkinstest is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0]
[17:25:45] <shinken-wm>	 PROBLEM - Puppet errors on saucelabs-01 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0]
[17:26:34] <shinken-wm>	 PROBLEM - Puppet errors on deployment-mediawiki06 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0]
[17:27:46] <shinken-wm>	 PROBLEM - Puppet errors on deployment-puppetmaster02 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0]
[17:28:12] <shinken-wm>	 PROBLEM - Puppet errors on deployment-parsoid09 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0]
[17:29:05] <shinken-wm>	 PROBLEM - Puppet errors on integration-slave-docker-1002 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0]
[17:29:55] <shinken-wm>	 PROBLEM - Puppet errors on deployment-cassandra3-02 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0]
[17:30:03] <shinken-wm>	 PROBLEM - Puppet errors on deployment-trending01 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0]
[17:30:11] <shinken-wm>	 PROBLEM - Puppet errors on deployment-cassandra3-01 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0]
[17:30:25] <shinken-wm>	 PROBLEM - Puppet errors on deployment-tin is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0]
[17:30:26] <shinken-wm>	 PROBLEM - Puppet errors on castor02 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0]
[17:30:34] <shinken-wm>	 PROBLEM - Puppet errors on deployment-memc05 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0]
[17:30:38] <shinken-wm>	 PROBLEM - Puppet errors on deployment-imagescaler01 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0]
[17:30:44] <shinken-wm>	 PROBLEM - Puppet errors on deployment-kafka03 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0]
[17:30:46] <shinken-wm>	 PROBLEM - Puppet errors on deployment-puppetdb01 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0]
[17:31:37] <shinken-wm>	 PROBLEM - Puppet errors on integration-r-lang-01 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0]
[17:31:53] <shinken-wm>	 PROBLEM - Puppet errors on deployment-elastic06 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0]
[17:32:17] <greg-g>	 eeek?
[17:32:25] <shinken-wm>	 PROBLEM - Puppet errors on deployment-memc06 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0]
[17:32:49] <shinken-wm>	 PROBLEM - Puppet errors on deployment-restbase02 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0]
[17:32:53] <shinken-wm>	 PROBLEM - Puppet errors on integration-cumin is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0]
[17:32:55] <paladox>	 i think those may be the same as i am getting
[17:32:58] <paladox>	 Error: Could not retrieve catalog from remote server: Error 400 on SERVER: Variable $::nameservers is not defined! at /etc/puppet/modules/base/manifests/resolving.pp:6 on node phabricator.phabricator.eqiad.wmflabs
[17:32:58] <paladox>	 Warning: Not using cache on failed catalog
[17:32:58] <paladox>	 Error: Could not retrieve catalog; skipping run
[17:33:26] <shinken-wm>	 PROBLEM - Puppet errors on deployment-ircd is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0]
[17:34:12] <shinken-wm>	 PROBLEM - Puppet errors on integration-slave-docker-1005 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0]
[17:34:48] <shinken-wm>	 PROBLEM - Puppet errors on deployment-eventlog02 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0]
[17:35:00] <shinken-wm>	 PROBLEM - Puppet errors on deployment-aqs01 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0]
[17:35:02] <shinken-wm>	 PROBLEM - Puppet errors on deployment-tmh01 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0]
[17:35:22] <shinken-wm>	 PROBLEM - Puppet errors on deployment-db03 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0]
[17:36:17] <shinken-wm>	 PROBLEM - Puppet errors on integration-slave-docker-1001 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0]
[17:36:23] <shinken-wm>	 PROBLEM - Puppet errors on deployment-conf03 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0]
[17:36:29] <shinken-wm>	 PROBLEM - Puppet errors on integration-publishing is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0]
[17:36:31] <shinken-wm>	 PROBLEM - Puppet errors on deployment-eventlogging04 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0]
[17:37:19] <shinken-wm>	 PROBLEM - Puppet errors on saucelabs-03 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0]
[17:37:39] <shinken-wm>	 PROBLEM - Puppet errors on saucelabs-02 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0]
[17:38:07] <shinken-wm>	 PROBLEM - Puppet errors on deployment-cache-upload04 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0]
[17:38:34] <shinken-wm>	 PROBLEM - Puppet errors on deployment-memc07 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0]
[17:39:22] <shinken-wm>	 PROBLEM - Puppet errors on deployment-videoscaler01 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0]
[17:39:38] <shinken-wm>	 PROBLEM - Puppet errors on deployment-sentry01 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0]
[17:39:40] <shinken-wm>	 PROBLEM - Puppet errors on deployment-db04 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0]
[17:39:42] <shinken-wm>	 PROBLEM - Puppet errors on deployment-cumin is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0]
[17:40:26] <wikibugs>	 10Release-Engineering-Team, 10ORES, 10Operations, 10Scoring-platform-team (Current), and 2 others: Git refusing to clone some ORES submodules - https://phabricator.wikimedia.org/T181552#3797253 (10awight) @thcipriani OK thank you for the workaround.  I'll note that I don't have permissions to do that mysel...
[17:40:27] <shinken-wm>	 PROBLEM - Puppet errors on deployment-changeprop is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0]
[17:40:37] <shinken-wm>	 PROBLEM - Puppet errors on integration-slave-docker-1003 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0]
[17:40:57] <shinken-wm>	 PROBLEM - Puppet errors on deployment-etcd-01 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0]
[17:41:03] <shinken-wm>	 PROBLEM - Puppet errors on deployment-aqs02 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0]
[17:42:22] <shinken-wm>	 PROBLEM - Puppet errors on deployment-ms-fe02 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0]
[17:42:46] <shinken-wm>	 PROBLEM - Puppet errors on integration-slave-jessie-android is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0]
[17:43:06] <shinken-wm>	 PROBLEM - Puppet errors on deployment-mathoid is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0]
[17:43:12] <shinken-wm>	 PROBLEM - Puppet errors on deployment-elastic05 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0]
[17:43:14] <shinken-wm>	 PROBLEM - Puppet errors on deployment-memc04 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0]
[17:43:47] <shinken-wm>	 PROBLEM - Puppet errors on deployment-kafka04 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0]
[17:43:51] <shinken-wm>	 PROBLEM - Puppet errors on deployment-aqs03 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0]
[17:43:54] <shinken-wm>	 PROBLEM - Puppet errors on deployment-elastic07 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0]
[17:44:21] <shinken-wm>	 PROBLEM - Puppet errors on integration-slave-docker-1007 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0]
[17:44:45] <shinken-wm>	 PROBLEM - Puppet errors on deployment-zotero01 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0]
[17:44:47] <shinken-wm>	 PROBLEM - Puppet errors on deployment-mcs01 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0]
[17:45:03] <shinken-wm>	 PROBLEM - Puppet errors on deployment-sca03 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0]
[17:45:23] <shinken-wm>	 PROBLEM - Puppet errors on deployment-logstash2 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0]
[17:45:27] <shinken-wm>	 PROBLEM - Puppet errors on deployment-sca01 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0]
[17:45:35] <shinken-wm>	 PROBLEM - Puppet errors on deployment-pdfrender02 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0]
[17:45:39] <shinken-wm>	 PROBLEM - Puppet errors on integration-puppetmaster01 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0]
[17:45:53] <shinken-wm>	 PROBLEM - Puppet errors on deployment-mediawiki04 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0]
[17:46:14] <shinken-wm>	 PROBLEM - Puppet errors on deployment-restbase01 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0]
[17:49:21] <shinken-wm>	 PROBLEM - Puppet errors on deployment-mediawiki05 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0]
[17:49:35] <shinken-wm>	 PROBLEM - Puppet errors on deployment-fluorine02 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0]
[17:49:39] <shinken-wm>	 PROBLEM - Puppet errors on deployment-kafka05 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0]
[17:49:51] <shinken-wm>	 PROBLEM - Puppet errors on deployment-cache-text04 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0]
[17:50:01] <shinken-wm>	 PROBLEM - Puppet errors on integration-slave-docker-1004 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0]
[17:50:17] <shinken-wm>	 PROBLEM - Puppet errors on integration-slave-docker-1006 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0]
[17:50:22] <shinken-wm>	 PROBLEM - Puppet errors on deployment-sca04 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0]
[17:50:28] <shinken-wm>	 PROBLEM - Puppet errors on deployment-cpjobqueue is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0]
[17:50:38] <shinken-wm>	 PROBLEM - Puppet errors on deployment-redis05 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0]
[17:50:58] <shinken-wm>	 PROBLEM - Puppet errors on deployment-mediawiki07 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0]
[17:51:27] <shinken-wm>	 PROBLEM - Puppet errors on deployment-sca02 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0]
[17:52:27] <awight>	 How would I find the internal service URL for a beta cluster machine, in this case ores-beta.wmflabs.org ?  I don’t see that configured in puppet...
[17:52:40] <shinken-wm>	 PROBLEM - Puppet errors on webperformance is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0]
[17:53:29] <shinken-wm>	 PROBLEM - Puppet errors on deployment-zookeeper02 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0]
[17:53:34] <shinken-wm>	 PROBLEM - Puppet errors on deployment-poolcounter04 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0]
[17:54:52] <shinken-wm>	 PROBLEM - Puppet errors on deployment-mira is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0]
[17:55:16] <shinken-wm>	 PROBLEM - Puppet errors on deployment-ores-redis-01 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0]
[17:55:48] <shinken-wm>	 PROBLEM - Puppet errors on deployment-urldownloader is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0]
[17:56:08] <awight>	 Answering my question, I found the mapping in https://tools.wmflabs.org/openstack-browser/project/deployment-prep
[17:56:32] <shinken-wm>	 PROBLEM - Puppet errors on deployment-imagescaler02 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0]
[17:56:41] <awight>	 However, I’d like to discover it dynamically, like using ores.discovery.wmnet in production.
[17:57:27] <awight>	 My motivation is to have the ORES clients on beta connect directly to the server rather than going through an external proxy.  Should I not bother?
[17:59:01] <awight>	 K all the other services use the external DNS names.  I won’t bother.
[18:09:56] <wikibugs>	 10Release-Engineering-Team, 10ORES, 10Operations, 10Scoring-platform-team (Current): Connection timeout from tin to new ores servers - https://phabricator.wikimedia.org/T181661#3797349 (10awight)
[18:09:59] <wikibugs>	 10Release-Engineering-Team, 10ORES, 10Operations, 10Scoring-platform-team (Current): Connection timeout from tin to new ores servers - https://phabricator.wikimedia.org/T181661#3797349 (10awight) a:05awight>03None
[18:14:05] <wikibugs>	 10Release-Engineering-Team, 10ORES, 10Operations, 10Scoring-platform-team (Current), and 2 others: Git refusing to clone some ORES submodules - https://phabricator.wikimedia.org/T181552#3797381 (10awight) a:05awight>03None
[18:28:41] <greg-g>	 paladox: sorry, in meetings back-to-back, any more info?
[18:29:19] <thcipriani>	 hrm... Error 400 on SERVER: Variable $::nameservers is not defined! at /etc/puppet/modules/base/manifests/resolving.pp:6
[18:30:04] <thcipriani>	 and...that's right, it isn't defined....but was it before...
[18:36:36] <wikibugs>	 10Release-Engineering-Team (Kanban), 10Patch-For-Review, 10User-zeljkofilipin: Run Cucumber+Selenium+Node.js in CI - https://phabricator.wikimedia.org/T179190#3797446 (10zeljkofilipin) a:05zeljkofilipin>03None
[18:37:36] <wikibugs>	 10Release-Engineering-Team (Kanban), 10User-zeljkofilipin: Use createAccount() in Selenium tests - https://phabricator.wikimedia.org/T180379#3797449 (10zeljkofilipin) a:05zeljkofilipin>03None
[18:39:13] <wikibugs>	 10Release-Engineering-Team (Kanban), 10releng-201718-q1, 10MediaWiki-General-or-Unknown, 10Epic, and 5 others: Port Selenium tests from Ruby to Node.js - https://phabricator.wikimedia.org/T139740#3797455 (10zeljkofilipin) a:05zeljkofilipin>03None
[18:39:43] <wikibugs>	 10Release-Engineering-Team (Kanban), 10Discovery, 10Discovery-Search (Current work), 10User-zeljkofilipin: Run selenium-EXTENSION-jessie Jenkins job for CirrusSearch - https://phabricator.wikimedia.org/T175179#3797457 (10zeljkofilipin) a:05zeljkofilipin>03None
[18:40:37] <wikibugs>	 10Release-Engineering-Team (Kanban), 10Patch-For-Review, 10User-zeljkofilipin: Run Selenium tests in CI for extensions - https://phabricator.wikimedia.org/T164721#3797470 (10zeljkofilipin) a:05zeljkofilipin>03None
[18:42:50] <wikibugs>	 10Release-Engineering-Team (Watching / External), 10MediaWiki-Core-Tests, 10Patch-For-Review, 10User-zeljkofilipin: WebdriverIO should run Chrome headlessly - https://phabricator.wikimedia.org/T167507#3797479 (10zeljkofilipin)
[18:42:52] <wikibugs>	 10Continuous-Integration-Infrastructure, 10User-zeljkofilipin: Upgrade to Chromium 59 or newer on Debian Jessie in CI - https://phabricator.wikimedia.org/T170032#3797478 (10zeljkofilipin) 05Open>03stalled
[18:45:27] <wikibugs>	 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (Next), 10User-zeljkofilipin: Create a Jenkins job that runs Echo RSpec tests daily - https://phabricator.wikimedia.org/T171753#3797494 (10zeljkofilipin) 05Open>03declined
[18:47:52] <wikibugs>	 10Release-Engineering-Team (Kanban), 10releng-201718-q1, 10MediaWiki-General-or-Unknown, 10Epic, and 5 others: Port Selenium tests from Ruby to Node.js - https://phabricator.wikimedia.org/T139740#3797516 (10zeljkofilipin) a:03zeljkofilipin
[18:48:03] <wikibugs>	 10Release-Engineering-Team, 10ORES, 10Operations, 10Scoring-platform-team (Current): Connection timeout from tin to new ores servers - https://phabricator.wikimedia.org/T181661#3797349 (10mmodell) This is very strange. I can't tell exactly what would be causing this.
[19:05:13] <shinken-wm>	 PROBLEM - Free space - all mounts on integration-slave-docker-1001 is CRITICAL: CRITICAL: integration.integration-slave-docker-1001.diskspace._var_lib_docker_devicemapper_mnt_e6e01440f04a145c09affed5a3ca6f3b53daaab53d038a0d9a25a2321ad2e83e.byte_percentfree (No valid datapoints found) integration.integration-slave-docker-1001.diskspace._var_lib_docker_devicemapper_mnt_ae0c0365a418e6d09fafa87f9895fd4e627f98827599ba01651a8d2bb7
[19:05:13] <shinken-wm>	 e (No valid datapoints found) integration.integration-slave-docker-1001.diskspace._var_lib_docker_devicemapper_mnt_542702f60914ef9f11b4fff9801fb50b98babed3d88dab01384772c8a8f1f043.byte_percentfree (No valid datapoints found) integration.integration-slave-docker-1001.diskspace._var_lib_docker_devicemapper_mnt_382de2b48f7a102ecf17cbd12d459f88057d36dba96cedb922040e35e9b82d2e.byte_percentfree (No valid datapoints found) integrat
[19:05:13] <shinken-wm>	 docker-1001.diskspace._var_lib_docker_devicemapper_mnt_244a2d354369e401b6c66d4bd544048c3840d932cf778fdf984537361725f816.byte_percentfree (No valid datapoints found) integration.integration-slave-docker-1001.diskspace._var_lib_docker_devicemapper_mnt_d8fc462aeb76074a5f89517c31b54a82a6b9bee7edadd84f909f7bf192acd160.byte_percentfree (No valid datapoints found) integration.integration-slave-docker-1001.diskspace._var_lib_docker_
[19:05:13] <shinken-wm>	 a7f3437c5df7fe9022aa85e5b8d3f8456090e82a1199b2b29dec9635842.byte_percentfree (No valid datapoints found) integration.integration-slave-docker-1001.diskspace._var_lib_docker_devicemapper_mnt_1c0fb41f7a2181b123e5946fba45e85cdfef2556fc8597cb15f65675a1b19486.byte_percentfree (No valid datapoints found) integration.integration-slave-docker-1001.diskspace._var_lib_docker_devicemapper_mnt_eed8308ecda3c515fc3eb746da906634666ed3ec1b4
[19:05:13] <shinken-wm>	 byte_percentfree (No valid datapoints found) integration.integration-slave-docker-1001.diskspace._var_lib_docker_devicemapper_mnt_31efaafd7825c53306a17741dcce1df6a28a339aadc13d4aa228986a0325c8a7.byte_percentfree (No valid datapoints found) integration.integration-slave-docker-1001.diskspace._var_lib_docker_devicemapper_mnt_734cb090a18ec3f54bb70110faa369e4073fe6ad423521c364e403698f3ec8b1.byte_percentfree (No valid datapoints 
[19:05:22] <wikibugs>	 10Beta-Cluster-Infrastructure, 10WMF-Legal, 10Privacy, 10Security: Require email address to register on Beta Cluster - https://phabricator.wikimedia.org/T181034#3797631 (10Bawolff) p:05Triage>03Normal
[19:06:31] <wikibugs>	 10Continuous-Integration-Config, 10Librarization, 10Security-Team, 10Composer, and 2 others: Expand our usage of FriendsOfPHP/security-advisories - https://phabricator.wikimedia.org/T180278#3797633 (10Bawolff) p:05Triage>03Normal
[19:11:48] <shinken-wm>	 RECOVERY - Mediawiki Error Rate on graphite-labs is OK: OK: Less than 1.00% above the threshold [1.0]
[19:12:09] <wikibugs>	 10Release-Engineering-Team (Kanban), 10StructuredDiscussions, 10Browser-Tests, 10Collaboration-Team-Triage (Collab-Team-This-Quarter), and 2 others: Flow: Migrate browser tests from Ruby to node.js - https://phabricator.wikimedia.org/T174591#3797647 (10zeljkofilipin) p:05Triage>03Normal a:03zeljkofili...
[19:12:48] <paladox>	 greg-g it was fixed but andrew says the cron should run and fix this soon. Hes going to keep and eye on it.
[19:13:26] <paladox>	 to fix it manually you replace this line default_manifest in puppet.conf with default_manifest = $confdir/manifests and then run puppet.
[19:13:37] <paladox>	 then sudo service apache2 restart
[19:13:53] <paladox>	 then puppet should start working. you only need to do that on the puppetmaster.
[19:14:21] <paladox>	 thcipriani ^^
[19:14:29] <chasemp>	 that's only going to work if https://gerrit.wikimedia.org/r/#/c/394098/4 has landed there I imagine
[19:14:37] <paladox>	 yeh
[19:14:38] <thcipriani>	 34.todd_                    │19:12:47           paladox greg-g it was fixed but andrew says the cron should run and fix this soon. Hes going to keep and eye on it.
[19:14:55] <thcipriani>	 heh middle click paste rather than middle click link click
[19:15:10] <thcipriani>	 sorry for pings
[19:15:48] <paladox>	 cc chasemp ^^ :)
[19:16:14] <thcipriani>	 chasemp: afaict that has landed on deployment-puppetmaster02
[19:16:25] <paladox>	 i think all puppetmasters may need to do that.
[19:17:40] <chasemp>	 paladox: so I wondered teh same but that does not seem to fix things
[19:17:46] <chasemp>	 I know it's part of the issue anyway
[19:17:54] <paladox>	 hmm, it fixes it for me
[19:18:29] <chasemp>	 there were other transitional changes that maybe didn't happen here first?
[19:18:46] <thcipriani>	 that's possible
[19:20:05] <paladox>	 1. edit vi /etc/puppet/puppet.conf (replace this line default_manifest with default_manifest = $confdir/manifests) then save
[19:20:09] <paladox>	 2. restart apache
[19:20:13] <paladox>	 then run puppet
[19:20:57] <wikibugs>	 10Release-Engineering-Team (Kanban), 10StructuredDiscussions, 10Browser-Tests, 10Collaboration-Team-Triage (Collab-Team-This-Quarter), and 2 others: Flow: Migrate browser tests from Ruby to node.js - https://phabricator.wikimedia.org/T174591#3797689 (10zeljkofilipin) I have assigned the task to myself unti...
[19:21:24] <thcipriani>	 ah...I did just notice that puppet.conf had the wrong default_manifest line
[19:21:31] <thcipriani>	 and now it seems to be running ok
[19:21:41] <chasemp>	 I changed it back after it didn't work the first time
[19:21:51] <chasemp>	 it's possible I race conditioned originally w/ the puppet checkout
[19:22:02] <chasemp>	 order of operational things
[19:22:11] <thcipriani>	 ah, gotcha, well puppet run was just successful on deployment-puppetmaster02
[19:22:23] <chasemp>	 thcipriani: can you spot check a few clients as well please?
[19:22:28] <thcipriani>	 yep, doing
[19:23:37] <thcipriani>	 looking good for a handful of them!
[19:23:54] <thcipriani>	 chasemp: paladox thanks for your help!
[19:24:00] <chasemp>	 ok cool
[19:24:09] <thcipriani>	 I think we'll start to see recoveries from shinken soon :)
[19:24:14] <paladox>	 Your welcome, though andrew gave me the hint to replace default_manifest line :)
[19:24:45] <paladox>	 chasemp i think doing this will need to be done in all local puppet masters. 
[19:24:46] <chasemp>	 there are a lot of broken masters out there I imagine https://tools.wmflabs.org/openstack-browser/puppetclass/role::puppetmaster::standalone
[19:25:05] <paladox>	 puppet-phabricator.phabricator.eqiad.wmflabs and puppet-paladox3 are fixed :)
[19:27:20] * thcipriani checks integration
[19:31:31] <shinken-wm>	 RECOVERY - Puppet errors on deployment-imagescaler02 is OK: OK: Less than 1.00% above the threshold [0.0]
[19:32:15] <thcipriani>	 yeap. Looks like integration-puppetmaster needed a manual nudge as well.
[19:32:46] <shinken-wm>	 RECOVERY - Puppet errors on deployment-puppetmaster02 is OK: OK: Less than 1.00% above the threshold [0.0]
[19:32:52] <wikibugs>	 (03PS1) 10Zfilipin: Do not run Ruby Selenium jobs for Flow [integration/config] - 10https://gerrit.wikimedia.org/r/394111 (https://phabricator.wikimedia.org/T174591)
[19:34:16] <shinken-wm>	 RECOVERY - Puppet errors on deployment-kafka01 is OK: OK: Less than 1.00% above the threshold [0.0]
[19:34:22] <shinken-wm>	 RECOVERY - Puppet errors on deployment-secureredirexperiment is OK: OK: Less than 1.00% above the threshold [0.0]
[19:34:26] <shinken-wm>	 RECOVERY - Puppet errors on deployment-prometheus01 is OK: OK: Less than 1.00% above the threshold [0.0]
[19:34:52] <wikibugs>	 (03PS2) 10Zfilipin: Do not run Ruby Selenium jobs for Flow [integration/config] - 10https://gerrit.wikimedia.org/r/394111 (https://phabricator.wikimedia.org/T174591)
[19:34:53] <shinken-wm>	 RECOVERY - Puppet errors on deployment-mira is OK: OK: Less than 1.00% above the threshold [0.0]
[19:34:55] <shinken-wm>	 RECOVERY - Puppet errors on deployment-jobrunner02 is OK: OK: Less than 1.00% above the threshold [0.0]
[19:34:57] <shinken-wm>	 RECOVERY - Puppet errors on deployment-apertium02 is OK: OK: Less than 1.00% above the threshold [0.0]
[19:35:11] <shinken-wm>	 RECOVERY - Puppet errors on deployment-cassandra3-01 is OK: OK: Less than 1.00% above the threshold [0.0]
[19:35:26] <shinken-wm>	 RECOVERY - Puppet errors on deployment-tin is OK: OK: Less than 1.00% above the threshold [0.0]
[19:35:33] <shinken-wm>	 RECOVERY - Puppet errors on deployment-memc05 is OK: OK: Less than 1.00% above the threshold [0.0]
[19:35:47] <shinken-wm>	 RECOVERY - Puppet errors on deployment-puppetdb01 is OK: OK: Less than 1.00% above the threshold [0.0]
[19:35:59] <shinken-wm>	 RECOVERY - Puppet errors on deployment-mediawiki07 is OK: OK: Less than 1.00% above the threshold [0.0]
[19:36:38] <shinken-wm>	 RECOVERY - Puppet errors on deployment-mediawiki06 is OK: OK: Less than 1.00% above the threshold [0.0]
[19:37:50] <shinken-wm>	 RECOVERY - Puppet errors on deployment-restbase02 is OK: OK: Less than 1.00% above the threshold [0.0]
[19:38:12] <shinken-wm>	 RECOVERY - Puppet errors on deployment-parsoid09 is OK: OK: Less than 1.00% above the threshold [0.0]
[19:39:55] <shinken-wm>	 RECOVERY - Puppet errors on deployment-cassandra3-02 is OK: OK: Less than 1.00% above the threshold [0.0]
[19:40:01] <shinken-wm>	 RECOVERY - Puppet errors on deployment-aqs01 is OK: OK: Less than 1.00% above the threshold [0.0]
[19:40:03] <shinken-wm>	 RECOVERY - Puppet errors on deployment-tmh01 is OK: OK: Less than 1.00% above the threshold [0.0]
[19:40:03] <shinken-wm>	 RECOVERY - Puppet errors on deployment-trending01 is OK: OK: Less than 1.00% above the threshold [0.0]
[19:40:23] <shinken-wm>	 RECOVERY - Puppet errors on deployment-db03 is OK: OK: Less than 1.00% above the threshold [0.0]
[19:40:39] <shinken-wm>	 RECOVERY - Puppet errors on deployment-imagescaler01 is OK: OK: Less than 1.00% above the threshold [0.0]
[19:40:43] <shinken-wm>	 RECOVERY - Puppet errors on deployment-kafka03 is OK: OK: Less than 1.00% above the threshold [0.0]
[19:41:22] <shinken-wm>	 RECOVERY - Puppet errors on deployment-conf03 is OK: OK: Less than 1.00% above the threshold [0.0]
[19:41:28] <shinken-wm>	 RECOVERY - Puppet errors on integration-publishing is OK: OK: Less than 1.00% above the threshold [0.0]
[19:41:30] <shinken-wm>	 RECOVERY - Puppet errors on deployment-eventlogging04 is OK: OK: Less than 1.00% above the threshold [0.0]
[19:41:54] <shinken-wm>	 RECOVERY - Puppet errors on deployment-elastic06 is OK: OK: Less than 1.00% above the threshold [0.0]
[19:42:26] <shinken-wm>	 RECOVERY - Puppet errors on deployment-memc06 is OK: OK: Less than 1.00% above the threshold [0.0]
[19:43:25] <shinken-wm>	 RECOVERY - Puppet errors on deployment-ircd is OK: OK: Less than 1.00% above the threshold [0.0]
[19:44:47] <shinken-wm>	 RECOVERY - Puppet errors on deployment-eventlog02 is OK: OK: Less than 1.00% above the threshold [0.0]
[19:45:40] <shinken-wm>	 RECOVERY - Puppet errors on integration-puppetmaster01 is OK: OK: Less than 1.00% above the threshold [0.0]
[19:46:05] <shinken-wm>	 RECOVERY - Puppet errors on deployment-aqs02 is OK: OK: Less than 1.00% above the threshold [0.0]
[19:46:20] <shinken-wm>	 RECOVERY - Puppet errors on integration-slave-docker-1001 is OK: OK: Less than 1.00% above the threshold [0.0]
[19:47:20] <shinken-wm>	 RECOVERY - Puppet errors on deployment-ms-fe02 is OK: OK: Less than 1.00% above the threshold [0.0]
[19:47:40] <shinken-wm>	 RECOVERY - Puppet errors on saucelabs-02 is OK: OK: Less than 1.00% above the threshold [0.0]
[19:48:06] <shinken-wm>	 RECOVERY - Puppet errors on deployment-cache-upload04 is OK: OK: Less than 1.00% above the threshold [0.0]
[19:48:10] <shinken-wm>	 RECOVERY - Puppet errors on deployment-elastic05 is OK: OK: Less than 1.00% above the threshold [0.0]
[19:48:14] <shinken-wm>	 RECOVERY - Puppet errors on deployment-memc04 is OK: OK: Less than 1.00% above the threshold [0.0]
[19:48:34] <shinken-wm>	 RECOVERY - Puppet errors on deployment-memc07 is OK: OK: Less than 1.00% above the threshold [0.0]
[19:48:45] <shinken-wm>	 RECOVERY - Puppet errors on deployment-kafka04 is OK: OK: Less than 1.00% above the threshold [0.0]
[19:48:51] <shinken-wm>	 RECOVERY - Puppet errors on deployment-aqs03 is OK: OK: Less than 1.00% above the threshold [0.0]
[19:49:23] <shinken-wm>	 RECOVERY - Puppet errors on deployment-videoscaler01 is OK: OK: Less than 1.00% above the threshold [0.0]
[19:49:38] <shinken-wm>	 RECOVERY - Puppet errors on deployment-sentry01 is OK: OK: Less than 1.00% above the threshold [0.0]
[19:49:40] <shinken-wm>	 RECOVERY - Puppet errors on deployment-db04 is OK: OK: Less than 1.00% above the threshold [0.0]
[19:49:46] <shinken-wm>	 RECOVERY - Puppet errors on deployment-cumin is OK: OK: Less than 1.00% above the threshold [0.0]
[19:49:46] <shinken-wm>	 RECOVERY - Puppet errors on deployment-mcs01 is OK: OK: Less than 1.00% above the threshold [0.0]
[19:50:04] <shinken-wm>	 RECOVERY - Puppet errors on deployment-sca03 is OK: OK: Less than 1.00% above the threshold [0.0]
[19:50:21] <shinken-wm>	 RECOVERY - Puppet errors on deployment-logstash2 is OK: OK: Less than 1.00% above the threshold [0.0]
[19:50:25] <shinken-wm>	 RECOVERY - Puppet errors on deployment-sca01 is OK: OK: Less than 1.00% above the threshold [0.0]
[19:50:30] <shinken-wm>	 RECOVERY - Puppet errors on deployment-changeprop is OK: OK: Less than 1.00% above the threshold [0.0]
[19:50:39] <shinken-wm>	 RECOVERY - Puppet errors on integration-slave-docker-1003 is OK: OK: Less than 1.00% above the threshold [0.0]
[19:50:55] <shinken-wm>	 RECOVERY - Puppet errors on deployment-etcd-01 is OK: OK: Less than 1.00% above the threshold [0.0]
[19:51:14] <shinken-wm>	 RECOVERY - Puppet errors on deployment-restbase01 is OK: OK: Less than 1.00% above the threshold [0.0]
[19:52:45] <shinken-wm>	 RECOVERY - Puppet errors on integration-slave-jessie-android is OK: OK: Less than 1.00% above the threshold [0.0]
[19:53:05] <shinken-wm>	 RECOVERY - Puppet errors on deployment-mathoid is OK: OK: Less than 1.00% above the threshold [0.0]
[19:53:51] <shinken-wm>	 RECOVERY - Puppet errors on deployment-elastic07 is OK: OK: Less than 1.00% above the threshold [0.0]
[19:53:51] <shinken-wm>	 PROBLEM - Mediawiki Error Rate on graphite-labs is CRITICAL: CRITICAL: 80.00% of data above the critical threshold [10.0]
[19:54:22] <shinken-wm>	 RECOVERY - Puppet errors on integration-slave-docker-1007 is OK: OK: Less than 1.00% above the threshold [0.0]
[19:54:42] <shinken-wm>	 RECOVERY - Puppet errors on deployment-zotero01 is OK: OK: Less than 1.00% above the threshold [0.0]
[19:55:20] <shinken-wm>	 RECOVERY - Puppet errors on deployment-sca04 is OK: OK: Less than 1.00% above the threshold [0.0]
[19:55:30] <shinken-wm>	 RECOVERY - Puppet errors on deployment-ms-be03 is OK: OK: Less than 1.00% above the threshold [0.0]
[19:55:34] <shinken-wm>	 RECOVERY - Puppet errors on deployment-pdfrender02 is OK: OK: Less than 1.00% above the threshold [0.0]
[19:55:36] <shinken-wm>	 RECOVERY - Puppet errors on deployment-redis05 is OK: OK: Less than 1.00% above the threshold [0.0]
[19:55:53] <shinken-wm>	 RECOVERY - Puppet errors on deployment-mediawiki04 is OK: OK: Less than 1.00% above the threshold [0.0]
[19:57:42] <shinken-wm>	 RECOVERY - Puppet errors on webperformance is OK: OK: Less than 1.00% above the threshold [0.0]
[19:58:26] <shinken-wm>	 RECOVERY - Puppet errors on deployment-zookeeper02 is OK: OK: Less than 1.00% above the threshold [0.0]
[19:58:34] <shinken-wm>	 RECOVERY - Puppet errors on deployment-poolcounter04 is OK: OK: Less than 1.00% above the threshold [0.0]
[19:59:20] <shinken-wm>	 RECOVERY - Puppet errors on deployment-mediawiki05 is OK: OK: Less than 1.00% above the threshold [0.0]
[19:59:35] <shinken-wm>	 RECOVERY - Puppet errors on deployment-fluorine02 is OK: OK: Less than 1.00% above the threshold [0.0]
[19:59:39] <shinken-wm>	 RECOVERY - Puppet errors on deployment-kafka05 is OK: OK: Less than 1.00% above the threshold [0.0]
[19:59:51] <shinken-wm>	 RECOVERY - Puppet errors on deployment-cache-text04 is OK: OK: Less than 1.00% above the threshold [0.0]
[19:59:55] <shinken-wm>	 RECOVERY - Puppet errors on deployment-ms-be04 is OK: OK: Less than 1.00% above the threshold [0.0]
[20:00:03] <shinken-wm>	 RECOVERY - Puppet errors on integration-slave-docker-1004 is OK: OK: Less than 1.00% above the threshold [0.0]
[20:00:05] <shinken-wm>	 RECOVERY - Puppet errors on jenkinstest is OK: OK: Less than 1.00% above the threshold [0.0]
[20:00:17] <shinken-wm>	 RECOVERY - Puppet errors on deployment-ores-redis-01 is OK: OK: Less than 1.00% above the threshold [0.0]
[20:00:17] <shinken-wm>	 RECOVERY - Puppet errors on integration-slave-docker-1006 is OK: OK: Less than 1.00% above the threshold [0.0]
[20:00:32] <shinken-wm>	 RECOVERY - Puppet errors on deployment-cpjobqueue is OK: OK: Less than 1.00% above the threshold [0.0]
[20:00:46] <shinken-wm>	 RECOVERY - Puppet errors on integration-slave-jessie-1003 is OK: OK: Less than 1.00% above the threshold [0.0]
[20:00:48] <shinken-wm>	 RECOVERY - Puppet errors on deployment-urldownloader is OK: OK: Less than 1.00% above the threshold [0.0]
[20:01:26] <shinken-wm>	 RECOVERY - Puppet errors on deployment-sca02 is OK: OK: Less than 1.00% above the threshold [0.0]
[20:01:26] <shinken-wm>	 RECOVERY - Puppet errors on integration-slave-jessie-1001 is OK: OK: Less than 1.00% above the threshold [0.0]
[20:04:05] <shinken-wm>	 RECOVERY - Puppet errors on integration-slave-docker-1002 is OK: OK: Less than 1.00% above the threshold [0.0]
[20:05:46] <shinken-wm>	 RECOVERY - Puppet errors on saucelabs-01 is OK: OK: Less than 1.00% above the threshold [0.0]
[20:07:54] <shinken-wm>	 RECOVERY - Puppet errors on integration-cumin is OK: OK: Less than 1.00% above the threshold [0.0]
[20:09:13] <shinken-wm>	 RECOVERY - Puppet errors on integration-slave-docker-1005 is OK: OK: Less than 1.00% above the threshold [0.0]
[20:09:56] <wikibugs>	 10Gerrit, 10Release-Engineering-Team (Kanban), 10Scap (Tech Debt Sprint FY201718-Q2), 10ORES, and 3 others: Support git-lfs files in gerrit - https://phabricator.wikimedia.org/T171758#3797904 (10Paladox) git-lfs is now supported :).  See https://gerrit.wikimedia.org/r/#/c/394125/
[20:10:27] <shinken-wm>	 RECOVERY - Puppet errors on castor02 is OK: OK: Less than 1.00% above the threshold [0.0]
[20:11:07] <shinken-wm>	 RECOVERY - Puppet errors on integration-slave-jessie-1004 is OK: OK: Less than 1.00% above the threshold [0.0]
[20:11:36] <shinken-wm>	 RECOVERY - Puppet errors on integration-r-lang-01 is OK: OK: Less than 1.00% above the threshold [0.0]
[20:12:18] <shinken-wm>	 RECOVERY - Puppet errors on saucelabs-03 is OK: OK: Less than 1.00% above the threshold [0.0]
[20:15:35] <paladox>	 we should write up some lfs docs now :).
[20:16:39] <shinken-wm>	 PROBLEM - Puppet errors on deployment-redis05 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0]
[20:16:58] <mutante>	 paladox: in a really large file...
[20:17:20] <paladox>	 yep. 1mb file. i just used head to try and create a test file.
[20:18:00] <mutante>	 hehe "Kanban), Scap (Tech Debt Sprint FY201718-Q2), ORES, and 3 others" :)
[20:18:12] <paladox>	 heh
[20:18:42] <shinken-wm>	 PROBLEM - Puppet errors on webperformance is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0]
[20:19:26] <shinken-wm>	 PROBLEM - Puppet errors on deployment-zookeeper02 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0]
[20:20:20] <shinken-wm>	 PROBLEM - Puppet errors on deployment-mediawiki05 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0]
[20:20:35] <shinken-wm>	 PROBLEM - Puppet errors on deployment-fluorine02 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0]
[20:20:41] <shinken-wm>	 PROBLEM - Puppet errors on deployment-kafka05 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0]
[20:20:51] <shinken-wm>	 PROBLEM - Puppet errors on deployment-cache-text04 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0]
[20:20:55] <shinken-wm>	 PROBLEM - Puppet errors on deployment-ms-be04 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0]
[20:21:01] <shinken-wm>	 PROBLEM - Puppet errors on integration-slave-docker-1004 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0]
[20:21:15] <shinken-wm>	 PROBLEM - Puppet errors on integration-slave-docker-1006 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0]
[20:21:29] <shinken-wm>	 PROBLEM - Puppet errors on deployment-cpjobqueue is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0]
[20:21:37] <paladox>	 hmm
[20:21:46] <shinken-wm>	 PROBLEM - Puppet errors on integration-slave-jessie-1003 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0]
[20:21:48] <shinken-wm>	 PROBLEM - Puppet errors on deployment-urldownloader is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0]
[20:21:50] <paladox>	 it happened for me too, but as soon as i re ran it worked
[20:21:57] <shinken-wm>	 PROBLEM - Puppet errors on deployment-mediawiki07 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0]
[20:22:27] <shinken-wm>	 PROBLEM - Puppet errors on integration-slave-jessie-1001 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0]
[20:22:29] <shinken-wm>	 PROBLEM - Puppet errors on deployment-sca02 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0]
[20:23:12] <wikibugs>	 (03CR) 10Jdlrobson: [C: 031] "The unit tests have been fixed now.. Keen to get this merged so they don't break again!" [integration/config] - 10https://gerrit.wikimedia.org/r/393642 (https://phabricator.wikimedia.org/T181429) (owner: 10Jdlrobson)
[20:24:35] <shinken-wm>	 PROBLEM - Puppet errors on deployment-poolcounter04 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0]
[20:31:27] <wikibugs>	 10Release-Engineering-Team (Watching / External), 10Scap, 10ORES, 10Operations, 10Scoring-platform-team: ORES should use a git large file plugin for storing serialized binaries - https://phabricator.wikimedia.org/T171619#3797980 (10demon)
[20:31:35] <wikibugs>	 10Gerrit, 10Release-Engineering-Team (Kanban), 10Scap (Tech Debt Sprint FY201718-Q2), 10ORES, and 3 others: Support git-lfs files in gerrit - https://phabricator.wikimedia.org/T171758#3797978 (10demon) 05Open>03Resolved a:03demon
[20:32:09] <shinken-wm>	 PROBLEM - Puppet errors on integration-slave-jessie-1004 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0]
[20:37:07] <wikibugs>	 10Gerrit, 10Release-Engineering-Team (Kanban), 10Scap (Tech Debt Sprint FY201718-Q2), 10ORES, and 2 others: Plan migration of ORES repos to git-lfs - https://phabricator.wikimedia.org/T181678#3798012 (10awight)
[20:38:50] <shinken-wm>	 RECOVERY - Mediawiki Error Rate on graphite-labs is OK: OK: Less than 1.00% above the threshold [1.0]
[20:39:57] <wikibugs>	 10Gerrit, 10Release-Engineering-Team (Kanban), 10Scap (Tech Debt Sprint FY201718-Q2), 10ORES, and 2 others: Plan migration of ORES repos to git-lfs - https://phabricator.wikimedia.org/T181678#3798028 (10awight)
[20:41:12] <Krinkle>	 hashar: zeljkof: why is https://phabricator.wikimedia.org/T170032 stalled?
[20:41:17] <Krinkle>	 What is it blocked on?
[20:44:11] <hashar>	 Krinkle: will reopen and write a status update
[20:45:55] <zeljkof>	 Krinkle: there is no progress on it in months
[20:45:56] <wikibugs>	 10Release-Engineering-Team (Watching / External), 10MediaWiki-Core-Tests, 10Patch-For-Review, 10User-zeljkofilipin: WebdriverIO should run Chrome headlessly - https://phabricator.wikimedia.org/T167507#3798083 (10hashar)
[20:45:58] <wikibugs>	 10Continuous-Integration-Infrastructure, 10User-zeljkofilipin: Upgrade to Chromium 59 or newer on Debian Jessie in CI - https://phabricator.wikimedia.org/T170032#3798079 (10hashar) 05stalled>03Open **Status update**  T179360 provides a Docker container with a Xvfb driver and running npm install. That was a...
[20:45:59] <hashar>	 Krinkle: that is a work in progress more or less
[20:46:29] <shinken-wm>	 PROBLEM - Puppet errors on deployment-ms-be03 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0]
[20:46:38] <wikibugs>	 10Gerrit, 10Release-Engineering-Team (Kanban), 10Scap (Tech Debt Sprint FY201718-Q2), 10ORES, and 3 others: Support git-lfs files in gerrit - https://phabricator.wikimedia.org/T171758#3798085 (10awight)
[20:46:44] <wikibugs>	 10Release-Engineering-Team (Watching / External), 10Scap, 10ORES, 10Operations, 10Scoring-platform-team: ORES should use a git large file plugin for storing serialized binaries - https://phabricator.wikimedia.org/T171619#3798086 (10awight)
[20:46:48] <wikibugs>	 10Gerrit, 10Release-Engineering-Team (Kanban), 10Scap (Tech Debt Sprint FY201718-Q2), 10ORES, and 2 others: Plan migration of ORES repos to git-lfs - https://phabricator.wikimedia.org/T181678#3798084 (10awight)
[20:46:53] <zeljkof>	 hashar: sorry, I thought it was stalled :)
[20:46:54] <hashar>	 Krinkle: the container that has Xvfb in the background and phantomjs is https://gerrit.wikimedia.org/r/#/c/393232/2
[20:47:10] <hashar>	 Krinkle: then I will switch it to stretch  and add chromium/firefox
[20:48:10] <hashar>	 zeljkof: no worries :)
[20:48:26] <wikibugs>	 10Continuous-Integration-Infrastructure (shipyard), 10Patch-For-Review: Create "npm-browser" docker image with npm, xvfb, chromium, and firefox installed - https://phabricator.wikimedia.org/T179360#3798096 (10hashar)
[20:48:28] <wikibugs>	 10Continuous-Integration-Infrastructure, 10User-zeljkofilipin: Upgrade to Chromium 59 or newer on Debian Jessie in CI - https://phabricator.wikimedia.org/T170032#3798095 (10hashar)
[20:48:34] <Krinkle>	 hashar: ok. 
[20:49:26] <wikibugs>	 10Release-Engineering-Team (Watching / External), 10Scap, 10ORES, 10Operations, 10Scoring-platform-team: ORES should use a git large file plugin for storing serialized binaries - https://phabricator.wikimedia.org/T171619#3798097 (10Paladox) 05stalled>03Open
[20:49:45] <wikibugs>	 10Continuous-Integration-Infrastructure (shipyard), 10User-zeljkofilipin: Upgrade to Chromium 59 or newer on Debian Jessie in CI - https://phabricator.wikimedia.org/T170032#3417522 (10hashar)
[20:50:07] <hashar>	 Krinkle: the thing I have to dig into is how to auto rebuild the container when a new chromium is available
[20:50:17] <hashar>	 but there is some tooling being build for that by ops
[20:51:57] <wikibugs>	 10Gerrit, 10Release-Engineering-Team (Kanban), 10Scap (Tech Debt Sprint FY201718-Q2), 10ORES, and 2 others: Plan migration of ORES repos to git-lfs - https://phabricator.wikimedia.org/T181678#3798104 (10awight) I'm guessing we want to do something like, # Copy repos to a read-only location. # Set LFS flags...
[20:53:05] <wikibugs>	 10Continuous-Integration-Infrastructure (shipyard), 10User-zeljkofilipin: Upgrade to Chromium 59 or newer on Debian Jessie in CI - https://phabricator.wikimedia.org/T170032#3798112 (10hashar) And I thought I commented on this task sorry. My thoughts are: * I am not going to add a stretch image to Nodepool. Nod...
[20:55:18] <shinken-wm>	 RECOVERY - Puppet errors on deployment-mediawiki05 is OK: OK: Less than 1.00% above the threshold [0.0]
[20:55:41] <shinken-wm>	 RECOVERY - Puppet errors on deployment-kafka05 is OK: OK: Less than 1.00% above the threshold [0.0]
[20:55:59] <shinken-wm>	 RECOVERY - Puppet errors on integration-slave-docker-1004 is OK: OK: Less than 1.00% above the threshold [0.0]
[20:56:17] <shinken-wm>	 RECOVERY - Puppet errors on integration-slave-docker-1006 is OK: OK: Less than 1.00% above the threshold [0.0]
[20:56:27] <shinken-wm>	 RECOVERY - Puppet errors on deployment-cpjobqueue is OK: OK: Less than 1.00% above the threshold [0.0]
[20:56:37] <shinken-wm>	 RECOVERY - Puppet errors on deployment-redis05 is OK: OK: Less than 1.00% above the threshold [0.0]
[20:56:58] <shinken-wm>	 RECOVERY - Puppet errors on deployment-mediawiki07 is OK: OK: Less than 1.00% above the threshold [0.0]
[20:58:16] <Krinkle>	 hashar: Makes sense.
[20:58:21] <Krinkle>	 hashar: What about mediawiki/php/apache
[20:58:31] <Krinkle>	 e.g. for npm jobs for mediawiki and extensions
[20:58:33] <hashar>	 no clue yet :(
[20:58:40] <Krinkle>	 What are the blockers for that?
[20:58:42] <shinken-wm>	 RECOVERY - Puppet errors on webperformance is OK: OK: Less than 1.00% above the threshold [0.0]
[20:59:08] <hashar>	 I havent much thought about it yet
[20:59:27] <shinken-wm>	 RECOVERY - Puppet errors on deployment-zookeeper02 is OK: OK: Less than 1.00% above the threshold [0.0]
[20:59:35] <shinken-wm>	 RECOVERY - Puppet errors on deployment-poolcounter04 is OK: OK: Less than 1.00% above the threshold [0.0]
[21:00:33] <shinken-wm>	 RECOVERY - Puppet errors on deployment-fluorine02 is OK: OK: Less than 1.00% above the threshold [0.0]
[21:00:50] <shinken-wm>	 RECOVERY - Puppet errors on deployment-cache-text04 is OK: OK: Less than 1.00% above the threshold [0.0]
[21:00:55] <wikibugs>	 10Release-Engineering-Team, 10ORES, 10Operations, 10Scoring-platform-team (Current): Connection timeout from tin to new ores servers - https://phabricator.wikimedia.org/T181661#3798137 (10thcipriani) Hrm. I think this error probably has something to do with ssh client timeout. I'm not sure if anything rece...
[21:01:47] <wikibugs>	 10Gerrit, 10Release-Engineering-Team (Kanban), 10Scap (Tech Debt Sprint FY201718-Q2), 10ORES, and 2 others: Plan migration of ORES repos to git-lfs - https://phabricator.wikimedia.org/T181678#3798142 (10demon) They'll mirror just fine since Phabricator just observes upstream.
[21:01:48] <shinken-wm>	 RECOVERY - Puppet errors on deployment-urldownloader is OK: OK: Less than 1.00% above the threshold [0.0]
[21:02:26] <shinken-wm>	 RECOVERY - Puppet errors on deployment-sca02 is OK: OK: Less than 1.00% above the threshold [0.0]
[21:15:39] <wikibugs>	 10Release-Engineering-Team (Kanban), 10Operations, 10Release Pipeline, 10User-Joe: Upgrade latest docker-registry.wikimedia.org/nodejs-devel to stretch - https://phabricator.wikimedia.org/T180524#3798234 (10Joe) p:05Triage>03High
[21:17:03] <awight>	 !log deployment-prep Verbose logging for ORES Celery
[21:17:08] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL
[21:30:20] <wikibugs>	 (03CR) 10Hashar: [C: 032] Run QUnit tests on Minerva skin non-experimental [integration/config] - 10https://gerrit.wikimedia.org/r/393642 (https://phabricator.wikimedia.org/T181429) (owner: 10Jdlrobson)
[21:30:34] <awight>	 thcipriani: Just curious, I see that fetching a new revision of ORES takes a long time even if it’s just a minor submodule bump.  I would’ve thought that the git cache would make that fast.  Is it really just a helluva lot of pure I/O file copying?
[21:30:49] <shinken-wm>	 PROBLEM - Free space - all mounts on deployment-sca03 is CRITICAL: CRITICAL: deployment-prep.deployment-sca03.diskspace._srv.byte_percentfree (<30.00%)
[21:31:29] <wikibugs>	 (03Merged) 10jenkins-bot: Run QUnit tests on Minerva skin non-experimental [integration/config] - 10https://gerrit.wikimedia.org/r/393642 (https://phabricator.wikimedia.org/T181429) (owner: 10Jdlrobson)
[21:31:50] <greg-g>	 awight: #REDIRECT twentyafterfour ^ ( thcipriani is trying to focus on non-scap stuff as much as possible)
[21:31:51] <thcipriani>	 awight: are you seeing that on beta or prod? We're doing things two different ways currently :)
[21:32:03] <thcipriani>	 heh, or that works, too :)
[21:32:34] <awight>	 twentyafterfour: This is on beta (pretending you are thcipriani :)
[21:33:31] <awight>	 Interestingly, a scap —force of the same revision is lightning-fast.
[21:33:50] <wikibugs>	 10Continuous-Integration-Config, 10Patch-For-Review, 10User-Jdlrobson: QUnit tests are not running on Minerva skin - https://phabricator.wikimedia.org/T181429#3798288 (10hashar) a:03Jdlrobson Thanks Jon for taking care of the CI config :]
[21:35:57] <thcipriani>	 well. To finish my thought. If it were on prod it's because submodule caching is still only in master. Since it's happening on beta it bears some investigation: I don't know, it should be pretty fast since there is a cache for submodules as well now, IIRC.
[21:46:18] <wikibugs>	 10Continuous-Integration-Config, 10Proton, 10Patch-For-Review, 10Readers-Web-Backlog (Tracking): Set up Jenkins for chromium-render and chromium-render-deploy repositories - https://phabricator.wikimedia.org/T179552#3798316 (10hashar) There is some eslint missing: ``` ├─┬ UNMET PEER DEPENDENCY eslint@3.19....
[21:46:18] <wikibugs>	 (03CR) 10Hashar: "Replied on T179552 :D Lets poke each other!" [integration/config] - 10https://gerrit.wikimedia.org/r/394058 (https://phabricator.wikimedia.org/T179552) (owner: 10Phuedx)
[21:47:29] <paladox>	 no_justification wondering could you merge https://gerrit.wikimedia.org/r/#/c/394179/ please? :)
[21:48:45] <no_justification>	 Lemme finish this other thing up then yeah I'll have a look
[21:48:54] <no_justification>	 Unrelated: we should have a jenkins job for refs/meta/config changes
[21:48:59] <no_justification>	 Should be easy to lint
[21:49:31] <paladox>	 ah
[21:49:49] <paladox>	 hmm, i wonder how will jenkins lint the project.config and groups.config file?
[21:49:56] <paladox>	 is there a lint for that? :)
[21:50:13] <paladox>	 (also thanks for reviewing it when you can :))
[21:50:34] <icinga-wm>	 PROBLEM - Work requests waiting in Zuul Gearman server on contint1001 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [140.0] https://grafana.wikimedia.org/dashboard/db/zuul-gearman?panelId=10fullscreenorgId=1
[21:52:15] <no_justification>	 paladox: gitconfig files should be easy to lint -- looks like there's an npm package that does it already
[21:52:21] <no_justification>	 (ew, npm, i suggested that?!)
[21:52:29] <paladox>	 ah, and lol
[21:52:43] <paladox>	 i guess we can do that
[21:54:06] * paladox wonders what the package name is
[21:54:20] <Zppix>	 I need to shower... i just saw the word npm xD
[21:54:22] <no_justification>	 Actually, ConfigParser in Python works just fine, if you strip the leading spaces
[21:57:28] <no_justification>	 paladox: Merged
[21:57:30] <paladox>	 thanks no_justification 
[21:58:05] <no_justification>	 yw
[22:22:30] <wikibugs>	 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (Watching / External), 10Cloud-VPS, 10Nodepool, and 2 others: figure out if nodepool is overwhelming rabbitmq and/or nova - https://phabricator.wikimedia.org/T170492#3798413 (10hashar) The issue  ( T170492#3581822 ) still happens from time...
[22:23:22] <hashar>	 Nodepool should be back 
[22:24:06] <hashar>	 something got stall in openstack at 21:09 and probably after again
[22:24:09] <hashar>	 but it seems to be recovering
[22:51:48] <Krinkle>	 addshore: btw, thoughts on using a subdirectory for MW in docker-dev?
[22:51:54] <Krinkle>	 (e.g. 'w' or mediawiki, instead of root)
[22:52:21] <addshore>	 Yup, can do! :)
[22:52:42] <addshore>	 Again, thanks for all the PRs..... I've not had much time to poke it much recently
[22:52:44] <Krinkle>	 addshore: Did you go with root because it was easiest with the webdev images, or because it is preferred?
[22:52:50] <addshore>	 I really want to make running unit tests easier
[22:52:52] <Krinkle>	 addshore: Thank you for revieiwng them :)
[22:53:29] <addshore>	 I simply went with root as I didn't see a reason not to really :P
[22:53:35] <Krinkle>	 Ah, okay.
[22:53:53] <wikibugs>	 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (Watching / External), 10Cloud-VPS, 10Nodepool, and 2 others: figure out if nodepool is overwhelming rabbitmq and/or nova - https://phabricator.wikimedia.org/T170492#3798548 (10Andrew) >>! In T170492#3798417, @Stashbot wrote: > {nav icon=f...
[22:53:58] <Krinkle>	 I suppose it would still be preferable to avoid having to write nginx/apache 2.2/apache 2.4/whatever config separately just for a simple rewrite rule
[22:54:06] <Krinkle>	 But I worked around that by using an index.php wrapper instead.
[22:54:08] <Krinkle>	 in the document root
[22:54:10] <Krinkle>	 simple enough
[22:55:04] <addshore>	 Ahh yes, perhaps thats why I did it, to avoid touching the images where possible and try to just use the env vars you can pass in
[22:55:25] <Krinkle>	 Yeah, they have no env API for rewrite rules (eww..)
[22:55:55] <Krinkle>	 Atm it's blocking QUnit and other npm commands because Karma will serve the HTML transparently, and then MW will try to relatively access itself where / is already taken by Karma itself.
[22:55:56] <Platonides>	 why should they?
[22:56:06] <Krinkle>	 I'll submit a PR :)
[22:56:25] <addshore>	 wouldn't it be cool (once CI is all run in containers) to be able to have CI run in the same docker environment locally as it does currently within the wmf infrastructure? ;)
[22:56:33] <addshore>	 anyway! I'm off to bed for now!
[22:56:37] <Krinkle>	 Platonides: Yeah, it makes sense not to want to provide a server-agnostic API to rewrite rules in form of env variables.
[22:57:21] <Krinkle>	 but it also means that to mount a web app elsewhere than root, you have to commit to a specific web server when writing the config, whereas right now the niceness is that we can swap the nginx/apache base images in mediawiki-docker-dev without any modifications.
[22:57:24] <Krinkle>	 Only run-time plugging.
[22:58:33] <wikibugs>	 10Gerrit, 10Release-Engineering-Team (Kanban), 10Scap (Tech Debt Sprint FY201718-Q2), 10ORES, and 3 others: Plan migration of ORES repos to git-lfs - https://phabricator.wikimedia.org/T181678#3798554 (10Halfak) Trying start a gerrit review for wheels.  Got this:  ``` Do you really want to submit the above...
[22:59:47] <wikibugs>	 10Gerrit, 10Release-Engineering-Team (Kanban), 10Scap (Tech Debt Sprint FY201718-Q2), 10ORES, and 3 others: Plan migration of ORES repos to git-lfs - https://phabricator.wikimedia.org/T181678#3798556 (10Halfak) Putting repo backups here: https://analytics.wikimedia.org/datasets/archive/public-datasets/all/...
[23:17:56] <wikibugs>	 10Gerrit, 10Release-Engineering-Team (Kanban), 10Scap (Tech Debt Sprint FY201718-Q2), 10ORES, and 3 others: Plan migration of ORES repos to git-lfs - https://phabricator.wikimedia.org/T181678#3798581 (10Halfak) https://github.com/wiki-ai/draftquality is fully updated.
[23:59:36] <icinga-wm>	 RECOVERY - Work requests waiting in Zuul Gearman server on contint1001 is OK: OK: Less than 30.00% above the threshold [90.0] https://grafana.wikimedia.org/dashboard/db/zuul-gearman?panelId=10fullscreenorgId=1