[00:12:23] <shinken-wm>	 PROBLEM - Puppet staleness on deployment-apertium01 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [43200.0]  
[03:04:11] <shinken-wm>	 PROBLEM - Puppet failure on deployment-cache-mobile03 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0]  
[03:11:18] <shinken-wm>	 PROBLEM - Puppet failure on deployment-apertium01 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0]  
[03:38:25] <shinken-wm>	 PROBLEM - Free space - all mounts on deployment-bastion is CRITICAL: CRITICAL: deployment-prep.deployment-bastion.diskspace._var.byte_percentfree.value (<44.44%)  
[04:55:25] <shinken-wm>	 PROBLEM - Puppet failure on deployment-stream is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0]  
[04:58:31] <shinken-wm>	 PROBLEM - Puppet failure on deployment-salt is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0]  
[05:02:10] <shinken-wm>	 PROBLEM - Puppet failure on deployment-redis02 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0]  
[05:04:48] <shinken-wm>	 PROBLEM - Puppet failure on deployment-mediawiki03 is CRITICAL: CRITICAL: 70.00% of data above the critical threshold [0.0]  
[05:05:37] <shinken-wm>	 PROBLEM - Puppet failure on deployment-redis01 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0]  
[05:07:40] <shinken-wm>	 PROBLEM - Puppet failure on deployment-elastic08 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0]  
[05:08:22] <shinken-wm>	 PROBLEM - Puppet failure on deployment-restbase03 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0]  
[05:11:05] <shinken-wm>	 PROBLEM - Puppet failure on deployment-memc02 is CRITICAL: CRITICAL: 37.50% of data above the critical threshold [0.0]  
[05:12:14] <shinken-wm>	 PROBLEM - Puppet failure on deployment-elastic07 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0]  
[05:14:55] <shinken-wm>	 PROBLEM - Puppet failure on deployment-videoscaler01 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0]  
[05:18:37] <shinken-wm>	 PROBLEM - Puppet failure on deployment-cache-bits01 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0]  
[05:24:04] <shinken-wm>	 PROBLEM - Puppet failure on deployment-logstash1 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0]  
[05:26:04] <shinken-wm>	 PROBLEM - Puppet failure on deployment-memc03 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0]  
[05:28:20] <shinken-wm>	 PROBLEM - Puppet failure on deployment-upload is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0]  
[05:29:46] <shinken-wm>	 RECOVERY - Puppet failure on deployment-mediawiki03 is OK: OK: Less than 1.00% above the threshold [0.0]  
[05:32:06] <shinken-wm>	 PROBLEM - Puppet failure on deployment-fluoride is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0]  
[05:32:40] <shinken-wm>	 RECOVERY - Puppet failure on deployment-elastic08 is OK: OK: Less than 1.00% above the threshold [0.0]  
[05:32:54] <shinken-wm>	 PROBLEM - Puppet failure on deployment-jobrunner01 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0]  
[05:33:24] <shinken-wm>	 RECOVERY - Puppet failure on deployment-restbase03 is OK: OK: Less than 1.00% above the threshold [0.0]  
[05:33:26] <shinken-wm>	 PROBLEM - Puppet failure on deployment-mediawiki02 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0]  
[05:35:36] <shinken-wm>	 RECOVERY - Puppet failure on deployment-redis01 is OK: OK: Less than 1.00% above the threshold [0.0]  
[05:36:08] <shinken-wm>	 RECOVERY - Puppet failure on deployment-memc02 is OK: OK: Less than 1.00% above the threshold [0.0]  
[05:37:13] <shinken-wm>	 RECOVERY - Puppet failure on deployment-elastic07 is OK: OK: Less than 1.00% above the threshold [0.0]  
[05:37:57] <shinken-wm>	 PROBLEM - Puppet failure on deployment-parsoid05 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0]  
[05:39:09] <shinken-wm>	 PROBLEM - Puppet failure on deployment-rsync01 is CRITICAL: CRITICAL: 62.50% of data above the critical threshold [0.0]  
[05:39:57] <shinken-wm>	 RECOVERY - Puppet failure on deployment-videoscaler01 is OK: OK: Less than 1.00% above the threshold [0.0]  
[05:40:47] <shinken-wm>	 PROBLEM - Puppet failure on deployment-db2 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0]  
[05:43:36] <shinken-wm>	 RECOVERY - Puppet failure on deployment-cache-bits01 is OK: OK: Less than 1.00% above the threshold [0.0]  
[05:44:22] <shinken-wm>	 PROBLEM - Puppet failure on deployment-parsoid04 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0]  
[05:45:24] <shinken-wm>	 RECOVERY - Puppet failure on deployment-stream is OK: OK: Less than 1.00% above the threshold [0.0]  
[05:48:41] <shinken-wm>	 PROBLEM - Puppet failure on deployment-mathoid is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0]  
[05:52:10] <shinken-wm>	 RECOVERY - Puppet failure on deployment-redis02 is OK: OK: Less than 1.00% above the threshold [0.0]  
[05:53:50] <shinken-wm>	 PROBLEM - Puppet failure on deployment-eventlogging02 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0]  
[05:57:07] <shinken-wm>	 RECOVERY - Puppet failure on deployment-fluoride is OK: OK: Less than 1.00% above the threshold [0.0]  
[05:58:18] <shinken-wm>	 RECOVERY - Puppet failure on deployment-upload is OK: OK: Less than 1.00% above the threshold [0.0]  
[06:01:03] <shinken-wm>	 RECOVERY - Puppet failure on deployment-memc03 is OK: OK: Less than 1.00% above the threshold [0.0]  
[06:03:09] <shinken-wm>	 PROBLEM - Puppet failure on deployment-redis02 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0]  
[06:03:25] <shinken-wm>	 RECOVERY - Puppet failure on deployment-mediawiki02 is OK: OK: Less than 1.00% above the threshold [0.0]  
[06:04:10] <shinken-wm>	 RECOVERY - Puppet failure on deployment-rsync01 is OK: OK: Less than 1.00% above the threshold [0.0]  
[06:05:48] <shinken-wm>	 RECOVERY - Puppet failure on deployment-db2 is OK: OK: Less than 1.00% above the threshold [0.0]  
[06:09:04] <shinken-wm>	 RECOVERY - Puppet failure on deployment-logstash1 is OK: OK: Less than 1.00% above the threshold [0.0]  
[06:13:40] <shinken-wm>	 RECOVERY - Puppet failure on deployment-mathoid is OK: OK: Less than 1.00% above the threshold [0.0]  
[06:14:18] <shinken-wm>	 RECOVERY - Puppet failure on deployment-parsoid04 is OK: OK: Less than 1.00% above the threshold [0.0]  
[06:17:54] <shinken-wm>	 RECOVERY - Puppet failure on deployment-jobrunner01 is OK: OK: Less than 1.00% above the threshold [0.0]  
[06:18:48] <shinken-wm>	 RECOVERY - Puppet failure on deployment-eventlogging02 is OK: OK: Less than 1.00% above the threshold [0.0]  
[06:22:57] <shinken-wm>	 RECOVERY - Puppet failure on deployment-parsoid05 is OK: OK: Less than 1.00% above the threshold [0.0]  
[06:23:33] <shinken-wm>	 RECOVERY - Puppet failure on deployment-salt is OK: OK: Less than 1.00% above the threshold [0.0]  
[06:28:08] <shinken-wm>	 RECOVERY - Puppet failure on deployment-redis02 is OK: OK: Less than 1.00% above the threshold [0.0]  
[06:38:24] <shinken-wm>	 RECOVERY - Free space - all mounts on deployment-bastion is OK: OK: All targets OK  
[06:39:38] <shinken-wm>	 PROBLEM - Puppet failure on deployment-cache-bits01 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0]  
[07:04:39] <shinken-wm>	 RECOVERY - Puppet failure on deployment-cache-bits01 is OK: OK: Less than 1.00% above the threshold [0.0]  
[08:54:04] <grrrit-wm>	 (03PS1) 10Adrian Lang: (Does not work) Make mwext-WikibaseJavaScriptApi-qunit voting [integration/config] - 10https://gerrit.wikimedia.org/r/180418 
[08:54:54] <grrrit-wm>	 (03CR) 10jenkins-bot: [V: 04-1] (Does not work) Make mwext-WikibaseJavaScriptApi-qunit voting [integration/config] - 10https://gerrit.wikimedia.org/r/180418 (owner: 10Adrian Lang)
[08:57:00] <grrrit-wm>	 (03PS2) 10Adrian Lang: (Does not work) Make mwext-WikibaseJavaScriptApi-qunit voting [integration/config] - 10https://gerrit.wikimedia.org/r/180418 
[09:06:43] <grrrit-wm>	 (03CR) 10Gilles: [C: 031] Add jobs for Sentry [integration/config] - 10https://gerrit.wikimedia.org/r/180309 (owner: 10Gergő Tisza)
[09:17:05] <wikibugs>	 3Continuous-Integration: common gating job for mediawiki core and extensions - https://phabricator.wikimedia.org/T60772#852789 (10hashar) Not yet. We have the `mediawiki-gate` dummy jobs used to enforce MediaWiki related changes to share the same queue in the Zuul gate-and-submit pipeline.  That bug is about mak...
[10:09:42] <grrrit-wm>	 (03CR) 10Hashar: Add jobs for Sentry (031 comment) [integration/config] - 10https://gerrit.wikimedia.org/r/180309 (owner: 10Gergő Tisza)
[10:09:52] <grrrit-wm>	 (03PS2) 10Hashar: Add jobs for Sentry [integration/config] - 10https://gerrit.wikimedia.org/r/180309 (owner: 10Gergő Tisza)
[10:10:28] <grrrit-wm>	 (03CR) 10Hashar: [C: 032] "Jobs created:" [integration/config] - 10https://gerrit.wikimedia.org/r/180309 (owner: 10Gergő Tisza)
[10:24:10] <grrrit-wm>	 (03PS3) 10Hashar: Add jobs for Sentry [integration/config] - 10https://gerrit.wikimedia.org/r/180309 (owner: 10Gergő Tisza)
[10:24:20] <grrrit-wm>	 (03CR) 10Hashar: [C: 032] Add jobs for Sentry [integration/config] - 10https://gerrit.wikimedia.org/r/180309 (owner: 10Gergő Tisza)
[10:25:02] <grrrit-wm>	 (03CR) 10jenkins-bot: [V: 04-1] Add jobs for Sentry [integration/config] - 10https://gerrit.wikimedia.org/r/180309 (owner: 10Gergő Tisza)
[10:30:57] <grrrit-wm>	 (03Merged) 10jenkins-bot: Add jobs for Sentry [integration/config] - 10https://gerrit.wikimedia.org/r/180309 (owner: 10Gergő Tisza)
[10:38:02] <grrrit-wm>	 (03PS1) 10Hashar: Drop {name}-{ext-name}-testextension [integration/config] - 10https://gerrit.wikimedia.org/r/180437 
[10:50:47] <grrrit-wm>	 (03CR) 10Hashar: [C: 032] Drop {name}-{ext-name}-testextension [integration/config] - 10https://gerrit.wikimedia.org/r/180437 (owner: 10Hashar)
[10:57:49] <grrrit-wm>	 (03Merged) 10jenkins-bot: Drop {name}-{ext-name}-testextension [integration/config] - 10https://gerrit.wikimedia.org/r/180437 (owner: 10Hashar)
[11:35:50] <wmf-insecte>	 Project beta-scap-eqiad build #34296: FAILURE in 1 min 41 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/34296/
[11:55:55] <wmf-insecte>	 Yippee, build fixed!
[11:55:56] <wmf-insecte>	 Project beta-scap-eqiad build #34298: FIXED in 1 min 39 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/34298/
[12:43:27] <wikibugs>	 3Continuous-Integration: remove operations-apache-config-lint on operations/mediawiki-config - https://phabricator.wikimedia.org/T78782#853100 (10Dzahn) 3NEW
[12:44:15] <wikibugs>	 3Continuous-Integration: remove operations-apache-config-lint on operations/mediawiki-config - https://phabricator.wikimedia.org/T78782#853100 (10Dzahn) example links:  https://gerrit.wikimedia.org/r/#/c/180451/ operations-apache-config-lint FAILURE in 29s (non-voting)  https://integration.wikimedia.org/ci/job/o...
[13:27:44] <grrrit-wm>	 (03PS1) 10Hashar: Prevent hhvm on REL1_19 and REL1_22 [integration/config] - 10https://gerrit.wikimedia.org/r/180469 
[13:34:54] <grrrit-wm>	 (03PS1) 10QChris: Add jobs for analytics/blog [integration/config] - 10https://gerrit.wikimedia.org/r/180470 
[13:35:06] <wmf-insecte>	 Project beta-scap-eqiad build #34308: FAILURE in 1 min 3 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/34308/
[13:41:26] <grrrit-wm>	 (03CR) 10Hashar: [C: 032] Prevent hhvm on REL1_19 and REL1_22 [integration/config] - 10https://gerrit.wikimedia.org/r/180469 (owner: 10Hashar)
[13:42:45] <grrrit-wm>	 (03Merged) 10jenkins-bot: Prevent hhvm on REL1_19 and REL1_22 [integration/config] - 10https://gerrit.wikimedia.org/r/180469 (owner: 10Hashar)
[13:43:03] <grrrit-wm>	 (03PS2) 10Hashar: Add jobs for analytics/blog [integration/config] - 10https://gerrit.wikimedia.org/r/180470 (owner: 10QChris)
[13:45:01] <grrrit-wm>	 (03CR) 10Hashar: [C: 032] "Jobs deployed, thanks!" [integration/config] - 10https://gerrit.wikimedia.org/r/180470 (owner: 10QChris)
[13:49:51] <grrrit-wm>	 (03Merged) 10jenkins-bot: Add jobs for analytics/blog [integration/config] - 10https://gerrit.wikimedia.org/r/180470 (owner: 10QChris)
[13:55:27] <wmf-insecte>	 Yippee, build fixed!
[13:55:27] <wmf-insecte>	 Project beta-scap-eqiad build #34310: FIXED in 1 min 27 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/34310/
[14:03:28] <grrrit-wm>	 (03PS1) 10Hashar: Force color for tox [integration/config] - 10https://gerrit.wikimedia.org/r/180474 
[14:09:52] <grrrit-wm>	 (03CR) 10Hashar: [C: 032] Force color for tox [integration/config] - 10https://gerrit.wikimedia.org/r/180474 (owner: 10Hashar)
[14:14:22] <grrrit-wm>	 (03Merged) 10jenkins-bot: Force color for tox [integration/config] - 10https://gerrit.wikimedia.org/r/180474 (owner: 10Hashar)
[14:14:35] <wikibugs>	 3Continuous-Integration: remove operations-apache-config-lint on operations/mediawiki-config - https://phabricator.wikimedia.org/T78782#853270 (10Dzahn) also see T72068
[14:34:34] <shinken-wm>	 PROBLEM - Puppet failure on deployment-salt is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0]  
[14:53:34] <shinken-wm>	 PROBLEM - Puppet failure on deployment-sentry2 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0]  
[15:04:32] <shinken-wm>	 RECOVERY - Puppet failure on deployment-salt is OK: OK: Less than 1.00% above the threshold [0.0]  
[15:16:11] <grrrit-wm>	 (03PS1) 10Hashar: Drop mediawiki-phpunit [integration/config] - 10https://gerrit.wikimedia.org/r/180487 
[15:21:19] <grrrit-wm>	 (03CR) 10Hashar: [C: 032] Drop mediawiki-phpunit [integration/config] - 10https://gerrit.wikimedia.org/r/180487 (owner: 10Hashar)
[15:25:42] <grrrit-wm>	 (03Merged) 10jenkins-bot: Drop mediawiki-phpunit [integration/config] - 10https://gerrit.wikimedia.org/r/180487 (owner: 10Hashar)
[16:03:40] <grrrit-wm>	 (03PS1) 10Hashar: (WIP) gating extensions together (WIP) [integration/config] - 10https://gerrit.wikimedia.org/r/180494 
[16:11:01] <grrrit-wm>	 (03CR) 10Hashar: [C: 04-2] (WIP) gating extensions together (WIP) (032 comments) [integration/config] - 10https://gerrit.wikimedia.org/r/180494 (owner: 10Hashar)
[16:14:53] <wikibugs>	 3MediaWiki-Unit-tests, Continuous-Integration: MediaWiki core 'structure' tests are not run for extensions - https://phabricator.wikimedia.org/T78798#853473 (10hashar) 3NEW
[16:38:13] <wikibugs>	 3Continuous-Integration: HHVM Jenkins job throw: Unable to set CoreFileSize to 8589934592: Operation not permitted (1) - https://phabricator.wikimedia.org/T78799#853512 (10hashar) 3NEW
[16:44:45] <grrrit-wm>	 (03PS2) 10Hashar: (WIP) gating extensions together (WIP) [integration/config] - 10https://gerrit.wikimedia.org/r/180494 
[16:45:18] <grrrit-wm>	 (03CR) 10Hashar: "Added VisualEditor and added a hack to initialize the submodule." [integration/config] - 10https://gerrit.wikimedia.org/r/180494 (owner: 10Hashar)
[16:54:05] <grrrit-wm>	 (03CR) 10Hashar: "Failures can be seen at https://integration.wikimedia.org/ci/job/mediawiki-phpunit-integration-hhvm/6/#showFailuresLink" [integration/config] - 10https://gerrit.wikimedia.org/r/180494 (owner: 10Hashar)
[17:07:50] <wmf-insecte>	 Yippee, build fixed!
[17:07:51] <wmf-insecte>	 Project browsertests-Wikidata-WikidataTests-linux-firefox-sauce build #75: FIXED in 2 hr 45 min: https://integration.wikimedia.org/ci/job/browsertests-Wikidata-WikidataTests-linux-firefox-sauce/75/
[17:29:19] <hasharCall>	 off
[17:43:46] <chrismcmahon>	 ryasmeen: I think I fixed all the browser test failures, I kicked off a build to see what happens: https://integration.wikimedia.org/ci/view/BrowserTests/view/-All/job/browsertests-VisualEditor-en.wikipedia.beta.wmflabs.org-linux-firefox-sauce/474/
[17:46:24] <ryasmeen>	 chrismcmahon:nice!
[17:47:52] <chrismcmahon>	 ryasmeen: the only issue that was not just a new locator was needing to clear the default text first from the Media search text box: https://gerrit.wikimedia.org/r/#/c/180524/
[17:48:01] <chrismcmahon>	 hooray for Page Objects :-)
[17:55:38] <wmf-insecte>	 Project beta-scap-eqiad build #34335: FAILURE in 1 min 34 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/34335/
[18:16:00] <wmf-insecte>	 Yippee, build fixed!
[18:16:00] <wmf-insecte>	 Project beta-scap-eqiad build #34337: FIXED in 1 Minute 53 Sekunden: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/34337/
[18:36:17] <wmf-insecte>	 Project browsertests-Echo-test2.wikipedia.org-linux-chrome-sauce build #231: FAILURE in 22 Minuten: https://integration.wikimedia.org/ci/job/browsertests-Echo-test2.wikipedia.org-linux-chrome-sauce/231/
[18:38:03] <wmf-insecte>	 Yippee, build fixed!
[18:38:03] <wmf-insecte>	 Project browsertests-MultimediaViewer-en.wikipedia.beta.wmflabs.org-windows_7-internet_explorer-11-sauce build #142: FIXED in 36 Minuten: https://integration.wikimedia.org/ci/job/browsertests-MultimediaViewer-en.wikipedia.beta.wmflabs.org-windows_7-internet_explorer-11-sauce/142/
[18:40:11] <wikibugs>	 3Scrum-of-Scrums, Beta-Cluster: beta cluster: deployment-cache-upload02 does not seem to purge content when getting PURGE - https://phabricator.wikimedia.org/T67683#853799 (10Cmcmahon) 5Open>3Resolved
[18:40:33] <wikibugs>	 3Scrum-of-Scrums, Beta-Cluster: beta cluster: deployment-cache-upload02 does not seem to purge content when getting PURGE - https://phabricator.wikimedia.org/T67683#698061 (10Cmcmahon) spoke to Brandon on IRC, as far as anyone can tell this is now working correctly
[19:14:31] <James_F>	 Krinkle: Why has Jenkins switched into German?
[19:14:47] <Nikerabbit>	 it seems to be in random languages recently
[19:14:52] <James_F>	 Hmm.
[19:23:10] <Krinkle>	 James_F: Yeah, been happening for the past 12 months. 
[19:23:18] <Krinkle>	 Sometimes it adopts one of the languages of the users.
[19:25:17] <Krinkle>	 James_F: Fixed by changing default language from en to en. (Yeah, makes so much sense, right)
[19:25:35] <James_F>	 Krinkle: Helpful. :-)
[19:25:48] <James_F>	 Krinkle: Can we ban users from setting their interface language to stop it breaking?
[19:26:33] <Krinkle>	 James_F: Annoyingly, there's several areas where messages are substituted and now stay in French, Spanish and German.
[19:26:36] <Krinkle>	 E.g. build logs 
[19:27:00] <Krinkle>	 most of build scripts are english, but the bootstrap from Jenkins itself is actually localised
[19:27:02] <James_F>	 Even better.
[19:27:10] <James_F>	 It's also not well localised.
[19:27:17] <James_F>	 Half the messages "in German" were still in English.
[19:37:42] <shinken-wm>	 PROBLEM - Puppet failure on deployment-pdf01 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0]  
[19:43:56] <grrrit-wm>	 (03PS1) 10Ejegg: Add composer check for DonationInterface/vendor [integration/config] - 10https://gerrit.wikimedia.org/r/180566 
[19:48:58] <grrrit-wm>	 (03Abandoned) 10Ejegg: Add composer check for DonationInterface/vendor [integration/config] - 10https://gerrit.wikimedia.org/r/180566 (owner: 10Ejegg)
[19:54:55] <wmf-insecte>	 Project beta-scap-eqiad build #34347: FAILURE in 56 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/34347/
[20:16:08] <wmf-insecte>	 Yippee, build fixed!
[20:16:08] <wmf-insecte>	 Project beta-scap-eqiad build #34349: FIXED in 1 min 50 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/34349/
[20:26:16] <tonythomas>	 i am getting different values for the field user_id in bounce_record table while running php unit tests on my local install and later on jenkins - https://gerrit.wikimedia.org/r/#/c/176366/ - and of course - jenkins fails 
[20:26:47] <tonythomas>	 probably - jenkins user table and my php unit created user table are on a mismatch :\ 
[20:26:52] <tonythomas>	 any way it can be sorted out ?
[20:29:25] <shinken-wm>	 PROBLEM - Free space - all mounts on deployment-bastion is CRITICAL: CRITICAL: deployment-prep.deployment-bastion.diskspace._var.byte_percentfree.value (<11.11%)  
[20:39:25] <grrrit-wm>	 (03PS1) 10Legoktm: Setup php-composer-validate for operations/mediawiki-config [integration/config] - 10https://gerrit.wikimedia.org/r/180591 
[21:02:16] <hashar>	 !log cancelled all browser tests,suspecting them to deadlock Jenkins somehow :(
[21:02:21] <qa-morebots>	 Logged the message, Master
[21:03:07] <greg-g>	 eh?
[21:09:26] <shinken-wm>	 PROBLEM - Free space - all mounts on deployment-bastion is CRITICAL: CRITICAL: deployment-prep.deployment-bastion.diskspace._var.byte_percentfree.value (<11.11%)  
[21:29:42] <grrrit-wm>	 (03PS1) 10Dduvall: Removed inclusion of pry-byebug [selenium] (env-abstraction-layer) - 10https://gerrit.wikimedia.org/r/180639 
[21:29:44] <grrrit-wm>	 (03PS1) 10Dduvall: Refactored EAL configuration overrides [selenium] (env-abstraction-layer) - 10https://gerrit.wikimedia.org/r/180640 
[22:10:27] <grrrit-wm>	 (03CR) 10Gergő Tisza: Add jobs for Sentry (031 comment) [integration/config] - 10https://gerrit.wikimedia.org/r/180309 (owner: 10Gergő Tisza)
[22:10:55] <shinken-wm>	 PROBLEM - Puppet failure on deployment-videoscaler01 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0]  
[22:14:25] <shinken-wm>	 PROBLEM - Free space - all mounts on deployment-bastion is CRITICAL: CRITICAL: deployment-prep.deployment-bastion.diskspace._var.byte_percentfree.value (<11.11%)  
[22:26:35] <wikibugs>	 3Release-Engineering: Add shinken output for Beta Cluster to -operations channel - https://phabricator.wikimedia.org/T1334#854342 (10yuvipanda) Bump?
[22:39:26] <shinken-wm>	 RECOVERY - Free space - all mounts on deployment-bastion is OK: OK: All targets OK  
[22:39:58] <wikibugs>	 3Release-Engineering: Mukunda ready to do deploy on 12/16 - https://phabricator.wikimedia.org/T76049#854388 (10greg) 5Open>3Resolved a:3greg This was done (and successful) :)
[22:40:06] <wikibugs>	 3Release-Engineering: Mukunda ready to do deploy on 12/16 - https://phabricator.wikimedia.org/T76049#854393 (10greg) p:5High>3Triage
[22:40:10] <hashar>	 twentyafterfour: congratulations :]
[22:40:56] <shinken-wm>	 RECOVERY - Puppet failure on deployment-videoscaler01 is OK: OK: Less than 1.00% above the threshold [0.0]  
[22:54:01] <wmf-insecte>	 Project beta-scap-eqiad build #34356: FAILURE in 30 min: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/34356/
[22:58:02] <wikibugs>	 3Continuous-Integration, MediaWiki-Unit-tests: MediaWiki core 'structure' tests are not run for extensions - https://phabricator.wikimedia.org/T78798#854427 (10hashar) a:3hashar Announced on wikitech https://lists.wikimedia.org/pipermail/wikitech-l/2014-December/079864.html
[23:03:20] <twentyafterfour>	 hashar: congratulations? for surviving my first deploy?
[23:14:01] <hashar>	 twentyafterfour: yup :]
[23:14:47] <hashar>	 twentyafterfour: and hopefully gives you some good ideas to refine the deploy process
[23:19:51] <Krinkle>	 hashar: shinken puppet failures is driving me mad.
[23:19:53] <Krinkle>	 Can we do something about that?
[23:20:02] <Krinkle>	 It seems most of them aren't about aptitude
[23:20:27] <Krinkle>	 When I look in syslog, I see puppet is locked. It's previous run still has the lock in place. Then the next time is fine.
[23:20:37] <hashar>	 well it is past midnight about to sleep
[23:20:39] <Krinkle>	 Presumably taking longer than the iteration duration.
[23:20:53] <hashar>	 but in short, we labs had transient DNS issues for most of the week which causes random failures of puppet
[23:20:56] <Krinkle>	 hashar: jenkins is about to shutdown from what it looks like.
[23:21:06] <Krinkle>	 I assume you scheduled that
[23:21:08] <hashar>	  /var/vdb/ is reported as full on the varnish instances, cause varnish actually fill the disk
[23:21:17] <hashar>	 Krinkle: yeah to remove a bunch of deadlock
[23:21:20] <hashar>	 will restart Zuul as well
[23:21:23] <hashar>	 and went to sleep
[23:21:56] <hashar>	 there is a nasty interactions between plugins that causes execution slots to end up locked :(
[23:22:22] <hashar>	 been trying to solve it without disturbing devs too much 
[23:24:09] <Krinkle>	 hashar: The static file server for build logs seems easy enough. Write stdout/stderro and artefacts to a unique directory that's mounted as some-thing.wikimedia.org (or wmfusercontent.org) and cronjob to purge directories older than 30 days. Zuul config can take custom urls for build result (openstack uses it already). And remove Jenkins forever :P. Goal for next quarter? Seems relatively light weight.
[23:24:18] <Krinkle>	 compared to other plans, like disposable vms
[23:25:27] <Krinkle>	 The main thing I guess we'll need to do in addition is figure out what we use Jenkins for. Right now it's mostly execution. We're already bypassing most of the plugins (e.g. zuul-cloner instead of jenkins git).
[23:25:49] <Krinkle>	 And execution of the job I think still depends on Jenkins. I'm curious how that would work otherwise. Can Zuul/Gearman connect directly?
[23:25:59] <Krinkle>	 creation of workspace on the slaves
[23:28:07] <Krinkle>	 hashar: Looking at openstack, I think they still use Jenkins. They just don't link to it. The build output looks lie it's generated by Jenkins (e.g. "Started by user anonymous")
[23:28:47] <hashar>	 yeah they scp iirc
[23:28:52] <hashar>	 or maybe send to swift can't remember
[23:29:08] <hashar>	 we could indeed get some entry under wmfusercontent.org with a server having a bunch of disk space
[23:29:16] <Krinkle>	 Looks like they use jenkins build html export, and then sync to the build log server with scp.
[23:29:25] <hashar>	 then use LOG_PATH ( ex: 63/180663/1/test/mwext-MobileFrontend-npm/741798c )
[23:29:44] <Krinkle>	 hashar: wouldn't need much space. gallium seems to handle it fine as well. It's just text files and a few artefacts (log files).
[23:29:51] <hashar>	 yup
[23:29:58] <Krinkle>	 we'll have to purge manaully, since jenkins wouldn't do that for us anymore.
[23:30:23] <Krinkle>	 hashar: I'm curous if the performance slowdown would get better if we still have Jenkins but without it keeping any builds in memory.
[23:30:29] <hashar>	 we have a bunch of old app server we can probably reuse
[23:30:33] <Krinkle>	 That should speed things up a bit, but ideally we'd cut it out entirely.
[23:30:37] <hashar>	 feel free to get one allocated for that :]
[23:31:44] <Krinkle>	 hashar: well, we can just mount it on gallium. like docwikimedia.org
[23:31:50] <Krinkle>	 wouldn't be too difficult. 
[23:31:54] <Krinkle>	 Same space, different directory.
[23:32:18] <Krinkle>	 Separate it as a module first, then move to separate server later.
[23:33:15] <hashar>	 well ideally I would prefer to get rid of gallium one day
[23:33:48] <Krinkle>	 hashar: well, then we'll need more people to do it for us. I don't see us having time for that. Or priority.
[23:35:54] <Krinkle>	 hashar: Hm.. did you research this before? Would save me some time. E.g. how to export it and from where we'd hook in to do that (e.g. post-build task inside the jenkins job? or is there a way to do it globally?)
[23:36:05] <hashar>	 if we get a spec defining what we want, I am sure ops will be happy to comply
[23:36:24] <hashar>	 i.e. something like:   low CPU/mem + 1 TB disk  with a 10.0.0.0 IP
[23:36:35] <hashar>	 have  cilog.wmfusercontent.org point to misc varnish
[23:36:36] <Krinkle>	 hashar: it'll need to be puppetised, whih means it'll take me a month or Infinity for someone else to write it.
[23:36:46] <hashar>	 configure misc varnish for that DNS entry to point to the new host
[23:37:07] <hashar>	 then get the 500GB disk mounted on /srv/  and ask for a lame apache virtual host
[23:37:14] <Krinkle>	 If I do it on gallium I can finish it in 2 days and be done with it. And then leave it to ops or the next guy to puppetise it. Terrible terrible, I know. But may be more realistic.
[23:37:39] <hashar>	 I am sure it can be done reasonably fast on a new server
[23:38:19] <hashar>	 or maybe on lanthanum
[23:38:27] <hashar>	 it as 380GB free :]
[23:39:05] <hashar>	 but then that is still a Precise machine :/
[23:42:35] <legoktm>	 hashar: https://integration.wikimedia.org/ci/job/mwext-MobileFrontend-qunit-mobile/8310/console 23:39:52 java.io.IOException: java.util.concurrent.ExecutionException: java.io.IOException: request to write '1272' bytes exceeds size in header of '1219' bytes for entry 'log/mw-debug-www.log'
[23:42:40] <legoktm>	 causing the submit job to fail
[23:42:45] <hashar>	 ah yeah
[23:42:50] <hashar>	 was looking at that task in phabricator
[23:43:18] <hashar>	 legoktm: https://phabricator.wikimedia.org/T78590
[23:43:30] <hashar>	 legoktm: the qunit tests ends before all apache requests have been completed
[23:43:46] <legoktm>	 weird
[23:43:49] <legoktm>	 ok
[23:43:50] <hashar>	 at the end of the qunit test, Jenkins compress all the logs using gzip
[23:43:52] <legoktm>	 but
[23:43:57] <legoktm>	 only the submit jobs are failing
[23:43:59] <legoktm>	 the test ones passed
[23:44:00] <hashar>	 then uses tar to gather them
[23:44:18] <hashar>	 but an apache thread ends up still being writing to the log file and tar complains the file has changed while it was processing it
[23:44:33] <legoktm>	 https://gerrit.wikimedia.org/r/#/c/180653/ the "recheck" one passed, then after I +2'd, it failed
[23:44:34] <hashar>	 yeah that is a race condition in the publishing :-5
[23:44:38] <legoktm>	 blagh
[23:44:56] <hashar>	 there must be some requests that is not properly waited for.  Timo commented on the task
[23:45:05] <hashar>	 if you remove the +2 and revote, that might pass
[23:46:03] <wikibugs>	 3MediaWiki-General-or-Unknown, Mobile-Web, Continuous-Integration: MediaWiki QUnit test does not wait for all requests to complete, causing a race condition in Jenkins - https://phabricator.wikimedia.org/T78590#854539 (10Legoktm) Also happening on https://gerrit.wikimedia.org/r/#/c/180653/ intermittently.
[23:46:30] <shinken-wm>	 PROBLEM - Free space - all mounts on deployment-cache-upload02 is CRITICAL: CRITICAL: deployment-prep.deployment-cache-upload02.diskspace._srv_vdb.byte_percentfree.value (<100.00%)  
[23:48:35] <legoktm>	 passed :D
[23:49:02] <hashar>	 legoktm: :-]
[23:49:09] <hashar>	 legoktm: one will have to investigate what happens though