[00:00:14] <grrrit-wm>	 (03PS1) 10Dduvall: Raita Elasticsearch logging [selenium] - 10https://gerrit.wikimedia.org/r/207324 
[00:02:49] <grrrit-wm>	 (03CR) 10jenkins-bot: [V: 04-1] Raita Elasticsearch logging [selenium] - 10https://gerrit.wikimedia.org/r/207324 (owner: 10Dduvall)
[00:06:59] <grrrit-wm>	 (03PS2) 10Dduvall: Raita Elasticsearch logging [selenium] - 10https://gerrit.wikimedia.org/r/207324 
[00:07:41] <grrrit-wm>	 (03CR) 10jenkins-bot: [V: 04-1] Raita Elasticsearch logging [selenium] - 10https://gerrit.wikimedia.org/r/207324 (owner: 10Dduvall)
[00:12:06] <grrrit-wm>	 (03PS3) 10Dduvall: Raita Elasticsearch logging [selenium] - 10https://gerrit.wikimedia.org/r/207324 
[00:14:07] <shinken-wm>	 PROBLEM - Puppet failure on deployment-mediawiki03 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0]  
[00:25:39] <bawolff>	 I think there's something messed up with how jenkins is doing the unit tests for gwtoolset - https://gerrit.wikimedia.org/r/#/c/207329/
[00:54:05] <shinken-wm>	 RECOVERY - Puppet failure on deployment-mediawiki03 is OK: OK: Less than 1.00% above the threshold [0.0]  
[01:15:03] <shinken-wm>	 PROBLEM - Puppet failure on deployment-mediawiki01 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0]  
[01:30:04] <shinken-wm>	 RECOVERY - Puppet failure on deployment-mediawiki01 is OK: OK: Less than 1.00% above the threshold [0.0]  
[02:07:13] <shinken-wm>	 PROBLEM - Puppet staleness on deployment-elastic07 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [43200.0]  
[02:23:34] <grrrit-wm>	 (03PS1) 10Krinkle: Create npm-run-doc job [integration/config] - 10https://gerrit.wikimedia.org/r/207363 
[02:28:33] <grrrit-wm>	 (03PS2) 10Krinkle: Create npm-run-doc job [integration/config] - 10https://gerrit.wikimedia.org/r/207363 
[02:28:45] <grrrit-wm>	 (03CR) 10Krinkle: [C: 032] "Deployed npm-run-doc job." [integration/config] - 10https://gerrit.wikimedia.org/r/207363 (owner: 10Krinkle)
[02:30:49] <grrrit-wm>	 (03Merged) 10jenkins-bot: Create npm-run-doc job [integration/config] - 10https://gerrit.wikimedia.org/r/207363 (owner: 10Krinkle)
[02:35:30] <grrrit-wm>	 (03PS1) 10Legoktm: Make OAI phpunit job voting, use generic job [integration/config] - 10https://gerrit.wikimedia.org/r/207368 (https://phabricator.wikimedia.org/T67895) 
[02:36:03] <grrrit-wm>	 (03CR) 10Legoktm: [C: 032] Make OAI phpunit job voting, use generic job [integration/config] - 10https://gerrit.wikimedia.org/r/207368 (https://phabricator.wikimedia.org/T67895) (owner: 10Legoktm)
[02:37:56] <grrrit-wm>	 (03Merged) 10jenkins-bot: Make OAI phpunit job voting, use generic job [integration/config] - 10https://gerrit.wikimedia.org/r/207368 (https://phabricator.wikimedia.org/T67895) (owner: 10Legoktm)
[02:39:12] <legoktm>	 Krinkle: crap, I didn't realize you hadn't deployed yet :/
[02:39:57] <legoktm>	 !log deploying https://gerrit.wikimedia.org/r/207363 and https://gerrit.wikimedia.org/r/207368
[02:40:04] <qa-morebots>	 Logged the message, Master
[02:42:27] <Krinkle>	 Perfect
[02:42:39] <Krinkle>	 legoktm: No worries
[02:55:08] <shinken-wm>	 PROBLEM - Puppet failure on deployment-mediawiki03 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0]  
[02:55:22] <shinken-wm>	 PROBLEM - Puppet failure on deployment-kafka02 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0]  
[03:29:20] <shinken-wm>	 PROBLEM - Puppet staleness on deployment-redis01 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [43200.0]  
[04:04:24] <wmf-insecte>	 Project beta-scap-eqiad build #50885: FAILURE in 29 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/50885/
[04:15:05] <wmf-insecte>	 Yippee, build fixed!
[04:15:06] <wmf-insecte>	 Project beta-scap-eqiad build #50886: FIXED in 1 min 7 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/50886/
[04:48:36] <shinken-wm>	 RECOVERY - App Server Main HTTP Response on deployment-mediawiki03 is OK: HTTP OK: HTTP/1.1 200 OK - 47285 bytes in 9.787 second response time  
[04:54:37] <shinken-wm>	 PROBLEM - App Server Main HTTP Response on deployment-mediawiki03 is CRITICAL: CRITICAL - Socket timeout after 10 seconds  
[05:40:20] <shinken-wm>	 RECOVERY - Puppet failure on deployment-kafka02 is OK: OK: Less than 1.00% above the threshold [0.0]  
[05:41:31] <wmf-insecte>	 Yippee, build fixed!
[05:41:31] <wmf-insecte>	 Project browsertests-MultimediaViewer-en.wikipedia.beta.wmflabs.org-os_x_10.9-chrome-sauce build #54: FIXED in 25 min: https://integration.wikimedia.org/ci/job/browsertests-MultimediaViewer-en.wikipedia.beta.wmflabs.org-os_x_10.9-chrome-sauce/54/
[05:46:27] <shinken-wm>	 PROBLEM - Puppet failure on deployment-parsoid05 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0]  
[06:32:14] <wmf-insecte>	 Project browsertests-Core-en.wikipedia.beta.wmflabs.org-linux-firefox-sauce build #597: FAILURE in 13 min: https://integration.wikimedia.org/ci/job/browsertests-Core-en.wikipedia.beta.wmflabs.org-linux-firefox-sauce/597/
[06:37:41] <shinken-wm>	 RECOVERY - Free space - all mounts on deployment-bastion is OK: OK: All targets OK  
[06:51:18] <shinken-wm>	 PROBLEM - Puppet failure on deployment-kafka02 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0]  
[07:43:15] <wmf-insecte>	 Yippee, build fixed!
[07:43:16] <wmf-insecte>	 Project browsertests-MultimediaViewer-en.wikipedia.beta.wmflabs.org-windows_8-internet_explorer-10-sauce build #21: FIXED in 34 min: https://integration.wikimedia.org/ci/job/browsertests-MultimediaViewer-en.wikipedia.beta.wmflabs.org-windows_8-internet_explorer-10-sauce/21/
[07:50:57] <shinken-wm>	 PROBLEM - Parsoid on deployment-parsoid05 is CRITICAL: CRITICAL - Socket timeout after 10 seconds  
[07:55:54] <shinken-wm>	 RECOVERY - Parsoid on deployment-parsoid05 is OK: HTTP OK: HTTP/1.1 200 OK - 1086 bytes in 5.384 second response time  
[08:21:20] <wmf-insecte>	 Yippee, build fixed!
[08:21:22] <wmf-insecte>	 Project browsertests-CirrusSearch-en.wikipedia.beta.wmflabs.org-linux-firefox-sauce build #566: FIXED in 1 min 19 sec: https://integration.wikimedia.org/ci/job/browsertests-CirrusSearch-en.wikipedia.beta.wmflabs.org-linux-firefox-sauce/566/
[08:28:31] <shinken-wm>	 RECOVERY - App Server Main HTTP Response on deployment-mediawiki03 is OK: HTTP OK: HTTP/1.1 200 OK - 47296 bytes in 3.832 second response time  
[08:43:32] <apergos>	 just so folks know, I am still on track for salt upgrae in deployment-prep in 1 hour and 15 minutes (10 am utc). 
[08:43:45] <apergos>	 I'll notify here before I proceed.
[08:44:01] <apergos>	 this may impact salt-related commands including git-deploy.
[08:44:25] <apergos>	 I got my timezone calculations wrong and thought it would be in 15 minutes :-D
[09:08:31] <wmf-insecte>	 Project browsertests-Wikidata-SmokeTests-linux-firefox-sauce-T89343-DEBUG build #1: FAILURE in 18 sec: https://integration.wikimedia.org/ci/job/browsertests-Wikidata-SmokeTests-linux-firefox-sauce-T89343-DEBUG/1/
[09:12:28] <wmf-insecte>	 Yippee, build fixed!
[09:12:28] <wmf-insecte>	 Project browsertests-Wikidata-SmokeTests-linux-firefox-sauce-T89343-DEBUG build #2: FIXED in 1 min 28 sec: https://integration.wikimedia.org/ci/job/browsertests-Wikidata-SmokeTests-linux-firefox-sauce-T89343-DEBUG/2/
[09:16:51] <hashar>	 zeljkof: any clue what that job is:  browsertests-MultimediaViewer-en.wikipedia.beta.wmflabs.org-windows_7-internet_explorer-11-sauce
[09:17:01] <hashar>	 https://integration.wikimedia.org/ci/job/browsertests-MultimediaViewer-en.wikipedia.beta.wmflabs.org-windows_7-internet_explorer-11-sauce/
[09:17:22] <zeljkof>	 hashar: not sure what you mean :/
[09:17:29] <hashar>	 it is in jenkins
[09:17:34] <hashar>	 but apparently not in our jjb config file
[09:17:39] <zeljkof>	 it should be
[09:17:49] <hashar>	 ah was created by gilles
[09:17:54] <zeljkof>	 if it is not in jjb file, delete it
[09:17:54] <hashar>	 maybe a patch in progress
[09:18:02] <zeljkof>	 yes, that might be it
[09:21:41] <wmf-insecte>	 Project browsertests-Wikidata-SmokeTests-linux-firefox-sauce-T89343-DEBUG build #3: FAILURE in 7 sec: https://integration.wikimedia.org/ci/job/browsertests-Wikidata-SmokeTests-linux-firefox-sauce-T89343-DEBUG/3/
[09:22:48] <grrrit-wm>	 (03PS1) 10Hashar: Logrotate mwext-VisualEditor-sync-gerrit [integration/config] - 10https://gerrit.wikimedia.org/r/207400 (https://phabricator.wikimedia.org/T91396) 
[09:22:50] <grrrit-wm>	 (03PS1) 10Hashar: browsertest for MultimediaViewer win7+ie11 [integration/config] - 10https://gerrit.wikimedia.org/r/207401 (https://phabricator.wikimedia.org/T91396) 
[09:23:00] <grrrit-wm>	 (03CR) 10Hashar: [C: 032] Logrotate mwext-VisualEditor-sync-gerrit [integration/config] - 10https://gerrit.wikimedia.org/r/207400 (https://phabricator.wikimedia.org/T91396) (owner: 10Hashar)
[09:24:51] <grrrit-wm>	 (03Merged) 10jenkins-bot: Logrotate mwext-VisualEditor-sync-gerrit [integration/config] - 10https://gerrit.wikimedia.org/r/207400 (https://phabricator.wikimedia.org/T91396) (owner: 10Hashar)
[09:26:29] <shinken-wm>	 RECOVERY - Puppet failure on deployment-parsoid05 is OK: OK: Less than 1.00% above the threshold [0.0]  
[09:29:34] <grrrit-wm>	 (03CR) 10Hashar: [C: 032] browsertest for MultimediaViewer win7+ie11 [integration/config] - 10https://gerrit.wikimedia.org/r/207401 (https://phabricator.wikimedia.org/T91396) (owner: 10Hashar)
[09:31:42] <grrrit-wm>	 (03Merged) 10jenkins-bot: browsertest for MultimediaViewer win7+ie11 [integration/config] - 10https://gerrit.wikimedia.org/r/207401 (https://phabricator.wikimedia.org/T91396) (owner: 10Hashar)
[09:51:24] <shinken-wm>	 RECOVERY - Puppet failure on deployment-kafka02 is OK: OK: Less than 1.00% above the threshold [0.0]  
[09:56:41] <grrrit-wm>	 (03PS1) 10Hashar: translatewiki-puppetlint-lenient is now voting [integration/config] - 10https://gerrit.wikimedia.org/r/207409 (https://phabricator.wikimedia.org/T95090) 
[09:57:21] <grrrit-wm>	 (03CR) 10Hashar: [C: 032] translatewiki-puppetlint-lenient is now voting [integration/config] - 10https://gerrit.wikimedia.org/r/207409 (https://phabricator.wikimedia.org/T95090) (owner: 10Hashar)
[09:57:44] <apergos>	 I'm going to get started on the salt upgrade in deployment-prep now
[09:58:57] <grrrit-wm>	 (03Merged) 10jenkins-bot: translatewiki-puppetlint-lenient is now voting [integration/config] - 10https://gerrit.wikimedia.org/r/207409 (https://phabricator.wikimedia.org/T95090) (owner: 10Hashar)
[10:15:38] <wmf-insecte>	 Project beta-scap-eqiad build #50922: FAILURE in 1 min 40 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/50922/
[10:16:27] <hashar>	 apergos: great to hear!
[10:26:34] <apergos>	 doo dee doo dee doo
[10:26:50] <apergos>	 salt is upgraded on master, the syndic and the minion is upgraded there as well
[10:27:04] <apergos>	 and everybody is responsive except parsoid05 which is heavily loaded
[10:27:18] <apergos>	 now setting up for upgrade of all non jessie minions
[10:27:28] <shinken-wm>	 PROBLEM - Puppet failure on deployment-parsoid05 is CRITICAL: CRITICAL: 25.00% of data above the critical threshold [0.0]  
[10:36:28] <wmf-insecte>	 Yippee, build fixed!
[10:36:28] <wmf-insecte>	 Project beta-scap-eqiad build #50924: FIXED in 2 min 33 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/50924/
[10:38:07] <apergos>	 deployment-parsoid05 is still very unhappy.... anyone around to look at it?
[10:40:59] <shinken-wm>	 PROBLEM - Puppet failure on deployment-test is CRITICAL: CRITICAL: 90.00% of data above the critical threshold [0.0]  
[11:01:24] <apergos>	 I'll take that as a 'no'.  It might not get updated til a more quiet time for it then
[11:15:27] <apergos>	 besides deployment-parsoid05, the other host that failed to update is deployment-cache-bits01, I will be doing this host manually
[11:15:56] <shinken-wm>	 RECOVERY - Puppet failure on deployment-test is OK: OK: Less than 1.00% above the threshold [0.0]  
[12:02:21] <apergos>	 now doing the two jessie instances
[12:08:34] <hasharLunch>	 apergos: parsoid05 suffers from some mis configuration that causes parsoid service to eat 100% cpu
[12:08:39] <hasharLunch>	 so I guess it is unrelated
[12:08:45] <apergos>	 oh I know it's not me
[12:09:02] <apergos>	 it was unhappy before I started.  but I got the update done over there
[12:09:05] <hashar>	 the bug being https://phabricator.wikimedia.org/T97421
[12:09:29] <apergos>	 I'lll leave that tab open
[12:09:35] <apergos>	 might be worth looking at later
[12:11:31] <hashar>	 that might be the instance having some issue though
[12:13:20] <hashar>	 !log killing puppet on deployment-parsoid05   eats all CPU for some reason
[12:13:28] <qa-morebots>	 Logged the message, Master
[12:14:20] <apergos>	 fyi eployment-restbase01 and 02 are the two jessie instances that are taking some extra time
[12:15:42] <hashar>	 maybe that is the underlying virt node which is just slow
[12:16:10] <hashar>	 [414080.036058] BUG: soft lockup - CPU#1 stuck for 21s! [apt-cache:7539]
[12:21:42] <hashar>	 I am going to upgrade and reboot it
[12:22:14] <hashar>	 !log deployment-parsoid05 slow down is https://phabricator.wikimedia.org/T97421  . Running apt-get upgrade and rebooting it but its slowness issue might be with the underlying hardware
[12:22:18] <qa-morebots>	 Logged the message, Master
[12:23:14] <grrrit-wm>	 (03PS3) 10JanZerebecki: Added job for WikidataQuality extension. [integration/config] - 10https://gerrit.wikimedia.org/r/206392 (owner: 10Soeren.oldag)
[12:25:06] <grrrit-wm>	 (03CR) 10jenkins-bot: [V: 04-1] Added job for WikidataQuality extension. [integration/config] - 10https://gerrit.wikimedia.org/r/206392 (owner: 10Soeren.oldag)
[12:29:32] <grrrit-wm>	 (03CR) 10Soeren.oldag: [C: 031] Added job for WikidataQuality extension. [integration/config] - 10https://gerrit.wikimedia.org/r/206392 (owner: 10Soeren.oldag)
[12:35:40] <grrrit-wm>	 (03PS4) 10JanZerebecki: Added job for WikidataQuality extension. [integration/config] - 10https://gerrit.wikimedia.org/r/206392 (owner: 10Soeren.oldag)
[12:37:59] <shinken-wm>	 PROBLEM - Parsoid on deployment-parsoid05 is CRITICAL: CRITICAL - Socket timeout after 10 seconds  
[12:38:59] <apergos>	 eyeroll
[12:42:21] <grrrit-wm>	 (03CR) 10Soeren.oldag: [C: 031] Added job for WikidataQuality extension. [integration/config] - 10https://gerrit.wikimedia.org/r/206392 (owner: 10Soeren.oldag)
[12:43:28] <grrrit-wm>	 (03PS5) 10JanZerebecki: Added job for WikidataQuality extension. [integration/config] - 10https://gerrit.wikimedia.org/r/206392 (owner: 10Soeren.oldag)
[12:43:43] <grrrit-wm>	 (03CR) 10JanZerebecki: "PS5 is rebase only" [integration/config] - 10https://gerrit.wikimedia.org/r/206392 (owner: 10Soeren.oldag)
[12:50:55] <grrrit-wm>	 (03CR) 10JanZerebecki: "Deployed the Jenkins jobs this commit adds: mwext-WikidataQuality-npm, mwext-WikidataQuality-qunit, mwext-WikidataQuality-repo-tests-mysql" [integration/config] - 10https://gerrit.wikimedia.org/r/206392 (owner: 10Soeren.oldag)
[12:52:34] <grrrit-wm>	 (03CR) 10Soeren.oldag: [C: 031] Added job for WikidataQuality extension. [integration/config] - 10https://gerrit.wikimedia.org/r/206392 (owner: 10Soeren.oldag)
[12:57:53] <shinken-wm>	 RECOVERY - Parsoid on deployment-parsoid05 is OK: HTTP OK: HTTP/1.1 200 OK - 1086 bytes in 3.397 second response time  
[12:58:29] <apergos>	 all instances in deployment prep have been updated to 2015.7.5, all are responsive (at least for now :-P)
[13:02:12] <hashar>	 !log labvirt1005 seems to have hardware issue. Impacts a bunch of beta cluster / integration instances as listed on https://phabricator.wikimedia.org/T97521#1245217
[13:02:16] <qa-morebots>	 Logged the message, Master
[13:04:00] <shinken-wm>	 PROBLEM - Parsoid on deployment-parsoid05 is CRITICAL: CRITICAL - Socket timeout after 10 seconds  
[13:05:17] <Tobi_WMDE_SW>	 zeljkof: CFisch_WMDE will be there in a minute.. ;)
[13:05:41] <zeljkof>	 Tobi_WMDE_SW: ok, no rush :)
[13:11:28] <hashar>	 !log Rebooting deployment-parsoid05 via wikitech interface.
[13:11:32] <qa-morebots>	 Logged the message, Master
[13:14:01] <shinken-wm>	 PROBLEM - Host deployment-parsoid05 is DOWN: CRITICAL - Host Unreachable (10.68.16.120)  
[13:25:06] <grrrit-wm>	 (03CR) 10JanZerebecki: "npm jobs works:" [integration/config] - 10https://gerrit.wikimedia.org/r/206392 (owner: 10Soeren.oldag)
[13:25:25] <apergos>	 hashar: did you tell me at some point that deployment-bastion has some firewall issues maybe?  that host has never done well being a target of git deploy for me
[13:25:38] <apergos>	 but the other targets respond fine so I'm inclined to ignore
[13:26:06] <hashar>	 it probably has ferm rules
[13:26:12] <hashar>	 and might be missing the one for salt?
[13:27:51] <apergos>	 well salt works as in test.ping is ok 
[13:28:05] <apergos>	 but some other piece maybe
[13:29:03] <apergos>	 it's a little odd since it's the deploy server to also be a target
[13:29:08] <grrrit-wm>	 (03CR) 10JanZerebecki: "Will fix: https://phabricator.wikimedia.org/T97529" [integration/config] - 10https://gerrit.wikimedia.org/r/206392 (owner: 10Soeren.oldag)
[13:31:42] <apergos>	 doubt there will be
[13:34:05] <hashar>	 apergos: you might want to announce it somewhere :]
[13:34:09] <hashar>	 QA list might be a good place
[13:34:59] <apergos>	 I have to look again to see if I'm on that
[13:38:28] <apergos>	 I sent to qa@lists, if it doesn't show up in a few minutes holler, I'll send to you to forward I guess
[13:39:13] <apergos>	 the subject has "salt upgraded" in it
[13:39:19] <grrrit-wm>	 (03PS1) 10Hashar: Cloner: Implement cache-no-hardlinks argument [integration/zuul] (patch-queue/debian/precise-wikimedia) - 10https://gerrit.wikimedia.org/r/207438 
[13:46:51] <grrrit-wm>	 (03PS1) 10Hashar: zuul-cloner can now hardlinks from cache-dir [integration/zuul] (debian/precise-wikimedia) - 10https://gerrit.wikimedia.org/r/207442 (https://phabricator.wikimedia.org/T97106) 
[13:50:02] <hashar>	 apergos: else poke engineering list :]
[13:53:46] <grrrit-wm>	 (03CR) 10Hashar: [C: 032 V: 032] "Build and published at http://people.wikimedia.org/~hashar/debs/zuul_2.0.0-304-g685ca22-wmf2/" [integration/zuul] (debian/precise-wikimedia) - 10https://gerrit.wikimedia.org/r/207442 (https://phabricator.wikimedia.org/T97106) (owner: 10Hashar)
[13:54:38] <apergos>	 guess it didn't show up eh?
[13:58:17] <grrrit-wm>	 (03PS1) 10Hashar: Merge branch 'debian/precise-wikimedia' into debian/trusty-wikimedia [integration/zuul] (debian/trusty-wikimedia) - 10https://gerrit.wikimedia.org/r/207444 (https://phabricator.wikimedia.org/T97106) 
[14:00:26] <grrrit-wm>	 (03CR) 10Hashar: [C: 032 V: 032] Merge branch 'debian/precise-wikimedia' into debian/trusty-wikimedia [integration/zuul] (debian/trusty-wikimedia) - 10https://gerrit.wikimedia.org/r/207444 (https://phabricator.wikimedia.org/T97106) (owner: 10Hashar)
[14:03:12] <grrrit-wm>	 (03CR) 10JanZerebecki: [C: 04-1] "Needs two composer runs or a combined one..." [integration/config] - 10https://gerrit.wikimedia.org/r/206392 (owner: 10Soeren.oldag)
[14:11:45] <hashar>	 !log rebooting integration-saltmaster stalled.
[14:11:52] <qa-morebots>	 Logged the message, Master
[14:16:21] <shinken-wm>	 PROBLEM - Host integration-saltmaster is DOWN: CRITICAL - Host Unreachable (10.68.18.24)  
[14:18:59] <shinken-wm>	 RECOVERY - Host deployment-parsoid05 is UP: PING OK - Packet loss = 0%, RTA = 0.63 ms  
[14:23:49] <shinken-wm>	 RECOVERY - Parsoid on deployment-parsoid05 is OK: HTTP OK: HTTP/1.1 200 OK - 1086 bytes in 0.032 second response time  
[14:25:23] <hashar>	 !log upgrading zuul on integration-slave-precise-1011 for  https://phabricator.wikimedia.org/T97106
[14:25:28] <qa-morebots>	 Logged the message, Master
[14:32:30] <shinken-wm>	 RECOVERY - Puppet failure on deployment-parsoid05 is OK: OK: Less than 1.00% above the threshold [0.0]  
[14:33:24] <shinken-wm>	 PROBLEM - Host deployment-test is DOWN: CRITICAL - Host Unreachable (10.68.16.149)  
[14:34:52] <hashar>	 Krinkle|detached: I have incorporated the zuul-cloner git cache hardlink stuff in our .deb packages.   Build both Precise and Trusty one and have put them in /data/project/root/    integration-slave-precise1011  has it now.   Full details at https://phabricator.wikimedia.org/T97106#1245429
[14:52:56] <Krinkle>	 hashar: cool
[14:53:41] <hashar>	 Krinkle: I guess there will be no side effect
[14:53:48] <Krinkle>	 Yeah
[14:53:57] <hashar>	 Krinkle: I have wrote some explanations on the task, so feel free to do the upgrade on the other instances
[14:54:08] <shinken-wm>	 RECOVERY - Host deployment-test is UP: PING OK - Packet loss = 0%, RTA = 0.84 ms  
[14:54:10] <hashar>	 I would have mass done it via salt but the saltmaster is dead at the moment :]
[14:54:15] <hashar>	 a good point
[14:54:22] <Krinkle>	 hashar: Aye
[14:54:23] <hashar>	 cherry picking a patch is reasonably easy
[14:54:29] <Krinkle>	 What happened?
[14:54:33] <hashar>	 oh
[14:54:48] <hashar>	 integration-saltmaster has been migrated last week to a new labs hardware (the labvirtXXXX hosts)
[14:55:03] <hashar>	 and the hardware has a faulty memory :]
[14:55:08] <hashar>	 they are cursed :/
[14:57:18] <shinken-wm>	 PROBLEM - Host deployment-cache-bits01 is DOWN: CRITICAL - Host Unreachable (10.68.16.12)  
[15:00:05] <hashar>	 !log Instances are being moved out from labvirt1005 which has some faulty memory. List of instances at https://phabricator.wikimedia.org/T97521#1245217
[15:00:09] <qa-morebots>	 Logged the message, Master
[15:00:51] <greg-g>	 golly
[15:04:00] <grrrit-wm>	 (03PS1) 10Aude: Update Wikidata branch to wmf/1.26wmf4 [tools/release] - 10https://gerrit.wikimedia.org/r/207459 
[15:10:43] <shinken-wm>	 RECOVERY - Host deployment-cache-bits01 is UP: PING OK - Packet loss = 0%, RTA = 0.98 ms  
[15:11:29] <shinken-wm>	 PROBLEM - Host deployment-elastic06 is DOWN: CRITICAL - Host Unreachable (10.68.17.186)  
[15:12:12] <Krinkle>	 hashar: Ah, right.
[15:45:50] <Krinkle>	 hashar: You told Andrew about https://phabricator.wikimedia.org/T96706 ?
[15:46:11] <Krinkle>	 I know he's busy with the migration, just making sure.
[15:48:30] <hashar>	 Krinkle: yeah yesterday during the weekly meeting
[15:48:37] <hashar>	 quite easy to do
[15:48:41] <shinken-wm>	 RECOVERY - Host deployment-elastic06 is UP: PING OK - Packet loss = 0%, RTA = 377.75 ms  
[15:48:42] <hashar>	 I am off, meeting!
[15:50:55] <shinken-wm>	 PROBLEM - Host deployment-kafka02 is DOWN: CRITICAL - Host Unreachable (10.68.17.156)  
[15:54:37] <shinken-wm>	 PROBLEM - Host deployment-mediawiki03 is DOWN: CRITICAL - Host Unreachable (10.68.17.55)  
[15:55:54] <shinken-wm>	 RECOVERY - Host deployment-kafka02 is UP: PING OK - Packet loss = 0%, RTA = 0.86 ms  
[16:34:37] <wmf-insecte>	 Project beta-scap-eqiad build #50962: FAILURE in 31 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/50962/
[16:47:07] <wikibugs>	 6Release-Engineering, 10MediaWiki-Maintenance-scripts, 10MediaWiki-Redirects, 5Patch-For-Review: namespaceDupes not handling deleted namespace redirects as desired - https://phabricator.wikimedia.org/T91401#1245997 (10demon) a:5demon>3None
[16:49:45] <wikibugs>	 6Release-Engineering, 10MediaWiki-Maintenance-scripts, 10MediaWiki-Redirects, 5Patch-For-Review: namespaceDupes not handling deleted namespace redirects as desired - https://phabricator.wikimedia.org/T91401#1246003 (10demon) a:3demon Whoops, didn't mean to unassign.  Also: refreshLinks has since finished...
[16:55:13] <wmf-insecte>	 Yippee, build fixed!
[16:55:13] <wmf-insecte>	 Project beta-scap-eqiad build #50964: FIXED in 1 min 11 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/50964/
[16:59:08] <thcipriani>	 huh: cp: cannot create regular file `/srv/mediawiki-staging/php-master/cache/l10n/l10n_cache-ab.cdb': Permission denied
[17:04:54] <thcipriani>	 I bet there is a l10nupdate user in both ldap and locally on deployment-bastion
[17:15:22] <thcipriani>	 !log removed l10nupdate user from /etc/passwd on deployment-bastion
[17:15:29] <qa-morebots>	 Logged the message, Master
[17:17:24] <wikibugs>	 6Release-Engineering, 10MediaWiki-Vagrant, 7Documentation: Document RSpec workflow on MediaWiki-Vagrant - https://phabricator.wikimedia.org/T97464#1246100 (10dduvall) A slightly easier way would be to invoke the specific gem version using `_<version>_` following the bin name.  ``` /Users/.../vagrant $ gem in...
[17:26:41] <wikibugs>	 10Browser-Tests: Create new account at Sauce Labs for running Jenkins jobs - https://phabricator.wikimedia.org/T97549#1246124 (10zeljkofilipin) 3NEW a:3zeljkofilipin
[17:27:45] <wikibugs>	 10Browser-Tests: Create new account at Sauce Labs for running Jenkins jobs - https://phabricator.wikimedia.org/T97549#1246124 (10zeljkofilipin)
[17:35:12] <wikibugs>	 10Browser-Tests: Create new account at Sauce Labs for running Jenkins jobs - https://phabricator.wikimedia.org/T97549#1246154 (10zeljkofilipin) User with username wikimedia-jenkins created. Asked OIT to create jenkins@wikimedia.org.
[17:50:14] <shinken-wm>	 RECOVERY - Host deployment-mediawiki03 is UP: PING OK - Packet loss = 0%, RTA = 0.80 ms  
[17:52:16] <shinken-wm>	 PROBLEM - Host deployment-restbase01 is DOWN: CRITICAL - Host Unreachable (10.68.17.227)  
[17:52:58] <wikibugs>	 7Blocked-on-RelEng, 6Release-Engineering, 6Multimedia, 6Reading-Infrastructure-Team, and 4 others: Create basic puppet role for Sentry - https://phabricator.wikimedia.org/T84956#1246211 (10bd808)
[18:00:05] <shinken-wm>	 RECOVERY - Puppet failure on deployment-mediawiki03 is OK: OK: Less than 1.00% above the threshold [0.0]  
[18:00:19] <wikibugs>	 6Release-Engineering: Shorten/Simply MW train deploy cadence to M->Tu->W - https://phabricator.wikimedia.org/T97553#1246233 (10greg) 3NEW
[18:00:41] <wikibugs>	 10Beta-Cluster, 7Blocked-on-RelEng, 10ContentTranslation-Deployments, 10MediaWiki-extensions-ContentTranslation, and 3 others: Setup new wikis in Beta Cluster for Content Translation - https://phabricator.wikimedia.org/T90683#1246240 (10mmodell) @KartikMistry it should be correct now, I'm going to try to f...
[18:02:04] <wikibugs>	 6Release-Engineering: Shorten/Simply MW train deploy cadence to M->Tu->W - https://phabricator.wikimedia.org/T97553#1246257 (10greg) (NB: This proposal is basically how FB does their deploy cadence, other than how the gradual rollout happens. Essentially weekly starting on Monday.)
[18:06:13] <wikibugs>	 6Release-Engineering: Shorten/Simplify MW train deploy cadence to M->Tu->W - https://phabricator.wikimedia.org/T97553#1246287 (10greg)
[18:16:33] <wikibugs>	 6Release-Engineering: Shorten/Simplify MW train deploy cadence to M->Tu->W - https://phabricator.wikimedia.org/T97553#1246327 (10demon) If we did this we should automate the cutting (and testing) of the new branch like Sunday night.
[18:21:29] <wikibugs>	 6Release-Engineering: Shorten/Simplify MW train deploy cadence to M->Tu->W - https://phabricator.wikimedia.org/T97553#1246343 (10Legoktm) Why MTuW instead of TuWTh? (deploying on monday means you're rushed to fix any bugs that might have been discovered over the weekend)
[18:28:30] <wikibugs>	 6Release-Engineering: Shorten/Simplify MW train deploy cadence to M->Tu->W - https://phabricator.wikimedia.org/T97553#1246367 (10greg) >>! In T97553#1246327, @demon wrote: > If we did this we should automate the cutting (and testing) of the new branch like Sunday night.  +1  >>! In T97553#1246343, @Legoktm wrote...
[18:37:54] <grrrit-wm>	 (03CR) 10Aude: [C: 032] Update Wikidata branch to wmf/1.26wmf4 [tools/release] - 10https://gerrit.wikimedia.org/r/207459 (owner: 10Aude)
[18:38:04] <grrrit-wm>	 (03Merged) 10jenkins-bot: Update Wikidata branch to wmf/1.26wmf4 [tools/release] - 10https://gerrit.wikimedia.org/r/207459 (owner: 10Aude)
[18:54:17] <wikibugs>	 6Release-Engineering: Shorten/Simplify MW train deploy cadence to M->Tu->W - https://phabricator.wikimedia.org/T97553#1246452 (10mmodell) We really need to automate the branching stuff anyway - it's really time consuming and error prone. the way it is now wastes not just my time but @aude's and anyone else who w...
[18:54:49] <shinken-wm>	 PROBLEM - Parsoid on deployment-parsoid05 is CRITICAL: Connection refused  
[18:55:53] <wikibugs>	 6Release-Engineering: Shorten/Simplify MW train deploy cadence to M->Tu->W - https://phabricator.wikimedia.org/T97553#1246460 (10Jdforrester-WMF) I'd also suggest TuWTh so that the preponderance of holiday Mondays doesn't massively disrupt schedules. (As well as @demon's, @mmodell's and @legoktm's points.)
[18:58:13] <wikibugs>	 6Release-Engineering: Shorten/Simplify MW train deploy cadence to Tu->W->Th - https://phabricator.wikimedia.org/T97553#1246481 (10greg)
[19:00:26] <shinken-wm>	 PROBLEM - Free space - all mounts on deployment-eventlogging02 is CRITICAL: CRITICAL: deployment-prep.deployment-eventlogging02.diskspace._var.byte_percentfree (<100.00%)  
[19:01:29] <manybubbles>	 greg-g: https://wikitech.wikimedia.org/w/index.php?title=Deployments&type=revision&diff=156472&oldid=156458
[19:06:23] <wikibugs>	 10Beta-Cluster, 10MediaWiki-extensions-GWToolset, 6Multimedia, 7HHVM, 5Patch-For-Review: GWToolset XML upload fails with “The file that was uploaded exceeds the upload_max_filesize and/or the post_max_size directive in php.ini” on hhvm 3.6 - https://phabricator.wikimedia.org/T97415#1246514 (10Bawolff) >>...
[19:06:53] <greg-g>	 manybubbles: ty, I think I had an edit window open for that ysterday but never did it
[19:07:18] <manybubbles>	 greg-g: I'm just trynig to get the last of it done so I can get it to beta tomorrow!
[19:07:31] <manybubbles>	 Its mostly done, but I've got some small stuff to finish up
[19:09:35] <greg-g>	 coolio
[19:11:30] <wikibugs>	 6Release-Engineering, 10Wikidata: enable use of production deployed autoloader for extensions that is created by composer - https://phabricator.wikimedia.org/T97560#1246532 (10JanZerebecki) 3NEW
[19:13:51] <wikibugs>	 6Release-Engineering, 10Wikidata: enable use of production deployed autoloader for extensions that is created by composer - https://phabricator.wikimedia.org/T97560#1246563 (10JanZerebecki)
[19:15:32] <wmf-insecte>	 Project beta-scap-eqiad build #50978: FAILURE in 1 min 28 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/50978/
[19:15:57] <grrrit-wm>	 (03CR) 10JanZerebecki: "A combined composer run won't work for now, see https://phabricator.wikimedia.org/T97560 ." [integration/config] - 10https://gerrit.wikimedia.org/r/206392 (owner: 10Soeren.oldag)
[19:17:35] <shinken-wm>	 RECOVERY - Host integration-saltmaster is UP: PING OK - Packet loss = 0%, RTA = 4.00 ms  
[19:20:47] <shinken-wm>	 RECOVERY - SSH on integration-saltmaster is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.4 (protocol 2.0)  
[19:21:10] <wikibugs>	 6Release-Engineering, 3Team-Practices-This-Week: Test phabricator sprint extension updates - https://phabricator.wikimedia.org/T95469#1246627 (10KLans_WMF) 5Open>3Resolved
[19:25:01] <wmf-insecte>	 Yippee, build fixed!
[19:25:02] <wmf-insecte>	 Project beta-scap-eqiad build #50979: FIXED in 1 min 3 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/50979/
[19:28:40] <shinken-wm>	 PROBLEM - Free space - all mounts on deployment-bastion is CRITICAL: CRITICAL: deployment-prep.deployment-bastion.diskspace._var.byte_percentfree (<50.00%)  
[19:30:15] <wikibugs>	 6Release-Engineering, 10Wikidata, 7Composer: enable use of production deployed autoloader for extensions that is created by composer - https://phabricator.wikimedia.org/T97560#1246671 (10bd808)
[19:31:16] <shinken-wm>	 PROBLEM - Puppet staleness on integration-saltmaster is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [43200.0]  
[19:36:55] <wikibugs>	 6Release-Engineering: Shorten/Simplify MW train deploy cadence to Tu->W->Th - https://phabricator.wikimedia.org/T97553#1246722 (10mmodell) I'm confused at how the best case would be 3 days but worst case 10 days?
[19:41:12] <wikibugs>	 6Release-Engineering: Shorten/Simplify MW train deploy cadence to Tu->W->Th - https://phabricator.wikimedia.org/T97553#1246771 (10greg) >>! In T97553#1246722, @mmodell wrote: > I'm confused at how the best case would be 3 days but worst case 10 days?  If you merge into master right after the new branch which hap...
[19:46:35] <shinken-wm>	 RECOVERY - Host deployment-restbase01 is UP: PING OK - Packet loss = 0%, RTA = 0.94 ms  
[19:48:41] <shinken-wm>	 RECOVERY - Free space - all mounts on deployment-bastion is OK: OK: All targets OK  
[19:52:09] <wikibugs>	 6Release-Engineering: Shorten/Simplify MW train deploy cadence to Tu->W->Th - https://phabricator.wikimedia.org/T97553#1246830 (10mmodell) could also do it like this: cut the branch and pushed to testing wikis on Wednesday like we already do but promote sooner:   wednesday: new branch  thursday:  group 1  friday...
[20:00:57] <shinken-wm>	 PROBLEM - Host integration-raita is DOWN: CRITICAL - Host Unreachable (10.68.16.53)  
[20:06:09] <shinken-wm>	 RECOVERY - Host integration-raita is UP: PING OK - Packet loss = 0%, RTA = 0.72 ms  
[20:06:47] <shinken-wm>	 PROBLEM - Host integration-saltmaster is DOWN: PING CRITICAL - Packet loss = 100%  
[20:16:21] <shinken-wm>	 RECOVERY - Host integration-saltmaster is UP: PING OK - Packet loss = 0%, RTA = 0.71 ms  
[20:21:53] <wikibugs>	 6Release-Engineering: Convert old wmf/* deployment branches to tags (recurring chore) - https://phabricator.wikimedia.org/T1288#1246928 (10Krinkle) a:5Krinkle>3None
[20:49:09] <wmf-insecte>	 Project browsertests-Flow-en.wikipedia.beta.wmflabs.org-linux-chrome-monobook-sauce build #412: ABORTED in 4 min 47 sec: https://integration.wikimedia.org/ci/job/browsertests-Flow-en.wikipedia.beta.wmflabs.org-linux-chrome-monobook-sauce/412/
[20:54:34] <greg-g>	 aborted?
[21:07:06] <shinken-wm>	 PROBLEM - Host deployment-bastion is DOWN: CRITICAL - Host Unreachable (10.68.16.58)  
[21:10:41] <shinken-wm>	 PROBLEM - Host Generic Beta Cluster is DOWN: CRITICAL - Host Unreachable (en.wikipedia.beta.wmflabs.org)  
[21:10:56] <shinken-wm>	 PROBLEM - Host deployment-cache-text02 is DOWN: CRITICAL - Host Unreachable (10.68.16.16)  
[21:11:02] <shinken-wm>	 PROBLEM - Puppet failure on deployment-mediawiki01 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0]  
[21:11:28] <shinken-wm>	 PROBLEM - Puppet failure on deployment-videoscaler01 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0]  
[21:11:54] <shinken-wm>	 RECOVERY - Host deployment-cache-text02 is UP: PING OK - Packet loss = 0%, RTA = 0.56 ms  
[21:11:58] <shinken-wm>	 RECOVERY - Host deployment-bastion is UP: PING OK - Packet loss = 0%, RTA = 0.69 ms  
[21:14:20] <shinken-wm>	 RECOVERY - Host Generic Beta Cluster is UP: PING OK - Packet loss = 0%, RTA = 0.81 ms  
[21:14:56] <shinken-wm>	 PROBLEM - Host deployment-db1 is DOWN: CRITICAL - Host Unreachable (10.68.16.193)  
[21:15:22] <shinken-wm>	 PROBLEM - Puppet failure on deployment-jobrunner01 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0]  
[21:17:30] <shinken-wm>	 PROBLEM - App Server Main HTTP Response on deployment-mediawiki03 is CRITICAL: HTTP CRITICAL: HTTP/1.1 500 MediaWiki exception - 1532 bytes in 3.091 second response time  
[21:19:18] <shinken-wm>	 PROBLEM - Host deployment-db2 is DOWN: CRITICAL - Host Unreachable (10.68.17.94)  
[21:19:32] <shinken-wm>	 PROBLEM - App Server Main HTTP Response on deployment-mediawiki01 is CRITICAL: HTTP CRITICAL: HTTP/1.1 500 MediaWiki exception - 1532 bytes in 2.054 second response time  
[21:20:00] <shinken-wm>	 PROBLEM - App Server Main HTTP Response on deployment-mediawiki02 is CRITICAL: HTTP CRITICAL: HTTP/1.1 500 MediaWiki exception - 1532 bytes in 2.062 second response time  
[21:20:10] <greg-g>	 ugh
[21:20:19] <thcipriani>	 greg-g: these are "planned" sort of
[21:20:23] <greg-g>	 oh, ok
[21:20:28] <shinken-wm>	 RECOVERY - Host deployment-db1 is UP: PING OK - Packet loss = 0%, RTA = 0.98 ms  
[21:20:34] <thcipriani>	 andrewbogott is migrating instances
[21:20:56] <shinken-wm>	 PROBLEM - English Wikipedia Mobile Main page on beta-cluster is CRITICAL: HTTP CRITICAL: HTTP/1.1 500 MediaWiki exception - 1888 bytes in 6.103 second response time  
[21:22:17] <matanya>	 thcipriani: please poke me when you are donw
[21:22:18] <matanya>	 e
[21:22:26] <shinken-wm>	 PROBLEM - English Wikipedia Main page on beta-cluster is CRITICAL: HTTP CRITICAL: HTTP/1.1 500 MediaWiki exception - 1885 bytes in 2.050 second response time  
[21:22:58] <thcipriani>	 matanya: will do
[21:23:20] <matanya>	 thanks
[21:24:32] <shinken-wm>	 RECOVERY - App Server Main HTTP Response on deployment-mediawiki01 is OK: HTTP OK: HTTP/1.1 200 OK - 47144 bytes in 0.890 second response time  
[21:25:00] <shinken-wm>	 RECOVERY - App Server Main HTTP Response on deployment-mediawiki02 is OK: HTTP OK: HTTP/1.1 200 OK - 47065 bytes in 1.076 second response time  
[21:25:48] <shinken-wm>	 RECOVERY - English Wikipedia Mobile Main page on beta-cluster is OK: HTTP OK: HTTP/1.1 200 OK - 28386 bytes in 0.538 second response time  
[21:26:16] <shinken-wm>	 PROBLEM - Host deployment-jobrunner01 is DOWN: CRITICAL - Host Unreachable (10.68.17.96)  
[21:27:26] <shinken-wm>	 RECOVERY - English Wikipedia Main page on beta-cluster is OK: HTTP OK: HTTP/1.1 200 OK - 47358 bytes in 0.576 second response time  
[21:27:26] <shinken-wm>	 RECOVERY - App Server Main HTTP Response on deployment-mediawiki03 is OK: HTTP OK: HTTP/1.1 200 OK - 47281 bytes in 0.557 second response time  
[21:30:20] <shinken-wm>	 RECOVERY - Host deployment-jobrunner01 is UP: PING OK - Packet loss = 0%, RTA = 1.09 ms  
[21:30:40] <shinken-wm>	 PROBLEM - Host deployment-logstash1 is DOWN: CRITICAL - Host Unreachable (10.68.16.134)  
[21:31:29] <shinken-wm>	 RECOVERY - Puppet failure on deployment-videoscaler01 is OK: OK: Less than 1.00% above the threshold [0.0]  
[21:35:19] <shinken-wm>	 RECOVERY - Puppet failure on deployment-jobrunner01 is OK: OK: Less than 1.00% above the threshold [0.0]  
[21:35:27] <shinken-wm>	 PROBLEM - Host deployment-mediawiki02 is DOWN: PING CRITICAL - Packet loss = 100%  
[21:36:01] <shinken-wm>	 RECOVERY - Puppet failure on deployment-mediawiki01 is OK: OK: Less than 1.00% above the threshold [0.0]  
[21:36:19] <shinken-wm>	 RECOVERY - Host deployment-logstash1 is UP: PING OK - Packet loss = 0%, RTA = 0.63 ms  
[21:37:48] <shinken-wm>	 RECOVERY - Host deployment-mediawiki02 is UP: PING OK - Packet loss = 0%, RTA = 1.07 ms  
[21:39:30] <shinken-wm>	 PROBLEM - Host deployment-rsync01 is DOWN: CRITICAL - Host Unreachable (10.68.17.66)  
[21:41:02] <shinken-wm>	 RECOVERY - SSH on deployment-mediawiki02 is OK: SSH OK - OpenSSH_6.6.1p1 Ubuntu-2ubuntu2 (protocol 2.0)  
[21:41:34] <shinken-wm>	 RECOVERY - Host deployment-rsync01 is UP: PING OK - Packet loss = 0%, RTA = 0.61 ms  
[21:51:34] <wikibugs>	 10Beta-Cluster, 6Release-Engineering, 10Continuous-Integration-Config, 10Parsoid: Parsoid patches don't update Beta Cluster automatically -- only deploy repo patches seem to update that code - https://phabricator.wikimedia.org/T92871#1247250 (10cscott) We tend to deploy something very close to Parsoid mast...
[21:58:46] <thcipriani>	 matanya: all clear!
[21:58:56] <matanya>	 thanks much thcipriani 
[22:04:52] <grrrit-wm>	 (03CR) 10Dduvall: [C: 032] Enforce jshint linting [integration/raita] - 10https://gerrit.wikimedia.org/r/207103 (owner: 10Dduvall)
[22:05:54] <grrrit-wm>	 (03Merged) 10jenkins-bot: Enforce jshint linting [integration/raita] - 10https://gerrit.wikimedia.org/r/207103 (owner: 10Dduvall)
[22:11:03] <matanya>	 thcipriani: still broken: (Cannot access the database: Unknown database 'dawiki' (10.68.17.94))
[22:13:27] * thcipriani looking
[22:13:55] <matanya>	 thcipriani: i tried to create a user, if that helps
[22:14:45] <grrrit-wm>	 (03PS4) 10Dduvall: Raita Elasticsearch logging [selenium] - 10https://gerrit.wikimedia.org/r/207324 
[22:20:18] <grrrit-wm>	 (03PS2) 10Dduvall: Field mappings for more build information [integration/raita] - 10https://gerrit.wikimedia.org/r/207291 
[22:21:33] <grrrit-wm>	 (03CR) 10Dduvall: [C: 032] Field mappings for more build information [integration/raita] - 10https://gerrit.wikimedia.org/r/207291 (owner: 10Dduvall)
[22:21:45] <grrrit-wm>	 (03Merged) 10jenkins-bot: Field mappings for more build information [integration/raita] - 10https://gerrit.wikimedia.org/r/207291 (owner: 10Dduvall)
[22:24:30] <thcipriani>	 hmm, yeah, I don't see it in mysql either, fwiw: ERROR 1049 (42000): Unknown database 'dawiki'
[22:27:37] <thcipriani>	 matanya: looks like this is being worked on: https://phabricator.wikimedia.org/T90683
[22:28:26] <matanya>	 thanks thcipriani i'll ty again tomorrow
[22:28:29] <thcipriani>	 see also: https://phabricator.wikimedia.org/T97388
[22:28:30] <matanya>	 try
[22:37:52] <wikibugs>	 10Deployment-Systems, 6Release-Engineering, 6Services, 6operations: Streamline our service development and deployment process - https://phabricator.wikimedia.org/T93428#1247411 (10ssastry) Goes without saying that the individual services should also be able to work with the fact that multiple versions of t...
[23:31:20] <shinken-wm>	 PROBLEM - Puppet failure on deployment-pdf01 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0]  
[23:31:24] <shinken-wm>	 PROBLEM - Puppet staleness on deployment-urldownloader is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [43200.0]  
[23:31:26] <grrrit-wm>	 (03PS1) 10Dduvall: Moved index.html to a docroot directory [integration/raita] - 10https://gerrit.wikimedia.org/r/207679 
[23:31:44] <shinken-wm>	 PROBLEM - Puppet failure on deployment-cache-text02 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0]