[01:12:36] RECOVERY - Puppet run on deployment-tmh01 is OK: OK: Less than 1.00% above the threshold [0.0] [02:38:21] PROBLEM - Parsoid on deployment-parsoid06 is CRITICAL: Connection refused [03:45:54] PROBLEM - Puppet run on deployment-mediawiki01 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [04:20:52] RECOVERY - Puppet run on deployment-mediawiki01 is OK: OK: Less than 1.00% above the threshold [0.0] [04:34:55] PROBLEM - Puppet run on deployment-salt is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [05:10:02] RECOVERY - Puppet run on deployment-salt is OK: OK: Less than 1.00% above the threshold [0.0] [06:31:57] PROBLEM - Puppet run on deployment-salt is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [06:39:50] RECOVERY - Puppet run on deployment-mediawiki03 is OK: OK: Less than 1.00% above the threshold [0.0] [07:11:57] RECOVERY - Puppet run on deployment-salt is OK: OK: Less than 1.00% above the threshold [0.0] [07:55:13] RECOVERY - Puppet run on deployment-pdf01 is OK: OK: Less than 1.00% above the threshold [0.0] [08:11:59] PROBLEM - Puppet run on deployment-mediawiki01 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [08:36:48] Yippee, build fixed! [08:36:49] Project browsertests-MultimediaViewer-en.wikipedia.beta.wmflabs.org-os_x_10.9-safari-sauce build #961: 09FIXED in 26 min: https://integration.wikimedia.org/ci/job/browsertests-MultimediaViewer-en.wikipedia.beta.wmflabs.org-os_x_10.9-safari-sauce/961/ [08:51:52] RECOVERY - Puppet run on deployment-mediawiki01 is OK: OK: Less than 1.00% above the threshold [0.0] [10:18:29] Project selenium-MultimediaViewer-286674 » internet_explorer 11.0,beta,Windows 7,contintLabsSlave && UbuntuTrusty build #12: 04FAILURE in 9.2 sec: https://integration.wikimedia.org/ci/job/selenium-MultimediaViewer-286674/BROWSER=internet_explorer%2011.0,MEDIAWIKI_ENVIRONMENT=beta,PLATFORM=Windows%207,label=contintLabsSlave%20&&%20UbuntuTrusty/12/ [10:18:31] Project selenium-MultimediaViewer-286674 » internet_explorer 10.0,beta,Windows 8,contintLabsSlave && UbuntuTrusty build #12: 04FAILURE in 11 sec: https://integration.wikimedia.org/ci/job/selenium-MultimediaViewer-286674/BROWSER=internet_explorer%2010.0,MEDIAWIKI_ENVIRONMENT=beta,PLATFORM=Windows%208,label=contintLabsSlave%20&&%20UbuntuTrusty/12/ [10:18:39] Project selenium-MultimediaViewer-286674 » internet_explorer 11.0,beta,Windows 8.1,contintLabsSlave && UbuntuTrusty build #12: 04FAILURE in 19 sec: https://integration.wikimedia.org/ci/job/selenium-MultimediaViewer-286674/BROWSER=internet_explorer%2011.0,MEDIAWIKI_ENVIRONMENT=beta,PLATFORM=Windows%208.1,label=contintLabsSlave%20&&%20UbuntuTrusty/12/ [10:28:27] Yippee, build fixed! [10:28:28] Project selenium-MultimediaViewer-286674 » safari,beta,OS X 10.9,contintLabsSlave && UbuntuTrusty build #12: 09FIXED in 10 min: https://integration.wikimedia.org/ci/job/selenium-MultimediaViewer-286674/BROWSER=safari,MEDIAWIKI_ENVIRONMENT=beta,PLATFORM=OS%20X%2010.9,label=contintLabsSlave%20&&%20UbuntuTrusty/12/ [11:18:14] PROBLEM - Host integration-dev is DOWN: CRITICAL - Host Unreachable (10.68.17.81) [11:56:52] RECOVERY - Puppet run on integration-slave-trusty-1015 is OK: OK: Less than 1.00% above the threshold [0.0] [12:29:10] 06Release-Engineering-Team, 13Patch-For-Review, 05Release: MW-1.27.0-wmf.23 deployment blockers - https://phabricator.wikimedia.org/T131557#2266951 (10hoo) [14:50:22] 10Browser-Tests-Infrastructure, 15User-zeljkofilipin: Ownership of Selenium tests - https://phabricator.wikimedia.org/T134492#2267217 (10zeljkofilipin) [14:51:46] 10Browser-Tests-Infrastructure, 15User-zeljkofilipin: Ownership of Selenium tests - https://phabricator.wikimedia.org/T134492#2267237 (10zeljkofilipin) [15:02:54] PROBLEM - Puppet run on deployment-eventlogging04 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [15:06:44] (03PS1) 10Zfilipin: Stephane Bisson is owner of Echo and Flow Selenium jobs [integration/config] - 10https://gerrit.wikimedia.org/r/287094 (https://phabricator.wikimedia.org/T134492) [15:18:11] (03CR) 10Sbisson: [C: 031] Stephane Bisson is owner of Echo and Flow Selenium jobs [integration/config] - 10https://gerrit.wikimedia.org/r/287094 (https://phabricator.wikimedia.org/T134492) (owner: 10Zfilipin) [15:23:14] (03CR) 10Zfilipin: [C: 032] Stephane Bisson is owner of Echo and Flow Selenium jobs [integration/config] - 10https://gerrit.wikimedia.org/r/287094 (https://phabricator.wikimedia.org/T134492) (owner: 10Zfilipin) [15:24:12] (03Merged) 10jenkins-bot: Stephane Bisson is owner of Echo and Flow Selenium jobs [integration/config] - 10https://gerrit.wikimedia.org/r/287094 (https://phabricator.wikimedia.org/T134492) (owner: 10Zfilipin) [15:43:04] RECOVERY - Puppet run on deployment-eventlogging04 is OK: OK: Less than 1.00% above the threshold [0.0] [15:50:49] hey dcausse, yt? [15:51:11] ottomata: yep [15:51:23] do you know if the monolog avro kafka stuff is set up in deployment-prep? [15:51:26] it looks like not, but i'm not sure [15:51:35] I'm pretty sure no [15:51:40] we are going to upgrade the kafka cluster next week, and i'd like to test that this piece still works [15:51:49] i don't expect it not to, but it would be good to have it [15:52:07] ottomata: ok, I'll try to have a look [15:52:10] i see a place to set $wmfAllServices['eqiad']['kafka'] in LabsServices.php in mediawiki-config [15:52:17] but, im' not sure if that is enough [15:52:37] do we have kafka/hadoop there? [15:53:20] kafka yes [15:53:28] yes, the 'analytics' cluster/instances is deployment-kafka02 [15:53:33] hadoop no [15:53:37] but i don't need to test that side :) [15:53:38] you're more concerned by the mediawiki <-> kafka connector? [15:53:40] ja [15:53:45] ok got it [15:53:49] we can setup hadoop there [15:53:58] but i haven't succesfullly kept a hadoop cluster running in labs long term [15:54:03] with enough work i'm sure we could [15:54:11] but it does a lotta stuff and fills up disks i think, dunno [15:54:19] i set them up as one offs when i need to test them [15:54:19] yes... [15:54:42] I'll try to set this up in labs, in the worst case I'll try it locally [15:54:46] what kafka version? [15:54:47] ok [15:54:53] 0.9.0.1 we are using confluent's package now [15:55:18] ok [15:55:32] i'll go ahead and push a patch to mw-config that sets the kafka broker info in deploymenet prep [15:55:32] I'll let you know, thanks for the heads up :) [15:55:34] and then you can add to it [15:55:39] ok [15:55:39] not sure what else is needed [15:57:09] dcausse: https://gerrit.wikimedia.org/r/#/c/287106/ [16:00:03] ottomata: I'll check for the rest we'll probably need to setup a new monolog channel (will have a look) [16:00:09] hm… is CI mostly working now, or mostly not? I broke it for a bit but it should be catching up... [16:00:10] ok thank you! [16:00:58] yw! :) [16:01:12] nevermind, I just got a jenkins review [16:03:07] 06Release-Engineering-Team, 06Operations, 10Phabricator, 10ops-eqiad: iridium (Phabricator host) went down, Possible cpu heat issue - https://phabricator.wikimedia.org/T131742#2267481 (10Cmjohnson) resolving this task. Please re-open if the heat issues return. [16:03:15] 06Release-Engineering-Team, 06Operations, 10Phabricator, 10ops-eqiad: iridium (Phabricator host) went down, Possible cpu heat issue - https://phabricator.wikimedia.org/T131742#2267483 (10Cmjohnson) 05Open>03Resolved a:03Cmjohnson [16:24:00] 10Browser-Tests-Infrastructure, 13Patch-For-Review, 15User-zeljkofilipin: Ownership of Selenium tests - https://phabricator.wikimedia.org/T134492#2267534 (10Jdlrobson) [16:24:11] 10Browser-Tests-Infrastructure, 13Patch-For-Review, 15User-zeljkofilipin: Ownership of Selenium tests - https://phabricator.wikimedia.org/T134492#2267217 (10Jdlrobson) [16:26:16] 07Browser-Tests, 10MobileFrontend: `Generic special page features.Search from Watchlist` test failing - https://phabricator.wikimedia.org/T130971#2267540 (10Jdlrobson) [16:26:18] 10Browser-Tests-Infrastructure, 10Reading-Web-Backlog, 13Patch-For-Review: Fix MobileFrontend scenarios that fail at en.wikipedia.beta.wmflabs.org or do not run them daily - https://phabricator.wikimedia.org/T94156#2267541 (10Jdlrobson) [16:26:20] 10Browser-Tests-Infrastructure, 10MobileFrontend, 10Reading-Web-Backlog: Net::ReadTimeout in MobileFrontend browser tests when visiting Watchlist page - https://phabricator.wikimedia.org/T129328#2267537 (10Jdlrobson) 05Open>03Resolved a:03Jdlrobson These appear to be passing consistently now. [16:26:47] 10Browser-Tests-Infrastructure, 06Release-Engineering-Team, 07Epic, 13Patch-For-Review, and 2 others: Fix scenarios that fail at en.wikipedia.beta.wmflabs.org or do not run them daily - https://phabricator.wikimedia.org/T94150#2267542 (10Jdlrobson) [16:49:22] 06Release-Engineering-Team, 15User-greg: Determine parental leave and delegation plan - https://phabricator.wikimedia.org/T131198#2267591 (10greg) p:05Unbreak!>03High Work happening (calling/emailing the State with questions etc).... [16:58:15] 10Continuous-Integration-Infrastructure: Install php7 and the php-ast extension so etsy/phan can be run from jenkins - https://phabricator.wikimedia.org/T132636#2267595 (10EBernhardson) Had a chance to try out dotdeb's php on a jessie instance in labs. The php7.0-cli package does not appear to conflict in any wa... [17:02:40] 10Continuous-Integration-Infrastructure: Install php7 and the php-ast extension so etsy/phan can be run from jenkins - https://phabricator.wikimedia.org/T132636#2267600 (10EBernhardson) Also i was going to see if dotdeb was interested in adding php-ast to their repo, but he already has a page (perhaps outdated)... [17:32:40] 10Continuous-Integration-Infrastructure, 06Release-Engineering-Team, 10Differential-Beta, 10Mobile-App-Goals, 06Wikipedia-Android-App-Backlog: Investigate migrating the Wikipedia Android App to Differential - https://phabricator.wikimedia.org/T134505#2267696 (10mmodell) [17:42:53] PROBLEM - Puppet run on deployment-mediawiki01 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [17:52:14] 06Release-Engineering-Team, 15User-greg: Determine parental leave and delegation plan - https://phabricator.wikimedia.org/T131198#2267802 (10greg) [18:18:56] 10Beta-Cluster-Infrastructure, 10Deployment-Systems, 03Scap3, 06Operations, 13Patch-For-Review: Automate the generation deployment keys (keyholder-managed ssh keys) - https://phabricator.wikimedia.org/T133211#2267929 (10mmodell) From @faidon's code review on Gerrit > From a quick look, this looks like it... [18:29:34] 06Release-Engineering-Team, 15User-greg: Create FY1617 annual personal goals (for RelEng team members) - https://phabricator.wikimedia.org/T134517#2267961 (10greg) [18:32:24] 06Release-Engineering-Team, 15User-greg: Create FY1617 annual personal goals (for RelEng team members) - https://phabricator.wikimedia.org/T134517#2267978 (10greg) [18:34:36] 06Release-Engineering-Team, 15User-greg: Create FY1617Q1 personal goals (for RelEng team members) - https://phabricator.wikimedia.org/T134518#2267986 (10greg) [18:52:52] RECOVERY - Puppet run on deployment-mediawiki01 is OK: OK: Less than 1.00% above the threshold [0.0] [19:02:27] 10Beta-Cluster-Infrastructure, 13Patch-For-Review: Creating wiki at beta cluster for the Dutch Wikipedia - https://phabricator.wikimedia.org/T118005#2268159 (10Krenair) Doesn't seem to be working yet: http://nl.wikipedia.beta.wmflabs.org/api/rest_v1/page/html/Hoofdpagina @mobrovac? I notice puppet on deployme... [19:02:35] PROBLEM - Puppet run on deployment-memc03 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [19:17:42] 10Staging, 10releng-201415-Q4, 07Tracking: Create staging cluster (tracking) - https://phabricator.wikimedia.org/T88702#2268221 (10Danny_B) [19:17:44] 06Release-Engineering-Team, 10Staging, 10releng-201415-Q3, 07Tracking: [Quarterly Success Metric] Green nightly builds on the staging cluster (tracking) - https://phabricator.wikimedia.org/T88701#2268222 (10Danny_B) [19:21:47] ostriches, are you making the deployment calendars with greg away? [19:23:16] I should be but I haven't kept up. [19:23:43] ostriches, right, want me to make the deployment calendar for next week? [19:24:26] If you've got the cycles, that'd be fantastic <3 [19:30:05] 10Continuous-Integration-Config, 07I18n, 07Tracking: Configure banana checker for i18n files to run on all MediaWiki extensions and skins (tracking) - https://phabricator.wikimedia.org/T94547#2268242 (10Danny_B) [19:30:07] 10Continuous-Integration-Config, 06Release-Engineering-Team, 07HHVM, 07Tracking: Jenkins: Implement hhvm based voting jobs for mediawiki and extensions (tracking) - https://phabricator.wikimedia.org/T75521#2268244 (10Danny_B) [19:38:12] 10Continuous-Integration-Infrastructure, 07Tracking: gallium and lanthanum disks full (tracking) - https://phabricator.wikimedia.org/T91211#2268263 (10Danny_B) [19:38:14] 10Beta-Cluster-Infrastructure, 06Operations, 07Puppet, 07Tracking: Minimize differences between beta and production (Tracking) - https://phabricator.wikimedia.org/T87220#2268264 (10Danny_B) [19:43:05] ostriches, I took the existing one, updated the dates, removed phabricator and UploadLink, cleared the SWAT lists, and updated the train versions [19:43:18] UploadsLink* [19:44:24] 10Continuous-Integration-Infrastructure, 13Patch-For-Review, 07Tracking: Create CI slaves using Debian Jessie (tracking) - https://phabricator.wikimedia.org/T94836#2268272 (10Danny_B) [19:45:27] Krenair: Ty <3 [19:48:37] 10Continuous-Integration-Infrastructure, 13Patch-For-Review, 07Tracking: Phase out yamllint jobs (tracking) - https://phabricator.wikimedia.org/T95890#2268293 (10Danny_B) [19:59:32] 10Continuous-Integration-Infrastructure, 07Tracking: Switch zuul to be gearman based (tracking) - https://phabricator.wikimedia.org/T52664#2268304 (10Danny_B) [20:02:12] 10Deployment-Systems, 06Release-Engineering-Team, 10RESTBase, 06Services, 07Tracking: Create or improve the RESTBase deploy method (tracking) - https://phabricator.wikimedia.org/T102667#2268308 (10Danny_B) [20:02:17] 10Deployment-Systems, 07Tracking: Trebuchet blockers for MediaWiki (tracking) - https://phabricator.wikimedia.org/T45338#2268310 (10Danny_B) [20:05:46] 06Release-Engineering-Team, 10releng-201415-Q3, 10releng-201415-Q4, 07Tracking, 15User-greg: [Quarterly Success Metric] RelEng+TPG process discussion and improvements (tracking) - https://phabricator.wikimedia.org/T88708#2268321 (10Danny_B) [20:12:10] 07Browser-Tests: Allow specifying required permissions for a new user - https://phabricator.wikimedia.org/T134529#2268338 (10SBisson) [20:13:17] 07Browser-Tests: Allow specifying required permissions for a new user - https://phabricator.wikimedia.org/T134529#2268354 (10SBisson) @zeljkofilipin Could you please tag with the proper projects so it's not lost. [20:17:06] 10Beta-Cluster-Infrastructure, 13Patch-For-Review: Creating wiki at beta cluster for the Dutch Wikipedia - https://phabricator.wikimedia.org/T118005#2268367 (10mobrovac) We are testing a newer version of Cassandra there, and Puppet fails on package pins (even though a newer package is installed, non-sense). I... [20:32:07] 10Beta-Cluster-Infrastructure, 10Deployment-Systems, 03Scap3, 06Operations, 13Patch-For-Review: Automate the generation deployment keys (keyholder-managed ssh keys) - https://phabricator.wikimedia.org/T133211#2268389 (10mobrovac) >>! In T133211#2267929, @mmodell wrote: > 1. Just let all of the service de... [20:32:28] 10Beta-Cluster-Infrastructure, 07Tracking: upload, thumbnails and transcoding on beta (tracking) - https://phabricator.wikimedia.org/T39080#2268395 (10Danny_B) [20:32:55] 10Deployment-Systems, 06Release-Engineering-Team, 06Operations, 06Services: Streamline our service development and deployment process - https://phabricator.wikimedia.org/T93428#2268404 (10GWicke) @akosiaris, I added the tag to reflect that several aspects of these requirements (especially config managemen... [20:48:39] Project browsertests-Flow-en.wikipedia.beta.wmflabs.org-linux-chrome-sauce build #1006: 04FAILURE in 22 min: https://integration.wikimedia.org/ci/job/browsertests-Flow-en.wikipedia.beta.wmflabs.org-linux-chrome-sauce/1006/ [21:21:01] 10Beta-Cluster-Infrastructure, 13Patch-For-Review: Creating wiki at beta cluster for the Dutch Wikipedia - https://phabricator.wikimedia.org/T118005#2268540 (10Krenair) That RB URL is still broken though [21:23:48] Krenair: /me looking [21:24:14] ty [21:41:05] 10Beta-Cluster-Infrastructure, 10Parsoid, 06Services, 13Patch-For-Review: Creating wiki at beta cluster for the Dutch Wikipedia - https://phabricator.wikimedia.org/T118005#2268575 (10mobrovac) It turns out for that specific URL, there's a Parsoid error: ```lines=10 [2016-05-05T21:38:54.573Z] ERROR: restba... [21:41:15] Krenair: ^ [22:09:51] !log Promoted Yurik and Jgirault to sysops on beta enwiki. Through shell because logging in is broken for me. [22:09:56] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL, Master [22:10:07] thx MaxSem [22:10:27] MaxSem: Ask me next time :D [22:10:38] *you can ask me next time [22:10:41] laaaazy! :P [22:11:12] next time I'll just hack the wiki because it's even lazier than using SSH :} [22:11:52] MaxSem: What do you mean with log in is broken? [22:12:04] login or logging? [22:12:11] https://gist.github.com/MaxSem/75e2c9ec10e6560afcf68ab41c72ca39 [22:12:20] (already reported) [23:41:52] PROBLEM - Puppet run on deployment-mediawiki01 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0]