[00:07:31] RECOVERY - Puppet errors on deployment-memc05 is OK: OK: Less than 1.00% above the threshold [0.0] [00:29:55] addshore: can we make a decision on https://phabricator.wikimedia.org/T174922 ? [00:30:01] *looks* [00:30:22] I can write an email to the team and see if they have come to one? [00:30:34] Do you want a decision there before moving forward with anything? or? [00:30:48] 10Release-Engineering-Team (Watching / External), 10Wikidata: Decide what to do with Wikibase JS-only libraries regarding the build/deployment of Wikidata code - https://phabricator.wikimedia.org/T174922#3616657 (10Addshore) p:05Normal>03High [00:31:14] 10Release-Engineering-Team (Watching / External), 10Wikidata: Decide what to do with Wikibase JS-only libraries regarding the build/deployment of Wikidata code - https://phabricator.wikimedia.org/T174922#3577093 (10Addshore) Marking as high as this is the one undecided thing blocking the killing of the build r... [00:37:08] addshore: I think it should be decided (not necessarily actually implemented though) before we start moving forward [00:37:15] okay! [00:37:18] I can write an email! [00:38:30] thanks :) [00:39:02] legoktm: sent [00:44:20] 10Scap (Scap3-Adoption-Phase1), 10releng-201516-q4, 10releng-201718-q1, 10Trebuchet: [keyresult] Migrate remaining trebuchet deployed services - https://phabricator.wikimedia.org/T129290#2101252 (10mmodell) itshappening [00:46:59] 10Release-Engineering-Team (Watching / External), 10Wikidata: Decide what to do with Wikibase JS-only libraries regarding the build/deployment of Wikidata code - https://phabricator.wikimedia.org/T174922#3616670 (10JeroenDeDauw) [00:54:51] 10Release-Engineering-Team (Watching / External), 10Wikidata: Decide what to do with Wikibase JS-only libraries regarding the build/deployment of Wikidata code - https://phabricator.wikimedia.org/T174922#3577093 (10JeroenDeDauw) If you add this code to the Wikivase.git repository you likely make it even harder... [02:13:08] Project selenium-QuickSurveys » chrome,beta,Linux,BrowserTests build #534: 04FAILURE in 6.5 sec: https://integration.wikimedia.org/ci/job/selenium-QuickSurveys/BROWSER=chrome,MEDIAWIKI_ENVIRONMENT=beta,PLATFORM=Linux,label=BrowserTests/534/ [02:23:59] (03PS1) 10Addshore: WIP DNM dicker: lintr image - linter for R [integration/config] - 10https://gerrit.wikimedia.org/r/378831 (https://phabricator.wikimedia.org/T176194) [02:27:10] right, bed... [02:35:17] PROBLEM - Puppet errors on integration-slave-docker-1001 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [03:56:14] Project selenium-MultimediaViewer » chrome,beta,OS X 10.9,BrowserTests build #521: 04FAILURE in 13 sec: https://integration.wikimedia.org/ci/job/selenium-MultimediaViewer/BROWSER=chrome,MEDIAWIKI_ENVIRONMENT=beta,PLATFORM=OS%20X%2010.9,label=BrowserTests/521/ [03:56:15] Project selenium-MultimediaViewer » firefox,beta,Linux,BrowserTests build #521: 04FAILURE in 13 sec: https://integration.wikimedia.org/ci/job/selenium-MultimediaViewer/BROWSER=firefox,MEDIAWIKI_ENVIRONMENT=beta,PLATFORM=Linux,label=BrowserTests/521/ [03:56:21] Project selenium-MultimediaViewer » safari,beta,OS X 10.9,BrowserTests build #521: 04FAILURE in 21 sec: https://integration.wikimedia.org/ci/job/selenium-MultimediaViewer/BROWSER=safari,MEDIAWIKI_ENVIRONMENT=beta,PLATFORM=OS%20X%2010.9,label=BrowserTests/521/ [04:15:19] RECOVERY - Puppet errors on integration-slave-docker-1001 is OK: OK: Less than 1.00% above the threshold [0.0] [05:36:17] PROBLEM - Puppet errors on integration-slave-docker-1001 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [05:36:39] 10Continuous-Integration-Config, 10MediaWiki-Platform-Team, 10Patch-For-Review: Run MediaWiki tests on PHP 7 - https://phabricator.wikimedia.org/T144962#3616898 (10tstarling) a:03Legoktm @Legoktm is taking this on next quarter. [05:39:49] PROBLEM - Mediawiki Error Rate on graphite-labs is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [10.0] [06:04:49] PROBLEM - Mediawiki Error Rate on graphite-labs is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [10.0] [06:16:15] RECOVERY - Puppet errors on integration-slave-docker-1001 is OK: OK: Less than 1.00% above the threshold [0.0] [06:35:23] 10Beta-Cluster-Infrastructure: IP Address Lookup Tool Installation - https://phabricator.wikimedia.org/T176074#3616908 (10Samtar) I believe you're essentially looking for #checkuser or CheckUser-like functionality? I've had a scour of the beta cluster docs and can't see why it //wasn't// installed, but I imagine... [06:37:19] PROBLEM - Puppet errors on integration-slave-docker-1001 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [06:57:18] PROBLEM - Puppet errors on deployment-kafka01 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [07:17:17] RECOVERY - Puppet errors on integration-slave-docker-1001 is OK: OK: Less than 1.00% above the threshold [0.0] [07:56:17] 10Release-Engineering-Team (Backlog), 10Scap, 10Operations, 10Parsoid: Check 'depool' failed while deploying - https://phabricator.wikimedia.org/T176184#3616952 (10KartikMistry) Also: https://phabricator.wikimedia.org/P6022 - blocking cxserver deployment. [08:01:59] RECOVERY - Free space - all mounts on integration-slave-jessie-1001 is OK: OK: integration.integration-slave-jessie-1001.diskspace._mnt.byte_percentfree (No valid datapoints found) [08:09:04] (03CR) 10Hashar: [C: 032] fab: git gc zuul repo on the servers [integration/config] - 10https://gerrit.wikimedia.org/r/378667 (owner: 10Hashar) [08:22:28] (03PS2) 10Hashar: fabric: fix file documentation [integration/config] - 10https://gerrit.wikimedia.org/r/378785 (owner: 10Addshore) [08:22:35] (03CR) 10Hashar: [C: 032] "Fixed a typo" [integration/config] - 10https://gerrit.wikimedia.org/r/378785 (owner: 10Addshore) [08:24:10] (03Merged) 10jenkins-bot: fabric: fix file documentation [integration/config] - 10https://gerrit.wikimedia.org/r/378785 (owner: 10Addshore) [08:24:36] (03PS2) 10Hashar: docker: updated cache-buster cmd for operations-puppet [integration/config] - 10https://gerrit.wikimedia.org/r/378811 (owner: 10Addshore) [08:25:07] (03CR) 10GoranSMilovanovic: [C: 031] WIP DNM dicker: lintr image - linter for R [integration/config] - 10https://gerrit.wikimedia.org/r/378831 (https://phabricator.wikimedia.org/T176194) (owner: 10Addshore) [08:28:51] 10Release-Engineering-Team (Backlog), 10Scap, 10Operations, 10Parsoid, 10Patch-For-Review: Check 'depool' failed while deploying - https://phabricator.wikimedia.org/T176184#3616972 (10Joe) a:03Joe [08:29:32] (03PS3) 10Hashar: docker: updated cache-buster cmd for operations-puppet [integration/config] - 10https://gerrit.wikimedia.org/r/378811 (owner: 10Addshore) [08:29:52] 10Release-Engineering-Team (Backlog), 10Scap, 10Operations, 10Parsoid, 10Patch-For-Review: Check 'depool' failed while deploying - https://phabricator.wikimedia.org/T176184#3616450 (10Joe) This was caused by https://gerrit.wikimedia.org/r/#/c/365891/, yet another case of a labs-specific fix breaking prod... [08:30:19] (03CR) 10Hashar: [C: 032] "Fixed typo in the Gerrit url (operation -> operations)" [integration/config] - 10https://gerrit.wikimedia.org/r/378811 (owner: 10Addshore) [08:31:19] (03Merged) 10jenkins-bot: docker: updated cache-buster cmd for operations-puppet [integration/config] - 10https://gerrit.wikimedia.org/r/378811 (owner: 10Addshore) [08:38:19] PROBLEM - Puppet errors on integration-slave-docker-1001 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [08:40:07] (03CR) 10Hashar: [C: 031] "sounds good (almost :D )" (031 comment) [integration/config] - 10https://gerrit.wikimedia.org/r/378807 (owner: 10Addshore) [08:46:55] (03CR) 10Hashar: "Whenever we get CI to build the image for us, I guess it will have a step to pull the new image on all the docker hosts." (031 comment) [integration/config] - 10https://gerrit.wikimedia.org/r/378783 (owner: 10Addshore) [08:48:00] (03CR) 10Hashar: [C: 04-1] docker: mediawiki-extensions-phan image [integration/config] - 10https://gerrit.wikimedia.org/r/371708 (owner: 10Addshore) [08:54:49] RECOVERY - Mediawiki Error Rate on graphite-labs is OK: OK: Less than 1.00% above the threshold [1.0] [08:57:58] (03CR) 10Hashar: docker: zuul-cloner image (032 comments) [integration/config] - 10https://gerrit.wikimedia.org/r/375834 (owner: 10Addshore) [09:00:44] (03PS7) 10Hashar: docker: zuul-cloner image [integration/config] - 10https://gerrit.wikimedia.org/r/375834 (owner: 10Addshore) [09:01:14] (03CR) 10Hashar: [C: 031] "I have slightly amended the patch to fix it up :) I guess just refresh the job and CR+2 this?" [integration/config] - 10https://gerrit.wikimedia.org/r/375834 (owner: 10Addshore) [09:02:02] PROBLEM - Puppet errors on deployment-trending01 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [09:06:52] 10Release-Engineering-Team (Backlog), 10Scap, 10Operations, 10Parsoid, 10Services (watching): Check 'depool' failed while deploying - https://phabricator.wikimedia.org/T176184#3617004 (10mobrovac) 05Open>03Resolved Confirmed to have fixed deployments on SCB, resolving. Thank you @Joe for the quick fix! [09:11:04] !log Ran mwscript cleanupSpam.php on the beta cluster, but it didn't worked (looks it is not fetching the domains properly) [09:11:08] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [09:13:30] !log Re-run previous script and it worked this time, see https://deployment.wikimedia.beta.wmflabs.org/wiki/Template_talk:Rotate/en [09:13:33] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [09:32:12] 10Release-Engineering-Team (Kanban), 10Cleanup, 10Repository-Admins, 10User-MarcoAurelio: Deprecate unmaintained/inactive XMLContentExtension - https://phabricator.wikimedia.org/T148825#3617050 (10MarcoAurelio) 05Open>03Resolved [09:42:13] 10Release-Engineering-Team (Watching / External), 10Wikidata, 10Story: [Story] Use composer-merge-plugin to include Wikidata components in mediawiki-vendor - https://phabricator.wikimedia.org/T95663#3617117 (10thiemowmde) [09:49:53] 10Continuous-Integration-Config, 10Wiki-Loves-Monuments-Database, 10Patch-For-Review: Add Shell linting to heritage repo - https://phabricator.wikimedia.org/T175906#3617142 (10JeanFred) >>! In T175906#3607546, @JeanFred wrote: > Looks like Shellcheck has been explored in T148494, but as far as I can see in r... [10:14:06] 10Release-Engineering-Team (Watching / External), 10Wikidata: Decide what to do with Wikibase JS-only libraries regarding the build/deployment of Wikidata code - https://phabricator.wikimedia.org/T174922#3617186 (10Lucas_Werkmeister_WMDE) [10:57:35] (03CR) 10Hashar: docker: php7 base image (031 comment) [integration/config] - 10https://gerrit.wikimedia.org/r/378530 (owner: 10Addshore) [11:00:49] (03CR) 10Hashar: "Good to me, but you forgot to git add sury-php.gpg :]" [integration/config] - 10https://gerrit.wikimedia.org/r/378530 (owner: 10Addshore) [11:04:30] 10Beta-Cluster-Infrastructure: IP Address Lookup Tool Installation - https://phabricator.wikimedia.org/T176074#3617225 (10Sau226) Yes I mean checkuser. If somehow rights were modified so only NDAed people could user it/ add those rights or if there was a mutual confirmation like thing (1 user asks and another us... [11:11:56] 10Beta-Cluster-Infrastructure, 10MediaWiki-Authentication-and-authorization, 10MediaWiki-extensions-CentralAuth, 10MW-1.30-release-notes (WMF-deploy-2017-08-08_(1.30.0-wmf.13)), 10Patch-For-Review: "Loss of session data" on Beta Cluster - https://phabricator.wikimedia.org/T172560#3617262 (10MarcoAurelio)... [11:16:05] 10Beta-Cluster-Infrastructure, 10MediaWiki-extensions-BlockAndNuke, 10wikimedia-extension-review-queue: Consider installing BlockAndNuke on the Beta Cluster - https://phabricator.wikimedia.org/T176207#3617264 (10MarcoAurelio) [11:20:47] (03CR) 10Addshore: [C: 04-1] docker: zuul-cloner image (031 comment) [integration/config] - 10https://gerrit.wikimedia.org/r/375834 (owner: 10Addshore) [11:24:44] 10Release-Engineering-Team (Backlog), 10Scap, 10Operations, 10Parsoid, 10Services (watching): Check 'depool' failed while deploying - https://phabricator.wikimedia.org/T176184#3617282 (10KartikMistry) Thanks @Joe and @mobrovac [11:32:44] hashar: hey, does this mean I need to fix the extension? https://gerrit.wikimedia.org/r/#/c/378787/ [11:32:59] probably make the extension.json [11:37:01] Is that repo not on github? [11:37:23] And it's not on diffusion either.. [11:40:04] I can do the diffusion thing Amir1 and Reedy [11:40:22] Just added an empty repo on github [11:40:24] https://github.com/wikimedia/mediawiki-extensions-DataTypes [11:40:26] yeah, I asked for that last night [11:41:09] Amir1: The autoload for that extension is only in composer.json [11:41:29] so a wgAutoloadClass needs adding to https://github.com/wikimedia/mediawiki-extensions-DataTypes/blob/master/DataTypes.mw.php [11:41:44] Reedy: Thanks for the help, I do it ASAP [11:42:38] diffusion mirror created [11:42:49] I'll set the gerrit URI to mirror it [11:44:57] tabbycat: Thanks [11:46:00] https://phabricator.wikimedia.org/diffusion/EDTP/ is being imported [11:47:20] oh shizz [11:47:27] EDTY already exists [11:48:19] wait, it was on Module:Callsigns but not on diffusion [11:48:29] well, I'll just rename the callsign on Phab then [11:49:11] https://phabricator.wikimedia.org/diffusion/EDTY/ <-- factum est [12:02:38] 10Beta-Cluster-Infrastructure: Requesting Global Rights - https://phabricator.wikimedia.org/T176140#3617326 (10Sau226) Even though I trust this user I am currently thinking whether the rights are appropriate and if we really need a global sysop. You will be notified on this task soon. @Anooprao May I please ask... [12:03:58] PROBLEM - Puppet errors on deployment-kafka-jumbo-2 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [12:04:26] PROBLEM - Puppet errors on deployment-kafka-jumbo-1 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [12:20:08] (03CR) 10Hashar: [C: 04-1] docker: composer image (032 comments) [integration/config] - 10https://gerrit.wikimedia.org/r/378531 (owner: 10Addshore) [12:23:57] RECOVERY - Puppet errors on deployment-kafka-jumbo-2 is OK: OK: Less than 1.00% above the threshold [0.0] [12:32:41] Reedy: can you enable Travis? https://travis-ci.org/wikimedia/mediawiki-extensions-DataTypes [12:33:11] Nope [12:33:11] You don't have sufficient rights to enable this repo on Travis. [12:33:11] Please contact the admin to enable it or to receive admin rights yourself. [12:33:11] it would be great if I can do it (be an admin in the organization) [12:33:24] Same for me :/ [12:44:25] RECOVERY - Puppet errors on deployment-kafka-jumbo-1 is OK: OK: Less than 1.00% above the threshold [0.0] [12:45:09] (03PS1) 10Aude: Bump Wikidata [tools/release] - 10https://gerrit.wikimedia.org/r/378895 [12:45:22] (03CR) 10Aude: [C: 032] Bump Wikidata [tools/release] - 10https://gerrit.wikimedia.org/r/378895 (owner: 10Aude) [12:48:38] (03Merged) 10jenkins-bot: Bump Wikidata [tools/release] - 10https://gerrit.wikimedia.org/r/378895 (owner: 10Aude) [13:01:35] PROBLEM - English Wikipedia Mobile Main page on beta-cluster is CRITICAL: HTTP CRITICAL: HTTP/1.1 503 Service Unavailable - string 'Wikipedia' not found on 'https://en.m.wikipedia.beta.wmflabs.org:443/wiki/Main_Page?debug=true' - 1971 bytes in 3.128 second response time [13:02:00] lies [13:04:54] Project selenium-Math » chrome,beta,Linux,BrowserTests build #519: 04FAILURE in 53 sec: https://integration.wikimedia.org/ci/job/selenium-Math/BROWSER=chrome,MEDIAWIKI_ENVIRONMENT=beta,PLATFORM=Linux,label=BrowserTests/519/ [13:04:55] Project selenium-Math » firefox,beta,Linux,BrowserTests build #519: 04FAILURE in 53 sec: https://integration.wikimedia.org/ci/job/selenium-Math/BROWSER=firefox,MEDIAWIKI_ENVIRONMENT=beta,PLATFORM=Linux,label=BrowserTests/519/ [13:06:35] RECOVERY - English Wikipedia Mobile Main page on beta-cluster is OK: HTTP OK: HTTP/1.1 200 OK - 35599 bytes in 1.884 second response time [13:13:16] RECOVERY - Puppet errors on integration-slave-docker-1001 is OK: OK: Less than 1.00% above the threshold [0.0] [13:30:39] 10Release-Engineering-Team (Kanban), 10Phabricator: Add support for task types - https://phabricator.wikimedia.org/T93499#3617440 (10Aklapper) > Any idea how this shows up in the database? See https://secure.phabricator.com/D17441 for the database part. (For Phab admins: The actual task (sub)types are defined... [13:47:59] 10Beta-Cluster-Infrastructure: Access to deployment-prep for sau226 - https://phabricator.wikimedia.org/T176213#3617488 (10Sau226) [13:54:32] (03PS1) 10Hashar: docker: clone puppet.git in a different layer [integration/config] - 10https://gerrit.wikimedia.org/r/378911 [13:55:09] (03CR) 10Hashar: "I am not sure how it is a good idea : ) but that saves me from having to dl puppet.git over and over." [integration/config] - 10https://gerrit.wikimedia.org/r/378911 (owner: 10Hashar) [14:00:54] 10Beta-Cluster-Infrastructure: Access to deployment-prep for sau226 - https://phabricator.wikimedia.org/T176213#3617488 (10MarcoAurelio) I don't think this should be done. The user just arrived 9 days ago and has self-assigned each and every flag on the wikis he's got access to for no real reason (T175555 · [[ h... [14:05:07] (03PS2) 10Hashar: docker: clone puppet.git in a different layer [integration/config] - 10https://gerrit.wikimedia.org/r/378911 [14:06:00] (03CR) 10Hashar: docker: clone puppet.git in a different layer (031 comment) [integration/config] - 10https://gerrit.wikimedia.org/r/378911 (owner: 10Hashar) [14:08:05] PROBLEM - Puppet errors on deployment-stream is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [14:15:54] 10Beta-Cluster-Infrastructure: Access to deployment-prep for sau226 - https://phabricator.wikimedia.org/T176213#3617592 (10Sau226) p:05Triage>03Normal Fair enough although a second opinion can't hurt. [14:21:56] 10Beta-Cluster-Infrastructure: Access to deployment-prep for sau226 - https://phabricator.wikimedia.org/T176213#3617611 (10Sau226) As you have suggested I have too many user rights on wikis for no real reason I have stripped my rights from wikis where I don't need these rights in the forseeable future. I will ho... [14:25:25] 10Continuous-Integration-Config, 10Wikidata, 10Patch-For-Review, 10User-Tobi_WMDE_SW: E-Mail notification on failures of Wikidata-builds - https://phabricator.wikimedia.org/T152495#3617624 (10Addshore) [14:30:49] 10Beta-Cluster-Infrastructure: Access to deployment-prep for sau226 - https://phabricator.wikimedia.org/T176213#3617488 (10Steinsplitter) >>! In T176213#3617523, @MarcoAurelio wrote: > I don't think this should be done. The user just arrived 9 days ago and has self-assigned each and every flag on the wikis he's... [14:33:29] 10Beta-Cluster-Infrastructure: Access to deployment-prep for sau226 - https://phabricator.wikimedia.org/T176213#3617662 (10Sau226) 05Open>03declined Community consensus is clear. I see no reason to pursue this matter further [14:33:39] Yippee, build fixed! [14:33:40] Project selenium-WikiLove » firefox,beta,Linux,BrowserTests build #521: 09FIXED in 1 min 38 sec: https://integration.wikimedia.org/ci/job/selenium-WikiLove/BROWSER=firefox,MEDIAWIKI_ENVIRONMENT=beta,PLATFORM=Linux,label=BrowserTests/521/ [14:35:20] 10Beta-Cluster-Infrastructure: Requesting Global Rights - https://phabricator.wikimedia.org/T176140#3617665 (10Sau226) p:05Triage>03Normal a:05Sau226>03None [14:46:39] 10Beta-Cluster-Infrastructure: Requesting Global Rights - https://phabricator.wikimedia.org/T176140#3617715 (10Aklapper) p:05Normal>03Triage @Sau226: Please do not prioritize tasks if you do not plan to work on them. Thanks! [15:00:43] PROBLEM - Puppet errors on deployment-redis02 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [15:02:25] PROBLEM - Puppet errors on deployment-memc06 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [15:02:39] PROBLEM - Puppet errors on deployment-imagescaler01 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [15:02:45] PROBLEM - Puppet errors on deployment-kafka03 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [15:02:49] PROBLEM - Puppet errors on deployment-restbase02 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [15:03:01] PROBLEM - Puppet errors on deployment-tmh01 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [15:03:34] PROBLEM - Puppet errors on deployment-memc05 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [15:04:27] PROBLEM - Puppet errors on deployment-tin is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [15:04:28] PROBLEM - Puppet errors on castor02 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [15:04:34] PROBLEM - Puppet errors on integration-r-lang-01 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [15:04:47] PROBLEM - Puppet errors on deployment-puppetdb01 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [15:04:58] PROBLEM - Puppet errors on deployment-kafka-jumbo-2 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [15:05:26] PROBLEM - Puppet errors on deployment-kafka-jumbo-1 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [15:05:55] PROBLEM - Puppet errors on deployment-elastic06 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [15:06:08] PROBLEM - Puppet errors on integration-slave-jessie-1004 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [15:06:15] PROBLEM - Host deployment-stream is DOWN: CRITICAL - Host Unreachable (10.68.17.106) [15:06:19] PROBLEM - Puppet errors on saucelabs-03 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [15:07:23] PROBLEM - Puppet errors on deployment-db03 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [15:07:29] PROBLEM - Puppet errors on integration-publishing is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [15:08:22] PROBLEM - Puppet errors on deployment-conf03 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [15:08:26] PROBLEM - Puppet errors on deployment-ircd is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [15:08:34] 10Beta-Cluster-Infrastructure, 10MediaWiki-extensions-BlockAndNuke, 10wikimedia-extension-review-queue: Consider installing BlockAndNuke on the Beta Cluster - https://phabricator.wikimedia.org/T176207#3617264 (10Sau226) If this extension actually works as intended it would be highly beneficial to sysops and... [15:08:46] PROBLEM - Puppet errors on deployment-eventlog02 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [15:09:16] PROBLEM - Puppet errors on integration-slave-docker-1001 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [15:09:26] PROBLEM - Puppet errors on deployment-pdf01 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [15:09:32] PROBLEM - Puppet errors on deployment-eventlogging04 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [15:09:32] PROBLEM - Puppet errors on deployment-memc07 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [15:10:37] PROBLEM - Puppet errors on deployment-sentry01 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [15:10:45] PROBLEM - Puppet errors on deployment-kafka04 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [15:11:07] PROBLEM - Puppet errors on deployment-cache-upload04 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [15:12:27] PROBLEM - Puppet errors on deployment-sca01 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [15:12:41] PROBLEM - Puppet errors on saucelabs-02 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [15:12:41] PROBLEM - Puppet errors on deployment-db04 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [15:12:55] PROBLEM - Puppet errors on deployment-etcd-01 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [15:13:13] hashar ^ [15:13:19] PROBLEM - Puppet errors on deployment-ms-fe02 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [15:13:48] andre__: maybe you know who to ping here for the shinken errors? [15:13:52] tabbycat know :) [15:13:55] know = known [15:13:59] see -cloud [15:14:10] thanks paladox [15:14:15] your welcome :) [15:14:24] PROBLEM - Puppet errors on deployment-mx is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [15:14:24] oh, -cloud is worse [15:14:28] PROBLEM - Puppet errors on deployment-changeprop is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [15:14:36] PROBLEM - Puppet errors on integration-slave-docker-1003 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [15:14:36] i think it will recover soon [15:14:41] running puppet works for me now [15:14:46] PROBLEM - Puppet errors on deployment-mcs01 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [15:15:04] PROBLEM - Puppet errors on deployment-sca03 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [15:15:08] PROBLEM - Puppet errors on deployment-mathoid is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [15:15:11] if c.m.p is on it surely it will recover soon :) [15:15:39] PROBLEM - Puppet errors on integration-puppetmaster01 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [15:15:44] yep [15:16:11] PROBLEM - Puppet errors on deployment-elastic05 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [15:17:17] PROBLEM - Puppet errors on deployment-memc04 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [15:17:23] PROBLEM - Puppet errors on deployment-logstash2 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [15:19:16] 10Beta-Cluster-Infrastructure, 10MediaWiki-Authentication-and-authorization, 10MediaWiki-extensions-CentralAuth, 10MW-1.30-release-notes (WMF-deploy-2017-08-08_(1.30.0-wmf.13)), 10Patch-For-Review: "Loss of session data" on Beta Cluster - https://phabricator.wikimedia.org/T172560#3617796 (10jmatazzoni) I... [15:38:34] RECOVERY - Puppet errors on deployment-memc05 is OK: OK: Less than 1.00% above the threshold [0.0] [15:39:27] RECOVERY - Puppet errors on deployment-tin is OK: OK: Less than 1.00% above the threshold [0.0] [15:39:29] RECOVERY - Puppet errors on castor02 is OK: OK: Less than 1.00% above the threshold [0.0] [15:39:35] RECOVERY - Puppet errors on integration-r-lang-01 is OK: OK: Less than 1.00% above the threshold [0.0] [15:39:47] RECOVERY - Puppet errors on deployment-puppetdb01 is OK: OK: Less than 1.00% above the threshold [0.0] [15:39:58] RECOVERY - Puppet errors on deployment-kafka-jumbo-2 is OK: OK: Less than 1.00% above the threshold [0.0] [15:40:24] RECOVERY - Puppet errors on deployment-kafka-jumbo-1 is OK: OK: Less than 1.00% above the threshold [0.0] [15:40:42] RECOVERY - Puppet errors on deployment-redis02 is OK: OK: Less than 1.00% above the threshold [0.0] [15:40:54] RECOVERY - Puppet errors on deployment-elastic06 is OK: OK: Less than 1.00% above the threshold [0.0] [15:41:10] RECOVERY - Puppet errors on integration-slave-jessie-1004 is OK: OK: Less than 1.00% above the threshold [0.0] [15:41:18] RECOVERY - Puppet errors on saucelabs-03 is OK: OK: Less than 1.00% above the threshold [0.0] [15:42:25] RECOVERY - Puppet errors on deployment-db03 is OK: OK: Less than 1.00% above the threshold [0.0] [15:42:25] RECOVERY - Puppet errors on deployment-memc06 is OK: OK: Less than 1.00% above the threshold [0.0] [15:42:29] RECOVERY - Puppet errors on integration-publishing is OK: OK: Less than 1.00% above the threshold [0.0] [15:42:39] RECOVERY - Puppet errors on deployment-imagescaler01 is OK: OK: Less than 1.00% above the threshold [0.0] [15:42:45] RECOVERY - Puppet errors on deployment-kafka03 is OK: OK: Less than 1.00% above the threshold [0.0] [15:42:49] RECOVERY - Puppet errors on deployment-restbase02 is OK: OK: Less than 1.00% above the threshold [0.0] [15:43:01] RECOVERY - Puppet errors on deployment-tmh01 is OK: OK: Less than 1.00% above the threshold [0.0] [15:43:21] RECOVERY - Puppet errors on deployment-conf03 is OK: OK: Less than 1.00% above the threshold [0.0] [15:43:24] RECOVERY - Puppet errors on deployment-ircd is OK: OK: Less than 1.00% above the threshold [0.0] [15:43:46] RECOVERY - Puppet errors on deployment-eventlog02 is OK: OK: Less than 1.00% above the threshold [0.0] [15:44:16] RECOVERY - Puppet errors on integration-slave-docker-1001 is OK: OK: Less than 1.00% above the threshold [0.0] [15:44:28] RECOVERY - Puppet errors on deployment-pdf01 is OK: OK: Less than 1.00% above the threshold [0.0] [15:44:33] RECOVERY - Puppet errors on deployment-eventlogging04 is OK: OK: Less than 1.00% above the threshold [0.0] [15:44:33] RECOVERY - Puppet errors on deployment-memc07 is OK: OK: Less than 1.00% above the threshold [0.0] [15:44:58] 10Continuous-Integration-Infrastructure, 10Jenkins, 10Patch-For-Review: Upgrade jenkins to 2.73.1 (new lts release) - https://phabricator.wikimedia.org/T168644#3617875 (10Paladox) [15:46:07] RECOVERY - Puppet errors on deployment-cache-upload04 is OK: OK: Less than 1.00% above the threshold [0.0] [15:47:39] RECOVERY - Puppet errors on saucelabs-02 is OK: OK: Less than 1.00% above the threshold [0.0] [15:47:39] RECOVERY - Puppet errors on deployment-db04 is OK: OK: Less than 1.00% above the threshold [0.0] [15:47:55] RECOVERY - Puppet errors on deployment-etcd-01 is OK: OK: Less than 1.00% above the threshold [0.0] [15:48:20] RECOVERY - Puppet errors on deployment-ms-fe02 is OK: OK: Less than 1.00% above the threshold [0.0] [15:49:22] RECOVERY - Puppet errors on deployment-mx is OK: OK: Less than 1.00% above the threshold [0.0] [15:49:28] RECOVERY - Puppet errors on deployment-changeprop is OK: OK: Less than 1.00% above the threshold [0.0] [15:49:36] RECOVERY - Puppet errors on integration-slave-docker-1003 is OK: OK: Less than 1.00% above the threshold [0.0] [15:49:48] RECOVERY - Puppet errors on deployment-mcs01 is OK: OK: Less than 1.00% above the threshold [0.0] [15:50:04] RECOVERY - Puppet errors on deployment-sca03 is OK: OK: Less than 1.00% above the threshold [0.0] [15:50:08] RECOVERY - Puppet errors on deployment-mathoid is OK: OK: Less than 1.00% above the threshold [0.0] [15:50:38] RECOVERY - Puppet errors on deployment-sentry01 is OK: OK: Less than 1.00% above the threshold [0.0] [15:50:39] RECOVERY - Puppet errors on integration-puppetmaster01 is OK: OK: Less than 1.00% above the threshold [0.0] [15:50:45] RECOVERY - Puppet errors on deployment-kafka04 is OK: OK: Less than 1.00% above the threshold [0.0] [15:51:11] PROBLEM - Long lived cherry-picks on puppetmaster on deployment-puppetmaster02 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [15:51:11] RECOVERY - Puppet errors on deployment-elastic05 is OK: OK: Less than 1.00% above the threshold [0.0] [15:52:13] RECOVERY - Puppet errors on deployment-memc04 is OK: OK: Less than 1.00% above the threshold [0.0] [15:52:21] RECOVERY - Puppet errors on deployment-logstash2 is OK: OK: Less than 1.00% above the threshold [0.0] [15:52:25] RECOVERY - Puppet errors on deployment-sca01 is OK: OK: Less than 1.00% above the threshold [0.0] [15:58:04] RECOVERY - Free space - all mounts on deployment-kafka01 is OK: OK: All targets OK [16:08:28] bah puppet is broken on beta [16:08:30] looking [16:09:21] ah was transient [16:10:37] hashar: lots of backscroll in -operations. I think things are good in most of the world now. [16:11:23] bd808: thanks :) [16:13:14] PROBLEM - Puppet errors on deployment-memc04 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [16:16:20] 10Release-Engineering-Team (Kanban), 10Release, 10Train Deployments: 1.30.0-wmf.18 deployment blockers - https://phabricator.wikimedia.org/T170636#3617985 (10demon) 05Open>03Resolved [16:22:43] (03CR) 10Hashar: docker: base image for CI images (031 comment) [integration/config] - 10https://gerrit.wikimedia.org/r/378033 (owner: 10Hashar) [16:26:25] (03CR) 10Hashar: docker: base image for CI images (031 comment) [integration/config] - 10https://gerrit.wikimedia.org/r/378033 (owner: 10Hashar) [16:31:50] (03PS2) 10Hashar: docker: base image for CI images [integration/config] - 10https://gerrit.wikimedia.org/r/378033 [16:48:14] RECOVERY - Puppet errors on deployment-memc04 is OK: OK: Less than 1.00% above the threshold [0.0] [16:51:44] 10Release-Engineering-Team (Kanban), 10Phabricator: Add support for task types - https://phabricator.wikimedia.org/T93499#3618177 (10Fjalapeno) Thanks @mmodell! Sorry this got buried in my inbox… as @MBinder_WMF said, for our teams, I think the one that would be the most useful one would be the bug type… fol... [16:53:17] 10Release-Engineering-Team (Kanban), 10Phabricator: Add support for task types - https://phabricator.wikimedia.org/T93499#3618191 (10Fjalapeno) Just a quick follow up, I think the others defined in the task description are useful as well… just wanted to note what the 90% case would be. [17:19:41] 10Beta-Cluster-Infrastructure: Access to deployment-prep for sau226 - https://phabricator.wikimedia.org/T176213#3617488 (10Legoktm) @Sau226 I think you currently don't understand what beta cluster is even for. From a quick glance at deletion logs, pages you're deleting and then salting as spam are most definitel... [17:26:28] !log removed rights from User:Sau226 on beta cluster due to block of account used for browser tests [17:26:32] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [17:27:44] also for some reason https://en.wikinews.beta.wmflabs.org/wiki/Main_Page has the beta commons logo? [17:39:09] (03CR) 10Legoktm: "Based on we should consider looking into using a base stretc" [integration/config] - 10https://gerrit.wikimedia.org/r/378530 (owner: 10Addshore) [17:40:17] PROBLEM - Puppet errors on integration-slave-docker-1001 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [17:42:29] no_justification: should I interpret https://phabricator.wikimedia.org/T174760#3611453 as "go ahead and do it"? [17:42:41] Yes, jfdi [17:42:53] Go forth and godspeed! [17:42:54] :) [17:51:02] * paladox had to order a new router today, bt has to replace it, but not after a half an hour on the phone with them. lol [17:51:02] woops [17:51:06] wrong place. [17:52:24] 10Beta-Cluster-Infrastructure, 10Analytics-Kanban, 10Wikimedia-Stream, 10Patch-For-Review: Decom RCStream in Beta Cluster - https://phabricator.wikimedia.org/T172356#3618442 (10Nuria) 05Open>03Resolved [17:54:13] 10Continuous-Integration-Config, 10Release-Engineering-Team (Kanban), 10Analytics-Kanban, 10Analytics-Wikistats: Set up continuous integration for wikistats 2.0 UI - https://phabricator.wikimedia.org/T170458#3618459 (10Nuria) [18:00:46] 10Beta-Cluster-Infrastructure: IP Address Lookup Tool Installation - https://phabricator.wikimedia.org/T176074#3618509 (10greg) 05Open>03declined Declining for now per T176213#3617488 [18:20:07] legoktm: greg-g: hashar: https://gerrit.wikimedia.org/r/#/c/378743/ [18:20:19] Not sure if that pet peeve is just me :) [18:21:03] you're only slightly more OCD than I :P [18:32:30] shoot i got my links mixed up in the email to wikitech-i. [18:34:00] (03PS2) 10Umherirrender: Add configuration to generate PHPUnit coverage reports [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/377042 (owner: 10Legoktm) [18:34:04] (03CR) 10Umherirrender: [C: 032] Add configuration to generate PHPUnit coverage reports [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/377042 (owner: 10Legoktm) [18:36:05] (03Merged) 10jenkins-bot: Add configuration to generate PHPUnit coverage reports [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/377042 (owner: 10Legoktm) [18:44:37] (03PS3) 10Krinkle: zuul: Don't toggle panel when clicking Gerrit patch link [integration/docroot] - 10https://gerrit.wikimedia.org/r/378743 [18:46:34] (03CR) 10Krinkle: "Upstreaming at https://review.openstack.org/#/c/505366/" [integration/docroot] - 10https://gerrit.wikimedia.org/r/373667 (https://phabricator.wikimedia.org/T174058) (owner: 10Thcipriani) [18:46:44] (03CR) 10Krinkle: "Upstreaming at https://review.openstack.org/#/c/505368/" [integration/docroot] - 10https://gerrit.wikimedia.org/r/378743 (owner: 10Krinkle) [18:51:31] 10Beta-Cluster-Infrastructure, 10MediaWiki-Authentication-and-authorization, 10MediaWiki-extensions-CentralAuth, 10MW-1.30-release-notes (WMF-deploy-2017-08-08_(1.30.0-wmf.13)), 10Patch-For-Review: "Loss of session data" on Beta Cluster - https://phabricator.wikimedia.org/T172560#3618821 (10Anomie) >>! I... [19:32:07] 10Beta-Cluster-Infrastructure, 10MediaWiki-Authentication-and-authorization, 10MediaWiki-extensions-CentralAuth, 10MW-1.30-release-notes (WMF-deploy-2017-08-08_(1.30.0-wmf.13)), 10Patch-For-Review: "Loss of session data" on Beta Cluster - https://phabricator.wikimedia.org/T172560#3619030 (10MarcoAurelio)... [19:40:12] 10Release-Engineering-Team (Watching / External), 10Wikidata: Decide what to do with Wikibase JS-only libraries regarding the build/deployment of Wikidata code - https://phabricator.wikimedia.org/T174922#3619066 (10Krinkle) >>! In T174922#3583384, @WMDE-leszek wrote: > One concern with the Option 5 I could ima... [19:45:19] RECOVERY - Puppet errors on integration-slave-docker-1001 is OK: OK: Less than 1.00% above the threshold [0.0] [20:06:16] PROBLEM - Puppet errors on integration-slave-docker-1001 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [20:16:18] 10Continuous-Integration-Config, 10Release-Engineering-Team (Watching / External), 10Discovery, 10Discovery-Analysis (Current work), 10Patch-For-Review: Add lint/CI to all wikimedia/discovery analytics repositories - https://phabricator.wikimedia.org/T153856#3619225 (10debt) It looks like we're pretty mu... [21:07:26] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (Kanban): Upgrade docker on integration-slave-docker-* - https://phabricator.wikimedia.org/T176267#3619405 (10hashar) [21:07:45] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (Kanban): Upgrade docker on integration-slave-docker-* - https://phabricator.wikimedia.org/T176267#3619419 (10hashar) [21:07:47] 10Release-Engineering-Team (Kanban), 10Operations, 10Release Pipeline, 10Patch-For-Review: Provision Docker >= 17.05 on contint1001 - https://phabricator.wikimedia.org/T175293#3589181 (10hashar) [21:33:36] does the Phabricator notification server have to be accessible from the world [21:33:40] or just from Phab service [21:33:45] re: firewall rules [21:46:17] RECOVERY - Puppet errors on integration-slave-docker-1001 is OK: OK: Less than 1.00% above the threshold [0.0] [22:07:17] PROBLEM - Puppet errors on integration-slave-docker-1001 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [23:17:17] RECOVERY - Puppet errors on integration-slave-docker-1001 is OK: OK: Less than 1.00% above the threshold [0.0] [23:38:19] PROBLEM - Puppet errors on integration-slave-docker-1001 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0]