[06:10:43] <wmf-insecte>	 Project beta-scap-eqiad build #222986: 04STILL FAILING in 12 min: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/222986/
[06:21:26] <wikibugs>	 10Project-Admins: Replace tracking bug T21719 by new project tag "HTML5" - https://phabricator.wikimedia.org/T102502 (10Aklapper) I'm missing a use case why someone would like to follow only HTML-5 related tasks.
[06:24:00] <wmf-insecte>	 Project beta-scap-eqiad build #222987: 04STILL FAILING in 12 min: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/222987/
[06:32:35] <Krinkle>	 The beta update job failure says "sudo: a password is required"
[06:38:30] <wmf-insecte>	 Yippee, build fixed!
[06:38:30] <wmf-insecte>	 Project beta-scap-eqiad build #222988: 09FIXED in 13 min: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/222988/
[07:09:18] <shinken-wm>	 PROBLEM - Free space - all mounts on deployment-deploy01 is CRITICAL: CRITICAL: deployment-prep.deployment-deploy01.diskspace.root.byte_percentfree (<11.11%)
[07:24:16] <shinken-wm>	 PROBLEM - Free space - all mounts on deployment-deploy01 is CRITICAL: CRITICAL: deployment-prep.deployment-deploy01.diskspace.root.byte_percentfree (<11.11%)
[07:39:17] <shinken-wm>	 PROBLEM - Free space - all mounts on deployment-deploy01 is CRITICAL: CRITICAL: deployment-prep.deployment-deploy01.diskspace.root.byte_percentfree (<11.11%)
[07:47:09] <shinken-wm>	 PROBLEM - Puppet errors on deployment-elastic06 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0]
[08:02:00] <shinken-wm>	 PROBLEM - Puppet errors on deployment-elastic07 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0]
[08:02:33] <shinken-wm>	 PROBLEM - Puppet errors on deployment-logstash2 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0]
[08:12:34] <shinken-wm>	 PROBLEM - Puppet errors on deployment-elastic05 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0]
[09:20:51] <wmf-insecte>	 Project beta-scap-eqiad build #222999: 04FAILURE in 12 min: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/222999/
[09:34:02] <wmf-insecte>	 Project beta-scap-eqiad build #223000: 04STILL FAILING in 12 min: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/223000/
[09:47:07] <wikibugs>	 10Release-Engineering-Team, 10Scap, 10Operations: mwdebug1001 and mwdebug1002 are reliably the last two hosts to finish scap-cdb-rebuild - https://phabricator.wikimedia.org/T203625 (10MoritzMuehlenhoff) p:05Triage>03Normal
[09:47:17] <wmf-insecte>	 Project beta-scap-eqiad build #223001: 04STILL FAILING in 12 min: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/223001/
[09:49:14] <hashar>	 zeljkof: wdio/ffmpeg I rewrote it this morning
[09:49:22] <hashar>	 as a wdio reporter, see https://gerrit.wikimedia.org/r/#/c/mediawiki/core/+/462656/2
[09:49:30] <hashar>	 and I left some comment on your change https://gerrit.wikimedia.org/r/#/c/mediawiki/core/+/422933/
[09:49:45] <hashar>	 I basically copy pasted the code from wdio.conf.js to a new file
[09:49:50] <hashar>	 and wdio.conf.js is now all about:
[09:50:04] <hashar>	 reporters: [ require('wdio-mediawiki/VideoRecordingReporter') ],
[09:50:08] <hashar>	 reporterOptions: {
[09:50:14] <hashar>	   video: { videoPath: logPath }
[09:50:15] <hashar>	 }
[09:50:29] <zeljkof>	 cool!
[09:50:50] <hashar>	 so we can add video recording to the extensions daily jobs fairly trivially (just release wdio-mediawiki 0.0.3, bump in the extension package.json and add the above reporter configuration)
[09:50:55] <hashar>	 then we get video reporting on daily jobs
[09:51:37] <hashar>	 for running the extension selenium tests from Quibble, I think I have a good plan.  Will do my best to deliver it by end of the week and cut a new quibble version
[09:52:17] <zeljkof>	 even cooler!
[10:00:38] <wmf-insecte>	 Project beta-scap-eqiad build #223002: 04STILL FAILING in 12 min: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/223002/
[10:13:51] <wmf-insecte>	 Project beta-scap-eqiad build #223003: 04STILL FAILING in 12 min: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/223003/
[10:17:49] <wikibugs>	 10Release-Engineering-Team, 10Wikimedia-Incident, 10Wikimedia-production-error: Promoting group1 to 1.32.0-wmf.22 caused a spam of  web request took longer than 60 seconds and timed out - https://phabricator.wikimedia.org/T204871 (10hashar)
[10:18:34] <wikibugs>	 10Release-Engineering-Team, 10Wikimedia-Incident, 10Wikimedia-production-error: Deployments of MediaWiki with scap cause a spam of "web request took longer than 60 seconds and timed out" - https://phabricator.wikimedia.org/T204871 (10hashar)
[10:20:27] <wikibugs>	 10Release-Engineering-Team (Kanban), 10MediaWiki-Core-Tests, 10MediaWiki-extensions-Other, 10User-zeljkofilipin: EditSubpages extension fails mediawiki/core webdriver.io tests - https://phabricator.wikimedia.org/T196436 (10zeljkofilipin) @hashar this will be resolved by {T199116}?
[10:21:46] <wikibugs>	 10Release-Engineering-Team, 10Scap, 10Datacenter-Switchover-2018, 10Patch-For-Review: Scap is checking canary servers in dormant instead of active-dc - https://phabricator.wikimedia.org/T204907 (10hashar)
[10:22:08] <wikibugs>	 10Release-Engineering-Team, 10Scap, 10Operations, 10Datacenter-Switchover-2018, 10Wikimedia-Incident: Scap is checking canary servers in dormant instead of active-dc - https://phabricator.wikimedia.org/T204907 (10hashar)
[10:24:44] <wikibugs>	 10Release-Engineering-Team (Kanban), 10Release, 10Train Deployments, 10User-zeljkofilipin: 1.32.0-wmf.22 deployment blockers - https://phabricator.wikimedia.org/T191068 (10hashar) Train report published on Wikitech: https://wikitech.wikimedia.org/wiki/Incident_documentation/20180918-train
[10:27:20] <wmf-insecte>	 Project beta-scap-eqiad build #223004: 04STILL FAILING in 12 min: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/223004/
[10:40:56] <wmf-insecte>	 Project beta-scap-eqiad build #223005: 15ABORTED in 12 min: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/223005/
[10:44:17] <shinken-wm>	 RECOVERY - Free space - all mounts on deployment-deploy01 is OK: OK: All targets OK
[10:49:18] <wikibugs>	 10Beta-Cluster-Infrastructure, 10Release-Engineering-Team (Kanban), 10Scap: On deployment-prep scap cache_git_info takes 12 minutes (that is too slow) - https://phabricator.wikimedia.org/T204762 (10hashar) Even funnier, I wanted to trace the execution of `scap sync` using the python `trace` module. The git c...
[10:57:25] <wmf-insecte>	 Yippee, build fixed!
[10:57:25] <wmf-insecte>	 Project beta-scap-eqiad build #223006: 09FIXED in 15 min: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/223006/
[11:02:14] <wmf-insecte>	 Project beta-scap-eqiad build #223007: 04FAILURE in 3 min 59 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/223007/
[11:08:30] <wikibugs>	 10Release-Engineering-Team (Someday), 10MediaWiki-Core-Tests, 10User-zeljkofilipin: Someday/maybe Selenium framework improvements - https://phabricator.wikimedia.org/T190995 (10hashar)
[11:08:40] <hashar>	 zeljkof: I have just declined that wdio async: false task 
[11:09:17] <hashar>	 seems webdriver.io 4 runs them synchronously now. That is good enough :]
[11:10:19] <zeljkof>	 it's actually a 4.x feature I think, provides a more JS syntax
[11:10:37] <hashar>	 na 3.x ran then asynchronously, and 4.x runs them synchronously
[11:11:09] <hashar>	 and  async: true is merely for back compatibility so people don't have to rewrite all their tests when upgrading
[11:13:41] <zeljko-evil-twin>	 hashar: something is wrong with freenode/irccloud, my messages get rejected all the time :/
[11:16:23] <wmf-insecte>	 Yippee, build fixed!
[11:16:23] <wmf-insecte>	 Project beta-scap-eqiad build #223008: 09FIXED in 13 min: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/223008/
[11:26:35] <wikibugs>	 10Release-Engineering-Team (Kanban), 10MediaWiki-Core-Tests, 10Patch-For-Review, 10User-zeljkofilipin: Video recording for Selenium tests in Node.js - https://phabricator.wikimedia.org/T179188 (10zeljkofilipin)
[11:26:48] <wikibugs>	 10Release-Engineering-Team (Kanban), 10MediaWiki-Core-Tests, 10Patch-For-Review, 10User-zeljkofilipin: Video recording for Selenium tests in Node.js - https://phabricator.wikimedia.org/T179188 (10zeljkofilipin) p:05High>03Normal
[11:42:24] <wikibugs>	 10Beta-Cluster-Infrastructure, 10Release-Engineering-Team (Kanban), 10Scap: On deployment-prep scap cache_git_info takes 12 minutes (that is too slow) - https://phabricator.wikimedia.org/T204762 (10hashar) I wanted some historical build durations. I took the IRC logs from https://wm-bot.wmflabs.org/logs/%23w...
[11:44:39] <wikibugs>	 10Beta-Cluster-Infrastructure, 10Release-Engineering-Team (Kanban), 10Scap: On deployment-prep scap cache_git_info takes 12 minutes (that is too slow) - https://phabricator.wikimedia.org/T204762 (10hashar) If that is from scap, that might be one of:  * 35e4dc6ea8557bb0ba63ec7eacd7e7233a24cd7e - sync-wikivers...
[12:13:28] <wikibugs>	 10Release-Engineering-Team (Kanban), 10User-zeljkofilipin: Find top 15 target projects that could use Selenium tests to prevent incidents - https://phabricator.wikimedia.org/T199133 (10zeljkofilipin)
[12:17:19] <wikibugs>	 10Release-Engineering-Team (Kanban), 10User-zeljkofilipin: Find top 15 target projects that could use Selenium tests to prevent incidents - https://phabricator.wikimedia.org/T199133 (10zeljkofilipin)
[12:21:45] <wikibugs>	 10Release-Engineering-Team (Kanban), 10User-zeljkofilipin: Find top 15 target projects that could use Selenium tests to prevent incidents - https://phabricator.wikimedia.org/T199133 (10zeljkofilipin)
[13:49:34] <enick_847>	 https://phabricator.wikimedia.org/tag/readers-web-backlog/https://phabricator.wikimedia.org/project/board/67/
[13:50:01] <enick_847>	 Sorry, bad message.
[13:50:37] <enick_847>	 zeljkof / zeljko-evil-twin: is it too late for a patch to make the train?
[13:50:57] <zeljkof>	 enick_847: yes
[13:51:17] <zeljkof>	 I've cut the branch, but for urgent things, you can always create a backport
[13:51:37] <wikibugs>	 10Continuous-Integration-Config, 10Operations, 10Traffic: CI jobs for authdns linting need to run on Stretch - https://phabricator.wikimedia.org/T205439 (10BBlack) p:05Triage>03Normal
[13:55:21] <enick_847>	 Ok, thanks zeljkof. 
[14:25:52] <wikibugs>	 10Release-Engineering-Team (Watching / External), 10Scap, 10Operations, 10Datacenter-Switchover-2018, 10Wikimedia-Incident: Scap is checking canary servers in dormant instead of active-dc - https://phabricator.wikimedia.org/T204907 (10greg) (I assume SRE will do the adding to conftool and the editing/ext...
[15:04:42] <wikibugs>	 (03CR) 10Jforrester: [C: 032] "…" [tools/release] - 10https://gerrit.wikimedia.org/r/461733 (https://phabricator.wikimedia.org/T106067) (owner: 10Reedy)
[15:05:46] <wikibugs>	 (03Merged) 10jenkins-bot: Stop branching DisableAccount [tools/release] - 10https://gerrit.wikimedia.org/r/461733 (https://phabricator.wikimedia.org/T106067) (owner: 10Reedy)
[15:08:20] <shinken-wm>	 PROBLEM - Puppet errors on saucelabs-01 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0]
[15:09:57] <James_F>	 Someone needs to set jenkins-bot's gerrit account "status" to an emoji. Maybe "🤖" or "🌋"?
[15:10:05] <wikibugs>	 10Release-Engineering-Team (Kanban), 10Release, 10Train Deployments, 10User-zeljkofilipin: 1.32.0-wmf.23 deployment blockers - https://phabricator.wikimedia.org/T191069 (10zeljkofilipin) 1.32.0-wmf.23 at group 0: - https://tools.wmflabs.org/versions/ - https://www.mediawiki.org/w/index.php?diff=2894422&old...
[15:15:36] <wikibugs>	 10Release-Engineering-Team (Kanban), 10Education-Program-Dashboard, 10MediaWiki-extensions-EducationProgram, 10Epic, and 2 others: Deprecate and remove the EducationProgram extension from Wikimedia servers after June 30, 2018 - https://phabricator.wikimedia.org/T125618 (10Jdforrester-WMF)
[15:16:46] <wikibugs>	 10Release-Engineering-Team (Kanban), 10Education-Program-Dashboard, 10MediaWiki-extensions-EducationProgram, 10Epic, and 2 others: Deprecate and remove the EducationProgram extension from Wikimedia servers after June 30, 2018 - https://phabricator.wikimedia.org/T125618 (10Jdforrester-WMF)
[15:19:11] <wikibugs>	 10Release-Engineering-Team (Kanban), 10Education-Program-Dashboard, 10MediaWiki-extensions-EducationProgram, 10Epic, and 2 others: Deprecate and remove the EducationProgram extension from Wikimedia servers after June 30, 2018 - https://phabricator.wikimedia.org/T125618 (10Jdforrester-WMF)
[15:20:46] <wikibugs>	 (03PS1) 10Jforrester: Stop branching EducationProgram [tools/release] - 10https://gerrit.wikimedia.org/r/462735 (https://phabricator.wikimedia.org/T125618)
[15:39:03] <shinken-wm>	 PROBLEM - Puppet errors on deployment-eventlog05 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0]
[15:48:20] <shinken-wm>	 RECOVERY - Puppet errors on saucelabs-01 is OK: OK: Less than 1.00% above the threshold [0.0]
[16:07:28] <wikibugs>	 10Continuous-Integration-Config: Capture failure logs from php-compile jobs (at least for luasandbox) - https://phabricator.wikimedia.org/T205453 (10Anomie)
[16:45:23] <marxarelli>	 !log launching new integration-slave-docker-1039/1040 bigram instances
[16:45:27] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL
[16:46:27] <marxarelli>	 !log new instance creation delayed due to quota
[16:46:31] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL
[16:47:24] <marxarelli>	 !log increasing executors to 7 for jenkins nodes integration-slave-docker-1033/1034
[16:47:25] <greg-g>	 :(
[16:47:28] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL
[17:05:05] <marxarelli>	 !log taking integration-slave-docker-1030/1031 offline for replacement
[17:05:08] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL
[17:06:25] <awight>	 Hopefully I didn't break Zuul, I submitted a patch when gerrit "cherry-pick" didn't result in a merge job.
[17:09:29] <marxarelli>	 !log deleting integration-slave-docker-1030/1031 instances (T205362)
[17:09:34] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL
[17:09:35] <stashbot>	 T205362: Migrate m4executor CI nodes to bigram instances - https://phabricator.wikimedia.org/T205362
[17:11:38] <shinken-wm>	 PROBLEM - Host integration-slave-docker-1030 is DOWN: CRITICAL - Host Unreachable (10.68.21.92)
[17:11:40] <shinken-wm>	 PROBLEM - Host integration-slave-docker-1031 is DOWN: CRITICAL - Host Unreachable (10.68.20.213)
[17:12:22] <marxarelli>	 !log taking integration-slave-docker-1007/1008 offline for replacement (T205362)
[17:12:27] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL
[17:13:11] <marxarelli>	 !log launching new integration-slave-docker-1039 bigram instance
[17:13:14] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL
[17:23:04] <wmf-insecte>	 Project beta-scap-eqiad build #223032: 04FAILURE in 13 min: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/223032/
[17:23:14] <wmf-insecte>	 Project beta-update-databases-eqiad build #28584: 04FAILURE in 3 min 13 sec: https://integration.wikimedia.org/ci/job/beta-update-databases-eqiad/28584/
[17:23:54] <Reedy>	 17:23:14 Exception: ('command: ', "echo 'wikidatawiki'; /usr/local/bin/mwscript update.php --wiki=wikidatawiki --quick", 'output: ', 'wikidatawiki\ngroups: cannot find name for group ID 50120\ngroups: cannot find name for group ID 51904\n\nWe trust you have received the usual lecture from the local System\nAdministrator. It usually boils down to these three things:\n\n    #1) Respect the privacy of others.\n    #2) Think before you 
[17:23:54] <Reedy>	 type.\n    #3) With great power comes great responsibility.\n\nsudo: no tty present and no askpass program specified\n')
[17:24:09] <marxarelli>	 !log deleting instances integration-slave-docker-1007/1008 (T205362)
[17:24:15] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL
[17:24:16] <stashbot>	 T205362: Migrate m4executor CI nodes to bigram instances - https://phabricator.wikimedia.org/T205362
[17:24:19] <Reedy>	 17:23:04 17:23:04 Last output:
[17:24:19] <Reedy>	 17:23:04 sudo: a password is required
[17:24:19] <Reedy>	 17:23:04 17:23:04 Last output:
[17:24:19] <Reedy>	 17:23:04 sudo: a password is required
[17:24:31] <Reedy>	 thcipriani: marxarelli ^ Has the keyholder become disarmed?
[17:25:29] <marxarelli>	 !log launching integration-slave-docker-1040 bigram instance (T205362)
[17:25:34] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL
[17:25:51] <marxarelli>	 thcipriani: ^ do you have a sec to look? i'm juggling jenkins nodes
[17:25:55] <thcipriani>	 sure
[17:26:00] <shinken-wm>	 PROBLEM - Host integration-slave-docker-1008 is DOWN: CRITICAL - Host Unreachable (10.68.17.85)
[17:26:24] <thcipriani>	 hrm, that's not a symptom of a keyholder problem, but I have seen this a few times in the past couple of days
[17:26:39] <shinken-wm>	 PROBLEM - Host integration-slave-docker-1007 is DOWN: CRITICAL - Host Unreachable (10.68.19.105)
[17:26:43] <thcipriani>	 seems like ldap is flapping maybe? /me files a task
[17:30:44] <marxarelli>	 !log the puppet parameter for docker_lvm_volume specified in horizon was not applied correctly on the first puppet run for some reason. tearing down integration-slave-docker-1039...
[17:30:48] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL
[17:37:45] <wikibugs>	 10Beta-Cluster-Infrastructure, 10Scap: Scap in beta fails occasionally due to permissions - https://phabricator.wikimedia.org/T205463 (10thcipriani)
[17:37:55] <wmf-insecte>	 Yippee, build fixed!
[17:37:56] <wmf-insecte>	 Project beta-scap-eqiad build #223033: 09FIXED in 13 min: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/223033/
[17:38:14] <marxarelli>	 !log launching integration-slave-docker-1041 bigram instance (T205362)
[17:38:19] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL
[17:38:20] <stashbot>	 T205362: Migrate m4executor CI nodes to bigram instances - https://phabricator.wikimedia.org/T205362
[17:42:27] <marxarelli>	 !log configuring new jenkins node integration-slave-docker-1040 with 7 executors (T205362)
[17:42:33] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL
[17:50:51] <wikibugs>	 10Continuous-Integration-Infrastructure, 10Wikimedia-production-error (Shared Build Failure): Jenkins jobs for MediaWiki failing with 'npm: shasum check failed' - https://phabricator.wikimedia.org/T203506 (10Krinkle) 05Open>03stalled This is blocked on CI upgrading to npm 5 or later. The cache instability...
[18:00:16] <marxarelli>	 !log configuring new integration-slave-docker-1041 jenkins node with 7 executors (T205362)
[18:00:23] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL
[18:00:24] <stashbot>	 T205362: Migrate m4executor CI nodes to bigram instances - https://phabricator.wikimedia.org/T205362
[18:13:07] <wikibugs>	 10Release-Engineering-Team (Kanban), 10Release, 10Train Deployments, 10User-zeljkofilipin: 1.32.0-wmf.23 deployment blockers - https://phabricator.wikimedia.org/T191069 (10Krinkle)
[18:16:16] <marxarelli>	 !log reconfiguring bigram jenkins nodes to use 6 executors. 7 were configured by mistake (T205362)
[18:16:22] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL
[18:16:22] <stashbot>	 T205362: Migrate m4executor CI nodes to bigram instances - https://phabricator.wikimedia.org/T205362
[18:18:41] <wikibugs>	 10Beta-Cluster-Infrastructure, 10Scap: Scap in beta fails occasionally due to permissions - https://phabricator.wikimedia.org/T205463 (10Krenair) The other day I briefly observed that `sudo` had stopped functioning for me (on a random deployment-prep box, may have been deploy01) too. It resumed working soon af...
[18:20:48] <wikibugs>	 10Beta-Cluster-Infrastructure, 10Analytics-Kanban, 10Operations, 10Patch-For-Review, and 2 others: exported puppet resources are not queryable: cannot create grafana graphs of EventLogging running in beta cluster - https://phabricator.wikimedia.org/T204088 (10Niedzielski) @Ottomata, @fgiunchedi hello!  We'...
[18:22:18] <wmf-insecte>	 Yippee, build fixed!
[18:22:19] <wmf-insecte>	 Project beta-update-databases-eqiad build #28585: 09FIXED in 2 min 18 sec: https://integration.wikimedia.org/ci/job/beta-update-databases-eqiad/28585/
[18:29:41] <wikibugs>	 10Beta-Cluster-Infrastructure, 10Analytics-Kanban, 10Operations, 10Patch-For-Review, and 2 others: exported puppet resources are not queryable: cannot create grafana graphs of EventLogging running in beta cluster - https://phabricator.wikimedia.org/T204088 (10Ottomata) They should be.  Something is not wor...
[18:38:53] <wikibugs>	 10Beta-Cluster-Infrastructure, 10Analytics-Kanban, 10Operations, 10Patch-For-Review, and 2 others: exported puppet resources are not queryable: cannot create grafana graphs of EventLogging running in beta cluster - https://phabricator.wikimedia.org/T204088 (10Ottomata) BTW, I updated https://wikitech.wikim...
[18:46:07] <RoanKattouw>	 thcipriani: Would it be OK for me to deploy a maintenance-script only cherry-pick during the (unused) American train window? https://gerrit.wikimedia.org/r/c/mediawiki/extensions/ORES/+/462788/
[18:48:00] <thcipriani>	 RoanKattouw: yep, sure. RelEng should probably need to revisit how SWATs are arranged during EU train-weeks. Thanks for checking :)
[18:49:42] <thcipriani>	 s/need to//
[18:53:42] <greg-g>	 I'm wary of variance
[18:54:04] <greg-g>	 not only because of the pain it puts on the current copy/paste/edit wiki workflow, but also because people's memory
[18:54:18] <greg-g>	 (I guess copy/edit/paste, but whatever)
[18:54:42] <greg-g>	 but if we can A) get rid of that stupid workflow then I'm probably OK with it :)
[18:54:47] <greg-g>	 (there is no B)
[18:54:55] <greg-g>	 (yet)
[18:55:29] <thcipriani>	 :)
[18:56:02] <thcipriani>	 that's a fair point, I am regularly surprised by SWAT since we moved the Wednesday window
[18:57:10] <Reedy>	 SURPRISE SWAT
[19:00:11] <Krinkle>	 greg-g: post is live
[19:02:03] <greg-g>	 Reedy: I want no surprise swat near my physical location :)
[19:02:05] <greg-g>	 Krinkle: yay
[19:27:51] <wikibugs>	 10Continuous-Integration-Config, 10ContentTranslation, 10WorkType-Maintenance: ContentTranslation is not running PHPUnit structure tests - https://phabricator.wikimedia.org/T109670 (10Petar.petkovic)
[19:31:43] <Volker_E>	 greg-g when is the deployment stop this year for the fundraiser?
[19:32:53] <greg-g>	 Volker_E: https://wikitech.wikimedia.org/wiki/Deployments#Upcoming
[19:34:00] <Volker_E>	 greg-g: wait, there are deployments in December? Or what am I misreading?
[19:34:25] <Volker_E>	 been on that page
[19:34:33] <greg-g>	 yes, as it was the last two years :P
[19:34:51] <greg-g>	 https://wikitech.wikimedia.org/wiki/Deployments/Archive/2017/12
[19:35:07] <greg-g>	 https://wikitech.wikimedia.org/wiki/Deployments/Archive/2016/12
[19:35:33] <greg-g>	 https://wikitech.wikimedia.org/wiki/Deployments/Archive/2015/12
[19:35:53] <greg-g>	 https://wikitech.wikimedia.org/wiki/Deployments/Archive/2014/12
[19:36:01] <greg-g>	 4 years :)
[20:13:34] <shinken-wm>	 PROBLEM - Puppet errors on deployment-prometheus01 is CRITICAL: CRITICAL: 11.11% of data above the critical threshold [0.0]
[20:14:06] <Krenair>	 me
[20:15:28] <shinken-wm>	 PROBLEM - Puppet errors on deployment-maps03 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0]
[20:24:48] <wikibugs>	 10Beta-Cluster-Infrastructure, 10Analytics-Kanban, 10Operations, 10Patch-For-Review, and 2 others: exported puppet resources are not queryable: cannot create grafana graphs of EventLogging running in beta cluster - https://phabricator.wikimedia.org/T204088 (10Krenair) (for context: `modules/prometheus/mani...
[20:28:32] <shinken-wm>	 RECOVERY - Puppet errors on deployment-prometheus01 is OK: OK: Less than 1.00% above the threshold [0.0]
[20:36:19] <p858snake|L>	 Volker_E: do we even need to stop depolys for fundraising anymore?, they are on their own cluster with their own wiki handling things now
[20:38:39] <Reedy>	 CN
[20:45:28] <shinken-wm>	 RECOVERY - Puppet errors on deployment-maps03 is OK: OK: Less than 1.00% above the threshold [0.0]
[20:48:18] <wikibugs>	 10Beta-Cluster-Infrastructure, 10Analytics-Kanban, 10Operations, 10Patch-For-Review, and 2 others: exported puppet resources are not queryable: cannot create grafana graphs of EventLogging running in beta cluster - https://phabricator.wikimedia.org/T204088 (10Krenair) With these puppet changes: ```diff --g...
[21:04:08] <wikibugs>	 10Beta-Cluster-Infrastructure, 10Analytics-Kanban, 10Operations, 10Patch-For-Review, and 2 others: exported puppet resources are not queryable: cannot create grafana graphs of EventLogging running in beta cluster - https://phabricator.wikimedia.org/T204088 (10Ottomata) @fgiunchedi Alex's change ^ should do...
[21:14:16] <wikibugs>	 10Beta-Cluster-Infrastructure, 10Analytics-Kanban, 10Operations, 10Patch-For-Review, and 2 others: Prometheus resources in deployment-prep to create grafana graphs of EventLogging - https://phabricator.wikimedia.org/T204088 (10Krenair)
[22:15:17] <marxarelli>	 !log taking remaining m1.medium m4executor jenkins nodes offline (T205362)
[22:15:23] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL
[22:15:24] <stashbot>	 T205362: Migrate m4executor CI nodes to bigram instances - https://phabricator.wikimedia.org/T205362
[22:33:16] <marxarelli>	 !log deleting remaining m1.medium instances used as m4executors (T205362)
[22:33:21] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL
[22:33:22] <stashbot>	 T205362: Migrate m4executor CI nodes to bigram instances - https://phabricator.wikimedia.org/T205362
[22:36:55] <shinken-wm>	 PROBLEM - Host integration-slave-docker-1024 is DOWN: CRITICAL - Host Unreachable (10.68.17.114)
[22:36:59] <shinken-wm>	 PROBLEM - Host integration-slave-docker-1012 is DOWN: CRITICAL - Host Unreachable (10.68.16.21)
[22:37:09] <shinken-wm>	 PROBLEM - Host integration-slave-docker-1013 is DOWN: CRITICAL - Host Unreachable (10.68.19.155)
[22:38:33] <shinken-wm>	 PROBLEM - Host integration-slave-docker-1023 is DOWN: CRITICAL - Host Unreachable (10.68.16.200)
[22:38:47] <shinken-wm>	 PROBLEM - Host integration-slave-docker-1015 is DOWN: CRITICAL - Host Unreachable (10.68.19.76)
[22:38:55] <shinken-wm>	 PROBLEM - Host integration-slave-docker-1014 is DOWN: CRITICAL - Host Unreachable (10.68.19.123)
[22:39:03] <shinken-wm>	 PROBLEM - Host integration-slave-docker-1010 is DOWN: CRITICAL - Host Unreachable (10.68.18.61)
[22:39:11] <marxarelli>	 !log launching new integration-slave-docker-1042 bigram instance
[22:39:15] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL
[22:39:45] <shinken-wm>	 PROBLEM - Host integration-slave-docker-1009 is DOWN: CRITICAL - Host Unreachable (10.68.21.208)
[22:39:49] <shinken-wm>	 PROBLEM - Host integration-slave-docker-1022 is DOWN: CRITICAL - Host Unreachable (10.68.19.33)
[22:39:55] <shinken-wm>	 PROBLEM - Host integration-slave-docker-1011 is DOWN: CRITICAL - Host Unreachable (10.68.23.221)
[23:01:20] <marxarelli>	 !log replaced integration-slave-docker-1042 with new integration-slave-docker-1043 instance
[23:01:24] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL
[23:01:32] <marxarelli>	 !log configured new jenkins node integration-slave-docker-1043 with 6 executors
[23:01:36] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL
[23:05:03] <wikibugs>	 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (Kanban), 10Release Pipeline: Evaluate different strategy for Docker CI instances - https://phabricator.wikimedia.org/T202160 (10dduvall)
[23:05:20] <wikibugs>	 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (Kanban), 10Release Pipeline: Evaluate different strategy for Docker CI instances - https://phabricator.wikimedia.org/T202160 (10dduvall) 05Open>03Resolved
[23:11:20] <wikibugs>	 10Release-Engineering-Team (Kanban), 10Quibble, 10Patch-For-Review: Error: 1071 Specified key was too long; max key length is 767 bytes - https://phabricator.wikimedia.org/T193222 (10Reedy)
[23:11:33] <wikibugs>	 10Release-Engineering-Team (Kanban), 10Education-Program-Dashboard, 10MediaWiki-extensions-EducationProgram, 10Epic, and 2 others: Deprecate and remove the EducationProgram extension from Wikimedia servers after June 30, 2018 - https://phabricator.wikimedia.org/T125618 (10Reedy)
[23:50:29] <wikibugs>	 10Release-Engineering-Team (Kanban), 10Release, 10Train Deployments, 10User-zeljkofilipin: 1.32.0-wmf.23 deployment blockers - https://phabricator.wikimedia.org/T191069 (10Ryasmeen)