[00:31:29] <wikibugs>	 (03PS1) 1020after4: WIP: fix up branch.py so that it's suitable for wmf/ production branches [tools/release] - 10https://gerrit.wikimedia.org/r/543248
[00:33:49] <wikibugs>	 (03CR) 10jerkins-bot: [V: 04-1] WIP: fix up branch.py so that it's suitable for wmf/ production branches [tools/release] - 10https://gerrit.wikimedia.org/r/543248 (owner: 1020after4)
[00:33:51] <wikibugs>	 (03CR) 1020after4: "see wmf-branch.sh for an example of how branch.py would be used to make a production wmf branch (at least in theory, currently untested!)" [tools/release] - 10https://gerrit.wikimedia.org/r/543248 (owner: 1020after4)
[01:53:47] <wikibugs>	 10Phabricator, 10Release-Engineering-Team (Development services), 10Release-Engineering-Team-TODO, 10Operations, and 2 others: Prepare Phame to support heavy traffic for a Tech Department blog - https://phabricator.wikimedia.org/T226044 (10Krinkle)
[08:42:37] <shinken-wm>	 PROBLEM - App Server Main HTTP Response on deployment-mediawiki-09 is CRITICAL: CRITICAL - Socket timeout after 10 seconds
[08:47:30] <shinken-wm>	 RECOVERY - App Server Main HTTP Response on deployment-mediawiki-09 is OK: HTTP OK: HTTP/1.1 200 OK - 49284 bytes in 0.548 second response time
[08:57:33] <elukey>	 !log created deployment-memc08 in deployment-prep as memcached test host for Buster - T213089
[08:57:36] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL
[08:57:36] <stashbot>	 T213089: Upgrade memcached for Debian Stretch/Buster - https://phabricator.wikimedia.org/T213089
[08:57:45] <elukey>	 please let me know if --^ is a problem
[09:02:30] <shinken-wm>	 PROBLEM - App Server Main HTTP Response on deployment-mediawiki-parsoid10 is CRITICAL: CRITICAL - Socket timeout after 10 seconds
[09:02:55] <shinken-wm>	 PROBLEM - Host integration-agent-docker-1008 is DOWN: CRITICAL - Host Unreachable (172.16.3.105)
[09:03:20] <shinken-wm>	 PROBLEM - Host deployment-sca01 is DOWN: CRITICAL - Host Unreachable (172.16.5.13)
[09:04:00] <shinken-wm>	 PROBLEM - Host deployment-db05 is DOWN: CRITICAL - Host Unreachable (172.16.5.170)
[09:05:25] <shinken-wm>	 PROBLEM - Host saucelabs-02 is DOWN: CRITICAL - Host Unreachable (172.16.3.20)
[09:05:47] <shinken-wm>	 PROBLEM - Host integration-agent-docker-1005 is DOWN: CRITICAL - Host Unreachable (172.16.7.210)
[09:08:45] <shinken-wm>	 PROBLEM - Host deployment-memc05 is DOWN: CRITICAL - Host Unreachable (172.16.5.17)
[09:09:12] <shinken-wm>	 PROBLEM - Host integration-agent-jessie-docker-1001 is DOWN: CRITICAL - Host Unreachable (172.16.5.149)
[09:10:36] <shinken-wm>	 RECOVERY - Host deployment-sca01 is UP: PING OK - Packet loss = 0%, RTA = 0.81 ms
[09:10:49] <shinken-wm>	 RECOVERY - Host integration-agent-docker-1005 is UP: PING OK - Packet loss = 0%, RTA = 0.79 ms
[09:11:31] <shinken-wm>	 RECOVERY - Host integration-agent-jessie-docker-1001 is UP: PING OK - Packet loss = 0%, RTA = 1.54 ms
[09:11:31] <shinken-wm>	 RECOVERY - Host integration-agent-docker-1008 is UP: PING OK - Packet loss = 0%, RTA = 1.32 ms
[09:12:22] <shinken-wm>	 RECOVERY - App Server Main HTTP Response on deployment-mediawiki-parsoid10 is OK: HTTP OK: HTTP/1.1 200 OK - 49268 bytes in 0.690 second response time
[09:12:36] <shinken-wm>	 RECOVERY - Host deployment-memc05 is UP: PING OK - Packet loss = 0%, RTA = 0.93 ms
[09:13:01] <shinken-wm>	 RECOVERY - Host deployment-db05 is UP: PING OK - Packet loss = 0%, RTA = 0.79 ms
[09:15:32] <shinken-wm>	 RECOVERY - Puppet staleness on deployment-sca01 is OK: OK: Less than 1.00% above the threshold [3600.0]
[09:15:46] <shinken-wm>	 RECOVERY - Puppet staleness on integration-agent-docker-1005 is OK: OK: Less than 1.00% above the threshold [3600.0]
[09:16:04] <shinken-wm>	 RECOVERY - Puppet staleness on saucelabs-02 is OK: OK: Less than 1.00% above the threshold [3600.0]
[09:17:24] <shinken-wm>	 RECOVERY - Puppet staleness on integration-agent-docker-1008 is OK: OK: Less than 1.00% above the threshold [3600.0]
[09:17:26] <shinken-wm>	 RECOVERY - Puppet staleness on integration-agent-jessie-docker-1001 is OK: OK: Less than 1.00% above the threshold [3600.0]
[09:17:46] <shinken-wm>	 RECOVERY - Puppet staleness on deployment-db05 is OK: OK: Less than 1.00% above the threshold [3600.0]
[09:20:59] <shinken-wm>	 PROBLEM - Host integration-slave-jessie-1004 is DOWN: CRITICAL - Host Unreachable (172.16.2.228)
[09:22:23] <shinken-wm>	 PROBLEM - Host deployment-imagescaler03 is DOWN: CRITICAL - Host Unreachable (172.16.7.231)
[09:25:56] <shinken-wm>	 RECOVERY - Host integration-slave-jessie-1004 is UP: PING OK - Packet loss = 0%, RTA = 1.02 ms
[09:27:16] <shinken-wm>	 RECOVERY - Host deployment-imagescaler03 is UP: PING OK - Packet loss = 0%, RTA = 1.44 ms
[09:31:19] <shinken-wm>	 PROBLEM - Host deployment-webperf12 is DOWN: CRITICAL - Host Unreachable (172.16.4.24)
[09:31:46] <shinken-wm>	 PROBLEM - Host deployment-parsoid09 is DOWN: CRITICAL - Host Unreachable (172.16.5.63)
[09:32:02] <shinken-wm>	 PROBLEM - Host integration-slave-jessie-1001 is DOWN: CRITICAL - Host Unreachable (172.16.0.86)
[09:32:17] <shinken-wm>	 PROBLEM - Host deployment-kafka-main-1 is DOWN: CRITICAL - Host Unreachable (172.16.4.116)
[09:32:36] <shinken-wm>	 PROBLEM - Host deployment-deploy02 is DOWN: CRITICAL - Host Unreachable (172.16.4.19)
[09:32:43] <shinken-wm>	 PROBLEM - Host deployment-sca02 is DOWN: CRITICAL - Host Unreachable (172.16.5.112)
[09:33:02] <shinken-wm>	 PROBLEM - Host deployment-mcs01 is DOWN: CRITICAL - Host Unreachable (172.16.5.64)
[09:33:17] <shinken-wm>	 PROBLEM - Host deployment-sca04 is DOWN: CRITICAL - Host Unreachable (172.16.5.54)
[09:33:30] <shinken-wm>	 PROBLEM - Host deployment-kafka-jumbo-2 is DOWN: CRITICAL - Host Unreachable (172.16.5.47)
[09:33:37] <shinken-wm>	 PROBLEM - Host deployment-mediawiki-09 is DOWN: CRITICAL - Host Unreachable (172.16.4.106)
[09:34:06] <shinken-wm>	 PROBLEM - Host deployment-memc04 is DOWN: CRITICAL - Host Unreachable (172.16.5.76)
[09:34:35] <shinken-wm>	 PROBLEM - Host deployment-maps04 is DOWN: CRITICAL - Host Unreachable (172.16.4.10)
[09:34:43] <shinken-wm>	 PROBLEM - Host integration-agent-docker-1011 is DOWN: CRITICAL - Host Unreachable (172.16.3.126)
[09:35:19] <shinken-wm>	 PROBLEM - Host deployment-deploy01 is DOWN: CRITICAL - Host Unreachable (172.16.4.18)
[09:35:55] <shinken-wm>	 PROBLEM - Host integration-agent-docker-1012 is DOWN: CRITICAL - Host Unreachable (172.16.3.130)
[09:36:52] <shinken-wm>	 PROBLEM - Host deployment-fluorine02 is DOWN: CRITICAL - Host Unreachable (172.16.5.71)
[09:38:04] <shinken-wm>	 RECOVERY - Host deployment-mcs01 is UP: PING OK - Packet loss = 0%, RTA = 0.87 ms
[09:38:16] <shinken-wm>	 RECOVERY - Host deployment-kafka-main-1 is UP: PING OK - Packet loss = 0%, RTA = 0.88 ms
[09:38:16] <shinken-wm>	 RECOVERY - Host deployment-sca04 is UP: PING OK - Packet loss = 0%, RTA = 1.11 ms
[09:38:19] <shinken-wm>	 RECOVERY - Host deployment-parsoid09 is UP: PING OK - Packet loss = 0%, RTA = 1.24 ms
[09:38:26] <shinken-wm>	 RECOVERY - Host deployment-deploy02 is UP: PING OK - Packet loss = 0%, RTA = 0.78 ms
[09:38:27] <shinken-wm>	 RECOVERY - Host deployment-sca02 is UP: PING OK - Packet loss = 0%, RTA = 1.01 ms
[09:38:30] <shinken-wm>	 RECOVERY - Host integration-agent-docker-1011 is UP: PING OK - Packet loss = 0%, RTA = 0.53 ms
[09:38:30] <shinken-wm>	 RECOVERY - Host deployment-mediawiki-09 is UP: PING OK - Packet loss = 0%, RTA = 0.83 ms
[09:38:32] <shinken-wm>	 RECOVERY - Host deployment-kafka-jumbo-2 is UP: PING OK - Packet loss = 0%, RTA = 0.70 ms
[09:38:32] <shinken-wm>	 RECOVERY - Host deployment-deploy01 is UP: PING OK - Packet loss = 0%, RTA = 0.83 ms
[09:38:35] <shinken-wm>	 RECOVERY - Host integration-slave-jessie-1001 is UP: PING OK - Packet loss = 0%, RTA = 0.70 ms
[09:38:35] <shinken-wm>	 RECOVERY - Host deployment-webperf12 is UP: PING OK - Packet loss = 0%, RTA = 1.21 ms
[09:39:06] <shinken-wm>	 RECOVERY - Host deployment-memc04 is UP: PING OK - Packet loss = 0%, RTA = 1.14 ms
[09:39:35] <shinken-wm>	 RECOVERY - Host deployment-maps04 is UP: PING OK - Packet loss = 0%, RTA = 0.53 ms
[09:39:36] <shinken-wm>	 RECOVERY - Host integration-agent-docker-1012 is UP: PING OK - Packet loss = 0%, RTA = 1.27 ms
[09:41:53] <shinken-wm>	 RECOVERY - Host deployment-fluorine02 is UP: PING OK - Packet loss = 0%, RTA = 1.16 ms
[09:42:32] <shinken-wm>	 RECOVERY - Puppet staleness on deployment-maps04 is OK: OK: Less than 1.00% above the threshold [3600.0]
[09:43:09] <shinken-wm>	 RECOVERY - Puppet staleness on deployment-kafka-main-1 is OK: OK: Less than 1.00% above the threshold [3600.0]
[09:43:10] <shinken-wm>	 RECOVERY - Puppet staleness on deployment-parsoid09 is OK: OK: Less than 1.00% above the threshold [3600.0]
[09:44:19] <shinken-wm>	 RECOVERY - Puppet staleness on deployment-sca02 is OK: OK: Less than 1.00% above the threshold [3600.0]
[09:45:29] <shinken-wm>	 RECOVERY - Puppet staleness on integration-agent-docker-1012 is OK: OK: Less than 1.00% above the threshold [3600.0]
[09:55:23] <wikibugs>	 (03PS1) 10Daimona Eaytoy: layout: [mediawiki/tools/phan/SecurityCheckPlugin] Move to PHP72+ [integration/config] - 10https://gerrit.wikimedia.org/r/543394
[10:33:20] <wikibugs>	 (03PS1) 10Pwirth: Activate tests for new repo BlueSpiceDistributionConnector [integration/config] - 10https://gerrit.wikimedia.org/r/543399
[10:58:26] <wikibugs>	 10Gerrit, 10Release-Engineering-Team (Development services), 10Release-Engineering-Team-TODO (201907): Investigate gerrit session expiration - https://phabricator.wikimedia.org/T222472 (10LarsWirzenius) Data point: I tend to need to log in about once a day, based on memory. Have not kept a log, though. I use...
[11:00:58] <wikibugs>	 10Deployments, 10MediaWiki-SWAT-deployments: Figure out what to do with `fatalmonitor` script - https://phabricator.wikimedia.org/T234345 (10Lucas_Werkmeister_WMDE) Now the script itself has been removed from `mwlog1001`, it seems.
[11:03:31] <wikibugs>	 (03CR) 10jerkins-bot: [V: 04-1] Activate tests for new repo BlueSpiceDistributionConnector [integration/config] - 10https://gerrit.wikimedia.org/r/543399 (owner: 10Pwirth)
[11:05:42] <wikibugs>	 (03PS2) 10Pwirth: Activate tests for new repo BlueSpiceDistributionConnector [integration/config] - 10https://gerrit.wikimedia.org/r/543399
[11:24:46] <wikibugs>	 (03PS8) 10Kosta Harlan: Add option for using Apache as server [integration/quibble] - 10https://gerrit.wikimedia.org/r/516729 (https://phabricator.wikimedia.org/T225218)
[11:25:30] <wikibugs>	 (03PS9) 10Kosta Harlan: Add option for using Apache as server [integration/quibble] - 10https://gerrit.wikimedia.org/r/516729 (https://phabricator.wikimedia.org/T225218)
[11:25:32] <wikibugs>	 (03CR) 10jerkins-bot: [V: 04-1] Add option for using Apache as server [integration/quibble] - 10https://gerrit.wikimedia.org/r/516729 (https://phabricator.wikimedia.org/T225218) (owner: 10Kosta Harlan)
[11:26:26] <wikibugs>	 (03CR) 10jerkins-bot: [V: 04-1] Add option for using Apache as server [integration/quibble] - 10https://gerrit.wikimedia.org/r/516729 (https://phabricator.wikimedia.org/T225218) (owner: 10Kosta Harlan)
[11:28:55] <wikibugs>	 (03PS10) 10Kosta Harlan: Add option for using Apache as server [integration/quibble] - 10https://gerrit.wikimedia.org/r/516729 (https://phabricator.wikimedia.org/T225218)
[11:29:48] <wikibugs>	 (03CR) 10jerkins-bot: [V: 04-1] Add option for using Apache as server [integration/quibble] - 10https://gerrit.wikimedia.org/r/516729 (https://phabricator.wikimedia.org/T225218) (owner: 10Kosta Harlan)
[11:32:19] <wikibugs>	 (03PS11) 10Kosta Harlan: Add option for using Apache as server [integration/quibble] - 10https://gerrit.wikimedia.org/r/516729 (https://phabricator.wikimedia.org/T225218)
[11:49:12] <hashar>	 kostajh: but. I though that support for Apache had been merged an ddeployed already! ? :D
[11:49:17] <hashar>	 I am outdated :-\
[12:30:13] <kostajh>	 hashar: no, I shelved it but Krinkle suggested I revive it, so here it is again
[12:31:53] <hashar>	 +1
[12:32:19] <hashar>	 kostajh: iirc the benchmark you have done definitely proved that apache was wayy faster than the php builtin server
[12:32:28] <hashar>	 for good reason, php -S is serially processing requests hehe
[12:32:46] <kostajh>	 hashar: it was faster on my local machine but not in a CI-like environment (I provisioned a DigitalOcean droplet for some tests)
[12:32:57] <hashar>	 ohhh
[12:33:00] <hashar>	 strange :-\
[12:33:27] <kostajh>	 See also https://travis-ci.org/kostajh/quibble/builds/549546283?utm_source=github_status
[12:34:46] <hashar>	 kostajh: that is surprising
[12:35:14] <hashar>	 or maybe it is less proeminent now that wdio runs test in parallel
[12:35:43] <hashar>	 or maybe it does not run them in parallel
[12:36:15] <hashar>	 anyway, given you wrote the patch, I guess there is not much work to complete it and have the feature added
[12:36:36] <wikibugs>	 (03PS12) 10Kosta Harlan: Add option for using Apache as server [integration/quibble] - 10https://gerrit.wikimedia.org/r/516729 (https://phabricator.wikimedia.org/T225218)
[12:36:55] <hashar>	 kostajh: will you be there on friday? Could use some of your time to talk about the tech conf sessions 
[12:37:23] <wikibugs>	 (03CR) 10jerkins-bot: [V: 04-1] Add option for using Apache as server [integration/quibble] - 10https://gerrit.wikimedia.org/r/516729 (https://phabricator.wikimedia.org/T225218) (owner: 10Kosta Harlan)
[12:37:27] <hashar>	 but I have to dig a bit more into them tomorrow and get more familiar with the proposed topic + figure out question I could have for you :)
[12:37:29] <kostajh>	 hashar: yes, I'll be around, and hopefully with a proper internet connection too
[12:37:34] <hashar>	 good
[12:37:55] <kostajh>	 hashar: have you seen anywhere what exactly is involved in being a "lead/co-lead" for techconf session?
[12:38:10] <hashar>	 well
[12:38:11] <hashar>	 no
[12:38:22] <hashar>	 but in short I guess it is mostly facilitating during the session
[12:38:35] <hashar>	 and prepare the actual session. So reach out to people interested and see what they want to talk about
[12:38:45] <hashar>	 maybe even talk about stuff before the sessio happen
[12:38:55] <hashar>	 gather materials for people to read
[12:39:24] <hashar>	 and come out with an organiation for the session :  unconference   versus   presentation+ questions/answers  versus whatever
[12:39:43] <hashar>	 I am not too worried, we masterize the topic so we should be at ease
[12:40:22] <hashar>	 I guess the lead /  co-lead role is thus to ensure there is a good dynamic, that the session is on track and does not derail in an off topic discussion or rant
[12:40:36] <hashar>	 and that in the end there is some positive outcome, eventually even a plan of action for the future
[12:41:39] <wikibugs>	 (03PS13) 10Kosta Harlan: Add option for using Apache as server [integration/quibble] - 10https://gerrit.wikimedia.org/r/516729 (https://phabricator.wikimedia.org/T225218)
[12:42:02] <kostajh>	 k
[12:43:13] <hashar>	 I am not too worried about the session in itsel
[12:43:20] <hashar>	 but would need to prepare some stuff before :]
[12:44:07] <kostajh>	 sure
[12:46:12] <hashar>	 anyway, gotta write slides and flip some more paper, so I am disconnecting
[12:46:26] <hashar>	 plan for tomorrow: review tech conf stuff and write thoughts about it
[12:46:33] <hashar>	 and hopefully review some of the pending Quibble patches!
[13:09:17] <elukey>	 one question about memcached in deployment prep
[13:09:38] <elukey>	 today I created deployment-memc08 with buster
[13:09:53] <elukey>	 so I was looking for the hiera config to add it to mcrouter
[13:10:34] <elukey>	 on deployment-mediawiki-07 I can see only two deployment-memc listed, 05 and 04
[13:10:45] <elukey>	 and they are listed in operations/puppet
[13:11:10] <elukey>	 but we also have deployment-memc06 and 07
[13:11:16] <elukey>	 do we use them elsewhere?
[13:11:20] <hauskater>	 Don't we use Horizon for beta things?
[13:11:43] <elukey>	 we have some config in operations/puppet for deployment-prep
[13:33:37] <wikibugs>	 10Beta-Cluster-Infrastructure: Global developer for DannyS712 on beta cluster - https://phabricator.wikimedia.org/T235650 (10Daimona)
[13:37:42] <wikibugs>	 10Continuous-Integration-Config, 10Wikidata, 10Wikidata Query UI: Update wikidata-query-gui-build job versions (from Jessie, Node v6, npm v3) - https://phabricator.wikimedia.org/T235651 (10Lucas_Werkmeister_WMDE)
[13:39:41] <wikibugs>	 10Beta-Cluster-Infrastructure: Global developer for DannyS712 on beta cluster - https://phabricator.wikimedia.org/T235650 (10MarcoAurelio) Global permissions are granted through [[ https://deployment.wikimedia.beta.wmflabs.org/wiki/Special:GlobalUserRights | Special:GlobalUserRights ]]. But as noted on [[ https:...
[14:18:21] <wikibugs>	 10Release-Engineering-Team, 10Scap, 10Operations, 10Wikimedia-General-or-Unknown, and 2 others: "Currently active MediaWiki versions:" broken on noc/conf - https://phabricator.wikimedia.org/T235338 (10thcipriani) >>! In T235338#5569953, @Reedy wrote: > Current implementation: >  > `lang=html > <p>Currently...
[14:52:09] <wikibugs>	 10Release-Engineering-Team (Deployment services), 10Security-Team, 10Wikimedia-Extension-setup, 10Wikimedia-Site-requests, 10Wikimedia-extension-review-queue: Deploy WebAuthn to Wikimedia Wikis - https://phabricator.wikimedia.org/T227242 (10Reedy)
[15:26:29] <wikibugs>	 10Release-Engineering-Team, 10Scap, 10Operations, 10Wikimedia-General-or-Unknown, and 2 others: "Currently active MediaWiki versions:" broken on noc/conf - https://phabricator.wikimedia.org/T235338 (10Krinkle) I thought maybe it was user-permission or working-directory related. But, looks like not.. As www...
[15:26:56] <wikibugs>	 (03CR) 10Thcipriani: "Some driveby comments inline" (032 comments) [tools/release] - 10https://gerrit.wikimedia.org/r/543248 (owner: 1020after4)
[15:40:01] <wikibugs>	 10Phabricator (Upstream), 10Upstream: Phabricator fonts look broken on systems with JoyPixels (formerly EmojiOne) installed - https://phabricator.wikimedia.org/T235339 (10epriestley) (See <https://discourse.phabricator-community.org/t/website-specifies-emoji-font-in-body-tag/2139/> for the upstream position on...
[16:27:36] <Lucas_WMDE>	 beta doesn’t seem to update anymore, is that a known problem?
[16:27:46] <Lucas_WMDE>	 apparently https://integration.wikimedia.org/ci/view/Beta/job/beta-scap-eqiad/ doesn’t find a suitable executor
[16:28:01] <Lucas_WMDE>	 since yesterday evening or so
[16:40:19] <wikibugs>	 10Beta-Cluster-Infrastructure: Beta cluster doesn’t update since ca. 2019-10-15 21:00 UTC - https://phabricator.wikimedia.org/T235674 (10Lucas_Werkmeister_WMDE)
[16:40:22] <Lucas_WMDE>	 reported as ^
[16:45:54] <hashar>	 Lucas_WMDE: ahh I noticed that and fixed it by killing the queued build
[16:46:37] <wikibugs>	 10Beta-Cluster-Infrastructure: Beta cluster doesn’t update since ca. 2019-10-15 21:00 UTC - https://phabricator.wikimedia.org/T235674 (10hashar) 05Open→03Resolved a:03hashar Fixed it on spot. I have canceled the queued builds in Jenkins which eventually unblock whatever deadlock occur.
[16:46:50] <hashar>	 Lucas_WMDE: I should probably just convert that job to poll the scm instead
[16:46:56] <hashar>	 ie just pull from time to time
[16:47:04] <hashar>	 instead of on every single merged changes
[16:48:52] <Lucas_WMDE>	 ok
[17:08:21] <hauskater>	 hashar: beta-scap-eqiad et al. ain't running since yesterday apparently
[17:08:45] <hauskater>	 https://integration.wikimedia.org/ci/view/Beta/
[17:08:55] <hauskater>	 can I scap manually while that's fixed?
[17:09:57] <James_F>	 Stalled on executor.
[17:10:01] <James_F>	 Don't scap manually.
[17:10:07] <James_F>	 I'll fix when I'm out of this meeting.
[17:10:38] <hauskater>	 Alright!
[17:10:38] <wikibugs>	 10Release-Engineering-Team, 10Operations, 10Wikimedia Design Style Guide: Automatic pickup of Gerrit clone master doesn't happen - https://phabricator.wikimedia.org/T235677 (10Volker_E)
[17:10:40] <Lucas_WMDE>	 hashar said he already fixed it…
[17:11:01] <James_F>	 Still looks stalled. :-(
[17:11:03] <hauskater>	 Lucas_WMDE: I should probably scroll up and down more often
[17:11:14] <Lucas_WMDE>	 :P
[17:11:15] <hauskater>	 but indeed it is not executing
[17:11:25] <wikibugs>	 10Release-Engineering-Team (Unit & Int & System Tooling), 10Release-Engineering-Team-TODO (201910), 10International-Developer-Events, 10Wikimedia-Technical-Conference-2019, and 2 others: Wikimedia Technical Conference 2019 Session: System level testing: patterns an... - https://phabricator.wikimedia.org/T234635
[17:11:47] <wikibugs>	 10Gerrit, 10Release-Engineering-Team, 10Operations, 10Wikimedia Design Style Guide: Automatic pickup of Gerrit clone master doesn't happen - https://phabricator.wikimedia.org/T235677 (10Dzahn)
[17:17:18] <wikibugs>	 10Release-Engineering-Team (CI & Testing services), 10Release-Engineering-Team-TODO, 10Release Pipeline: contint1001 has lot of dangling Docker images - https://phabricator.wikimedia.org/T235680 (10hashar)
[17:18:44] <wikibugs>	 (03CR) 1020after4: WIP: fix up branch.py so that it's suitable for wmf/ production branches (031 comment) [tools/release] - 10https://gerrit.wikimedia.org/r/543248 (owner: 1020after4)
[17:19:27] <wikibugs>	 10Release-Engineering-Team (CI & Testing services), 10Release-Engineering-Team-TODO, 10Release Pipeline: contint1001 has lot of dangling Docker images - https://phabricator.wikimedia.org/T235680 (10hashar) p:05Triage→03Normal
[17:25:21] <hauskater>	 faiure message reads "xxx do not have the BetaClusterBastion tag"
[17:26:40] <James_F>	 Yeah. Still in this meeting.
[17:26:44] <Lucas_WMDE>	 perhaps T235674 should be reopened until the issue is actually fixed?
[17:26:45] <stashbot>	 T235674: Beta cluster doesn’t update since ca. 2019-10-15 21:00 UTC - https://phabricator.wikimedia.org/T235674
[17:27:21] <wikibugs>	 10Beta-Cluster-Infrastructure, 10Release-Engineering-Team-TODO (201910): Beta cluster doesn’t update since ca. 2019-10-15 21:00 UTC - https://phabricator.wikimedia.org/T235674 (10Jdforrester-WMF) 05Resolved→03Open p:05Triage→03High Not fixed.
[17:27:27] <James_F>	 Done.
[17:27:31] <Lucas_WMDE>	 thanks
[17:42:32] <James_F>	 OK, back.
[17:43:05] <James_F>	 https://integration.wikimedia.org/ci/label/BetaClusterBastion/ has nodes and projects, so it's not a mis-config.
[17:43:23] <James_F>	 However https://integration.wikimedia.org/ci/computer/deployment-deploy01/ doesn't have anything assigned to it?
[17:44:30] <James_F>	 !log Marking deployment-deplog01 offline temporarily for T235674
[17:44:32] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL
[17:44:33] <stashbot>	 T235674: Beta cluster doesn’t update since ca. 2019-10-15 21:00 UTC - https://phabricator.wikimedia.org/T235674
[17:45:11] <wikibugs>	 10Project-Admins: Create Project: Watchlist-Expiry - https://phabricator.wikimedia.org/T235686 (10ifried)
[17:46:55] <James_F>	 OK, it processed https://integration.wikimedia.org/ci/job/beta-mediawiki-config-update-eqiad/16075/
[17:47:53] <shinken-wm>	 PROBLEM - Parsoid on deployment-parsoid09 is CRITICAL: connect to address 172.16.5.63 and port 8000: Connection refused PROBLEM - Parsoid on deployment-mediawiki-parsoid10 is CRITICAL: connect to address 172.16.0.141 and port 8000: Connection refused
[17:48:02] <James_F>	 Killed a few more hung jobs and it seems to be processing https://integration.wikimedia.org/ci/job/beta-mediawiki-config-update-eqiad/16075/ now 
[17:48:06] * James_F crosses fingers.
[17:49:41] <wmf-insecte>	 Project beta-scap-eqiad build #271468: 04FAILURE in 1 min 59 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/271468/
[17:49:59] <wikibugs>	 10Release-Engineering-Team, 10Wikimedia Design Style Guide, 10Patch-For-Review, 10User-Ladsgroup: Use `git lfs` for large binary files of Design Style Guide - https://phabricator.wikimedia.org/T235013 (10Dzahn)
[17:57:35] <wmf-insecte>	 Project beta-scap-eqiad build #271469: 04STILL FAILING in 2 min 7 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/271469/
[17:57:50] <James_F>	 Oh dear.
[17:59:01] <James_F>	 Could be because there's so much churn, possibly.
[17:59:25] <James_F>	 Hmm, no, it's getting refused when it's trying to scap out files.
[17:59:45] <James_F>	 `17:49:35 sudo -u mwdeploy -n -- /usr/bin/scap cdb-rebuild on deployment-mediawiki-09.deployment-prep.eqiad.wmflabs returned [255]: Permission denied (publickey).`
[18:01:43] <hauskater>	 geez
[18:01:47] <hauskater>	 publickey?
[18:05:05] <hauskater>	 https://integration.wikimedia.org/ci/view/Beta/job/beta-code-update-eqiad/268080/console was quite an update
[18:05:16] <James_F>	 Ha, yes.
[18:05:31] <James_F>	 But maybe the keyholder/whatever stuff isn't set right in Beta Cluster?
[18:06:25] <wmf-insecte>	 Project beta-scap-eqiad build #271470: 04STILL FAILING in 1 min 56 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/271470/
[18:06:41] <hauskater>	 puppet thing?
[18:09:49] <hauskater>	 James_F: puppet said it was 'stopped' for some reason
[18:10:04] <hauskater>	 Notice: /Stage[main]/Confd/Base::Service_unit[confd]/Service[confd]/ensure: ensu                                                                                                                                  re changed 'stopped' to 'running'
[18:10:27] <James_F>	 Hmm. That'd not help.
[18:10:49] * James_F wonders if Krenair is around.
[18:10:55] <hauskater>	 that was from puppet agent -tv
[18:11:01] <hauskater>	 let's see if that helps
[18:11:27] <wikibugs>	 10Beta-Cluster-Infrastructure, 10Release-Engineering-Team-TODO (201910): Beta cluster doesn’t update since ca. 2019-10-15 21:00 UTC - https://phabricator.wikimedia.org/T235674 (10Jdforrester-WMF) Jobs populating correctly, but failing with:  `17:49:35 sudo -u mwdeploy -n -- /usr/bin/scap cdb-rebuild on deploym...
[18:15:17] <hauskater>	 Nope, still failing
[18:15:35] <wmf-insecte>	 Project beta-scap-eqiad build #271471: 04STILL FAILING in 1 min 17 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/271471/
[18:16:26] <mutante>	 hauskater: you know what's odd.. if i follow that FAILING link above and look at Console Output.. it says SUCCESS at the end
[18:16:45] <mutante>	 https://integration.wikimedia.org/ci/job/beta-code-update-eqiad/268082/console
[18:16:55] <mutante>	 oh, not the same job ID i guess.. ehm
[18:17:11] <mutante>	 beta-scap-eqiad vs beta-code-update-eqiad
[18:17:13] <hauskater>	 https://github.com/wikimedia/puppet/blob/8897bcbaae5d96f0c0bf2db43c93e5d717b1cd83/modules/mediawiki/manifests/users.pp#L29 <-- ?
[18:17:42] <mutante>	 hauskater: do you see a puppet error?
[18:17:42] <hauskater>	 mutante: it says it's a public key failure but I'm not sure if you've updated mwdeploy user keys?
[18:17:57] <hauskater>	 mutante: Not errors, some debug messages
[18:18:00] <hauskater>	 let me show you
[18:18:53] <wikibugs>	 (03CR) 10Jforrester: [C: 03+2] layout: [mediawiki/tools/phan/SecurityCheckPlugin] Move to PHP72+ [integration/config] - 10https://gerrit.wikimedia.org/r/543394 (owner: 10Daimona Eaytoy)
[18:18:58] <hauskater>	 mutante: https://phabricator.wikimedia.org/P9365
[18:19:04] <hauskater>	 but I don't think they're related
[18:19:13] <wikibugs>	 (03CR) 10Jforrester: [C: 03+2] Activate tests for new repo BlueSpiceDistributionConnector [integration/config] - 10https://gerrit.wikimedia.org/r/543399 (owner: 10Pwirth)
[18:20:32] <mutante>	 hauskater: yea, that looks like a succesful puppet run. Does it start confd on every run? repeat it
[18:20:36] <wikibugs>	 (03Merged) 10jenkins-bot: layout: [mediawiki/tools/phan/SecurityCheckPlugin] Move to PHP72+ [integration/config] - 10https://gerrit.wikimedia.org/r/543394 (owner: 10Daimona Eaytoy)
[18:20:52] <hauskater>	 mutante: I'll switch back to deploy01
[18:21:04] <wikibugs>	 (03Merged) 10jenkins-bot: Activate tests for new repo BlueSpiceDistributionConnector [integration/config] - 10https://gerrit.wikimedia.org/r/543399 (owner: 10Pwirth)
[18:21:27] <Krenair>	 hello
[18:21:33] <Krenair>	 James_F, what's up?
[18:21:52] <James_F>	 Krenair: Puppet issues in Beta Cluster, but hauskater seems to be dealing?
[18:22:13] <hauskater>	 James_F: well, not really. Krenair is the expert
[18:22:21] <hauskater>	 mutante: same messages as previously
[18:22:27] <Krenair>	 permission denied errors everywhere? better check keyholder
[18:22:44] <mutante>	 hauskater: you can try "keyholder status"  https://wikitech.wikimedia.org/wiki/Keyholder
[18:22:56] <hauskater>	 Paste updated
[18:22:59] <mutante>	 it show a list of fingerprints
[18:23:01] <mutante>	 should
[18:23:18] <hauskater>	 -bash: keyholder: command not found
[18:23:26] <Krenair>	 jenkins-bot@deployment-deploy01:~$ SSH_AUTH_SOCK=/run/keyholder/proxy.sock ssh-add -L
[18:23:26] <Krenair>	 The agent has no identities.
[18:23:35] <Krenair>	 did nobody arm it?
[18:23:49] * Krenair does
[18:24:16] <hauskater>	 arm and restart?
[18:24:41] <mutante>	 for reference:  sudo keyholder arm (and then it asks for a password for the key(s)) .. that's how it would be in prod
[18:24:44] <Krenair>	 krenair@deployment-deploy01:~$ sudo keyholder arm
[18:24:44] <Krenair>	 ...
[18:24:51] <Krenair>	  /etc/keyholder.d/mwdeploy is not an acceptable key. Is it an RSA or ED25519 key with passphrase?
[18:24:55] <hauskater>	 I've not seen Icigna complain about it as stated in Wikitech
[18:25:14] <Krenair>	 It looks like an RSA private key... hmm
[18:25:20] <hauskater>	 sudo did it mutante 
[18:25:35] <mutante>	 hauskater: i think if that is monitored that would be Shinken
[18:25:36] <wmf-insecte>	 Project beta-scap-eqiad build #271472: 04STILL FAILING in 1 min 15 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/271472/
[18:26:10] <Krenair>	 krenair@deployment-deploy01:~$ sudo file /etc/keyholder.d/mwdeploy
[18:26:10] <Krenair>	  /etc/keyholder.d/mwdeploy: PEM RSA private key
[18:26:17] <mutante>	 Krenair: maybe a new key that does not have a passphrase?
[18:26:27] <Krenair>	 hmmm
[18:26:35] <Krenair>	 yes
[18:26:37] <Krenair>	 looks like it
[18:26:41] <Krenair>	 but
[18:26:42] <hauskater>	 When I did sudo keyholder status only keys for analytics_deploy and dumpsdeploy appear listed
[18:26:45] <Krenair>	 why?
[18:26:56] <James_F>	 !log Zuul: Activate tests for new repo BlueSpiceDistributionConnector
[18:26:57] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL
[18:27:05] <James_F>	 !log Zuul: [mediawiki/tools/phan/SecurityCheckPlugin] Move to PHP72+
[18:27:07] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL
[18:27:17] <Krenair>	  /etc/keyholder.d/mwdeploy on deployment-deploy01 does not match files/ssh/tin/mwdeploy_rsa from the puppetmaster
[18:27:58] <Krenair>	 puppet looks normal
[18:28:21] <Krenair>	 looks like it's picked up modules/secret/secrets/keyholder/mwdeploy instead
[18:28:21] <mutante>	 maybe somebody wanted to remove the "tin" remnants and replace with deploy1001
[18:29:53] <hauskater>	 https://phabricator.wikimedia.org/T235491#5577775 <-- related ?
[18:29:54] <mutante>	 but git log mwdeploy only shows one entry 
[18:30:00] <mutante>	 when it was added
[18:30:15] <mutante>	 taking about secret/secrets/keyholder in labs/private ..right
[18:31:02] <Krenair>	 hauskater, unlikely
[18:31:39] <Krenair>	 I tried 'cp files/ssh/tin/mwdeploy_rsa modules/secret/secrets/keyholder/mwdeploy' and while I could load that into keyholder, it was not accepted by the remote host
[18:32:03] <James_F>	 Is the problem on the keyholder server or the remotes?
[18:32:06] <James_F>	 (or both?)
[18:33:40] <Krenair>	 who knows?
[18:34:16] <Krenair>	 okay after checking things I am reasonably confident that modules/secret/secrets/keyholder/mwdeploy contains the key that remote hosts are expecting
[18:35:11] <mutante>	 sounds like the issue is it has no password and does not like that. 
[18:35:15] <Krenair>	 oh wait
[18:35:19] <Krenair>	 that's not a private file
[18:35:22] <mutante>	 but we dont see a change so far
[18:35:34] <wmf-insecte>	 Project beta-scap-eqiad build #271473: 04STILL FAILING in 1 min 15 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/271473/
[18:36:03] <Krenair>	 are the remote hosts getting the wrong key and we're trying to load the wrong thing into keyholder?
[18:39:16] <wikibugs>	 10Gerrit, 10Release-Engineering-Team, 10Operations, 10Wikimedia Design Style Guide: Automatic pickup of Gerrit clone master doesn't happen - https://phabricator.wikimedia.org/T235677 (10Dzahn) The changes made in T235013 added a requirement to have git-lfs installed and use a different command to pull data...
[18:43:05] <mutante>	 Krenair: this seems interesting https://gerrit.wikimedia.org/r/c/operations/puppet/+/522008
[18:43:23] <wikibugs>	 10Project-Admins: Create Project: Watchlist-Expiry - https://phabricator.wikimedia.org/T235686 (10MBinder_WMF) Thanks for making a task @ifried . FWIW, this does sound just like #expiring-watchlist-items and this description (even mentioning the Community Tech Wishlist from 2015 in T124752 : https://phabricator....
[18:43:27] <mutante>	 see that commit message there 
[18:43:38] <mutante>	 maybe it needs the override in Hiera
[18:43:49] <Krenair>	 no
[18:43:56] <mutante>	 ... re-introduces a hiera setting that allows this to be changed on
[18:43:56] <Krenair>	 the key that should be in use has a password
[18:43:58] <mutante>	 a per-deploy basis. Allowing unencrypted keys.. ?
[18:44:05] <mutante>	 oh, ok
[18:44:30] <hauskater>	 do we need to add all the keys at /etc/keyholder.d in the keyholder?
[18:44:36] <hauskater>	 ie, eventlogging
[18:44:39] <mutante>	 i dont see any newer changes in Gerrit that mention keyholder
[18:45:18] <Krenair>	 no
[18:45:40] <wmf-insecte>	 Project beta-scap-eqiad build #271474: 04STILL FAILING in 1 min 12 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/271474/
[18:48:25] <Krenair>	 ugh maybe I should just re-key this
[18:52:41] <wikibugs>	 10Project-Admins: Create Project: Watchlist-Expiry - https://phabricator.wikimedia.org/T235686 (10ifried) @MBinder_WMF I was thinking of creating a new component because the Expiring-Watchlist-Items component is attached to the work done by WMDE. In the past, the WMDE Community Tech team had taken on a [[ https:...
[18:54:21] <Krenair>	 now the keyholder agent is refusing to sign things... great...
[18:54:55] <Krenair>	 key works though
[18:55:09] <hauskater>	 was keyholder.d/mwdeploy fwiw?
[18:55:15] <Krenair>	 ok
[18:55:17] <hauskater>	 for documentation
[18:55:39] <Krenair>	 I'm not at the stage of dealing with docs yet
[19:02:40] <wmf-insecte>	 Project beta-scap-eqiad build #271475: 04STILL FAILING in 7 min 21 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/271475/
[19:06:49] <Krenair>	 James_F, so if I'm right I think it should be fixed everywhere but snapshot01?
[19:08:39] <wmf-insecte>	 Project beta-scap-eqiad build #271476: 04STILL FAILING in 4 min 15 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/271476/
[19:14:35] <shinken-wm>	 PROBLEM - App Server Main HTTP Response on deployment-mediawiki-09 is CRITICAL: CRITICAL - Socket timeout after 10 seconds
[19:16:27] <James_F>	 Krenair: The job complained about snapshot01, mwmaint01 and deploy02.
[19:16:40] <James_F>	 Was that just race condition with your fixing things, or are those broken too?
[19:16:41] <Krenair>	 so close
[19:17:11] <Krenair>	 oh is that just about the opcache update?
[19:17:16] <Krenair>	 it always does that
[19:17:25] <wmf-insecte>	 Project beta-scap-eqiad build #271477: 04STILL FAILING in 8 min 43 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/271477/
[19:17:38] <Krenair>	 snapshot01 is the bit that matters
[19:17:50] <James_F>	 Right.
[19:17:50] <Krenair>	 just gotta do the same fix on deployment-dumps-puppetmaster... :)
[19:17:55] <James_F>	 Fun.
[19:17:56] <Krenair>	 or at least the public part
[19:20:30] <Krenair>	 ok
[19:25:02] <wmf-insecte>	 Yippee, build fixed!
[19:25:03] <wmf-insecte>	 Project beta-scap-eqiad build #271478: 09FIXED in 6 min 15 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/271478/
[19:25:07] <wikibugs>	 10Project-Admins: Create new project for WatchTranslations tool - https://phabricator.wikimedia.org/T235700 (10Urbanecm)
[19:27:06] <Krenair>	 James_F, ^
[19:32:11] <James_F>	 Success.
[19:32:40] <wikibugs>	 10Beta-Cluster-Infrastructure, 10Release-Engineering-Team-TODO (201910): Beta cluster doesn’t update since ca. 2019-10-15 21:00 UTC - https://phabricator.wikimedia.org/T235674 (10Jdforrester-WMF) 05Open→03Resolved a:05hashar→03Krenair Fixed by @Krenair re-doing the keyholder configuration.
[19:39:16] <hashar>	 Krenair: oh so that was more complicated than just CI being broken. Thank you Krenair!
[19:39:25] <Krenair>	 yeah
[19:39:27] <shinken-wm>	 RECOVERY - App Server Main HTTP Response on deployment-mediawiki-09 is OK: HTTP OK: HTTP/1.1 200 OK - 49270 bytes in 1.034 second response time
[19:39:35] <Krenair>	 I'm still not sure how this happened
[19:40:38] <James_F>	 I blame cosmic rays.
[19:41:34] <wikibugs>	 10Deployments, 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO (201910), 10Performance-Team (Radar): Reduce static asset time on disk from five trains' worth to two - https://phabricator.wikimedia.org/T140921 (10Jdforrester-WMF) 05Open→03Resolved
[19:41:53] <Krenair>	 the lack of recent entries in those hosts' puppet logs suggests they may have been accepting this bad key for a while, but then the mystery becomes how were the deployment servers using it if they didn't permit unencrypted keys?
[19:43:04] <Krenair>	 I should get some food
[19:43:30] <wikibugs>	 (03PS1) 10Urbanecm: Add jobs for wikimedia-cz/web-plugin and wikimediacz/web-theme [integration/config] - 10https://gerrit.wikimedia.org/r/543681
[19:43:40] <James_F>	 Krinkle: Is that you using blameStartupRegistry?
[19:43:52] <James_F>	 See AW3WEXPaghP2xm4vmnOD etc.
[19:44:43] <Krinkle>	 https://logstash.wikimedia.org/goto/98450918bb82d3c3b05217dbe951e9cc
[19:44:46] <Krinkle>	 does not resolve
[19:44:57] <Krinkle>	 I haven't SSH'ed today yet
[19:45:02] <James_F>	 Oh.
[19:45:08] <Krinkle>	 but I did test with that some days ago
[19:45:33] <James_F>	 A smattering of `Error from line 160 of …/WikimediaMaintenance/blameStartupRegistry.php: Call to private method ResourceLoaderStartUpModule::getConfigSettings() from context 'BlameStartupRegistry'`
[19:45:36] <shinken-wm>	 PROBLEM - App Server Main HTTP Response on deployment-mediawiki-09 is CRITICAL: CRITICAL - Socket timeout after 10 seconds
[19:45:49] <James_F>	 Did you set it on a cron?
[19:46:05] <Krinkle>	 It has been a cron for a while yes
[19:46:16] <James_F>	 Ah, and we made a breaking change in wmf.2 I guess?
[19:46:23] <James_F>	 Hence the sudden burst just after the train.
[19:47:28] <wikibugs>	 10Project-Admins: Create Project: Watchlist-Expiry - https://phabricator.wikimedia.org/T235686 (10MBinder_WMF) Thanks for laying it all out, helps me understand.  I think a separate project is probably OK. I mostly just worry about confusion for those people that aren't part of either project but want to engage...
[19:47:52] <paladox>	 Gerrit User Summit live for the first time: https://twitter.com/gerritforge/status/1184405834899607553
[19:48:19] <Krinkle>	 James_F: hm..  not exactly, I made it public in master and backported to wmf.1 some days ago
[19:48:26] <Krinkle>	 was the branch not cut on Tuesday?
[19:48:37] <Krinkle>	 wmf.2 if anything should be fixing it not causing it
[19:48:40] <James_F>	 It was.
[19:49:18] <Krinkle>	 https://gerrit.wikimedia.org/r/#/c/mediawiki/core/+/542711/
[19:49:25] <Krinkle>	 OK I guess I was dreaming when I thought the core change landed
[19:49:31] <Krinkle>	 :|
[19:49:46] <James_F>	 Yeah, tip of RLSUM is 5155abe0e6ab6589d4104a221df0a0b2c5142c16 on wmf.2
[19:49:55] <James_F>	 Not merged. :-)
[19:50:09] <James_F>	 I'm merging it now.
[19:50:16] <Krinkle>	 thx
[19:50:25] <Krinkle>	 can also revert the WikimediaMaint change if that doesn't fix it
[19:50:40] <James_F>	 Nah, let's just fix it.
[19:50:56] <Krinkle>	 thx for flagging it.
[19:51:00] <Krinkle>	 Gotta prep for next meting now
[19:51:05] <James_F>	 That's what logspam is for. :-0
[19:51:21] <Krinkle>	 Feel free to deploy whenever. Am also find with rolling it out in 2-3h on my own otherwise
[19:51:25] <Krinkle>	 fine*
[19:52:02] <hauskater>	 hi, how's that publickey issue going?
[19:52:13] <wikibugs>	 (03CR) 10Jforrester: Add jobs for wikimedia-cz/web-plugin and wikimediacz/web-theme (031 comment) [integration/config] - 10https://gerrit.wikimedia.org/r/543681 (owner: 10Urbanecm)
[19:56:49] <mutante>	 thcipriani: Deployment Freeze affects everything deployed via scap from deployment server or everything
[19:59:15] <thcipriani>	 in the past we've done everything that would be scheduled on the deployment page aside from extreme emergency SWAT deployments
[19:59:39] <thcipriani>	 so that may be a wider net than everything deployed via scap
[20:01:43] <James_F>	 Honestly, though, if SRE want to spend their Christmases taking down the e-mail MXes, I guess they can and RelEng don't get involved? :-)
[20:02:37] <mutante>	 thcipriani: ack
[20:03:25] <mutante>	 there are more subtle things like puppet is set to clone content from something and changes are in the deploy repo. but the Deployment Calendar rule applies 
[20:30:26] <shinken-wm>	 RECOVERY - App Server Main HTTP Response on deployment-mediawiki-09 is OK: HTTP OK: HTTP/1.1 200 OK - 49274 bytes in 0.434 second response time
[21:16:45] <wikibugs>	 (03PS2) 10Urbanecm: Add jobs for wikimedia-cz/web-plugin and wikimediacz/web-theme [integration/config] - 10https://gerrit.wikimedia.org/r/543681
[21:19:18] <wikibugs>	 (03CR) 10Urbanecm: Add jobs for wikimedia-cz/web-plugin and wikimediacz/web-theme (031 comment) [integration/config] - 10https://gerrit.wikimedia.org/r/543681 (owner: 10Urbanecm)
[22:06:06] <wmf-insecte>	 Project beta-scap-eqiad build #271495: 04FAILURE in 1 min 46 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/271495/
[22:10:45] <James_F>	 !log https://integration.wikimedia.org/ci/job/wmf-quibble-core-vendor-mysql-php72-docker/8502/console got stuck in quibble's composer install step for half an hour; manually aborted. :-(
[22:10:46] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL
[22:11:30] <James_F>	 CalledProcessError. Interesting.
[22:16:21] <wmf-insecte>	 Project beta-scap-eqiad build #271496: 04STILL FAILING in 1 min 51 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/271496/
[22:26:16] <wmf-insecte>	 Project beta-scap-eqiad build #271497: 04STILL FAILING in 1 min 49 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/271497/
[22:26:49] <James_F>	 Hmm, Beta seems to have broken itself again. :-(
[22:35:26] <wikibugs>	 (03PS3) 10Jforrester: jjb: Point OOUI experimental image at node10-test-browser-php72-composer [integration/config] - 10https://gerrit.wikimedia.org/r/543227 (https://phabricator.wikimedia.org/T235570)
[22:35:28] <wikibugs>	 (03PS1) 10Jforrester: dockerfiles: [node10-test-browser-php72-composer] Make this actually provide both PHP and Node [integration/config] - 10https://gerrit.wikimedia.org/r/543723
[22:36:03] <wmf-insecte>	 Project beta-scap-eqiad build #271498: 04STILL FAILING in 1 min 46 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/271498/
[22:36:40] <wikibugs>	 (03CR) 10Jforrester: Add jobs for wikimedia-cz/web-plugin and wikimediacz/web-theme (031 comment) [integration/config] - 10https://gerrit.wikimedia.org/r/543681 (owner: 10Urbanecm)
[22:37:24] <Krenair>	 no idea what that is
[22:37:42] <Krenair>	 some weird permissions problem now?
[22:38:19] <thcipriani>	 hrm
[22:38:28] <thcipriani>	 did we lose contact with ldap at some point?
[22:38:53] <wikibugs>	 (03PS3) 10Urbanecm: Add jobs for wikimedia-cz/web-plugin and wikimediacz/web-theme [integration/config] - 10https://gerrit.wikimedia.org/r/543681
[22:39:01] <wikibugs>	 (03CR) 10Urbanecm: Add jobs for wikimedia-cz/web-plugin and wikimediacz/web-theme (031 comment) [integration/config] - 10https://gerrit.wikimedia.org/r/543681 (owner: 10Urbanecm)
[22:39:42] <Krenair>	 thcipriani, sounds like you may have seen this before? :)
[22:40:08] <wikibugs>	 (03CR) 10Jforrester: [C: 03+2] Add jobs for wikimedia-cz/web-plugin and wikimediacz/web-theme [integration/config] - 10https://gerrit.wikimedia.org/r/543681 (owner: 10Urbanecm)
[22:40:43] <thcipriani>	 heh, unexplained permissions can mean that we've created a local mwdeploy user that shadows the ldap mwdeploy user
[22:40:50] <thcipriani>	 looking now
[22:41:53] <thcipriani>	 which looks like what happened on deployment-deploy02
[22:41:57] <wikibugs>	 (03Merged) 10jenkins-bot: Add jobs for wikimedia-cz/web-plugin and wikimediacz/web-theme [integration/config] - 10https://gerrit.wikimedia.org/r/543681 (owner: 10Urbanecm)
[22:42:13] <Krenair>	 now that you mention it there was a weird thing in puppet
[22:42:27] <Krenair>	 changing ownership from mwdeploy to mwdeploy I think?
[22:42:38] <Krenair>	 that fits with this
[22:42:42] <thcipriani>	 that would, yeah
[22:42:59] <thcipriani>	 !log deployment-deploy02:sudo vipw to remove mwdeploy user
[22:43:01] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL
[22:43:08] <Krenair>	 vipw?
[22:43:26] <Krenair>	 hm, ok
[22:43:49] <thcipriani>	 yep, just removing from passwd file locally is how I've been doing it
[22:44:16] <wikibugs>	 (03CR) 10Urbanecm: "thanks" [integration/config] - 10https://gerrit.wikimedia.org/r/543681 (owner: 10Urbanecm)
[22:45:05] <thcipriani>	 scap pull works on deployment-deploy02, so hopefully that means beta-scap-eqiad'll be happy
[22:46:06] <wmf-insecte>	 Yippee, build fixed!
[22:46:07] <wmf-insecte>	 Project beta-scap-eqiad build #271499: 09FIXED in 1 min 48 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/271499/
[22:47:29] <Urbanecm>	 (y)