[00:19:11] 06serviceops, 13Patch-For-Review: Migrate the etcd main cluster to cfssl-based PKI - https://phabricator.wikimedia.org/T352245#10949081 (10Scott_French) Thanks, @MoritzMuehlenhoff - that's an interesting idea! So, there are two related concerns here: 1. The race itself, which can cause nginx to be notified //... [09:34:19] 06serviceops, 10MW-on-K8s, 06Release-Engineering-Team, 10Scap: helmfile/scap does not reliably bootstrap mediawiki - https://phabricator.wikimedia.org/T397685#10949609 (10Clement_Goubert) This [[ https://logstash.wikimedia.org/goto/812cbc37145f858389cf24c4c65bd115 | logstash search ]] shows missing config... [09:34:28] 06serviceops: Upgrade Excimer to 1.2.5 in production - https://phabricator.wikimedia.org/T397907 (10tstarling) 03NEW [09:52:46] 06serviceops, 06DBA, 10MW-on-K8s: Should deployment servers include mariadb::maintenance profile - https://phabricator.wikimedia.org/T397847#10949700 (10Clement_Goubert) 05Open→03In progress p:05Triage→03Medium [10:47:54] 06serviceops, 06MediaWiki-Engineering, 06Release-Engineering-Team: Deprecate mwdebugXXXX hosts - https://phabricator.wikimedia.org/T397498#10949861 (10jijiki) [10:50:12] 06serviceops, 06DBA, 10MW-on-K8s: Should deployment servers include mariadb::maintenance profile - https://phabricator.wikimedia.org/T397847#10949870 (10Clement_Goubert) 05In progress→03Resolved [10:50:51] 06serviceops, 10MW-on-K8s: Turn down mwmaint production servers - https://phabricator.wikimedia.org/T397017#10949877 (10Clement_Goubert) >>! In T397017#10947620, @Urbanecm_WMF wrote: > hey, before we plug the switch, would it be possible to adjust the default `mysql` prompt on the deployment machine? I like mw... [10:52:28] 06serviceops, 10MW-on-K8s: Turn down mwmaint production servers - https://phabricator.wikimedia.org/T397017#10949905 (10Urbanecm_WMF) >>! In T397017#10949871, @Clement_Goubert wrote: >>>! In T397017#10947620, @Urbanecm_WMF wrote: >> hey, before we plug the switch, would it be possible to adjust the default `my... [10:55:07] 06serviceops, 10Prod-Kubernetes, 07Kubernetes, 13Patch-For-Review: Update wikikube codfw to kubernetes 1.31 - https://phabricator.wikimedia.org/T397148#10949909 (10JMeybohm) [11:15:08] 06serviceops, 06MediaWiki-Engineering, 06Release-Engineering-Team, 13Patch-For-Review: Deprecate mwdebugXXXX hosts - https://phabricator.wikimedia.org/T397498#10949964 (10jijiki) [11:26:06] 06serviceops, 06MediaWiki-Engineering, 06Release-Engineering-Team, 13Patch-For-Review: Deprecate mwdebugXXXX hosts - https://phabricator.wikimedia.org/T397498#10949982 (10jijiki) [11:28:49] 06serviceops: Provide functionality to apply specific patches/gerrit changes to mw-experimental - https://phabricator.wikimedia.org/T397916 (10jijiki) 03NEW [11:30:05] 06serviceops: Provide functionality to apply specific patches/gerrit changes to mw-experimental - https://phabricator.wikimedia.org/T397916#10949999 (10jijiki) p:05Triage→03Medium [11:30:51] 06serviceops, 10MW-on-K8s, 13Patch-For-Review, 10Release-Engineering-Team (Priority Backlog šŸ“„): Make mw-experimental production ready - https://phabricator.wikimedia.org/T396767#10950001 (10jijiki) 05In progress→03Resolved a:03jijiki opened T397916 for the concerns raised, marking this task as done. [11:31:05] 06serviceops, 10MW-on-K8s, 13Patch-For-Review, 10Release-Engineering-Team (Priority Backlog šŸ“„): Make mw-experimental production ready - https://phabricator.wikimedia.org/T396767#10950008 (10jijiki) [11:32:15] 06serviceops, 10MW-on-K8s, 10Release-Engineering-Team (Priority Backlog šŸ“„): Provide an mwdebug functionality on kubernetes (mw-experimental) - https://phabricator.wikimedia.org/T276994#10950017 (10jijiki) 05Open→03Resolved a:03jijiki [11:35:10] 06serviceops, 10MW-on-K8s, 06SRE, 06Traffic, 10Release-Engineering-Team (Seen): Serve production traffic via Kubernetes - https://phabricator.wikimedia.org/T290536#10950029 (10Jdforrester-WMF) [12:06:48] 06serviceops, 06MediaWiki-Platform-Team, 07Epic: Migrate Wikimedia production from PHP 8.1 to PHP 8.3 - https://phabricator.wikimedia.org/T360995#10950096 (10Jdforrester-WMF) [12:48:42] 06serviceops: Provide functionality to apply specific patches/gerrit changes to mw-experimental - https://phabricator.wikimedia.org/T397916#10950212 (10Tgr) Note that typically these are not gerrit changes. There are two security workflows: * [[https://wikitech.wikimedia.org/wiki/How_to_deploy_code#Security_patc... [12:55:49] 06serviceops: Provide functionality to apply specific patches/gerrit changes to mw-experimental - https://phabricator.wikimedia.org/T397916#10950223 (10Tgr) Security aside, if it would be possible to make the various `/srv/mediawiki` subdirectories proper git repositories on mw-experimental, that would go a long... [14:20:34] 06serviceops: Provide functionality to apply specific patches/gerrit changes to mw-experimental - https://phabricator.wikimedia.org/T397916#10950709 (10sbassett) > Partially this is covered by patchdemo already Amusingly enough, I've recently been trying to get patchdemo to run locally. This could serve as a p... [14:30:02] 06serviceops, 10Wikifeeds: Some wikifeeds endpoints very sensitive to mobileapps latency - https://phabricator.wikimedia.org/T397937 (10hnowlan) 03NEW [14:42:58] 06serviceops, 10MW-on-K8s, 06Release-Engineering-Team, 10Scap: helmfile/scap does not reliably bootstrap mediawiki - https://phabricator.wikimedia.org/T397685#10950885 (10Scott_French) @Clement_Goubert - Ah, thanks for the additional details! Yes, if a missing `STATSD_EXPORTER_PROMETHEUS_SERVICE_HOST` res... [15:03:44] 06serviceops, 10Page Content Service, 13Patch-For-Review: mobileapps is comparatively slower to handle changeprop events - https://phabricator.wikimedia.org/T397750#10950977 (10hnowlan) p:05Triage→03High A few notes: Throttling is happening even when mobileapps is not hitting the CPU limits. This isn't... [15:52:57] 06serviceops, 06MediaWiki-Engineering, 13Patch-For-Review: Clean up UcfirstOverrides.php following PHP 7.4 -> 8.1 transition - https://phabricator.wikimedia.org/T394556#10951180 (10Scott_French) [16:51:36] 06serviceops, 10Deployments, 13Patch-For-Review, 10Release-Engineering-Team (Radar), 07Wikimedia-production-error: httpb sometimes fails upon deployment with a HTTP 503 - https://phabricator.wikimedia.org/T380958#10951485 (10akosiaris) I have https://gerrit.wikimedia.org/r/c/operations/deployment-charts/... [17:27:38] 06serviceops, 10ChangeProp: Changeprop dashboard has multiple missing metrics - https://phabricator.wikimedia.org/T397970 (10hnowlan) 03NEW [18:01:05] 06serviceops, 06Infrastructure-Foundations, 13Patch-For-Review: Upgrade httpd images to bullseye or bookworm - https://phabricator.wikimedia.org/T378128#10951742 (10Scott_French) As of 17:20 UTC, all mediawiki releases have now migrated to the bookworm-based webserver image. As before, no notable changes ha... [18:36:16] 06serviceops, 10MW-on-K8s, 10Data-Platform-SRE (2025.06.13 - 2025.07.04), 10Discovery-Search (2025.06.13 - 2025.07.04): Investigate EQIAD daily completion suggester rebuild failure - https://phabricator.wikimedia.org/T395465#10951884 (10EBernhardson) a:03EBernhardson While the above is a reduced form of... [18:54:28] 06serviceops, 10decommission-hardware, 13Patch-For-Review: decommission mw135[8-9], mw136[4-6], mw137[2-3], mw140[0-4], mw1406, mw14[11-13] - https://phabricator.wikimedia.org/T383227#10951952 (10ops-monitoring-bot) cookbooks.sre.hosts.decommission executed by jasmine@cumin1002 for hosts: `wikikube-worker[10... [19:46:28] 06serviceops, 10decommission-hardware, 13Patch-For-Review: decommission mw135[8-9], mw136[4-6], mw137[2-3], mw140[0-4], mw1406, mw14[11-13] - https://phabricator.wikimedia.org/T383227#10952062 (10jasmine_) [20:29:45] 06serviceops, 06Commons, 06MW-Interfaces-Team, 10WMF-JobQueue: Temporarily run more refreshLinks jobs on Commons - https://phabricator.wikimedia.org/T380544#10952106 (10LucasWerkmeister) Okay, would there be a problem with running more refreshLinks jobs across all wikis? šŸ˜‡ (I’m not sure how we could e... [20:56:34] 06serviceops, 06DC-Ops, 10ops-eqiad, 06SRE: hw troubleshooting: Backplane failure for wikikube-worker1243.eqiad.wmnet - https://phabricator.wikimedia.org/T397851#10952153 (10Jclark-ctr) Confirmed: Service Request 212013802 [21:31:30] 06serviceops, 13Patch-For-Review: Migrate the etcd main cluster to cfssl-based PKI - https://phabricator.wikimedia.org/T352245#10952215 (10Scott_French) Alright, I //think// https://gerrit.wikimedia.org/r/1164264 is the simplest option to achieve the specific behavior we want - i.e., reload rather than restart...