[00:34:50] 06serviceops, 13Patch-For-Review: Turn up PHP 8.1 Shellbox deployments - https://phabricator.wikimedia.org/T375243#10252524 (10Scott_French) Returning to this, there are two issues I'd like to resolve before I'd consider this ready-to-go: 1. There are a number of unapplied diffs against various prod shellbox i... [04:06:49] 06serviceops, 10MW-on-K8s: Functional replacement for importImages.php on Kubernetes - https://phabricator.wikimedia.org/T377497#10252641 (10Joe) >>! In T377497#10240939, @Urbanecm_WMF wrote: > For example, #video2commons (as one of the major sources of #server-side-upload-request) first attempts to upload the... [07:44:20] 06serviceops, 06SRE: host rdb1014 is down - https://phabricator.wikimedia.org/T376961#10252886 (10MoritzMuehlenhoff) 05Resolved→03Open >>! In T376961#10225247, @akosiaris wrote: > I 'll resolve, although something tells me we 'll soon see this again. You jinxed it :-) rdb1014 is again down since three d... [08:10:38] 06serviceops, 10MW-on-K8s: Functional replacement for importImages.php on Kubernetes - https://phabricator.wikimedia.org/T377497#10252956 (10Urbanecm_WMF) >>! In T377497#10251281, @Joe wrote: >>>! In T377497#10248513, @Pppery wrote: >> That documentation isn't quite accurate. The goal of server-side uploads as... [09:09:25] 06serviceops, 10MW-on-K8s, 10TimedMediaHandler, 13Patch-For-Review, 07Video: shellbox-video pods being restarted prematurely - https://phabricator.wikimedia.org/T373517#10253164 (10hnowlan) Running the client directly against a k8s worker IP also succeeds, which means that kube-proxy most likely isn't to... [09:19:07] anyone around for a docker_registry change review? https://gerrit.wikimedia.org/r/c/operations/puppet/+/1082332 + https://gerrit.wikimedia.org/r/c/operations/puppet/+/1082334 [09:19:07] the gitlab-runner ips in codfw changed and have to be updated [09:30:01] 06serviceops, 06DC-Ops, 10ops-eqiad, 06SRE: hw troubleshooting: CPU 2 machine check error detected for rdb1014.eqiad.wmnet - https://phabricator.wikimedia.org/T376961#10253253 (10Clement_Goubert) p:05Triage→03Medium a:05akosiaris→03Jclark-ctr [09:32:22] 06serviceops, 06DC-Ops, 10ops-eqiad, 06SRE: hw troubleshooting: CPU 2 machine check error detected for rdb1014.eqiad.wmnet - https://phabricator.wikimedia.org/T376961#10253273 (10akosiaris) >>! In T376961#10252886, @MoritzMuehlenhoff wrote: >>>! In T376961#10225247, @akosiaris wrote: >> I 'll resolve, alth... [11:53:53] 06serviceops: Evaluate out redis_misc cluster - https://phabricator.wikimedia.org/T325243#10253656 (10jijiki) [11:54:18] 06serviceops, 10ChangeProp, 06collaboration-services, 06Infrastructure-Foundations, and 10 others: Figure out a plan to move forward with regarding Redis License changes - https://phabricator.wikimedia.org/T360596#10253657 (10jijiki) [12:07:59] 06serviceops, 13Patch-For-Review: Turn up PHP 8.1-flavored k8s deployments for all MediaWiki services - https://phabricator.wikimedia.org/T377040#10253724 (10jijiki) p:05Triage→03Medium [12:08:20] 06serviceops, 10Prod-Kubernetes, 07Kubernetes: Migrate wikikube-eqiad to containerd - https://phabricator.wikimedia.org/T377876#10253725 (10jijiki) p:05Triage→03Medium [12:10:09] 06serviceops, 10MW-on-K8s, 10wikitech.wikimedia.org: Cleanup: Wikitech code leftovers - https://phabricator.wikimedia.org/T371378#10253730 (10jijiki) p:05Triage→03Low [12:10:22] 06serviceops, 13Patch-For-Review: Turn up PHP 8.1 Shellbox deployments - https://phabricator.wikimedia.org/T375243#10253731 (10jijiki) p:05Triage→03Medium [12:10:30] 06serviceops, 10Thumbor: Majority of thumbor containers on pods occasionally getting into a stuck state - https://phabricator.wikimedia.org/T374350#10253732 (10jijiki) p:05Triage→03Medium [12:11:34] 06serviceops, 10ChangeProp, 06cloud-services-team, 06collaboration-services, and 11 others: Figure out a plan to move forward with regarding Redis License changes - https://phabricator.wikimedia.org/T360596#10253695 (10jijiki) [13:18:52] 06serviceops, 10LPL Essential, 10MinT, 10Community Wishlist (Translations), 10Community-Tech (Jackal (not a fox) Fox): Caching service request for MinT - https://phabricator.wikimedia.org/T370755#10253978 (10Pginer-WMF) Thanks for your input, @akosiaris. As always, super useful. I created a ticket to c... [13:18:56] hnowlan: In case you can take a look, i think this is missing: https://gerrit.wikimedia.org/r/c/operations/deployment-charts/+/1082450 [13:25:55] done [13:30:37] thanks, deploying this now [13:31:16] 06serviceops, 10Prod-Kubernetes, 07Kubernetes: Migrate wikikube-codfw to containerd - https://phabricator.wikimedia.org/T377877#10254039 (10JMeybohm) [13:32:10] 06serviceops, 10Prod-Kubernetes, 07Kubernetes: Migrate wikikube-codfw to containerd - https://phabricator.wikimedia.org/T377877#10254043 (10JMeybohm) [13:35:02] 06serviceops, 10ChangeProp, 06cloud-services-team, 06collaboration-services, and 11 others: Figure out a plan to move forward with regarding Redis License changes - https://phabricator.wikimedia.org/T360596#10254048 (10bking) Forgive the drive-by comment, but at the 6-month anniversary of this ticket, it m... [13:38:38] 06serviceops, 06Data Products, 06Data-Platform-SRE, 10Dumps-Generation, and 2 others: Migrate current-generation dumps to run from our containerized images - https://phabricator.wikimedia.org/T352650#10254054 (10Joe) I think @BTullis' idea is great - there's a few unknowns regarding how to keep what runs v... [13:41:11] hnowlan: i think its ok I started receiving some traffic that shows up in the caching metrics [13:41:16] thanks for the help [13:41:19] 06serviceops, 06Data Products, 06Data-Platform-SRE, 10Dumps-Generation, and 2 others: Migrate current-generation dumps to run from our containerized images - https://phabricator.wikimedia.org/T352650#10254068 (10akosiaris) >>! In T352650#10252263, @BTullis wrote: > I'm keen to hear your feedback, even if y... [13:41:49] 06serviceops, 06Content-Transform-Team-WIP, 10Page Content Service, 10RESTBase Sunsetting, and 2 others: hewiki: Use backing node service instead of RESTBase on pregeneration changeprop rules - https://phabricator.wikimedia.org/T372749#10254070 (10Jgiannelos) 05Open→03Resolved [13:42:01] nemo-yiannis: great! [14:21:32] 06serviceops, 10MW-on-K8s: Add helper script functionality to our php images - https://phabricator.wikimedia.org/T377958 (10Clement_Goubert) 03NEW [14:21:45] 06serviceops, 10MW-on-K8s: Add helper script functionality to our php images - https://phabricator.wikimedia.org/T377958#10254307 (10Clement_Goubert) p:05Triage→03High [14:22:06] 06serviceops, 10MW-on-K8s, 13Patch-For-Review: Allow running periodic jobs for mw on k8s - https://phabricator.wikimedia.org/T341555#10254308 (10Clement_Goubert) a:03Clement_Goubert [14:39:11] 06serviceops, 10MW-on-K8s: Add mwcron functionality to mediawiki chart - https://phabricator.wikimedia.org/T377961 (10Clement_Goubert) 03NEW [14:39:21] 06serviceops, 10MW-on-K8s, 07Datacenter-Switchover: Control mw-on-k8s periodic maintenance jobs with an etcd value - https://phabricator.wikimedia.org/T367118#10254376 (10Clement_Goubert) Copying from T341555 with added reflections While we could rely on setting the `mwcron.enabled` parameter to `false` in... [14:41:09] 06serviceops, 10MW-on-K8s: Turn up mwcron releases - https://phabricator.wikimedia.org/T377962 (10Clement_Goubert) 03NEW [14:41:30] 06serviceops, 10MW-on-K8s: Turn up mwcron releases - https://phabricator.wikimedia.org/T377962#10254394 (10Clement_Goubert) p:05Triage→03High [14:42:36] 06serviceops, 10MW-on-K8s, 13Patch-For-Review: Allow running periodic jobs for mw on k8s - https://phabricator.wikimedia.org/T341555#10254399 (10Clement_Goubert) 05Open→03In progress [14:43:50] 06serviceops, 10MW-on-K8s, 13Patch-For-Review: Add helper script functionality to our php images - https://phabricator.wikimedia.org/T377958#10254404 (10Clement_Goubert) 05Open→03In progress [14:44:00] 06serviceops, 10MW-on-K8s, 13Patch-For-Review: Turn up mwcron releases - https://phabricator.wikimedia.org/T377962#10254409 (10Clement_Goubert) 05Open→03In progress [14:44:13] 06serviceops, 10MW-on-K8s, 07Datacenter-Switchover: Control mw-on-k8s periodic maintenance jobs with an etcd value - https://phabricator.wikimedia.org/T367118#10254413 (10Clement_Goubert) p:05Triage→03High [14:44:23] 06serviceops, 10MW-on-K8s, 07Datacenter-Switchover: Control mw-on-k8s periodic maintenance jobs with an etcd value - https://phabricator.wikimedia.org/T367118#10254401 (10Clement_Goubert) 05Open→03In progress a:03Clement_Goubert [14:46:39] 06serviceops, 10MW-on-K8s, 13Patch-For-Review: Add mwcron functionality to mediawiki chart - https://phabricator.wikimedia.org/T377961#10254406 (10Clement_Goubert) 05Open→03In progress p:05Triage→03High [14:48:34] 06serviceops, 10MW-on-K8s: Identify low-criticity maintenance job to move to mwcron - https://phabricator.wikimedia.org/T377963 (10Clement_Goubert) 03NEW [14:50:50] 06serviceops, 10MW-on-K8s: Contact team responsible for a job on failure in mwcron - https://phabricator.wikimedia.org/T377964 (10Clement_Goubert) 03NEW [14:54:05] 06serviceops, 06DC-Ops, 10ops-eqiad, 06SRE: hw troubleshooting: CPU 2 machine check error detected for rdb1014.eqiad.wmnet - https://phabricator.wikimedia.org/T376961#10254455 (10ops-monitoring-bot) Icinga downtime and Alertmanager silence (ID=0d71122b-3e94-47c7-a121-4dda9db372d8) set by cgoubert@cumin1002... [15:11:05] 06serviceops, 06Data Products, 06Data-Platform-SRE, 10Dumps-Generation, and 2 others: Migrate current-generation dumps to run from our containerized images - https://phabricator.wikimedia.org/T352650#10254536 (10Ottomata) +1, this is a great idea. [15:37:07] 06serviceops, 10MediaWiki-extensions-PropertySuggester, 10MW-on-K8s, 10Wikidata, 10wmde-wikidata-tech: [PS] Update PropertySuggester update process for mwscript-k8s - https://phabricator.wikimedia.org/T376604#10254725 (10Lucas_Werkmeister_WMDE) >>! In T376604#10206276, @Lucas_Werkmeister_WMDE wrote: > I... [16:02:42] 06serviceops, 10MediaWiki-Platform-Team (Radar): Regenerate UcfirstOverrides.php for PHP 7.4 -> 8.1 transition - https://phabricator.wikimedia.org/T372603#10254909 (10Krinkle) [16:21:42] 06serviceops, 10MediaWiki-extensions-WikimediaEvents, 06MediaWiki-Platform-Team: Prepare sticky cookie for gradual PHP 8.1 rollout - https://phabricator.wikimedia.org/T377987 (10Krinkle) 03NEW [16:28:35] 06serviceops, 10MW-on-K8s: Functional replacement for importImages.php on Kubernetes - https://phabricator.wikimedia.org/T377497#10255164 (10Pppery) > I'm not 100% sure what the purpose of allowlisting is here. I think the point is to allow people to upload only from cites known to have free licenses to preve... [16:35:24] 06serviceops, 06Data-Persistence, 13Patch-For-Review: Sessionstore's discovery TLS cert will expire before end of May 2024 - https://phabricator.wikimedia.org/T363996#10255176 (10hnowlan) 05Open→03Resolved sessionstore codfw and eqiad are running with an envoy tls terminator, and latencies etc look a... [20:29:06] 06serviceops, 06Data-Platform-SRE, 06SRE: DegradedArray email alerts for aqs1013 and aqs1014 are firing since April 18 - https://phabricator.wikimedia.org/T373490#10256231 (10Ottomata) [23:14:29] 06serviceops, 06DC-Ops, 10ops-eqiad, 06SRE: hw troubleshooting: CPU 2 machine check error detected for rdb1014.eqiad.wmnet - https://phabricator.wikimedia.org/T376961#10256855 (10Jclark-ctr) Confirmed: Service Request 199807744 was successfully submitted. [23:19:34] 06serviceops, 10MediaWiki-extensions-WikimediaEvents, 06MediaWiki-Platform-Team: Prepare sticky cookie for gradual PHP 8.1 rollout - https://phabricator.wikimedia.org/T377987#10256858 (10Krinkle) The previous MW configuration and JS code for this has been kept and maintained since 2022 and so is already in p...