[03:05:27] <wikibugs>	 06serviceops, 10Citoid, 06Editing-team, 10RESTBase Sunsetting, and 2 others: Switchover plan from restbase to api gateway for Citoid - https://phabricator.wikimedia.org/T361576#10627058 (10Ryasmeen)
[03:38:34] <wikibugs>	 06serviceops, 10Image-Suggestions, 10Structured Data Engineering, 06Structured-Data-Backlog: Migrate data-engineering jobs to mw-cron - https://phabricator.wikimedia.org/T388537#10627082 (10Ottomata)
[07:38:18] <wikibugs>	 06serviceops, 10Shellbox, 10SyntaxHighlight, 13Patch-For-Review, 07Wikimedia-production-error: Shellbox bubbles GuzzleHttp\Exception\ConnectException when it should probably wrap it in a ShellboxError? - https://phabricator.wikimedia.org/T374117#10627201 (10hashar)
[07:59:18] <elukey>	 hnowlan: o/
[07:59:47] <elukey>	 There are some patches lined up to move changeprop to node20 and librdkafka 2.3 (we have node18 and node 2.2 now)
[07:59:54] <elukey>	 https://gerrit.wikimedia.org/r/c/operations/deployment-charts/+/1126215 (+nexts)
[08:00:35] <elukey>	 I am going to double check but this  time we didn't see bumps in memory/cpu usage in staging, so in theory I don't expect any fireword
[08:00:38] <elukey>	 *firework
[08:00:50] <elukey>	 but there is the switchover lined up so this may be postponed
[08:01:21] <elukey>	 lemme know your preference - I can deploy changeprop and changeprop-jobqueue eqiad today in case, and complete the rollout by end of week
[08:01:36] <elukey>	 or we can postpone to after the switchover week, safer probably
[08:18:10] <wikibugs>	 06serviceops, 10CX-cxserver, 10LPL Essential (LPL Essential 2025 Feb-Mar), 13Patch-For-Review, 07Technical-Debt: Use openapi compliant examples in swagger spec - https://phabricator.wikimedia.org/T382294#10627292 (10Nikerabbit) Please create a new task for the remaining work so that this can be resolved.
[08:52:09] <wikibugs>	 06serviceops, 10MediaWiki-extensions-CentralAuth, 10MW-on-K8s, 10MediaWiki-Platform-Team (Radar), 13Patch-For-Review: Missing backfill_localaccounts periodic jobs - https://phabricator.wikimedia.org/T388564#10627387 (10ArielGlenn) >>! In T388564#10624852, @Clement_Goubert wrote: > @ArielGlenn I've create...
[09:02:16] <akosiaris>	 elukey: This is already apparently being handled by MW teams per https://phabricator.wikimedia.org/T381588
[09:02:38] <elukey>	 akosiaris: yep I am helping them :D
[09:02:46] <akosiaris>	 which is the linked task in the change, I expected to see you on the task subscribed too
[09:02:52] <akosiaris>	 🤦
[09:03:00] <akosiaris>	 ma bad disregard
[09:03:31] <elukey>	 nono it was a good hint, when they reached out saying "we'd like to upgrade changeprop" I almost cried
[09:04:01] <elukey>	 didn't expect it so really glad about it :)
[09:04:19] <akosiaris>	 lol, why did they reach out to you specifically though?
[09:04:45] <elukey>	 the usual curse, git log
[09:04:55] <elukey>	 I upgraded the last time :D
[09:05:00] <akosiaris>	 lol
[09:05:13] <elukey>	 it was way more brutal, node10 to node18 + librdkafka etc..
[09:05:28] <elukey>	 but if we keep upgrading in small steps I hope it will get better
[09:06:21] <akosiaris>	 yes, that's the hope
[09:06:42] <akosiaris>	 we 'll see how that pans out. There is no nodejs upgrade slated for APP next year, unlike this year.
[09:07:12] <akosiaris>	 but then again, node20 is going to be ok until 30 Apr 2026
[09:13:12] <wikibugs>	 06serviceops, 06Infrastructure-Foundations, 10Maps (Kartotherian), 13Patch-For-Review: Scale up Kartotherian on Wikikube and move live traffic to it - https://phabricator.wikimedia.org/T386926#10627585 (10elukey) @Jgiannelos I have three things to propose:  1) Try to use jemalloc (see above patch) via LD_P...
[09:14:01] <wikibugs>	 06serviceops, 06Content-Transform-Team, 07Epic, 10Maps (Kartotherian): Move Kartotherian to Kubernetes - https://phabricator.wikimedia.org/T216826#10627588 (10elukey) Status: Kartotherian runs on k8s now!  We are still investigating a slow memory leak in T386926, so we are not totally done.
[09:55:53] <hnowlan>	 elukey: I'd say go ahead and see how we do 
[09:55:57] <hnowlan>	 thanks for checking in though 
[10:45:02] <elukey>	 ack! I sadly found out that the deploy to staging brought a bit more cpu/memory usage 
[10:45:05] <elukey>	 https://grafana.wikimedia.org/d/000300/change-propagation?orgId=1&var-dc=eqiad%20prometheus%2Fk8s-staging&from=1741622567919&to=1741697593357
[10:45:09] <elukey>	 (see saturation graphs)
[10:45:28] <elukey>	 it is similar to what happened the last time, I think that bumping librdkafka causes this for some reason
[10:45:45] <elukey>	 (I don't think it is a viz issue due to avg/max being used)
[10:48:20] <wikibugs>	 06serviceops, 10MediaWiki-extensions-CentralAuth, 10MW-on-K8s, 10MediaWiki-Platform-Team (Radar), 13Patch-For-Review: Missing backfill_localaccounts periodic jobs - https://phabricator.wikimedia.org/T388564#10627838 (10Clement_Goubert) 05In progress→03Resolved Jobs are now deployed on the mainten...
[10:49:15] <hnowlan>	 not *enormous* jumps though in the grand scheme of things
[10:49:47] <hnowlan>	 interesting that it comes with increased network traffic also, similar jump. has a poll rate increased? 
[10:51:57] <elukey>	 not that I know, but maybe librdkafka 2.2 -> 2.3 causes this? (plus I imagine noderdkafka changes)
[12:35:11] <wikibugs>	 06serviceops, 13Patch-For-Review: MediaWiki on PHP 8.1 production traffic ramp-up - https://phabricator.wikimedia.org/T383845#10628235 (10TheDJ) ping @jijiki as scap deployer
[12:38:22] <wikibugs>	 06serviceops, 13Patch-For-Review: MediaWiki on PHP 8.1 production traffic ramp-up - https://phabricator.wikimedia.org/T383845#10628252 (10Clement_Goubert) >>! In T383845#10628231, @TheDJ wrote: > ping @jijiki as scap deployer for the possible change that kicked this error rate of T388659 up  This is an unrelat...
[13:03:18] <wikibugs>	 06serviceops, 10Page Content Service, 10RESTBase Sunsetting, 07Code-Health-Objective, 07Epic: Move PCS endpoints behind API Gateway - https://phabricator.wikimedia.org/T264670#10628342 (10MSantos)
[13:04:50] <wikibugs>	 06serviceops, 10Page Content Service, 10RESTBase Sunsetting, 07Code-Health-Objective, 07Epic: Move PCS endpoints behind API Gateway - https://phabricator.wikimedia.org/T264670#10628350 (10MSantos)
[13:07:47] <wikibugs>	 06serviceops, 10Page Content Service, 10RESTBase Sunsetting, 07Code-Health-Objective, 07Epic: Move PCS endpoints behind API Gateway - https://phabricator.wikimedia.org/T264670#10628355 (10MSantos) p:05Low→03High
[13:08:41] <wikibugs>	 06serviceops, 10Page Content Service, 10RESTBase Sunsetting, 07Code-Health-Objective, and 2 others: Move PCS endpoints behind API Gateway - https://phabricator.wikimedia.org/T264670#10628357 (10MSantos)
[14:24:49] <wikibugs>	 06serviceops, 10Deployments, 10Shellbox, 10Wikibase-Quality-Constraints, and 4 others: Burst of GuzzleHttp Exception for http://localhost:6025/call/constraint-regex-checker - https://phabricator.wikimedia.org/T371633#10628743 (10Lucas_Werkmeister_WMDE)
[14:26:55] <wikibugs>	 06serviceops, 10Deployments, 10Shellbox, 10Wikibase-Quality-Constraints, and 4 others: Burst of GuzzleHttp Exception for http://localhost:6025/call/constraint-regex-checker - https://phabricator.wikimedia.org/T371633#10628754 (10karapayneWMDE) To do: Update the gerrit change to catch the ClientExceptionInt...
[14:27:09] <wikibugs>	 06serviceops, 10Deployments, 10Shellbox, 10Wikibase-Quality-Constraints, and 5 others: Burst of GuzzleHttp Exception for http://localhost:6025/call/constraint-regex-checker - https://phabricator.wikimedia.org/T371633#10628756 (10karapayneWMDE)
[14:34:47] <wikibugs>	 06serviceops, 06Infrastructure-Foundations, 10Maps (Kartotherian): Scale up Kartotherian on Wikikube and move live traffic to it - https://phabricator.wikimedia.org/T386926#10628841 (10elukey) Deployed the jemalloc change to staging, and verified that jemalloc's so is loaded:  ` elukey@kubestage1005:~$ sudo...