[08:07:16] o/ looking for some help to determine at what point a MW api.php request might fail with a timeout at 15sec, context is the search API that is supposedly hitting the search-https listener (which is configured with a 50s timeout) [08:07:35] seems like this 15s timeout is kind of new according to users (T410007) [08:08:13] since november 11th [08:51:37] could this be due to rest-gateway.discovery.wmnet, enabled on group2 around that time? I see it has a timeout around 10s? [09:19:23] 06serviceops, 06Data-Engineering, 06Machine-Learning-Team: Enable ChangeProp to consume mediawiki.page_content_change.v1 - https://phabricator.wikimedia.org/T409469#11377845 (10JMonton-WMF) That sounds good. Then, we could consider increasing the partitions in Jumbo too, `codfw.mediawiki.page_content_change.... [09:38:40] not 100% but I believe this is caused by T408223, added a comment there [09:38:43] *sure [11:14:55] 06serviceops, 06Data-Engineering, 06Machine-Learning-Team: Enable ChangeProp to consume mediawiki.page_content_change.v1 - https://phabricator.wikimedia.org/T409469#11378447 (10jijiki) Thank you for the discussion everyone! Reading through, I would suggest proceeding with Option D for the time being. This ap... [11:18:01] 06serviceops, 07sre-alert-triage: Alert in need of triage: KubernetesWorkerUnschedulable - https://phabricator.wikimedia.org/T400969#11378450 (10LSobanski) And one more time :) [11:37:07] dcausse: hugh has prepped a patch, let us know if it worked [11:38:56] 06serviceops, 06MediaWiki-Platform-Team, 07OKR-Work: Determine the source of internal requests going through the API gateway. - https://phabricator.wikimedia.org/T410198#11378496 (10hnowlan) Could you supply some of these IP addresses for investigation? My gut feeling is that these are going to be health che... [11:39:14] sorry, missed your messages dcausse :) that should be fixed by now [12:22:29] 06serviceops, 06MediaWiki-Platform-Team, 07OKR-Work: Determine the source of internal requests going through the API gateway. - https://phabricator.wikimedia.org/T410198#11378605 (10daniel) >>! In T410198#11378496, @hnowlan wrote: > Could you supply some of these IP addresses for investigation? My gut feelin... [12:37:04] 06serviceops, 06MediaWiki-Platform-Team, 07OKR-Work: Determine the source of internal requests going through the API gateway. - https://phabricator.wikimedia.org/T410198#11378640 (10hnowlan) >>! In T410198#11378605, @daniel wrote: >>>! In T410198#11378496, @hnowlan wrote: >> Could you supply some of these IP... [13:27:49] hnowlan, effie: it's working, thanks for the quick fix! [13:54:02] 06serviceops, 06MediaWiki-Platform-Team, 07OKR-Work: api rate limiting: Assign ratelimit class based on IP range - https://phabricator.wikimedia.org/T410273 (10daniel) 03NEW [15:39:18] 06serviceops, 06SRE, 07Datacenter-Switchover: Add scap lock/unlock steps to sre.switchdc.mediawiki cookbook - https://phabricator.wikimedia.org/T330996#11379346 (10LSobanski) [15:42:17] 06serviceops, 06MediaWiki-Platform-Team, 07OKR-Work: Determine the source of internal requests going through the API gateway. - https://phabricator.wikimedia.org/T410198#11379358 (10akosiaris) > The 172.16.* hosts are from WMCS. We need to investigate what kind of traffic we're seeing from these hosts furthe... [15:53:34] 06serviceops, 06DC-Ops, 10ops-eqiad, 06SRE: eqiad row C/D Service Ops host migrations - https://phabricator.wikimedia.org/T405950#11379425 (10RobH) @Clement_Goubert, Is it possible that I could send the commands for this or do we need someone in your team? If we need someone in your team, could we schedu... [17:34:08] 06serviceops, 06Content-Transform-Team, 10Wikifeeds: Significant increase in wikifeeds latency since 2025/11/13 - https://phabricator.wikimedia.org/T410296 (10hnowlan) 03NEW [17:36:54] 06serviceops, 06Content-Transform-Team, 10Wikifeeds: Significant increase in wikifeeds latency since 2025/11/13 - https://phabricator.wikimedia.org/T410296#11380206 (10hnowlan) [17:47:01] 06serviceops, 06Content-Transform-Team, 10Wikifeeds: Significant increase in wikifeeds latency since 2025/11/13 - https://phabricator.wikimedia.org/T410296#11380280 (10hnowlan) Wikifeeds logs quite heavily in general, but it's hard to determine signal. Looks like there has been [[ https://logstash.wikimedia.... [17:48:23] 06serviceops, 10MediaWiki-Core-JobQueue, 06MW-Interfaces-Team, 10WMF-JobQueue, 13Patch-Needs-Improvement: Find a way to set elevated timeouts for job running - https://phabricator.wikimedia.org/T247114#11380289 (10BPirkle) 05Open→03Invalid [18:03:59] 06serviceops, 06Content-Transform-Team, 10Wikifeeds: Significant increase in wikifeeds latency since 2025/11/13 - https://phabricator.wikimedia.org/T410296#11380430 (10ssastry) [[ https://grafana.wikimedia.org/d/lxZAdAdMk/wikifeeds?orgId=1&from=now-7d&to=now&timezone=utc&var-dc=000000026&var-site=codfw&var-p... [18:09:40] 06serviceops, 06Content-Transform-Team, 10Wikifeeds: Significant increase in wikifeeds latency since 2025/11/13 - https://phabricator.wikimedia.org/T410296#11380452 (10ssastry) [[ https://grafana.wikimedia.org/d/8169987e-2ef2-4bf2-ba85-eefad1edbefa/rest-gateway-per-service-breakdown?orgId=1&from=now-7d&to=no... [18:24:35] 06serviceops, 10MediaWiki-Core-JobQueue, 06MW-Interfaces-Team, 10WMF-JobQueue, 13Patch-Needs-Improvement: Find a way to set elevated timeouts for job running - https://phabricator.wikimedia.org/T247114#11380508 (10aaron) Seems to have been dealt with in e784ab5897c9479aab525dbe2573b76ed46c83f2 and 10... [18:46:24] 06serviceops, 13Patch-For-Review: MediaWiki on PHP 8.3 production workload migration - https://phabricator.wikimedia.org/T405955#11380563 (10Scott_French) [21:17:02] 06serviceops, 13Patch-For-Review: Migrate the etcd main cluster to cfssl-based PKI - https://phabricator.wikimedia.org/T352245#11381223 (10Scott_French) I've gone ahead an moved codfw PyBals to conf2006 today, so that preparation step is out of the way. I've also posted additional patches related to disabling...