[00:21:07] 10serviceops, 10DC-Ops, 10SRE, 10ops-codfw: Q1:rack/setup/install kubernetes20[25-54] - https://phabricator.wikimedia.org/T342534 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by pt1979@cumin2002 for host kubernetes2040.codfw.wmnet with OS bullseye completed: - kubernetes2040 (**PASS*... [00:21:17] 10serviceops, 10DC-Ops, 10SRE, 10ops-codfw: Q1:rack/setup/install kubernetes20[25-54] - https://phabricator.wikimedia.org/T342534 (10Papaul) [07:49:50] 10serviceops, 10Observability-Tracing, 10Patch-For-Review, 10User-fgiunchedi: jaeger is configured to receive traces from production - https://phabricator.wikimedia.org/T344253 (10JMeybohm) [08:30:47] 10serviceops, 10Data-Platform-SRE, 10Discovery-Search (Current work): Requesting permission to use kafka-main cluster to transport CirrusSearch updates - https://phabricator.wikimedia.org/T341625 (10Gehel) **DECISION** (as discussed in synchronous meeting): * Reading bulk data is done from the consumer (at t... [09:21:51] 10serviceops, 10MW-on-K8s, 10SRE, 10Traffic, and 2 others: Migrate internal traffic to k8s - https://phabricator.wikimedia.org/T333120 (10Joe) [09:22:44] 10serviceops, 10MW-on-K8s, 10Wikidata, 10Wikidata-Termbox, and 2 others: Migrate termbox to mw-api-int - https://phabricator.wikimedia.org/T334064 (10Joe) 05Stalled→03Resolved Termbox has been migrated [09:26:23] 10serviceops, 10Data Products, 10RESTbase Sunsetting, 10Code-Health-Objective, 10Patch-For-Review: Route to new AQS Knowledge Gaps endpoint - https://phabricator.wikimedia.org/T342213 (10Sfaci) Regarding AQS 2.0, I guess this new service could be managed in the same way we are managing the existing ones.... [09:53:18] 10serviceops, 10MW-on-K8s, 10Wikidata, 10Wikidata-Termbox, and 2 others: Migrate termbox to mw-api-int - https://phabricator.wikimedia.org/T334064 (10Lucas_Werkmeister_WMDE) \o/ [10:16:11] there's RedisMemoryFull alerts firing since 2d, known/expected ? [10:52:59] 6378? that's ORES [10:54:09] although...it's 6 hosts? not 4 (2 per DC?) [10:54:11] weird [10:59:51] * akosiaris looking first into rdb2007, rdb2008, those should see that much memory usage [11:07:19] 10serviceops, 10MW-on-K8s, 10Wikidata, 10Wikidata-Termbox, 10wdwb-tech: Termbox SSR broken since k8s migration - https://phabricator.wikimedia.org/T344904 (10Lucas_Werkmeister_WMDE) [11:08:47] 10serviceops, 10MW-on-K8s, 10Wikidata, 10Wikidata-Termbox, and 2 others: Migrate termbox to mw-api-int - https://phabricator.wikimedia.org/T334064 (10Lucas_Werkmeister_WMDE) Hm, doesn’t seem to be fully working :/ {T344904} [11:10:26] 10serviceops, 10MW-on-K8s, 10Wikidata, 10Wikidata-Termbox, 10wdwb-tech: Termbox SSR broken since k8s migration - https://phabricator.wikimedia.org/T344904 (10Lucas_Werkmeister_WMDE) Probably worth mentioning that it’s also broken on Test Wikidata ([example item](https://test.m.wikidata.org/wiki/Q469)), w... [11:22:54] 10serviceops, 10MW-on-K8s, 10Wikidata, 10Wikidata-Termbox, 10wdwb-tech: Termbox SSR broken since k8s migration - https://phabricator.wikimedia.org/T344904 (10Joe) Yes, localhost:6008 is pointing to `termbox.discovery.wmnet:4004` in production. The problem doesn't seem to be in termbox, as we could both... [11:33:05] 10serviceops, 10MW-on-K8s, 10Wikidata, 10Wikidata-Termbox, 10wdwb-tech: Termbox SSR broken on Test Wikidata (since k8s migration? unclear) - https://phabricator.wikimedia.org/T344904 (10Lucas_Werkmeister_WMDE) Sorry, I’m an idiot and couldn’t read the page properly. The termbox SSR is actually working fi... [11:34:10] 10serviceops, 10MW-on-K8s, 10Wikidata, 10Wikidata-Termbox, 10wdwb-tech: Termbox SSR broken on Test Wikidata (since k8s migration? unclear) - https://phabricator.wikimedia.org/T344904 (10Lucas_Werkmeister_WMDE) [11:40:26] 10serviceops, 10MW-on-K8s, 10Wikidata, 10Wikidata-Termbox, 10wdwb-tech: Termbox SSR broken on Test Wikidata (since k8s migration? unclear) - https://phabricator.wikimedia.org/T344904 (10Lucas_Werkmeister_WMDE) Looks like Test Wikidata (which is mw-on-k8s) can’t talk to the Termbox SSR (@Joe says in IRC i... [11:48:29] 10serviceops, 10Observability-Metrics: Decide on default histogram buckets for MediaWiki timers - https://phabricator.wikimedia.org/T344751 (10Aklapper) [12:18:49] 10serviceops, 10MW-on-K8s, 10Wikidata, 10Wikidata-Termbox, 10wdwb-tech: Make termbox-test a proper production release - https://phabricator.wikimedia.org/T344914 (10Clement_Goubert) [12:39:21] 10serviceops, 10MW-on-K8s, 10Wikidata, 10Wikidata-Termbox, and 2 others: Termbox SSR broken on Test Wikidata (since k8s migration? unclear) - https://phabricator.wikimedia.org/T344904 (10Clement_Goubert) >>! In T344904#9116570, @Lucas_Werkmeister_WMDE wrote: > Looks like Test Wikidata (which is mw-on-k8s)... [12:41:18] 10serviceops, 10MW-on-K8s, 10Wikidata, 10Wikidata-Termbox, and 2 others: Termbox SSR broken on Test Wikidata (since k8s migration? unclear) - https://phabricator.wikimedia.org/T344904 (10Clement_Goubert) 05Open→03Resolved a:03Clement_Goubert Resolving, feel free to reopen if there are still any issues. [13:08:29] 10serviceops, 10MW-on-K8s, 10Wikidata, 10Wikidata-Termbox, and 2 others: Termbox SSR broken on Test Wikidata (since k8s migration? unclear) - https://phabricator.wikimedia.org/T344904 (10Lucas_Werkmeister_WMDE) 05Resolved→03Open Still not working, I’m afraid – https://test.m.wikidata.org/wiki/Q469 stil... [13:14:20] 10serviceops, 10MW-on-K8s, 10Wikidata, 10Wikidata-Termbox, 10wdwb-tech: Termbox SSR broken on Test Wikidata (since k8s migration? unclear) - https://phabricator.wikimedia.org/T344904 (10Lucas_Werkmeister_WMDE) >>! In T344904#9116872, @Lucas_Werkmeister_WMDE wrote: > there are [new logstash messages](http... [13:19:13] 10serviceops, 10MW-on-K8s, 10Wikidata, 10Wikidata-Termbox, 10wdwb-tech: Termbox SSR broken on Test Wikidata (since k8s migration? unclear) - https://phabricator.wikimedia.org/T344904 (10Lucas_Werkmeister_WMDE) Apparently the network policy is pretty old and I don’t see 10.192.0.195 in it, is that correct... [13:21:13] 10serviceops, 10MW-on-K8s, 10Wikidata, 10Wikidata-Termbox, 10wdwb-tech: Termbox SSR broken on Test Wikidata (since k8s migration? unclear) - https://phabricator.wikimedia.org/T344904 (10Lucas_Werkmeister_WMDE) (It does contain some of the other IPs seen in [_mediawiki-common_/global.yaml](https://gerrit.... [13:26:25] 10serviceops, 10MW-on-K8s, 10Wikidata, 10Wikidata-Termbox, 10wdwb-tech: Termbox SSR broken on Test Wikidata (since k8s migration? unclear) - https://phabricator.wikimedia.org/T344904 (10Lucas_Werkmeister_WMDE) Hm, in `kube_env mw-debug eqiad`, the sole network policy is also 169d old, but does contain `1... [13:53:46] 10serviceops, 10MW-on-K8s, 10Wikidata, 10Wikidata-Termbox, 10wdwb-tech: Termbox SSR broken on Test Wikidata (since k8s migration? unclear) - https://phabricator.wikimedia.org/T344904 (10Lucas_Werkmeister_WMDE) 05Open→03Resolved Apparently @Joe fixed mw-web (it wasn’t deployed earlier), now it’s worki... [14:44:14] 10serviceops, 10RESTbase Sunsetting, 10Epic, 10Platform Engineering Roadmap: Replace usage of RESTbase parsoid endpoints - https://phabricator.wikimedia.org/T328559 (10Nikerabbit) [15:24:26] 10serviceops, 10Abstract Wikipedia team: Sandboxing Strategy for Wikifunctions - https://phabricator.wikimedia.org/T343829 (10Jdforrester-WMF) 05Open→03In progress p:05Triage→03High a:03cmassaro [15:31:38] akosiaris: thank you for taking a look (and I missed the reply of course) [15:33:17] thanks for notifying us. I didn't do much btw. flush an unused redis and just delete a few keys from the ORES cache ones. [15:33:34] I am a bit perplexed as to why it alerted tbh after all this time. There has been 0 change [15:33:48] but also, I am not sure it's worth to investigate more, ORES is going away pretty soon [15:33:57] yay lift wing [15:36:33] thank you, yeah that sounds odd [15:38:20] 10serviceops, 10Content-Transform-Team-WIP, 10Maintenance-Worktype, 10Wikimedia-Incident: Maps Unavailability due to thanos-swift cfssl rollout (14 Aug 2023) - https://phabricator.wikimedia.org/T344324 (10jijiki) Testing procedure: * stop puppet on eqiad * merge the cfssl puppet change and roll it out on... [15:40:20] 10serviceops, 10Content-Transform-Team-WIP, 10Maintenance-Worktype, 10Wikimedia-Incident: Maps Unavailability due to thanos-swift cfssl rollout (14 Aug 2023) - https://phabricator.wikimedia.org/T344324 (10jijiki) >>! In T344324#9112390, @Joe wrote: >>>! In T344324#9112384, @JMeybohm wrote: >> Could we jus... [16:33:55] 10serviceops, 10MW-on-K8s, 10Observability-Logging, 10SRE: Keep calculating latencies for MediaWiki requests in the WikiKube environment - https://phabricator.wikimedia.org/T276095 (10kamila) Benthos is deployed and producing metrics, but I am not closing this yet, because the logs contain quite a lot of e... [16:50:34] 10serviceops, 10DC-Ops, 10SRE, 10ops-codfw: Q1:rack/setup/install kubernetes20[25-54] - https://phabricator.wikimedia.org/T342534 (10Jhancock.wm) [17:38:03] 10serviceops, 10RESTbase Sunsetting, 10Parsoid (Tracking): Enable WarmParsoidParserCache on all wikis - https://phabricator.wikimedia.org/T329366 (10daniel) [21:23:17] 10serviceops, 10DC-Ops, 10SRE, 10ops-codfw: Q1:rack/setup/install kubernetes20[25-54] - https://phabricator.wikimedia.org/T342534 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by jhancock@cumin2002 for host kubernetes2025.codfw.wmnet with OS bullseye [21:38:39] 10serviceops, 10DC-Ops, 10SRE, 10ops-codfw: Q1:rack/setup/install kubernetes20[25-54] - https://phabricator.wikimedia.org/T342534 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by jhancock@cumin2002 for host kubernetes2025.codfw.wmnet with OS bullseye executed with errors: - kubernetes... [21:43:16] 10serviceops, 10DC-Ops, 10SRE, 10ops-codfw: Q1:rack/setup/install kubernetes20[25-54] - https://phabricator.wikimedia.org/T342534 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by jhancock@cumin2002 for host kubernetes2025.codfw.wmnet with OS bullseye [21:43:30] 10serviceops, 10DC-Ops, 10SRE, 10ops-codfw: Q1:rack/setup/install kubernetes20[25-54] - https://phabricator.wikimedia.org/T342534 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by jhancock@cumin2002 for host kubernetes2025.codfw.wmnet with OS bullseye executed with errors: - kubernetes... [21:59:20] 10serviceops, 10DC-Ops, 10SRE, 10ops-codfw: Q1:rack/setup/install kubernetes20[25-54] - https://phabricator.wikimedia.org/T342534 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by jhancock@cumin2002 for host kubernetes2025.codfw.wmnet with OS bullseye [22:15:51] 10serviceops, 10DC-Ops, 10SRE, 10ops-codfw: Q1:rack/setup/install kubernetes20[25-54] - https://phabricator.wikimedia.org/T342534 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by jhancock@cumin2002 for host kubernetes2025.codfw.wmnet with OS bullseye executed with errors: - kubernetes...