[00:04:28] FIRING: NodeTextfileStale: Stale textfile for cloudcontrol2005-dev:9100 - https://wikitech.wikimedia.org/wiki/Prometheus#Stale_file_for_node-exporter_textfile - https://grafana.wikimedia.org/d/knkl4dCWz/node-exporter-textfile - https://alerts.wikimedia.org/?q=alertname%3DNodeTextfileStale [00:14:55] FIRING: MaxConntrack: Max conntrack at 80.61% on cloudvirt1039:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_conntrack - https://grafana.wikimedia.org/d/oITUqwKIk/netfilter-connection-tracking - https://alerts.wikimedia.org/?q=alertname%3DMaxConntrack [00:19:55] RESOLVED: MaxConntrack: Max conntrack at 80.1% on cloudvirt1039:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_conntrack - https://grafana.wikimedia.org/d/oITUqwKIk/netfilter-connection-tracking - https://alerts.wikimedia.org/?q=alertname%3DMaxConntrack [00:24:28] RESOLVED: NodeTextfileStale: Stale textfile for cloudcontrol2005-dev:9100 - https://wikitech.wikimedia.org/wiki/Prometheus#Stale_file_for_node-exporter_textfile - https://grafana.wikimedia.org/d/knkl4dCWz/node-exporter-textfile - https://alerts.wikimedia.org/?q=alertname%3DNodeTextfileStale [00:30:06] FIRING: [2x] ProbeDown: Service tools-legacy-redirector-2:443 has failed probes (http_tools_wmflabs_org_tool_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#tools-legacy-redirector-2:443 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [00:32:55] FIRING: MaxConntrack: Max conntrack at 80.09% on cloudvirt1039:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_conntrack - https://grafana.wikimedia.org/d/oITUqwKIk/netfilter-connection-tracking - https://alerts.wikimedia.org/?q=alertname%3DMaxConntrack [00:35:06] RESOLVED: [2x] ProbeDown: Service tools-legacy-redirector-2:443 has failed probes (http_tools_wmflabs_org_tool_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#tools-legacy-redirector-2:443 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [00:42:55] RESOLVED: MaxConntrack: Max conntrack at 80.33% on cloudvirt1039:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_conntrack - https://grafana.wikimedia.org/d/oITUqwKIk/netfilter-connection-tracking - https://alerts.wikimedia.org/?q=alertname%3DMaxConntrack [00:53:55] FIRING: MaxConntrack: Max conntrack at 80.01% on cloudvirt1039:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_conntrack - https://grafana.wikimedia.org/d/oITUqwKIk/netfilter-connection-tracking - https://alerts.wikimedia.org/?q=alertname%3DMaxConntrack [00:58:55] RESOLVED: MaxConntrack: Max conntrack at 80.01% on cloudvirt1039:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_conntrack - https://grafana.wikimedia.org/d/oITUqwKIk/netfilter-connection-tracking - https://alerts.wikimedia.org/?q=alertname%3DMaxConntrack [01:01:39] FIRING: ProbeDown: Service tools-legacy-redirector-2:443 has failed probes (http_tools_wmflabs_org_main_page_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#tools-legacy-redirector-2:443 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [01:05:56] 10tool-wdlocator: Add UI language selector - https://phabricator.wikimedia.org/T386289 (10Samwilson) 03NEW [01:06:39] RESOLVED: [2x] ProbeDown: Service tools-legacy-redirector-2:443 has failed probes (http_tools_wmflabs_org_main_page_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#tools-legacy-redirector-2:443 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [01:27:14] (03open) 10samwilson: Add language selection dropdown [toolforge-repos/wdlocator] - 10https://gitlab.wikimedia.org/toolforge-repos/wdlocator/-/merge_requests/36 (https://phabricator.wikimedia.org/T386289) [01:27:39] FIRING: [3x] ProbeDown: Service tools-legacy-redirector-2:443 has failed probes (http_tools_wmflabs_org_main_page_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#tools-legacy-redirector-2:443 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [01:29:40] (03merge) 10samwilson: Add language selection dropdown [toolforge-repos/wdlocator] - 10https://gitlab.wikimedia.org/toolforge-repos/wdlocator/-/merge_requests/36 (https://phabricator.wikimedia.org/T386289) [01:32:39] RESOLVED: [4x] ProbeDown: Service tools-legacy-redirector-2:443 has failed probes (http_tools_wmflabs_org_main_page_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#tools-legacy-redirector-2:443 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [01:37:09] 10tool-wdlocator: Add UI language selector - https://phabricator.wikimedia.org/T386289#10547283 (10Samwilson) 05Open→03Resolved a:03Samwilson {F58393229,size=full} [01:39:35] (03open) 10samwilson: Build assets in CI [toolforge-repos/wdlocator] - 10https://gitlab.wikimedia.org/toolforge-repos/wdlocator/-/merge_requests/37 [01:58:55] (03merge) 10samwilson: Build assets in CI [toolforge-repos/wdlocator] - 10https://gitlab.wikimedia.org/toolforge-repos/wdlocator/-/merge_requests/37 [02:05:41] 10tool-wdlocator: Add link to OSM feature - https://phabricator.wikimedia.org/T386296 (10Samwilson) 03NEW [04:51:45] 10Tools: ConnectTimeoutError when trying to `pip install` inside the deadlinkscanner Kubernetes namespace - https://phabricator.wikimedia.org/T386059#10547620 (10taavi) [08:33:03] FIRING: ToolforgeKubernetesWorkerTooManyDProcesses: Node tools-k8s-worker-nfs-75 has at least 12 procs in D state, and may be having NFS/IO issues - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesWorkerTooManyDProcesses - https://grafana.wmcloud.org/d/3jhWxB8Vk/toolforge-general-overview - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesWorkerTooManyDProcesses [08:38:03] RESOLVED: ToolforgeKubernetesWorkerTooManyDProcesses: Node tools-k8s-worker-nfs-75 has at least 12 procs in D state, and may be having NFS/IO issues - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesWorkerTooManyDProcesses - https://grafana.wmcloud.org/d/3jhWxB8Vk/toolforge-general-overview - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesWorkerTooManyDProcesses [10:47:36] 10cloud-services-team (FY2024/2025-Q3-Q4), 10Toolforge: [toolsdb] Remove apt pinning and upgrade to latest version - https://phabricator.wikimedia.org/T385885#10548644 (10fnegri) a:03fnegri [11:03:27] 10cloud-services-team (FY2024/2025-Q3-Q4), 10Toolforge: [toolsdb] mariadb crashing repeatedly (innodb_fatal_semaphore_wait_threshold) - https://phabricator.wikimedia.org/T385900#10548701 (10fnegri) 05Resolved→03In progress Thanks @aborrero @Marostegui! I'm gonna reopen this for more investigation. > upgra... [11:11:37] 10PAWS: hub-paws.wmcloud.org : SPARQL is absent at first connexion, and always timeout in further sessions - https://phabricator.wikimedia.org/T386339#10548736 (10Wladek92) [11:13:20] 10PAWS: hub-paws.wmcloud.org: SPARQL is absent at first connection, always timeout in further sessions - https://phabricator.wikimedia.org/T386339#10548741 (10Aklapper) [11:13:42] 10cloud-services-team (FY2024/2025-Q3-Q4), 10Toolforge, 13Patch-For-Review: [toolsdb] Remove apt pinning and upgrade to latest version - https://phabricator.wikimedia.org/T385885#10548743 (10fnegri) > I think we can remove that pinning completely, but we need to ensure unattended-upgrade does not upgrade Lo... [11:18:04] 10PAWS: hub-paws.wmcloud.org: SPARQL is absent at first connection, always timeout in further sessions - https://phabricator.wikimedia.org/T386339#10548750 (10rook) Ah thank you for finding that! Sparql was removed a few years ago as the plugin stopped updating, and stopped installing in jupyter. T320934 is the... [11:18:17] 10PAWS: hub-paws.wmcloud.org: SPARQL is absent at first connection, always timeout in further sessions - https://phabricator.wikimedia.org/T386339#10548752 (10rook) 05Open→03Resolved a:03rook [11:21:34] 10cloud-services-team (FY2024/2025-Q3-Q4), 10Toolforge: [toolsdb] ToolsToolsDBReplicationLagIsTooHigh - 2025-02-12 - https://phabricator.wikimedia.org/T386240#10548756 (10fnegri) That transaction eventually completed around 21:00 UTC, replication lag went down quickly for about 1 hour, then replication got stu... [11:24:31] 10cloud-services-team (FY2024/2025-Q3-Q4), 10Toolforge, 13Patch-For-Review: [toolsdb] Remove apt pinning and upgrade to latest version - https://phabricator.wikimedia.org/T385885#10548757 (10fnegri) 05Open→03In progress [11:28:42] 10cloud-services-team (FY2024/2025-Q3-Q4), 10Toolforge: [toolsdb] ToolsToolsDBReplicationLagIsTooHigh - 2025-02-12 - https://phabricator.wikimedia.org/T386240#10548773 (10fnegri) Full output of `SHOW SLAVE STATUS`: {P73459} [11:32:16] 10cloud-services-team (FY2024/2025-Q3-Q4), 10Toolforge: [toolsdb] ToolsToolsDBReplicationLagIsTooHigh - 2025-02-12 - https://phabricator.wikimedia.org/T386240#10548780 (10fnegri) Raw binlog from the primary: {P73460} [11:37:27] 10cloud-services-team (FY2024/2025-Q3-Q4), 10Data-Services: [wikireplicas] Create views for new wiki kncwiki - https://phabricator.wikimedia.org/T385188#10548783 (10fnegri) 05Open→03In progress [12:22:14] 10cloud-services-team (FY2024/2025-Q3-Q4), 10Data-Services: [wikireplicas] Create views for new wiki kncwiki - https://phabricator.wikimedia.org/T385188#10548914 (10fnegri) Views and DNS records were created, but 3 out of 4 DNS records are in status "PENDING" and are not actually working: ` fnegri@cloudcontro... [12:22:32] (03update) 10l10n-bot: Localisation updates from https://translatewiki.net. [toolforge-repos/wd-image-positions] - 10https://gitlab.wikimedia.org/toolforge-repos/wd-image-positions/-/merge_requests/30 [12:22:48] (03open) 10l10n-bot: Localisation updates from https://translatewiki.net. [toolforge-repos/ranker] - 10https://gitlab.wikimedia.org/toolforge-repos/ranker/-/merge_requests/6 [13:07:46] 10cloud-services-team (FY2024/2025-Q3-Q4), 10Data-Services: [wikireplicas] Create views for new wiki kncwiki - https://phabricator.wikimedia.org/T385188#10549090 (10Andrew) It seems like the zone serial isn't incrementing properly after a new recordset is added. I'm giving a poke with ` update zones set seri... [13:14:51] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.restart_openstack [13:15:19] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.restart_openstack (exit_code=0) [13:20:47] 10cloud-services-team (FY2024/2025-Q3-Q4), 10Data-Services: [wikireplicas] Create views for new wiki kncwiki - https://phabricator.wikimedia.org/T385188#10549104 (10Andrew) I restarted designate services and this has resolved. I don't have much of a theory beyond that. [13:41:35] 10cloud-services-team (FY2024/2025-Q3-Q4), 10Data-Services: [wikireplicas] Create views for new wiki kncwiki - https://phabricator.wikimedia.org/T385188#10549131 (10fnegri) 05In progress→03Resolved Thanks @Andrew, this task is now complete. [13:44:35] 06cloud-services-team, 10Cloud-VPS: Neutron policy does not allow the admin role to modify security groups - https://phabricator.wikimedia.org/T348582#10549137 (10Andrew) I think I need more specifics about this. From the CLI I was just now able to add and remove security group rules to an existing security gr... [14:10:09] 06cloud-services-team, 10Cloud-VPS, 10Continuous-Integration-Infrastructure, 10ci-test-error (WMF-deployed Build Failure), and 2 others: Various CI jobs running in the integration Cloud VPS project failing due to transient DNS lookup failures, often for ou... - https://phabricator.wikimedia.org/T374830#10549209 [14:16:42] 06cloud-services-team, 10Cloud-VPS, 10Continuous-Integration-Infrastructure, 10ci-test-error (WMF-deployed Build Failure), and 2 others: Various CI jobs running in the integration Cloud VPS project failing due to transient DNS lookup failures, often for ou... - https://phabricator.wikimedia.org/T374830#10549225 [14:18:06] 06cloud-services-team, 10Openstack-Magnum: CSI Cinder issues causing periodic failures on Magnum cluster - https://phabricator.wikimedia.org/T383560#10549228 (10Proc) 05Open→03Resolved a:03Proc For context, I recreated the cluster recently after my previous message. > Your current cluster appears to... [14:18:13] 06cloud-services-team, 10Openstack-Magnum: CSI Cinder issues causing periodic failures on Magnum cluster - https://phabricator.wikimedia.org/T383560#10549231 (10Proc) a:05Proc→03None [14:21:15] 10cloud-services-team (FY2024/2025-Q3-Q4), 10Toolforge: [toolsdb] ToolsToolsDBReplicationLagIsTooHigh - 2025-02-12 - https://phabricator.wikimedia.org/T386240#10549250 (10Ladsgroup) I will try to help with this more once I'm done with {T386162} but a couple of small things that comes to mind: - A replica shou... [14:25:31] 10cloud-services-team (FY2024/2025-Q3-Q4), 10Toolforge, 13Patch-For-Review: [toolsdb] Remove apt pinning and upgrade to latest version - https://phabricator.wikimedia.org/T385885#10549255 (10Andrew) Can you just leave the pinning as is and just upgrade the package specifying version with =? Or do you not tr... [14:29:40] 10wikitech.wikimedia.org, 06serviceops-radar, 06SRE, 13Patch-For-Review, 07SRE-Unowned: Redesign wikitech-static - https://phabricator.wikimedia.org/T376400#10549272 (10Andrew) p:05Triage→03Medium Update: gitlab is making daily snapshot builds and uploading them to quay.io -- the builds fail now and... [14:31:09] 06cloud-services-team, 10Cloud-VPS, 10InternetArchiveBot: Block crawlers on cyberbot project (iabot.wmcloud.org) - https://phabricator.wikimedia.org/T383592#10549286 (10Andrew) 05Open→03Invalid ok! Once per second doesn't sound like a candidate for blocking so I'm closing this for now. @Cyberpower678... [14:38:54] 10cloud-services-team (FY2024/2025-Q3-Q4), 10Toolforge, 13Patch-For-Review: [toolsdb] Remove apt pinning and upgrade to latest version - https://phabricator.wikimedia.org/T385885#10549335 (10fnegri) I thought I tried, but my bash history reveals I used `@` instead of `=`... Using `=` works fine: ` fnegri@to... [14:42:28] 10cloud-services-team (FY2024/2025-Q3-Q4), 10Toolforge: [toolsdb] ToolsToolsDBReplicationLagIsTooHigh - 2025-02-12 - https://phabricator.wikimedia.org/T386240#10549351 (10fnegri) [14:43:09] 10cloud-services-team (FY2024/2025-Q3-Q4), 10Toolforge: [toolsdb] ToolsToolsDBReplicationLagIsTooHigh - 2025-02-12 - https://phabricator.wikimedia.org/T386240#10549355 (10fnegri) [14:51:49] 10cloud-services-team (FY2024/2025-Q3-Q4), 10Toolforge: [toolsdb] ToolsToolsDBReplicationLagIsTooHigh - 2025-02-12 - https://phabricator.wikimedia.org/T386240#10549411 (10fnegri) Thanks @Ladsgroup! > Please make sure the replica is read only ` MariaDB [(none)]> SELECT @@read_only; +-------------+ | @@read_on... [14:53:54] 10cloud-services-team (FY2024/2025-Q3-Q4), 10Toolforge: [toolsdb] ToolsToolsDBReplicationLagIsTooHigh - 2025-02-12 - https://phabricator.wikimedia.org/T386240#10549418 (10fnegri) [15:18:34] 06cloud-services-team, 10Toolforge: Support with steps to access Toolforge user data - https://phabricator.wikimedia.org/T386120#10549574 (10KCVelaga_WMF) 05Open→03Resolved All that information was super helpful @bd808! We are able to get the data required, thanks again. [15:26:56] 10cloud-services-team (FY2024/2025-Q3-Q4), 10Toolforge: [toolsdb] mariadb crashing repeatedly (innodb_fatal_semaphore_wait_threshold) - https://phabricator.wikimedia.org/T385900#10549630 (10aborrero) >>! In T385900#10548701, @fnegri wrote: > I didn't find anything in the logs (see comments above) indicating wh... [15:39:56] FIRING: SystemdUnitDown: The service unit backup_vms.service is in failed status on host cloudbackup1004. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=cloudbackup1004 - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitDown [15:42:16] 10cloud-services-team (FY2024/2025-Q3-Q4), 10Toolforge: [toolsdb] mariadb crashing repeatedly (innodb_fatal_semaphore_wait_threshold) - https://phabricator.wikimedia.org/T385900#10549744 (10fnegri) I have no idea which table caused the crash, I don't see anything in the logs that points to a table name... :/... [15:45:38] 10cloud-services-team (FY2024/2025-Q3-Q4), 10Toolforge: [toolsdb] mariadb crashing repeatedly (innodb_fatal_semaphore_wait_threshold) - https://phabricator.wikimedia.org/T385900#10549769 (10fnegri) We could try running [mariadb-check](https://mariadb.com/kb/en/mariadb-check/) to check //all// tables and //all/... [15:48:10] 10Tool-ranker, 06translatewiki.net, 10LPL Essential (LPL Essential 2024 Nov-Jan), 13Patch-For-Review, 07Unplanned-Sprint-Work: Add Ranker to translatewiki.net - https://phabricator.wikimedia.org/T384061#10549814 (10abi_) 05Open→03Resolved >>! In T384061#10542600, @abi_ wrote: >>>! In T384061#1054... [16:01:28] 10Cloud Services Proposals, 06cloud-services-team: Decision Request - PAWS user gatekeeping - https://phabricator.wikimedia.org/T386380 (10aborrero) 03NEW [16:06:05] 10cloud-services-team (FY2024/2025-Q3-Q4), 10Toolforge: [toolsdb] mariadb crashing repeatedly (innodb_fatal_semaphore_wait_threshold) - https://phabricator.wikimedia.org/T385900#10549982 (10aborrero) I would start by checking the few mentions we have in the different logs you collected, for example: `s51434__m... [16:12:26] RESOLVED: SystemdUnitDown: The service unit backup_vms.service is in failed status on host cloudbackup1004. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=cloudbackup1004 - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitDown [16:13:50] 10Cloud Services Proposals, 06cloud-services-team: Decision Request - PAWS user gatekeeping - https://phabricator.wikimedia.org/T386380#10550015 (10aborrero) p:05Triage→03Medium [16:49:59] 10cloud-services-team (FY2024/2025-Q3-Q4), 10Toolforge: [toolsdb] mariadb crashing repeatedly (innodb_fatal_semaphore_wait_threshold) - https://phabricator.wikimedia.org/T385900#10550224 (10fnegri) I tried ANALYZE TABLE on those ones (both on primary and replica host) and they all returned `OK`: ` MariaDB [s5... [16:53:56] 10Cloud Services Proposals, 06cloud-services-team: Decision Request - PAWS user gatekeeping - https://phabricator.wikimedia.org/T386380#10550257 (10bd808) Option 2 was the original design intent and the primary market benefit of PAWS (zero friction compute via an SUL account). [17:03:08] 10Cloud Services Proposals, 06cloud-services-team: Decision Request - PAWS user gatekeeping - https://phabricator.wikimedia.org/T386380#10550272 (10rook) The main abusive uses that I've seen are: crypto miners, proxies, dos scripts, and pulling down pirated material. T381373 appears to have significantly reduc... [17:04:40] 10cloud-services-team (FY2024/2025-Q3-Q4), 10Toolforge: [toolsdb] mariadb crashing repeatedly (innodb_fatal_semaphore_wait_threshold) - https://phabricator.wikimedia.org/T385900#10550276 (10fnegri) Unless we see more crashes in the coming days/weeks, I don't think we need to debug this further and I think we c... [17:06:56] FIRING: SystemdUnitDown: The service unit backup_vms.service is in failed status on host cloudbackup1004. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=cloudbackup1004 - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitDown [18:01:32] 10Cloud Services Proposals, 06cloud-services-team: Decision Request - PAWS user gatekeeping - https://phabricator.wikimedia.org/T386380#10550459 (10cmooney) Thanks for opening this task @aborrero ! I don't have sufficient knowledge to know what the exact best strategies to lock down access and prevent abuse o... [19:01:56] FIRING: SystemdUnitDown: The systemd unit backup_vms.service on node cloudbackup1004 has been failing for more than two hours. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=cloudbackup1004 - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitDown [19:27:22] FIRING: HAProxyBackendUnavailable: HAProxy service nova-metadata-api_backend backend cloudcontrol1005.private.eqiad.wikimedia.cloud is down - https://wikitech.wikimedia.org/wiki/HAProxy - TODO - https://alerts.wikimedia.org/?q=alertname%3DHAProxyBackendUnavailable [19:29:00] (03update) 10raymond-ndibe: Draft: [jobs-api] use job k8s custom resources in code [repos/cloud/toolforge/jobs-api] (diff_job_runtime_method) - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-api/-/merge_requests/114 (https://phabricator.wikimedia.org/T359650) [19:32:22] RESOLVED: HAProxyBackendUnavailable: HAProxy service nova-metadata-api_backend backend cloudcontrol1005.private.eqiad.wikimedia.cloud is down - https://wikitech.wikimedia.org/wiki/HAProxy - TODO - https://alerts.wikimedia.org/?q=alertname%3DHAProxyBackendUnavailable [19:36:21] (03update) 10raymond-ndibe: Draft: [jobs-api] use job k8s custom resources in code [repos/cloud/toolforge/jobs-api] (diff_job_runtime_method) - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-api/-/merge_requests/114 (https://phabricator.wikimedia.org/T359650) [19:53:11] (03update) 10raymond-ndibe: Draft: [jobs-api] use job k8s custom resources in code [repos/cloud/toolforge/jobs-api] (diff_job_runtime_method) - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-api/-/merge_requests/114 (https://phabricator.wikimedia.org/T359650) [20:41:16] 10PAWS: New upstream release for OpenRefine - https://phabricator.wikimedia.org/T386408#10550851 (10LibUp-bot) [20:59:01] RESOLVED: ToolsToolsDBReplicationLagIsTooHigh: ToolsDB replication on tools-db-5 is lagging behind the primary, the current lag is 4737 - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolsDBReplication - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolsToolsDBReplicationLagIsTooHigh [21:36:58] (03approved) 10lucaswerkmeister: Localisation updates from https://translatewiki.net. [toolforge-repos/ranker] - 10https://gitlab.wikimedia.org/toolforge-repos/ranker/-/merge_requests/6 (owner: 10l10n-bot) [21:37:01] (03merge) 10lucaswerkmeister: Localisation updates from https://translatewiki.net. [toolforge-repos/ranker] - 10https://gitlab.wikimedia.org/toolforge-repos/ranker/-/merge_requests/6 (owner: 10l10n-bot) [21:38:44] (03update) 10lucaswerkmeister: Localisation updates from https://translatewiki.net. [toolforge-repos/wd-image-positions] - 10https://gitlab.wikimedia.org/toolforge-repos/wd-image-positions/-/merge_requests/30 (owner: 10l10n-bot) [21:39:01] (03approved) 10lucaswerkmeister: Localisation updates from https://translatewiki.net. [toolforge-repos/wd-image-positions] - 10https://gitlab.wikimedia.org/toolforge-repos/wd-image-positions/-/merge_requests/30 (owner: 10l10n-bot) [21:39:27] (03merge) 10lucaswerkmeister: Localisation updates from https://translatewiki.net. [toolforge-repos/wd-image-positions] - 10https://gitlab.wikimedia.org/toolforge-repos/wd-image-positions/-/merge_requests/30 (owner: 10l10n-bot) [21:54:06] 10Cloud-Services, 06cloud-services-team, 10Catalyst: Add catalyst project to prometheus-alerts alertmanager. - https://phabricator.wikimedia.org/T386416 (10EBomani) 03NEW The #Cloud-Services project tag is not intended to have any tasks. Please check the list on https://phabricator.wikimedia.org/project/pr... [22:01:10] 06cloud-services-team, 10Cloud-VPS, 10VPS-Projects, 10Catalyst: Add catalyst project to prometheus-alerts alertmanager. - https://phabricator.wikimedia.org/T386416#10551139 (10EBomani) [22:25:14] 06cloud-services-team, 10Cloud-VPS, 10VPS-Projects, 10Catalyst: Add catalyst project to prometheus-alerts alertmanager. - https://phabricator.wikimedia.org/T386416#10551256 (10bd808) [22:25:14] 06cloud-services-team, 10Catalyst (Kiwen): Grafana.wmcloud.org has project alerts for catalyst, route alerts catalyst/patchdemo maintainers - https://phabricator.wikimedia.org/T385330#10551257 (10bd808)