[00:35:57] 10Tool-campwiz-nxt, 10Google-Summer-of-Code (2026): GSoC 2026: CampWiz NxT Redesign - https://phabricator.wikimedia.org/T414269#11762569 (10Abdullahi207) I have completed the root path microtask (T415408). Implemented a HomePage component for the root path (/) with the following structure: - Hero Banner: di... [00:58:43] 10Tool-campwiz-nxt, 10Google-Summer-of-Code (2026): GSoC 2026: CampWiz NxT Redesign - https://phabricator.wikimedia.org/T414269#11762596 (10Abdullahi207) [05:13:11] 06cloud-services-team, 10Data-Services, 06Data-Persistence, 06DBA: clouddb1013 crashed after the upgrade to mariadb 10.11.16 - https://phabricator.wikimedia.org/T420177#11762729 (10Marostegui) So no crashes since 10.11.13 got installed, it's been stable ever since. Seems to be related to 10.11.15 and .16.... [05:17:22] FIRING: [2x] HAProxyBackendUnavailable: HAProxy service wikireplica-db-web-s3 backend clouddb1022.eqiad.wmnet is DOWN - https://wikitech.wikimedia.org/wiki/HAProxy - TODO - https://alerts.wikimedia.org/?q=alertname%3DHAProxyBackendUnavailable [05:20:08] 06cloud-services-team, 10Data-Services, 06Data-Persistence, 06DBA: clouddb1013 crashed after the upgrade to mariadb 10.11.16 - https://phabricator.wikimedia.org/T420177#11762748 (10Marostegui) >>! In T420177#11762729, @Marostegui wrote: > So no crashes since 10.11.13 got installed, it's been stable ever si... [05:22:22] RESOLVED: [2x] HAProxyBackendUnavailable: HAProxy service wikireplica-db-web-s3 backend clouddb1022.eqiad.wmnet is DOWN - https://wikitech.wikimedia.org/wiki/HAProxy - TODO - https://alerts.wikimedia.org/?q=alertname%3DHAProxyBackendUnavailable [08:08:17] FIRING: JobUnavailable: Reduced availability for job rabbitmq in cloud@codfw - https://wikitech.wikimedia.org/wiki/Prometheus#Prometheus_job_unavailable - https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets - https://alerts.wikimedia.org/?q=alertname%3DJobUnavailable [08:13:17] RESOLVED: JobUnavailable: Reduced availability for job rabbitmq in cloud@codfw - https://wikitech.wikimedia.org/wiki/Prometheus#Prometheus_job_unavailable - https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets - https://alerts.wikimedia.org/?q=alertname%3DJobUnavailable [08:18:17] FIRING: JobUnavailable: Reduced availability for job rabbitmq in cloud@codfw - https://wikitech.wikimedia.org/wiki/Prometheus#Prometheus_job_unavailable - https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets - https://alerts.wikimedia.org/?q=alertname%3DJobUnavailable [08:23:17] RESOLVED: JobUnavailable: Reduced availability for job rabbitmq in cloud@codfw - https://wikitech.wikimedia.org/wiki/Prometheus#Prometheus_job_unavailable - https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets - https://alerts.wikimedia.org/?q=alertname%3DJobUnavailable [08:39:19] 10Tool-wikimedia-attribution: Update data links to the Attribution API Beta - https://phabricator.wikimedia.org/T421385#11763025 (10Sarai-WMF) This looks like a duplicate of the more verbose task {T416989}. I'll mark that ticket as invalid to avoid any confusion. [08:40:07] 10Tool-wikimedia-attribution: [WAF·V1] Replace placeholder Attribution API references in signals' Data Sources section - https://phabricator.wikimedia.org/T416989#11763029 (10Sarai-WMF) 05Open→03Invalid Invalid because a parallel task was created to document this effort: {T421385} [08:43:40] 06cloud-services-team, 10Data-Services, 06Data-Persistence, 06DBA: clouddb1013 crashed after the upgrade to mariadb 10.11.16 - https://phabricator.wikimedia.org/T420177#11763051 (10Marostegui) Crashed with 10.11.16 and got some stack graces, I will start digging. [08:44:00] FIRING: OpenstackAPIResponse: Openstack API average response time is too high. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/OpenstackAPIResponse - https://grafana.wikimedia.org/d/UUmLqqX4k - https://alerts.wikimedia.org/?q=alertname%3DOpenstackAPIResponse [08:49:41] FIRING: [2x] PrometheusRestarted: Prometheus/cloud restarted: beware monitoring artifacts. - https://wikitech.wikimedia.org/wiki/Prometheus#Prometheus_was_restarted - https://alerts.wikimedia.org/?q=alertname%3DPrometheusRestarted [09:07:56] FIRING: SystemdUnitDown: The service unit rabbitmq_detect_partition.service is in failed status on host cloudrabbit1002. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=cloudrabbit1002 - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitDown [09:08:17] FIRING: JobUnavailable: Reduced availability for job rabbitmq in cloud@eqiad - https://wikitech.wikimedia.org/wiki/Prometheus#Prometheus_job_unavailable - https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets - https://alerts.wikimedia.org/?q=alertname%3DJobUnavailable [09:11:32] !log filippo@cloudcumin1001 admin START - Cookbook wmcs.openstack.restart_openstack on deployment codfw1dev for all services [09:11:49] !log filippo@cloudcumin1001 admin START - Cookbook wmcs.openstack.restart_openstack on deployment eqiad1 for all services [09:12:56] RESOLVED: SystemdUnitDown: The service unit rabbitmq_detect_partition.service is in failed status on host cloudrabbit1002. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=cloudrabbit1002 - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitDown [09:12:59] !log filippo@cloudcumin1001 admin END (FAIL) - Cookbook wmcs.openstack.restart_openstack (exit_code=99) on deployment eqiad1 for all services [09:13:17] RESOLVED: JobUnavailable: Reduced availability for job rabbitmq in cloud@eqiad - https://wikitech.wikimedia.org/wiki/Prometheus#Prometheus_job_unavailable - https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets - https://alerts.wikimedia.org/?q=alertname%3DJobUnavailable [09:13:34] !log filippo@cloudcumin1001 admin START - Cookbook wmcs.openstack.restart_openstack on deployment eqiad1 for service: project,heat [09:14:25] !log filippo@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.restart_openstack (exit_code=0) on deployment eqiad1 for service: project,heat [09:14:40] !log filippo@cloudcumin1001 admin START - Cookbook wmcs.openstack.restart_openstack on deployment eqiad1 for service: project,neutron,designate [09:14:41] RESOLVED: [2x] PrometheusRestarted: Prometheus/cloud restarted: beware monitoring artifacts. - https://wikitech.wikimedia.org/wiki/Prometheus#Prometheus_was_restarted - https://alerts.wikimedia.org/?q=alertname%3DPrometheusRestarted [09:14:56] FIRING: [2x] PrometheusRestarted: Prometheus/cloud restarted: beware monitoring artifacts. - https://wikitech.wikimedia.org/wiki/Prometheus#Prometheus_was_restarted - https://alerts.wikimedia.org/?q=alertname%3DPrometheusRestarted [09:15:11] RESOLVED: PrometheusRestarted: Prometheus/cloud restarted: beware monitoring artifacts. - https://wikitech.wikimedia.org/wiki/Prometheus#Prometheus_was_restarted - https://grafana.wikimedia.org/d/GWvEXWDZk/prometheus-server?var-datasource=eqiad%20prometheus%2Fcloud - https://alerts.wikimedia.org/?q=alertname%3DPrometheusRestarted [09:15:53] !log filippo@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.restart_openstack (exit_code=0) on deployment codfw1dev for all services [09:18:04] !log filippo@cloudcumin1001 admin END (ERROR) - Cookbook wmcs.openstack.restart_openstack (exit_code=97) on deployment eqiad1 for service: project,neutron,designate [09:18:10] !log filippo@cloudcumin1001 admin START - Cookbook wmcs.openstack.restart_openstack on deployment eqiad1 for service: project,nova,neutron,designate [09:19:15] 06cloud-services-team (FY2025/2026-Q3-Q4), 10Cloud-VPS: rabbitmqctl list_queues in eqiad/codfw times out after 60s - https://phabricator.wikimedia.org/T420923#11763170 (10fgiunchedi) 05Open→03Resolved a:03fgiunchedi This is fixed now, both rabbitmq server and CLI use ipv6 for erlang distribution prot... [09:19:56] FIRING: SystemdUnitDown: The service unit prometheus-node-textfile-wmcs-dnsleaks.service is in failed status on host cloudcontrol1007. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=cloudcontrol1007 - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitDown [09:24:56] FIRING: [2x] SystemdUnitDown: The service unit prometheus-node-textfile-wmcs-dnsleaks.service is in failed status on host cloudcontrol1007. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitDown [09:26:11] FIRING: [4x] PrometheusRestarted: Prometheus/cloud restarted: beware monitoring artifacts. - https://wikitech.wikimedia.org/wiki/Prometheus#Prometheus_was_restarted - https://alerts.wikimedia.org/?q=alertname%3DPrometheusRestarted [09:28:58] !log filippo@cloudcumin1001 admin START - Cookbook wmcs.openstack.restart_openstack on deployment eqiad1 for service: project,designate [09:30:01] !log filippo@cloudcumin1001 admin END (FAIL) - Cookbook wmcs.openstack.restart_openstack (exit_code=99) on deployment eqiad1 for service: project,designate [09:30:26] !log filippo@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.restart_openstack (exit_code=0) on deployment eqiad1 for service: project,nova,neutron,designate [09:31:32] 10Cloud-VPS (Project-requests): GOSSH (Global Open-Source Scientific Hardware) - https://phabricator.wikimedia.org/T421600#11763219 (10Aklapper) Thanks, that's helpful info! When you say "verified by the GOSSH community", where is that community currently and how large is the community? Is there a discussion, e.... [09:41:38] !log filippo@cloudcumin1001 admin START - Cookbook wmcs.openstack.restart_openstack on deployment eqiad1 for service: project,designate [09:42:46] !log filippo@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.restart_openstack (exit_code=0) on deployment eqiad1 for service: project,designate [09:44:12] 10Tool-wikimedia-attribution: Consider renaming the 'Attribution Count' signal - https://phabricator.wikimedia.org/T421683 (10Sarai-WMF) 03NEW [09:44:56] RESOLVED: SystemdUnitDown: The service unit prometheus-node-textfile-wmcs-dnsleaks.service is in failed status on host cloudcontrol1007. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=cloudcontrol1007 - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitDown [09:45:26] 10Tool-wikimedia-attribution: Consider renaming the 'Attribution Count' signal - https://phabricator.wikimedia.org/T421683#11763288 (10Sarai-WMF) [09:46:11] FIRING: [2x] PrometheusRestarted: Prometheus/cloud restarted: beware monitoring artifacts. - https://wikitech.wikimedia.org/wiki/Prometheus#Prometheus_was_restarted - https://alerts.wikimedia.org/?q=alertname%3DPrometheusRestarted [09:46:26] FIRING: [2x] PrometheusRestarted: Prometheus/cloud restarted: beware monitoring artifacts. - https://wikitech.wikimedia.org/wiki/Prometheus#Prometheus_was_restarted - https://alerts.wikimedia.org/?q=alertname%3DPrometheusRestarted [09:51:11] RESOLVED: [2x] PrometheusRestarted: Prometheus/cloud restarted: beware monitoring artifacts. - https://wikitech.wikimedia.org/wiki/Prometheus#Prometheus_was_restarted - https://alerts.wikimedia.org/?q=alertname%3DPrometheusRestarted [09:54:56] FIRING: SystemdUnitDown: The service unit prometheus-node-textfile-wmcs-dnsleaks.service is in failed status on host cloudcontrol1007. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=cloudcontrol1007 - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitDown [09:56:50] FIRING: [48x] NeutronAgentDown: Neutron neutron-openvswitch-agent on cloudnet1005 is down - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Troubleshooting#Networking_failures - https://grafana.wikimedia.org/d/wKnDJf97z/wmcs-neutron-eqiad1 - https://alerts.wikimedia.org/?q=alertname%3DNeutronAgentDown [09:57:45] !log filippo@cloudcumin1001 admin START - Cookbook wmcs.openstack.rabbitmq.rebuild_rabbit_cluster on deployment eqiad1 [09:58:40] !log filippo@cloudcumin1001 admin END (FAIL) - Cookbook wmcs.openstack.rabbitmq.rebuild_rabbit_cluster (exit_code=99) on deployment eqiad1 [09:59:17] FIRING: JobUnavailable: Reduced availability for job rabbitmq in cloud@eqiad - https://wikitech.wikimedia.org/wiki/Prometheus#Prometheus_job_unavailable - https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets - https://alerts.wikimedia.org/?q=alertname%3DJobUnavailable [09:59:56] FIRING: [3x] SystemdUnitDown: The service unit magnum-conductor.service is in failed status on host cloudcontrol1006. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitDown [10:00:55] FIRING: MaxConntrack: Elevated conntrack usage on cloudrabbit1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_conntrack - https://grafana.wikimedia.org/d/oITUqwKIk/netfilter-connection-tracking - https://alerts.wikimedia.org/?q=alertname%3DMaxConntrack [10:04:17] RESOLVED: JobUnavailable: Reduced availability for job rabbitmq in cloud@eqiad - https://wikitech.wikimedia.org/wiki/Prometheus#Prometheus_job_unavailable - https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets - https://alerts.wikimedia.org/?q=alertname%3DJobUnavailable [10:04:56] FIRING: [5x] SystemdUnitDown: The service unit magnum-conductor.service is in failed status on host cloudcontrol1006. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitDown [10:05:55] RESOLVED: MaxConntrack: Elevated conntrack usage on cloudrabbit1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_conntrack - https://grafana.wikimedia.org/d/oITUqwKIk/netfilter-connection-tracking - https://alerts.wikimedia.org/?q=alertname%3DMaxConntrack [10:06:50] FIRING: [48x] NeutronAgentDown: Neutron neutron-openvswitch-agent on cloudnet1005 is down - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Troubleshooting#Networking_failures - https://grafana.wikimedia.org/d/wKnDJf97z/wmcs-neutron-eqiad1 - https://alerts.wikimedia.org/?q=alertname%3DNeutronAgentDown [10:09:56] FIRING: [6x] SystemdUnitDown: The service unit magnum-conductor.service is in failed status on host cloudcontrol1006. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitDown [10:10:22] FIRING: HAProxyBackendUnavailable: HAProxy service wikireplica-db-web-s3 backend clouddb1022.eqiad.wmnet is DOWN - https://wikitech.wikimedia.org/wiki/HAProxy - TODO - https://alerts.wikimedia.org/?q=alertname%3DHAProxyBackendUnavailable [10:11:50] RESOLVED: [48x] NeutronAgentDown: Neutron neutron-openvswitch-agent on cloudnet1005 is down - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Troubleshooting#Networking_failures - https://grafana.wikimedia.org/d/wKnDJf97z/wmcs-neutron-eqiad1 - https://alerts.wikimedia.org/?q=alertname%3DNeutronAgentDown [10:12:05] FIRING: [48x] NeutronAgentDown: Neutron neutron-openvswitch-agent on cloudnet1005 is down - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Troubleshooting#Networking_failures - https://grafana.wikimedia.org/d/wKnDJf97z/wmcs-neutron-eqiad1 - https://alerts.wikimedia.org/?q=alertname%3DNeutronAgentDown [10:12:20] RESOLVED: [9x] NeutronAgentDown: Neutron neutron-metadata-agent on cloudnet1005 is down - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Troubleshooting#Networking_failures - https://grafana.wikimedia.org/d/wKnDJf97z/wmcs-neutron-eqiad1 - https://alerts.wikimedia.org/?q=alertname%3DNeutronAgentDown [10:14:14] FIRING: [2x] ToolforgeKubernetesHAproxyServerDown: Toolforge HAProxy server toolsbeta-test-k8s-ingress-12.toolsbeta.eqiad1.wikimedia.cloud is DOWN - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesHAproxyServerDown - https://grafana.wmcloud.org/d/toolforge-k8s-haproxy/toolforge-k8s-haproxy?orgId=1 - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesHAproxyServerDown [10:14:56] FIRING: [7x] SystemdUnitDown: The service unit designate_floating_ip_ptr_records_updater.service is in failed status on host cloudcontrol1006. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitDown [10:15:22] RESOLVED: HAProxyBackendUnavailable: HAProxy service wikireplica-db-web-s3 backend clouddb1022.eqiad.wmnet is DOWN - https://wikitech.wikimedia.org/wiki/HAProxy - TODO - https://alerts.wikimedia.org/?q=alertname%3DHAProxyBackendUnavailable [10:17:32] FIRING: WidespreadInstanceDown: Widespread instances down in project gitlab-runners - https://prometheus-alerts.wmcloud.org/?q=alertname%3DWidespreadInstanceDown [10:17:32] FIRING: InstanceDown: Project toolsbeta instance toolsbeta-test-k8s-ingress-12 is down - https://prometheus-alerts.wmcloud.org/?q=alertname%3DInstanceDown [10:17:32] FIRING: [2x] InstanceDown: Project gitlab-runners instance runner-1032 is down - https://prometheus-alerts.wmcloud.org/?q=alertname%3DInstanceDown [10:19:56] FIRING: [7x] SystemdUnitDown: The service unit designate_floating_ip_ptr_records_updater.service is in failed status on host cloudcontrol1006. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitDown [10:21:30] !log filippo@cloudcumin1001 admin START - Cookbook wmcs.openstack.restart_openstack on deployment eqiad1 for all services [10:22:18] 10Tool-campwiz-nxt, 10Google-Summer-of-Code (2026): GSoC 2026: CampWiz NxT Redesign - https://phabricator.wikimedia.org/T414269#11763451 (10Amitkhatri89) Hi @Nokib_Sarkar and @Tiven2240, I am interested in the CampWiz NxT Redesign project for GSoC 2026. I have basic experience in React, JavaScript and fronte... [10:23:23] FIRING: ToolforgeKubernetesNodeNotReady: Kubernetes node toolsbeta-test-k8s-ingress-12 is not ready - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesNodeNotReady - https://grafana.wmcloud.org/d/8GiwHDL4k/kubernetes-cluster-overview?orgId=1 - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesNodeNotReady [10:29:04] FIRING: [6x] NeutronAgentDown: Neutron neutron-openvswitch-agent on cloudnet1005 is down - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Troubleshooting#Networking_failures - https://grafana.wikimedia.org/d/wKnDJf97z/wmcs-neutron-eqiad1 - https://alerts.wikimedia.org/?q=alertname%3DNeutronAgentDown [10:34:05] FIRING: [24x] NeutronAgentDown: Neutron neutron-openvswitch-agent on cloudnet1005 is down - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Troubleshooting#Networking_failures - https://grafana.wikimedia.org/d/wKnDJf97z/wmcs-neutron-eqiad1 - https://alerts.wikimedia.org/?q=alertname%3DNeutronAgentDown [10:36:02] !log filippo@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.restart_openstack (exit_code=0) on deployment eqiad1 for all services [10:38:49] FIRING: [38x] NeutronAgentDown: Neutron neutron-openvswitch-agent on cloudnet1005 is down - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Troubleshooting#Networking_failures - https://grafana.wikimedia.org/d/wKnDJf97z/wmcs-neutron-eqiad1 - https://alerts.wikimedia.org/?q=alertname%3DNeutronAgentDown [10:39:56] RESOLVED: SystemdUnitDown: The service unit drain_rabbitmq_notification_error.service is in failed status on host cloudrabbit1001. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=cloudrabbit1001 - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitDown [10:43:50] FIRING: [42x] NeutronAgentDown: Neutron neutron-openvswitch-agent on cloudnet1005 is down - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Troubleshooting#Networking_failures - https://grafana.wikimedia.org/d/wKnDJf97z/wmcs-neutron-eqiad1 - https://alerts.wikimedia.org/?q=alertname%3DNeutronAgentDown [10:45:46] (03open) 10valeriobozzolan: itwiki: fix breaking change on categorylinks, imagelinks (linktarget) [toolforge-repos/lists] - 10https://gitlab.wikimedia.org/toolforge-repos/lists/-/merge_requests/2 [10:57:56] !log filippo@cloudcumin1001 admin START - Cookbook wmcs.openstack.rabbitmq.rebuild_rabbit_cluster on deployment eqiad1 [10:58:44] 06cloud-services-team, 10Striker: Striker should warn against applying for membership with bot accounts - https://phabricator.wikimedia.org/T421690 (10taavi) 03NEW [10:59:44] 10Tool-campwiz-nxt, 10Google-Summer-of-Code (2026): GSoC 2026: CampWiz NxT Redesign - https://phabricator.wikimedia.org/T414269#11763615 (10Navneet212006) Hello, I have completed **Microtask T415408** as part of the CampWiz contribution tasks. ### Summary of Work * Built a React-based frontend using **Vite... [11:00:17] FIRING: JobUnavailable: Reduced availability for job rabbitmq in cloud@eqiad - https://wikitech.wikimedia.org/wiki/Prometheus#Prometheus_job_unavailable - https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets - https://alerts.wikimedia.org/?q=alertname%3DJobUnavailable [11:01:14] !log filippo@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.rabbitmq.rebuild_rabbit_cluster (exit_code=0) on deployment eqiad1 [11:05:17] RESOLVED: JobUnavailable: Reduced availability for job rabbitmq in cloud@eqiad - https://wikitech.wikimedia.org/wiki/Prometheus#Prometheus_job_unavailable - https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets - https://alerts.wikimedia.org/?q=alertname%3DJobUnavailable [11:05:56] FIRING: SystemdUnitDown: The service unit drain_rabbitmq_notification_error.service is in failed status on host cloudrabbit1001. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=cloudrabbit1001 - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitDown [11:06:25] !log filippo@cloudcumin1001 admin START - Cookbook wmcs.openstack.restart_openstack on deployment eqiad1 for all services [11:08:50] FIRING: [42x] NeutronAgentDown: Neutron neutron-openvswitch-agent on cloudnet1005 is down - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Troubleshooting#Networking_failures - https://grafana.wikimedia.org/d/wKnDJf97z/wmcs-neutron-eqiad1 - https://alerts.wikimedia.org/?q=alertname%3DNeutronAgentDown [11:09:22] FIRING: [7x] HAProxyBackendUnavailable: HAProxy service glance-api_backend backend cloudcontrol1006.private.eqiad.wikimedia.cloud is DOWN - https://wikitech.wikimedia.org/wiki/HAProxy - TODO - https://alerts.wikimedia.org/?q=alertname%3DHAProxyBackendUnavailable [11:13:50] FIRING: [42x] NeutronAgentDown: Neutron neutron-openvswitch-agent on cloudnet1005 is down - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Troubleshooting#Networking_failures - https://grafana.wikimedia.org/d/wKnDJf97z/wmcs-neutron-eqiad1 - https://alerts.wikimedia.org/?q=alertname%3DNeutronAgentDown [11:14:04] FIRING: [42x] NeutronAgentDown: Neutron neutron-openvswitch-agent on cloudnet1005 is down - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Troubleshooting#Networking_failures - https://grafana.wikimedia.org/d/wKnDJf97z/wmcs-neutron-eqiad1 - https://alerts.wikimedia.org/?q=alertname%3DNeutronAgentDown [11:14:22] RESOLVED: [7x] HAProxyBackendUnavailable: HAProxy service glance-api_backend backend cloudcontrol1006.private.eqiad.wikimedia.cloud is DOWN - https://wikitech.wikimedia.org/wiki/HAProxy - TODO - https://alerts.wikimedia.org/?q=alertname%3DHAProxyBackendUnavailable [11:18:50] FIRING: [42x] NeutronAgentDown: Neutron neutron-openvswitch-agent on cloudnet1005 is down - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Troubleshooting#Networking_failures - https://grafana.wikimedia.org/d/wKnDJf97z/wmcs-neutron-eqiad1 - https://alerts.wikimedia.org/?q=alertname%3DNeutronAgentDown [11:20:57] !log filippo@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.restart_openstack (exit_code=0) on deployment eqiad1 for all services [11:22:32] RESOLVED: InstanceDown: Project toolsbeta instance toolsbeta-test-k8s-ingress-12 is down - https://prometheus-alerts.wmcloud.org/?q=alertname%3DInstanceDown [11:22:32] RESOLVED: WidespreadInstanceDown: Widespread instances down in project gitlab-runners - https://prometheus-alerts.wmcloud.org/?q=alertname%3DWidespreadInstanceDown [11:22:32] RESOLVED: [2x] InstanceDown: Project gitlab-runners instance runner-1032 is down - https://prometheus-alerts.wmcloud.org/?q=alertname%3DInstanceDown [11:23:23] RESOLVED: ToolforgeKubernetesNodeNotReady: Kubernetes node toolsbeta-test-k8s-ingress-12 is not ready - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesNodeNotReady - https://grafana.wmcloud.org/d/8GiwHDL4k/kubernetes-cluster-overview?orgId=1 - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesNodeNotReady [11:23:50] RESOLVED: [41x] NeutronAgentDown: Neutron neutron-openvswitch-agent on cloudnet1005 is down - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Troubleshooting#Networking_failures - https://grafana.wikimedia.org/d/wKnDJf97z/wmcs-neutron-eqiad1 - https://alerts.wikimedia.org/?q=alertname%3DNeutronAgentDown [11:24:14] RESOLVED: [2x] ToolforgeKubernetesHAproxyServerDown: Toolforge HAProxy server toolsbeta-test-k8s-ingress-12.toolsbeta.eqiad1.wikimedia.cloud is DOWN - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesHAproxyServerDown - https://grafana.wmcloud.org/d/toolforge-k8s-haproxy/toolforge-k8s-haproxy?orgId=1 - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesHAproxyServerDo [11:26:58] (03update) 10valeriobozzolan: itwiki: fix breaking change on categorylinks, imagelinks (linktarget) [toolforge-repos/lists] - 10https://gitlab.wikimedia.org/toolforge-repos/lists/-/merge_requests/2 [11:27:59] 10Cloud-VPS (Quota-requests), 10Browser Test Platform, 10Continuous-Integration-Infrastructure, 10Continuous-Integration-Config, 07Jenkins: New flavor for the integration project with more vCPU and ephemeral disk space - https://phabricator.wikimedia.org/T421242#11763790 (10hashar) [11:30:32] (03update) 10valeriobozzolan: itwiki: fix breaking change on categorylinks, imagelinks (linktarget) [toolforge-repos/lists] - 10https://gitlab.wikimedia.org/toolforge-repos/lists/-/merge_requests/2 [11:32:06] (03update) 10valeriobozzolan: itwiki: fix breaking change on categorylinks, imagelinks (linktarget) [toolforge-repos/lists] - 10https://gitlab.wikimedia.org/toolforge-repos/lists/-/merge_requests/2 [11:32:27] (03approved) 10vriaa: fix: prevent banner link input from overflowing in small screen sizes [toolforge-repos/centralnotice-banner-editor] - 10https://gitlab.wikimedia.org/toolforge-repos/centralnotice-banner-editor/-/merge_requests/37 [11:32:32] (03merge) 10vriaa: fix: prevent banner link input from overflowing in small screen sizes [toolforge-repos/centralnotice-banner-editor] - 10https://gitlab.wikimedia.org/toolforge-repos/centralnotice-banner-editor/-/merge_requests/37 [11:32:39] (03update) 10valeriobozzolan: itwiki: fix breaking change on categorylinks, imagelinks (linktarget) [toolforge-repos/lists] - 10https://gitlab.wikimedia.org/toolforge-repos/lists/-/merge_requests/2 [11:35:56] RESOLVED: SystemdUnitDown: The service unit drain_rabbitmq_notification_error.service is in failed status on host cloudrabbit1001. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=cloudrabbit1001 - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitDown [11:36:00] (03merge) 10vriaa: fix: prevent image fields from overflowing the inputs sidebar [toolforge-repos/centralnotice-banner-editor] - 10https://gitlab.wikimedia.org/toolforge-repos/centralnotice-banner-editor/-/merge_requests/38 (https://phabricator.wikimedia.org/T421120) [11:36:57] (03merge) 10vriaa: fix: prevent alignment buttons from overflowing in small screen sizes [toolforge-repos/centralnotice-banner-editor] - 10https://gitlab.wikimedia.org/toolforge-repos/centralnotice-banner-editor/-/merge_requests/40 (https://phabricator.wikimedia.org/T420940) [11:38:07] (03merge) 10vriaa: fix: prevent add link modal from overflowing [toolforge-repos/centralnotice-banner-editor] - 10https://gitlab.wikimedia.org/toolforge-repos/centralnotice-banner-editor/-/merge_requests/41 (https://phabricator.wikimedia.org/T421101) [11:44:31] 06cloud-services-team (FY2025/2026-Q3-Q4), 10Cloud-VPS, 13Patch-For-Review: Move all openstack rabbitmq queues to quorum - https://phabricator.wikimedia.org/T421054#11763845 (10fgiunchedi) This took a bunch of tries today and despite my best attempts to mess with rabbit and oslo, openstack reacted reasonably... [11:51:13] (03PS1) 10Btullis: Update dummy keytabs to match the active list in puppet [labs/private] - 10https://gerrit.wikimedia.org/r/1264589 (https://phabricator.wikimedia.org/T421241) [11:53:28] (03CR) 10Btullis: [V:03+2 C:03+2] Update dummy keytabs to match the active list in puppet [labs/private] - 10https://gerrit.wikimedia.org/r/1264589 (https://phabricator.wikimedia.org/T421241) (owner: 10Btullis) [11:54:38] (03update) 10vriaa: refactor: migrate deep selectors to use :deep syntax [toolforge-repos/centralnotice-banner-editor] - 10https://gitlab.wikimedia.org/toolforge-repos/centralnotice-banner-editor/-/merge_requests/43 (https://phabricator.wikimedia.org/T421519) [11:56:41] (03merge) 10vriaa: refactor: migrate deep selectors to use :deep syntax [toolforge-repos/centralnotice-banner-editor] - 10https://gitlab.wikimedia.org/toolforge-repos/centralnotice-banner-editor/-/merge_requests/43 (https://phabricator.wikimedia.org/T421519) [11:56:50] FIRING: NeutronAgentDown: Neutron neutron-l3-agent on cloudnet1005 is down - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Troubleshooting#Networking_failures - https://grafana.wikimedia.org/d/wKnDJf97z/wmcs-neutron-eqiad1 - https://alerts.wikimedia.org/?q=alertname%3DNeutronAgentDown [12:08:41] (03CR) 10CI reject: [V:04-1] Localisation updates from https://translatewiki.net. [labs/tools/intuition] - 10https://gerrit.wikimedia.org/r/1264592 (owner: 10L10n-bot) [12:20:47] (03update) 10vriaa: fix: prevent banner link and text link from conflicting [toolforge-repos/centralnotice-banner-editor] - 10https://gitlab.wikimedia.org/toolforge-repos/centralnotice-banner-editor/-/merge_requests/42 (https://phabricator.wikimedia.org/T421068) [12:29:28] 06cloud-services-team (FY2025/2026-Q3-Q4), 10Cloud-VPS, 13Patch-For-Review: Move all openstack rabbitmq queues to quorum - https://phabricator.wikimedia.org/T421054#11764075 (10fgiunchedi) The final bits of flipping neutron-l3-agent to quorum queues will be done tomorrow at 7 UTC within a scheduled window. T... [12:31:08] (03open) 10l10n-bot: Localisation updates from https://translatewiki.net. [toolforge-repos/wd-image-positions] - 10https://gitlab.wikimedia.org/toolforge-repos/wd-image-positions/-/merge_requests/65 [12:34:17] FIRING: JobUnavailable: Reduced availability for job openstack in cloud@codfw - https://wikitech.wikimedia.org/wiki/Prometheus#Prometheus_job_unavailable - https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets - https://alerts.wikimedia.org/?q=alertname%3DJobUnavailable [12:39:17] RESOLVED: JobUnavailable: Reduced availability for job openstack in cloud@codfw - https://wikitech.wikimedia.org/wiki/Prometheus#Prometheus_job_unavailable - https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets - https://alerts.wikimedia.org/?q=alertname%3DJobUnavailable [12:53:10] (03approved) 10lucaswerkmeister: Localisation updates from https://translatewiki.net. [toolforge-repos/wd-image-positions] - 10https://gitlab.wikimedia.org/toolforge-repos/wd-image-positions/-/merge_requests/65 (owner: 10l10n-bot) [12:53:18] (03merge) 10lucaswerkmeister: Localisation updates from https://translatewiki.net. [toolforge-repos/wd-image-positions] - 10https://gitlab.wikimedia.org/toolforge-repos/wd-image-positions/-/merge_requests/65 (owner: 10l10n-bot) [12:53:56] FIRING: SystemdUnitDown: The service unit kiwix-mirror-update.service is in failed status on host clouddumps1001. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=clouddumps1001 - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitDown [12:59:35] 06cloud-services-team, 10Toolforge: Alert on Prometheus instability / unexpected restarts - https://phabricator.wikimedia.org/T421416#11764341 (10dcaro) After the first restart at least, it's failing to parse the WAL during startup and crashing in a loop, the "quick fix" for the restart loop is to remove wal (... [13:00:05] 06cloud-services-team, 10Toolforge: [infra,o11y] Alert on Prometheus instability / unexpected restarts - https://phabricator.wikimedia.org/T421416#11764342 (10dcaro) [13:03:15] 06cloud-services-team, 10Toolforge: [infra,o11y] Alert on Prometheus instability / unexpected restarts - https://phabricator.wikimedia.org/T421416#11764347 (10dcaro) promtool analysis: {{P89967}} [13:06:19] (03update) 10fnegri: Introduce diff-mode [repos/cloud/wikireplicas-utils] - 10https://gitlab.wikimedia.org/repos/cloud/wikireplicas-utils/-/merge_requests/10 (https://phabricator.wikimedia.org/T351637) [13:06:24] (03update) 10fnegri: Introduce diff-mode [repos/cloud/wikireplicas-utils] - 10https://gitlab.wikimedia.org/repos/cloud/wikireplicas-utils/-/merge_requests/10 (https://phabricator.wikimedia.org/T351637) [13:38:15] (03merge) 10valeriobozzolan: itwiki: fix breaking change on categorylinks, imagelinks (linktarget) [toolforge-repos/lists] - 10https://gitlab.wikimedia.org/toolforge-repos/lists/-/merge_requests/2 [13:39:17] FIRING: JobUnavailable: Reduced availability for job openstack in cloud@codfw - https://wikitech.wikimedia.org/wiki/Prometheus#Prometheus_job_unavailable - https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets - https://alerts.wikimedia.org/?q=alertname%3DJobUnavailable [13:40:58] (03update) 10raymond-ndibe: [status] make job status an enum, with clearly defined states [repos/cloud/toolforge/jobs-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-api/-/merge_requests/208 (https://phabricator.wikimedia.org/T401172) [13:42:45] !log filippo@cloudcumin1001 admin START - Cookbook wmcs.openstack.rabbitmq.rebuild_rabbit_cluster on deployment codfw1dev [13:42:55] 06cloud-services-team, 10Toolforge: [infra,o11y] Alert on Prometheus instability / unexpected restarts - https://phabricator.wikimedia.org/T421416#11765271 (10taavi) a:03taavi [13:43:21] (03open) 10taavi: monitoring: Alert on unexpected Prometheus restarts [repos/cloud/toolforge/alerts] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/alerts/-/merge_requests/58 (https://phabricator.wikimedia.org/T421416) [13:43:26] (03update) 10taavi: monitoring: Alert on unexpected Prometheus restarts [repos/cloud/toolforge/alerts] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/alerts/-/merge_requests/58 (https://phabricator.wikimedia.org/T421416) [13:43:34] (03update) 10taavi: monitoring: Alert on unexpected Prometheus restarts [repos/cloud/toolforge/alerts] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/alerts/-/merge_requests/58 (https://phabricator.wikimedia.org/T421416) [13:44:17] FIRING: [2x] JobUnavailable: Reduced availability for job openstack in cloud@codfw - https://wikitech.wikimedia.org/wiki/Prometheus#Prometheus_job_unavailable - https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets - https://alerts.wikimedia.org/?q=alertname%3DJobUnavailable [13:45:50] !log filippo@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.rabbitmq.rebuild_rabbit_cluster (exit_code=0) on deployment codfw1dev [13:48:07] !log filippo@cloudcumin1001 admin START - Cookbook wmcs.openstack.restart_openstack on deployment codfw1dev for all services [13:49:17] RESOLVED: [2x] JobUnavailable: Reduced availability for job openstack in cloud@codfw - https://wikitech.wikimedia.org/wiki/Prometheus#Prometheus_job_unavailable - https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets - https://alerts.wikimedia.org/?q=alertname%3DJobUnavailable [13:51:29] 10Tools: QuickStatements not working since a couple of days (Item not available) - https://phabricator.wikimedia.org/T421345#11765344 (10pere_prlpz) An example from Commons: https://quickstatements.toolforge.org/#/batch/256058 Lately I've been able to run not very long batches in Wikidata but my batches in Comm... [13:51:41] (03approved) 10filippo: monitoring: Alert on unexpected Prometheus restarts [repos/cloud/toolforge/alerts] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/alerts/-/merge_requests/58 (https://phabricator.wikimedia.org/T421416) (owner: 10taavi) [13:52:17] !log filippo@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.restart_openstack (exit_code=0) on deployment codfw1dev for all services [13:53:12] (03merge) 10taavi: monitoring: Alert on unexpected Prometheus restarts [repos/cloud/toolforge/alerts] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/alerts/-/merge_requests/58 (https://phabricator.wikimedia.org/T421416) [13:55:49] FIRING: NeutronAgentDownForLong: Neutron neutron-l3-agent on cloudnet1005 has been down for more than 2h - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Troubleshooting#Networking_failures - https://grafana.wikimedia.org/d/wKnDJf97z/wmcs-neutron-eqiad1 - https://alerts.wikimedia.org/?q=alertname%3DNeutronAgentDownForLong [13:57:14] (03update) 10taavi: maintain-kubeusers: Sort quota entries [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1185 [13:57:15] (03update) 10taavi: maintain-kubeusers: Bump techactivity quota [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1186 (https://phabricator.wikimedia.org/T421292) [13:57:15] (03open) 10taavi: maintain-kubeusers: Sort quota entries [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1185 [13:57:18] (03open) 10taavi: maintain-kubeusers: Bump techactivity quota [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1186 (https://phabricator.wikimedia.org/T421292) [13:57:24] (03update) 10taavi: maintain-kubeusers: Sort quota entries [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1185 [13:57:26] (03update) 10taavi: maintain-kubeusers: Bump techactivity quota [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1186 (https://phabricator.wikimedia.org/T421292) [13:57:30] 10Toolforge (Quota-requests), 13Patch-For-Review: Request increased quota for techactivity Toolforge tool - https://phabricator.wikimedia.org/T421292#11765372 (10taavi) a:03taavi [13:58:19] 06cloud-services-team, 10Data-Services, 06Data-Persistence, 06DBA: clouddb1013 crashed after the upgrade to mariadb 10.11.16 - https://phabricator.wikimedia.org/T420177#11765375 (10Marostegui) Created: https://jira.mariadb.org/browse/MDEV-39209 [14:21:02] (03approved) 10dcaro: maintain-kubeusers: Sort quota entries [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1185 (owner: 10taavi) [14:22:02] (03approved) 10dcaro: maintain-kubeusers: Bump techactivity quota [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1186 (https://phabricator.wikimedia.org/T421292) (owner: 10taavi) [14:27:08] (03update) 10raymond-ndibe: [status] make job status an enum, with clearly defined states [repos/cloud/toolforge/jobs-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-api/-/merge_requests/208 (https://phabricator.wikimedia.org/T401172) [14:38:37] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.toolforge.component.deploy for component maintain-kubeusers [14:38:50] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.component.deploy (exit_code=0) for component maintain-kubeusers [14:42:49] (03merge) 10taavi: maintain-kubeusers: Sort quota entries [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1185 [14:42:53] (03update) 10taavi: maintain-kubeusers: Bump techactivity quota [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1186 (https://phabricator.wikimedia.org/T421292) [14:42:57] (03merge) 10taavi: maintain-kubeusers: Bump techactivity quota [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1186 (https://phabricator.wikimedia.org/T421292) [14:43:52] 10Toolforge (Quota-requests), 13Patch-For-Review: Request increased quota for techactivity Toolforge tool - https://phabricator.wikimedia.org/T421292#11765675 (10taavi) 05Open→03Resolved [14:48:56] FIRING: SystemdUnitDown: The systemd unit kiwix-mirror-update.service on node clouddumps1001 has been failing for more than two hours. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=clouddumps1001 - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitDown [14:49:37] (03update) 10raymond-ndibe: support since and until params in logs endpoints [repos/cloud/toolforge/logs-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/logs-api/-/merge_requests/12 (https://phabricator.wikimedia.org/T400917) [15:10:24] 06cloud-services-team, 10Cloud-VPS (Quota-requests): Improvements to auto-generated floating ip ptr records - https://phabricator.wikimedia.org/T421739 (10Andrew) 03NEW [15:11:24] 06cloud-services-team, 10Cloud-VPS: Improvements to auto-generated floating ip ptr records - https://phabricator.wikimedia.org/T421739#11765924 (10taavi) [15:16:21] 06cloud-services-team, 10Data-Services, 06Data-Persistence, 06DBA: clouddb1013 crashed after the upgrade to mariadb 10.11.16 - https://phabricator.wikimedia.org/T420177#11765943 (10Marostegui) I've got a patch from mariadb which I've compiled already along with the .deb. However I am going to wait for the... [15:17:16] (03update) 10raymond-ndibe: support since and until params in logs endpoints [repos/cloud/toolforge/logs-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/logs-api/-/merge_requests/12 (https://phabricator.wikimedia.org/T400917) [15:18:23] (03update) 10raymond-ndibe: support since and until params in logs endpoints [repos/cloud/toolforge/logs-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/logs-api/-/merge_requests/12 (https://phabricator.wikimedia.org/T400917) [15:18:27] (03update) 10raymond-ndibe: support since and until params in logs endpoints [repos/cloud/toolforge/logs-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/logs-api/-/merge_requests/12 (https://phabricator.wikimedia.org/T400917) [15:32:33] (03update) 10raymond-ndibe: logs: test since, until params for jobs-api logs [repos/cloud/toolforge/toolforge-deploy] (add_logs_api_tests) - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1141 (https://phabricator.wikimedia.org/T400917) [15:34:33] 06cloud-services-team, 10Cloud-VPS: Improvements to auto-generated floating ip ptr records - https://phabricator.wikimedia.org/T421739#11766094 (10Andrew) [15:35:12] (03update) 10raymond-ndibe: logs-api: add more logs tests [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1140 (https://phabricator.wikimedia.org/T418326) [15:35:27] 06cloud-services-team, 10Cloud-VPS: Improvements to auto-generated floating ip ptr records - https://phabricator.wikimedia.org/T421739#11766100 (10Andrew) For #2, taavi points out that there is a boilerplate description for auto-created records https://gerrit.wikimedia.org/r/plugins/gitiles/operations/puppet... [15:36:23] (03update) 10raymond-ndibe: logs: test since, until params for jobs-api logs [repos/cloud/toolforge/toolforge-deploy] (add_logs_api_tests) - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1141 (https://phabricator.wikimedia.org/T400917) [15:36:53] (03update) 10raymond-ndibe: toolforge_deploy: add `local` option to choices [repos/cloud/toolforge/lima-kilo] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/lima-kilo/-/merge_requests/315 (owner: 10dcaro) [15:36:59] (03merge) 10raymond-ndibe: toolforge_deploy: add `local` option to choices [repos/cloud/toolforge/lima-kilo] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/lima-kilo/-/merge_requests/315 (owner: 10dcaro) [15:37:39] (03update) 10raymond-ndibe: builds-api: update buildpacks to 24_0.21.5 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1175 (owner: 10dcaro) [15:40:21] 10Tool-campwiz-nxt, 10Google-Summer-of-Code (2026): GSoC 2026: CampWiz NxT Redesign - https://phabricator.wikimedia.org/T414269#11766116 (10Priyanjali05) Hello, I am interested in working on the CampWiz NxT Redesign project for GSoC 2026. I have experience with JavaScript, HTML, and CSS, and I am currently le... [15:41:06] (03merge) 10vriaa: fix: prevent banner link and text link from conflicting [toolforge-repos/centralnotice-banner-editor] - 10https://gitlab.wikimedia.org/toolforge-repos/centralnotice-banner-editor/-/merge_requests/42 (https://phabricator.wikimedia.org/T421068) [15:41:17] (03merge) 10vriaa: fix: preserve line breaks in text elements [toolforge-repos/centralnotice-banner-editor] - 10https://gitlab.wikimedia.org/toolforge-repos/centralnotice-banner-editor/-/merge_requests/39 (https://phabricator.wikimedia.org/T420920) [15:43:15] 10Tool-centralnotice-banner-editor, 13Patch-For-Review: Banner-level and text-level link options conflict with each other - https://phabricator.wikimedia.org/T421068#11766133 (10Oyelola_Victoria) a:03Oyelola_Victoria [15:44:39] 10Tool-centralnotice-banner-editor, 13Patch-For-Review: Banner-level and text-level link options conflict with each other - https://phabricator.wikimedia.org/T421068#11766140 (10Oyelola_Victoria) 05Open→03Resolved [15:45:42] 10Tool-centralnotice-banner-editor, 13Patch-For-Review: Banner editor does not preserve line breaks in text elements - https://phabricator.wikimedia.org/T420920#11766150 (10Oyelola_Victoria) 05Open→03Resolved a:03Oyelola_Victoria [15:46:56] 10Tool-centralnotice-banner-editor: Alignment buttons are not responsive and are cut off on small screens - https://phabricator.wikimedia.org/T420940#11766173 (10Oyelola_Victoria) 05Open→03Resolved a:03Oyelola_Victoria [15:57:23] 06cloud-services-team, 10Striker: Striker should warn against applying for membership with bot accounts - https://phabricator.wikimedia.org/T421690#11766263 (10taavi) p:05Triage→03Low [15:57:44] 10Tool-centralnotice-banner-editor: Add link to text modal overflows and cuts off in small screens - https://phabricator.wikimedia.org/T421101#11766265 (10Oyelola_Victoria) 05Open→03Resolved a:03Oyelola_Victoria [15:58:14] 10Tool-centralnotice-banner-editor: Replace ::v-deep selectors with :deep() - https://phabricator.wikimedia.org/T421519#11766270 (10Oyelola_Victoria) 05Open→03Resolved a:03Oyelola_Victoria [16:02:04] 10Tool-centralnotice-banner-editor: Some image fields overflow and cut off in small screens - https://phabricator.wikimedia.org/T421120#11766285 (10Oyelola_Victoria) 05Open→03Resolved [16:04:53] 06cloud-services-team (FY2025/2026-Q3-Q4), 10Toolforge: cadvisor-reported Istio network usage is way too high - https://phabricator.wikimedia.org/T421386#11766300 (10elukey) In Prod we are working on reducing the buckets in T392886, now that we run a relatively recent version of Istio. [16:31:47] 10Tool-centralnotice-banner-editor: Allow links to be applied to individual words or phrases within a text element - https://phabricator.wikimedia.org/T420945#11766418 (10Oyelola_Victoria) 05Open→03Invalid [16:38:41] FIRING: CloudVPSDesignateLeaks: Detected 4 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [16:40:11] (03update) 10dcaro: support since and until params in logs endpoints [repos/cloud/toolforge/logs-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/logs-api/-/merge_requests/12 (https://phabricator.wikimedia.org/T400917) (owner: 10raymond-ndibe) [16:40:11] (03approved) 10dcaro: support since and until params in logs endpoints [repos/cloud/toolforge/logs-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/logs-api/-/merge_requests/12 (https://phabricator.wikimedia.org/T400917) (owner: 10raymond-ndibe) [16:40:27] 10Tool-centralnotice-banner-editor: Allow links to be applied to individual words or phrases within a text element - https://phabricator.wikimedia.org/T420945#11766438 (10Oyelola_Victoria) Closing this task as Invalid after discussing with @GFontenelle_WMF. It could lead to banners with too many links, which... [16:50:32] 10Tool-campwiz-nxt, 10Google-Summer-of-Code (2026): GSoC 2026: CampWiz NxT Redesign - https://phabricator.wikimedia.org/T414269#11766478 (10Aklapper) [16:51:10] 10Tool-centralnotice-banner-editor: Add dark mode support to the editor interface - https://phabricator.wikimedia.org/T421746 (10Oyelola_Victoria) 03NEW [16:51:25] 10Tool-centralnotice-banner-editor: Extend Banner Editor to support more complex banner templates - https://phabricator.wikimedia.org/T420918#11766505 (10Oyelola_Victoria) p:05Triage→03High [16:54:47] 10Tool-centralnotice-banner-editor: Add contrast checker for accessibility guidelines - https://phabricator.wikimedia.org/T420929#11766529 (10Oyelola_Victoria) p:05Triage→03Medium [16:56:09] 10Tool-centralnotice-banner-editor: Add undo functionality to the editor - https://phabricator.wikimedia.org/T421061#11766536 (10Oyelola_Victoria) p:05Triage→03High [16:56:19] 10Tool-centralnotice-banner-editor: Add option to set border, margin, and padding for all sides at once - https://phabricator.wikimedia.org/T421063#11766537 (10Oyelola_Victoria) p:05Triage→03Low [17:00:00] (03update) 10fnegri: Replace only views that need updating [repos/cloud/wikireplicas-utils] - 10https://gitlab.wikimedia.org/repos/cloud/wikireplicas-utils/-/merge_requests/9 (https://phabricator.wikimedia.org/T351637) [17:00:00] (03update) 10fnegri: Add --diff-mode and remove --dry-run [repos/cloud/wikireplicas-utils] - 10https://gitlab.wikimedia.org/repos/cloud/wikireplicas-utils/-/merge_requests/10 (https://phabricator.wikimedia.org/T351637) [17:00:00] (03open) 10fnegri: Add summary with counts [repos/cloud/wikireplicas-utils] - 10https://gitlab.wikimedia.org/repos/cloud/wikireplicas-utils/-/merge_requests/11 [17:00:01] (03update) 10fnegri: Add summary with counts [repos/cloud/wikireplicas-utils] - 10https://gitlab.wikimedia.org/repos/cloud/wikireplicas-utils/-/merge_requests/11 [17:00:28] (03update) 10fnegri: Add summary with counts [repos/cloud/wikireplicas-utils] - 10https://gitlab.wikimedia.org/repos/cloud/wikireplicas-utils/-/merge_requests/11 [17:00:28] (03update) 10fnegri: Add --diff-mode and remove --dry-run [repos/cloud/wikireplicas-utils] - 10https://gitlab.wikimedia.org/repos/cloud/wikireplicas-utils/-/merge_requests/10 [17:00:30] (03update) 10fnegri: Replace only views that need updating [repos/cloud/wikireplicas-utils] - 10https://gitlab.wikimedia.org/repos/cloud/wikireplicas-utils/-/merge_requests/9 (https://phabricator.wikimedia.org/T351637) [17:04:47] (03update) 10fnegri: Add summary with counts [repos/cloud/wikireplicas-utils] - 10https://gitlab.wikimedia.org/repos/cloud/wikireplicas-utils/-/merge_requests/11 [17:04:48] (03update) 10fnegri: Add --diff-mode and remove --dry-run [repos/cloud/wikireplicas-utils] - 10https://gitlab.wikimedia.org/repos/cloud/wikireplicas-utils/-/merge_requests/10 [17:29:06] 10Tool-centralnotice-banner-editor: Add image lookup within the editor - https://phabricator.wikimedia.org/T421072#11766676 (10Oyelola_Victoria) p:05Triage→03Low [18:04:22] 10Tool-campwiz-nxt, 10Google-Summer-of-Code (2026): GSoC 2026: CampWiz NxT Redesign - https://phabricator.wikimedia.org/T414269#11766910 (10LikhithSappa) Hi, I have compelted the required migration tasks and submitted multiple pull requests for review. I have also sent an interview mail to the provided email... [18:11:21] (03open) 10andrew: Add ptr record for mail.wikimedia.az [repos/cloud/cloud-vps/tofu-infra] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/303 (https://phabricator.wikimedia.org/T421025) [18:14:57] (03update) 10andrew: Add ptr record for mail.wikimedia.az [repos/cloud/cloud-vps/tofu-infra] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/303 (https://phabricator.wikimedia.org/T421025) [18:24:47] 10Tool-refill, 07good first task: reFill UI still shows links to former maintainer’s user pages and GitHub account - https://phabricator.wikimedia.org/T421478#11767039 (10Alachuckthebuck) So I've gone and taken a look at the codebase and it links to @Curb_Safe_Charmer 's repo, but the bundled js being used by... [18:26:01] 10Tool-campwiz-nxt, 10Google-Summer-of-Code (2026): GSoC 2026: CampWiz NxT Redesign - https://phabricator.wikimedia.org/T414269#11767044 (10RidsiyaNimi) [18:33:51] 10Cloud-VPS (Project-requests): GOSSH (Global Open-Source Scientific Hardware) - https://phabricator.wikimedia.org/T421600#11767096 (10taavi) 05Open→03Declined Declining, since this does not seem related to any existing Wikimedia project, and Cloud VPS is not a hosting platform for any generic freely lic... [18:49:11] FIRING: SystemdUnitDown: The systemd unit kiwix-mirror-update.service on node clouddumps1001 has been failing for more than two hours. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=clouddumps1001 - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitDown [19:07:10] (03CR) 10MusikAnimal: [C:03+2] Add test coverage for Model (031 comment) [labs/xtools] - 10https://gerrit.wikimedia.org/r/1257870 (owner: 10Alien4444) [19:07:55] (03CR) 10MusikAnimal: [C:03+2] "Not sure why L10n-bot doesn't +2 itself here. I will look into it." [labs/xtools] - 10https://gerrit.wikimedia.org/r/1261416 (owner: 10L10n-bot) [19:09:20] (03Merged) 10jenkins-bot: Add test coverage for Model [labs/xtools] - 10https://gerrit.wikimedia.org/r/1257870 (owner: 10Alien4444) [19:11:43] 10Tool-centralnotice-banner-editor: Learn Vue - https://phabricator.wikimedia.org/T397729#11767228 (10Oyelola_Victoria) 05Open→03Resolved [19:12:01] 10Tool-centralnotice-banner-editor: Add dark mode support to the editor interface - https://phabricator.wikimedia.org/T421746#11767229 (10Oyelola_Victoria) p:05Triage→03Low [19:12:17] 10Tool-centralnotice-banner-editor: Automatically strip protocol from URLs - https://phabricator.wikimedia.org/T421070#11767231 (10Oyelola_Victoria) p:05Triage→03Medium [19:12:32] 10Tool-centralnotice-banner-editor: Add confirmation modal when deleting a banner - https://phabricator.wikimedia.org/T420958#11767233 (10Oyelola_Victoria) p:05Triage→03High [19:12:40] 10Tool-centralnotice-banner-editor: Add confirmation modal when selecting a new template while editing - https://phabricator.wikimedia.org/T420955#11767234 (10Oyelola_Victoria) p:05Triage→03High [19:13:02] 10Tool-centralnotice-banner-editor: Implement fixed CSS feature for default uneditable styles in banners - https://phabricator.wikimedia.org/T420950#11767237 (10Oyelola_Victoria) p:05Triage→03High [19:15:21] (03CR) 10MusikAnimal: "recheck" [labs/xtools] - 10https://gerrit.wikimedia.org/r/1264608 (owner: 10L10n-bot) [19:18:41] RESOLVED: CloudVPSDesignateLeaks: Detected 4 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [19:24:21] (03CR) 10MusikAnimal: [C:03+2] Localisation updates from https://translatewiki.net. [labs/xtools] - 10https://gerrit.wikimedia.org/r/1264608 (owner: 10L10n-bot) [19:30:21] 06cloud-services-team, 10Cloud-VPS: Improvements to auto-generated floating ip ptr records - https://phabricator.wikimedia.org/T421739#11767301 (10Andrew) Item #2 is already handled by the code. I don't know how/why my last attempt was clobbered; trying again. [19:32:31] 06cloud-services-team, 10Cloud-VPS: Improvements to auto-generated floating ip ptr records - https://phabricator.wikimedia.org/T421739#11767340 (10Andrew) >>! In T421739#11767301, @Andrew wrote: > Item #2 is already handled by the code. I don't know how/why my last attempt was clobbered; trying again. OK, on... [19:50:17] FIRING: JobUnavailable: Reduced availability for job blackbox_http in cloud@eqiad - https://wikitech.wikimedia.org/wiki/Prometheus#Prometheus_job_unavailable - https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets - https://alerts.wikimedia.org/?q=alertname%3DJobUnavailable [19:55:17] RESOLVED: JobUnavailable: Reduced availability for job blackbox_http in cloud@eqiad - https://wikitech.wikimedia.org/wiki/Prometheus#Prometheus_job_unavailable - https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets - https://alerts.wikimedia.org/?q=alertname%3DJobUnavailable [20:08:41] FIRING: CloudVPSDesignateLeaks: Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [20:18:56] RESOLVED: SystemdUnitDown: The service unit kiwix-mirror-update.service is in failed status on host clouddumps1001. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=clouddumps1001 - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitDown [20:18:56] RESOLVED: SystemdUnitDown: The systemd unit kiwix-mirror-update.service on node clouddumps1001 has been failing for more than two hours. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=clouddumps1001 - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitDown [21:14:01] (03update) 10raymond-ndibe: logs: test since, until params for jobs-api logs [repos/cloud/toolforge/toolforge-deploy] (add_logs_api_tests) - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1141 (https://phabricator.wikimedia.org/T400917) [22:01:54] (03update) 10raymond-ndibe: support since and until params in logs endpoints [repos/cloud/toolforge/logs-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/logs-api/-/merge_requests/12 (https://phabricator.wikimedia.org/T400917) [22:02:21] (03update) 10raymond-ndibe: support since and until params in logs endpoints [repos/cloud/toolforge/logs-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/logs-api/-/merge_requests/12 (https://phabricator.wikimedia.org/T400917) [22:02:22] (03approved) 10raymond-ndibe: support since and until params in logs endpoints [repos/cloud/toolforge/logs-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/logs-api/-/merge_requests/12 (https://phabricator.wikimedia.org/T400917) [22:02:35] (03merge) 10raymond-ndibe: support since and until params in logs endpoints [repos/cloud/toolforge/logs-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/logs-api/-/merge_requests/12 (https://phabricator.wikimedia.org/T400917) [22:05:55] (03open) 10group_203_bot_f4d95069bb2675e4ce1fff090c1c1620: logs-api: bump to 0.0.16-20260330220252-cd8acfa2 [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1187 (https://phabricator.wikimedia.org/T400917) [22:19:22] FIRING: HAProxyBackendUnavailable: HAProxy service wikireplica-db-web-s3 backend clouddb1022.eqiad.wmnet is DOWN - https://wikitech.wikimedia.org/wiki/HAProxy - TODO - https://alerts.wikimedia.org/?q=alertname%3DHAProxyBackendUnavailable [23:00:44] FIRING: MaintainDBUsersStuck: Maintain-dbusers is stuck - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/MaintainDBUsersStuck - https://grafana.wikimedia.org/d/ae240a06-c13e-49f3-b12c-58432c551e85/wmcs-maintain-dbusers - https://alerts.wikimedia.org/?q=alertname%3DMaintainDBUsersStuck