[00:01:15] <wikibugs>	 10Tool-paulina, 10Outreachy (Round 31): Outreachy 31: Features to edit author and work data on Wikidata directly from Paulina - https://phabricator.wikimedia.org/T392429#11494308 (10Nurah_Wakili) Weekly Internship Report  Week 4: December 22 – December 27  Task 1: Conducted research on making authenticated req...
[00:03:04] <jinxer-wm>	 FIRING: ObjectStorageObjectQuotaFull: Object storage quota by 'objects' is 80.43% full for project tools-logging - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/ObjectStorageObjectQuotaFull - https://grafana.wikimedia.org/d/7120b794-4638-49f5-bccd-9716efc60f24/wmcs-object-storage-quotas - https://alerts.wikimedia.org/?q=alertname%3DObjectStorageObjectQuotaFull
[00:05:12] <wikibugs>	 10Tool-paulina, 10Outreachy (Round 31): Outreachy 31: Features to edit author and work data on Wikidata directly from Paulina - https://phabricator.wikimedia.org/T392429#11494321 (10Nurah_Wakili) Weekly Internship Report  Week 4: December 30 – January 5  Task 1: No development tasks were completed this week du...
[00:14:56] <jinxer-wm>	 RESOLVED: SystemdUnitDown: The service unit prometheus-node-textfile-wmcs-dnsleaks.service is in failed status on host cloudcontrol1007. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=cloudcontrol1007 - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitDown
[00:18:56] <jinxer-wm>	 FIRING: SystemdUnitDown: The service unit prometheus-node-textfile-wmcs-dnsleaks.service is in failed status on host cloudcontrol1007. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=cloudcontrol1007 - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitDown
[00:43:56] <jinxer-wm>	 RESOLVED: SystemdUnitDown: The service unit prometheus-node-textfile-wmcs-dnsleaks.service is in failed status on host cloudcontrol1007. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=cloudcontrol1007 - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitDown
[00:48:56] <jinxer-wm>	 FIRING: SystemdUnitDown: The service unit prometheus-node-textfile-wmcs-dnsleaks.service is in failed status on host cloudcontrol1007. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=cloudcontrol1007 - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitDown
[00:54:56] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 toolsbeta START - Cookbook wmcs.toolforge.add_k8s_etcd_node (T375217)
[00:55:01] <stashbot>	 T375217: Complete upgrading WMCS bare metal hosts to Trixie - https://phabricator.wikimedia.org/T375217
[00:56:37] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 toolsbeta END (ERROR) - Cookbook wmcs.toolforge.add_k8s_etcd_node (exit_code=97)
[00:57:12] <wikibugs>	 10Tool-Pageviews, 06Data-Engineering: Generate 2025 topviews yearly datasets - https://phabricator.wikimedia.org/T413393#11494493 (10HMonroy) @MusikAnimal looking at this task and noticed your last comment. Did you get around this? I do see data in PageViews from 2025 :)
[00:57:44] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 toolsbeta START - Cookbook wmcs.toolforge.add_k8s_etcd_node (T375217)
[01:03:28] <wmcs-alerts>	 FIRING: InstanceDown: Project toolsbeta instance toolsbeta-test-k8s-etcd-27 is down   - https://prometheus-alerts.wmcloud.org/?q=alertname%3DInstanceDown
[01:05:28] <wmcs-alerts>	 RESOLVED: PuppetStaleCertificates: Found non-revoked Puppet certificates for 1 deleted instances on toolsbeta-puppetserver-1 - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/PuppetStaleCertificates  - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetStaleCertificates
[01:08:48] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 toolsbeta END (FAIL) - Cookbook wmcs.toolforge.add_k8s_etcd_node (exit_code=99)
[01:13:37] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 toolsbeta START - Cookbook wmcs.toolforge.remove_k8s_etcd_node (T375217)
[01:13:41] <stashbot>	 T375217: Complete upgrading WMCS bare metal hosts to Trixie - https://phabricator.wikimedia.org/T375217
[01:13:45] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 toolsbeta END (ERROR) - Cookbook wmcs.toolforge.remove_k8s_etcd_node (exit_code=97)
[01:13:56] <jinxer-wm>	 RESOLVED: SystemdUnitDown: The service unit prometheus-node-textfile-wmcs-dnsleaks.service is in failed status on host cloudcontrol1007. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=cloudcontrol1007 - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitDown
[01:14:33] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 toolsbeta START - Cookbook wmcs.toolforge.remove_k8s_etcd_node (T375217)
[01:15:08] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 toolsbeta END (FAIL) - Cookbook wmcs.toolforge.remove_k8s_etcd_node (exit_code=99)
[01:18:15] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 toolsbeta START - Cookbook wmcs.toolforge.remove_k8s_etcd_node (T375217)
[01:18:56] <jinxer-wm>	 FIRING: SystemdUnitDown: The service unit prometheus-node-textfile-wmcs-dnsleaks.service is in failed status on host cloudcontrol1007. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=cloudcontrol1007 - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitDown
[01:22:23] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 toolsbeta END (FAIL) - Cookbook wmcs.toolforge.remove_k8s_etcd_node (exit_code=99)
[01:22:28] <wmcs-alerts>	 FIRING: PuppetStaleCertificates: Found non-revoked Puppet certificates for 1 deleted instances on toolsbeta-puppetserver-1 - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/PuppetStaleCertificates  - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetStaleCertificates
[01:22:58] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 toolsbeta START - Cookbook wmcs.toolforge.remove_k8s_etcd_node (T375217)
[01:23:02] <stashbot>	 T375217: Complete upgrading WMCS bare metal hosts to Trixie - https://phabricator.wikimedia.org/T375217
[01:23:28] <wmcs-alerts>	 RESOLVED: InstanceDown: Project toolsbeta instance toolsbeta-test-k8s-etcd-27 is down   - https://prometheus-alerts.wmcloud.org/?q=alertname%3DInstanceDown
[01:26:28] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 toolsbeta END (FAIL) - Cookbook wmcs.toolforge.remove_k8s_etcd_node (exit_code=99)
[01:28:14] <wmcs-alerts>	 FIRING: [6x] ToolforgeKubernetesHAproxyServerDown: Toolforge HAProxy server toolsbeta-test-k8s-control-10.toolsbeta.eqiad1.wikimedia.cloud is DOWN - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesHAproxyServerDown - https://grafana.wmcloud.org/d/toolforge-k8s-haproxy/toolforge-k8s-haproxy?orgId=1 - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesHAproxyServerDown
[01:28:53] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 toolsbeta START - Cookbook wmcs.toolforge.remove_k8s_etcd_node (T375217)
[01:28:58] <stashbot>	 T375217: Complete upgrading WMCS bare metal hosts to Trixie - https://phabricator.wikimedia.org/T375217
[01:29:26] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 toolsbeta END (FAIL) - Cookbook wmcs.toolforge.remove_k8s_etcd_node (exit_code=99)
[01:32:45] <wmcs-alerts>	 FIRING: Toolforge Kyverno no policy resources: Toolforge Kyverno has no policy resources - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/Toolforge_Kyverno_no_policy_resources - https://grafana.wmcloud.org/d/kyverno/kyverno?orgId=1&var-DS_PROMETHEUS_KYVERNO=prometheus-tools - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforge+Kyverno+no+policy+resources
[01:32:46] <wmcs-alerts>	 FIRING: Toolforge Kyverno unknown state: Toolforge Kyverno has unknown state. Kyverno might be down - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/Toolforge_Kyverno_unknown_state - https://grafana.wmcloud.org/d/kyverno/kyverno?orgId=1&var-DS_PROMETHEUS_KYVERNO=prometheus-tools - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforge+Kyverno+unknown+state
[01:33:20] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 toolsbeta START - Cookbook wmcs.toolforge.remove_k8s_etcd_node (T375217)
[01:33:53] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 toolsbeta END (FAIL) - Cookbook wmcs.toolforge.remove_k8s_etcd_node (exit_code=99)
[01:36:19] <wmcs-alerts>	 FIRING: TektonDown: Tekton is down - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/TektonDown  - https://prometheus-alerts.wmcloud.org/?q=alertname%3DTektonDown
[01:36:25] <wmcs-alerts>	 FIRING: JobsApiDown: JobsApi is down - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/JobsApiDown  - https://prometheus-alerts.wmcloud.org/?q=alertname%3DJobsApiDown
[01:36:58] <wmcs-alerts>	 FIRING: JobsEmailerDown: JobsEmailer is down - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/JobsEmailerDown  - https://prometheus-alerts.wmcloud.org/?q=alertname%3DJobsEmailerDown
[01:37:11] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 tools START - Cookbook wmcs.toolforge.remove_k8s_etcd_node (T375217)
[01:37:17] <stashbot>	 T375217: Complete upgrading WMCS bare metal hosts to Trixie - https://phabricator.wikimedia.org/T375217
[01:37:18] <wmcs-alerts>	 FIRING: EnvvarsApiDown: EnvvarsApi is down - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/EnvvarsApiDown  - https://prometheus-alerts.wmcloud.org/?q=alertname%3DEnvvarsApiDown
[01:37:21] <wmcs-alerts>	 FIRING: MaintainKubeusersDown: maintain-kubeusers is down - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/MaintainKubeusersDown  - https://prometheus-alerts.wmcloud.org/?q=alertname%3DMaintainKubeusersDown
[01:37:23] <wmcs-alerts>	 FIRING: ToolforgeKubernetesNodeNotReady: (no data) Multiple Kubernetes nodes are not ready #page - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesNodeNotReady - https://grafana.wmcloud.org/d/8GiwHDL4k/kubernetes-cluster-overview?orgId=1 - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesNodeNotReady
[01:37:26] <wmcs-alerts>	 FIRING: BuildsApiDown: BuildsApi is down - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/BuildsApiDown  - https://prometheus-alerts.wmcloud.org/?q=alertname%3DBuildsApiDown
[01:37:34] <wmcs-alerts>	 FIRING: ComponentsApiDown: ComponentsApi is down - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ComponentsApiDown  - https://prometheus-alerts.wmcloud.org/?q=alertname%3DComponentsApiDown
[01:37:34] <wmcs-alerts>	 FIRING: EnvvarsAdmissionDown: EnvvarsAdmission is down - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/EnvvarsAdmissionDown  - https://prometheus-alerts.wmcloud.org/?q=alertname%3DEnvvarsAdmissionDown
[01:42:43] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 tools END (FAIL) - Cookbook wmcs.toolforge.remove_k8s_etcd_node (exit_code=99)
[01:43:49] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 tools START - Cookbook wmcs.toolforge.add_k8s_etcd_node (T375217)
[01:43:53] <stashbot>	 T375217: Complete upgrading WMCS bare metal hosts to Trixie - https://phabricator.wikimedia.org/T375217
[01:43:56] <jinxer-wm>	 RESOLVED: SystemdUnitDown: The service unit prometheus-node-textfile-wmcs-dnsleaks.service is in failed status on host cloudcontrol1007. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=cloudcontrol1007 - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitDown
[01:46:19] <wmcs-alerts>	 RESOLVED: TektonDown: Tekton is down - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/TektonDown  - https://prometheus-alerts.wmcloud.org/?q=alertname%3DTektonDown
[01:46:25] <wmcs-alerts>	 RESOLVED: JobsApiDown: JobsApi is down - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/JobsApiDown  - https://prometheus-alerts.wmcloud.org/?q=alertname%3DJobsApiDown
[01:47:21] <wmcs-alerts>	 RESOLVED: MaintainKubeusersDown: maintain-kubeusers is down - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/MaintainKubeusersDown  - https://prometheus-alerts.wmcloud.org/?q=alertname%3DMaintainKubeusersDown
[01:47:26] <wmcs-alerts>	 RESOLVED: BuildsApiDown: BuildsApi is down - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/BuildsApiDown  - https://prometheus-alerts.wmcloud.org/?q=alertname%3DBuildsApiDown
[01:47:34] <wmcs-alerts>	 RESOLVED: ComponentsApiDown: ComponentsApi is down - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ComponentsApiDown  - https://prometheus-alerts.wmcloud.org/?q=alertname%3DComponentsApiDown
[01:47:45] <wmcs-alerts>	 RESOLVED: Toolforge Kyverno no policy resources: Toolforge Kyverno has no policy resources - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/Toolforge_Kyverno_no_policy_resources - https://grafana.wmcloud.org/d/kyverno/kyverno?orgId=1&var-DS_PROMETHEUS_KYVERNO=prometheus-tools - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforge+Kyverno+no+policy+resources
[01:47:46] <wmcs-alerts>	 RESOLVED: Toolforge Kyverno unknown state: Toolforge Kyverno has unknown state. Kyverno might be down - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/Toolforge_Kyverno_unknown_state - https://grafana.wmcloud.org/d/kyverno/kyverno?orgId=1&var-DS_PROMETHEUS_KYVERNO=prometheus-tools - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforge+Kyverno+unknown+state
[01:48:14] <wmcs-alerts>	 RESOLVED: [6x] ToolforgeKubernetesHAproxyServerDown: Toolforge HAProxy server toolsbeta-test-k8s-control-10.toolsbeta.eqiad1.wikimedia.cloud is DOWN - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesHAproxyServerDown - https://grafana.wmcloud.org/d/toolforge-k8s-haproxy/toolforge-k8s-haproxy?orgId=1 - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesHAproxyServerDo
[01:48:50] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 tools END (FAIL) - Cookbook wmcs.toolforge.add_k8s_etcd_node (exit_code=99)
[01:48:56] <jinxer-wm>	 FIRING: SystemdUnitDown: The service unit prometheus-node-textfile-wmcs-dnsleaks.service is in failed status on host cloudcontrol1007. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=cloudcontrol1007 - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitDown
[01:51:58] <wmcs-alerts>	 RESOLVED: JobsEmailerDown: JobsEmailer is down - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/JobsEmailerDown  - https://prometheus-alerts.wmcloud.org/?q=alertname%3DJobsEmailerDown
[01:52:18] <wmcs-alerts>	 RESOLVED: EnvvarsApiDown: EnvvarsApi is down - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/EnvvarsApiDown  - https://prometheus-alerts.wmcloud.org/?q=alertname%3DEnvvarsApiDown
[01:52:23] <wmcs-alerts>	 RESOLVED: ToolforgeKubernetesNodeNotReady: (no data) Multiple Kubernetes nodes are not ready #page - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesNodeNotReady - https://grafana.wmcloud.org/d/8GiwHDL4k/kubernetes-cluster-overview?orgId=1 - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesNodeNotReady
[01:52:34] <wmcs-alerts>	 RESOLVED: EnvvarsAdmissionDown: EnvvarsAdmission is down - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/EnvvarsAdmissionDown  - https://prometheus-alerts.wmcloud.org/?q=alertname%3DEnvvarsAdmissionDown
[01:52:54] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 tools START - Cookbook wmcs.toolforge.remove_k8s_etcd_node (T375217)
[01:52:58] <stashbot>	 T375217: Complete upgrading WMCS bare metal hosts to Trixie - https://phabricator.wikimedia.org/T375217
[01:59:49] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 tools END (FAIL) - Cookbook wmcs.toolforge.remove_k8s_etcd_node (exit_code=99)
[02:00:14] <wmcs-alerts>	 FIRING: [3x] ToolforgeKubernetesHAproxyServerDown: Toolforge HAProxy server tools-k8s-control-7.tools.eqiad1.wikimedia.cloud is DOWN - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesHAproxyServerDown - https://grafana.wmcloud.org/d/toolforge-k8s-haproxy/toolforge-k8s-haproxy?orgId=1 - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesHAproxyServerDown
[02:00:44] <jinxer-wm>	 FIRING: MaintainDBUsersManyErrors: Maintain-dbusers is having sustained errors - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/MaintainDBUsersManyErrors - https://grafana.wikimedia.org/d/ae240a06-c13e-49f3-b12c-58432c551e85/wmcs-maintain-dbusers - https://alerts.wikimedia.org/?q=alertname%3DMaintainDBUsersManyErrors
[02:04:00] <jinxer-wm>	 FIRING: [4x] OpenstackAPIResponse: Openstack API average response time is too high. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/OpenstackAPIResponse - https://grafana.wikimedia.org/d/UUmLqqX4k - https://alerts.wikimedia.org/?q=alertname%3DOpenstackAPIResponse
[02:05:14] <wmcs-alerts>	 RESOLVED: [3x] ToolforgeKubernetesHAproxyServerDown: Toolforge HAProxy server tools-k8s-control-7.tools.eqiad1.wikimedia.cloud is DOWN - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesHAproxyServerDown - https://grafana.wmcloud.org/d/toolforge-k8s-haproxy/toolforge-k8s-haproxy?orgId=1 - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesHAproxyServerDown
[02:05:44] <jinxer-wm>	 RESOLVED: MaintainDBUsersManyErrors: Maintain-dbusers is having sustained errors - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/MaintainDBUsersManyErrors - https://grafana.wikimedia.org/d/ae240a06-c13e-49f3-b12c-58432c551e85/wmcs-maintain-dbusers - https://alerts.wikimedia.org/?q=alertname%3DMaintainDBUsersManyErrors
[02:07:38] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.restart_openstack on deployment eqiad1 for all services
[02:10:22] <jinxer-wm>	 FIRING: [9x] HAProxyBackendUnavailable: HAProxy service designate-api_backend backend cloudcontrol1007.private.eqiad.wikimedia.cloud is DOWN - https://wikitech.wikimedia.org/wiki/HAProxy - TODO - https://alerts.wikimedia.org/?q=alertname%3DHAProxyBackendUnavailable
[02:11:25] <wikibugs>	 10Tool-Pageviews, 06Data-Engineering: Generate 2025 topviews yearly datasets - https://phabricator.wikimedia.org/T413393#11494553 (10MusikAnimal) >>! In T413393#11494493, @HMonroy wrote: > @MusikAnimal looking at this task and noticed your last comment. Did you get around this? I do see data in PageViews from...
[02:11:56] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 toolsbeta START - Cookbook wmcs.toolforge.remove_k8s_etcd_node (T375217)
[02:12:02] <stashbot>	 T375217: Complete upgrading WMCS bare metal hosts to Trixie - https://phabricator.wikimedia.org/T375217
[02:13:56] <jinxer-wm>	 RESOLVED: SystemdUnitDown: The service unit prometheus-node-textfile-wmcs-dnsleaks.service is in failed status on host cloudcontrol1007. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=cloudcontrol1007 - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitDown
[02:15:22] <jinxer-wm>	 RESOLVED: [9x] HAProxyBackendUnavailable: HAProxy service designate-api_backend backend cloudcontrol1007.private.eqiad.wikimedia.cloud is DOWN - https://wikitech.wikimedia.org/wiki/HAProxy - TODO - https://alerts.wikimedia.org/?q=alertname%3DHAProxyBackendUnavailable
[02:16:11] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 toolsbeta END (PASS) - Cookbook wmcs.toolforge.remove_k8s_etcd_node (exit_code=0)
[02:16:32] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 toolsbeta START - Cookbook wmcs.toolforge.remove_k8s_etcd_node (T375217)
[02:18:56] <jinxer-wm>	 FIRING: SystemdUnitDown: The service unit prometheus-node-textfile-wmcs-dnsleaks.service is in failed status on host cloudcontrol1007. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=cloudcontrol1007 - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitDown
[02:19:00] <jinxer-wm>	 FIRING: [5x] OpenstackAPIResponse: Openstack API average response time is too high. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/OpenstackAPIResponse - https://grafana.wikimedia.org/d/UUmLqqX4k - https://alerts.wikimedia.org/?q=alertname%3DOpenstackAPIResponse
[02:19:58] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 toolsbeta END (PASS) - Cookbook wmcs.toolforge.remove_k8s_etcd_node (exit_code=0)
[02:20:48] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 toolsbeta START - Cookbook wmcs.toolforge.add_k8s_etcd_node (T375217)
[02:20:52] <stashbot>	 T375217: Complete upgrading WMCS bare metal hosts to Trixie - https://phabricator.wikimedia.org/T375217
[02:21:33] <wikibugs>	 10Tool-Pageviews, 06Data-Engineering: Generate 2025 topviews yearly datasets - https://phabricator.wikimedia.org/T413393#11494557 (10MusikAnimal)
[02:22:21] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.restart_openstack (exit_code=0) on deployment eqiad1 for all services
[02:24:00] <jinxer-wm>	 FIRING: [6x] OpenstackAPIResponse: Openstack API average response time is too high. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/OpenstackAPIResponse - https://grafana.wikimedia.org/d/UUmLqqX4k - https://alerts.wikimedia.org/?q=alertname%3DOpenstackAPIResponse
[02:31:58] <wmcs-alerts>	 FIRING: JobsEmailerNoEmails: No emails sent in the last hour - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/JobsEmailerNoEmails  - https://prometheus-alerts.wmcloud.org/?q=alertname%3DJobsEmailerNoEmails
[02:35:20] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 toolsbeta END (PASS) - Cookbook wmcs.toolforge.add_k8s_etcd_node (exit_code=0)
[02:39:45] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 tools START - Cookbook wmcs.toolforge.remove_k8s_etcd_node (T375217)
[02:39:49] <stashbot>	 T375217: Complete upgrading WMCS bare metal hosts to Trixie - https://phabricator.wikimedia.org/T375217
[02:43:56] <jinxer-wm>	 RESOLVED: SystemdUnitDown: The service unit prometheus-node-textfile-wmcs-dnsleaks.service is in failed status on host cloudcontrol1007. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=cloudcontrol1007 - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitDown
[02:45:41] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.remove_k8s_etcd_node (exit_code=0)
[02:47:07] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 tools START - Cookbook wmcs.toolforge.remove_k8s_etcd_node (T375217)
[02:47:11] <stashbot>	 T375217: Complete upgrading WMCS bare metal hosts to Trixie - https://phabricator.wikimedia.org/T375217
[02:53:18] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 tools END (FAIL) - Cookbook wmcs.toolforge.remove_k8s_etcd_node (exit_code=99)
[02:57:07] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 tools START - Cookbook wmcs.toolforge.add_k8s_etcd_node (T375217)
[02:58:10] <stashbot>	 T375217: Complete upgrading WMCS bare metal hosts to Trixie - https://phabricator.wikimedia.org/T375217
[03:08:41] <jinxer-wm>	 FIRING: CloudVPSDesignateLeaks: Detected 6 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks
[03:14:32] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.add_k8s_etcd_node (exit_code=0)
[03:16:56] <jinxer-wm>	 FIRING: SystemdUnitDown: The service unit opentofu-infra-diff.service is in failed status on host cloudcontrol1007. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=cloudcontrol1007 - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitDown
[03:19:00] <jinxer-wm>	 FIRING: [7x] OpenstackAPIResponse: Openstack API average response time is too high. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/OpenstackAPIResponse - https://grafana.wikimedia.org/d/UUmLqqX4k - https://alerts.wikimedia.org/?q=alertname%3DOpenstackAPIResponse
[03:20:16] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 tools START - Cookbook wmcs.toolforge.remove_k8s_etcd_node (T375217)
[03:20:22] <stashbot>	 T375217: Complete upgrading WMCS bare metal hosts to Trixie - https://phabricator.wikimedia.org/T375217
[03:21:56] <jinxer-wm>	 FIRING: [2x] SystemdUnitDown: The service unit opentofu-infra-diff.service is in failed status on host cloudcontrol1007. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=cloudcontrol1007 - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitDown
[03:28:44] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 tools END (FAIL) - Cookbook wmcs.toolforge.remove_k8s_etcd_node (exit_code=99)
[03:31:56] <jinxer-wm>	 FIRING: [2x] SystemdUnitDown: The service unit opentofu-infra-diff.service is in failed status on host cloudcontrol1007. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=cloudcontrol1007 - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitDown
[03:34:27] <wikibugs>	 10Tool-Pageviews, 06Data-Engineering: Generate 2025 topviews yearly datasets - https://phabricator.wikimedia.org/T413393#11494622 (10MusikAnimal)
[03:36:56] <jinxer-wm>	 RESOLVED: [2x] SystemdUnitDown: The service unit opentofu-infra-diff.service is in failed status on host cloudcontrol1007. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=cloudcontrol1007 - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitDown
[03:36:58] <wmcs-alerts>	 RESOLVED: JobsEmailerNoEmails: No emails sent in the last hour - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/JobsEmailerNoEmails  - https://prometheus-alerts.wmcloud.org/?q=alertname%3DJobsEmailerNoEmails
[03:44:22] <icinga-wm>	 RECOVERY - toolschecker: All k8s etcd nodes are healthy on checker.tools.wmflabs.org is OK: HTTP OK: HTTP/1.1 200 OK - 158 bytes in 0.400 second response time https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Toolschecker
[03:47:28] <wmcs-alerts>	 RESOLVED: PuppetStaleCertificates: Found non-revoked Puppet certificates for 1 deleted instances on toolsbeta-puppetserver-1 - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/PuppetStaleCertificates  - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetStaleCertificates
[03:47:33] <wmcs-alerts>	 RESOLVED: PuppetStaleCertificates: Found non-revoked Puppet certificates for 1 deleted instances on tools-puppetserver-01 - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/PuppetStaleCertificates  - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetStaleCertificates
[04:13:11] <jinxer-wm>	 RESOLVED: CloudVPSDesignateLeaks: Detected 6 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks
[04:18:56] <jinxer-wm>	 FIRING: SystemdUnitDown: The service unit prometheus-node-textfile-wmcs-dnsleaks.service is in failed status on host cloudcontrol1007. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=cloudcontrol1007 - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitDown
[04:22:44] <wikibugs>	 10Tool-Pageviews, 06Data-Engineering: Generate 2025 topviews yearly datasets - https://phabricator.wikimedia.org/T413393#11494644 (10MusikAnimal) 05In progress→03Resolved Annnnd the 2025 results are live! 🎉 https://pageviews.wmcloud.org/topviews/?date=2025  I was told to ping @LDickinsonWMF and Aiman J...
[04:43:56] <jinxer-wm>	 RESOLVED: SystemdUnitDown: The service unit prometheus-node-textfile-wmcs-dnsleaks.service is in failed status on host cloudcontrol1007. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=cloudcontrol1007 - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitDown
[05:09:08] <wikibugs>	 10Tool-wsindex, 10Wikisource Reader App, 10Outreachy (Round 31): Outreachy 31: Improve the Wikisource Reader App - https://phabricator.wikimedia.org/T405593#11494666 (10Muguro) **Weekly Internship Report**  //Week 4: December 29 - January 2//   **Overview of Tasks Completed:**  Task 1: Update [[ https://phab...
[05:18:56] <jinxer-wm>	 FIRING: SystemdUnitDown: The service unit prometheus-node-textfile-wmcs-dnsleaks.service is in failed status on host cloudcontrol1007. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=cloudcontrol1007 - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitDown
[05:43:56] <jinxer-wm>	 RESOLVED: SystemdUnitDown: The service unit prometheus-node-textfile-wmcs-dnsleaks.service is in failed status on host cloudcontrol1007. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=cloudcontrol1007 - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitDown
[05:49:56] <jinxer-wm>	 FIRING: SystemdUnitDown: The service unit prometheus-node-textfile-wmcs-dnsleaks.service is in failed status on host cloudcontrol1007. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=cloudcontrol1007 - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitDown
[06:14:56] <jinxer-wm>	 RESOLVED: SystemdUnitDown: The service unit prometheus-node-textfile-wmcs-dnsleaks.service is in failed status on host cloudcontrol1007. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=cloudcontrol1007 - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitDown
[06:49:56] <jinxer-wm>	 FIRING: SystemdUnitDown: The service unit prometheus-node-textfile-wmcs-dnsleaks.service is in failed status on host cloudcontrol1007. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=cloudcontrol1007 - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitDown
[09:55:37] <wikibugs>	 10Tool-wsindex, 10Wikisource Reader App: Display extra information from Wikidata in Book Details Screen - https://phabricator.wikimedia.org/T406245#11494952 (10Saiphani02) If any value is unknown, we should not display that line. example, it should not say, "Publisher: Unknown Publisher" change label from "pla...
[10:25:24] <wikibugs>	 (03PS1) 10Majavah: toolforge: k8s: prepare_upgrade: Check that functional tests pass [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1223633
[10:25:24] <wikibugs>	 (03PS1) 10Majavah: toolforge: k8s: prepare_upgrade: Automatically set downtime [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1223634
[11:04:34] <wikibugs>	 10Tool-wsindex, 10Wikisource Reader App: Add an option to find new books added to the app - https://phabricator.wikimedia.org/T410597#11495181 (10Bodhisattwa)
[11:04:40] <wikibugs>	 10Tool-wsindex, 10Wikisource Reader App, 10Outreachy (Round 31): Outreachy 31: Improve the Wikisource Reader App - https://phabricator.wikimedia.org/T405593#11495182 (10Bodhisattwa)
[11:05:13] <wikibugs>	 10Tool-wsindex, 10Wikisource Reader App: Add an option to find new books added to the app - https://phabricator.wikimedia.org/T410597#11495184 (10Bodhisattwa)
[11:05:17] <wikibugs>	 10Tool-wsindex, 10Wikisource Reader App, 10Outreachy (Round 31): Outreachy 31: Improve the Wikisource Reader App - https://phabricator.wikimedia.org/T405593#11495185 (10Bodhisattwa)
[11:05:29] <wikibugs>	 10Tool-wsindex, 10Wikisource Reader App, 10Outreachy (Round 31): Outreachy 31: Improve the Wikisource Reader App - https://phabricator.wikimedia.org/T405593#11495188 (10Bodhisattwa)
[11:06:33] <wikibugs>	 10Tool-wsindex, 10Wikisource Reader App, 10Outreachy (Round 31): Outreachy 31: Improve the Wikisource Reader App - https://phabricator.wikimedia.org/T405593#11495190 (10Bodhisattwa)
[11:12:35] <wikibugs>	 10Tool-wsindex, 10Wikisource Reader App: Some basic usage analytics from the API - https://phabricator.wikimedia.org/T413864 (10Saiphani02) 03NEW
[11:13:16] <wikibugs>	 10Tool-wsindex, 10Wikisource Reader App: Some basic usage analytics from the API - https://phabricator.wikimedia.org/T413864#11495225 (10Bodhisattwa)
[11:13:22] <wikibugs>	 10Tool-wsindex, 10Wikisource Reader App, 10Outreachy (Round 31): Outreachy 31: Improve the Wikisource Reader App - https://phabricator.wikimedia.org/T405593#11495226 (10Bodhisattwa)
[11:17:17] <wikibugs>	 10Tool-wsindex, 10Wikisource Reader App: Some basic usage analytics from the API - https://phabricator.wikimedia.org/T413864#11495235 (10Bodhisattwa)
[11:18:56] <wikibugs>	 10Tool-wsindex, 10Wikisource Reader App, 10Outreachy (Round 31): Outreachy 31: Improve the Wikisource Reader App - https://phabricator.wikimedia.org/T405593#11495259 (10Bodhisattwa)
[11:24:37] <wikibugs>	 10Tool-wsindex, 10Wikisource Reader App: Books update option in local library - https://phabricator.wikimedia.org/T408285#11495311 (10Saiphani02) Let us add a refresh type icon for each book in the Library. When the user clicks on it, they should first see a warning that any existing notes, highlights, and pro...
[11:25:07] <wikibugs>	 10Tool-wsindex, 10Wikisource Reader App: Books update option in local library - https://phabricator.wikimedia.org/T408285#11495312 (10Bodhisattwa) a:03Muguro
[12:07:23] <logmsgbot_cloud>	 !log taavi@cloudcumin1001 toolsbeta START - Cookbook wmcs.toolforge.k8s.prepare_upgrade for cluster toolsbeta upgrade to 1.31.14 (T413796)
[12:07:28] <stashbot>	 T413796: Upgrade toolsbeta cluster to Kubernetes 1.31 - https://phabricator.wikimedia.org/T413796
[12:22:14] <logmsgbot_cloud>	 !log taavi@cloudcumin1001 toolsbeta END (FAIL) - Cookbook wmcs.toolforge.k8s.prepare_upgrade (exit_code=99) for cluster toolsbeta upgrade to 1.31.14 (T413796)
[12:22:20] <stashbot>	 T413796: Upgrade toolsbeta cluster to Kubernetes 1.31 - https://phabricator.wikimedia.org/T413796
[12:25:14] <logmsgbot_cloud>	 !log taavi@cloudcumin1001 toolsbeta START - Cookbook wmcs.toolforge.k8s.prepare_upgrade for cluster toolsbeta upgrade to 1.31.14 (T413796)
[12:26:23] <wikibugs>	 06cloud-services-team, 10Toolforge: toolforge jobs logs: requests.exceptions.HTTPError: 400 Client Error: Bad Request for url - https://phabricator.wikimedia.org/T413874 (10taavi) 03NEW p:05Triage→03High
[12:34:06] <icinga-wm>	 PROBLEM - Host wikitech-static.wikimedia.org is DOWN: PING CRITICAL - Packet loss = 100%
[12:40:04] <logmsgbot_cloud>	 !log taavi@cloudcumin1001 toolsbeta END (PASS) - Cookbook wmcs.toolforge.k8s.prepare_upgrade (exit_code=0) for cluster toolsbeta upgrade to 1.31.14 (T413796)
[12:40:10] <stashbot>	 T413796: Upgrade toolsbeta cluster to Kubernetes 1.31 - https://phabricator.wikimedia.org/T413796
[12:41:29] <logmsgbot_cloud>	 !log taavi@cloudcumin1001 toolsbeta START - Cookbook wmcs.toolforge.k8s.worker.upgrade for node toolsbeta-test-k8s-control-10 from 1.30.14 to 1.31.14 (T413796)
[12:49:54] <logmsgbot_cloud>	 taavi@cloudcumin1001 upgrade (PID 1086757) is awaiting input
[12:56:57] <logmsgbot_cloud>	 !log taavi@cloudcumin1001 toolsbeta END (PASS) - Cookbook wmcs.toolforge.k8s.worker.upgrade (exit_code=0) for node toolsbeta-test-k8s-control-10 from 1.30.14 to 1.31.14 (T413796)
[12:57:02] <stashbot>	 T413796: Upgrade toolsbeta cluster to Kubernetes 1.31 - https://phabricator.wikimedia.org/T413796
[12:57:05] <wikibugs>	 (03PS1) 10Majavah: toolforge: k8s: Fix check for first node to be upgraded [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1223652
[12:57:06] <wikibugs>	 (03PS1) 10Majavah: toolforge: k8s: Remind user about hostname being upgraded [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1223653
[12:57:17] <logmsgbot_cloud>	 !log taavi@cloudcumin1001 toolsbeta START - Cookbook wmcs.toolforge.k8s.worker.upgrade for node toolsbeta-test-k8s-control-11 from 1.30.14 to 1.31.14 (T413796)
[12:58:01] <logmsgbot_cloud>	 !log taavi@cloudcumin1001 toolsbeta END (FAIL) - Cookbook wmcs.toolforge.k8s.worker.upgrade (exit_code=99) for node toolsbeta-test-k8s-control-11 from 1.30.14 to 1.31.14 (T413796)
[12:59:16] <wikibugs>	 (03PS2) 10Majavah: toolforge: k8s: Fix check for first node to be upgraded [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1223652
[12:59:16] <wikibugs>	 (03PS2) 10Majavah: toolforge: k8s: Remind user about hostname being upgraded [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1223653
[12:59:21] <logmsgbot_cloud>	 !log taavi@cloudcumin1001 toolsbeta START - Cookbook wmcs.toolforge.k8s.worker.upgrade for node toolsbeta-test-k8s-control-11 from 1.30.14 to 1.31.14 (T413796)
[13:01:29] <logmsgbot_cloud>	 !log taavi@cloudcumin1001 toolsbeta END (FAIL) - Cookbook wmcs.toolforge.k8s.worker.upgrade (exit_code=99) for node toolsbeta-test-k8s-control-11 from 1.30.14 to 1.31.14 (T413796)
[13:02:52] <wikibugs>	 (03CR) 10CI reject: [V:04-1] toolforge: k8s: Remind user about hostname being upgraded [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1223653 (owner: 10Majavah)
[13:03:45] <logmsgbot_cloud>	 !log taavi@cloudcumin1001 toolsbeta START - Cookbook wmcs.toolforge.k8s.worker.upgrade for node toolsbeta-test-k8s-control-12 from 1.30.14 to 1.31.14 (T413796)
[13:03:50] <stashbot>	 T413796: Upgrade toolsbeta cluster to Kubernetes 1.31 - https://phabricator.wikimedia.org/T413796
[13:04:13] <wikibugs>	 (03PS3) 10Majavah: toolforge: k8s: Remind user about hostname being upgraded [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1223653
[13:05:56] <wikibugs>	 (03PS1) 10Majavah: toolforge: k8s: Retry other errors when polling for version upgrade [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1223656
[13:06:06] <wikibugs>	 06cloud-services-team, 10Toolforge, 13Patch-For-Review: Maintain-dbusers is having sustained errors - https://phabricator.wikimedia.org/T413558#11495631 (10Andrew) 05Open→03Resolved
[13:08:31] <logmsgbot_cloud>	 !log taavi@cloudcumin1001 toolsbeta END (PASS) - Cookbook wmcs.toolforge.k8s.worker.upgrade (exit_code=0) for node toolsbeta-test-k8s-control-12 from 1.30.14 to 1.31.14 (T413796)
[13:09:17] <logmsgbot_cloud>	 !log taavi@cloudcumin1001 toolsbeta START - Cookbook wmcs.toolforge.k8s.restart_static_pods for toolsbeta-test-k8s-control-11 (T413796)
[13:09:22] <stashbot>	 T413796: Upgrade toolsbeta cluster to Kubernetes 1.31 - https://phabricator.wikimedia.org/T413796
[13:11:54] <logmsgbot_cloud>	 !log taavi@cloudcumin1001 toolsbeta END (PASS) - Cookbook wmcs.toolforge.k8s.restart_static_pods (exit_code=0) for toolsbeta-test-k8s-control-11 (T413796)
[13:12:04] <logmsgbot_cloud>	 !log taavi@cloudcumin1001 toolsbeta START - Cookbook wmcs.toolforge.k8s.restart_static_pods for toolsbeta-test-k8s-control-10 (T413796)
[13:14:40] <logmsgbot_cloud>	 !log taavi@cloudcumin1001 toolsbeta END (PASS) - Cookbook wmcs.toolforge.k8s.restart_static_pods (exit_code=0) for toolsbeta-test-k8s-control-10 (T413796)
[13:14:45] <stashbot>	 T413796: Upgrade toolsbeta cluster to Kubernetes 1.31 - https://phabricator.wikimedia.org/T413796
[13:17:02] <logmsgbot_cloud>	 !log taavi@cloudcumin1001 toolsbeta START - Cookbook wmcs.toolforge.k8s.worker.upgrade_workers for toolsbeta-test-k8s-worker-nfs-10, toolsbeta-test-k8s-worker-nfs-11, toolsbeta-test-k8s-worker-nfs-7, toolsbeta-test-k8s-worker-nfs-8, toolsbeta-test-k8s-worker-nfs-9, toolsbeta-test-k8s-worker-12, toolsbeta-test-k8s-worker-13
[13:19:12] <wikibugs>	 (03PS1) 10Majavah: toolforge: k8s: worker: Fix cookbook path comments [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1223657
[13:19:12] <wikibugs>	 (03PS1) 10Majavah: toolforge: k8s: worker: Remove special handling for SGE bastion [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1223658
[13:24:27] <logmsgbot_cloud>	 !log taavi@cloudcumin1001 toolsbeta END (PASS) - Cookbook wmcs.toolforge.k8s.worker.upgrade_workers (exit_code=0) for toolsbeta-test-k8s-worker-nfs-10, toolsbeta-test-k8s-worker-nfs-11, toolsbeta-test-k8s-worker-nfs-7, toolsbeta-test-k8s-worker-nfs-8, toolsbeta-test-k8s-worker-nfs-9, toolsbeta-test-k8s-worker-12, toolsbeta-test-k8s-worker-13
[13:25:30] <logmsgbot_cloud>	 !log taavi@cloudcumin1001 toolsbeta START - Cookbook wmcs.toolforge.k8s.worker.upgrade_ingresses for toolsbeta-test-k8s-ingress-11, toolsbeta-test-k8s-ingress-12, toolsbeta-test-k8s-ingress-9
[13:28:05] <logmsgbot_cloud>	 !log taavi@cloudcumin1001 toolsbeta END (PASS) - Cookbook wmcs.toolforge.k8s.worker.upgrade_ingresses (exit_code=0) for toolsbeta-test-k8s-ingress-11, toolsbeta-test-k8s-ingress-12, toolsbeta-test-k8s-ingress-9
[13:28:46] <wikibugs>	 10VPS-project-Codesearch, 10phan-taint-check-plugin: phan-taint-check-plugin / SecurityCheckPlugin is indexed by Codesearch twice - https://phabricator.wikimedia.org/T413879 (10A_smart_kitten) 03NEW
[13:29:40] <wikibugs>	 (03CR) 10A smart kitten: "I think this may have resulted in phan-taint-check-plugin being indexed twice by Codesearch (once under the `libs` group, once under the `" [labs/codesearch] - 10https://gerrit.wikimedia.org/r/965841 (owner: 10Gergő Tisza)
[13:33:00] <logmsgbot_cloud>	 !log taavi@cloudcumin1001 toolsbeta START - Cookbook wmcs.toolforge.k8s.worker.upgrade_bastions for toolsbeta-bastion-7.toolsbeta.eqiad1.wikimedia.cloud, toolsbeta-bastion-6.toolsbeta.eqiad1.wikimedia.cloud
[13:33:19] <logmsgbot_cloud>	 !log taavi@cloudcumin1001 toolsbeta END (PASS) - Cookbook wmcs.toolforge.k8s.worker.upgrade_bastions (exit_code=0) for toolsbeta-bastion-7.toolsbeta.eqiad1.wikimedia.cloud, toolsbeta-bastion-6.toolsbeta.eqiad1.wikimedia.cloud
[13:34:10] <logmsgbot_cloud>	 !log taavi@cloudcumin1001 toolsbeta START - Cookbook wmcs.toolforge.run_tests
[13:47:45] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 tools START - Cookbook wmcs.toolforge.remove_k8s_etcd_node (T375217)
[13:47:50] <stashbot>	 T375217: Complete upgrading WMCS bare metal hosts to Trixie - https://phabricator.wikimedia.org/T375217
[13:49:20] <logmsgbot_cloud>	 !log taavi@cloudcumin1001 toolsbeta END (PASS) - Cookbook wmcs.toolforge.run_tests (exit_code=0)
[13:49:45] <wikibugs>	 06cloud-services-team, 10Toolforge: Upgrade toolsbeta cluster to Kubernetes 1.31 - https://phabricator.wikimedia.org/T413796#11495929 (10taavi) 05Open→03Resolved
[13:50:53] <wikibugs>	 06cloud-services-team, 10Toolforge: [infra,k8s] Upgrade Toolforge Kubernetes to version 1.31 - https://phabricator.wikimedia.org/T372697#11495941 (10taavi)
[13:52:58] <wmcs-alerts>	 FIRING: JobsEmailerNoEmails: No emails sent in the last hour - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/JobsEmailerNoEmails  - https://prometheus-alerts.wmcloud.org/?q=alertname%3DJobsEmailerNoEmails
[13:54:15] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 tools END (FAIL) - Cookbook wmcs.toolforge.remove_k8s_etcd_node (exit_code=99)
[13:54:43] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 tools START - Cookbook wmcs.toolforge.remove_k8s_etcd_node (T375217)
[13:54:48] <stashbot>	 T375217: Complete upgrading WMCS bare metal hosts to Trixie - https://phabricator.wikimedia.org/T375217
[13:59:22] <icinga-wm>	 PROBLEM - toolschecker: All k8s etcd nodes are healthy on checker.tools.wmflabs.org is CRITICAL: HTTP CRITICAL: HTTP/1.1 503 SERVICE UNAVAILABLE - string OK not found on http://checker.tools.wmflabs.org:80/etcd/k8s - 490 bytes in 0.014 second response time https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Toolschecker
[14:00:42] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.remove_k8s_etcd_node (exit_code=0)
[14:04:28] <wmcs-alerts>	 FIRING: PuppetStaleCertificates: Found non-revoked Puppet certificates for 1 deleted instances on tools-puppetserver-01 - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/PuppetStaleCertificates  - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetStaleCertificates
[14:13:43] <wikibugs>	 06cloud-services-team, 10Wikimedia-Mailing-lists: aborrero@wikimedia.org still subscribed to ops@lists.wikimedia.org - https://phabricator.wikimedia.org/T413883 (10Andrew) 03NEW
[14:15:23] <wikibugs>	 06cloud-services-team, 10Wikimedia-Mailing-lists: aborrero@wikimedia.org still subscribed to ops@lists.wikimedia.org - https://phabricator.wikimedia.org/T413883#11496106 (10Ladsgroup) Mailman has automatic unsub if the emails bounce too many times I'd assume since ops@ is not sending that many emails or the bo...
[14:18:36] <wikibugs>	 (03update) 10raymond-ndibe: images: resolve the image every time [repos/cloud/toolforge/jobs-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-api/-/merge_requests/251 (owner: 10dcaro)
[14:36:11] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 cloudvirt-canary START - Cookbook wmcs.openstack.cloudvirt.lib.ensure_canary on eqiad1, with recreate True, for hosts list: ['cloudvirtlocal1003']
[14:36:58] <wmcs-alerts>	 RESOLVED: PuppetStaleCertificates: Found non-revoked Puppet certificates for 1 deleted instances on tools-puppetserver-01 - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/PuppetStaleCertificates  - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetStaleCertificates
[14:37:00] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 cloudvirt-canary END (PASS) - Cookbook wmcs.openstack.cloudvirt.lib.ensure_canary (exit_code=0) on eqiad1, with recreate True, for hosts list: ['cloudvirtlocal1003']
[14:37:58] <wmcs-alerts>	 RESOLVED: JobsEmailerNoEmails: No emails sent in the last hour - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/JobsEmailerNoEmails  - https://prometheus-alerts.wmcloud.org/?q=alertname%3DJobsEmailerNoEmails
[14:39:44] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 toolsbeta START - Cookbook wmcs.toolforge.add_k8s_etcd_node (T375217)
[14:39:48] <stashbot>	 T375217: Complete upgrading WMCS bare metal hosts to Trixie - https://phabricator.wikimedia.org/T375217
[14:53:25] <wikibugs>	 06cloud-services-team, 06SRE, 10Wikimedia-Mailing-lists: aborrero@wikimedia.org still subscribed to ops@lists.wikimedia.org - https://phabricator.wikimedia.org/T413883#11496217 (10Ladsgroup) I'm not seeing the email address in ops list. Maybe someone removed it in the mean time.
[14:56:29] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 toolsbeta END (PASS) - Cookbook wmcs.toolforge.add_k8s_etcd_node (exit_code=0)
[14:57:34] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 tools START - Cookbook wmcs.toolforge.add_k8s_etcd_node (T375217)
[14:57:38] <stashbot>	 T375217: Complete upgrading WMCS bare metal hosts to Trixie - https://phabricator.wikimedia.org/T375217
[15:14:14] <logmsgbot_cloud>	 !log andrew@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.add_k8s_etcd_node (exit_code=0)
[15:17:35] <wikibugs>	 10wikitech.wikimedia.org, 06serviceops-radar, 06SRE, 13Patch-For-Review, 07SRE-Unowned: Redesign wikitech-static - https://phabricator.wikimedia.org/T376400#11496297 (10taavi) >>! In T376400#11345384, @Andrew wrote: >> Sure, for example the first image on http://ec2-54-81-201-239.compute-1.amazonaws.com/...
[15:19:22] <icinga-wm>	 RECOVERY - toolschecker: All k8s etcd nodes are healthy on checker.tools.wmflabs.org is OK: HTTP OK: HTTP/1.1 200 OK - 158 bytes in 0.366 second response time https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Toolschecker
[15:57:36] <wikibugs>	 06cloud-services-team, 10PAWS, 10MediaWiki-extensions-OAuth, 10Wikidata, 06MediaWiki-Platform-Team (Radar): Add PAWS edit tag for Wikidata sitelink auto-change - https://phabricator.wikimedia.org/T263347#11496416 (10Tgr)
[16:00:38] <wikibugs>	 06cloud-services-team, 10Data-Services, 10MediaWiki-extensions-OAuth, 06Security-Team, and 2 others: (partially) expose oauth_registered_consumer table - https://phabricator.wikimedia.org/T247800#11496424 (10Tgr)
[16:23:46] <wikibugs>	 06cloud-services-team, 10Cloud-VPS: [tofu-infra] "tofu plan" failing in codfw - https://phabricator.wikimedia.org/T410265#11496572 (10Andrew) >>! In T410265#11411496, @Andrew wrote: > This is probably unrelated, but  The fact that eqiad1 is now running all the same versions without issue makes it even less lik...
[16:40:21] <wikibugs>	 10VPS-Projects, 06Security-Team, 10WM-Bot, 07SecTeam-Processed, and 2 others: https://wm-bot.wmcloud.org/github/index.php seems vulnerable to SQL injection - https://phabricator.wikimedia.org/T408876#11496688 (10sbassett)
[16:50:50] <wikibugs>	 06cloud-services-team, 10Cloud-VPS: Complete upgrading WMCS bare metal hosts to Trixie - https://phabricator.wikimedia.org/T375217#11496826 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by andrew@cumin2002 for host cloudnet1006.eqiad.wmnet with OS trixie completed: - cloudnet1006 (**WARN*...
[17:22:10] <icinga-wm>	 PROBLEM - Memcached on cloudcontrol1007 is CRITICAL: connect to address 10.64.148.21 and port 11211: Connection refused https://wikitech.wikimedia.org/wiki/Memcached
[17:22:10] <icinga-wm>	 PROBLEM - Memcached on cloudcontrol1011 is CRITICAL: connect to address 10.64.151.8 and port 11211: Connection refused https://wikitech.wikimedia.org/wiki/Memcached
[17:23:10] <icinga-wm>	 RECOVERY - Memcached on cloudcontrol1007 is OK: TCP OK - 0.000 second response time on 10.64.148.21 port 11211 https://wikitech.wikimedia.org/wiki/Memcached
[17:23:10] <icinga-wm>	 RECOVERY - Memcached on cloudcontrol1011 is OK: TCP OK - 0.000 second response time on 10.64.151.8 port 11211 https://wikitech.wikimedia.org/wiki/Memcached
[17:34:47] <wikibugs>	 10VPS-project-Codesearch, 10phan-taint-check-plugin: phan-taint-check-plugin / SecurityCheckPlugin is indexed by Codesearch twice - https://phabricator.wikimedia.org/T413879#11497211 (10sbassett) > In October 2023, https://gerrit.wikimedia.org/r/c/labs/codesearch/+/965841 (2aff2405b99a) added the mediawiki/too...
[17:37:40] <wikibugs>	 (03PS1) 10SBassett: Remove duplicated config for the SecurityCheckPlugin lib/tool [labs/codesearch] - 10https://gerrit.wikimedia.org/r/1223699 (https://phabricator.wikimedia.org/T413879)
[17:58:49] <wikibugs>	 (03CR) 10Ladsgroup: Remove duplicated config for the SecurityCheckPlugin lib/tool (031 comment) [labs/codesearch] - 10https://gerrit.wikimedia.org/r/1223699 (https://phabricator.wikimedia.org/T413879) (owner: 10SBassett)
[18:11:42] <wikibugs>	 (03CR) 10A smart kitten: Remove duplicated config for the SecurityCheckPlugin lib/tool (031 comment) [labs/codesearch] - 10https://gerrit.wikimedia.org/r/1223699 (https://phabricator.wikimedia.org/T413879) (owner: 10SBassett)
[19:17:21] <wikibugs>	 10Cloud-VPS (Debian Bullseye Deprecation), 10Beta-Cluster-Infrastructure, 07Epic, 06Release-Engineering-Team (Priority Backlog 📥): Migrate deployment-prep away from Debian Bullseye to Bookworm/Trixie - https://phabricator.wikimedia.org/T401839#11497588 (10bd808)
[19:45:13] <wikibugs>	 10Tool-curator: Wait when lockmanager-fail-conflict error occurs - https://phabricator.wikimedia.org/T413915 (10DaxServer) 03NEW
[19:45:43] <wikibugs>	 10Tool-curator: Wait when lockmanager-fail-conflict error occurs - https://phabricator.wikimedia.org/T413915#11497701 (10DaxServer) p:05Triage→03Medium
[20:15:19] <wikibugs>	 06cloud-services-team, 10GitLab (CI & Job Runners), 06Release-Engineering-Team (Priority Backlog 📥): Recent incidents of buildkitd's storage volume filling up - https://phabricator.wikimedia.org/T395097#11497749 (10dancy) 05In progress→03Resolved @Andrew has since changed where he performing the buil...