[00:06:21] (03PS2) 10Krinkle: write_config: clarify "skipping restart" log message [labs/codesearch] - 10https://gerrit.wikimedia.org/r/1260063 (https://phabricator.wikimedia.org/T421147) [00:06:22] (03PS1) 10Krinkle: write_config: Fix incomplete wmf_gitlab_group_projects paging [labs/codesearch] - 10https://gerrit.wikimedia.org/r/1267285 [00:12:17] 06cloud-services-team, 10Toolforge: Building/Running dotnet job fails on Toolforge - https://phabricator.wikimedia.org/T422224 (10Hawkeye7) 03NEW [00:12:51] (03PS2) 10Krinkle: write_config: Fix incomplete wmf_gitlab_group_projects paging [labs/codesearch] - 10https://gerrit.wikimedia.org/r/1267285 (https://phabricator.wikimedia.org/T421147) [00:12:56] (03CR) 10Krinkle: [C:03+2] write_config: Fix incomplete wmf_gitlab_group_projects paging [labs/codesearch] - 10https://gerrit.wikimedia.org/r/1267285 (https://phabricator.wikimedia.org/T421147) (owner: 10Krinkle) [00:13:00] (03PS3) 10Krinkle: write_config: clarify "skipping restart" log message [labs/codesearch] - 10https://gerrit.wikimedia.org/r/1260063 (https://phabricator.wikimedia.org/T421147) [00:13:03] (03CR) 10Krinkle: [C:03+2] write_config: clarify "skipping restart" log message [labs/codesearch] - 10https://gerrit.wikimedia.org/r/1260063 (https://phabricator.wikimedia.org/T421147) (owner: 10Krinkle) [00:14:02] (03Merged) 10jenkins-bot: write_config: Fix incomplete wmf_gitlab_group_projects paging [labs/codesearch] - 10https://gerrit.wikimedia.org/r/1267285 (https://phabricator.wikimedia.org/T421147) (owner: 10Krinkle) [00:14:12] (03Merged) 10jenkins-bot: write_config: clarify "skipping restart" log message [labs/codesearch] - 10https://gerrit.wikimedia.org/r/1260063 (https://phabricator.wikimedia.org/T421147) (owner: 10Krinkle) [00:15:46] 06cloud-services-team, 10Toolforge: Building/Running dotnet job fails on Toolforge - https://phabricator.wikimedia.org/T422224#11784762 (10JJMC89) [00:26:42] 10VPS-project-Codesearch, 13Patch-For-Review: Codesearch stuck at Feb 12th? - https://phabricator.wikimedia.org/T421147#11784788 (10Krinkle) The status is generally "up" and the nightly restart/reindex appears to finish without errors. ` krinkle@codesearch9:~$ sudo journalctl -u codesearch-write-config -n1000... [00:28:55] FIRING: ToolforgeKubernetesCapacity: Kubernetes cluster k8s.tools.eqiad1.wikimedia.cloud:6443 in risk of running out of memory - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesCapacity - https://grafana.wmcloud.org/d/8GiwHDL4k/kubernetes-cluster-overview?orgId=1 - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesCapacity [00:37:19] 10VPS-project-Phabricator, 06collaboration-services, 10Phabricator: Phabricator test project requires email verification but can't send email - https://phabricator.wikimedia.org/T388022#11784800 (10Dzahn) >>! In T388022#11784516, @dancy wrote: > Interesting. What host did you run `scap` on? What does `scap... [00:38:22] (03CR) 10MusikAnimal: [C:03+2] eslint: add spaces inside parentheses and brackets [labs/xtools] - 10https://gerrit.wikimedia.org/r/1260608 (https://phabricator.wikimedia.org/T392531) (owner: 10Novem Linguae) [00:39:11] (03Merged) 10jenkins-bot: eslint: add spaces inside parentheses and brackets [labs/xtools] - 10https://gerrit.wikimedia.org/r/1260608 (https://phabricator.wikimedia.org/T392531) (owner: 10Novem Linguae) [00:40:03] 10VPS-project-Phabricator, 06collaboration-services, 10Phabricator: Phabricator test project requires email verification but can't send email - https://phabricator.wikimedia.org/T388022#11784803 (10Dzahn) >>! In T388022#11784455, @A_smart_kitten wrote: > And..... it sent me an email for it! Wow! Cool. Someh... [00:46:59] 10VPS-project-Codesearch: Codesearch stuck at Feb 12th? - https://phabricator.wikimedia.org/T421147#11784814 (10Krinkle) a:03Krinkle There's quite a few. It's currently 2AM in UTC and the daily restarts are done (all backends up), which means there shouldn't be any git repo in a locked state right now. And yet... [00:48:33] 10VPS-project-Codesearch: Codesearch stuck at Feb 12th? - https://phabricator.wikimedia.org/T421147#11784819 (10Krinkle) ` $ sudo rm hound-deployed/data/*/.git/shallow.lock` ` ` krinkle@codesearch9:/srv/hound$ sudo systemctl status hound-deployed * hound-deployed.service - hound-deployed Loaded: loaded (/l... [01:05:17] FIRING: PrometheusRestarted: Prometheus instance tools-prometheus-8:9902 restarted - https://wikitech.wikimedia.org/wiki/Prometheus#Prometheus_was_restarted - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPrometheusRestarted [01:21:17] 10VPS-project-Codesearch: Codesearch stuck at Feb 12th? - https://phabricator.wikimedia.org/T421147#11784871 (10Krinkle) 05Open→03Resolved Re-indexing of `hound-deployed` backend finished. The example at T421147#11745763 now returns fresh results. I've gone ahead and deleted the git locks in other backe... [01:35:17] RESOLVED: PrometheusRestarted: Prometheus instance tools-prometheus-8:9902 restarted - https://wikitech.wikimedia.org/wiki/Prometheus#Prometheus_was_restarted - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPrometheusRestarted [01:47:15] (03PS1) 10Krinkle: hound: Upgrade Hound from v0.4.0 to v0.7.1 [labs/codesearch] - 10https://gerrit.wikimedia.org/r/1267343 [02:22:38] (03CR) 10Krinkle: "check php" [labs/tools/fileprotectionsync] - 10https://gerrit.wikimedia.org/r/1255901 (owner: 10Krinkle) [02:44:07] (03PS1) 10Krinkle: build: Enable eslint in `npm test` and make pass [labs/xtools] - 10https://gerrit.wikimedia.org/r/1267406 (https://phabricator.wikimedia.org/T422228) [02:44:40] (03CR) 10CI reject: [V:04-1] build: Enable eslint in `npm test` and make pass [labs/xtools] - 10https://gerrit.wikimedia.org/r/1267406 (https://phabricator.wikimedia.org/T422228) (owner: 10Krinkle) [02:46:11] (03PS2) 10Krinkle: build: Enable eslint in `npm test` and make pass [labs/xtools] - 10https://gerrit.wikimedia.org/r/1267406 (https://phabricator.wikimedia.org/T422228) [02:46:47] (03CR) 10CI reject: [V:04-1] build: Enable eslint in `npm test` and make pass [labs/xtools] - 10https://gerrit.wikimedia.org/r/1267406 (https://phabricator.wikimedia.org/T422228) (owner: 10Krinkle) [02:47:21] (03PS3) 10Krinkle: build: Enable eslint in `npm test` and make pass [labs/xtools] - 10https://gerrit.wikimedia.org/r/1267406 (https://phabricator.wikimedia.org/T422228) [02:48:01] (03CR) 10Krinkle: [C:03+2] build: Prepare for PHP 8.5 [labs/tools/fileprotectionsync] - 10https://gerrit.wikimedia.org/r/1255901 (owner: 10Krinkle) [02:48:38] (03Merged) 10jenkins-bot: build: Prepare for PHP 8.5 [labs/tools/fileprotectionsync] - 10https://gerrit.wikimedia.org/r/1255901 (owner: 10Krinkle) [03:16:56] FIRING: SystemdUnitDown: The service unit opentofu-infra-diff.service is in failed status on host cloudcontrol1007. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=cloudcontrol1007 - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitDown [03:19:06] (03CR) 10Krinkle: "recheck" [labs/tools/fileprotectionsync] - 10https://gerrit.wikimedia.org/r/1255902 (owner: 10Krinkle) [03:19:43] (03CR) 10Krinkle: [C:03+2] config: Change key to targetpage and sort config to ease review [labs/tools/fileprotectionsync] - 10https://gerrit.wikimedia.org/r/1255902 (owner: 10Krinkle) [03:20:17] (03Merged) 10jenkins-bot: config: Change key to targetpage and sort config to ease review [labs/tools/fileprotectionsync] - 10https://gerrit.wikimedia.org/r/1255902 (owner: 10Krinkle) [05:05:17] FIRING: PrometheusRestarted: Prometheus instance tools-prometheus-9:9902 restarted - https://wikitech.wikimedia.org/wiki/Prometheus#Prometheus_was_restarted - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPrometheusRestarted [05:11:56] FIRING: SystemdUnitDown: The systemd unit opentofu-infra-diff.service on node cloudcontrol1007 has been failing for more than two hours. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=cloudcontrol1007 - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitDown [05:19:10] RESOLVED: ToolforgeKubernetesCapacity: Kubernetes cluster k8s.tools.eqiad1.wikimedia.cloud:6443 in risk of running out of memory - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesCapacity - https://grafana.wmcloud.org/d/8GiwHDL4k/kubernetes-cluster-overview?orgId=1 - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesCapacity [05:35:17] RESOLVED: PrometheusRestarted: Prometheus instance tools-prometheus-9:9902 restarted - https://wikitech.wikimedia.org/wiki/Prometheus#Prometheus_was_restarted - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPrometheusRestarted [05:51:55] FIRING: ToolforgeKubernetesCapacity: Kubernetes cluster k8s.tools.eqiad1.wikimedia.cloud:6443 in risk of running out of memory - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesCapacity - https://grafana.wmcloud.org/d/8GiwHDL4k/kubernetes-cluster-overview?orgId=1 - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesCapacity [06:44:49] (03CR) 10MusikAnimal: [C:03+2] build: Enable eslint in `npm test` and make pass [labs/xtools] - 10https://gerrit.wikimedia.org/r/1267406 (https://phabricator.wikimedia.org/T422228) (owner: 10Krinkle) [06:45:42] (03Merged) 10jenkins-bot: build: Enable eslint in `npm test` and make pass [labs/xtools] - 10https://gerrit.wikimedia.org/r/1267406 (https://phabricator.wikimedia.org/T422228) (owner: 10Krinkle) [07:05:58] (03CR) 10Novem Linguae: build: Enable eslint in `npm test` and make pass (031 comment) [labs/xtools] - 10https://gerrit.wikimedia.org/r/1267406 (https://phabricator.wikimedia.org/T422228) (owner: 10Krinkle) [07:06:50] (03PS4) 10Novem Linguae: eslint: autofix several rules [labs/xtools] - 10https://gerrit.wikimedia.org/r/1260616 (https://phabricator.wikimedia.org/T392531) [07:54:33] 10VPS-project-Phabricator, 06collaboration-services, 10Phabricator: Phabricator test project requires email verification but can't send email - https://phabricator.wikimedia.org/T388022#11785098 (10A_smart_kitten) >>! In T388022#11784803, @Dzahn wrote: > This is great to hear! Sounds resolved, then. Hopefu... [09:05:17] FIRING: PrometheusRestarted: Prometheus instance tools-prometheus-8:9902 restarted - https://wikitech.wikimedia.org/wiki/Prometheus#Prometheus_was_restarted - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPrometheusRestarted [09:12:11] FIRING: SystemdUnitDown: The systemd unit opentofu-infra-diff.service on node cloudcontrol1007 has been failing for more than two hours. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=cloudcontrol1007 - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitDown [09:35:17] RESOLVED: PrometheusRestarted: Prometheus instance tools-prometheus-8:9902 restarted - https://wikitech.wikimedia.org/wiki/Prometheus#Prometheus_was_restarted - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPrometheusRestarted [09:38:41] FIRING: CloudVPSDesignateLeaks: Detected 6 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [11:04:17] FIRING: PrometheusRestarted: Prometheus instance tools-prometheus-9:9902 restarted - https://wikitech.wikimedia.org/wiki/Prometheus#Prometheus_was_restarted - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPrometheusRestarted [11:25:36] 10Tool-quickcategories, 10MediaWiki-Action-API, 10Notifications (Echo), 06Traffic: Notifications API is returning a permissions error since 2026-04-01 for a bot account - https://phabricator.wikimedia.org/T421991#11785419 (10LucasWerkmeister) This change also broke #tool-quickcategories: > mwapi.errors.AP... [11:34:17] RESOLVED: PrometheusRestarted: Prometheus instance tools-prometheus-9:9902 restarted - https://wikitech.wikimedia.org/wiki/Prometheus#Prometheus_was_restarted - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPrometheusRestarted [13:38:56] FIRING: CloudVPSDesignateLeaks: Detected 20 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [14:18:41] RESOLVED: CloudVPSDesignateLeaks: Detected 22 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [14:51:44] FIRING: MaintainDBUsersManyErrors: Maintain-dbusers is having sustained errors - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/MaintainDBUsersManyErrors - https://grafana.wikimedia.org/d/ae240a06-c13e-49f3-b12c-58432c551e85/wmcs-maintain-dbusers - https://alerts.wikimedia.org/?q=alertname%3DMaintainDBUsersManyErrors [14:56:44] RESOLVED: MaintainDBUsersManyErrors: Maintain-dbusers is having sustained errors - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/MaintainDBUsersManyErrors - https://grafana.wikimedia.org/d/ae240a06-c13e-49f3-b12c-58432c551e85/wmcs-maintain-dbusers - https://alerts.wikimedia.org/?q=alertname%3DMaintainDBUsersManyErrors [15:05:17] FIRING: PrometheusRestarted: Prometheus instance tools-prometheus-8:9902 restarted - https://wikitech.wikimedia.org/wiki/Prometheus#Prometheus_was_restarted - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPrometheusRestarted [15:12:49] 10VPS-project-Phabricator, 06collaboration-services, 10Phabricator: Phabricator test project requires email verification but can't send email - https://phabricator.wikimedia.org/T388022#11785747 (10dancy) >>! In T388022#11783814, @Dzahn wrote: > I tried to do the scap deploy: > > ` > debug1: Server host key... [15:16:56] 06cloud-services-team, 10Cloud-VPS, 10Continuous-Integration-Infrastructure: Puppet fail to create volume group for ephemeral disk space when it is sda (instead of sdb) - https://phabricator.wikimedia.org/T422258 (10hashar) 03NEW [15:18:41] 06cloud-services-team, 10Cloud-VPS, 10Continuous-Integration-Infrastructure: Puppet fail to create volume group for ephemeral disk space when it is sda (instead of sdb) - https://phabricator.wikimedia.org/T422258#11785765 (10hashar) [15:35:17] RESOLVED: PrometheusRestarted: Prometheus instance tools-prometheus-8:9902 restarted - https://wikitech.wikimedia.org/wiki/Prometheus#Prometheus_was_restarted - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPrometheusRestarted [15:48:33] 06cloud-services-team, 10Cloud-VPS, 10Continuous-Integration-Infrastructure: Puppet fail to create volume group for ephemeral disk space when it is sda (instead of sdb) - https://phabricator.wikimedia.org/T422258#11785835 (10hashar) [15:55:05] 06cloud-services-team, 10Cloud-VPS, 10Continuous-Integration-Infrastructure: Puppet fail to create volume group for ephemeral disk space when it is sda (instead of sdb) - https://phabricator.wikimedia.org/T422258#11785850 (10hashar) The workaround is to manually run manually: ` sudo /usr/local/sbin/make-inst... [16:20:51] 10VPS-project-devtools, 06Release-Engineering-Team, 10Scap: Upgrade ancient version of scap running on deploy-1006.devtools.eqiad1.wikimedia.cloud - https://phabricator.wikimedia.org/T422257#11785937 (10A_smart_kitten) [16:29:31] 10VPS-project-Phabricator, 06collaboration-services, 10Phabricator: Phabricator test project requires email verification but can't send email - https://phabricator.wikimedia.org/T388022#11785951 (10Dzahn) Ah, thank you very much, @dancy :) [16:58:03] 10Tool-quickcategories, 10MediaWiki-Action-API, 10Notifications (Echo), 06Traffic: Notifications API is returning a permissions error since 2026-04-01 for a bot account - https://phabricator.wikimedia.org/T421991#11786041 (10matmarex) [16:59:18] 10VPS-project-devtools, 06collaboration-services: Cleanup collaboration-services WMCS hiera config - https://phabricator.wikimedia.org/T390948#11786046 (10A_smart_kitten) >>! In T390948#10993840, @Dzahn wrote: > Getting back to the origin of this ticket.. I think we need to first decide what we WANT to use. D... [17:05:17] FIRING: PrometheusRestarted: Prometheus instance tools-prometheus-9:9902 restarted - https://wikitech.wikimedia.org/wiki/Prometheus#Prometheus_was_restarted - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPrometheusRestarted [17:11:40] (03open) 10lucaswerkmeister: Add placeholder= attributes to