[00:53:55] 06cloud-services-team, 10wikitech.wikimedia.org, 06Infrastructure-Foundations, 07Epic: Make Wikitech an SUL wiki - https://phabricator.wikimedia.org/T161859#10223135 (10GTrang) Should we re-enable account creation on Wikitech or not? Account creation was disabled in https://gerrit.wikimedia.org/r/c/operati... [01:21:27] FIRING: CloudVPSDesignateLeaks: Detected 4 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [02:07:19] FIRING: PuppetZeroResources: Puppet has failed generate resources on cloudweb2002-dev:9100 - https://puppetboard.wikimedia.org/nodes?status=failed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetZeroResources [02:27:07] (03update) 10raymond-ndibe: [lima-kilo] refactor the project to suit a multi VM configuration [repos/cloud/toolforge/lima-kilo] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/lima-kilo/-/merge_requests/198 [02:28:35] (03update) 10raymond-ndibe: [lima-kilo] cache container images [repos/cloud/toolforge/lima-kilo] (refactor_in_preparation_for_cache) - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/lima-kilo/-/merge_requests/196 [03:36:34] FIRING: DiskSpace: Disk space cloudbackup1004:9100:/srv 4.963% free - https://wikitech.wikimedia.org/wiki/Monitoring/Disk_space - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&viewPanel=12&var-server=cloudbackup1004 - https://alerts.wikimedia.org/?q=alertname%3DDiskSpace [05:19:22] 10Tool-ldap, 10Phabricator, 13Patch-For-Review: https://ldap.toolforge.org/ integration assumes that `cn` and `uid` are equivalent - https://phabricator.wikimedia.org/T376769#10223261 (10matmarex) Hmm, they’re not even 404s, but 500s for me. Both the space and the ‘ń’ seem to cause that. [05:19:59] 10Tool-ldap, 10Phabricator, 13Patch-For-Review: https://ldap.toolforge.org/ integration assumes that `cn` and `uid` are equivalent - https://phabricator.wikimedia.org/T376769#10223262 (10Pppery) Yeah, they're 500s for me too. I saw an error and made a bad assumption. [05:21:27] FIRING: CloudVPSDesignateLeaks: Detected 4 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [05:35:59] 06cloud-services-team, 10Horizon: Keystone auth endpoint should use a standard HTTPS port - https://phabricator.wikimedia.org/T377055 (10taavi) 03NEW [06:07:19] FIRING: PuppetZeroResources: Puppet has failed generate resources on cloudweb2002-dev:9100 - https://puppetboard.wikimedia.org/nodes?status=failed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetZeroResources [07:36:34] FIRING: DiskSpace: Disk space cloudbackup1004:9100:/srv 4.624% free - https://wikitech.wikimedia.org/wiki/Monitoring/Disk_space - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&viewPanel=12&var-server=cloudbackup1004 - https://alerts.wikimedia.org/?q=alertname%3DDiskSpace [09:21:27] FIRING: CloudVPSDesignateLeaks: Detected 4 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [09:50:03] FIRING: ToolforgeKubernetesWorkerTooManyDProcesses: Node tools-k8s-worker-nfs-9 has at least 12 procs in D state, and may be having NFS/IO issues - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesWorkerTooManyDProcesses - https://grafana.wmcloud.org/d/3jhWxB8Vk/toolforge-general-overview - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesWorkerTooManyDProcesses [10:07:19] FIRING: PuppetZeroResources: Puppet has failed generate resources on cloudweb2002-dev:9100 - https://puppetboard.wikimedia.org/nodes?status=failed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetZeroResources [11:07:51] PROBLEM - Disk space on cloudbackup1004 is CRITICAL: DISK CRITICAL - free space: /srv 649590MiB (3% inode=99%): https://wikitech.wikimedia.org/wiki/Monitoring/Disk_space https://grafana.wikimedia.org/d/000000377/host-overview?var-server=cloudbackup1004&var-datasource=eqiad+prometheus/ops [11:36:34] FIRING: DiskSpace: Disk space cloudbackup1004:9100:/srv 3.847% free - https://wikitech.wikimedia.org/wiki/Monitoring/Disk_space - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&viewPanel=12&var-server=cloudbackup1004 - https://alerts.wikimedia.org/?q=alertname%3DDiskSpace [12:45:30] 06cloud-services-team, 10Horizon: Keystone auth endpoint should use a standard HTTPS port - https://phabricator.wikimedia.org/T377055#10223421 (10taavi) [13:21:27] FIRING: CloudVPSDesignateLeaks: Detected 4 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [14:07:19] FIRING: PuppetZeroResources: Puppet has failed generate resources on cloudweb2002-dev:9100 - https://puppetboard.wikimedia.org/nodes?status=failed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetZeroResources [14:27:21] 06cloud-services-team, 10Cloud-VPS, 06Infrastructure-Foundations, 10netops: keepalived: it doesn't support mixing IPv4 and IPv6 VIPs on the same VRRP instance - https://phabricator.wikimedia.org/T376879#10223483 (10Multichill) Happy to hear it works now! [15:03:04] 06cloud-services-team, 10wikitech.wikimedia.org, 06Infrastructure-Foundations, 07Epic: Make Wikitech an SUL wiki - https://phabricator.wikimedia.org/T161859#10223491 (10GTrang) Instead, your Wikitech account will become "married" to your SUL account (along with other wiki accounts, so this is actually "pol... [15:26:34] RESOLVED: DiskSpace: Disk space cloudbackup1004:9100:/srv 5.978% free - https://wikitech.wikimedia.org/wiki/Monitoring/Disk_space - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&viewPanel=12&var-server=cloudbackup1004 - https://alerts.wikimedia.org/?q=alertname%3DDiskSpace [15:47:51] RECOVERY - Disk space on cloudbackup1004 is OK: DISK OK https://wikitech.wikimedia.org/wiki/Monitoring/Disk_space https://grafana.wikimedia.org/d/000000377/host-overview?var-server=cloudbackup1004&var-datasource=eqiad+prometheus/ops [16:06:35] 06cloud-services-team, 10Cloud-VPS, 06Infrastructure-Foundations, 10netops, 06SRE: cloudgw: add support and enable IPv6 - https://phabricator.wikimedia.org/T374716#10223526 (10cmooney) FWIW I was curious about the setting so I [[ https://github.com/topranks/homerlabs/tree/main/labs/v6_sysctl | labbed thi... [17:17:44] 10Tools: Update welcome message in Zulip's goodbot - https://phabricator.wikimedia.org/T310826#10223572 (10Pppery) >>! In T310826#10137768, @debt wrote: > Hi! I recently found out about the goodbot for Zulip...and it needs some updating! Is there someone that can help or direct me as to how to update the wording... [17:21:27] FIRING: CloudVPSDesignateLeaks: Detected 5 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [18:07:19] FIRING: PuppetZeroResources: Puppet has failed generate resources on cloudweb2002-dev:9100 - https://puppetboard.wikimedia.org/nodes?status=failed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetZeroResources [18:35:51] 10Quarry: [bug] Quarry queries are stopped - https://phabricator.wikimedia.org/T377010#10223581 (10Prototyperspective) Please prevent queries from getting stopped. One went through but the other still gets stopped all the time. [21:21:27] FIRING: CloudVPSDesignateLeaks: Detected 4 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [22:07:19] FIRING: PuppetZeroResources: Puppet has failed generate resources on cloudweb2002-dev:9100 - https://puppetboard.wikimedia.org/nodes?status=failed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetZeroResources [22:47:33] FIRING: PuppetCertificateAboutToExpire: Puppet CA certificate mwv-builder-03.mediawiki-vagrant.eqiad.wmflabs is about to expire in 12d 23h 58m 34s - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/PuppetCertificateAboutToExpire - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetCertificateAboutToExpire