[00:00:03] FIRING: PuppetZeroResources: Puppet has failed generate resources on cloudvirt2004-dev:9100 - https://puppetboard.wikimedia.org/nodes?status=failed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetZeroResources [00:05:55] FIRING: MaxConntrack: Max conntrack at 82.51% on cloudvirt1067:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_conntrack - https://grafana.wikimedia.org/d/oITUqwKIk/netfilter-connection-tracking - https://alerts.wikimedia.org/?q=alertname%3DMaxConntrack [00:20:08] (03PS1) 10Jacob4code: Create a root package.json file to run both frontend and server [labs/tools/WdTmCollab] - 10https://gerrit.wikimedia.org/r/1164323 [00:30:55] RESOLVED: MaxConntrack: Max conntrack at 80.4% on cloudvirt1067:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_conntrack - https://grafana.wikimedia.org/d/oITUqwKIk/netfilter-connection-tracking - https://alerts.wikimedia.org/?q=alertname%3DMaxConntrack [00:31:25] FIRING: MaxConntrack: Max conntrack at 83.41% on cloudvirt1067:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_conntrack - https://grafana.wikimedia.org/d/oITUqwKIk/netfilter-connection-tracking - https://alerts.wikimedia.org/?q=alertname%3DMaxConntrack [00:56:25] RESOLVED: MaxConntrack: Max conntrack at 81.32% on cloudvirt1067:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_conntrack - https://grafana.wikimedia.org/d/oITUqwKIk/netfilter-connection-tracking - https://alerts.wikimedia.org/?q=alertname%3DMaxConntrack [01:08:08] 06cloud-services-team, 10Cloud-VPS, 13Patch-For-Review: OpenStack services should use system users to talk to Keystone - https://phabricator.wikimedia.org/T273150#10952623 (10Andrew) [02:50:04] FIRING: ToolforgeKubernetesWorkerTooManyDProcesses: Node tools-k8s-worker-nfs-46 has at least 12 procs in D state, and may be having NFS/IO issues - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesWorkerTooManyDProcesses - https://grafana.wmcloud.org/d/3jhWxB8Vk/toolforge-general-overview - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesWorkerTooManyDProcesses [04:00:03] FIRING: PuppetZeroResources: Puppet has failed generate resources on cloudvirt2004-dev:9100 - https://puppetboard.wikimedia.org/nodes?status=failed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetZeroResources [06:08:57] (03open) 10mdsshakil: Draft: Main2 [toolforge-repos/wikimediamonitorbot] - 10https://gitlab.wikimedia.org/toolforge-repos/wikimediamonitorbot/-/merge_requests/1 [07:01:28] FIRING: PuppetStaleCertificates: Found non-revoked Puppet certificates for 1 deleted instances on gitlab-runners-puppetserver-01 - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/PuppetStaleCertificates - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetStaleCertificates [07:38:41] FIRING: CloudVPSDesignateLeaks: Detected 2 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [08:00:03] FIRING: PuppetZeroResources: Puppet has failed generate resources on cloudvirt2004-dev:9100 - https://puppetboard.wikimedia.org/nodes?status=failed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetZeroResources [08:45:15] 10Toolforge (Toolforge iteration 21): lima-kilo failing on macos - https://phabricator.wikimedia.org/T398016 (10Raymond_Ndibe) 03NEW [08:45:46] 10Toolforge (Toolforge iteration 21): lima-kilo failing on macos - https://phabricator.wikimedia.org/T398016#10953208 (10Raymond_Ndibe) [08:46:05] 10Toolforge (Toolforge iteration 21): lima-kilo failing on macos - https://phabricator.wikimedia.org/T398016#10953209 (10Raymond_Ndibe) a:03Raymond_Ndibe [09:07:27] 06cloud-services-team, 10Acme-chief, 07IPv6, 13Patch-For-Review: tools-acme-chief-01 is attempting to validate DNS challenge against cloud authdns IPv6 addresses - https://phabricator.wikimedia.org/T245937#10953250 (10taavi) 05Open→03Invalid I think this is obsolete now, the `authdns_servers:` sett... [09:30:21] 06cloud-services-team, 10Cloud-VPS: [tofu-cloudvps] Manage project puppet classes and hiera - https://phabricator.wikimedia.org/T397994#10953321 (10taavi) The ENC API represents project Puppet data as a prefix where the prefix is an empty string. [09:38:10] (03approved) 10fnegri: bash-completion: Add file system recognition to autocomplete [repos/cloud/toolforge/components-cli] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/components-cli/-/merge_requests/46 (https://phabricator.wikimedia.org/T395077) (owner: 10chuckonwumelu) [09:41:44] 06cloud-services-team, 10Toolforge (Toolforge iteration 21): [components-api] allow stopping a deployment that's running - https://phabricator.wikimedia.org/T388644#10953350 (10dcaro) [09:42:16] 06cloud-services-team, 10Toolforge (Toolforge iteration 21): [components-api] allow stopping a deployment that's running - https://phabricator.wikimedia.org/T388644#10953352 (10dcaro) →14Duplicate dup:03T395039 [09:42:18] 10Toolforge (Toolforge iteration 21), 13Patch-For-Review: [components-api,components-cli] add `deploy cancel` feature - https://phabricator.wikimedia.org/T395039#10953354 (10dcaro) [09:43:13] 10cloud-services-team (FY2024/2025-Q3-Q4), 10Toolforge (Toolforge iteration 21), 05Cloud-Services-Origin-Team, 07Cloud-Services-Worktype-Project, and 2 others: [Hypothesis] WE6.3.10 start a beta for the push-to-deploy features - https://phabricator.wikimedia.org/T393564#10953361 (10dcaro) [11:38:56] FIRING: CloudVPSDesignateLeaks: Detected 2 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [11:52:18] (03PS3) 10Kamila Součková: Add fake hcaptcha proxy secrets. [labs/private] - 10https://gerrit.wikimedia.org/r/1155221 (https://phabricator.wikimedia.org/T381265) [12:00:03] FIRING: PuppetZeroResources: Puppet has failed generate resources on cloudvirt2004-dev:9100 - https://puppetboard.wikimedia.org/nodes?status=failed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetZeroResources [12:08:58] (03update) 10taavi: logging: loki: Add network policy rule for object storage access [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/827 (https://phabricator.wikimedia.org/T386480) [12:09:11] (03update) 10taavi: logging: loki: Add second Loki instance for infrastructure logs [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/834 (https://phabricator.wikimedia.org/T386480 https://phabricator.wikimedia.org/T97861) [12:09:17] (03update) 10taavi: logging: alloy: Add routing for infrastructure logs [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/835 (https://phabricator.wikimedia.org/T386480 https://phabricator.wikimedia.org/T97861) [12:09:23] (03update) 10taavi: logging: alloy: Allow running on the entire cluster [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/836 (https://phabricator.wikimedia.org/T97861) [12:55:34] 06cloud-services-team, 10Toolforge: Lock down tools-sgebastion-10 (login-buster.toolforge.org) to only members of tools with known dependencies on it - https://phabricator.wikimedia.org/T397459#10953965 (10-jem-) Thanks, @bd808. Months ago, I had two or three conversations about my situation in IRC, but I... [14:02:18] 06cloud-services-team, 06Infrastructure-Foundations, 10SRE-tools: sre.hosts.decommission often leaves dangling things in netbox - https://phabricator.wikimedia.org/T398052 (10Andrew) 03NEW [14:02:39] 06cloud-services-team, 06Infrastructure-Foundations, 10SRE-tools: sre.hosts.decommission often leaves dangling things in netbox - https://phabricator.wikimedia.org/T398052#10954206 (10Andrew) [14:18:41] RESOLVED: CloudVPSDesignateLeaks: Detected 2 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [15:17:47] 06cloud-services-team, 10Data-Services, 06Privacy Engineering: Make querycache, querycachetwo and querycache_info tables visible on Wiki Replicas - https://phabricator.wikimedia.org/T65782#10954530 (10SD0001) I think these need signoff from Privacy instead of Legal nowadays. [15:21:47] 06cloud-services-team, 06DC-Ops, 10decommission-hardware, 10ops-codfw, 06SRE: decommission cloudcontrol2004-dev.codfw.wmnet - https://phabricator.wikimedia.org/T396396#10954534 (10Jhancock.wm) no, sorry, still getting the same error [15:22:17] (03PS3) 10Majavah: Remove support for SUL 'realname' field [labs/striker] - 10https://gerrit.wikimedia.org/r/1163331 (https://phabricator.wikimedia.org/T384206) (owner: 10Arendpieter) [15:23:12] (03CR) 10Majavah: [C:03+2] "PS3 fixes a comment that was missed and wraps the commit message at 72 chars. Thank you!" [labs/striker] - 10https://gerrit.wikimedia.org/r/1163331 (https://phabricator.wikimedia.org/T384206) (owner: 10Arendpieter) [15:24:37] (03CR) 10CI reject: [V:04-1] Remove support for SUL 'realname' field [labs/striker] - 10https://gerrit.wikimedia.org/r/1163331 (https://phabricator.wikimedia.org/T384206) (owner: 10Arendpieter) [15:28:23] (03PS4) 10Majavah: Remove support for SUL 'realname' field [labs/striker] - 10https://gerrit.wikimedia.org/r/1163331 (https://phabricator.wikimedia.org/T384206) (owner: 10Arendpieter) [15:28:29] (03CR) 10Majavah: [C:03+2] Remove support for SUL 'realname' field [labs/striker] - 10https://gerrit.wikimedia.org/r/1163331 (https://phabricator.wikimedia.org/T384206) (owner: 10Arendpieter) [15:29:02] 06cloud-services-team, 10Striker, 07Documentation: Document how to run Striker migrations - https://phabricator.wikimedia.org/T398062 (10taavi) 03NEW p:05Triage→03High [15:31:08] (03Merged) 10jenkins-bot: Remove support for SUL 'realname' field [labs/striker] - 10https://gerrit.wikimedia.org/r/1163331 (https://phabricator.wikimedia.org/T384206) (owner: 10Arendpieter) [15:33:26] (03Abandoned) 10Majavah: dev(docker): Add wmf-user custom LDAP schema [labs/striker] - 10https://gerrit.wikimedia.org/r/1076814 (https://phabricator.wikimedia.org/T148048) (owner: 10Majavah) [15:33:38] (03PS2) 10Majavah: labsauth: Write SUL account details to LDAP on registration [labs/striker] - 10https://gerrit.wikimedia.org/r/1076815 (https://phabricator.wikimedia.org/T148048) [15:33:38] (03PS2) 10Majavah: labsauth: Write SUL details to LDAP when updating linkage [labs/striker] - 10https://gerrit.wikimedia.org/r/1076816 (https://phabricator.wikimedia.org/T148048) [15:37:18] 06cloud-services-team, 10Toolforge: Disable tools.maintain-harbor - https://phabricator.wikimedia.org/T397933#10954587 (10taavi) p:05Triage→03Low [16:00:04] FIRING: PuppetZeroResources: Puppet has failed generate resources on cloudvirt2004-dev:9100 - https://puppetboard.wikimedia.org/nodes?status=failed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetZeroResources [16:10:06] 06cloud-services-team, 10Cloud-VPS: [tofu-cloudvps] Manage project puppet classes and hiera - https://phabricator.wikimedia.org/T397994#10954730 (10bd808) >>! In T397994#10953321, @taavi wrote: > The ENC API represents project Puppet data as a prefix where the prefix is an empty string. Ah! So this is mostly... [16:25:50] 06cloud-services-team, 10Cloud-VPS: TF irc alert on failure - https://phabricator.wikimedia.org/T340676#10954834 (10taavi) 05Open→03Resolved [16:27:06] 06cloud-services-team, 10Cloud-VPS: cloudcontrol: puppet: mariadb package fails to install cleanly - https://phabricator.wikimedia.org/T337752#10954838 (10taavi) 05Open→03Resolved Not a problem since we moved back to upstream Debian packages for MariaDB. [16:29:05] 06cloud-services-team, 10Cloud-VPS: Consider replacing our spreadcheck alerts with Server Groups Anti-Affinity policies - https://phabricator.wikimedia.org/T247213#10954843 (10taavi) 05Open→03Resolved a:03taavi [16:31:39] 06cloud-services-team, 10Cloud-VPS, 13Patch-Needs-Improvement, 07Puppet: role::puppetmaster::standalone clones Git repositories as gitpuppet, git-sync-upstream overwrites them as root - https://phabricator.wikimedia.org/T152059#10954854 (10taavi) 05Open→03Resolved I believe this was fixed with the... [16:44:49] RESOLVED: PuppetZeroResources: Puppet has failed generate resources on cloudvirt2004-dev:9100 - https://puppetboard.wikimedia.org/nodes?status=failed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetZeroResources [16:59:43] 06cloud-services-team, 06DC-Ops, 10decommission-hardware, 10ops-codfw, 06SRE: decommission cloudcontrol2004-dev.codfw.wmnet - https://phabricator.wikimedia.org/T396396#10954940 (10cmooney) Folks you need to delete the interfaces on the box to get around this. I've done that now, Jenn hopefully will work... [17:13:58] (03PS4) 10Kamila Součková: Add fake hcaptcha proxy secrets. [labs/private] - 10https://gerrit.wikimedia.org/r/1155221 (https://phabricator.wikimedia.org/T397841) [17:28:57] 06cloud-services-team, 06DC-Ops, 10decommission-hardware, 10ops-codfw, 06SRE: decommission cloudcontrol2004-dev.codfw.wmnet - https://phabricator.wikimedia.org/T396396#10954997 (10Jhancock.wm) 05Open→03Resolved [17:58:21] !log andrew@cloudcumin1001 tools START - Cookbook wmcs.toolforge.k8s.reboot for tools-k8s-worker-nfs-14, tools-k8s-worker-nfs-46 [18:00:42] 06cloud-services-team, 06DC-Ops, 10decommission-hardware, 10ops-codfw, 06SRE: decommission cloudcontrol2004-dev.codfw.wmnet - https://phabricator.wikimedia.org/T396396#10955048 (10Andrew) >>! In T396396#10954940, @cmooney wrote: > Folks you need to delete the interfaces on the box to get around this.... [18:05:28] FIRING: InstanceDown: Project tools instance tools-k8s-worker-nfs-14 is down - https://prometheus-alerts.wmcloud.org/?q=alertname%3DInstanceDown [18:10:28] RESOLVED: InstanceDown: Project tools instance tools-k8s-worker-nfs-14 is down - https://prometheus-alerts.wmcloud.org/?q=alertname%3DInstanceDown [18:12:54] !log andrew@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.k8s.reboot (exit_code=0) for tools-k8s-worker-nfs-14, tools-k8s-worker-nfs-46 [18:15:13] 06cloud-services-team, 10Cloud-VPS, 10Bitu, 06Infrastructure-Foundations: developer service acounts and email - https://phabricator.wikimedia.org/T398074 (10Andrew) 03NEW [18:15:37] 06cloud-services-team, 10Cloud-VPS, 10Bitu, 06Infrastructure-Foundations: developer service acounts and email - https://phabricator.wikimedia.org/T398074#10955065 (10Andrew) [18:16:12] 06cloud-services-team, 10Cloud-VPS, 10Bitu, 06Infrastructure-Foundations: developer service acounts and email - https://phabricator.wikimedia.org/T398074#10955068 (10Andrew) Also I'll be using option 1 (just edit ldap) until someone tells me not to! [18:28:17] 06cloud-services-team, 10Cloud-VPS, 10Bitu, 06Infrastructure-Foundations: developer service accounts and email - https://phabricator.wikimedia.org/T398074#10955085 (10Aklapper) [18:35:05] RESOLVED: ToolforgeKubernetesWorkerTooManyDProcesses: Node tools-k8s-worker-nfs-46 has at least 12 procs in D state, and may be having NFS/IO issues - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesWorkerTooManyDProcesses - https://grafana.wmcloud.org/d/3jhWxB8Vk/toolforge-general-overview - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesWorkerTooManyDProcesses [18:36:22] 10Data-Services, 06Data-Engineering, 06Data-Engineering-Radar, 06DBA, 06Privacy Engineering: Create views for SecurePoll db tables in Toolforge replicas - https://phabricator.wikimedia.org/T381197#10955112 (10SD0001) [18:41:01] 10Data-Services, 06Data-Engineering, 06Data-Engineering-Radar, 06DBA, 06Privacy Engineering: Create views for SecurePoll db tables on Wiki Replicas - https://phabricator.wikimedia.org/T381197#10955116 (10taavi) [19:34:23] 10Tool-paulina: Authors page info displays wrong public domain message - https://phabricator.wikimedia.org/T397241#10955223 (10Pepe_piton) 05Open→03Resolved Done! I just added an if statement in author.html file. See lines 39 to 41 in [[ https://gitlab.wikimedia.org/toolforge-repos/paulina/-/blob/a072d... [19:34:40] 06cloud-services-team, 06DC-Ops, 10decommission-hardware, 10ops-codfw, and 2 others: decommission cloudcephosd200[12]-dev.codfw.wmnet - https://phabricator.wikimedia.org/T397968#10955226 (10Andrew) [19:55:19] (03PS1) 10Andrew Bogott: Add dummy ldap passwords for placement service user [labs/private] - 10https://gerrit.wikimedia.org/r/1164499 (https://phabricator.wikimedia.org/T273150) [20:01:14] (03CR) 10Andrew Bogott: [V:03+2 C:03+2] Add dummy ldap passwords for placement service user [labs/private] - 10https://gerrit.wikimedia.org/r/1164499 (https://phabricator.wikimedia.org/T273150) (owner: 10Andrew Bogott) [20:44:41] 06cloud-services-team, 10Cloud-VPS, 13Patch-For-Review: OpenStack services should use system users to talk to Keystone - https://phabricator.wikimedia.org/T273150#10955311 (10Andrew) [20:51:14] (03PS1) 10Andrew Bogott: Add dummy ldap passwords for trove service user [labs/private] - 10https://gerrit.wikimedia.org/r/1164503 (https://phabricator.wikimedia.org/T273150) [20:59:32] (03Abandoned) 10Andrew Bogott: Add dummy ldap passwords for trove service user [labs/private] - 10https://gerrit.wikimedia.org/r/1164503 (https://phabricator.wikimedia.org/T273150) (owner: 10Andrew Bogott) [21:07:50] (03PS1) 10Andrew Bogott: Fix misnamed fake password [labs/private] - 10https://gerrit.wikimedia.org/r/1164508 [21:08:12] (03CR) 10Andrew Bogott: [V:03+2 C:03+2] Fix misnamed fake password [labs/private] - 10https://gerrit.wikimedia.org/r/1164508 (owner: 10Andrew Bogott) [21:17:12] 06cloud-services-team, 10Cloud-VPS, 13Patch-For-Review: OpenStack services should use system users to talk to Keystone - https://phabricator.wikimedia.org/T273150#10955404 (10Andrew) [22:12:12] 10PAWS: [bug] - https://phabricator.wikimedia.org/T398087 (10Kanbot777) 03NEW [23:17:58] 06cloud-services-team, 10Cloud-VPS: [tofu-cloudvps] Manage project puppet classes and hiera - https://phabricator.wikimedia.org/T397994#10955587 (10bd808) The API level magic is to pass `_` as the prefix name. The `_preprocess_prefix(prefix)` function in puppet-enc.py turns that into the empty prefix that appl...