[00:01:22] (03PS1) 10Andrew Bogott: Revert "WMFHACK: don't compress template_cache_preloads" [openstack/horizon/horizon] (2024.1) - 10https://gerrit.wikimedia.org/r/1025486 [00:01:35] (03PS1) 10Andrew Bogott: Revert "WMFHACK: don't compress template_cache_preloads" [openstack/horizon/horizon] - 10https://gerrit.wikimedia.org/r/1025487 [00:01:53] (03CR) 10Andrew Bogott: [V:03+2 C:03+2] Revert "WMFHACK: don't compress template_cache_preloads" [openstack/horizon/horizon] (2024.1) - 10https://gerrit.wikimedia.org/r/1025486 (owner: 10Andrew Bogott) [00:02:03] (03CR) 10Andrew Bogott: [V:03+2 C:03+2] Revert "WMFHACK: don't compress template_cache_preloads" [openstack/horizon/horizon] - 10https://gerrit.wikimedia.org/r/1025487 (owner: 10Andrew Bogott) [00:14:41] (CloudVPSDesignateLeaks) firing: Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [00:24:41] (CloudVPSDesignateLeaks) resolved: Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [00:35:40] (03PS1) 10Andrew Bogott: Revert "Revert "WMFHACK: don't compress template_cache_preloads"" [openstack/horizon/horizon] - 10https://gerrit.wikimedia.org/r/1025491 [00:36:00] (03CR) 10Andrew Bogott: [V:03+2 C:03+2] Revert "Revert "WMFHACK: don't compress template_cache_preloads"" [openstack/horizon/horizon] - 10https://gerrit.wikimedia.org/r/1025491 (owner: 10Andrew Bogott) [00:36:30] (03PS1) 10Andrew Bogott: Revert "Revert "WMFHACK: don't compress template_cache_preloads"" [openstack/horizon/horizon] (2024.1) - 10https://gerrit.wikimedia.org/r/1025492 [00:36:46] (03CR) 10Andrew Bogott: [V:03+2 C:03+2] Revert "Revert "WMFHACK: don't compress template_cache_preloads"" [openstack/horizon/horizon] (2024.1) - 10https://gerrit.wikimedia.org/r/1025492 (owner: 10Andrew Bogott) [01:33:00] (03PS1) 10AntiCompositeNumber: SULWatcher: fix typo [labs/tools/stewardbots] - 10https://gerrit.wikimedia.org/r/1025497 [01:37:22] (03CR) 10AntiCompositeNumber: [C:03+2] SULWatcher: fix typo [labs/tools/stewardbots] - 10https://gerrit.wikimedia.org/r/1025497 (owner: 10AntiCompositeNumber) [01:37:55] (03Merged) 10jenkins-bot: SULWatcher: fix typo [labs/tools/stewardbots] - 10https://gerrit.wikimedia.org/r/1025497 (owner: 10AntiCompositeNumber) [01:42:41] (CloudVPSDesignateLeaks) firing: Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [01:52:41] (CloudVPSDesignateLeaks) resolved: Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [02:44:41] (CloudVPSDesignateLeaks) firing: Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [02:54:41] (CloudVPSDesignateLeaks) resolved: Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [05:14:41] (CloudVPSDesignateLeaks) firing: Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [05:24:41] (CloudVPSDesignateLeaks) resolved: Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [05:44:16] 10Tools, 05Community-Wishlist-Survey-2023, 03Wikimedia Wishathon: Investigate Dabfix tool implementation - https://phabricator.wikimedia.org/T336545#9756317 (10Soda) There should be a working prototype at https://dabfix.toolforge.org :) [07:47:41] 10cloud-services-team (FY2023/2024-Q3-Q4), 10Toolforge (Toolforge iteration 09): [toolforge] Redis refusing connections - https://phabricator.wikimedia.org/T363709#9756530 (10dcaro) [07:48:17] 06cloud-services-team, 10Toolforge (Toolforge iteration 09): lima-kilo: container image caching - https://phabricator.wikimedia.org/T362967#9756533 (10dcaro) [07:48:40] 06cloud-services-team, 10Toolforge (Toolforge iteration 09), 13Patch-For-Review: review pod templates for stricter security - https://phabricator.wikimedia.org/T362050#9756539 (10dcaro) [07:48:50] 06cloud-services-team, 10Toolforge (Toolforge iteration 09): toolforge lima-kilo: refresh maintain-kubeusers test data - https://phabricator.wikimedia.org/T363482#9756535 (10dcaro) [07:48:57] 06cloud-services-team, 10Toolforge (Toolforge iteration 09), 13Patch-For-Review: Toolforge: Introduce grid-less bookworm based bastion hosts - https://phabricator.wikimedia.org/T314665#9756545 (10dcaro) I'm leaving this open until we remove the old bastions, I'll rephrase the title [07:49:12] 06cloud-services-team, 10Toolforge (Toolforge iteration 09), 13Patch-For-Review: Toolforge: Replace all bastion with grid-less bookworm based bastion hosts - https://phabricator.wikimedia.org/T314665#9756546 (10dcaro) [07:51:22] 10Toolforge (Toolforge iteration 09): [builds-builder,lima-kilo] tekton stopped working on default setup - https://phabricator.wikimedia.org/T363800 (10dcaro) 03NEW [07:52:01] 10Toolforge (Toolforge iteration 09): [builds-builder,lima-kilo] tekton stopped working on default setup - https://phabricator.wikimedia.org/T363800#9756558 (10dcaro) 05Open→03In progress p:05Triage→03High a:03dcaro [08:29:47] 10Toolforge (Toolforge iteration 09): [builds-builder,lima-kilo] tekton stopped working on default setup - https://phabricator.wikimedia.org/T363800#9756644 (10dcaro) It seems that inside the controller container apparmor is missing it's filesystem: ` dcaro@lima-kilo:/home/dcaro/Work/wikimedia/lima-kilo$ docker... [08:41:08] 10Toolforge (Toolforge iteration 09): [builds-builder,lima-kilo] tekton stopped working on default setup - https://phabricator.wikimedia.org/T363800#9756685 (10dcaro) Still fails, but in a different way: ` │ - image: docker-registry.tools.wmflabs.org/toolforge-tektoncd-pipeline-cmd-controller:v0.33.2... [09:04:13] 10Toolforge (Toolforge iteration 09): [builds-builder,lima-kilo] tekton stopped working on default setup - https://phabricator.wikimedia.org/T363800#9756723 (10dcaro) Deleting the apparmor entries on the tekton psp allows the containers to run. We still have seccomp entries, and that's what we seem to use in ot... [09:06:34] 10Toolforge (Toolforge iteration 09), 13Patch-For-Review: [builds-builder,lima-kilo] tekton stopped working on default setup - https://phabricator.wikimedia.org/T363800#9756724 (10CodeReviewBot) dcaro opened https://gitlab.wikimedia.org/repos/cloud/toolforge/builds-builder/-/merge_requests/43 psp: remove appa... [09:12:41] (CloudVPSDesignateLeaks) firing: (2) Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [09:17:41] (CloudVPSDesignateLeaks) firing: (3) Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [09:22:41] (CloudVPSDesignateLeaks) firing: (3) Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [09:27:41] (CloudVPSDesignateLeaks) resolved: (3) Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [09:43:51] 10Toolforge (Toolforge iteration 09): [infra] NFS hangs in some workers until the worker is rebooted - https://phabricator.wikimedia.org/T362690#9756855 (10dcaro) It did not happen again, will open a new task and continue the debugging if the issue happens anew. [09:46:18] 10Toolforge (Toolforge iteration 09): [infra] NFS hangs in some workers until the worker is rebooted - https://phabricator.wikimedia.org/T362690#9756857 (10dcaro) 05In progress→03Resolved [09:46:32] 10Toolforge (Toolforge iteration 09): I can't connect to Toolforge DB replicas from my PC using MySQL Workbench - https://phabricator.wikimedia.org/T360839#9756859 (10dcaro) 05In progress→03Stalled [10:02:21] 10Toolforge: [builds-api] Prefix all endpoints with `/tool/` - https://phabricator.wikimedia.org/T363808 (10dcaro) 03NEW [10:03:25] 10Toolforge: [envvars-api] Prefix all endpoints with `/tool/` - https://phabricator.wikimedia.org/T363809 (10dcaro) 03NEW [10:03:48] 10Toolforge: [jobs-apii] Prefix all endpoints with `/tool/` - https://phabricator.wikimedia.org/T363346#9756958 (10dcaro) [10:07:42] 10Toolforge: [envvars-api] Prefix all endpoints with `/tool/` - https://phabricator.wikimedia.org/T363809#9756976 (10dcaro) p:05Triage→03High [10:07:43] 10Toolforge: [builds-api] Prefix all endpoints with `/tool/` - https://phabricator.wikimedia.org/T363808#9756977 (10dcaro) p:05Triage→03High [10:07:59] 10Toolforge: [envvars-api] Prefix all endpoints with `/tool/` - https://phabricator.wikimedia.org/T363809#9756981 (10dcaro) a:05dcaro→03None [10:25:34] 10Toolforge (Toolforge iteration 09), 13Patch-For-Review: [builds-builder,lima-kilo] tekton stopped working on default setup - https://phabricator.wikimedia.org/T363800#9757045 (10CodeReviewBot) dcaro merged https://gitlab.wikimedia.org/repos/cloud/toolforge/builds-builder/-/merge_requests/43 psp: remove appa... [10:26:57] 10Toolforge (Toolforge iteration 09), 13Patch-For-Review: [builds-builder,lima-kilo] tekton stopped working on default setup - https://phabricator.wikimedia.org/T363800#9757050 (10CodeReviewBot) project_1317_bot_df3177307bed93c3f34e421e26c86e38 opened https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforg... [10:29:57] 06cloud-services-team, 10Data-Platform-SRE (2024.04.15 - 2024.05.05): Review and fix any bugs found in the automated bootstrap process for a ceph mon/mgr server - https://phabricator.wikimedia.org/T332987#9757064 (10BTullis) p:05Low→03Medium a:03BTullis I am going to work on this ticket at the moment, si... [10:47:58] 10Toolforge (Toolforge iteration 09), 13Patch-For-Review: [builds-builder,lima-kilo] tekton stopped working on default setup - https://phabricator.wikimedia.org/T363800#9757118 (10aborrero) I think kind in particular has some issues working with appArmor. There is a reference in the doc to just disable it: htt... [10:52:22] !log dcaro@cloudcumin1001 toolsbeta START - Cookbook wmcs.toolforge.k8s.component.deploy for component builds-builder [10:52:40] !log dcaro@cloudcumin1001 toolsbeta END (PASS) - Cookbook wmcs.toolforge.k8s.component.deploy (exit_code=0) for component builds-builder [10:55:53] !log dcaro@cloudcumin1001 tools START - Cookbook wmcs.toolforge.k8s.component.deploy for component builds-builder [10:56:13] !log dcaro@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.k8s.component.deploy (exit_code=0) for component builds-builder [10:58:24] 10Toolforge (Toolforge iteration 09), 13Patch-For-Review: [builds-builder,lima-kilo] tekton stopped working on default setup - https://phabricator.wikimedia.org/T363800#9757164 (10CodeReviewBot) dcaro merged https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/268 builds-builde... [11:16:59] (03PS1) 10Btullis: Add dummy keydata for a wildcard ceph monitor [labs/private] - 10https://gerrit.wikimedia.org/r/1025728 (https://phabricator.wikimedia.org/T332987) [11:25:10] 10Cloud Services Proposals, 06cloud-services-team, 10Toolforge: Decision Request - Toolforge policy agent enforcement model - https://phabricator.wikimedia.org/T362872#9757243 (10aborrero) >>! In T362872#9752551, @dcaro wrote: > The decision about commiting to drop the extra component on the upgrade to k8s 1... [11:27:25] (03CR) 10Btullis: [V:03+2 C:03+2] Add dummy keydata for a wildcard ceph monitor [labs/private] - 10https://gerrit.wikimedia.org/r/1025728 (https://phabricator.wikimedia.org/T332987) (owner: 10Btullis) [11:37:51] 06cloud-services-team, 10Data-Platform-SRE (2024.04.15 - 2024.05.05), 13Patch-For-Review: Review and fix any bugs found in the automated bootstrap process for a ceph mon/mgr server - https://phabricator.wikimedia.org/T332987#9757312 (10ops-monitoring-bot) Host rebooted by btullis@cumin1002 with reason: Troub... [12:14:12] 06Toolforge-standards-committee: Adoption request for Yapperbot - https://phabricator.wikimedia.org/T361426#9757455 (10Soda) @taavi Would it be possible to unprotect the yml files in the yapper bot directory ? Based on a cursory reading of the code, they should not contain any account secrets in them, but rather... [12:36:50] 10Cloud Services Proposals, 06cloud-services-team, 10Toolforge: Decision Request - Toolforge policy agent enforcement model - https://phabricator.wikimedia.org/T362872#9757536 (10dcaro) >>! In T362872#9757243, @aborrero wrote: >>>! In T362872#9752551, @dcaro wrote: >> The decision about commiting to drop the... [12:38:52] 10Cloud Services Proposals, 06cloud-services-team, 10Toolforge: Decision Request - Toolforge policy agent - https://phabricator.wikimedia.org/T362233#9757539 (10dcaro) The decision about commiting to drop the extra component on the upgrade to k8s 1.26 might become way more relevant with {T363683}, if we deci... [12:40:32] 10Toolforge (Toolforge iteration 09): [builds-builder,lima-kilo] tekton stopped working on default setup - https://phabricator.wikimedia.org/T363800#9757546 (10dcaro) 05In progress→03Resolved [12:43:22] 10cloud-services-team (FY2023/2024-Q3-Q4), 10Toolforge (Toolforge iteration 09), 13Patch-For-Review: [prometheus] [grafana] set scrape interval in data source config - https://phabricator.wikimedia.org/T363176#9757553 (10dcaro) 05Stalled→03In progress [12:45:37] 10cloud-services-team (FY2023/2024-Q3-Q4), 10Toolforge (Toolforge iteration 09): [builds-api] Add dashboards with the new statistics - https://phabricator.wikimedia.org/T352764#9757572 (10dcaro) It looks good :) I changed a couple things: * Added soft minimum to most graphs, to see the scale of the graph * Ad... [12:47:19] 10cloud-services-team (FY2023/2024-Q3-Q4), 10Toolforge (Toolforge iteration 09): [builds-api] Add dashboards with the new statistics - https://phabricator.wikimedia.org/T352764#9757576 (10dcaro) Maybe a link to the wiki page for the builds-api would be nice too. We could use a text panel like in https://grafa... [12:48:14] 10cloud-services-team (FY2023/2024-Q3-Q4), 10Toolforge (Toolforge iteration 09): [builds-api] Add dashboards with the new statistics - https://phabricator.wikimedia.org/T352764#9757579 (10dcaro) (btw. it's good enough as it is, we can do that in another round when we add something more interesting to it like d... [12:49:27] 10cloud-services-team (FY2023/2024-Q3-Q4), 10Toolforge (Toolforge iteration 09): [builds-api] Add dashboards with the new statistics - https://phabricator.wikimedia.org/T352764#9757581 (10dcaro) 05Stalled→03In progress [13:01:54] 10Tool-itwiki: Verifica automatica dei requisiti per la Commissione arbitrale di Itwiki - https://phabricator.wikimedia.org/T363826 (10Melos) 03NEW [13:07:53] 10PAWS: remove paws-123-13 cluster - https://phabricator.wikimedia.org/T363827 (10rook) 03NEW [13:09:04] 10PAWS: remove paws-123-13 cluster - https://phabricator.wikimedia.org/T363827#9757640 (10github-toolforge-bot) vivian-rook opened https://github.com/toolforge/paws/pull/407 [13:09:18] vivian-rook opened https://github.com/toolforge/paws/pull/407 [13:16:28] 10Toolforge (Toolforge iteration 09), 13Patch-For-Review: [jobs-api,jobs-cli] Support services in jobs - https://phabricator.wikimedia.org/T348758#9757651 (10dcaro) 05Open→03In progress [13:25:35] 10Tool-itwiki, 10Wikimedia-Site-requests, 13Patch-For-Review: Creating a new 'arbcom' usergroup on itwiki - https://phabricator.wikimedia.org/T363805#9757699 (10Melos) [13:28:07] 10cloud-services-team (FY2023/2024-Q3-Q4), 10Toolforge (Toolforge iteration 09), 05Goal, 13Patch-For-Review: [infra] Decommission the Grid Engine infrastructure - https://phabricator.wikimedia.org/T314664#9757736 (10dcaro) 05In progress→03Stalled [13:29:22] 10Toolforge (Toolforge iteration 09), 13Patch-For-Review: [jobs-api,buildservice-api,envvars-api] Investigate ways to present our multiple Openapi definitions to a future consolidated CLI client - https://phabricator.wikimedia.org/T354745#9757752 (10dcaro) [13:30:15] (03PS1) 10Muehlenhoff: Remove obsolete stub cert [labs/private] - 10https://gerrit.wikimedia.org/r/1025777 (https://phabricator.wikimedia.org/T360439) [13:30:36] 10Toolforge (Toolforge iteration 09): [toolforge] simplify calling the different toolforge apis from within the containers - https://phabricator.wikimedia.org/T356377#9757758 (10dcaro) 05Open→03Stalled [13:31:18] 10cloud-services-team (FY2023/2024-Q3-Q4), 10Toolforge (Toolforge iteration 09): [toolforge] Redis refusing connections - https://phabricator.wikimedia.org/T363709#9757755 (10dcaro) 05Open→03In progress [13:31:52] 10Toolforge (Toolforge iteration 09), 07Epic: [jobs-cli,builds-cli,toolforge-cli,webservice] Consolidate the Toolforge CLIs - https://phabricator.wikimedia.org/T356262#9757764 (10dcaro) 05Open→03Stalled [13:43:06] 10cloud-services-team (FY2023/2024-Q3-Q4), 10Cloud-VPS: [tf-infra-test] Authentication failed - https://phabricator.wikimedia.org/T363696#9757823 (10rook) Rotated keys. Issue appears resolved. https://github.com/toolforge/tf-infra-test/pull/9 [13:44:43] 10cloud-services-team (FY2023/2024-Q3-Q4), 10Cloud-VPS: [tf-infra-test] Authentication failed - https://phabricator.wikimedia.org/T363696#9757824 (10rook) 05Open→03Resolved a:03rook [14:11:27] 10Toolforge, 10wikitech.wikimedia.org, 10Diffusion, 10Phabricator, 07Documentation: Document diffusion->github mirroring to https://github.com/toolforge/ on wikitech - https://phabricator.wikimedia.org/T361859#9757971 (10Aklapper) [14:14:17] 10Toolforge: [infra,k8s] pgrade Toolforge Kubernetes to version 1.29 - https://phabricator.wikimedia.org/T362868#9757996 (10dcaro) [14:20:48] 10Cloud Services Proposals, 06cloud-services-team, 10Toolforge: Decision Request - Toolforge policy agent - https://phabricator.wikimedia.org/T362233#9758024 (10fnegri) The output of the decision meeting was that we go with option 3, with an additional caveat: we want to drop Kyverno in favor of Vallidation... [14:42:41] (CloudVPSDesignateLeaks) firing: (2) Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [14:52:41] (CloudVPSDesignateLeaks) firing: (2) Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [14:53:15] 10cloud-services-team (FY2023/2024-Q3-Q4), 10Toolforge (Toolforge iteration 09): [prometheus] [grafana] set scrape interval in data source config - https://phabricator.wikimedia.org/T363176#9758120 (10fnegri) The patch was merged but hasn't been applied yet to the Grafana instance because metricsinfra-puppetse... [14:57:42] (CloudVPSDesignateLeaks) resolved: (2) Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [15:00:57] 10PAWS: remove paws-123-13 cluster - https://phabricator.wikimedia.org/T363827#9758148 (10github-toolforge-bot) vivian-rook closed https://github.com/toolforge/paws/pull/407 [15:01:04] 10PAWS: remove paws-123-13 cluster - https://phabricator.wikimedia.org/T363827#9758149 (10rook) 05Open→03Resolved [15:01:09] vivian-rook closed https://github.com/toolforge/paws/pull/407 [15:03:24] 10Cloud-VPS: update k8s version tf-infra-test - https://phabricator.wikimedia.org/T363837 (10rook) 03NEW [15:28:09] 10cloud-services-team (FY2023/2024-Q3-Q4), 06Infrastructure-Foundations, 10Spicerack, 10SRE-tools, and 2 others: Remove elasticsearch-curator dependency from Spicerack/Elastic cookbooks - https://phabricator.wikimedia.org/T361647#9758252 (10RKemper) 05Open→03Resolved [15:29:55] 10Cloud-VPS: update k8s version tf-infra-test - https://phabricator.wikimedia.org/T363837#9758268 (10rook) https://github.com/toolforge/tf-infra-test/pull/10 [15:30:00] 10Cloud-VPS: update k8s version tf-infra-test - https://phabricator.wikimedia.org/T363837#9758269 (10rook) 05Open→03Resolved [15:33:59] 10Cloud-VPS: Add default security group to vm tf-infra-test - https://phabricator.wikimedia.org/T363841 (10rook) 03NEW [15:37:30] 10Cloud-VPS: Add default security group to vm tf-infra-test - https://phabricator.wikimedia.org/T363841#9758317 (10rook) https://github.com/toolforge/tf-infra-test/pull/8 [15:42:41] (CloudVPSDesignateLeaks) firing: (2) Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [15:47:41] (CloudVPSDesignateLeaks) firing: (3) Detected 2 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [15:52:41] (CloudVPSDesignateLeaks) firing: (3) Detected 2 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [15:57:41] (CloudVPSDesignateLeaks) resolved: (3) Detected 2 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [16:04:49] 10cloud-services-team (FY2023/2024-Q3-Q4), 10Toolforge (Toolforge iteration 09): [builds-api] Add dashboards with the new statistics - https://phabricator.wikimedia.org/T352764#9758401 (10fnegri) 05In progress→03Resolved Thanks for the improvements! I've also added a text panel as you suggested. [16:06:22] 10cloud-services-team (FY2023/2024-Q3-Q4), 10Toolforge (Toolforge iteration 09): [prometheus] [grafana] set scrape interval in data source config - https://phabricator.wikimedia.org/T363176#9758419 (10fnegri) 05In progress→03Resolved This is now applied to https://grafana-rw.wmcloud.org/ and is working... [16:23:11] (03CR) 10Eugene233: "recheck" [labs/tools/Isa] - 10https://gerrit.wikimedia.org/r/1005028 (owner: 10Ketulucas) [16:26:15] (03CR) 10Eugene233: "recheck" [labs/tools/Isa] - 10https://gerrit.wikimedia.org/r/1005029 (https://phabricator.wikimedia.org/T248587) (owner: 10Ketulucas) [16:26:50] (03CR) 10Eugene233: [C:03+2] "Looks good!" [labs/tools/Isa] - 10https://gerrit.wikimedia.org/r/1003743 (owner: 10Amire80) [16:27:28] (03Merged) 10jenkins-bot: Fix a lego message [labs/tools/Isa] - 10https://gerrit.wikimedia.org/r/1003743 (owner: 10Amire80) [16:38:33] (03CR) 10Eugene233: "recheck" [labs/tools/Isa] - 10https://gerrit.wikimedia.org/r/991970 (owner: 10Afimaame) [16:38:58] (03CR) 10CI reject: [V:04-1] Reviewed and improved the code comments on the Isa tool [labs/tools/Isa] - 10https://gerrit.wikimedia.org/r/991970 (owner: 10Afimaame) [18:12:41] (CloudVPSDesignateLeaks) firing: Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [18:17:41] (CloudVPSDesignateLeaks) firing: (3) Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [18:22:41] (CloudVPSDesignateLeaks) firing: (3) Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [18:27:41] (CloudVPSDesignateLeaks) resolved: (3) Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [19:06:34] FIRING: DiskSpace: Disk space cloudbackup1002-dev:9100:/srv/cinder-backups 1.758% free - https://wikitech.wikimedia.org/wiki/Monitoring/Disk_space - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&viewPanel=12&var-server=cloudbackup1002-dev - https://alerts.wikimedia.org/?q=alertname%3DDiskSpace [19:08:36] 10Toolforge (Software install/update): Upgrade golang buildpack to 1.22 - https://phabricator.wikimedia.org/T363854#9759233 (10Pppery) [19:16:34] (03PS1) 10Andrew Bogott: Disable backup strategies panel [openstack/horizon/trove-dashboard] - 10https://gerrit.wikimedia.org/r/1025840 [19:17:08] (03PS1) 10Andrew Bogott: Disable backup strategies panel [openstack/horizon/trove-dashboard] (2024.1) - 10https://gerrit.wikimedia.org/r/1025841 [19:24:38] PROBLEM - Disk space on cloudbackup1002-dev is CRITICAL: DISK CRITICAL - free space: /srv/cinder-backups 333MiB (1% inode=98%): https://wikitech.wikimedia.org/wiki/Monitoring/Disk_space https://grafana.wikimedia.org/d/000000377/host-overview?var-server=cloudbackup1002-dev&var-datasource=eqiad+prometheus/ops [19:29:11] 10Data-Services, 10Tools: Some PetScan queries do not return any results anymore for some days now - https://phabricator.wikimedia.org/T363073#9759322 (10M2k_dewiki) [19:31:00] 10Data-Services, 10VPS-Projects: Some PetScan queries do not return any results anymore for some days now - https://phabricator.wikimedia.org/T363073#9759335 (10M2k_dewiki) [19:33:11] 10Data-Services, 10VPS-Projects: Some PetScan queries do not return any results anymore for some days now - https://phabricator.wikimedia.org/T363073#9759346 (10taavi) 05Open→03Invalid Per the issues link on https://petscan.wmflabs.org/, petscan issues should be reported at https://github.com/magnusman... [19:42:41] FIRING: [2x] CloudVPSDesignateLeaks: Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [19:52:42] RESOLVED: [2x] CloudVPSDesignateLeaks: Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [20:46:07] 06cloud-services-team, 10Data-Platform-SRE (2024.04.15 - 2024.05.05): Review and fix any bugs found in the automated bootstrap process for a ceph mon/mgr server - https://phabricator.wikimedia.org/T332987#9759553 (10BTullis) The test worked, so the automated bootstrapping of a new mon server is now OK. I have... [20:46:39] 06cloud-services-team, 10Data-Platform-SRE (2024.04.15 - 2024.05.05): Review and fix any bugs found in the automated bootstrap process for a ceph mon/mgr server - https://phabricator.wikimedia.org/T332987#9759556 (10BTullis) 05Open→03Resolved [20:56:08] (03CR) 10Btullis: [C:03+1] Remove obsolete stub cert [labs/private] - 10https://gerrit.wikimedia.org/r/1025777 (https://phabricator.wikimedia.org/T360439) (owner: 10Muehlenhoff) [22:12:41] FIRING: CloudVPSDesignateLeaks: Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [22:17:41] FIRING: [3x] CloudVPSDesignateLeaks: Detected 2 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [22:22:41] FIRING: [3x] CloudVPSDesignateLeaks: Detected 2 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [22:27:41] RESOLVED: [3x] CloudVPSDesignateLeaks: Detected 2 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [22:51:43] 10Wikibugs: wikibugs SSE client does not seen events from gitlab-webhooks in real-time. - https://phabricator.wikimedia.org/T363877 (10bd808) 03NEW [22:52:31] 10Wikibugs: wikibugs SSE client does not seen events from gitlab-webhooks in real-time. - https://phabricator.wikimedia.org/T363877#9759831 (10bd808) 05Open→03In progress a:03bd808 [22:57:54] 10Wikibugs: wikibugs SSE client does not seen events from gitlab-webhooks in real-time. - https://phabricator.wikimedia.org/T363877#9759842 (10bd808) My very work-in-progress client code is now available at https://gitlab.wikimedia.org/toolforge-repos/wikibugs2/-/commit/7fcd6c83ae24326e2e3397970b6272fbd8002cdb. [23:06:34] FIRING: DiskSpace: Disk space cloudbackup1002-dev:9100:/srv/cinder-backups 1.758% free - https://wikitech.wikimedia.org/wiki/Monitoring/Disk_space - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&viewPanel=12&var-server=cloudbackup1002-dev - https://alerts.wikimedia.org/?q=alertname%3DDiskSpace [23:10:52] 10Wikibugs: wikibugs SSE client does not see events from gitlab-webhooks in real-time. - https://phabricator.wikimedia.org/T363877#9759876 (10bd808)