[02:32:36] 14cloud-services-team (Kanban), 06DBA, 07Puppet: labtestpuppetmaster2001 is failing to backup - https://phabricator.wikimedia.org/T256846#10872955 (10Andrew) >>! In T256846#10867604, @jcrespo wrote: > @Andrew should we revert now https://gerrit.wikimedia.org/r/c/operations/puppet/+/612167/6/modules/profi... [02:54:56] (03open) 10raymond-ndibe: [components.tool_router] add example config endpoint [repos/cloud/toolforge/components-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/components-api/-/merge_requests/82 (https://phabricator.wikimedia.org/T394753) [03:13:57] (03open) 10raymond-ndibe: [components-cli] print example config [repos/cloud/toolforge/components-cli] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/components-cli/-/merge_requests/34 (https://phabricator.wikimedia.org/T394753) [03:21:34] (03update) 10raymond-ndibe: [components.tool_router] add example config endpoint [repos/cloud/toolforge/components-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/components-api/-/merge_requests/82 (https://phabricator.wikimedia.org/T394753) [03:28:05] (03update) 10raymond-ndibe: [components-cli] print example config [repos/cloud/toolforge/components-cli] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/components-cli/-/merge_requests/34 (https://phabricator.wikimedia.org/T394753) [03:29:16] (03update) 10raymond-ndibe: [components-cli] print example config [repos/cloud/toolforge/components-cli] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/components-cli/-/merge_requests/34 (https://phabricator.wikimedia.org/T394753) [03:29:49] (03update) 10raymond-ndibe: [components-cli] print example config [repos/cloud/toolforge/components-cli] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/components-cli/-/merge_requests/34 (https://phabricator.wikimedia.org/T394753) [03:29:58] (03update) 10raymond-ndibe: [components-cli] print example config [repos/cloud/toolforge/components-cli] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/components-cli/-/merge_requests/34 (https://phabricator.wikimedia.org/T394753) [03:33:42] 10Toolforge (Toolforge iteration 20), 13Patch-For-Review: [components-api] Add endpoint to get what would be the "current" config - https://phabricator.wikimedia.org/T394753#10872968 (10Raymond_Ndibe) a:03Raymond_Ndibe [03:33:52] 10Toolforge (Toolforge iteration 20), 13Patch-For-Review: [components-api] Add endpoint to get what would be the "current" config - https://phabricator.wikimedia.org/T394753#10872970 (10Raymond_Ndibe) 05Open→03In progress [03:54:24] 10Toolforge (Toolforge iteration 20): [components-api] add all the missing options for continuous components - https://phabricator.wikimedia.org/T395070#10872975 (10Raymond_Ndibe) Not sure how many of the below fields we want to make configurable like cpu and mem: ` cpu=None, emails=None,... [03:54:46] 10Toolforge (Toolforge iteration 20): [components-api] add all the missing options for continuous components - https://phabricator.wikimedia.org/T395070#10872976 (10Raymond_Ndibe) a:03Raymond_Ndibe [08:22:41] (03open) 10gautamsudhanshu: Improve UI/UX with edge-to-edge layout and responsive content width [toolforge-repos/paulina] - 10https://gitlab.wikimedia.org/toolforge-repos/paulina/-/merge_requests/6 [10:31:42] (03update) 10addshore: Draft: Components [repos/cloud/toolforge/toolforge-gen-cli] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-gen-cli/-/merge_requests/2 [10:33:19] (03update) 10addshore: Components [repos/cloud/toolforge/toolforge-gen-cli] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-gen-cli/-/merge_requests/2 [10:34:07] (03merge) 10addshore: Components [repos/cloud/toolforge/toolforge-gen-cli] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-gen-cli/-/merge_requests/2 [11:26:08] 06cloud-services-team, 10Toolforge, 10Tools: Flickr blocking image requests from Toolforge k8s, breaking multiple tools - https://phabricator.wikimedia.org/T384468#10873104 (10Don-vip) Flickr apparently now blocks all requests with a 403 error (API calls, web page retrievals, everything). Can you please chec... [12:44:51] 10superset.wmcloud.org, 10Pywikibot, 07Pywikibot-tests: TestSupersetWithAuth.test_login_and_oauth_permission tests of superset_tests fails - https://phabricator.wikimedia.org/T395664#10873137 (10Xqt) >>! In T395664#10871413, @Zache wrote: > Login to fiwiki and commonswiki worked, but they didin't require the... [13:07:49] 10Tool-wdactle, 10MediaWiki-extensions-Wikibase-Repo, 10Wikibase Action API (WPP), 10Wikidata, 13Patch-For-Review: wbformatentities with generate=text/plain HTML-escapes some characters - https://phabricator.wikimedia.org/T395731#10873154 (10LucasWerkmeister) [16:34:06] 10Wikibugs, 10Phabricator (Upstream), 07Upstream: Wikibugs reports color of milestones wrong - https://phabricator.wikimedia.org/T395250#10873221 (10Aklapper) [17:36:04] 10Wikibugs, 10Phabricator (Upstream), 07Upstream: Wikibugs reports color of milestones wrong - https://phabricator.wikimedia.org/T395250#10873262 (10Aklapper) ` diff --git a/src/applications/project/storage/PhabricatorProject.php b/src/applications/project/storage/PhabricatorProject.php index 74083ae646..bea... [19:23:03] FIRING: ToolforgeKubernetesWorkerTooManyDProcesses: Node tools-k8s-worker-nfs-70 has at least 12 procs in D state, and may be having NFS/IO issues - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesWorkerTooManyDProcesses - https://grafana.wmcloud.org/d/3jhWxB8Vk/toolforge-general-overview - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesWorkerTooManyDProcesses [19:28:03] RESOLVED: ToolforgeKubernetesWorkerTooManyDProcesses: Node tools-k8s-worker-nfs-70 has at least 12 procs in D state, and may be having NFS/IO issues - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesWorkerTooManyDProcesses - https://grafana.wmcloud.org/d/3jhWxB8Vk/toolforge-general-overview - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesWorkerTooManyDProcesses [19:31:03] FIRING: ToolforgeKubernetesWorkerTooManyDProcesses: Node tools-k8s-worker-nfs-69 has at least 12 procs in D state, and may be having NFS/IO issues - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesWorkerTooManyDProcesses - https://grafana.wmcloud.org/d/3jhWxB8Vk/toolforge-general-overview - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesWorkerTooManyDProcesses [19:36:03] RESOLVED: [2x] ToolforgeKubernetesWorkerTooManyDProcesses: Node tools-k8s-worker-nfs-69 has at least 12 procs in D state, and may be having NFS/IO issues - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesWorkerTooManyDProcesses - https://grafana.wmcloud.org/d/3jhWxB8Vk/toolforge-general-overview - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesWorkerTooManyDProce [20:52:17] FIRING: KernelErrors: Server cloudvirt1045 logged kernel errors - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/KernelErrors - https://grafana.wikimedia.org/d/b013af4c-d405-4d9f-85d4-985abb3dec0c/wmcs-kernel-errors?orgId=1&var-instance=cloudvirt1045 - https://alerts.wikimedia.org/?q=alertname%3DKernelErrors [20:52:28] 06cloud-services-team: KernelErrors Server cloudvirt1045 logged kernel errors - https://phabricator.wikimedia.org/T395739 (10phaultfinder) 03NEW [21:10:00] FIRING: NovafullstackSustainedFailures: Novafullstack tests have been failing for more than 5hours in eqiad - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/NovafullstackSustainedFailures - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-nova-fullstack?orgId=1 - https://alerts.wikimedia.org/?q=alertname%3DNovafullstackSustainedFailures [23:21:49] 06cloud-services-team, 10Cloud-VPS: Nova metadata service failing for all VMs - https://phabricator.wikimedia.org/T395742 (10Andrew) 03NEW [23:22:28] 06cloud-services-team, 10Cloud-VPS: Nova metadata service failing for all VMs - https://phabricator.wikimedia.org/T395742#10873455 (10Andrew) p:05Triage→03High Stopping nova-api-metadata and all cloudcontrols doesn't affect the behavior. From this I conclude that the issue is upstream of the actual metadat... [23:24:06] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.restart_openstack on deployment eqiad1 for service: project,neutron (T395742) [23:24:14] T395742: Nova metadata service failing for all VMs - https://phabricator.wikimedia.org/T395742 [23:31:56] FIRING: [2x] ProbeDown: Service tools-k8s-haproxy-5:30000 has failed probes (http_admin_toolforge_org_ip4) - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [23:36:15] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.restart_openstack (exit_code=0) on deployment eqiad1 for service: project,neutron (T395742) [23:36:22] T395742: Nova metadata service failing for all VMs - https://phabricator.wikimedia.org/T395742