[01:38:42] 06cloud-services-team, 07artificial-intelligence: Supporting AI, LLM, and data models on WMCS - https://phabricator.wikimedia.org/T336905#10869930 (10Huji) I keep occasionally getting pinged about this general topic on fawiki. Various users there are envisioning a lot of value from having LLMs helped with tran... [02:44:37] 10Tools: zoomviewer uses an unreasonable amount of disk space - https://phabricator.wikimedia.org/T395020#10869960 (10tstarling) The trend continued in the last 3 days, adding 414GB and purging 56GB for a net increase of 353GB. I ran a one-off purge of files older than 7 days, reducing the current size to 1048G... [02:49:56] 10Tools: zoomviewer uses an unreasonable amount of disk space - https://phabricator.wikimedia.org/T395020#10869961 (10dschwen) Storing the entire original file just for a timestamp is pretty wasteful. I'm sure we can come up with a better solution... [03:03:36] 10Tools: zoomviewer uses an unreasonable amount of disk space - https://phabricator.wikimedia.org/T395020#10869969 (10dschwen) Yeah, I don't see why we wouldn't be able to take the modification date of the pyramid instead. [04:06:34] FIRING: DiskSpace: Disk space cloudcontrol2004-dev:9100:/ 1.801% free - https://wikitech.wikimedia.org/wiki/Monitoring/Disk_space - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&viewPanel=12&var-server=cloudcontrol2004-dev - https://alerts.wikimedia.org/?q=alertname%3DDiskSpace [04:16:34] RESOLVED: DiskSpace: Disk space cloudcontrol2004-dev:9100:/ 5.061% free - https://wikitech.wikimedia.org/wiki/Monitoring/Disk_space - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&viewPanel=12&var-server=cloudcontrol2004-dev - https://alerts.wikimedia.org/?q=alertname%3DDiskSpace [05:15:34] FIRING: DiskSpace: Disk space cloudcontrol2004-dev:9100:/ 5.438% free - https://wikitech.wikimedia.org/wiki/Monitoring/Disk_space - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&viewPanel=12&var-server=cloudcontrol2004-dev - https://alerts.wikimedia.org/?q=alertname%3DDiskSpace [05:25:34] RESOLVED: DiskSpace: Disk space cloudcontrol2004-dev:9100:/ 5.431% free - https://wikitech.wikimedia.org/wiki/Monitoring/Disk_space - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&viewPanel=12&var-server=cloudcontrol2004-dev - https://alerts.wikimedia.org/?q=alertname%3DDiskSpace [06:58:46] FIRING: [2x] ProbeDown: Service toolsbeta-test-k8s-haproxy-5:30000 has failed probes (http_admin_beta_toolforge_org_ip4) - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [07:03:46] RESOLVED: [2x] ProbeDown: Service toolsbeta-test-k8s-haproxy-5:30000 has failed probes (http_admin_beta_toolforge_org_ip4) - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [07:22:33] FIRING: [2x] ToolforgeKubernetesWorkerTooManyDProcesses: Node tools-k8s-worker-nfs-11 has at least 12 procs in D state, and may be having NFS/IO issues - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesWorkerTooManyDProcesses - https://grafana.wmcloud.org/d/3jhWxB8Vk/toolforge-general-overview - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesWorkerTooManyDProcess [07:32:31] (03PS5) 10Slyngshede: Build: Update build system [labs/countervandalism/CVNBot] - 10https://gerrit.wikimedia.org/r/1143806 [07:33:09] (03CR) 10CI reject: [V:04-1] Build: Update build system [labs/countervandalism/CVNBot] - 10https://gerrit.wikimedia.org/r/1143806 (owner: 10Slyngshede) [07:39:00] (03CR) 10Slyngshede: Build: Update build system (031 comment) [labs/countervandalism/CVNBot] - 10https://gerrit.wikimedia.org/r/1143806 (owner: 10Slyngshede) [07:41:51] RESOLVED: ProbeDown: Service tools-static-15:80 has failed probes (http_tools_static_wmflabs_org_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#tools-static-15:80 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [07:41:52] 06cloud-services-team: Is it a bug to have a hostname in profile::resolving::nameservers? - https://phabricator.wikimedia.org/T395633#10870094 (10taavi) 05Open→03Invalid `profile::resolving` documents this as `Array[Stdlib::Host]`, i.e. IP addresses or hostnames. If some other part of code needs the IPs... [08:07:33] FIRING: [3x] ToolforgeKubernetesWorkerTooManyDProcesses: Node tools-k8s-worker-nfs-11 has at least 12 procs in D state, and may be having NFS/IO issues - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesWorkerTooManyDProcesses - https://grafana.wmcloud.org/d/3jhWxB8Vk/toolforge-general-overview - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesWorkerTooManyDProcess [08:20:52] 10Data-Services, 06DBA: Remove sanitarium hosts from codfw - https://phabricator.wikimedia.org/T394884#10870180 (10ops-monitoring-bot) Icinga downtime and Alertmanager silence (ID=3f1522fc-3d4f-4682-bff9-5f42de7bfff6) set by fceratto@cumin1002 for 7 days, 0:00:00 on 1 host(s) and their services with reason: Re... [09:39:53] 10cloud-services-team (FY2024/2025-Q3-Q4), 10Data-Services, 06Data-Engineering, 06Data-Persistence, and 3 others: Migrate clouddb* hosts to MariaDB 10.11 - https://phabricator.wikimedia.org/T394372#10870365 (10Gehel) [10:06:00] FIRING: NovafullstackSustainedFailures: Novafullstack tests have been failing for more than 5hours in eqiad - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/NovafullstackSustainedFailures - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-nova-fullstack?orgId=1 - https://alerts.wikimedia.org/?q=alertname%3DNovafullstackSustainedFailures [10:06:07] 06cloud-services-team: NovafullstackSustainedFailures Novafullstack tests have been failing for more than 5hours in eqiad - https://phabricator.wikimedia.org/T395658 (10phaultfinder) 03NEW [10:12:33] FIRING: [3x] ToolforgeKubernetesWorkerTooManyDProcesses: Node tools-k8s-worker-nfs-11 has at least 12 procs in D state, and may be having NFS/IO issues - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesWorkerTooManyDProcesses - https://grafana.wmcloud.org/d/3jhWxB8Vk/toolforge-general-overview - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesWorkerTooManyDProcess [10:15:24] 06cloud-services-team, 10Cloud-VPS, 06Community-Tech, 10WS Export: Alternative ws-export instance for Wikisource Reader app - https://phabricator.wikimedia.org/T395660 (10Saiphani02) 03NEW [10:16:10] 06cloud-services-team, 10Cloud-VPS, 06Community-Tech, 10WS Export: Alternative ws-export instance for Wikisource Reader app - https://phabricator.wikimedia.org/T395660#10870561 (10Saiphani02) [11:16:22] 10superset.wmcloud.org, 10Pywikibot, 10Pywikibot-login.py, 07Pywikibot-tests: TestSupersetWithAuth.test_login_and_oauth_permission tests of superset_tests fails - https://phabricator.wikimedia.org/T395664 (10Xqt) 03NEW [11:17:08] 10superset.wmcloud.org, 10Pywikibot, 10Pywikibot-login.py, 07Pywikibot-tests: TestSupersetWithAuth.test_login_and_oauth_permission tests of superset_tests fails - https://phabricator.wikimedia.org/T395664#10870641 (10Xqt) p:05Triage→03High [11:30:33] 10superset.wmcloud.org, 10Pywikibot, 07Pywikibot-tests: TestSupersetWithAuth.test_login_and_oauth_permission tests of superset_tests fails - https://phabricator.wikimedia.org/T395664#10870661 (10Xqt) [11:32:33] FIRING: [2x] ToolforgeKubernetesWorkerTooManyDProcesses: Node tools-k8s-worker-nfs-11 has at least 12 procs in D state, and may be having NFS/IO issues - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesWorkerTooManyDProcesses - https://grafana.wmcloud.org/d/3jhWxB8Vk/toolforge-general-overview - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesWorkerTooManyDProcess [11:33:17] 10superset.wmcloud.org, 10Pywikibot, 07Pywikibot-tests: TestSupersetWithAuth.test_login_and_oauth_permission tests of superset_tests fails - https://phabricator.wikimedia.org/T395664#10870663 (10Zache) Not sure if this same, but I noticed yesterday that I could not login to "meta" using pywikibot. When I try... [11:41:59] 10Data-Services, 06DBA: Remove sanitarium hosts from codfw - https://phabricator.wikimedia.org/T394884#10870688 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by fceratto@cumin1002 for host db2187.codfw.wmnet with OS bookworm [12:09:29] 06cloud-services-team, 10Toolforge: [infra] Reports of slow connectivity from APAC - https://phabricator.wikimedia.org/T395135#10870774 (10cmooney) Change has been rolled back on cr2-eqiad. [12:17:33] FIRING: [2x] ToolforgeKubernetesWorkerTooManyDProcesses: Node tools-k8s-worker-nfs-11 has at least 12 procs in D state, and may be having NFS/IO issues - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesWorkerTooManyDProcesses - https://grafana.wmcloud.org/d/3jhWxB8Vk/toolforge-general-overview - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesWorkerTooManyDProcess [12:22:33] FIRING: [2x] ToolforgeKubernetesWorkerTooManyDProcesses: Node tools-k8s-worker-nfs-11 has at least 12 procs in D state, and may be having NFS/IO issues - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesWorkerTooManyDProcesses - https://grafana.wmcloud.org/d/3jhWxB8Vk/toolforge-general-overview - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesWorkerTooManyDProcess [12:27:04] 10Data-Services, 06DBA: Remove sanitarium hosts from codfw - https://phabricator.wikimedia.org/T394884#10870788 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by fceratto@cumin1002 for host db2187.codfw.wmnet with OS bookworm completed: - db2187 (**WARN**) - Downtimed on Icinga/Alertmana... [13:27:33] FIRING: [2x] ToolforgeKubernetesWorkerTooManyDProcesses: Node tools-k8s-worker-nfs-11 has at least 12 procs in D state, and may be having NFS/IO issues - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesWorkerTooManyDProcesses - https://grafana.wmcloud.org/d/3jhWxB8Vk/toolforge-general-overview - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesWorkerTooManyDProcess [13:47:20] (03open) 10eliza189: Full database update cycle. [toolforge-repos/miss-search] (linkhere_branch) - 10https://gitlab.wikimedia.org/toolforge-repos/miss-search/-/merge_requests/5 [13:52:05] 10Quarry: Quarry (quarry.wmcloud.org) not working - https://phabricator.wikimedia.org/T395680 (10RoySmith) 03NEW [13:53:22] (03update) 10eliza189: Full database update cycle. [toolforge-repos/miss-search] (linkhere_branch) - 10https://gitlab.wikimedia.org/toolforge-repos/miss-search/-/merge_requests/5 [13:54:14] (03merge) 10eliza189: Full database update cycle. [toolforge-repos/miss-search] (linkhere_branch) - 10https://gitlab.wikimedia.org/toolforge-repos/miss-search/-/merge_requests/5 [14:00:01] (03merge) 10naorleizer: db_utils.py : Util functionality implementation [toolforge-repos/miss-search] (main-w-create-rank-db) - 10https://gitlab.wikimedia.org/toolforge-repos/miss-search/-/merge_requests/6 [14:24:56] !log dcaro@acme appservers START - Cookbook wmcs.openstack.cloudvirt.vm_console [14:24:58] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Appservers/SAL [14:25:08] !log dcaro@acme appservers END (FAIL) - Cookbook wmcs.openstack.cloudvirt.vm_console (exit_code=99) [14:25:08] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Appservers/SAL [14:25:26] !log dcaro@acme appservers START - Cookbook wmcs.openstack.cloudvirt.vm_console [14:25:27] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Appservers/SAL [14:37:39] (03open) 10andrew: Fix dns servers for a couple of eqiad1 subnets [repos/cloud/cloud-vps/tofu-infra] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/238 [14:37:48] (03update) 10andrew: Fix dns servers for a couple of eqiad1 subnets [repos/cloud/cloud-vps/tofu-infra] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/238 [14:38:57] (03merge) 10andrew: Fix dns servers for a couple of eqiad1 subnets [repos/cloud/cloud-vps/tofu-infra] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/238 [14:39:23] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.tofu running tofu plan+apply for main branch [14:40:03] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.tofu (exit_code=0) running tofu plan+apply for main branch [14:43:20] !log dcaro@acme appservers END (PASS) - Cookbook wmcs.openstack.cloudvirt.vm_console (exit_code=0) [14:43:21] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Appservers/SAL [15:23:30] (03update) 10raymond-ndibe: components-api: deploy also on tools [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/785 (owner: 10dcaro) [15:25:23] 10superset.wmcloud.org, 10Pywikibot, 07Pywikibot-tests: TestSupersetWithAuth.test_login_and_oauth_permission tests of superset_tests fails - https://phabricator.wikimedia.org/T395664#10871361 (10Xqt) Thanks for that hint but I changed pywikibot-test account to use bot passwords due to T395264. I am not sure... [15:26:40] !log raymond-ndibe@cloudcumin1001 tools START - Cookbook wmcs.toolforge.component.deploy for component components-api [15:28:37] !log raymond-ndibe@cloudcumin1001 tools END (ERROR) - Cookbook wmcs.toolforge.component.deploy (exit_code=97) for component components-api [15:28:49] !log andrew@cloudcumin1001 tools START - Cookbook wmcs.toolforge.k8s.reboot for tools-k8s-worker-nfs-38, tools-k8s-worker-nfs-11 [15:29:29] !log raymond-ndibe@cloudcumin1001 tools START - Cookbook wmcs.toolforge.component.deploy for component components-api [15:29:39] !log raymond-ndibe@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.component.deploy (exit_code=0) for component components-api [15:31:45] 06cloud-services-team, 10Toolforge: purge-dup-args-29141613-j5rrs and purge-script-errors-ns0-29142064-gz5lz stuck in Terminating state - https://phabricator.wikimedia.org/T395693 (10JJMC89) 03NEW [15:37:50] 10superset.wmcloud.org, 10Pywikibot, 07Pywikibot-tests: TestSupersetWithAuth.test_login_and_oauth_permission tests of superset_tests fails - https://phabricator.wikimedia.org/T395664#10871413 (10Zache) Login to fiwiki and commonswiki worked, but they didin't require the verification code for login. Only meta... [15:40:44] !log andrew@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.k8s.reboot (exit_code=0) for tools-k8s-worker-nfs-38, tools-k8s-worker-nfs-11 [15:42:23] !log andrew@cloudcumin1001 tools START - Cookbook wmcs.toolforge.k8s.reboot for tools-k8s-worker-nfs-46 [15:48:07] !log andrew@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.k8s.reboot (exit_code=0) for tools-k8s-worker-nfs-46 [15:49:34] FIRING: DiskSpace: Disk space cloudcontrol2004-dev:9100:/ 4.046% free - https://wikitech.wikimedia.org/wiki/Monitoring/Disk_space - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&viewPanel=12&var-server=cloudcontrol2004-dev - https://alerts.wikimedia.org/?q=alertname%3DDiskSpace [15:49:39] 10Data-Services, 06DBA: Remove sanitarium hosts from codfw - https://phabricator.wikimedia.org/T394884#10871437 (10FCeratto-WMF) Notifications enabled in puppet, icinga is green, downtime is gone. Pooling in db2187 [15:49:52] 10Data-Services, 06DBA: Remove sanitarium hosts from codfw - https://phabricator.wikimedia.org/T394884#10871438 (10ops-monitoring-bot) Start pool of db2187 gradually with 4 steps - Pooling in after reimage - fceratto@cumin1002 [15:50:41] 06cloud-services-team, 10Toolforge: purge-dup-args-29141613-j5rrs and purge-script-errors-ns0-29142064-gz5lz stuck in Terminating state - https://phabricator.wikimedia.org/T395693#10871444 (10JJMC89) 05Open→03Resolved a:03Andrew [15:59:34] RESOLVED: DiskSpace: Disk space cloudcontrol2004-dev:9100:/ 4.031% free - https://wikitech.wikimedia.org/wiki/Monitoring/Disk_space - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&viewPanel=12&var-server=cloudcontrol2004-dev - https://alerts.wikimedia.org/?q=alertname%3DDiskSpace [16:07:33] FIRING: [2x] ToolforgeKubernetesWorkerTooManyDProcesses: Node tools-k8s-worker-nfs-11 has at least 12 procs in D state, and may be having NFS/IO issues - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesWorkerTooManyDProcesses - https://grafana.wmcloud.org/d/3jhWxB8Vk/toolforge-general-overview - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesWorkerTooManyDProcess [16:12:28] (03CR) 10Krinkle: Build: Update build system (034 comments) [labs/countervandalism/CVNBot] - 10https://gerrit.wikimedia.org/r/1143806 (owner: 10Slyngshede) [16:24:31] 06cloud-services-team, 10Striker: Rotate StrikerBot GitLab PAT before it expires on 2025-07-29 - https://phabricator.wikimedia.org/T395694 (10bd808) 03NEW [16:26:22] !log raymond-ndibe@cloudcumin1001 tools START - Cookbook wmcs.toolforge.component.deploy for component components-api [16:26:48] FIRING: PuppetFailure: Puppet has failed on cloudcontrol2006-dev:9100 - https://puppetboard.wikimedia.org/nodes?status=failed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetFailure [16:26:54] 06cloud-services-team: PuppetFailure Puppet has failed on cloudcontrol2006-dev:9100 - https://phabricator.wikimedia.org/T395695 (10phaultfinder) 03NEW [16:27:24] !log raymond-ndibe@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.component.deploy (exit_code=0) for component components-api [16:31:34] FIRING: DiskSpace: Disk space cloudcontrol2004-dev:9100:/ 5.954% free - https://wikitech.wikimedia.org/wiki/Monitoring/Disk_space - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&viewPanel=12&var-server=cloudcontrol2004-dev - https://alerts.wikimedia.org/?q=alertname%3DDiskSpace [16:32:33] RESOLVED: ToolforgeKubernetesWorkerTooManyDProcesses: Node tools-k8s-worker-nfs-38 has at least 12 procs in D state, and may be having NFS/IO issues - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesWorkerTooManyDProcesses - https://grafana.wmcloud.org/d/3jhWxB8Vk/toolforge-general-overview - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesWorkerTooManyDProcesses [16:34:23] (03CR) 10Krinkle: Build: Update build system (031 comment) [labs/countervandalism/CVNBot] - 10https://gerrit.wikimedia.org/r/1143806 (owner: 10Slyngshede) [16:35:16] 10Data-Services, 06DBA: Remove sanitarium hosts from codfw - https://phabricator.wikimedia.org/T394884#10871598 (10ops-monitoring-bot) Completed pool of db2187 gradually with 4 steps - Pooling in after reimage - fceratto@cumin1002 [16:36:48] RESOLVED: PuppetFailure: Puppet has failed on cloudcontrol2006-dev:9100 - https://puppetboard.wikimedia.org/nodes?status=failed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetFailure [16:37:09] 06cloud-services-team, 10Striker: Update StrikerBot Developer, SUL, and related accounts to email folks besides just bd808 - https://phabricator.wikimedia.org/T395697 (10bd808) 03NEW [16:41:34] RESOLVED: DiskSpace: Disk space cloudcontrol2004-dev:9100:/ 5.947% free - https://wikitech.wikimedia.org/wiki/Monitoring/Disk_space - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&viewPanel=12&var-server=cloudcontrol2004-dev - https://alerts.wikimedia.org/?q=alertname%3DDiskSpace [16:46:05] 06cloud-services-team, 10Cloud-VPS: Stop configuring the openstack osbpo repos on most VMs - https://phabricator.wikimedia.org/T394438#10871636 (10Andrew) 05Open→03Resolved [16:48:48] 06cloud-services-team, 10Quarry: Quarry (quarry.wmcloud.org) not working - https://phabricator.wikimedia.org/T395680#10871640 (10bd808) 05Open→03Resolved a:03taavi `lang=irc [05:54] < chlod> tools-static.wmflabs.org appears unresponsive. anyone available to give it a nudge? May 30, 2025 [06:01]... [16:52:58] (03open) 10raymond-ndibe: [components-service] add components api alert [repos/cloud/toolforge/alerts] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/alerts/-/merge_requests/31 (https://phabricator.wikimedia.org/T394275) [17:00:02] 06cloud-services-team, 13Patch-For-Review: Is it a bug to have a hostname in profile::resolving::nameservers? - https://phabricator.wikimedia.org/T395633#10871663 (10dancy) Thanks @taavi ! That bit of information got me on the right track. [17:16:07] (03update) 10raymond-ndibe: [components-service] add components api alert [repos/cloud/toolforge/alerts] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/alerts/-/merge_requests/31 (https://phabricator.wikimedia.org/T394275) [17:30:40] 10Toolforge (Toolforge iteration 20), 13Patch-For-Review: [components-api] Add alerts and runbooks for basic service health - https://phabricator.wikimedia.org/T394275#10871794 (10Raymond_Ndibe) 05Open→03In progress [18:27:53] 10Toolforge (Toolforge iteration 20): [components-api] Add endpoint to get what would be the "current" config - https://phabricator.wikimedia.org/T394753#10871982 (10Raymond_Ndibe) I'd argue that there is very little benefit to this kind of `involved` approach (checking jobs, checking builds, etc) over just dump... [18:30:02] 10Toolforge (Toolforge iteration 20): [functional-tests,builds-builder] create a test suite to run builds for all the sample tools we have - https://phabricator.wikimedia.org/T394927#10871995 (10Raymond_Ndibe) [18:30:22] 10Toolforge (Toolforge iteration 20): [components-api] Add endpoint to get what would be the "current" config - https://phabricator.wikimedia.org/T394753#10871998 (10Raymond_Ndibe) [18:31:13] 10Toolforge (Toolforge iteration 20): [components-api] Add all missing options for scheduled components - https://phabricator.wikimedia.org/T395071#10872003 (10Raymond_Ndibe) [18:31:43] 10Toolforge (Toolforge iteration 20): [components-api] add all the missing options for continuous components - https://phabricator.wikimedia.org/T395070#10872005 (10Raymond_Ndibe) [18:31:58] 10Toolforge (Toolforge iteration 20): [components-api] Add support for scheduled components - https://phabricator.wikimedia.org/T395065#10872006 (10Raymond_Ndibe) [18:37:02] 10Toolforge (Toolforge iteration 20): [components-api] Add support for scheduled components - https://phabricator.wikimedia.org/T395065#10872015 (10Raymond_Ndibe) should probably wait until the jobs split PR (https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-api/-/merge_requests/154) has been merged before... [18:55:30] 06cloud-services-team, 10Striker: Rotate StrikerBot GitLab PAT before it expires on 2025-07-29 - https://phabricator.wikimedia.org/T395694#10872150 (10taavi) How does one log in to GitLab as StrikerBot? Is the password in pwstore (or somewhere in Striker's live hiera)? [20:16:07] 06cloud-services-team, 10Striker: Rotate StrikerBot GitLab PAT before it expires on 2025-07-29 - https://phabricator.wikimedia.org/T395694#10872450 (10bd808) >>! In T395694#10872150, @taavi wrote: > How does one log in to GitLab as StrikerBot? Is the password in pwstore (or somewhere in Striker's live hiera)?... [21:05:31] 10cloud-services-team (Hardware), 06DC-Ops, 10ops-codfw, 06SRE: Q4:rack/setup/install cloudcontrol2010-dev - https://phabricator.wikimedia.org/T393102#10872556 (10Jhancock.wm) @Andrew hey sorry to get back to this so late. This one does not have a raid controller. just an HBA. cannot set a hardware raid. [21:05:39] 10cloud-services-team (Hardware), 06DC-Ops, 10ops-codfw, 06SRE: Q4:rack/setup/install cloudcontrol2010-dev - https://phabricator.wikimedia.org/T393102#10872557 (10Jhancock.wm) [21:25:15] (03CR) 10Leila237: [C:03+1] rearrange the location of some files [labs/tools/WdTmCollab] - 10https://gerrit.wikimedia.org/r/1152117 (owner: 10NkwadaNora) [23:46:47] 06cloud-services-team, 10Cloud-VPS: Is it allowed to expose HTTPS services targeting machines without web proxies? - https://phabricator.wikimedia.org/T395721 (10XtexChooser) 03NEW