[00:06:33] PROBLEM - Wikitech-static main page has content on wikitech-static.wikimedia.org is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Wikitech-static [00:07:25] RECOVERY - Wikitech-static main page has content on wikitech-static.wikimedia.org is OK: HTTP OK: HTTP/1.1 200 OK - 30039 bytes in 3.394 second response time https://wikitech.wikimedia.org/wiki/Wikitech-static [00:14:33] PROBLEM - Wikitech-static main page has content on wikitech-static.wikimedia.org is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Wikitech-static [00:15:27] RECOVERY - Wikitech-static main page has content on wikitech-static.wikimedia.org is OK: HTTP OK: HTTP/1.1 200 OK - 30030 bytes in 4.482 second response time https://wikitech.wikimedia.org/wiki/Wikitech-static [00:18:33] PROBLEM - Wikitech-static main page has content on wikitech-static.wikimedia.org is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Wikitech-static [00:20:25] RECOVERY - Wikitech-static main page has content on wikitech-static.wikimedia.org is OK: HTTP OK: HTTP/1.1 200 OK - 30031 bytes in 3.257 second response time https://wikitech.wikimedia.org/wiki/Wikitech-static [00:23:33] PROBLEM - Wikitech-static main page has content on wikitech-static.wikimedia.org is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Wikitech-static [00:25:25] RECOVERY - Wikitech-static main page has content on wikitech-static.wikimedia.org is OK: HTTP OK: HTTP/1.1 200 OK - 30035 bytes in 3.076 second response time https://wikitech.wikimedia.org/wiki/Wikitech-static [00:28:33] PROBLEM - Wikitech-static main page has content on wikitech-static.wikimedia.org is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Wikitech-static [00:33:27] RECOVERY - Wikitech-static main page has content on wikitech-static.wikimedia.org is OK: HTTP OK: HTTP/1.1 200 OK - 30031 bytes in 5.270 second response time https://wikitech.wikimedia.org/wiki/Wikitech-static [00:39:33] PROBLEM - Wikitech-static main page has content on wikitech-static.wikimedia.org is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Wikitech-static [00:40:25] RECOVERY - Wikitech-static main page has content on wikitech-static.wikimedia.org is OK: HTTP OK: HTTP/1.1 200 OK - 30041 bytes in 2.832 second response time https://wikitech.wikimedia.org/wiki/Wikitech-static [00:43:15] (03PS1) 10BryanDavis: Delete orphan deployment-eventgate-1 and deployment-eventgate-2 data [cloud/instance-puppet] - 10https://gerrit.wikimedia.org/r/1204710 (https://phabricator.wikimedia.org/T409985) [00:43:47] (03CR) 10BryanDavis: [V:03+2 C:03+2] Delete orphan deployment-eventgate-1 and deployment-eventgate-2 data [cloud/instance-puppet] - 10https://gerrit.wikimedia.org/r/1204710 (https://phabricator.wikimedia.org/T409985) (owner: 10BryanDavis) [00:45:08] (03CR) 10BryanDavis: [V:03+2 C:03+2] "andrewbogott: do you have the rights to submit this?" [cloud/instance-puppet] - 10https://gerrit.wikimedia.org/r/1204710 (https://phabricator.wikimedia.org/T409985) (owner: 10BryanDavis) [00:47:15] (03CR) 10Andrew Bogott: [C:04-2] "unfortunately instance-puppet is a read-only archive, so even if we forced a change here it wouldn't affect the actual puppet config. The " [cloud/instance-puppet] - 10https://gerrit.wikimedia.org/r/1204710 (https://phabricator.wikimedia.org/T409985) (owner: 10BryanDavis) [00:49:55] (03Abandoned) 10BryanDavis: Delete orphan deployment-eventgate-1 and deployment-eventgate-2 data [cloud/instance-puppet] - 10https://gerrit.wikimedia.org/r/1204710 (https://phabricator.wikimedia.org/T409985) (owner: 10BryanDavis) [00:51:33] PROBLEM - Wikitech-static main page has content on wikitech-static.wikimedia.org is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Wikitech-static [00:54:23] RECOVERY - Wikitech-static main page has content on wikitech-static.wikimedia.org is OK: HTTP OK: HTTP/1.1 200 OK - 30037 bytes in 0.206 second response time https://wikitech.wikimedia.org/wiki/Wikitech-static [00:57:33] PROBLEM - Wikitech-static main page has content on wikitech-static.wikimedia.org is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Wikitech-static [00:58:29] RECOVERY - Wikitech-static main page has content on wikitech-static.wikimedia.org is OK: HTTP OK: HTTP/1.1 200 OK - 30031 bytes in 6.177 second response time https://wikitech.wikimedia.org/wiki/Wikitech-static [01:11:33] PROBLEM - Wikitech-static main page has content on wikitech-static.wikimedia.org is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Wikitech-static [01:15:23] RECOVERY - Wikitech-static main page has content on wikitech-static.wikimedia.org is OK: HTTP OK: HTTP/1.1 200 OK - 30031 bytes in 0.209 second response time https://wikitech.wikimedia.org/wiki/Wikitech-static [02:54:33] PROBLEM - Wikitech-static main page has content on wikitech-static.wikimedia.org is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Wikitech-static [02:57:27] RECOVERY - Wikitech-static main page has content on wikitech-static.wikimedia.org is OK: HTTP OK: HTTP/1.1 200 OK - 30033 bytes in 3.150 second response time https://wikitech.wikimedia.org/wiki/Wikitech-static [03:00:33] PROBLEM - Wikitech-static main page has content on wikitech-static.wikimedia.org is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Wikitech-static [03:03:25] RECOVERY - Wikitech-static main page has content on wikitech-static.wikimedia.org is OK: HTTP OK: HTTP/1.1 200 OK - 30032 bytes in 0.759 second response time https://wikitech.wikimedia.org/wiki/Wikitech-static [03:06:33] PROBLEM - Wikitech-static main page has content on wikitech-static.wikimedia.org is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Wikitech-static [03:08:27] RECOVERY - Wikitech-static main page has content on wikitech-static.wikimedia.org is OK: HTTP OK: HTTP/1.1 200 OK - 30033 bytes in 3.691 second response time https://wikitech.wikimedia.org/wiki/Wikitech-static [03:12:33] PROBLEM - Wikitech-static main page has content on wikitech-static.wikimedia.org is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Wikitech-static [03:13:33] RECOVERY - Wikitech-static main page has content on wikitech-static.wikimedia.org is OK: HTTP OK: HTTP/1.1 200 OK - 30032 bytes in 9.524 second response time https://wikitech.wikimedia.org/wiki/Wikitech-static [03:26:33] PROBLEM - Wikitech-static main page has content on wikitech-static.wikimedia.org is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Wikitech-static [03:32:33] RECOVERY - Wikitech-static main page has content on wikitech-static.wikimedia.org is OK: HTTP OK: HTTP/1.1 200 OK - 30045 bytes in 9.464 second response time https://wikitech.wikimedia.org/wiki/Wikitech-static [03:35:33] PROBLEM - Wikitech-static main page has content on wikitech-static.wikimedia.org is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Wikitech-static [03:36:29] RECOVERY - Wikitech-static main page has content on wikitech-static.wikimedia.org is OK: HTTP OK: HTTP/1.1 200 OK - 30045 bytes in 5.856 second response time https://wikitech.wikimedia.org/wiki/Wikitech-static [03:40:33] PROBLEM - Wikitech-static main page has content on wikitech-static.wikimedia.org is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Wikitech-static [03:47:33] RECOVERY - Wikitech-static main page has content on wikitech-static.wikimedia.org is OK: HTTP OK: HTTP/1.1 200 OK - 30031 bytes in 8.595 second response time https://wikitech.wikimedia.org/wiki/Wikitech-static [04:32:35] PROBLEM - Wikitech-static main page has content on wikitech-static.wikimedia.org is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Wikitech-static [04:38:27] RECOVERY - Wikitech-static main page has content on wikitech-static.wikimedia.org is OK: HTTP OK: HTTP/1.1 200 OK - 30033 bytes in 3.571 second response time https://wikitech.wikimedia.org/wiki/Wikitech-static [04:42:33] PROBLEM - Wikitech-static main page has content on wikitech-static.wikimedia.org is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Wikitech-static [04:47:33] RECOVERY - Wikitech-static main page has content on wikitech-static.wikimedia.org is OK: HTTP OK: HTTP/1.1 200 OK - 30041 bytes in 8.602 second response time https://wikitech.wikimedia.org/wiki/Wikitech-static [04:50:33] PROBLEM - Wikitech-static main page has content on wikitech-static.wikimedia.org is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Wikitech-static [05:02:33] RECOVERY - Wikitech-static main page has content on wikitech-static.wikimedia.org is OK: HTTP OK: HTTP/1.1 200 OK - 30031 bytes in 9.679 second response time https://wikitech.wikimedia.org/wiki/Wikitech-static [05:05:33] PROBLEM - Wikitech-static main page has content on wikitech-static.wikimedia.org is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Wikitech-static [05:19:25] RECOVERY - Wikitech-static main page has content on wikitech-static.wikimedia.org is OK: HTTP OK: HTTP/1.1 200 OK - 30033 bytes in 1.737 second response time https://wikitech.wikimedia.org/wiki/Wikitech-static [05:38:33] PROBLEM - Wikitech-static main page has content on wikitech-static.wikimedia.org is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Wikitech-static [05:42:25] RECOVERY - Wikitech-static main page has content on wikitech-static.wikimedia.org is OK: HTTP OK: HTTP/1.1 200 OK - 30033 bytes in 2.252 second response time https://wikitech.wikimedia.org/wiki/Wikitech-static [05:50:35] PROBLEM - Wikitech-static main page has content on wikitech-static.wikimedia.org is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Wikitech-static [05:57:29] RECOVERY - Wikitech-static main page has content on wikitech-static.wikimedia.org is OK: HTTP OK: HTTP/1.1 200 OK - 30031 bytes in 3.652 second response time https://wikitech.wikimedia.org/wiki/Wikitech-static [06:00:35] PROBLEM - Wikitech-static main page has content on wikitech-static.wikimedia.org is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Wikitech-static [06:11:31] RECOVERY - Wikitech-static main page has content on wikitech-static.wikimedia.org is OK: HTTP OK: HTTP/1.1 200 OK - 30041 bytes in 5.801 second response time https://wikitech.wikimedia.org/wiki/Wikitech-static [06:14:35] PROBLEM - Wikitech-static main page has content on wikitech-static.wikimedia.org is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Wikitech-static [06:20:33] RECOVERY - Wikitech-static main page has content on wikitech-static.wikimedia.org is OK: HTTP OK: HTTP/1.1 200 OK - 30041 bytes in 7.610 second response time https://wikitech.wikimedia.org/wiki/Wikitech-static [06:23:35] PROBLEM - Wikitech-static main page has content on wikitech-static.wikimedia.org is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Wikitech-static [06:36:27] RECOVERY - Wikitech-static main page has content on wikitech-static.wikimedia.org is OK: HTTP OK: HTTP/1.1 200 OK - 30032 bytes in 1.738 second response time https://wikitech.wikimedia.org/wiki/Wikitech-static [06:40:35] PROBLEM - Wikitech-static main page has content on wikitech-static.wikimedia.org is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Wikitech-static [06:43:29] RECOVERY - Wikitech-static main page has content on wikitech-static.wikimedia.org is OK: HTTP OK: HTTP/1.1 200 OK - 30033 bytes in 5.535 second response time https://wikitech.wikimedia.org/wiki/Wikitech-static [06:47:35] PROBLEM - Wikitech-static main page has content on wikitech-static.wikimedia.org is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Wikitech-static [06:49:31] RECOVERY - Wikitech-static main page has content on wikitech-static.wikimedia.org is OK: HTTP OK: HTTP/1.1 200 OK - 30031 bytes in 6.012 second response time https://wikitech.wikimedia.org/wiki/Wikitech-static [06:53:35] PROBLEM - Wikitech-static main page has content on wikitech-static.wikimedia.org is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Wikitech-static [06:58:25] RECOVERY - Wikitech-static main page has content on wikitech-static.wikimedia.org is OK: HTTP OK: HTTP/1.1 200 OK - 30037 bytes in 1.634 second response time https://wikitech.wikimedia.org/wiki/Wikitech-static [07:06:34] PROBLEM - Wikitech-static main page has content on wikitech-static.wikimedia.org is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Wikitech-static [07:19:34] RECOVERY - Wikitech-static main page has content on wikitech-static.wikimedia.org is OK: HTTP OK: HTTP/1.1 200 OK - 30039 bytes in 9.473 second response time https://wikitech.wikimedia.org/wiki/Wikitech-static [07:22:34] PROBLEM - Wikitech-static main page has content on wikitech-static.wikimedia.org is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Wikitech-static [07:24:34] RECOVERY - Wikitech-static main page has content on wikitech-static.wikimedia.org is OK: HTTP OK: HTTP/1.1 200 OK - 30041 bytes in 9.865 second response time https://wikitech.wikimedia.org/wiki/Wikitech-static [07:27:34] PROBLEM - Wikitech-static main page has content on wikitech-static.wikimedia.org is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Wikitech-static [07:42:04] (03open) 10marostegui: task_template*: Add some clarifications [toolforge-repos/switchmaster] - 10https://gitlab.wikimedia.org/toolforge-repos/switchmaster/-/merge_requests/15 (https://phabricator.wikimedia.org/T406008) [07:49:20] (03update) 10marostegui: task_template*: Add some clarifications [toolforge-repos/switchmaster] - 10https://gitlab.wikimedia.org/toolforge-repos/switchmaster/-/merge_requests/15 (https://phabricator.wikimedia.org/T406008) [08:29:28] RECOVERY - Wikitech-static main page has content on wikitech-static.wikimedia.org is OK: HTTP OK: HTTP/1.1 200 OK - 30039 bytes in 3.097 second response time https://wikitech.wikimedia.org/wiki/Wikitech-static [08:34:34] PROBLEM - Wikitech-static main page has content on wikitech-static.wikimedia.org is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Wikitech-static [08:36:24] RECOVERY - Wikitech-static main page has content on wikitech-static.wikimedia.org is OK: HTTP OK: HTTP/1.1 200 OK - 30040 bytes in 0.852 second response time https://wikitech.wikimedia.org/wiki/Wikitech-static [08:55:02] 06cloud-services-team, 10wikitech.wikimedia.org: Flapping wikitech-static icinga alert - https://phabricator.wikimedia.org/T409029#11369801 (10dcaro) Tonight it was specially flappy (almost every hour like): {F70169176} [09:13:03] 06cloud-services-team (FY2025/26-Q1), 10Toolforge (Toolforge iteration 25), 13Patch-For-Review: [maintain-kubeusers,maintain-dbusers] user homes are not readable by replica_cnf so it fails to create replica.my.cnf files - https://phabricator.wikimedia.org/T409847#11369831 (10dcaro) 05In progress→03Res... [09:15:18] 06cloud-services-team, 10Toolforge: SSH session hangs after authentication for user delemike on login.toolforge.org. Logs show hang at debug1: pledge: filesystem. - https://phabricator.wikimedia.org/T410009 (10DeleMike) 03NEW [09:19:39] 06cloud-services-team, 10Toolforge: SSH session hangs after authentication for user delemike on login.toolforge.org. Logs show hang at debug1: pledge: filesystem. - https://phabricator.wikimedia.org/T410009#11369863 (10taavi) Your account has a bunch of open sshd-session processes running which are likely prev... [09:20:17] 06cloud-services-team, 10Toolforge: SSH session hangs after authentication for user delemike on login.toolforge.org. Logs show hang at debug1: pledge: filesystem. - https://phabricator.wikimedia.org/T410009#11369865 (10dcaro) You are reaching the limit of open ssh sessions using vscode remotely. Note that the... [09:23:18] 06cloud-services-team, 10Toolforge: SSH session hangs after authentication for user delemike on login.toolforge.org. Logs show hang at debug1: pledge: filesystem. - https://phabricator.wikimedia.org/T410009#11369874 (10DeleMike) Oh, thank you so much @taavi and @dcaro! All works now! ✨ [09:26:20] 10VPS-project-Phabricator, 06collaboration-services, 06Release-Engineering-Team (Radar): 'Fulltext' searches fail on test Phab instance due to ElasticSearch default config (PhutilAggregateException: All Fulltext Search hosts failed / CURLE_COULDNT_CONNECT) - https://phabricator.wikimedia.org/T403948#11369888 (... [09:29:15] 10Tool-yearinreview, 10MediaWiki-extensions-Translate, 06Wikipedia-Android-App-Backlog, 06LPL Essential (FY26 Q2), and 3 others: PLURAL syntax validator gets confused by other uses of equals signs in the message, as seen at [[Wikimedia:Wikipedia-android-st... - https://phabricator.wikimedia.org/T409655#11369896 [09:31:04] 06cloud-services-team, 10Toolforge: Check for non-libre vscode-server installs/processes on Toolforge bastions - https://phabricator.wikimedia.org/T390885#11369899 (10dcaro) >>! In T390885#10793594, @taavi wrote: > I was hoping one way to do this would be to null route the domain name where VS Code downloads t... [09:31:34] 06cloud-services-team, 10Toolforge: SSH session hangs after authentication for user delemike on login.toolforge.org. Logs show hang at debug1: pledge: filesystem. - https://phabricator.wikimedia.org/T410009#11369900 (10taavi) 05Open→03Resolved a:03taavi Also, please note that the version of VS Code S... [09:33:32] 06cloud-services-team, 10Toolforge: SSH session hangs after authentication for user delemike on login.toolforge.org. Logs show hang at debug1: pledge: filesystem. - https://phabricator.wikimedia.org/T410009#11369911 (10DeleMike) Noted, thanks! :) [09:33:44] FIRING: MaintainDBUsersDown: Maintain-dbusers is down - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/MaintainDBUsersDown - https://grafana.wikimedia.org/d/ae240a06-c13e-49f3-b12c-58432c551e85/wmcs-maintain-dbusers - https://alerts.wikimedia.org/?q=alertname%3DMaintainDBUsersDown [09:53:54] 10Tool-paulina: Sticky Navigation Bar - https://phabricator.wikimedia.org/T410013 (10Oluwatumininu.m) 03NEW [09:54:49] 10Cloud-Services, 10Diffusion: create conduit method for the creation of phabricator policy objects - https://phabricator.wikimedia.org/T135249#11370001 (10Aklapper) For the papertrail, the related commit is https://gitlab.wikimedia.org/repos/phabricator/extensions/-/commit/ce2f3d7a3d6c7f8929b9ddf05331c0a6... [10:08:41] FIRING: CloudVPSDesignateLeaks: Detected 4 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [10:40:08] (03update) 10dcaro: [jobs-api] save business models in a DB [repos/cloud/toolforge/jobs-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-api/-/merge_requests/114 (owner: 10raymond-ndibe) [10:42:20] (03open) 10oluwatumininu: feat(ui): make main navigation sticky on scroll [toolforge-repos/paulina] - 10https://gitlab.wikimedia.org/toolforge-repos/paulina/-/merge_requests/171 (https://phabricator.wikimedia.org/T410013) [10:45:44] 06cloud-services-team, 10Cloud-VPS, 05Cloud-Services-Origin-Team, 07Cloud-Services-Worktype-Unplanned, 13Patch-For-Review: [cloudvirt] Enable jumbo frames on cloud-hosts/cloud-private interfaces - https://phabricator.wikimedia.org/T330075#11370162 (10taavi) 05Open→03Resolved [10:52:32] 06cloud-services-team, 10Toolforge: [toolsdb] pt-heartbeat service should automatically follow the primary - https://phabricator.wikimedia.org/T409890#11370170 (10fgiunchedi) [10:58:30] (03open) 10taavi: Set MTU 1500 for VXLAN networks in eqiad1 [repos/cloud/cloud-vps/tofu-infra] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/282 (https://phabricator.wikimedia.org/T408543) [10:58:34] (03update) 10taavi: Set MTU 1500 for VXLAN networks in eqiad1 [repos/cloud/cloud-vps/tofu-infra] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/282 (https://phabricator.wikimedia.org/T408543) [10:58:35] !log taavi@cloudcumin1001 admin START - Cookbook wmcs.openstack.tofu running tofu plan for https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/282 [10:59:10] !log taavi@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.tofu (exit_code=0) running tofu plan for https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/282 [10:59:44] FIRING: MaintainDBUsersManyErrors: Maintain-dbusers is having sustained errors - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/MaintainDBUsersManyErrors - https://grafana.wikimedia.org/d/ae240a06-c13e-49f3-b12c-58432c551e85/wmcs-maintain-dbusers - https://alerts.wikimedia.org/?q=alertname%3DMaintainDBUsersManyErrors [11:05:51] 10Tool-paulina, 13Patch-For-Review: Fix error handling and add request timeouts to prevent indefinite hangs - https://phabricator.wikimedia.org/T409914#11370238 (10System625) a:05System625→03None [11:14:17] FIRING: JobUnavailable: Reduced availability for job maintain_dbusers_eqiad in cloud@eqiad - https://wikitech.wikimedia.org/wiki/Prometheus#Prometheus_job_unavailable - https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets - https://alerts.wikimedia.org/?q=alertname%3DJobUnavailable [11:24:17] RESOLVED: JobUnavailable: Reduced availability for job maintain_dbusers_eqiad in cloud@eqiad - https://wikitech.wikimedia.org/wiki/Prometheus#Prometheus_job_unavailable - https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets - https://alerts.wikimedia.org/?q=alertname%3DJobUnavailable [11:28:14] RESOLVED: MaintainDBUsersManyErrors: Maintain-dbusers is having sustained errors - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/MaintainDBUsersManyErrors - https://grafana.wikimedia.org/d/ae240a06-c13e-49f3-b12c-58432c551e85/wmcs-maintain-dbusers - https://alerts.wikimedia.org/?q=alertname%3DMaintainDBUsersManyErrors [11:35:14] RESOLVED: MaintainDBUsersDown: Maintain-dbusers is down - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/MaintainDBUsersDown - https://grafana.wikimedia.org/d/ae240a06-c13e-49f3-b12c-58432c551e85/wmcs-maintain-dbusers - https://alerts.wikimedia.org/?q=alertname%3DMaintainDBUsersDown [11:43:25] (03update) 10marostegui: task_template*: Add some clarifications [toolforge-repos/switchmaster] - 10https://gitlab.wikimedia.org/toolforge-repos/switchmaster/-/merge_requests/15 (https://phabricator.wikimedia.org/T406008) [11:52:45] !log taavi@cloudcumin1001 admin START - Cookbook wmcs.openstack.restart_openstack on deployment eqiad1 for service: project,neutron [11:58:13] 10Toolforge (Quota-requests): Request increased build quota for MilHistBot Toolforge tool - https://phabricator.wikimedia.org/T409981#11370342 (10dcaro) +1 [11:58:35] 10Cloud-VPS (Quota-requests): Increase volume storage on project analytics - https://phabricator.wikimedia.org/T409970#11370344 (10dcaro) +1 [11:58:51] 06cloud-services-team, 10PAWS, 06tools-platform-team: New upstream release for Pywikibot - https://phabricator.wikimedia.org/T408157#11370356 (10fnegri) 05Open→03In progress a:03fnegri [11:59:20] 06cloud-services-team, 10Toolforge, 06tools-platform-team: New upstream release for Pywikibot - https://phabricator.wikimedia.org/T408158#11370358 (10fnegri) 05Open→03In progress p:05Triage→03Medium a:03fnegri [11:59:21] 06cloud-services-team, 10PAWS, 06tools-platform-team: New upstream release for Pywikibot - https://phabricator.wikimedia.org/T408157#11370361 (10fnegri) p:05Triage→03Medium [11:59:32] 06cloud-services-team, 10Cloud-VPS: `logging` project missing normal DNS zone delegation - https://phabricator.wikimedia.org/T409361#11370362 (10fnegri) p:05Triage→03Medium [12:01:56] (03approved) 10ladsgroup: task_template*: Add some clarifications [toolforge-repos/switchmaster] - 10https://gitlab.wikimedia.org/toolforge-repos/switchmaster/-/merge_requests/15 (https://phabricator.wikimedia.org/T406008) (owner: 10marostegui) [12:04:07] !log taavi@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.restart_openstack (exit_code=0) on deployment eqiad1 for service: project,neutron [12:04:37] !log taavi@cloudcumin1001 admin START - Cookbook wmcs.openstack.roll_reboot_cloudnets [12:06:32] PROBLEM - Host cloudnet1006 is DOWN: PING CRITICAL - Packet loss = 100% [12:08:30] RECOVERY - Host cloudnet1006 is UP: PING OK - Packet loss = 0%, RTA = 0.28 ms [12:25:36] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.vps.instance.stop_start (exit_code=0) vm tools-k8s-haproxy-8 (cluster eqiad1) [12:25:57] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.vps.instance.stop_start vm tools-k8s-haproxy-7 (cluster eqiad1) [12:26:26] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.vps.instance.stop_start (exit_code=0) vm tools-k8s-haproxy-7 (cluster eqiad1) [12:27:03] !log taavi@cloudcumin1001 project-proxy START - Cookbook wmcs.vps.instance.stop_start vm proxy-5 (cluster eqiad1) [12:27:41] !log taavi@cloudcumin1001 project-proxy END (PASS) - Cookbook wmcs.vps.instance.stop_start (exit_code=0) vm proxy-5 (cluster eqiad1) [12:28:01] !log taavi@cloudcumin1001 project-proxy START - Cookbook wmcs.vps.instance.stop_start vm proxy-6 (cluster eqiad1) [12:28:12] (03Merged) 10jenkins-bot: vps: Add cookbook to fully restart an instance [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1204841 (owner: 10Majavah) [12:28:39] !log taavi@cloudcumin1001 project-proxy END (PASS) - Cookbook wmcs.vps.instance.stop_start (exit_code=0) vm proxy-6 (cluster eqiad1) [12:28:42] (03open) 10system625: Fix mobile navigation hamburger menu overlay and animation [toolforge-repos/paulina] - 10https://gitlab.wikimedia.org/toolforge-repos/paulina/-/merge_requests/172 (https://phabricator.wikimedia.org/T410027) [12:30:31] !log taavi@cloudcumin1001 toolsbeta START - Cookbook wmcs.vps.instance.stop_start vm toolsbeta-bastion-7 (cluster eqiad1) [12:31:08] !log taavi@cloudcumin1001 toolsbeta END (PASS) - Cookbook wmcs.vps.instance.stop_start (exit_code=0) vm toolsbeta-bastion-7 (cluster eqiad1) [12:39:00] FIRING: HarborDown: Harbor is down - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/HarborDown - https://prometheus-alerts.wmcloud.org/?q=alertname%3DHarborDown [12:39:56] FIRING: HarborDown: Harbor is down - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/HarborDown - https://prometheus-alerts.wmcloud.org/?q=alertname%3DHarborDown [12:40:45] (03open) 10l10n-bot: Localisation updates from https://translatewiki.net. [toolforge-repos/lexeme-forms] - 10https://gitlab.wikimedia.org/toolforge-repos/lexeme-forms/-/merge_requests/21 [12:44:37] (03approved) 10lucaswerkmeister: Localisation updates from https://translatewiki.net. [toolforge-repos/lexeme-forms] - 10https://gitlab.wikimedia.org/toolforge-repos/lexeme-forms/-/merge_requests/21 (owner: 10l10n-bot) [12:44:43] (03merge) 10lucaswerkmeister: Localisation updates from https://translatewiki.net. [toolforge-repos/lexeme-forms] - 10https://gitlab.wikimedia.org/toolforge-repos/lexeme-forms/-/merge_requests/21 (owner: 10l10n-bot) [13:02:42] dhinus closed https://github.com/toolforge/paws/pull/503 [13:04:59] 06cloud-services-team, 10PAWS, 06tools-platform-team: New upstream release for Pywikibot - https://phabricator.wikimedia.org/T408157#11370571 (10fnegri) 05In progress→03Resolved https://github.com/toolforge/paws/pull/503 {F70176197} [13:09:00] RESOLVED: HarborDown: Harbor is down - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/HarborDown - https://prometheus-alerts.wmcloud.org/?q=alertname%3DHarborDown [13:09:56] RESOLVED: HarborDown: Harbor is down - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/HarborDown - https://prometheus-alerts.wmcloud.org/?q=alertname%3DHarborDown [13:25:14] RESOLVED: [3x] MaintainDBUsersManyErrors: Maintain-dbusers is having sustained errors - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/MaintainDBUsersManyErrors - https://grafana.wikimedia.org/d/ae240a06-c13e-49f3-b12c-58432c551e85/wmcs-maintain-dbusers - https://alerts.wikimedia.org/?q=alertname%3DMaintainDBUsersManyErrors [13:50:38] (03update) 10dcaro: [jobs-api] save business models in a DB [repos/cloud/toolforge/jobs-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-api/-/merge_requests/114 (owner: 10raymond-ndibe) [14:08:56] FIRING: CloudVPSDesignateLeaks: Detected 4 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [15:00:26] 06cloud-services-team, 10Toolforge, 13Patch-For-Review: [jobs-api] apply topology constraints - https://phabricator.wikimedia.org/T408707#11370931 (10DamianZaremba) Looks like this is working as expected; ` tools.cluebot3@tools-bastion-15:~$ kubectl get pods -o json | jq -r '.items[] | [.metadata.labels."ap... [15:08:56] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.vps.instance.stop_start vm tools-bastion-14 (cluster eqiad1) [15:09:25] !log taavi@cloudcumin1001 tools END (PASS) - Cookbook wmcs.vps.instance.stop_start (exit_code=0) vm tools-bastion-14 (cluster eqiad1) [15:13:52] !log taavi@cloudcumin1001 tools START - Cookbook wmcs.vps.instance.stop_start vm toolsbeta-test-k8s-ingress-12 (cluster eqiad1) [15:13:54] !log taavi@cloudcumin1001 tools END (FAIL) - Cookbook wmcs.vps.instance.stop_start (exit_code=99) vm toolsbeta-test-k8s-ingress-12 (cluster eqiad1) [15:14:02] !log taavi@cloudcumin1001 toolsbeta START - Cookbook wmcs.vps.instance.stop_start vm toolsbeta-test-k8s-ingress-12 (cluster eqiad1) [15:14:53] !log taavi@cloudcumin1001 toolsbeta END (PASS) - Cookbook wmcs.vps.instance.stop_start (exit_code=0) vm toolsbeta-test-k8s-ingress-12 (cluster eqiad1) [15:15:30] 06cloud-services-team, 10wikitech.wikimedia.org: Flapping wikitech-static icinga alert - https://phabricator.wikimedia.org/T409029#11370989 (10Andrew) I've just switched the mod_evasive settings to be more aggressive than the defaults: ` DOSHashTableSize 20000 DOSPageCount 2 DOSSiteCoun... [15:19:02] !log taavi@cloudcumin1001 project-proxy START - Cookbook wmcs.vps.instance.stop_start vm maps-proxy-5 (cluster eqiad1) [15:19:10] 06cloud-services-team, 10Toolforge: [logs-api,jobs-cli] `toolforge jobs logs` has inconsistent ordering - https://phabricator.wikimedia.org/T401552#11371008 (10dcaro) [15:19:32] !log taavi@cloudcumin1001 project-proxy END (PASS) - Cookbook wmcs.vps.instance.stop_start (exit_code=0) vm maps-proxy-5 (cluster eqiad1) [15:19:32] 06cloud-services-team, 10Toolforge: [jobs-cli,logs-api] `toolforge jobs logs` breaks on long log lines - https://phabricator.wikimedia.org/T401422#11371009 (10dcaro) [15:19:46] 06cloud-services-team, 10Toolforge: [jobs-api,jobs-cli] Support `--timeout` for one-off jobs - https://phabricator.wikimedia.org/T401110#11371010 (10dcaro) [15:19:53] 06cloud-services-team, 10Toolforge, 13Patch-For-Review: [jobs-api] apply topology constraints - https://phabricator.wikimedia.org/T408707#11371012 (10DamianZaremba) 05Open→03Resolved a:03DamianZaremba [15:20:20] !log taavi@cloudcumin1001 project-proxy START - Cookbook wmcs.vps.instance.stop_start vm maps-proxy-6 (cluster eqiad1) [15:20:57] !log taavi@cloudcumin1001 project-proxy END (PASS) - Cookbook wmcs.vps.instance.stop_start (exit_code=0) vm maps-proxy-6 (cluster eqiad1) [15:25:29] 06cloud-services-team, 10Toolforge: [jobs-cli] provides no meaningful feedback for restart - https://phabricator.wikimedia.org/T410046 (10DamianZaremba) 03NEW [15:26:11] 06cloud-services-team, 10Toolforge: [jobs-cli] provides no meaningful feedback for restart - https://phabricator.wikimedia.org/T410046#11371073 (10DamianZaremba) [15:31:48] 06cloud-services-team, 10Toolforge: [jobs-cli] provides no meaningful feedback for delete - https://phabricator.wikimedia.org/T410048#11371131 (10DamianZaremba) [15:41:23] 06cloud-services-team, 10Toolforge: [logs-api] `--follow` returns inconsistent/artificial log entries - https://phabricator.wikimedia.org/T410055 (10DamianZaremba) 03NEW [15:47:16] !log tools.cluebotng-trainer Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/19337204923 (https://github.com/cluebotng/component-configs/commits/d5247b0e2c72c4426888761d70d73f59ba7362a5) [15:47:18] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.cluebotng-trainer/SAL [15:55:04] 06cloud-services-team, 10Toolforge (Toolforge iteration 25), 13Patch-For-Review: [jobs-api] apply topology constraints - https://phabricator.wikimedia.org/T408707#11371275 (10dcaro) [15:59:47] 06cloud-services-team, 10Toolforge: [jobs-cli] provides no meaningful feedback for delete - https://phabricator.wikimedia.org/T410048#11371308 (10dcaro) I think it was an early decision on the jobs-cli side to not return anything when everything went well. I'm ok changing it. In other clis we return all the in... [16:00:17] 06cloud-services-team, 10Cloud-VPS (Project-requests): CloudVPS instance for ProVe - https://phabricator.wikimedia.org/T408387#11371309 (10fnegri) +1 to creating a project with enough quota to support one `g4.cores8.ram16.disk20` instance (per the [requirements doc](https://emckclac-my.sharepoint.com/:b:/g/per... [16:00:38] (03PS1) 10Muehlenhoff: Remove a lot of historical stub secrets [labs/private] - 10https://gerrit.wikimedia.org/r/1204913 (https://phabricator.wikimedia.org/T381565) [16:01:33] 06cloud-services-team, 10Openstack-Magnum: Magnum UI should offer full kube config - https://phabricator.wikimedia.org/T343362#11371315 (10Andrew) 05Open→03Resolved Unless I'm missing something, this feature is now available on Horizon via 'Get Cluster Config'. There's also an API for this. [16:02:12] (03PS1) 10Muehlenhoff: Add stub secrets for the staging role [labs/private] - 10https://gerrit.wikimedia.org/r/1204915 (https://phabricator.wikimedia.org/T409528) [16:08:51] (03PS2) 10Muehlenhoff: Add stub secrets for the staging role [labs/private] - 10https://gerrit.wikimedia.org/r/1204915 (https://phabricator.wikimedia.org/T409528) [16:11:09] (03CR) 10Muehlenhoff: [V:03+2 C:03+2] Add stub secrets for the staging role [labs/private] - 10https://gerrit.wikimedia.org/r/1204915 (https://phabricator.wikimedia.org/T409528) (owner: 10Muehlenhoff) [16:13:32] 06cloud-services-team, 10Toolforge: [builds-api] support specifying tag in build - https://phabricator.wikimedia.org/T410058 (10DamianZaremba) 03NEW [16:16:25] 10VPS-project-Phabricator, 06collaboration-services, 06Release-Engineering-Team (Radar): 'Fulltext' searches fail on test Phab instance due to ElasticSearch default config (PhutilAggregateException: All Fulltext Search hosts failed / CURLE_COULDNT_CONNECT)... - https://phabricator.wikimedia.org/T403948#11371383 [16:16:28] 10VPS-project-Phabricator, 06collaboration-services, 06Release-Engineering-Team (Radar): 'Fulltext' searches fail on test Phab instance due to ElasticSearch default config (PhutilAggregateException: All Fulltext Search hosts failed / CURLE_COULDNT_CONNECT)... - https://phabricator.wikimedia.org/T403948#11371384 [16:18:41] RESOLVED: CloudVPSDesignateLeaks: Detected 4 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [16:32:56] (03PS1) 10Muehlenhoff: Add missing secret [labs/private] - 10https://gerrit.wikimedia.org/r/1204926 (https://phabricator.wikimedia.org/T409528) [16:42:12] (03CR) 10Muehlenhoff: [V:03+2 C:03+2] Add missing secret [labs/private] - 10https://gerrit.wikimedia.org/r/1204926 (https://phabricator.wikimedia.org/T409528) (owner: 10Muehlenhoff) [16:48:33] 06cloud-services-team, 10Toolforge: [builds-api] support specifying tag in build - https://phabricator.wikimedia.org/T410058#11371620 (10dcaro) Can you elaborate a bit on the flow you have in mind? Would that be sorted if components-api allowed to change the image name? One of the reasons to not have the tag... [16:50:55] 06cloud-services-team, 10Toolforge: [jobs-cli] provides no meaningful feedback for delete - https://phabricator.wikimedia.org/T410048#11371625 (10dcaro) I think that returning an info message (in the wrapper structure of the jobs-api https://api-docs.toolforge.org/docs#/Jobs/jobs_list) with "Job delete... [17:00:56] 06cloud-services-team, 10Toolforge: [builds-api] support specifying tag in build - https://phabricator.wikimedia.org/T410058#11371692 (10DamianZaremba) Being able to change the component name in components-api would be useful, but for different reasons (especially with reuse_build the naming gets a bit funny e... [17:01:15] 06cloud-services-team, 10Toolforge: [logs-api] `--follow` returns inconsistent/artificial log entries - https://phabricator.wikimedia.org/T410055#11371697 (10dcaro) Agree, we can now try to extend the datastructure that logs-api returns (LogEntry), ideally we would want to support different types of logs too (... [17:02:15] 06cloud-services-team, 10Toolforge: [logs-api] `--follow` returns inconsistent/artificial log entries - https://phabricator.wikimedia.org/T410055#11371708 (10dcaro) That would allow also to have a 'type' that's something like "internal", and express there the fact that it got no logs yet. [17:02:55] 06cloud-services-team, 10Toolforge: [jobs-cli] provides no meaningful feedback for delete - https://phabricator.wikimedia.org/T410048#11371712 (10DamianZaremba) Yeah, we do this for example in the patch endpoint, but not the delete or restart endpoint (the structure is there but always empty). components cli... [17:11:35] 06cloud-services-team, 10Toolforge: [logs-api] `--follow` returns inconsistent/artificial log entries - https://phabricator.wikimedia.org/T410055#11371761 (10dcaro) Some notes for whomever implements this: We probably want to make the changes in small increments on the cli (first support both formats, change... [17:11:45] PROBLEM - Wikitech-static main page has content on wikitech-static.wikimedia.org is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Wikitech-static [17:11:49] 06cloud-services-team, 10Toolforge: [logs-api] `--follow` returns inconsistent/artificial log entries - https://phabricator.wikimedia.org/T410055#11371763 (10DamianZaremba) Eventually having 'system logs' (build, jobs, components) with a common 'trace id', so for a deployment you can see everything that happen... [17:15:44] RECOVERY - Wikitech-static main page has content on wikitech-static.wikimedia.org is OK: HTTP OK: HTTP/1.1 200 OK - 30032 bytes in 8.459 second response time https://wikitech.wikimedia.org/wiki/Wikitech-static [17:16:15] 06cloud-services-team, 10Toolforge: [logs-api] `--follow` returns inconsistent/artificial log entries - https://phabricator.wikimedia.org/T410055#11371792 (10dcaro) @DamianZaremba btw. I think that you are the last one using the jobs-api log endpoint, can you move your code to use the logs-api instead? (so we... [17:19:28] 06cloud-services-team, 10Toolforge: [logs-api] `--follow` returns inconsistent/artificial log entries - https://phabricator.wikimedia.org/T410055#11371808 (10DamianZaremba) >>! In T410055#11371792, @dcaro wrote: > @DamianZaremba btw. I think that you are the last one using the jobs-api log endpoint, can you mo... [17:30:53] 10Cloud Services Proposals, 06cloud-services-team, 10Cloud-VPS: Decision Request - How openstack projects relate to tofu-infra - https://phabricator.wikimedia.org/T385604#11371837 (10dcaro) I vote for the Option 1 (with Andrew's note on only for non-automatic projects), though Option 3 would be a close sec... [17:41:37] 06cloud-services-team, 10Cloud-VPS, 10CAS-SSO, 06Infrastructure-Foundations: sso failure in codfw1dev (labtesthorizon.wikimedia.org) - https://phabricator.wikimedia.org/T409328#11371866 (10Andrew) andrewbogott> Andrew Bogott moritzm: do you still aspire to look at https://phabricator.wikimedia.org/T409328... [17:44:14] !log tools.cluebotng-trainer Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/19340563565 (https://github.com/cluebotng/component-configs/commits/a5ea7ec11c7d2a7ced4f29df1f2a742f52dbf777) [17:44:17] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.cluebotng-trainer/SAL [17:44:56] 10Toolforge (Toolforge iteration 25), 13Patch-For-Review: [builds-api, maintain-harbor] fix build/image cleanup - https://phabricator.wikimedia.org/T404157#11371872 (10dcaro) [17:45:04] 06cloud-services-team, 10Cloud-VPS, 10CAS-SSO, 06Infrastructure-Foundations: sso failure in codfw1dev (labtesthorizon.wikimedia.org) - https://phabricator.wikimedia.org/T409328#11371888 (10bd808) https://cloudidp-dev.wikimedia.org/oidc/.well-known is serving an infinite redirect loop to itself with a `serv... [17:50:14] !log tools.cluebotng-trainer Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/19340717614 (https://github.com/cluebotng/component-configs/commits/a2ac5e2b12d80df59f7f37209ed8943fe471992a) [17:50:16] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.cluebotng-trainer/SAL [17:50:17] 06cloud-services-team, 10Toolforge: [logs-api] `--follow` returns inconsistent/artificial log entries - https://phabricator.wikimedia.org/T410055#11371918 (10DamianZaremba) >>! In T410055#11371808, @DamianZaremba wrote: >>>! In T410055#11371792, @dcaro wrote: >> @DamianZaremba btw. I think that you are the las... [17:58:52] PROBLEM - Wikitech-static main page has content on wikitech-static.wikimedia.org is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Wikitech-static [17:59:47] !log tools.cluebotng-trainer Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/19340976225 (https://github.com/cluebotng/component-configs/commits/8c3a4f743978487c10274534162bb1a7a2ceb8a8) [17:59:49] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.cluebotng-trainer/SAL [18:00:46] RECOVERY - Wikitech-static main page has content on wikitech-static.wikimedia.org is OK: HTTP OK: HTTP/1.1 200 OK - 30033 bytes in 3.622 second response time https://wikitech.wikimedia.org/wiki/Wikitech-static [18:03:03] !log tools.cluebotng-trainer Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/19341017921 (https://github.com/cluebotng/component-configs/commits/b28d25e5b98fe60c2b85138e7d1a2a52cee03383) [18:03:04] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.cluebotng-trainer/SAL [18:06:35] 06cloud-services-team, 10Toolforge: [logs-api] `--follow` returns inconsistent/artificial log entries - https://phabricator.wikimedia.org/T410055#11371972 (10DamianZaremba) https://cluebotng-trainer.toolforge.org/Report%20Interface%20Import/2025-11-13%2018:02:49/logs/bayes-train.log example of where this gets... [18:09:04] 06cloud-services-team, 10Toolforge: [logs-api] `--follow` returns inconsistent/artificial log entries - https://phabricator.wikimedia.org/T410055#11372002 (10DamianZaremba) >>! In T410055#11371972, @DamianZaremba wrote: > https://cluebotng-trainer.toolforge.org/Report%20Interface%20Import/2025-11-13%2018:02:49... [18:11:42] !log tools.cluebotng-trainer Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/19341306596 (https://github.com/cluebotng/component-configs/commits/6e3dd964163e25f7115b74c7b132316745613729) [18:11:47] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.cluebotng-trainer/SAL [18:17:02] !log tools.cluebotng-trainer Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/19341445508 (https://github.com/cluebotng/component-configs/commits/2a497ee04f7182e6d114d1aafa79a99e29ff6d22) [18:17:03] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.cluebotng-trainer/SAL [18:25:46] !log tools.cluebotng-trainer Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/19341671161 (https://github.com/cluebotng/component-configs/commits/84f054bb9402daea356833fbd2d606f4d2b44b48) [18:25:48] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.cluebotng-trainer/SAL [18:29:18] !log tools.cluebotng-trainer Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/19341777629 (https://github.com/cluebotng/component-configs/commits/afb04303301910afc2a7692e2e5676540a27c38c) [18:29:19] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.cluebotng-trainer/SAL [18:40:18] !log tools.cluebotng-trainer Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/19342079797 (https://github.com/cluebotng/component-configs/commits/b8c173ced436297aaa9a09b2ef9a4b814d696a30) [18:40:20] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.cluebotng-trainer/SAL [19:07:40] !log tools.cluebotng-trainer Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/19342782944 (https://github.com/cluebotng/component-configs/commits/fed3ef035303fd54334c5a7932d23a43e381492e) [19:07:42] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.cluebotng-trainer/SAL [19:15:48] !log tools.cluebotng-trainer Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/19342998586 (https://github.com/cluebotng/component-configs/commits/17a2abf82e9a2076113891e4ae3d5159c048b940) [19:15:50] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.cluebotng-trainer/SAL [19:19:40] 06cloud-services-team, 10Striker, 10CAS-SSO, 13Patch-For-Review: Use IDP for authentication in Striker - https://phabricator.wikimedia.org/T359554#11372282 (10Arendpieter) @taavi do I need to do something else for https://gerrit.wikimedia.org/r/c/labs/striker/+/1189915 ? [19:45:47] 10Cloud-Services: gitlab-docker-runner-v2.analytics instance is innaccesible via SSH - https://phabricator.wikimedia.org/T410083 (10xcollazo) 03NEW The #Cloud-Services project tag is not intended to have any tasks. Please check the list on https://phabricator.wikimedia.org/project/profile/832/ and replace it w... [20:20:16] 06cloud-services-team, 10Cloud-VPS: gitlab-docker-runner-v2.analytics instance is innaccesible via SSH - https://phabricator.wikimedia.org/T410083#11372479 (10JJMC89) [20:31:09] 10VPS-Projects: gitlab-docker-runner-v2.analytics instance is innaccesible via SSH - https://phabricator.wikimedia.org/T410083#11372512 (10taavi) The instance has Puppet roles [[ https://gerrit.wikimedia.org/r/plugins/gitiles/cloud/instance-puppet/+/d9b99547019cac8a9702fe4791ffaeffcbbadcea%5E%21/#F0 | applied ]]... [21:01:58] 10VPS-Projects: gitlab-docker-runner-v2.analytics instance is inaccessible via SSH - https://phabricator.wikimedia.org/T410083#11372600 (10Aklapper) [21:49:35] !log tools.cluebotng-editsets Deployment failed: https://github.com/cluebotng/component-configs/actions/runs/19346862272 (https://github.com/cluebotng/component-configs/commits/65fa127d349eb5521d47017350bdc52ceb35d9ae) [21:49:36] !log tools.cluebotng-trainer Deployment failed: https://github.com/cluebotng/component-configs/actions/runs/19346862269 (https://github.com/cluebotng/component-configs/commits/65fa127d349eb5521d47017350bdc52ceb35d9ae) [21:49:39] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.cluebotng-editsets/SAL [21:49:40] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.cluebotng-trainer/SAL [22:25:34] !log tools.cluebotng-trainer Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/19347680045 (https://github.com/cluebotng/component-configs/commits/5de2267efa622608c868e4cb66658a4f367aa3b0) [22:25:36] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.cluebotng-trainer/SAL [22:39:29] 06cloud-services-team, 10Cloud-VPS, 10CAS-SSO, 06Infrastructure-Foundations: sso failure in codfw1dev (labtesthorizon.wikimedia.org) - https://phabricator.wikimedia.org/T409328#11372894 (10Andrew) @taavi this is one of the codfw1dev issues that has me blocked. I've spent a while messing with the envoy conf... [23:29:43] 10Toolforge (Toolforge iteration 25): [jobs-api] Investigate if we can reuse the 'web' flavour pre-built images as regular images - https://phabricator.wikimedia.org/T409191#11372981 (10Raymond_Ndibe) **Lima kilo env configurations for anyone who wants to recreate (configmaps, limitranges, resourcequotas, etc. I... [23:31:25] 10Toolforge (Toolforge iteration 25): [jobs-api] Investigate if we can reuse the 'web' flavour pre-built images as regular images - https://phabricator.wikimedia.org/T409191#11372990 (10Raymond_Ndibe) It seems like we don't need to do any special thing to get the images to run @dcaro @fnegri [23:37:38] (03close) 10raymond-ndibe: d/changelog: bump to 0.0.16 [repos/cloud/toolforge/components-cli] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/components-cli/-/merge_requests/65 (https://phabricator.wikimedia.org/T400064) [23:38:44] (03approved) 10raymond-ndibe: global: only suport python3.13/trixie [repos/cloud/toolforge/components-cli] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/components-cli/-/merge_requests/67 (owner: 10dcaro) [23:39:33] (03approved) 10raymond-ndibe: toolforge_depoy: removed extension [repos/cloud/toolforge/lima-kilo] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/lima-kilo/-/merge_requests/296 (owner: 10dcaro) [23:39:41] (03update) 10raymond-ndibe: toolforge_depoy: removed extension [repos/cloud/toolforge/lima-kilo] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/lima-kilo/-/merge_requests/296 (owner: 10dcaro) [23:39:50] (03update) 10raymond-ndibe: global: only suport python3.13/trixie [repos/cloud/toolforge/components-cli] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/components-cli/-/merge_requests/67 (owner: 10dcaro) [23:45:18] 10VPS-Projects: gitlab-docker-runner-v2.analytics instance is inaccessible via SSH - https://phabricator.wikimedia.org/T410083#11373007 (10xcollazo) @taavi, I removed all puppet roles, and it still doesn't allow me to SSH in.