[01:00:03] RESOLVED: ToolforgeKubernetesWorkerTooManyDProcesses: Node tools-k8s-worker-nfs-22 has at least 12 procs in D state, and may be having NFS/IO issues - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesWorkerTooManyDProcesses - https://grafana.wmcloud.org/d/3jhWxB8Vk/toolforge-general-overview - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesWorkerTooManyDProcesses [01:00:20] 10Cloud-VPS (Project-requests): Request creation of createwikitest VPS project - https://phabricator.wikimedia.org/T375454#10186326 (10Xaloria) a:03Aklapper Hello Can you review my request. [01:08:18] 10Cloud-VPS (Project-requests): Request creation of createwikitest VPS project - https://phabricator.wikimedia.org/T375454#10186330 (10JJMC89) a:05Aklapper→03None [01:28:28] 10Cloud-VPS (Project-requests): Request creation of createwikitest VPS project - https://phabricator.wikimedia.org/T375454#10186336 (10Pppery) Note that much of the task description is AI-generated. Despite that https://meta.miraheze.org/wiki/User:Xaloria is a real user, and CreateWiki genuinely does have compli... [03:04:58] 10Cloud-Services: Quarry not working today - https://phabricator.wikimedia.org/T375988 (10Liz) 03NEW The #Cloud-Services project tag is not intended to have any tasks. Please check the list on https://phabricator.wikimedia.org/project/profile/832/ and replace it with a more specific project tag to this task. T... [03:07:06] 10Cloud-Services: Quarry not working today - https://phabricator.wikimedia.org/T375988#10186395 (10Liz) [03:08:33] 10Quarry: Quarry not working today - https://phabricator.wikimedia.org/T375988#10186396 (10Liz) [04:30:36] 10Cloud-VPS (Project-requests): Request creation of createwikitest VPS project - https://phabricator.wikimedia.org/T375454#10186432 (10taavi) Who is "we" here? As far as I can tell this request was created by a brand new account with no history with CreateWiki development or in the wider Wikimedia movement.. [05:06:24] 10Cloud-VPS (Project-requests): Request creation of createwikitest VPS project - https://phabricator.wikimedia.org/T375454#10186458 (10Xaloria) Ok so what should I do now is my request directly declined. [05:12:20] 10Cloud-VPS (Project-requests): Request creation of createwikitest VPS project - https://phabricator.wikimedia.org/T375454#10186461 (10Pppery) Your request has not been declined yet. Various people, none of whom have final decision-making authority, have expressed skepticism over it. Someone in the #cloud-servic... [05:55:06] 10VPS-Projects: Request for project membership - https://phabricator.wikimedia.org/T375985#10186470 (10Bugreporter) [07:04:30] 10Quarry: Quarry not working today - https://phabricator.wikimedia.org/T375988#10186544 (10rook) Looks like k8s is having trouble with garbage collection ` Warning FreeDiskSpaceFailed 118s (x159 over 13h) kubelet Failed to garbage collect required amount of images. Attempted to free 4281512755 bytes, but o... [07:08:30] 10Cloud-VPS (Project-requests): Request creation of createwikitest VPS project - https://phabricator.wikimedia.org/T375454#10186548 (10RhinosF1) I'm not sure what this is going to gain. 1) CreateWiki + ManageWiki are heavily intertwined. 2) Both extensions are used on thousands of wikis together. 1) and 2) wou... [07:09:29] 10Cloud-VPS (Project-requests), 07affects-Miraheze: Request creation of createwikitest VPS project - https://phabricator.wikimedia.org/T375454#10186551 (10RhinosF1) [07:29:07] 10Quarry: worker nodes issue with garbage collection - https://phabricator.wikimedia.org/T375997 (10rook) 03NEW [07:30:52] 10Quarry: Quarry not working today - https://phabricator.wikimedia.org/T375988#10186590 (10rook) Quarry is working again. Though I didn't have time to investigate what is happening so this may happen again. Opening T375997 to investigate the underlying issue. [07:32:16] 10Quarry: Quarry not working today - https://phabricator.wikimedia.org/T375988#10186592 (10rook) 05Open→03Resolved a:03rook [07:44:11] 10Tool-wikiloves: Armenia in WLM statistic. - https://phabricator.wikimedia.org/T375998 (10Beko) 03NEW [07:45:16] 10Tool-wikiloves: Armenia in WLM statistic. - https://phabricator.wikimedia.org/T375998#10186626 (10Beko) [07:51:12] (03Restored) 10Hashar: Revert "switch to main gerrit server instead of using the replica" [labs/codesearch] - 10https://gerrit.wikimedia.org/r/920243 (https://phabricator.wikimedia.org/T336710) (owner: 10Hashar) [07:51:49] 10VPS-project-Codesearch, 10VPS-project-Extdist, 06collaboration-services, 10Gerrit, 13Patch-For-Review: Move clients off of gerrit-replica.wikimedia.org back to gerrit.wikimedia.org - https://phabricator.wikimedia.org/T336710#10186643 (10hashar) 05Stalled→03Open I am reopening since I apparently mis... [08:01:09] (03PS3) 10Hashar: Revert "switch to main gerrit server instead of using the replica" [labs/codesearch] - 10https://gerrit.wikimedia.org/r/920243 (https://phabricator.wikimedia.org/T336710) [08:03:10] (03CR) 10Hashar: "I have restored this change since I'd like the codesearch indexer to hit the replica instead of the primary." [labs/codesearch] - 10https://gerrit.wikimedia.org/r/920243 (https://phabricator.wikimedia.org/T336710) (owner: 10Hashar) [08:07:27] 10Cloud-VPS (Project-requests), 07affects-Miraheze: Request creation of createwikitest VPS project - https://phabricator.wikimedia.org/T375454#10186662 (10aborrero) 05Open→03Declined I'm tempted to declined per the feedback from the folks. But I will wait for the reply by @Xaloria to the comments by @... [08:10:58] 10Cloud-VPS (Project-requests), 07affects-Miraheze: Request creation of createwikitest VPS project - https://phabricator.wikimedia.org/T375454#10186666 (10aborrero) 05Declined→03Open [08:35:43] 10VPS-Projects: Request for project membership - https://phabricator.wikimedia.org/T375985#10186696 (10Dreamy_Jazz) 05Open→03Declined Going to decline this as the owner of this VPS project. I use it to test schema changes for #CheckUser. Because #checkuser stores private data, it means that access to the... [08:47:20] 06cloud-services-team, 10Cloud-VPS: Puppet fails on cloudcontrol when updating /srv/tofu-infra - https://phabricator.wikimedia.org/T373815#10186760 (10dcaro) The scripts probably should use the lower level 'git fetch && git reset --hard && git clean -fdx' instead of 'checkout' to avoid the extra logic creating... [08:58:24] 06cloud-services-team, 10Cloud-VPS: tofu-infra: refactor repo structure - https://phabricator.wikimedia.org/T375283#10186885 (10aborrero) I just checked, and in eqiad1 alone, we could track at least around 10 projects, without counting tools/toolsbeta. We just started using tofu, and we are tracking resources... [09:00:00] !log dcaro@urcuchillay admin START - Cookbook wmcs.ceph.osd.bootstrap_and_add (T372814) [09:00:01] !log dcaro@urcuchillay admin END (FAIL) - Cookbook wmcs.ceph.osd.bootstrap_and_add (exit_code=99) (T372814) [09:00:05] 10Quarry: Quarry shows error: This web service cannot be reached - https://phabricator.wikimedia.org/T375988#10186916 (10Aklapper) [09:00:07] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [09:00:07] T372814: Put cloudcephosd10[39-41] into service - https://phabricator.wikimedia.org/T372814 [09:00:12] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [09:00:59] !log dcaro@urcuchillay admin START - Cookbook wmcs.ceph.osd.bootstrap_and_add (T372814) [09:01:04] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [09:08:09] FIRING: CephClusterInWarning: Ceph cluster in eqiad is in warning status - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/CephClusterInWarning - https://grafana.wikimedia.org/d/P1tFnn3Mk/wmcs-ceph-eqiad-health?orgId=1&search=open&tag=ceph&tag=health&tag=WMCS - https://alerts.wikimedia.org/?q=alertname%3DCephClusterInWarning [09:32:25] !log dcaro@urcuchillay admin START - Cookbook wmcs.ceph.osd.reset_weights [09:32:31] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [09:32:39] !log dcaro@urcuchillay admin END (FAIL) - Cookbook wmcs.ceph.osd.reset_weights (exit_code=99) [09:32:49] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [09:38:09] RESOLVED: CephClusterInWarning: Ceph cluster in eqiad is in warning status - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/CephClusterInWarning - https://grafana.wikimedia.org/d/P1tFnn3Mk/wmcs-ceph-eqiad-health?orgId=1&search=open&tag=ceph&tag=health&tag=WMCS - https://alerts.wikimedia.org/?q=alertname%3DCephClusterInWarning [09:39:21] !log dcaro@urcuchillay admin START - Cookbook wmcs.ceph.osd.reset_weights [09:39:27] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [09:39:49] !log dcaro@urcuchillay admin END (PASS) - Cookbook wmcs.ceph.osd.reset_weights (exit_code=0) [09:39:54] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [09:40:57] !log dcaro@urcuchillay admin START - Cookbook wmcs.ceph.osd.reset_weights [09:41:01] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [09:41:24] !log dcaro@urcuchillay admin END (PASS) - Cookbook wmcs.ceph.osd.reset_weights (exit_code=0) [09:41:30] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [09:44:01] !log dcaro@urcuchillay admin START - Cookbook wmcs.ceph.osd.reset_weights [09:44:06] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [09:44:42] !log dcaro@urcuchillay admin END (PASS) - Cookbook wmcs.ceph.osd.reset_weights (exit_code=0) [09:44:48] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [09:45:12] FIRING: Primary cloud switch inbound port utilisation over 80%: Alert for device cloudsw1-e4-eqiad.mgmt.eqiad.wmnet - Primary cloud switch inbound port utilisation over 80% - https://alerts.wikimedia.org/?q=alertname%3DPrimary+cloud+switch+inbound+port+utilisation+over+80%25 [09:45:12] FIRING: Primary cloud switch port utilisation over 80%: Alert for device cloudsw1-d5-eqiad.mgmt.eqiad.wmnet - Primary cloud switch port utilisation over 80% - https://alerts.wikimedia.org/?q=alertname%3DPrimary+cloud+switch+port+utilisation+over+80%25 [09:45:17] 06cloud-services-team: Primary cloud switch inbound port utilisation over 80% Rule: Primary cloud switch inbound port utilisation over 80% Faults: #1: sysObjectID = .1.3.6.1.4.1.2636.1.1.1.4.82.22; sysDescr = Juniper Networks, Inc. qfx5120-48y-8c Ethernet Switc... - https://phabricator.wikimedia.org/T376018 [09:45:18] 06cloud-services-team: Primary cloud switch port utilisation over 80% Rule: Primary cloud switch port utilisation over 80% Faults: #1: sysObjectID = .1.3.6.1.4.1.2636.1.1.1.4.82.5; sysDescr = Juniper Networks, Inc. qfx5100-48s-6q Ethernet Switch, kernel JUNOS 2... - https://phabricator.wikimedia.org/T376019 [09:46:09] !log dcaro@urcuchillay admin START - Cookbook wmcs.ceph.osd.reset_weights [09:46:14] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [09:46:50] !log dcaro@urcuchillay admin END (PASS) - Cookbook wmcs.ceph.osd.reset_weights (exit_code=0) [09:46:55] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [09:48:15] !log dcaro@urcuchillay admin START - Cookbook wmcs.ceph.osd.reset_weights [09:48:23] (03open) 10aborrero: projects: refactor into new repository layout [repos/cloud/cloud-vps/tofu-infra] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/69 (https://phabricator.wikimedia.org/T375283) [09:48:24] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [09:48:45] 10VPS-Projects: Request for project membership in Checkuser Beta Wiki project - https://phabricator.wikimedia.org/T375985#10187071 (10Aklapper) [09:48:54] !log dcaro@urcuchillay admin END (PASS) - Cookbook wmcs.ceph.osd.reset_weights (exit_code=0) [09:48:59] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [09:49:05] !log dcaro@urcuchillay admin START - Cookbook wmcs.ceph.osd.reset_weights [09:49:48] !log dcaro@urcuchillay admin END (PASS) - Cookbook wmcs.ceph.osd.reset_weights (exit_code=0) [09:49:54] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [09:55:12] RESOLVED: Primary cloud switch inbound port utilisation over 80%: Device cloudsw1-e4-eqiad.mgmt.eqiad.wmnet recovered from Primary cloud switch inbound port utilisation over 80% - https://alerts.wikimedia.org/?q=alertname%3DPrimary+cloud+switch+inbound+port+utilisation+over+80%25 [09:55:12] RESOLVED: Primary cloud switch port utilisation over 80%: Device cloudsw1-d5-eqiad.mgmt.eqiad.wmnet recovered from Primary cloud switch port utilisation over 80% - https://alerts.wikimedia.org/?q=alertname%3DPrimary+cloud+switch+port+utilisation+over+80%25 [09:55:13] !log dcaro@urcuchillay admin START - Cookbook wmcs.ceph.osd.reset_weights [09:55:18] !log dcaro@urcuchillay admin END (PASS) - Cookbook wmcs.ceph.osd.reset_weights (exit_code=0) [09:55:18] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [09:55:23] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [09:55:44] 06cloud-services-team, 10Cloud-VPS, 06Infrastructure-Foundations, 10netops, 06SRE: openstack: initial IPv6 support in neutron - https://phabricator.wikimedia.org/T375847#10187090 (10cmooney) >>! In T375847#10182667, @aborrero wrote: > I see the dhcp6 packets from my test VM arriving into neutron: > > `... [09:58:48] !log dcaro@urcuchillay admin START - Cookbook wmcs.ceph.osd.reset_weights [09:58:52] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [10:00:12] !log dcaro@urcuchillay admin END (FAIL) - Cookbook wmcs.ceph.osd.reset_weights (exit_code=99) [10:00:15] !log dcaro@urcuchillay admin START - Cookbook wmcs.ceph.osd.reset_weights [10:00:17] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [10:00:23] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [10:02:10] !log dcaro@urcuchillay admin END (PASS) - Cookbook wmcs.ceph.osd.reset_weights (exit_code=0) [10:02:16] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [10:04:28] !log dcaro@urcuchillay admin START - Cookbook wmcs.ceph.reset_weights [10:04:36] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [10:06:02] !log dcaro@urcuchillay admin END (FAIL) - Cookbook wmcs.ceph.reset_weights (exit_code=99) [10:06:04] !log dcaro@urcuchillay admin START - Cookbook wmcs.ceph.reset_weights [10:06:06] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [10:06:06] 06cloud-services-team, 10Cloud-VPS, 06Infrastructure-Foundations, 10netops, 06SRE: openstack: initial IPv6 support in neutron - https://phabricator.wikimedia.org/T375847#10187153 (10cmooney) @aborrero the network assignment is incorrect also. 2a02:ec80:a100::/56 is the entire public IPv6 allocation for... [10:06:10] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [10:07:42] (03update) 10aborrero: projects: refactor into new repository layout [repos/cloud/cloud-vps/tofu-infra] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/69 (https://phabricator.wikimedia.org/T375283) [10:08:07] !log dcaro@urcuchillay admin END (PASS) - Cookbook wmcs.ceph.reset_weights (exit_code=0) [10:08:13] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [10:09:39] (03PS1) 10David Caro: ceph: add cookbook to reset weights in the cluster [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1076702 [10:09:42] (03update) 10aborrero: projects: refactor into new repository layout [repos/cloud/cloud-vps/tofu-infra] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/69 (https://phabricator.wikimedia.org/T375283) [10:10:30] (03CR) 10David Caro: "Tested in codfw:" [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1076702 (owner: 10David Caro) [10:10:39] (03update) 10aborrero: projects: refactor into new repository layout [repos/cloud/cloud-vps/tofu-infra] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/69 (https://phabricator.wikimedia.org/T375283) [10:12:02] (03update) 10aborrero: projects: refactor into new repository layout [repos/cloud/cloud-vps/tofu-infra] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/69 (https://phabricator.wikimedia.org/T375283) [10:13:17] (03CR) 10CI reject: [V:04-1] ceph: add cookbook to reset weights in the cluster [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1076702 (owner: 10David Caro) [10:15:12] (03update) 10aborrero: projects: refactor into new repository layout [repos/cloud/cloud-vps/tofu-infra] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/69 (https://phabricator.wikimedia.org/T375283) [10:16:15] (03update) 10aborrero: projects: refactor into new repository layout [repos/cloud/cloud-vps/tofu-infra] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/69 (https://phabricator.wikimedia.org/T375283) [10:16:37] 10Cloud-VPS (Project-requests), 07affects-Miraheze: Request creation of createwikitest VPS project - https://phabricator.wikimedia.org/T375454#10187185 (10Xaloria) @aborrero @RhinosF1 As told to see my response here is my response at first Thanks for sharing your thoughts on the createwikitest project. I un... [10:27:09] 10Cloud-VPS (Project-requests), 07affects-Miraheze: Request creation of createwikitest VPS project - https://phabricator.wikimedia.org/T375454#10187199 (10RhinosF1) It still feels like I'm talking to GPT but >>! In T375454#10187185, @Xaloria wrote: > @aborrero @RhinosF1 > As told to see my response here is m... [10:27:42] !log dcaro@urcuchillay admin START - Cookbook wmcs.ceph.reset_weights [10:27:48] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [10:28:55] (03PS2) 10David Caro: ceph: add cookbook to reset weights in the cluster [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1076702 [10:29:38] 10Cloud-VPS (Project-requests), 07affects-Miraheze: Request creation of createwikitest VPS project - https://phabricator.wikimedia.org/T375454#10187203 (10Xaloria) Actually I don't know but Sorry If I was that wrong and big a huge mistake by requesting this VPS project really sorry. [10:34:28] 10Cloud-VPS (Project-requests), 07affects-Miraheze: Request creation of createwikitest VPS project - https://phabricator.wikimedia.org/T375454#10187212 (10aborrero) 05Open→03Declined declining per latest comments in this ticket. [10:34:57] 06cloud-services-team, 10Cloud-VPS, 06Infrastructure-Foundations, 10netops, 06SRE: openstack: work out IPv6 and designate integration - https://phabricator.wikimedia.org/T374715#10187215 (10cmooney) Guys I would propose the following: * We delegate the allocated 'public' and 'private' ranges to the codf... [10:35:42] 10Cloud-VPS (Project-requests), 07affects-Miraheze: Request creation of createwikitest VPS project - https://phabricator.wikimedia.org/T375454#10187217 (10Xaloria) @aborrero And can you let me know what type of media wiki project can be accepted. [10:36:51] !log dcaro@urcuchillay admin END (PASS) - Cookbook wmcs.ceph.reset_weights (exit_code=0) [10:36:56] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [10:46:02] 10Cloud-VPS (Project-requests), 07affects-Miraheze: Request creation of createwikitest VPS project - https://phabricator.wikimedia.org/T375454#10187241 (10aborrero) >>! In T375454#10187217, @Xaloria wrote: > @aborrero And can you let me know what type of media wiki project can be accepted. I invite you to... [10:48:18] 10Cloud-VPS (Project-requests), 07affects-Miraheze: Request creation of createwikitest VPS project - https://phabricator.wikimedia.org/T375454#10187242 (10Xaloria) Ok so interactions is important Iam still interested in creating an media wiki project and helping others as possible. [10:49:41] 10Toolforge (Quota-requests): Request increased quota for sqid Toolforge tool - https://phabricator.wikimedia.org/T375070#10187244 (10fnegri) a:03fnegri [10:50:35] 10Cloud-VPS (Quota-requests): Temporary (1-2 weeks) quota increase for disaster recovery exercise - https://phabricator.wikimedia.org/T375977#10187246 (10aborrero) 05Open→03In progress p:05Triage→03Medium +1 LGTM. [10:52:48] 10Cloud-VPS (Quota-requests): Temporary (1-2 weeks) quota increase for disaster recovery exercise - https://phabricator.wikimedia.org/T375977#10187251 (10fnegri) a:03fnegri [11:08:46] (03CR) 10FNegri: [C:03+1] "LGTM, I didn't check all the details, but the interface looks ok and I trust the fact it worked as expected in codfw." [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1076702 (owner: 10David Caro) [11:12:31] FIRING: ToolsNfsAlmostFull: Toolforge NFS is 0.8513419848380185/1 full - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolsNfsAlmostFull - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolsNfsAlmostFull [11:18:44] !log dcaro@urcuchillay admin START - Cookbook wmcs.ceph.reset_weights [11:18:50] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [11:18:53] !log dcaro@urcuchillay admin END (FAIL) - Cookbook wmcs.ceph.reset_weights (exit_code=99) [11:18:57] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [11:19:00] !log dcaro@urcuchillay admin START - Cookbook wmcs.ceph.reset_weights [11:19:05] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [11:33:16] (03update) 10aborrero: projects: refactor into new repository layout [repos/cloud/cloud-vps/tofu-infra] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/69 (https://phabricator.wikimedia.org/T375283) [11:34:09] (03update) 10aborrero: projects: refactor into new repository layout [repos/cloud/cloud-vps/tofu-infra] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/69 (https://phabricator.wikimedia.org/T375283) [11:35:32] 06cloud-services-team, 10Cloud-VPS, 13Patch-For-Review: tofu-infra: refactor repo structure - https://phabricator.wikimedia.org/T375283#10187327 (10fnegri) > We just started using tofu, and we are tracking resources in more than 4 projects already. Can you list the projects where we are tracking resources a... [11:41:27] 06cloud-services-team, 10Cloud-VPS, 13Patch-For-Review: tofu-infra: refactor repo structure - https://phabricator.wikimedia.org/T375283#10187332 (10fnegri) > Only the project resources (DNS records, custom security groups, etc.) would go into the project folders. For managing the project list, I would keep t... [11:42:28] (03update) 10aborrero: projects: refactor into new repository layout [repos/cloud/cloud-vps/tofu-infra] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/69 (https://phabricator.wikimedia.org/T375283) [11:49:59] !log dcaro@urcuchillay admin END (FAIL) - Cookbook wmcs.ceph.reset_weights (exit_code=99) [11:50:05] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [11:50:51] (03CR) 10David Caro: [C:03+2] ceph: add cookbook to reset weights in the cluster [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1076702 (owner: 10David Caro) [11:54:18] (03Merged) 10jenkins-bot: ceph: add cookbook to reset weights in the cluster [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1076702 (owner: 10David Caro) [12:01:25] (03update) 10aborrero: projects: refactor into new repository layout [repos/cloud/cloud-vps/tofu-infra] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/69 (https://phabricator.wikimedia.org/T375283) [12:03:58] (03update) 10aborrero: projects: refactor into new repository layout [repos/cloud/cloud-vps/tofu-infra] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/69 (https://phabricator.wikimedia.org/T375283) [12:05:01] (03update) 10aborrero: projects: refactor into new repository layout [repos/cloud/cloud-vps/tofu-infra] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/69 (https://phabricator.wikimedia.org/T375283) [12:05:37] (03update) 10aborrero: projects: refactor into new repository layout [repos/cloud/cloud-vps/tofu-infra] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/69 (https://phabricator.wikimedia.org/T375283) [12:06:47] (03update) 10aborrero: projects: refactor into new repository layout [repos/cloud/cloud-vps/tofu-infra] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/69 (https://phabricator.wikimedia.org/T375283) [12:31:16] (03update) 10aborrero: projects: refactor into new repository layout [repos/cloud/cloud-vps/tofu-infra] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/69 (https://phabricator.wikimedia.org/T375283) [12:53:47] 10Cloud-VPS (Project-requests), 07affects-Miraheze: Request creation of createwikitest VPS project - https://phabricator.wikimedia.org/T375454#10187488 (10Aklapper) @Xaloria: For information to set up MediaWiki locally, see https://www.mediawiki.org/wiki/How_to_become_a_MediaWiki_hacker and https://develop... [13:01:52] !log dcaro@urcuchillay admin START - Cookbook wmcs.ceph.reset_weights [13:01:58] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [13:11:21] 10cloud-services-team (FY2024/2025-Q1-Q2), 10Data-Services, 06Infrastructure-Foundations: Remove wmcs-admin access from production cumin hosts - https://phabricator.wikimedia.org/T347979#10187541 (10fnegri) @taavi good question! Given that wikireplica hosts are not owned by a single team, my suggestion would... [13:15:10] (03open) 10hunsvotti: Fix cursor on links in image boxes [toolforge-repos/imgs-for-wikt] - 10https://gitlab.wikimedia.org/toolforge-repos/imgs-for-wikt/-/merge_requests/11 [13:15:16] (03merge) 10hunsvotti: Fix cursor on links in image boxes [toolforge-repos/imgs-for-wikt] - 10https://gitlab.wikimedia.org/toolforge-repos/imgs-for-wikt/-/merge_requests/11 [13:22:38] !log dcaro@urcuchillay admin END (PASS) - Cookbook wmcs.ceph.reset_weights (exit_code=0) [13:22:44] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [13:26:45] (03update) 10dcaro: all: upgrade to tekton 0.59.X LTS [repos/cloud/toolforge/builds-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/builds-api/-/merge_requests/111 (https://phabricator.wikimedia.org/T374908) [13:38:13] !log dcaro@urcuchillay admin START - Cookbook wmcs.ceph.reset_weights [13:38:21] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [13:47:39] !log dcaro@urcuchillay admin END (PASS) - Cookbook wmcs.ceph.osd.bootstrap_and_add (exit_code=0) (T372814) [13:47:46] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [13:47:46] T372814: Put cloudcephosd10[39-41] into service - https://phabricator.wikimedia.org/T372814 [13:51:50] 10cloud-services-team (FY2024/2025-Q1-Q2), 10Toolforge (Toolforge iteration 15), 13Patch-For-Review: [builds-builder,builds-api] Upgrade tekton - https://phabricator.wikimedia.org/T374908#10187640 (10dcaro) [13:51:59] 10Toolforge: [infra,k8s] Upgrade Toolforge Kubernetes to version 1.28 - https://phabricator.wikimedia.org/T362867#10187642 (10dcaro) [13:52:02] 10cloud-services-team (FY2024/2025-Q1-Q2), 10Toolforge (Toolforge iteration 15), 13Patch-For-Review: [builds-builder,builds-api] Upgrade tekton - https://phabricator.wikimedia.org/T374908#10187641 (10dcaro) [13:52:04] 10Toolforge: [infra,k8s] Upgrade Toolforge Kubernetes to version 1.28 - https://phabricator.wikimedia.org/T362867#10187644 (10Raymond_Ndibe) [13:52:07] 06cloud-services-team, 10Toolforge: [infra,k8s] remove deprecated kubelet flags before 1.27 upgrade - https://phabricator.wikimedia.org/T370245#10187645 (10Raymond_Ndibe) [13:52:15] 06cloud-services-team, 10Toolforge: [infra,k8s] remove deprecated kubelet flags before 1.27 upgrade - https://phabricator.wikimedia.org/T370245#10187646 (10dcaro) [13:52:19] 10cloud-services-team (FY2024/2025-Q1-Q2), 10Toolforge (Toolforge iteration 15), 13Patch-For-Review: [infra,k8s] Upgrade Toolforge Kubernetes to version 1.27 - https://phabricator.wikimedia.org/T359641#10187647 (10dcaro) [13:52:43] 06cloud-services-team, 10Toolforge: [infra,k8s] remove deprecated kubelet flags before 1.28 upgrade (we might be able to remove all custom ones) - https://phabricator.wikimedia.org/T370245#10187649 (10dcaro) [13:52:50] 10Toolforge: [builds-builder, builds-api] upgrade tekton version - https://phabricator.wikimedia.org/T370869#10187638 (10dcaro) →14Duplicate dup:03T374908 [13:53:28] 10cloud-services-team (FY2024/2025-Q1-Q2), 10Toolforge (Toolforge iteration 15), 13Patch-For-Review: [infra,k8s] Upgrade Toolforge Kubernetes to version 1.27 - https://phabricator.wikimedia.org/T359641#10187650 (10dcaro) [13:56:44] 10Toolforge (Toolforge iteration 15): add --force to wmcs.toolforge.remove_k8s_node cookbook - https://phabricator.wikimedia.org/T375158#10187670 (10Raymond_Ndibe) 05In progress→03Resolved [13:56:57] 06cloud-services-team, 10Toolforge: [infra,k8s] Move to kubernetes PAVs and drop kyverno - https://phabricator.wikimedia.org/T364293#10187672 (10dcaro) [13:56:58] 10Toolforge, 07Kubernetes: Toolforge: replace admission controllers with an existing policy admin project - https://phabricator.wikimedia.org/T335131#10187671 (10dcaro) [13:57:09] 06cloud-services-team, 10Toolforge, 13Patch-For-Review: [infra] Replace PodSecurityPolicy in Toolforge Kubernetes - https://phabricator.wikimedia.org/T279110#10187673 (10dcaro) [13:57:56] 10cloud-services-team (FY2024/2025-Q1-Q2), 10Toolforge (Toolforge iteration 15), 13Patch-For-Review: [infra,k8s] Upgrade Toolforge Kubernetes to version 1.27 - https://phabricator.wikimedia.org/T359641#10187662 (10Raymond_Ndibe) 05In progress→03Resolved [13:58:33] !log dcaro@urcuchillay admin END (FAIL) - Cookbook wmcs.ceph.reset_weights (exit_code=99) [13:58:38] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [13:59:31] 10Toolforge, 07Kubernetes: Toolforge: replace admission controllers with an existing policy admin project - https://phabricator.wikimedia.org/T335131#10187665 (10dcaro) On {T362233} was decided to drop kyverno in favor ValidationAdmissionPolicies, so the focus moves to seeing if this can be done with those [14:00:50] 10Toolforge, 07Kubernetes: [infra,k8s] replace admission controllers with an existing policy admin project - https://phabricator.wikimedia.org/T335131#10187679 (10dcaro) [14:00:54] 06cloud-services-team, 10Toolforge: [infra,k8s] Move to kubernetes PAVs and drop kyverno - https://phabricator.wikimedia.org/T364293#10187680 (10dcaro) [14:00:56] 06cloud-services-team, 10Toolforge: [infra,k8s] Move to kubernetes PAVs and drop kyverno - https://phabricator.wikimedia.org/T364293#10187682 (10dcaro) [14:00:57] 10Toolforge, 07Kubernetes: [infra,k8s] replace admission controllers with an existing policy admin project - https://phabricator.wikimedia.org/T335131#10187681 (10dcaro) [14:01:01] 10Toolforge, 07Kubernetes: [infra,k8s] replace admission controllers with an existing policy admin project - https://phabricator.wikimedia.org/T335131#10187684 (10dcaro) [14:01:16] 06cloud-services-team, 10Toolforge, 10Sustainability (Incident Followup): [infra,k8s] Scrape Kubernetes controller-manager and apiserver metrics into Prometheus - https://phabricator.wikimedia.org/T308381#10187690 (10dcaro) [14:04:58] 06cloud-services-team, 10Toolforge: [k8s,kyverno]: explore change from per-namespace policy resource to a single ClusterPolicy resource - https://phabricator.wikimedia.org/T368135#10187701 (10dcaro) [14:10:14] (03open) 10aborrero: network: have projects in the resouce id [repos/cloud/cloud-vps/tofu-infra] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/70 (https://phabricator.wikimedia.org/T375283) [14:45:48] (03open) 10taavi: Adapt home archiving configuration for NFS-on-VMs usage [repos/cloud/toolforge/disable-tool] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/disable-tool/-/merge_requests/20 (https://phabricator.wikimedia.org/T372701) [14:45:51] (03update) 10taavi: Adapt home archiving configuration for NFS-on-VMs usage [repos/cloud/toolforge/disable-tool] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/disable-tool/-/merge_requests/20 (https://phabricator.wikimedia.org/T372701) [14:46:15] (03update) 10taavi: Adapt home archiving configuration for NFS-on-VMs usage [repos/cloud/toolforge/disable-tool] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/disable-tool/-/merge_requests/20 (https://phabricator.wikimedia.org/T372701) [14:48:16] (03update) 10taavi: Adapt home archiving configuration for NFS-on-VMs usage [repos/cloud/toolforge/disable-tool] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/disable-tool/-/merge_requests/20 (https://phabricator.wikimedia.org/T372701) [15:36:15] (03update) 10aborrero: network: have projects in the resouce id [repos/cloud/cloud-vps/tofu-infra] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/70 (https://phabricator.wikimedia.org/T375283) [15:56:10] (03update) 10dcaro: all: upgrade to tekton 0.59.X LTS [repos/cloud/toolforge/builds-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/builds-api/-/merge_requests/111 (https://phabricator.wikimedia.org/T374908) [15:56:15] 10Data-Services, 06Data-Platform-SRE, 06DBA: Prepare and check storage layer for kgewiki - https://phabricator.wikimedia.org/T374814#10188262 (10Ladsgroup) a:05ABran-WMF→03None Done now. [15:57:08] 10Data-Services, 06Data-Platform-SRE, 06DBA: Prepare and check storage layer for madwiktionary - https://phabricator.wikimedia.org/T375023#10188272 (10Ladsgroup) a:05ABran-WMF→03None DBA side is done now. [15:57:29] 10Data-Services, 06Data-Platform-SRE, 06DBA: Prepare and check storage layer for gorwikiquote - https://phabricator.wikimedia.org/T375094#10188277 (10Ladsgroup) a:05ABran-WMF→03None DBA side is done now. [15:58:01] 10Data-Services, 06Data-Platform-SRE, 06DBA: Prepare and check storage layer for shnwikinews - https://phabricator.wikimedia.org/T375432#10188284 (10Ladsgroup) a:05ABran-WMF→03None DBA side is done now. [15:59:41] 10Data-Services, 06Data-Platform-SRE, 06DBA: Prepare and check storage layer for moswiki - https://phabricator.wikimedia.org/T375568#10188267 (10Ladsgroup) a:05ABran-WMF→03None DBA side is done now. [16:01:19] 06cloud-services-team, 10Toolforge: [toolsbeta] Rebuild servers to learn how to take down the services without downtime (and use affinities) - https://phabricator.wikimedia.org/T267140#10188293 (10dcaro) 05Open→03Resolved a:03dcaro I think we can close it, and if/when we need to do anything new speci... [16:02:32] RESOLVED: ToolsNfsAlmostFull: Toolforge NFS is 0.8517862409119462/1 full - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolsNfsAlmostFull - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolsNfsAlmostFull [16:10:59] 10wikitech.wikimedia.org, 10Wikimedia-Site-requests: fold contentadmin group to sysop in Wikitech - https://phabricator.wikimedia.org/T375950#10188340 (10bd808) 05Open→03Stalled Seems reasonable to me, but only after LDAP auth is removed from Wikitech. [16:11:01] 06cloud-services-team, 10wikitech.wikimedia.org, 06Infrastructure-Foundations, 06serviceops: wikitech self-auth: Allow wikitech to use its own internal authentication - https://phabricator.wikimedia.org/T371588#10188343 (10bd808) [16:11:02] 10wikitech.wikimedia.org, 10Wikimedia-Site-requests: fold contentadmin group to sysop in Wikitech - https://phabricator.wikimedia.org/T375950#10188342 (10bd808) [16:13:04] (03CR) 10Majavah: [C:03+2] Fix a bunch of strict mypy errors [labs/tools/majavah-bot] - 10https://gerrit.wikimedia.org/r/1055580 (owner: 10Majavah) [16:14:42] (03Merged) 10jenkins-bot: Fix a bunch of strict mypy errors [labs/tools/majavah-bot] - 10https://gerrit.wikimedia.org/r/1055580 (owner: 10Majavah) [16:19:12] (03PS1) 10Giuseppe Lavagetto: Add ssh key for conftool2git [labs/private] - 10https://gerrit.wikimedia.org/r/1076792 [16:20:03] 10cloud-services-team (Hardware), 06DC-Ops, 10ops-codfw, 06SRE: Q1:rack/setup/install cloudlb2004-dev - https://phabricator.wikimedia.org/T370678#10188377 (10Jhancock.wm) @aborrero I accidentally ran a few imaging attempts while just going through lists. Could you update the site.pp file for us? Thanks! [16:21:32] (03CR) 10Giuseppe Lavagetto: [V:03+2 C:03+2] Add ssh key for conftool2git [labs/private] - 10https://gerrit.wikimedia.org/r/1076792 (owner: 10Giuseppe Lavagetto) [16:22:57] (03CR) 10Majavah: [C:03+2] labsauth: Add field for SUL account ID [labs/striker] - 10https://gerrit.wikimedia.org/r/1009310 (https://phabricator.wikimedia.org/T359428) (owner: 10Majavah) [16:25:14] (03Merged) 10jenkins-bot: labsauth: Add field for SUL account ID [labs/striker] - 10https://gerrit.wikimedia.org/r/1009310 (https://phabricator.wikimedia.org/T359428) (owner: 10Majavah) [16:34:53] (03PS1) 10David Caro: ceph.undrain_osds_in_chunks: fix the chunking [cloud/wmcs-cookbooks] - 10https://gerrit.wikimedia.org/r/1076796 [16:35:02] !log dcaro@urcuchillay admin START - Cookbook wmcs.ceph.osd.undrain_node (T372814) [16:35:09] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [16:35:09] T372814: Put cloudcephosd10[39-41] into service - https://phabricator.wikimedia.org/T372814 [16:41:07] !log dcaro@urcuchillay admin START - Cookbook wmcs.ceph.reset_weights [16:41:12] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [16:44:44] (03update) 10dcaro: builds-buidler: upgrade tekton [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/531 [16:49:00] FIRING: Primary cloud switch port utilisation over 80%: Alert for device cloudsw1-d5-eqiad.mgmt.eqiad.wmnet - Primary cloud switch port utilisation over 80% - https://alerts.wikimedia.org/?q=alertname%3DPrimary+cloud+switch+port+utilisation+over+80%25 [16:49:08] 06cloud-services-team: Primary cloud switch port utilisation over 80% Rule: Primary cloud switch port utilisation over 80% Faults: #1: sysObjectID = .1.3.6.1.4.1.2636.1.1.1.4.82.5; sysDescr = Juniper Networks, Inc. qfx5100-48s-6q Ethernet Switch, kerne... - https://phabricator.wikimedia.org/T376019#10188539 [16:54:01] RESOLVED: Primary cloud switch port utilisation over 80%: Device cloudsw1-d5-eqiad.mgmt.eqiad.wmnet recovered from Primary cloud switch port utilisation over 80% - https://alerts.wikimedia.org/?q=alertname%3DPrimary+cloud+switch+port+utilisation+over+80%25 [16:55:29] 10VPS-project-Wikistats: Add shnwikinews to wikistats - https://phabricator.wikimedia.org/T375437#10188561 (10Dzahn) a:03Dzahn [16:59:18] (03CR) 10Majavah: [C:03+2] labsauth: Store SUL user ID like username [labs/striker] - 10https://gerrit.wikimedia.org/r/1009311 (https://phabricator.wikimedia.org/T359428) (owner: 10Majavah) [16:59:56] !log dcaro@urcuchillay admin END (PASS) - Cookbook wmcs.ceph.reset_weights (exit_code=0) [17:00:01] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [17:00:31] (03Merged) 10jenkins-bot: labsauth: Store SUL user ID like username [labs/striker] - 10https://gerrit.wikimedia.org/r/1009311 (https://phabricator.wikimedia.org/T359428) (owner: 10Majavah) [17:05:58] (03PS2) 10Abaris: Add i18n messages for Codex [labs/tools/intuition] - 10https://gerrit.wikimedia.org/r/1072839 (https://phabricator.wikimedia.org/T374873) [17:09:54] (03PS3) 10Abaris: Add i18n messages for Codex [labs/tools/intuition] - 10https://gerrit.wikimedia.org/r/1072839 (https://phabricator.wikimedia.org/T374873) [17:11:20] (03PS4) 10Abaris: Add i18n messages for Codex [labs/tools/intuition] - 10https://gerrit.wikimedia.org/r/1072839 (https://phabricator.wikimedia.org/T374873) [17:22:51] (03PS1) 10Majavah: register: Also filter on user SUL ID [labs/striker] - 10https://gerrit.wikimedia.org/r/1076809 (https://phabricator.wikimedia.org/T359428) [17:42:53] (03PS1) 10Majavah: dev(docker): Add wmf-user custom LDAP schema [labs/striker] - 10https://gerrit.wikimedia.org/r/1076814 (https://phabricator.wikimedia.org/T148048) [17:42:56] (03PS1) 10Majavah: labsauth: Write SUL account details to LDAP on registration [labs/striker] - 10https://gerrit.wikimedia.org/r/1076815 (https://phabricator.wikimedia.org/T148048) [17:42:59] (03PS1) 10Majavah: labsauth: Write SUL details to LDAP when updating linkage [labs/striker] - 10https://gerrit.wikimedia.org/r/1076816 (https://phabricator.wikimedia.org/T148048) [17:56:29] 10Tool-video-answer-tool, 06Future-Audiences: Implement attribution requirements for demo video - https://phabricator.wikimedia.org/T374376#10188965 (10Maryana) p:05Triage→03High [17:57:44] 10Tool-video-answer-tool, 06Future-Audiences, 07Spike: Investigate different options for animation of images - https://phabricator.wikimedia.org/T374367#10188970 (10Maryana) 05Open→03Resolved [17:57:48] 10Tool-video-answer-tool, 06Future-Audiences: Generate first batch of videos with latest styling - https://phabricator.wikimedia.org/T375888#10188972 (10Maryana) p:05Triage→03High [17:57:58] 10Tool-video-answer-tool, 06Future-Audiences: Add element to video preview to allow for text editing - https://phabricator.wikimedia.org/T375933#10188975 (10Maryana) p:05Triage→03High [18:13:51] 10Tool-video-answer-tool, 06Future-Audiences: Ensure Image Attribution Is Readable - https://phabricator.wikimedia.org/T375830#10189051 (10derenrich) a:03derenrich [19:20:41] FIRING: CloudVPSDesignateLeaks: Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [19:24:39] 10Toolforge-standards-committee (Maintainer needed): The EditGroups tool needs new maintainer(s) - https://phabricator.wikimedia.org/T376072 (10Pintoch) 03NEW [19:28:40] 10Toolforge-standards-committee (Maintainer needed), 10Tools, 10Wikidata: The EditGroups tool needs new maintainer(s) - https://phabricator.wikimedia.org/T376072#10189364 (10Pintoch) [19:28:57] FIRING: [2x] SystemdUnitDown: The service unit wikitech_run_jobs.service is in failed status on host cloudweb1003. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitDown [19:30:41] RESOLVED: CloudVPSDesignateLeaks: Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [19:33:57] RESOLVED: [2x] SystemdUnitDown: The service unit wikitech_run_jobs.service is in failed status on host cloudweb1003. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitDown [19:47:24] 10Toolforge-standards-committee (Maintainer needed), 10Tools, 10Wikidata: The EditGroups tool needs new maintainer(s) - https://phabricator.wikimedia.org/T376072#10189435 (10bd808) One change that might help EditGroups stability would be switching from the Toolforge shared Redis to [[https://wikitech.wikimed... [19:48:02] 10Toolforge: [infra,k8s] Upgrade Toolforge Kubernetes to version 1.28 - https://phabricator.wikimedia.org/T362867#10189438 (10Raymond_Ndibe) [19:48:27] (03CR) 10Krinkle: "We generally maintain localisation in the same repo as the source code using it. Is that an option here? Placing it here will significantl" [labs/tools/intuition] - 10https://gerrit.wikimedia.org/r/1072839 (https://phabricator.wikimedia.org/T374873) (owner: 10Abaris) [19:50:53] 10VPS-project-Wikistats: Add tddwiki to wikistats - https://phabricator.wikimedia.org/T375428#10189468 (10Dzahn) 05Open→03Stalled stalled by T375422 (NOT T375424) [19:56:05] (03CR) 10Abaris: "@krinkle@fastmail.com Codex PHP already has its own i18n folder within the source code repository. Would it make sense to add Codex PHP to" [labs/tools/intuition] - 10https://gerrit.wikimedia.org/r/1072839 (https://phabricator.wikimedia.org/T374873) (owner: 10Abaris) [19:58:24] 10VPS-project-Wikistats: Add gorwikiquote to wikistats - https://phabricator.wikimedia.org/T375099#10189496 (10Dzahn) a:03Dzahn [19:58:40] 10VPS-project-Wikistats: Add kgewiki to wikistats - https://phabricator.wikimedia.org/T374819#10189497 (10Dzahn) a:03Dzahn [19:59:01] 10VPS-project-Wikistats: Add madwiktionary to wikistats - https://phabricator.wikimedia.org/T375028#10189498 (10Dzahn) a:03Dzahn [20:10:01] 10VPS-project-Wikistats: Add madwiktionary to wikistats - https://phabricator.wikimedia.org/T375028#10189533 (10Dzahn) 05Open→03Resolved ` MariaDB [wikistats]> insert into wiktionaries (prefix, lang, loclang, method) select prefix,lang,loclang,method from wikipedias where prefix="mad"; Query OK, 1 row af... [20:12:18] !log dcaro@urcuchillay admin END (FAIL) - Cookbook wmcs.ceph.osd.undrain_node (exit_code=99) (T372814) [20:12:24] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [20:12:24] T372814: Put cloudcephosd10[39-41] into service - https://phabricator.wikimedia.org/T372814 [20:15:02] 10VPS-project-Wikistats: Add moswiki to wikistats - https://phabricator.wikimedia.org/T374648#10189573 (10Dzahn) 05Open→03Resolved ` MariaDB [wikistats]> insert into wikipedias (prefix, lang, loclang, method) values ("mos","Mooré", "Mooré", 8); Query OK, 1 row affected (0.011 sec) --- dzahn@wi... [20:17:32] 10VPS-project-Wikistats: Add kgewiki to wikistats - https://phabricator.wikimedia.org/T374819#10189578 (10Dzahn) 05Open→03Resolved ` MariaDB [wikistats]> insert into wikipedias (prefix, lang, loclang, method) values ("kge","Komering", "Basa Kumoring", 8); Query OK, 1 row affected (0.008 sec) --- dzahn@... [20:18:02] 10VPS-project-Wikistats: Add gorwikiquote to wikistats - https://phabricator.wikimedia.org/T375099#10189583 (10Dzahn) 05Open→03Resolved ` MariaDB [wikistats]> insert into wikiquotes (prefix, lang, loclang, method) select prefix,lang,loclang,method from wikipedias where prefix="gor"; Query OK, 1 row affec... [20:33:57] FIRING: [2x] SystemdUnitDown: The service unit wikitech_run_jobs.service is in failed status on host cloudweb1003. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitDown [20:38:08] 06cloud-services-team, 10Cloud-VPS: cloudinfra hosts switching between 2 puppet changes / changes on every puppet run - https://phabricator.wikimedia.org/T263790#10189631 (10Dzahn) Given this ticket has been open/High for a couple years, I checked if it's still relevant at all. On what seems to be the cur... [20:38:14] 06cloud-services-team, 10Cloud-VPS: cloudinfra hosts switching between 2 puppet changes / changes on every puppet run - https://phabricator.wikimedia.org/T263790#10189638 (10Dzahn) a:05Dzahn→03None [20:38:57] RESOLVED: [2x] SystemdUnitDown: The service unit wikitech_run_jobs.service is in failed status on host cloudweb1003. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitDown [20:40:27] 06cloud-services-team, 10Cloud-VPS: cloudinfra hosts switching between 2 puppet changes / changes on every puppet run - https://phabricator.wikimedia.org/T263790#10189635 (10Dzahn) 05Open→03Resolved a:03Dzahn Also on mx-out06.cloudinfra I don't see this issue anymore. [20:44:45] 10VPS-project-Codesearch, 10VPS-project-Extdist, 06collaboration-services, 10Gerrit, 13Patch-For-Review: Move clients off of gerrit-replica.wikimedia.org back to gerrit.wikimedia.org - https://phabricator.wikimedia.org/T336710#10189664 (10Dzahn) A thing that changed here since the previous comments is th... [20:46:52] (03CR) 10Dzahn: "can we talk about it in the teams or is it an urgent fix because the primary server was affected badly?" [labs/codesearch] - 10https://gerrit.wikimedia.org/r/920243 (https://phabricator.wikimedia.org/T336710) (owner: 10Hashar) [21:28:57] FIRING: SystemdUnitDown: The service unit wikitech_run_jobs.service is in failed status on host cloudweb1004. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=cloudweb1004 - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitDown [21:33:57] RESOLVED: SystemdUnitDown: The service unit wikitech_run_jobs.service is in failed status on host cloudweb1004. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=cloudweb1004 - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitDown [22:12:32] 10Cloud-Services: Prepare "What's new with Wikimedia Cloud Services" presentation for WikiConNA 2024 - https://phabricator.wikimedia.org/T373159#10189861 (10bd808) Draft slides at https://docs.google.com/presentation/d/1hBrl_bOcpkYldoIS50hFUU8hrX3kDcokO2CeIVojJ-s. I will export a PDF and upload it to commons onc... [22:47:29] FIRING: PuppetCertificateAboutToExpire: Puppet CA certificate mwv-builder-03.mediawiki-vagrant.eqiad.wmflabs is about to expire in 24d 23h 58m 34s - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/PuppetCertificateAboutToExpire - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetCertificateAboutToExpire