[00:09:48] !log marostegui@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1146:3314 (T277354)', diff saved to https://phabricator.wikimedia.org/P18327 and previous config saved to /var/cache/conftool/dbconfig/20220104-000947-marostegui.json [00:09:49] !log marostegui@cumin1001 START - Cookbook sre.hosts.downtime for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance [00:09:50] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [00:09:51] !log marostegui@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance [00:09:51] T277354: "chemical" major mime type was never added to production database - https://phabricator.wikimedia.org/T277354 [00:09:52] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [00:09:53] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [00:22:54] RECOVERY - SSH on mw2258.mgmt is OK: SSH OK - OpenSSH_7.0 (protocol 2.0) https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [00:39:52] !log marostegui@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1098:3317 (T297094)', diff saved to https://phabricator.wikimedia.org/P18328 and previous config saved to /var/cache/conftool/dbconfig/20220104-003951-marostegui.json [00:39:54] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [00:39:56] T297094: Add globaluser.gu_hidden_level column to production - https://phabricator.wikimedia.org/T297094 [00:50:26] PROBLEM - Prometheus jobs reduced availability on alert1001 is CRITICAL: job=sidekiq site=codfw https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets [00:50:58] 10SRE, 10ops-eqsin: cr3-eqsin:xe-0/1/1 interface errors - https://phabricator.wikimedia.org/T298459 (10wiki_willy) a:03RobH [00:54:56] RECOVERY - Prometheus jobs reduced availability on alert1001 is OK: All metrics within thresholds. https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets [00:54:57] !log marostegui@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1098:3317', diff saved to https://phabricator.wikimedia.org/P18329 and previous config saved to /var/cache/conftool/dbconfig/20220104-005456-marostegui.json [00:54:58] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [01:10:01] !log marostegui@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1098:3317', diff saved to https://phabricator.wikimedia.org/P18330 and previous config saved to /var/cache/conftool/dbconfig/20220104-011001-marostegui.json [01:10:03] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [01:25:06] !log marostegui@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1098:3317 (T297094)', diff saved to https://phabricator.wikimedia.org/P18331 and previous config saved to /var/cache/conftool/dbconfig/20220104-012506-marostegui.json [01:25:08] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [01:25:10] T297094: Add globaluser.gu_hidden_level column to production - https://phabricator.wikimedia.org/T297094 [01:51:19] !log marostegui@cumin1001 START - Cookbook sre.hosts.downtime for 6:00:00 on db1144.eqiad.wmnet with reason: Maintenance [01:51:20] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [01:51:21] !log marostegui@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1144.eqiad.wmnet with reason: Maintenance [01:51:22] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [01:51:25] !log marostegui@cumin1001 dbctl commit (dc=all): 'Depooling db1144:3314 (T277354)', diff saved to https://phabricator.wikimedia.org/P18332 and previous config saved to /var/cache/conftool/dbconfig/20220104-015125-marostegui.json [01:51:27] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [01:51:27] T277354: "chemical" major mime type was never added to production database - https://phabricator.wikimedia.org/T277354 [01:52:34] Hello, I'm having problem with Gerrit. [01:53:03] I frequently have this error on git clone/git fetch: [01:53:06] Connection to gerrit.wikimedia.org closed by remote host.KiB/s [01:53:06] fetch-pack: unexpected disconnect while reading sideband packet [01:53:07] fatal: early EOF [01:53:07] fatal: fetch-pack: invalid index-pack output [01:55:34] My internet is okay, ISP told me I should ask someone from WMF. [02:06:02] PROBLEM - Ensure hosts are not performing a change on every puppet run on cumin1001 is CRITICAL: CRITICAL: the following (6) node(s) change every puppet run: releases2002, contint1001, ms-be2065, miscweb1002, releases1002, contint2001 https://wikitech.wikimedia.org/wiki/Puppet%23check_puppet_run_changes [02:06:07] !log mwdebug-deploy@deploy1002 helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn [02:06:08] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [02:06:57] (03PS1) 10TrainBranchBot: Branch commit for wmf/1.38.0-wmf.16 [core] (wmf/1.38.0-wmf.16) - 10https://gerrit.wikimedia.org/r/751224 [02:06:59] (03CR) 10TrainBranchBot: [C: 03+2] Branch commit for wmf/1.38.0-wmf.16 [core] (wmf/1.38.0-wmf.16) - 10https://gerrit.wikimedia.org/r/751224 (owner: 10TrainBranchBot) [02:07:06] !log mwdebug-deploy@deploy1002 helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn [02:07:07] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [02:07:07] !log mwdebug-deploy@deploy1002 helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn [02:07:08] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [02:08:20] !log mwdebug-deploy@deploy1002 helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn [02:08:21] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [02:24:50] (03CR) 10jerkins-bot: [V: 04-1] Branch commit for wmf/1.38.0-wmf.16 [core] (wmf/1.38.0-wmf.16) - 10https://gerrit.wikimedia.org/r/751224 (owner: 10TrainBranchBot) [02:25:13] (IcingaOverload) firing: Checks are taking long to execute on alert2001:9245 - https://grafana.wikimedia.org/d/rsCfQfuZz/icinga - https://alerts.wikimedia.org [02:25:30] (03Merged) 10jenkins-bot: Branch commit for wmf/1.38.0-wmf.16 [core] (wmf/1.38.0-wmf.16) - 10https://gerrit.wikimedia.org/r/751224 (owner: 10TrainBranchBot) [02:28:31] !log mwdebug-deploy@deploy1002 helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn [02:28:32] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [02:29:25] !log mwdebug-deploy@deploy1002 helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn [02:29:26] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [02:29:26] !log mwdebug-deploy@deploy1002 helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn [02:29:27] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [02:30:22] !log mwdebug-deploy@deploy1002 helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn [02:30:23] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [02:35:13] (IcingaOverload) resolved: Checks are taking long to execute on alert2001:9245 - https://grafana.wikimedia.org/d/rsCfQfuZz/icinga - https://alerts.wikimedia.org [02:50:34] PROBLEM - Maps tiles generation on alert1001 is CRITICAL: CRITICAL: 90.35% of data under the critical threshold [5.0] https://wikitech.wikimedia.org/wiki/Maps/Runbook https://grafana.wikimedia.org/dashboard/db/maps-performances?panelId=8&fullscreen&orgId=1 [03:26:39] (03PS1) 10Ladsgroup: sre.mysql.upgrade: Fix calling icinga with list [cookbooks] - 10https://gerrit.wikimedia.org/r/751225 (https://phabricator.wikimedia.org/T239814) [03:29:57] (03CR) 10Ladsgroup: [C: 03+2] sre.mysql.upgrade: Fix calling icinga with list [cookbooks] - 10https://gerrit.wikimedia.org/r/751225 (https://phabricator.wikimedia.org/T239814) (owner: 10Ladsgroup) [03:32:43] (03Merged) 10jenkins-bot: sre.mysql.upgrade: Fix calling icinga with list [cookbooks] - 10https://gerrit.wikimedia.org/r/751225 (https://phabricator.wikimedia.org/T239814) (owner: 10Ladsgroup) [03:35:55] !log marostegui@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1144:3314 (T277354)', diff saved to https://phabricator.wikimedia.org/P18333 and previous config saved to /var/cache/conftool/dbconfig/20220104-033555-marostegui.json [03:35:57] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [03:35:59] T277354: "chemical" major mime type was never added to production database - https://phabricator.wikimedia.org/T277354 [03:36:13] !log ladsgroup@cumin1001 START - Cookbook sre.mysql.upgrade for db2144.codfw.wmnet [03:36:13] !log ladsgroup@cumin1001 END (FAIL) - Cookbook sre.mysql.upgrade (exit_code=99) for db2144.codfw.wmnet [03:36:14] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [03:36:15] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [03:36:40] ugh [03:42:02] PROBLEM - Ensure hosts are not performing a change on every puppet run on cumin2001 is CRITICAL: CRITICAL: the following (6) node(s) change every puppet run: contint1001, releases2002, miscweb1002, contint2001, ms-be2065, releases1002 https://wikitech.wikimedia.org/wiki/Puppet%23check_puppet_run_changes [03:42:05] (03PS1) 10Ladsgroup: sre.mysql.upgrade: Fix the icinga, second try [cookbooks] - 10https://gerrit.wikimedia.org/r/751226 (https://phabricator.wikimedia.org/T239814) [03:43:14] (03CR) 10Ladsgroup: [C: 03+2] sre.mysql.upgrade: Fix the icinga, second try [cookbooks] - 10https://gerrit.wikimedia.org/r/751226 (https://phabricator.wikimedia.org/T239814) (owner: 10Ladsgroup) [03:45:47] (03Merged) 10jenkins-bot: sre.mysql.upgrade: Fix the icinga, second try [cookbooks] - 10https://gerrit.wikimedia.org/r/751226 (https://phabricator.wikimedia.org/T239814) (owner: 10Ladsgroup) [03:50:04] !log ladsgroup@cumin1001 START - Cookbook sre.mysql.upgrade for db2144.codfw.wmnet [03:50:05] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [03:50:11] !log ladsgroup@cumin1001 END (FAIL) - Cookbook sre.mysql.upgrade (exit_code=99) for db2144.codfw.wmnet [03:50:11] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [03:50:51] Made some progress [03:51:00] !log marostegui@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1144:3314', diff saved to https://phabricator.wikimedia.org/P18334 and previous config saved to /var/cache/conftool/dbconfig/20220104-035059-marostegui.json [03:51:01] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [03:54:41] (03PS1) 10Ladsgroup: sre.mysql.upgrade: Add logger object [cookbooks] - 10https://gerrit.wikimedia.org/r/751228 (https://phabricator.wikimedia.org/T239814) [03:56:54] (03CR) 10Ladsgroup: [C: 03+2] sre.mysql.upgrade: Add logger object [cookbooks] - 10https://gerrit.wikimedia.org/r/751228 (https://phabricator.wikimedia.org/T239814) (owner: 10Ladsgroup) [03:59:30] (03Merged) 10jenkins-bot: sre.mysql.upgrade: Add logger object [cookbooks] - 10https://gerrit.wikimedia.org/r/751228 (https://phabricator.wikimedia.org/T239814) (owner: 10Ladsgroup) [04:01:03] !log ladsgroup@cumin1001 START - Cookbook sre.mysql.upgrade for db2144.codfw.wmnet [04:01:04] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [04:05:24] !log ladsgroup@cumin1001 END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for db2144.codfw.wmnet [04:05:25] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [04:06:05] !log marostegui@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1144:3314', diff saved to https://phabricator.wikimedia.org/P18335 and previous config saved to /var/cache/conftool/dbconfig/20220104-040604-marostegui.json [04:06:06] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [04:17:05] (03PS2) 10Ladsgroup: auto_schema: Rework upgrade_mysql to reuse code and cookbooks [software] - 10https://gerrit.wikimedia.org/r/748720 (https://phabricator.wikimedia.org/T239814) [04:21:09] !log marostegui@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1144:3314 (T277354)', diff saved to https://phabricator.wikimedia.org/P18337 and previous config saved to /var/cache/conftool/dbconfig/20220104-042109-marostegui.json [04:21:11] !log marostegui@cumin1001 START - Cookbook sre.hosts.downtime for 6:00:00 on db1143.eqiad.wmnet with reason: Maintenance [04:21:11] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [04:21:12] !log marostegui@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1143.eqiad.wmnet with reason: Maintenance [04:21:13] T277354: "chemical" major mime type was never added to production database - https://phabricator.wikimedia.org/T277354 [04:21:13] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [04:21:14] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [04:21:17] !log marostegui@cumin1001 dbctl commit (dc=all): 'Depooling db1143 (T277354)', diff saved to https://phabricator.wikimedia.org/P18338 and previous config saved to /var/cache/conftool/dbconfig/20220104-042116-marostegui.json [04:21:19] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [04:26:13] (03PS3) 10Ladsgroup: auto_schema: Rework upgrade_mysql to reuse code and cookbooks [software] - 10https://gerrit.wikimedia.org/r/748720 (https://phabricator.wikimedia.org/T239814) [04:30:13] (IcingaOverload) firing: Checks are taking long to execute on alert2001:9245 - https://grafana.wikimedia.org/d/rsCfQfuZz/icinga - https://alerts.wikimedia.org [04:35:13] (IcingaOverload) resolved: Checks are taking long to execute on alert2001:9245 - https://grafana.wikimedia.org/d/rsCfQfuZz/icinga - https://alerts.wikimedia.org [05:03:54] PROBLEM - SSH on mw2254.mgmt is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [05:13:36] PROBLEM - Ensure hosts are not performing a change on every puppet run on cumin2002 is CRITICAL: CRITICAL: the following (6) node(s) change every puppet run: contint1001, releases2002, miscweb1002, releases1002, ms-be2065, contint2001 https://wikitech.wikimedia.org/wiki/Puppet%23check_puppet_run_changes [05:17:14] PROBLEM - Backup freshness on backup1001 is CRITICAL: Stale: 1 (gerrit1001), Fresh: 106 jobs https://wikitech.wikimedia.org/wiki/Bacula%23Monitoring [06:02:08] releases/contint are my fault, fixing today [06:05:02] RECOVERY - SSH on mw2254.mgmt is OK: SSH OK - OpenSSH_7.0 (protocol 2.0) https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [06:24:37] !log marostegui@cumin1001 START - Cookbook sre.hosts.downtime for 6:00:00 on db1102.eqiad.wmnet with reason: Maintenance [06:24:38] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [06:24:38] !log marostegui@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1102.eqiad.wmnet with reason: Maintenance [06:24:40] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [06:28:46] !log marostegui@cumin1001 START - Cookbook sre.hosts.downtime for 6:00:00 on db1139.eqiad.wmnet with reason: Maintenance [06:28:47] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [06:28:48] !log marostegui@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1139.eqiad.wmnet with reason: Maintenance [06:28:49] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [06:33:04] !log marostegui@cumin1001 START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance [06:33:05] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [06:33:05] !log marostegui@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance [06:33:06] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [06:37:09] !log marostegui@cumin1001 START - Cookbook sre.hosts.downtime for 6:00:00 on db1105.eqiad.wmnet with reason: Maintenance [06:37:10] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [06:37:10] !log marostegui@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1105.eqiad.wmnet with reason: Maintenance [06:37:11] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [06:37:15] !log marostegui@cumin1001 dbctl commit (dc=all): 'Depooling db1105:3312 (T298316)', diff saved to https://phabricator.wikimedia.org/P18339 and previous config saved to /var/cache/conftool/dbconfig/20220104-063714-marostegui.json [06:37:17] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [06:37:18] T298316: Fix nullability of column recentchanges.rc_params on wmf wikis - https://phabricator.wikimedia.org/T298316 [06:39:19] (03PS3) 10Giuseppe Lavagetto: deployment_server: fix permissions for mwbuilder/other [puppet] - 10https://gerrit.wikimedia.org/r/751166 (https://phabricator.wikimedia.org/T297673) [06:39:21] (03PS1) 10Giuseppe Lavagetto: kubernetes::deployment_server: only include additional stuff in the role [puppet] - 10https://gerrit.wikimedia.org/r/751312 [06:44:58] (03PS1) 10Giuseppe Lavagetto: Update for refactor of deployment server role [labs/private] - 10https://gerrit.wikimedia.org/r/751313 [06:45:53] (03CR) 10Giuseppe Lavagetto: [V: 03+2 C: 03+2] Update for refactor of deployment server role [labs/private] - 10https://gerrit.wikimedia.org/r/751313 (owner: 10Giuseppe Lavagetto) [06:50:26] (03CR) 10Giuseppe Lavagetto: [V: 03+1] "PCC SUCCESS (NOOP 3): https://integration.wikimedia.org/ci/job/operations-puppet-catalog-compiler/33101/console" [puppet] - 10https://gerrit.wikimedia.org/r/751312 (owner: 10Giuseppe Lavagetto) [06:55:48] (03CR) 10Giuseppe Lavagetto: [V: 03+1 C: 03+2] kubernetes::deployment_server: only include additional stuff in the role [puppet] - 10https://gerrit.wikimedia.org/r/751312 (owner: 10Giuseppe Lavagetto) [06:59:42] !log marostegui@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1105:3312 (T298316)', diff saved to https://phabricator.wikimedia.org/P18340 and previous config saved to /var/cache/conftool/dbconfig/20220104-065942-marostegui.json [06:59:44] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [06:59:46] T298316: Fix nullability of column recentchanges.rc_params on wmf wikis - https://phabricator.wikimedia.org/T298316 [07:14:47] !log marostegui@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1105:3312', diff saved to https://phabricator.wikimedia.org/P18341 and previous config saved to /var/cache/conftool/dbconfig/20220104-071446-marostegui.json [07:14:48] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [07:29:52] !log marostegui@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1105:3312', diff saved to https://phabricator.wikimedia.org/P18342 and previous config saved to /var/cache/conftool/dbconfig/20220104-072951-marostegui.json [07:29:53] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [07:37:45] !log marostegui@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1143 (T277354)', diff saved to https://phabricator.wikimedia.org/P18343 and previous config saved to /var/cache/conftool/dbconfig/20220104-073745-marostegui.json [07:37:47] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [07:37:48] T277354: "chemical" major mime type was never added to production database - https://phabricator.wikimedia.org/T277354 [07:44:56] !log marostegui@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1105:3312 (T298316)', diff saved to https://phabricator.wikimedia.org/P18344 and previous config saved to /var/cache/conftool/dbconfig/20220104-074456-marostegui.json [07:44:59] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [07:45:00] T298316: Fix nullability of column recentchanges.rc_params on wmf wikis - https://phabricator.wikimedia.org/T298316 [07:45:01] !log marostegui@cumin1001 START - Cookbook sre.hosts.downtime for 6:00:00 on 9 hosts with reason: Maintenance [07:45:02] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [07:45:08] !log marostegui@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on 9 hosts with reason: Maintenance [07:45:09] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [07:47:30] !log marostegui@cumin1001 START - Cookbook sre.hosts.downtime for 6:00:00 on db1129.eqiad.wmnet with reason: Maintenance [07:47:31] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [07:47:32] !log marostegui@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1129.eqiad.wmnet with reason: Maintenance [07:47:33] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [07:49:55] !log marostegui@cumin1001 START - Cookbook sre.hosts.downtime for 6:00:00 on db1162.eqiad.wmnet with reason: Maintenance [07:49:56] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [07:49:57] !log marostegui@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1162.eqiad.wmnet with reason: Maintenance [07:49:58] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [07:50:22] (03PS1) 10Marostegui: db2094: Disable notifications [puppet] - 10https://gerrit.wikimedia.org/r/751378 (https://phabricator.wikimedia.org/T295965) [07:52:02] !log marostegui@cumin1001 START - Cookbook sre.hosts.downtime for 6:00:00 on db1182.eqiad.wmnet with reason: Maintenance [07:52:03] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [07:52:04] !log marostegui@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1182.eqiad.wmnet with reason: Maintenance [07:52:04] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [07:52:50] !log marostegui@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1143', diff saved to https://phabricator.wikimedia.org/P18345 and previous config saved to /var/cache/conftool/dbconfig/20220104-075249-marostegui.json [07:52:51] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [07:53:56] (03CR) 10Marostegui: "The schema change was done - this can go." [puppet] - 10https://gerrit.wikimedia.org/r/743948 (https://phabricator.wikimedia.org/T297094) (owner: 10Ladsgroup) [07:54:01] (03CR) 10Marostegui: [C: 03+2] db2094: Disable notifications [puppet] - 10https://gerrit.wikimedia.org/r/751378 (https://phabricator.wikimedia.org/T295965) (owner: 10Marostegui) [07:55:37] (03CR) 10Ladsgroup: [C: 03+1] wmcs: Change maintain-views to prepare for schema change [puppet] - 10https://gerrit.wikimedia.org/r/743948 (https://phabricator.wikimedia.org/T297094) (owner: 10Ladsgroup) [07:56:27] !log marostegui@cumin1001 START - Cookbook sre.hosts.downtime for 6:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db[1155-1156].eqiad.wmnet with reason: Maintenance [07:56:28] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [07:56:32] !log marostegui@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db[1155-1156].eqiad.wmnet with reason: Maintenance [07:56:32] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [07:56:43] !log marostegui@cumin1001 START - Cookbook sre.hosts.reimage for host db2094.codfw.wmnet with OS bullseye [07:56:44] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [07:57:22] (03PS1) 10Ladsgroup: trafficserver: Point dbtree.wm.o to miscweb instead of dbmonitor [puppet] - 10https://gerrit.wikimedia.org/r/751379 (https://phabricator.wikimedia.org/T297605) [07:57:48] (03Abandoned) 10Ladsgroup: trafficserver: Point dbtree.wm.o to miscweb instead of dbmonitor [puppet] - 10https://gerrit.wikimedia.org/r/747602 (https://phabricator.wikimedia.org/T297605) (owner: 10Ladsgroup) [08:00:46] !log marostegui@cumin1001 START - Cookbook sre.hosts.downtime for 6:00:00 on db1170.eqiad.wmnet with reason: Maintenance [08:00:47] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [08:00:47] !log marostegui@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1170.eqiad.wmnet with reason: Maintenance [08:00:49] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [08:00:52] !log marostegui@cumin1001 dbctl commit (dc=all): 'Depooling db1170:3312 (T298316)', diff saved to https://phabricator.wikimedia.org/P18346 and previous config saved to /var/cache/conftool/dbconfig/20220104-080051-marostegui.json [08:00:54] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [08:00:55] T298316: Fix nullability of column recentchanges.rc_params on wmf wikis - https://phabricator.wikimedia.org/T298316 [08:01:24] PROBLEM - Prometheus jobs reduced availability on alert1001 is CRITICAL: job=mysql-labs site=codfw https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets [08:04:15] (03PS1) 10Ladsgroup: Move all records of tendril to tendril-legacy [puppet] - 10https://gerrit.wikimedia.org/r/751380 (https://phabricator.wikimedia.org/T297605) [08:05:50] (03PS2) 10Ladsgroup: trafficserver: Point dbtree and tendril to miscweb instead of dbmonitor [puppet] - 10https://gerrit.wikimedia.org/r/751379 (https://phabricator.wikimedia.org/T297605) [08:05:52] (03PS2) 10Ladsgroup: Move all records of tendril to tendril-legacy [puppet] - 10https://gerrit.wikimedia.org/r/751380 (https://phabricator.wikimedia.org/T297605) [08:06:04] !log marostegui@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1170:3312 (T298316)', diff saved to https://phabricator.wikimedia.org/P18347 and previous config saved to /var/cache/conftool/dbconfig/20220104-080604-marostegui.json [08:06:06] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [08:06:08] T298316: Fix nullability of column recentchanges.rc_params on wmf wikis - https://phabricator.wikimedia.org/T298316 [08:07:08] (03PS1) 10Ladsgroup: Change DNS entries for tendril [dns] - 10https://gerrit.wikimedia.org/r/751381 (https://phabricator.wikimedia.org/T297605) [08:07:55] !log marostegui@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1143', diff saved to https://phabricator.wikimedia.org/P18348 and previous config saved to /var/cache/conftool/dbconfig/20220104-080754-marostegui.json [08:07:55] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [08:17:01] (03CR) 10Giuseppe Lavagetto: [C: 04-1] "I like the idea, but one of the checks doesn't really work as is." [docker-images/docker-pkg] - 10https://gerrit.wikimedia.org/r/731149 (https://phabricator.wikimedia.org/T283855) (owner: 10Hashar) [08:21:09] !log marostegui@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1170:3312', diff saved to https://phabricator.wikimedia.org/P18349 and previous config saved to /var/cache/conftool/dbconfig/20220104-082109-marostegui.json [08:21:10] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [08:22:48] (03CR) 10MVernon: [C: 03+2] admin: add approver for the "restricted" group [puppet] - 10https://gerrit.wikimedia.org/r/747463 (owner: 10MVernon) [08:22:59] (03PS2) 10MVernon: admin: add approver for the "restricted" group [puppet] - 10https://gerrit.wikimedia.org/r/747463 [08:22:59] !log marostegui@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1143 (T277354)', diff saved to https://phabricator.wikimedia.org/P18350 and previous config saved to /var/cache/conftool/dbconfig/20220104-082259-marostegui.json [08:23:00] !log marostegui@cumin1001 START - Cookbook sre.hosts.downtime for 6:00:00 on db1142.eqiad.wmnet with reason: Maintenance [08:23:01] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [08:23:02] T277354: "chemical" major mime type was never added to production database - https://phabricator.wikimedia.org/T277354 [08:23:02] !log marostegui@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1142.eqiad.wmnet with reason: Maintenance [08:23:03] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [08:23:04] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [08:23:07] !log marostegui@cumin1001 dbctl commit (dc=all): 'Depooling db1142 (T277354)', diff saved to https://phabricator.wikimedia.org/P18351 and previous config saved to /var/cache/conftool/dbconfig/20220104-082306-marostegui.json [08:23:08] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [08:24:55] (03PS3) 10Ladsgroup: Move all records of tendril to tendril-legacy [puppet] - 10https://gerrit.wikimedia.org/r/751380 (https://phabricator.wikimedia.org/T297605) [08:25:19] (03CR) 10Giuseppe Lavagetto: [C: 04-1] [WIP] mediawiki-cache-warmup: Add support for POST requests (033 comments) [puppet] - 10https://gerrit.wikimedia.org/r/737498 (https://phabricator.wikimedia.org/T290989) (owner: 10Ladsgroup) [08:26:33] !log marostegui@cumin1001 END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2094.codfw.wmnet with OS bullseye [08:26:34] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [08:26:37] (03CR) 10Giuseppe Lavagetto: [C: 04-1] conftool: clean up references to obsolete restbase service (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/747098 (https://phabricator.wikimedia.org/T244843) (owner: 10Hnowlan) [08:29:35] (03CR) 10Ayounsi: [C: 03+1] netops: removed empty class [puppet] - 10https://gerrit.wikimedia.org/r/751158 (https://phabricator.wikimedia.org/T272559) (owner: 10David Caro) [08:36:14] !log marostegui@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1170:3312', diff saved to https://phabricator.wikimedia.org/P18352 and previous config saved to /var/cache/conftool/dbconfig/20220104-083613-marostegui.json [08:36:15] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [08:39:26] (03CR) 10Giuseppe Lavagetto: "Personally I don't think it makes sense to move a chart from 0.0.20 to 1.0.0 for this change." [deployment-charts] - 10https://gerrit.wikimedia.org/r/751070 (https://phabricator.wikimedia.org/T295750) (owner: 10Jelto) [08:42:42] (03PS1) 10Majavah: prod: WRITE_BOTH for centralauth hidden level migration [mediawiki-config] - 10https://gerrit.wikimedia.org/r/751385 (https://phabricator.wikimedia.org/T289068) [08:45:41] RECOVERY - Prometheus jobs reduced availability on alert1001 is OK: All metrics within thresholds. https://wikitech.wikimedia.org/wiki/Prometheus%23Prometheus_job_unavailable https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets [08:51:20] !log marostegui@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1170:3312 (T298316)', diff saved to https://phabricator.wikimedia.org/P18353 and previous config saved to /var/cache/conftool/dbconfig/20220104-085118-marostegui.json [08:51:21] !log marostegui@cumin1001 START - Cookbook sre.hosts.downtime for 6:00:00 on db1146.eqiad.wmnet with reason: Maintenance [08:51:22] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [08:51:22] !log marostegui@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1146.eqiad.wmnet with reason: Maintenance [08:51:23] T298316: Fix nullability of column recentchanges.rc_params on wmf wikis - https://phabricator.wikimedia.org/T298316 [08:51:24] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [08:51:24] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [08:51:27] !log marostegui@cumin1001 dbctl commit (dc=all): 'Depooling db1146:3312 (T298316)', diff saved to https://phabricator.wikimedia.org/P18354 and previous config saved to /var/cache/conftool/dbconfig/20220104-085127-marostegui.json [08:51:29] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [08:53:14] (03CR) 10Ladsgroup: [C: 03+1] "LGTM but be careful with the deployment." [mediawiki-config] - 10https://gerrit.wikimedia.org/r/751385 (https://phabricator.wikimedia.org/T289068) (owner: 10Majavah) [08:53:33] (03CR) 10Majavah: wmcs: Change maintain-views to prepare for schema change (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/743948 (https://phabricator.wikimedia.org/T297094) (owner: 10Ladsgroup) [08:57:54] (03PS2) 10Ladsgroup: wmcs: Change maintain-views to prepare for schema change [puppet] - 10https://gerrit.wikimedia.org/r/743948 (https://phabricator.wikimedia.org/T297094) [08:58:02] (03CR) 10Ladsgroup: wmcs: Change maintain-views to prepare for schema change (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/743948 (https://phabricator.wikimedia.org/T297094) (owner: 10Ladsgroup) [09:03:17] (03PS2) 10David Caro: logstash:input:syslog: remove unused module [puppet] - 10https://gerrit.wikimedia.org/r/751127 (https://phabricator.wikimedia.org/T272559) [09:03:19] (03CR) 10David Caro: logstash:input:syslog: remove unused module (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/751127 (https://phabricator.wikimedia.org/T272559) (owner: 10David Caro) [09:04:06] !log marostegui@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1146:3312 (T298316)', diff saved to https://phabricator.wikimedia.org/P18355 and previous config saved to /var/cache/conftool/dbconfig/20220104-090406-marostegui.json [09:04:08] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [09:04:09] T298316: Fix nullability of column recentchanges.rc_params on wmf wikis - https://phabricator.wikimedia.org/T298316 [09:04:12] !log start merging puppet cleanup patches [09:04:13] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [09:05:42] (03CR) 10David Caro: [C: 03+2] logstash:plugin: remove unused module [puppet] - 10https://gerrit.wikimedia.org/r/751129 (https://phabricator.wikimedia.org/T272559) (owner: 10David Caro) [09:06:07] (03CR) 10Ema: [C: 03+1] "LGTM, and I double-checked that both tendril.wikimedia.org and dbtree.wikimedia.org are in subjectAltName on the certificate sent by miscw" [puppet] - 10https://gerrit.wikimedia.org/r/751379 (https://phabricator.wikimedia.org/T297605) (owner: 10Ladsgroup) [09:06:34] 10Puppet, 10SRE, 10Infrastructure-Foundations, 10Patch-For-Review: Unused puppet resources audit, 2021 - https://phabricator.wikimedia.org/T272559 (10dcaro) [09:11:59] (03CR) 10David Caro: [C: 03+2] nginx:snippet: remove unused class [puppet] - 10https://gerrit.wikimedia.org/r/751159 (https://phabricator.wikimedia.org/T272559) (owner: 10David Caro) [09:12:50] 10Puppet, 10SRE, 10Infrastructure-Foundations, 10Patch-For-Review: Unused puppet resources audit, 2021 - https://phabricator.wikimedia.org/T272559 (10dcaro) [09:12:54] (03CR) 10David Caro: [C: 03+2] nginx::ssl: remove class [puppet] - 10https://gerrit.wikimedia.org/r/751160 (https://phabricator.wikimedia.org/T272559) (owner: 10David Caro) [09:13:51] 10Puppet, 10SRE, 10Infrastructure-Foundations, 10Patch-For-Review: Unused puppet resources audit, 2021 - https://phabricator.wikimedia.org/T272559 (10dcaro) [09:15:17] (03PS1) 10David Caro: nginx::ssl: remove orphan template [puppet] - 10https://gerrit.wikimedia.org/r/751389 (https://phabricator.wikimedia.org/T272559) [09:15:30] (03CR) 10David Caro: [C: 03+2] netops: removed empty class [puppet] - 10https://gerrit.wikimedia.org/r/751158 (https://phabricator.wikimedia.org/T272559) (owner: 10David Caro) [09:17:04] (03CR) 10David Caro: [C: 03+2] parsoid: remove unused module [puppet] - 10https://gerrit.wikimedia.org/r/751163 (https://phabricator.wikimedia.org/T272559) (owner: 10David Caro) [09:17:11] (03PS2) 10Ayounsi: Capirca: disable shade check [software/homer] - 10https://gerrit.wikimedia.org/r/749696 (https://phabricator.wikimedia.org/T273865) [09:19:11] !log marostegui@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1146:3312', diff saved to https://phabricator.wikimedia.org/P18356 and previous config saved to /var/cache/conftool/dbconfig/20220104-091910-marostegui.json [09:19:12] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [09:20:23] (03PS2) 10David Caro: parsoid: remove unused module [puppet] - 10https://gerrit.wikimedia.org/r/751163 (https://phabricator.wikimedia.org/T272559) [09:20:25] (03CR) 10David Caro: parsoid: remove unused module (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/751163 (https://phabricator.wikimedia.org/T272559) (owner: 10David Caro) [09:21:05] (03CR) 10David Caro: [C: 03+2] locales: remove unused module [puppet] - 10https://gerrit.wikimedia.org/r/751126 (https://phabricator.wikimedia.org/T272559) (owner: 10David Caro) [09:22:10] (03CR) 10David Caro: [C: 03+2] mariadb: remove unused class [puppet] - 10https://gerrit.wikimedia.org/r/751135 (https://phabricator.wikimedia.org/T272559) (owner: 10David Caro) [09:22:17] (03CR) 10Ayounsi: [C: 03+2] Capirca: disable shade check [software/homer] - 10https://gerrit.wikimedia.org/r/749696 (https://phabricator.wikimedia.org/T273865) (owner: 10Ayounsi) [09:24:35] (03CR) 10David Caro: [C: 03+2] labs_debrepo: remove unused modules [puppet] - 10https://gerrit.wikimedia.org/r/751105 (https://phabricator.wikimedia.org/T272559) (owner: 10David Caro) [09:25:16] (03CR) 10David Caro: [C: 03+2] identd: remove unused module [puppet] - 10https://gerrit.wikimedia.org/r/751098 (https://phabricator.wikimedia.org/T272559) (owner: 10David Caro) [09:25:25] (03Merged) 10jenkins-bot: Capirca: disable shade check [software/homer] - 10https://gerrit.wikimedia.org/r/749696 (https://phabricator.wikimedia.org/T273865) (owner: 10Ayounsi) [09:25:56] (03CR) 10Hashar: deployment_server: fix permissions for mwbuilder/other (032 comments) [puppet] - 10https://gerrit.wikimedia.org/r/751166 (https://phabricator.wikimedia.org/T297673) (owner: 10Giuseppe Lavagetto) [09:26:31] (03PS4) 10Ladsgroup: Move all records of tendril to tendril-legacy [puppet] - 10https://gerrit.wikimedia.org/r/751380 (https://phabricator.wikimedia.org/T297605) [09:26:33] (03PS1) 10Ladsgroup: Add tendril to cache routes [puppet] - 10https://gerrit.wikimedia.org/r/751390 (https://phabricator.wikimedia.org/T297605) [09:26:50] (03PS1) 10Ayounsi: Bump Capirca to 2.0.4 [software/homer] - 10https://gerrit.wikimedia.org/r/751391 (https://phabricator.wikimedia.org/T273865) [09:27:39] (03PS3) 10Ladsgroup: trafficserver: Point dbtree and tendril to miscweb instead of dbmonitor [puppet] - 10https://gerrit.wikimedia.org/r/751379 (https://phabricator.wikimedia.org/T297605) [09:28:17] (03CR) 10Ladsgroup: [C: 03+2] trafficserver: Point dbtree and tendril to miscweb instead of dbmonitor [puppet] - 10https://gerrit.wikimedia.org/r/751379 (https://phabricator.wikimedia.org/T297605) (owner: 10Ladsgroup) [09:29:07] 10Puppet, 10Cloud-VPS, 10Infrastructure-Foundations, 10puppet-compiler, and 2 others: Improve PCC support for cloud VPS environments - https://phabricator.wikimedia.org/T289666 (10jbond) [09:29:32] 10Puppet, 10Cloud-VPS, 10Infrastructure-Foundations, 10puppet-compiler, and 2 others: Improve PCC support for cloud VPS environments - https://phabricator.wikimedia.org/T289666 (10jbond) [09:34:15] !log marostegui@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1146:3312', diff saved to https://phabricator.wikimedia.org/P18357 and previous config saved to /var/cache/conftool/dbconfig/20220104-093415-marostegui.json [09:34:17] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [09:39:56] (03CR) 10Jbond: [C: 03+1] Bump Capirca to 2.0.4 [software/homer] - 10https://gerrit.wikimedia.org/r/751391 (https://phabricator.wikimedia.org/T273865) (owner: 10Ayounsi) [09:40:36] (03CR) 10Jbond: [C: 03+1] "lgtm" [puppet] - 10https://gerrit.wikimedia.org/r/751210 (owner: 10Muehlenhoff) [09:41:02] 10Puppet, 10SRE, 10Infrastructure-Foundations, 10Patch-For-Review: Unused puppet resources audit, 2021 - https://phabricator.wikimedia.org/T272559 (10dcaro) [09:46:13] PROBLEM - PyBal backends health check on lvs1020 is CRITICAL: PYBAL CRITICAL - CRITICAL - wdqs-heavy-queries_8888: Servers wdqs1007.eqiad.wmnet, wdqs1005.eqiad.wmnet are marked down but pooled: wdqs-ssl_443: Servers wdqs1007.eqiad.wmnet, wdqs1005.eqiad.wmnet are marked down but pooled: wdqs_80: Servers wdqs1007.eqiad.wmnet, wdqs1005.eqiad.wmnet are marked down but pooled https://wikitech.wikimedia.org/wiki/PyBal [09:46:40] looking ^ [09:48:19] RECOVERY - PyBal backends health check on lvs1020 is OK: PYBAL OK - All pools are healthy https://wikitech.wikimedia.org/wiki/PyBal [09:49:20] !log marostegui@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1146:3312 (T298316)', diff saved to https://phabricator.wikimedia.org/P18358 and previous config saved to /var/cache/conftool/dbconfig/20220104-094920-marostegui.json [09:49:22] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [09:49:23] T298316: Fix nullability of column recentchanges.rc_params on wmf wikis - https://phabricator.wikimedia.org/T298316 [09:52:52] (03CR) 10David Caro: [C: 03+2] initramfs:hook: remove unused module [puppet] - 10https://gerrit.wikimedia.org/r/751102 (https://phabricator.wikimedia.org/T272559) (owner: 10David Caro) [09:53:15] (03CR) 10David Caro: [C: 03+2] apparmor::hardlink: remove unused module [puppet] - 10https://gerrit.wikimedia.org/r/751073 (https://phabricator.wikimedia.org/T272559) (owner: 10David Caro) [09:53:55] (03CR) 10David Caro: [C: 03+2] geoip:data:package: remove unused module [puppet] - 10https://gerrit.wikimedia.org/r/751091 (https://phabricator.wikimedia.org/T272559) (owner: 10David Caro) [09:54:29] (03CR) 10David Caro: [C: 03+2] alternatives::install: remove unused module [puppet] - 10https://gerrit.wikimedia.org/r/751072 (https://phabricator.wikimedia.org/T272559) (owner: 10David Caro) [09:55:05] (03CR) 10David Caro: [C: 03+2] html5depurate: remove unused role and modules [puppet] - 10https://gerrit.wikimedia.org/r/751093 (https://phabricator.wikimedia.org/T272559) (owner: 10David Caro) [09:55:51] (03CR) 10David Caro: [C: 03+2] monitoring::graphite_freshness: remove define/cleanup [puppet] - 10https://gerrit.wikimedia.org/r/751148 (https://phabricator.wikimedia.org/T272559) (owner: 10David Caro) [09:57:45] (03CR) 10David Caro: [V: 03+1 C: 03+2] "PCC SUCCESS (NOOP 1): https://integration.wikimedia.org/ci/job/operations-puppet-catalog-compiler/33103/console" [puppet] - 10https://gerrit.wikimedia.org/r/751148 (https://phabricator.wikimedia.org/T272559) (owner: 10David Caro) [09:58:33] (03CR) 10David Caro: [V: 03+1 C: 03+2] "PCC SUCCESS (NOOP 1): https://integration.wikimedia.org/ci/job/operations-puppet-catalog-compiler/33104/console" [puppet] - 10https://gerrit.wikimedia.org/r/751148 (https://phabricator.wikimedia.org/T272559) (owner: 10David Caro) [10:00:56] 10Puppet, 10SRE, 10Infrastructure-Foundations, 10Patch-For-Review: Unused puppet resources audit, 2021 - https://phabricator.wikimedia.org/T272559 (10dcaro) [10:06:42] 10Puppet, 10SRE, 10Infrastructure-Foundations, 10Patch-For-Review: Unused puppet resources audit, 2021 - https://phabricator.wikimedia.org/T272559 (10dcaro) [10:07:05] 10Puppet, 10SRE, 10Infrastructure-Foundations, 10Patch-For-Review: Unused puppet resources audit, 2021 - https://phabricator.wikimedia.org/T272559 (10dcaro) [10:11:02] (03PS1) 10Ladsgroup: microsites: Fix link to tendril-legacy [puppet] - 10https://gerrit.wikimedia.org/r/751394 (https://phabricator.wikimedia.org/T297605) [10:13:28] (03PS5) 10Jelto: charts: update charts to api v2 [deployment-charts] - 10https://gerrit.wikimedia.org/r/751070 (https://phabricator.wikimedia.org/T295750) [10:13:30] (03CR) 10Ladsgroup: [C: 03+2] microsites: Fix link to tendril-legacy [puppet] - 10https://gerrit.wikimedia.org/r/751394 (https://phabricator.wikimedia.org/T297605) (owner: 10Ladsgroup) [10:14:01] !log marostegui@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1142 (T277354)', diff saved to https://phabricator.wikimedia.org/P18359 and previous config saved to /var/cache/conftool/dbconfig/20220104-101400-marostegui.json [10:14:03] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [10:14:04] T277354: "chemical" major mime type was never added to production database - https://phabricator.wikimedia.org/T277354 [10:14:49] (03CR) 10Vgutierrez: [C: 03+1] Add tendril to cache routes [puppet] - 10https://gerrit.wikimedia.org/r/751390 (https://phabricator.wikimedia.org/T297605) (owner: 10Ladsgroup) [10:15:27] (03PS2) 10Ladsgroup: Add tendril to cache routes [puppet] - 10https://gerrit.wikimedia.org/r/751390 (https://phabricator.wikimedia.org/T297605) [10:15:34] (03CR) 10Ladsgroup: [C: 03+2] Add tendril to cache routes [puppet] - 10https://gerrit.wikimedia.org/r/751390 (https://phabricator.wikimedia.org/T297605) (owner: 10Ladsgroup) [10:20:58] !log marostegui@cumin1001 START - Cookbook sre.hosts.downtime for 6:00:00 on db1139.eqiad.wmnet with reason: Maintenance [10:20:59] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [10:21:00] !log marostegui@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1139.eqiad.wmnet with reason: Maintenance [10:21:01] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [10:21:36] (03PS1) 10Jbond: hieradta - cloud pki: fix classes [puppet] - 10https://gerrit.wikimedia.org/r/751399 [10:21:50] (03CR) 10Jbond: [V: 03+2 C: 03+2] hieradta - cloud pki: fix classes [puppet] - 10https://gerrit.wikimedia.org/r/751399 (owner: 10Jbond) [10:22:16] 10Puppet, 10SRE, 10Infrastructure-Foundations, 10Patch-For-Review: Unused puppet resources audit, 2021 - https://phabricator.wikimedia.org/T272559 (10fgiunchedi) [10:24:39] 10SRE, 10ops-codfw: ms-be2065 failed drive sdq - https://phabricator.wikimedia.org/T297933 (10fgiunchedi) Thank you @Papaul [10:26:13] !log marostegui@cumin1001 START - Cookbook sre.hosts.downtime for 6:00:00 on db1140.eqiad.wmnet with reason: Maintenance [10:26:13] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [10:26:14] !log marostegui@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1140.eqiad.wmnet with reason: Maintenance [10:26:15] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [10:26:19] PROBLEM - very high load average likely xfs on ms-be2065 is CRITICAL: CRITICAL - load average: 132.54, 112.14, 70.06 https://wikitech.wikimedia.org/wiki/Swift [10:27:10] (03PS2) 10Ladsgroup: Change DNS entries for tendril [dns] - 10https://gerrit.wikimedia.org/r/751381 (https://phabricator.wikimedia.org/T297605) [10:27:47] (03CR) 10Vgutierrez: [C: 03+1] Change DNS entries for tendril [dns] - 10https://gerrit.wikimedia.org/r/751381 (https://phabricator.wikimedia.org/T297605) (owner: 10Ladsgroup) [10:29:05] !log marostegui@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1142', diff saved to https://phabricator.wikimedia.org/P18360 and previous config saved to /var/cache/conftool/dbconfig/20220104-102905-marostegui.json [10:29:06] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [10:30:10] (03PS1) 10David Caro: bastionhost::migration: remove unused profile [puppet] - 10https://gerrit.wikimedia.org/r/751401 (https://phabricator.wikimedia.org/T272559) [10:31:01] 10Puppet, 10SRE, 10Infrastructure-Foundations, 10Patch-For-Review: Unused puppet resources audit, 2021 - https://phabricator.wikimedia.org/T272559 (10dcaro) [10:31:25] !log marostegui@cumin1001 START - Cookbook sre.hosts.downtime for 6:00:00 on db1163.eqiad.wmnet with reason: Maintenance [10:31:26] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [10:31:27] !log marostegui@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1163.eqiad.wmnet with reason: Maintenance [10:31:28] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [10:32:20] 10Puppet, 10SRE, 10Infrastructure-Foundations, 10Patch-For-Review: Unused puppet resources audit, 2021 - https://phabricator.wikimedia.org/T272559 (10dcaro) [10:34:11] PROBLEM - SSH on mw2252.mgmt is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [10:36:01] (03PS2) 10Jelto: changeprop/eventgate: bump kafka-dev dependencie to 0.1.0 [deployment-charts] - 10https://gerrit.wikimedia.org/r/751120 (https://phabricator.wikimedia.org/T295750) [10:36:16] !log marostegui@cumin1001 START - Cookbook sre.hosts.downtime for 6:00:00 on db1128.eqiad.wmnet with reason: Maintenance [10:36:17] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [10:36:18] !log marostegui@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1128.eqiad.wmnet with reason: Maintenance [10:36:19] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [10:37:14] (03PS1) 10David Caro: profile::ceph::common: remove unused profile [puppet] - 10https://gerrit.wikimedia.org/r/751403 (https://phabricator.wikimedia.org/T272559) [10:39:32] (03PS1) 10Kosta Harlan: Monitoring: Adjust logic for counting reverts [extensions/GrowthExperiments] (wmf/1.38.0-wmf.16) - 10https://gerrit.wikimedia.org/r/751191 (https://phabricator.wikimedia.org/T286366) [10:41:06] !log marostegui@cumin1001 START - Cookbook sre.hosts.downtime for 6:00:00 on db1133.eqiad.wmnet with reason: Maintenance [10:41:07] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [10:41:08] !log marostegui@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1133.eqiad.wmnet with reason: Maintenance [10:41:08] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [10:41:09] !log marostegui@cumin1001 START - Cookbook sre.hosts.downtime for 6:00:00 on db1134.eqiad.wmnet with reason: Maintenance [10:41:10] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [10:41:11] !log marostegui@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1134.eqiad.wmnet with reason: Maintenance [10:41:12] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [10:44:10] !log marostegui@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1142', diff saved to https://phabricator.wikimedia.org/P18362 and previous config saved to /var/cache/conftool/dbconfig/20220104-104410-marostegui.json [10:44:11] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [10:47:03] !log marostegui@cumin1001 START - Cookbook sre.hosts.downtime for 6:00:00 on db1135.eqiad.wmnet with reason: Maintenance [10:47:04] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [10:47:04] !log marostegui@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1135.eqiad.wmnet with reason: Maintenance [10:47:05] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [10:47:09] (03CR) 10Ladsgroup: [C: 03+2] Change DNS entries for tendril [dns] - 10https://gerrit.wikimedia.org/r/751381 (https://phabricator.wikimedia.org/T297605) (owner: 10Ladsgroup) [10:51:04] (03PS5) 10Ladsgroup: Move all records of tendril to tendril-legacy [puppet] - 10https://gerrit.wikimedia.org/r/751380 (https://phabricator.wikimedia.org/T297605) [10:51:38] (03CR) 10Ladsgroup: [C: 03+2] "DNS is in, we are going in." [puppet] - 10https://gerrit.wikimedia.org/r/751380 (https://phabricator.wikimedia.org/T297605) (owner: 10Ladsgroup) [10:52:38] !log marostegui@cumin1001 START - Cookbook sre.hosts.downtime for 6:00:00 on db1099.eqiad.wmnet with reason: Maintenance [10:52:39] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [10:52:39] !log marostegui@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1099.eqiad.wmnet with reason: Maintenance [10:52:41] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [10:52:44] !log marostegui@cumin1001 dbctl commit (dc=all): 'Depooling db1099:3311 (T298316)', diff saved to https://phabricator.wikimedia.org/P18364 and previous config saved to /var/cache/conftool/dbconfig/20220104-105244-marostegui.json [10:52:46] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [10:52:47] T298316: Fix nullability of column recentchanges.rc_params on wmf wikis - https://phabricator.wikimedia.org/T298316 [10:53:52] (03CR) 10Filippo Giunchedi: [C: 03+1] "LGTM, easy enough to restore if/when we needed it" [puppet] - 10https://gerrit.wikimedia.org/r/751401 (https://phabricator.wikimedia.org/T272559) (owner: 10David Caro) [10:54:18] (03CR) 10JMeybohm: [C: 03+1] "This should work" [deployment-charts] - 10https://gerrit.wikimedia.org/r/751171 (https://phabricator.wikimedia.org/T262265) (owner: 10DCausse) [10:56:00] (03PS1) 10Majavah: admin: remove non-existent files from ldapadmins sudo rules [puppet] - 10https://gerrit.wikimedia.org/r/751404 [10:58:07] (03CR) 10Filippo Giunchedi: [C: 03+1] "LGTM, see inline for a nit" [puppet] - 10https://gerrit.wikimedia.org/r/746801 (owner: 10Giuseppe Lavagetto) [10:59:15] !log marostegui@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1142 (T277354)', diff saved to https://phabricator.wikimedia.org/P18365 and previous config saved to /var/cache/conftool/dbconfig/20220104-105914-marostegui.json [10:59:16] !log marostegui@cumin1001 START - Cookbook sre.hosts.downtime for 6:00:00 on db1141.eqiad.wmnet with reason: Maintenance [10:59:17] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [10:59:18] !log marostegui@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1141.eqiad.wmnet with reason: Maintenance [10:59:18] T277354: "chemical" major mime type was never added to production database - https://phabricator.wikimedia.org/T277354 [10:59:19] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [10:59:20] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [10:59:22] !log marostegui@cumin1001 dbctl commit (dc=all): 'Depooling db1141 (T277354)', diff saved to https://phabricator.wikimedia.org/P18366 and previous config saved to /var/cache/conftool/dbconfig/20220104-105922-marostegui.json [10:59:24] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [10:59:50] !log marostegui@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1099:3311 (T298316)', diff saved to https://phabricator.wikimedia.org/P18367 and previous config saved to /var/cache/conftool/dbconfig/20220104-105949-marostegui.json [10:59:52] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [10:59:52] T298316: Fix nullability of column recentchanges.rc_params on wmf wikis - https://phabricator.wikimedia.org/T298316 [11:00:55] PROBLEM - HTTPS-tendril on dbmonitor1002 is CRITICAL: SSL CRITICAL - failed to verify tendril.wikimedia.org against dbtree.wikimedia.org https://wikitech.wikimedia.org/wiki/Tendril [11:04:59] expected :) ^^ Amir1 [11:05:21] ugh, I'm gonna shut that down [11:07:34] okay, I dowtimed it for a couple of months [11:07:43] (03PS1) 10Jbond: O:idp: add tendril-legacy.wikimedia.org to idp services [puppet] - 10https://gerrit.wikimedia.org/r/751406 [11:11:07] Amir1: let me look at that [11:11:26] (03PS1) 10David Caro: nagios_common::command: don't use a config when removing a command [puppet] - 10https://gerrit.wikimedia.org/r/751407 (https://phabricator.wikimedia.org/T272559) [11:12:26] (03PS1) 10Btullis: Add remaining aqs_next hosts to the aqs cluster [puppet] - 10https://gerrit.wikimedia.org/r/751408 (https://phabricator.wikimedia.org/T297803) [11:14:48] (03Abandoned) 10Jbond: O:idp: add tendril-legacy.wikimedia.org to idp services [puppet] - 10https://gerrit.wikimedia.org/r/751406 (owner: 10Jbond) [11:14:55] !log marostegui@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1099:3311', diff saved to https://phabricator.wikimedia.org/P18368 and previous config saved to /var/cache/conftool/dbconfig/20220104-111454-marostegui.json [11:14:56] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [11:15:07] (03CR) 10JMeybohm: [C: 04-1] charts: update charts to api v2 (031 comment) [deployment-charts] - 10https://gerrit.wikimedia.org/r/751070 (https://phabricator.wikimedia.org/T295750) (owner: 10Jelto) [11:15:13] (03PS1) 10Marostegui: webserver.pp: Disable tendril checks [puppet] - 10https://gerrit.wikimedia.org/r/751409 (https://phabricator.wikimedia.org/T297605) [11:15:53] (03CR) 10jerkins-bot: [V: 04-1] webserver.pp: Disable tendril checks [puppet] - 10https://gerrit.wikimedia.org/r/751409 (https://phabricator.wikimedia.org/T297605) (owner: 10Marostegui) [11:16:47] (03PS2) 10Marostegui: webserver.pp: Disable tendril checks [puppet] - 10https://gerrit.wikimedia.org/r/751409 (https://phabricator.wikimedia.org/T297605) [11:17:02] !log jayme@deploy1002 helmfile [staging-codfw] START helmfile.d/admin 'apply'. [11:17:03] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [11:18:18] !log jayme@deploy1002 helmfile [staging-codfw] DONE helmfile.d/admin 'apply'. [11:18:19] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [11:20:38] !log jayme@deploy1002 helmfile [staging-eqiad] START helmfile.d/admin 'apply'. [11:20:39] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [11:20:42] !log jayme@deploy1002 helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'. [11:20:43] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [11:23:58] (03CR) 10Ladsgroup: "Random idea: Just remove them. It's in git, we can revert it if needed." [puppet] - 10https://gerrit.wikimedia.org/r/751409 (https://phabricator.wikimedia.org/T297605) (owner: 10Marostegui) [11:25:07] (03PS2) 10David Caro: nagios_common::command: don't need for absent [puppet] - 10https://gerrit.wikimedia.org/r/751407 (https://phabricator.wikimedia.org/T272559) [11:26:11] (03CR) 10Jbond: [C: 03+1] "lgtm" [puppet] - 10https://gerrit.wikimedia.org/r/751407 (https://phabricator.wikimedia.org/T272559) (owner: 10David Caro) [11:28:04] (03CR) 10David Caro: [C: 03+2] nagios_common::command: don't need for absent [puppet] - 10https://gerrit.wikimedia.org/r/751407 (https://phabricator.wikimedia.org/T272559) (owner: 10David Caro) [11:28:10] (03PS3) 10Marostegui: webserver.pp: Remove tendril checks [puppet] - 10https://gerrit.wikimedia.org/r/751409 (https://phabricator.wikimedia.org/T297605) [11:28:14] (03PS6) 10Jelto: charts: update charts to api v2 [deployment-charts] - 10https://gerrit.wikimedia.org/r/751070 (https://phabricator.wikimedia.org/T295750) [11:29:04] (03CR) 10Jbond: [C: 03+2] "LGTM thanks will merge" [puppet] - 10https://gerrit.wikimedia.org/r/751404 (owner: 10Majavah) [11:29:32] (03CR) 10Jelto: charts: update charts to api v2 (031 comment) [deployment-charts] - 10https://gerrit.wikimedia.org/r/751070 (https://phabricator.wikimedia.org/T295750) (owner: 10Jelto) [11:29:59] !log marostegui@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1099:3311', diff saved to https://phabricator.wikimedia.org/P18369 and previous config saved to /var/cache/conftool/dbconfig/20220104-112959-marostegui.json [11:30:00] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [11:32:00] RECOVERY - HTTPS-tendril on dbmonitor1002 is OK: SSL OK - Certificate tendril-legacy.wikimedia.org valid until 2022-04-04 09:55:41 +0000 (expires in 89 days) https://wikitech.wikimedia.org/wiki/Tendril [11:32:19] I have a backport to wmf.16 that can't be tested anywhere since wmf.16 isn't yet deployed; shall I just +2 it now so it's out of the way for the backport window? [11:33:10] (03PS1) 10Vgutierrez: site: Reimage cp5005 as cache::upload_envoy [puppet] - 10https://gerrit.wikimedia.org/r/751413 (https://phabricator.wikimedia.org/T271421) [11:34:10] RECOVERY - SSH on mw2252.mgmt is OK: SSH OK - OpenSSH_7.0 (protocol 2.0) https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [11:34:14] (03PS4) 10Marostegui: webserver.pp: Remove tendril checks [puppet] - 10https://gerrit.wikimedia.org/r/751409 (https://phabricator.wikimedia.org/T297605) [11:34:49] (03CR) 10jerkins-bot: [V: 04-1] webserver.pp: Remove tendril checks [puppet] - 10https://gerrit.wikimedia.org/r/751409 (https://phabricator.wikimedia.org/T297605) (owner: 10Marostegui) [11:36:11] (03PS5) 10Marostegui: webserver.pp: Remove tendril checks [puppet] - 10https://gerrit.wikimedia.org/r/751409 (https://phabricator.wikimedia.org/T297605) [11:36:17] ^ cc Amir1 / urbanecm on my question above [11:36:38] kostajh: yeah sounds fine, if the wmf branch was cloned to deploy1002 you would need to fetch it there but wmf.16 wasn't yet [11:37:04] kostajh: yup, please go ahead [11:37:06] (03CR) 10Kosta Harlan: [C: 03+2] "Backport" [extensions/GrowthExperiments] (wmf/1.38.0-wmf.16) - 10https://gerrit.wikimedia.org/r/751191 (https://phabricator.wikimedia.org/T286366) (owner: 10Kosta Harlan) [11:37:10] thanks both [11:37:13] It doesn't need sync, just rebase [11:37:22] not sure even rebase is needed [11:37:42] (if the directory is not there in deploy1001, it doesn't need it, otherwise do it) [11:38:51] (03CR) 10Ladsgroup: [C: 03+1] "This would make Daniel happy." [puppet] - 10https://gerrit.wikimedia.org/r/751409 (https://phabricator.wikimedia.org/T297605) (owner: 10Marostegui) [11:39:00] (03CR) 10Marostegui: [C: 03+2] webserver.pp: Remove tendril checks [puppet] - 10https://gerrit.wikimedia.org/r/751409 (https://phabricator.wikimedia.org/T297605) (owner: 10Marostegui) [11:44:14] PROBLEM - SSH on dns5001.mgmt is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [11:45:04] !log marostegui@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1099:3311 (T298316)', diff saved to https://phabricator.wikimedia.org/P18370 and previous config saved to /var/cache/conftool/dbconfig/20220104-114503-marostegui.json [11:45:05] !log marostegui@cumin1001 START - Cookbook sre.hosts.downtime for 6:00:00 on db1169.eqiad.wmnet with reason: Maintenance [11:45:06] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [11:45:07] !log marostegui@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1169.eqiad.wmnet with reason: Maintenance [11:45:07] T298316: Fix nullability of column recentchanges.rc_params on wmf wikis - https://phabricator.wikimedia.org/T298316 [11:45:08] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [11:45:09] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [11:49:52] !log marostegui@cumin1001 START - Cookbook sre.hosts.downtime for 6:00:00 on db1184.eqiad.wmnet with reason: Maintenance [11:49:53] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [11:49:54] !log marostegui@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1184.eqiad.wmnet with reason: Maintenance [11:49:55] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [11:52:51] (03CR) 10Btullis: [C: 03+1] "Also lgtm." [puppet] - 10https://gerrit.wikimedia.org/r/751104 (owner: 10Elukey) [11:54:42] !log marostegui@cumin1001 START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance [11:54:43] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [11:54:43] !log marostegui@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance [11:54:44] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [11:58:13] (03PS1) 10David Caro: profile::openldap: increase the size limit for labs servers [puppet] - 10https://gerrit.wikimedia.org/r/751417 [11:58:20] (03CR) 10jerkins-bot: [V: 04-1] Monitoring: Adjust logic for counting reverts [extensions/GrowthExperiments] (wmf/1.38.0-wmf.16) - 10https://gerrit.wikimedia.org/r/751191 (https://phabricator.wikimedia.org/T286366) (owner: 10Kosta Harlan) [11:59:31] (03CR) 10Kosta Harlan: [C: 03+2] "recheck due to T297345" [extensions/GrowthExperiments] (wmf/1.38.0-wmf.16) - 10https://gerrit.wikimedia.org/r/751191 (https://phabricator.wikimedia.org/T286366) (owner: 10Kosta Harlan) [11:59:50] (03CR) 10David Caro: [V: 03+1] "PCC SUCCESS (NOOP 1): https://integration.wikimedia.org/ci/job/operations-puppet-catalog-compiler/33106/console" [puppet] - 10https://gerrit.wikimedia.org/r/751417 (owner: 10David Caro) [12:00:04] Amir1, Lucas_WMDE, awight, and Urbanecm: May I have your attention please! UTC morning backport window. (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20220104T1200) [12:00:04] MatmaRex, _joe_, MdsShakil, Juan_90264, and kostajh: A patch you scheduled for UTC morning backport window is about to be deployed. Please be around during the process. Note: If you break AND fix the wikis, you will be rewarded with a sticker. [12:00:09] !log marostegui@cumin1001 START - Cookbook sre.hosts.downtime for 6:00:00 on 15 hosts with reason: Maintenance [12:00:10] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [12:00:20] !log marostegui@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on 15 hosts with reason: Maintenance [12:00:21] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [12:00:40] (03PS1) 10Ladsgroup: Remove dbtree from doc pages [mediawiki-config] - 10https://gerrit.wikimedia.org/r/751420 (https://phabricator.wikimedia.org/T297605) [12:00:57] hello [12:01:47] Hey ☺️ [12:02:12] hi [12:03:03] my patch for wmf.16 is being backported, waiting on CI (Selenium errored out if anyone wants to +2 a patch disabling a flaky test https://gerrit.wikimedia.org/r/c/mediawiki/extensions/WikibaseLexeme/+/751419) [12:05:40] !log marostegui@cumin1001 START - Cookbook sre.hosts.downtime for 6:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db[1106,1154].eqiad.wmnet with reason: Maintenance [12:05:41] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [12:05:44] !log marostegui@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db[1106,1154].eqiad.wmnet with reason: Maintenance [12:05:45] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [12:06:30] is anyone deploying yet? [12:08:59] (03PS2) 10David Caro: profile::openldap: increase the size limit for labs servers [puppet] - 10https://gerrit.wikimedia.org/r/751417 [12:08:59] apparently not, I can manage the window then [12:09:39] MatmaRex: let's start with your config patches [12:09:39] (03CR) 10David Caro: [V: 03+1] "PCC SUCCESS (DIFF 1): https://integration.wikimedia.org/ci/job/operations-puppet-catalog-compiler/33107/console" [puppet] - 10https://gerrit.wikimedia.org/r/751417 (owner: 10David Caro) [12:09:52] sure, thanks [12:09:53] (03CR) 10Majavah: [C: 03+2] Make reply tool available as opt-out on metawiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/751167 (https://phabricator.wikimedia.org/T297534) (owner: 10Bartosz Dziewoński) [12:10:34] (03CR) 10David Caro: [V: 03+1] "The pcc change is the expected change for the parameter" [puppet] - 10https://gerrit.wikimedia.org/r/751417 (owner: 10David Caro) [12:10:41] (03Merged) 10jenkins-bot: Make reply tool available as opt-out on metawiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/751167 (https://phabricator.wikimedia.org/T297534) (owner: 10Bartosz Dziewoński) [12:11:16] !log marostegui@cumin1001 START - Cookbook sre.hosts.downtime for 6:00:00 on db1119.eqiad.wmnet with reason: Maintenance [12:11:17] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [12:11:18] !log marostegui@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1119.eqiad.wmnet with reason: Maintenance [12:11:19] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [12:11:23] MatmaRex: your first patch is on mwdebug1001, please test [12:12:12] taavi: seems good [12:12:42] 10Puppet, 10SRE, 10Infrastructure-Foundations, 10Patch-For-Review: Unused puppet resources audit, 2021 - https://phabricator.wikimedia.org/T272559 (10dcaro) [12:12:46] (03CR) 10Majavah: [C: 03+2] Make reply tool available as opt-out on specieswiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/751168 (https://phabricator.wikimedia.org/T297535) (owner: 10Bartosz Dziewoński) [12:13:19] !log taavi@deploy1002 Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:751167|Make reply tool available as opt-out on metawiki (T297534)]] (duration: 00m 59s) [12:13:21] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [12:13:22] T297534: Config Change: Deploy Reply Tool as Opt-Out at Meta - https://phabricator.wikimedia.org/T297534 [12:13:35] (03Merged) 10jenkins-bot: Make reply tool available as opt-out on specieswiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/751168 (https://phabricator.wikimedia.org/T297535) (owner: 10Bartosz Dziewoński) [12:14:11] and the second patch is on mwdebug1001 too [12:14:43] taavi: also looks good [12:14:58] great, syncing [12:15:09] joe: around? your patches are up next [12:15:33] !log mwdebug-deploy@deploy1002 helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn [12:15:34] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [12:15:53] !log taavi@deploy1002 Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:751168|Make reply tool available as opt-out on specieswiki (T297535)]] (duration: 00m 57s) [12:15:55] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [12:15:55] T297535: Config Change: Deploy Reply Tool as Opt-Out at Wikispecies - https://phabricator.wikimedia.org/T297535 [12:16:15] MatmaRex: both of your patches are now live! [12:16:34] let's continue to MdsShakil's patches then [12:16:34] thanks [12:16:37] !log marostegui@cumin1001 START - Cookbook sre.hosts.downtime for 6:00:00 on db1105.eqiad.wmnet with reason: Maintenance [12:16:38] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [12:16:39] !log marostegui@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1105.eqiad.wmnet with reason: Maintenance [12:16:40] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [12:16:43] !log marostegui@cumin1001 dbctl commit (dc=all): 'Depooling db1105:3311 (T298316)', diff saved to https://phabricator.wikimedia.org/P18372 and previous config saved to /var/cache/conftool/dbconfig/20220104-121643-marostegui.json [12:16:45] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [12:16:46] T298316: Fix nullability of column recentchanges.rc_params on wmf wikis - https://phabricator.wikimedia.org/T298316 [12:16:55] (03PS6) 10Majavah: Create autopatroller and patroller groups on bnwiktionary [mediawiki-config] - 10https://gerrit.wikimedia.org/r/749244 (https://phabricator.wikimedia.org/T298187) (owner: 10MdsShakil) [12:17:32] (03CR) 10Majavah: [C: 03+2] Create autopatroller and patroller groups on bnwiktionary [mediawiki-config] - 10https://gerrit.wikimedia.org/r/749244 (https://phabricator.wikimedia.org/T298187) (owner: 10MdsShakil) [12:17:49] MdsShakil: do you have the x-wikimedia-debug browser extension installed? [12:18:17] No, I am in mobile [12:18:24] Thanks to taavi for handling the window. [12:18:39] (03Merged) 10jenkins-bot: Create autopatroller and patroller groups on bnwiktionary [mediawiki-config] - 10https://gerrit.wikimedia.org/r/749244 (https://phabricator.wikimedia.org/T298187) (owner: 10MdsShakil) [12:18:43] !log mwdebug-deploy@deploy1002 helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn [12:18:44] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [12:18:44] !log mwdebug-deploy@deploy1002 helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn [12:18:46] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [12:19:19] ah, that's going to make things a bit difficult :/ [12:19:28] (03CR) 10Jbond: [C: 03+1] "LGTM" [puppet] - 10https://gerrit.wikimedia.org/r/751417 (owner: 10David Caro) [12:20:04] this patch is in theory simple enough that I can verify it myself, but in the future please make sure you can do it yourself [12:20:32] Yeah, sure [12:21:32] !log mwdebug-deploy@deploy1002 helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn [12:21:33] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [12:21:51] special:listgrouprights seems to be correct, so syncing the change to the entire cluster [12:22:56] !log taavi@deploy1002 Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:749244|Create autopatroller and patroller groups on bnwiktionary (T298187)]] (duration: 00m 57s) [12:22:58] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [12:22:59] T298187: Create autopatroller and patroller groups on bnwiktionary - https://phabricator.wikimedia.org/T298187 [12:23:03] !log marostegui@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1105:3311 (T298316)', diff saved to https://phabricator.wikimedia.org/P18373 and previous config saved to /var/cache/conftool/dbconfig/20220104-122302-marostegui.json [12:23:05] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [12:23:05] T298316: Fix nullability of column recentchanges.rc_params on wmf wikis - https://phabricator.wikimedia.org/T298316 [12:23:12] MdsShakil: your change is now live, could you double check it works as expected please? [12:23:22] (03CR) 10David Caro: [V: 03+1 C: 03+2] profile::openldap: increase the size limit for labs servers [puppet] - 10https://gerrit.wikimedia.org/r/751417 (owner: 10David Caro) [12:23:40] Juan does not appear to be around either [12:25:12] Looks fine, thanks you [12:25:20] great, happy to help [12:25:37] unless someone has something to deploy, I'll take this opportunity to backport https://gerrit.wikimedia.org/r/c/mediawiki/extensions/LdapAuthentication/+/751405/ [12:26:03] (03Merged) 10jenkins-bot: Monitoring: Adjust logic for counting reverts [extensions/GrowthExperiments] (wmf/1.38.0-wmf.16) - 10https://gerrit.wikimedia.org/r/751191 (https://phabricator.wikimedia.org/T286366) (owner: 10Kosta Harlan) [12:26:16] (03PS1) 10Majavah: Include ldap errno on account creation debug logs [extensions/LdapAuthentication] (wmf/1.38.0-wmf.13) - 10https://gerrit.wikimedia.org/r/751192 (https://phabricator.wikimedia.org/T298508) [12:26:34] (03PS1) 10Majavah: Include ldap errno on account creation debug logs [extensions/LdapAuthentication] (wmf/1.38.0-wmf.16) - 10https://gerrit.wikimedia.org/r/751193 (https://phabricator.wikimedia.org/T298508) [12:26:34] !log mwdebug-deploy@deploy1002 helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn [12:26:35] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [12:26:54] (03CR) 10Majavah: [C: 03+2] Include ldap errno on account creation debug logs [extensions/LdapAuthentication] (wmf/1.38.0-wmf.13) - 10https://gerrit.wikimedia.org/r/751192 (https://phabricator.wikimedia.org/T298508) (owner: 10Majavah) [12:26:57] (03CR) 10Majavah: [C: 03+2] Include ldap errno on account creation debug logs [extensions/LdapAuthentication] (wmf/1.38.0-wmf.16) - 10https://gerrit.wikimedia.org/r/751193 (https://phabricator.wikimedia.org/T298508) (owner: 10Majavah) [12:27:56] !log mwdebug-deploy@deploy1002 helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn [12:27:57] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [12:27:58] !log mwdebug-deploy@deploy1002 helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn [12:27:59] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [12:29:07] !log mwdebug-deploy@deploy1002 helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn [12:29:08] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [12:29:18] (03Merged) 10jenkins-bot: Include ldap errno on account creation debug logs [extensions/LdapAuthentication] (wmf/1.38.0-wmf.13) - 10https://gerrit.wikimedia.org/r/751192 (https://phabricator.wikimedia.org/T298508) (owner: 10Majavah) [12:29:20] (03Merged) 10jenkins-bot: Include ldap errno on account creation debug logs [extensions/LdapAuthentication] (wmf/1.38.0-wmf.16) - 10https://gerrit.wikimedia.org/r/751193 (https://phabricator.wikimedia.org/T298508) (owner: 10Majavah) [12:29:50] (03CR) 10Btullis: [C: 03+2] Add remaining aqs_next hosts to the aqs cluster [puppet] - 10https://gerrit.wikimedia.org/r/751408 (https://phabricator.wikimedia.org/T297803) (owner: 10Btullis) [12:30:17] (03PS3) 10David Caro: profile::openldap: increase the size limit for labs servers [puppet] - 10https://gerrit.wikimedia.org/r/751417 [12:30:19] (03CR) 10David Caro: profile::openldap: increase the size limit for labs servers (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/751417 (owner: 10David Caro) [12:32:08] (03CR) 10Jbond: [C: 03+1] "lg if ci passes if not see comment" [puppet] - 10https://gerrit.wikimedia.org/r/751417 (owner: 10David Caro) [12:32:44] taavi: my patch finished merging, do you need me to do any rebase or are you doing that as part of the patch you're backporting? [12:33:03] kostajh: wmf.16 is not yet cloned on deploy1002, I don't think I need to do anything [12:34:03] That should be correct. [12:34:09] !log mwdebug-deploy@deploy1002 helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn [12:34:10] !log taavi@deploy1002 Synchronized php-1.38.0-wmf.13/extensions/LdapAuthentication/includes/LdapAuthenticationPlugin.php: Backport: [[gerrit:751192|Include ldap errno on account creation debug logs (T298508)]] (duration: 00m 58s) [12:34:10] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [12:34:13] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [12:34:14] T298508: ⧼authmanager-authplugin-create-fail⧽ when trying to create developer account on wikitech - https://phabricator.wikimedia.org/T298508 [12:35:00] taavi: let me know if you want to do the CU maintenance now as well, or in the evening :-). [12:35:20] RECOVERY - Maps tiles generation on alert1001 is OK: OK: Less than 90.00% under the threshold [10.0] https://wikitech.wikimedia.org/wiki/Maps/Runbook https://grafana.wikimedia.org/dashboard/db/maps-performances?panelId=8&fullscreen&orgId=1 [12:35:25] !log mwdebug-deploy@deploy1002 helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn [12:35:26] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [12:35:27] !log mwdebug-deploy@deploy1002 helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn [12:35:28] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [12:36:09] you mean CA maintenance (not CU)? [12:36:24] Yes, sorry [12:36:36] !log mwdebug-deploy@deploy1002 helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn [12:36:37] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [12:36:43] sure! [12:36:55] (03PS2) 10Majavah: prod: WRITE_BOTH for centralauth hidden level migration [mediawiki-config] - 10https://gerrit.wikimedia.org/r/751385 (https://phabricator.wikimedia.org/T289068) [12:36:57] Well, let me know once you need sth from me :) [12:37:42] so I'll pull that patch to mwdebug1001, and then you can test hiding an account and I'll verify it sets the correct hidden_level value? [12:37:54] (03CR) 10Majavah: [C: 03+2] prod: WRITE_BOTH for centralauth hidden level migration [mediawiki-config] - 10https://gerrit.wikimedia.org/r/751385 (https://phabricator.wikimedia.org/T289068) (owner: 10Majavah) [12:38:08] !log marostegui@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1105:3311', diff saved to https://phabricator.wikimedia.org/P18374 and previous config saved to /var/cache/conftool/dbconfig/20220104-123807-marostegui.json [12:38:09] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [12:38:38] (03Merged) 10jenkins-bot: prod: WRITE_BOTH for centralauth hidden level migration [mediawiki-config] - 10https://gerrit.wikimedia.org/r/751385 (https://phabricator.wikimedia.org/T289068) (owner: 10Majavah) [12:38:46] !log marostegui@cumin1001 dbctl commit (dc=all): 'Remove recentchanges from s2 eqiad T263127', diff saved to https://phabricator.wikimedia.org/P18375 and previous config saved to /var/cache/conftool/dbconfig/20220104-123845-marostegui.json [12:38:48] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [12:38:48] T263127: Remove groups from db configs - https://phabricator.wikimedia.org/T263127 [12:39:14] (03PS4) 10David Caro: profile::openldap: increase the size limit for labs servers [puppet] - 10https://gerrit.wikimedia.org/r/751417 [12:39:32] urbanecm: that's on mwdebug1001 now [12:40:52] so can you globally lock/hide a test account and then please let me know the account name [12:41:18] (03CR) 10David Caro: [V: 03+1] "PCC SUCCESS (DIFF 1): https://integration.wikimedia.org/ci/job/operations-puppet-catalog-compiler/33109/console" [puppet] - 10https://gerrit.wikimedia.org/r/751417 (owner: 10David Caro) [12:41:41] !log mwdebug-deploy@deploy1002 helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn [12:41:42] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [12:42:12] taavi: will do. Do you have any preferred test acc? Or just any will do? [12:42:21] (03CR) 10Jbond: [C: 03+1] "lgtm" [puppet] - 10https://gerrit.wikimedia.org/r/751417 (owner: 10David Caro) [12:42:43] any, feel free to use for example [[User:Majavah test]] [12:42:53] (03CR) 10David Caro: [V: 03+1 C: 03+2] profile::openldap: increase the size limit for labs servers [puppet] - 10https://gerrit.wikimedia.org/r/751417 (owner: 10David Caro) [12:42:55] !log mwdebug-deploy@deploy1002 helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn [12:42:56] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [12:42:56] !log mwdebug-deploy@deploy1002 helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn [12:42:57] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [12:43:03] Majavah-test I mean [12:43:17] (03CR) 10David Caro: [V: 03+1 C: 03+2] profile::openldap: increase the size limit for labs servers (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/751417 (owner: 10David Caro) [12:43:27] taavi: sure. just to double check, you want me to _hide_ it (not suppress) [12:44:02] hiding should be fine, yes [12:44:10] !log mwdebug-deploy@deploy1002 helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn [12:44:11] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [12:44:19] taavi: 13:44, 4 January 2022 Martin Urbanec talk contribs block changed status for global account "User:Majavah-test@global": set locked, hidden; unset (none) (testing: per Majavah's request) [12:44:38] neat, as far as I see it did the right thing [12:45:10] great [12:45:16] RECOVERY - SSH on dns5001.mgmt is OK: SSH OK - OpenSSH_7.4 (protocol 2.0) https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [12:45:18] suppressing is a reversible action, right? [12:45:32] all of locking, hiding and suppressing are supposed to be reversible [12:45:39] (unless you broke sth in CA, that is ;)) [12:45:57] that's what I'm trying to verify did not happen :D [12:46:34] can you then try to suppress it, just to make sure that works too before undoing [12:46:47] taavi: i fully suppressed my own test, `Martin Urbanec (test 2)` [12:47:09] gu_hidden: suppressed; gu_hidden_level: 2 [12:47:11] sounds about right? [12:47:14] that did the right thing too [12:47:26] now undo it for either account please? [12:47:34] sure [12:48:05] majavah-test unlocked, _but left hidden_ taavi [12:48:15] (I'm surprised that's an option, but apparently it is) [12:48:36] that's an interesting option [12:48:42] I wonder what happens when I log in [12:48:50] feel free to try :) [12:49:03] (I'm also happy to remove the hiding too, if you want taavi) [12:49:40] it lets me log in, but not edit because "You cannot edit because your account is locked." [12:49:47] feel free to remove the hiding now too [12:50:12] taavi: majavah-test is now unlocked, not hidden [12:50:14] and gu_hidden_level was again set correctly [12:50:20] thanks for your help! I think I'll sync then [12:50:31] taavi: before you sync... [12:50:38] ...let me also undo the suppression test [12:50:46] (just in case it behaves differently for some wrong reason) [12:50:48] sure, waiting [12:50:49] *weird [12:51:06] `Martin Urbanec (test 2)` is now unlocked, not hidden, not suppressed [12:51:23] if i understand the new DB schema right, it did the right thing too [12:51:25] looks good in the db too [12:51:44] excellent [12:51:46] so syncing for real [12:51:50] +2 [12:52:39] !log taavi@deploy1002 Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:751385|prod: WRITE_BOTH for centralauth hidden level migration (T289068)]] (duration: 00m 57s) [12:52:41] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [12:52:42] T289068: Normalise centralauth.gu_hidden - https://phabricator.wikimedia.org/T289068 [12:53:12] !log marostegui@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1105:3311', diff saved to https://phabricator.wikimedia.org/P18376 and previous config saved to /var/cache/conftool/dbconfig/20220104-125312-marostegui.json [12:53:13] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [12:53:16] so, done? [12:53:20] yeah, thanks! [12:53:25] !log UTC morning deploys done [12:53:26] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [12:58:45] !log marostegui@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1141 (T277354)', diff saved to https://phabricator.wikimedia.org/P18377 and previous config saved to /var/cache/conftool/dbconfig/20220104-125845-marostegui.json [12:58:47] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [12:58:51] T277354: "chemical" major mime type was never added to production database - https://phabricator.wikimedia.org/T277354 [13:01:24] (03PS1) 10David Caro: p:db:development,r:beta_dashboards: remove unused classes [puppet] - 10https://gerrit.wikimedia.org/r/751430 (https://phabricator.wikimedia.org/T272559) [13:02:34] 10Puppet, 10SRE, 10Infrastructure-Foundations, 10Patch-For-Review: Unused puppet resources audit, 2021 - https://phabricator.wikimedia.org/T272559 (10dcaro) [13:04:51] taavi: sorry, I forgot I put the patches up today, and I was cooking lunch [13:05:01] I'll deploy them myself later :/ [13:08:17] !log marostegui@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1105:3311 (T298316)', diff saved to https://phabricator.wikimedia.org/P18378 and previous config saved to /var/cache/conftool/dbconfig/20220104-130816-marostegui.json [13:08:18] !log marostegui@cumin1001 START - Cookbook sre.hosts.downtime for 6:00:00 on db1164.eqiad.wmnet with reason: Maintenance [13:08:19] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [13:08:20] !log marostegui@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1164.eqiad.wmnet with reason: Maintenance [13:08:20] T298316: Fix nullability of column recentchanges.rc_params on wmf wikis - https://phabricator.wikimedia.org/T298316 [13:08:21] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [13:08:22] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [13:13:50] !log marostegui@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1141', diff saved to https://phabricator.wikimedia.org/P18379 and previous config saved to /var/cache/conftool/dbconfig/20220104-131349-marostegui.json [13:13:51] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [13:26:58] RECOVERY - Backup freshness on backup1001 is OK: Fresh: 107 jobs https://wikitech.wikimedia.org/wiki/Bacula%23Monitoring [13:28:54] !log marostegui@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1141', diff saved to https://phabricator.wikimedia.org/P18380 and previous config saved to /var/cache/conftool/dbconfig/20220104-132854-marostegui.json [13:28:56] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [13:43:59] !log marostegui@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1141 (T277354)', diff saved to https://phabricator.wikimedia.org/P18381 and previous config saved to /var/cache/conftool/dbconfig/20220104-134359-marostegui.json [13:44:01] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [13:44:01] !log marostegui@cumin1001 START - Cookbook sre.hosts.downtime for 6:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db[1121,1155].eqiad.wmnet with reason: Maintenance [13:44:03] T277354: "chemical" major mime type was never added to production database - https://phabricator.wikimedia.org/T277354 [13:44:04] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [13:44:06] !log marostegui@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db[1121,1155].eqiad.wmnet with reason: Maintenance [13:44:07] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [13:44:10] !log marostegui@cumin1001 dbctl commit (dc=all): 'Depooling db1121 (T277354)', diff saved to https://phabricator.wikimedia.org/P18382 and previous config saved to /var/cache/conftool/dbconfig/20220104-134410-marostegui.json [13:44:13] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [13:56:58] !log marostegui@cumin1001 START - Cookbook sre.hosts.downtime for 6:00:00 on db2087.codfw.wmnet with reason: Maintenance [13:56:59] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [13:56:59] !log marostegui@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2087.codfw.wmnet with reason: Maintenance [13:57:00] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [13:57:10] !log bump prometheus k8s + ops space in eqiad [13:57:11] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [13:58:34] (03PS5) 10Jelto: services: cleanup helmfiles, update SAL logging [deployment-charts] - 10https://gerrit.wikimedia.org/r/737034 (https://phabricator.wikimedia.org/T251305) [13:59:32] (03CR) 10Jelto: services: cleanup helmfiles, update SAL logging (031 comment) [deployment-charts] - 10https://gerrit.wikimedia.org/r/737034 (https://phabricator.wikimedia.org/T251305) (owner: 10Jelto) [14:04:56] jouncebot: next [14:04:57] In 2 hour(s) and 55 minute(s): Puppet request window (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20220104T1700) [14:05:12] ok, I have time to just deploy my patches from earlier [14:05:49] (03CR) 10Giuseppe Lavagetto: [C: 03+2] Remove dead symlinks [mediawiki-config] - 10https://gerrit.wikimedia.org/r/746794 (https://phabricator.wikimedia.org/T285232) (owner: 10Giuseppe Lavagetto) [14:06:56] (03Merged) 10jenkins-bot: Remove dead symlinks [mediawiki-config] - 10https://gerrit.wikimedia.org/r/746794 (https://phabricator.wikimedia.org/T285232) (owner: 10Giuseppe Lavagetto) [14:09:51] !log mwdebug-deploy@deploy1002 helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn [14:09:52] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [14:11:03] !log mwdebug-deploy@deploy1002 helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn [14:11:04] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [14:11:04] !log mwdebug-deploy@deploy1002 helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn [14:11:05] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [14:12:14] !log mwdebug-deploy@deploy1002 helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn [14:12:15] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [14:12:27] !log oblivian@deploy1002 Synchronized images: Config: Remove dead symlinks (T285232) (duration: 00m 58s) [14:12:29] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [14:12:29] T285232: The restricted/mediawiki-webserver image should include skins and resources - https://phabricator.wikimedia.org/T285232 [14:13:43] (03CR) 10Giuseppe Lavagetto: [C: 03+2] Make symlinks relative so they work on a local checkout too [mediawiki-config] - 10https://gerrit.wikimedia.org/r/746795 (https://phabricator.wikimedia.org/T285232) (owner: 10Giuseppe Lavagetto) [14:14:17] (03PS4) 10Jbond: WIP: add reposync [software/spicerack] - 10https://gerrit.wikimedia.org/r/747116 [14:14:24] (03Merged) 10jenkins-bot: Make symlinks relative so they work on a local checkout too [mediawiki-config] - 10https://gerrit.wikimedia.org/r/746795 (https://phabricator.wikimedia.org/T285232) (owner: 10Giuseppe Lavagetto) [14:15:47] (03CR) 10Jbond: [C: 03+1] p:db:development,r:beta_dashboards: remove unused classes [puppet] - 10https://gerrit.wikimedia.org/r/751430 (https://phabricator.wikimedia.org/T272559) (owner: 10David Caro) [14:17:12] (03PS1) 10PipelineBot: mathoid: pipeline bot promote [deployment-charts] - 10https://gerrit.wikimedia.org/r/751439 [14:17:26] !log mwdebug-deploy@deploy1002 helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn [14:17:27] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [14:18:36] !log mwdebug-deploy@deploy1002 helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn [14:18:37] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [14:18:37] !log mwdebug-deploy@deploy1002 helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn [14:18:38] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [14:19:49] !log mwdebug-deploy@deploy1002 helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn [14:19:50] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [14:20:08] joe: let me know if/when you're done. I have a noc doc patch [14:20:27] (03CR) 10jerkins-bot: [V: 04-1] WIP: add reposync [software/spicerack] - 10https://gerrit.wikimedia.org/r/747116 (owner: 10Jbond) [14:20:39] Amir1: a couple minutes I think [14:20:54] sure, no rush. Mine is pretty minor [14:21:24] !log oblivian@deploy1002 Synchronized docroot: Config: Make symlinks relative so they work on a local checkout too (T285232) (duration: 00m 57s) [14:21:26] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [14:21:27] T285232: The restricted/mediawiki-webserver image should include skins and resources - https://phabricator.wikimedia.org/T285232 [14:22:16] Amir1: done [14:22:24] awesome [14:22:43] (03PS2) 10Ladsgroup: Remove dbtree from doc pages [mediawiki-config] - 10https://gerrit.wikimedia.org/r/751420 (https://phabricator.wikimedia.org/T297605) [14:22:47] (03CR) 10Ladsgroup: [C: 03+2] Remove dbtree from doc pages [mediawiki-config] - 10https://gerrit.wikimedia.org/r/751420 (https://phabricator.wikimedia.org/T297605) (owner: 10Ladsgroup) [14:23:30] (03Merged) 10jenkins-bot: Remove dbtree from doc pages [mediawiki-config] - 10https://gerrit.wikimedia.org/r/751420 (https://phabricator.wikimedia.org/T297605) (owner: 10Ladsgroup) [14:24:12] done [14:26:08] PROBLEM - SSH on contint1001.mgmt is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [14:29:59] !log mwdebug-deploy@deploy1002 helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn [14:30:01] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [14:30:41] (03CR) 10Giuseppe Lavagetto: deployment_server: fix permissions for mwbuilder/other (032 comments) [puppet] - 10https://gerrit.wikimedia.org/r/751166 (https://phabricator.wikimedia.org/T297673) (owner: 10Giuseppe Lavagetto) [14:31:16] !log mwdebug-deploy@deploy1002 helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn [14:31:17] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [14:31:17] !log mwdebug-deploy@deploy1002 helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn [14:31:18] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [14:34:12] !log mwdebug-deploy@deploy1002 helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn [14:34:13] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [14:35:08] (03PS1) 10David Caro: docker: remove unused modules/role/profiles [puppet] - 10https://gerrit.wikimedia.org/r/751445 (https://phabricator.wikimedia.org/T272559) [14:35:52] 10Puppet, 10SRE, 10Infrastructure-Foundations, 10Patch-For-Review: Unused puppet resources audit, 2021 - https://phabricator.wikimedia.org/T272559 (10dcaro) [14:35:54] (03CR) 10Marostegui: [C: 03+1] auto_schema: Rework upgrade_mysql to reuse code and cookbooks [software] - 10https://gerrit.wikimedia.org/r/748720 (https://phabricator.wikimedia.org/T239814) (owner: 10Ladsgroup) [14:37:59] (03CR) 10Ladsgroup: [C: 03+2] auto_schema: Rework upgrade_mysql to reuse code and cookbooks [software] - 10https://gerrit.wikimedia.org/r/748720 (https://phabricator.wikimedia.org/T239814) (owner: 10Ladsgroup) [14:38:35] (03Merged) 10jenkins-bot: auto_schema: Rework upgrade_mysql to reuse code and cookbooks [software] - 10https://gerrit.wikimedia.org/r/748720 (https://phabricator.wikimedia.org/T239814) (owner: 10Ladsgroup) [14:40:16] (03PS4) 10Giuseppe Lavagetto: deployment_server: fix permissions for mwbuilder/other [puppet] - 10https://gerrit.wikimedia.org/r/751166 (https://phabricator.wikimedia.org/T297673) [14:41:04] (03CR) 10jerkins-bot: [V: 04-1] deployment_server: fix permissions for mwbuilder/other [puppet] - 10https://gerrit.wikimedia.org/r/751166 (https://phabricator.wikimedia.org/T297673) (owner: 10Giuseppe Lavagetto) [14:42:37] 10Puppet, 10SRE, 10Infrastructure-Foundations, 10Patch-For-Review: Unused puppet resources audit, 2021 - https://phabricator.wikimedia.org/T272559 (10dcaro) [14:42:42] (03PS5) 10Giuseppe Lavagetto: deployment_server: fix permissions for mwbuilder/other [puppet] - 10https://gerrit.wikimedia.org/r/751166 (https://phabricator.wikimedia.org/T297673) [14:45:11] (03CR) 10Andrew Bogott: [C: 03+1] wmcs: Change maintain-views to prepare for schema change [puppet] - 10https://gerrit.wikimedia.org/r/743948 (https://phabricator.wikimedia.org/T297094) (owner: 10Ladsgroup) [14:45:20] (03PS1) 10David Caro: profile::parsoid::diffserver: remove unused profile [puppet] - 10https://gerrit.wikimedia.org/r/751446 (https://phabricator.wikimedia.org/T272559) [14:46:36] (03CR) 10Giuseppe Lavagetto: [V: 03+1] "PCC SUCCESS (DIFF 1): https://integration.wikimedia.org/ci/job/operations-puppet-catalog-compiler/33110/console" [puppet] - 10https://gerrit.wikimedia.org/r/751166 (https://phabricator.wikimedia.org/T297673) (owner: 10Giuseppe Lavagetto) [14:46:58] 10Puppet, 10SRE, 10Infrastructure-Foundations, 10Patch-For-Review: Unused puppet resources audit, 2021 - https://phabricator.wikimedia.org/T272559 (10dcaro) [14:49:25] (03CR) 10AOkoth: [C: 03+2] changeprop: increase memory limit for staging [deployment-charts] - 10https://gerrit.wikimedia.org/r/748781 (https://phabricator.wikimedia.org/T293729) (owner: 10AOkoth) [14:50:31] (03CR) 10Andrew Bogott: [C: 03+1] "I'm fine with 'delete now, recreate later if we need it'" [puppet] - 10https://gerrit.wikimedia.org/r/737437 (owner: 10David Caro) [14:51:29] (03CR) 10AOkoth: [V: 03+2 C: 03+2] changeprop: increase memory limit for staging [deployment-charts] - 10https://gerrit.wikimedia.org/r/748781 (https://phabricator.wikimedia.org/T293729) (owner: 10AOkoth) [14:52:42] (03Merged) 10jenkins-bot: changeprop: increase memory limit for staging [deployment-charts] - 10https://gerrit.wikimedia.org/r/748781 (https://phabricator.wikimedia.org/T293729) (owner: 10AOkoth) [15:01:52] (03PS1) 10David Caro: openldap: Raise openstack acl query size limit [puppet] - 10https://gerrit.wikimedia.org/r/751451 [15:02:30] (03CR) 10Andrew Bogott: [C: 03+1] openldap: Raise openstack acl query size limit [puppet] - 10https://gerrit.wikimedia.org/r/751451 (owner: 10David Caro) [15:03:22] (03PS7) 10Jelto: gitlab_runner: use config template for registering new runners [puppet] - 10https://gerrit.wikimedia.org/r/747539 (https://phabricator.wikimedia.org/T295481) [15:03:24] (03PS1) 10Jelto: P:prometheus::ops: add prometheus job and ferm rules for gitlab_runner metrics [puppet] - 10https://gerrit.wikimedia.org/r/751452 (https://phabricator.wikimedia.org/T295481) [15:05:44] (03CR) 10David Caro: [V: 03+1] "PCC SUCCESS (DIFF 1): https://integration.wikimedia.org/ci/job/operations-puppet-catalog-compiler/33111/console" [puppet] - 10https://gerrit.wikimedia.org/r/751451 (owner: 10David Caro) [15:06:47] (03CR) 10David Caro: [C: 03+2] "PCC makes sense" [puppet] - 10https://gerrit.wikimedia.org/r/751451 (owner: 10David Caro) [15:07:40] !log aokoth@deploy1002 helmfile [staging] Ran 'sync' command on namespace 'changeprop-jobqueue' for release 'staging' . [15:07:41] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [15:11:42] (03CR) 10Filippo Giunchedi: [C: 03+1] "LGTM, CC'ing David as this is WMCS-specific" [puppet] - 10https://gerrit.wikimedia.org/r/751207 (https://phabricator.wikimedia.org/T273673) (owner: 10Zabe) [15:13:49] (03PS1) 10David Caro: role::wmcs::prometheus: remove unused role [puppet] - 10https://gerrit.wikimedia.org/r/751454 (https://phabricator.wikimedia.org/T238096) [15:16:00] 10Puppet, 10SRE, 10Infrastructure-Foundations, 10Patch-For-Review: Unused puppet resources audit, 2021 - https://phabricator.wikimedia.org/T272559 (10dcaro) [15:18:00] (03CR) 10Jelto: [V: 03+1] "PCC SUCCESS (DIFF 3): https://integration.wikimedia.org/ci/job/operations-puppet-catalog-compiler/33112/console" [puppet] - 10https://gerrit.wikimedia.org/r/751452 (https://phabricator.wikimedia.org/T295481) (owner: 10Jelto) [15:20:19] (03CR) 10David Caro: [C: 03+1] "Thanks for the notice! LGTM, though we are currently swapping our cloudmetric hosts and there's nothing using this 'right now', adding and" [puppet] - 10https://gerrit.wikimedia.org/r/751207 (https://phabricator.wikimedia.org/T273673) (owner: 10Zabe) [15:21:32] (03PS2) 10Jelto: P:prometheus::ops: add prometheus job and ferm rules for gitlab_runner metrics [puppet] - 10https://gerrit.wikimedia.org/r/751452 (https://phabricator.wikimedia.org/T295481) [15:24:16] !log marostegui@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1121 (T277354)', diff saved to https://phabricator.wikimedia.org/P18384 and previous config saved to /var/cache/conftool/dbconfig/20220104-152416-marostegui.json [15:24:18] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [15:24:19] T277354: "chemical" major mime type was never added to production database - https://phabricator.wikimedia.org/T277354 [15:25:33] (03PS1) 10David Caro: sonofagridengine: cleanup unused classes [puppet] - 10https://gerrit.wikimedia.org/r/751456 (https://phabricator.wikimedia.org/T272559) [15:26:37] (03CR) 10Jelto: [V: 03+1] "PCC SUCCESS (DIFF 2): https://integration.wikimedia.org/ci/job/operations-puppet-catalog-compiler/33113/console" [puppet] - 10https://gerrit.wikimedia.org/r/751452 (https://phabricator.wikimedia.org/T295481) (owner: 10Jelto) [15:27:16] RECOVERY - SSH on contint1001.mgmt is OK: SSH OK - OpenSSH_6.6 (protocol 2.0) https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [15:27:29] 10Puppet, 10SRE, 10Infrastructure-Foundations, 10Patch-For-Review: Unused puppet resources audit, 2021 - https://phabricator.wikimedia.org/T272559 (10dcaro) [15:28:07] 10Puppet, 10SRE, 10Infrastructure-Foundations, 10Patch-For-Review: Unused puppet resources audit, 2021 - https://phabricator.wikimedia.org/T272559 (10dcaro) [15:29:35] 10SRE, 10Infrastructure-Foundations: Setup new mirror server (mirror1001.wikimedia.org) - https://phabricator.wikimedia.org/T286898 (10jhathaway) >>! In T286898#7594730, @faidon wrote: > Not sure if this has been flagged by anyone else or considered but note that our mirror is an official mirror for [[ https:/... [15:29:41] 10Puppet, 10SRE, 10Infrastructure-Foundations, 10Patch-For-Review: Unused puppet resources audit, 2021 - https://phabricator.wikimedia.org/T272559 (10dcaro) [15:39:21] !log marostegui@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1121', diff saved to https://phabricator.wikimedia.org/P18386 and previous config saved to /var/cache/conftool/dbconfig/20220104-153920-marostegui.json [15:39:22] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [15:39:23] (03CR) 10Btullis: [C: 03+2] Increase the threshold EventgateLoggingExternalLatency [alerts] - 10https://gerrit.wikimedia.org/r/748704 (https://phabricator.wikimedia.org/T294911) (owner: 10Btullis) [15:41:28] (03Merged) 10jenkins-bot: Increase the threshold EventgateLoggingExternalLatency [alerts] - 10https://gerrit.wikimedia.org/r/748704 (https://phabricator.wikimedia.org/T294911) (owner: 10Btullis) [15:42:14] (03PS2) 10Vgutierrez: site: Reimage cp5005 as cache::upload_envoy [puppet] - 10https://gerrit.wikimedia.org/r/751413 (https://phabricator.wikimedia.org/T271421) [15:42:16] (03PS1) 10Vgutierrez: cache::envoy: Increase LimitNOFILE [puppet] - 10https://gerrit.wikimedia.org/r/751459 (https://phabricator.wikimedia.org/T271421) [15:43:19] (03CR) 10Vgutierrez: [V: 03+1] "PCC SUCCESS (DIFF 1): https://integration.wikimedia.org/ci/job/operations-puppet-catalog-compiler/33114/console" [puppet] - 10https://gerrit.wikimedia.org/r/751459 (https://phabricator.wikimedia.org/T271421) (owner: 10Vgutierrez) [15:44:15] (03CR) 10Giuseppe Lavagetto: [C: 04-1] cache::envoy: Increase LimitNOFILE (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/751459 (https://phabricator.wikimedia.org/T271421) (owner: 10Vgutierrez) [15:47:03] joe: interesting, I guess that would imply switching to use_override => false on the envoyproxy module to avoid puppet-override.conf collisions, right? [15:48:04] oh sigh, you're right [15:48:32] makes sense though, thanks [15:48:55] in this case, just add a notify => Exec['systemctl daemon-reload'] to the file resources [15:48:57] it's simpler [15:49:24] ah sigh no [15:50:07] notify Exec['systemctl daemon-reload for envoyproxy'] is the right name to use [15:50:39] Exec['systemd reload for envoyproxy'] even :D [15:50:49] sorry, I can't read apparently today [15:51:11] (03CR) 10Jbond: [C: 03+1] docker: remove unused modules/role/profiles [puppet] - 10https://gerrit.wikimedia.org/r/751445 (https://phabricator.wikimedia.org/T272559) (owner: 10David Caro) [15:51:28] (03CR) 10Jbond: [C: 03+1] profile::parsoid::diffserver: remove unused profile [puppet] - 10https://gerrit.wikimedia.org/r/751446 (https://phabricator.wikimedia.org/T272559) (owner: 10David Caro) [15:54:26] !log marostegui@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1121', diff saved to https://phabricator.wikimedia.org/P18387 and previous config saved to /var/cache/conftool/dbconfig/20220104-155425-marostegui.json [15:54:27] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [15:57:27] (03PS1) 10David Caro: p:wmcs::nfs::misc/mist_backup/backup_keys: remove unused profiles [puppet] - 10https://gerrit.wikimedia.org/r/751460 (https://phabricator.wikimedia.org/T272559) [15:57:34] joe: hmm that triggers a dependency cycle due to /etc/systemd/system/envoyproxy.service.d/ being created by the override managed by the envoyproxy module that also requires the daemon-reload Exec [15:58:23] 10Puppet, 10SRE, 10Infrastructure-Foundations, 10Patch-For-Review: Unused puppet resources audit, 2021 - https://phabricator.wikimedia.org/T272559 (10dcaro) [15:58:57] (03CR) 10Andrew Bogott: p:wmcs::nfs::misc/mist_backup/backup_keys: remove unused profiles (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/751460 (https://phabricator.wikimedia.org/T272559) (owner: 10David Caro) [16:00:17] (03PS2) 10David Caro: p:wmcs::nfs::misc/misc_backup/backup_keys: remove unused profiles [puppet] - 10https://gerrit.wikimedia.org/r/751460 (https://phabricator.wikimedia.org/T272559) [16:00:19] (03CR) 10David Caro: p:wmcs::nfs::misc/misc_backup/backup_keys: remove unused profiles (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/751460 (https://phabricator.wikimedia.org/T272559) (owner: 10David Caro) [16:00:21] (03CR) 10Herron: [C: 04-1] "Great idea to make these easier to manage! Please see comments inline" [puppet] - 10https://gerrit.wikimedia.org/r/748884 (https://phabricator.wikimedia.org/T298038) (owner: 10JHathaway) [16:01:16] 10Puppet, 10SRE, 10Infrastructure-Foundations, 10Patch-For-Review: Unused puppet resources audit, 2021 - https://phabricator.wikimedia.org/T272559 (10dcaro) [16:02:25] (03CR) 10Vgutierrez: [V: 03+1] cache::envoy: Increase LimitNOFILE (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/751459 (https://phabricator.wikimedia.org/T271421) (owner: 10Vgutierrez) [16:04:23] (03CR) 10Jbond: [C: 03+1] role::wmcs::prometheus: remove unused role [puppet] - 10https://gerrit.wikimedia.org/r/751454 (https://phabricator.wikimedia.org/T238096) (owner: 10David Caro) [16:05:03] (03PS1) 10David Caro: r:wmcs:openstack:eqiad1:cumin_controller: remove unused role [puppet] - 10https://gerrit.wikimedia.org/r/751461 (https://phabricator.wikimedia.org/T234462) [16:06:52] 10Puppet, 10SRE, 10Infrastructure-Foundations, 10Patch-For-Review: Unused puppet resources audit, 2021 - https://phabricator.wikimedia.org/T272559 (10dcaro) [16:09:30] !log marostegui@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1121 (T277354)', diff saved to https://phabricator.wikimedia.org/P18388 and previous config saved to /var/cache/conftool/dbconfig/20220104-160930-marostegui.json [16:09:33] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [16:09:34] T277354: "chemical" major mime type was never added to production database - https://phabricator.wikimedia.org/T277354 [16:10:12] vgutierrez: uhm not sure I follow how that's a dependency cycle [16:10:27] you already depend on the module being included before you create those files [16:10:50] (03CR) 10Bartosz Dziewoński: [C: 04-1] "Done in https://gerrit.wikimedia.org/r/751167, I didn't realize you had already prepared a patch :/" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/748780 (https://phabricator.wikimedia.org/T297534) (owner: 10Esanders) [16:11:13] (03CR) 10Bartosz Dziewoński: [C: 04-1] "Done in https://gerrit.wikimedia.org/r/https://gerrit.wikimedia.org/r/751168, I didn't realize you had already prepared a patch :/" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/748727 (https://phabricator.wikimedia.org/T297535) (owner: 10Esanders) [16:11:17] uhm unless puppet is stupid enough to float the whole class later because of that [16:12:05] vgutierrez: then the alternative is to declare another daemon-reload yourself, not so horrible tbh [16:14:49] (03PS2) 10Vgutierrez: cache::envoy: Increase LimitNOFILE [puppet] - 10https://gerrit.wikimedia.org/r/751459 (https://phabricator.wikimedia.org/T271421) [16:15:51] (03PS5) 10Jbond: WIP: add reposync [software/spicerack] - 10https://gerrit.wikimedia.org/r/747116 [16:16:07] (03PS4) 10Southparkfan: Add WMCS specific cloud role for syslog server [puppet] - 10https://gerrit.wikimedia.org/r/682259 (https://phabricator.wikimedia.org/T127717) [16:16:11] (03CR) 10Vgutierrez: [V: 03+1] "PCC SUCCESS (DIFF 1): https://integration.wikimedia.org/ci/job/operations-puppet-catalog-compiler/33115/console" [puppet] - 10https://gerrit.wikimedia.org/r/751459 (https://phabricator.wikimedia.org/T271421) (owner: 10Vgutierrez) [16:16:48] (03CR) 10jerkins-bot: [V: 04-1] Add WMCS specific cloud role for syslog server [puppet] - 10https://gerrit.wikimedia.org/r/682259 (https://phabricator.wikimedia.org/T127717) (owner: 10Southparkfan) [16:17:28] joe: pcc seems happy with your approach [16:17:31] testing on WMCS [16:17:42] (03CR) 10Southparkfan: [C: 04-1] "Merely a rebase; do not merge. The next step for me is to process your comments." [puppet] - 10https://gerrit.wikimedia.org/r/682259 (https://phabricator.wikimedia.org/T127717) (owner: 10Southparkfan) [16:17:47] 10Puppet, 10SRE, 10Infrastructure-Foundations, 10Patch-For-Review: Unused puppet resources audit, 2021 - https://phabricator.wikimedia.org/T272559 (10dcaro) [16:19:18] (03CR) 10Michael DiPietro: [C: 03+1] kubeadm: raise default to 1.20 [puppet] - 10https://gerrit.wikimedia.org/r/739402 (owner: 10Majavah) [16:20:04] (03CR) 10Giuseppe Lavagetto: [V: 03+1 C: 03+2] deployment_server: fix permissions for mwbuilder/other [puppet] - 10https://gerrit.wikimedia.org/r/751166 (https://phabricator.wikimedia.org/T297673) (owner: 10Giuseppe Lavagetto) [16:22:27] (03CR) 10jerkins-bot: [V: 04-1] WIP: add reposync [software/spicerack] - 10https://gerrit.wikimedia.org/r/747116 (owner: 10Jbond) [16:22:33] joe: you win :) [16:22:54] vgutierrez: puppet is stupid, but not *that* stupid [16:23:13] (03PS6) 10Jbond: WIP: add reposync [software/spicerack] - 10https://gerrit.wikimedia.org/r/747116 [16:23:19] 10SRE, 10SRE-OnFire, 10Wikimedia-Incident: Incident: 2021-12-03 mx2001->Gmail delivery issues - https://phabricator.wikimedia.org/T297127 (10herron) >>! In T297127#7583730, @Dzahn wrote: > @Herron How do you see this is as the task creator, should this stay open until all subtasks are resolved? That means ev... [16:26:51] (03PS1) 10David Caro: r:wmcs:paws:k8s:etcd: remove unused role [puppet] - 10https://gerrit.wikimedia.org/r/751463 (https://phabricator.wikimedia.org/T188912) [16:27:32] 10Puppet, 10SRE, 10Infrastructure-Foundations, 10Patch-For-Review: Unused puppet resources audit, 2021 - https://phabricator.wikimedia.org/T272559 (10dcaro) [16:29:08] vgutierrez@traffic-cache-envoyupload-buster:~$ sudo -i systemctl show envoyproxy.service |grep NOFILE [16:29:09] LimitNOFILE=65536 [16:29:20] sadly the most restrictive LimitNOFILE seems to be applied [16:29:48] (03CR) 10jerkins-bot: [V: 04-1] WIP: add reposync [software/spicerack] - 10https://gerrit.wikimedia.org/r/747116 (owner: 10Jbond) [16:30:28] 10Puppet, 10SRE, 10Infrastructure-Foundations, 10Patch-For-Review: Unused puppet resources audit, 2021 - https://phabricator.wikimedia.org/T272559 (10dcaro) [16:31:54] (03CR) 10Herron: [C: 03+1] hieradata: add more network probes for internal services [puppet] - 10https://gerrit.wikimedia.org/r/747805 (https://phabricator.wikimedia.org/T291946) (owner: 10Filippo Giunchedi) [16:34:28] (03CR) 10Herron: [C: 03+1] prometheus: extend blackbox probes options [puppet] - 10https://gerrit.wikimedia.org/r/747835 (https://phabricator.wikimedia.org/T291946) (owner: 10Filippo Giunchedi) [16:35:09] (03CR) 10Herron: [C: 03+1] logstash:input:syslog: remove unused module [puppet] - 10https://gerrit.wikimedia.org/r/751127 (https://phabricator.wikimedia.org/T272559) (owner: 10David Caro) [16:35:53] (03Abandoned) 10Herron: logstash: move api-feature-usage outputs to elk7 cluster [puppet] - 10https://gerrit.wikimedia.org/r/744862 (https://phabricator.wikimedia.org/T297239) (owner: 10Herron) [16:37:52] 10Puppet, 10SRE, 10Infrastructure-Foundations, 10Patch-For-Review: Unused puppet resources audit, 2021 - https://phabricator.wikimedia.org/T272559 (10jbond) [16:44:41] (03CR) 10Giuseppe Lavagetto: wmflib: add service::get_services_for function (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/746801 (owner: 10Giuseppe Lavagetto) [16:44:52] (03PS2) 10Giuseppe Lavagetto: wmflib: add service::get_services_for function [puppet] - 10https://gerrit.wikimedia.org/r/746801 [16:45:00] (03PS2) 10David Caro: profile::parsoid::diffserver: remove unused profile [puppet] - 10https://gerrit.wikimedia.org/r/751446 (https://phabricator.wikimedia.org/T272559) [16:46:06] 10Puppet, 10SRE, 10Infrastructure-Foundations, 10Patch-For-Review: Unused puppet resources audit, 2021 - https://phabricator.wikimedia.org/T272559 (10dcaro) [16:48:07] (03CR) 10Giuseppe Lavagetto: [C: 03+1] cache::envoy: Increase LimitNOFILE [puppet] - 10https://gerrit.wikimedia.org/r/751459 (https://phabricator.wikimedia.org/T271421) (owner: 10Vgutierrez) [16:48:20] _joe_: that doesn't work [16:48:35] <_joe_> vgutierrez: wdym? [16:49:12] (03PS1) 10David Caro: varnish: remove empty init class and unused module [puppet] - 10https://gerrit.wikimedia.org/r/751465 (https://phabricator.wikimedia.org/T272559) [16:49:20] (03PS3) 10Vgutierrez: cache::envoy: Increase LimitNOFILE [puppet] - 10https://gerrit.wikimedia.org/r/751459 (https://phabricator.wikimedia.org/T271421) [16:49:29] (03CR) 10Jbond: WIP: add reposync (037 comments) [software/spicerack] - 10https://gerrit.wikimedia.org/r/747116 (owner: 10Jbond) [16:49:48] 10Puppet, 10SRE, 10Infrastructure-Foundations, 10Patch-For-Review: Unused puppet resources audit, 2021 - https://phabricator.wikimedia.org/T272559 (10dcaro) [16:49:53] (03PS2) 10Majavah: LabsServices: refresh cloudmetrics server [mediawiki-config] - 10https://gerrit.wikimedia.org/r/747494 (https://phabricator.wikimedia.org/T289888) [16:49:55] _joe_: two things.. LimitNOFile= triggers a syntax error. And we need to set the new LimitNOFILE value on a config file that gets evaluated after puppet-override.conf [16:49:55] <_joe_> vgutierrez: aHAHAHA just read the comment [16:50:03] so traffic-limits.conf for instance :) [16:50:28] <_joe_> vgutierrez: oh right, LimitNOFile expects an integer [16:50:39] <_joe_> so it replaces values? I didn't know, TIL [16:51:02] yep [16:52:01] https://www.irccloud.com/pastebin/1XOHamRN/ [16:52:02] <_joe_> but yes the alphabetical order makes perfect sense [16:52:10] (03PS1) 10David Caro: udp2log:rsyncd: remove unused module [puppet] - 10https://gerrit.wikimedia.org/r/751466 (https://phabricator.wikimedia.org/T272559) [16:52:22] lexicographic order per systemd documentation [16:52:36] I wonder if we should refactor our puppet-override.conf to be called 99-puppet-override.conf [16:53:02] <_joe_> why not 00- [16:53:04] <_joe_> ? [16:53:15] 10Puppet, 10SRE, 10Infrastructure-Foundations, 10Patch-For-Review: Unused puppet resources audit, 2021 - https://phabricator.wikimedia.org/T272559 (10dcaro) [16:53:23] <_joe_> heh that's something to consider [16:53:36] yeah sorry, 00 [16:53:52] <_joe_> but yes, I get the idea [16:54:04] <_joe_> so if it's an override, we allow to define a priority for it [16:54:25] <_joe_> and possibly setting a name [16:54:44] <_joe_> well there is some designing to do if we want to refactor it, not at 6 pm :D [16:54:52] that's for sure [16:54:58] PS3 works for me :) [16:55:23] <_joe_> 🚢 it! [16:55:35] !log ebernhardson@deploy1002 Started deploy [wikimedia/discovery/analytics@b38fb58]: Switch mjolnir norm_query_clustering to the shsaded refinery jar [16:55:36] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [16:55:58] now I don't want to! damn emojis ;P [16:56:32] (03CR) 10Vgutierrez: [C: 03+2] cache::envoy: Increase LimitNOFILE [puppet] - 10https://gerrit.wikimedia.org/r/751459 (https://phabricator.wikimedia.org/T271421) (owner: 10Vgutierrez) [16:57:46] !log ebernhardson@deploy1002 Finished deploy [wikimedia/discovery/analytics@b38fb58]: Switch mjolnir norm_query_clustering to the shsaded refinery jar (duration: 02m 11s) [16:57:47] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [17:00:05] jbond and rzl: #bothumor When your hammer is PHP, everything starts looking like a thumb. Rise for Puppet request window. (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20220104T1700). [17:00:05] No Gerrit patches in the queue for this window AFAICS. [17:04:49] (03PS5) 10Southparkfan: Add WMCS specific cloud role for syslog server [puppet] - 10https://gerrit.wikimedia.org/r/682259 (https://phabricator.wikimedia.org/T127717) [17:09:16] (03PS1) 10David Caro: systemtap::runtime: remove unused module [puppet] - 10https://gerrit.wikimedia.org/r/751469 (https://phabricator.wikimedia.org/T272559) [17:11:27] (03PS1) 10Zabe: graphite: whisper_cleanup: migrate cron to systemd timer job [puppet] - 10https://gerrit.wikimedia.org/r/751470 (https://phabricator.wikimedia.org/T273673) [17:11:29] (03PS1) 10Zabe: graphite: whisper_cleanup: remove absented cron [puppet] - 10https://gerrit.wikimedia.org/r/751471 (https://phabricator.wikimedia.org/T273673) [17:11:35] 10Puppet, 10SRE, 10Infrastructure-Foundations, 10Patch-For-Review: Unused puppet resources audit, 2021 - https://phabricator.wikimedia.org/T272559 (10dcaro) [17:18:02] (03CR) 10Southparkfan: "It has taken a while due to certain events, but I should have fixed all comments by now. Please re-review!" [puppet] - 10https://gerrit.wikimedia.org/r/682259 (https://phabricator.wikimedia.org/T127717) (owner: 10Southparkfan) [17:37:21] (03PS2) 10Clare Ming: Deploy sticky header to pilot wikis, launch A/B test. [mediawiki-config] - 10https://gerrit.wikimedia.org/r/747981 (https://phabricator.wikimedia.org/T295976) [17:38:25] (03CR) 10Clare Ming: Deploy sticky header to pilot wikis, launch A/B test. (032 comments) [mediawiki-config] - 10https://gerrit.wikimedia.org/r/747981 (https://phabricator.wikimedia.org/T295976) (owner: 10Clare Ming) [17:48:29] (03PS1) 10Majavah: P:graphite: support not using CAS [puppet] - 10https://gerrit.wikimedia.org/r/751477 (https://phabricator.wikimedia.org/T241285) [17:50:47] (03PS3) 10Majavah: LabsServices: use deployment-graphite01 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/747494 (https://phabricator.wikimedia.org/T241285) [17:52:15] (03CR) 10Majavah: "check experimental" [puppet] - 10https://gerrit.wikimedia.org/r/751477 (https://phabricator.wikimedia.org/T241285) (owner: 10Majavah) [18:00:05] chrisalbon and accraze: (Dis)respected human, time to deploy Services – Graphoid / ORES (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20220104T1800). Please do the needful. [18:20:34] marostegui: can you put my name in clinic duty? i dont have op right flags [18:21:41] I'm guessing Manuel might be gone for the day ;P [18:30:45] Reedy: thx! [18:38:22] (03PS1) 10Ahmon Dancy: Beginnings of git::daemon class [puppet] - 10https://gerrit.wikimedia.org/r/751481 [18:39:01] (03CR) 10jerkins-bot: [V: 04-1] Beginnings of git::daemon class [puppet] - 10https://gerrit.wikimedia.org/r/751481 (owner: 10Ahmon Dancy) [18:40:52] (03PS2) 10Ahmon Dancy: Beginnings of git::daemon class [puppet] - 10https://gerrit.wikimedia.org/r/751481 [19:00:05] Deploy window Pre MediaWiki train break (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20220104T1900) [19:01:38] RECOVERY - Check systemd state on cumin1001 is OK: OK - running: The system is fully operational https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state [19:01:56] (03PS3) 10Ahmon Dancy: Beginnings of git::daemon class [puppet] - 10https://gerrit.wikimedia.org/r/751481 [19:02:31] (03CR) 10jerkins-bot: [V: 04-1] Beginnings of git::daemon class [puppet] - 10https://gerrit.wikimedia.org/r/751481 (owner: 10Ahmon Dancy) [19:03:35] (03PS1) 10Ebernhardson: Move CirrusSearch more_like traffic to eqiad [mediawiki-config] - 10https://gerrit.wikimedia.org/r/751485 [19:04:10] (03PS4) 10Ahmon Dancy: Beginnings of git::daemon class [puppet] - 10https://gerrit.wikimedia.org/r/751481 [19:11:03] (03PS5) 10Ahmon Dancy: Beginnings of git::daemon class [puppet] - 10https://gerrit.wikimedia.org/r/751481 [19:14:58] (03PS6) 10Ahmon Dancy: Beginnings of git::daemon class [puppet] - 10https://gerrit.wikimedia.org/r/751481 [19:18:00] (03PS7) 10Ahmon Dancy: Beginnings of git::daemon class [puppet] - 10https://gerrit.wikimedia.org/r/751481 [19:18:51] 10Puppet, 10SRE, 10Infrastructure-Foundations: Ensure that there are no firewall rules in modules - https://phabricator.wikimedia.org/T114209 (10Majavah) [19:22:32] (03CR) 10Jdlrobson: Deploy sticky header to pilot wikis, launch A/B test. (031 comment) [mediawiki-config] - 10https://gerrit.wikimedia.org/r/747981 (https://phabricator.wikimedia.org/T295976) (owner: 10Clare Ming) [19:25:19] (03PS8) 10Ahmon Dancy: Beginnings of git::daemon class [puppet] - 10https://gerrit.wikimedia.org/r/751481 [19:39:54] PROBLEM - Check systemd state on stat1006 is CRITICAL: CRITICAL - degraded: The following units failed: jupyter-iflorez-singleuser.service https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state [19:44:18] (03PS9) 10Ahmon Dancy: Beginnings of git::daemon class [puppet] - 10https://gerrit.wikimedia.org/r/751481 [19:46:26] RECOVERY - Check systemd state on stat1006 is OK: OK - running: The system is fully operational https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state [19:58:30] (03PS1) 10MSantos: tegola: place_label i18n fix [deployment-charts] - 10https://gerrit.wikimedia.org/r/751490 (https://phabricator.wikimedia.org/T288728) [20:00:04] twentyafterfour and hashar: #bothumor My software never has bugs. It just develops random features. Rise for MediaWiki train - Utc-7+Utc-0 Version. (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20220104T2000). [20:01:02] ^lol [20:01:41] (03PS10) 10Ahmon Dancy: Beginnings of git::daemon class [puppet] - 10https://gerrit.wikimedia.org/r/751481 [20:02:18] (03CR) 10jerkins-bot: [V: 04-1] Beginnings of git::daemon class [puppet] - 10https://gerrit.wikimedia.org/r/751481 (owner: 10Ahmon Dancy) [20:03:04] (03PS11) 10Ahmon Dancy: Beginnings of git::daemon class [puppet] - 10https://gerrit.wikimedia.org/r/751481 [20:04:59] (03CR) 10jerkins-bot: [V: 04-1] Beginnings of git::daemon class [puppet] - 10https://gerrit.wikimedia.org/r/751481 (owner: 10Ahmon Dancy) [20:07:35] (03PS12) 10Ahmon Dancy: Beginnings of git::daemon class [puppet] - 10https://gerrit.wikimedia.org/r/751481 [20:08:45] (03PS8) 10Urbanecm: snapshot: Dump information about Growth mentorship [puppet] - 10https://gerrit.wikimedia.org/r/740371 (https://phabricator.wikimedia.org/T291966) [20:09:42] (03CR) 10jerkins-bot: [V: 04-1] Beginnings of git::daemon class [puppet] - 10https://gerrit.wikimedia.org/r/751481 (owner: 10Ahmon Dancy) [20:17:44] (03PS2) 10Dzahn: planet: update Sumana's RSS feed [puppet] - 10https://gerrit.wikimedia.org/r/750795 (owner: 10Amire80) [20:18:14] (03PS3) 10Clare Ming: Deploy sticky header to pilot wikis, launch A/B test. [mediawiki-config] - 10https://gerrit.wikimedia.org/r/747981 (https://phabricator.wikimedia.org/T295976) [20:19:50] (03CR) 10Dzahn: [C: 03+2] "thank you for monitoring the wiki requests page!" [puppet] - 10https://gerrit.wikimedia.org/r/750795 (owner: 10Amire80) [20:19:54] (03CR) 10Clare Ming: Deploy sticky header to pilot wikis, launch A/B test. (031 comment) [mediawiki-config] - 10https://gerrit.wikimedia.org/r/747981 (https://phabricator.wikimedia.org/T295976) (owner: 10Clare Ming) [20:20:19] (03PS9) 10Urbanecm: snapshot: Dump information about Growth mentorship [puppet] - 10https://gerrit.wikimedia.org/r/740371 (https://phabricator.wikimedia.org/T291966) [20:21:17] (03CR) 10Urbanecm: snapshot: Dump information about Growth mentorship (032 comments) [puppet] - 10https://gerrit.wikimedia.org/r/740371 (https://phabricator.wikimedia.org/T291966) (owner: 10Urbanecm) [20:22:26] (03PS3) 10Clare Ming: Fix wordmark svgs for strategywiki, viwikibooks. [mediawiki-config] - 10https://gerrit.wikimedia.org/r/748214 (https://phabricator.wikimedia.org/T290091) [20:23:40] (03CR) 10Dzahn: "please ask Subbu if this will not be needed again" [puppet] - 10https://gerrit.wikimedia.org/r/751446 (https://phabricator.wikimedia.org/T272559) (owner: 10David Caro) [20:26:55] (03PS13) 10Ahmon Dancy: Define git::daemon class and use it in class profile::mediawiki::deployment::server [puppet] - 10https://gerrit.wikimedia.org/r/751481 [20:28:52] (03CR) 10jerkins-bot: [V: 04-1] Define git::daemon class and use it in class profile::mediawiki::deployment::server [puppet] - 10https://gerrit.wikimedia.org/r/751481 (owner: 10Ahmon Dancy) [20:32:05] Running `scap prep` for wmf.16 - T293957 [20:32:05] T293957: 1.38.0-wmf.16 deployment blockers - https://phabricator.wikimedia.org/T293957 [20:33:02] (03PS14) 10Ahmon Dancy: Define git::daemon class and use it in profile::mediawiki::deployment::server [puppet] - 10https://gerrit.wikimedia.org/r/751481 (https://phabricator.wikimedia.org/T298165) [20:33:47] (03Abandoned) 10Ahmon Dancy: rsync::quickdatacopy: Allow dest_path to be supplied [puppet] - 10https://gerrit.wikimedia.org/r/749563 (owner: 10Ahmon Dancy) [20:33:54] (03Abandoned) 10Ahmon Dancy: profile::releases::mediawiki::private: Enable timer and alter target directory [puppet] - 10https://gerrit.wikimedia.org/r/749566 (https://phabricator.wikimedia.org/T298165) (owner: 10Ahmon Dancy) [20:33:57] !log MediaWiki train for 1.38.0-wmf.16 - ran `scap prep` T293957 [20:33:59] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [20:34:57] (03CR) 10jerkins-bot: [V: 04-1] Define git::daemon class and use it in profile::mediawiki::deployment::server [puppet] - 10https://gerrit.wikimedia.org/r/751481 (https://phabricator.wikimedia.org/T298165) (owner: 10Ahmon Dancy) [20:35:47] twentyafterfour_: good thing you thought about the train cause I have completely forgot about it ;D [20:37:17] I have subscribed to the task so I can be aware of what is going on [20:37:20] but for now, bed time! [20:37:39] hashar: good night :) [20:42:33] !log mwdebug-deploy@deploy1002 helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn [20:42:34] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [20:43:46] !log mwdebug-deploy@deploy1002 helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn [20:43:46] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [20:43:47] !log mwdebug-deploy@deploy1002 helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn [20:43:48] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [20:44:58] !log mwdebug-deploy@deploy1002 helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn [20:44:59] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [20:45:35] (03PS15) 10Ahmon Dancy: Define git::daemon class and use it in profile::mediawiki::deployment::server [puppet] - 10https://gerrit.wikimedia.org/r/751481 (https://phabricator.wikimedia.org/T298165) [20:46:12] (03CR) 10jerkins-bot: [V: 04-1] Define git::daemon class and use it in profile::mediawiki::deployment::server [puppet] - 10https://gerrit.wikimedia.org/r/751481 (https://phabricator.wikimedia.org/T298165) (owner: 10Ahmon Dancy) [20:47:31] (03PS16) 10Ahmon Dancy: Define git::daemon class and use it in profile::mediawiki::deployment::server [puppet] - 10https://gerrit.wikimedia.org/r/751481 (https://phabricator.wikimedia.org/T298165) [20:48:06] (03CR) 10jerkins-bot: [V: 04-1] Define git::daemon class and use it in profile::mediawiki::deployment::server [puppet] - 10https://gerrit.wikimedia.org/r/751481 (https://phabricator.wikimedia.org/T298165) (owner: 10Ahmon Dancy) [20:48:57] (03PS17) 10Ahmon Dancy: Define git::daemon class and use it in profile::mediawiki::deployment::server [puppet] - 10https://gerrit.wikimedia.org/r/751481 (https://phabricator.wikimedia.org/T298165) [20:50:58] (03CR) 10Subramanya Sastry: "Well, this code will be quite useful for https://phabricator.wikimedia.org/T295907 ... can someone tell me if cloud VM instances are puppe" [puppet] - 10https://gerrit.wikimedia.org/r/751446 (https://phabricator.wikimedia.org/T272559) (owner: 10David Caro) [20:53:01] (03CR) 10Ahmon Dancy: "PCC results: https://puppet-compiler.wmflabs.org/pcc-worker1002/33125/deploy1002.eqiad.wmnet/fulldiff.html" [puppet] - 10https://gerrit.wikimedia.org/r/751481 (https://phabricator.wikimedia.org/T298165) (owner: 10Ahmon Dancy) [20:54:49] (03CR) 10Ahmon Dancy: "Notes for dzahn: This commit exposes some deploy server directories via git-daemon that were already exposed via rsync. Exposing via git " [puppet] - 10https://gerrit.wikimedia.org/r/751481 (https://phabricator.wikimedia.org/T298165) (owner: 10Ahmon Dancy) [20:57:55] 10SRE, 10DC-Ops: Confirm support of PERC 750 raid controller - https://phabricator.wikimedia.org/T297913 (10RobH) a:03MoritzMuehlenhoff [20:59:39] (03CR) 10Majavah: Define git::daemon class and use it in profile::mediawiki::deployment::server (035 comments) [puppet] - 10https://gerrit.wikimedia.org/r/751481 (https://phabricator.wikimedia.org/T298165) (owner: 10Ahmon Dancy) [21:05:55] (03PS1) 10Urbanecm: Add Zscaler to list of trusted hosts for XFF [extensions/TrustedXFF] (wmf/1.38.0-wmf.16) - 10https://gerrit.wikimedia.org/r/751195 (https://phabricator.wikimedia.org/T298241) [21:06:08] (03PS1) 10Urbanecm: Add Zscaler to list of trusted hosts for XFF [extensions/TrustedXFF] (wmf/1.38.0-wmf.13) - 10https://gerrit.wikimedia.org/r/751196 (https://phabricator.wikimedia.org/T298241) [21:07:59] (03PS18) 10Ahmon Dancy: Define git::daemon class and use it in profile::mediawiki::deployment::server [puppet] - 10https://gerrit.wikimedia.org/r/751481 (https://phabricator.wikimedia.org/T298165) [21:08:03] (03CR) 10Ahmon Dancy: Define git::daemon class and use it in profile::mediawiki::deployment::server (035 comments) [puppet] - 10https://gerrit.wikimedia.org/r/751481 (https://phabricator.wikimedia.org/T298165) (owner: 10Ahmon Dancy) [21:15:19] (03PS5) 10JHathaway: exim: add the ability to silently drop senders [puppet] - 10https://gerrit.wikimedia.org/r/748884 (https://phabricator.wikimedia.org/T298038) [21:18:14] (03CR) 10JHathaway: "@Herron, thanks for reviewing!" [puppet] - 10https://gerrit.wikimedia.org/r/748884 (https://phabricator.wikimedia.org/T298038) (owner: 10JHathaway) [21:35:27] (03PS1) 10Andrew Bogott: apt sources.list templates: add some comments [puppet] - 10https://gerrit.wikimedia.org/r/751497 (https://phabricator.wikimedia.org/T264311) [21:43:00] (03CR) 10Dzahn: "If "profile::apt::purge_sources" is true in Hiera then files in sources.list.d would also get purged, but this seems to only be abled for " [puppet] - 10https://gerrit.wikimedia.org/r/751497 (https://phabricator.wikimedia.org/T264311) (owner: 10Andrew Bogott) [21:43:44] (03PS2) 10Andrew Bogott: apt sources.list templates: add some comments [puppet] - 10https://gerrit.wikimedia.org/r/751497 (https://phabricator.wikimedia.org/T264311) [21:43:46] (03PS1) 10Andrew Bogott: cloud-vps: puppetize /etc/apt/sources.list [puppet] - 10https://gerrit.wikimedia.org/r/751498 (https://phabricator.wikimedia.org/T264311) [21:44:53] (03CR) 10Subramanya Sastry: "I've scheduled this for this evening's swat window." [mediawiki-config] - 10https://gerrit.wikimedia.org/r/749302 (owner: 10Subramanya Sastry) [21:45:30] (03PS3) 10Andrew Bogott: apt sources.list templates: add some comments [puppet] - 10https://gerrit.wikimedia.org/r/751497 (https://phabricator.wikimedia.org/T264311) [21:45:32] (03PS2) 10Andrew Bogott: cloud-vps: puppetize /etc/apt/sources.list [puppet] - 10https://gerrit.wikimedia.org/r/751498 (https://phabricator.wikimedia.org/T264311) [21:49:20] (03CR) 10Dzahn: [C: 03+1] apt sources.list templates: add some comments [puppet] - 10https://gerrit.wikimedia.org/r/751497 (https://phabricator.wikimedia.org/T264311) (owner: 10Andrew Bogott) [21:55:25] (03CR) 10JHathaway: [V: 03+1] "looks good!" [puppet] - 10https://gerrit.wikimedia.org/r/751497 (https://phabricator.wikimedia.org/T264311) (owner: 10Andrew Bogott) [21:56:01] 10Puppet, 10SRE, 10Infrastructure-Foundations: Ensure that there are no firewall rules in modules - https://phabricator.wikimedia.org/T114209 (10Dzahn) I would feel responsible for the "phabricator" one here but I am not doing that because we have T296022 anyways. And if that happens we will remove the entir... [22:16:22] (03PS1) 1020after4: testwikis wikis to 1.38.0-wmf.16 refs T293957 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/751509 [22:16:24] (03CR) 1020after4: [C: 03+2] testwikis wikis to 1.38.0-wmf.16 refs T293957 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/751509 (owner: 1020after4) [22:17:20] RECOVERY - snapshot of s1 in eqiad on alert1001 is OK: Last snapshot for s1 at eqiad (db1139.eqiad.wmnet:3311) taken on 2022-01-04 20:50:14 (1048 GB) https://wikitech.wikimedia.org/wiki/MariaDB/Backups%23Alerting [22:17:39] (03Merged) 10jenkins-bot: testwikis wikis to 1.38.0-wmf.16 refs T293957 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/751509 (owner: 1020after4) [22:17:41] !log twentyafterfour@deploy1002 Started scap: testwikis wikis to 1.38.0-wmf.16 refs T293957 [22:17:43] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [22:17:44] T293957: 1.38.0-wmf.16 deployment blockers - https://phabricator.wikimedia.org/T293957 [22:20:30] (03PS1) 10Dzahn: phabricator: move vcs firewall rules to profile [puppet] - 10https://gerrit.wikimedia.org/r/751510 (https://phabricator.wikimedia.org/T114209) [22:21:08] !log mwdebug-deploy@deploy1002 helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn [22:21:09] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [22:21:26] PROBLEM - very high load average likely xfs on ms-be2065 is CRITICAL: CRITICAL - load average: 105.14, 96.73, 100.20 https://wikitech.wikimedia.org/wiki/Swift [22:21:28] (03CR) 10jerkins-bot: [V: 04-1] phabricator: move vcs firewall rules to profile [puppet] - 10https://gerrit.wikimedia.org/r/751510 (https://phabricator.wikimedia.org/T114209) (owner: 10Dzahn) [22:22:18] !log mwdebug-deploy@deploy1002 helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn [22:22:19] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [22:22:19] !log mwdebug-deploy@deploy1002 helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn [22:22:21] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [22:23:31] !log mwdebug-deploy@deploy1002 helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn [22:23:32] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [22:24:15] (03CR) 10Jdlrobson: [C: 03+1] Fix wordmark svgs for strategywiki, viwikibooks. (031 comment) [mediawiki-config] - 10https://gerrit.wikimedia.org/r/748214 (https://phabricator.wikimedia.org/T290091) (owner: 10Clare Ming) [22:27:04] (03PS2) 10BryanDavis: toolhub: Bump container version to 2021-12-23-121200-production [deployment-charts] - 10https://gerrit.wikimedia.org/r/749220 (https://phabricator.wikimedia.org/T271490) [22:27:40] (03CR) 10BryanDavis: toolhub: Bump container version to 2021-12-23-121200-production [deployment-charts] - 10https://gerrit.wikimedia.org/r/749220 (https://phabricator.wikimedia.org/T271490) (owner: 10BryanDavis) [22:32:40] (03CR) 10Jdlrobson: [C: 04-1] Deploy sticky header to pilot wikis, launch A/B test. (031 comment) [mediawiki-config] - 10https://gerrit.wikimedia.org/r/747981 (https://phabricator.wikimedia.org/T295976) (owner: 10Clare Ming) [22:32:56] (03CR) 10Jdlrobson: [C: 04-1] Deploy sticky header to pilot wikis, launch A/B test. (031 comment) [mediawiki-config] - 10https://gerrit.wikimedia.org/r/747981 (https://phabricator.wikimedia.org/T295976) (owner: 10Clare Ming) [22:34:56] PROBLEM - very high load average likely xfs on ms-be2065 is CRITICAL: CRITICAL - load average: 112.10, 104.08, 100.87 https://wikitech.wikimedia.org/wiki/Swift [22:39:32] (03PS4) 10Clare Ming: Deploy sticky header to pilot wikis, launch A/B test. [mediawiki-config] - 10https://gerrit.wikimedia.org/r/747981 (https://phabricator.wikimedia.org/T295976) [22:45:04] (03CR) 10Clare Ming: Deploy sticky header to pilot wikis, launch A/B test. (032 comments) [mediawiki-config] - 10https://gerrit.wikimedia.org/r/747981 (https://phabricator.wikimedia.org/T295976) (owner: 10Clare Ming) [22:45:55] (03PS1) 10Bking: Bug: T298525 [alerts] - 10https://gerrit.wikimedia.org/r/751513 (https://phabricator.wikimedia.org/T298525) [22:46:58] PROBLEM - SSH on mw2258.mgmt is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [22:48:56] (03CR) 10Clare Ming: Deploy sticky header to pilot wikis, launch A/B test. (031 comment) [mediawiki-config] - 10https://gerrit.wikimedia.org/r/747981 (https://phabricator.wikimedia.org/T295976) (owner: 10Clare Ming) [22:51:36] (03PS2) 10Dzahn: phabricator: move vcs firewall rules to profile [puppet] - 10https://gerrit.wikimedia.org/r/751510 (https://phabricator.wikimedia.org/T114209) [22:54:25] (03PS2) 10Bking: Blazegraph: further relax free allocators check [alerts] - 10https://gerrit.wikimedia.org/r/751513 (https://phabricator.wikimedia.org/T298525) [22:54:47] wow scap sync seems extra slow today. [22:55:37] !log twentyafterfour@deploy1002 Finished scap: testwikis wikis to 1.38.0-wmf.16 refs T293957 (duration: 37m 56s) [22:55:39] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [22:55:40] T293957: 1.38.0-wmf.16 deployment blockers - https://phabricator.wikimedia.org/T293957 [22:56:06] hmm I guess that's not tooo terrible ... 38 minutes isn't great though. [23:00:32] (03PS3) 10Dzahn: phabricator: move vcs firewall rules to profile [puppet] - 10https://gerrit.wikimedia.org/r/751510 (https://phabricator.wikimedia.org/T114209) [23:00:48] (03PS3) 10Bking: Blazegraph: further relax free allocators check [alerts] - 10https://gerrit.wikimedia.org/r/751513 (https://phabricator.wikimedia.org/T298525) [23:02:18] (03CR) 10Ebernhardson: [C: 03+1] Blazegraph: further relax free allocators check [alerts] - 10https://gerrit.wikimedia.org/r/751513 (https://phabricator.wikimedia.org/T298525) (owner: 10Bking) [23:09:45] (03CR) 10Arlolra: [C: 03+1] Enable slow-parsoid logs (031 comment) [mediawiki-config] - 10https://gerrit.wikimedia.org/r/749302 (owner: 10Subramanya Sastry) [23:10:16] (03CR) 10Clare Ming: Deploy sticky header to pilot wikis, launch A/B test. (031 comment) [mediawiki-config] - 10https://gerrit.wikimedia.org/r/747981 (https://phabricator.wikimedia.org/T295976) (owner: 10Clare Ming) [23:15:57] (03CR) 10Razzi: [C: 03+2] wmcs: Change maintain-views to prepare for schema change [puppet] - 10https://gerrit.wikimedia.org/r/743948 (https://phabricator.wikimedia.org/T297094) (owner: 10Ladsgroup) [23:16:40] (03CR) 10Razzi: [C: 03+2] "This looks good to me. I'm writing up my plan to deploy this in the task https://phabricator.wikimedia.org/T298505 and will await confirma" [puppet] - 10https://gerrit.wikimedia.org/r/743948 (https://phabricator.wikimedia.org/T297094) (owner: 10Ladsgroup) [23:34:37] 10SRE, 10serviceops, 10Wikimedia-production-error: wtp* hosts: Out of memory (allocated 39845888) (tried to allocate 131072 bytes) in OutputHandler.php - https://phabricator.wikimedia.org/T297517 (10tstarling) I filed T298573 for the kernel tuning issue. [23:48:04] RECOVERY - SSH on mw2258.mgmt is OK: SSH OK - OpenSSH_7.0 (protocol 2.0) https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [23:54:20] (03PS2) 10Subramanya Sastry: Enable slow-parsoid logs [mediawiki-config] - 10https://gerrit.wikimedia.org/r/749302 [23:54:40] (03CR) 10Subramanya Sastry: "PS2: rebased" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/749302 (owner: 10Subramanya Sastry) [23:58:15] twentyafterfour: Do you have a transcript? I'm curious about what phase took the most time. [23:59:46] dancy: https://phabricator.wikimedia.org/P18393 [23:59:59] most of the time was sync apaches