[00:00:19] <icinga-wm>	 RECOVERY - Check systemd state on maps2009 is OK: OK - running: The system is fully operational https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state
[00:07:23] <icinga-wm>	 PROBLEM - Check systemd state on maps2009 is CRITICAL: CRITICAL - degraded: The following units failed: planet_sync_tile_generation-gis.service https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state
[00:13:17] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1144:3315', diff saved to https://phabricator.wikimedia.org/P33418 and previous config saved to /var/cache/conftool/dbconfig/20220828-001317-ladsgroup.json
[00:19:53] <icinga-wm>	 RECOVERY - SSH on wtp1044.mgmt is OK: SSH OK - OpenSSH_7.0 (protocol 2.0) https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook
[00:28:23] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1144:3315', diff saved to https://phabricator.wikimedia.org/P33419 and previous config saved to /var/cache/conftool/dbconfig/20220828-002823-ladsgroup.json
[00:43:13] <icinga-wm>	 RECOVERY - Check systemd state on logstash2026 is OK: OK - running: The system is fully operational https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state
[00:43:30] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1144:3315 (T316186)', diff saved to https://phabricator.wikimedia.org/P33420 and previous config saved to /var/cache/conftool/dbconfig/20220828-004329-ladsgroup.json
[00:43:34] <logmsgbot>	 !log ladsgroup@cumin1001 START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1161.eqiad.wmnet with reason: Maintenance
[00:43:48] <logmsgbot>	 !log ladsgroup@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1161.eqiad.wmnet with reason: Maintenance
[00:43:49] <logmsgbot>	 !log ladsgroup@cumin1001 START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
[00:44:05] <logmsgbot>	 !log ladsgroup@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
[00:44:11] <icinga-wm>	 RECOVERY - Check systemd state on logstash1026 is OK: OK - running: The system is fully operational https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state
[00:44:11] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Depooling db1161 (T316186)', diff saved to https://phabricator.wikimedia.org/P33421 and previous config saved to /var/cache/conftool/dbconfig/20220828-004410-ladsgroup.json
[00:50:16] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1161 (T316186)', diff saved to https://phabricator.wikimedia.org/P33422 and previous config saved to /var/cache/conftool/dbconfig/20220828-005015-ladsgroup.json
[01:05:22] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P33423 and previous config saved to /var/cache/conftool/dbconfig/20220828-010522-ladsgroup.json
[01:20:29] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P33424 and previous config saved to /var/cache/conftool/dbconfig/20220828-012028-ladsgroup.json
[01:23:11] <icinga-wm>	 RECOVERY - SSH on mw1327.mgmt is OK: SSH OK - OpenSSH_7.0 (protocol 2.0) https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook
[01:35:35] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1161 (T316186)', diff saved to https://phabricator.wikimedia.org/P33425 and previous config saved to /var/cache/conftool/dbconfig/20220828-013534-ladsgroup.json
[01:35:39] <logmsgbot>	 !log ladsgroup@cumin1001 START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1110.eqiad.wmnet with reason: Maintenance
[01:35:53] <logmsgbot>	 !log ladsgroup@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1110.eqiad.wmnet with reason: Maintenance
[01:35:59] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Depooling db1110 (T316186)', diff saved to https://phabricator.wikimedia.org/P33426 and previous config saved to /var/cache/conftool/dbconfig/20220828-013558-ladsgroup.json
[01:36:45] <jinxer-wm>	 (JobUnavailable) firing: (3) Reduced availability for job redis_gitlab in ops@codfw - https://wikitech.wikimedia.org/wiki/Prometheus#Prometheus_job_unavailable - https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets - https://alerts.wikimedia.org/?q=alertname%3DJobUnavailable
[01:41:02] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1110 (T316186)', diff saved to https://phabricator.wikimedia.org/P33427 and previous config saved to /var/cache/conftool/dbconfig/20220828-014101-ladsgroup.json
[01:41:45] <jinxer-wm>	 (JobUnavailable) firing: (8) Reduced availability for job nginx in ops@codfw - https://wikitech.wikimedia.org/wiki/Prometheus#Prometheus_job_unavailable - https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets - https://alerts.wikimedia.org/?q=alertname%3DJobUnavailable
[01:46:45] <jinxer-wm>	 (JobUnavailable) firing: (10) Reduced availability for job gitaly in ops@codfw - https://wikitech.wikimedia.org/wiki/Prometheus#Prometheus_job_unavailable - https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets - https://alerts.wikimedia.org/?q=alertname%3DJobUnavailable
[01:51:45] <jinxer-wm>	 (JobUnavailable) firing: (10) Reduced availability for job gitaly in ops@codfw - https://wikitech.wikimedia.org/wiki/Prometheus#Prometheus_job_unavailable - https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets - https://alerts.wikimedia.org/?q=alertname%3DJobUnavailable
[01:56:08] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1110', diff saved to https://phabricator.wikimedia.org/P33428 and previous config saved to /var/cache/conftool/dbconfig/20220828-015608-ladsgroup.json
[01:58:55] <icinga-wm>	 PROBLEM - SSH on wtp1040.mgmt is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook
[02:06:45] <jinxer-wm>	 (JobUnavailable) firing: (8) Reduced availability for job gitaly in ops@codfw - https://wikitech.wikimedia.org/wiki/Prometheus#Prometheus_job_unavailable - https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets - https://alerts.wikimedia.org/?q=alertname%3DJobUnavailable
[02:11:15] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1110', diff saved to https://phabricator.wikimedia.org/P33429 and previous config saved to /var/cache/conftool/dbconfig/20220828-021114-ladsgroup.json
[02:11:45] <jinxer-wm>	 (JobUnavailable) resolved: (5) Reduced availability for job gitaly in ops@codfw - https://wikitech.wikimedia.org/wiki/Prometheus#Prometheus_job_unavailable - https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets - https://alerts.wikimedia.org/?q=alertname%3DJobUnavailable
[02:26:21] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1110 (T316186)', diff saved to https://phabricator.wikimedia.org/P33430 and previous config saved to /var/cache/conftool/dbconfig/20220828-022620-ladsgroup.json
[02:26:25] <logmsgbot>	 !log ladsgroup@cumin1001 START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1150.eqiad.wmnet with reason: Maintenance
[02:26:38] <logmsgbot>	 !log ladsgroup@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1150.eqiad.wmnet with reason: Maintenance
[02:30:52] <logmsgbot>	 !log ladsgroup@cumin1001 START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2111.codfw.wmnet with reason: Maintenance
[02:31:05] <logmsgbot>	 !log ladsgroup@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2111.codfw.wmnet with reason: Maintenance
[02:31:12] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Depooling db2111 (T316186)', diff saved to https://phabricator.wikimedia.org/P33431 and previous config saved to /var/cache/conftool/dbconfig/20220828-023111-ladsgroup.json
[02:36:19] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db2111 (T316186)', diff saved to https://phabricator.wikimedia.org/P33432 and previous config saved to /var/cache/conftool/dbconfig/20220828-023618-ladsgroup.json
[02:51:25] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db2111', diff saved to https://phabricator.wikimedia.org/P33433 and previous config saved to /var/cache/conftool/dbconfig/20220828-025124-ladsgroup.json
[03:06:31] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db2111', diff saved to https://phabricator.wikimedia.org/P33434 and previous config saved to /var/cache/conftool/dbconfig/20220828-030631-ladsgroup.json
[03:09:35] <jinxer-wm>	 (FrontendUnavailable) firing: varnish-text has reduced HTTP availability #page - https://wikitech.wikimedia.org/wiki/Varnish#Diagnosing_Varnish_alerts - https://grafana.wikimedia.org/d/000000479/frontend-traffic?viewPanel=3 - https://alerts.wikimedia.org/?q=alertname%3DFrontendUnavailable
[03:09:35] <jinxer-wm>	 (FrontendUnavailable) firing: HAProxy (cache_text) has reduced HTTP availability #page - TODO - https://grafana.wikimedia.org/d/000000479/frontend-traffic?viewPanel=13 - https://alerts.wikimedia.org/?q=alertname%3DFrontendUnavailable
[03:10:22] <rzl>	 looking
[03:12:57] <icinga-wm>	 RECOVERY - SSH on mw1314.mgmt is OK: SSH OK - OpenSSH_7.0 (protocol 2.0) https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook
[03:14:35] <jinxer-wm>	 (FrontendUnavailable) resolved: varnish-text has reduced HTTP availability #page - https://wikitech.wikimedia.org/wiki/Varnish#Diagnosing_Varnish_alerts - https://grafana.wikimedia.org/d/000000479/frontend-traffic?viewPanel=3 - https://alerts.wikimedia.org/?q=alertname%3DFrontendUnavailable
[03:14:35] <jinxer-wm>	 (FrontendUnavailable) resolved: HAProxy (cache_text) has reduced HTTP availability #page - TODO - https://grafana.wikimedia.org/d/000000479/frontend-traffic?viewPanel=13 - https://alerts.wikimedia.org/?q=alertname%3DFrontendUnavailable
[03:21:37] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db2111 (T316186)', diff saved to https://phabricator.wikimedia.org/P33435 and previous config saved to /var/cache/conftool/dbconfig/20220828-032137-ladsgroup.json
[03:21:43] <logmsgbot>	 !log ladsgroup@cumin1001 START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2178.codfw.wmnet with reason: Maintenance
[03:21:56] <logmsgbot>	 !log ladsgroup@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2178.codfw.wmnet with reason: Maintenance
[03:22:02] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Depooling db2178 (T316186)', diff saved to https://phabricator.wikimedia.org/P33436 and previous config saved to /var/cache/conftool/dbconfig/20220828-032202-ladsgroup.json
[03:23:55] <icinga-wm>	 PROBLEM - SSH on wtp1044.mgmt is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook
[03:27:13] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db2178 (T316186)', diff saved to https://phabricator.wikimedia.org/P33437 and previous config saved to /var/cache/conftool/dbconfig/20220828-032713-ladsgroup.json
[03:42:19] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db2178', diff saved to https://phabricator.wikimedia.org/P33438 and previous config saved to /var/cache/conftool/dbconfig/20220828-034219-ladsgroup.json
[03:57:26] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db2178', diff saved to https://phabricator.wikimedia.org/P33439 and previous config saved to /var/cache/conftool/dbconfig/20220828-035725-ladsgroup.json
[04:01:41] <icinga-wm>	 RECOVERY - SSH on wtp1040.mgmt is OK: SSH OK - OpenSSH_7.0 (protocol 2.0) https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook
[04:06:30] <wikibugs>	 (03PS4) 10Gergő Tisza: Declare mediawiki.createaccount_blocked_user schema [mediawiki-config] - 10https://gerrit.wikimedia.org/r/822686 (https://phabricator.wikimedia.org/T306018) (owner: 10Sergio Gimeno)
[04:12:32] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db2178 (T316186)', diff saved to https://phabricator.wikimedia.org/P33440 and previous config saved to /var/cache/conftool/dbconfig/20220828-041231-ladsgroup.json
[04:12:37] <logmsgbot>	 !log ladsgroup@cumin1001 START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2101.codfw.wmnet with reason: Maintenance
[04:12:51] <logmsgbot>	 !log ladsgroup@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2101.codfw.wmnet with reason: Maintenance
[04:15:59] <logmsgbot>	 !log ladsgroup@cumin1001 START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2128.codfw.wmnet with reason: Maintenance
[04:16:13] <logmsgbot>	 !log ladsgroup@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2128.codfw.wmnet with reason: Maintenance
[04:16:14] <logmsgbot>	 !log ladsgroup@cumin1001 START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on db2094.codfw.wmnet with reason: Maintenance
[04:16:17] <logmsgbot>	 !log ladsgroup@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2094.codfw.wmnet with reason: Maintenance
[04:16:23] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Depooling db2128 (T316186)', diff saved to https://phabricator.wikimedia.org/P33441 and previous config saved to /var/cache/conftool/dbconfig/20220828-041622-ladsgroup.json
[04:21:46] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db2128 (T316186)', diff saved to https://phabricator.wikimedia.org/P33442 and previous config saved to /var/cache/conftool/dbconfig/20220828-042145-ladsgroup.json
[04:36:52] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db2128', diff saved to https://phabricator.wikimedia.org/P33443 and previous config saved to /var/cache/conftool/dbconfig/20220828-043651-ladsgroup.json
[04:51:58] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db2128', diff saved to https://phabricator.wikimedia.org/P33444 and previous config saved to /var/cache/conftool/dbconfig/20220828-045157-ladsgroup.json
[05:03:13] <wikibugs>	 10SRE, 10Wikimedia-Mailing-lists: ESEAP mailing list set up - https://phabricator.wikimedia.org/T316454 (10Legoktm) To clarify, by "Secondary list administrator's email address" we want each list to have a second list administrator, not a secondary email for the first administrator :)
[05:07:04] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db2128 (T316186)', diff saved to https://phabricator.wikimedia.org/P33445 and previous config saved to /var/cache/conftool/dbconfig/20220828-050704-ladsgroup.json
[05:07:10] <logmsgbot>	 !log ladsgroup@cumin1001 START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2157.codfw.wmnet with reason: Maintenance
[05:07:23] <logmsgbot>	 !log ladsgroup@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2157.codfw.wmnet with reason: Maintenance
[05:07:29] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Depooling db2157 (T316186)', diff saved to https://phabricator.wikimedia.org/P33446 and previous config saved to /var/cache/conftool/dbconfig/20220828-050729-ladsgroup.json
[05:12:37] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db2157 (T316186)', diff saved to https://phabricator.wikimedia.org/P33447 and previous config saved to /var/cache/conftool/dbconfig/20220828-051237-ladsgroup.json
[05:27:44] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db2157', diff saved to https://phabricator.wikimedia.org/P33448 and previous config saved to /var/cache/conftool/dbconfig/20220828-052743-ladsgroup.json
[05:42:50] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db2157', diff saved to https://phabricator.wikimedia.org/P33449 and previous config saved to /var/cache/conftool/dbconfig/20220828-054249-ladsgroup.json
[05:57:56] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db2157 (T316186)', diff saved to https://phabricator.wikimedia.org/P33450 and previous config saved to /var/cache/conftool/dbconfig/20220828-055756-ladsgroup.json
[05:58:01] <logmsgbot>	 !log ladsgroup@cumin1001 START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2113.codfw.wmnet with reason: Maintenance
[05:58:15] <logmsgbot>	 !log ladsgroup@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2113.codfw.wmnet with reason: Maintenance
[05:58:21] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Depooling db2113 (T316186)', diff saved to https://phabricator.wikimedia.org/P33451 and previous config saved to /var/cache/conftool/dbconfig/20220828-055821-ladsgroup.json
[06:03:36] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db2113 (T316186)', diff saved to https://phabricator.wikimedia.org/P33452 and previous config saved to /var/cache/conftool/dbconfig/20220828-060336-ladsgroup.json
[06:17:01] <icinga-wm>	 PROBLEM - SSH on mw1314.mgmt is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook
[06:18:43] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db2113', diff saved to https://phabricator.wikimedia.org/P33453 and previous config saved to /var/cache/conftool/dbconfig/20220828-061842-ladsgroup.json
[06:33:49] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db2113', diff saved to https://phabricator.wikimedia.org/P33454 and previous config saved to /var/cache/conftool/dbconfig/20220828-063348-ladsgroup.json
[06:48:55] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db2113 (T316186)', diff saved to https://phabricator.wikimedia.org/P33455 and previous config saved to /var/cache/conftool/dbconfig/20220828-064855-ladsgroup.json
[06:49:00] <logmsgbot>	 !log ladsgroup@cumin1001 START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2137.codfw.wmnet with reason: Maintenance
[06:49:14] <logmsgbot>	 !log ladsgroup@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2137.codfw.wmnet with reason: Maintenance
[06:49:20] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Depooling db2137:3314 (T316186)', diff saved to https://phabricator.wikimedia.org/P33456 and previous config saved to /var/cache/conftool/dbconfig/20220828-064920-ladsgroup.json
[06:49:52] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Depooling db2137:3315 (T316186)', diff saved to https://phabricator.wikimedia.org/P33457 and previous config saved to /var/cache/conftool/dbconfig/20220828-064952-ladsgroup.json
[06:55:58] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db2137:3314 (T316186)', diff saved to https://phabricator.wikimedia.org/P33458 and previous config saved to /var/cache/conftool/dbconfig/20220828-065557-ladsgroup.json
[07:00:05] <jouncebot>	 Deploy window No deploys all day! See Deployments/Emergencies if things are broken. (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20220828T0700)
[07:00:14] <wikibugs>	 (03CR) 10Physikerwelt: "@Αλέξανδρος Κοσιάρης can you deploy this?" [deployment-charts] - 10https://gerrit.wikimedia.org/r/826957 (owner: 10PipelineBot)
[07:11:04] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db2137:3314', diff saved to https://phabricator.wikimedia.org/P33459 and previous config saved to /var/cache/conftool/dbconfig/20220828-071103-ladsgroup.json
[07:26:10] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db2137:3314', diff saved to https://phabricator.wikimedia.org/P33460 and previous config saved to /var/cache/conftool/dbconfig/20220828-072610-ladsgroup.json
[07:41:16] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db2137:3314 (T316186)', diff saved to https://phabricator.wikimedia.org/P33461 and previous config saved to /var/cache/conftool/dbconfig/20220828-074116-ladsgroup.json
[07:43:32] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db2137:3315 (T316186)', diff saved to https://phabricator.wikimedia.org/P33462 and previous config saved to /var/cache/conftool/dbconfig/20220828-074332-ladsgroup.json
[07:58:39] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db2137:3315', diff saved to https://phabricator.wikimedia.org/P33463 and previous config saved to /var/cache/conftool/dbconfig/20220828-075838-ladsgroup.json
[08:13:45] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db2137:3315', diff saved to https://phabricator.wikimedia.org/P33464 and previous config saved to /var/cache/conftool/dbconfig/20220828-081344-ladsgroup.json
[08:28:51] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db2137:3315 (T316186)', diff saved to https://phabricator.wikimedia.org/P33465 and previous config saved to /var/cache/conftool/dbconfig/20220828-082851-ladsgroup.json
[08:42:49] <wikibugs>	 10SRE, 10Wikimedia-Etherpad: Upgrade etherpad.wikimedia.org to (more) recent Etherpad version with more rich end-user features - https://phabricator.wikimedia.org/T316421 (10JeanFred) That would be 1.8.16 → 1.8.18 then? There’s not much in the change log: https://github.com/ether/etherpad-lite/blob/develop/CHA...
[09:05:03] <wikibugs>	 (03PS1) 10Majavah: dynamicproxy: improve /zones API [puppet] - 10https://gerrit.wikimedia.org/r/826986 (https://phabricator.wikimedia.org/T316463)
[09:20:55] <icinga-wm>	 RECOVERY - SSH on mw1314.mgmt is OK: SSH OK - OpenSSH_7.0 (protocol 2.0) https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook
[09:28:32] <logmsgbot>	 !log ladsgroup@cumin1001 START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2098.codfw.wmnet with reason: Maintenance
[09:28:45] <logmsgbot>	 !log ladsgroup@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2098.codfw.wmnet with reason: Maintenance
[09:31:49] <icinga-wm>	 RECOVERY - SSH on wtp1044.mgmt is OK: SSH OK - OpenSSH_7.0 (protocol 2.0) https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook
[09:33:27] <logmsgbot>	 !log ladsgroup@cumin1001 START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2154.codfw.wmnet with reason: Maintenance
[09:33:40] <logmsgbot>	 !log ladsgroup@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2154.codfw.wmnet with reason: Maintenance
[09:33:46] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Depooling db2154 (T316186)', diff saved to https://phabricator.wikimedia.org/P33466 and previous config saved to /var/cache/conftool/dbconfig/20220828-093346-ladsgroup.json
[09:39:05] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db2154 (T316186)', diff saved to https://phabricator.wikimedia.org/P33467 and previous config saved to /var/cache/conftool/dbconfig/20220828-093904-ladsgroup.json
[09:54:11] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db2154', diff saved to https://phabricator.wikimedia.org/P33468 and previous config saved to /var/cache/conftool/dbconfig/20220828-095411-ladsgroup.json
[09:54:54] <wikibugs>	 (03PS1) 10Majavah: dynamicproxy: api: do not return 'no such project' errors [puppet] - 10https://gerrit.wikimedia.org/r/826990
[10:09:17] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db2154', diff saved to https://phabricator.wikimedia.org/P33469 and previous config saved to /var/cache/conftool/dbconfig/20220828-100917-ladsgroup.json
[10:12:53] <icinga-wm>	 PROBLEM - High average GET latency for mw requests on api_appserver in codfw on alert1001 is CRITICAL: cluster=api_appserver code=200 handler=proxy:unix:/run/php/fpm-www.sock https://wikitech.wikimedia.org/wiki/Monitoring/Missing_notes_link https://grafana.wikimedia.org/d/RIA1lzDZk/application-servers-red-dashboard?panelId=9&fullscreen&orgId=1&from=now-3h&to=now&var-datasource=codfw+prometheus/ops&var-cluster=api_appserver&var-method=GET
[10:19:59] <icinga-wm>	 RECOVERY - High average GET latency for mw requests on api_appserver in codfw on alert1001 is OK: All metrics within thresholds. https://wikitech.wikimedia.org/wiki/Monitoring/Missing_notes_link https://grafana.wikimedia.org/d/RIA1lzDZk/application-servers-red-dashboard?panelId=9&fullscreen&orgId=1&from=now-3h&to=now&var-datasource=codfw+prometheus/ops&var-cluster=api_appserver&var-method=GET
[10:24:24] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db2154 (T316186)', diff saved to https://phabricator.wikimedia.org/P33470 and previous config saved to /var/cache/conftool/dbconfig/20220828-102423-ladsgroup.json
[10:24:29] <logmsgbot>	 !log ladsgroup@cumin1001 START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2100.codfw.wmnet with reason: Maintenance
[10:24:42] <logmsgbot>	 !log ladsgroup@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2100.codfw.wmnet with reason: Maintenance
[10:27:40] <logmsgbot>	 !log ladsgroup@cumin1001 START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2162.codfw.wmnet with reason: Maintenance
[10:27:54] <logmsgbot>	 !log ladsgroup@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2162.codfw.wmnet with reason: Maintenance
[10:28:00] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Depooling db2162 (T316186)', diff saved to https://phabricator.wikimedia.org/P33471 and previous config saved to /var/cache/conftool/dbconfig/20220828-102800-ladsgroup.json
[10:33:14] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db2162 (T316186)', diff saved to https://phabricator.wikimedia.org/P33472 and previous config saved to /var/cache/conftool/dbconfig/20220828-103314-ladsgroup.json
[10:47:49] <icinga-wm>	 PROBLEM - Check systemd state on logstash1026 is CRITICAL: CRITICAL - degraded: The following units failed: curator_actions_cluster_wide.service https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state
[10:48:21] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db2162', diff saved to https://phabricator.wikimedia.org/P33473 and previous config saved to /var/cache/conftool/dbconfig/20220828-104820-ladsgroup.json
[11:03:27] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db2162', diff saved to https://phabricator.wikimedia.org/P33474 and previous config saved to /var/cache/conftool/dbconfig/20220828-110326-ladsgroup.json
[11:18:33] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db2162 (T316186)', diff saved to https://phabricator.wikimedia.org/P33475 and previous config saved to /var/cache/conftool/dbconfig/20220828-111832-ladsgroup.json
[11:18:38] <logmsgbot>	 !log ladsgroup@cumin1001 START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2163.codfw.wmnet with reason: Maintenance
[11:18:52] <logmsgbot>	 !log ladsgroup@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2163.codfw.wmnet with reason: Maintenance
[11:18:58] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Depooling db2163 (T316186)', diff saved to https://phabricator.wikimedia.org/P33476 and previous config saved to /var/cache/conftool/dbconfig/20220828-111857-ladsgroup.json
[11:24:12] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db2163 (T316186)', diff saved to https://phabricator.wikimedia.org/P33477 and previous config saved to /var/cache/conftool/dbconfig/20220828-112412-ladsgroup.json
[11:39:19] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db2163', diff saved to https://phabricator.wikimedia.org/P33478 and previous config saved to /var/cache/conftool/dbconfig/20220828-113918-ladsgroup.json
[11:54:25] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db2163', diff saved to https://phabricator.wikimedia.org/P33479 and previous config saved to /var/cache/conftool/dbconfig/20220828-115424-ladsgroup.json
[11:59:41] <icinga-wm>	 PROBLEM - PHD should be supervising processes on phab1001 is CRITICAL: PROCS CRITICAL: 2 processes with UID = 497 (phd) https://wikitech.wikimedia.org/wiki/Phabricator
[12:01:51] <icinga-wm>	 RECOVERY - PHD should be supervising processes on phab1001 is OK: PROCS OK: 16 processes with UID = 497 (phd) https://wikitech.wikimedia.org/wiki/Phabricator
[12:09:31] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db2163 (T316186)', diff saved to https://phabricator.wikimedia.org/P33480 and previous config saved to /var/cache/conftool/dbconfig/20220828-120931-ladsgroup.json
[12:09:37] <logmsgbot>	 !log ladsgroup@cumin1001 START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2164.codfw.wmnet with reason: Maintenance
[12:09:51] <logmsgbot>	 !log ladsgroup@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2164.codfw.wmnet with reason: Maintenance
[12:09:52] <logmsgbot>	 !log ladsgroup@cumin1001 START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on db2094.codfw.wmnet with reason: Maintenance
[12:09:54] <logmsgbot>	 !log ladsgroup@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2094.codfw.wmnet with reason: Maintenance
[12:10:01] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Depooling db2164 (T316186)', diff saved to https://phabricator.wikimedia.org/P33481 and previous config saved to /var/cache/conftool/dbconfig/20220828-121000-ladsgroup.json
[12:15:15] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db2164 (T316186)', diff saved to https://phabricator.wikimedia.org/P33482 and previous config saved to /var/cache/conftool/dbconfig/20220828-121515-ladsgroup.json
[12:30:22] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db2164', diff saved to https://phabricator.wikimedia.org/P33483 and previous config saved to /var/cache/conftool/dbconfig/20220828-123021-ladsgroup.json
[12:45:28] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db2164', diff saved to https://phabricator.wikimedia.org/P33484 and previous config saved to /var/cache/conftool/dbconfig/20220828-124527-ladsgroup.json
[12:50:34] <wikibugs>	 10SRE, 10Performance-Team, 10Traffic: Enable HTTP compression for arclamp trace logs - https://phabricator.wikimedia.org/T305783 (10Krinkle) 05Open→03Declined
[13:00:34] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db2164 (T316186)', diff saved to https://phabricator.wikimedia.org/P33485 and previous config saved to /var/cache/conftool/dbconfig/20220828-130033-ladsgroup.json
[13:00:39] <logmsgbot>	 !log ladsgroup@cumin1001 START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2166.codfw.wmnet with reason: Maintenance
[13:00:53] <logmsgbot>	 !log ladsgroup@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2166.codfw.wmnet with reason: Maintenance
[13:00:59] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Depooling db2166 (T316186)', diff saved to https://phabricator.wikimedia.org/P33486 and previous config saved to /var/cache/conftool/dbconfig/20220828-130059-ladsgroup.json
[13:06:15] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db2166 (T316186)', diff saved to https://phabricator.wikimedia.org/P33487 and previous config saved to /var/cache/conftool/dbconfig/20220828-130614-ladsgroup.json
[13:21:21] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db2166', diff saved to https://phabricator.wikimedia.org/P33488 and previous config saved to /var/cache/conftool/dbconfig/20220828-132120-ladsgroup.json
[13:24:44] <wikibugs>	 10SRE, 10Wikimedia-Mailing-lists: ESEAP mailing list set up - https://phabricator.wikimedia.org/T316454 (10Aklapper) What is "ESEAP"?
[13:25:57] <wikibugs>	 10SRE, 10Wikimedia-Mailing-lists: ESEAP mailing list set up - https://phabricator.wikimedia.org/T316454 (10Aklapper) Ah, found something. Is this somehow related to https://meta.wikimedia.org/wiki/ESEAP_Hub ? Who to use that list?
[13:28:45] <icinga-wm>	 PROBLEM - SSH on restbase2012.mgmt is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook
[13:36:27] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db2166', diff saved to https://phabricator.wikimedia.org/P33489 and previous config saved to /var/cache/conftool/dbconfig/20220828-133627-ladsgroup.json
[13:40:53] <wikibugs>	 10SRE, 10Wikimedia-Mailing-lists: ESEAP mailing list set up - https://phabricator.wikimedia.org/T316454 (10Ladsgroup) >>! In T316454#8191716, @Legoktm wrote: > To clarify, by "Secondary list administrator's email address" we want each list to have a second list administrator, not a secondary email for the firs...
[13:51:33] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db2166 (T316186)', diff saved to https://phabricator.wikimedia.org/P33490 and previous config saved to /var/cache/conftool/dbconfig/20220828-135133-ladsgroup.json
[13:51:39] <logmsgbot>	 !log ladsgroup@cumin1001 START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2165.codfw.wmnet with reason: Maintenance
[13:51:52] <logmsgbot>	 !log ladsgroup@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2165.codfw.wmnet with reason: Maintenance
[13:51:58] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Depooling db2165 (T316186)', diff saved to https://phabricator.wikimedia.org/P33491 and previous config saved to /var/cache/conftool/dbconfig/20220828-135158-ladsgroup.json
[13:57:14] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db2165 (T316186)', diff saved to https://phabricator.wikimedia.org/P33492 and previous config saved to /var/cache/conftool/dbconfig/20220828-135713-ladsgroup.json
[14:12:20] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db2165', diff saved to https://phabricator.wikimedia.org/P33493 and previous config saved to /var/cache/conftool/dbconfig/20220828-141220-ladsgroup.json
[14:27:26] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db2165', diff saved to https://phabricator.wikimedia.org/P33494 and previous config saved to /var/cache/conftool/dbconfig/20220828-142726-ladsgroup.json
[14:29:59] <icinga-wm>	 RECOVERY - SSH on restbase2012.mgmt is OK: SSH OK - OpenSSH_7.0 (protocol 2.0) https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook
[14:42:33] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db2165 (T316186)', diff saved to https://phabricator.wikimedia.org/P33495 and previous config saved to /var/cache/conftool/dbconfig/20220828-144232-ladsgroup.json
[14:42:38] <logmsgbot>	 !log ladsgroup@cumin1001 START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2167.codfw.wmnet with reason: Maintenance
[14:42:51] <logmsgbot>	 !log ladsgroup@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2167.codfw.wmnet with reason: Maintenance
[14:42:57] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Depooling db2167:3311 (T316186)', diff saved to https://phabricator.wikimedia.org/P33496 and previous config saved to /var/cache/conftool/dbconfig/20220828-144257-ladsgroup.json
[14:43:20] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Depooling db2167:3318 (T316186)', diff saved to https://phabricator.wikimedia.org/P33497 and previous config saved to /var/cache/conftool/dbconfig/20220828-144319-ladsgroup.json
[14:48:31] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db2167:3311 (T316186)', diff saved to https://phabricator.wikimedia.org/P33498 and previous config saved to /var/cache/conftool/dbconfig/20220828-144830-ladsgroup.json
[14:56:25] <icinga-wm>	 PROBLEM - High average GET latency for mw requests on api_appserver in codfw on alert1001 is CRITICAL: cluster=api_appserver code=200 handler=proxy:unix:/run/php/fpm-www.sock https://wikitech.wikimedia.org/wiki/Monitoring/Missing_notes_link https://grafana.wikimedia.org/d/RIA1lzDZk/application-servers-red-dashboard?panelId=9&fullscreen&orgId=1&from=now-3h&to=now&var-datasource=codfw+prometheus/ops&var-cluster=api_appserver&var-method=GET
[14:58:49] <icinga-wm>	 RECOVERY - High average GET latency for mw requests on api_appserver in codfw on alert1001 is OK: All metrics within thresholds. https://wikitech.wikimedia.org/wiki/Monitoring/Missing_notes_link https://grafana.wikimedia.org/d/RIA1lzDZk/application-servers-red-dashboard?panelId=9&fullscreen&orgId=1&from=now-3h&to=now&var-datasource=codfw+prometheus/ops&var-cluster=api_appserver&var-method=GET
[15:03:37] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db2167:3311', diff saved to https://phabricator.wikimedia.org/P33499 and previous config saved to /var/cache/conftool/dbconfig/20220828-150336-ladsgroup.json
[15:12:52] <wikibugs>	 (03PS1) 10Ori: Increase roll-out of query-sorting to 50% [puppet] - 10https://gerrit.wikimedia.org/r/826994 (https://phabricator.wikimedia.org/T314868)
[15:14:30] <wikibugs>	 (03CR) 10Ori: [V: 03+1] "PCC SUCCESS (DIFF 1): https://integration.wikimedia.org/ci/job/operations-puppet-catalog-compiler/37010/console" [puppet] - 10https://gerrit.wikimedia.org/r/826994 (https://phabricator.wikimedia.org/T314868) (owner: 10Ori)
[15:18:43] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db2167:3311', diff saved to https://phabricator.wikimedia.org/P33500 and previous config saved to /var/cache/conftool/dbconfig/20220828-151843-ladsgroup.json
[15:19:23] <wikibugs>	 (03PS1) 10Ori: Increase roll-out of query-sorting to 75% [puppet] - 10https://gerrit.wikimedia.org/r/826996 (https://phabricator.wikimedia.org/T314868)
[15:20:19] <wikibugs>	 (03CR) 10Ori: [V: 03+1] "PCC SUCCESS (DIFF 1): https://integration.wikimedia.org/ci/job/operations-puppet-catalog-compiler/37011/console" [puppet] - 10https://gerrit.wikimedia.org/r/826996 (https://phabricator.wikimedia.org/T314868) (owner: 10Ori)
[15:30:41] <wikibugs>	 (03PS1) 10Ori: Increase query-sorting to 100%, remove sampling code [puppet] - 10https://gerrit.wikimedia.org/r/826997 (https://phabricator.wikimedia.org/T314868)
[15:31:36] <wikibugs>	 (03CR) 10Ori: [V: 03+1] "PCC SUCCESS (DIFF 1): https://integration.wikimedia.org/ci/job/operations-puppet-catalog-compiler/37012/console" [puppet] - 10https://gerrit.wikimedia.org/r/826997 (https://phabricator.wikimedia.org/T314868) (owner: 10Ori)
[15:33:49] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db2167:3311 (T316186)', diff saved to https://phabricator.wikimedia.org/P33501 and previous config saved to /var/cache/conftool/dbconfig/20220828-153349-ladsgroup.json
[15:38:06] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db2167:3318 (T316186)', diff saved to https://phabricator.wikimedia.org/P33502 and previous config saved to /var/cache/conftool/dbconfig/20220828-153806-ladsgroup.json
[15:46:01] <icinga-wm>	 PROBLEM - High average GET latency for mw requests on api_appserver in codfw on alert1001 is CRITICAL: cluster=api_appserver code=200 handler=proxy:unix:/run/php/fpm-www.sock https://wikitech.wikimedia.org/wiki/Monitoring/Missing_notes_link https://grafana.wikimedia.org/d/RIA1lzDZk/application-servers-red-dashboard?panelId=9&fullscreen&orgId=1&from=now-3h&to=now&var-datasource=codfw+prometheus/ops&var-cluster=api_appserver&var-method=GET
[15:50:45] <icinga-wm>	 RECOVERY - High average GET latency for mw requests on api_appserver in codfw on alert1001 is OK: All metrics within thresholds. https://wikitech.wikimedia.org/wiki/Monitoring/Missing_notes_link https://grafana.wikimedia.org/d/RIA1lzDZk/application-servers-red-dashboard?panelId=9&fullscreen&orgId=1&from=now-3h&to=now&var-datasource=codfw+prometheus/ops&var-cluster=api_appserver&var-method=GET
[15:53:12] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db2167:3318', diff saved to https://phabricator.wikimedia.org/P33503 and previous config saved to /var/cache/conftool/dbconfig/20220828-155312-ladsgroup.json
[16:08:19] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db2167:3318', diff saved to https://phabricator.wikimedia.org/P33504 and previous config saved to /var/cache/conftool/dbconfig/20220828-160818-ladsgroup.json
[16:23:25] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db2167:3318 (T316186)', diff saved to https://phabricator.wikimedia.org/P33505 and previous config saved to /var/cache/conftool/dbconfig/20220828-162324-ladsgroup.json
[16:23:30] <logmsgbot>	 !log ladsgroup@cumin1001 START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2181.codfw.wmnet with reason: Maintenance
[16:23:44] <logmsgbot>	 !log ladsgroup@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2181.codfw.wmnet with reason: Maintenance
[16:23:50] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Depooling db2181 (T316186)', diff saved to https://phabricator.wikimedia.org/P33506 and previous config saved to /var/cache/conftool/dbconfig/20220828-162349-ladsgroup.json
[16:29:07] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db2181 (T316186)', diff saved to https://phabricator.wikimedia.org/P33507 and previous config saved to /var/cache/conftool/dbconfig/20220828-162906-ladsgroup.json
[16:32:11] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db2181 (T316186)', diff saved to https://phabricator.wikimedia.org/P33508 and previous config saved to /var/cache/conftool/dbconfig/20220828-163211-ladsgroup.json
[16:34:28] <logmsgbot>	 !log ladsgroup@cumin1001 START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2152.codfw.wmnet with reason: Maintenance
[16:34:41] <logmsgbot>	 !log ladsgroup@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2152.codfw.wmnet with reason: Maintenance
[16:34:48] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Depooling db2152 (T316186)', diff saved to https://phabricator.wikimedia.org/P33509 and previous config saved to /var/cache/conftool/dbconfig/20220828-163447-ladsgroup.json
[16:40:04] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db2152 (T316186)', diff saved to https://phabricator.wikimedia.org/P33510 and previous config saved to /var/cache/conftool/dbconfig/20220828-164004-ladsgroup.json
[16:41:35] <logmsgbot>	 !log ladsgroup@cumin1001 START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1121.eqiad.wmnet with reason: Maintenance
[16:41:48] <logmsgbot>	 !log ladsgroup@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1121.eqiad.wmnet with reason: Maintenance
[16:41:50] <logmsgbot>	 !log ladsgroup@cumin1001 START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
[16:42:05] <logmsgbot>	 !log ladsgroup@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
[16:42:11] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Depooling db1121 (T316186)', diff saved to https://phabricator.wikimedia.org/P33511 and previous config saved to /var/cache/conftool/dbconfig/20220828-164211-ladsgroup.json
[16:47:22] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1121 (T316186)', diff saved to https://phabricator.wikimedia.org/P33512 and previous config saved to /var/cache/conftool/dbconfig/20220828-164722-ladsgroup.json
[17:02:28] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1121', diff saved to https://phabricator.wikimedia.org/P33513 and previous config saved to /var/cache/conftool/dbconfig/20220828-170228-ladsgroup.json
[17:17:35] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1121', diff saved to https://phabricator.wikimedia.org/P33514 and previous config saved to /var/cache/conftool/dbconfig/20220828-171734-ladsgroup.json
[17:18:23] <icinga-wm>	 PROBLEM - SSH on db1101.mgmt is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook
[17:32:41] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1121 (T316186)', diff saved to https://phabricator.wikimedia.org/P33515 and previous config saved to /var/cache/conftool/dbconfig/20220828-173241-ladsgroup.json
[17:32:45] <logmsgbot>	 !log ladsgroup@cumin1001 START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1141.eqiad.wmnet with reason: Maintenance
[17:32:57] <Amir1>	 hmm, db1101 seems to be fine from the mariadb point of view
[17:32:59] <logmsgbot>	 !log ladsgroup@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1141.eqiad.wmnet with reason: Maintenance
[17:33:05] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Depooling db1141 (T316186)', diff saved to https://phabricator.wikimedia.org/P33516 and previous config saved to /var/cache/conftool/dbconfig/20220828-173304-ladsgroup.json
[17:39:47] <icinga-wm>	 PROBLEM - High average GET latency for mw requests on api_appserver in codfw on alert1001 is CRITICAL: cluster=api_appserver code=200 handler=proxy:unix:/run/php/fpm-www.sock https://wikitech.wikimedia.org/wiki/Monitoring/Missing_notes_link https://grafana.wikimedia.org/d/RIA1lzDZk/application-servers-red-dashboard?panelId=9&fullscreen&orgId=1&from=now-3h&to=now&var-datasource=codfw+prometheus/ops&var-cluster=api_appserver&var-method=GET
[17:40:03] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Depooling failed', diff saved to https://phabricator.wikimedia.org/P33517 and previous config saved to /var/cache/conftool/dbconfig/20220828-174002-ladsgroup.json
[17:40:40] <logmsgbot>	 !log ladsgroup@cumin1001 START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2179.codfw.wmnet with reason: Maintenance
[17:40:53] <logmsgbot>	 !log ladsgroup@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2179.codfw.wmnet with reason: Maintenance
[17:40:59] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Depooling db2179 (T316186)', diff saved to https://phabricator.wikimedia.org/P33518 and previous config saved to /var/cache/conftool/dbconfig/20220828-174059-ladsgroup.json
[17:46:30] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db2179 (T316186)', diff saved to https://phabricator.wikimedia.org/P33519 and previous config saved to /var/cache/conftool/dbconfig/20220828-174630-ladsgroup.json
[17:46:36] <logmsgbot>	 !log ladsgroup@cumin1001 START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2106.codfw.wmnet with reason: Maintenance
[17:46:49] <logmsgbot>	 !log ladsgroup@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2106.codfw.wmnet with reason: Maintenance
[17:46:55] <icinga-wm>	 RECOVERY - High average GET latency for mw requests on api_appserver in codfw on alert1001 is OK: All metrics within thresholds. https://wikitech.wikimedia.org/wiki/Monitoring/Missing_notes_link https://grafana.wikimedia.org/d/RIA1lzDZk/application-servers-red-dashboard?panelId=9&fullscreen&orgId=1&from=now-3h&to=now&var-datasource=codfw+prometheus/ops&var-cluster=api_appserver&var-method=GET
[17:46:56] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Depooling db2106 (T316186)', diff saved to https://phabricator.wikimedia.org/P33520 and previous config saved to /var/cache/conftool/dbconfig/20220828-174655-ladsgroup.json
[17:52:47] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db2106 (T316186)', diff saved to https://phabricator.wikimedia.org/P33521 and previous config saved to /var/cache/conftool/dbconfig/20220828-175246-ladsgroup.json
[17:52:52] <logmsgbot>	 !log ladsgroup@cumin1001 START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2140.codfw.wmnet with reason: Maintenance
[17:53:06] <logmsgbot>	 !log ladsgroup@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2140.codfw.wmnet with reason: Maintenance
[17:53:12] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Depooling db2140 (T316186)', diff saved to https://phabricator.wikimedia.org/P33522 and previous config saved to /var/cache/conftool/dbconfig/20220828-175311-ladsgroup.json
[18:00:43] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db2140 (T316186)', diff saved to https://phabricator.wikimedia.org/P33523 and previous config saved to /var/cache/conftool/dbconfig/20220828-180042-ladsgroup.json
[18:00:48] <logmsgbot>	 !log ladsgroup@cumin1001 START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2172.codfw.wmnet with reason: Maintenance
[18:01:02] <logmsgbot>	 !log ladsgroup@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2172.codfw.wmnet with reason: Maintenance
[18:01:08] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Depooling db2172 (T316186)', diff saved to https://phabricator.wikimedia.org/P33524 and previous config saved to /var/cache/conftool/dbconfig/20220828-180108-ladsgroup.json
[18:07:26] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db2172 (T316186)', diff saved to https://phabricator.wikimedia.org/P33525 and previous config saved to /var/cache/conftool/dbconfig/20220828-180725-ladsgroup.json
[18:07:31] <logmsgbot>	 !log ladsgroup@cumin1001 START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2119.codfw.wmnet with reason: Maintenance
[18:07:45] <logmsgbot>	 !log ladsgroup@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2119.codfw.wmnet with reason: Maintenance
[18:07:51] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Depooling db2119 (T316186)', diff saved to https://phabricator.wikimedia.org/P33526 and previous config saved to /var/cache/conftool/dbconfig/20220828-180751-ladsgroup.json
[18:14:21] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db2119 (T316186)', diff saved to https://phabricator.wikimedia.org/P33527 and previous config saved to /var/cache/conftool/dbconfig/20220828-181421-ladsgroup.json
[18:14:26] <logmsgbot>	 !log ladsgroup@cumin1001 START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2099.codfw.wmnet with reason: Maintenance
[18:14:40] <logmsgbot>	 !log ladsgroup@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2099.codfw.wmnet with reason: Maintenance
[18:17:46] <logmsgbot>	 !log ladsgroup@cumin1001 START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2138.codfw.wmnet with reason: Maintenance
[18:17:59] <logmsgbot>	 !log ladsgroup@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2138.codfw.wmnet with reason: Maintenance
[18:18:05] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Depooling db2138:3312 (T316186)', diff saved to https://phabricator.wikimedia.org/P33528 and previous config saved to /var/cache/conftool/dbconfig/20220828-181805-ladsgroup.json
[18:18:31] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Depooling db2138:3314 (T316186)', diff saved to https://phabricator.wikimedia.org/P33529 and previous config saved to /var/cache/conftool/dbconfig/20220828-181830-ladsgroup.json
[18:19:35] <icinga-wm>	 RECOVERY - SSH on db1101.mgmt is OK: SSH OK - OpenSSH_7.0 (protocol 2.0) https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook
[18:23:50] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db2138:3312 (T316186)', diff saved to https://phabricator.wikimedia.org/P33530 and previous config saved to /var/cache/conftool/dbconfig/20220828-182350-ladsgroup.json
[18:26:06] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db2138:3314 (T316186)', diff saved to https://phabricator.wikimedia.org/P33531 and previous config saved to /var/cache/conftool/dbconfig/20220828-182605-ladsgroup.json
[18:26:11] <logmsgbot>	 !log ladsgroup@cumin1001 START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2136.codfw.wmnet with reason: Maintenance
[18:26:25] <logmsgbot>	 !log ladsgroup@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2136.codfw.wmnet with reason: Maintenance
[18:26:31] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Depooling db2136 (T316186)', diff saved to https://phabricator.wikimedia.org/P33532 and previous config saved to /var/cache/conftool/dbconfig/20220828-182630-ladsgroup.json
[18:31:57] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db2136 (T316186)', diff saved to https://phabricator.wikimedia.org/P33533 and previous config saved to /var/cache/conftool/dbconfig/20220828-183156-ladsgroup.json
[18:32:02] <logmsgbot>	 !log ladsgroup@cumin1001 START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2155.codfw.wmnet with reason: Maintenance
[18:32:16] <logmsgbot>	 !log ladsgroup@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2155.codfw.wmnet with reason: Maintenance
[18:32:17] <logmsgbot>	 !log ladsgroup@cumin1001 START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on db2095.codfw.wmnet with reason: Maintenance
[18:32:20] <logmsgbot>	 !log ladsgroup@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2095.codfw.wmnet with reason: Maintenance
[18:32:26] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Depooling db2155 (T316186)', diff saved to https://phabricator.wikimedia.org/P33534 and previous config saved to /var/cache/conftool/dbconfig/20220828-183226-ladsgroup.json
[18:38:50] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db2155 (T316186)', diff saved to https://phabricator.wikimedia.org/P33535 and previous config saved to /var/cache/conftool/dbconfig/20220828-183850-ladsgroup.json
[18:38:55] <logmsgbot>	 !log ladsgroup@cumin1001 START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2147.codfw.wmnet with reason: Maintenance
[18:39:09] <logmsgbot>	 !log ladsgroup@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2147.codfw.wmnet with reason: Maintenance
[18:39:15] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Depooling db2147 (T316186)', diff saved to https://phabricator.wikimedia.org/P33536 and previous config saved to /var/cache/conftool/dbconfig/20220828-183915-ladsgroup.json
[18:45:43] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db2147 (T316186)', diff saved to https://phabricator.wikimedia.org/P33537 and previous config saved to /var/cache/conftool/dbconfig/20220828-184542-ladsgroup.json
[18:50:03] <logmsgbot>	 !log ladsgroup@cumin1001 START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2148.codfw.wmnet with reason: Maintenance
[18:50:16] <logmsgbot>	 !log ladsgroup@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2148.codfw.wmnet with reason: Maintenance
[18:50:23] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Depooling db2148 (T316186)', diff saved to https://phabricator.wikimedia.org/P33538 and previous config saved to /var/cache/conftool/dbconfig/20220828-185022-ladsgroup.json
[18:55:36] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db2148 (T316186)', diff saved to https://phabricator.wikimedia.org/P33539 and previous config saved to /var/cache/conftool/dbconfig/20220828-185536-ladsgroup.json
[18:55:42] <logmsgbot>	 !log ladsgroup@cumin1001 START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2126.codfw.wmnet with reason: Maintenance
[18:55:55] <logmsgbot>	 !log ladsgroup@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2126.codfw.wmnet with reason: Maintenance
[18:55:57] <logmsgbot>	 !log ladsgroup@cumin1001 START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on db2095.codfw.wmnet with reason: Maintenance
[18:55:59] <logmsgbot>	 !log ladsgroup@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2095.codfw.wmnet with reason: Maintenance
[18:56:07] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Depooling db2126 (T316186)', diff saved to https://phabricator.wikimedia.org/P33540 and previous config saved to /var/cache/conftool/dbconfig/20220828-185606-ladsgroup.json
[19:02:38] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db2126 (T316186)', diff saved to https://phabricator.wikimedia.org/P33541 and previous config saved to /var/cache/conftool/dbconfig/20220828-190238-ladsgroup.json
[19:02:43] <logmsgbot>	 !log ladsgroup@cumin1001 START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2125.codfw.wmnet with reason: Maintenance
[19:02:57] <logmsgbot>	 !log ladsgroup@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2125.codfw.wmnet with reason: Maintenance
[19:03:03] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Depooling db2125 (T316186)', diff saved to https://phabricator.wikimedia.org/P33542 and previous config saved to /var/cache/conftool/dbconfig/20220828-190303-ladsgroup.json
[19:08:25] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db2125 (T316186)', diff saved to https://phabricator.wikimedia.org/P33543 and previous config saved to /var/cache/conftool/dbconfig/20220828-190824-ladsgroup.json
[19:08:30] <logmsgbot>	 !log ladsgroup@cumin1001 START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2107.codfw.wmnet with reason: Maintenance
[19:08:43] <logmsgbot>	 !log ladsgroup@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2107.codfw.wmnet with reason: Maintenance
[19:08:50] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Depooling db2107 (T316186)', diff saved to https://phabricator.wikimedia.org/P33544 and previous config saved to /var/cache/conftool/dbconfig/20220828-190849-ladsgroup.json
[19:14:15] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db2107 (T316186)', diff saved to https://phabricator.wikimedia.org/P33545 and previous config saved to /var/cache/conftool/dbconfig/20220828-191414-ladsgroup.json
[19:14:20] <logmsgbot>	 !log ladsgroup@cumin1001 START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2175.codfw.wmnet with reason: Maintenance
[19:14:34] <logmsgbot>	 !log ladsgroup@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2175.codfw.wmnet with reason: Maintenance
[19:14:42] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Depooling db2175 (T316186)', diff saved to https://phabricator.wikimedia.org/P33546 and previous config saved to /var/cache/conftool/dbconfig/20220828-191440-ladsgroup.json
[19:19:51] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db2175 (T316186)', diff saved to https://phabricator.wikimedia.org/P33547 and previous config saved to /var/cache/conftool/dbconfig/20220828-191951-ladsgroup.json
[19:19:57] <logmsgbot>	 !log ladsgroup@cumin1001 START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2170.codfw.wmnet with reason: Maintenance
[19:20:10] <logmsgbot>	 !log ladsgroup@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2170.codfw.wmnet with reason: Maintenance
[19:20:16] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Depooling db2170:3311 (T316186)', diff saved to https://phabricator.wikimedia.org/P33548 and previous config saved to /var/cache/conftool/dbconfig/20220828-192016-ladsgroup.json
[19:20:43] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Depooling db2170:3312 (T316186)', diff saved to https://phabricator.wikimedia.org/P33549 and previous config saved to /var/cache/conftool/dbconfig/20220828-192042-ladsgroup.json
[19:21:57] <icinga-wm>	 PROBLEM - SSH on wtp1040.mgmt is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook
[19:25:51] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db2170:3311 (T316186)', diff saved to https://phabricator.wikimedia.org/P33550 and previous config saved to /var/cache/conftool/dbconfig/20220828-192550-ladsgroup.json
[19:27:06] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db2170:3312 (T316186)', diff saved to https://phabricator.wikimedia.org/P33551 and previous config saved to /var/cache/conftool/dbconfig/20220828-192705-ladsgroup.json
[19:27:12] <logmsgbot>	 !log ladsgroup@cumin1001 START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2097.codfw.wmnet with reason: Maintenance
[19:27:26] <logmsgbot>	 !log ladsgroup@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2097.codfw.wmnet with reason: Maintenance
[19:34:41] <logmsgbot>	 !log ladsgroup@cumin1001 START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1188.eqiad.wmnet with reason: Maintenance
[19:34:54] <logmsgbot>	 !log ladsgroup@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1188.eqiad.wmnet with reason: Maintenance
[19:35:00] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Depooling db1188 (T316186)', diff saved to https://phabricator.wikimedia.org/P33552 and previous config saved to /var/cache/conftool/dbconfig/20220828-193500-ladsgroup.json
[19:41:19] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1188 (T316186)', diff saved to https://phabricator.wikimedia.org/P33553 and previous config saved to /var/cache/conftool/dbconfig/20220828-194119-ladsgroup.json
[19:56:26] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P33554 and previous config saved to /var/cache/conftool/dbconfig/20220828-195625-ladsgroup.json
[20:11:32] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P33555 and previous config saved to /var/cache/conftool/dbconfig/20220828-201131-ladsgroup.json
[20:18:39] <icinga-wm>	 PROBLEM - SSH on mw1313.mgmt is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook
[20:18:57] <ori>	 !log mw1411, mw1413, mw1419, mw1429, mw1431, mw1433: set energy-performance preference to 0 via 'x86_energy_perf_policy --hwp-epp 0' T315398
[20:19:00] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[20:19:02] <stashbot>	 T315398: Set MW appserver scaling_governor to performance - https://phabricator.wikimedia.org/T315398
[20:26:38] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1188 (T316186)', diff saved to https://phabricator.wikimedia.org/P33556 and previous config saved to /var/cache/conftool/dbconfig/20220828-202638-ladsgroup.json
[20:26:42] <logmsgbot>	 !log ladsgroup@cumin1001 START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1129.eqiad.wmnet with reason: Maintenance
[20:26:55] <logmsgbot>	 !log ladsgroup@cumin1001 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1129.eqiad.wmnet with reason: Maintenance
[20:27:02] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Depooling db1129 (T316186)', diff saved to https://phabricator.wikimedia.org/P33557 and previous config saved to /var/cache/conftool/dbconfig/20220828-202701-ladsgroup.json
[20:32:24] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1129 (T316186)', diff saved to https://phabricator.wikimedia.org/P33558 and previous config saved to /var/cache/conftool/dbconfig/20220828-203223-ladsgroup.json
[20:47:30] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1129', diff saved to https://phabricator.wikimedia.org/P33559 and previous config saved to /var/cache/conftool/dbconfig/20220828-204729-ladsgroup.json
[21:02:36] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1129', diff saved to https://phabricator.wikimedia.org/P33560 and previous config saved to /var/cache/conftool/dbconfig/20220828-210235-ladsgroup.json
[21:03:37] <logmsgbot>	 !log ladsgroup@cumin1001 dbctl commit (dc=all): 'Repooling after maintenance db1129', diff saved to https://phabricator.wikimedia.org/P33561 and previous config saved to /var/cache/conftool/dbconfig/20220828-210336-ladsgroup.json
[21:14:11] <wikibugs>	 10SRE, 10Patch-For-Review, 10Performance-Team (Radar): Set MW appserver scaling_governor to performance - https://phabricator.wikimedia.org/T315398 (10ori) I tried setting EPP to 0 using `x86_energy_perf_policy`, thinking that bypassing the sysfs interface and writing directly to the MSR would make the setti...
[21:48:17] <icinga-wm>	 PROBLEM - SSH on wtp1044.mgmt is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook
[21:52:51] <wikibugs>	 (03PS1) 10Legoktm: Use shell webservice-runner for python35/python37 images [docker-images/toollabs-images] - 10https://gerrit.wikimedia.org/r/827007 (https://phabricator.wikimedia.org/T293552)
[22:06:32] <wikibugs>	 (03PS1) 10Legoktm: Use shell webservice-runner for node16 image [docker-images/toollabs-images] - 10https://gerrit.wikimedia.org/r/827009 (https://phabricator.wikimedia.org/T293552)
[22:21:17] <icinga-wm>	 RECOVERY - SSH on mw1313.mgmt is OK: SSH OK - OpenSSH_7.0 (protocol 2.0) https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook
[22:25:55] <icinga-wm>	 RECOVERY - SSH on wtp1040.mgmt is OK: SSH OK - OpenSSH_7.0 (protocol 2.0) https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook
[23:50:53] <icinga-wm>	 RECOVERY - SSH on wtp1044.mgmt is OK: SSH OK - OpenSSH_7.0 (protocol 2.0) https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook
[23:52:46] <wikibugs>	 10SRE, 10Wikimedia-Mailing-lists: ESEAP mailing list set up - https://phabricator.wikimedia.org/T316454 (10Samwilson) a:05Samwilson→03None Other #WMAU lists have `accounts@wikimedia.org.au` as a secondary admin (Mailman username `wmau`). Would that be suitable, or is it best to have a single person nominat...