[03:32:15] FIRING: [2x] SystemdUnitFailed: wmf_auto_restart_prometheus-mysqld-exporter@s3.service on db2239:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [07:04:45] I am going to switchover m3-master (phabricator) [08:01:16] Amir1: You can go ahead with db1193 (old s8 master switched past week) - it is pooled though [08:06:18] I am going to switch es5 codfw master [09:25:25] I am going to reboot and upgrade the backup source for all the mX sections [09:57:14] thanks! [10:15:11] Emperor: the clean up script is done with 06 - 0f (inclusive). Wanna vacuum them? [10:18:08] Good question :) [12:38:04] I am going to reboot and upgrade the backup source for all the mX sections -> doing the same again but for eqiad [13:27:04] I am going to enable slow query log on a host in x1, to investigate a possible full table scan [13:30:57] done, and disabled, ticket incoming [14:06:45] Amir1: https://grafana.wikimedia.org/d/000000273/mysql?orgId=1&refresh=1m&var-job=All&var-server=db1179&var-port=9104&from=now-1h&to=now&viewPanel=3 [14:07:24] weeeeeee [14:09:08] marostegui: https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&refresh=5m&var-server=db1179&var-datasource=thanos&var-cluster=mysql&from=now-1h&to=now&viewPanel=3 and https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&refresh=5m&var-server=db1179&var-datasource=thanos&var-cluster=mysql&from=now-1h&to=now&viewPanel=25 [14:09:25] Excellent, I will fix codfw now [15:12:14] Progress! https://phabricator.wikimedia.org/T383053#10432813 [15:39:51] marostegui: did you see btw I got you and Amir1 a little Christmas present? https://gerrit.wikimedia.org/r/c/mediawiki/core/+/1105792 [15:40:53] cdanis: I wasn't aware of this! [15:41:17] Mate' and I did a little hackathon ourselves :D [15:41:28] <3 [15:41:32] I hope to have it enabled on a test wiki this week [15:41:40] group0 or something [15:41:52] I was just going to ask what was the plan for deploying [15:41:53] Nice