[08:18:40] <jynus>	 does anyone know if planned work at T405942 was completed? Should I restart backups?
[08:18:41] <stashbot>	 T405942: eqiad row C/D Data Persistence host migrations - https://phabricator.wikimedia.org/T405942
[08:20:08] <marostegui>	 jynus: I think it was, I was checking the spreadsheet and most of the hosts that were scheduled yesterday are marked as what I understand as done
[08:20:34] <jynus>	 I will double check it there
[08:21:01] <marostegui>	 I think they did one that wasn't, but that's ok
[08:22:10] <jynus>	 I just would have liked on ticket confirmation, to avoid accidents
[08:22:24] <jynus>	 I understand that if there was confusion, it didn't happen
[08:22:35] <marostegui>	 jynus: yeah I agree
[08:22:52] <jynus>	 but precisely is when I would prefer clarity, because of the confusion, a small message as a summary
[08:23:21] <jynus>	 I will comment that I will start backups now on eqiad
[13:11:08] <Amir1>	 Emperor: the clean up script is finished in eqiad, wanna do the ms-fe reboots? Let me know once you're done (I need to restart codfw ones too)
[13:32:37] <Emperor>	 Amir1: ack, thanks.
[13:45:25] <jinxer-wm>	 FIRING: [2x] SystemdUnitFailed: swift_dispersion_stats.service on ms-fe1009:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed
[13:54:14] <Emperor>	 ^-- consequence of reboot, fixed now.
[13:55:25] <jinxer-wm>	 RESOLVED: [2x] SystemdUnitFailed: swift_dispersion_stats.service on ms-fe1009:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed
[14:16:40] <Amir1>	 https://usercontent.irccloud-cdn.com/file/2F9FziHL/grafik.png
[14:17:06] <Amir1>	 number of files in s3 /srv/ has gone down by around 12% in the past six months
[14:17:17] <Amir1>	 https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&refresh=5m&var-server=db1157&var-datasource=000000026&var-cluster=mysql&from=now-6M&to=now&timezone=utc&viewPanel=panel-12
[14:17:40] <jynus>	 👏
[14:24:28] <jynus>	 Amir1: that's a very dedicated work you've done over the last months, thank you
[14:25:47] <Amir1>	 it's very addictive trying to reduce the number 
[14:27:49] <Amir1>	 once we finish file migration, it'll a couple more percent lower. I think I can get linter and betafeatures_user_counts moved to x1 once we set up replication to wmcs
[14:49:00] <Emperor>	 Amir1: eqiad frontend reboots done.
[14:49:38] <Amir1>	 Awesome. I start the new batch of deletions
[14:50:15] <Emperor>	 👍
[15:51:49] <Emperor>	 I caught up with rob.h via privmsg about T405942 and the confusion about moss-be1002 there, so I think that should be all good now.
[15:51:50] <stashbot>	 T405942: eqiad row C/D Data Persistence host migrations - https://phabricator.wikimedia.org/T405942
[17:03:22] <marostegui>	 Amir1: pc1014 was depooled in the morning
[17:03:33] <Amir1>	 marostegui: you're the best <3
[18:20:16] <icinga-wm>	 PROBLEM - MariaDB sustained replica lag on s1 on db1219 is CRITICAL: 50.6 ge 10 https://wikitech.wikimedia.org/wiki/MariaDB/troubleshooting%23Replication_lag https://grafana.wikimedia.org/d/000000273/mysql?orgId=1&var-server=db1219&var-port=9104
[18:22:16] <icinga-wm>	 RECOVERY - MariaDB sustained replica lag on s1 on db1219 is OK: (C)10 ge (W)5 ge 0 https://wikitech.wikimedia.org/wiki/MariaDB/troubleshooting%23Replication_lag https://grafana.wikimedia.org/d/000000273/mysql?orgId=1&var-server=db1219&var-port=9104