[08:54:33] <federico3>	 we have a PSU fault on es2055
[09:01:51] <Emperor>	 sometimes monitoring will make a ticket automatically...
[10:45:35] <Emperor>	 federico3: looking at the essential work doc, you look to have added a new section the other day, but it's still called week of 18 August, and copies the object storage update from the week of 18 August. I think that's an error and you meant to make a new week of 25 August section, so I should update the date and the object storage bits. Is that correct?
[10:45:55] <Emperor>	 [alternatively, you might have been meaning to add things to the 18th August section]
[10:46:36] <federico3>	 oh, I'll move those to the 25th
[10:47:24] <Emperor>	 thanks
[11:33:57] <federico3>	 thank you
[11:43:10] <federico3>	 Amir1: updated https://gerrit.wikimedia.org/r/c/operations/puppet/+/1182592 
[11:43:52] <federico3>	 Amir1: also can I start the schema change on s3 in codfw?
[11:49:37] <Amir1>	 federico3: I'm running schema change on s3
[11:49:42] <Amir1>	 https://wikitech.wikimedia.org/wiki/Map_of_database_maintenance
[11:49:47] <Amir1>	 It'll be over soon though
[11:50:05] <federico3>	 I can wait or do s7 instead
[11:50:29] <Amir1>	 wait a bit since I want to do s7 next :P
[13:06:57] <federico3>	 Amir1: can I start the upgrade and clone of es2049 or maybe we want to do the clone on monday?
[13:07:12] <Amir1>	 let's do it on Monday
[13:07:34] <federico3>	 good to go for the upgrade now?
[13:31:25] <jinxer-wm>	 FIRING: SystemdUnitFailed: prometheus-mysqld-exporter.service on es2049:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed
[13:49:11] <Emperor>	 ^-- expected?
[13:49:53] <Amir1>	 yeah, it's being provisioned which shouldn't trigger an alert
[15:33:07] <federico3>	 Amir1: can I start the upgrade on es2049 now?
[15:34:08] <federico3>	 also can I start che schema change on s3? I see yours is not running on cumin1002
[15:37:49] * Emperor is just going to gently note it's Friday EU-afternoon
[16:01:58] <Amir1>	 I'm also a bit confused by the notion of upgrade. It's a new host being provisioned, it doesn't have anything to upgrade. What am I missing here
[16:03:19] <Amir1>	 s3 is done on my side
[17:31:40] <jinxer-wm>	 FIRING: SystemdUnitFailed: prometheus-mysqld-exporter.service on es2049:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed
[17:51:25] <jinxer-wm>	 FIRING: [2x] SystemdUnitFailed: prometheus-mysqld-exporter.service on es2049:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed
[21:51:40] <jinxer-wm>	 FIRING: [2x] SystemdUnitFailed: prometheus-mysqld-exporter.service on es2049:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed
[22:35:02] <Emperor>	 Is this going to alert all weekend? Can we at least silence it until Monday?
[22:40:10] <Amir1>	 I'll try to silence it
[22:43:59] <Amir1>	 downtimed, let's see if that does anything
[22:44:27] <Amir1>	 this is empty but can't say whether it's real or not, https://alerts.wikimedia.org/?q=alertname%21%3DSystemdUnitFailed&q=team%3D~data-persistence&q=%40state%3Dactive
[22:44:54] <federico3>	 is this triggered by the deployment through puppet?
[22:45:19] <Amir1>	 it's not fully provisioned that's the problem
[22:45:42] <Amir1>	 so puppet is checking for a service that's not set up (yet) and fails
[22:46:28] <federico3>	 but.. should puppet have ran and deployed mariadb? I was planning to run a host update to get all the packages deployed
[22:49:31] <Amir1>	 I don't think puppet deploys mariadb, most notably the data is missing
[22:49:43] <Amir1>	 needs cloning from another host first
[23:13:02] <federico3>	 (fwiw the wmf-mariadb1011 package  is deployed)