[05:52:56] 10Blocked-on-schema-change, 10DBA: Schema change for renaming name_title_timestamp on archive table - https://phabricator.wikimedia.org/T273359 (10Marostegui) [05:53:09] 10Blocked-on-schema-change, 10DBA: Schema change for renaming name_title_timestamp on archive table - https://phabricator.wikimedia.org/T273359 (10Marostegui) 05Open→03Resolved All done [05:53:50] 10DBA, 10Orchestrator: orchestrator: Upgrade to v3.2.4 (ish) - https://phabricator.wikimedia.org/T275784 (10Marostegui) p:05Triage→03Medium [05:59:54] 10DBA: Reimage db1134 to Buster and repool it - https://phabricator.wikimedia.org/T275343 (10Marostegui) I will start repooling this host on Monday [06:02:26] 10Blocked-on-schema-change, 10DBA: Drop default of oldimage.oi_timestamp - https://phabricator.wikimedia.org/T272511 (10Marostegui) a:03Marostegui [06:11:05] 10DBA: Productionize db21[45-52] and db11[76-84] - https://phabricator.wikimedia.org/T275633 (10Marostegui) [06:11:43] 10DBA: Productionize db21[45-52] and db11[76-84] - https://phabricator.wikimedia.org/T275633 (10Marostegui) [06:15:27] 10DBA: Reimage db1134 to Buster and repool it - https://phabricator.wikimedia.org/T275343 (10Marostegui) >>! In T275343#6863407, @Marostegui wrote: > I will start repooling this host on Monday This host crashed overnight, so the data is probably corrupted from the previous crash. So it needs to be rebuilt [06:15:43] 10DBA: Reimage db1134 to Buster and repool it - https://phabricator.wikimedia.org/T275343 (10Marostegui) [06:22:01] 10DBA: Reimage db1134 to Buster and repool it - https://phabricator.wikimedia.org/T275343 (10Marostegui) [06:39:03] 10DBA, 10decommission-hardware, 10Patch-For-Review: decommission db1092.eqiad.wmnet - https://phabricator.wikimedia.org/T275019 (10Marostegui) [07:12:31] 10DBA: Productionize db21[45-52] and db11[76-84] - https://phabricator.wikimedia.org/T275633 (10Marostegui) Just ran: `sudo cumin 'db21[45-52].codfw.wmnet' 'sudo lvextend -L+1100G /dev/mapper/tank-data && sudo xfs_growfs /srv'` on all the hosts in codfw [07:14:27] 10DBA, 10SRE, 10ops-eqiad: eqiad: move db1111 to rack A8 - https://phabricator.wikimedia.org/T273982 (10elukey) [07:41:07] 10DBA: Reimage db1134 to Buster and repool it - https://phabricator.wikimedia.org/T275343 (10Marostegui) [07:41:25] 10DBA: Reimage db1134 to Buster and repool it - https://phabricator.wikimedia.org/T275343 (10Marostegui) db1134 has been cloned. Will leave it running during the weekend before repooling it [08:06:36] 10Blocked-on-schema-change, 10DBA: Drop default of oldimage.oi_timestamp - https://phabricator.wikimedia.org/T272511 (10Marostegui) @Ladsgroup is the schema change needed only: `ALTER TABLE oldimage ALTER oi_timestamp DROP DEFAULT;`? I am seeing: https://gerrit.wikimedia.org/r/c/mediawiki/core/+/656586/2/main... [09:14:42] 10DBA, 10DC-Ops, 10SRE, 10ops-eqiad: (Need By: TBD) rack/setup/install db11[76-84] - https://phabricator.wikimedia.org/T273566 (10Marostegui) Reminder: do not add IPV6 entries for DB hosts. [09:15:08] 10Blocked-on-schema-change, 10DBA: Drop default of oldimage.oi_timestamp - https://phabricator.wikimedia.org/T272511 (10Ladsgroup) The second one is for non-binary fields (you'd see it's as varchar in wikitech) and it's not needed for our production. [09:31:12] 10DBA, 10DC-Ops, 10SRE, 10ops-codfw: (Need By: TBD) rack/setup/install db21[45-52] - https://phabricator.wikimedia.org/T273568 (10Marostegui) @Papaul reminder for next iterations, please do not add ipv6 entries for DB hosts (T270101) I have already removed them from netbox Thanks! [09:44:14] 10Blocked-on-schema-change, 10DBA: Drop default of oldimage.oi_timestamp - https://phabricator.wikimedia.org/T272511 (10Marostegui) Thanks for the clarification! [09:44:55] 10Blocked-on-schema-change, 10DBA: Drop default of oldimage.oi_timestamp - https://phabricator.wikimedia.org/T272511 (10Marostegui) [12:10:16] :wq [12:10:22] ok, here it is not useful [12:17:48] 10DBA, 10Patch-For-Review: Productionize db21[45-52] and db11[76-84] - https://phabricator.wikimedia.org/T275633 (10Marostegui) [12:47:43] 10Blocked-on-schema-change, 10DBA: Drop default of oldimage.oi_timestamp - https://phabricator.wikimedia.org/T272511 (10Marostegui) Altered db2089:3316 will leave it running for a few days s6 progress [] labsdb1012 [] labsdb1011 [] labsdb1010 [] labsdb1009 [] dbstore1005 [] db2141 [] db2129 [] db2124 [] db21... [12:48:17] 10Blocked-on-schema-change, 10DBA: Drop default of oldimage.oi_timestamp - https://phabricator.wikimedia.org/T272511 (10Marostegui) [12:58:45] PROBLEM - MariaDB sustained replica lag on db1148 is CRITICAL: 24 ge 2 https://wikitech.wikimedia.org/wiki/MariaDB/troubleshooting%23Replication_lag https://grafana.wikimedia.org/d/000000273/mysql?orgId=1&var-server=db1148&var-port=9104 [12:59:59] RECOVERY - MariaDB sustained replica lag on db1148 is OK: (C)2 ge (W)1 ge 0 https://wikitech.wikimedia.org/wiki/MariaDB/troubleshooting%23Replication_lag https://grafana.wikimedia.org/d/000000273/mysql?orgId=1&var-server=db1148&var-port=9104 [13:12:25] ^is it possible these spikes were due to ongoing schema changes? [13:12:52] I've seen a few s4 hosts complaining in the last days [13:13:13] nope, no schema changes are on-going at the moment. that host is vslow,dump so maybe slow queries and/or dumps [13:13:43] it is not that, a previously it was other host [13:14:07] so probably something going on on s4 (high concurrency/bad query) [13:14:30] nothing important now, but to keep an eye if it reocurres [16:05:22] 10DBA, 10DC-Ops, 10SRE, 10ops-codfw: (Need By: TBD) rack/setup/install db21[45-52] - https://phabricator.wikimedia.org/T273568 (10Papaul) @Marostegui understood. We will have to mentioned that on all the next racking task now as a side note so i do not forget. Thanks. [16:19:35] 10DBA, 10DC-Ops, 10SRE, 10ops-codfw: (Need By: TBD) rack/setup/install db21[45-52] - https://phabricator.wikimedia.org/T273568 (10Marostegui) Thank you! [17:07:20] 10DBA, 10SRE, 10ops-eqiad: db1162 crashed - https://phabricator.wikimedia.org/T275309 (10Cmjohnson) update: this was scheduled for today but when I sent the tech the access ticket I was told it's been re-assigned and someone should've contacted me. That did not happen. I need to figure it out and this will... [18:44:55] 10DBA, 10SRE, 10ops-eqiad: db1162 crashed - https://phabricator.wikimedia.org/T275309 (10Marostegui) Thanks for the update. Much appreciated!