[00:59:50] 10DBA, 10Patch-For-Review: Productionize x2 databases - https://phabricator.wikimedia.org/T269324 (10Krinkle) > nightmare […] It all starts with having to depool them via a MW commit, […] I'm confident we can avoid this for mainstash db (x2). The fact that this requires a MW commit today for parser cache is... [01:00:24] 10DBA, 10Patch-For-Review, 10Performance-Team (Radar): Productionize x2 databases - https://phabricator.wikimedia.org/T269324 (10Krinkle) [06:12:28] 10DBA, 10DC-Ops, 10SRE, 10ops-eqiad: (Need By: TBD) rack/setup/install db11[76-84] - https://phabricator.wikimedia.org/T273566 (10Marostegui) [06:12:58] 10DBA, 10DC-Ops, 10SRE, 10ops-codfw: (Need By: TBD) rack/setup/install db21[45-52] - https://phabricator.wikimedia.org/T273568 (10Marostegui) [06:20:43] 10DBA, 10Patch-For-Review, 10Performance-Team (Radar): Productionize x2 databases - https://phabricator.wikimedia.org/T269324 (10Marostegui) >>! In T269324#6794656, @Krinkle wrote: >> nightmare […] It all starts with having to depool them via a MW commit, […] > > I'm confident we can avoid this for mainstas... [06:31:44] 10DBA, 10Orchestrator, 10Patch-For-Review, 10User-Kormat: Enable report_host for mariadb - https://phabricator.wikimedia.org/T266483 (10Marostegui) [06:36:08] 10DBA, 10Orchestrator: Cleanup heartbeat.heartbeat on all production instances - https://phabricator.wikimedia.org/T268336 (10Marostegui) [06:36:09] 10DBA, 10Orchestrator: Cleanup heartbeat.heartbeat on all production instances - https://phabricator.wikimedia.org/T268336 (10Marostegui) es4 cleaned [06:53:31] 10DBA, 10Orchestrator: Add m* and es4/es5 sections to Orchestrator - https://phabricator.wikimedia.org/T272568 (10Marostegui) es4 added to orchestrator [06:53:39] 10DBA, 10Orchestrator: Add m* and es4/es5 sections to Orchestrator - https://phabricator.wikimedia.org/T272568 (10Marostegui) [07:19:51] 10DBA, 10DC-Ops, 10decommission-hardware, 10ops-eqiad: decommission db1089.eqiad.wmnet - https://phabricator.wikimedia.org/T273417 (10Marostegui) a:05Marostegui→03wiki_willy [07:19:54] 10DBA, 10DC-Ops, 10decommission-hardware, 10ops-eqiad: decommission db1089.eqiad.wmnet - https://phabricator.wikimedia.org/T273417 (10Marostegui) [07:20:06] 10DBA, 10DC-Ops, 10decommission-hardware, 10ops-eqiad: decommission db1089.eqiad.wmnet - https://phabricator.wikimedia.org/T273417 (10Marostegui) Ready for DC-Ops! [07:21:07] 10DBA, 10SRE, 10Patch-For-Review: Productionize db1155-db1175 and refresh and decommission db1074-db1095 (22 servers) - https://phabricator.wikimedia.org/T258361 (10Marostegui) [09:12:40] 10DBA, 10Orchestrator: Cleanup heartbeat.heartbeat on all production instances - https://phabricator.wikimedia.org/T268336 (10Marostegui) After almost 2h, I have entirely cleaned up s4 :) [09:13:51] 10DBA, 10Orchestrator: Cleanup heartbeat.heartbeat on all production instances - https://phabricator.wikimedia.org/T268336 (10Marostegui) [10:22:17] 10DBA, 10Data-Services: Clean up heartbeat table on clouddb hosts - https://phabricator.wikimedia.org/T273593 (10Marostegui) [10:23:27] 10DBA, 10Data-Services: Clean up heartbeat table on clouddb hosts - https://phabricator.wikimedia.org/T273593 (10Marostegui) p:05Triage→03Medium I will try to get this done this week [10:24:44] 10DBA, 10Data-Services: Clean up heartbeat table on clouddb hosts - https://phabricator.wikimedia.org/T273593 (10Marostegui) [10:26:06] 10DBA, 10Data-Services: Clean up heartbeat table on clouddb hosts - https://phabricator.wikimedia.org/T273593 (10Marostegui) [10:33:09] 10DBA, 10Data-Services: Clean up heartbeat table on clouddb hosts - https://phabricator.wikimedia.org/T273593 (10Marostegui) [10:34:00] there is really no other option, marostegui, right? https://gerrit.wikimedia.org/r/c/operations/puppet/+/661074 [10:34:45] I just commented hehe [10:35:04] and yes, you need that unfortunately [10:35:05] sorry, I meant about the stretch [10:35:28] yep :( [10:42:59] 10DBA, 10decommission-hardware: decommission db1078.eqiad.wmnet - https://phabricator.wikimedia.org/T273597 (10Marostegui) [10:43:27] 10DBA, 10decommission-hardware: decommission db1078.eqiad.wmnet - https://phabricator.wikimedia.org/T273597 (10Marostegui) [10:43:29] 10DBA, 10SRE, 10Patch-For-Review: Productionize db1155-db1175 and refresh and decommission db1074-db1095 (22 servers) - https://phabricator.wikimedia.org/T258361 (10Marostegui) [10:43:49] 10DBA, 10SRE, 10Patch-For-Review: Productionize db1155-db1175 and refresh and decommission db1074-db1095 (22 servers) - https://phabricator.wikimedia.org/T258361 (10Marostegui) [10:48:11] 10Data-Persistence-Backup, 10SRE, 10decommission-hardware, 10ops-eqiad: decommission helium.eqiad.wmnet and helium-array - https://phabricator.wikimedia.org/T273049 (10jcrespo) @wiki_willy As promised, we sped up the decommissioning of eqiad hw, this should free 3Us of space. No blocker on us, but I though... [10:52:12] 10DBA, 10SRE, 10Patch-For-Review: Productionize db1155-db1175 and refresh and decommission db1074-db1095 (22 servers) - https://phabricator.wikimedia.org/T258361 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by jynus on cumin1001.eqiad.wmnet for hosts: ` db1171.eqiad.wmnet ` The log can be foun... [11:14:15] 10DBA, 10SRE, 10Patch-For-Review: Productionize db1155-db1175 and refresh and decommission db1074-db1095 (22 servers) - https://phabricator.wikimedia.org/T258361 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['db1171.eqiad.wmnet'] ` and were **ALL** successful. [11:48:55] I am updating zarcillo, and the row format issues happen again- something may have changed on config [11:49:48] as far as I know we never took a super deep look and did the work around of: set session... [11:50:03] I had to do the same on command line [11:50:10] that is new [11:50:10] yeah, that's what I mean :) [11:50:42] yes, what I am saying is that as far as I remember no one took a deep look at why it started happening [11:53:21] I am doing a "recovery" of db1171 with myloader to 1) prevent any option of corruption 2) test logical backups [11:58:05] 10DBA, 10Orchestrator: Enable communication between orchestrator and clouddb hosts - https://phabricator.wikimedia.org/T273606 (10Marostegui) [11:58:16] 10DBA, 10Orchestrator: Enable communication between orchestrator and clouddb hosts - https://phabricator.wikimedia.org/T273606 (10Marostegui) p:05Triage→03Medium [13:20:04] 10DBA, 10SRE, 10Patch-For-Review: Productionize db1155-db1175 and refresh and decommission db1074-db1095 (22 servers) - https://phabricator.wikimedia.org/T258361 (10Marostegui) db1174 is now replicating. Not pooling until tomorrow. [13:47:01] 10Blocked-on-schema-change: Schema change for renaming two indexes of site_identifiers - https://phabricator.wikimedia.org/T273361 (10Marostegui) [15:09:56] 10Blocked-on-schema-change: Schema change for renaming two indexes of site_identifiers - https://phabricator.wikimedia.org/T273361 (10Marostegui) [15:47:30] 10DBA, 10DC-Ops, 10SRE, 10ops-codfw: (Need By: TBD) rack/setup/install db21[45-52] - https://phabricator.wikimedia.org/T273568 (10Papaul) [16:58:55] 10Data-Persistence-Backup, 10SRE, 10decommission-hardware, 10ops-eqiad: decommission helium.eqiad.wmnet and helium-array - https://phabricator.wikimedia.org/T273049 (10wiki_willy) Thanks a lot @jcrespo, it's much appreciated! >>! In T273049#6795619, @jcrespo wrote: > @wiki_willy As promised, we sped up th...