[04:34:34] 10DBA, 10Wikimedia-Rdbms, 10Goal, 10Performance-Team (Radar), and 2 others: FY18/19 TEC1.6 Q4: Improve or replace the usage of GTID_WAIT with pt-heartbeat in MW - https://phabricator.wikimedia.org/T221159 (10aaron) >>! In T221159#5562483, @daniel wrote: > Patch was merged, removing the patch for review tag... [07:16:01] 10DBA, 10Phabricator, 10Patch-For-Review: Upgrade m3 to Buster and MariaDB 10.4 - https://phabricator.wikimedia.org/T259589 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by marostegui on cumin1001.eqiad.wmnet for hosts: ` ['db1132.eqiad.wmnet'] ` The log can be found in `/var/log/wmf-auto-reima... [07:50:29] 10DBA, 10Phabricator, 10Patch-For-Review: Upgrade m3 to Buster and MariaDB 10.4 - https://phabricator.wikimedia.org/T259589 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['db1132.eqiad.wmnet'] ` and were **ALL** successful. [07:55:43] 10DBA: Enable DB replication codfw -> eqiad before the switchover - https://phabricator.wikimedia.org/T243373 (10Marostegui) Scheduled for 27th August. [08:32:35] 10DBA, 10Phabricator: Upgrade m3 to Buster and MariaDB 10.4 - https://phabricator.wikimedia.org/T259589 (10Marostegui) [08:54:09] 10DBA, 10Patch-For-Review, 10User-Urbanecm: Move muswiki and mhwiktionary (closed wikis) from s3 to s5 - https://phabricator.wikimedia.org/T259004 (10Marostegui) [08:54:11] 10DBA: Move more wikis from s3 to s5 - https://phabricator.wikimedia.org/T226950 (10Marostegui) [08:57:17] 10DBA, 10Product-Infrastructure-Team-Backlog, 10Release-Engineering-Team-TODO: Drop DB tables for now-deleted zerowiki from production - https://phabricator.wikimedia.org/T227717 (10Marostegui) @jcrespo can we take a logical backup from this specific wiki so I can truncate its tables? The wiki is super small... [09:01:29] ^all wikis are backed up, but you want a specific one of zerowiki, right? [09:01:37] time, now? [09:02:02] no rush [09:02:02] not urgent :) [09:02:18] yeah, but when you want it, when you plan to do your maintenance? [09:02:31] ah, if you can do it this week, that'd be cool [09:02:39] I can do it now [09:03:03] but backup will only last for some weeks, it won't be there forever [09:03:09] I know [09:03:18] ok, just making it explicit [09:03:23] ok [09:27:53] https://mysqlserverteam.com/mysql-shell-dump-load-part-2-benchmarks/ [09:29:30] in particular this image: https://mysqlserverteam.com/wp-content/uploads/2020/07/dump-1-1024x640.png [09:30:48] stackoverflow data is interesting [09:30:58] I don't want to know what they have there :-/ [09:31:53] have you checked their schemas just because of curiosity? [09:32:26] nah, I am trying to know what they call English wikipedia database, as nothing we offer matches their dataset [09:33:35] I guess an xml dump? [09:33:45] "uncompressed TSV size" [09:34:00] 130 GB [09:34:13] maybe it is just one table? [09:34:15] too small to be content dumps, but to large to be only metadata [09:35:35] I may ask them [09:37:29] "mydumper was faster in dumping wikipedia than MySQL Shell, which might be because the wikipedia dataset contains many binary columns which MySQL Shell converts to base64 format" [09:37:44] there are some binary strings on metadata, but most of them are on content [09:37:47] so not sure [09:47:46] 10DBA, 10Product-Infrastructure-Team-Backlog, 10Release-Engineering-Team-TODO: Drop DB tables for now-deleted zerowiki from production - https://phabricator.wikimedia.org/T227717 (10jcrespo) I have created both a mydumper and a mysqldump exports of zerowiki. It was anyway being backed up regularly. You can s... [09:47:59] ^done [09:48:37] mydumper with a regex takes a lot of time on s3, as it does metadata checking of all 100000+ tables [09:48:41] thanks [09:48:42] 10DBA, 10Product-Infrastructure-Team-Backlog, 10Release-Engineering-Team-TODO: Drop DB tables for now-deleted zerowiki from production - https://phabricator.wikimedia.org/T227717 (10Marostegui) Thank you! @Jdforrester-WMF I can proceed with the truncate now. However, I do have a last question, do we have to... [09:49:36] not sure if relevant, but x1 too (maybe?) [09:49:51] yeah, maybe, let me ask that too [09:50:08] 10DBA, 10Product-Infrastructure-Team-Backlog, 10Release-Engineering-Team-TODO: Drop DB tables for now-deleted zerowiki from production - https://phabricator.wikimedia.org/T227717 (10Marostegui) And same question goes for x1 tables? [09:51:05] 10DBA, 10Product-Infrastructure-Team-Backlog, 10Release-Engineering-Team-TODO: Drop DB tables for now-deleted zerowiki from production - https://phabricator.wikimedia.org/T227717 (10Marostegui) >>! In T227717#6362222, @Marostegui wrote: > And same question goes for x1 tables? Disregard that, there are no ta... [10:01:19] 10DBA, 10Patch-For-Review, 10User-Urbanecm: Move muswiki and mhwiktionary (closed wikis) from s3 to s5 - https://phabricator.wikimedia.org/T259004 (10Urbanecm) >>! In T259004#6356753, @Marostegui wrote: > And also de yaml that will regenerate the new s5 dblist Updated the moving patch. [10:26:20] I will restart prometheus exporter for db1117:s3 [10:28:38] this is not really a systemd issue, because the bug is that after "Error pinging mysql", for some reason the binary doesn't reconnect back [10:28:59] it's a prometheus bug [10:29:42] thanks [10:30:05] or if it bails out after a lot of retry, the systemd unit should mark itself as failed [10:30:10] it's weird [10:31:00] either work or fail properly, don't go into a limbo state [11:38:07] 10DBA, 10Product-Infrastructure-Team-Backlog, 10Release-Engineering-Team-TODO: Drop DB tables for now-deleted zerowiki from production - https://phabricator.wikimedia.org/T227717 (10Jdforrester-WMF) >>! In T227717#6362221, @Marostegui wrote: > Thank you! > @Jdforrester-WMF I can proceed with the truncate now... [12:28:15] 10DBA, 10Operations, 10User-Kormat: DBA python layout - https://phabricator.wikimedia.org/T259516 (10Kormat) [12:52:32] are you running a check or other maintenance on db1117:m3 (I see no current activity)? [12:52:52] not me [12:53:03] asking manuel as he stopped earlier [12:53:10] *it [12:53:35] jynus: yes, I was [12:53:39] let me see if it finished [12:54:03] were you doing a check, a data migration, just out of curiosity? [12:54:14] I was checking some data [12:54:20] restarted replication! [12:54:23] thanks for the heads up! [12:54:23] cool [12:54:29] sorry, I didn't meant to pressure [12:54:36] no no, no pressure [12:54:37] but as it is also the source of backups [12:54:54] I wasnted to make sure it didn't stay stopped until tomorrow [12:55:18] *wanted [12:55:39] I also remember you wanted to do an m3 switchover, but I don't remember when [12:56:25] found it on the calendar, it was next week [12:56:44] so not related (or at least not directly) to this [12:57:30] thanks for the checks, we should do more of those, more frequently [14:49:45] 10DBA, 10DC-Ops, 10Operations, 10ops-codfw: (Need By: 2020-09-14) rack/setup/install dbprov2003.codfw.wmnet - https://phabricator.wikimedia.org/T258749 (10Papaul) [15:16:37] 10DBA, 10DC-Ops, 10Operations, 10ops-codfw: (Need By: 2020-09-14) rack/setup/install dbprov2003.codfw.wmnet - https://phabricator.wikimedia.org/T258749 (10Papaul) [15:29:35] 10DBA, 10DC-Ops, 10Operations, 10ops-codfw: (Need By: 2020-09-14) rack/setup/install dbprov2003.codfw.wmnet - https://phabricator.wikimedia.org/T258749 (10Papaul) ` Interface Admin Link Description xe-4/0/20 up up dbprov2003 Logical Vlan TAG MAC STP L... [15:30:03] 10DBA, 10DC-Ops, 10Operations, 10ops-codfw: (Need By: 2020-09-14) rack/setup/install dbprov2003.codfw.wmnet - https://phabricator.wikimedia.org/T258749 (10Papaul) [15:48:42] 10DBA, 10DC-Ops, 10Operations, 10ops-codfw: (Need By: 2020-09-14) rack/setup/install dbprov2003.codfw.wmnet - https://phabricator.wikimedia.org/T258749 (10jcrespo) [15:53:46] 10DBA, 10DC-Ops, 10Operations, 10ops-eqiad: (2020-09-14) rack/setup/install dbprov1003.eqiad.wmnet - https://phabricator.wikimedia.org/T258750 (10jcrespo) [15:55:07] 10DBA, 10DC-Ops, 10Operations, 10ops-codfw: (Need By: 2020-09-14) rack/setup/install dbprov2003.codfw.wmnet - https://phabricator.wikimedia.org/T258749 (10jcrespo) [15:59:14] 10DBA, 10DC-Ops, 10Operations, 10ops-codfw: (Need By: 2020-09-14) rack/setup/install dbprov2003.codfw.wmnet - https://phabricator.wikimedia.org/T258749 (10Papaul) Status Name State Layout Size Media Type Read Policy Write Policy Stripe Size Secured Remaining Redundancy Virtual Disk 0 Onlin... [16:07:00] 10DBA, 10DC-Ops, 10Operations, 10ops-codfw, 10Patch-For-Review: (Need By: 2020-09-14) rack/setup/install dbprov2003.codfw.wmnet - https://phabricator.wikimedia.org/T258749 (10Papaul) [16:18:05] 10DBA, 10DC-Ops, 10Operations, 10ops-codfw, 10Patch-For-Review: (Need By: 2020-09-14) rack/setup/install dbprov2003.codfw.wmnet - https://phabricator.wikimedia.org/T258749 (10jcrespo) They use the `custom/db.cfg` recipe, but only on first install, after that they are moved to `custom/reuse-dbprov.cfg`. [18:22:04] 10DBA, 10DC-Ops, 10Operations, 10ops-codfw: (Need By: 2020-09-14) rack/setup/install dbprov2003.codfw.wmnet - https://phabricator.wikimedia.org/T258749 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by pt1979 on cumin2001.codfw.wmnet for hosts: ` dbprov2003.codfw.wmnet ` The log can be found i... [18:39:15] 10DBA, 10DC-Ops, 10Operations, 10ops-codfw: (Need By: 2020-09-14) rack/setup/install dbprov2003.codfw.wmnet - https://phabricator.wikimedia.org/T258749 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['dbprov2003.codfw.wmnet'] ` Of which those **FAILED**: ` ['dbprov2003.codfw.wmnet'] ` [18:54:51] 10DBA, 10DC-Ops, 10Operations, 10ops-codfw: (Need By: 2020-09-14) rack/setup/install dbprov2003.codfw.wmnet - https://phabricator.wikimedia.org/T258749 (10jcrespo) This was the only issue we had the last time with the same hw and recipe: T218336#5068836 [19:21:10] 10DBA, 10DC-Ops, 10Operations, 10ops-codfw, 10Patch-For-Review: (Need By: 2020-09-14) rack/setup/install dbprov2003.codfw.wmnet - https://phabricator.wikimedia.org/T258749 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by pt1979 on cumin2001.codfw.wmnet for hosts: ` dbprov2003.codfw.wmnet `... [19:44:02] 10DBA, 10Goal: Expand database provisioning/backup service to accomodate for growing capacity and high availability needs - https://phabricator.wikimedia.org/T257551 (10Papaul) [19:46:03] 10DBA, 10DC-Ops, 10Operations, 10ops-codfw, 10Patch-For-Review: (Need By: 2020-09-14) rack/setup/install dbprov2003.codfw.wmnet - https://phabricator.wikimedia.org/T258749 (10Papaul) [19:59:47] 10DBA, 10DC-Ops, 10Operations, 10ops-codfw, 10Patch-For-Review: (Need By: 2020-09-14) rack/setup/install dbprov2003.codfw.wmnet - https://phabricator.wikimedia.org/T258749 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['dbprov2003.codfw.wmnet'] ` and were **ALL** successful. [20:12:43] 10DBA, 10DC-Ops, 10Operations, 10ops-codfw: (Need By: 2020-09-14) rack/setup/install dbprov2003.codfw.wmnet - https://phabricator.wikimedia.org/T258749 (10Papaul) [20:12:58] 10DBA, 10DC-Ops, 10Operations, 10ops-codfw: (Need By: 2020-09-14) rack/setup/install dbprov2003.codfw.wmnet - https://phabricator.wikimedia.org/T258749 (10Papaul) 05Open→03Resolved This is done [22:34:14] 10Blocked-on-schema-change, 10DBA, 10Anti-Harassment (The Letter Song), 10MW-1.35-notes (1.35.0-wmf.36; 2020-06-09): ipb_address_unique has an extra column in production but not in the code - https://phabricator.wikimedia.org/T251188 (10Ladsgroup) 05Resolved→03Open Sorry :( The drift report gave this:... [22:36:35] marostegui: so I ran it again and only found the things we are waiting for master switchover (in s8) and this one I just reopened but 1- 25% of tables are in abstract schema mode and my sql parser is not good enough to understand these OTOH, writing a drift checker using abstract schema is much easier, I just need a bit of time 2- I accidentally introduced a drift in the code. A column in change_tag is not unsigned but I will make a [22:36:35] patch and make a ticket for that soon