[02:18:16] 10Blocked-on-schema-change, 10DBA, 10Anti-Harassment (The Letter Song), 10MW-1.35-notes (1.35.0-wmf.36; 2020-06-09): ipb_address_unique has an extra column in production but not in the code - https://phabricator.wikimedia.org/T251188 (10TK-999) This will probably address T157508 as well. [04:52:55] 10DBA, 10Data-Services, 10cloud-services-team (Kanban): Prepare and check storage layer for awawiki - https://phabricator.wikimedia.org/T251410 (10Marostegui) 05Open→03Resolved This is done as labsdb1011 has been recloned from labsdb1012 [04:53:36] 10DBA, 10Data-Services, 10cloud-services-team (Kanban): Prepare and check storage layer for gomwiktionary - https://phabricator.wikimedia.org/T250706 (10Marostegui) 05Open→03Resolved This is done as labsdb1011 has been recloned from labsdb1012 [04:57:48] 10Blocked-on-schema-change, 10DBA, 10Anti-Harassment (The Letter Song), 10MW-1.35-notes (1.35.0-wmf.36; 2020-06-09): ipb_address_unique has an extra column in production but not in the code - https://phabricator.wikimedia.org/T251188 (10Marostegui) [04:58:26] 10Blocked-on-schema-change, 10DBA, 10Anti-Harassment (The Letter Song), 10MW-1.35-notes (1.35.0-wmf.36; 2020-06-09): ipb_address_unique has an extra column in production but not in the code - https://phabricator.wikimedia.org/T251188 (10Marostegui) [04:58:42] 10Blocked-on-schema-change, 10DBA, 10Anti-Harassment (The Letter Song), 10MW-1.35-notes (1.35.0-wmf.36; 2020-06-09): ipb_address_unique has an extra column in production but not in the code - https://phabricator.wikimedia.org/T251188 (10Marostegui) 05Open→03Resolved This is all done, thanks everyone fo... [06:17:22] 10DBA, 10Performance-Team: Database for XHGui profiles - https://phabricator.wikimedia.org/T254795 (10Marostegui) Thanks for the information. I think we might want to place this on m2, where otrs, recommendationapi, gerrit live, so we can group things that come from "outside". The connection should go via th... [06:31:43] 10DBA, 10Upstream, 10cloud-services-team (Kanban): Reimage labsdb1011 to Buster and MariaDB 10.4 - https://phabricator.wikimedia.org/T249188 (10Marostegui) 05Open→03Declined After a couple of months, I am declining this task because it has been impossible to upgrade labsdb to Buster and Mariadb 10.4 To... [06:31:45] 10DBA, 10Epic: Upgrade WMF database-and-backup-related hosts to buster - https://phabricator.wikimedia.org/T250666 (10Marostegui) [06:48:36] 10DBA: Relocate "old" s4 hosts - https://phabricator.wikimedia.org/T253217 (10Marostegui) [06:56:52] 10DBA, 10Operations: Reboot, upgrade firmware and kernel of db1096-db1106, db2071-db2092 - https://phabricator.wikimedia.org/T216240 (10Marostegui) [06:57:39] 10DBA, 10Growth-Team, 10MediaWiki-Recent-changes, 10Schema-change: recentchanges table indexes: tmp1, tmp2 and tmp3 - https://phabricator.wikimedia.org/T206103 (10Marostegui) [07:06:10] 10DBA: Relocate "old" s4 hosts - https://phabricator.wikimedia.org/T253217 (10Marostegui) [07:07:34] 10DBA: Relocate "old" s4 hosts - https://phabricator.wikimedia.org/T253217 (10Marostegui) [07:08:20] 10DBA: Relocate "old" s4 hosts - https://phabricator.wikimedia.org/T253217 (10Marostegui) [07:17:44] 10DBA, 10Patch-For-Review: Upgrade m1 to Buster and Mariadb 10.4 - https://phabricator.wikimedia.org/T254556 (10Marostegui) [07:22:55] 10DBA, 10Patch-For-Review: Upgrade m1 to Buster and Mariadb 10.4 - https://phabricator.wikimedia.org/T254556 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by marostegui on cumin1001.eqiad.wmnet for hosts: ` ['db1097.eqiad.wmnet'] ` The log can be found in `/var/log/wmf-auto-reimage/202006090722_m... [07:40:17] 10DBA, 10Patch-For-Review: Upgrade m1 to Buster and Mariadb 10.4 - https://phabricator.wikimedia.org/T254556 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['db1097.eqiad.wmnet'] ` and were **ALL** successful. [08:46:28] 10DBA: Upgrade m1 to Buster and Mariadb 10.4 - https://phabricator.wikimedia.org/T254556 (10Marostegui) [08:47:03] 10DBA, 10Patch-For-Review: Relocate "old" s4 hosts - https://phabricator.wikimedia.org/T253217 (10Marostegui) [09:15:08] 10DBA, 10Patch-For-Review: Productionize db114[1-9] - https://phabricator.wikimedia.org/T252512 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by marostegui on cumin1001.eqiad.wmnet for hosts: ` ['db1141.eqiad.wmnet'] ` The log can be found in `/var/log/wmf-auto-reimage/202006090914_marostegui_259... [09:17:49] 10DBA, 10Patch-For-Review: Productionize db114[1-9] - https://phabricator.wikimedia.org/T252512 (10Kormat) [09:18:27] 10DBA, 10Patch-For-Review: Productionize db114[1-9] - https://phabricator.wikimedia.org/T252512 (10Kormat) Removed the loan to labsdb from the description now that https://gerrit.wikimedia.org/r/603907 is merged. [09:35:55] 10DBA, 10Patch-For-Review: Productionize db114[1-9] - https://phabricator.wikimedia.org/T252512 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['db1141.eqiad.wmnet'] ` and were **ALL** successful. [09:39:25] 10DBA: Compress enwiki InnoDB tables - https://phabricator.wikimedia.org/T254462 (10Marostegui) [10:28:16] 10DBA, 10Growth-Team, 10MediaWiki-Recent-changes, 10Schema-change: recentchanges table indexes: tmp1, tmp2 and tmp3 - https://phabricator.wikimedia.org/T206103 (10Marostegui) [10:42:08] 10Blocked-on-schema-change, 10DBA: CentralNotice: Update DB schema on Meta for new features - https://phabricator.wikimedia.org/T254371 (10Marostegui) 05Open→03Stalled Stalling per the above comment [11:56:46] 10DBA: Identify and delete non-useful files on database hosts taking space unnecesarily - https://phabricator.wikimedia.org/T182563 (10Marostegui) 05Open→03Declined I am going to close this task as we are not really using it for anything (tracking or anything). This is an ongoing permanent thing really (also... [12:00:42] 10DBA, 10Patch-For-Review: Productionize db114[1-9] - https://phabricator.wikimedia.org/T252512 (10Marostegui) [12:29:35] 10DBA: Relocate "old" s4 hosts - https://phabricator.wikimedia.org/T253217 (10Marostegui) [12:34:12] 10DBA, 10Analytics: Upgrade analytics dbstore databases to Buster and Mariadb 10.4 - https://phabricator.wikimedia.org/T254870 (10Marostegui) [12:34:25] 10DBA, 10Analytics: Upgrade analytics dbstore databases to Buster and Mariadb 10.4 - https://phabricator.wikimedia.org/T254870 (10Marostegui) p:05Triage→03Medium [12:42:31] 10DBA: Upgrade x1 databases to Buster and Mariadb 10.4 - https://phabricator.wikimedia.org/T254871 (10Marostegui) [12:43:39] 10DBA: Upgrade x1 databases to Buster and Mariadb 10.4 - https://phabricator.wikimedia.org/T254871 (10Marostegui) p:05Triage→03Medium @jcrespo would you be okay if I upgrade db1095 and db2101 to Buster? Those are backup sources. [12:45:35] 10DBA, 10Analytics: Upgrade analytics dbstore databases to Buster and Mariadb 10.4 - https://phabricator.wikimedia.org/T254870 (10elukey) No preference! I think that given the limited downtime it can be done anytime, no problems from my side. Do you require help in doing anything? [12:46:57] 10DBA, 10Analytics: Upgrade analytics dbstore databases to Buster and Mariadb 10.4 - https://phabricator.wikimedia.org/T254870 (10Marostegui) >>! In T254870#6205693, @elukey wrote: > No preference! I think that given the limited downtime it can be done anytime, no problems from my side. Do you require help in... [13:31:25] 10DBA, 10Cloud-Services, 10CPT Initiatives (MCR Schema Migration), 10Core Platform Team Workboards (Clinic Duty Team), and 2 others: Apply updates for MCR, actor migration, and content migration, to production wikis. - https://phabricator.wikimedia.org/T238966 (10Marostegui) Some other exceptions seeing on... [13:53:30] 10DBA: Upgrade x1 databases to Buster and Mariadb 10.4 - https://phabricator.wikimedia.org/T254871 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by marostegui on cumin1001.eqiad.wmnet for hosts: ` ['db2131.codfw.wmnet'] ` The log can be found in `/var/log/wmf-auto-reimage/202006091353_marostegui_10... [14:17:23] 10DBA: Upgrade x1 databases to Buster and Mariadb 10.4 - https://phabricator.wikimedia.org/T254871 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['db2131.codfw.wmnet'] ` and were **ALL** successful. [14:21:53] 10DBA: Upgrade x1 databases to Buster and Mariadb 10.4 - https://phabricator.wikimedia.org/T254871 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by marostegui on cumin1001.eqiad.wmnet for hosts: ` ['db2131.codfw.wmnet'] ` The log can be found in `/var/log/wmf-auto-reimage/202006091421_marostegui_12... [14:47:56] 10DBA: Upgrade x1 databases to Buster and Mariadb 10.4 - https://phabricator.wikimedia.org/T254871 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['db2131.codfw.wmnet'] ` and were **ALL** successful. [14:49:57] 10DBA: Upgrade x1 databases to Buster and Mariadb 10.4 - https://phabricator.wikimedia.org/T254871 (10Marostegui) [14:58:06] heads up, I have to run a bunch of deletes from m2 master (via debmonitor) because I discovered that the garbage collection of not-anymore-referenced packages was broken since a while [14:58:30] I just released the fix and I ofc prefer to run it manually the first time not waiting for the bi-weekly cron [14:58:34] volans: a bunch meaning..? [14:59:20] should be just few thousands rows and each query does at most 1000 rows [14:59:34] ah cool, that's cool [14:59:42] I don't expect any impact, but checking if there is any maintenance ongoing or stuff [14:59:47] that I should hold on [14:59:48] we don't care much about lag on misc anyways, but good to batch them in small sizes yeah [14:59:55] volans: go ahead! [15:00:09] thanks! :) [15:00:38] if you're curious I had to find the rows to delete in the code as the single query I was using before would explode with the added dimensions in the join :D [15:00:55] hahahaha [15:08:32] marostegui: FYI {done}, results at https://phabricator.wikimedia.org/T254865#6206340 [15:08:56] ah that was tiny! but thanks for the heads up <3 [15:34:51] o/ marostegui a question that you may be able to help me with :) in "show slave status" what are the things I should look at to indicate replication is oky and healthy? (not related to WM rpoduction) [15:35:08] I was doing something like this: [15:35:11] seconds_behind_master=$(mysql -e "show slave status\G" -uroot -p$password_aux | grep "Seconds_Behind_Master" | awk '{print $2}') [15:35:47] but it turns out if replication breaks then seconds behind can be "Null", so wondering if there is another sensible thing to use in conjunction with seconds behind? [15:36:22] Slave_IO_Running and or Slave_SQL_Running perhaps? [15:40:21] addshore: those are probably the key items [15:41:24] Right, now to figure out how to fix "binlog truncated in the middle of event" or if I should rebuild my replica... [15:49:32] 10DBA, 10Cloud-Services, 10CPT Initiatives (MCR Schema Migration), 10Core Platform Team Workboards (Clinic Duty Team), and 2 others: Apply updates for MCR, actor migration, and content migration, to production wikis. - https://phabricator.wikimedia.org/T238966 (10Bstorm) [17:57:44] addshore: normally seconds behind the master (although that's not always correct, if there is an intermediate master) [17:57:55] but in most use cases, that's what you want to look at [17:58:22] you can monitor last error and IO running and slave SQL running too yes [18:20:25] Cool, I'll be sure to try and bake that into my helm chart :0 [18:21:03] I need to document my process of snapshoting a master and then setting it up as a replica next, but perhaps not tonight... [19:30:01] 10Blocked-on-schema-change, 10DBA: CentralNotice: Update DB schema on Meta for new features - https://phabricator.wikimedia.org/T254371 (10AndyRussG)