[05:28:09] 10DBA, 10Data-Persistence-Backup, 10Patch-For-Review: Upgrade all sanitarium masters to 10.4 and Buster - https://phabricator.wikimedia.org/T280492 (10Marostegui) [05:28:25] 10DBA, 10Data-Persistence-Backup, 10Patch-For-Review: Upgrade all sanitarium masters to 10.4 and Buster - https://phabricator.wikimedia.org/T280492 (10Marostegui) All codfw sanitarium masters are running 10.4 and Buster [05:35:50] 10Blocked-on-schema-change, 10DBA: Schema change for renaming new_name_timestamp to rc_new_name_timestamp in recentchanges - https://phabricator.wikimedia.org/T276292 (10Marostegui) [05:44:20] 10DBA, 10SRE, 10Patch-For-Review: Productionize db1155-db1175 and refresh and decommission db1074-db1095 (22 servers) - https://phabricator.wikimedia.org/T258361 (10Marostegui) Transfer from db1079 to db1158 started. [06:04:56] 10DBA, 10SRE, 10Patch-For-Review: Productionize db1155-db1175 and refresh and decommission db1074-db1095 (22 servers) - https://phabricator.wikimedia.org/T258361 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by marostegui on cumin1001.eqiad.wmnet for hosts: ` ['db1124.eqiad.wmnet'] ` The log ca... [06:06:20] volans|off: not sure how to proceed with this error, first time seeing it: https://phabricator.wikimedia.org/P15508 [06:07:04] 10DBA, 10SRE, 10Patch-For-Review: Productionize db1155-db1175 and refresh and decommission db1074-db1095 (22 servers) - https://phabricator.wikimedia.org/T258361 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['db1124.eqiad.wmnet'] ` Of which those **FAILED**: ` ['db1124.eqiad.wmnet'] ` [06:10:32] 10DBA, 10SRE, 10Wikimedia-Mailing-lists: Upgrade lists-next to bullseye mailman versions - https://phabricator.wikimedia.org/T280887 (10Marostegui) @Legoktm we can do it on Monday if you like just ping on #wikimedia-databases whenever you want to start and we can coordinate. [06:10:55] 10DBA, 10SRE, 10Wikimedia-Mailing-lists: Upgrade lists-next to bullseye mailman versions - https://phabricator.wikimedia.org/T280887 (10Marostegui) I meant wikimedia-databases irc channel, not the tag :) [06:24:25] 10Blocked-on-schema-change, 10DBA: Schema change for renaming new_name_timestamp to rc_new_name_timestamp in recentchanges - https://phabricator.wikimedia.org/T276292 (10Marostegui) [06:26:54] 10Blocked-on-schema-change, 10DBA: Schema change for renaming new_name_timestamp to rc_new_name_timestamp in recentchanges - https://phabricator.wikimedia.org/T276292 (10Marostegui) s8 eqiad progress [] labsdb1011 [] labsdb1010 [] labsdb1009 [x] dbstore1005 [] db1177 [] db1172 [] db1154 [] db1126 [x] db1116 [... [06:27:22] 10Blocked-on-schema-change, 10DBA: Schema change for renaming new_name_timestamp to rc_new_name_timestamp in recentchanges - https://phabricator.wikimedia.org/T276292 (10Marostegui) [06:28:44] marostegui: 301 john (not here?) that was migrating debmonitor to cfssl, if the paths have changed we might need to adjust the reimage script and spicerack module [06:29:05] volans|off: thanks, I will ping him :) [06:56:38] PROBLEM - MariaDB sustained replica lag on pc2008 is CRITICAL: 2.2 ge 2 https://wikitech.wikimedia.org/wiki/MariaDB/troubleshooting%23Replication_lag https://grafana.wikimedia.org/d/000000273/mysql?orgId=1&var-server=pc2008&var-port=9104 [07:01:20] RECOVERY - MariaDB sustained replica lag on pc2008 is OK: (C)2 ge (W)1 ge 0.4 https://wikitech.wikimedia.org/wiki/MariaDB/troubleshooting%23Replication_lag https://grafana.wikimedia.org/d/000000273/mysql?orgId=1&var-server=pc2008&var-port=9104 [07:24:28] 10DBA: Disable/remove unused features on Tendril - https://phabricator.wikimedia.org/T231185 (10jcrespo) There was recently maintenance on tendril to migrate its web frontend. While in theory puppet should setup everything automatically, make sure that 1) its memcache instance is setup correctly and it is workin... [07:32:28] 10DBA, 10Data-Persistence-Backup, 10Patch-For-Review: Upgrade all sanitarium masters to 10.4 and Buster - https://phabricator.wikimedia.org/T280492 (10jcrespo) Could you provide/confirm more details about what you want to achieve? db1156 needs to host s2 and you want to rebuild it from db1171 (stretch backup... [07:32:41] 10DBA: Disable/remove unused features on Tendril - https://phabricator.wikimedia.org/T231185 (10Marostegui) The large table matches whatever happened during the night (and is still happening): https://grafana.wikimedia.org/d/000000377/host-overview?viewPanel=12&orgId=1&refresh=5m&var-server=db1115&var-datasource... [07:34:28] 10DBA: Disable/remove unused features on Tendril - https://phabricator.wikimedia.org/T231185 (10jcrespo) Just to be clear- I was just guessing on possible explanations, I think you know better the internals than me. Other option is to start truncating it, either manually or on a cronjob, if that makes sense. [07:35:02] 10DBA, 10Data-Persistence-Backup, 10Patch-For-Review: Upgrade all sanitarium masters to 10.4 and Buster - https://phabricator.wikimedia.org/T280492 (10Marostegui) db1156 is now running 10.4 + Buster. What I would like to do would be: - Delete all its content - Place a logical dump from buster backup source (... [07:36:22] 10DBA: Disable/remove unused features on Tendril - https://phabricator.wikimedia.org/T231185 (10Marostegui) >>! In T231185#7028757, @jcrespo wrote: > Other option is to start truncating it, either manually or on a cronjob, if that makes sense. Yeah, that's possibly what I will do, I want to see first if this ta... [07:39:15] 10DBA, 10Data-Persistence-Backup, 10Patch-For-Review: Upgrade all sanitarium masters to 10.4 and Buster - https://phabricator.wikimedia.org/T280492 (10jcrespo) Thanks for clarifications- this helps me making sure I don't break the wrong host and I recover the right one, because I lack all the context. **Will... [07:40:11] 10DBA, 10Data-Persistence-Backup, 10Patch-For-Review: Upgrade all sanitarium masters to 10.4 and Buster - https://phabricator.wikimedia.org/T280492 (10Marostegui) No worries, stretch + mysql_upgrade should be fine :) Feel free to ask as many things you need if something isn't clear Thanks a lot for the help... [07:41:24] 10DBA, 10Data-Persistence-Backup, 10Patch-For-Review: Upgrade all sanitarium masters to 10.4 and Buster - https://phabricator.wikimedia.org/T280492 (10Marostegui) [07:44:46] 10DBA, 10Data-Persistence-Backup, 10Patch-For-Review: Upgrade all sanitarium masters to 10.4 and Buster - https://phabricator.wikimedia.org/T280492 (10jcrespo) FYI - I will load the stretch backup directly, as the logical backups do not contain system tables (only wiki ones)- that will have to be setup separ... [07:45:25] 10DBA, 10Data-Persistence-Backup, 10Patch-For-Review: Upgrade all sanitarium masters to 10.4 and Buster - https://phabricator.wikimedia.org/T280492 (10Marostegui) That's perfectly fine yeah. Thanks :) [08:34:33] 10DBA, 10Data-Persistence-Backup, 10Patch-For-Review: Upgrade all sanitarium masters to 10.4 and Buster - https://phabricator.wikimedia.org/T280492 (10jcrespo) https://grafana.wikimedia.org/d/000000273/mysql?viewPanel=5&orgId=1&var-server=db1156&var-port=9104&from=1619165049343&to=1619202849343 [08:37:45] 10DBA, 10SRE, 10Patch-For-Review: Productionize db1155-db1175 and refresh and decommission db1074-db1095 (22 servers) - https://phabricator.wikimedia.org/T258361 (10Marostegui) db1158 is now replicating. Once it's caught up I will enable GTID and start checking tables. [08:44:30] 10DBA, 10SRE, 10Patch-For-Review: Productionize db1155-db1175 and refresh and decommission db1074-db1095 (22 servers) - https://phabricator.wikimedia.org/T258361 (10Marostegui) db1158: GTID enabled and tables being checked [08:44:43] 10DBA: Upgrade 10.4.13 hosts to a higher version - https://phabricator.wikimedia.org/T279281 (10elukey) @razzi ping :) Can you update this task with a timeline about when the job will be done? [12:19:06] 10DBA, 10SRE, 10Wikimedia-Mailing-lists: Upgrade lists-next to bullseye mailman versions - https://phabricator.wikimedia.org/T280887 (10akosiaris) p:05Triage→03Medium [12:32:47] s2 is probably going "soon" after s6 and s3 for 10.4 upgrade, right? (note the quotes meaning relatively) [12:32:55] yeah [12:32:59] it is either s2 or s5 [12:33:13] if yes, I may use db1156 to setup in advance an s2 buster source backups [12:34:04] I think I have s5 done, so that would be s6, s3, s5 and s6 duplicated into buster [12:34:25] (x1 too, but I don't count that) [12:34:40] and I have s3 left to do on eqiad [12:35:03] I think I need a ticket to track this [12:35:18] I will create one as a subtiquet of one of yours [12:36:39] sounds good! [12:36:57] just for my own organization, I will still work on a common ticket for each section [12:37:53] 10DBA: Disable/remove unused features on Tendril - https://phabricator.wikimedia.org/T231185 (10Marostegui) So it grew like 40G in 5 hours...: ` -rw-rw---- 1 mysql mysql 140G Apr 23 12:25 general_log_sampled.ibd ` From what I gathered, it is definitely being used on the activity report we use. As I truncated it... [12:51:05] 10Data-Persistence-Backup: Upgrade pending stretch backup hosts to buster - https://phabricator.wikimedia.org/T280979 (10jcrespo) [12:53:21] I think there is enough activity at T231185 that is worth reopening :-) [12:53:21] T231185: Disable/remove unused features on Tendril - https://phabricator.wikimedia.org/T231185 [13:01:46] 10Blocked-on-schema-change, 10DBA: Schema change for renaming new_name_timestamp to rc_new_name_timestamp in recentchanges - https://phabricator.wikimedia.org/T276292 (10Marostegui) [19:44:53] 10DBA, 10DiscussionTools, 10OWC2020, 10Editing-team (FY2020-21 Kanban Board): DBA review: conversation subscriptions - https://phabricator.wikimedia.org/T263817 (10matmarex) Info for DBA review: Note that the code is already merged after we reviewed it internally, but the functionality is disabled using `... [19:45:57] 10DBA, 10DiscussionTools, 10OWC2020, 10Editing-team (FY2020-21 Kanban Board): DBA review: conversation subscriptions - https://phabricator.wikimedia.org/T263817 (10matmarex) [19:46:38] 10DBA, 10DiscussionTools, 10OWC2020, 10Editing-team (FY2020-21 Kanban Board): DBA review: conversation subscriptions - https://phabricator.wikimedia.org/T263817 (10matmarex) a:05matmarex→03None