[04:43:13] 10DBA, 10Gerrit: Make sure `reviewdb-test` database (used forgerrit upgrade testing) gets torn down - https://phabricator.wikimedia.org/T255715 (10Marostegui) a:03jcrespo Thanks for the new ticket. As Jaime handled the initial DB and grants ticket, I am going to assign this to him as he has probably all the... [04:48:48] 10DBA, 10Cloud-Services, 10CPT Initiatives (MCR Schema Migration), 10Core Platform Team Workboards (Clinic Duty Team), and 2 others: Apply updates for MCR, actor migration, and content migration, to production wikis. - https://phabricator.wikimedia.org/T238966 (10Marostegui) [04:53:22] 10DBA, 10Patch-For-Review: Relocate "old" s4 hosts - https://phabricator.wikimedia.org/T253217 (10Marostegui) [07:42:58] 10DBA, 10Epic: Upgrade WMF database-and-backup-related hosts to buster - https://phabricator.wikimedia.org/T250666 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by marostegui on cumin2001.codfw.wmnet for hosts: ` ['es1025.eqiad.wmnet'] ` The log can be found in `/var/log/wmf-auto-reimage/20200618... [08:17:44] 10DBA, 10Gerrit, 10Patch-For-Review: Make sure `reviewdb-test` database (used forgerrit upgrade testing) gets torn down - https://phabricator.wikimedia.org/T255715 (10jcrespo) Waiting to proceed on @Dzahn's ok +private report revert: https://gerrit.wikimedia.org/r/c/operations/puppet/+/606387 [08:18:12] 10DBA, 10Epic: Upgrade WMF database-and-backup-related hosts to buster - https://phabricator.wikimedia.org/T250666 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['es1025.eqiad.wmnet'] ` and were **ALL** successful. [08:48:35] 10DBA, 10Epic, 10Patch-For-Review: Upgrade WMF database-and-backup-related hosts to buster - https://phabricator.wikimedia.org/T250666 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by marostegui on cumin2001.codfw.wmnet for hosts: ` ['es2022.codfw.wmnet'] ` The log can be found in `/var/log/wmf... [09:16:53] 10DBA, 10Epic, 10Patch-For-Review: Upgrade WMF database-and-backup-related hosts to buster - https://phabricator.wikimedia.org/T250666 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['es2022.codfw.wmnet'] ` and were **ALL** successful. [09:23:29] this doesn't look right - a profile using a role: https://github.com/wikimedia/puppet/blob/production/modules/profile/manifests/mariadb/monitor/prometheus.pp#L28 [09:27:12] 10DBA, 10Patch-For-Review: Make checksum parallel to the data transfer in transferpy package - https://phabricator.wikimedia.org/T254979 (10Privacybatm) //(Machine spec: i5-2nd Gen with SATA HDD and 6GB DDR3 RAM)// I have run my patch (**in the following order**) for the data of Size: 6928532418 bytes (**6.5G... [09:27:46] i think role::prometheus::mysqld_exporter shhould be changed to a profile [09:30:43] 10DBA, 10Core Platform Team: text table still has old_* fields and indexes on some hosts - https://phabricator.wikimedia.org/T250066 (10Marostegui) [09:31:30] 10DBA, 10Patch-For-Review: Make partman/custom/no-srv-format.cfg work - https://phabricator.wikimedia.org/T251768 (10Marostegui) 05Stalled→03Open @Kormat I think we can proceed with this, es1025 and es2022 reimages worked fine. [09:43:47] 10DBA, 10Patch-For-Review: Make partman/custom/no-srv-format.cfg work - https://phabricator.wikimedia.org/T251768 (10Kormat) https://gerrit.wikimedia.org/r/606401 is out to make reuse-parts the default for most of the db fleet. We still need recipes for the tendril, zarcillo and dbprov hosts however. [09:57:58] 10DBA: Switchover es5 master from es1023 to es1024 - https://phabricator.wikimedia.org/T255755 (10Marostegui) [09:58:12] 10DBA: Switchover es5 master from es1023 to es1024 - https://phabricator.wikimedia.org/T255755 (10Marostegui) p:05Triage→03Medium [09:59:28] 10DBA, 10Patch-For-Review: Make checksum parallel to the data transfer in transferpy package - https://phabricator.wikimedia.org/T254979 (10jcrespo) As you correctly assume, those number may be misleading due to filesystem cache + parallelism behavior on memory. We should test with larger filesets to increase... [10:08:21] 10DBA: Switchover es5 master from es1023 to es1024 - https://phabricator.wikimedia.org/T255755 (10Marostegui) [10:18:15] 10DBA: Switchover es5 master from es1023 to es1024 - https://phabricator.wikimedia.org/T255755 (10jcrespo) > We cannot put es5 in RO from MW We used to be able to do it: we can remove writes from it from rotation, and only write to es4. That allows a switchover without read only/impact on application (make es1,... [10:19:08] 10DBA, 10Patch-For-Review: Make checksum parallel to the data transfer in transferpy package - https://phabricator.wikimedia.org/T254979 (10Privacybatm) Okay! [10:20:35] 10DBA, 10Operations: Replace `role::prometheus::mysqld_exporter` with `profile::prometheus::mysqld_exporter_instance` - https://phabricator.wikimedia.org/T255758 (10Kormat) [10:21:10] 10DBA, 10Operations: Replace `role::prometheus::mysqld_exporter` with `profile::prometheus::mysqld_exporter_instance` - https://phabricator.wikimedia.org/T255758 (10Kormat) p:05Triage→03Medium [10:27:05] 10DBA: Switchover es5 master from es1023 to es1024 - https://phabricator.wikimedia.org/T255755 (10Marostegui) @CDanis can you confirm if es5 can be set up as RO with: `dbctl --scope eqiad section es5 ro "Maintenance on es5"` and then revert with `dbctl --scope eqiad section es5 rw` ? I don't recall if we've ever... [10:28:33] kormat: you are right, welcome to puppet :-) [10:28:50] "yaaaaycrap" [10:28:58] jokes aside, it takes some times to fix everything in puppet [10:29:04] we do our best [10:29:09] arturo: i don't doubt it [10:29:37] and there is the question: if you introduce a new rule or model, who is responsible from refreshing/refactoring all the previous code? [10:29:50] marostegui: reading the files, probably requires a mw config edit [10:30:13] removing a cluster from $wgDefaultExternalStore [10:30:36] and make it is static => true [10:30:52] arturo: indeed [10:30:56] I want to remember that was what tim showed us to do, but it was long time ago [10:32:06] it is slower than dbctl, but doesn't matter as things will be writing all the time [10:36:41] jynus: yeah, I cannot recall how we did it last time [10:36:49] I am not sure we tried Tim's method [10:37:16] I think we did 1 and 1 [10:37:18] But I think it might be possible with dbctl as I think es are treated as a normal section [10:37:32] I am trying to find the task [10:38:16] found it: https://phabricator.wikimedia.org/T202364 [10:38:40] marostegui: https://gerrit.wikimedia.org/r/c/operations/mediawiki-config/+/454210/3/wmf-config/db-eqiad.php [10:39:11] but that is pre-dbctl era, no? [10:39:24] yeah, but I don't think dbctl handles that yet [10:39:32] as it is not technically "read only" [10:39:41] as much as "default configuration" [10:39:52] it is like x1, no good support from dbct [10:39:56] yeah [10:40:04] but please don't take my word for granted [10:40:08] that is my understanding [10:40:15] what I am 100% sure is that it is possible [10:40:26] yeah, tim's approach is what that patch shows [10:40:26] (without dbctl) [10:40:50] you should ask if dbctl changed that in any way or even request it as a feature [10:41:01] Yeah, I have asked cd4nis [10:41:02] on the task [10:41:11] if it is not possible, let's go with tim's approach (that patch) [10:41:22] it is cleaner than enforcing read only on mysql [10:41:27] my guess is that it is like x1 [10:41:42] yeah but x1 doesn't even have the possibility on MW [10:41:42] being an outlier it is still file controlled [10:41:52] but it will be 0 downtime [10:42:02] yep, and it can be done "anytime" [10:42:14] it is the whole idea behind having 2 rw hosts [10:42:24] cannot be depooled, but we can stop writing to 1 [10:42:25] let's wait for chris and then decide [10:42:38] but I think we haven't done it but 1 or 2 times [10:43:31] yeah, I definitely didn't remember it [10:44:47] we did es3 with the read only method https://phabricator.wikimedia.org/T197073#4449452 [10:44:53] but that was before [10:45:50] haha 2 months before [10:51:38] 10DBA, 10Core Platform Team: text table still has old_* fields and indexes on some hosts - https://phabricator.wikimedia.org/T250066 (10Marostegui) [11:49:45] 10DBA, 10Core Platform Team: text table still has old_* fields and indexes on some hosts - https://phabricator.wikimedia.org/T250066 (10Marostegui) s3 eqiad progress [x] dbstore1004 [] db1123 [] db1112 [] db1095 [] db1078 [] db1075 [12:08:35] 10DBA, 10Patch-For-Review: Make partman/custom/no-srv-format.cfg work - https://phabricator.wikimedia.org/T251768 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by kormat on cumin1001.eqiad.wmnet for hosts: ` ['db1077.eqiad.wmnet'] ` The log can be found in `/var/log/wmf-auto-reimage/202006181208_... [12:28:09] 10DBA, 10Patch-For-Review: Make partman/custom/no-srv-format.cfg work - https://phabricator.wikimedia.org/T251768 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['db1077.eqiad.wmnet'] ` and were **ALL** successful. [12:28:30] heck yeah [12:28:56] :-D [12:37:11] 10DBA, 10Patch-For-Review: Make partman/custom/no-srv-format.cfg work - https://phabricator.wikimedia.org/T251768 (10Kormat) 05Open→03Resolved a:03Kormat This is done :) [12:38:44] 10DBA: Create reuse recipes for tendril/zarcillo/dbprov hosts - https://phabricator.wikimedia.org/T255768 (10Kormat) [12:38:57] 10DBA: Create reuse recipes for tendril/zarcillo/dbprov hosts - https://phabricator.wikimedia.org/T255768 (10Kormat) p:05Triage→03Medium [13:05:44] jynus: we can probably expect the same warning for s6 eqiad, as I am altering the backupsources now, for revision and archive table [13:24:07] marostegui: re: es5, I am not sure, I need to look at the code path in Mediawiki, you might need a mwconfig deploy [13:24:12] I will look today [13:24:44] cdanis: it would be good to know, but yeah, mwconfig deploy is ok too so no worries [13:45:47] thanks for the heads up [15:17:13] 10DBA, 10Gerrit, 10Patch-For-Review: Make sure `reviewdb-test` database (used forgerrit upgrade testing) gets torn down - https://phabricator.wikimedia.org/T255715 (10Dzahn) Since we won't need a mysql/mariadb db anymore for Gerrit after the upgrade is complete, this ticket might as well be used to drop the... [15:40:13] 10DBA, 10Gerrit, 10Patch-For-Review: Make sure `reviewdb-test` database (used forgerrit upgrade testing) gets torn down - https://phabricator.wikimedia.org/T255715 (10jcrespo) It is all ok, let's rename the task to state so and wait for the upgrade + some time to act on it. [15:40:47] 10DBA, 10Gerrit, 10Patch-For-Review: Make sure `reviewdb-test` database (used forgerrit upgrade testing) gets torn down - https://phabricator.wikimedia.org/T255715 (10jcrespo) 05Open→03Stalled