[05:16:26] 10DBA: Remove commonswiki.templatelinks partitions - https://phabricator.wikimedia.org/T258956 (10Marostegui) 05Open→03Resolved ` mysql:root@localhost [commonswiki]> alter table templatelinks remove partitioning; Query OK, 2383408636 rows affected (10 hours 40 min 48.339 sec) Records: 2383408636 Duplicates... [05:22:16] 10DBA, 10Gerrit, 10Patch-For-Review: Make sure both `reviewdb-test` (used forgerrit upgrade testing) and `reviewdb` (formerly production) databases get torn down - https://phabricator.wikimedia.org/T255715 (10Marostegui) >>! In T255715#6338856, @Dzahn wrote: > If it hasn't already happened you can remove the... [07:09:14] jynus: kormat I would need to take friday off, sorry for the short notice, any objection or can I go ahead and request it? [07:09:52] yes [07:10:19] yes to go ahead or yes you have an objection :p [07:18:58] 10DBA, 10Patch-For-Review: Upgrade dbproxyXXXX to Buster - https://phabricator.wikimedia.org/T255408 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by marostegui on cumin1001.eqiad.wmnet for hosts: ` ['dbproxy1015.eqiad.wmnet'] ` The log can be found in `/var/log/wmf-auto-reimage/202007280718_maro... [07:29:11] 10DBA: Move muswiki and mhwiktionary (closed wikis) from s3 to s5 - https://phabricator.wikimedia.org/T259004 (10Marostegui) [07:29:23] 10DBA: Move muswiki and mhwiktionary (closed wikis) from s3 to s5 - https://phabricator.wikimedia.org/T259004 (10Marostegui) p:05Triage→03Medium [07:31:07] 10DBA: Move muswiki and mhwiktionary (closed wikis) from s3 to s5 - https://phabricator.wikimedia.org/T259004 (10Marostegui) @Ladsgroup @Reedy even if they are closed wikis, do we need to change db-eqiad.php and db-codfw.php and make sure we point them to s5? s3.dblist would need to be modified, as they do show... [07:39:33] marostegui: yes :) [07:39:43] :( [07:39:54] 10DBA, 10Patch-For-Review: Upgrade dbproxyXXXX to Buster - https://phabricator.wikimedia.org/T255408 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['dbproxy1015.eqiad.wmnet'] ` and were **ALL** successful. [07:41:13] marostegui: any chance you could take more time off? [07:41:22] XDDDDDD [07:41:34] I would be happy to, but I don't want you to drop all databases! [07:42:31] dang, i've been caught [07:42:40] fiiiine, take friday then. it's better than nothing. ;) [07:43:10] I will just work, but I can ignore you as I can pretend I am on holidays [07:43:21] ahhaha [07:44:39] I don't see anything on jynus' calendar, so I will go ahead and request it [07:44:40] thanks .) [07:47:59] 10DBA, 10Patch-For-Review: Upgrade dbproxyXXXX to Buster - https://phabricator.wikimedia.org/T255408 (10Marostegui) [07:59:04] Last dump for zarcillo at codfw (db2093.codfw.wmnet) taken on 2020-07-28 05:55:05 is 14 GB, but previous one was 0 GB, a change of 3633670.7% [08:03:10] jynus: i meant to have a look at that [08:03:15] how the hell was it ever 14G [08:03:27] oh. i'm misreading [08:03:31] it's 14G _now_ [08:03:37] i've broken something, almost certainly. [08:04:40] /srv/sqldata/zarcillo is 27M on db1115 [08:05:08] db2093 probably has wikitech, from my test [08:05:18] oh you bastard [08:05:25] I am not sure though, if I ever deleted it [08:05:35] the question is why it backed up more than it should? [08:05:46] not if there is something else there [08:06:00] i'm happy for both of these to be questions :) [08:06:39] `modules/profile/templates/mariadb/backup_config/dbprov2002.cnf.erb` _looks_ fine [08:07:36] I wonder if someone has altered the grants of the mysql accounts and I wasn't told [08:08:07] not all accounts, I mean the backuping account specifically [08:08:20] I doubt that [08:08:54] `GRANT RELOAD, FILE, SUPER, REPLICATION CLIENT ON *.* TO 'dump'@'10.192.16.96'` [08:08:58] or maybe it was wrongly overwritten when imported/exported [08:09:01] `GRANT SELECT, INSERT, UPDATE, LOCK TABLES, SHOW VIEW, EVENT, TRIGGER ON `zarcillo`.* TO 'dump'@'10.192.16.96'` [08:09:57] the grants look identical to the ones on db1115 [08:10:47] INSERT, UPDATE that looks very wrong [08:10:54] 10DBA: page_restrictions indexes have been majestically drifting from code - https://phabricator.wikimedia.org/T256682 (10Marostegui) [08:11:29] INSERT UPDATE is what the backup metadata should have, not that the dumping should have [08:12:23] indeed [08:13:53] there is also 6 acounts on the host [08:14:05] there should only be 2, one per dbprov [08:14:17] i can look at cleaning all that up [08:14:19] I am going to wipe all accounts and set them up [08:14:23] let me do it [08:14:29] ... ok [08:14:36] does any of this explain the backup size increase? [08:16:32] there is test_labswiki backed up [08:16:46] maybe it had very generous grants so everybody connected could read it [08:17:24] jynus: huh. i would have assumed that the db specified in the backup_config was the only one backed up. is that not how it works? [08:17:58] db? [08:18:27] 10DBA: page_restrictions indexes have been majestically drifting from code - https://phabricator.wikimedia.org/T256682 (10Marostegui) s1 eqiad progress [x] labsdb1012 [x] labsdb1011 [x] labsdb1010 [x] labsdb1009 [x] dbstore1003 [x] db1140 [x] db1139 [x] db1135 [x] db1134 [x] db1124 [] db1119 [] db1118 [x] db110... [08:19:05] jynus: https://github.com/wikimedia/puppet/blob/production/modules/profile/templates/mariadb/backup_config/dbprov1002.cnf.erb#L32 [08:19:13] oh, I know why, this is a security issue [08:19:21] ohh. that's a 'section' name. not a db name. [08:19:30] kormat: that is just an identifier, arbitrary [08:19:31] so.. it'll connect and back up everything it can read i guess [08:19:46] I will explain it on a private channel [08:23:50] db selection can be done with "regex:" option, eg. regex: ^zarcillo\. [08:26:47] ahh, i see [08:27:25] 10DBA: page_restrictions indexes have been majestically drifting from code - https://phabricator.wikimedia.org/T256682 (10Marostegui) [08:27:47] but normally not needed, tried to be enforced on grants [08:28:07] I am going to drop the stats user and the dump user on both hosts [08:28:15] then recreate the backup user [08:34:22] for the record, grants needed have to be only: FILE, RELOAD, REPLICATION CLIENT, SUPER ON *.* and GRANT EVENT, LOCK TABLES, SELECT, SHOW VIEW, TRIGGER ON individual dbs [08:35:01] we will remove the supper when we are on later mariadb versions that include more fine-grained replication control [08:35:18] maybe we can use roles? [08:35:23] uff [08:35:34] that seems strange coming from you :-P [08:35:38] I know XD [08:35:53] roles would make a lot of sense, conceptually [08:36:00] conceptually 100% [08:36:01] I was actually typing: "but we are not having great experiences with them" [08:36:02] XDDD [08:36:16] but we never resolved the "case of disappearing grants" [08:36:32] or taking an undeterministic time to apply [08:36:44] plus the 10.4 issues [08:37:22] Still waiting for https://jira.mariadb.org/browse/MDEV-22645 [08:37:27] For an answer I mean [08:37:29] You know I proposed it at T254756#6202036 [08:37:40] T254756: Setup a global admin account that can only read/have limited privileges to databases for safer debugging - https://phabricator.wikimedia.org/T254756 [08:37:53] in the short-to-medium term, we could monitor this _relatively_ easily [08:38:07] but heard the issues so not feeling confident [08:38:34] I think we may have to implement roles ourself, which we will need anyway as there is no support for distributed grants [08:38:53] roles and gropus [08:38:58] *groups [08:39:13] let's fix the ongoing issue first, ok? [09:33:40] 10DBA: Move muswiki and mhwiktionary (closed wikis) from s3 to s5 - https://phabricator.wikimedia.org/T259004 (10Ladsgroup) >>! In T259004#6339835, @Marostegui wrote: > @Ladsgroup @Reedy even if they are closed wikis, do we need to change db-eqiad.php and db-codfw.php and make sure we point them to s5? Yup, that... [09:34:56] Amir1: can we set wgReadOnly to just specific wikis? how is that done? [09:35:10] it should be doable [09:35:16] in IS.php [09:35:20] let dig [09:35:26] *let me dig [09:36:49] ./wmf-config/CommonSettings.php:$wgReadOnly = $etcdConfig->get( "$wmfDatacenter/ReadOnly" ); [09:36:59] ugh, it's not that hard but it's still doable [09:37:25] *that easy I really need my coffee [09:37:58] I didn't know we could set specific wikis to read only [09:37:59] that's interesting [09:38:10] we'd definitely need your help with this yeah [09:41:28] You see this line? https://github.com/wikimedia/operations-mediawiki-config/blob/master/wmf-config/CommonSettings.php#L134 [09:41:41] now look at https://github.com/wikimedia/operations-mediawiki-config/blob/master/wmf-config/CommonSettings.php#L147 [09:41:48] we can pull of something similar [09:42:06] interesting... [09:42:16] should be easy to test by a steward I guess? [09:42:23] to see if it is really RO [09:42:33] unless it has some unintended consequences but I highly doubt that, if global variables bleed to other wiki, we would have way bigger problems [09:43:08] Sure. That sounds good [09:43:17] do you want to test with beta first? [09:43:33] yeah, and maybe even on codfw via mwdebug, will that show a RO message? [09:45:31] I think so [09:45:53] so yeah, beta is a good first test, to see if it breaks something obvious [09:46:00] then maybe codfw [09:46:08] we won't be able to write, but at least we can see if we get the RO banner [09:46:51] https://www.mediawiki.org/wiki/Manual:$wgReadOnly [09:46:59] Warning: In contrast to its name, this settings does not make the database read only! Even if $wgReadOnly is set, extensions, API scripts and other cacheable events can write data nonetheless. [09:47:18] :-/ [09:48:15] but it reduces huge amount of writes from what I see, This should be used on top of db locking at seems [09:48:36] so those writes that we were seeing on the logging table, where do they come from? [09:49:01] We cannot set the specific databases to read_only, it is a global variable in mysql [09:49:32] ugh [09:49:52] it would reduces those loggings too [09:51:22] I hope [09:51:54] so writes only happen to logging and module_deps (I guess because there was a release) [09:53:58] logging happens for two reasons: Someone visits the wiki so their accounts get created, someone with an account has their account globally renamed (that would write to actor table too) [09:54:05] and page [09:54:09] and a couple other [09:55:03] yeah: https://phabricator.wikimedia.org/P12071 [09:57:28] sites is because of creating new wikis, I rebuild the table globally (the table should be sitting somewhere global instead of me changing it on every wiki but that's another story) [09:58:23] So there is really no way to prevent all writes from happening on a closed wiki, no? [09:58:28] Even if stewards are informed [09:59:04] yeah, you can reduce its chance to practically zero but still things might happen [10:00:11] let's test on beta and all that and then we can see if we feel confident about proceeding... [10:00:47] worst case scenario, if there are writes, we could maybe replicate them extracting the binlogs, but that is a bit ugly....worst case scenario, what if those writes got lost? [10:01:20] 10DBA: Move muswiki and mhwiktionary (closed wikis) from s3 to s5 - https://phabricator.wikimedia.org/T259004 (10Urbanecm) Yes, putting the wikis to read only is certainly a good idea, plus informing global renamers and stewards to not do global renames during the window (I'm honestly not sure what it does when... [10:02:24] 10DBA: Move muswiki and mhwiktionary (closed wikis) from s3 to s5 - https://phabricator.wikimedia.org/T259004 (10Marostegui) On IRC I told Amir that on MySQL level, we cannot put specific databases into read_only mode, as it is a global mysql flag, so it is either everything (no-go) or nothing. [10:25:11] 10DBA, 10Operations: db1082 crashed - https://phabricator.wikimedia.org/T258336 (10Marostegui) 05Open→03Resolved This host has been fully repooled. The BBU is now gone {T258910} so this same crash shouldn't happen again. This host will be decommissioned in Q2 {T258361} [10:25:14] 10DBA, 10Operations: db1080-95 batch possibly suffering BBU issues - https://phabricator.wikimedia.org/T258386 (10Marostegui) [10:25:17] 10DBA, 10Operations: Refresh and decommission db1074-db1095 (22 servers) - https://phabricator.wikimedia.org/T258361 (10Marostegui) [10:45:01] Amir1: https://gerrit.wikimedia.org/r/c/operations/mediawiki-config/+/615723/ should reduce the chance of getting an account on those wikis to virtually zero (only stews etc), we can also temp-revoke the autocreate permission [10:52:24] marostegui: Amir1: maybe a stupid question, but do we have a beta test plan? 🙂 [10:52:34] (like, what would we test, and also who :D) [10:53:05] Urbanecm: I would assume that trying to set those wikis into read-only doesn't break the site at least :) [10:53:32] As discussed earlier, looks like it will be hard to be 100% sure that no writes will happen, but that would reduce the chance from what I understood [10:53:47] yup [10:54:55] thanks [10:55:53] But yeah, we need to see what happens if for some reason there's just one write [10:55:55] how to handle that [10:57:01] depending which type of a write [10:57:49] account autocreations should be fine to throw away IIRC, so most cache updates (I see querycache got updated in https://phabricator.wikimedia.org/P12071) [10:58:44] Urbanecm: these are some of the tables that have been written "lately" on one of the closed wikis https://phabricator.wikimedia.org/P12071 [10:59:16] global renames might be an issue, as it's a global action, but that should be possible to turn off globally during the window if we want to be 100% sure [10:59:51] yeah, that'd be nice [10:59:57] one less thing to worry about [11:00:11] yup :) [11:09:44] 10DBA: Move muswiki and mhwiktionary (closed wikis) from s3 to s5 - https://phabricator.wikimedia.org/T259004 (10Urbanecm) So, the first step to do would be to test what happens when wgReadOnly is turned on for one wiki in the beta cluster (how does global renaming behave, account autocreations, querycache updat... [11:11:31] 10DBA: Move muswiki and mhwiktionary (closed wikis) from s3 to s5 - https://phabricator.wikimedia.org/T259004 (10Marostegui) >>! In T259004#6340263, @Urbanecm wrote: > So, the first step to do would be to test what happens when wgReadOnly is turned on for one wiki in the beta cluster (how does global renaming be... [11:14:36] sounds good to me [11:31:02] 10DBA, 10CheckUser, 10Growth-Team, 10Thanks, 10User-DannyS712: Monitor the growth of CheckUser tables thanks to the addition of Thanks data - https://phabricator.wikimedia.org/T257223 (10Marostegui) [11:32:20] 10DBA, 10CheckUser, 10Growth-Team, 10Thanks, 10User-DannyS712: Monitor the growth of CheckUser tables thanks to the addition of Thanks data - https://phabricator.wikimedia.org/T257223 (10Marostegui) 05Open→03Resolved I have added the results for today. They are pretty much the same as we've got for t... [12:02:18] 10DBA: page_restrictions indexes have been majestically drifting from code - https://phabricator.wikimedia.org/T256682 (10Marostegui) [12:04:57] 10DBA, 10CheckUser, 10Growth-Team, 10Thanks, 10User-DannyS712: Monitor the growth of CheckUser tables thanks to the addition of Thanks data - https://phabricator.wikimedia.org/T257223 (10DannyS712) @Marostegui am I reading this right that none of the tables' compressed sizes changed, and other than 07-07... [12:06:26] 10DBA, 10CheckUser, 10Growth-Team, 10Thanks, 10User-DannyS712: Monitor the growth of CheckUser tables thanks to the addition of Thanks data - https://phabricator.wikimedia.org/T257223 (10Marostegui) Correct, no noticiable impact on disk size [12:11:53] 10DBA, 10CheckUser, 10Growth-Team, 10Thanks, 10User-DannyS712: Monitor the growth of CheckUser tables thanks to the addition of Thanks data - https://phabricator.wikimedia.org/T257223 (10Huji) >>! In T257223#6340333, @Marostegui wrote: > Thanks everyone! I bet @Niharika would have appreciated it if you... [12:31:32] 10DBA: page_restrictions indexes have been majestically drifting from code - https://phabricator.wikimedia.org/T256682 (10Marostegui) s3 eqiad progress [] labsdb1012 [] labsdb1011 [] labsdb1010 [] labsdb1009 [x] dbstore1004 [] db1124 [] db1123 [] db1112 [x] db1095 [] db1078 [] db1075 [12:49:09] 10DBA, 10Operations, 10User-Kormat: Package wmfmariadbpy as a .deb - https://phabricator.wikimedia.org/T259021 (10Kormat) [12:49:17] 10DBA, 10Operations, 10User-Kormat: Package wmfmariadbpy as a .deb - https://phabricator.wikimedia.org/T259021 (10Kormat) p:05Triage→03Medium [12:49:28] jynus: any thoughts on ^ ? [13:09:59] 10DBA, 10Patch-For-Review, 10User-Urbanecm: Move muswiki and mhwiktionary (closed wikis) from s3 to s5 - https://phabricator.wikimedia.org/T259004 (10Urbanecm) [13:16:47] I was working on that, I needed to reorganize the repo a bit [13:19:19] jynus: is that work current? [13:19:41] the reason i came to this is i'd like to write a couple of libraries for interfacing with zarcillo and tendril, and wmfmariadbpy would be a good place to put them [13:19:48] https://gerrit.wikimedia.org/r/c/operations/software/wmfmariadbpy/+/571528 [13:21:42] but I wouldn't mind you taking over [13:22:37] ack. i think what you're doing there isn't a blocker for me in any case [13:29:13] 10DBA, 10Cloud-Services, 10MW-1.35-notes (1.35.0-wmf.36; 2020-06-09), 10Platform Team Initiatives (MCR Schema Migration), and 2 others: Apply updates for MCR, actor migration, and content migration, to production wikis. - https://phabricator.wikimedia.org/T238966 (10Marostegui) [14:30:15] so sure, put them there, it is the right fit [14:30:35] what I meant is that if you want to speed up the packaging you may have to help me [14:30:41] as in [14:31:11] if you just want the new stuff packaged quickly you may have to work on that [15:50:58] 10DBA, 10Patch-For-Review, 10User-Urbanecm: Move muswiki and mhwiktionary (closed wikis) from s3 to s5 - https://phabricator.wikimedia.org/T259004 (10Urbanecm) I have turned cswiki beta to read only mode via https://www.mediawiki.org/wiki/Manual:$wgReadOnly, and then executed `ls -lt /srv/sqldata/cswiki | he... [17:26:56] 10DBA, 10CheckUser, 10Growth-Team, 10Thanks, 10User-DannyS712: Monitor the growth of CheckUser tables thanks to the addition of Thanks data - https://phabricator.wikimedia.org/T257223 (10Niharika) Haha. #thanks all! Great work here. :) [17:42:09] qq - an-coord1001 is replicating a lot of dbs to db1108's meta instance. Is it ok if I add a new test database on an-coord1001's mariadb or does it break replication? [17:42:20] (the new test database doesn't need replication at all) [17:51:42] marostegui: I don't know if you noticed but for the MCR changes on s1, it cut the size of the db to half https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&refresh=5m&fullscreen&panelId=12&from=now-30d&to=now&var-server=db1119&var-datasource=eqiad%20prometheus%2Fops&var-cluster=mysql [17:51:57] but for s4 the change is around 0.3%, that seems really weird [17:53:59] Amir1: that's the innodb compression I'm running on S1 :) [17:54:44] Amir1: the revision table shrinked too, but not that much. I have the figures somewhere (maybe in the ticket) but I'm on my phone :) [17:55:11] that's weird, the rev_user_text column is huge [17:55:38] and rev_comment [17:56:33] it reduced by 25% its size [17:56:38] the table size I mean [17:57:33] ooooh, that's nice