[06:14:34] 10DBA, 10Goal, 10Patch-For-Review: Implement database binary backups into the production infrastructure - https://phabricator.wikimedia.org/T206203 (10Marostegui) s1 and s6 eqiad backups have been on "prepare" status for almost 24h now. stracing the processes on dbprov1001 and dbprov1002 looks like they are... [08:29:10] hello! could we by chance get https://phabricator.wikimedia.org/T221339 deployed soon, maybe even today? :) [08:29:16] cc marostegui [08:30:09] dropping the _user and _user_index columns is slated to happen next Monday, 2019-05-27, yet maintainers are still unable to use the new schema due to the lack of the index on rev_actor [08:32:12] I was hoping to start migrating tools today/tomorrow while I'm at the Hackathon [08:32:27] cc bstorm_ [08:48:41] musikanimal: that won't happen today :-(. we need to depool the replicas for that and we are doing maintenance on them due to disk space reaching 90% usage, hopefully will be done during next week [08:54:15] eek! okay, well I'll have to kindly request we delay deployment of https://gerrit.wikimedia.org/r/c/operations/puppet/+/510595 [08:54:24] bstorm_: [08:55:24] this is the biggest schema change to ever happen to the replicas, we need multiple weeks to update our tools. Hundreds of tools, bots, etc. are going to break [08:55:35] and it hasn't been mentioned in Tech News yet either [08:59:24] musikanimal: Yeah, but getting the replicas back to a non worry disk space usage has more priority. I will do my best to coordinate with bstorm_ next week so she can get the changes she need out of the way and then I will continue with the maintenance [08:59:48] musikanimal: But that won't definitely happen during a weekend (the maintenance is ongoing now and one of the replicas is depooled and with 2 days to catch up on replication) [09:02:57] marostegui: hmm alright, well if we expect any breaking changes this week, we should put it in Tech News now. I [09:03:14] 've got some people here at the Hackathon who can help with that. What should we say? [09:08:28] "Toolforge and VPS tools may break later this week due to maintenance, and a related schema change. Maintainers will need to update their tools to use the new schema" -- is that accurate? then we'd link to the email Brooke sent out [09:09:35] musikanimal: I don't know really, I guess you need to reach out to bd808 or bstorm_ [09:13:51] musikanimal: sadly I don't know anything more that Brooke's email [09:18:45] oh boy [13:23:28] 10DBA, 10Goal, 10Patch-For-Review: Implement database binary backups into the production infrastructure - https://phabricator.wikimedia.org/T206203 (10Marostegui) I have killed both, they were not doing anything, and the server was fully idle. strace didn't show any activity at all. Killed both the processes... [14:25:34] musikanimal: I can likely get updates on specific tables and specific view dbs. [14:25:47] The difficulty is ensuring an update works across everything [14:26:34] But if there are particular dbs I might be able to do it regardless of maintenance. In fact, if I only do it to revision_userindex, I might get away of putting that up now. [14:26:38] On ever db [14:34:10] that would be amazing! [14:48:30] I'm finding it difficult. revision is a busy table, which means I cannot get a lock on it. [14:48:59] the enwiki_p.revision_userindex table in particular won't let go of it [14:49:17] I was able to do updates across quite a lot [14:49:51] bstorm_: make sure not to forget labsdb1012 on all these operations [14:50:44] Yup! The important updates for the schema are already there. The hard part is getting the updates for performance of the revision_userindex view to deploy on labsdb1010 and labsdb1011 :) [14:51:18] 1009 is fully done? [14:51:20] Since naturally, there's a lot hitting that...and the index wasn't working there lol [14:51:58] 1009 was done with the most recent updates for the comment table stuff. I haven't done the performance update I that musikanimal is talking about there yet. I can try it [14:52:13] No, I was just curious :) [14:52:20] It is fully depooled but still running alter tables [14:52:21] I'm trying unblock several parties... [14:52:32] With competing interests in a way [14:53:04] I will kill the alters tomorrow evening, so hopefully on monday/tuesday we can repool it, depool 1010 do all the stuff, repool 1010, depool 1011 do all the stuff, repool 1011, and then I can depool 1009 back to finish the maintenance there [14:55:18] The change regarding the actor table is basically the same thing I did in March for the comment table, but for user records. That's not going to even be merged until the 27th. This is tools/VPS users trying to get ahead of that, but a coalesce and such is blocking index use when doing user revision queries. Since that is related to attempting to refactor to be ready for the 27th.... That's the hurry [14:56:46] The monday depool is going to unblock brad because there was a mistake in the comment table reshuffle that caused it to not update some tables, so I have to get that fixed. It will also include this revision_userindex update if I cannot get it out before then. [14:58:44] marostegui: now 1009 is fully up to date :) [14:58:58] Unfortunately, musikanimal's traffic doesn't use it most likely [14:59:54] bstorm_: cool, hopefully it can be done Monday or Tuesday (depending on how much it takes for it to catch-up with replication) [15:06:28] 👍🏻