[06:41:26] I am going to depool ms2 and migrate it to mariadb 10.11 [07:24:36] I'm still in the market for reviews for https://gerrit.wikimedia.org/r/c/operations/puppet/+/1134221 and https://gerrit.wikimedia.org/r/c/operations/puppet/+/1134208 for bringing some more frontends into service, please :) [08:04:59] TY :) [09:34:05] Since it was the first time I've added a node to apus (rather than bootstrapping an entire cluster), I wrote up the process https://wikitech.wikimedia.org/wiki/Ceph/Cephadm#Adding_a_host [09:34:08] It's pretty painless [11:03:52] Amir1 federico3 https://phabricator.wikimedia.org/T391056 it is a bit confusing that s6 is marked as done but has a comment, I'd suggest just to untick the box and leave "Done except dc masters" so once it is fully done, we just mark the box [11:04:04] And also s8 is running, but we should untick the box too [11:05:06] ok [11:05:49] federico3: For the codfw master, feel free to create a master switch task as that can be done anytime really [11:05:55] (talking about s6) [11:06:01] (maybe I can put 3 tickboxes instead?) [11:06:31] ? [11:06:42] federico3: Somtimes I do: [11:06:45] [] s6 [11:06:46] do we want to create a task for the masters for each section? [11:06:49] ** [] eqiad [11:06:53] ** [] codfw [11:07:06] federico3: I meant a task using: https://switchmaster.toolforge.org/schedule [11:07:14] As you need to switch the masters [11:07:29] I don't think we need to switch the masters [11:07:40] at least for most sections [11:07:54] Ah that's better then [11:09:51] marostegui: mayme something like this? (updated the task summary) https://phabricator.wikimedia.org/T391056 [11:10:25] federico3: Yeah, that works for me [11:22:05] Amir1: Switchmaster only looks for a candidate using the "candidate" flag right, not the binlog? [11:22:16] Amir1: I am asking cause I am going to start with https://phabricator.wikimedia.org/T383795 [11:22:23] For at least one section (s5 probably) [11:25:18] Amir1: Database maintenance map doesn't seem to be working: https://wikitech.wikimedia.org/wiki/Map_of_database_maintenance [11:25:36] It should have federico3's schema change and a !log I did today when I upgraded ms3 [11:30:22] marostegui: regarding switchmaster, it uses the comment in puppet. We probably should use something better [11:30:33] regarding the map, yeah I think I need to add him to the allowed list [11:30:36] give me a bit [11:31:12] Amir1: But mine wasn't logged either [11:31:29] then something bigger is broken :( [11:31:50] Amir1: Also I see some of his from the last 7 days [11:31:58] Why everything has tendency to just break out of nowhere (I know it's called 2nd law of thermodynamics but still) [13:04:41] this has been opened 8 years ago, perhaps some of the unchecked items in the description have been implemented in the meantime? https://phabricator.wikimedia.org/T143896 [13:06:40] Created: https://phabricator.wikimedia.org/T391346 [13:07:07] federico3: More or less only the third point [13:10:56] I can open subtasks for clarity and add a bit of details [13:11:55] federico3: That is fine, but I think we have other more pressing things in that regards [13:50:17] btullis, marostegui, my recollection is that y'all thought we should expand the clouddb cluster. Is that right? And do you happen to know how many new servers I should ask for? [13:50:22] Or maybe you've already talked to willy about that [13:53:25] andrewbogott: btullis' situation is different, they use one host for everything. Wikireplicas don't do that. The expansion on clouddb* depends pretty much on your estimation of users' usage really and if the service wants to be continued. It also depends on whether this wants to be scaled vertically or horizontally (which I assume it is horizontally, but needs to be decided). [13:55:00] ok! So no big grand vision or redesign in mind, just organic growth. [13:55:09] that's easy, I'll just order another pair of what we've got. [13:55:57] andrewbogott: We have 8 hosts, and they are split by section, it is not that easy [13:56:37] Yeah, we have the an-redacteddb1001 which is one big host for all sections. But our usage of it is fairly minimal and I am hoping that we can deprecate the whole service before we need to scale it horizontally. [13:56:56] btullis: ok, so that sounds like "don't order any new hardware for me" correct? [13:57:12] Correct, thanks. [13:57:24] andrewbogott: But are you talking about their hosts or your (WMCS) hosts? Because there are 8 that you own [13:57:38] marostegui: both! But now just wmcs hosts. [13:57:55] I am hoping that dhinus already has a plan for how to shuffle around the sections to spread out to 10. But I'll let him chime in. [13:57:55] marostegui: with a couple more hosts, I guess we could have a dedicated host for the sections with the highest traffic [13:58:07] andrewbogott: Then an-redacteddb1001 isn't under your budget? [13:58:38] marostegui: not as far as I know? But also, it already exists so won't show up on this years spreadsheet anyway [13:58:54] andrewbogott: I am very confused then with: [15:56:56] btullis: ok, so that sounds like "don't order any new hardware for me" correct? [13:59:09] I think this needs a bit more thinking and it is not that easy [13:59:29] The moving section parts I mean between hosts [13:59:41] marostegui: the fact that ben's replica use has moved off of clouddb* hardware is news to me, I'm adjusting before your eyes [13:59:50] And how to organize everything, but if you all are comfortable with ordering just two hosts, that's ok [14:00:07] Now I have a meeting but will plan on returning to you and dhinus having a plan! [14:00:12] andrewbogott: Yeah, it used to be clouddb1021 but it was still their host for years [14:00:29] I will be gone as it is the end of my day, but we can continue tomorrow [14:00:35] As far as I am aware an-redacteddb1001 is the only part of wikireplicas that is under the DPE budget and doesn't need refreshing, nor expanding. [14:01:57] I've no strong feelings on however you (both teams) wish to expand/refresh the clouddb10* hosts, or any other hosts involved in wikireplicas.