[04:22:39] Going to start with s4 switchover [08:08:24] wow x1 is already 1.5TB? [08:13:05] should we rename it to xl1? :-P [08:13:53] hahaha [08:58:05] godog: looking at site.pp, I note that new thanos-be nodes are not automatically added to thanos::backend, unlike new ms-be nodes that are automatically added to swift::storage. Is that intentional / is there some reason why thanos::backend is less safe than swift::storage ? [08:58:41] (I NTK because I have to do something to site.pp before the new thanos backends can be racked) [09:00:09] Emperor: no reason, AFAIK swift::storage and thanos::backend can be treated the same now [09:00:33] godog: cool, thanks, I'll put in a CR to make that adjustment [09:01:07] SGTM, happy to take a look [09:09:42] godog: thanks! https://gerrit.wikimedia.org/r/c/operations/puppet/+/1058092 [09:13:19] marostegui: there's a puppet change from you unmerged (db1179: No longer candidate master), are you OK for me to merge that, or should I pause? [09:13:31] Emperor: oh yes, sorry, you can go ahead [09:13:34] thank you [09:13:38] NP, doing so :) [09:14:53] {{done}} [14:24:55] hello folks! [14:25:21] due to an issue with myself and Python, I ended up breaking the provision cookbook [14:25:36] some db nodes have probably a wrong BIOS config, namely https://phabricator.wikimedia.org/T371132 (see description) [14:26:05] elukey: hola guapo [14:26:08] from a what with Riccardo they all seem not in production, I just wanted to ask a quick check and also if you could avoid assigning puppet roles for the moment [14:26:14] marostegui: hola hermoso [14:26:18] elukey: The only one in production is db1179 [14:26:33] and it is a slave, so we can easily depool it [14:26:45] okok I'll take a note and keep it for last [14:26:47] sorry :( [14:26:53] no worries, it happens! [14:27:04] elukey: let me know if you need me [14:27:10] sure! [14:33:21] I think db1179 was the x1 replica that had issues a while ago [14:33:30] yes the one that crashed [14:48:58] marostegui: let me know if you get some time to discuss T370852 [14:48:59] T370852: Migrate codfw row C & D database hosts to new Leaf switches - https://phabricator.wikimedia.org/T370852 [14:50:18] topranks: it's probably better if you coordinate with Arnaud (he's back 12th) as all that maintenance will happen when I'm on sabbatical [14:51:29] marostegui: ok [14:52:04] we're hoping to get it started asap though, if anyone could advise before the 12th it'd be great [14:52:18] topranks: before the 12th August? [14:52:27] topranks: I'm out next week too :) [14:52:34] So maybe that will need to wait [14:53:59] it's a busy time for holidays so obviously we gotta work with that [14:54:24] but basically there are no blockers to migrating hosts and dc-ops are anxious to get it done [14:56:10] topranks: I understand, but I just got back from a week out, and I'll be gone next week too. After that I'll be on sabbatical and this will extend while I'm out, so I think it makes sense for Arnaud to decide how he wants to tackle this as he'll be the primary point of contact [14:56:58] yep I know - we were waiting to ambush you on your return and before you went off again :P [14:57:19] but no probs, we can wait for arnaudb [14:57:34] topranks: I can suggest things but I don't want to decide what works, as I won't be here when it happens [14:58:49] no probs [14:59:25] topranks: if I were to do this, I'd probably get the list of easy hosts and get all those done at once as we can easily depool all of them (keeping in mind how many belong to each section) and then see what's left in terms of masters [15:00:14] yeah that's basically the plan, but my own knowledge doesn't go much deeper than that [15:00:16] topranks: If this is that urgent, we could even depool codfw and do all of them at once with no user impact [15:01:21] nah it's not that urgent - I guess the main factor on our side is wanting to complete the full migration before the dc-switchover in mid-sept [15:01:31] Yeah that makes total sense [15:02:09] Keep in mind that master switch overs within the non writable DC are way easier so it should not take long for the "non easy" hosts to be done [15:02:48] I think it can totally be done before the DC switchover, but luckily I won't be here to see it \o/ [15:03:44] I'll also do kwakuofori ^ so he's aware of all this upcoming operational work [15:04:40] so be clear when I say "full migration" I mean all 400 hosts in those rows [15:04:56] and we're trying to get the db ones done before we begin the rack-by-rack migrations so it's not as much pain as last time [15:05:46] if we can get the DB hosts out of the way in August we should be ok I think [15:05:57] and if not... it's ok too no disaster [15:07:25] topranks: I think we really need to wait for Arnaud, I don't want to commit to anything as I won't be part of it :) [15:08:04] ok thanks we will discuss with him on his return [15:08:10] Sounds good [15:08:16] Sorry [19:58:28] is a query that takes 30 seconds on m3-master something you would call expensive or nothing at all? [19:59:04] responded privately, given that it's once a day, it should be fine! [20:01:24] ACK,thankyou