[01:25:55] 10DBA, 10Wikimedia-General-or-Unknown, 10Performance: Clean up skin properties - https://phabricator.wikimedia.org/T171643#3939860 (10demon) I'm not planning to run any queries. I was just digging at some data. [07:06:49] 10DBA, 10Wikimedia-Site-requests: Global rename of Ks-M9 → Stïnger: supervision needed - https://phabricator.wikimedia.org/T185795#3940284 (10Marostegui) you around? [09:17:10] 10DBA, 10Operations, 10ops-eqiad, 10Patch-For-Review: Rack and setup db1115 (tendril replacement database) - https://phabricator.wikimedia.org/T185788#3940425 (10jcrespo) @Marostegui- this is bad, codfw machine was created as tendril2001 T186123, and this was called db1115. This is not a terrible name for... [09:18:48] 10DBA, 10Operations, 10ops-eqiad, 10Patch-For-Review: Rack and setup db1115 (tendril replacement database) - https://phabricator.wikimedia.org/T185788#3940433 (10Marostegui) >>! In T185788#3940425, @jcrespo wrote: > @Marostegui- this is bad, codfw machine was created as tendril2001 T186123, and this was ca... [09:24:15] 10DBA, 10Operations, 10ops-eqiad, 10Patch-For-Review: Rack and setup tendril1001 (tendril replacement database) - https://phabricator.wikimedia.org/T185788#3940445 (10Marostegui) >>! In T185788#3940441, @jcrespo wrote: >> Suggestions? > > The easy thing would be call it tendril1001 (which I do not 100% li... [09:26:51] 10DBA, 10Operations, 10ops-eqiad, 10Patch-For-Review: Rack and setup tendril1001 (tendril replacement database) - https://phabricator.wikimedia.org/T185788#3940447 (10jcrespo) > To be honest, I would call it as a normal database name, to avoid making any kind of exception and having dedicated hostnames Wh... [09:27:46] 10DBA, 10Operations, 10ops-eqiad, 10Patch-For-Review: Rack and setup tendril1001 (tendril replacement database) - https://phabricator.wikimedia.org/T185788#3940448 (10Marostegui) >>! In T185788#3940447, @jcrespo wrote: >> To be honest, I would call it as a normal database name, to avoid making any kind of... [09:37:10] 10DBA, 10Operations, 10ops-eqiad, 10Patch-For-Review: Rack and setup db1115 (tendril replacement database) - https://phabricator.wikimedia.org/T185788#3940476 (10Marostegui) [09:41:11] 10DBA, 10Operations, 10ops-eqiad, 10Patch-For-Review: Rack and setup db1115 (tendril replacement database) - https://phabricator.wikimedia.org/T185788#3940489 (10jcrespo) Note for that one means involving papaul and renaming stuff, from the physical label to racktables, to dns, etc. [09:41:36] 10DBA, 10Operations, 10ops-eqiad, 10Patch-For-Review: Rack and setup db1115 (tendril replacement database) - https://phabricator.wikimedia.org/T185788#3940491 (10Marostegui) >>! In T185788#3940489, @jcrespo wrote: > Note for that one means involving papaul and renaming stuff, from the physical label to rac... [09:43:09] 10DBA, 10Operations, 10ops-eqiad, 10Patch-For-Review: Rack and setup db1115 (tendril replacement database) - https://phabricator.wikimedia.org/T185788#3940492 (10jcrespo) All db hosts will have a hw RAID except these, it will be confusing. [09:44:40] 10DBA, 10Operations, 10ops-eqiad, 10Patch-For-Review: Rack and setup db1115 (tendril replacement database) - https://phabricator.wikimedia.org/T185788#3940493 (10Marostegui) >>! In T185788#3940492, @jcrespo wrote: > All db hosts will have a hw RAID except these, it will be confusing. ok - I am going to st... [10:00:14] 10DBA, 10Operations, 10ops-eqiad, 10Patch-For-Review: Rack and setup db1115 (tendril replacement database) - https://phabricator.wikimedia.org/T185788#3940555 (10Marostegui) 05Open>03stalled [10:37:18] I'll install the updated jessie kernels on the remaining db hosts, I know that you usually simply reimage to stretch, but that way they no longer show up in the list of updatable packages and it might still catch up a few cases where hosts are restarted for hw maintenance. ok? [10:37:42] sounds good! [10:37:43] thanks [10:44:26] 10DBA, 10Wikimedia-Site-requests: Global rename of Ks-M9 → Stïnger: supervision needed - https://phabricator.wikimedia.org/T185795#3940595 (10MarcoAurelio) @Marostegui I'm here. No IRC access atm though. [10:44:47] ok, will upgrade in half an hour [10:44:51] 10DBA, 10Wikimedia-Site-requests: Global rename of Ks-M9 → Stïnger: supervision needed - https://phabricator.wikimedia.org/T185795#3940596 (10Marostegui) if you want to go for it, I am fine :) [10:45:40] 10DBA, 10Wikimedia-Site-requests: Global rename of Ks-M9 → Stïnger: supervision needed - https://phabricator.wikimedia.org/T185795#3940597 (10MarcoAurelio) @Marostegui Okay, starting in un momentito. [10:47:09] 10DBA, 10Wikimedia-Site-requests, 10User-MarcoAurelio: Global rename of Ks-M9 → Stïnger: supervision needed - https://phabricator.wikimedia.org/T185795#3940598 (10Marostegui) a:03MarcoAurelio haha! please paste the progress URL whenever you get it! Thanks! [10:48:48] 10DBA, 10Wikimedia-Site-requests, 10User-MarcoAurelio: Global rename of Ks-M9 → Stïnger: supervision needed - https://phabricator.wikimedia.org/T185795#3940613 (10MarcoAurelio) https://meta.wikimedia.org/wiki/Special:GlobalRenameProgress/St%C3%AFnger [10:49:11] 10DBA, 10Wikimedia-Site-requests, 10User-MarcoAurelio: Global rename of Ks-M9 → Stïnger: supervision needed - https://phabricator.wikimedia.org/T185795#3940618 (10Marostegui) Thank you! [11:11:06] 10DBA, 10Wikimedia-Site-requests, 10User-MarcoAurelio: Global rename of Ks-M9 → Stïnger: supervision needed - https://phabricator.wikimedia.org/T185795#3940645 (10MarcoAurelio) eswiki, commonswiki and metawiki are the wikis with most of the edits as stated at the top of the task; you may want to do some of w... [11:12:10] 10DBA, 10Wikimedia-Site-requests, 10User-MarcoAurelio: Global rename of Ks-M9 → Stïnger: supervision needed - https://phabricator.wikimedia.org/T185795#3940648 (10Marostegui) >>! In T185795#3940645, @MarcoAurelio wrote: > eswiki, commonswiki and metawiki are the wikis with most of the edits as stated at the... [11:27:12] 10DBA, 10Wikimedia-Site-requests, 10User-MarcoAurelio: Global rename of Ks-M9 → Stïnger: supervision needed - https://phabricator.wikimedia.org/T185795#3940706 (10MarcoAurelio) p:05Triage>03Normal [12:39:13] 10DBA, 10Wikimedia-Site-requests, 10User-MarcoAurelio: Global rename of Ks-M9 → Stïnger: supervision needed - https://phabricator.wikimedia.org/T185795#3940819 (10MarcoAurelio) Heading off for lunch. Will check back later in case there's something that you need me to do. [12:54:14] 10DBA, 10Wikimedia-Site-requests, 10User-MarcoAurelio: Global rename of Ks-M9 → Stïnger: supervision needed - https://phabricator.wikimedia.org/T185795#3940830 (10Marostegui) No worries, it is well throttled, once the big ones have already big done, there is not much to keep an eye on. [12:56:28] 10DBA, 10Operations, 10ops-eqiad: Degraded RAID on db1070 - https://phabricator.wikimedia.org/T186319#3940835 (10Marostegui) This is s5 master [12:56:56] 10DBA, 10Operations, 10ops-eqiad: Degraded RAID on db1070 - https://phabricator.wikimedia.org/T186319#3940837 (10Marostegui) p:05Triage>03High [12:57:45] would you be ok with pre-selecting and preparing master failovers? [12:58:14] 10DBA, 10Operations, 10ops-eqiad: Degraded RAID on db1070 - https://phabricator.wikimedia.org/T186319#3940831 (10Marostegui) a:03Cmjohnson @Cmjohnson this host is out of warranty Can we replace its disk as soon as possible - if possible before the weekend comes? [12:58:18] jynus: what? [12:59:25] pre-selecting master failover candidates [12:59:38] sure [12:59:59] It requires depooling them and possibly restarting them [13:00:22] we can start with s5 ;) [13:00:29] and we should agree on which one [13:00:40] based on many things- size, physical position, etc. [13:00:57] some of them probably need to be phisically moved to a different row [13:01:06] that would be part of this [13:01:21] for s5, I would say db1100 [13:01:34] that is a larger one, isn't it? [13:01:37] yes [13:01:43] but the non large one is db1051 [13:01:45] which as BBU issues [13:01:45] plus there are some now that require row [13:01:55] we can move them, too [13:02:00] there was a plan on gerrit [13:02:06] that we can of course change [13:02:18] that is the whole point of being on gerrit, review it and amend it [13:02:31] I suggested db1100 because: different row, old master data [13:02:33] the other thing [13:02:51] is the only non powerful server is db1051 which I don't think we should pick as it has bbu issues, the rest are all powerful, so it doesn't really matter [13:03:54] should I create a task? [13:04:02] yeah, I would say so [13:04:03] related to decom? [13:04:17] it doesn't matter I think [13:04:26] but feel free, I don't really mind [13:04:34] just a track so we can track what we are doing [13:04:38] *task [13:04:58] so maybe not one for the 2, but related to the decom tickets [13:05:10] because if we say, for example, to decom db1052 [13:05:18] we need 2 candidates, not just 1 [13:05:19] I really don't mind, whatever you feel comfortable with [13:07:11] I would say we leave db1110 (or whichever we pick) restarted and running statement for the weekend, as s5 had the disk broken and not sure chris will replace it (I will ping him later) [13:07:20] the rest of the candidates we can work them out during next week [13:12:30] I've created https://phabricator.wikimedia.org/T186320 [13:12:34] to do on Q4 [13:13:21] nice [13:15:41] 10DBA: Decommission db1051-db1060 (DBA tracking) - https://phabricator.wikimedia.org/T186320#3940873 (10jcrespo) [13:40:25] 10DBA: Prepare and indicate proper master db failover candidates for all database sections (s1-s8, x1) - https://phabricator.wikimedia.org/T186321#3940891 (10jcrespo) [13:41:11] work on T186321 could start now [13:41:11] T186321: Prepare and indicate proper master db failover candidates for all database sections (s1-s8, x1) - https://phabricator.wikimedia.org/T186321 [13:42:44] 10DBA: Prepare and indicate proper master db failover candidates for all database sections (s1-s8, x1) - https://phabricator.wikimedia.org/T186321#3940925 (10Marostegui) [14:21:00] 10DBA, 10Operations, 10ops-codfw, 10Patch-For-Review: rack/setup/install tendril2001 - https://phabricator.wikimedia.org/T186123#3934359 (10Dzahn) After talking with jcrespo on IRC: We should use a different name for this system. So far tendril is only a service name, not a host name. And tendril might be... [15:11:40] 10DBA, 10Wikimedia-Site-requests, 10User-MarcoAurelio: Global rename of Ks-M9 → Stïnger: supervision needed - https://phabricator.wikimedia.org/T185795#3941025 (10Ks-M9) Now it is just done. So, thanks to all! [15:41:52] 10DBA, 10Operations, 10ops-codfw, 10Patch-For-Review: rack/setup/install tendril2001 - https://phabricator.wikimedia.org/T186123#3941063 (10Marostegui) As I have stated on T185788 my personal preference is to keep using db* on both, eqiad and codfw. I was also fine with tendrilXXXX. I honestly thing we a... [15:45:43] 10DBA, 10Patch-For-Review: Prepare and indicate proper master db failover candidates for all database sections (s1-s8, x1) - https://phabricator.wikimedia.org/T186321#3941073 (10Marostegui) [15:47:45] 10DBA, 10Patch-For-Review: Prepare and indicate proper master db failover candidates for all database sections (s1-s8, x1) - https://phabricator.wikimedia.org/T186321#3940891 (10Marostegui) p:05Triage>03Normal [15:51:33] 10DBA, 10Wikimedia-Site-requests, 10User-MarcoAurelio: Global rename of Ks-M9 → Stïnger: supervision needed - https://phabricator.wikimedia.org/T185795#3941088 (10Cyberpower678) 05Open>03Resolved Looks like the rename was successful. [15:51:48] 10DBA, 10Operations, 10ops-codfw, 10Patch-For-Review: rack/setup/install tendril2001 - https://phabricator.wikimedia.org/T186123#3941091 (10Papaul) Thanks to all for the name discussion, but so far no decision has been made yet if we are keeping the same name or changing the name. Please confirm if we ar... [15:53:04] 10DBA, 10Wikimedia-Site-requests: Global rename of Dick Laurent → Qehath: supervision needed - https://phabricator.wikimedia.org/T185719#3941108 (10Cyberpower678) @Marostegui If you're up for it, this is the only other rename in need of supervision. I can take this one on now. [15:53:56] 10DBA, 10Wikimedia-Site-requests: Global rename of Dick Laurent → Qehath: supervision needed - https://phabricator.wikimedia.org/T185719#3941109 (10Marostegui) >>! In T185719#3941108, @Cyberpower678 wrote: > @Marostegui If you're up for it, this is the only other rename in need of supervision. I can take this... [15:54:29] 10DBA, 10Wikimedia-Site-requests: Global rename of Dick Laurent → Qehath: supervision needed - https://phabricator.wikimedia.org/T185719#3941110 (10Cyberpower678) Okay. [15:54:45] 10DBA, 10Operations, 10ops-codfw, 10Patch-For-Review: rack/setup/install tendril2001 - https://phabricator.wikimedia.org/T186123#3941112 (10jcrespo) The things is people like @faidon expressed that our current schema names was confusing for him, and I can see a reason why. We can run, even with difficulty,... [16:17:32] 10DBA, 10Operations, 10ops-eqiad: Degraded RAID on db1070 - https://phabricator.wikimedia.org/T186319#3941192 (10Marostegui) Thanks @Cmjohnson for replacing this disk so fast! ``` root@db1070:~# megacli -PDRbld -ShowProg -PhysDrv [32:0] -a0 Rebuild Progress on Device at Enclosure 32, Slot 0 Completed 17% in... [16:30:27] 10DBA, 10Patch-For-Review: Prepare and indicate proper master db failover candidates for all database sections (s1-s8, x1) - https://phabricator.wikimedia.org/T186321#3941194 (10Marostegui) For s8: ``` 'db1054' => 0, # A3 2.8TB 96GB, master 'db1053' => 0, # A2 2.8TB 96GB, vslow, dump... [16:31:54] 10DBA, 10Patch-For-Review: Prepare and indicate proper master db failover candidates for all database sections (s1-s8, x1) - https://phabricator.wikimedia.org/T186321#3941197 (10Marostegui) >>! In T186321#3941194, @Marostegui wrote: > For s8: > ``` > 'db1054' => 0, # A3 2.8TB 96GB, master >... [16:35:35] 10DBA, 10Patch-For-Review: Prepare and indicate proper master db failover candidates for all database sections (s1-s8, x1) - https://phabricator.wikimedia.org/T186321#3941201 (10Marostegui) [16:42:33] 10DBA, 10Patch-For-Review: Prepare and indicate proper master db failover candidates for all database sections (s1-s8, x1) - https://phabricator.wikimedia.org/T186321#3941232 (10jcrespo) This works ok for now, but we may need a proper longer strategy fpr the others - db1061-db1073 will be the only non-500GB ho... [16:46:10] 10DBA, 10Patch-For-Review: Prepare and indicate proper master db failover candidates for all database sections (s1-s8, x1) - https://phabricator.wikimedia.org/T186321#3941237 (10Marostegui) >>! In T186321#3941232, @jcrespo wrote: > This works ok for now, but we may need a proper longer strategy fpr the others... [16:51:14] 10DBA, 10Patch-For-Review: Prepare and indicate proper master db failover candidates for all database sections (s1-s8, x1) - https://phabricator.wikimedia.org/T186321#3941258 (10Marostegui) [17:09:23] 10DBA, 10Operations, 10ops-eqiad: Degraded RAID on db1070 - https://phabricator.wikimedia.org/T186319#3941310 (10Marostegui) 05Open>03Resolved ``` root@db1070:~# megacli -LDInfo -L0 -a0 Adapter 0 -- Virtual Drive Information: Virtual Drive: 0 (Target Id: 0) Name : RAID Level : P... [17:33:08] 10DBA, 10Operations, 10ops-eqiad: db1051 database host BBU issues - https://phabricator.wikimedia.org/T186049#3941390 (10Cmjohnson) @marostegui Let's do this Tuesday (my morning) 1500UTC [17:33:45] 10DBA, 10Operations, 10ops-eqiad: db1051 database host BBU issues - https://phabricator.wikimedia.org/T186049#3941392 (10Cmjohnson) Tuesday 6 Feb [17:59:20] 10DBA, 10Operations, 10ops-eqiad: db1051 database host BBU issues - https://phabricator.wikimedia.org/T186049#3941461 (10Marostegui) Great! Will have the server ready by then Thanks!