[05:35:03] 10DBA, 06Community-Tech, 10MediaWiki-extensions-CentralAuth: Add indices for local_user_id and global_user_id in production - https://phabricator.wikimedia.org/T148243#2727631 (10kaldari) [08:36:05] 10DBA, 13Patch-For-Review: Reimage dbstore2001 as jessie - https://phabricator.wikimedia.org/T146261#2727832 (10Marostegui) The table `change_tag` is definitely corrupted (that is the one that crashed the first time, and during the different two attempts of copying it). It is imposible to use it, not even to s... [08:42:58] 10DBA, 06Labs, 10Labs-Infrastructure, 06Operations: Create maintain-views user for labsdb1001 and labsdb1003 - https://phabricator.wikimedia.org/T148560#2727848 (10Marostegui) @chasemp are we talking about this user from labsdb1008 that you want to get replicated to the other labs hosts? If so, I will get... [08:57:50] 10DBA, 06Labs, 10Labs-Infrastructure, 06Operations: Create maintain-views user for labsdb1001 and labsdb1003 - https://phabricator.wikimedia.org/T148560#2727898 (10jcrespo) A couple of things- I created a new pasword for something similar to this, which is currently living on palladium secrets git store. I... [09:04:05] 10DBA, 13Patch-For-Review: Reimage dbstore2001 as jessie - https://phabricator.wikimedia.org/T146261#2727908 (10jcrespo) Maybe stopping the slave, exporting it (probably it is very small), dropping the table and recreating it logically on the source? Check if it has a primary key- wasn't this one that caused... [09:04:30] 10DBA, 06Labs, 10Labs-Infrastructure, 06Operations: Create maintain-views user for labsdb1001 and labsdb1003 - https://phabricator.wikimedia.org/T148560#2727921 (10Marostegui) @jcrespo thanks - I see it now. You suggest using the same password for this new user? or even using the same user? @chasemp do yo... [09:08:05] 07Blocked-on-schema-change, 10DBA: Apply change_tag and tag_summary primary key schema change to Wikimedia wikis - https://phabricator.wikimedia.org/T147166#2727930 (10jcrespo) p:05Normal>03High change_tag is causing issues T146261#2727832, I believe this schema change could mitigate the issue, directly or... [09:09:35] 10DBA, 13Patch-For-Review: Reimage dbstore2001 as jessie - https://phabricator.wikimedia.org/T146261#2727934 (10Marostegui) >>! In T146261#2727908, @jcrespo wrote: > Maybe stopping the slave, exporting it (probably it is very small), dropping the table and recreating it logically on the source? > That is a... [09:11:24] I remember you having issues with an innodb table on eqiad, can you remember the ticket? [09:11:34] so similar to this [09:12:36] my bet is that it is a fleet-wide issue [09:12:52] 07Blocked-on-schema-change, 10DBA: Apply change_tag and tag_summary primary key schema change to Wikimedia wikis - https://phabricator.wikimedia.org/T147166#2727937 (10Marostegui) I can start with this on S1 in codfw if you agree @jcrespo [09:13:43] ^yes, I believe it will help or break things completely- please go very slow [09:13:51] Sure [09:13:58] I will only do one host on codfw maybe [09:14:12] maybe even the one you were just using [09:14:16] By the way, any objection to start altering commonswiki.revision on db1069 now? [09:14:26] no obj [09:14:32] I will proceed then :) [09:14:35] (with db1069) [09:14:35] monitor the lag on labs [09:14:45] I believe toku change will not be online [09:14:55] because it is a PK [09:14:59] Do we want to replicate the change? [09:15:09] I wouls say no [09:15:20] We think alike then :) [09:15:22] too much traffic, but it is up to you [09:15:27] No, no I didn't want to [09:15:31] Just checking with you :) [09:16:32] maybe I can start the partitioning now because it may take a few days? [09:16:51] I do not need you to shepard it, just to be aware of it [09:17:06] Sure, let's go ahead [09:17:19] Will you report in SAL the host you are altering? [09:17:22] I will stop the sql thread [09:17:24] of course [09:17:38] and put a 5-day downtime [09:17:55] sure thing [09:17:55] so you are aware why it is stopped [09:18:03] this is all for db1053 [09:18:09] * marostegui still thinks about 5 days ALTER…XD [09:18:09] I will apply it on a screen on localhost [09:18:18] Oh yeah, it is only db1053 true :) [09:18:41] and that will fix our lack of servers for s4 [09:19:39] backups are still running on dbstore [09:20:16] I saw you repooled db1064, thank you! [09:20:23] you are the best partner [09:23:35] mmmm I appreciate the compliment, but are you sure you have the latest copy from the repo? In my db1064 still looks depooled :) [09:23:41] I can repool it if you like :) [09:24:19] I am looking at https://noc.wikimedia.org/conf/highlight.php?file=db-eqiad.php [09:24:43] oh [09:24:47] s* [09:25:05] they had issues with mira/tin yesterday [09:25:17] I bet they reverted to a safe config [09:25:31] wow, that could have caused a large problem for us [09:25:33] mmm, I just did a git fetch - pull [09:25:39] And I still have it depooled [09:25:40] no, it is not you [09:25:42] yes [09:25:48] but servers have it pooled [09:25:53] oh I know what you mean [09:25:53] pffff [09:25:58] git =/= deployed code [09:26:05] Yep, get it now [09:26:09] this is quite urgent [09:26:13] it could have been a disaster [09:26:16] tell around [09:26:19] sure [09:26:38] it is probably safe, but by pure randomness [09:26:52] I will not start the partitioning to avoid more variables [09:27:00] yep [09:27:02] good idea [11:07:50] 10DBA, 06Operations, 10ops-eqiad: Degraded RAID on db1046.eqiad.wmnet. - https://phabricator.wikimedia.org/T148627#2728197 (10elukey) [11:07:59] 10DBA, 06Operations, 10ops-eqiad: Degraded RAID on db1046.eqiad.wmnet. - https://phabricator.wikimedia.org/T148627#2728188 (10elukey) p:05Triage>03High [11:08:08] hello people [11:08:12] elukey: please don't, wtill working on ti [11:08:16] *it [11:08:28] don't? [11:08:36] don't manage that ticket [11:08:42] I wanted to ask if it was already work in progress [11:08:52] was created automagically [11:08:57] yes I got it [11:09:11] but I'm still working to refine a couple of things I think I will close it and let the system create a new one in a few [11:09:31] with the name instead of IP and also there was another thign that didn't work as expected :( [11:09:36] ok good, just wanted to know if you guys were on it [11:09:51] (don't blame the clinic duty :P) [11:10:03] it's more on DC actually ;) but yeah m*rostegui know about it [11:30:15] I do :) [11:32:56] I found a small issue, I'll try to fix it and re-force the check, sorry in advance about a couple of duplicated tasks [11:35:25] :) [11:46:02] 10DBA, 13Patch-For-Review: Reimage dbstore2001 as jessie - https://phabricator.wikimedia.org/T146261#2728276 (10Marostegui) Different server, same table crashed. Interesting. I am going to run the alter table described at: T147166 in `db2055.codfw.wmnet` and see what happens - I can also try to export the tab... [12:13:57] 07Blocked-on-schema-change, 10DBA: Apply change_tag and tag_summary primary key schema change to Wikimedia wikis - https://phabricator.wikimedia.org/T147166#2728317 (10Marostegui) I have changed those two tables on db2055.codfw.wmnet ``` MariaDB PRODUCTION s1 localhost enwiki > show create table tag_summary\G... [12:47:43] 10DBA, 06Operations, 10ops-eqiad: Degraded RAID on db1046.eqiad.wmnet. - https://phabricator.wikimedia.org/T148627#2728403 (10Volans) [12:56:26] 10DBA, 06Labs, 10Labs-Infrastructure, 06Operations: Create maintain-views user for labsdb1001 and labsdb1003 - https://phabricator.wikimedia.org/T148560#2728476 (10chasemp) >>! In T148560#2727848, @Marostegui wrote: > @chasemp are we talking about this user from labsdb1008 that you want to get replicated t... [13:03:40] 10DBA, 06Operations, 10ops-eqiad: Degraded RAID on db1046 - https://phabricator.wikimedia.org/T148633#2728534 (10elukey) [13:08:07] 10DBA, 06Labs, 10Labs-Infrastructure, 06Operations: Create maintain-views user for labsdb1001 and labsdb1003 - https://phabricator.wikimedia.org/T148560#2728548 (10Marostegui) >>! In T148560#2728476, @chasemp wrote: >>>! In T148560#2727848, @Marostegui wrote: >> @chasemp are we talking about this user from... [13:16:07] 10DBA, 06Labs, 10Labs-Infrastructure, 06Operations: Create maintain-views user for labsdb1001 and labsdb1003 - https://phabricator.wikimedia.org/T148560#2728584 (10Krenair) I think it needs to be able to at least read non-_p databases to be allowed to create views selecting them? [13:19:41] 10DBA, 06Labs, 10Labs-Infrastructure, 06Operations: Create maintain-views user for labsdb1001 and labsdb1003 - https://phabricator.wikimedia.org/T148560#2728602 (10chasemp) At the moment the user has full permissions iirc. @Marostegui for my part I meant more like "Let's not use this user for anything els... [13:21:59] marostegui: morning! (afternoon even) thanks for poking at maintain-views user [13:22:17] do you want to tweak in place on labsdb1008 and I'll try out a better permissions setup? [13:36:29] chasemp: Morning! Yeah, maybe it is easier to create a different user maintain-replicas2 maybe to tweak it [13:36:38] So we do not break the existing one [13:36:41] sure [13:36:54] maintain-views2? [13:37:02] Ok, I am going to give it grants to all _p tabls and then SELECT and SHOW for the rest of databases [13:37:19] Yeah, let's try that one [13:37:45] kk [13:37:49] marostegui: same pass? [13:38:01] Yep [13:38:06] Same as maintain-replicas? [13:38:28] maintain-views :) [13:38:29] sure [13:38:56] we have disavowed all knowledge of a script ever existing called maintain-replicas and also that perl is a thing [13:39:29] Where is maintain-views user now? Because it is not in labsdb1003 even [13:39:42] (so I can get the pass) [13:40:21] only on labsdb1008 and there it's defined in the password module on the puppetmaster [13:40:40] Ah right [13:40:41] I see it :) [13:40:43] Thanks [13:41:14] I'll clean up this puppet config and move to labs/db/views.pp or something but I think everything is kosher afa done right otherwise [13:41:21] and apply it to the existing labsdb's [13:45:10] chasemp: I have given maintain-views2 ALL privileges on %\_p databases and SELECT on *.* [13:45:20] marostegui: on labsdb1008? [13:45:22] yep [13:45:27] kk I'll do some poking [13:45:31] ok [13:45:40] I am sure we will need to tune it a bit more [13:45:54] I'm out half-day today and through wed of next week so may be a bit before I circle back to you on this [13:46:10] yeah, I can't think of anything else needed atm but I'm sure it will be something [13:46:26] Ah sure [13:46:28] No worries [13:46:31] I will update the ticket [13:46:32] with this [13:46:37] For tracking [13:50:06] 10DBA, 06Labs, 10Labs-Infrastructure, 06Operations: Create maintain-views user for labsdb1001 and labsdb1003 - https://phabricator.wikimedia.org/T148560#2728703 (10Marostegui) As @chasemp and myself agreed on IRC, we have decided to do some tweaking with a new user (to avoid touching maintain-views one) .... [13:54:28] marostegui: by ay chance do you have a DB in codfw that is due to be decommissioned ? I just need to force an icinga check and I don't want to bother production [13:56:57] volans: mmm not that I am aware really [13:57:07] I haven't spoken with Jaime about any decommissioning in CODFW [13:57:47] I don't see any even commented out [13:57:57] if it's out eqiad is fine too [13:58:51] volans: All the ones under db1050 are supposed to be ready to be decommissioned as you probably know already [13:58:54] Maybe picking one of those? [13:59:11] ie: db1035? [13:59:53] ok thanks [14:00:55] volans: Actually db1019 [14:00:57] that one :) [14:01:11] Ah, but it is in spare role in Puppet [14:01:11] ack [14:01:15] it's fine [14:01:20] I guess [14:01:23] let me check icinga [14:01:41] Ah, I think it was removed [14:01:44] fine for me [14:01:46] It is in the last step of being decommissioned [14:01:50] ahhh [14:01:53] Yeah [14:01:57] but the config on icinga is still there [14:02:01] so ok for me :D [14:02:09] You can also check this: https://phabricator.wikimedia.org/T148078 [14:02:22] There are three hosts there :) [14:10:06] thanks marostegui all done with db1019 :) [14:11:55] nice [14:12:58] yw! [14:16:06] 10DBA, 06Labs, 10Labs-Infrastructure, 06Operations: Create maintain-views user for labsdb1001 and labsdb1003 - https://phabricator.wikimedia.org/T148560#2728781 (10chasemp) With maintain-views2 error: > pymysql.err.InternalError: (1227, 'Access denied; you need (at least one of) the SUPER privilege(s) for... [14:19:12] chasemp: I will try that manually [14:19:24] cool [14:19:40] I'm wondering if it's because the definer is not the executing user? [14:19:43] but I'm not sure [14:20:02] It is weird, I am going to try the SELECT first [14:20:15] I hate mysql grants :) [14:44:45] chasemp: I am only able to create that definer with the SUPER privilege indeed :( [14:51:15] 10DBA, 06Labs, 10Labs-Infrastructure, 06Operations: Create maintain-views user for labsdb1001 and labsdb1003 - https://phabricator.wikimedia.org/T148560#2728901 (10Marostegui) So far I have only been able to create it with either SUPER or ALL PRIVILEGES ON: `GRANT SELECT ON *.* TO 'maintain-views2'@'10.64.... [14:52:06] chasemp: By the way, db1069 is running an ALTER table that might take a few more hours, so you will see delay on S4 on labs. This is the alter: https://phabricator.wikimedia.org/T147305 [14:59:45] marostegui: k thanks for the heads up :) [15:15:10] 10DBA, 13Patch-For-Review: Reimage dbstore2001 as jessie - https://phabricator.wikimedia.org/T146261#2728979 (10Marostegui) After adding the PK change I am now importing the enwiki tables and `change_tag` table has been imported finely. So, so far so good!. We will see how it goes with the rest of the tables,... [17:09:29] 10DBA, 06Operations, 10ops-codfw: db2037: Disk in predictive failure - https://phabricator.wikimedia.org/T148373#2729189 (10Papaul) a:05Papaul>03Marostegui Disk replacement complete