[06:44:50] 10DBA: Decommission db1059 - https://phabricator.wikimedia.org/T196606#4263030 (10jcrespo) p:05Triage>03Normal [06:45:54] 10DBA, 10Patch-For-Review: Decommission db1053 - https://phabricator.wikimedia.org/T194634#4263046 (10jcrespo) [06:46:08] 10DBA, 10Patch-For-Review: Decommission db1053 - https://phabricator.wikimedia.org/T194634#4204024 (10jcrespo) [07:46:40] 10DBA, 10MediaWiki-User-management, 10Anti-Harassment (AHT Sprint 23): Draft a proposal for granular blocks table schema(s), submit for DBA review - https://phabricator.wikimedia.org/T193449#4263134 (10jcrespo) > ir_type | varbinary No, don't do this. If anomie has a veto on enums that's fine to me, but... [08:32:42] 10DBA, 10Operations, 10ops-eqiad: Degraded RAID on labsdb1009 - https://phabricator.wikimedia.org/T195690#4263212 (10Marostegui) >>! In T195690#4259933, @jcrespo wrote: > @RobH Can you check if we have next-business day support for defects for this hw provider and purchase? Because they seem to not be honori... [08:46:49] be careful with those events, they were installed on the old hosts, but they could break replication [08:47:07] yeah [08:47:13] I only deployed on one of the sanitariums [08:47:20] And only one thread [08:47:44] After two hours I deployed it to the rest of threads of that single host, and I am seeing data so far, but won't deploy to the other host till tomorrow [08:47:47] just to be 100% sure [09:42:11] can we delay our meeing 15 minutes? [09:43:08] yep [09:43:25] (we can finish at the original time) [12:21:04] db1111 and 2 are coming back now reimaged, marostegui [12:43:16] \o/ [12:43:22] jynus: so I can start with them? [12:43:44] I just restarted them for intel-microcode installing [12:43:55] nice! [12:44:37] do you want me to depool db1091 and I reimge that after cloning? [12:44:50] or do you prefer to load from backups [12:45:11] I was going to depool a host in s4 [12:45:13] any preference? [12:45:16] :q! [12:46:51] db1091? [12:47:14] https://gerrit.wikimedia.org/r/437985 [12:47:22] \o/ [12:57:39] marostegui: db1091 is all yours ping me when done so I can reimage it- note that copying it before reimage will require a mysql_upgrade [12:57:46] awesome! [12:57:46] thanks [12:57:49] I will start now [14:20:57] 10DBA, 10MediaWiki-User-management, 10Anti-Harassment (AHT Sprint 23): Draft a proposal for granular blocks table schema(s), submit for DBA review - https://phabricator.wikimedia.org/T193449#4264100 (10dbarratt) >>! In T193449#4263134, @jcrespo wrote: > No, don't do this. If anomie has a veto on enums that's... [14:22:52] 10DBA, 10MediaWiki-User-management, 10Anti-Harassment (AHT Sprint 23): Draft a proposal for granular blocks table schema(s), submit for DBA review - https://phabricator.wikimedia.org/T193449#4264102 (10jcrespo) > it should only exist in code That is ok to me. [14:29:57] 10DBA, 10MediaWiki-User-management, 10Anti-Harassment (AHT Sprint 23): Draft a proposal for granular blocks table schema(s), submit for DBA review - https://phabricator.wikimedia.org/T193449#4264120 (10Anomie) I don't claim a veto, but the cons seem to outweigh the pros. NameTableStore is basically a "manual... [14:36:21] 10DBA, 10MediaWiki-User-management, 10Anti-Harassment (AHT Sprint 23): Draft a proposal for granular blocks table schema(s), submit for DBA review - https://phabricator.wikimedia.org/T193449#4264157 (10jcrespo) > I don't claim a veto, but the cons seem to outweigh the pros I meant veto as a figure of speech... [14:39:32] Heads up that the databases for the 5 new wikis have been created (all public) [14:41:21] can you comment on the tickets, so I can get to sanitize them and then pass them over to cloud team? [14:44:23] I can when I've done the rest :P [14:44:28] Or, I can cheat [14:44:29] * Reedy grins [14:44:45] haha [15:04:33] 10DBA, 10Operations, 10ops-eqiad: Degraded RAID on labsdb1009 - https://phabricator.wikimedia.org/T195690#4264282 (10RobH) We had next day support until it expired on May 25th. However, if this case was open before hten, they should honor the warranty. [15:06:48] 10DBA, 10Operations, 10ops-eqiad: Degraded RAID on labsdb1009 - https://phabricator.wikimedia.org/T195690#4264299 (10Marostegui) >>! In T195690#4264282, @RobH wrote: > We had next day support until it expired on May 25th. However, if this case was open before hten, they should honor the warranty. No, it wa... [15:13:27] 10DBA, 10Cloud-Services, 10User-Urbanecm: Prepare and check storage layer for sahwikiquote - https://phabricator.wikimedia.org/T196362#4264310 (10Marostegui) 05stalled>03Open This wiki has been sanitized. Ready for the views creation cc @bd808 @Bstorm. We have to double check the script for the views wor... [15:13:33] 10DBA, 10Cloud-Services, 10User-Urbanecm: Prepare and check storage layer for sahwikiquote - https://phabricator.wikimedia.org/T196362#4264315 (10Marostegui) p:05Triage>03Normal [15:13:57] 10DBA, 10Cloud-Services: Prepare and check storage layer for bn.wikivoyage - https://phabricator.wikimedia.org/T196358#4264316 (10Marostegui) 05stalled>03Open p:05Triage>03Normal This wiki has been sanitized. Ready for the views creation cc @bd808 @Bstorm. We have to double check the script for the vie... [15:14:17] 10DBA, 10Cloud-Services, 10User-Urbanecm: Prepare and check storage layer for pswikivoyage - https://phabricator.wikimedia.org/T196359#4264323 (10Marostegui) 05stalled>03Open p:05Triage>03Normal This wiki has been sanitized. Ready for the views creation cc @bd808 @Bstorm. We have to double check the... [15:14:42] 10DBA, 10Cloud-Services, 10User-Urbanecm: Prepare and check storage layer for pmswikisource - https://phabricator.wikimedia.org/T195008#4264330 (10Marostegui) This wiki has been sanitized. Ready for the views creation cc @bd808 @Bstorm. We have to double check the script for the views works correctly before... [15:15:04] 10DBA, 10Cloud-Services, 10User-Urbanecm: Prepare and check storage layer for idwikimedia - https://phabricator.wikimedia.org/T193187#4264333 (10Marostegui) This wiki has been sanitized. Ready for the views creation cc @bd808 @Bstorm. We have to double check the script for the views works correctly before go... [15:15:50] 10DBA, 10Operations, 10ops-eqiad: Degraded RAID on labsdb1009 - https://phabricator.wikimedia.org/T195690#4264338 (10RobH) >>! In T195690#4264282, @RobH wrote: > We had next day support until it expired on May 25th. However, if this case was open before hten, they should honor the warranty. I misread the r... [15:47:06] 10DBA, 10Cloud-Services, 10User-Urbanecm: Prepare and check storage layer for idwikimedia - https://phabricator.wikimedia.org/T193187#4264500 (10Bstorm) a:03Bstorm I'll try this one as our first test case. [15:50:35] 10DBA, 10MediaWiki-User-management, 10Anti-Harassment (AHT Sprint 23): Draft a proposal for granular blocks table schema(s), submit for DBA review - https://phabricator.wikimedia.org/T193449#4264510 (10dmaza) `tinyint` and PHP constant it is then. Thank you all for the help and prompt response. [15:51:17] 10DBA, 10MediaWiki-User-management, 10Anti-Harassment (AHT Sprint 23): Draft a proposal for granular blocks table schema(s), submit for DBA review - https://phabricator.wikimedia.org/T193449#4169871 (10dmaza) 05Open>03Resolved [15:53:40] 10DBA, 10Cloud-Services, 10User-Urbanecm: Prepare and check storage layer for idwikimedia - https://phabricator.wikimedia.org/T193187#4264531 (10Bstorm) Surprisingly, the script has failed on the rights to create `_p` the database (could not execute the `CREATE DATABASE` statement. I could have sworn that w... [15:54:32] 10DBA, 10Cloud-Services, 10User-Urbanecm: Prepare and check storage layer for sahwikiquote - https://phabricator.wikimedia.org/T196362#4264546 (10Bstorm) a:03Bstorm [15:54:47] 10DBA, 10Cloud-Services, 10User-Urbanecm: Prepare and check storage layer for pswikivoyage - https://phabricator.wikimedia.org/T196359#4264549 (10Bstorm) a:03Bstorm [15:55:11] 10DBA, 10Cloud-Services: Prepare and check storage layer for bn.wikivoyage - https://phabricator.wikimedia.org/T196358#4264551 (10Bstorm) a:03Bstorm [15:55:42] 10DBA, 10Cloud-Services, 10User-Urbanecm: Prepare and check storage layer for pmswikisource - https://phabricator.wikimedia.org/T195008#4264558 (10Bstorm) a:03Bstorm [16:08:02] 10DBA, 10Operations, 10ops-codfw, 10Patch-For-Review: pc2005 down - https://phabricator.wikimedia.org/T196339#4264628 (10Papaul) a:05Papaul>03jcrespo @jcrespo Dell Shipped a new main board and a new network card. I replaced first the network card to see if the network card was the problem and yes we h... [16:10:48] 10DBA, 10Operations, 10ops-eqiad: Degraded RAID on labsdb1009 - https://phabricator.wikimedia.org/T195690#4264633 (10Marostegui) Disk arrived and got replaced. Note, it is bigger than the other ones. It is rebuilding: ``` logicaldrive 1 (11.6 TB, RAID 1+0, Recovering, 2% complete) physicaldrive... [16:15:37] 10DBA, 10Operations, 10ops-codfw, 10Patch-For-Review: pc2005 down - https://phabricator.wikimedia.org/T196339#4264648 (10Marostegui) I got the server back with network up. We will take it from here. Thanks a lot @Papaul [16:18:11] 10DBA, 10Operations, 10ops-codfw, 10Patch-For-Review: pc2005 down - https://phabricator.wikimedia.org/T196339#4264650 (10Marostegui) So, MySQL is up, but unfortunately the master's binlog where pc2005 was replicating from is gone [16:22:41] 10DBA, 10Operations, 10ops-codfw, 10Patch-For-Review: pc2005 down - https://phabricator.wikimedia.org/T196339#4264660 (10Marostegui) I guess as this data is just be erased when the number of days expires, we can probably start replication from the current position? [16:25:39] 10DBA, 10Operations, 10ops-codfw, 10Patch-For-Review: pc2005 down - https://phabricator.wikimedia.org/T196339#4264697 (10jcrespo) > I guess as this data is just be erased when the number of days expires, we can probably start replication from the current position? Yes, or replicating from the first pc1005... [16:32:22] 10DBA, 10Operations, 10ops-codfw, 10Patch-For-Review: pc2005 down - https://phabricator.wikimedia.org/T196339#4264711 (10Marostegui) Sounds like a plan, I will start replication from the first available position [16:38:24] 10DBA, 10Operations, 10ops-codfw, 10Patch-For-Review: pc2005 down - https://phabricator.wikimedia.org/T196339#4264723 (10Marostegui) Replication has started. It is not ROW: ``` +------------+ | @@hostname | +------------+ | pc1005 | +------------+ +---------------+-----------+ | Variable_name | Value... [16:42:31] 10DBA, 10MediaWiki-Platform-Team, 10Structured-Data-Commons, 10Wikidata, and 2 others: Deploy MCR storage layer - https://phabricator.wikimedia.org/T174044#4264731 (10CCicalese_WMF) [16:45:28] 10DBA, 10MediaWiki-Platform-Team, 10Structured-Data-Commons, 10Wikidata, and 2 others: Test MCR Storage Layer Patches - https://phabricator.wikimedia.org/T196653#4264733 (10CCicalese_WMF) p:05Triage>03Normal [16:46:59] 10DBA, 10MediaWiki-Platform-Team, 10Structured-Data-Commons, 10Wikidata, and 2 others: Deploy MCR storage layer - https://phabricator.wikimedia.org/T174044#4264752 (10CCicalese_WMF) [16:47:07] 10DBA, 10MediaWiki-Platform-Team, 10Structured-Data-Commons, 10Wikidata, and 2 others: Test MCR Storage Layer Patches - https://phabricator.wikimedia.org/T196653#4264733 (10CCicalese_WMF) [16:47:53] 10DBA, 10MediaWiki-Platform-Team, 10Structured-Data-Commons, 10Wikidata, and 2 others: Deploy MCR storage layer - https://phabricator.wikimedia.org/T174044#3549134 (10CCicalese_WMF) [16:51:18] marostegui: just wondering, whats the state with the data reload? :) [16:54:21] addshore: every time you ask, it gets delayed one day :-) [16:54:55] wahahahaa [16:54:57] ;) [16:54:59] db1111 should be almost there, but we need to copy it to db1112 [16:55:05] ack, thanks :) [16:55:09] we said one day, but we started midday [16:55:17] so think midday tomorrow [16:55:41] we had to take care of more urgent things in the morning [16:56:03] thats fine :) just getting a status update as I'm in a meeting about it now! thanks for your work! [17:03:12] 10DBA, 10MediaWiki-Platform-Team, 10Structured-Data-Commons, 10Wikidata, and 2 others: Deploy MCR storage layer - https://phabricator.wikimedia.org/T174044#4264823 (10CCicalese_WMF) [17:07:49] addshore: I replied earlier, it is all done [17:07:56] all done? :D [17:08:11] https://phabricator.wikimedia.org/T196172#4264758 [17:08:29] GREAT, thanks! [17:10:31] pc2005 is catching up nicely [17:10:37] I guess tomorrow it will be up to date [17:10:42] I will leave it depooled till monday though [17:10:47] Just to make sure it is stable [17:10:56] And on monday if all goes fine I will repool it [18:30:57] 10DBA, 10Operations, 10ops-eqiad: Degraded RAID on labsdb1009 - https://phabricator.wikimedia.org/T195690#4265369 (10Marostegui) 05Open>03Resolved All good! Thank you!! ``` logicaldrive 1 (11.6 TB, RAID 1+0, OK) physicaldrive 1I:1:1 (port 1I:box 1:bay 1, Solid State SATA, 1600.3 GB, OK)... [20:11:58] 10DBA, 10Operations, 10ops-eqiad: rack/setup/install dbproxy101[2-7].eqiad.wmnet - https://phabricator.wikimedia.org/T196690#4265618 (10RobH) p:05Triage>03Normal [22:54:15] 10DBA, 10Operations, 10ops-codfw: replace bad disk in db2059 - https://phabricator.wikimedia.org/T196709#4266144 (10RobH) p:05Triage>03High [22:56:15] 10DBA, 10Operations, 10ops-codfw: replace bad disk in db2059 - https://phabricator.wikimedia.org/T196709#4266159 (10RobH) I've set this to high priority due to the looming end of fiscal. If this new SAS disk works fine and the raid rebuilds without incident, we'll be ordering a dozen (or more) additional di...