[05:30:36] 10DBA, 10Labs: Prepare and check storage layer for kbp.wikipedia.org - https://phabricator.wikimedia.org/T160869#3399517 (10Marostegui) a:03Marostegui [05:55:00] 10DBA, 10Labs, 10User-bd808, 10cloud-services-team (Kanban): Prepare and check storage layer for atjwiki - https://phabricator.wikimedia.org/T167715#3399535 (10Marostegui) I have sanitized the tables on both sanitarium hosts and that replicated to labs. Also recreated the views on labsdb1009, labsdb1010, l... [05:55:55] 10DBA, 10Labs: Prepare and check storage layer for kbp.wikipedia.org - https://phabricator.wikimedia.org/T160869#3399548 (10Marostegui) 05Open>03Resolved I have sanitized the tables on both sanitarium hosts and that replicated to labs. Also recreated the views on labsdb1009, labsdb1010, labsdb1011 and the... [07:31:53] 10Blocked-on-schema-change, 10DBA, 10Patch-For-Review: Convert unique keys into primary keys for some wiki tables on s7 - https://phabricator.wikimedia.org/T166208#3399651 (10Marostegui) [07:36:04] 10Blocked-on-schema-change, 10DBA, 10Patch-For-Review: Convert unique keys into primary keys for some wiki tables on s7 - https://phabricator.wikimedia.org/T166208#3399680 (10Marostegui) #cloud-services-team I have started the alter table on labsdb1003, so s7 replication thread will be delayed for around 48h. [08:33:51] 10DBA, 10Commons, 10MediaWiki-Special-pages, 10Wikimedia-General-or-Unknown, and 3 others: Special:ShortPages does not load in Wikimedia Commons: "Read timeout is reached" - https://phabricator.wikimedia.org/T168010#3399815 (10Aklapper) This "Unbreak Now" priority task has not seen updates for two weeks. W... [08:37:43] 10DBA, 10Commons, 10MediaWiki-Special-pages, 10Wikimedia-General-or-Unknown, and 3 others: Special:ShortPages does not load in Wikimedia Commons: "Read timeout is reached" - https://phabricator.wikimedia.org/T168010#3399829 (10jcrespo) > Who plans to review I think nobody, so we should just merge it- wors... [08:44:15] 10DBA, 10Operations, 10ops-eqiad: Degraded RAID on db1066 - https://phabricator.wikimedia.org/T169448#3398609 (10jcrespo) @Marostegui That doesn't work- older hosts have 300GB disks- older but not so much have 600GB ones. [08:52:36] 10DBA, 10Operations, 10ops-eqiad: Degraded RAID on db1066 - https://phabricator.wikimedia.org/T169448#3399923 (10Marostegui) Good catch @jcrespo - thank you. @Cmjohnson please advise if you ran out of 600GB spare disks. Thanks guys [09:09:27] 10DBA, 10Operations, 10ops-codfw: Move some masters away from B6 - https://phabricator.wikimedia.org/T169501#3400000 (10Marostegui) [09:09:33] 10DBA, 10Operations, 10ops-codfw: Move some masters away from B6 - https://phabricator.wikimedia.org/T169501#3400015 (10Marostegui) p:05Triage>03Normal [09:13:11] 10DBA, 10Operations, 10ops-codfw: Move some masters away from B6 - https://phabricator.wikimedia.org/T169501#3400000 (10jcrespo) We may want to hold this, at least unless a switch is planned- hosts 10DBA, 10Operations, 10ops-codfw: Move some masters away from B6 - https://phabricator.wikimedia.org/T169501#3400043 (10Marostegui) 05Open>03stalled [09:30:43] 10DBA, 10Operations, 10Wikimedia-Site-requests: Global rename of Markos90 → Mαρκος: supervision needed - https://phabricator.wikimedia.org/T169396#3400081 (10alanajjar) @Marostegui start it now? [09:38:07] 10DBA, 10Operations, 10Wikimedia-Site-requests: Global rename of Markos90 → Mαρκος: supervision needed - https://phabricator.wikimedia.org/T169396#3400117 (10alanajjar) We started https://meta.wikimedia.org/wiki/Special:GlobalRenameProgress/M%CE%B1%CF%81%CE%BA%CE%BF%CF%82 [09:44:06] 10DBA, 10Labs, 10Labs-Infrastructure, 10Tracking: LabsDB replica service for tools and labs - issues and missing available views (tracking) - https://phabricator.wikimedia.org/T150767#3400123 (10jcrespo) [09:44:10] 10DBA, 10Labs, 10Epic: Labs database replica drift - https://phabricator.wikimedia.org/T138967#3400124 (10jcrespo) [09:44:12] 10DBA, 10Labs: enwiki_p logging vs logging_userindex returning dramatically different results - https://phabricator.wikimedia.org/T168349#3400122 (10jcrespo) [10:11:08] 10DBA, 10Operations, 10Wikimedia-Site-requests: Global rename of Markos90 → Mαρκος: supervision needed - https://phabricator.wikimedia.org/T169396#3400169 (10Marostegui) a:03alanajjar [10:16:53] 10DBA, 10Operations, 10Wikimedia-Site-requests: Global rename of Markos90 → Mαρκος: supervision needed - https://phabricator.wikimedia.org/T169396#3400195 (10alanajjar) Finished [10:17:09] 10DBA, 10Operations, 10Wikimedia-Site-requests: Global rename of Markos90 → Mαρκος: supervision needed - https://phabricator.wikimedia.org/T169396#3400211 (10alanajjar) 05Open>03Resolved [10:24:30] 10DBA, 10Epic: Meta ticket: Migrate multi-source database hosts to multi-instance - https://phabricator.wikimedia.org/T159423#3400239 (10jcrespo) [10:24:58] 10DBA, 10Epic: Meta ticket: Migrate multi-source database hosts to multi-instance - https://phabricator.wikimedia.org/T159423#3400240 (10Marostegui) p:05Triage>03Normal [10:31:32] 10DBA: Setup dbstore2002 with 2 new mysql instances from production and enable GTID - https://phabricator.wikimedia.org/T169510#3400245 (10jcrespo) [10:31:47] 10DBA: Setup dbstore2002 with 2 new mysql instances from production and enable GTID - https://phabricator.wikimedia.org/T169510#3400268 (10Marostegui) p:05Triage>03Normal [10:32:08] 10DBA: Setup dbstore2002 with 2 new mysql instances from production and enable GTID - https://phabricator.wikimedia.org/T169510#3400271 (10jcrespo) [10:32:10] 10DBA, 10Epic: Meta ticket: Migrate multi-source database hosts to multi-instance - https://phabricator.wikimedia.org/T159423#3400272 (10jcrespo) [10:40:07] 10DBA, 10Epic: Meta ticket: Migrate multi-source database hosts to multi-instance - https://phabricator.wikimedia.org/T159423#3400304 (10jcrespo) [10:40:09] 10DBA: Migrate dbstore2001 to multi instance - https://phabricator.wikimedia.org/T168409#3400303 (10jcrespo) [10:43:00] 10DBA: Refactor puppet mariadb class to support multi-instance hosts - https://phabricator.wikimedia.org/T169514#3400308 (10jcrespo) [10:46:22] 10DBA, 10Operations: Create less overhead on bacula jobs when dumping production databases - https://phabricator.wikimedia.org/T162789#3400335 (10jcrespo) a:05jcrespo>03None [10:50:45] 10DBA: Implement cron-based mydumper backups on the dbstore role - https://phabricator.wikimedia.org/T169516#3400346 (10jcrespo) [10:55:28] 10DBA, 10Documentation: Research backup storage options and prepare a design document - https://phabricator.wikimedia.org/T169517#3400365 (10jcrespo) [10:56:30] 10DBA: Implement cron-based mydumper backups on the dbstore role - https://phabricator.wikimedia.org/T169516#3400379 (10jcrespo) [10:56:32] 10DBA, 10Operations: Create less overhead on bacula jobs when dumping production databases - https://phabricator.wikimedia.org/T162789#3400378 (10jcrespo) [10:56:47] 10DBA: Implement cron-based mydumper backups on the dbstore role - https://phabricator.wikimedia.org/T169516#3400346 (10jcrespo) [10:56:49] 10DBA, 10Wikimedia-Incident: Improve regular production database backups handling - https://phabricator.wikimedia.org/T138562#2403942 (10jcrespo) [11:57:03] 10DBA: Setup dbstore2002 with 2 new mysql instances from production and enable GTID - https://phabricator.wikimedia.org/T169510#3400637 (10Marostegui) a:03Marostegui [11:58:13] 10DBA, 10Wikidata, 10Patch-For-Review, 10Performance, and 2 others: slow master queries on Wikibase\Client\Usage\Sql\EntityUsageTable::getAffectedRowIds - https://phabricator.wikimedia.org/T169336#3400643 (10thiemowmde) [11:58:54] 10DBA, 10Wikidata, 10Patch-For-Review, 10Performance, and 2 others: slow master queries on Wikibase\Client\Usage\Sql\EntityUsageTable::getAffectedRowIds - https://phabricator.wikimedia.org/T169336#3395074 (10thiemowmde) p:05Triage>03Low [12:16:50] 10DBA, 10Operations, 10Patch-For-Review: eqiad rack/setup 11 new DB servers - https://phabricator.wikimedia.org/T162233#3400682 (10Marostegui) Taking db1102 for: T169510 [12:17:15] 10DBA, 10Operations, 10Patch-For-Review: eqiad rack/setup 11 new DB servers - https://phabricator.wikimedia.org/T162233#3400684 (10Marostegui) [12:31:28] 10DBA, 10Patch-For-Review: Setup dbstore2002 with 2 new mysql instances from production and enable GTID - https://phabricator.wikimedia.org/T169510#3400711 (10Marostegui) [12:32:56] 10DBA: Query from stat1003 brought down db1047 - https://phabricator.wikimedia.org/T136214#2326904 (10Marostegui) Just came across this ticket and given how old it is....is this still an issue or can be closed? [12:38:57] 10DBA: Query from stat1003 brought down db1047 - https://phabricator.wikimedia.org/T136214#3400742 (10jcrespo) 05Open>03Resolved [12:48:46] 10DBA, 10Operations, 10Patch-For-Review: eqiad rack/setup 11 new DB servers - https://phabricator.wikimedia.org/T162233#3400760 (10jcrespo) Marostegui, quick question, do you know what is the state of this- are those other servers still not going up, do you want me to have a third quick look in case it is a... [12:50:35] 10DBA, 10Operations, 10Patch-For-Review: eqiad rack/setup 11 new DB servers - https://phabricator.wikimedia.org/T162233#3400761 (10Marostegui) >>! In T162233#3400760, @jcrespo wrote: > Marostegui, quick question, do you know what is the state of this- are those other servers still not going up, do you want m... [13:40:07] 10DBA, 10Operations, 10Wikimedia-Site-requests: Global rename of Antero de Quintal → JMagalhães: supervision needed - https://phabricator.wikimedia.org/T169527#3400808 (10alanajjar) [13:40:40] 10DBA, 10Operations, 10Wikimedia-Site-requests: Global rename of Antero de Quintal → JMagalhães: supervision needed - https://phabricator.wikimedia.org/T169527#3400808 (10Marostegui) @alanajjar if you want to do it now, I am available [13:42:16] 10DBA, 10Operations, 10Wikimedia-Site-requests: Global rename of Antero de Quintal → JMagalhães: supervision needed - https://phabricator.wikimedia.org/T169527#3400822 (10alanajjar) [13:42:48] 10DBA, 10Operations, 10Wikimedia-Site-requests: Global rename of Antero de Quintal → JMagalhães: supervision needed - https://phabricator.wikimedia.org/T169527#3400808 (10alanajjar) @Marostegui. of course, let us start! [13:43:06] 10DBA, 10Operations, 10Wikimedia-Site-requests: Global rename of Antero de Quintal → JMagalhães: supervision needed - https://phabricator.wikimedia.org/T169527#3400825 (10Marostegui) Go ahead! [13:45:19] 10DBA, 10Operations, 10Wikimedia-Site-requests: Global rename of Antero de Quintal → JMagalhães: supervision needed - https://phabricator.wikimedia.org/T169527#3400845 (10alanajjar) We start. [[https://meta.wikimedia.org/wiki/Special:GlobalRenameProgress/JMagalh%C3%A3es |The log]] [13:45:50] 10DBA, 10Operations, 10Wikimedia-Site-requests: Global rename of Antero de Quintal → JMagalhães: supervision needed - https://phabricator.wikimedia.org/T169527#3400846 (10alanajjar) a:03Marostegui [13:45:56] 10DBA, 10Operations, 10Wikimedia-Site-requests: Global rename of Antero de Quintal → JMagalhães: supervision needed - https://phabricator.wikimedia.org/T169527#3400847 (10Marostegui) >>! In T169527#3400845, @alanajjar wrote: > We start. > > [[https://meta.wikimedia.org/wiki/Special:GlobalRenameProgress/JMag... [14:19:12] 10DBA, 10Operations, 10Wikimedia-Site-requests: Global rename of Antero de Quintal → JMagalhães: supervision needed - https://phabricator.wikimedia.org/T169527#3400983 (10alanajjar) Finished. Thanks @Marostegui [14:19:35] 10DBA, 10Operations, 10Wikimedia-Site-requests: Global rename of Antero de Quintal → JMagalhães: supervision needed - https://phabricator.wikimedia.org/T169527#3400984 (10alanajjar) 05Open>03Resolved [14:19:45] 10DBA, 10Operations, 10Wikimedia-Site-requests: Global rename of Antero de Quintal → JMagalhães: supervision needed - https://phabricator.wikimedia.org/T169527#3400986 (10Marostegui) Thank you!! :-) [14:21:27] 10DBA, 10Operations, 10Wikimedia-Site-requests: Global rename of Antero de Quintal → JMagalhães: supervision needed - https://phabricator.wikimedia.org/T169527#3400992 (10alanajjar) >>! In T169527#3400986, @Marostegui wrote: > Thank you!! :-) :) [14:36:22] as FYI I am going to run some long running alter tables on dbstore1002, the same as db1047 (/home/elukey/dbstore1002.sql for more info) [14:36:29] Manuel is aware :D [14:36:52] I checked and no alters are running, plus I am going to submit only alters for table less than 100M rows [14:36:58] what is the ETA for those alters? [14:37:06] ok [14:37:27] good question, db1047 is more than half way through and it is not trashing atm [14:37:38] but I'd need to start with dbstore1002 [14:38:51] jynus: anything against it? Otherwise I'll start now [14:39:07] they will probably run for some days :( [14:39:12] no, I asked the ETA because during it [14:39:20] we cannot restart the server [14:39:51] okok.. I learned my lesson while restarting db1047, now I triple check with you guys first [14:40:01] all right going to start [15:19:51] 10DBA, 10Operations, 10ops-codfw: Move some masters away from B6 - https://phabricator.wikimedia.org/T169501#3401229 (10Papaul) @Marostegui Proposal approved. [15:20:32] 10DBA, 10Operations, 10ops-codfw: Move some masters away from B6 - https://phabricator.wikimedia.org/T169501#3401232 (10Marostegui) Thanks @Papaul - let's leave this stalled for now. We will ping you if we decide to go for it :-) [16:30:48] 10DBA, 10Labs: ukwikimedia still present on replicas dbs on labs hosts - https://phabricator.wikimedia.org/T169488#3401577 (10jcrespo) [16:31:03] 10DBA, 10Labs: ukwikimedia still present on replicas dbs on labs hosts - https://phabricator.wikimedia.org/T169488#3401579 (10Marostegui) Thanks @bd808! Can you clean up the views and I will take care of removing the db? [16:31:27] 10DBA, 10Labs: Drop ukwikimedia from labsdb hosts (was: ukwikimedia still present on replicas dbs on labs hosts) - https://phabricator.wikimedia.org/T169488#3401580 (10jcrespo) [16:32:05] 10DBA, 10Labs: Drop ukwikimedia from labsdb hosts (was: ukwikimedia still present on replicas dbs on labs hosts) - https://phabricator.wikimedia.org/T169488#3399766 (10jcrespo) Yes, we may need your help to update the meta database, maybe? I can take care of the actual data deletion. [16:33:09] 10DBA, 10Labs: Drop ukwikimedia from labsdb hosts (was: ukwikimedia still present on replicas dbs on labs hosts) - https://phabricator.wikimedia.org/T169488#3401597 (10jcrespo) Self-reminder, reload the replication filters, too, just in case. [16:40:04] marostegui: I can most certainly drop the views, but ... I need you to help me understand what to actually do. The only maint I've done on the replicas so far is running the scripts that maintain replicas and metadata [16:44:00] bd808: As far as I know there is a way to clean up views, that chasemp told me a few days ago [16:44:06] let me see if I find it [16:45:42] bd808: https://phabricator.wikimedia.org/T153213#3382376 [16:46:10] ok. I think I can run that in all the places now :) [16:48:38] hmmm... so with the change to add to deleted.dblist maintain-views will actually skip the db... [16:48:48] * bd808 reads the code [16:51:21] hhmmm... maybe --clean will remove. It's worth a shot [16:57:06] 10DBA, 10Labs: Drop ukwikimedia from labsdb hosts (was: ukwikimedia still present on replicas dbs on labs hosts) - https://phabricator.wikimedia.org/T169488#3401733 (10bd808) bah. !log fail: `[16:55] < bd808> !log Running maintain-views --all-databases --clean --replace-all --debug on labsdb1001` [17:04:13] bd808 : --clean workout? [17:12:02] chasemp: not sure yet. Doing --clean with just the db name did not. The db gets skipped because it is in the deleted list [17:12:29] running the full deal now [17:12:34] (not quickly) [17:12:42] Oh, right. It wont change with full list.... [17:13:09] need a force for deleted I guess [17:14:03] yeah.. it's not common at all, so maybe we just document how to find and remove all the views for an decomissioned wiki? [17:14:37] Although if I'm going to do it I guess it needs to be handled by the script I have access to run [17:15:51] * chasemp nods [17:18:38] running `maintain-views --all-databases --clean --replace-all --debug` on labsdb1001 is taking much longer than the 5m you saw on 1009 [17:19:11] its at ~25m now and still going [17:19:14] Well it has never been run on the old labsdb I believe [17:19:20] ah [17:19:28] I treated those with kid gloves [17:19:30] probably time then ;) [17:19:44] * bd808 wildly breaks all the things [17:20:02] assuming it has never been cleaned up this way [17:20:27] the time seems to be soaked up by creating a lot of missing views [17:20:43] page_props, pagelinks, etch [18:03:09] volans: I filed some tasks on Friday for a couple other core tables that can be truncated [18:03:15] (linked from the epic-mega-tracking-task) [18:03:46] RainbowSprinkles: you probably wanted to ping the DBAs ;) [18:03:58] Derp, I meant marostegui ;-) [18:04:05] Dunno how I conflated y'all [18:04:16] lol [18:11:48] RainbowSprinkles: yeah, saw them, thanks a lot!! :) [18:12:06] No problem. Should all be mostly empty anyway, but yeah another to check off the list :) [18:12:08] maintain-views is still running 80m later... [18:12:29] apparently the views on labsdb1001 were quite out of date [18:14:09] bd808: is it doing useful work? [18:14:29] It looks like it is creating missing views [18:14:34] I've been keeping with the motto "all things will be right on new cluster" and been afraid of a mass run there [18:14:35] so yes, useful [18:14:52] so all ops I've done have been per wiki or 1 table across all [18:55:56] 10DBA, 10Wikidata, 10Patch-For-Review, 10Performance, and 2 others: slow master queries on Wikibase\Client\Usage\Sql\EntityUsageTable::getAffectedRowIds - https://phabricator.wikimedia.org/T169336#3402183 (10Ladsgroup) I don't think this should be low as it definitely contributes to master go read-only 1,5... [20:30:50] 10DBA, 10Labs: Drop ukwikimedia from labsdb hosts (was: ukwikimedia still present on replicas dbs on labs hosts) - https://phabricator.wikimedia.org/T169488#3402456 (10bd808) >>! In T169488#3401733, @bd808 wrote: > bah. !log fail: `[16:55] < bd808> !log Running maintain-views --all-databases --clean --repl... [23:25:03] 10DBA, 10Labs: Drop ukwikimedia from labsdb hosts (was: ukwikimedia still present on replicas dbs on labs hosts) - https://phabricator.wikimedia.org/T169488#3402799 (10bd808) > seemed to add quite a large number of missing views Not necessarily true. I used `--replace-all`.