[02:09:05] 10DBA, 10Beta-Cluster-Infrastructure, 10Release-Engineering-Team: MySQL database on deployment-db03 does not start due to InnoDB issue - https://phabricator.wikimedia.org/T216635 (10mmodell) 05Stalled→03Declined I think we should just delete db03 now that we've got db04 and db05 going. Can we snapshot t... [05:59:53] 10DBA, 10Wikimedia-Site-requests, 10MW-1.33-notes (1.33.0-wmf.18; 2019-02-19), 10Patch-For-Review, 10User-MarcoAurelio: Global rename of The_Photographer → Wilfredor: supervision needed - https://phabricator.wikimedia.org/T215107 (10Marostegui) @MarcoAurelio green light from use to get the job re-scheduled [06:03:13] 10DBA, 10Notifications, 10Growth-Team (Current Sprint), 10WorkType-Maintenance: Clean up orphaned echo_event rows again - https://phabricator.wikimedia.org/T217073 (10Marostegui) Thanks for the heads up, that works for me! [06:26:04] 10Blocked-on-schema-change, 10DBA, 10AbuseFilter, 10Patch-For-Review: Apply AbuseFilter patch-fix-index - https://phabricator.wikimedia.org/T187295 (10Marostegui) I used the following query on db1083 to measure the impact of the index change (I executed the query twice to make sure it was "warm"): ` SELECT... [06:26:45] 10Blocked-on-schema-change, 10DBA, 10AbuseFilter, 10Patch-For-Review: Apply AbuseFilter patch-fix-index - https://phabricator.wikimedia.org/T187295 (10Marostegui) s1 eqiad progress [] labsdb1011 [] labsdb1010 [] labsdb1009 [] dbstore1003 [] dbstore1002 [] dbstore1001 [] db1124 [] db1119 [] db1118 [] db1106... [06:27:00] 10Blocked-on-schema-change, 10DBA, 10AbuseFilter, 10Patch-For-Review: Apply AbuseFilter patch-fix-index - https://phabricator.wikimedia.org/T187295 (10Marostegui) [06:47:54] 10DBA, 10Wikimedia-Site-requests, 10MW-1.33-notes (1.33.0-wmf.18; 2019-02-19), 10Patch-For-Review, 10User-MarcoAurelio: Global rename of The_Photographer → Wilfredor: supervision needed - https://phabricator.wikimedia.org/T215107 (10Tgr) The job has finished running, no obvious sign of anything going wro... [06:51:32] 10DBA, 10Wikimedia-Site-requests, 10MW-1.33-notes (1.33.0-wmf.18; 2019-02-19), 10Patch-For-Review, 10User-MarcoAurelio: Global rename of The_Photographer → Wilfredor: supervision needed - https://phabricator.wikimedia.org/T215107 (10Marostegui) Thanks @Tgr! @Wilfredor can you try to log-in now? [07:01:22] 10Blocked-on-schema-change, 10DBA, 10Patch-For-Review, 10Schema-change: Dropping page.page_no_title_convert on wmf databases - https://phabricator.wikimedia.org/T86342 (10Marostegui) [07:23:24] 10Blocked-on-schema-change, 10DBA, 10Patch-For-Review, 10Schema-change: Dropping page.page_no_title_convert on wmf databases - https://phabricator.wikimedia.org/T86342 (10Marostegui) [08:02:33] 10Blocked-on-schema-change, 10MediaWiki-Change-tagging, 10Patch-For-Review, 10User-Ladsgroup: Drop change_tag.ct_tag column in production - https://phabricator.wikimedia.org/T210713 (10Marostegui) 05Open→03Stalled Stalling this until we have failed over s1 master, as it is impossible to alter that host... [08:07:57] 10Blocked-on-schema-change, 10DBA, 10AbuseFilter, 10Patch-For-Review: Apply AbuseFilter patch-fix-index - https://phabricator.wikimedia.org/T187295 (10Marostegui) [08:31:44] 10Blocked-on-schema-change, 10DBA, 10AbuseFilter, 10Patch-For-Review: Apply AbuseFilter patch-fix-index - https://phabricator.wikimedia.org/T187295 (10Daimona) @Marostegui Thanks, that's nice to hear! Given that queries like that one are pretty common, this is surely a huge performance boost. [08:57:01] I am going to perform an s1 snapshot, will stop replication to speed up the process [08:57:05] on dbstore1001 [08:57:11] great! [09:41:39] 10Blocked-on-schema-change, 10DBA, 10AbuseFilter, 10Patch-For-Review: Apply AbuseFilter patch-fix-index - https://phabricator.wikimedia.org/T187295 (10Marostegui) >>! In T187295#4983856, @Daimona wrote: > @Marostegui Thanks, that's nice to hear! Given that queries like that one are pretty common, this is s... [10:08:36] 10DBA, 10Wikimedia-Site-requests, 10MW-1.33-notes (1.33.0-wmf.18; 2019-02-19), 10Patch-For-Review, 10User-MarcoAurelio: Global rename of The_Photographer → Wilfredor: supervision needed - https://phabricator.wikimedia.org/T215107 (10MarcoAurelio) Lots of unnatached accounts at https://meta.wikimedia.org/... [10:11:47] 10Blocked-on-schema-change, 10DBA, 10Patch-For-Review, 10Schema-change: Dropping page.page_no_title_convert on wmf databases - https://phabricator.wikimedia.org/T86342 (10Marostegui) s2 eqiad progress [] labsdb1011 [] labsdb1010 [] labsdb1009 [x] dbstore1004 [x] dbstore1002 [] db1125 [] db1122 [x] db1105 [... [10:39:34] \o marostegui [10:39:42] o/ [10:39:43] I am also running a logical dump of m5 to test for regressions [10:39:53] I'm presenting the stuff I have been doing on wb_terms over the past month to people later today, just preparing some slides [10:40:01] \o/ [10:40:09] I was wondering what is the biggest #3 tables for wikidatawiki ignoring wb_terms? [10:40:22] in terms of size? [10:41:19] yes, probably rows and or disk [10:41:42] So probably revision and pagelinks I would say [10:41:43] let me check [10:41:59] just so I can give some sort of scale to the fact that the biggest table from the normalization will be ~60GB and 1.8 billion rows [10:43:51] https://phabricator.wikimedia.org/P8130 [10:49:55] ty [10:51:19] checking the backup size can be useful too [10:51:51] because a low entopy may mark a way to normalize content [10:53:51] jynus: oooh, that is indeed interesting [11:05:01] addshore: see https://phabricator.wikimedia.org/P8130#48330 [11:05:29] the fact that wb_terms is not the first in size after compression means it is fixable [11:05:45] * marostegui saves that query [11:09:50] and we will be able to compare it with physical sizes once snapshots metrics are working :-) [11:11:15] m5 backups went through, checking now the metadata [11:12:59] they didn't worked- it was marked as correct (expected, the backup worked), but it didn't gather correctly the individual file data [11:13:34] A MySQL error occurred while inserting the backup file details [11:13:49] so it just aborts [11:14:08] aborted but was marked as correct? [11:14:25] ERROR:backup:We could not find the existing statistics for the finished dump [11:14:47] yes, remember that I do not consider gathering metrics as critical [11:14:59] yeah, I meant where was it marked as correct? [11:15:22] the general backup process was correct, so it went from ongoing to correct [11:15:37] because it was correct, it only failed to get the individual file statistics [11:15:44] so it was marked as finished on the DB but failed to get the stats [11:15:46] yeah, that [11:15:59] yes, it was correctly marked as finished [11:16:06] and correct, because it is [11:16:19] yes, I was trying to understand what you meant, to avoid assuming things, I get it now [11:16:21] but it errored (non critically) on gathering the individual file stats [11:16:27] gotcha [11:28:47] interesting, now it works, but it says it cannot find the new backup [11:29:03] oh, I see [11:29:28] it is because it needs to search among the ongoing backups [11:29:52] but it fails because there are several backups with the same name either finished or failed [11:41:11] so I just "can" change the status of ongoig backups, and never change failed, finished or deleted backups [11:41:44] in the future, I should just work with ids, as something I presumed never happened (backups with the same name) [11:42:12] I am not sure I understand this: ˜/jynus 12:41> so I just "can" change the status of ongoig backups, and never change failed, finished or deleted backups [11:42:16] now it is supported because a failed backup (ongoing) can be prepared again (prepare) [11:43:03] so I have to enable the preparation of backups, but becase there was failed backups older, with the same name [11:43:20] the gathering stats failed, because there were 2 backups with the same name [11:43:33] the older that failed and the new ongoing one [11:43:53] so a process will have to fail ongoing backups after, let's say, 24 hours [11:44:24] so there is only 1 ongoing backup with the same name [11:44:42] it will make sense, don't worry [11:45:19] the sumary is "ignore failed backups" [11:45:32] with a "and status = 'ongoing'" [11:45:37] Ah, I see what you mean now [11:45:51] Sorry for making you repeat it :) [11:46:01] no no, it is me that I am so deep [11:46:05] that I am monologing here [11:46:30] I think we discussed about how long we should leave a backup on on-goiing before [11:46:43] yeah, didn't get to that [11:46:53] note dome dumps may take 12 hours, like es ones [11:46:55] But then we realised that sometimes they are still on on-going but they are actually finished and correct (which is what happened now) [11:47:17] yeah, that is why I am doing prepare not only for xtrabackup [11:47:23] but for mydumper, too [11:47:36] so it can be prepared/data gathered again [11:47:47] yep, makes sense [11:47:53] so the process will be to let the failure be recongnized [11:48:04] (with an automated process we don't have yet) [11:48:11] exactly, I was going to say that how do we identify if after 24h the process is fine or failed [11:48:12] or do it manually [11:48:35] and if the backup succeded, just manually move it to ongoing and --only-postprocess [11:48:40] or generate a new one [11:50:00] ERROR:backup:Expecting 1 matching dump for m5, found 0 [11:50:11] ^because I forgot to move it to ongoing [11:50:43] it finished correctly, you mean right? it was a "finished"? [11:51:59] yep [11:52:03] and has all metadata [11:52:08] including size [11:52:23] with the new recursive directory listing [11:52:53] it is nice that only the logical part of the programming was broken [11:53:06] and not the actual coding [11:53:24] needs proper testing with the snapshot [11:53:51] the snapshot finished already [11:53:56] nice [11:54:35] 1h45 [11:57:46] prepare ongoing, it may take a bit [11:59:14] it is curious because while copying files is actually faster, the fact that mydumper generates less amount of data [11:59:26] it reads from memory most of it [11:59:44] and has higher parallelization opportunities makes it faster to generate in general [11:59:52] or about the same [12:00:04] the huge gain will be on recovery [12:04:56] mydumper takes less than 1:45, for, s1? [12:05:54] around that, for s1 on dbstore1001 (local copy, which is both good and bad) [12:06:07] it worked, I think [12:06:19] 776 | snapshot.s1.2019-02-26--09-45-12 | ongoing | dbstore1001.eqiad.wmnet:3311 | dbstore1001.eqiad.wmnet | snapshot [12:06:28] s1 | 2019-02-26 11:57:26 | NULL | 1111887557719 [12:06:49] it says no end date and ongoing because it is now pigz'ing [12:07:12] 776 | | aria_log.00000001 | 16384 | 2019-02-26 11:32:36 | NULL [12:07:20] 776 | enwiki | abuse_filter.frm | 2951 | 2019-02-26 11:32:31 | NULL [12:07:23] yay! [12:07:28] nice! [12:07:38] :) [12:07:57] this is nice info to cross-reference with backup_files from dump [17:24:12] 10DBA, 10Wikimedia-Site-requests, 10MW-1.33-notes (1.33.0-wmf.18; 2019-02-19), 10Patch-For-Review, 10User-MarcoAurelio: Global rename of The_Photographer → Wilfredor: supervision needed - https://phabricator.wikimedia.org/T215107 (10Tgr) That's not good... Can you check if the move otherwise went OK? Es... [17:36:31] 10DBA, 10Wikimedia-Site-requests, 10MW-1.33-notes (1.33.0-wmf.18; 2019-02-19), 10Patch-For-Review, 10User-MarcoAurelio: Global rename of The_Photographer → Wilfredor: supervision needed - https://phabricator.wikimedia.org/T215107 (10Tgr) ` Wikimedia\Rdbms\LoadBalancer::runMasterTransactionIdleCallbacks:... [18:11:18] 10DBA, 10Wikimedia-Site-requests, 10MW-1.33-notes (1.33.0-wmf.18; 2019-02-19), 10Patch-For-Review, 10User-MarcoAurelio: Global rename of The_Photographer → Wilfredor: supervision needed - https://phabricator.wikimedia.org/T215107 (10Tgr) Oh, right, this is {T188882}, let's follow up there. The accounts s... [20:55:29] 10DBA, 10Wikimedia-Site-requests, 10MW-1.33-notes (1.33.0-wmf.18; 2019-02-19), 10Patch-For-Review, 10User-MarcoAurelio: Global rename of The_Photographer → Wilfredor: supervision needed - https://phabricator.wikimedia.org/T215107 (10aaron) >>! In T215107#4983666, @Tgr wrote: > The job has finished runnin... [21:39:39] 10DBA, 10Wikimedia-Site-requests: Global rename of Дагиров Умар → Takhirgeran Umar: supervision needed - https://phabricator.wikimedia.org/T216444 (10Nihlus) @Marostegui Is it okay to proceed with this rename? I see that T215107 has been somewhat resolved.