[06:27:49] 10DBA, 10Wikidata: Make a copy of the current wb_terms table on the MCR testing DB servers - https://phabricator.wikimedia.org/T211338 (10Marostegui) 05Open>03Resolved @Addshore wb_terms has been imported into db1111 (and replicated to db1112). Please check that you have access and if not talk to me priva... [06:32:56] 10DBA, 10Operations, 10ops-eqiad: Degraded RAID on db1063 - https://phabricator.wikimedia.org/T211537 (10Marostegui) p:05Triage>03High a:03Cmjohnson @Cmjohnson I am setting this to high priority because there is one failed disk and another one with smart errors (on a different SPAN). Let's **replace on... [06:34:09] 10DBA, 10Patch-For-Review, 10cloud-services-team (Kanban): cloudvps: dedicated openstack database - https://phabricator.wikimedia.org/T202889 (10Marostegui) 05Open>03Resolved a:03Marostegui Closing this for now per T202889#4798131 If someone feels we need to revisit this, please re-open! Thanks everyone [06:35:49] 10DBA, 10Analytics, 10Data-Services, 10User-Banyek, 10User-Elukey: Hardware for cloud db replicas for analytics usage - https://phabricator.wikimedia.org/T210749 (10Marostegui) 05Open>03Resolved a:03Marostegui Closing this as the hardware has been decided to be purchased and will be followed up at:... [06:35:51] 10DBA, 10Operations: Predictive failures on disk S.M.A.R.T. status - https://phabricator.wikimedia.org/T208323 (10Marostegui) [06:37:57] 10DBA, 10MediaWiki-extensions-FlaggedRevs, 10Wikimedia-Site-requests, 10User-Zoranzoki21: Drop FlaggedRevs tables in database for srwikinews - https://phabricator.wikimedia.org/T209761 (10Marostegui) p:05Triage>03Normal @Zoranzoki21 can you confirm if this is good to go and if these are the tables that... [07:14:17] 10DBA, 10MediaWiki-extensions-FlaggedRevs, 10Wikimedia-Site-requests, 10User-Zoranzoki21: Drop FlaggedRevs tables in database for srwikinews - https://phabricator.wikimedia.org/T209761 (10Zoranzoki21) >>! In T209761#4809283, @Marostegui wrote: > @Zoranzoki21 can you confirm if this is good to go and if the... [07:14:48] 10DBA, 10MediaWiki-extensions-FlaggedRevs, 10Wikimedia-Site-requests, 10User-Zoranzoki21: Drop FlaggedRevs tables in database for srwikinews - https://phabricator.wikimedia.org/T209761 (10Marostegui) Thanks! [07:19:29] 10DBA, 10Wikimedia-Site-requests, 10User-Zoranzoki21: Drop FlaggedRevs tables in database for ptwikipedia - https://phabricator.wikimedia.org/T211544 (10Zoranzoki21) 05Open>03stalled [08:45:50] marostegui: you mentioned we should rewrite the reimport from master from mysqldump, to mydumper I still agree, as the labsdb1004 fix is still in progress. 63 hours and it is still at 93% [08:46:30] banyek: It was a general thing, not something we have to do right now. That host is very old and has HDDs [08:46:37] It was expected it to take long if the table is that big [08:47:05] I was expecting a whole day as 'long' not three :) [08:47:14] but yeah [08:47:19] The rewrite shouldn't be difficul, it is just changing mysqldump and use mydumper, so it shouldn't take more than 5 minutes (assuming myloader/mydumper are installed) [08:47:56] Those hosts will be decommissioned anyways at some point [08:48:06] ANd moved to a VM - I think that's cloud's idea I believe [08:48:25] well, we'll see that when it comes [08:48:49] I go now, and recheck the errors, phab and mail, also update the dba-sync [08:49:02] thanks [08:50:12] I have updated my part already [10:04:06] 10Blocked-on-schema-change, 10DBA, 10MediaWiki-Database, 10Scoring-platform-team, and 2 others: Schema change for rc_this_oldid index - https://phabricator.wikimedia.org/T202167 (10Marostegui) [10:04:13] 10Blocked-on-schema-change, 10DBA, 10Patch-For-Review, 10Schema-change: Dropping page.page_counter on wmf databases - https://phabricator.wikimedia.org/T86338 (10Marostegui) [10:17:22] 10DBA, 10Analytics, 10Analytics-Kanban, 10Data-Services, 10Core Platform Team Backlog (Watching / External): Not able to scoop comment table in labs for mediawiki reconstruction process - https://phabricator.wikimedia.org/T209031 (10JAllemandou) a:05JAllemandou>03Milimetric [10:22:48] 10Blocked-on-schema-change, 10DBA, 10MediaWiki-Database, 10Scoring-platform-team, and 2 others: Schema change for rc_this_oldid index - https://phabricator.wikimedia.org/T202167 (10Marostegui) s8 eqiad progress [] labsdb1011 [] labsdb1010 [] labsdb1009 [] dbstore1002 [] db1124 [] db1116 [] db1109 [] db1104... [10:22:52] 10Blocked-on-schema-change, 10DBA, 10Patch-For-Review, 10Schema-change: Dropping page.page_counter on wmf databases - https://phabricator.wikimedia.org/T86338 (10Marostegui) s8 eqiad progress [] labsdb1011 [] labsdb1010 [] labsdb1009 [] dbstore1002 [] db1124 [] db1116 [] db1109 [] db1104 [] db1101 [] db109... [10:42:19] i see that dbstore2001 lags because of the wb_terms updates [10:43:39] ? [10:44:30] i see that dbstore2001 lags because wb_terms updates [10:44:32] better [10:44:40] I can't edit [10:44:42] It is not because of that, check SAL [10:45:49] ah, the schema change [10:45:58] ;) [10:46:21] it's weird b/c wb_terms update (inserts and deletes) show up in the processlist [10:46:33] why would that be weird? [10:46:49] Aren't those supposed to show up? [10:47:32] iirc on not mariadb I wouldn't see those, and I still need to get used to it ;) [10:47:57] You can see those too on mysql [10:48:01] WIth processlist [10:48:05] If they are not fast enough [10:52:32] ah those are hdd-s [10:52:48] 🐢 [11:12:52] marostegui: so i think you need to send me a password by some means! [11:13:02] addshore: sure - meeting now [11:13:06] coolio! [13:48:13] addshore: I have sent you an email with instructions [13:48:24] marostegui: great thanks! [13:48:42] addshore: Let me know if it works, once you've tested it [14:28:51] 10DBA, 10Wikimedia-Site-requests, 10User-Zoranzoki21: Drop FlaggedRevs tables in database for ptwikipedia - https://phabricator.wikimedia.org/T211544 (10Banyek) p:05Triage>03Normal [14:44:01] 10DBA, 10Wikimedia-Site-requests, 10User-Banyek, 10User-Zoranzoki21: Drop FlaggedRevs tables in database for ptwikipedia - https://phabricator.wikimedia.org/T211544 (10Banyek) [14:54:16] marostegui: looks gooood and working to me :) [14:54:22] great! [14:55:35] 10DBA, 10Wikidata: Make a copy of the current wb_terms table on the MCR testing DB servers - https://phabricator.wikimedia.org/T211338 (10Addshore) Looking good from my side! ` test_wikiadmin@db1111(wikidatawiki)> show tables; +------------------------+ | Tables_in_wikidatawiki | +------------------------+ |... [14:57:31] 10DBA: Grant addshore access to test-s4 servers - https://phabricator.wikimedia.org/T211593 (10Marostegui) 05Open>03Resolved p:05Triage>03Normal [15:04:42] 10DBA: Grant addshore access to test-s4 servers - https://phabricator.wikimedia.org/T211593 (10Marostegui) [15:06:12] marostegui: and how much space is there on that server free? :D [15:06:41] 1.3T free there [15:06:44] YAY [15:06:47] thats perfect ;) [15:07:02] * marostegui scared of why addshore needs more space! [15:07:42] well, I want to experiment with a couple of different things, but with the full size of the table [15:07:56] the table is compressed (and defragmented there), so it is "only 400G" [15:08:03] oh nice! [15:08:09] what would be the easiest way to essentially duplicate the table? so I still have the origional? [15:08:41] probably the fastest is to import it again [15:08:51] on a different db [15:09:35] I could do that (after the meeting I am at now) [15:09:44] Something like importing it into wikidata2 for example [15:09:50] ack, I guess "INSERT newtable SELECT * FROM oldtable;" would be pretty slow then? :) [15:09:58] yeah, it is a huge table [15:10:11] and that would just use one thread, whereas using myloader I usually use 10 [15:10:40] what I can do is do a rename it to wb_terms_original [15:10:45] and then reimport it again [15:10:48] into wikidata [15:10:54] so we don't end up with wikidata and wikidata2 [15:12:31] 10Blocked-on-schema-change, 10DBA, 10Patch-For-Review, 10Schema-change, 10User-Banyek: Dropping user.user_options on wmf databases - https://phabricator.wikimedia.org/T85757 (10Banyek) [15:17:26] addshore: should I do that? [15:19:35] yes a second one will be good :) [15:19:49] addshore: ok I will rename wb_terms to wb_terms_original now [15:19:57] And I will start loading the other wb_terms [15:19:58] Thanks! [15:20:58] ok, rename done and I have started the new import [15:37:59] 10Blocked-on-schema-change, 10DBA, 10Patch-For-Review, 10Schema-change, 10User-Banyek: Dropping user.user_options on wmf databases - https://phabricator.wikimedia.org/T85757 (10Banyek) [15:50:09] only 7 minutes left for labsdb1004! [16:47:07] 10DBA, 10Analytics, 10Analytics-Kanban, 10Data-Services, 10Core Platform Team Backlog (Watching / External): Not able to scoop comment table in labs for mediawiki reconstruction process - https://phabricator.wikimedia.org/T209031 (10fdans) [17:08:08] banyek: labsdb1004 finished \o/ [17:08:21] yay [17:08:33] last time it was still years from finishing [17:08:40] can you take care of it? [17:08:43] I'll finish our part after the SRE meeting yes [17:08:47] thanks [17:08:51] np [18:06:55] 10DBA, 10Operations, 10ops-eqiad, 10User-Marostegui: rack/setup/install db11[26-38].eqiad.wmnet - https://phabricator.wikimedia.org/T211613 (10RobH) p:05Triage>03Normal [18:09:57] 10DBA: labsdb1004 replication broken for linkwatcher_linklog table - https://phabricator.wikimedia.org/T211210 (10Banyek) aand after I restarted the instance, I've got: ` Last_SQL_Error: Could not execute Write_rows_v1 event on table s51230__linkwatcher.linkwatcher_linklog; Duplicate entry '573509232' for key... [18:11:02] 10DBA, 10Operations, 10ops-eqiad, 10User-Marostegui: rack/setup/install db11[26-38].eqiad.wmnet - https://phabricator.wikimedia.org/T211613 (10RobH) So, to figure out the racking plan: db1061: s6 master : C3 db1062: s7 master : D4 db1063: m1 master : C5 db1064: x1 slave : D1 db1065: m5 master : D1 db1066:... [18:32:25] banyek: no [18:32:27] don't do that [18:32:31] don't ignore it [18:33:05] is it done already? [18:33:46] yes, it is done already [18:33:50] that should've not been done [18:33:56] it took 3 days to import the table [18:36:11] 10DBA: labsdb1004 replication broken for linkwatcher_linklog table - https://phabricator.wikimedia.org/T211210 (10Marostegui) >>! In T211210#4811409, @Banyek wrote: > aand after I restarted the instance, I've got: > > ` Last_SQL_Error: Could not execute Write_rows_v1 event on table s51230__linkwatcher.linkwatc... [18:37:21] :( [18:38:26] I am going to check the binlog to see how many transactions we might have lost for that table [18:38:36] But don't ignore tables that easily, specially big one [18:38:40] Specially if they take 3 days to reimport [18:38:48] It is easier to inspect the transaction [18:38:49] 10DBA: labsdb1004 replication broken for linkwatcher_linklog table - https://phabricator.wikimedia.org/T211210 (10Banyek) I was thinking we should let it catch up, and then redo the import, but with mydumper instead of mysqldump as we were talking about that Also it's weird why it happened again [18:39:20] ok, see my comment [18:39:26] Yeah, but that makes no sense [18:39:31] It takes 3 days [18:39:42] It is easier to see if the duplicate is real or not [18:39:55] Ignoring the table without checking what really happened isn't a good idea [18:40:01] Specially if it is that big [18:41:13] well, we should discuss this tomorrow, I leave now because my kids demanding me [18:41:22] bye [18:57:33] 10DBA: labsdb1004 replication broken for linkwatcher_linklog table - https://phabricator.wikimedia.org/T211210 (10Marostegui) >>! In T211210#4811472, @Banyek wrote: > I was thinking we should let it catch up, and then redo the import, but with mydumper instead of mysqldump as we were talking about that > Also it... [20:02:42] 10DBA: labsdb1004 replication broken for linkwatcher_linklog table - https://phabricator.wikimedia.org/T211210 (10Marostegui) I think I have replayed all the transactions that were skipped during that time. Of course, going thru binlogs is hard and tedious and I might have missed transactions, so far replication... [20:08:41] 10DBA, 10Gerrit, 10Operations, 10Release-Engineering-Team (Next): Gerrit is failing to connect to db on gerrit2001 thus preventing systemd from working - https://phabricator.wikimedia.org/T176532 (10Paladox) ReviewDB has now been removed upstream. [21:06:31] 10DBA, 10Operations, 10Performance-Team: Increase parsercache keys TTL from 22 days back to 30 days - https://phabricator.wikimedia.org/T210992 (10Imarlier) a:03aaron @aaron to provide feedback, will assign back once he has.