[05:39:49] 10DBA, 10wikitech.wikimedia.org, 10cloud-services-team (Kanban): Labtestwiki database not reachable from deploy1001 using the standard `sql` helper script - https://phabricator.wikimedia.org/T250002 (10Marostegui) 05Open→03Resolved a:03Marostegui `labtestwiki` lives on `clouddb2001-dev.codfw.wmnet` whi... [06:02:39] 10DBA, 10Data-Services, 10Operations, 10cloud-services-team (Kanban): Prepare and check storage layer for gr.wikimedia.org - https://phabricator.wikimedia.org/T245912 (10Marostegui) @Urbanecm I saw you created the database, next time please ping us on this sort of ticket "Prepare and check storage layer" s... [06:19:01] 10DBA, 10wikitech.wikimedia.org: Wikitech has lots of database drifts with core and rest of the databases - https://phabricator.wikimedia.org/T249972 (10Marostegui) Thank you both for detecting and fixing this so fast! [06:36:27] 10DBA, 10Datasets-General-or-Unknown, 10Patch-For-Review, 10Wikimedia-Incident, 10WorkType-NewFunctionality: Automate the check and fix of object, schema and data drifts between mediawiki HEAD, production masters and slaves - https://phabricator.wikimedia.org/T104459 (10Marostegui) Thanks for this Amir -... [06:37:01] 10DBA: db1110 has 5 important database drifts that are unique to the host - https://phabricator.wikimedia.org/T249973 (10Marostegui) p:05Triage→03Medium [06:45:38] 10DBA: db1110 has 5 important database drifts that are unique to the host - https://phabricator.wikimedia.org/T249973 (10Marostegui) a:03Marostegui @Ladsgroup are all thse on `dewiki` right? The drifts from `externallinks` look like they come from {T114117} which probably were not done on db1110 (or it got re... [06:47:00] 10DBA: db1110 has 5 important database drifts that are unique to the host - https://phabricator.wikimedia.org/T249973 (10Marostegui) [06:47:02] 10DBA, 10Schema-change: Drop externallinks.el_from_namespace on wmf databases - https://phabricator.wikimedia.org/T114117 (10Marostegui) [06:49:49] 10DBA: db1110 has 5 important database drifts that are unique to the host - https://phabricator.wikimedia.org/T249973 (10Marostegui) Confirmed it only affects `dewiki` [07:17:19] 10DBA: db1110 has 5 important database drifts that are unique to the host - https://phabricator.wikimedia.org/T249973 (10Marostegui) 05Open→03Resolved This is fixed: ` root@cumin1001:/home/marostegui# mysql.py -hdb1110 -e "show create table dewiki.externallinks\G" *************************** 1. row *********... [07:17:21] 10DBA, 10Schema-change: Drop externallinks.el_from_namespace on wmf databases - https://phabricator.wikimedia.org/T114117 (10Marostegui) [07:19:46] Amir1: probably the easier thing for now, is to collect all those drifts which are extra fields in production [07:19:54] those are probably left overs from schema changes that were half applied [07:53:09] marostegui: sure [07:55:56] We were joking you would hate me for doing this [07:56:14] No way! [07:56:21] I love you (even more than before) for doing this! [08:11:33] <3 [08:17:23] 10DBA: Remove grants for the old dbproxy hosts from the misc databases - https://phabricator.wikimedia.org/T231280 (10Marostegui) dbproxy1011 grants removed ` root@cumin1001:/home/marostegui# host dbproxy1011 dbproxy1011.eqiad.wmnet has address 10.64.37.15 root@cumin1001:/home/marostegui# for i in labsdb1009 lab... [08:17:41] 10DBA: Remove grants for the old dbproxy hosts from the misc databases - https://phabricator.wikimedia.org/T231280 (10Marostegui) [08:18:24] 10DBA: Remove grants for the old dbproxy hosts from the misc databases - https://phabricator.wikimedia.org/T231280 (10Marostegui) [08:33:43] Amir1: I was checking the img_deleted column and it indeed doesn't appear on tables.sql but it does on production (enwiki), but I cannot even find a ticket where it said that it had to get dropped in production [08:34:15] select * from image where img_deleted != 0; [08:34:15] Empty set (1.53 sec) [08:34:26] hmm, it can be that people made the change and forgot to follow up on that? [08:34:38] I have seen this several times [08:35:13] I cannot even find the gerrit patch for that column drop [08:35:39] [wikidatawiki]> select * from image where img_deleted != 0; [08:35:39] Empty set (0.00 sec) [08:35:42] But it looks 0 everywhere [08:38:39] "Fun" [08:39:28] I wonder if it is safe to delete [08:39:58] A grep on core doesn't show anything for img_deleted [08:40:23] nothing anywhere: https://codesearch.wmflabs.org/search/?q=img_deleted&i=nope&files=&repos= [08:40:54] I didn't know that url! [08:40:56] Bookmarked! [08:51:25] https://www.mediawiki.org/wiki/Codesearch [08:51:26] :D [08:52:00] I am creating a task for img_deleted [08:52:01] :) [08:53:43] Thanks [08:55:17] marostegui: I think one index in ipblocks need renaming [08:55:46] we have two large drifts that seem to be completing each other [08:55:47] ipblocks ipb_address_unique index-mismatch-prod-extra [08:55:57] ipblocks ipb_address index-mismatch-code-extra [08:58:07] 10DBA, 10Schema-change: Remove image.img_deleted column from production - https://phabricator.wikimedia.org/T250055 (10Marostegui) [08:58:25] 10DBA, 10Schema-change: Remove image.img_deleted column from production - https://phabricator.wikimedia.org/T250055 (10Marostegui) p:05Triage→03Medium a:03Marostegui [08:58:29] I create a ticket for lc_lang_key [08:58:45] I will create one for ipblocks [08:59:01] Thanks! [08:59:15] We need to check if there are parts of code forcing that specific index though [08:59:20] like USE index or stuff like that [09:00:59] Codesearch should be enough I guess [09:01:33] yeah [09:02:19] I wish we could rename indexes in mysql :( [09:02:37] oh wait, I think it is implemented already [09:02:40] let me check [09:03:34] https://jira.mariadb.org/browse/MDEV-7318 10.5 :( [09:04:03] 10DBA: lc_lang_key index is lingering in production - https://phabricator.wikimedia.org/T250056 (10Ladsgroup) [09:04:12] 10DBA: lc_lang_key index is lingering in production - https://phabricator.wikimedia.org/T250056 (10Ladsgroup) [09:04:43] 10.5, we will be at 10.4 :(( [09:04:46] so close [09:04:54] yeah [09:04:58] but mysql implemented this on 5.7 [09:04:59] :( [09:10:48] 10DBA: type_acton index in logging table is lingering in production - https://phabricator.wikimedia.org/T250057 (10Ladsgroup) [09:11:10] marostegui: I found two incomplete schema change for you ^ [09:11:11] :D [09:11:12] I am wondering if we should just change the ipb_address index on tables.sql rather than altering all production hosts [09:11:53] Changing tables.sql is definitely faster for sure [09:12:28] let me check, I think it might collide with an index in another table, some DBMSes do that [09:14:14] ah ok [09:14:31] also fixing searchindex might be easy, we don't use it in production [09:17:15] https://phabricator.wikimedia.org/T250056 this one looks easy too [09:17:41] 10DBA: lc_lang_key index is lingering in production - https://phabricator.wikimedia.org/T250056 (10Marostegui) a:03Marostegui [09:18:39] interesting the PK is there but the index wasn't dropped [09:18:43] that's weird [09:22:40] 10DBA: lc_lang_key index is lingering in production - https://phabricator.wikimedia.org/T250056 (10Marostegui) s1 fixed - they extra KEY only existed on db1089 and dbstore1003:3311: ` root@cumin1001:/home/marostegui# ./section s1 | while read host port; do echo "$host:$port"; mysql.py -h$host:$port enwiki -e "sh... [09:25:33] 10DBA: searchindex indexes are missing in production - https://phabricator.wikimedia.org/T250058 (10Ladsgroup) [09:25:51] 10DBA: lc_lang_key index is lingering in production - https://phabricator.wikimedia.org/T250056 (10Marostegui) s6 fixed for: db1093, db1131, db1139:3316, dbstore1005:3316 [09:27:14] man, text tabled is completely fucked up, 400 drifts [09:27:26] responsible for 40% of all drifts [09:27:27] yeah, that one is a mess [09:31:26] 10DBA: lc_lang_key index is lingering in production - https://phabricator.wikimedia.org/T250056 (10Marostegui) s7 fixed, only db1094 had the index (on all the wikis) [09:33:53] tl_namespace index is not unique only in s8 [09:34:27] 10DBA: lc_lang_key index is lingering in production - https://phabricator.wikimedia.org/T250056 (10Marostegui) 05Open→03Resolved s2 fixed: dbstore1004:3312, db1140:3312, db1122:3312, db1090:3312, [09:35:04] 10DBA: tl_namespace index on templatelinks is not unique only in s8 - https://phabricator.wikimedia.org/T250060 (10Ladsgroup) [09:35:23] did you check if ipblocks can be changed on tables.sql rather than changing all replicas? [09:37:03] 10DBA: tl_namespace index on templatelinks is unique only in s8 - https://phabricator.wikimedia.org/T250060 (10Ladsgroup) [09:37:37] sorry, I should [09:41:38] at first glance it should be okay but I'm trying to find out why it was in the first place [09:45:01] ok, cool! [09:45:08] 10DBA, 10Data-Services, 10Operations, 10cloud-services-team (Kanban): Prepare and check storage layer for gr.wikimedia.org - https://phabricator.wikimedia.org/T245912 (10Urbanecm) Upps, sorry, will do next time @Marostegui! [09:45:52] 10DBA, 10Data-Services, 10Operations, 10cloud-services-team (Kanban): Prepare and check storage layer for gr.wikimedia.org - https://phabricator.wikimedia.org/T245912 (10Marostegui) No problem! Can you check my comment at: T245911#6051001 Thank you! [09:48:46] The latest change on that index is for 2009 and it's not related [09:48:47] https://github.com/wikimedia/mediawiki/commit/a7c7a3fd3305dc118416bebff2d4d209e26cfd88#diff-a22af9c6d1b8ab836a50a290227b98c9R685 [09:49:37] 10DBA: searchindex indexes are missing in production - https://phabricator.wikimedia.org/T250058 (10Marostegui) I am thinking about declining this. The tables in production are InnoDB and binary with binary columns, so we cannot create full text indexes there. tables.sql shows the table being created as MyISAM... [09:51:09] Amir1: But I don't see anything regarding ipblocks no? [09:51:27] there's adding prefix there [09:51:33] to the index but it's not related [09:51:38] and it's realllly old [09:53:58] 10DBA: ipb_parent_block_id_2 index on ipblocks table on s8 only - https://phabricator.wikimedia.org/T250062 (10Ladsgroup) [09:56:50] 10DBA, 10Data-Services, 10Operations, 10cloud-services-team (Kanban): Prepare and check storage layer for gr.wikimedia.org - https://phabricator.wikimedia.org/T245912 (10Marostegui) #cloud-services-team this is ready for the views creation on labsdb1009, 1010, 1011 and 1012. I have run this: `set session s... [09:57:24] I think we might need to close this https://phabricator.wikimedia.org/T250058#6051265 [10:01:12] Sure [10:01:26] New finding inverse_timestamp field on text table [10:01:52] 10DBA: ipb_parent_block_id_2 index on ipblocks table on s8 only - https://phabricator.wikimedia.org/T250062 (10Marostegui) And it does sound as unused: ` root@db1109.eqiad.wmnet[sys]> select * from schema_unused_indexes where index_name='ipb_parent_block_id_2'; +---------------+-------------+-------------------... [10:01:54] 10DBA: ipb_parent_block_id_2 index on ipblocks table on s8 only - https://phabricator.wikimedia.org/T250062 (10Marostegui) p:05Triage→03Medium [10:01:56] 10DBA: searchindex indexes are missing in production - https://phabricator.wikimedia.org/T250058 (10Ladsgroup) 05Open→03Declined Sure! [10:06:50] 10DBA: inverse_timestamp column exists in text table, it shouldn't - https://phabricator.wikimedia.org/T250063 (10Ladsgroup) [10:16:16] Amir1: we have an issue with the vslow host again in wokidata [10:16:19] and the following query [10:16:35] SELECT /* SpecialFewestRevisions::reallyDoQuery */ page_namespace AS `namespace`,page_title AS `title`,COUNT(*) AS `value` FROM `revision`,`page` WHERE page_namespace IN (640,146,0) AND (page_id = rev_page) AND (page_is_redirect = 0) GROUP BY page_namespace,page_title ORDER BY value LIMIT 5000 [10:16:54] It has been running for 9h [10:16:57] and it is making the host lag [10:17:00] And creating errors [10:17:03] I am going to kill it [10:17:09] Urghhh. I will strangle these special pages with my bare hands [10:17:25] Do you want to disable it for now? [10:17:40] yeah, if it is not needed [10:18:14] It can die for now. I don't think people will use this special page [10:18:15] I need to go and find the php process and kill it [10:18:49] killed [10:18:57] I should make a general ticket for all of these [10:19:07] And audit them [10:19:20] yeah, definitely [10:21:17] Brb [10:24:04] 10DBA: ipb_parent_block_id_2 index on ipblocks table on s8 only - https://phabricator.wikimedia.org/T250062 (10Marostegui) [11:20:16] 10DBA: text table still has old_* fields and indexes on some hosts - https://phabricator.wikimedia.org/T250066 (10Ladsgroup) [11:23:10] 10DBA: user_newtalk has two indexes not renamed in s4 - https://phabricator.wikimedia.org/T250067 (10Ladsgroup) [11:23:31] marostegui: quick q, did you make a ticket for renaming ipblocks table index in core? [11:25:50] nope [11:25:54] you want me to? [11:26:01] I was waiting on your assessment [11:26:04] no, I make it [11:26:10] don't worry [11:26:17] you mean renaming in tables.sql, right? [11:26:24] yup [11:26:37] nope, I didn't [11:26:52] also I created some tickets for more work for you :P [11:26:53] if that happens, I will need to alter the most recent created wikis [11:26:58] but shouldn't be a lot I guess [11:27:16] and certainly faster than enwiki [11:27:20] and they are tiny [11:30:26] yeah [11:34:18] I am checking s3 and most of them have the ipb_address_unique index [11:36:16] 10DBA: user_newtalk has two indexes not renamed in s4 - https://phabricator.wikimedia.org/T250067 (10Marostegui) [11:36:24] 10DBA, 10Datasets-General-or-Unknown, 10Patch-For-Review, 10Wikimedia-Incident, 10WorkType-NewFunctionality: Automate the check and fix of object, schema and data drifts between mediawiki HEAD, production masters and slaves - https://phabricator.wikimedia.org/T104459 (10Marostegui) [11:36:29] 10DBA: user_newtalk has two indexes not renamed in s4 - https://phabricator.wikimedia.org/T250067 (10Marostegui) p:05Triage→03Medium [11:36:56] 10DBA: ipb_parent_block_id_2 index on ipblocks table on s8 only - https://phabricator.wikimedia.org/T250062 (10Marostegui) [11:37:02] 10DBA, 10Datasets-General-or-Unknown, 10Patch-For-Review, 10Wikimedia-Incident, 10WorkType-NewFunctionality: Automate the check and fix of object, schema and data drifts between mediawiki HEAD, production masters and slaves - https://phabricator.wikimedia.org/T104459 (10Marostegui) [11:49:00] 10DBA: tl_namespace index on templatelinks is unique only in s8 - https://phabricator.wikimedia.org/T250060 (10Marostegui) [11:49:22] 10DBA: tl_namespace index on templatelinks is unique only in s8 - https://phabricator.wikimedia.org/T250060 (10Marostegui) [11:49:28] 10DBA, 10Datasets-General-or-Unknown, 10Patch-For-Review, 10Wikimedia-Incident, 10WorkType-NewFunctionality: Automate the check and fix of object, schema and data drifts between mediawiki HEAD, production masters and slaves - https://phabricator.wikimedia.org/T104459 (10Marostegui) [11:49:34] 10DBA: tl_namespace index on templatelinks is unique only in s8 - https://phabricator.wikimedia.org/T250060 (10Marostegui) p:05Triage→03Medium [11:52:13] 10DBA: ipb_parent_block_id_2 index on ipblocks table on s8 only - https://phabricator.wikimedia.org/T250062 (10Marostegui) a:03Marostegui It is unused everywhere: ` dbstore1005.eqiad.wmnet:3318 object_schema object_name index_name wikidatawiki ipblocks ipb_parent_block_id_2 db1126.eqiad.wmnet:3306 object_schem... [11:54:14] 10DBA: ipb_parent_block_id_2 index on ipblocks table on s8 only - https://phabricator.wikimedia.org/T250062 (10Marostegui) [11:59:15] 10DBA: ipb_parent_block_id_2 index on ipblocks table on s8 only - https://phabricator.wikimedia.org/T250062 (10Marostegui) [12:03:50] 10DBA: ipb_parent_block_id_2 index on ipblocks table on s8 only - https://phabricator.wikimedia.org/T250062 (10Marostegui) [12:04:14] 10DBA: ipb_parent_block_id_2 index on ipblocks table on s8 only - https://phabricator.wikimedia.org/T250062 (10Marostegui) 05Open→03Resolved This is all done ` root@cumin1001:/home/marostegui# ./section s8 | while read host port; do echo "$host:$port"; mysql.py -h$host:$port wikidatawiki -e "show create tabl... [12:04:21] 10DBA, 10Datasets-General-or-Unknown, 10Patch-For-Review, 10Wikimedia-Incident, 10WorkType-NewFunctionality: Automate the check and fix of object, schema and data drifts between mediawiki HEAD, production masters and slaves - https://phabricator.wikimedia.org/T104459 (10Marostegui) [12:21:20] marostegui: thank you for doing all of this <3 [12:21:33] it is great that you reported it! [12:21:48] once we are done with these first batch, let's run the script again [12:25:27] YES [12:35:09] 10DBA, 10Core Platform Team, 10MediaWiki-User-management, 10Schema-change: Rename ipb_address index on ipb_address to ipb_address_unique - https://phabricator.wikimedia.org/T250071 (10Ladsgroup) [12:39:22] 10DBA, 10Datasets-General-or-Unknown, 10Patch-For-Review, 10Wikimedia-Incident, 10WorkType-NewFunctionality: Automate the check and fix of object, schema and data drifts between mediawiki HEAD, production masters and slaves - https://phabricator.wikimedia.org/T104459 (10Ladsgroup) For the sake of documen... [12:59:46] 10DBA, 10Core Platform Team, 10MediaWiki-User-management, 10Schema-change: Rename ipb_address index on ipb_address to ipb_address_unique - https://phabricator.wikimedia.org/T250071 (10Marostegui) p:05Triage→03Medium +1 to rename this on table.sql. The wikis that were created and would need an `alter t... [13:01:26] marostegui: I was thinking maybe we can have a report on diffs, to scream for new drifts [13:01:55] yeah, drifts can also be useful indeed [13:02:38] If I can find a way to automatically send the data out of mwmaint1002 (like to people.wikimedia.org), I can even make creating tickets automatic [13:03:21] hmm, I might not even need that, I can create the ticket right from mwmaint1002 [13:26:16] 10DBA, 10Core Platform Team: text table still has old_* fields and indexes on some hosts - https://phabricator.wikimedia.org/T250066 (10Marostegui) Can someone from #core_platform_team confirm Amir's guess? [13:26:28] 10DBA, 10Core Platform Team: text table still has old_* fields and indexes on some hosts - https://phabricator.wikimedia.org/T250066 (10Marostegui) p:05Triage→03Medium [14:16:18] Amir1: when do you want to kill wb_terms? [14:16:19] next week? [14:18:08] marostegui: this week sounds fine for WMDE jus keep the labs replica for a bit longer [14:18:17] yeah, that's not a problem [14:18:30] the tables have been renamed everywhere, so it is like if they were dropped anyways [14:18:48] maybe a couple of weeks more for that but production-wise it's fine. wb_terms did its last kicks too and ready to be killed [14:19:09] by that, I means labs [14:19:24] yeah, not in a rush with labs [14:19:26] whenever you guys decide [14:19:43] marostegui: it would be great if addshore and I can do one as an honorary move :D [14:20:16] xD [14:20:21] unless it needs to be depooled and all that dance [14:20:40] labs? [14:20:49] or core? [14:21:07] one of replicas :P [14:21:22] whatever you decide and it's not a big deal [14:21:23] sure thing, you can do the first one if you like. But it needs depooling and all that yeah [14:21:31] but I am happy to depool the first one and you can do it if you like :) [14:21:49] addshore: do you want to do the honors? [14:21:59] well, not right now XD [14:22:03] I am about to go into an interview [14:22:07] But we can schedule for this week [14:22:22] sure [14:22:34] I go back to fighting LDAP [14:22:40] * marostegui hides [14:22:43] i got back to fighting google cloud buckets [14:22:45] :P [14:23:06] I go back to close all Amir1's tickets as declined [14:23:12] :D [14:23:39] * Amir1 summons his microservices [14:23:57] they are AI-powered [15:52:22] Amir1: I don't understand this one https://phabricator.wikimedia.org/T250067 [15:53:00] I am comparing db1089 (enwiki) and db1138 and they have the same structure for user_newktalk [15:53:05] marostegui: basically https://phabricator.wikimedia.org/T234066 was not done in s4 [15:53:23] hmm, maybe it's the testcommons? [15:53:28] Are you sure? I am checking db1138 and db1097:3314 and they are done [15:53:32] let me check testcommons [15:53:44] yes, it is testcommons [15:53:56] cool [15:54:01] 10DBA: user_newtalk has two indexes not renamed in s4 - https://phabricator.wikimedia.org/T250067 (10Marostegui) [15:54:01] should be quick [15:54:35] Actually, I don't know what do with different wikis having different structure in the same host, right now the bot picks a wiki from the section at random [15:54:58] otherwise any issue in s3 will overwhelm the whole report [15:55:42] leave it like that for now [15:55:47] it is easy to check the wikis manually [15:55:53] I just forgot that we have testcommons :) [15:55:56] when I saw s4 [15:56:06] I was thinking of doing logarithmic scaling. If you have 1000 wikis, you will use 7 wikis (ln(100) ~=6.9) [15:57:01] any way, that can be done later [15:57:27] 10DBA: user_newtalk has two indexes not renamed in s4 - https://phabricator.wikimedia.org/T250067 (10Marostegui) codfw fixed: ` root@cumin1001:/home/marostegui# mysql.py -hdb2099:3314 testcommonswiki -e "show create table user_newtalk\G" *************************** 1. row *************************** Table... [15:57:46] Amir1: yeah, no rush on that [16:01:57] 10DBA: user_newtalk has two indexes not renamed in s4 - https://phabricator.wikimedia.org/T250067 (10Marostegui) s4 eqiad testcommonswiki: [x] labsdb1012 [x] labsdb1011 [x] labsdb1010 [x] labsdb1009 [x] dbstore1004 [] db1138 [x] db1125 [x] db1121 [x] db1103 [x] db1102 [x] db1097 [] db1091 [] db1084 [] db1081 [16:05:27] 10DBA: user_newtalk has two indexes not renamed in s4 - https://phabricator.wikimedia.org/T250067 (10Marostegui) 05Open→03Resolved a:03Marostegui All done: ` root@cumin1001:/home/marostegui# ./section s4 | while read host port; do echo "$host:$port"; mysql.py -h$host:$port testcommonswiki -e "show create t... [16:05:33] 10DBA, 10Datasets-General-or-Unknown, 10Patch-For-Review, 10Wikimedia-Incident, 10WorkType-NewFunctionality: Automate the check and fix of object, schema and data drifts between mediawiki HEAD, production masters and slaves - https://phabricator.wikimedia.org/T104459 (10Marostegui) [16:19:27] marostegui: can you check s3 if any new wikis also missed the index rename? I highly doubt it [16:20:09] for which task? I did it for the ipblocks [16:20:26] and posted the list of wikis that would need rename on the db [17:17:35] 10DBA, 10wikitech.wikimedia.org: Wikitech has lots of database drifts with core and rest of the databases - https://phabricator.wikimedia.org/T249972 (10Reedy) I've done `labtestwiki` now too that {T250002} is fixed [17:20:36] 10DBA, 10wikitech.wikimedia.org, 10cloud-services-team (Kanban): Labtestwiki database not reachable from deploy1001 using the standard `sql` helper script - https://phabricator.wikimedia.org/T250002 (10bd808) >>! In T250002#6050986, @Marostegui wrote: > `labtestwiki` lives on `clouddb2001-dev.codfw.wmnet` wh... [20:02:48] https://grafana.wikimedia.org/d/XyoE_N_Wz/wikidata-database-cpu-saturation?orgId=1 db1126 isnt happy [20:23:23] looks like its having quite an impact on performance