[07:09:32] 10DBA, 10Analytics, 06Operations: Prep to decommission old dbstore hosts (db1046, db1047) - https://phabricator.wikimedia.org/T156844#3100960 (10Marostegui) >>! In T156844#3099522, @Ottomata wrote: > Oh yeah, rats, I totally forgot to put this in our budget request. Hm. do db1046 and db1047 host just EL da... [08:14:22] 07Blocked-on-schema-change, 10DBA, 10ContentTranslation, 10ContentTranslation-Deployments, and 3 others: Apply wikishared.cx_translations index change - https://phabricator.wikimedia.org/T160407#3101019 (10Marostegui) Just to be on the safe side, I have taken a backup of cx_translations and it is at: `db10... [08:19:37] 07Blocked-on-schema-change, 10DBA, 10ContentTranslation, 10ContentTranslation-Deployments, and 3 others: Apply wikishared.cx_translations index change - https://phabricator.wikimedia.org/T160407#3101021 (10KartikMistry) >>! In T160407#3101019, @Marostegui wrote: > Just to be on the safe side, I have taken... [08:32:12] 07Blocked-on-schema-change, 10DBA, 10ContentTranslation, 10ContentTranslation-Deployments, and 3 others: Apply wikishared.cx_translations index change - https://phabricator.wikimedia.org/T160407#3101058 (10Marostegui) This has been applied in all the core servers, it was really fast, so no delay was apprec... [08:36:53] 07Blocked-on-schema-change, 10DBA, 10Expiring-Watchlist-Items, 10MediaWiki-Watchlist, and 3 others: Add wl_id to watchlist tables on production dbs - https://phabricator.wikimedia.org/T130067#3101065 (10Marostegui) We have tested an ALTER table + parallel replication + gtid_domain_id on x1 (T160407). It wa... [08:56:57] 07Blocked-on-schema-change, 10DBA, 10ContentTranslation, 10ContentTranslation-Deployments, and 3 others: Apply wikishared.cx_translations index change - https://phabricator.wikimedia.org/T160407#3101139 (10Marostegui) dbstore1002 is misbehaving, not sure if it is because of parallel replication or because... [09:00:43] 07Blocked-on-schema-change, 10DBA, 10ContentTranslation, 10ContentTranslation-Deployments, and 3 others: Apply wikishared.cx_translations index change - https://phabricator.wikimedia.org/T160407#3101147 (10Marostegui) 05Open>03Resolved It might have been caused by parallel replication, I was able to di... [09:09:53] 07Blocked-on-schema-change, 10DBA, 10ContentTranslation, 10ContentTranslation-Deployments, and 3 others: Apply wikishared.cx_translations index change - https://phabricator.wikimedia.org/T160407#3101184 (10Marostegui) It is definitely multisource + parallel replication. All the multisource slaves are/were... [09:38:30] 10DBA: Defragment db1070, db1082, db1087, db1092 - https://phabricator.wikimedia.org/T137191#3101208 (10Marostegui) db1070 status update: dewiki has been 100% over night. wikidatawiki has only 3 tables pending which have been since yesterday already: revision, logging, wb_terms ``` root@db1070:/srv/sqldata/wik... [10:33:33] 10DBA: run pt-tablechecksum on s6 - https://phabricator.wikimedia.org/T160509#3101348 (10Marostegui) [10:46:43] 10DBA: Update change tag indexes - https://phabricator.wikimedia.org/T42867#3101400 (10TTO) @Jcrespo is this task still relevant today? I know you and @Marostegui have been doing Herculean amounts of cleanup on this type of thing... [10:48:36] 07Blocked-on-schema-change, 10DBA, 06Multimedia, 05MW-1.29-release (WMF-deploy-2017-03-21_(1.29.0-wmf.17)), and 3 others: Review schema changes for T125071 - https://phabricator.wikimedia.org/T160415#3101402 (10matthiasmullie) [10:50:33] 10DBA: Update change tag indexes - https://phabricator.wikimedia.org/T42867#3101409 (10jcrespo) @TTO I cannot say for sure- only researching it requires quering 20,000 databases to check the current state or implement T104459. [10:52:53] 10DBA: run pt-tablechecksum on s6 - https://phabricator.wikimedia.org/T160509#3101415 (10Marostegui) The following tables per database do not have a PK and will need to be excluded: frwiki: ``` archive_save categorylinks change_tag click_tracking click_tracking_user_properties cur edit_page_tracking flaggedrev... [10:55:31] 07Blocked-on-schema-change, 10DBA, 06Multimedia, 05MW-1.29-release (WMF-deploy-2017-03-21_(1.29.0-wmf.17)), and 3 others: Review schema changes for T125071 - Add index to image table on all wikis - https://phabricator.wikimedia.org/T160415#3101418 (10Marostegui) [11:12:58] 10DBA, 06Operations, 13Patch-For-Review: Decommission old coredb machines (<=db1050) - https://phabricator.wikimedia.org/T134476#3101425 (10Marostegui) [11:13:00] 10DBA: run pt-tablechecksum on s6 - https://phabricator.wikimedia.org/T160509#3101424 (10Marostegui) [11:34:26] 10DBA: run pt-tablechecksum on s6 - https://phabricator.wikimedia.org/T160509#3101461 (10Marostegui) dsns_s6 table looks good, and I have added the replication filter to the rc slaves (db1037 and db2039) to ignore the wmf_checksums table to avoid the issue we already had: T154485#3050716 ``` Replicate_Wild_Igno... [11:43:26] the bad thing about small chunks is that we have now 20M row-tables [11:44:54] yeah... [11:45:02] wait a sec, why are there several dbs on the same db? [11:45:15] on the checksum table? [11:45:48] ERROR 1146 (42S02): Table 'zhwiki.__wmf_checksums' doesn't exist [11:45:53] Yes [11:46:00] so you put all of them on the same table? [11:46:05] As I started testing with nlwiki, I just left it there [11:46:14] Now for s6 and (onwards) I will use ops [11:46:17] For all the dbs [11:46:31] I choped them in the past [11:46:38] because now it is a pain to query [11:47:04] yeah, I can do that too, specially for s3 [11:47:10] actually [11:47:13] for s3 [11:47:18] I would not do that [11:47:22] why? [11:47:39] create 900 new tables is too messy [11:47:48] haha that is true too [11:48:00] however, on the larger dbs, creating 6 new tables is easier to query [11:48:10] yep, that is true [11:48:18] I will do that for s6 :) [11:48:50] generating them is the first step [11:49:16] fixing the data is the important one, the reports is not that useful [11:49:25] as much as identifying the actual rows with problems [11:57:22] 10DBA, 13Patch-For-Review: run pt-table-checksum before decommissioning db1015, db1035,db1044,db1038 - https://phabricator.wikimedia.org/T154485#3101494 (10jcrespo) db1047: ``` db tbl total_rows chunks enwiktionary archive 100 1 idwiki geo_tags 100 1 nlwiki archive 200 2... [12:13:11] 10DBA, 13Patch-For-Review: run pt-table-checksum before decommissioning db1015, db1035,db1044,db1038 - https://phabricator.wikimedia.org/T154485#3101539 (10jcrespo) Good and bad news: {P5061} There are no differences in data, but the primary key values differ, probably because those were inserted with the unsa... [12:17:26] what were the issues with multisource and parallel replication? [12:17:34] I remember those happening on labs [12:17:45] basically, servers getting stuck and struggling [12:17:59] almost impossible to stop slaves [12:18:08] yeah, I think I mentioned that in the past, I forgot about the dbstores [12:18:24] maybe we should setup some mysql slaves [12:18:27] for testing [12:18:47] whatever it is, it looks more or less confirmed that multisource and parallel do not like each other that much [12:19:04] I thought it was toku db, however [12:19:11] but 2001 isn't toku anymore [12:19:15] had the hope it wouldn't happen with innodb [12:19:27] yeah, that is why it was a hope :-) [12:19:44] i haven't looked much up yet, to see if people out there had issues too [12:20:14] going to grab some food [12:27:15] 10DBA, 13Patch-For-Review: run pt-table-checksum before decommissioning db1015, db1035,db1044,db1038 - https://phabricator.wikimedia.org/T154485#3101548 (10jcrespo) I intend to run: ``` (echo "SET SESSION sql_log_bin=0; " && mysqldump --single-transaction -h db1018.eqiad.wmnet enwiktionary archive --where "ar_... [12:46:56] 10DBA, 10Wikimedia-Site-requests, 07Tracking: Database table cleanup (tracking) - https://phabricator.wikimedia.org/T18660#3101750 (10MarcoAurelio) [13:39:18] 10DBA, 13Patch-For-Review: run pt-table-checksum before decommissioning db1015, db1035,db1044,db1038 - https://phabricator.wikimedia.org/T154485#3102023 (10Marostegui) >>! In T154485#3101548, @jcrespo wrote: > I intend to run: > ``` > (echo "SET SESSION sql_log_bin=0; " && mysqldump --single-transaction -h db... [13:48:04] 10DBA, 13Patch-For-Review: run pt-table-checksum before decommissioning db1015, db1035,db1044,db1038 - https://phabricator.wikimedia.org/T154485#3102046 (10jcrespo) I have done the above, there is no longer a diff for db1047 on enwiktionary.archive. I will repeat the steps for all detected changes. Some import... [13:48:14] 07Blocked-on-schema-change, 10DBA, 06Multimedia, 05MW-1.29-release (WMF-deploy-2017-03-21_(1.29.0-wmf.17)), and 3 others: Review schema changes for T125071 - Add index to image table on all wikis - https://phabricator.wikimedia.org/T160415#3097979 (10Marostegui) Table sizes: s1 - 1G (we might need to depoo... [14:12:29] 07Blocked-on-schema-change, 10DBA, 06Multimedia, 05MW-1.29-release (WMF-deploy-2017-03-21_(1.29.0-wmf.17)), and 3 others: Review schema changes for T125071 - Add index to image table on all wikis - https://phabricator.wikimedia.org/T160415#3102130 (10Marostegui) I did a quick test on s7, db2047 and alterin... [14:19:03] 07Blocked-on-schema-change, 10DBA, 06Multimedia, 05MW-1.29-release (WMF-deploy-2017-03-21_(1.29.0-wmf.17)), and 3 others: Review schema changes for T125071 - Add index to image table on all wikis - https://phabricator.wikimedia.org/T160415#3097979 (10jcrespo) @Marostegui Note the main issue is not size, bu... [14:22:08] 07Blocked-on-schema-change, 10DBA, 06Multimedia, 05MW-1.29-release (WMF-deploy-2017-03-21_(1.29.0-wmf.17)), and 3 others: Review schema changes for T125071 - Add index to image table on all wikis - https://phabricator.wikimedia.org/T160415#3102174 (10Marostegui) >>! In T160415#3102147, @jcrespo wrote: > @M... [15:17:06] 07Blocked-on-schema-change, 10DBA, 06Multimedia, 05MW-1.29-release (WMF-deploy-2017-03-21_(1.29.0-wmf.17)), and 3 others: Review schema changes for T125071 - Add index to image table on all wikis - https://phabricator.wikimedia.org/T160415#3102395 (10Marostegui) a:03Marostegui The following hosts have be... [15:24:57] 07Blocked-on-schema-change, 10DBA, 06Multimedia, 05MW-1.29-release (WMF-deploy-2017-03-21_(1.29.0-wmf.17)), and 3 others: Review schema changes for T125071 - Add index to image table on all wikis - https://phabricator.wikimedia.org/T160415#3102436 (10Marostegui) s2 eqiad hosts done so the full shard is now... [15:37:30] 10DBA, 06Analytics-Kanban: Change length of userAgent column on EL tables - https://phabricator.wikimedia.org/T160454#3102481 (10Nuria) ping @jcrespo Is the solution of renaming tables easier on the dabase end? that would work great for us too. Please let us know. [15:39:05] 10DBA, 06Analytics-Kanban: Change length of userAgent column on EL tables - https://phabricator.wikimedia.org/T160454#3102484 (10jcrespo) Not sure all details of that-but yes, renaming a table is an instant operation, doing a schema change can take up to a keep per table and server. [15:43:29] 07Blocked-on-schema-change, 10DBA, 06Multimedia, 05MW-1.29-release (WMF-deploy-2017-03-21_(1.29.0-wmf.17)), and 3 others: Review schema changes for T125071 - Add index to image table on all wikis - https://phabricator.wikimedia.org/T160415#3102490 (10Marostegui) s6 is now fully done - hosts done: dbstore2... [15:52:44] 10DBA, 06Analytics-Kanban: Change length of userAgent column on EL tables - https://phabricator.wikimedia.org/T160454#3102530 (10Nuria) Excellent, let us know what you think is a good time on your end to do this and we will take an outage accordingly. For us, the sooner the better. Ideally (i think) we might... [16:03:49] 10DBA, 06Analytics-Kanban: Change length of userAgent column on EL tables - https://phabricator.wikimedia.org/T160454#3102668 (10Ottomata) > change https://github.com/wikimedia/eventlogging/blob/master/eventlogging/jrm.py#L79 for length of varchars to what? (@jcrespo to advice) FYI, the comment on this line sa... [16:09:12] 07Blocked-on-schema-change, 10DBA, 06Multimedia, 05MW-1.29-release (WMF-deploy-2017-03-21_(1.29.0-wmf.17)), and 3 others: Review schema changes for T125071 - Add index to image table on all wikis - https://phabricator.wikimedia.org/T160415#3102706 (10Marostegui) s7 current hosts done (all codfw): dbstore2... [16:36:14] 10DBA, 13Patch-For-Review: run pt-table-checksum before decommissioning db1015, db1035,db1044,db1038 - https://phabricator.wikimedia.org/T154485#3102881 (10Marostegui) The last database for s2 was checksummed and these are the results: ``` Differences on db1047 TABLE CHUNK CNT_DIFF CRC_DIFF CHUNK_INDEX LOWER_B... [16:38:56] 10DBA, 13Patch-For-Review: run pt-table-checksum before decommissioning db1015, db1035,db1044,db1038 - https://phabricator.wikimedia.org/T154485#3102889 (10jcrespo) I have actually fixed all db1047 found differences. I am working now on dbstore1002. [16:52:35] 10DBA, 13Patch-For-Review: run pt-table-checksum before decommissioning db1015, db1035,db1044,db1038 - https://phabricator.wikimedia.org/T154485#3102925 (10jcrespo) dbstore1002: ``` $ mysql -h $slave nlwiki -e "SELECT db, tbl, SUM(this_cnt) AS total_rows, COUNT(*) AS chunks FROM __wmf_checksums WHERE ( maste... [16:54:12] 10DBA, 06Operations, 10ops-codfw: es2015 crashed on 2017-03-11 - https://phabricator.wikimedia.org/T160242#3102927 (10Papaul) Dell we replace the main board and the CPU' Hi Papaul, Please accept my apologies for the delayed response. I just came into office and hence there was a delay in the response. The... [16:55:35] 10DBA, 10Expiring-Watchlist-Items, 10MediaWiki-Watchlist, 06TCB-Team, and 2 others: Allow setting the watchlist table to read-only on a per-wiki basis - https://phabricator.wikimedia.org/T160062#3102936 (10Lea_WMDE) [16:57:00] 10DBA, 06Operations, 10ops-codfw: es2015 crashed on 2017-03-11 - https://phabricator.wikimedia.org/T160242#3102940 (10jcrespo) Thank you very much! We will shutdown the server tomorrow ahead of time- ping us if have more details about the predicted schedule for it. [17:07:17] 10DBA, 13Patch-For-Review: run pt-table-checksum before decommissioning db1015, db1035,db1044,db1038 - https://phabricator.wikimedia.org/T154485#3102980 (10jcrespo) dbstore1001 - it is 24 hours behind, so it has to be checked tomorrow again: ``` root@dbstore1001[nlwiki]> SELECT db, tbl, SUM(this_cnt) AS total_... [17:50:39] 10DBA, 06Operations, 10ops-codfw: es2015 crashed on 2017-03-11 - https://phabricator.wikimedia.org/T160242#3103272 (10Marostegui) >>! In T160242#3102940, @jcrespo wrote: > Thank you very much! We will shutdown the server tomorrow ahead of time- ping us if have more details about the predicted schedule for it... [23:42:06] 10DBA, 10Phabricator, 05Security: Improve privilege separation for phabricator's config files and mysql credentials - https://phabricator.wikimedia.org/T146055#3104558 (10mmodell) >>! In T146055#3104429, @Dzahn wrote: > should this ticket have DBA tag? Yeah probably so...