[07:12:09] 10DBA, 13Patch-For-Review: Rampant differences in indexes on enwiki.revision across the DB cluster - https://phabricator.wikimedia.org/T132416#3066798 (10Marostegui) dbstore2002: ``` root@dbstore2002.codfw.wmnet[enwiki]> show create table revision\G *************************** 1. row **************************... [07:22:10] 10DBA, 13Patch-For-Review: Rampant differences in indexes on enwiki.revision across the DB cluster - https://phabricator.wikimedia.org/T132416#3066804 (10Marostegui) The master of codfw (db2016) is now being altered [07:31:53] 10DBA, 07Epic: Meta DBA ticket for the DC switchover - https://phabricator.wikimedia.org/T155099#3066813 (10Marostegui) [07:31:55] 10DBA: Review capacity on codfw - https://phabricator.wikimedia.org/T155102#3066808 (10Marostegui) 05Open>03Resolved a:03Marostegui I believe this is no longer necessary as we believe we are good to go for the DC switchover with the current capacity. If we get the new hardware for codfw on time (T158669)... [08:15:37] 07Blocked-on-schema-change, 10DBA, 06Community-Tech, 06Stewards-and-global-tools (Temporary-UserRights): Schema changes for expiring user groups - https://phabricator.wikimedia.org/T155605#3066853 (10Marostegui) 05Open>03Resolved a:03Marostegui s3 is missing the column on: `closed_zh_twwiki` which I... [08:29:51] 07Blocked-on-schema-change, 10DBA, 06Community-Tech, 06Stewards-and-global-tools (Temporary-UserRights): Schema changes for expiring user groups - https://phabricator.wikimedia.org/T155605#3066863 (10TTO) Thank you, Manuel, for your work on this! Will these schema changes automatically percolate to the La... [08:33:42] 07Blocked-on-schema-change, 10DBA, 06Community-Tech, 06Stewards-and-global-tools (Temporary-UserRights): Schema changes for expiring user groups - https://phabricator.wikimedia.org/T155605#3066864 (10Marostegui) >>! In T155605#3066863, @TTO wrote: > Thank you, Manuel, for your work on this! You are welcom... [08:34:31] 07Blocked-on-schema-change, 10DBA, 06Community-Tech, 06Stewards-and-global-tools (Temporary-UserRights): Schema changes for expiring user groups - https://phabricator.wikimedia.org/T155605#3066866 (10jcrespo) @TTO the db changes should have included labs physical table change already, but to make them visi... [08:51:14] 10DBA: Rampant differences in indexes and PK on s6 (frwiki, jawiki, ruwiki) for revision table - https://phabricator.wikimedia.org/T159414#3066902 (10Marostegui) [08:52:28] 10DBA: Rampant differences in indexes and PK on s6 (frwiki, jawiki, ruwiki) for revision table - https://phabricator.wikimedia.org/T159414#3066917 (10jcrespo) I am already taking care of RCs. [08:54:45] 10DBA: Rampant differences in indexes and PK on s6 (frwiki, jawiki, ruwiki) for revision table - https://phabricator.wikimedia.org/T159414#3066918 (10jcrespo) e.g.: ``` $ mysql -h db1037.eqiad.wmnet frwiki -e "SHOW CREATE TABLE revision\G" | grep KEY PRIMARY KEY (`rev_id`,`rev_user`), KEY `rev_timestamp` (`... [08:56:30] 10DBA: Rampant differences in indexes and PK on s6 (frwiki, jawiki, ruwiki) for revision table - https://phabricator.wikimedia.org/T159414#3066921 (10Marostegui) Lovely, so we will "ignore" the rc slaves in this task then! [09:00:49] 10DBA: Rampant differences in indexes and PK on s6 (frwiki, jawiki, ruwiki) for revision table - https://phabricator.wikimedia.org/T159414#3066926 (10Marostegui) [09:22:35] 10DBA, 06Labs, 10Labs-Infrastructure, 06Operations, 13Patch-For-Review: Migrate labsdb1005/1006/1007 to jessie - https://phabricator.wikimedia.org/T123731#3066980 (10Marostegui) [09:22:39] 10DBA, 06Labs, 10Labs-Infrastructure, 06Operations, and 2 others: labsdb1005 (mysql) maintenance for reimage - https://phabricator.wikimedia.org/T157358#3066977 (10Marostegui) 05Open>03Resolved a:03Marostegui I am closing this as nothing has been reported so far. If something arises, feel free to reo... [09:23:48] 10DBA, 10Datasets-General-or-Unknown, 06Labs, 10Labs-Infrastructure: Rebuild old timestamp format tables - https://phabricator.wikimedia.org/T151607#3066984 (10Marostegui) p:05Normal>03Low [09:25:08] 10DBA: Drop echo tables from local wiki databases - https://phabricator.wikimedia.org/T153638#3066986 (10Marostegui) I want to start working on this slowly in the background as doing T136428 was confusing and painful with so many echo tables that were all over the shards and not really needed. [09:25:56] 07Blocked-on-schema-change, 10DBA, 05MW-1.28-release-notes, 13Patch-For-Review: Clean up revision UNIQUE indexes - https://phabricator.wikimedia.org/T142725#3066990 (10Marostegui) [09:25:59] 10DBA: Rampant differences in indexes and PK on s6 (frwiki, jawiki, ruwiki) for revision table - https://phabricator.wikimedia.org/T159414#3066989 (10Marostegui) [10:01:49] 10DBA, 07Epic: Meta ticket: The future of multi source replication slaves vs multi instance ones. - https://phabricator.wikimedia.org/T159423#3067139 (10Marostegui) [10:27:16] 10DBA: convert dbstore1001 to InnoDB compressed by importing db shards to it - https://phabricator.wikimedia.org/T159430#3067279 (10jcrespo) [10:27:33] 10DBA: convert dbstore1001 to InnoDB compressed by importing db shards to it - https://phabricator.wikimedia.org/T159430#3067264 (10jcrespo) [10:27:35] 10DBA, 07Epic: Meta ticket: The future of multi source replication slaves vs multi instance ones. - https://phabricator.wikimedia.org/T159423#3067281 (10jcrespo) [10:27:40] 10DBA, 06Labs, 13Patch-For-Review: Add and sanitize s2, s4, s5, s6 and s7 to sanitarium2 and new labsdb hosts - https://phabricator.wikimedia.org/T153743#3067283 (10jcrespo) [10:29:03] 10DBA: convert dbstore1001 to InnoDB compressed by importing db shards to it - https://phabricator.wikimedia.org/T159430#3067264 (10jcrespo) [11:19:14] 10DBA: Investigate db1047 replication lag - https://phabricator.wikimedia.org/T159266#3067442 (10jcrespo) 05Resolved>03Open [11:24:59] 10DBA: Investigate db1047 replication lag - https://phabricator.wikimedia.org/T159266#3067449 (10jcrespo) ``` BBU status for Adapter: 1 BatteryType: BBU Voltage: 3788 mV Current: 0 mA Temperature: 39 C Battery State: Failed BBU Firmware Status: Charging Status : None Voltage... [12:48:05] 10DBA, 06Operations, 10ops-eqiad: Investigate db1047 replication lag - https://phabricator.wikimedia.org/T159266#3067703 (10jcrespo) We need to change the battery of the second controller (number 1) and disable auto-learning there (it was only disabled on number 0). For the first part we need @Cmjohnson, ne... [12:48:57] 10DBA, 06Operations, 10ops-eqiad: db1047 BBU RAID issues (was: Investigate db1047 replication lag) - https://phabricator.wikimedia.org/T159266#3067709 (10jcrespo) [12:49:09] 10DBA, 06Operations, 10ops-eqiad: db1047 BBU RAID issues (was: Investigate db1047 replication lag) - https://phabricator.wikimedia.org/T159266#3062225 (10jcrespo) a:05jcrespo>03None [12:49:21] 10DBA, 06Operations, 10ops-eqiad: db1047 BBU RAID issues (was: Investigate db1047 replication lag) - https://phabricator.wikimedia.org/T159266#3062225 (10jcrespo) p:05Triage>03Normal [13:08:23] 10DBA, 06Operations, 10ops-eqiad: db1047 BBU RAID issues (was: Investigate db1047 replication lag) - https://phabricator.wikimedia.org/T159266#3067736 (10Marostegui) I have disabled the auto-learn mode for that controller - I have not set it to "2" (warn via an event) because we are not really using it: ```... [13:45:57] 10DBA, 07Availability: Look into Maria 10 parallel-replication - https://phabricator.wikimedia.org/T85266#942877 (10Marostegui) We would need to enable parallel replication for: T130067 [13:58:48] 10DBA, 07Availability: Look into Maria 10 parallel-replication - https://phabricator.wikimedia.org/T85266#3067809 (10jcrespo) This thread is more about group-commit and several synchronous commit techniques/features. While GTID-domain based replication is parallel replication, as Aaron said above, it is not go... [15:12:44] 07Blocked-on-schema-change, 10DBA, 10Expiring-Watchlist-Items, 10MediaWiki-Watchlist, and 3 others: Add wl_id to watchlist tables on production dbs - https://phabricator.wikimedia.org/T130067#3067897 (10jcrespo) The status of this may not be 100% clear- Manuel did all the necessary testing to check this ca... [15:17:38] I need some extra background task- I am going to try to reimage db1001 [15:17:50] good luck :p [15:17:53] shout if there is a problem with that [15:18:21] it is very old, but I need to upgrade mysql anyway- reimaging will be faster as it only has 300GB [15:18:48] ah cool :) [15:19:05] if you have time, not sure if you have yet to clean up es1017? [15:19:15] sure, I can do that [15:19:23] and try to stik to dbstore1001 now [15:19:29] *we should [15:20:00] yeah [15:20:02] makes sense [15:20:06] I will clean es1017 in a bit [15:20:17] there is no rush [15:20:21] just a reminder [15:36:41] i have cleaned up screens + the old files in srv/tmp - I haven't deleted dbstore1001 files, just in case [15:36:48] maybe we can do that in a few days? [15:40:56] yay [16:13:28] 10DBA, 10Monitoring: m1 slaves all broke replication due to bacula.DelCandidates temporary table - https://phabricator.wikimedia.org/T158764#3068074 (10jcrespo) db1001 has been repooled without the replication filter again. I will keep it there for a second, and if the replication is stable, I will retire the... [16:16:02] 10DBA, 13Patch-For-Review: run pt-table-checksum before decommissioning db1015, db1035,db1044,db1038 - https://phabricator.wikimedia.org/T154485#3068091 (10Marostegui) Today's checksums idwiki - after resuming it today, db1047 has differences on one table: idwiki.geo_tags 1 0 1 PRIMARY 988797 1916637 itwiki -... [16:16:34] 10DBA, 06Operations, 13Patch-For-Review: Followup for TLS MariaDB server roll-out - https://phabricator.wikimedia.org/T157702#3068092 (10jcrespo) m1 slave db1001 has been restarted and TLS enabled. [16:49:02] 10DBA, 06Community-Tech, 10MediaWiki-Categories, 13Patch-For-Review: Increase size of categorylinks.cl_collation column - https://phabricator.wikimedia.org/T158724#3068146 (10Bawolff) I think there is too much fuss being made over this bug. As a temporary solution until some larger table refactoring takes... [17:06:04] 07Blocked-on-schema-change, 10DBA, 10Expiring-Watchlist-Items, 10MediaWiki-Watchlist, and 3 others: Add wl_id to watchlist tables on production dbs - https://phabricator.wikimedia.org/T130067#3068203 (10Addshore) @Marostegui @jcrespo Do we have an estimation of how long each wiki would need to be in a watc... [17:09:54] 07Blocked-on-schema-change, 10DBA, 10Expiring-Watchlist-Items, 10MediaWiki-Watchlist, and 3 others: Add wl_id to watchlist tables on production dbs - https://phabricator.wikimedia.org/T130067#3068213 (10jcrespo) Based on manuel's explanation, 10 minutes for enwiki for the actual alter, 10 minutes for the c... [17:43:22] 07Blocked-on-schema-change, 10DBA, 10Expiring-Watchlist-Items, 10MediaWiki-Watchlist, and 3 others: Add wl_id to watchlist tables on production dbs - https://phabricator.wikimedia.org/T130067#3068364 (10Marostegui) I will get: https://gerrit.wikimedia.org/r/#/c/340130/ deployed on Monday by the way which i... [17:45:50] 07Blocked-on-schema-change, 10DBA, 10Expiring-Watchlist-Items, 10MediaWiki-Watchlist, and 3 others: Add wl_id to watchlist tables on production dbs - https://phabricator.wikimedia.org/T130067#3068373 (10jcrespo) >>! In T130067#3068213, @jcrespo wrote: > Based on manuel's explanation, 10 minutes for enwiki... [17:47:36] 07Blocked-on-schema-change, 10DBA, 10Expiring-Watchlist-Items, 10MediaWiki-Watchlist, and 3 others: Add wl_id to watchlist tables on production dbs - https://phabricator.wikimedia.org/T130067#3068382 (10Addshore) As far as I am aware there would be no reason that this couldn't be done in batches. We (WMDE)... [17:52:08] 10DBA, 10Monitoring: m1 slaves all broke replication due to bacula.DelCandidates temporary table - https://phabricator.wikimedia.org/T158764#3068394 (10jcrespo) 05Open>03Resolved No replication filters anywhere, things look good. [20:49:39] 10DBA, 10MediaWiki-User-blocking, 03Community-Tech-Sprint: Do test queries for range contributions to gauge performance of using different tables - https://phabricator.wikimedia.org/T156318#3068954 (10MusikAnimal) >>! In T156318#3063773, @jcrespo wrote: > Why 255 bytes for the ip address? I went off of `cu_c... [23:11:39] 10DBA, 06Labs, 10Labs-Infrastructure: Data integrity issue with enwiki_p user_groups on Wikimedia Tool Labs (missing rows) - https://phabricator.wikimedia.org/T159493#3069402 (10MZMcBride)