[05:43:09] 10DBA, 10Data-Services, 10Patch-For-Review: Some queries to new replica hosts are dramatically slower than labsdb; missing indexes? - https://phabricator.wikimedia.org/T177096#3801865 (10bd808) [05:43:12] 10DBA, 10Data-Services, 10User-bd808, 10cloud-services-team (Kanban): Determine schema differences between labsdb1001 and labsdb1009 - https://phabricator.wikimedia.org/T177223#3801864 (10bd808) 05stalled>03Open [06:20:07] 10DBA, 10Operations, 10ops-codfw: db2044: RAID disk with predictive failure - https://phabricator.wikimedia.org/T181775#3801889 (10Marostegui) [06:22:58] 10Blocked-on-schema-change, 10DBA, 10Data-Services, 10Dumps-Generation, and 2 others: Schema change for refactored comment storage - https://phabricator.wikimedia.org/T174569#3801904 (10Marostegui) @Anomie s3 is almost done (T174569#3796157). I will not alter the master on a Friday, as it takes several hou... [06:25:59] 10DBA, 10Operations, 10ops-codfw: db2044: RAID disk with predictive failure - https://phabricator.wikimedia.org/T181775#3801905 (10Marostegui) p:05Triage>03Normal [06:52:41] 10DBA: Meta ticket: Deploy InnoDB compression where possible - https://phabricator.wikimedia.org/T150438#3802064 (10Marostegui) 05Open>03Resolved a:03Marostegui Going to close this task, as compressing InnoDB is something we are naturally doing where it makes sense, so no need this task to remind it to us... [06:59:04] 10DBA, 10Operations, 10ops-codfw: Degraded RAID on db2044 - https://phabricator.wikimedia.org/T181779#3802076 (10Marostegui) p:05Triage>03Normal a:03Papaul Can we get this replaced @Papaul ? Thanks! [06:59:35] 10DBA, 10Operations, 10ops-codfw: db2044: RAID disk with predictive failure - https://phabricator.wikimedia.org/T181775#3802080 (10Marostegui) 05Open>03Resolved And it finally failed: T181779 Let's follow up there [08:31:09] When moving to another ticket, I think it is less confusing to close as "merged into other ticket", but it is just a convention T181775#3802080 [08:31:09] T181775: db2044: RAID disk with predictive failure - https://phabricator.wikimedia.org/T181775 [08:32:31] suer, can do [08:32:34] sure [08:32:43] it doesn't matter now [08:33:15] but on other less clear tickets, "Resolved" would be missleading [08:34:16] technically the predictive failure is resolved [08:34:19] but anyways, ok [10:32:13] 10DBA, 10Patch-For-Review: Set barracuda InnoDB file format as the default configuration everywhere - https://phabricator.wikimedia.org/T150949#3802258 (10Marostegui) I would say we leave the strict mode enablement for another time (T108255), as it requires a lot more coordination and complete the Barracuda en... [10:36:02] 10DBA, 10Patch-For-Review: Set barracuda InnoDB file format as the default configuration everywhere - https://phabricator.wikimedia.org/T150949#3802276 (10jcrespo) Marostegui, you misunderstood. This task is about **Innodb strict mode** which is very safe and highly useful (not sql_mode strict all tables/trans... [10:37:01] 10DBA, 10Patch-For-Review: Set barracuda InnoDB file format as the default configuration everywhere - https://phabricator.wikimedia.org/T150949#3802278 (10Marostegui) Oh, sorry, indeed. My mind was thinking about sql strict mode! [10:37:15] 10DBA, 10Patch-For-Review: Set barracuda InnoDB file format as the default configuration everywhere - https://phabricator.wikimedia.org/T150949#3802292 (10jcrespo) > Oracle recommends enabling innodb_strict_mode when using ROW_FORMAT and KEY_BLOCK_SIZE clauses in CREATE TABLE, ALTER TABLE, and CREATE INDEX sta... [10:37:48] 10DBA, 10Patch-For-Review: Set barracuda InnoDB file format as the default configuration everywhere - https://phabricator.wikimedia.org/T150949#3802293 (10Marostegui) Yep yep, I was just thinking about sql strict mode. Sorry! [10:54:11] 10DBA, 10Epic, 10Patch-For-Review: Decouple roles from mariadb.pp into their own file - https://phabricator.wikimedia.org/T150850#3802356 (10Marostegui) https://gerrit.wikimedia.org/r/#/c/394541/ [11:10:24] 10DBA, 10Analytics-Kanban, 10Operations, 10Patch-For-Review, 10User-Elukey: Decommission old dbstore hosts (db1046, db1047) - https://phabricator.wikimedia.org/T156844#3802437 (10elukey) Opened https://phabricator.wikimedia.org/T181784 to fully decom db104[67] [11:21:11] I am going to upgrade and restart dbstore2001 [11:21:31] ok! [11:56:52] can I also upgrade only mysql on labsdb1004 ? [11:57:00] oh yes! [11:57:04] :) [11:57:26] I was actually trying to find how to (if possible) a failover on labsdb1004/1005 would work [11:57:38] so you read my mind :) [11:57:51] not very well [11:57:57] better to just restart [11:58:23] it should be fast anyways, yep [12:03:35] it takes very little on an idle server [12:04:21] 1-2 minutes at most [12:12:45] meanwhile, dbstore2001:3313 keeps runing mysql_upgrade... [12:13:01] haha hope you did it on a screen [14:09:39] can I provision db1096:s6? [14:10:01] nope, it is not ready [14:10:08] not ready? [14:10:10] or you mean build it? [14:10:25] put data there and start it [14:10:47] I would wait till monday because I am compressing s4 and I gave all the available memory for s4 to make it faster [14:10:48] provision, not pool [14:10:53] ok [14:10:56] that is why I asked [14:10:59] so you'd need to stop s4 (and it is still compressing, since yesterday) [14:11:03] and will finish tomorrow probably [14:11:11] That is why I didn't build it yet :) [14:11:18] if we added the posibility of adding comments to tendril [14:11:26] I wouldn't need to ask on irc every time [14:11:32] indeed [14:11:37] that would be a nice feature! [14:11:47] can I get rid of s3 multiinstance on codfw? [14:11:53] yes! [14:11:55] all yours [14:12:00] I would suggest one thing [14:12:11] yes? [14:12:24] Those host in codfw that you will free up. maybe put them back here?: https://phabricator.wikimedia.org/T170662 [14:12:37] you do not need to suggest that [14:12:42] that should happen [14:12:47] without you saying it [14:13:07] well, just in case you were not aware that task is still open :) [14:13:45] we can use db2078, db2090 and db2092 for misc [14:14:45] great! [14:15:04] we'll see [14:15:22] maybe moving some old hosts to those and then the older for m* [14:15:37] but making those available [14:15:45] yep, we have to do a brainstorm to renew all the misc :) [14:15:51] Once we have all clear [14:16:10] problem is proxies [14:16:14] we do not have them yet [14:16:29] so we will have to do some numbers [14:16:35] yeah [14:17:33] worse case scenario, we setup misc_multiinstance (we will call them eventualy misc, as all should be converted to it) [14:18:03] and we setup s1+s2, s3; s3, s5; s5, s1+s2 [14:18:28] after cleanup I do not think s2 makes much sense on its own [14:18:32] oh, that'd be a good idea, some multinstance for misc! [14:18:55] unlike s3 (phab, unreliable), s4 (log, unreliable) [14:19:27] I will start by freeing db2092 [14:19:31] cool :) [14:19:38] and moving s1 to db2085 [14:20:44] sounds good yeah, make sure to update https://phabricator.wikimedia.org/T178359 when you have done the movement :) [14:20:49] yes [14:21:03] gotta go, will be back later! [14:21:08] bye [15:21:17] I made some tunings so it is easier to detect replication group issues: https://logstash.wikimedia.org/goto/08bf79894e5ba9d25966ef464530315d [15:46:21] 10DBA, 10Patch-For-Review: Support multi-instance on core hosts - https://phabricator.wikimedia.org/T178359#3803102 (10jcrespo) [15:46:49] 10DBA, 10Patch-For-Review: Support multi-instance on core hosts - https://phabricator.wikimedia.org/T178359#3780959 (10jcrespo) [15:48:25] 10DBA, 10Patch-For-Review: Productionize 22 new codfw database servers - https://phabricator.wikimedia.org/T170662#3803121 (10jcrespo) [18:11:26] 10DBA, 10Patch-For-Review: Productionize 22 new codfw database servers - https://phabricator.wikimedia.org/T170662#3803574 (10jcrespo) [18:11:29] 10DBA, 10Patch-For-Review: Support multi-instance on core hosts - https://phabricator.wikimedia.org/T178359#3803575 (10jcrespo) [18:14:32] 10DBA, 10Patch-For-Review: Support multi-instance on core hosts - https://phabricator.wikimedia.org/T178359#3803600 (10jcrespo) So the summary of changes: db2085 (s3) and db2092 (s3) have disappeared (data has not been pysically deleted, but mysql is no longer running). db2092(s1) has been moved to db2085 (aga... [18:15:25] 10DBA, 10Patch-For-Review: Productionize 22 new codfw database servers - https://phabricator.wikimedia.org/T170662#3803601 (10jcrespo) Latest updates: T178359#3803600 , reflected on the description already. [18:23:28] 10DBA, 10Data-Services, 10User-bd808, 10cloud-services-team (Kanban): Determine schema differences between labsdb1001 and labsdb1009 - https://phabricator.wikimedia.org/T177223#3803612 (10bd808) I used this query to collect a description of the indexes on enwiki tables from both labsdb1001 and labsdb1011:... [18:30:16] 10DBA, 10Data-Services, 10User-bd808, 10cloud-services-team (Kanban): Determine schema differences between labsdb1001 and labsdb1009 - https://phabricator.wikimedia.org/T177223#3803648 (10jcrespo) `watchlist_count` is T59617#3070203. `__wmf_checksums` can be ignored, it is an ops-only table (to check data... [18:38:03] 10DBA, 10Data-Services, 10User-bd808, 10cloud-services-team (Kanban): Determine schema differences between labsdb1001 and labsdb1009 - https://phabricator.wikimedia.org/T177223#3803698 (10bd808) For wikidatawiki, the substantive diff is: ```lang=diff +wb_items_per_site wb_ips_site_page ips_site... [18:45:56] 10DBA, 10Data-Services, 10User-bd808, 10cloud-services-team (Kanban): Determine schema differences between labsdb1001 and labsdb1009 - https://phabricator.wikimedia.org/T177223#3803773 (10Marostegui) Also lately there have been at least 3-4 schema changes that never arrived to labsdb1001 as it is in read o... [18:53:59] 10DBA, 10Data-Services, 10User-bd808, 10cloud-services-team (Kanban): Determine schema differences between labsdb1001 and labsdb1009 - https://phabricator.wikimedia.org/T177223#3803808 (10jcrespo)