[01:00:46] 10DBA, 10Community-Tech, 10MediaWiki-General-or-Unknown, 10Operations, 10Stewards-and-global-tools (Temporary-UserRights): Regularly purge expired temporary userrights from DB tables - https://phabricator.wikimedia.org/T176754#3655999 (10EddieGP) a:03EddieGP >>! In T176754#3655920, @Dzahn wrote: > ..of... [02:16:48] 10DBA, 10Community-Tech, 10MediaWiki-General-or-Unknown, 10Operations, 10Stewards-and-global-tools (Temporary-UserRights): Regularly purge expired temporary userrights from DB tables - https://phabricator.wikimedia.org/T176754#3656030 (10Legoktm) 05Open>03declined I agree with T176754#3636245 and am... [05:24:56] 10DBA, 10Operations, 10ops-eqiad: Degraded RAID on db1056 - https://phabricator.wikimedia.org/T177171#3656157 (10Marostegui) 05Open>03Resolved Raid back to optimal - thank you Chris!: ``` root@db1056:~# megacli -LDPDInfo -aAll Adapter #0 Number of Virtual Disks: 1 Virtual Drive: 0 (Target Id: 0) Name... [05:28:16] 10Blocked-on-schema-change, 10DBA, 10Readers-Community-Engagement, 10Community-Liaisons (Oct-Dec 2017): Help communicate read-only time for Commons for schema change required by adding 3D filetype - https://phabricator.wikimedia.org/T176883#3656159 (10Marostegui) Sure! Just let me know if you need anything... [05:32:50] 10DBA, 10Analytics: Drop MoodBar tables from all wikis - https://phabricator.wikimedia.org/T153033#3656161 (10Marostegui) >>! In T153033#3653939, @Nuria wrote: > @marostegui: let's put them on a mediawiki-archive database, the staging database (if I am not mistaken) has open permits for everyone to delete /up... [05:35:48] 10DBA, 10Patch-For-Review: Decommission db2010 and move m1 codfw to db2078 - https://phabricator.wikimedia.org/T175685#3656163 (10Marostegui) [05:44:24] 10Blocked-on-schema-change, 10DBA, 10Patch-For-Review: Drop now redundant indexes from pagelinks and templatelinks - https://phabricator.wikimedia.org/T174509#3656164 (10Marostegui) [06:10:27] 10Blocked-on-schema-change, 10DBA, 10Patch-For-Review: Drop now redundant indexes from pagelinks and templatelinks - https://phabricator.wikimedia.org/T174509#3656189 (10Marostegui) [06:11:03] 10Blocked-on-schema-change, 10DBA, 10Patch-For-Review: Drop now redundant indexes from pagelinks and templatelinks - https://phabricator.wikimedia.org/T174509#3635130 (10Marostegui) [06:31:53] 10DBA, 10Patch-For-Review: Productionize 11 new eqiad database servers - https://phabricator.wikimedia.org/T172679#3656200 (10Marostegui) [06:34:43] 10DBA, 10Gerrit, 10Operations, 10Patch-For-Review, 10Release-Engineering-Team (Next): Gerrit is failing to connect to db on gerrit2001 thus preventing systemd from working - https://phabricator.wikimedia.org/T176532#3656202 (10Paladox) Bump. [06:36:11] 10DBA, 10Gerrit, 10Operations, 10Patch-For-Review, 10Release-Engineering-Team (Next): Gerrit is failing to connect to db on gerrit2001 thus preventing systemd from working - https://phabricator.wikimedia.org/T176532#3628916 (10Marostegui) >>! In T176532#3656202, @Paladox wrote: > Bump. Hey Paladox Chec... [07:02:33] 10DBA, 10Patch-For-Review: Decommission db2010 and move m1 codfw to db2078 - https://phabricator.wikimedia.org/T175685#3656247 (10Marostegui) [07:15:31] 10DBA, 10Patch-For-Review: Decommission db2010 and move m1 codfw to db2078 - https://phabricator.wikimedia.org/T175685#3656262 (10Marostegui) [07:17:37] 10DBA, 10Patch-For-Review: Decommission db2010 and move m1 codfw to db2078 - https://phabricator.wikimedia.org/T175685#3656264 (10Marostegui) [07:21:57] elukey: https://phabricator.wikimedia.org/T168303#3653222 and following. ok with the plan? [07:22:20] 1TB back!!! [07:22:32] marostegui: and hopefuly replication [07:22:55] jynus: I am super ok, thanks! [07:22:56] jynus: I guess your last comment means dbstore1001? [07:23:01] (on the ticket) [07:25:02] yes [07:31:09] thanks a lot people for all the help on dbstore1002 [07:38:58] I always say we have no problem helping [07:39:14] but that is the key- I am helping yout team, you own the service [08:20:12] yes completely agree [08:21:04] BTW, marostegui s5 backups were created flawlessly on dbstore2001 for s5 during the night [08:21:37] dbstore1001 is choking trying to do something [08:23:11] trumping into start and stopping all slaves, which conflits with itself (even if delayed replication is disabled) [08:27:34] jynus: great news about dbstore2001!! :) [08:29:33] 10DBA, 10Patch-For-Review: Decommission db2010 and move m1 codfw to db2078 - https://phabricator.wikimedia.org/T175685#3656417 (10Marostegui) [08:30:31] 10DBA, 10Operations, 10ops-codfw: Decommission db2010 and move m1 codfw to db2078 - https://phabricator.wikimedia.org/T175685#3656422 (10Marostegui) a:03Papaul db2010 is ready to be fully decommissioned by @Papaul [08:40:24] so I am thinking of creating a /srv/backups/ logical + raw + binlog / in_progress + latest + 24_hours [09:33:23] 10DBA, 10Patch-For-Review: Productionize 11 new eqiad database servers - https://phabricator.wikimedia.org/T172679#3656592 (10Marostegui) [09:50:52] I am going to upgrade labsdbs to validate the new package and test the rolling restart workflow [09:51:09] mmm [09:51:13] let me check 1010 [09:51:16] because I was altering it [09:51:26] oh, ok [09:51:30] I can wait [09:51:40] 1010 was actually upgraded already [09:51:42] it should be done in a 2-3 hours I think [09:51:46] I am only touching 1010 [09:51:51] I was going to do 9 and 11 [09:51:53] ah [09:51:55] then go ahead :) [09:52:20] labsdb1009 10.1.25 [09:52:27] labsdb1010 10.1.28 [09:52:36] labsdb1011 10.1.25 [09:53:03] ah :) [09:53:04] nice [09:53:11] so 1009 and 1011 I am not touching [09:53:55] see also https://gerrit.wikimedia.org/r/#/c/382144/ [09:54:17] 10DBA, 10Gerrit, 10Operations, 10Patch-For-Review, 10Release-Engineering-Team (Next): Gerrit is failing to connect to db on gerrit2001 thus preventing systemd from working - https://phabricator.wikimedia.org/T176532#3656638 (10Paladox) @Marostegui oh thanks. Is there a way we can fix this please? As it w... [10:01:15] 10DBA, 10Gerrit, 10Operations, 10Patch-For-Review, 10Release-Engineering-Team (Next): Gerrit is failing to connect to db on gerrit2001 thus preventing systemd from working - https://phabricator.wikimedia.org/T176532#3628916 (10jcrespo) > As it was working before It wasn't working before- there was a sec... [10:35:16] I am thinking of breaking s4 replication so that it cannot start back [10:35:26] on dbstore1001 [10:35:45] sure [10:35:47] each of the 900 backups that are happening run [10:35:49] it is lagging anyways, no? [10:36:01] START SLAVE and STOP SLAVE on all shards [10:36:16] and it takes 30 minutes on commons for that to happen [10:36:19] buf [10:36:34] 30*900, imagine our throughupt [10:36:51] if I break replication temporarilly, at least that will be instant [10:37:05] and that may at least the backups finish [10:37:06] yeah [10:37:10] let's just do it [10:37:13] it is a bug for the START SLAVE [10:37:21] to happen on an already stopped slave [10:37:28] on --slave-info [10:38:00] combined with broken s5 and s4 replication threads [10:38:15] doing full table scans instead of indexes for writes [10:38:20] it is makign things not working [10:38:30] so I can add a row on recentchanges [10:38:40] so replication breaks [10:38:53] and then delete it after backups finish [10:39:05] that should stop the sql thread for good [10:41:41] or maybe ven disconnect it and then configure it [10:41:48] but yes, breaking it would also work [10:41:49] oh, that is actually easier [10:41:53] thanks [10:42:07] although it is dangerous [10:42:18] because the async STOP/START [10:42:39] probably I can STOP;SHOW;RESET? [10:42:45] yeah [10:42:51] make sure to do the set default_connection bla bla bla [10:42:52] or reply on the error log for the coords? [10:42:53] XD [10:43:03] i would do a stop;: show; reset [10:43:21] if it starts in the middle, it will fail? [10:43:44] maybe you ca [10:43:44] can [10:43:46] disable events [10:43:50] do all the stuff [10:43:52] and start events? [10:43:53] events are disabled [10:44:01] the problem are the --slave-info [10:44:05] ah [10:44:11] that starts the replication even if it is stopped [10:44:18] i guess the reset would complain if the replicaiton is working [10:45:30] STOP SLAVE 's4'; SHOW SLAVE 's4' STATUS; RESET MASTER 's4' ALL; [10:45:59] if it doesn't work, I will go with the breaking replication plan :-) [10:46:10] that command looks good to me [10:46:36] i normally use set default_master_connection at the start, just to be sure [10:46:39] but that is personal manias [10:46:52] so it is a combination of what I think is a backup bug from mysqldump [10:47:10] and s4 lagging due to full table scans for some tables [10:47:27] I think s5 was doing that but I possibly fixed it by reconstructing some tables [10:47:49] but I cannot do that for s4 in the middle of the backups running [10:47:55] yeah [10:48:07] dbstore1001 has reached its performanced limit anyways [10:48:14] with all the extra load from s5.. [10:48:39] it is mostly the crashes and tokudb [10:48:53] plus the purging lag [10:48:59] but dbstore2001 didn't have toku and we saw it happenning [10:49:07] crashes [10:49:16] plus one replica set goes wrong [10:49:19] and all are affected [10:49:19] that is true, since dbstore2001 crash..it was never the same [10:49:21] because transactions [10:49:31] affect all dbs [10:49:38] btw [10:49:42] did you see this [10:49:53] ? [10:50:03] https://phabricator.wikimedia.org/T149418#3653125 [10:50:35] yeah, I was waiting for the :-) "Time allowing, it'd be nice to try it in a test environment just to see if it actually works." [10:50:39] haha [10:50:47] remember I was the one to comment on the ticket "that will help us" [10:51:10] oh yeah you are subscribed to the mariadb ticket, i forgot :) [10:51:19] https://jira.mariadb.org/browse/MDEV-12012?focusedCommentId=99569&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-99569 [10:51:43] in fact, you commented right below [10:52:04] either you forgot or you thought that was a different jaime [10:52:07] :-D [10:52:14] hahaha [10:52:26] hey I have been on holidays! [10:52:36] :) [10:52:54] ou have an error in your SQL syntax; check the manual that corresponds to your MariaDB server version for the right syntax to use near ''s4' ALL' at line 1 [10:52:56] was this jaime with beard or without beard commenting? [10:54:02] just try with default_master_connection=s4 [10:54:06] and then the normal commands [10:54:12] yeah, I just did that [10:55:18] I have shared the output on the typical place [10:55:49] that should unlock the backup process [10:56:07] although we could directly drop s4 and s5 and forget about that [10:56:29] is s4 on dbstore2001? if yes, we could run a manual backup run [10:56:43] I can see the output now :) [10:56:56] "see" [10:57:05] s4 is not in dbstore2001 [10:57:28] we could test the remote backup and test it [10:57:49] at least by next week [10:57:55] OR [10:58:02] setup multi-instance now [10:58:20] (when backup finishes) [10:58:31] you mean copying s4 on dbstore2001? [10:58:37] no [10:58:52] setting up s4 instance on dbstore1001 [10:58:57] aaaah [10:59:02] which is a bit more involved [10:59:33] but we are delaying it since months ago [10:59:44] yeah, that is also a good idea [10:59:49] we skipped the innodb conversion [11:00:00] but I hope multi-instance is the way [11:00:11] i really think it is [11:00:36] and if some sets keep having issues like these [11:00:38] i would start converting 1001 to multi-instance indeed [11:00:54] it "should" be easy [11:01:01] just a transfer [11:01:09] yeah [11:01:20] no filtering or engine comversion [11:01:20] we could actually convert the whole dbstore1001 maybe in a whole week [11:01:26] (maybe) [11:01:45] although we have the lack of performance [11:01:57] seen on dbstore2001 [11:02:08] yeah, but with delayed replication, i think we could be ok [11:02:13] (which is not what we want) [11:02:15] but for now... [11:02:26] we should talk to mark and see his opinion on purchases [11:02:33] we can do that tomorrow [11:02:45] having a couple of extra replicas + dbstore could be enough [11:02:52] as it is our bi weekly meeting with him [11:04:26] puppet agent -tv [11:04:33] systemctl set-environment MYSQLD_OPTS="--skip-slave-start" [11:04:39] systemctl start mariadb [11:05:22] mysql_upgrade --skip-ssl [11:05:49] on labsd? [11:05:51] labs [11:06:00] on every upgrade [11:06:14] I was thinking that if I write it I will remember it [11:06:18] haha [11:06:22] I do have it on my own notes [11:06:23] XD [11:06:38] I am not sure my draining method was too successful [11:06:46] I think I hard-closed connections [11:07:01] what did you do to drain them? [11:07:03] I will try again with the other haproxy [11:07:22] I tried drain [11:07:50] but not sure it worked, so I just overwrite the haconfig and reload [11:07:54] what did you do? just disabled in on haproxy? [11:07:56] ah right [11:08:11] i guess some people just leave the connection open and reuse it? [11:08:15] https://github.com/rancher/rancher/issues/8627 [11:08:32] the thing is that drain/weight 0 may not work with us [11:08:41] because we do not do load balancing, but backup [11:09:01] so maybe it needs 3 phases, load balacing + drain [11:09:14] and then removed from the pool? [11:09:34] I also need to test it with a long running connection [11:09:49] then we can just script it [11:10:05] at least I checked connections were moved [11:10:32] so worse case scenario, conections dropped and reconnected, which is the "right" behaviour [11:11:16] yeah, i guess there was no error given to an active conneciton [11:11:18] connection [11:11:23] just killed an open sleeping one [11:11:49] I don't know [11:11:59] I will wait for buffer pool to heat again [11:12:04] and will reload the proxy [11:12:25] labsdb1011 10.1.28 516G 6m 1s 325 Yes 38m [11:12:31] \o/ [11:12:45] I do not want to assume you know less than me [11:12:56] but you should get familiar with haproxy commands [11:13:00] (I wasn't) [11:13:15] and in an emergency, there will be little time to look at manuals [11:13:33] yeah, I normally do some reload, check the status of a given weight etc [11:13:44] but normally not more than that [11:13:44] yeah, reload and check, yes [11:13:50] exactly, [11:13:52] same here [11:14:36] do you normally use more than that? [11:14:46] no, that is what I am trying to learn [11:14:58] so this was more of a self reminder [11:15:01] that I extended to you [11:15:42] and if possible, have predefined script with "failover to 10" or similar [11:15:56] oh that'd be nice indeed [11:15:59] just a "button" [11:16:12] with e.g. drain, wait X minutes, hard failover [11:17:08] in the past I think sean used haproxies for master failover [11:17:29] to minimize read only time [11:17:38] on core, I mean [12:42:14] I'm cleaning up ores_classification in enwiki, the deletes will be high a little [12:42:34] but it's almost finished some small things [12:42:54] thanks for the heads up [12:53:42] 10Blocked-on-schema-change, 10DBA, 10Patch-For-Review: Drop now redundant indexes from pagelinks and templatelinks - https://phabricator.wikimedia.org/T174509#3657128 (10Marostegui) [12:54:04] 10Blocked-on-schema-change, 10DBA, 10Patch-For-Review: Drop now redundant indexes from pagelinks and templatelinks - https://phabricator.wikimedia.org/T174509#3638543 (10Marostegui) [12:55:06] 10Blocked-on-schema-change, 10DBA, 10Patch-For-Review: Drop now redundant indexes from pagelinks and templatelinks - https://phabricator.wikimedia.org/T174509#3638544 (10Marostegui) [13:17:22] done now [13:19:09] I think it would be great if you shrink it to claim some space [13:20:47] it is currently 5.3G on the master [13:21:05] let me see how much we can get in codfw [13:25:13] after the optimize it goes from 5.4 to 784M [13:25:26] I will run an optimize on codfw with replication [13:25:33] thanks [13:25:37] it toon only 2 minutes [13:30:39] 10Blocked-on-schema-change, 10DBA, 10MediaWiki-extensions-ORES, 10MW-1.29-release (WMF-deploy-2017-04-25_(1.29.0-wmf.21)), and 5 others: Concerns about ores_classification table size on enwiki - https://phabricator.wikimedia.org/T159753#3657239 (10Marostegui) >>! In T159753#3657238, @Stashbot wrote: > {nav... [14:44:19] "ORDER BY RAND() LIMIT 10569" ugh, I think I am going to get sick [14:52:15] 10DBA, 10Community-Tech, 10MediaWiki-General-or-Unknown, 10Operations, 10Stewards-and-global-tools (Temporary-UserRights): Regularly purge expired temporary userrights from DB tables - https://phabricator.wikimedia.org/T176754#3657440 (10MarcoAurelio) 05declined>03Open Sorry, I dare to totally disagr... [15:00:11] 10DBA, 10Wikidata: Migrate wb_terms to using prefixed entity IDs instead of numeric IDs - https://phabricator.wikimedia.org/T114903#3657469 (10Ladsgroup) [15:13:45] 10DBA, 10Community-Tech, 10MediaWiki-General-or-Unknown, 10Operations, 10Stewards-and-global-tools (Temporary-UserRights): Regularly purge expired temporary userrights from DB tables - https://phabricator.wikimedia.org/T176754#3635590 (10jcrespo) This is my (I hope) neutral evaluation of the issue: * Th... [15:21:59] 10DBA, 10Cloud-Services, 10Tracking: Wikireplica service for tools and labs - issues and missing available views (tracking) - https://phabricator.wikimedia.org/T150767#3657622 (10jcrespo) [15:22:16] 10DBA, 10Data-Services, 10Tracking: Wikireplica service for tools and labs - issues and missing available views (tracking) - https://phabricator.wikimedia.org/T150767#2795855 (10jcrespo) [15:23:07] 10DBA, 10Data-Services: Some queries to new replica hosts are dramatically slower than labsdb; missing indexes? - https://phabricator.wikimedia.org/T177096#3657633 (10jcrespo) [15:23:10] 10DBA, 10Data-Services, 10Tracking: Wikireplica service for tools and labs - issues and missing available views (tracking) - https://phabricator.wikimedia.org/T150767#2795855 (10jcrespo) [15:29:30] 10DBA, 10Operations, 10ops-codfw: Decommission db2010 and move m1 codfw to db2078 - https://phabricator.wikimedia.org/T175685#3657671 (10Marostegui) [15:31:41] 10DBA, 10Community-Tech, 10MediaWiki-General-or-Unknown, 10Operations, 10Stewards-and-global-tools (Temporary-UserRights): Regularly purge expired temporary userrights from DB tables - https://phabricator.wikimedia.org/T176754#3657682 (10MarcoAurelio) @jcrespo Thank you. Point number two is what people a... [15:32:59] 10DBA, 10Community-Tech, 10Stewards-and-global-tools (Temporary-UserRights): Expired user groups not added to user_former_groups table - https://phabricator.wikimedia.org/T177404#3657688 (10MarcoAurelio) [15:34:09] 10DBA, 10Operations, 10ops-codfw, 10Patch-For-Review: Decommission db2010 and move m1 codfw to db2078 - https://phabricator.wikimedia.org/T175685#3657692 (10Marostegui) [15:37:22] 10DBA, 10Operations, 10ops-codfw, 10Patch-For-Review: Decommission db2010 and move m1 codfw to db2078 - https://phabricator.wikimedia.org/T175685#3657708 (10Marostegui) [15:40:49] 10DBA, 10Community-Tech, 10Stewards-and-global-tools (Temporary-UserRights): Expired user groups not added to user_former_groups table - https://phabricator.wikimedia.org/T177404#3657453 (10Marostegui) What are we (DBAs) supposed to do here? (Asking as we got added to the ticket :-) ) [15:42:26] 10DBA, 10Community-Tech, 10Stewards-and-global-tools (Temporary-UserRights): Expired user groups not added to user_former_groups table - https://phabricator.wikimedia.org/T177404#3657728 (10MarcoAurelio) @jcrespo so you can evaluate if #dba is needed here as you're also on T176754. [15:46:02] 10DBA, 10Community-Tech, 10Stewards-and-global-tools (Temporary-UserRights): Expired user groups not added to user_former_groups table - https://phabricator.wikimedia.org/T177404#3657734 (10jcrespo) In my humble opinion, this is the part that is purely #mediawiki-database (actual mediawiki bug, not DBA relat... [15:48:30] 10DBA, 10Data-Services, 10Patch-For-Review: Some queries to new replica hosts are dramatically slower than labsdb; missing indexes? - https://phabricator.wikimedia.org/T177096#3657738 (10Marostegui) >>! In T177096#3657674, @gerritbot wrote: > Change 382170 had a related patch set uploaded (by Jcrespo; owner:... [15:57:20] 10DBA, 10Community-Tech, 10MediaWiki-General-or-Unknown, 10Operations, 10Stewards-and-global-tools (Temporary-UserRights): Regularly purge expired temporary userrights from DB tables - https://phabricator.wikimedia.org/T176754#3657784 (10jcrespo) Legoktm was totally legitimate about closing the ticket wi... [16:05:47] 10DBA, 10Data-Services, 10Patch-For-Review: Some queries to new replica hosts are dramatically slower than labsdb; missing indexes? - https://phabricator.wikimedia.org/T177096#3657823 (10jcrespo) This should fix things: ``` root@labsdb1010[enwiki]> ALTER TABLE archive ADD KEY `user_timestamp` (`ar_user`,`ar... [17:42:35] 10DBA, 10Data-Services, 10Tracking: Wikireplica service for tools and labs - issues and missing available views (tracking) - https://phabricator.wikimedia.org/T150767#3658252 (10jcrespo) [17:42:38] 10DBA, 10Data-Services, 10Goal, 10cloud-services-team (FY2017-18): Migrate all users to new Wiki Replica cluster and decommission old hardware - https://phabricator.wikimedia.org/T142807#3658253 (10jcrespo) [17:42:41] 10DBA, 10Data-Services, 10Patch-For-Review: Some queries to new replica hosts are dramatically slower than labsdb; missing indexes? - https://phabricator.wikimedia.org/T177096#3658249 (10jcrespo) 05Open>03Resolved a:03jcrespo @MusikAnimal Your query takes now 3 second cold, 0.13 seconds hot on the new... [17:50:42] 10DBA, 10Gerrit, 10Operations, 10Patch-For-Review, 10Release-Engineering-Team (Next): Gerrit is failing to connect to db on gerrit2001 thus preventing systemd from working - https://phabricator.wikimedia.org/T176532#3658342 (10Dzahn) Yea, this should just wait for the proper setup in codfw. I don't see a... [17:55:57] 10DBA, 10Community-Tech, 10MediaWiki-General-or-Unknown, 10Operations, 10Stewards-and-global-tools (Temporary-UserRights): Regularly purge expired temporary userrights from DB tables - https://phabricator.wikimedia.org/T176754#3658385 (10kaldari) >If purging is undesirable on production, which is somethi... [18:19:45] 10DBA, 10Community-Tech, 10MediaWiki-General-or-Unknown, 10Operations, 10Stewards-and-global-tools (Temporary-UserRights): Regularly purge expired temporary userrights from DB tables - https://phabricator.wikimedia.org/T176754#3658483 (10EddieGP) When digging a bit further into this, I found that it was... [18:25:38] 10DBA, 10Community-Tech, 10MediaWiki-General-or-Unknown, 10Operations, 10Stewards-and-global-tools (Temporary-UserRights): Regularly purge expired temporary userrights from DB tables - https://phabricator.wikimedia.org/T176754#3658511 (10MarcoAurelio) Sorry but I don't see it that way. As much as I respe... [18:46:04] 10DBA, 10Community-Tech, 10MediaWiki-General-or-Unknown, 10Operations, 10Stewards-and-global-tools (Temporary-UserRights): Regularly purge expired temporary userrights from DB tables - https://phabricator.wikimedia.org/T176754#3658551 (10kaldari) @EddieGP: I believe there were performance concerns with t... [18:57:46] 10DBA, 10Community-Tech, 10MediaWiki-General-or-Unknown, 10Operations, 10Stewards-and-global-tools (Temporary-UserRights): Regularly purge expired temporary userrights from DB tables - https://phabricator.wikimedia.org/T176754#3658580 (10MarcoAurelio) @kaldari Tool Labs is mentioned here because I discov... [19:44:28] 10DBA, 10Data-Services, 10Patch-For-Review: Some queries to new replica hosts are dramatically slower than labsdb; missing indexes? - https://phabricator.wikimedia.org/T177096#3658782 (10MusikAnimal) Lightning fast! :D Many thanks for the prompt assistance [19:54:39] 10DBA, 10Operations, 10Availability (Multiple-active-datacenters), 10Performance-Team (Radar): Make apache/maintenance hosts TLS connections to mariadb work - https://phabricator.wikimedia.org/T175672#3658801 (10Joe) So what I extract from the errors is you're trying to connect to db2048 by IP and not by h... [20:01:13] 10DBA, 10Community-Tech, 10MediaWiki-General-or-Unknown, 10Operations, and 2 others: Regularly purge expired temporary userrights from DB tables - https://phabricator.wikimedia.org/T176754#3658814 (10EddieGP) p:05Normal>03Low >>! In T176754#3658580, @MarcoAurelio wrote: > The issue here is that MediaWi... [20:22:13] 10DBA, 10Gerrit, 10Operations, 10Patch-For-Review, 10Release-Engineering-Team (Next): Gerrit is failing to connect to db on gerrit2001 thus preventing systemd from working - https://phabricator.wikimedia.org/T176532#3658840 (10demon) p:05Triage>03Low [20:25:56] 10DBA, 10Analytics: Drop MoodBar tables from all wikis - https://phabricator.wikimedia.org/T153033#3658843 (10demon) >>! In T153033#3656161, @Marostegui wrote: > Just to be clear, you are talking about dbstore1002/db1047? > We also have to keep in mind that there are thousands of tables (two per wiki basically... [20:33:58] 10DBA, 10Operations, 10Availability (Multiple-active-datacenters), 10Performance-Team (Radar): Make apache/maintenance hosts TLS connections to mariadb work - https://phabricator.wikimedia.org/T175672#3658863 (10aaron) [20:39:14] 10DBA, 10Operations, 10Availability (Multiple-active-datacenters), 10Performance-Team (Radar): Make apache/maintenance hosts TLS connections to mariadb work - https://phabricator.wikimedia.org/T175672#3658870 (10aaron) Also, there is https://bugs.php.net/bug.php?id=74445 :) [20:47:05] 10DBA, 10Operations, 10Availability (Multiple-active-datacenters), 10Performance-Team (Radar): Make apache/maintenance hosts TLS connections to mariadb work - https://phabricator.wikimedia.org/T175672#3658884 (10aaron) >>! In T175672#3658801, @Joe wrote: > So what I extract from the errors is you're trying... [21:08:20] 10Blocked-on-schema-change, 10DBA, 10Readers-Community-Engagement, 10Community-Liaisons (Oct-Dec 2017): Help communicate read-only time for Commons for schema change required by adding 3D filetype - https://phabricator.wikimedia.org/T176883#3658963 (10CKoerner_WMF) I've posted the message to the Commons Vi... [21:08:38] 10Blocked-on-schema-change, 10DBA, 10Readers-Community-Engagement, 10Community-Liaisons (Oct-Dec 2017): Help communicate read-only time for Commons for schema change required by adding 3D filetype - https://phabricator.wikimedia.org/T176883#3658965 (10CKoerner_WMF) [21:27:40] 10DBA, 10Analytics: Drop MoodBar tables from all wikis - https://phabricator.wikimedia.org/T153033#3658983 (10Nuria) >Could/should we drop the ones that are completely empty already--assuming some wikis never actually used it. Would that make it more manageable? Yes, please. I think that makes loads of sense.