[01:00:46] <wikibugs_>	 10DBA, 10Community-Tech, 10MediaWiki-General-or-Unknown, 10Operations, 10Stewards-and-global-tools (Temporary-UserRights): Regularly purge expired temporary userrights from DB tables - https://phabricator.wikimedia.org/T176754#3655999 (10EddieGP) a:03EddieGP >>! In T176754#3655920, @Dzahn wrote: > ..of...
[02:16:48] <wikibugs_>	 10DBA, 10Community-Tech, 10MediaWiki-General-or-Unknown, 10Operations, 10Stewards-and-global-tools (Temporary-UserRights): Regularly purge expired temporary userrights from DB tables - https://phabricator.wikimedia.org/T176754#3656030 (10Legoktm) 05Open>03declined I agree with T176754#3636245 and am...
[05:24:56] <wikibugs_>	 10DBA, 10Operations, 10ops-eqiad: Degraded RAID on db1056 - https://phabricator.wikimedia.org/T177171#3656157 (10Marostegui) 05Open>03Resolved Raid back to optimal - thank you Chris!: ``` root@db1056:~# megacli -LDPDInfo -aAll  Adapter #0  Number of Virtual Disks: 1 Virtual Drive: 0 (Target Id: 0) Name...
[05:28:16] <wikibugs_>	 10Blocked-on-schema-change, 10DBA, 10Readers-Community-Engagement, 10Community-Liaisons (Oct-Dec 2017): Help communicate read-only time for Commons for schema change required by adding 3D filetype - https://phabricator.wikimedia.org/T176883#3656159 (10Marostegui) Sure! Just let me know if you need anything...
[05:32:50] <wikibugs_>	 10DBA, 10Analytics: Drop MoodBar tables from all wikis - https://phabricator.wikimedia.org/T153033#3656161 (10Marostegui) >>! In T153033#3653939, @Nuria wrote: > @marostegui: let's put them on a mediawiki-archive database, the staging database  (if I am not mistaken) has open permits for everyone to delete /up...
[05:35:48] <wikibugs_>	 10DBA, 10Patch-For-Review: Decommission db2010 and move m1 codfw to db2078 - https://phabricator.wikimedia.org/T175685#3656163 (10Marostegui)
[05:44:24] <wikibugs_>	 10Blocked-on-schema-change, 10DBA, 10Patch-For-Review: Drop now redundant indexes from pagelinks and templatelinks - https://phabricator.wikimedia.org/T174509#3656164 (10Marostegui)
[06:10:27] <wikibugs_>	 10Blocked-on-schema-change, 10DBA, 10Patch-For-Review: Drop now redundant indexes from pagelinks and templatelinks - https://phabricator.wikimedia.org/T174509#3656189 (10Marostegui)
[06:11:03] <wikibugs_>	 10Blocked-on-schema-change, 10DBA, 10Patch-For-Review: Drop now redundant indexes from pagelinks and templatelinks - https://phabricator.wikimedia.org/T174509#3635130 (10Marostegui)
[06:31:53] <wikibugs_>	 10DBA, 10Patch-For-Review: Productionize 11 new eqiad database servers - https://phabricator.wikimedia.org/T172679#3656200 (10Marostegui)
[06:34:43] <wikibugs_>	 10DBA, 10Gerrit, 10Operations, 10Patch-For-Review, 10Release-Engineering-Team (Next): Gerrit is failing to connect to db on gerrit2001 thus preventing systemd from working - https://phabricator.wikimedia.org/T176532#3656202 (10Paladox) Bump.
[06:36:11] <wikibugs_>	 10DBA, 10Gerrit, 10Operations, 10Patch-For-Review, 10Release-Engineering-Team (Next): Gerrit is failing to connect to db on gerrit2001 thus preventing systemd from working - https://phabricator.wikimedia.org/T176532#3628916 (10Marostegui) >>! In T176532#3656202, @Paladox wrote: > Bump.  Hey Paladox  Chec...
[07:02:33] <wikibugs_>	 10DBA, 10Patch-For-Review: Decommission db2010 and move m1 codfw to db2078 - https://phabricator.wikimedia.org/T175685#3656247 (10Marostegui)
[07:15:31] <wikibugs_>	 10DBA, 10Patch-For-Review: Decommission db2010 and move m1 codfw to db2078 - https://phabricator.wikimedia.org/T175685#3656262 (10Marostegui)
[07:17:37] <wikibugs_>	 10DBA, 10Patch-For-Review: Decommission db2010 and move m1 codfw to db2078 - https://phabricator.wikimedia.org/T175685#3656264 (10Marostegui)
[07:21:57] <jynus>	 elukey: https://phabricator.wikimedia.org/T168303#3653222 and following. ok with the plan?
[07:22:20] <marostegui>	 1TB back!!! 
[07:22:32] <jynus>	 marostegui: and hopefuly replication
[07:22:55] <elukey>	 jynus: I am super ok, thanks!
[07:22:56] <marostegui>	 jynus: I guess your last comment means dbstore1001?
[07:23:01] <marostegui>	 (on the ticket)
[07:25:02] <jynus>	 yes
[07:31:09] <elukey>	 thanks a lot people for all the help on dbstore1002
[07:38:58] <jynus>	 I always say we have no problem helping
[07:39:14] <jynus>	 but that is the key- I am helping yout team, you own the service
[08:20:12] <elukey>	 yes completely agree
[08:21:04] <jynus>	 BTW, marostegui s5 backups were created flawlessly on dbstore2001 for s5 during the night
[08:21:37] <jynus>	 dbstore1001 is choking trying to do something
[08:23:11] <jynus>	 trumping into start and stopping all slaves, which conflits with itself (even if delayed replication is disabled)
[08:27:34] <marostegui>	 jynus: great news about dbstore2001!! :)
[08:29:33] <wikibugs_>	 10DBA, 10Patch-For-Review: Decommission db2010 and move m1 codfw to db2078 - https://phabricator.wikimedia.org/T175685#3656417 (10Marostegui)
[08:30:31] <wikibugs_>	 10DBA, 10Operations, 10ops-codfw: Decommission db2010 and move m1 codfw to db2078 - https://phabricator.wikimedia.org/T175685#3656422 (10Marostegui) a:03Papaul db2010 is ready to be fully decommissioned by @Papaul
[08:40:24] <jynus>	 so I am thinking of creating a /srv/backups/ logical + raw + binlog / in_progress + latest + 24_hours
[09:33:23] <wikibugs_>	 10DBA, 10Patch-For-Review: Productionize 11 new eqiad database servers - https://phabricator.wikimedia.org/T172679#3656592 (10Marostegui)
[09:50:52] <jynus>	 I am going to upgrade labsdbs to validate the new package and test the rolling restart workflow
[09:51:09] <marostegui>	 mmm
[09:51:13] <marostegui>	 let me check 1010
[09:51:16] <marostegui>	 because I was altering it
[09:51:26] <jynus>	 oh, ok
[09:51:30] <jynus>	 I  can wait
[09:51:40] <jynus>	 1010 was actually upgraded already
[09:51:42] <marostegui>	 it should be done in a 2-3 hours I think
[09:51:46] <marostegui>	 I am only touching 1010
[09:51:51] <jynus>	 I was going to do 9 and 11
[09:51:53] <marostegui>	 ah
[09:51:55] <marostegui>	 then go ahead :)
[09:52:20] <jynus>	 labsdb1009	10.1.25
[09:52:27] <jynus>	 labsdb1010	10.1.28
[09:52:36] <jynus>	 labsdb1011	10.1.25
[09:53:03] <marostegui>	 ah :)
[09:53:04] <marostegui>	 nice
[09:53:11] <marostegui>	 so 1009 and 1011 I am not touching
[09:53:55] <jynus>	 see also https://gerrit.wikimedia.org/r/#/c/382144/
[09:54:17] <wikibugs_>	 10DBA, 10Gerrit, 10Operations, 10Patch-For-Review, 10Release-Engineering-Team (Next): Gerrit is failing to connect to db on gerrit2001 thus preventing systemd from working - https://phabricator.wikimedia.org/T176532#3656638 (10Paladox) @Marostegui oh thanks. Is there a way we can fix this please? As it w...
[10:01:15] <wikibugs_>	 10DBA, 10Gerrit, 10Operations, 10Patch-For-Review, 10Release-Engineering-Team (Next): Gerrit is failing to connect to db on gerrit2001 thus preventing systemd from working - https://phabricator.wikimedia.org/T176532#3628916 (10jcrespo) > As it was working before  It wasn't working before- there was a sec...
[10:35:16] <jynus>	 I am thinking of breaking s4 replication so that it cannot start back
[10:35:26] <jynus>	 on dbstore1001
[10:35:45] <marostegui>	 sure
[10:35:47] <jynus>	 each of the 900 backups that are happening run
[10:35:49] <marostegui>	 it is lagging anyways, no?
[10:36:01] <jynus>	 START SLAVE and STOP SLAVE on all shards
[10:36:16] <jynus>	 and it takes 30 minutes on commons for that to happen
[10:36:19] <marostegui>	 buf
[10:36:34] <jynus>	 30*900, imagine our throughupt
[10:36:51] <jynus>	 if I break replication temporarilly, at least that will be instant
[10:37:05] <jynus>	 and that may at least the backups finish
[10:37:06] <marostegui>	 yeah
[10:37:10] <marostegui>	 let's just do it
[10:37:13] <jynus>	 it is a bug for the START SLAVE
[10:37:21] <jynus>	 to happen on an already stopped slave
[10:37:28] <jynus>	 on --slave-info
[10:38:00] <jynus>	 combined with broken s5 and s4 replication threads
[10:38:15] <jynus>	 doing full table scans instead of indexes for writes
[10:38:20] <jynus>	 it is makign things not working
[10:38:30] <jynus>	 so I can add a row on recentchanges
[10:38:40] <jynus>	 so replication breaks
[10:38:53] <jynus>	 and then delete it after backups finish
[10:39:05] <jynus>	 that should stop the sql thread for good
[10:41:41] <marostegui>	 or maybe ven disconnect it and then configure it
[10:41:48] <marostegui>	 but yes, breaking it would also work
[10:41:49] <jynus>	 oh, that is actually easier
[10:41:53] <jynus>	 thanks
[10:42:07] <jynus>	 although it is dangerous
[10:42:18] <jynus>	 because the async STOP/START
[10:42:39] <jynus>	 probably I can STOP;SHOW;RESET?
[10:42:45] <marostegui>	 yeah
[10:42:51] <marostegui>	 make sure to do the set default_connection bla bla bla
[10:42:52] <jynus>	 or reply on the error log for the coords?
[10:42:53] <marostegui>	 XD
[10:43:03] <marostegui>	 i would do a stop;: show; reset 
[10:43:21] <jynus>	 if it starts in the middle, it will fail?
[10:43:44] <marostegui>	 maybe you ca
[10:43:44] <marostegui>	 can
[10:43:46] <marostegui>	 disable events
[10:43:50] <marostegui>	 do all the stuff
[10:43:52] <marostegui>	 and start events?
[10:43:53] <jynus>	 events are disabled
[10:44:01] <jynus>	 the problem are the --slave-info
[10:44:05] <marostegui>	 ah
[10:44:11] <jynus>	 that starts the replication even if it is stopped
[10:44:18] <marostegui>	 i guess the reset would complain if the replicaiton is working
[10:45:30] <jynus>	 STOP SLAVE 's4'; SHOW SLAVE 's4' STATUS; RESET MASTER 's4' ALL;
[10:45:59] <jynus>	 if it doesn't work, I will go with the breaking replication plan :-)
[10:46:10] <marostegui>	 that command looks good to me
[10:46:36] <marostegui>	 i normally use set default_master_connection at the start, just to be sure
[10:46:39] <marostegui>	 but that is personal manias
[10:46:52] <jynus>	 so it is a combination of what I think is a backup bug from mysqldump
[10:47:10] <jynus>	 and s4 lagging due to full table scans for some tables
[10:47:27] <jynus>	 I think s5 was doing that but I possibly fixed it by reconstructing some tables
[10:47:49] <jynus>	 but I cannot do that for s4 in the middle of the backups running
[10:47:55] <marostegui>	 yeah
[10:48:07] <marostegui>	 dbstore1001 has reached its performanced limit anyways
[10:48:14] <marostegui>	 with all the extra load from s5..
[10:48:39] <jynus>	 it is mostly the crashes and tokudb
[10:48:53] <jynus>	 plus the purging lag
[10:48:59] <marostegui>	 but dbstore2001 didn't have toku and we saw it happenning
[10:49:07] <jynus>	 crashes
[10:49:16] <jynus>	 plus one replica set goes wrong
[10:49:19] <jynus>	 and all are affected
[10:49:19] <marostegui>	 that is true, since dbstore2001 crash..it was never the same
[10:49:21] <jynus>	 because transactions
[10:49:31] <jynus>	 affect all dbs
[10:49:38] <marostegui>	 btw
[10:49:42] <marostegui>	 did you see this
[10:49:53] <jynus>	 ?
[10:50:03] <marostegui>	 https://phabricator.wikimedia.org/T149418#3653125
[10:50:35] <jynus>	 yeah, I was waiting for the :-) "Time allowing, it'd be nice to try it in a test environment just to see if it actually works."
[10:50:39] <marostegui>	 haha
[10:50:47] <jynus>	 remember I was the one to comment on the ticket "that will help us"
[10:51:10] <marostegui>	 oh yeah you are subscribed to the mariadb ticket, i forgot :)
[10:51:19] <jynus>	 https://jira.mariadb.org/browse/MDEV-12012?focusedCommentId=99569&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-99569
[10:51:43] <jynus>	 in fact, you commented right below
[10:52:04] <jynus>	 either you forgot or you thought that was a different jaime
[10:52:07] <jynus>	 :-D
[10:52:14] <marostegui>	 hahaha
[10:52:26] <marostegui>	 hey I have been on holidays!
[10:52:36] <marostegui>	 :)
[10:52:54] <jynus>	 ou have an error in your SQL syntax; check the manual that corresponds to your MariaDB server version for the right syntax to use near ''s4' ALL' at line 1
[10:52:56] <marostegui>	 was this jaime with beard or without beard commenting?
[10:54:02] <marostegui>	 just try with default_master_connection=s4
[10:54:06] <marostegui>	 and then the normal commands
[10:54:12] <jynus>	 yeah, I just did that
[10:55:18] <jynus>	 I have shared the output on the typical place
[10:55:49] <jynus>	 that should unlock the backup process
[10:56:07] <jynus>	 although we could directly drop s4 and s5 and forget about that
[10:56:29] <jynus>	 is s4 on dbstore2001? if yes, we could run a manual backup run
[10:56:43] <marostegui>	 I can see the output now :)
[10:56:56] <jynus>	 "see"
[10:57:05] <marostegui>	 s4 is not in dbstore2001
[10:57:28] <jynus>	 we could test the remote backup and test it
[10:57:49] <jynus>	 at least by next week
[10:57:55] <jynus>	 OR
[10:58:02] <jynus>	 setup multi-instance now
[10:58:20] <jynus>	 (when backup finishes)
[10:58:31] <marostegui>	 you mean copying s4 on dbstore2001?
[10:58:37] <jynus>	 no
[10:58:52] <jynus>	 setting up s4 instance on dbstore1001
[10:58:57] <marostegui>	 aaaah
[10:59:02] <jynus>	 which is a bit more involved
[10:59:33] <jynus>	 but we are delaying it since months ago
[10:59:44] <marostegui>	 yeah, that is also a good idea
[10:59:49] <jynus>	 we skipped the innodb conversion
[11:00:00] <jynus>	 but I hope multi-instance is the way
[11:00:11] <marostegui>	 i really think it is
[11:00:36] <jynus>	 and if some sets keep having issues like these
[11:00:38] <marostegui>	 i would start converting 1001 to multi-instance indeed
[11:00:54] <jynus>	 it "should" be easy
[11:01:01] <jynus>	 just a transfer
[11:01:09] <marostegui>	 yeah
[11:01:20] <jynus>	 no filtering or engine comversion
[11:01:20] <marostegui>	 we could actually convert the whole dbstore1001 maybe in a whole week
[11:01:26] <marostegui>	 (maybe)
[11:01:45] <jynus>	 although we have the lack of performance
[11:01:57] <jynus>	 seen on dbstore2001
[11:02:08] <marostegui>	 yeah, but with delayed replication, i think we could be ok
[11:02:13] <marostegui>	 (which is not what we want)
[11:02:15] <marostegui>	 but for now...
[11:02:26] <jynus>	 we should talk to mark and see his opinion on purchases
[11:02:33] <marostegui>	 we can do that tomorrow
[11:02:45] <jynus>	 having a couple of extra replicas + dbstore could be enough
[11:02:52] <marostegui>	 as it is our bi weekly meeting with him
[11:04:26] <jynus>	 puppet agent -tv
[11:04:33] <jynus>	 systemctl set-environment MYSQLD_OPTS="--skip-slave-start"
[11:04:39] <jynus>	 systemctl start mariadb
[11:05:22] <jynus>	 mysql_upgrade --skip-ssl
[11:05:49] <marostegui>	 on labsd?
[11:05:51] <marostegui>	 labs
[11:06:00] <jynus>	 on every upgrade
[11:06:14] <jynus>	 I was thinking that if I write it I will remember it
[11:06:18] <marostegui>	 haha
[11:06:22] <marostegui>	 I do have it on my own notes
[11:06:23] <marostegui>	 XD
[11:06:38] <jynus>	 I am not sure my draining method was too successful
[11:06:46] <jynus>	 I think I hard-closed connections
[11:07:01] <marostegui>	 what did you do to drain them?
[11:07:03] <jynus>	 I will try again with the other haproxy
[11:07:22] <jynus>	 I tried drain
[11:07:50] <jynus>	 but not sure it worked, so I just overwrite the haconfig and reload
[11:07:54] <marostegui>	 what did you do? just disabled in on haproxy?
[11:07:56] <marostegui>	 ah right
[11:08:11] <marostegui>	 i guess some people just leave the connection open and reuse it?
[11:08:15] <jynus>	 https://github.com/rancher/rancher/issues/8627
[11:08:32] <jynus>	 the thing is that drain/weight 0 may not work with us
[11:08:41] <jynus>	 because we do not do load balancing, but backup
[11:09:01] <jynus>	 so maybe it needs 3 phases, load balacing + drain
[11:09:14] <jynus>	 and then removed from the pool?
[11:09:34] <jynus>	 I also need to test it with a long running connection
[11:09:49] <jynus>	 then we can just script it
[11:10:05] <jynus>	 at least I checked connections were moved
[11:10:32] <jynus>	 so worse case scenario, conections dropped and reconnected, which is the "right" behaviour
[11:11:16] <marostegui>	 yeah, i guess there was no error given to an active conneciton
[11:11:18] <marostegui>	 connection
[11:11:23] <marostegui>	 just killed an open sleeping one
[11:11:49] <jynus>	 I don't know
[11:11:59] <jynus>	 I will wait for buffer pool to heat again
[11:12:04] <jynus>	 and will reload the proxy
[11:12:25] <jynus>	 labsdb1011	10.1.28	516G	6m	1s	325	Yes	38m	
[11:12:31] <marostegui>	 \o/
[11:12:45] <jynus>	 I do not want to assume you know less than me
[11:12:56] <jynus>	 but you should get familiar with haproxy commands
[11:13:00] <jynus>	 (I wasn't)
[11:13:15] <jynus>	 and in an emergency, there will be little time to look at manuals
[11:13:33] <marostegui>	 yeah, I normally do some reload, check the status of a given weight etc
[11:13:44] <marostegui>	 but normally not more than that
[11:13:44] <jynus>	 yeah, reload and check, yes
[11:13:50] <jynus>	 exactly,
[11:13:52] <jynus>	 same here
[11:14:36] <marostegui>	 do you normally use more than that?
[11:14:46] <jynus>	 no, that is what I am trying to learn
[11:14:58] <jynus>	 so this was more of a self reminder
[11:15:01] <jynus>	 that I extended to you
[11:15:42] <jynus>	 and if possible, have predefined script with "failover to 10" or similar
[11:15:56] <marostegui>	 oh that'd be nice indeed
[11:15:59] <marostegui>	 just a "button"
[11:16:12] <jynus>	 with e.g. drain, wait X minutes, hard failover
[11:17:08] <jynus>	 in the past I think sean used haproxies for master failover
[11:17:29] <jynus>	 to minimize read only time
[11:17:38] <jynus>	 on core, I mean
[12:42:14] <Amir1>	 I'm cleaning up ores_classification in enwiki, the deletes will be high a little
[12:42:34] <Amir1>	 but it's almost finished some small things
[12:42:54] <marostegui>	 thanks for the heads up
[12:53:42] <wikibugs_>	 10Blocked-on-schema-change, 10DBA, 10Patch-For-Review: Drop now redundant indexes from pagelinks and templatelinks - https://phabricator.wikimedia.org/T174509#3657128 (10Marostegui)
[12:54:04] <wikibugs_>	 10Blocked-on-schema-change, 10DBA, 10Patch-For-Review: Drop now redundant indexes from pagelinks and templatelinks - https://phabricator.wikimedia.org/T174509#3638543 (10Marostegui)
[12:55:06] <wikibugs_>	 10Blocked-on-schema-change, 10DBA, 10Patch-For-Review: Drop now redundant indexes from pagelinks and templatelinks - https://phabricator.wikimedia.org/T174509#3638544 (10Marostegui)
[13:17:22] <Amir1>	 done now
[13:19:09] <Amir1>	 I think it would be great if you shrink it to claim some space
[13:20:47] <marostegui>	 it is currently 5.3G on the master
[13:21:05] <marostegui>	 let me see how much we can get in codfw
[13:25:13] <marostegui>	 after the optimize it goes from 5.4 to 784M
[13:25:26] <marostegui>	 I will run an optimize on codfw with replication
[13:25:33] <Amir1>	 thanks
[13:25:37] <marostegui>	 it toon only 2 minutes
[13:30:39] <wikibugs_>	 10Blocked-on-schema-change, 10DBA, 10MediaWiki-extensions-ORES, 10MW-1.29-release (WMF-deploy-2017-04-25_(1.29.0-wmf.21)), and 5 others: Concerns about ores_classification table size on enwiki - https://phabricator.wikimedia.org/T159753#3657239 (10Marostegui) >>! In T159753#3657238, @Stashbot wrote: > {nav...
[14:44:19] <jynus>	 "ORDER BY RAND() LIMIT 10569" ugh, I think I am going to get sick
[14:52:15] <wikibugs_>	 10DBA, 10Community-Tech, 10MediaWiki-General-or-Unknown, 10Operations, 10Stewards-and-global-tools (Temporary-UserRights): Regularly purge expired temporary userrights from DB tables - https://phabricator.wikimedia.org/T176754#3657440 (10MarcoAurelio) 05declined>03Open Sorry, I dare to totally disagr...
[15:00:11] <wikibugs_>	 10DBA, 10Wikidata: Migrate wb_terms to using prefixed entity IDs instead of numeric IDs - https://phabricator.wikimedia.org/T114903#3657469 (10Ladsgroup)
[15:13:45] <wikibugs_>	 10DBA, 10Community-Tech, 10MediaWiki-General-or-Unknown, 10Operations, 10Stewards-and-global-tools (Temporary-UserRights): Regularly purge expired temporary userrights from DB tables - https://phabricator.wikimedia.org/T176754#3635590 (10jcrespo) This is my (I hope) neutral evaluation of the issue:  * Th...
[15:21:59] <wikibugs_>	 10DBA, 10Cloud-Services, 10Tracking: Wikireplica service for tools and labs - issues and missing available views (tracking) - https://phabricator.wikimedia.org/T150767#3657622 (10jcrespo)
[15:22:16] <wikibugs_>	 10DBA, 10Data-Services, 10Tracking: Wikireplica service for tools and labs - issues and missing available views (tracking) - https://phabricator.wikimedia.org/T150767#2795855 (10jcrespo)
[15:23:07] <wikibugs_>	 10DBA, 10Data-Services: Some queries to new replica hosts are dramatically slower than labsdb; missing indexes? - https://phabricator.wikimedia.org/T177096#3657633 (10jcrespo)
[15:23:10] <wikibugs_>	 10DBA, 10Data-Services, 10Tracking: Wikireplica service for tools and labs - issues and missing available views (tracking) - https://phabricator.wikimedia.org/T150767#2795855 (10jcrespo)
[15:29:30] <wikibugs_>	 10DBA, 10Operations, 10ops-codfw: Decommission db2010 and move m1 codfw to db2078 - https://phabricator.wikimedia.org/T175685#3657671 (10Marostegui)
[15:31:41] <wikibugs_>	 10DBA, 10Community-Tech, 10MediaWiki-General-or-Unknown, 10Operations, 10Stewards-and-global-tools (Temporary-UserRights): Regularly purge expired temporary userrights from DB tables - https://phabricator.wikimedia.org/T176754#3657682 (10MarcoAurelio) @jcrespo Thank you. Point number two is what people a...
[15:32:59] <wikibugs_>	 10DBA, 10Community-Tech, 10Stewards-and-global-tools (Temporary-UserRights): Expired user groups not added to user_former_groups table - https://phabricator.wikimedia.org/T177404#3657688 (10MarcoAurelio)
[15:34:09] <wikibugs_>	 10DBA, 10Operations, 10ops-codfw, 10Patch-For-Review: Decommission db2010 and move m1 codfw to db2078 - https://phabricator.wikimedia.org/T175685#3657692 (10Marostegui)
[15:37:22] <wikibugs_>	 10DBA, 10Operations, 10ops-codfw, 10Patch-For-Review: Decommission db2010 and move m1 codfw to db2078 - https://phabricator.wikimedia.org/T175685#3657708 (10Marostegui)
[15:40:49] <wikibugs_>	 10DBA, 10Community-Tech, 10Stewards-and-global-tools (Temporary-UserRights): Expired user groups not added to user_former_groups table - https://phabricator.wikimedia.org/T177404#3657453 (10Marostegui) What are we (DBAs) supposed to do here? (Asking as we got added to the ticket :-)  )
[15:42:26] <wikibugs_>	 10DBA, 10Community-Tech, 10Stewards-and-global-tools (Temporary-UserRights): Expired user groups not added to user_former_groups table - https://phabricator.wikimedia.org/T177404#3657728 (10MarcoAurelio) @jcrespo so you can evaluate if #dba is needed here as you're also on T176754.
[15:46:02] <wikibugs_>	 10DBA, 10Community-Tech, 10Stewards-and-global-tools (Temporary-UserRights): Expired user groups not added to user_former_groups table - https://phabricator.wikimedia.org/T177404#3657734 (10jcrespo) In my humble opinion, this is the part that is purely #mediawiki-database (actual mediawiki bug, not DBA relat...
[15:48:30] <wikibugs_>	 10DBA, 10Data-Services, 10Patch-For-Review: Some queries to new replica hosts are dramatically slower than labsdb; missing indexes? - https://phabricator.wikimedia.org/T177096#3657738 (10Marostegui) >>! In T177096#3657674, @gerritbot wrote: > Change 382170 had a related patch set uploaded (by Jcrespo; owner:...
[15:57:20] <wikibugs_>	 10DBA, 10Community-Tech, 10MediaWiki-General-or-Unknown, 10Operations, 10Stewards-and-global-tools (Temporary-UserRights): Regularly purge expired temporary userrights from DB tables - https://phabricator.wikimedia.org/T176754#3657784 (10jcrespo) Legoktm was totally legitimate about closing the ticket wi...
[16:05:47] <wikibugs_>	 10DBA, 10Data-Services, 10Patch-For-Review: Some queries to new replica hosts are dramatically slower than labsdb; missing indexes? - https://phabricator.wikimedia.org/T177096#3657823 (10jcrespo) This should fix things:  ``` root@labsdb1010[enwiki]> ALTER TABLE archive ADD KEY `user_timestamp` (`ar_user`,`ar...
[17:42:35] <wikibugs_>	 10DBA, 10Data-Services, 10Tracking: Wikireplica service for tools and labs - issues and missing available views (tracking) - https://phabricator.wikimedia.org/T150767#3658252 (10jcrespo)
[17:42:38] <wikibugs_>	 10DBA, 10Data-Services, 10Goal, 10cloud-services-team (FY2017-18): Migrate all users to new Wiki Replica cluster and decommission old hardware - https://phabricator.wikimedia.org/T142807#3658253 (10jcrespo)
[17:42:41] <wikibugs_>	 10DBA, 10Data-Services, 10Patch-For-Review: Some queries to new replica hosts are dramatically slower than labsdb; missing indexes? - https://phabricator.wikimedia.org/T177096#3658249 (10jcrespo) 05Open>03Resolved a:03jcrespo @MusikAnimal Your query takes now 3 second cold, 0.13 seconds hot on the new...
[17:50:42] <wikibugs_>	 10DBA, 10Gerrit, 10Operations, 10Patch-For-Review, 10Release-Engineering-Team (Next): Gerrit is failing to connect to db on gerrit2001 thus preventing systemd from working - https://phabricator.wikimedia.org/T176532#3658342 (10Dzahn) Yea, this should just wait for the proper setup in codfw. I don't see a...
[17:55:57] <wikibugs_>	 10DBA, 10Community-Tech, 10MediaWiki-General-or-Unknown, 10Operations, 10Stewards-and-global-tools (Temporary-UserRights): Regularly purge expired temporary userrights from DB tables - https://phabricator.wikimedia.org/T176754#3658385 (10kaldari) >If purging is undesirable on production, which is somethi...
[18:19:45] <wikibugs_>	 10DBA, 10Community-Tech, 10MediaWiki-General-or-Unknown, 10Operations, 10Stewards-and-global-tools (Temporary-UserRights): Regularly purge expired temporary userrights from DB tables - https://phabricator.wikimedia.org/T176754#3658483 (10EddieGP) When digging a bit further into this, I found that it was...
[18:25:38] <wikibugs_>	 10DBA, 10Community-Tech, 10MediaWiki-General-or-Unknown, 10Operations, 10Stewards-and-global-tools (Temporary-UserRights): Regularly purge expired temporary userrights from DB tables - https://phabricator.wikimedia.org/T176754#3658511 (10MarcoAurelio) Sorry but I don't see it that way. As much as I respe...
[18:46:04] <wikibugs_>	 10DBA, 10Community-Tech, 10MediaWiki-General-or-Unknown, 10Operations, 10Stewards-and-global-tools (Temporary-UserRights): Regularly purge expired temporary userrights from DB tables - https://phabricator.wikimedia.org/T176754#3658551 (10kaldari) @EddieGP: I believe there were performance concerns with t...
[18:57:46] <wikibugs_>	 10DBA, 10Community-Tech, 10MediaWiki-General-or-Unknown, 10Operations, 10Stewards-and-global-tools (Temporary-UserRights): Regularly purge expired temporary userrights from DB tables - https://phabricator.wikimedia.org/T176754#3658580 (10MarcoAurelio) @kaldari Tool Labs is mentioned here because I discov...
[19:44:28] <wikibugs_>	 10DBA, 10Data-Services, 10Patch-For-Review: Some queries to new replica hosts are dramatically slower than labsdb; missing indexes? - https://phabricator.wikimedia.org/T177096#3658782 (10MusikAnimal) Lightning fast! :D Many thanks for the prompt assistance
[19:54:39] <wikibugs_>	 10DBA, 10Operations, 10Availability (Multiple-active-datacenters), 10Performance-Team (Radar): Make apache/maintenance hosts TLS connections to mariadb work - https://phabricator.wikimedia.org/T175672#3658801 (10Joe) So what I extract from the errors is you're trying to connect to db2048 by IP and not by h...
[20:01:13] <wikibugs_>	 10DBA, 10Community-Tech, 10MediaWiki-General-or-Unknown, 10Operations, and 2 others: Regularly purge expired temporary userrights from DB tables - https://phabricator.wikimedia.org/T176754#3658814 (10EddieGP) p:05Normal>03Low >>! In T176754#3658580, @MarcoAurelio wrote: > The issue here is that MediaWi...
[20:22:13] <wikibugs_>	 10DBA, 10Gerrit, 10Operations, 10Patch-For-Review, 10Release-Engineering-Team (Next): Gerrit is failing to connect to db on gerrit2001 thus preventing systemd from working - https://phabricator.wikimedia.org/T176532#3658840 (10demon) p:05Triage>03Low
[20:25:56] <wikibugs_>	 10DBA, 10Analytics: Drop MoodBar tables from all wikis - https://phabricator.wikimedia.org/T153033#3658843 (10demon) >>! In T153033#3656161, @Marostegui wrote: > Just to be clear, you are talking about dbstore1002/db1047? > We also have to keep in mind that there are thousands of tables (two per wiki basically...
[20:33:58] <wikibugs_>	 10DBA, 10Operations, 10Availability (Multiple-active-datacenters), 10Performance-Team (Radar): Make apache/maintenance hosts TLS connections to mariadb work - https://phabricator.wikimedia.org/T175672#3658863 (10aaron)
[20:39:14] <wikibugs_>	 10DBA, 10Operations, 10Availability (Multiple-active-datacenters), 10Performance-Team (Radar): Make apache/maintenance hosts TLS connections to mariadb work - https://phabricator.wikimedia.org/T175672#3658870 (10aaron) Also, there is https://bugs.php.net/bug.php?id=74445 :)
[20:47:05] <wikibugs_>	 10DBA, 10Operations, 10Availability (Multiple-active-datacenters), 10Performance-Team (Radar): Make apache/maintenance hosts TLS connections to mariadb work - https://phabricator.wikimedia.org/T175672#3658884 (10aaron) >>! In T175672#3658801, @Joe wrote: > So what I extract from the errors is you're trying...
[21:08:20] <wikibugs_>	 10Blocked-on-schema-change, 10DBA, 10Readers-Community-Engagement, 10Community-Liaisons (Oct-Dec 2017): Help communicate read-only time for Commons for schema change required by adding 3D filetype - https://phabricator.wikimedia.org/T176883#3658963 (10CKoerner_WMF) I've posted the message to the Commons Vi...
[21:08:38] <wikibugs_>	 10Blocked-on-schema-change, 10DBA, 10Readers-Community-Engagement, 10Community-Liaisons (Oct-Dec 2017): Help communicate read-only time for Commons for schema change required by adding 3D filetype - https://phabricator.wikimedia.org/T176883#3658965 (10CKoerner_WMF)
[21:27:40] <wikibugs_>	 10DBA, 10Analytics: Drop MoodBar tables from all wikis - https://phabricator.wikimedia.org/T153033#3658983 (10Nuria) >Could/should we drop the ones that are completely empty already--assuming some wikis never actually used it. Would that make it more manageable? Yes, please. I think that makes loads of sense.