[04:56:42] <wikibugs>	 10DBA, 10cloud-services-team (Kanban): Reimage labsdb1011 to Buster and 10.4 - https://phabricator.wikimedia.org/T249188 (10Marostegui) Yes
[04:58:17] <wikibugs>	 10DBA, 10Data-Services, 10Quarry: Quarry query became work much slower - https://phabricator.wikimedia.org/T247978 (10Marostegui) Unfortunately, the servers that we use for Quarry and for the all wikireplicas in general is very specific (and very costly) so we do not have hot spares ready to take over any mo...
[05:03:14] <wikibugs>	 10DBA, 10DC-Ops, 10Operations, 10ops-codfw: (Need By: 31st May) rack/setup/install db213[6-9] and db2140 - https://phabricator.wikimedia.org/T251639 (10Marostegui) a:05jcrespo→03Papaul
[05:03:28] <wikibugs>	 10DBA, 10DC-Ops, 10Operations, 10ops-codfw: (Need By: 31st May) rack/setup/install db213[6-9] and db2140 - https://phabricator.wikimedia.org/T251639 (10Marostegui) >>! In T251639#6101118, @RobH wrote: > @jcrespo or @Marostegui: >  > The racking details from the ordering task only list 4 hosts, but we ended...
[05:16:40] <wikibugs>	 10DBA, 10DC-Ops, 10Operations, 10ops-eqiad: (Need By: 31st May) rack/setup/install db114[1-9] - https://phabricator.wikimedia.org/T251614 (10Marostegui)
[07:08:07] <wikibugs>	 10DBA, 10Datasets-General-or-Unknown, 10Patch-For-Review, 10Sustainability (Incident Prevention), 10WorkType-NewFunctionality: Automate the check and fix of object, schema and data drifts between mediawiki HEAD, production masters and slaves - https://phabricator.wikimedia.org/T104459 (10Marostegui)
[07:08:09] <wikibugs>	 10DBA, 10MediaWiki-User-management, 10Core Platform Team Workboards (Clinic Duty Team), 10MW-1.35-notes (1.35.0-wmf.30; 2020-04-28), and 2 others: Rename ipb_address index on ipb_address to ipb_address_unique - https://phabricator.wikimedia.org/T250071 (10Marostegui)
[07:08:11] <wikibugs>	 10DBA: inverse_timestamp column exists in text table, it shouldn't - https://phabricator.wikimedia.org/T250063 (10Marostegui)
[07:08:13] <wikibugs>	 10DBA, 10Core Platform Team: text table still has old_* fields and indexes on some hosts - https://phabricator.wikimedia.org/T250066 (10Marostegui)
[07:08:15] <wikibugs>	 10DBA: lc_lang_key index is lingering in production - https://phabricator.wikimedia.org/T250056 (10Marostegui)
[07:08:17] <wikibugs>	 10DBA: type_acton index in logging table is lingering in production - https://phabricator.wikimedia.org/T250057 (10Marostegui)
[07:08:19] <wikibugs>	 10DBA: searchindex indexes are missing in production - https://phabricator.wikimedia.org/T250058 (10Marostegui)
[07:08:21] <wikibugs>	 10DBA: db1110 has 5 important database drifts that are unique to the host - https://phabricator.wikimedia.org/T249973 (10Marostegui)
[07:08:24] <wikibugs>	 10DBA, 10wikitech.wikimedia.org: Wikitech has lots of database drifts with core and rest of the databases - https://phabricator.wikimedia.org/T249972 (10Marostegui)
[07:25:43] <wikibugs>	 10DBA, 10Operations: Upgrade and restart s5 and s6 primary DB master: Tue 5th May - https://phabricator.wikimedia.org/T251154 (10Marostegui) 10.1.43-2 has been installed on both masters (without mysql_upgrade) and they are ready for tomorrow's restart.
[07:33:36] <wikibugs>	 10DBA, 10DC-Ops, 10Operations, 10ops-codfw, 10Patch-For-Review: (Need By: 31st May) rack/setup/install db213[6-9] and db2140 - https://phabricator.wikimedia.org/T251639 (10Marostegui)
[07:34:38] <wikibugs>	 10DBA, 10DC-Ops, 10Operations, 10ops-codfw, 10Patch-For-Review: (Need By: 31st May) rack/setup/install db213[6-9] and db2140 - https://phabricator.wikimedia.org/T251639 (10Marostegui) @Papaul the initial puppet changes are done. From puppet side the only pending thing is; to add them to the DCHP file (if...
[07:37:14] <wikibugs>	 10DBA, 10User-DannyS712: Drop flagged revs tables on mediawikiwiki - https://phabricator.wikimedia.org/T248298 (10Marostegui) 05Open→03Resolved Dropped everywhere. I kept a temporary tiny backup at: ` root@cumin1001:/home/marostegui/T248298# ls -lh flagged* -rw-r--r-- 1 root root 2.7M May  4 07:30 flaggedi...
[07:37:19] <wikibugs>	 10DBA, 10Epic, 10Tracking-Neverending: Database tables to be dropped on Wikimedia wikis and other WMF databases (tracking) - https://phabricator.wikimedia.org/T54921 (10Marostegui)
[07:39:06] <marostegui>	 Amir1: I am going to go ahead and drop wb_terms on the labsdb hosts
[08:13:52] <wikibugs>	 10Blocked-on-schema-change, 10DBA: ipb_address_unique has an extra column in the code but not in production - https://phabricator.wikimedia.org/T251188 (10Marostegui) a:03Marostegui
[08:22:05] <wikibugs>	 10Blocked-on-schema-change, 10DBA: ipb_address_unique has an extra column in the code but not in production - https://phabricator.wikimedia.org/T251188 (10Marostegui)
[08:31:22] <wikibugs>	 10Blocked-on-schema-change, 10DBA: ipb_address_unique has an extra column in the code but not in production - https://phabricator.wikimedia.org/T251188 (10jcrespo) Question: are block functionality maintainers (https://www.mediawiki.org/wiki/Developers/Maintainers#MediaWiki_core) aware of these issues, fixes?...
[08:31:48] <wikibugs>	 10Blocked-on-schema-change, 10DBA: ipb_address_unique has an extra column in the code but not in production - https://phabricator.wikimedia.org/T251188 (10Marostegui) Heh, this needs to be made in different transactions as we were hit (again) by: https://jira.mariadb.org/browse/MDEV-8351 (I reported it too a f...
[08:32:09] <wikibugs>	 10Blocked-on-schema-change, 10DBA: ipb_address_unique has an extra column in the code but not in production - https://phabricator.wikimedia.org/T251188 (10Marostegui)
[08:46:44] <wikibugs>	 10Blocked-on-schema-change, 10DBA, 10Anti-Harassment: ipb_address_unique has an extra column in the code but not in production - https://phabricator.wikimedia.org/T251188 (10Marostegui) Good question @jcrespo - on the other hand, wikidata (and all the new s3 wikis) are being created with this 4 columns into...
[09:16:35] <volans>	 re:netboot.cfg, wouldn't be easier to have db[12]* ?
[09:16:53] <kormat>	 we can do that now, yes
[09:17:33] <volans>	 as long as any exception is defined above should be fine anytime
[09:19:53] <wikibugs>	 10DBA, 10Epic: Upgrade WMF database-and-backup-related hosts to buster - https://phabricator.wikimedia.org/T250666 (10Kormat) I'm going to work on db1001 and es2025.
[09:35:58] <jynus>	 what's db1001?
[09:36:20] <kormat>	 a test
[09:36:37] <jynus>	 like, a vm?
[09:36:52] <kormat>	 no, like a typo :) fixed now, db1101
[09:36:58] <jynus>	 ah, I see
[09:37:00] <jynus>	 sorry
[09:37:06] <jynus>	 I was like super-confused
[09:37:13] <jynus>	 because that db host actually existed
[09:37:22] <kormat>	 don't be, i'm glad you asked :) i'm paranoid about typing the wrong hostname, so it's good to know when it happens
[09:38:00] <jynus>	 Re: errors, don't worry
[09:38:36] <jynus>	 most errors will eventually come a few months in, when withing the gap of confidence :-D
[09:39:02] <kormat>	 haha
[09:58:05] <Amir1>	 marostegui: cool! Thanks
[10:20:55] <wikibugs>	 10DBA, 10Epic: Upgrade WMF database-and-backup-related hosts to buster - https://phabricator.wikimedia.org/T250666 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by kormat on cumin1001.eqiad.wmnet for hosts: ` ['db1101.eqiad.wmnet'] ` The log can be found in `/var/log/wmf-auto-reimage/202005041020...
[10:41:02] <wikibugs>	 10DBA, 10Epic: Upgrade WMF database-and-backup-related hosts to buster - https://phabricator.wikimedia.org/T250666 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['db1101.eqiad.wmnet'] `  and were **ALL** successful.
[11:07:15] <kormat>	 jynus: i have no idea why tendril needs to die - is there a phab task or something that gives context?
[11:07:40] <jynus>	 yep
[11:07:45] <jynus>	 one sec
[11:10:20] <volans>	 kormat: run! ;)
[11:11:14] <kormat>	 good point, lunch!
[11:40:55] <marostegui>	 Amir1: everytime I inspect recentchanges queries I really want to cry
[11:48:27] <Amir1>	 I feel you 😭
[11:48:52] <jynus>	 there is a task for that
[11:51:13] <marostegui>	 there are multiple
[11:51:26] <jynus>	 multiple tasks for crying?
[11:51:28] <jynus>	 :-D
[11:52:30] <jynus>	 kormat: technically tendril doesn't have to die, just transform into something maintainable and with a modern stack- functionality has to stay the same (or improved)
[12:11:33] <kormat>	 hmm. prometheus-mysqld-exporter.service is enabled on db1101, but it's a multi-instance host
[12:12:00] <jynus>	 yeah, that is a bug
[12:12:04] <jynus>	 that shoude disabled
[12:12:38] <jynus>	 common package is uses and that enables it by default
[12:12:57] <jynus>	 which is known to be problematic and cause races between package install & puppet
[12:13:04] <jynus>	 (in general)
[12:18:34] <kormat>	 ah, lovely.
[12:19:01] <jynus>	 it could be still added to puppet, I think on the appropiate roles
[12:19:08] <jynus>	 as a workaround
[12:19:46] <jynus>	 there is a few upgrading issues documented at: https://wikitech.wikimedia.org/wiki/MariaDB#Stretch_+_10.1_-%3E_Buster_+_10.4_known_issues
[12:20:05] <jynus>	 which we don't have enough information about to decide how to solve or workaround
[12:21:50] <jynus>	 feel free to add that "disable X for multi-instance hosts"
[12:21:58] <jynus>	 and we can later puppetize it
[12:22:11] <jynus>	 or repackage, depending on the best way to solve it
[12:25:18] <kormat>	 done
[12:42:55] <kormat>	 marostegui: PTAL at the dbctl config diff on cumin1001 for repooling db1101 into s7 and s8
[12:43:10] <marostegui>	 checking
[12:43:11] <kormat>	 s7 is straight to 100% (weight 150), s8 is at 50% (weight 175)
[12:43:44] <marostegui>	 I would suggest 25% for s8 and 50% for s7 instead
[12:43:52] <kormat>	 ah, ok :)
[12:43:52] <marostegui>	 s8 can hit hard
[12:45:01] <kormat>	 updated
[12:45:29] <marostegui>	 checking
[12:46:02] <marostegui>	 notifications are enabled and the hosts are green on icinga?
[12:46:10] <kormat>	 correct
[12:46:15] <marostegui>	 then +1!
[12:46:24] <kormat>	 not +2? :(
[12:46:26] <kormat>	 (kidding :)
[12:46:34] <marostegui>	 XDD
[12:58:51] <kormat>	 marostegui: how do i know when it's time to increase the percentage?
[12:59:55] <marostegui>	 kormat: normally 10 minutes are ok, what can give you more indications are the per host graph and if the graphs are stable, specially the number of connections, traffic, innodb buffer pool efficiency and disk latency: https://grafana.wikimedia.org/d/000000273/mysql?orgId=1&from=now-15d&to=now&var-dc=eqiad%20prometheus%2Fops&var-server=labsdb1010&var-port=9104
[13:00:02] <marostegui>	 that's for labsdb1010
[13:00:33] <marostegui>	 if you go for db1101, remember you need to change the port as those are multi instance, 9104 isn't going to give you any data, but 1331X where X is the section
[13:00:38] <marostegui>	 so in that case 13311 and 11317
[13:01:17] <kormat>	 oh yeah, on that topic - who owns the mysql dashboard?
[13:01:35] <kormat>	 because it's ~trivial to change the variable config such that only the relevant ports will be shown in the dropdown
[13:01:36] <marostegui>	 I don't think it has a clear owner :)
[13:01:44] <marostegui>	 kormat: go for it!
[13:02:21] <kormat>	 do we use graffonet or similar to generate the dashboards
[13:02:25] <kormat>	 or is the live copy the source?
[13:02:39] <marostegui>	 the live copy is the source as far as I know
[13:02:43] <marostegui>	 godog: ^
[13:03:15] <volans>	 it depends
[13:03:22] <volans>	 some are defined in puppet, some are "live"
[13:03:23] <kormat>	 ™
[13:03:36] <volans>	 those defined in puppet IIRC aren't editable in the UI (or at least used to be)
[13:03:59] <marostegui>	 Specifically the mysql one I don't think it is on puppet
[13:04:05] <volans>	 mandatory reference: T171482
[13:04:06] <stashbot>	 T171482: Programmatic generation of grafana dashboards - https://phabricator.wikimedia.org/T171482
[13:04:11] <marostegui>	 I have changed some items there to adapt it to 10.4 stuff
[13:04:20] <volans>	 and 
[13:04:21] <volans>	 T178690
[13:04:22] <stashbot>	 T178690: Better organization for SRE grafana dashboards - https://phabricator.wikimedia.org/T178690
[13:09:55] <godog>	 yup, what has been said already basically
[13:10:03] <kormat>	 is there a way to run queries directly against prom itself?
[13:10:14] <kormat>	 i'm assuming the prom instances aren't publically accessible
[13:10:42] <godog>	 that's correct, we don't expose the prometheus web interface behind ldap even but you can ssh-tunnel to it
[13:10:54] <godog>	 searching for the wikitech link
[13:11:10] <godog>	 https://wikitech.wikimedia.org/wiki/Prometheus#Access_Prometheus_web_interface
[13:11:23] <kormat>	 great, thanks
[13:12:00] <volans>	 kormat: there is also the Explore feature in Grafana
[13:12:03] <godog>	 np!
[13:12:16] <volans>	 that gives you quick access to explore metrics, IIRC you have to be logged in 
[13:12:38] <kormat>	 ah. i haven't used that before. i'll give it a shot
[13:13:31] <kormat>	 volans: this is useful, thanks :)
[13:14:23] * volans completed the 1 useful action / day goal :D
[13:14:51] <kormat>	 marostegui: check this out: https://grafana.wikimedia.org/d/kUxjfbeWk/mysql-kormat?orgId=1&var-dc=eqiad%20prometheus%2Fops&var-server=db1101&var-port=13317
[13:15:04] <kormat>	 if you change the server dropdown, you'll only get valid ports in the port dropdown
[13:15:13] <marostegui>	 I see, that's nice yeah
[13:15:27] <marostegui>	 Otherwise you have to start guessing if you don't know which sections that host belong to
[13:15:35] <kormat>	 yep
[13:15:49] <kormat>	 ok. i'll make the change to the main dashboard
[13:15:57] <marostegui>	 kormat: don't worry, there will be a moment when you know just by looking at the hostname
[13:16:00] <marostegui>	 kormat: `1
[13:16:00] <kormat>	 (almost wrote 'db' as a shorthand, then realised that might be confusing :)
[13:16:02] <marostegui>	 +1
[13:21:29] <kormat>	 marostegui: db1101 is looking healthy to me
[13:23:08] <kormat>	 shall i add another 25% to both sections?
[13:23:15] <marostegui>	 sounds good to me yeah
[13:23:53] <marostegui>	 mmm, is it me or I cannot see any data on its dashboard now?
[13:24:15] <marostegui>	 ah nevermind
[13:24:17] <marostegui>	 I am stupid
[13:24:24] <volans>	 +1 :-P
[13:24:27] <kormat>	 don't scare me like that :)
[13:24:28] <marostegui>	 XDDD
[13:25:36] <kormat>	 marostegui: diff ready on cumin1001
[13:25:50] <marostegui>	 checking
[13:26:29] <marostegui>	 +1
[13:26:45] * volans wonders if we should add a more integrated cross-check capability to dbctl
[13:27:08] <marostegui>	 what I would love to have is a dbctl restore-config
[13:27:09] <marostegui>	 or something
[13:27:18] <marostegui>	 like to roll back the last change
[13:27:37] <volans>	 we have rollback capability for the MW side (context for kormat)
[13:27:47] <volans>	 we miss it on the normalized side of things
[13:28:13] <marostegui>	 it is nice that dbctl still provides you the way to restore the last one, but if you have lost all that in your terminal, it can be tricky
[13:28:24] <volans>	 ?
[13:28:50] <kormat>	 marostegui: just make a shell one-liner to itarate over everything in /var/cache/conftool/dbconfig, and hit ctrl-c when you think it restored the right onw. simples.
[13:28:59] <marostegui>	 volans: the output of: Previous configuration saved. To restore it run: dbctl config restore /var/cache/conftool/dbconfig/20200504-075148-marostegui.json
[13:29:03] <marostegui>	 that's nice
[13:29:35] <volans>	 marostegui: yeah so just go to /var/cache/conftool/dbconfig and ls | tail
[13:30:04] <marostegui>	 volans: as I was saying, it is nice it can still be done, but I would to have it with just dbctl restore-last or whatever
[13:30:23] <volans>	 ahhh, ok, I misunderstood
[13:30:36] <volans>	 the current problem is that that would be local to the host, and that sucks
[13:30:52] <volans>	 a better approach would be to move everything to a git repo that is automatically replicated between the two hosts
[13:30:58] <volans>	 (like we do for the Netbox generated dns)
[13:30:59] <marostegui>	 yeah, I remember we discussed all this during the deployment of dbctl
[13:31:24] <volans>	 at that point it should be also doable to save the normalized state and be able to revert that one too
[13:31:48] <volans>	 and/or at that point remove the normalized view from etcd entirely
[13:31:52] <volans>	 and have it only on git
[13:33:57] <kormat>	 marostegui: going to kick off reimaging of es2025 now. it looks like the section in general is very lightly loaded, and the host is in codfw, so i'm going to depool it directly.
[13:34:17] <marostegui>	 kormat: yeah, it doesn't get any traffic being in codfw, go for it
[13:44:43] <wikibugs>	 10DBA, 10Growth-Team, 10MediaWiki-Recent-changes, 10Schema-change: recentchanges table indexes: tmp1, tmp2 and tmp3 - https://phabricator.wikimedia.org/T206103 (10Marostegui)
[13:49:05] <wikibugs>	 10DBA: Make partman/custom/no-srv-format.cfg work - https://phabricator.wikimedia.org/T251768 (10Kormat)
[13:49:29] <wikibugs>	 10DBA, 10Operations: Make enabling reimaging for db hosts more humane - https://phabricator.wikimedia.org/T251392 (10Kormat) 05Open→03Resolved a:03Kormat Closing this, and opened T251768 to cover fixing the partman recipe.
[13:58:09] <kormat>	 marostegui: es2025 has a different pxe path than the other hosts: http://apt.wikimedia.org/tftpboot/stretch-installer-bootif/
[13:58:30] <marostegui>	 kormat: Yeah, that was a fun thing
[13:58:32] <marostegui>	 Let me look for the ticket
[13:59:03] <marostegui>	 kormat: https://phabricator.wikimedia.org/T242481
[13:59:20] <marostegui>	 I believe this was fixed on Buster, moritzm am I right?
[14:06:36] <moritzm>	 looking
[14:06:58] <marostegui>	 moritzm: I believe the hack was just because they were being installed with stretch
[14:07:06] <marostegui>	 no?
[14:07:15] <moritzm>	 yeah, on buster this works out of the boot, we only need the stretch-bootif setting for these specific servers when installed with stretch
[14:07:24] <moritzm>	 out of the box :-)
[14:07:31] <marostegui>	 sweeeet
[14:07:40] <kormat>	 moritzm: great, thanks
[14:07:46] <marostegui>	 kormat: you so you can treat them as a normal server now then, just remove the installer line
[14:08:00] <moritzm>	 when all those hosts are moved to buster, ping me and I'll axe the stretch-installer-bootif stuff
[14:08:15] <marostegui>	 Wow, moritzm that issue was BEFORE all hands?
[14:08:32] <marostegui>	 I thought it happened like a month ago only
[14:11:11] <moritzm>	 times flies when you're having a pandemic :-)
[14:11:18] <marostegui>	 haha
[14:11:43] <marostegui>	 yeah, during the all hands we were seeing the covid as a very far away thing, and now look at us
[14:11:56] <volans>	 I modified my concept that it's always the same week to one where each week is a new generation of the previous, with tiny differences driven by evolution
[14:12:04] <jynus>	 wait
[14:12:08] <jynus>	 where is april?
[14:12:15] <jynus>	 I think I lost it!
[14:12:29] <volans>	 jynus: was an april's fool ;)
[14:12:44] <volans>	 no april in a pandemic year
[14:13:10] <marostegui>	 I hope this is the first and last year I have to spent in a pandemic!
[14:13:18] <marostegui>	 *birthday
[14:14:06] <kormat>	 marostegui: db1101 still looks ok to me. time to increase to 100% s7, 75% s8?
[14:14:26] <marostegui>	 kormat: go for it!
[14:15:52] <jynus>	 I wrote something for new dbas, let me find it
[14:16:11] <kormat>	 jynus: was it "Run The Fuck Away" repeated 1000 times?
[14:16:23] <jynus>	 kormat: https://phabricator.wikimedia.org/P7516
[14:16:27] <jynus>	 it may help as a checklist
[14:16:34] <jynus>	 just not all at the same time
[14:16:55] <jynus>	 ignore the [DONE]
[14:18:02] <marostegui>	 jynus: I am doing some math about the s4 hosts and all that, and we can definitely have the backup testing host back
[14:18:17] <jynus>	 after the new hosts, right?
[14:18:21] <marostegui>	 yep
[14:18:24] <jynus>	 cool
[14:18:39] <jynus>	 even with redundancy, right?
[14:18:51] <marostegui>	 jynus: Also, db1102 (I believe is a backup sourcE will be replaced with one of the new hosts too) with larger disk
[14:18:59] <marostegui>	 unless you prefer not to (right now it is 50%)
[14:19:06] <marostegui>	 jynus: yep, even with redundancy
[14:19:22] <jynus>	 wait, what?
[14:19:37] <jynus>	 is that as a natural, speed up refresh happening now?
[14:19:42] <jynus>	 or a new thing
[14:19:48] <marostegui>	 jynus: no, that is part of SDC expansion
[14:19:49] <jynus>	 or happening next year
[14:20:48] <marostegui>	 jynus: the SDC expansion was meant to get new hosts with larger disks for all eqiad hosts needing it, and the backupsource host was included of course
[14:20:57] <jynus>	 oh
[14:21:02] <jynus>	 I didn't remember that
[14:21:07] <marostegui>	 surprise!
[14:21:17] <jynus>	 that's actually great news
[14:21:34] <jynus>	 I thought that was the broken source
[14:22:03] <marostegui>	 No, but we can do some moves there if you like and send db1102 to replace db1140
[14:22:13] <marostegui>	 and once db1140 is fixed, we can put it somewhere else as a core host
[14:22:18] <jynus>	 well, not sure if to replace it
[14:22:26] <jynus>	 but more disk will give us more flexibility
[14:22:47] <jynus>	 the problem with backups is that they take 5 minutes more every time
[14:22:53] <marostegui>	 what I am saying is that we can use a new host to replace db1102 (as scheduled) and move db1102 to replace db1140's role
[14:23:07] <jynus>	 so from 2 years ago to today, it is taken several hours more
[14:23:20] <jynus>	 yep, I got you
[14:23:45] <marostegui>	 and once db1140 is fixed we can fit it somewhere else within s1,s2,s3 etc
[14:24:01] <marostegui>	 anyways, TLDR; db1102 will be upgraded and the backup testing host will be back :)
[14:24:17] <jynus>	 thanks kormat for T251392 !
[14:24:17] <stashbot>	 T251392: Make enabling reimaging for db hosts more humane - https://phabricator.wikimedia.org/T251392
[14:29:29] <kormat>	 you are almost welcome! :)
[14:38:28] <wikibugs>	 10DBA, 10Growth-Team, 10MediaWiki-Recent-changes, 10Schema-change: recentchanges table indexes: tmp1, tmp2 and tmp3 - https://phabricator.wikimedia.org/T206103 (10Marostegui) For the record, I have caught this query using `tmp_3`: ` explain SELECT /* SpecialRecentChanges::doMainQuery */ /*! STRAIGHT_JOIN *...
[14:43:45] <wikibugs>	 10DBA, 10Epic, 10Patch-For-Review: Upgrade WMF database-and-backup-related hosts to buster - https://phabricator.wikimedia.org/T250666 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by kormat on cumin1001.eqiad.wmnet for hosts: ` ['es2025.codfw.wmnet'] ` The log can be found in `/var/log/wmf-aut...
[15:06:30] <wikibugs>	 10Blocked-on-schema-change, 10DBA, 10Anti-Harassment: ipb_address_unique has an extra column in the code but not in production - https://phabricator.wikimedia.org/T251188 (10dmaza) IIRC `ipb_anon_only` and `ipb_auto` are mutually exclusive from the business logic perspective and that's why we probably haven'...
[15:11:27] <wikibugs>	 10DBA, 10Epic: Upgrade WMF database-and-backup-related hosts to buster - https://phabricator.wikimedia.org/T250666 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['es2025.codfw.wmnet'] `  and were **ALL** successful.
[15:11:36] <kormat>	 marostegui: db1101 still seems fine. doing the final 25% on s8.
[15:11:41] <marostegui>	 sweet!
[15:11:44] <marostegui>	 thank you
[15:16:57] <kormat>	 de nada
[15:17:30] <wikibugs>	 10Blocked-on-schema-change, 10DBA, 10Anti-Harassment: ipb_address_unique has an extra column in the code but not in production - https://phabricator.wikimedia.org/T251188 (10dmaza) Disregard my last comment. I was still organizing my thoughts and pressed submit by mistake.  I think we don't need the 4th colu...
[15:20:05] <wikibugs>	 10Blocked-on-schema-change, 10DBA, 10Anti-Harassment: ipb_address_unique has an extra column in the code but not in production - https://phabricator.wikimedia.org/T251188 (10Marostegui) >>! In T251188#6105052, @dmaza wrote: > Disregard my last comment. I was still organizing my thoughts and pressed submit by...
[15:25:30] <wikibugs>	 10DBA, 10DC-Ops, 10Operations, 10ops-eqiad: db1140 (backup source) crashed - https://phabricator.wikimedia.org/T250602 (10Cmjohnson) a:05Cmjohnson→03Jclark-ctr @Jclark-ctr can you start the process with HPE please.
[15:25:32] <wikibugs>	 10Blocked-on-schema-change, 10DBA, 10Anti-Harassment: ipb_address_unique has an extra column in the code but not in production - https://phabricator.wikimedia.org/T251188 (10Ladsgroup) It might be a covering index though meaning it works better with four columns instead of three.
[15:39:59] <wikibugs>	 10Blocked-on-schema-change, 10DBA, 10Anti-Harassment: ipb_address_unique has an extra column in the code but not in production - https://phabricator.wikimedia.org/T251188 (10dmaza) **Extra info:** Seems to have been added back in 2009 `git show 4124558d7b4 maintenance/tables.sql`.    @Marostegui let me doubl...
[15:52:05] <kormat>	 marostegui: es2025 done.
[15:52:19] <marostegui>	  Excellent!!! No issues with the 10G cards this time then?
[15:52:26] <kormat>	 nope, all went smoothly
[15:52:34] <jynus>	 yay!
[15:53:01] <wikibugs>	 10DBA, 10Operations, 10User-notice: Upgrade and restart s4 (commonswiki) primary database master: Tue 12th May - https://phabricator.wikimedia.org/T251502 (10Trizek-WMF)
[16:35:10] <wikibugs>	 10Blocked-on-schema-change, 10DBA, 10Anti-Harassment: ipb_address_unique has an extra column in the code but not in production - https://phabricator.wikimedia.org/T251188 (10dbarratt) >>! In T251188#6105052, @dmaza wrote: > I think we don't need the 4th column as part of the index. From the product perspecti...
[16:58:00] <wikibugs>	 10Blocked-on-schema-change, 10DBA, 10Anti-Harassment: ipb_address_unique has an extra column in the code but not in production - https://phabricator.wikimedia.org/T251188 (10Tchanders) @jcrespo @Marostegui - thanks for pinging AHT. This would explain {T46657}, where the example given is duplicate IP blocks w...
[17:06:07] <wikibugs>	 10DBA, 10DC-Ops, 10Operations, 10Sustainability (Incident Prevention): PXE Boot defaults to automatically reimaging (normally destroying os and all filesystemdata) on all servers - https://phabricator.wikimedia.org/T251416 (10jcrespo)
[17:15:25] <wikibugs>	 10DBA, 10DC-Ops, 10Operations, 10ops-eqiad: (Need By: TBD) rack/setup/install backup1002 + array - https://phabricator.wikimedia.org/T250816 (10jcrespo) Thanks, will take it from here, I should be able to handle this on my own unless unexpected issues arise.
[18:20:45] <wikibugs_>	 10DBA, 10Core Platform Team, 10MediaWiki-API, 10Wikimedia-production-error: LBFactoryMulti: Unknown cluster 'cluster14' - https://phabricator.wikimedia.org/T251778 (10Reedy) p:05Triage→03Lowest There's no cluster14 in db-eqiad.php... Tagging #dba but imagine they might not be aware of history from 2007...
[18:21:36] <wikibugs_>	 10DBA, 10Core Platform Team, 10Wikimedia-General-or-Unknown, 10Wikimedia-production-error: LBFactoryMulti: Unknown cluster 'cluster14' - https://phabricator.wikimedia.org/T251778 (10Reedy)
[22:15:14] <wikibugs>	 10Blocked-on-schema-change, 10DBA, 10Anti-Harassment, 10Patch-For-Review: ipb_address_unique has an extra column in the code but not in production - https://phabricator.wikimedia.org/T251188 (10Niharika) Would it be possible to tell ahead of time if there are duplicate blocks on our projects already? If du...