[01:00:34] 10DBA, 10Performance-Team: Database for XHGui profiles - https://phabricator.wikimedia.org/T254795 (10dpifke) User needs to be able to CREATE TABLE, INSERT, and SELECT. Possibly DELETE if we someday want to limit retention. Connections from mwdebug* and xhgui*. (No need for the webperf* hosts.) [04:50:04] 10DBA, 10Patch-For-Review: Upgrade dbproxyXXXX to Buster - https://phabricator.wikimedia.org/T255408 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by marostegui on cumin1001.eqiad.wmnet for hosts: ` ['dbproxy1014.eqiad.wmnet'] ` The log can be found in `/var/log/wmf-auto-reimage/202006260449_maro... [04:56:36] 10DBA, 10Upstream: Investigate possible memory leak on db1115 - https://phabricator.wikimedia.org/T231769 (10Marostegui) The "usual" OOM happened last night: {F31906146} ` Jun 25 20:46:17 db1115 kernel: [4014818.142520] mysqld invoked oom-killer: gfp_mask=0x24280ca(GFP_HIGHUSER_MOVABLE|__GFP_ZERO), nodemask=0... [05:11:23] 10DBA: Upgrade dbproxyXXXX to Buster - https://phabricator.wikimedia.org/T255408 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['dbproxy1014.eqiad.wmnet'] ` and were **ALL** successful. [05:17:54] 10DBA, 10Patch-For-Review: Upgrade dbproxyXXXX to Buster - https://phabricator.wikimedia.org/T255408 (10Marostegui) [05:57:29] 10DBA, 10Performance-Team, 10Patch-For-Review: Database for XHGui profiles - https://phabricator.wikimedia.org/T254795 (10Marostegui) a:03Marostegui I have created the user and allowed it from dbproxies. Reminder: the application needs to connect to `m2-master.eqiad.wmnet` which points to our dbproxies. Th... [08:02:02] 10DBA: Solve transferpy concurrency issue with auto port detection - https://phabricator.wikimedia.org/T256450 (10Privacybatm) [08:02:37] 10DBA: Solve transferpy concurrency issue with auto port detection - https://phabricator.wikimedia.org/T256450 (10Privacybatm) This race condition can be solved/reduced by making a directory (`mkdir`) in temp as soon as we see a free port. [08:45:33] 10DBA, 10Patch-For-Review: Upgrade x1 databases to Buster and Mariadb 10.4 - https://phabricator.wikimedia.org/T254871 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by jynus on cumin1001.eqiad.wmnet for hosts: ` ['db1102.eqiad.wmnet'] ` The log can be found in `/var/log/wmf-auto-reimage/202006260... [09:03:35] 10DBA, 10Patch-For-Review: Upgrade x1 databases to Buster and Mariadb 10.4 - https://phabricator.wikimedia.org/T254871 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['db1102.eqiad.wmnet'] ` and were **ALL** successful. [10:11:04] 10DBA, 10Patch-For-Review: Upgrade x1 databases to Buster and Mariadb 10.4 - https://phabricator.wikimedia.org/T254871 (10jcrespo) [10:11:54] 10DBA: Upgrade x1 databases to Buster and Mariadb 10.4 - https://phabricator.wikimedia.org/T254871 (10jcrespo) a:05jcrespo→03Marostegui db1095 instance (stretch) has been backed up and moved to db1102 (buster). Backups are not done there and sent to backup1002. [10:13:15] 10DBA: Upgrade x1 databases to Buster and Mariadb 10.4 - https://phabricator.wikimedia.org/T254871 (10Marostegui) Thank you! I will go ahead and finish codfw master and then start looking at dates to failover x1 primary master. [10:27:35] 10DBA, 10User-Kormat: Create reuse recipes for tendril/zarcillo/dbprov/backup hosts - https://phabricator.wikimedia.org/T255768 (10Kormat) [10:28:25] 10DBA, 10Operations, 10SRE-tools, 10User-Kormat: Add native mysql module to spicerack - https://phabricator.wikimedia.org/T255409 (10Kormat) [10:28:36] 10DBA, 10Patch-For-Review, 10User-Kormat: Create prometheus alert to detect lag spikes - https://phabricator.wikimedia.org/T253120 (10Kormat) [10:28:53] 10DBA, 10Operations, 10SRE-tools, 10Patch-For-Review, 10User-Kormat: Audit all cumin queries in switchdc scripts - https://phabricator.wikimedia.org/T243935 (10Kormat) [10:32:09] 10DBA, 10User-Kormat: Create reuse recipes for tendril/zarcillo/dbprov/backup hosts - https://phabricator.wikimedia.org/T255768 (10jcrespo) Note backup hosts are very unique ones- with different hardware (some have 1 or 2 external arrays of disks). I believe they were setup manually each one, as each one has d... [10:54:47] 10DBA, 10Operations, 10decommission-hardware, 10ops-eqiad: decommission dbproxy1003.eqiad.wmnet - https://phabricator.wikimedia.org/T256216 (10Marostegui) [13:07:14] the partitioning on the tendril/zarcillo hosts is.. odd. there's 3 separate software raid volumes between a pair of disks [13:07:41] (instead of a single raid volume, and lvm to partition that up) [13:08:01] i'm not sure this can safely use reuse-parts [13:08:40] because i don't know if the naming of md{0..2} is deterministic [13:13:36] the man-page specifically says it's non-deterministic [13:15:33] 10DBA, 10Patch-For-Review, 10User-Kormat: Create reuse recipes for tendril/zarcillo/dbprov/backup hosts - https://phabricator.wikimedia.org/T255768 (10Kormat) tendril/zarcillo are probably unsupportable by reuse-parts. They have multiple mdraid arrays, and the documentation says that the naming of arrays is... [13:20:03] 10DBA, 10Patch-For-Review, 10User-Kormat: Create reuse recipes for tendril/zarcillo/dbprov/backup hosts - https://phabricator.wikimedia.org/T255768 (10Kormat) Aand the same applies to the backup* hosts. [14:57:56] 10DBA, 10CheckUser, 10Trust-and-Safety, 10WMF-Legal, and 2 others: Configure WMF wikis to log login attempts in CheckUser - https://phabricator.wikimedia.org/T253802 (10Huji) @Ladsgroup I understand that due to T256395, all users were logged out and had to log in again. This could provide a once-in-a-longt...