[02:19:28] PROBLEM - MariaDB sustained replica lag on pc2010 is CRITICAL: 2.2 ge 2 https://wikitech.wikimedia.org/wiki/MariaDB/troubleshooting%23Replication_lag https://grafana.wikimedia.org/d/000000273/mysql?orgId=1&var-server=pc2010&var-port=9104 [02:29:10] PROBLEM - MariaDB sustained replica lag on pc2010 is CRITICAL: 2.2 ge 2 https://wikitech.wikimedia.org/wiki/MariaDB/troubleshooting%23Replication_lag https://grafana.wikimedia.org/d/000000273/mysql?orgId=1&var-server=pc2010&var-port=9104 [02:34:08] RECOVERY - MariaDB sustained replica lag on pc2010 is OK: (C)2 ge (W)1 ge 0 https://wikitech.wikimedia.org/wiki/MariaDB/troubleshooting%23Replication_lag https://grafana.wikimedia.org/d/000000273/mysql?orgId=1&var-server=pc2010&var-port=9104 [04:45:29] 10DBA, 10DiscussionTools, 10OWC2020, 10Editing-team (FY2020-21 Kanban Board), 10Patch-For-Review: DBA review: conversation subscriptions - https://phabricator.wikimedia.org/T263817 (10Marostegui) [05:17:15] 10DBA, 10decommission-hardware, 10Patch-For-Review: decommission db1087.eqiad.wmnet - https://phabricator.wikimedia.org/T282093 (10Marostegui) [05:26:54] 10DBA, 10DiscussionTools, 10Editing-team, 10Performance-Team, and 2 others: Reduce parser cache retention temporarily for DiscussionTools - https://phabricator.wikimedia.org/T280605 (10Marostegui) There was a script running from May 2nd still on mwmaint1002, which was stuck. I have killed it and as soon as... [06:15:03] 10Blocked-on-schema-change, 10DBA: Schema change to turn user_last_timestamp.user_newtalk to binary(14) - https://phabricator.wikimedia.org/T266486 (10Marostegui) s5 is done, pending the master. [06:15:07] 10Blocked-on-schema-change, 10DBA: Schema change for dropping default of img_timestamp and making it binary(14) - https://phabricator.wikimedia.org/T273360 (10Marostegui) s5 is done, pending the master. [06:15:10] 10Blocked-on-schema-change, 10DBA: Schema change for watchlist.wl_notificationtimestamp going binary(14) from varbinary(14) - https://phabricator.wikimedia.org/T268392 (10Marostegui) s5 is done, pending the master. [06:15:29] 10Blocked-on-schema-change, 10DBA: Schema change to turn user_last_timestamp.user_newtalk to binary(14) - https://phabricator.wikimedia.org/T266486 (10Marostegui) [06:15:50] 10Blocked-on-schema-change, 10DBA: Schema change for dropping default of img_timestamp and making it binary(14) - https://phabricator.wikimedia.org/T273360 (10Marostegui) [06:16:10] 10Blocked-on-schema-change, 10DBA: Schema change for watchlist.wl_notificationtimestamp going binary(14) from varbinary(14) - https://phabricator.wikimedia.org/T268392 (10Marostegui) [06:45:36] 10Blocked-on-schema-change, 10DBA: Schema change to turn user_last_timestamp.user_newtalk to binary(14) - https://phabricator.wikimedia.org/T266486 (10Marostegui) [06:45:44] 10Blocked-on-schema-change, 10DBA: Schema change for dropping default of img_timestamp and making it binary(14) - https://phabricator.wikimedia.org/T273360 (10Marostegui) [06:45:51] 10Blocked-on-schema-change, 10DBA: Schema change for watchlist.wl_notificationtimestamp going binary(14) from varbinary(14) - https://phabricator.wikimedia.org/T268392 (10Marostegui) [07:29:12] 10Blocked-on-schema-change, 10DBA: Schema change to turn user_last_timestamp.user_newtalk to binary(14) - https://phabricator.wikimedia.org/T266486 (10Marostegui) s2 eqiad [] labsdb1011 [] labsdb1010 [] labsdb1009 [x] dbstore1004 [] db1182 [] db1171 [] db1170 [] db1162 [] db1156 [] db1155 [] db1146 [] db1129 [... [07:29:15] 10Blocked-on-schema-change, 10DBA: Schema change for dropping default of img_timestamp and making it binary(14) - https://phabricator.wikimedia.org/T273360 (10Marostegui) s2 eqiad [] labsdb1011 [] labsdb1010 [] labsdb1009 [x] dbstore1004 [] db1182 [] db1171 [] db1170 [] db1162 [] db1156 [] db1155 [] db1146 []... [07:29:20] 10Blocked-on-schema-change, 10DBA: Schema change for watchlist.wl_notificationtimestamp going binary(14) from varbinary(14) - https://phabricator.wikimedia.org/T268392 (10Marostegui) s2 eqiad [] labsdb1011 [] labsdb1010 [] labsdb1009 [x] dbstore1004 [] db1182 [] db1171 [] db1170 [] db1162 [] db1156 [] db1155 [... [08:06:37] 10Blocked-on-schema-change, 10DBA: Schema change to turn user_last_timestamp.user_newtalk to binary(14) - https://phabricator.wikimedia.org/T266486 (10Marostegui) [08:06:50] 10Blocked-on-schema-change, 10DBA: Schema change for dropping default of img_timestamp and making it binary(14) - https://phabricator.wikimedia.org/T273360 (10Marostegui) [08:07:08] 10Blocked-on-schema-change, 10DBA: Schema change for watchlist.wl_notificationtimestamp going binary(14) from varbinary(14) - https://phabricator.wikimedia.org/T268392 (10Marostegui) [08:09:24] I am compiling 10.4 in bullseye, wish me luck [08:09:50] moritzm: ^ [08:16:08] I was going to ask you today [08:16:34] I have done mariadb 10.5 for testing, but he told me cumin2002 is intended to go into production in a month [08:18:15] the client should be fine with 10.5 I guess [08:18:42] I said I wanted to ask you because it looked a bit risky being so critical and so soon [08:19:17] yeah, it is definitely sooner than I would have expected [08:20:45] however, on the other side 10.5 is the connector that bullseye will ship, and the client is normally not touched a lot (at most, added extra authentications methods) [08:21:27] yes, that's what I meant above that it will only have the client so it is not too bad [08:21:45] I assume it will work fine with 10.1 and 10.4 [08:22:37] we really need to evaluate if we want to just go for 10.5 with bullseye or 10.4 but it really depends on when we finish the migration to buster and 10.4 [08:22:59] yeah, I am not thinking of mariadb-server at all [08:23:32] I happened to be working with the server because we get xtrabackup from there, but that is a complete different conversation [08:23:44] yeah [08:24:07] too bad we've blocked so long with 10.1 :( [08:24:58] despite that I think we are ok- some folks are still migrating away from jessie [08:26:00] hehe we just did with dbmonitor! [08:26:19] the migration was done long time ago [08:26:28] we just completed the last package [08:27:50] in any case, I think you just should select 1 version, being the client we can always change it [08:28:07] marostegui: https://www.youtube.com/watch?v=a2GVxYfKSxA&t=135s [08:28:43] yeah indeed [08:29:04] * sobanski not clicking in case it's a rickroll [08:30:25] and I am not intending to cause you stress, but moritz asked me last night and I said, I need to ask you! [08:30:44] (as I had done some work already with porting wmfbackups to bullseye) [08:31:34] I think for cumin2002 the 10.5 should be fine, but going for mariadb-server is a different story yeah [08:31:39] which needs a loooot more time [11:25:49] 10Data-Persistence-Backup, 10GitLab (Initialization), 10Patch-For-Review, 10User-brennen: Backups for GitLab - https://phabricator.wikimedia.org/T274463 (10jcrespo) > Will this require new hardware? If so we should add it into annual planning quickly. I discussed this with my manager, and we had taken the... [11:55:39] 10DBA: Switchover s6 from db1131 to db1173 - https://phabricator.wikimedia.org/T282124 (10Kormat) [12:00:09] 10DBA, 10Patch-For-Review: Switchover s6 from db1131 to db1173 - https://phabricator.wikimedia.org/T282124 (10Kormat) [12:00:43] 10.4 compiled fine on bullseye, going to try to create the packages [12:03:39] 10DBA, 10Patch-For-Review: Switchover s6 from db1131 to db1173 - https://phabricator.wikimedia.org/T282124 (10Kormat) [13:06:26] 10Data-Persistence-Backup, 10GitLab (Initialization), 10Patch-For-Review, 10User-brennen: Backups for GitLab - https://phabricator.wikimedia.org/T274463 (10thcipriani) >>! In T274463#7069222, @jcrespo wrote: >> Will this require new hardware? If so we should add it into annual planning quickly. > > I disc... [13:22:26] moritzm: I am building the 10.4 packages for bullseye now, once done I will test them and if they work well I will merge https://gerrit.wikimedia.org/r/c/operations/software/+/685512 [13:22:29] At least for the client [13:52:11] 10DBA, 10Analytics, 10Event-Platform, 10WMF-Architecture-Team, 10Services (later): Consistent MediaWiki state change events | MediaWiki events as source of truth - https://phabricator.wikimedia.org/T120242 (10Marostegui) >>! In T120242#6925040, @Ottomata wrote: >> How would it handle if the replica goes... [14:07:07] Looks like 10.4.18 works fine on bullseye [14:07:56] I have left the packages at apt1001:/home/marostegui/bullseye/10.4/ [15:11:42] 10Data-Persistence-Backup: Setup backup1003 and backup2003 as the storage location for es bacula backups - https://phabricator.wikimedia.org/T282249 (10jcrespo) [15:12:22] 10Data-Persistence-Backup: Setup backup1003 and backup2003 as the storage location for es bacula backups - https://phabricator.wikimedia.org/T282249 (10jcrespo) p:05Triage→03High High because this is causing some available disk space alarms. [16:47:56] 10DBA, 10Analytics, 10Event-Platform, 10WMF-Architecture-Team, 10Services (later): Consistent MediaWiki state change events | MediaWiki events as source of truth - https://phabricator.wikimedia.org/T120242 (10Ottomata) Interesting thanks! So brainstorming how that would work for Debezium, since Debezium... [18:32:53] 10Data-Persistence-Backup, 10GitLab (Initialization), 10Patch-For-Review, 10User-brennen: Backups for GitLab - https://phabricator.wikimedia.org/T274463 (10Sergey.Trofimovsky.SF) >> One thing that should be made explicit is that we had into account the amount of disk and resources needed to store backups... [19:17:25] so we've run into https://phabricator.wikimedia.org/T282271 in Mailman3 [19:17:46] it's trying to insert an emoji into a column which is using CHARACTER SET utf8 [19:18:09] and I know that you're supposed to use utf8mb4 for real UTF-8, which is what most columns use [19:18:36] buff [19:18:37] but for some columns mailman forces utf8 https://gitlab.com/mailman/mailman/-/blob/master/src/mailman/database/types.py#L105 "We hardcode the collate here to make string comparison case sensitive." [19:18:42] and for password, what a fail [19:19:00] does utf8mb4 not have case sensitive comparison? [19:19:38] 10DBA, 10DiscussionTools, 10Editing-team, 10Performance-Team, and 2 others: Reduce parser cache retention temporarily for DiscussionTools - https://phabricator.wikimedia.org/T280605 (10Krinkle) >>! In T280605#7068496, @Marostegui wrote: > There was a script running from May 2nd still on mwmaint1002, which... [19:19:51] utf8mb4_bin probably, there are a few collations [19:20:44] the normal fix is to just convert it on binary, but that won't work for python as bianary strings and character strings have a complete different workflow and type on python [19:21:51] for now, if it is a single user (or a very small number) manually change the display name to be utf8mb3 [19:21:57] then report the issue [19:22:09] * legoktm nods [19:22:33] the issue is easy from a db point of view, but not as much from the application side [19:23:02] but using utf8 is a big oof [19:23:06] specially for the password [19:23:23] because it's not actually utf8? [19:23:28] but also for the other stuff- we need 4 byte utf8 to support all communities [19:23:35] yes, sorry, it is confusing [19:23:46] until 4 months ago they were recommending to use utf8 and not utf8mb4: https://gitlab.com/mailman/mailman/-/commit/e6e0a10adc369c78e3e158f2ab3d68ce63d79e7a [19:24:11] there is UTF-8 (the standard) and utf8 (they mysql value) which is an alias for utf8mb3 [19:24:50] you won't be able to support emojis and other imporant non-western languages with utf8mb3 [19:24:59] and we normally need to support those [19:25:56] another option is to alter it- the db should be ok with that, but cannot speak for the application [19:26:24] gotcha [19:26:55] I think we can get away with the alter, but there are a bunch of columns probably using utf8, so I'll do a more detailed check [19:27:04] we can test it in our Cloud VPS setup to make sure the application is ok [19:27:55] that is more delicate [19:28:15] I suggest if it is a few records, to update the data, and register those cchanges somewhere [19:28:26] report the issue [19:28:40] then do the larger refactoring- it will be a bit more secure [19:29:02] you will break a single user vs all users, potentially [19:29:21] makes sense [19:29:37] and we can then unbreak the few users with a more careful change later on [19:29:47] after feedback from developers, etc. [19:30:07] the display_name doesn't actually matter for this mailing list since it's announce-only so I can drop the emoji without any trouble [19:30:18] if I was a user and was told "you cannot use emojis for some time in the name because of an application bug" [19:30:28] I think I would understood [19:30:42] and then fix it later after migration, with more time, etc. [19:30:56] and specially with upstream support, if we make a good case [19:31:23] I would support changing display name and password to at the very least utf8mb4_bin [19:31:27] later on [19:31:40] yep, makes sense [19:32:07] thanks, /me will write up a summary of this [19:36:51] https://phabricator.wikimedia.org/T282271#7070452 [20:08:26] if this is like etherpad, there could be some convoluted reasoning to use utf8_bin, and the fix will be ugly, if there is any :-( [20:13:08] alternatively, this could be unintended/old thing and they really want utf8mb4_bin [20:14:03] I would accept not being able to use invalid unicode characters in a password, but not unable to use emojis! [20:15:47] it is not a large_prefix_index thing, they have fields of >500 characters [20:16:19] I almost sure it is unintended, and a bug due to bad hardcoding (expecting utf8 be default, instead of utf8mb4)