[06:33:23] andrewbogott: o/ replied in the task, I think we shouldn't since deployment-prep uses them [17:03:48] sobanski: swfrench-wmf: taavi: here [17:03:55] ack [17:03:59] Can it actually be a problem with db1163? [17:04:07] s1 is r/o [17:04:15] thanks taavi [17:04:22] sobanski: yeah, that's what I'm wondering, yeah [17:04:28] taavi: ack, thanks! [17:04:31] wow and i just got okta'd [17:04:35] Looking at orchestrator it's been unavailable for 11 minutes [17:04:35] did someone start a doc already? [17:04:44] I can update status page if someonce can grab a DBA [17:04:50] taavi: I'm starting one [17:04:57] replication error on 2203 is now "Too many connections" [17:05:06] Let me try and alert DBAs (we have Federico around already) [17:05:25] Amir1, jynus ^ [17:05:53] all hosts in s1 are lagging indeed [17:06:17] https://docs.google.com/document/d/1j-1dz6L_Xt_3eF4bVOm7fK_DoD4RsuGsgB30OiOn5SE/edit?tab=t.0#heading=h.8lq88tc7h8f8 [17:06:30] creating status page incident [17:07:11] for around 10 mins? https://grafana.wikimedia.org/d/bd60e6f6-11fc-47f4-a6ba-109c1aed251d/federico-s-mariadb-replication-dash?folderUid=Wagp6Ryik&from=2025-09-19T16%3A50%3A14.414Z&orgId=1&timezone=utc&to=2025-09-19T17%3A04%3A03.011Z [17:08:06] sobanski: yeah, I think 1163 is the issue [17:08:41] status page updated [17:08:49] thanks scott [17:08:59] I see lots of read queries stuck there as well [17:09:16] Did anyone try to connect to 1163? [17:09:37] I got into the mariadb console in via the admin port on a cumin box [17:09:58] https://phabricator.wikimedia.org/P83442 [17:10:15] I'm able to log in directly using ssh [17:10:48] but the process is not responding to me [17:11:00] mysqld is using every single bit of CPU available [17:11:06] I'm taking a look [17:12:19] spawned a ton of threads in accept() [17:12:52] MediaWiki\User\User::loadFromDatabase [17:13:34] PHP worker saturation has resolved too [17:13:35] the master is clearly overloaded [17:13:54] CPU dropped, I see recovery [17:14:31] Something is loading mediawiki user from master. That's a lot of load [17:15:19] do you think we should set s1 r/w again? [17:15:30] was just about to remind the channel :) [17:15:39] yes [17:15:44] I want to see what's causing this [17:17:09] thank you to whomever just did [17:17:20] amir did I think [17:20:02] If anyone has time, can someone check why loading the user is happening on master? 80% of show processlist entries during the overload were `MediaWiki\User\User::loadFromDatabase` [17:20:13] > SELECT /* MediaWiki\User\User::loadFromDatabase */ user_id,user_name,user_real_name,user_email,us [17:21:10] the default is the master: https://gerrit.wikimedia.org/g/mediawiki/core/+/da1292846e8bd16c0fd9d3d3e9c2b263da410c13/includes/user/User.php#1110 [17:24:45] Amir1: nothing that jumps out at me here https://trace.wikimedia.org/search?end=1758302616340000&limit=200&lookback=1h&maxDuration&minDuration&operation=Database%20SELECT%20enwiki&service=mediawiki&start=1758299016340000&tags=%7B%22code.function%22%3A%22.%2AloadFromDatabase.%2A%22%7D [17:26:02] thanks Scott! [17:28:32] I can't find any code path that would actually force reading from the master [17:28:50] except on auto creating accounts [17:28:54] which is quite rare [17:28:59] https://trace.wikimedia.org/search?end=1758302847289000&limit=200&lookback=24h&maxDuration&minDuration&operation=Database%20SELECT%20enwiki&service=mediawiki&start=1758216447289000&tags=%7B%22code.function%22%3A%22.%2AloadFromDatabase.%2A%22%2C%22and%22%3A%22true%22%2C%22server.address%22%3A%22db1163%22%7D [17:29:04] this is also empty [17:29:36] apologies, the jaeger query UI is not good and you're holding it wrong [17:29:45] it's not added as a replica or having weight (accidents like that happens from time to time) [17:29:49] AND is just a space [17:29:53] https://trace.wikimedia.org/search?end=1758302980158000&limit=200&lookback=24h&maxDuration&minDuration&operation=Database%20SELECT%20enwiki&service=mediawiki&start=1758216580158000&tags=%7B%22code.function%22%3A%22.%2AloadFromDatabase.%2A%22%2C%22server.address%22%3A%22db1163%22%7D [17:31:35] ah thanks [17:32:10] https://trace.wikimedia.org/search?end=1758303106094000&limit=200&lookback=2h&maxDuration&minDuration&service=mediawiki&start=1758295906094000&tags=%7B%22server.address%22%3A%22db1163%22%7D also probably interesting to look at