[07:21:04] https://github.com/miraheze/mw-config/pull/6436 merged and deployed. A test wiki was successfully created on prod. [09:22:33] @Infrastructure Specialists db161 looks sad, I am currently waiting in security queue [09:22:44] So I will be a little bit if no else around [09:23:08] I’m mobile atm [09:29:46] Same [09:30:12] I have ssh from mobile but I am in the passport control queue [09:30:27] <.labster> poor gateways [09:55:25] @posix_memalign I see you out c2 in maint mode, can you see what's causing the outage? [09:55:49] I will try and restart MySQL but it may not help [09:56:52] It's not even responding to ssh [09:58:50] @Infrastructure Specialists db161 is unresponsive even to ssh as far as I can see. It will need someone with cloud access to do it from that side. [09:59:15] I didn't find anything super obvious on CF. The usual scraper stuff but nothing crazy. [10:00:33] @paladox can you remember if there's a way to reset a VM from CLI [10:00:37] By SSH to cloud16 [10:01:54] I’ve never used the cli to do that but I think there is [10:01:56] https://forum.proxmox.com/threads/start-and-stop-kvm-vm-from-command-line.580/ [10:02:31] @paladox shall I do 'qm reset' then [10:03:17] I think so. Hopefully it’ll start back up after [10:04:13] There’s a risk of data loss but the vm is unresponsive so it’s our last resort. [10:05:05] <1024720274928697384> (did the machine run out of storage? i know that pve can suspend machines if it does) [10:05:21] That's not been really really helpful [10:05:40] <1024720274928697384> Huh? The suspension behavior? [10:05:53] We have monitoring for disk space [10:05:59] <1024720274928697384> Oh, nevermind then [10:06:49] @posix_memalign it should be coming back [10:06:58] Please monitor and repool when you feel comfortable [10:07:13] And then update #announcements [10:07:49] I added a few more rules but no idea if it addresses the root cause [10:08:01] I have zero clue what happened [10:08:08] I am from my phone in a departures lounge [10:08:32] are we in trouble? [10:08:48] Useless comments in #general or #offtopic please [10:09:00] I’m not sure this will be addressed in cf layer. It seems to be our side. Something either in our extensions or mw is triggering it in the db (unless there’s a bug in MariaDB) [10:09:14] Or all 3? [10:09:19] Yeh [10:09:31] im hoping the db logging we set up has something useful now [10:09:59] more likely than not it's a stupidly shit piece of code on a very big wiki [10:10:21] Hopefully the log catches what it is lol [10:11:48] Repooling [10:13:44] [1/2] I think it may have something to do with being OOM. [10:13:45] [2/2] https://cdn.discordapp.com/attachments/1006789349498699827/1516022894965297222/image.png?ex=6a3121d8&is=6a2fd058&hm=4fd0100251db86a310e6298742e3ee9d154b1efb7747a42464e6823d906966cd& [10:14:28] OOM is not good [10:14:32] Especially if it stalled stuff [10:14:39] That's probably more likely a MariaDB bug [10:15:22] Yeah I would rather it kill transactions than just stall [10:17:01] It should never stall [10:17:18] It also should trigger OOM killer if needed [10:17:38] But it's designed to use as much memory as possible without crashing, stalling or killing anything [10:23:32] We have `innodb_buffer_pool_size` at 78GiB so it should never use so much RAM [10:23:39] https://github.com/miraheze/puppet/blob/adb279f7fa2df51f6f41ba7e8944894ba2e1e415/hieradata/hosts/db161.yaml#L5 [11:13:59] Not sure how heavily you guys use salt but you could set it so if icinga finds the machine unreachable over ssh a salt reactor rule could trigger a qm reset on the cloud host it uses [11:14:14] although not sure automating that is necessarily the best idea [11:27:38] We use both salt and icinga but a part of me just died reading that [11:28:12] Automating uncontrolled shutdowns of equipment in response to potential faults as a recovery mechanism is questionable [11:28:22] You might as well play Russian roulette [11:28:25] a:shrug: [11:28:30] alternatively you could just like [11:28:33] move stuff off db161 to another db* host [11:28:36] that would probably also fix it [11:29:00] not sure what utilisation it has storage wise but if there's less things hitting it that would also probably do it favours [11:29:49] No it wouldn't [11:29:55] It would almost certainly just move the problem [11:30:24] _wonders if @zipppee wants him to have a heart attack_ [11:30:38] personally I have distaste towards db161 [11:30:43] my personal favourite is definitely db172 [11:31:09] I mean so do we all [11:31:20] well [11:32:35] aside of actually fixing the OOM, the only options I'd see are take some load off it to a new db* server, or automate the depooling better [11:33:31] Automating the depooling better would be the most useful we can do [11:34:05] Alongside working out the root cause [11:34:35] I'd assume it's one database in particular causing it [11:34:46] unless it has like significantly more load on it than other db servers for some reason [11:35:08] It should have an about equal number of databases [11:35:15] Probably one query [11:35:34] that reminds me I need to better instrument mysql for myself at some point [11:35:41] probably a sentry thing [13:33:11] [1/2] https://cdn.discordapp.com/attachments/1006789349498699827/1516073087886557265/image.jpg?ex=6a315097&is=6a2fff17&hm=4ce40094c97d8d91abca1d949da8155e9bf7f3917f7987c04d8ce49c88675240& [13:33:12] [2/2] https://cdn.discordapp.com/attachments/1006789349498699827/1516073088243204218/image.jpg?ex=6a315097&is=6a2fff17&hm=0ecb5ce61e0db3ef2c2aaece3298ff51e04f77edd5ea2008f2ce9f993747a303& [18:04:49] Errrr I think the CreateWiki Extension is broken? [18:05:13] <_arawynn> yeah it is, tech should know about it [18:05:57] Can't ping tech lol [18:11:00] @pskyechology CreateWiki Extension is brokennnnnn [18:12:03] i think you were just told we know [18:27:30] oh oops sorry [23:16:41] [1/2] well, trying to get https://github.com/miraheze/DiscordNotifications for the discord [23:16:41] [2/2] why does it seem to require computer folders when no such thing is possible [23:17:12] ? [23:18:28] ```Download the latest release of this extension, uncompress the archive, and move folder DiscordNotifications into your mediawiki_installation/extensions folder. (And instead of manually downloading the latest version, you could also just git clone this repository to that same extensions folder).``` [23:18:53] DN is installed on all miraheze wikis if that's what you're worried about [23:19:05] it is already installed onto your miraheze wiki. the settings are in managewiki [23:28:54] how long does it take before it'll start doing notifications