[07:16:47] <_joe_> cdanis: what you want to patch php-fpm for? [08:40:50] anyone willing to admit ganeti knowledge? [08:45:11] kormat: what do you need? [08:46:00] hmm. it might have resolved itself, eventually. gnt-instance shutdown took many minutes, node itself logged that the instance wasn't running [08:46:16] i'm trying to use d-i-test to test partman recipes [08:46:31] <_joe_> lol [08:47:02] <_joe_> I need to add a new rule to onboarding: do not let newcomers near partman before they're hooked [08:47:18] d-i-test is typically not running I think, it only gets powered one when people are doing install tests [08:47:38] moritzm: it was running since 2019-11 when i checked yesterday :) in any case i'd done a fresh install, and wanted to boot off disk [08:48:25] so.. how do i log into a fresh install? do i need to do https://wikitech.wikimedia.org/wiki/Server_Lifecycle#Manual_installation before i get login? [08:48:57] all i want to do is look at the resulting partitioning [08:48:58] simply set "gnt-instance modify --hypervisor-parameters=boot_order=network FQDN" [08:49:11] gnt-instance start [08:49:31] and then gnt-instance console for the console [08:49:36] _joe_: don't worry, i'm already aware of how user-hostile partman is :) [08:49:48] moritzm: i've done the install. how do i log into the instance? [08:50:03] <_joe_> kormat: have you already felt the dread of finding out the best partman docs are already on wikitech? [08:50:09] _joe_: yep! [08:50:26] <_joe_> ok, you unblocked one quest! [08:50:28] kormat: puppet didn't run yet, so your user isn't there yet: [08:50:36] log into puppetmaster1001.eqiad,wmnet [08:50:39] then run [08:50:54] sudo install_console d-i-test.eqiad.wmnet [08:50:58] then run puppet there [08:51:09] it will generate the certificate request [08:51:17] ahh [08:51:21] then sign it on puppetmaster1001 "puppet cert sign d-i-test.eqiad.wmnet" [08:51:21] install_console is all i need actually [08:51:36] i don't need to enroll this properly in puppet, i'm going to be doing a bunch of reinstalls [08:51:36] and then the next install will create your user [08:51:39] ok [08:51:45] this is perfect, thanks :) [09:15:54] gehel: fyi wdqs-updater is failing to start on wdqs1009 [09:16:15] jbond42: thanks! I thought I silenced that one [09:16:35] ahh i noticed it because of the "Ensure hosts are not performing a change on every puppet run [09:16:35] this is a test server and we're still dealing with the aftermath of the broken deploy yesterday... [09:16:38] View Extra Service Notescheck [09:17:05] ack ok ill ignore it thanks [12:49:21] _joe_: I'd like to properly track in-use php workers, without requiring a free worker to do so [13:14:24] <_joe_> cdanis: you mean have a dedicated worker for statistics? [13:16:22] <_joe_> because right now we get the stats by interrogating the stats endpoint in php-fpm already, but it's handled by php-fpm as any other request [13:27:48] _joe_: either that, or other options, like having some out-of-band mechanism for php-fpm to report basic stats [13:28:06] even if it was just changing the argv of a worker when it starts/stops processing a request [14:26:38] hey ema - I'm looking at the message passing bit of changeprop to send purge events to purged. Do you know if purged would understand a uri of the format `//en.wikipedia.org/stuff`? [14:29:19] cdanis: rzl: fyi have been looking at pcc today and you can use `./utils/pcc last $hosts` to check the current head which i think is pretty much what your pcc function does [14:30:15] hnowlan: good question, I think it should work yeah [14:31:13] ema: cool :) [14:31:13] jbond42: oh beautiful, thanks! [14:44:15] https://github.com/openjdk/jdk/blob/master/src/java.security.jgss/share/classes/sun/security/krb5/internal/ccache/FileCredentialsCache.java#L448-L456 [14:44:35] all hardcoded, TIL [14:44:44] what a joyful day [14:49:14] we could submit a patch to also check /run/user/UID/krb5cc [14:49:30] but obviously this would only trickle in much later [14:50:35] I see bug reports from 2013 about it, I suspect that there is a twisted reason why this wasn't implemented [14:50:54] or maybe there is an obscure option to tune [14:54:16] I don't think there's a strict reason, this Oracle stuff is based on RHEL, so it's always 5 years behind everything... [14:58:16] and /run/user is the canonical place to store this in the systemd age [15:11:58] <_joe_> hnowlan: are you preparing a patch? we still haven't released the new code yet [15:12:09] <_joe_> so the order of release will need to be [15:12:21] <_joe_> 1 - purged 0.11 with kafka disabled [15:12:37] <_joe_> 2 - try kafka enabled for a test on a single cp host [15:12:48] <_joe_> 3 - deploy purged correctly configured [15:12:57] <_joe_> 4 - switch something to produce to kafka [15:13:11] <_joe_> I hope to get 1-3 in order by start of next week [15:16:28] _joe_: as it stands we're looking to implement populating the topic from k8s and then consuming the messages from the existing scb instance of changeprop. Are you okay with us doing this now-ish? We can disable consumption from scb whenever needs be [15:16:49] this is just a stopgap rather than a longterm plan [15:29:17] <_joe_> ok that's not a terrible idea [15:29:45] <_joe_> that would allow you to transfer more things to k8s while waiting for us to finish work on our side [18:06:21] I'm curious, is it possible to easily remove the timeout and memory limit restrictions on a mwdebug host? [19:46:39] <_joe_> no. [19:46:41] <_joe_> :P [19:47:05] <_joe_> addshore: jokes aside, what do you want to do [19:47:43] I have an api call that runs out of memory in prod and also on mwdebug when profiling then runs out of time. [19:48:04] <_joe_> ok so you want to profile an api call in production [19:48:04] ideally I wanted to get a full profile for it, rather than just the bit up until it breaks [19:48:10] <_joe_> right [19:48:11] indeed [19:48:34] <_joe_> you need to tell me again tomorrow morning when we're both online in working hours :P [19:48:42] :D I'll try to! [19:49:04] <_joe_> you need someone with root to help you do that, I think