[07:16:47] <_joe_>	 cdanis: what you want to patch php-fpm for?
[08:40:50] <kormat>	 anyone willing to admit ganeti knowledge?
[08:45:11] <moritzm>	 kormat: what do you need?
[08:46:00] <kormat>	 hmm. it might have resolved itself, eventually. gnt-instance shutdown took many minutes, node itself logged that the instance wasn't running
[08:46:16] <kormat>	 i'm trying to use d-i-test to test partman recipes
[08:46:31] <_joe_>	 lol
[08:47:02] <_joe_>	 I need to add a new rule to onboarding: do not let newcomers near partman before they're hooked
[08:47:18] <moritzm>	 d-i-test is typically not running I think, it only gets powered one when people are doing install tests
[08:47:38] <kormat>	 moritzm: it was running since 2019-11 when i checked yesterday :) in any case i'd done a fresh install, and wanted to boot off disk
[08:48:25] <kormat>	 so.. how do i log into a fresh install? do i need to do https://wikitech.wikimedia.org/wiki/Server_Lifecycle#Manual_installation before i get login?
[08:48:57] <kormat>	 all i want to do is look at the resulting partitioning
[08:48:58] <moritzm>	 simply set "gnt-instance modify --hypervisor-parameters=boot_order=network FQDN"
[08:49:11] <moritzm>	 gnt-instance start
[08:49:31] <moritzm>	 and then  gnt-instance console for the console
[08:49:36] <kormat>	 _joe_: don't worry, i'm already aware of how user-hostile partman is :)
[08:49:48] <kormat>	 moritzm: i've done the install. how do i log into the instance?
[08:50:03] <_joe_>	 kormat: have you already felt the dread of finding out the best partman docs are already on wikitech?
[08:50:09] <kormat>	 _joe_: yep!
[08:50:26] <_joe_>	 ok, you unblocked one quest!
[08:50:28] <moritzm>	 kormat: puppet didn't run yet, so your user isn't there yet:
[08:50:36] <moritzm>	 log into puppetmaster1001.eqiad,wmnet
[08:50:39] <moritzm>	 then run
[08:50:54] <moritzm>	 sudo install_console d-i-test.eqiad.wmnet
[08:50:58] <moritzm>	 then run puppet there
[08:51:09] <moritzm>	 it will generate the certificate request
[08:51:17] <kormat>	 ahh
[08:51:21] <moritzm>	 then sign it on puppetmaster1001 "puppet cert sign d-i-test.eqiad.wmnet"
[08:51:21] <kormat>	 install_console is all i need actually
[08:51:36] <kormat>	 i don't need to enroll this properly in puppet, i'm going to be doing a bunch of reinstalls
[08:51:36] <moritzm>	 and then the next install will create your user
[08:51:39] <moritzm>	 ok
[08:51:45] <kormat>	 this is perfect, thanks :)
[09:15:54] <jbond42>	 gehel: fyi wdqs-updater is failing to start on wdqs1009
[09:16:15] <gehel>	 jbond42: thanks! I thought I silenced that one
[09:16:35] <jbond42>	 ahh i noticed it because of the "Ensure hosts are not performing a change on every puppet run
[09:16:35] <gehel>	 this is a test server and we're still dealing with the aftermath of the broken deploy yesterday...
[09:16:38] <jbond42>	 View Extra Service Notescheck
[09:17:05] <jbond42>	 ack ok ill ignore it thanks
[12:49:21] <cdanis>	 _joe_: I'd like to properly track in-use php workers, without requiring a free worker to do so
[13:14:24] <_joe_>	 cdanis: you mean have a dedicated worker for statistics?
[13:16:22] <_joe_>	 because right now we get the stats by interrogating the stats endpoint in php-fpm already, but it's handled by php-fpm as any other request
[13:27:48] <cdanis>	 _joe_: either that, or other options, like having some out-of-band mechanism for php-fpm to report basic stats
[13:28:06] <cdanis>	 even if it was just changing the argv of a worker when it starts/stops processing a request
[14:26:38] <hnowlan>	 hey ema - I'm looking at the message passing bit of changeprop to send purge events to purged. Do you know if purged would understand a uri of the format `//en.wikipedia.org/stuff`?
[14:29:19] <jbond42>	 cdanis: rzl: fyi have been looking at pcc today and you can use `./utils/pcc last $hosts` to check the current head which i think is pretty much what your pcc function does
[14:30:15] <ema>	 hnowlan: good question, I think it should work yeah
[14:31:13] <hnowlan>	 ema: cool :)
[14:31:13] <rzl>	 jbond42: oh beautiful, thanks!
[14:44:15] <elukey>	 https://github.com/openjdk/jdk/blob/master/src/java.security.jgss/share/classes/sun/security/krb5/internal/ccache/FileCredentialsCache.java#L448-L456
[14:44:35] <elukey>	 all hardcoded, TIL
[14:44:44] <elukey>	 what a joyful day
[14:49:14] <moritzm>	 we could submit a patch to also check /run/user/UID/krb5cc
[14:49:30] <moritzm>	 but obviously this would only trickle in much later
[14:50:35] <elukey>	 I see bug reports from 2013 about it, I suspect that there is a twisted reason why this wasn't implemented
[14:50:54] <elukey>	 or maybe there is an obscure option to tune
[14:54:16] <moritzm>	 I don't think there's a strict reason, this Oracle stuff is based on RHEL, so it's always 5 years behind everything...
[14:58:16] <moritzm>	 and /run/user is the canonical place to store this in the systemd age
[15:11:58] <_joe_>	 hnowlan: are you preparing a patch? we still haven't released the new code yet
[15:12:09] <_joe_>	 so the order of release will need to be
[15:12:21] <_joe_>	 1 - purged 0.11 with kafka disabled
[15:12:37] <_joe_>	 2 - try kafka enabled for a test on a single cp host
[15:12:48] <_joe_>	 3 - deploy purged correctly configured
[15:12:57] <_joe_>	 4 - switch something to produce to kafka
[15:13:11] <_joe_>	 I hope to get 1-3 in order by start of next week
[15:16:28] <hnowlan>	 _joe_: as it stands we're looking to implement populating the topic from k8s and then consuming the messages from the existing scb instance of changeprop. Are you okay with us doing this now-ish? We can disable consumption from scb whenever needs be
[15:16:49] <hnowlan>	 this is just a stopgap rather than a longterm plan
[15:29:17] <_joe_>	 ok that's not a terrible idea
[15:29:45] <_joe_>	 that would allow you to transfer more things to k8s while waiting for us to finish work on our side
[18:06:21] <addshore>	 I'm curious, is it possible to easily remove the timeout and memory limit restrictions on a mwdebug host?
[19:46:39] <_joe_>	 no.
[19:46:41] <_joe_>	 :P
[19:47:05] <_joe_>	 addshore: jokes aside, what do you want to do
[19:47:43] <addshore>	 I have an api call that runs out of memory in prod and also on mwdebug when profiling then runs out of time.
[19:48:04] <_joe_>	 ok so you want to profile an api call in production
[19:48:04] <addshore>	 ideally I wanted to get a full profile for it, rather than just the bit up until it breaks
[19:48:10] <_joe_>	 right
[19:48:11] <addshore>	 indeed
[19:48:34] <_joe_>	 you need to tell me again tomorrow morning when we're both online in working hours :P
[19:48:42] <addshore>	 :D I'll try to!
[19:49:04] <_joe_>	 you need someone with root to help you do that, I think