[10:15:30] elukey: [10:16:42] regarding T385531, to my understanding, the issue was that we were uploading builds which in total were exceeding the 4G tmpfs size? [10:16:43] T385531: Image publishing via docker-pkg on build2001 repeatedly failing - https://phabricator.wikimedia.org/T385531 [10:18:03] effie: o/ yes more precisely docker image layers that *compressed* filled up the 4GB tmpfs [10:18:46] or more than one upload happening at the same time with layers that summed together were big enough to fill up tmpfs [10:19:03] at that point nginx refuses to continue the upload and returns 500 [10:19:06] (to dockerd) [10:19:27] alright, so for the time being it will be something to lookout for [10:19:41] thank you very much for digging into it! [10:20:19] np! [10:20:54] <3 [11:29:33] please don't use build-production-images on build2001 for the moment [12:10:26] should be good now [12:23:10] Reedy mszabo thank you very very much, do you reckon we should rollback due to those errors? [12:23:32] I think no, this seems like the usual security scanner bots injecting garbage [12:23:43] outstanding [12:23:59] T385568 should be easy too - rethrow a 400 instead of letting the original error bubble up [12:24:00] T385568: InvalidArgumentException: Bad key: -1" OR 2+583-583-1=0+0+0+1 -- - https://phabricator.wikimedia.org/T385568 [12:27:33] It's seemingly against the Ukranian Wiki [12:27:39] You have to wonder who might be doing that [12:28:20] wait [12:28:21] it's ur not uk [12:28:23] nvm [12:30:23] it's just the state of the internet in 2025 [13:00:52] heads up, I'm upgrading prometheus-poolcounter-exporter per https://phabricator.wikimedia.org/T333947 [13:01:02] no impact expected [13:09:17] Is puppet7 now the default for new servers, or do we still need to set force_puppet7 in hiera one host at a time? [14:02:33] godog: <# [14:02:34] <3 [14:12:35] jayme, quick look at https://gerrit.wikimedia.org/r/c/operations/puppet/+/1116888 when you have a moment? ty [14:15:19] andrewbogott: o/ I think Janis may be out today, the change looks good but could you please run pcc on some wikikube/ml-serve workers just in case? [14:16:52] thx elukey, will do [14:26:14] elukey: pcc is weirdly inserting known hosts changes but I assume that's a bad cache or something, otherwise no changes [14:36:58] andrewbogott: yes yes, but I see some changes in resource names, not a big deal but it may trigger some removal/re-add [14:37:07] lemme check [14:37:57] mmm no so sysctl::parameters has $ensure, so it doesn't auto-cleanup [14:42:26] andrewbogott: left a comment [14:44:51] claime: didn't you and hnowla.n work a restbase issue that manifested as cached entries with an empty result? [14:45:10] hmmm [14:45:20] that faintly rings a bell [14:46:11] I remember you doing gnarly logstash queries to farm related errors, and then scripting a request with a no-cache header [14:46:49] That was probably not me as I'm not a friend of logstash x) [14:47:00] hoping to find an associated phab or something, it sounds like T379017 [14:47:02] T379017: API returning completely empty contents from mobile-html endpoint for some articles - https://phabricator.wikimedia.org/T379017 [14:47:31] well... if my memory is right, you were not very happy with that process, so it does track :) [14:48:02] honestly, that is kind of why it is memorable...because the whole thing was pretty awful [14:48:15] (as anything to do with restbase typically is...) [14:48:55] hnowlan: ^^^ any of this ringing a bell? [16:22:42] I don't remember it manifesting as empty responses? I do vaguely remembering having to purge the cache for a bunch of stuff from logstash [18:58:06] hnowlan: if not empty, do you remember what the problem was? [18:58:17] some other kind of corruption I guess? [18:58:57] I'm trying to find the ticket, and perhaps the corresponding changeset [18:59:48] the only thing I'm certain of is that it/something happened, but I can't recall enough about when, or any of the details