[17:14:09] Hi, I need to compile some gettext files, but I don't have either xgettext or make. Do I really have to build a container image just to run that once? [17:21:40] Joutbis: I think all of the pre-built containers have `make` in them. I am not finding anything that also has gettext installed unfortunately. [17:23:08] Oh hi, a couple days ago you advised me to rewrite my app (saying it was not good tech support advice). I want to reassure you it was good advice anyway. [17:23:22] heh [17:23:52] The thing is, I am not the author of the app, it's something that's published on github. I installed it some four years ago, and have been maintaining it since with minor tweaks. [17:24:47] But migration to 3.11 really broke it. Looking back at github, I see the author has found the same problems and has got rid of flask-uwsgi-websocket, so I'll reinstall everything from scratch [17:25:11] so I wanted to say thanks to you [17:26:18] I'm glad you are finding some possible solutions. I will stick with "rewrite your app" being generally poor advice to give, but it is good to hear that led you to thinking about options. :) [17:26:25] Anyway, back to my first question, I will probably be better off if I compile the gettext locally in my personal computer, and then upload the compiled files to toolforge. [17:37:52] Before your advice, installing from scratch was a dreaded option. After seeing the alternatives, and looking again at the author's comments on github I saw it was the best course [21:24:43] !log gitlab-runners upgrading gitlab-runner package to 17.3.3-1 on all instances [21:24:45] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Gitlab-runners/SAL [23:42:13] I don't seem to be able to SSH into pageviews01.commtech.eqiad1.wikimedia.cloud anymore (permission denied, public key). I can SSH into other VPS instances though. Could this be due to the Puppet failures the other day? [23:46:00] musikanimal: I dont think it's related to the puppet failures. that should not influence user keys. I would suggest to try rebooting the instance via Horizon first and try again. [23:47:56] musikanimal: fwiw, I can ssh to that as root [23:48:23] I do notice that the puppet run is broken [23:48:37] yeah the tool is running (mostly) okay, but I see the backend API is down [23:48:48] Error: Connection to https://puppetmaster.cloudinfra.wmflabs.org:8140/puppet/v3 failed, [23:48:49] I thought puppet was responsible for adding SSH keys so I thought that could be it [23:48:56] Failed to open TCP connection to puppetmaster.cloudinfra.wmflabs.org:8140 [23:50:08] yes, puppet would add user keys on the first run [23:50:15] does any of that sound familiar to you? I'm not sure how to debug, especially since I can't get in [23:50:20] but I dont think that means just because it fails to run any keys are deleted [23:50:21] here's what the backend API is failing with now https://pageviews.wmcloud.org/massviews/api.php?project=en.wikipedia.org&category=Articles_lacking_sources_from_December_2008&limit=20000 [23:50:52] which also sounds like a network communication issue [23:51:06] you dont normally use a local puppetmaster in that project, do you? [23:51:15] no [23:51:23] ok, i'm looking a bit [23:51:29] awesome, thank you! :) [23:51:42] puppetmaster.cloudinfra.wmflabs.org: Temporary failure in name resolution [23:53:21] your /etc/resolv.conf does not have a DNS server configured [23:53:33] comparing to a different cloud VPS in my own project.. where I do have that [23:53:45] trying to manually hack that [23:53:46] hmm, interesting! [23:53:56] yeah, I don't know anything about that file, but I know I didn't touch it [23:54:13] not on purpose, anyway hehe. I haven't touched this VM at all in quite a while [23:54:45] I got the first report of something being broken ~12 hours ago, so I think whatever it is broke very recently [23:55:06] !log commtech - on instance pageviews01 puppet fails because it can't connect to the puppetmaster, and that fails because it can't look it up in DNS. /etc/resolv.conf does not have a nameserver line. adding "nameserver 172.20.255.1" to it manually [23:55:09] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Commtech/SAL [23:55:36] and now puppet works again [23:55:49] and now I can SSH in! yay! \o/ [23:55:53] thank you :) [23:55:56] it's not like I saw it do anything to SSH keys [23:56:05] so it was all the DNS lookup failure [23:56:32] guess so, which also effected the backend, which is now working :) [23:56:41] I dont know why that happened. but it simply wasnt configured what DNS server to use [23:56:48] so it could not look up any names [23:57:06] is that prone to break again, you think? or is that file supposed to be automated? [23:57:42] I am pretty sure it's supposed to be automated. Either it comes with the image when the instance is created or it's set by DHCP when it boots. [23:57:49] seems like a bug somewhere [23:58:19] really no idea how likely it is to happen again [23:58:44] I dont recall it happening on my instances so far.. I think. [23:58:49] okay, well thanks for fixing! I'll set up a monitor for the backend so I'm notified next time [23:59:04] I don't recall ever having this issue, either [23:59:23] there are some tickets about spurious DNS failures in cloud VPS [23:59:33] but as far as I know it wasn't like this