[07:53:03] Hmm … I have an job like this (toolforge jobs list -o long): | three-per-hour | ~/jobs/three_per_hour.sh | schedule: 0,20,40 * * * * | php8.2 | none | yes | /data/project/persondata/logs/three_per_hour.out | /data/project/persondata/logs/three_per_hour.err | none | default | all | no | none | Last schedule time: 2024-09-04T05:00:00Z | [07:53:14] hi folks, I filed T374513 for the outstanding lint alerts on neutron agent alerts [07:53:14] T374513: Lint problems for NeutronAgentDownForLong and NeutronAgentDown - https://phabricator.wikimedia.org/T374513 [07:53:23] Any reason why it did not start the last days? [07:53:25] not sure about the tags, please adjust at will [07:55:10] godog: thanks! [07:55:15] Wurgl: let me take a look [07:55:24] sure np dcaro [07:55:35] Thanks, persondata is the tool [07:57:55] Wurgl: it seems the job it generates is failing to validate the kyverno policies, looking [07:58:28] >$ kubectl describe pod three-per-hour-28757100-k45ck [07:58:28] .. [07:58:28] 644 Warning PolicyViolation 45m kyverno-scan policy toolforge-kyverno-pod-policy/toolforge-validate-pod-policy fail: validation error: pod security configuration must be correct. rule toolforge-validate-pod-policy failed at path /spec/securityContext/runAsG roup/ [07:58:28] ... [07:59:03] What does that mean? [07:59:21] internal k8s stuff that should not break xd, looking into it [08:00:08] Another job | once-per-hour | ~/jobs/once_per_hour.sh | schedule: 17 0,4-22 * * * | php8.2 | none | yes | /data/project/persondata/logs/once_per_hour.out | /data/project/persondata/logs/once_per_hour.err | none | default | all | no | none | Last schedule time: 2024-09-04T04:17:00Z | [08:00:34] the rest seems to run fine [08:07:03] hmm, the cluster is being very slow, just got timeout [08:07:57] I was able to kick-start the job though [08:08:03] https://www.irccloud.com/pastebin/O42pFNOt/ [08:11:30] they are all getting errors though [08:11:37] (three-per-hour and once-per-hour I mean) [08:11:39] Is there any way I can kickstart such a job other then deleting and reschedule? [08:12:30] yes, with `toolforge jobs restart` [08:13:07] that will also force it to run right away (not wait for the previous schedule) [08:13:12] fyi [08:14:24] if you are using filelogs btw. I recommend adding `| ts` to the job command, like `php ~/td/process_statistics.php | ts`, that will add the timestamp to each line in the log [08:14:38] toolforge jobs restart once-per-hour … hmm … requests.exceptions.ReadTimeout: HTTPSConnectionPool(host='api.svc.tools.eqiad1.wikimedia.cloud', port=30003): Read timed out. (read timeout=30) [08:24:25] Wurgl: your jobs are quite old, can you try deleting them and recreating them? you can do something like `toolforge jobs dump > myjobs.yaml`, `toolforge jobs flush` (this will delete them all) and `toolforge jobs load myjobs.yaml [08:25:57] … compared to my age, these jobs are very young ;^) [08:26:46] hahahaha, yep, I feel similarly xd [08:30:32] Okay, case closed. next question: I can add an option -mem … which units are recognized, "G" ist Gigabyte okay, what else? Can/shall I reduce the memory to 128 K (is "K" understood)? [08:31:18] or 256K or 1.5G? [08:33:53] It will understand it yes, though 128K is very little, I recommend at least 50M (for very very small things, probably golang or compiled languages) or higher [08:34:27] usually the default is ok until you need more [08:36:34] https://wikitech.wikimedia.org/wiki/Help:Toolforge/Jobs_framework <-- maybe you want to explain that option a little bit I read only Gi, G and Mi [08:37:22] Yesterday I ran out of memory, so I try to reduce it a little bit. [08:41:35] Oh, I see, you want to actually force less memory than the default, good point [08:45:54] https://www.php.net/manual/de/function.memory-get-peak-usage.php <-- I am using this function to check the max memory usage [08:48:31] you'll probably need a bit more than that, for the wrapper shell and such [08:49:11] it's an interesting exercise though, I'm curious to see how much memory you actually need xd [08:53:23] quick review https://gerrit.wikimedia.org/r/c/operations/puppet/+/1072146 [08:53:54] and I just found a typo myself xd [11:30:30] dcaro: 100 MB is minimum, 128 is good enough for simple php-scripts (with API- and database access) [11:31:01] PHP-Usage is 4 MB [12:04:45] dcaro: maybe adding "colud" to https://gerrit.wikimedia.org/r/plugins/gitiles/operations/puppet/+/refs/heads/production/typos would be a good idea? ^^ [12:08:13] that'd be a lifesaver for me! [12:19:48] https://gerrit.wikimedia.org/r/c/operations/puppet/+/1072188 [15:24:07] Wurgl: https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/#resource-units-in-kubernetes is the upstream doc for what units are used. [15:31:08] LOL! "Pay attention to the case of the suffixes. If you request 400m of memory, this is a request for 0.4 bytes." [15:32:44] millibytes are a silly unit [15:34:39] 0.4 byte are exactly 3.2 bit … [15:38:36] mebibytes [15:46:53] When I buy a harddrive, 1 MB is 1024 x 1024 Byte. When I sell a harddrive 1 MB is exactly 1000000 byte [15:48:52] one of them is MiB but nobody seems to use it [16:43:44] Hi. Could someone help me understand the problem for my tool - https://github.com/DaxServer/flickypedia/pull/1 [16:43:45] I built it using `toolforge build start --ref init-curatorbot https://github.com/DaxServer/flickypedia.git` [16:43:45] and ran using `toolforge jobs run --image tool-curator/tool-curator:latest --emails all --command "bot --skip-until M120585117" backfillr` [16:43:46] I keep getting the error `python /workspace/src/flickypedia/backfillr/bot.py: line 1: python /workspace/src/flickypedia/backfillr/bot.py: No such file or directory` [16:43:46] I pulled the image `tools-harbor.wmcloud.org/tool-curator/tool-curator:latest` locally and checked it has the file. [16:45:50] DaxServer: is there a `python` in the $PATH in the image or only a `python3`? [16:46:57] bd808 python is in path symlinked to python3 [16:46:57] ls -al /usr/bin/python [16:46:58] lrwxrwxrwx 1 root root 7 Oct 11 2021 /usr/bin/python -> python3 [16:50:19] bd808 [16:50:19] ``` [16:50:20] which python [16:50:21] $ ls -al /usr/bin/python [16:50:21] lrwxrwxrwx 1 root root 7 Oct 11 2021 /usr/bin/python -> python3 [16:50:22] $ ls -al /usr/bin/python3 [16:50:22] lrwxrwxrwx 1 root root 10 Aug 18 2022 /usr/bin/python3 -> python3.10 [16:50:23] $ ls -al /usr/bin/python3.10 [16:50:23] -rwxr-xr-x 1 root root 5904904 Nov 20 2023 /usr/bin/python3.10 [16:50:24] ``` [16:50:24] During build, I have a `No Python version specified, using the current default of Python 3.12.1.` - I was expecting 3.12.1 to be in the image, but it seems I have 3.10 [16:51:00] DaxServer: use a pastebin please or ozone is likely to kick you out of the channel asa spammer [16:51:17] oh, thanks for the note [16:52:15] 3.12.1 will be in /layers/... somewhere. Buildpacks do "interesting" things about file locations [16:57:44] yep, you have to wrap your call with 'launcher' to set the $PATH and other vars correctly, like `launcher bash` will start a shell where `which python` should point to the buildpack installed one [16:57:55] you can also run `launcher python` and get the python shell directly [16:58:21] (the toolforge jobs command adds `launcher` automatically to your command, you can see it in toolforge jobs list -o long, or toolforge jobs show ) [17:01:09] ah I see, thanks for the tip dcaro [17:01:19] DaxServer: I think that you are hitting the issue where you can't pass arguments to a procfile entry [17:01:53] T356016 [17:01:57] T356016: [builds-builder,jobs-api,upstream] Calling nontrivial Procfile commands with arguments results in confusing error (“no such file or directory”) - https://phabricator.wikimedia.org/T356016 [17:02:23] there's no fix yet, but there's a workaround [17:02:24] https://wikitech.wikimedia.org/wiki/Help:Toolforge/Build_Service#Example%3A_Python_web_service [17:03:23] you wrap your command in a shell script, (say, bot.sh), then in the procfile you have `bot: bot.sh`, and in the shellscript you have something like `python /workspace/src/flickypedia/backfillr/bot.py "$@"` [17:06:20] dcaro This will run inside the `launcher` scope, in Toolforge, and thus is not need to put that here inside the bot.sh, did I get it correct? [17:06:53] yep, correct, toolforge will take care of wrapping everything in the `launcher` program [17:06:54] !log lucaswerkmeister@tools-bastion-13 tools.sal added health-check-path: /toolinfo.json to service.template (probably the cheapest endpoint of the tool?), which should hopefully reduce the need for manual restarts; also removed canonical: True which hasn’t been needed for a while now [17:06:56] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.sal/SAL [17:07:16] done ^^ (re @lucaswerkmeister: (feels like this tool could use a health check URL 🤔 maybe I can do that later)) [17:10:26] gtg, I might be around later, DaxServer let me know how it goes, I might be able to send a PR later (or tomorrow) to help if you still have issues [17:15:17] Thanks, I'll let you know [17:22:24] dcaro Seems to work with `/bin/bash` shebang but not `/usr/bin/env bash` Now I have to update the Pywikibot envvars and ... [17:31:25] @lucaswerkmeister: nice hack :) [17:31:48] I didn’t want to bother you with a PR for an empty PHP file :P [17:33:00] I need to move that repo into gitlab where the world isn't held hostage by the number of spoons I can hold. [18:50:52] hello yall :) [18:55:37] hi [19:03:20] i wanna help a friend who currently runs infra on toolforge [19:03:22] sadlyphp [19:03:32] but i wanna help him out clean and improve or port it to py [19:03:36] any resources ) [19:03:38] :) [19:08:19] https://wikitech.wikimedia.org/wiki/Help:Toolforge/Tool_Accounts [20:34:19] andre, awesome thanks [20:34:28] ill bug yall later for sure :) [20:42:50] tian_: when you come back, please try not to denigrate the implementation languages chosen by others. Having personal opinions and preferences is great. Crapping on the choices of others is not so great. [20:44:49] Picking on PHP is a popular internet sport, but maybe one that should be practiced further away from the largest non-profit PHP web project in the world. [20:46:31] i mean it beacause i am not well versed in it, therefore it is sad i am not able to improve on the current one, [20:46:46] as i could if had more experience on it [20:47:38] no intento offend [21:00:18] !log lucaswerkmeister@tools-bastion-13 tools.lexeme-forms deployed 38b3b281ed (fix two ZIDs for Breton templates) [21:00:21] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.lexeme-forms/SAL