[05:58:25] <dcaro>	 I'm not around today, but maybe taav.i or d.hinus or godo.g can have a look. I think we can even double the maxconn again (from the CPU/mem/net usage perspective)
[10:31:13] <taavi>	 !log admin remove /var/lib/prometheus/node.d/kernel-messages.prom which got left as a leftover from the now-removed exporter
[10:31:20] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL
[11:40:21] <godog>	 !log metricsinfra set default timezone for grafana 'wikimedia cloud services' org to UTC
[11:40:23] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Metricsinfra/SAL
[12:52:28] <wm-bot>	 !log damian-scripts@tools-bastion-15 tools.cluebotng bot deployed @ refs/tags/v1.2.4
[12:52:30] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.cluebotng/SAL
[13:01:35] <ItsNyoty>	 Hello, I've got a problem trying to install my Python bot on toolforge. While trying to create a virtual environment I recieve the error that I need to install python3-venv, except since it's a shared server I don't have acces to perform sudo commands to install. Is there an option to get root on your toolforge debian server?
[13:02:41] <taavi>	 ItsNyoty: venvs cannot be created on the bastions directly, see https://wikitech.wikimedia.org/wiki/Help:Toolforge/Python#Virtual_environments_with_prebuilt_images
[13:02:46] <Damianz>	 builds appear to be failing with `[step-inject-buildpacks] 2025-09-26T12:59:43.832107903Z wget: can't connect to remote host (208.80.154.145): Connection refused` =\ assume it's related to the maintenance/upgrade but is pretty impactful
[13:03:26] <taavi>	 i would guess that's related to the gitlab maintenance. but yep, not great that builds depends fail during that
[13:04:05] <Westvleteren>	 ItsNyoty: : https://github.com/Daniuu-Wikipedia/DaniuuBot/blob/main/Toolforge_venv/Create_venv.sh (the script I showed you) should be run as a job
[13:04:11] <Westvleteren>	 Sorry, forgot to mention it to you
[13:04:26] <Damianz>	 working again now, I would expect it to use a replica that is highly available... can make a ticket later
[13:10:53] <Orval>	 ItsNyoty: you see the scripts for DaniuuBot right?
[13:11:05] <Orval>	 I can help you with running the job
[13:11:17] <Orval>	 But need more time, doing five things at the same time rn
[13:16:11] <Damianz>	 Can someone shed some light on where the "CPU Usage" metric on the "Tool Dashboard" comes from? I assume some cgroup metric but don't have 'explore' access in grafana to check. I have a pod running constantly at a load of 2-3 (3 cpu), which makes sense because it cycles a lot of processes and does a bunch of network requests, but according to metrics it does a lot of not much
[16:24:01] <lucaswerkmeister>	 Damianz: even without “explore” access you can use the three-dot menu > Inspect > Panel JSON option to see what the underlying metric is (it’s a bit annoying but it works)
[16:24:06] <lucaswerkmeister>	 in this case it appears to be `sum (rate(container_cpu_usage_seconds_total{container_label_io_kubernetes_pod_namespace=\"$namespace\"}[10m]))`
[16:25:05] <lucaswerkmeister>	 (which isn’t *super* helpful I guess, it’s still “CPU usage” of some kind. but you can search for that and find e.g. https://kubernetes.io/docs/reference/instrumentation/metrics/#:~:text=container_cpu_usage_seconds_total)
[18:14:59] <Damianz>	 That's defiantly sub-optimal compared to being able to just see the metrics, but good tip... so metrics server
[18:23:17] <Damianz>	 Created https://phabricator.wikimedia.org/T405782 in regards to my early comment in regards to builds exploding
[19:16:32] <mutante>	 !log mailman shutting down instance mailman-puppetserver-1 - T402889
[19:16:36] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Mailman/SAL
[19:16:37] <stashbot>	 T402889: Puppet CA certificate Puppet CA: mailman-puppetmaster.mailman.eqiad.wmflabs expired - https://phabricator.wikimedia.org/T402889