[05:58:25] I'm not around today, but maybe taav.i or d.hinus or godo.g can have a look. I think we can even double the maxconn again (from the CPU/mem/net usage perspective) [10:31:13] !log admin remove /var/lib/prometheus/node.d/kernel-messages.prom which got left as a leftover from the now-removed exporter [10:31:20] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [11:40:21] !log metricsinfra set default timezone for grafana 'wikimedia cloud services' org to UTC [11:40:23] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Metricsinfra/SAL [12:52:28] !log damian-scripts@tools-bastion-15 tools.cluebotng bot deployed @ refs/tags/v1.2.4 [12:52:30] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.cluebotng/SAL [13:01:35] Hello, I've got a problem trying to install my Python bot on toolforge. While trying to create a virtual environment I recieve the error that I need to install python3-venv, except since it's a shared server I don't have acces to perform sudo commands to install. Is there an option to get root on your toolforge debian server? [13:02:41] ItsNyoty: venvs cannot be created on the bastions directly, see https://wikitech.wikimedia.org/wiki/Help:Toolforge/Python#Virtual_environments_with_prebuilt_images [13:02:46] builds appear to be failing with `[step-inject-buildpacks] 2025-09-26T12:59:43.832107903Z wget: can't connect to remote host (208.80.154.145): Connection refused` =\ assume it's related to the maintenance/upgrade but is pretty impactful [13:03:26] i would guess that's related to the gitlab maintenance. but yep, not great that builds depends fail during that [13:04:05] ItsNyoty: : https://github.com/Daniuu-Wikipedia/DaniuuBot/blob/main/Toolforge_venv/Create_venv.sh (the script I showed you) should be run as a job [13:04:11] Sorry, forgot to mention it to you [13:04:26] working again now, I would expect it to use a replica that is highly available... can make a ticket later [13:10:53] ItsNyoty: you see the scripts for DaniuuBot right? [13:11:05] I can help you with running the job [13:11:17] But need more time, doing five things at the same time rn [13:16:11] Can someone shed some light on where the "CPU Usage" metric on the "Tool Dashboard" comes from? I assume some cgroup metric but don't have 'explore' access in grafana to check. I have a pod running constantly at a load of 2-3 (3 cpu), which makes sense because it cycles a lot of processes and does a bunch of network requests, but according to metrics it does a lot of not much [16:24:01] Damianz: even without “explore” access you can use the three-dot menu > Inspect > Panel JSON option to see what the underlying metric is (it’s a bit annoying but it works) [16:24:06] in this case it appears to be `sum (rate(container_cpu_usage_seconds_total{container_label_io_kubernetes_pod_namespace=\"$namespace\"}[10m]))` [16:25:05] (which isn’t *super* helpful I guess, it’s still “CPU usage” of some kind. but you can search for that and find e.g. https://kubernetes.io/docs/reference/instrumentation/metrics/#:~:text=container_cpu_usage_seconds_total) [18:14:59] That's defiantly sub-optimal compared to being able to just see the metrics, but good tip... so metrics server [18:23:17] Created https://phabricator.wikimedia.org/T405782 in regards to my early comment in regards to builds exploding [19:16:32] !log mailman shutting down instance mailman-puppetserver-1 - T402889 [19:16:36] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Mailman/SAL [19:16:37] T402889: Puppet CA certificate Puppet CA: mailman-puppetmaster.mailman.eqiad.wmflabs expired - https://phabricator.wikimedia.org/T402889