[08:33:26] I'll merge https://gerrit.wikimedia.org/r/c/operations/puppet/+/1177956 which will change things about throttling in nftables, this might impact Gerrit (on the https port), I'll be monitoring the situation but please let me know if you see something! [08:42:03] (reverted) [12:49:33] sukhe: thanks, merged https://gerrit.wikimedia.org/r/c/operations/puppet/+/1178585 [12:49:57] thanks! I will take care of prod DNS bits [13:43:56] summary: PYBAL CRITICAL - CRITICAL - labweb-ssl_7443: Servers cloudweb1004.wikimedia.org are marked down but pooled [13:44:49] I see mixed signals on who was last working on it, but moritzm and taavi, maybe? ^ [13:45:09] I'm not aware of any work there but I can have a look [13:45:28] ok thanks [13:52:55] moritzm: I think this is from https://gerrit.wikimedia.org/r/c/operations/puppet/+/1098556, the haproxy_allowed_healthcheck_sources hiera key used to generate the LOAD_BALANCER_HEALTH_CHECKS firewall set only contains addresses in the private VLANs but cloudweb hosts have public addressing [13:54:18] ok, I'll push a revert for 1098556 and fix that up separately [13:55:10] thank you! [13:55:22] thanks, folks! might be helpful to depool the server in the meantime if required [13:56:10] https://gerrit.wikimedia.org/r/c/operations/puppet/+/1178884 [13:56:35] now I wonder why this is affecting 1004 only and not 1003 [13:57:13] sukhe@lvs1019:~$ curl localhost:9090/pools/labweb-ssl_7443 [13:57:13] cloudweb1004.wikimedia.org: enabled/down/pooled [13:57:13] cloudweb1003.wikimedia.org: enabled/down/not pooled [14:02:39] that's the depool threshold saving us i think [14:05:40] revert has been merged and I've forced puppet runs on cloudweb [14:05:46] thanks! pybal is happy again [14:07:14] If pybal is happy, I'm happy! [14:08:06] we should get soon get used to making Liberica happy [14:08:16] one less get [15:26:59] Anyone know how EmailAuth is triggered for an account? https://www.mediawiki.org/wiki/Extension:EmailAuth, I am trying to trigger it on my test account, so I can get an email from the extension [15:57:05] I'd probably ask t.gr on that one [16:00:17] thanks swfrench-wmf [16:26:39] https://www.mediawiki.org/wiki/Help:Extension:EmailAuth [16:27:03] Trying a different IP, UA (and therefore no cookies) etc should be enough to trigger one in many cases... [16:27:07] Make sure you've confirmed the email too [16:27:19] orrr [16:27:19] >This is installation-specific, but on Wikimedia wikis, if you want to test the functionality (e.g. because you are translating it), you can set a cookie with the name forceEmailAuth and the value 1 on the domain auth.wikimedia.org; as long as the cookie is present, you will always be required to go through email verification during login. [16:27:32] oooh, nice [16:27:45] thanks Reedy I'll give that a go [16:27:53] oh, that's awesome [17:29:52] andre made it possible to have templated or "stock" answers in Phabricator comments. isn't this cool? just need to be creative where and what texts make the most sense (maybe for your team) https://phabricator.wikimedia.org/F65748921 [17:31:36] oh, i'm sorry if this is not actually viewable. the point is you can select from pre-filled responses [18:14:38] When running `kube_env cirrus-streaming-updater eqiad` (same for codfw), then `helmfile status` in `/srv/deployment-charts/helmfile.d/services/cirrus-streaming-updater` i get an error that .Values.kubernetesVersion isn't defined. I see it in /etc/helmfile-defaults/general-eqiad.yaml. Any idea what might be wrong? [18:17:02] sigh, nm i'm just totally forgetfull on helmfile arguments, there were some more needed [18:34:50] was trying to understand why the flink-operator sometimes doesn't shut down the backfill. Logs are ...curious. Basically it says "Obeserving job status", "Job Status (FINISHED) unchanged", "nothing to do" [18:38:10] at some point in the past it changed from INITIALIZED to FINISHED, i wonder if it's some oddity about it running for such a short time period