[00:06:16] <apine>	 @inflatador: Will do. I thought we should be able to see it in Grafana, too, so thank you for confirming that that's anomalous.
[00:07:22] <apine>	 @swfrench-wmf: For the memory issue, https://phabricator.wikimedia.org/T400515. For the other, we don't (yet) have a direct task, but some of the symptomatic logs can be seen here: https://phabricator.wikimedia.org/T400757. We seem to have an undocumented 100kB request size limit in our backend services.
[00:39:42] <swfrench-wmf>	 apine: got it, thanks. from a quick glance, a default heap size of 512 MiB sounds plausible given the container memory limit you're using on orchestrator - i.e., 50% of 1GiB, which is coming from the default value in the orchestrator chart [0].
[00:39:42] <swfrench-wmf>	 if needed, you should be able to override those defaults in the helmfile values for the service - e.g., in [1] - though, depending on what value you have in mind, some other changes may be necessary on our end to allow that (there are limits on how high you can request).
[00:39:42] <swfrench-wmf>	 in any case, I can ask around a bit tomorrow on the other one (the request size limit doesn't sound familiar off hand).
[00:39:42] <swfrench-wmf>	 [0] https://gerrit.wikimedia.org/r/plugins/gitiles/operations/deployment-charts/+/refs/heads/master/charts/function-orchestrator/values.yaml#21
[00:39:42] <swfrench-wmf>	 [1] https://gerrit.wikimedia.org/r/plugins/gitiles/operations/deployment-charts/+/refs/heads/master/helmfile.d/services/wikifunctions/values-main-orchestrator.yaml
[00:44:51] <apine>	 @swfrench-wmf: Thank you! Okay, that makes sense. In local testing, 2GiB seems safe, but this issue is hard to repro--it's ultimately related to GC, so very spiky. I'd want to try bumping the chart to 2 CPUs and 4GiB. Would that be reasonable or beyond the limits for what we can request?
[00:52:22] <swfrench-wmf>	 that cpu limit should be fine, but I think the memory limit might require a change on our side to allow it. I'll take a look to confirm tomorrow, unless someone else from serviceops does in the interim :)
[20:26:26] <addshore>	 Are the ES hosts for trace.wikimedia.org directly queryable somewhere? rather than going via trace.wikimedia.org/api/ ?
[20:37:24] <cdanis>	 addshore: you can also get at the raw spans storage via the kibana “discover” interface — I can help you with that tomorrow if you need 
[20:37:49] <addshore>	 ooooo
[20:38:02] <addshore>	 how long are they kept for? and whats the request sampling rate?
[20:43:09] <addshore>	 oooooh, `select * from `jaeger-span-2025.08.06` limit 10`
[21:00:00] <addshore>	 oooh, and found it in discover, thanks for the pointer
[21:06:51] <cdanis>	 addshore: sampling rate depends on service, you can find it in deployment-charts; for mediawiki I think it’s 0.1% iirc
[21:07:00] <cdanis>	 Retention is 90d
[21:07:04] <addshore>	 👍
[21:07:38] <cdanis>	 oh except mw-debug is 100% :)
[21:11:36] <addshore>	 right, time for a nap