[06:36:52] <_joe_> btullis: uhm [06:37:39] <_joe_> we haven't discussed running ipoid, which is a fundamental production payload, on a k8s cluster dedicated to analytics payloads. [06:37:53] <_joe_> I... am not that happy about that, let's say [06:41:21] <_joe_> I understand you need persistence to run opensearch, but we've also made the call that, for website payloads, we will stick to running persistence off k8s exactly because we don't trust k8s to schedule io-bound payloads sensibly enough to ensure minimal latency [06:41:45] <_joe_> while that isn't a problem for async payloads, it is for services that need to respond to live requests [08:46:06] Hi _joe_: and thanks for sharing your concerns. I do understand where you're coming from. However, I feel there is still a good case to be made for this approach in the specific use-case of ipoid, for a number of reasons. [08:47:02] <_joe_> as long as we have a SLO for latency and errors of the service, so that we know what we expect of it and we monitor we're within those boundaries, my concerns will be much less relevant [08:47:16] * btullis 1) The volume of requests to ipoid is relatively low (~1rps) as it is [08:47:27] Gah, pressed the wrong key. [08:47:38] <_joe_> yeah, I wouldn't trust that to be true long-term (low rps to ipoid) [08:48:27] 2) the opensearch load will not be io-bound. It will be memory bound. The whole data set is 4GB of JSON, reloaded once per day, with no ongoing indexing. [08:49:17] There are some SLOs it seems in development here: https://wikitech.wikimedia.org/wiki/SLO/ipoid - Although it looks like they need work. [08:55:58] And yes, I fully take your point about monitoring the latency and errors. If we find that we need to scale the OpenSearch index and migrate it off to a bare-metal cluster in time, then that's definitely possible. [08:58:12] At the moment, though. This is a requirement for a *new* OpenSearch cluster and it would be beneficial for us to be able to deploy this without running a lot of new bare-metal or VMs, for what is effectively a small and focused service. [09:03:22] <_joe_> how is the index only 4 GB? I guess we're removing a lot of stuff from the documents we get [09:03:43] <_joe_> the feed is something like 7 GB compressed, 32 GB uncompressed IIRC [09:05:01] This could be out-of-date, but it says 700MB and 4GB here: https://gitlab.wikimedia.org/repos/mediawiki/services/ipoid/-/blob/main/README.md#ipoid [13:06:13] _joe_ the design doc for OpenSearch on k8s is at https://docs.google.com/document/d/1G7OTmBmzl5GVwoanCrzQNHMMXIrAvROpmKumdWtZfuo/edit?tab=t.0 , feel free to read it over and add your concerns [13:15:20] FWiW, the iPoid indices are 42 GB, Kosta put some good notes at https://office.wikimedia.org/wiki/User:KHarlan_(WMF)/OpenSearch_ipoid#Server_setup [13:45:49] <_joe_> thanks [13:45:58] <_joe_> I'll comment on the doc if I can find the time [13:52:38] np, I agree with all your concerns BTW, but I think we should be able to overcome them