[10:52:27] errand+lunch [14:07:07] o/ [14:58:18] \o [15:02:56] o/ [15:04:55] think it's worthwhile to setup ml ingest pipeline in relforge? I was pondering it, but then i was ponder 144 core relforge vs 10k core hadoop cluster [15:10:53] had this discussion with Fabian, and yes using the hadoop cluster to compute the embeddings might be more realistic [15:11:24] ok, i'll look that direction then. Was trying to find the relevanat ticket but no luck yet [15:11:40] it's interesting to know how opensearch would react with an ingest pipeline but probably relforge is not appropriate for that, with two nodes it's hard to allocate ml only nodes [15:11:47] separately, mildly annoyed at a user comment on regex search "Maybe wikis would need a simple plain-text search, without all the flawed indexed "magic", and the resource-hoggy regex magic." [15:11:58] yes saw this :/ [15:12:08] haven't had time to respond [15:12:52] not sure why this is a problem now, suspecting that the recent timeout issues with rest-gateway made them quite suspicious about all this [15:13:16] i did see something in the docs about making normal nodes also do the ml stuff, but yea i suspect it will turn a 1 day thing into 1 week computing in relforge [15:14:05] yea i'm not sure either, i did notice the in-request timeout sent to cirrus is 15s, but that looks to have been the shard timeout for a long time [15:14:12] somehow i thought the shard timeout was 60s [15:14:25] there's no ticket about that yet, but I think we could agree with Fabian on the shape of the index so that he can adapt his notebook to produce to all this and we'll ingest it [15:15:07] i'll try and setup an ingestion notebook then, read simplewiki, gen vectors, and ship to relforge [15:15:14] yes the 15s was decided to be sure we return before the MW timeouts to avoid leaking opensearch compute threads outside of the poolcounter protection [15:17:55] a quick check to make sure that you run the pre-trained opensearch models the same way as inside opensearch might be interesting [15:19:13] hmm, yea i suppose i can setup a demon pipeline, am at least curious how that works out [15:19:17] s/demon/demo/ [15:29:56] there's a predict api in opensearch that could be used as well to compare embeddings computed from python vs the ones computed by the ml plugin [15:30:22] pre-trained models I've seen are all torch script [15:33:31] sure can do that, i was going to try and use spark-ml, but if that's being difficult will probably switch to python [16:38:20] quick school run, [17:03:23] heading out [17:51:10] inflatador: not a big deal, but ops-monitoring-bot is posting wdqs reimages to the opensearch 3 task [17:52:41] ebernhardson ACK, I fixed it for the 2nd set of reimages, sorry for the trouble [17:52:53] no worries [17:53:23] * ebernhardson is mostly idling while waiting for maven to download the entire world...well just spark-nlp, but it's been going for awhile now [21:17:05] damn, I used the wrong task again for those reimages [21:57:19] inflatador: otw back with dog, 2:15 [22:15:50] inflatador: back, https://meet.google.com/fde-tbpf-wqh [22:26:02] feeling dubious about spark-nlp...on the one hand i'm getting vectors out of it. On the other hand i can't seem to figure out how to get it to load the model on executors from hadoop. Every time i start a run it pegs the drivers network as 120MB/s for minutes at a time shipping the model back out to the executors [22:27:19] (and then after the 5 minute timeout, increased from 2 minutes, some fraction of executors fail having not finished fetching the model)