[09:25:01] hello folks [09:25:47] I found https://www.featurestore.org/ that is very interesting, a nice summary in my opinion (namely from somebody super ignorant about the subject) [09:26:09] IIUC the opensource solutions available are Feast and Hops [09:26:34] and I am a little puzzled about how they both handle storage [09:26:57] it seems that there is the distinction between offline and online, namely training vs serving [09:27:15] for serving, there is the support for Redis/Cassandra [09:27:37] but IIRC we'll not need it initially (but we may later on down the next fiscal year road?) [09:27:57] for offline/historical, the situation seems less straightfoward [09:28:00] *forward [09:28:35] Hops provides Hive/Hudi support, Feast seems to support only BigQuery [09:29:11] (https://github.com/feast-dev/feast/issues/482 is very interesting) [09:31:00] (https://github.com/feast-dev/feast/issues/259 talks about Hive + HBase) [09:31:10] what is our idea about the feature store? [09:31:45] We'd need to understand it to buy the right hardware, and also to sync with Data Engineering about our needs [09:48:55] https://doordash.engineering/2020/11/19/building-a-gigascale-ml-feature-store-with-redis/ is also another interesting example, all Redis based [10:53:22] I sent a link to a machine/capex planning doc on Slack, feel free to add text/info/comments in preparation for Monday's meeting. [13:07:45] klausman: o/ https://gerrit.wikimedia.org/r/c/operations/puppet/+/682097 - naming ok for you? [13:14:46] sec [13:15:38] +1'd [13:19:29] <3 [13:57:55] 10Lift-Wing, 10SRE-swift-storage, 10Machine-Learning-Team (Active Tasks), 10Patch-For-Review: Swift account to store ML models - https://phabricator.wikimedia.org/T280773 (10elukey) 05Open→03Resolved a:03elukey ` elukey@ml-serve1001:~$ cat .s3cfg [default] access_key = mlserve:prod host_base = https... [13:57:57] 10Lift-Wing, 10Machine-Learning-Team (Active Tasks): Find a way to store models for Kubeflow - https://phabricator.wikimedia.org/T280025 (10elukey) [13:58:14] s3cmd worked with our new account --^ [13:59:12] 10Lift-Wing, 10Machine-Learning-Team (Active Tasks): Find a way to store models for Kubeflow - https://phabricator.wikimedia.org/T280025 (10elukey) Swift account created in T280773 on Thanos Swift (also tested it with `s3cmd`) [16:16:49] elukey: glad to hear s3cmd works for thanos swift [16:17:03] woot [16:18:43] also yeah offline/historical is the hard part of feature stores :/ [16:30:13] Is there a minimum viable decision around the feature store we can get to in order to do a rough hardware spec? And then only decide on the rest of the specifics of the system down the road? [16:59:25] chrisalbon__: in theory the "offline" part of the feature store should ideally be taken care by what is already offered from DE (like Hive) plus maybe something else like Hudi/Iceberg, that is IIUC what should be used in Train Wing. The main problem is that nobody seems to really support what we need (maybe only Hops is, even if DE prefers Iceberg over Hudi) [16:59:54] the online part is easier, if we need it, since it is Redis or Cassandra [17:00:13] this assuming that we'll not create our own custom Feature store :D [17:00:29] accraze: lemme know if what I wrote is totally wrong or not --^ [17:00:42] I have zero experience with the feature stores, but it seems a world of closed source options :( [17:36:23] elukey: I think you've got it. We may not need an online feature store for Lift Wing / serving anytime soon [17:37:04] offline is what I think will be used in exploration/analysis and model training (Train Wing) [17:37:29] I think Feast can support most blob storage now, (bigquery isn't required) and it seems to support the S3 api so maybe Swift would work? I know it uses Spark to handle ingestion from offline to online, which I believe is already in use here. [17:42:18] accraze: it would be great if we could fetch data directly from the Hadoop cluster for Train Wing, without pushing a lot of data to Swift.. today I was reading https://github.com/feast-dev/feast/issues/259 that was not very encouraging [17:43:30] (we don't have HBase but it is included in Bigtop if needed) [17:44:02] (Hbase is suggested for serving so not a concern) [17:44:38] ahh interesting, yeah I agree fetching directly from Hadoop would be ideal [17:45:25] https://github.com/feast-dev/feast/discussions/1382 - just found it [17:47:05] oh nice! [17:47:47] that also brought me to https://www.applyconf.com/, all videos on yutube [17:47:51] *youtube [17:48:33] oh whoa, that was yesterday [17:49:48] yes! :D [17:50:08] great timing [17:50:13] https://www.logicalclocks.com/hopsworks-featurestore looks that it got full support for Kubeflow [17:53:16] hmm elukey accraze i set up a meeting on monday with search eng to talk about future data platform stuff, which I thiink might be relevant to things you are talking about here [17:53:30] shall I invite you too?! [17:54:23] discussions around [17:54:24] https://docs.google.com/document/d/15QqLTsKIrUCfhGPHIkl6OKeh2S1NZeEe4h0O9yTm7Fo/edit#heading=h.k12p3bpi70y5 [17:54:50] and https://docs.google.com/document/d/1CpxSbL1RfCfnSnl2tFrMLoLFx_Dd2I4zRzlaM1qjcCw/edit [17:57:10] ottomata: I can join for sure, haven't brought it up before to the team since this use case seems still a bit fuzzy, but we can write a generic use case for the "offline feature store", namely fetching data from Hive/Hadoop in some form and materialize it in features (this is my super ignorant view) [17:57:46] finding an opensource project that we can use and that leverages Hive/Hadoop/Spark seems challenging [17:58:00] for the moment only Hops seems to be doing it [17:58:16] Feast is another one that could be good, but seems less integrated with Hadoop [18:00:09] accraze: what do you think? [18:00:25] ottomata: when is the meeting? If we are not busy I think it would be interesting to join [18:00:34] (dinner time, will read later :) [18:07:54] ottomata: yeah same here, not sure if I'll have much to add but definitely interested [18:09:20] I'll attend if I can, but its the Director of DE's first day and annual planning. [18:32:06] (afk for the weekend, have a good day folks :) [18:32:15] (and weekend!) [18:47:21] 10Machine-Learning-Team, 10ORES, 10Scap: Scap deploy for ORES reports success even when uwsgi fails to start up - https://phabricator.wikimedia.org/T280998 (10Halfak) [19:25:31] let's schedule another meetng another day then :) this one will just be seaerch and de, prob bettte rto keep it small atm [19:50:11] 10Lift-Wing, 10Machine-Learning-Team (Active Tasks): Naming convention for the model storage structure - https://phabricator.wikimedia.org/T280467 (10ACraze) Thanks @Theofpa , this is really helpful right now as I'm working with a sandbox KF install to test some of our models with. > Object key can have tags... [22:06:25] 10Lift-Wing, 10Machine-Learning-Team (Active Tasks): Naming convention for the model storage structure - https://phabricator.wikimedia.org/T280467 (10Theofpa) > I noticed you are using a timestamp for the version, is this to ensure a unique key? We currently use semver with ORES but I could see how that could...