[00:19:58] 06Machine-Learning-Team, 06Data-Persistence, 10Data-Persistence-Design-Review, 06Growth-Team, and 3 others: Data Persistence Design Review: Improve Tone Suggested Edits newcomer task - https://phabricator.wikimedia.org/T401021#11248240 (10Eevans) >>! In T401021#11246721, @isarantopoulos wrote: >>Update: Gr... [01:03:14] (03CR) 10Tim Starling: [C:03+2] Allow filtering models by rc_source instead of rc_type [extensions/ORES] - 10https://gerrit.wikimedia.org/r/1187807 (https://phabricator.wikimedia.org/T74157) (owner: 10Zabe) [01:05:46] (03CR) 10Tim Starling: [C:03+2] Replace most usages of rc_type with rc_source [extensions/ORES] - 10https://gerrit.wikimedia.org/r/1187802 (https://phabricator.wikimedia.org/T74157) (owner: 10Zabe) [01:12:14] 06Machine-Learning-Team, 06Data-Persistence, 10Data-Persistence-Design-Review, 06Growth-Team, and 3 others: Data Persistence Design Review: Improve Tone Suggested Edits newcomer task - https://phabricator.wikimedia.org/T401021#11248269 (10Eevans) >>! In T401021#11197656, @achou wrote: > > [ ... ] > > @Eev... [01:17:14] (03Merged) 10jenkins-bot: Allow filtering models by rc_source instead of rc_type [extensions/ORES] - 10https://gerrit.wikimedia.org/r/1187807 (https://phabricator.wikimedia.org/T74157) (owner: 10Zabe) [01:19:55] (03Merged) 10jenkins-bot: Replace most usages of rc_type with rc_source [extensions/ORES] - 10https://gerrit.wikimedia.org/r/1187802 (https://phabricator.wikimedia.org/T74157) (owner: 10Zabe) [05:58:40] 06Machine-Learning-Team, 07Essential-Work: Update tone-check training pipeline to use Parquet datasets instead of CSV - https://phabricator.wikimedia.org/T406117#11248465 (10kevinbazira) 05Open→03Resolved [07:22:27] Hola! [07:29:24] good morning [07:40:12] good morning! [08:02:47] 06Machine-Learning-Team, 10Semantic Search: Semantic Search POC - In article QA - https://phabricator.wikimedia.org/T405359#11248752 (10OKarakaya-WMF) [08:03:09] 06Machine-Learning-Team, 10Semantic Search: Semantic Search POC - In article QA - https://phabricator.wikimedia.org/T405359#11248753 (10OKarakaya-WMF) [08:08:55] 10Lift-Wing, 06Machine-Learning-Team: Add LiftWing streams data to event_sanitized (increase data retention) - https://phabricator.wikimedia.org/T405358#11248766 (10gkyziridis) ==Update== Since both tables do not include PII data, we configure them under `static_data/sanitization/event_sanitized_main_allow... [08:16:27] 10Lift-Wing, 06Machine-Learning-Team: Add LiftWing streams data to event_sanitized (increase data retention) - https://phabricator.wikimedia.org/T405358#11248782 (10isarantopoulos) Can we verify that the tables exist before we resolve this? I ran a quick check and the table `event_sanitized.mediawiki_page_... [08:40:02] kevinbazira: I approved the MR: https://gitlab.wikimedia.org/repos/data-engineering/airflow-dags/-/merge_requests/1726 Thnx so much tackling that one mate@ [08:40:24] georgekyz: thanks for the review :) [09:05:46] 06Machine-Learning-Team, 10Semantic Search: Semantic Search POC - In article QA - https://phabricator.wikimedia.org/T405359#11248985 (10santhosh) The LLM evaluation could be a little bit systematic. Currently GPS-OSS is 20B param mode, which we are comparing with Aya Expanse 32B. And Aya Expanse is an older mo... [09:33:07] 10Lift-Wing, 06Machine-Learning-Team: Add LiftWing streams data to event_sanitized (increase data retention) - https://phabricator.wikimedia.org/T405358#11249149 (10gkyziridis) >>! In T405358#11248782, @isarantopoulos wrote: > Can we verify that the tables exist before we resolve this? I ran a quick check... [09:48:24] 06Machine-Learning-Team, 07Essential-Work: Orchestrate end-to-end tone-check pipeline using the TriggerDagRunOperator - https://phabricator.wikimedia.org/T406302#11249205 (10kevinbazira) After stripping down the tone-check pipeline, it still ran successfully in [[ https://wikitech.wikimedia.org/wiki/Data_Platf... [09:49:50] isaranto, klausman - I think that ml-serve1012 is ready for a test, it is currently cordoned in k8s but it should be able to get pods.. The missing thing is the node labeller, so we cannot really ask to run a pod with a GPU of a given size [09:50:01] but it shouldn't block testing [09:53:40] wow, thanks! let's go trixie! [10:01:28] I've asked to the k8s sig an opinion about the node labeller as well, maybe we can add it without too much pain [10:01:47] (to have also the possibility to select gpu vram size etc..) [10:33:13] 10Lift-Wing, 06Machine-Learning-Team: Add LiftWing streams data to event_sanitized (increase data retention) - https://phabricator.wikimedia.org/T405358#11249393 (10isarantopoulos) 05Resolved→03Open Thanks for clearing that up! I assume data eng are the ones that deploy this but we could open the patch for... [10:35:05] 06Machine-Learning-Team, 06Data-Platform-SRE: Investigate Label functionality of AMD GPU device plugin on k8s - https://phabricator.wikimedia.org/T373806#11249395 (10elukey) [10:35:06] 06Machine-Learning-Team: Setup & experiments for MI300x GPUs used for LiftWing - https://phabricator.wikimedia.org/T403599#11249396 (10elukey) [10:36:48] 10Lift-Wing, 06Machine-Learning-Team: Add LiftWing streams data to event_sanitized (increase data retention) - https://phabricator.wikimedia.org/T405358#11249399 (10gkyziridis) >>! In T405358#11249393, @isarantopoulos wrote: > Thanks for clearing that up! I assume data eng are the ones that deploy this but we... [10:38:21] 06Machine-Learning-Team, 06Data-Platform-SRE: Investigate Label functionality of AMD GPU device plugin on k8s - https://phabricator.wikimedia.org/T373806#11249403 (10elukey) Some high level notes: * The node labeller component is a K8s controller, so it needs to run as a pod (it is unlikely that we can run it... [10:38:24] elukey: I am out sick today and tomorrow, so feel free to proceed (or wait until I am back). Thanks for the update [10:39:05] klausman: o/ I just updated the task with the info, we can chat when you're back about next steps (both labeller and ml-serve1012), there is no rush [10:39:31] ack. [10:40:19] isaranto: The patch for exporting tables in the event_sanitized schema is deployed. I updated the phab task: https://phabricator.wikimedia.org/T405358#11249399, I think we can close it. [10:43:28] awesome georgekyz ! go ahead and resolve it then! [10:43:29] jfyi: the thumbnails you have attached on the task don't have public visibility so all we can see is {F66735901} [10:44:17] isaranto: really?? oh I thought they are available... I am gonna recheck that. They are not available even if you refresh the page ? [10:45:25] I usually just using copy+paste in the comment. It probably takes some time to upload the image or something like that. [10:46:11] refreshing won't help, it has to do with permissions `Access Denied: Restricted File` [10:46:37] https://phabricator.wikimedia.org/F66735901 is not available -- you won't be able to verify because you have access [10:47:19] a wait ofc you can by an private browser or just not logged in [10:50:50] isaranto: https://phabricator.wikimedia.org/T405358#11249399 Is it available now ? [10:51:34] same [10:52:10] nice [10:53:08] I used the uploading tool from phabricator. is there anthing else that I need to do ? [10:54:25] 06Machine-Learning-Team, 10Semantic Search: Semantic Search POC - In article QA - https://phabricator.wikimedia.org/T405359#11249438 (10OKarakaya-WMF) [10:55:08] not sure but you could follow the documentation https://www.mediawiki.org/wiki/Phabricator/Help/lb#File_visibility [10:55:44] and check if you can change access from there, otherwise I can ping you later to have a look together [10:57:21] I think I found it [10:57:26] thnx [11:03:24] 10Lift-Wing, 06Machine-Learning-Team: Add LiftWing streams data to event_sanitized (increase data retention) - https://phabricator.wikimedia.org/T405358#11249470 (10gkyziridis) 05Open→03Resolved [11:06:05] kevinbazira: Hey Kevin, I think there is not `enwiki` for reverted model, I updated the phab task here: https://phabricator.wikimedia.org/T403236#11241938. Should I open an explicit phab task for the `reverted` model and close this one? Feel free to comment on ticket your thoughts. Thank you in advance [11:06:40] 06Machine-Learning-Team, 10Semantic Search: Semantic Search POC - In article QA - https://phabricator.wikimedia.org/T405359#11249499 (10OKarakaya-WMF) [11:06:52] 06Machine-Learning-Team, 10Semantic Search: Semantic Search POC - In article QA - https://phabricator.wikimedia.org/T405359#11249503 (10OKarakaya-WMF) [11:13:35] ack. looking ... [11:46:15] 06Machine-Learning-Team, 07Essential-Work: Fix revscoring load tests to match staging deployments - https://phabricator.wikimedia.org/T403236#11249666 (10kevinbazira) Thank you for working on this @gkyziridis. Since there is no enwiki reverted model (and we don't plan to train one), the locust test for this is... [12:13:48] (03PS1) 10Gkyziridis: locust_test: Add locust (load) test for revscoring-editquality-reverted model. [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/1194185 (https://phabricator.wikimedia.org/T403236) [12:15:04] (03CR) 10CI reject: [V:04-1] locust_test: Add locust (load) test for revscoring-editquality-reverted model. [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/1194185 (https://phabricator.wikimedia.org/T403236) (owner: 10Gkyziridis) [12:40:28] 06Machine-Learning-Team, 07Essential-Work, 13Patch-For-Review: Fix revscoring load tests to match staging deployments - https://phabricator.wikimedia.org/T403236#11249840 (10gkyziridis) ==Update== - Locust tests issue fixed ✅ - **2nd option**: Deploy enwiki-goodfaith on staging and keep the test aligned wit... [12:41:45] (03PS2) 10Gkyziridis: locust_test: Add locust (load) test for revscoring-editquality-reverted model. [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/1194185 (https://phabricator.wikimedia.org/T403236) [12:42:55] (03CR) 10CI reject: [V:04-1] locust_test: Add locust (load) test for revscoring-editquality-reverted model. [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/1194185 (https://phabricator.wikimedia.org/T403236) (owner: 10Gkyziridis) [12:44:06] (03PS3) 10Gkyziridis: locust_test: Add locust (load) test for revscoring-editquality-reverted model. [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/1194185 (https://phabricator.wikimedia.org/T403236) [12:47:09] 06Machine-Learning-Team, 07Essential-Work, 13Patch-For-Review: Fix revscoring load tests to match staging deployments - https://phabricator.wikimedia.org/T403236#11249851 (10gkyziridis) [12:53:38] Hey folks, whenever anybody has some time, please cast an eye over this patch for review: https://gerrit.wikimedia.org/r/c/machinelearning/liftwing/inference-services/+/1194185 [13:32:42] just out of a meeting. taking a look now ... [13:36:28] (03CR) 10Kevin Bazira: [C:03+1] "Thank you for working on this George. LGTM!" [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/1194185 (https://phabricator.wikimedia.org/T403236) (owner: 10Gkyziridis) [13:43:21] (03CR) 10Gkyziridis: [C:03+2] locust_test: Add locust (load) test for revscoring-editquality-reverted model. [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/1194185 (https://phabricator.wikimedia.org/T403236) (owner: 10Gkyziridis) [13:43:35] 06Machine-Learning-Team, 07Essential-Work: Orchestrate end-to-end tone-check pipeline using the TriggerDagRunOperator - https://phabricator.wikimedia.org/T406302#11250084 (10BTullis) I have deployed the new version of the airflow chart to the airflow-ml instance. This has fixed the issue, I believe. ` btullis... [13:43:50] (03Merged) 10jenkins-bot: locust_test: Add locust (load) test for revscoring-editquality-reverted model. [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/1194185 (https://phabricator.wikimedia.org/T403236) (owner: 10Gkyziridis) [13:54:01] 06Machine-Learning-Team, 05Goal: Q1 FY2025-26 Goal: Airflow training pipeline for Tone check model - https://phabricator.wikimedia.org/T398970#11250151 (10kevinbazira) * Updated the tone-check training pipeline to use parquet files instead of csv: ( [job logic](https://gitlab.wikimedia.org/repos/machine-learni... [13:59:19] 06Machine-Learning-Team, 07Essential-Work, 13Patch-For-Review: Fix revscoring load tests to match staging deployments - https://phabricator.wikimedia.org/T403236#11250161 (10gkyziridis) 05Open→03Resolved [15:18:24] 06Machine-Learning-Team, 06DC-Ops, 10ops-eqiad, 06SRE: eqiad row C/D Machine Learning host migrations - https://phabricator.wikimedia.org/T405647#11250698 (10RobH) a:05RobH→03klausman @klausman, Can you provide feedback on when we can migrate these hosts from one network port to the new network port?... [17:20:25] 06Machine-Learning-Team, 10Add-Link-Structured-Task, 06Growth-Team (FY2026-26 Q2 Sprint 1): Add a Link: Rollout "Add a Link" Structured Task to Wikipedias that are supported by V2 model - https://phabricator.wikimedia.org/T404460#11251403 (10KStoller-WMF) a:03Urbanecm_WMF