[06:38:16] good morning [07:28:30] 06Machine-Learning-Team, 10Semantic Search: Semantic Search POC - In article QA - https://phabricator.wikimedia.org/T405359#11239518 (10OKarakaya-WMF) I've updated the [prompt](https://gitlab.wikimedia.org/repos/machine-learning/exploratory-notebook/-/blob/semantic_search_poc/semantic_search_poc/notebooks/qa_... [07:32:24] good morning [07:49:00] hey folks morning! Quick patch for the new gpu metrics: https://gerrit.wikimedia.org/r/c/operations/puppet/+/1193133 (if anybody has time) [07:49:15] Looking [07:49:28] 06Machine-Learning-Team, 07Essential-Work: Orchestrate end-to-end tone-check pipeline using the TriggerDagRunOperator - https://phabricator.wikimedia.org/T406302 (10kevinbazira) 03NEW [08:30:48] fixed my patch thanks :) [10:00:57] still not perfect but: https://grafana.wikimedia.org/d/ZAX3zaIWz/amd-rocm-gpu?orgId=1&from=now-6h&to=now&timezone=utc&var-source=000000006&var-instance=ml-serve1012:9100 [10:01:04] it shows all the 64 gpus configured [10:02:52] Hey folks, I have a question about revscoring-goodfaith. I see in the isvc something like: [10:02:52] ``` [10:02:52] revscoring_inference_services: [10:02:52] - wiki: "zhwiki" [10:02:52] version: "20220214192324" [10:02:53] predictor: [10:02:53] config: [10:02:53] minReplicas: 0 [10:02:54] ``` [10:03:13] Where I can find the versions of each wiki ? [10:04:55] You mean the available versions of the model checkpoints? [10:06:28] not sure if this is a checkpoint, but for instance I see the image version of the revscoring model: [10:06:28] ``` [10:06:28] predictor: [10:06:28] image: "machinelearning-liftwing-inference-services-revscoring" [10:06:28] version: "2025-08-27-093725-publish" [10:06:28] ``` [10:06:48] So there are two version: the image we run, and the model checkpoint [10:06:51] and there is also a version for `zhwiki`: [10:06:52] ``` [10:06:52] - wiki: "zhwiki" [10:06:52] version: "20220214192324" [10:06:52] ``` [10:07:06] the 'version' field is about the model checkpoint [10:07:10] alright [10:07:12] and they can be found here https://analytics.wikimedia.org/published/wmf-ml-models/ [10:07:53] thank youuu [10:08:24] you can also use s3cmd ls s3://wmf-ml-models/ on a statbox to look at the S3 bucket. [10:08:53] (you also need to supply the config file for credentials, but that's always the case with s3cmd) [10:13:35] ack! ty [11:05:19] hello! back online [11:18:12] 06Machine-Learning-Team, 07Essential-Work, 13Patch-For-Review: Fix revscoring load tests to match staging deployments - https://phabricator.wikimedia.org/T403236#11240582 (10gkyziridis) ==Update== `enwiki-goodfaith` is deployed on staging. ` $ kube_env revscoring-editquality-goodfaith ml-staging-codfw $ kub... [11:25:17] 06Machine-Learning-Team, 07Essential-Work, 13Patch-For-Review: Fix revscoring load tests to match staging deployments - https://phabricator.wikimedia.org/T403236#11240604 (10isarantopoulos) We could remove the zhwiki deployment and free up some resources. In staging it makes sense to have 1 deployment for ea... [11:52:49] 06Machine-Learning-Team, 07Essential-Work, 13Patch-For-Review: Fix revscoring load tests to match staging deployments - https://phabricator.wikimedia.org/T403236#11240719 (10gkyziridis) > We could remove the zhwiki deployment and free up some resources. In staging it makes sense to have 1 deployment for each... [12:01:28] FIRING: [3x] HelmfileAdminNGPendingChangesLiftWing: Pending admin_ng changes on ml-serve-codfw - https://wikitech.wikimedia.org/wiki/Kubernetes/Add_a_new_service#Deploy_changes_to_helmfile.d%2Fadmin_ng - https://alerts.wikimedia.org/?q=alertname%3DHelmfileAdminNGPendingChangesLiftWing [12:05:37] 06Machine-Learning-Team, 05Goal, 13Patch-For-Review: Q1 FY2025-26 Goal: Scaling Add-a-link to more wikis via production (airflow) pipelines - https://phabricator.wikimedia.org/T398950#11240752 (10OKarakaya-WMF) ###__**Reporting (03/10/2025)**__ **Progress update on the hypothesis for the week, including if s... [12:06:02] 06Machine-Learning-Team, 10Semantic Search: Semantic Search POC - In article QA - https://phabricator.wikimedia.org/T405359#11240758 (10OKarakaya-WMF) [12:06:28] FIRING: [6x] HelmfileAdminNGPendingChangesLiftWing: Pending admin_ng changes on ml-serve-codfw - https://wikitech.wikimedia.org/wiki/Kubernetes/Add_a_new_service#Deploy_changes_to_helmfile.d%2Fadmin_ng - https://alerts.wikimedia.org/?q=alertname%3DHelmfileAdminNGPendingChangesLiftWing [12:10:12] I'll do the adming stuff in a hot second [12:18:34] ack! [12:18:51] the tone check latency SLO is looking nice https://slo.wikimedia.org/objectives?expr=%7B__name__=%22tonecheck-latency-v1%22,%20revision=%221%22,%20service=%22tonecheck%22,%20team=%22ml%22%7D&grouping=%7B%7D [13:01:28] FIRING: [6x] HelmfileAdminNGPendingChangesLiftWing: Pending admin_ng changes on ml-serve-codfw - https://wikitech.wikimedia.org/wiki/Kubernetes/Add_a_new_service#Deploy_changes_to_helmfile.d%2Fadmin_ng - https://alerts.wikimedia.org/?q=alertname%3DHelmfileAdminNGPendingChangesLiftWing [13:06:28] RESOLVED: [6x] HelmfileAdminNGPendingChangesLiftWing: Pending admin_ng changes on ml-serve-codfw - https://wikitech.wikimedia.org/wiki/Kubernetes/Add_a_new_service#Deploy_changes_to_helmfile.d%2Fadmin_ng - https://alerts.wikimedia.org/?q=alertname%3DHelmfileAdminNGPendingChangesLiftWing [13:19:20] (03PS1) 10Gkyziridis: locust_tests: Add locust tests for revscoring models. [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/1193415 (https://phabricator.wikimedia.org/T403236) [13:23:39] (03PS2) 10Gkyziridis: locust_tests: Add locust tests for revscoring models. [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/1193415 (https://phabricator.wikimedia.org/T403236) [14:49:29] 06Machine-Learning-Team, 07Essential-Work, 13Patch-For-Review: Fix revscoring load tests to match staging deployments - https://phabricator.wikimedia.org/T403236#11241402 (10gkyziridis) ==Update== I am having some issues while trying to run the locust tests (even for models that we had reran them like edit-c... [15:10:08] 06Machine-Learning-Team, 07Essential-Work, 13Patch-For-Review: Fix revscoring load tests to match staging deployments - https://phabricator.wikimedia.org/T403236#11241544 (10isarantopoulos) @gkyziridis the error is quite self explanatory the file is just not there :) It seems that it has been renamed in htt... [15:19:35] 06Machine-Learning-Team, 07Essential-Work, 13Patch-For-Review: Fix revscoring load tests to match staging deployments - https://phabricator.wikimedia.org/T403236#11241607 (10gkyziridis) >>! In T403236#11241544, @isarantopoulos wrote: > @gkyziridis the error is quite self explanatory the file is just not ther... [15:29:15] (03PS3) 10Gkyziridis: locust_tests: Add locust tests for revscoring models. [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/1193415 (https://phabricator.wikimedia.org/T403236) [15:41:00] (03PS4) 10Gkyziridis: locust_tests: Add locust tests for revscoring models. [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/1193415 (https://phabricator.wikimedia.org/T403236) [15:52:00] 06Machine-Learning-Team, 07Essential-Work, 13Patch-For-Review: Fix revscoring load tests to match staging deployments - https://phabricator.wikimedia.org/T403236#11241938 (10gkyziridis) ==Update== - Locust tests issue fixed ✅ - **2nd option**: Deploy enwiki-goodfaith on staging and keep the test aligned wit... [15:53:49] Have a great weekend all [15:58:50] have a nice weekend! [16:00:48] 06Machine-Learning-Team, 07Essential-Work, 13Patch-For-Review: Fix revscoring load tests to match staging deployments - https://phabricator.wikimedia.org/T403236#11242090 (10gkyziridis) a:03gkyziridis [16:18:21] o/ have a nice weekend! [17:48:39] 10Lift-Wing, 06Tech-Docs-Team: Lift Wing API documentation standardization - https://phabricator.wikimedia.org/T406369 (10TBurmeister) 03NEW [19:25:15] 06Machine-Learning-Team, 06Growth-Team, 06Research, 10Revise-Tone-Structured-Task, and 2 others: Analyze samples of articles to see how many structured tasks we might be able to generate - https://phabricator.wikimedia.org/T401968#11242701 (10achou) 05Open→03Resolved I reposted @Sucheta-Salgaonkar-... [20:03:44] 06Machine-Learning-Team, 05Goal, 07OKR-Work: Q1 FY2025-26 Goal: Apply the Tone Check model to published articles, to learn whether we can build a pool of high-quality structured tasks for new editors - https://phabricator.wikimedia.org/T392283#11242820 (10achou) **Weekly Report** Progress update on the hypo...