[07:33:12] hello! [08:25:58] Morning! [08:26:09] morning Tobias! [08:26:43] just got a notification that ml-testing has issues with puppet. I tried and couldn't connect to it [08:27:02] just bringing it up! it is not urgent in any case [08:27:55] I'll have a look [08:28:29] where did you get the notification? [08:42:13] ah, it got buried in thos other hosts with the same error [09:59:15] (03PS5) 10Nik Gkountas: Initialize campaign cache and update it every 1 hour [research/recommendation-api] - 10https://gerrit.wikimedia.org/r/1075974 [10:04:35] (03PS6) 10Nik Gkountas: Initialize campaign cache and update it every 1 hour [research/recommendation-api] - 10https://gerrit.wikimedia.org/r/1075974 [10:06:54] klausman: I got an email [10:11:34] (03PS7) 10Nik Gkountas: Initialize campaign cache and update it every 1 hour [research/recommendation-api] - 10https://gerrit.wikimedia.org/r/1075974 [10:11:34] (03PS7) 10Nik Gkountas: Use category search to find campaign pages instead of template [research/recommendation-api] - 10https://gerrit.wikimedia.org/r/1076020 (https://phabricator.wikimedia.org/T373132) [10:12:20] isaranto: yeah, found it [10:44:21] (03PS3) 10Nik Gkountas: Replace "campaign" term with "collection" or "page_collection" [research/recommendation-api] - 10https://gerrit.wikimedia.org/r/1079467 [10:44:36] (03CR) 10Nik Gkountas: Replace "campaign" term with "collection" or "page_collection" (031 comment) [research/recommendation-api] - 10https://gerrit.wikimedia.org/r/1079467 (owner: 10Nik Gkountas) [11:08:50] * klausman lunch [11:29:45] isaranto: the puppet runs on ml-testing are working again now. I suspect it was a bad puppet change [11:31:26] ack, thanks! [11:35:30] * klausman goes back to his discovery of the week: kleftiko [11:37:28] (03PS16) 10Nik Gkountas: Fetch campaign metadata and return them with recommendations [research/recommendation-api] - 10https://gerrit.wikimedia.org/r/1070308 (https://phabricator.wikimedia.org/T373132) [11:38:14] (03CR) 10CI reject: [V:04-1] Fetch campaign metadata and return them with recommendations [research/recommendation-api] - 10https://gerrit.wikimedia.org/r/1070308 (https://phabricator.wikimedia.org/T373132) (owner: 10Nik Gkountas) [11:53:56] Tobias I'll be waiting for your feedback on kleftiko! I'm not a big fan but curious what others think ;) [12:03:15] I like it, it's very hearty, just right for autumn weather. But I guess it really depends on the specific marinade used, and whether the the lamb is good. [12:03:32] (I personally hate mutton, for example) [12:03:57] brb, this machine needs a reboot [12:05:28] and back [13:18:54] klausman: o/ shall we do ml-serve201[01]? [13:19:00] yes! [13:20:51] Draining 2011 as we speak [13:21:00] and drained. [13:21:26] elukey: do you want to do any needed silences? [13:21:49] yes please [13:22:34] Alright, 2011 is all yours [13:23:35] proceeding :) [13:31:28] ok /dev/kvm is gone, you can uncordon! [13:37:42] uncordoned, and nowdrainingin 2010 [13:38:52] elukey: 2010 is drained and all yours [13:38:57] super [13:50:07] (03PS17) 10Nik Gkountas: Fetch campaign metadata and return them with recommendations [research/recommendation-api] - 10https://gerrit.wikimedia.org/r/1070308 (https://phabricator.wikimedia.org/T373132) [13:50:07] (03PS11) 10Nik Gkountas: Support Default collections [research/recommendation-api] - 10https://gerrit.wikimedia.org/r/1072175 (https://phabricator.wikimedia.org/T374597) (owner: 10Santhosh) [13:53:29] all done :) [14:02:06] 10Lift-Wing, 06Machine-Learning-Team: [LLM] Allow loading model weights as int8 models with HF - https://phabricator.wikimedia.org/T377848 (10isarantopoulos) 03NEW [14:07:31] merci! [14:31:00] 10Lift-Wing, 06Machine-Learning-Team: [LLM] Allow loading model weights as int8 with HF - https://phabricator.wikimedia.org/T377848#10250488 (10isarantopoulos) [14:34:01] 06Machine-Learning-Team: [ml-lab] Use a (jupyter) notebook and load a LLM from huggingface - https://phabricator.wikimedia.org/T377574#10250512 (10isarantopoulos) [14:34:30] 10Lift-Wing, 06Machine-Learning-Team: [LLM] Allow loading model weights as int8 with HF - https://phabricator.wikimedia.org/T377848#10250507 (10isarantopoulos) p:05Triage→03High [14:37:24] 10Lift-Wing, 06Machine-Learning-Team: [langid] fasttext only processes one line at a time - https://phabricator.wikimedia.org/T377751#10250534 (10isarantopoulos) p:05Triage→03Unbreak! [14:39:53] found https://github.com/ROCm/k8s-device-plugin/commit/d7114784a180d8436794ebab56dd0a6346501aae while checking the new release [14:40:11] https://www.amd.com/content/dam/amd/en/documents/instinct-tech-docs/data-sheets/amd-instinct-mi300a-data-sheet.pdf [14:40:32] no idea what those "partition" entail, but.. [14:45:34] 06Machine-Learning-Team, 10ORES: Rename ORES extension - https://phabricator.wikimedia.org/T377563#10250606 (10isarantopoulos) a:03isarantopoulos [14:46:02] 10Lift-Wing, 06Machine-Learning-Team: [langid] fasttext only processes one line at a time - https://phabricator.wikimedia.org/T377751#10250610 (10isarantopoulos) a:03kevinbazira [14:47:08] 06Machine-Learning-Team: ml-lab should have a symlink /home -> /srv/home/ - https://phabricator.wikimedia.org/T377478#10250616 (10klausman) 05Open→03Resolved [14:47:36] 06Machine-Learning-Team: ml-lab should have a symlink /home -> /srv/home/ - https://phabricator.wikimedia.org/T377478#10250618 (10klausman) 05Resolved→03Open [14:49:59] 06Machine-Learning-Team: ml-lab can't install rocm torch - https://phabricator.wikimedia.org/T376967#10250631 (10klausman) 05Open→03Resolved [14:52:42] 06Machine-Learning-Team: ml-lab should have a symlink /home -> /srv/home/ - https://phabricator.wikimedia.org/T377478#10250673 (10klausman) 05Open→03Resolved [14:57:08] 06Machine-Learning-Team: Refactor locust load test data handling for consistency with model-servers - https://phabricator.wikimedia.org/T377418#10250697 (10kevinbazira) We renamed the data directory and all data files to accurately reflect their contents, then updated all respective paths in the load tests. We n... [14:57:39] 06Machine-Learning-Team: Refactor locust load test data handling for consistency with model-servers - https://phabricator.wikimedia.org/T377418#10250698 (10kevinbazira) 05In progress→03Resolved [15:04:37] (03CR) 10AikoChou: [C:03+1] "Should we also remove the `validate_qid` function in utils.py? because it's not being used. Other than that, the patch LGTM!" [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/1081809 (https://phabricator.wikimedia.org/T371897) (owner: 10Kevin Bazira) [15:09:24] So if partitions work as expected with the Mi300a, virtualization would provide 24GB per instance (?) [15:09:56] btw the MI300X has 192GB of VRAM and 8 partitions [15:10:05] haven't understood it yet, but it seems somehow similar to one of the partitioning that Nvidia offerr [15:10:10] *offers [15:10:18] ack [15:22:29] would definitely be neat if feasible for us [15:23:41] (03CR) 10Kevin Bazira: [C:03+2] "Thanks for the review, Aiko! `validate_qid` is still used here:" [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/1081809 (https://phabricator.wikimedia.org/T371897) (owner: 10Kevin Bazira) [15:26:28] (03Merged) 10jenkins-bot: article-country: remove support for QID input [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/1081809 (https://phabricator.wikimedia.org/T371897) (owner: 10Kevin Bazira) [15:31:53] there, all scorecards so far submitted. taking a quick break before the circle-up [16:11:22] 06Machine-Learning-Team, 06serviceops, 10Data-Platform-SRE (2024.10.19 - 2024.11.08), 07Security: Migrate the ownership of DPE-Owned Docker images in production-images repo to mailing lists - https://phabricator.wikimedia.org/T373534#10251162 (10BTullis) [17:17:02] 06Machine-Learning-Team, 06serviceops: Replace the current recommendation-api service with a newer version - https://phabricator.wikimedia.org/T338471#10251472 (10akosiaris) To keep the archives happy, unless I am mistaken, per {T373611} Android applications have moved from the old recommendation-api to a... [17:42:58] going afk folks, have a nice evening/rest of day o/ [18:44:20] (03CR) 10Eamedina: [C:03+1] "Nice" [research/recommendation-api] - 10https://gerrit.wikimedia.org/r/1081237 (https://phabricator.wikimedia.org/T377124) (owner: 10Sbisson) [18:45:43] (03CR) 10Eamedina: [C:03+1] Filter out disambiguation pages from search API response [research/recommendation-api] - 10https://gerrit.wikimedia.org/r/1081236 (owner: 10Sbisson)