[07:02:42] 10Machine-Learning-Team, 10ORES, 10artificial-intelligence, 10articlequality-modeling, 10drafttopic-modeling: ORES deployment - Spring 2021 - https://phabricator.wikimedia.org/T278723 (10elukey) IIUC the next steps should be to run something like T212818#4865070 for `drafttopic`, then updating the relate... [07:03:33] kevinbazira: ==^ o/ [09:13:28] new version of https://gerrit.wikimedia.org/r/c/operations/docker-images/production-images/+/688211 [09:13:54] I have still some doubts about the entrypoints [09:14:25] those are the same as the upstream docker images [09:15:02] but I compared them with the debs produced by `make deb` in the istio repo [09:15:40] and the istio-sidecar.deb pkg runs istio-start.sh, that seems to add some parameters to pilot [09:16:08] so I am trying to understand if the parameters are added by istioctl/helm when starting the containers/pods [09:16:16] (basically overriding the entry point) [09:17:55] but layers in https://hub.docker.com/layers/istio/proxyv2/1.6.14/images/sha256-fa7ca1d60e6d4d7f212aaf497d273ae58e5b02eddb4b80ab9cfe424cf6fb58d2?context=explore seems consistent with what we have [09:39:21] 10Machine-Learning-Team, 10ORES, 10artificial-intelligence, 10articlequality-modeling, 10drafttopic-modeling: ORES deployment - Spring 2021 - https://phabricator.wikimedia.org/T278723 (10kevinbazira) @elukey, that's fine, I set up a meeting and shared it with you on your calendar. Please feel free to adj... [09:40:05] elukey: ==^ o/ [09:41:11] :) [10:03:58] also TIL - we use https://gerrit.wikimedia.org/r/admin/repos/operations/docker-images/docker-pkg to build our docker images [10:04:07] that offers some nice templating etc.. [10:04:23] I am going to try to make it work and test my patch/images [12:01:17] kevinbazira: I am having a problem with my monitor, will join shortly sorry! [12:01:42] That's fine elukey. I will wait for you! [12:36:05] kevinbazira: I finished the pull, ~1GB downloaded [12:49:53] Ok ... mine is still going [12:50:07] Jumping into the call now. [13:06:03] elukey: after the upload, we should be able to see the changes here: https://gerrit.wikimedia.org/r/p/scoring/ores/drafttopic/+/dashboard/default:open [13:06:35] ack, the upload is still super slow [13:10:29] kevinbazira: done [13:11:33] Great. I'm still on the call. We can go to the next step. [13:11:50] I am quickly checking the gerrit repo [13:21:03] ahahah kevinbazira still cloning the repo [13:21:23] after that will try git lfs pull to verify that it works [13:21:27] and then we'll be able to proceed [13:21:39] can you prep the patch for the deploy repo? [13:21:50] Done cloning it and added gerrit to remote ... don't want to push yet. [13:22:22] kevinbazira: I have already pushed no? [13:22:46] Oh alright, that's fine... miscommunication [13:23:35] ok all good! git lfs pull doesn't return errors for the gerrit repo too [13:23:47] kevinbazira: if you are ok we can proceed with the deploy repo patch [13:23:50] and then beta deployment [13:24:12] Yep, I'm ok with it. [13:24:28] ack, are you going to create the patch? [13:30:26] kevinbazira: ==^ [13:30:36] otherwise I can create one [13:30:55] It's really not clear to me what the deploy repo is. Is it: https://wikitech.wikimedia.org/wiki/ORES/Deployment#Update_ores-wmflabs-deploy [13:32:02] ah sorry! it should be https://gerrit.wikimedia.org/r/admin/repos/scoring/ores/deploy [13:32:20] at this point I think that we should update the submodule reference to point to the new code [13:38:12] ah no wait, it is https://gerrit.wikimedia.org/r/admin/repos/mediawiki/services/ores/deploy [13:39:17] Thanks. Cloning it now [13:39:32] on deploy1001 I see [13:39:33] [remote "origin"] url = https://gerrit.wikimedia.org/r/p/mediawiki/services/ores/deploy.git fetch = +refs/heads/*:refs/remotes/origin/* [13:39:37] so that should be it [13:51:44] elukey: my clone keeps failing [13:51:45] error: RPC failed; curl 18 transfer closed with outstanding read data remaining [13:51:45] fatal: The remote end hung up unexpectedly [13:51:45] fatal: early EOF [13:51:45] fatal: index-pack failed [13:51:52] :( [13:53:05] If it is possible, you could update the submodules: https://www.vogella.com/tutorials/GitSubmodules/article.html [13:53:43] In case that doesn't work. We'll ask Aron to do it in the phab task. [13:58:02] I am currently checking out submodules with git submodules update --init, but it is taking ages [13:58:14] I'll try to file the code review [13:59:39] Alright, I'm on stand by. [14:19:23] (03PS1) 10Elukey: Update drafttopic to its latest version [services/ores/deploy] - 10https://gerrit.wikimedia.org/r/689884 (https://phabricator.wikimedia.org/T278723) [14:19:52] kevinbazira: --^ [14:19:53] :) [14:20:42] Checking ... [14:23:11] elukey: Nice. This looks good! [14:24:09] (03CR) 10Kevin Bazira: [V: 03+2] Update drafttopic to its latest version [services/ores/deploy] - 10https://gerrit.wikimedia.org/r/689884 (https://phabricator.wikimedia.org/T278723) (owner: 10Elukey) [14:24:46] kevinbazira: ok to merge + deploy in beta? [14:26:44] Yep. [14:28:39] (03CR) 10Elukey: [C: 03+2] Update drafttopic to its latest version [services/ores/deploy] - 10https://gerrit.wikimedia.org/r/689884 (https://phabricator.wikimedia.org/T278723) (owner: 10Elukey) [14:40:17] kevinbazira: https://ores-beta.wmflabs.org/v3/scores/viwiki/123125/articletopic seems working now! [14:40:42] Checking ... [14:41:03] 10Machine-Learning-Team, 10ORES, 10artificial-intelligence, 10articlequality-modeling, and 2 others: ORES deployment - Spring 2021 - https://phabricator.wikimedia.org/T278723 (10elukey) https://ores-beta.wmflabs.org/v3/scores/viwiki/123125/articletopic seems working now :) @Halfak do you want to do a sani... [14:41:11] Yep. it's working elukey. [14:42:04] ok I asked to Aaron to sanity check in beta, then we can move to prod! [14:51:17] * elukey taking a break [14:51:29] Alright, that's fine. [14:52:27] Will also be AFK for a bit. [16:14:08] we now have a gerrit repo for our inference services: https://gerrit.wikimedia.org/r/#/admin/projects/machinelearning/liftwing/inference-services [16:14:44] super [16:14:57] github mirror: https://github.com/wikimedia/machinelearning-liftwing-inference-services [16:21:17] accraze: me and Kevin today merge+deployed a new version of ores' drafttopic, with Aaron patch for the viwiki [16:21:34] nice! [16:21:53] if you want to sanity check in beta please do, otherwise we were thinking about waiting for Aaron's sanity check in deployment-prep and deploy [16:22:04] (basic tests that were failing now works, so it seems fine) [16:22:07] anything against it? [16:22:18] also klausman --^ [16:22:24] nah that should be fine, will take a look here in just a sec [16:24:29] "that should be fine" --> all good stories start with something similar :D [16:26:45] lol yeah famous last words [16:30:21] beta looks good to me [16:30:51] super [16:31:02] hopefully we'll get rid of the trailing errors [16:31:24] elukey: nice work. no objections [16:32:15] klausman: https://phabricator.wikimedia.org/T212818#4865070 made me really sad [16:32:20] not sure if you saw it [16:32:31] elukey: just reading all the above messages from earlier today, looks like you had some fun w/ git lfs ;) [16:33:03] accraze: now I know why having swift storing model will be so much better [16:33:07] *models [16:33:14] 100% [16:33:45] the other thing to discuss is the priority of https://github.com/wikimedia/editquality/pull/233 [16:33:56] that probably can go out after the drafttopic one [16:34:03] elukey: yeah, it's one of the downsides of lfs/annex [16:34:46] :( [16:35:09] kevinbazira: what plans do you have for the turkish edit quality change? [16:35:31] I would not couple it with the next deployment, but maybe set up some time next week to do it [16:35:52] (if it is something that is waiting to go out, otherwise we can wait) [16:37:53] ^ rebuilding/retraining a model is less of an issue than deploy a bunch of new models [18:10:56] (03CR) 10Umherirrender: [C: 03+2] build: Updating dependencies [extensions/ORES] - 10https://gerrit.wikimedia.org/r/689486 (owner: 10Libraryupgrader) [20:27:03] rebooting the MiniKF VM, sandbox cluster will be down for a little bit [21:49:20] 10Lift-Wing, 10Machine-Learning-Team: Upgrade Istio & Knative on sandbox cluster - https://phabricator.wikimedia.org/T282752 (10ACraze) [22:05:28] 10Lift-Wing, 10Machine-Learning-Team, 10Patch-For-Review: Load a fastText model in to KFServing - https://phabricator.wikimedia.org/T276862 (10ACraze) Thanks @Isaac, I see that reflected in the code now, but didn't have `threshold` documented with the other params. I've added a patch for that in gerrit. The... [22:13:28] 10Lift-Wing, 10Machine-Learning-Team, 10Patch-For-Review: Load outlinks topic model model in to KFServing - https://phabricator.wikimedia.org/T276862 (10ACraze) a:03ACraze [22:33:23] 10Lift-Wing, 10artificial-intelligence, 10revscoring, 10Machine-Learning-Team (Active Tasks), 10Patch-For-Review: Load a revscoring model into KFServing - https://phabricator.wikimedia.org/T279000 (10ACraze) [22:50:28] 10Lift-Wing, 10Machine-Learning-Team (Active Tasks), 10Patch-For-Review: Load outlinks topic model model in to KFServing - https://phabricator.wikimedia.org/T276862 (10ACraze) [23:07:32] rebooting minikf one more time, load testing with transformers keeps freezing up the cluster