[07:14:06] (03PS1) 10Kevin Bazira: revertrisk-wikidata: parallelize async calls that fetch metadata features [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/1236618 (https://phabricator.wikimedia.org/T414060) [07:14:12] (03CR) 10CI reject: [V:04-1] revertrisk-wikidata: parallelize async calls that fetch metadata features [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/1236618 (https://phabricator.wikimedia.org/T414060) (owner: 10Kevin Bazira) [07:17:43] (03CR) 10Kevin Bazira: "recheck" [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/1236618 (https://phabricator.wikimedia.org/T414060) (owner: 10Kevin Bazira) [07:17:50] (03CR) 10Kevin Bazira: "recheck" [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/1236618 (https://phabricator.wikimedia.org/T414060) (owner: 10Kevin Bazira) [09:12:48] (03PS2) 10Kevin Bazira: revertrisk-wikidata: parallelize async calls that fetch metadata features [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/1236618 (https://phabricator.wikimedia.org/T414060) [09:12:54] (03CR) 10CI reject: [V:04-1] revertrisk-wikidata: parallelize async calls that fetch metadata features [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/1236618 (https://phabricator.wikimedia.org/T414060) (owner: 10Kevin Bazira) [09:28:29] The Jenkins CI is failing on the inference services pipeline pre-commit check: https://phabricator.wikimedia.org/P88633 [09:28:33] o_0 [10:15:27] (03CR) 10Kevin Bazira: "recheck" [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/1236618 (https://phabricator.wikimedia.org/T414060) (owner: 10Kevin Bazira) [10:20:07] the CI issue has been fixed by RelEng: https://phabricator.wikimedia.org/P88633#356784 [11:56:48] (03CR) 10Gkyziridis: [C:03+1] "Thnx LGTM!" [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/1236618 (https://phabricator.wikimedia.org/T414060) (owner: 10Kevin Bazira) [12:09:34] (03CR) 10Kevin Bazira: [C:03+2] revertrisk-wikidata: parallelize async calls that fetch metadata features [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/1236618 (https://phabricator.wikimedia.org/T414060) (owner: 10Kevin Bazira) [12:11:21] (03Merged) 10jenkins-bot: revertrisk-wikidata: parallelize async calls that fetch metadata features [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/1236618 (https://phabricator.wikimedia.org/T414060) (owner: 10Kevin Bazira) [12:39:14] (03PS1) 10Kevin Bazira: revertrisk-wikidata: add localized cache for Wikidata API responses [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/1236713 (https://phabricator.wikimedia.org/T414060) [12:40:48] 06Machine-Learning-Team, 10MediaWiki-extensions-ORES, 10MediaWiki-Recent-changes, 06Moderator-Tools-Team, and 2 others: WE 1.3.4 Roll out Revert Risk Filters to Wikis that don't have damaging/goodfaith Edit Models - https://phabricator.wikimedia.org/T408388#11582948 (10DMburugu) [12:41:38] 06Machine-Learning-Team, 10MediaWiki-extensions-ORES, 10MediaWiki-Recent-changes, 06Moderator-Tools-Team, and 2 others: WE 1.3.4 Roll out Revert Risk Filters to Wikis that don't have damaging/goodfaith Edit Models - https://phabricator.wikimedia.org/T408388#11582957 (10DMburugu) [12:52:03] (03CR) 10Gkyziridis: [C:03+1] "LGTM!" [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/1236713 (https://phabricator.wikimedia.org/T414060) (owner: 10Kevin Bazira) [12:53:18] (03CR) 10Kevin Bazira: [C:03+2] revertrisk-wikidata: add localized cache for Wikidata API responses [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/1236713 (https://phabricator.wikimedia.org/T414060) (owner: 10Kevin Bazira) [12:54:33] (03Merged) 10jenkins-bot: revertrisk-wikidata: add localized cache for Wikidata API responses [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/1236713 (https://phabricator.wikimedia.org/T414060) (owner: 10Kevin Bazira) [12:59:34] 06Machine-Learning-Team, 10MediaWiki-extensions-ORES, 10MediaWiki-Recent-changes, 06Moderator-Tools-Team (Kanban), 07OKR-Work (WE1 FY2025-26): Enable revert risk filters for first batch of wikis: < 1000 monthly edits - https://phabricator.wikimedia.org/T411485#11583076 (10Samwalton9-WMF) >>! In T411485#1... [13:38:46] 06Machine-Learning-Team, 07Essential-Work: Unify and improve load testing strategy for inference services - https://phabricator.wikimedia.org/T416475 (10BWojtowicz-WMF) 03NEW [14:01:49] FIRING: KubernetesDeploymentUnavailableReplicas: ... [14:01:49] Deployment revertrisk-wikidata-predictor-00006-deployment in revertrisk at codfw has persistently unavailable replicas - https://wikitech.wikimedia.org/wiki/Kubernetes/Troubleshooting#Troubleshooting_a_deployment - https://grafana.wikimedia.org/d/a260da06-259a-4ee4-9540-5cab01a246c8/kubernetes-deployment-details?var-site=codfw&var-cluster=k8s-mlserve&var-namespace=revertrisk&var-deployment=revertrisk-wikidata-predictor-00006-deployment - ... [14:01:49] https://alerts.wikimedia.org/?q=alertname%3DKubernetesDeploymentUnavailableReplicas [14:32:00] 06Machine-Learning-Team, 06Moderator-Tools-Team, 10PersonalDashboard, 07OKR-Work (WE1 FY2025-26): Surface edits to moderators which may require their review - https://phabricator.wikimedia.org/T404174#11583476 (10Samwalton9-WMF) →14Duplicate dup:03T409059 [14:32:45] 06Machine-Learning-Team, 06Moderator-Tools-Team, 10PersonalDashboard, 07OKR-Work (WE1 FY2025-26): Surface edits to moderators which may require their review - https://phabricator.wikimedia.org/T404174#11583483 (10Samwalton9-WMF) We're thinking about edit review recommendations in a broader sense, so cl... [14:35:49] FIRING: KubernetesDeploymentUnavailableReplicas: ... [14:35:49] Deployment revertrisk-wikidata-predictor-00006-deployment in revertrisk at eqiad has persistently unavailable replicas - https://wikitech.wikimedia.org/wiki/Kubernetes/Troubleshooting#Troubleshooting_a_deployment - https://grafana.wikimedia.org/d/a260da06-259a-4ee4-9540-5cab01a246c8/kubernetes-deployment-details?var-site=eqiad&var-cluster=k8s-mlserve&var-namespace=revertrisk&var-deployment=revertrisk-wikidata-predictor-00006-deployment - ... [14:35:49] https://alerts.wikimedia.org/?q=alertname%3DKubernetesDeploymentUnavailableReplicas [14:45:49] RESOLVED: KubernetesDeploymentUnavailableReplicas: ... [14:45:49] Deployment revertrisk-wikidata-predictor-00006-deployment in revertrisk at eqiad has persistently unavailable replicas - https://wikitech.wikimedia.org/wiki/Kubernetes/Troubleshooting#Troubleshooting_a_deployment - https://grafana.wikimedia.org/d/a260da06-259a-4ee4-9540-5cab01a246c8/kubernetes-deployment-details?var-site=eqiad&var-cluster=k8s-mlserve&var-namespace=revertrisk&var-deployment=revertrisk-wikidata-predictor-00006-deployment - ... [14:45:49] https://alerts.wikimedia.org/?q=alertname%3DKubernetesDeploymentUnavailableReplicas [15:21:38] 06Machine-Learning-Team, 06Wikipedia-Android-App-Backlog: Migrate Machine-generated Article Descriptions from toolforge to liftwing. - https://phabricator.wikimedia.org/T343123#11583854 (10Aklapper) a:05kevinbazira→03None @kevinbazira Removing task assignee as this open task has been assigned for more than... [16:29:12] 10Lift-Wing, 06Machine-Learning-Team, 10Wikidata, 07OKR-Work: Optimize revertrisk-wikidata inference service to achieve ~500ms latency target - https://phabricator.wikimedia.org/T414060#11584141 (10kevinbazira) In the logs referenced in T414060#11578047, we identified edge-cases that occur under heavy load... [16:44:29] 06Machine-Learning-Team, 10Wikimedia-Enterprise-Kanban-On-Call: Test liftwing wikidata revert risk API for scale and latency - https://phabricator.wikimedia.org/T409388#11584234 (10kevinbazira) @SGupta-WMF, thank you for the clarification regarding WME's infrastructure and the use of the external endpoint. We...