[02:10:26] 06Revision-Scoring-As-A-Service, 10Diffusion, 10ORES, 06Repository-Admins, 10rsaas-editquality: Diffusion repository can't be cloned: 500 errors (research-ores-editquality) - https://phabricator.wikimedia.org/T157141#2999986 (10demon) >>! In T157141#2998631, @Paladox wrote: > we will probably want to set... [05:37:26] (03CR) 10Gergő Tisza: Make API not fail when edit is deleted or can't be scored (031 comment) [extensions/ORES] - 10https://gerrit.wikimedia.org/r/335959 (https://phabricator.wikimedia.org/T157078) (owner: 10Ladsgroup) [05:43:41] (03CR) 10Ladsgroup: Make API not fail when edit is deleted or can't be scored (031 comment) [extensions/ORES] - 10https://gerrit.wikimedia.org/r/335959 (https://phabricator.wikimedia.org/T157078) (owner: 10Ladsgroup) [07:06:55] (03CR) 10Gergő Tisza: [C: 032] Make API not fail when edit is deleted or can't be scored [extensions/ORES] - 10https://gerrit.wikimedia.org/r/335959 (https://phabricator.wikimedia.org/T157078) (owner: 10Ladsgroup) [07:08:25] (03Merged) 10jenkins-bot: Make API not fail when edit is deleted or can't be scored [extensions/ORES] - 10https://gerrit.wikimedia.org/r/335959 (https://phabricator.wikimedia.org/T157078) (owner: 10Ladsgroup) [07:13:16] 06Revision-Scoring-As-A-Service, 10MediaWiki-extensions-ORES, 13Patch-For-Review, 15User-Ladsgroup, 07Wikimedia-log-errors: ores logspam: Model contains an error - https://phabricator.wikimedia.org/T157078#2994821 (10Ladsgroup) 05Open>03Resolved [07:13:58] 06Revision-Scoring-As-A-Service, 10Revision-Scoring-As-A-Service-Backlog, 10ORES, 06Operations, and 2 others: ORES Overloaded (particularly 02/05/17 2:25-2:30) - https://phabricator.wikimedia.org/T157206#3000449 (10greg) {F5506544} Whatever it was hasn't subsided yet. [07:24:17] 06Revision-Scoring-As-A-Service, 10Revision-Scoring-As-A-Service-Backlog, 10ORES, 06Operations, and 2 others: ORES Overloaded (particularly 02/05/17 2:25-2:30) - https://phabricator.wikimedia.org/T157206#2999048 (10Joe) Before raising the number of workers for ORES: - has anyone done an analysis of where... [07:45:10] 06Revision-Scoring-As-A-Service, 10Revision-Scoring-As-A-Service-Backlog, 10ORES, 06Operations, and 2 others: ORES Overloaded (particularly 02/05/17 2:25-2:30) - https://phabricator.wikimedia.org/T157206#3000507 (10Joe) So after taking a quick look at ORES's logs: around 70% of requests come from changepro... [08:20:39] 06Revision-Scoring-As-A-Service, 10Revision-Scoring-As-A-Service-Backlog, 10ORES, 06Operations, and 2 others: ORES Overloaded (particularly 02/05/17 2:25-2:30) - https://phabricator.wikimedia.org/T157206#3000706 (10Ladsgroup) https://github.com/wikimedia/change-propagation/pull/161 will reduce the old by d... [08:52:44] 06Revision-Scoring-As-A-Service, 10Revision-Scoring-As-A-Service-Backlog, 10ORES, 06Operations, and 2 others: ORES Overloaded (particularly 02/05/17 2:25-2:30) - https://phabricator.wikimedia.org/T157206#3000780 (10Ladsgroup) I'm pretty someone externally is putting pressure on the service too. This correl... [10:04:56] 06Revision-Scoring-As-A-Service, 10Beta-Cluster-Infrastructure, 10ORES, 06Release-Engineering-Team, 10scap: Running out of space when deploying on sca03 (deploy-cache) - https://phabricator.wikimedia.org/T157199#2998859 (10hashar) The root cause is that the scap cache clone submodules from the deployment... [11:21:43] 06Revision-Scoring-As-A-Service, 10Revision-Scoring-As-A-Service-Backlog, 10ORES, 06Operations, and 2 others: ORES Overloaded (particularly 02/05/17 2:25-2:30) - https://phabricator.wikimedia.org/T157206#3001202 (10Joe) From my further analysis of logs: - there is one API heavy hitter, whose rate of consu... [11:41:58] 06Revision-Scoring-As-A-Service, 10Revision-Scoring-As-A-Service-Backlog, 10ORES, 06Operations, and 2 others: ORES Overloaded (particularly 02/05/17 2:25-2:30) - https://phabricator.wikimedia.org/T157206#3001307 (10Joe) So, graphing `ores.*.scores_request.*.count` it shows most requests seem to come from `... [11:48:01] 06Revision-Scoring-As-A-Service, 10Revision-Scoring-As-A-Service-Backlog, 10ORES, 06Operations, and 2 others: ORES Overloaded (particularly 02/05/17 2:25-2:30) - https://phabricator.wikimedia.org/T157206#3001311 (10Joe) scratch what I said; the counter for etwiki is most likely broken. The surge in reque... [12:08:59] 06Revision-Scoring-As-A-Service, 10Revision-Scoring-As-A-Service-Backlog, 10ORES, 06Operations, and 2 others: ORES Overloaded (particularly 02/05/17 2:25-2:30) - https://phabricator.wikimedia.org/T157206#3001387 (10Joe) Looking into it better, the api user wasn't a red herring after all; I am going to ban... [12:20:14] 06Revision-Scoring-As-A-Service, 10Revision-Scoring-As-A-Service-Backlog, 10ORES, 06Operations, and 2 others: ORES Overloaded (particularly 02/05/17 2:25-2:30) - https://phabricator.wikimedia.org/T157206#3001406 (10Ladsgroup) Okay, let's block them for now. Until we find a way to only hold-out the abuser. [12:29:13] (03PS1) 10Ladsgroup: Remove all (except meta) API funcationality hooks [extensions/ORES] - 10https://gerrit.wikimedia.org/r/336213 (https://phabricator.wikimedia.org/T157206) [12:50:51] (03CR) 10Giuseppe Lavagetto: [C: 031] "LGTM" [extensions/ORES] - 10https://gerrit.wikimedia.org/r/336213 (https://phabricator.wikimedia.org/T157206) (owner: 10Ladsgroup) [12:52:07] (03CR) 10Giuseppe Lavagetto: [C: 032] Remove all (except meta) API funcationality hooks [extensions/ORES] - 10https://gerrit.wikimedia.org/r/336213 (https://phabricator.wikimedia.org/T157206) (owner: 10Ladsgroup) [12:53:43] (03Merged) 10jenkins-bot: Remove all (except meta) API funcationality hooks [extensions/ORES] - 10https://gerrit.wikimedia.org/r/336213 (https://phabricator.wikimedia.org/T157206) (owner: 10Ladsgroup) [12:54:05] (03PS1) 10Ladsgroup: Remove all (except meta) API funcationality hooks [extensions/ORES] (wmf/1.29.0-wmf.10) - 10https://gerrit.wikimedia.org/r/336215 (https://phabricator.wikimedia.org/T157206) [14:01:56] (03PS2) 10Addshore: Remove all (except meta) API funcationality hooks [extensions/ORES] (wmf/1.29.0-wmf.10) - 10https://gerrit.wikimedia.org/r/336215 (https://phabricator.wikimedia.org/T157206) (owner: 10Ladsgroup) [14:02:36] (03CR) 10Addshore: [C: 032] Remove all (except meta) API funcationality hooks [extensions/ORES] (wmf/1.29.0-wmf.10) - 10https://gerrit.wikimedia.org/r/336215 (https://phabricator.wikimedia.org/T157206) (owner: 10Ladsgroup) [14:05:03] (03Merged) 10jenkins-bot: Remove all (except meta) API funcationality hooks [extensions/ORES] (wmf/1.29.0-wmf.10) - 10https://gerrit.wikimedia.org/r/336215 (https://phabricator.wikimedia.org/T157206) (owner: 10Ladsgroup) [15:23:05] o/ [15:24:08] o/ [15:43:19] Hey Amir1. Just saw some pings about ores throttling. Anything weird going on? [15:58:07] halfak: Hey [15:58:36] TLDR: We've been working to reduce ORES workload in the past 14 hours [15:58:54] It just worked (that's why I took a break) [15:59:45] Our Chinese friend was making a shit ton of request in the past days, we had to remove api.php functionality for now [15:59:55] until we can find a way to issue a varnish ban [16:00:53] Yikes! [16:01:11] Hmm. I wonder what the right long term strategy is. [16:01:19] We could just build up capacity to handle this. [16:01:33] Or we could implement some strategies for tracking individual querying users. [16:01:44] we did but still we had issues [16:01:49] I'd written up a strategy from a while ago. [16:02:33] halfak: What _joe_ suggested was to make mediawiki send IP of original requester when hitting ores endpoint, so varnish can ban it in the middle [16:03:01] Amir1, +1 to that for now. [16:03:10] I'd like to do some things with user-agents. [16:03:31] We'd need to have user-agent forwarding from the MediaWiki API. [16:04:02] that would be a good option too [16:17:39] Everyone gets to make requests until the queue hits 50. Then, between 50 and 75, only requests that contain a user-agent with an "@" (implying an email address) get through. Between 75 and 100 items in the queue, only the precaching systems would get through. [16:18:17] Might want to bring the queue ceiling for any user-agent down to 25 or maybe even 15. [16:18:45] We'd want to then implement some blocks based on user-agent. [16:19:02] Oh! There's also some options for limiting the request rates based on IP and user-agent. [16:20:34] I investigated and it's very discouraged to do something like this. It's better to have several queues (and one or two as fast lanes dedicated to vital applications) [16:21:11] Amir1, seems like my earlier proposal is a lane-like solution. [16:21:42] Also, we've talked about implementing separate queues in celery. I forget why we didn't continue that. [16:42:10] 06Revision-Scoring-As-A-Service, 10Revision-Scoring-As-A-Service-Backlog, 10ORES, 06Operations, and 4 others: ORES Overloaded (particularly 02/05/17 2:25-2:30) - https://phabricator.wikimedia.org/T157206#3002362 (10Joe) 05Open>03Resolved [16:48:17] 06Revision-Scoring-As-A-Service, 10rsaas-editquality: Restore recall-threshold-based metrics for editquality models. - https://phabricator.wikimedia.org/T156644#2982373 (10Halfak) https://github.com/wiki-ai/editquality/pull/58 [17:10:50] 10Revision-Scoring-As-A-Service-Backlog, 10Mobile-Content-Service, 10ORES, 10RESTBase-API, 06Services (designing): Add ORES WP10 data to summaries? - https://phabricator.wikimedia.org/T157132#3002472 (10Fjalapeno) > @mobrovac We're talking about special model in ORES that evaluates the article quality as... [17:14:26] 10Revision-Scoring-As-A-Service-Backlog, 10Mobile-Content-Service, 10ORES, 10RESTBase-API, 06Services (designing): Add ORES WP10 data to summaries? - https://phabricator.wikimedia.org/T157132#3002481 (10Pchelolo) > The two questions that IMO need to be addressed for this IMO are how to cache the data Su... [17:29:55] 10Revision-Scoring-As-A-Service-Backlog, 10Mobile-Content-Service, 10ORES, 10RESTBase-API, 06Services (designing): Add ORES WP10 data to summaries? - https://phabricator.wikimedia.org/T157132#2996522 (10Fjalapeno) From @Halfak : documents how long it took to rescore all of Wikipedia: T135684 [17:34:03] 06Revision-Scoring-As-A-Service, 10Beta-Cluster-Infrastructure, 10ORES, 06Release-Engineering-Team, 10scap: Running out of space when deploying on sca03 (deploy-cache) - https://phabricator.wikimedia.org/T157199#2998859 (10hashar) [17:34:32] 06Revision-Scoring-As-A-Service, 10Beta-Cluster-Infrastructure, 10ORES, 06Release-Engineering-Team, 10scap: Running out of space when deploying on sca03 (deploy-cache) - https://phabricator.wikimedia.org/T157199#2998859 (10hashar) Made this bug a dupe of the older task T137124 and I have copy pasted the... [17:38:28] 06Revision-Scoring-As-A-Service, 10Mobile-Content-Service, 10ORES, 10RESTBase-API, 06Services (designing): Add ORES WP10 data to summaries? - https://phabricator.wikimedia.org/T157132#3002587 (10Halfak) [17:38:39] 06Revision-Scoring-As-A-Service, 10Mobile-Content-Service, 10ORES, 10Reading-Web-Trending-Service, 06Services: Trending API should consult ORES - https://phabricator.wikimedia.org/T145829#3002588 (10Fjalapeno) [17:47:55] 06Revision-Scoring-As-A-Service, 10ORES, 06Reading-Infrastructure-Team: [Spike] Review ORES architecture for Reading Product plans - https://phabricator.wikimedia.org/T153321#3002637 (10Halfak) [17:48:24] 06Revision-Scoring-As-A-Service, 10Mobile-Content-Service, 10ORES, 10RESTBase-API, 06Services (designing): Add ORES WP10 data to summaries? - https://phabricator.wikimedia.org/T157132#3002640 (10Halfak) [17:48:26] 06Revision-Scoring-As-A-Service, 10Mobile-Content-Service, 10ORES, 10Reading-Web-Trending-Service, 06Services: Trending API should consult ORES - https://phabricator.wikimedia.org/T145829#3002641 (10Halfak) [17:48:29] 06Revision-Scoring-As-A-Service, 10ORES, 06Reading-Infrastructure-Team: [Spike] Review ORES architecture for Reading Product plans - https://phabricator.wikimedia.org/T153321#2876634 (10Halfak) [17:49:16] 06Revision-Scoring-As-A-Service, 10ORES, 06Reading-Infrastructure-Team: [Spike] Review ORES architecture for Reading Product plans - https://phabricator.wikimedia.org/T153321#2876634 (10Halfak) [17:59:00] 06Revision-Scoring-As-A-Service, 10Mobile-Content-Service, 10ORES, 10Reading-Web-Trending-Service, 06Services: Trending API should consult ORES - https://phabricator.wikimedia.org/T145829#3002671 (10Jdlrobson) My hope would be that an ores score is in the kafka event in some form. What I'm interested in... [18:02:21] 06Revision-Scoring-As-A-Service, 10Mobile-Content-Service, 10ORES, 10RESTBase-API, 06Services (designing): Add ORES WP10 data to summaries? - https://phabricator.wikimedia.org/T157132#2996522 (10GWicke) @Fjalapeno: Do we have information on median / p99 score times? [18:05:14] 06Revision-Scoring-As-A-Service, 10Mobile-Content-Service, 10ORES, 10Reading-Web-Trending-Service, 06Services: Trending API should consult ORES - https://phabricator.wikimedia.org/T145829#3002708 (10Pchelolo) > My hope would be that an ores score is in the kafka event in some form. What I'm interested in... [18:23:37] 06Revision-Scoring-As-A-Service, 10Mobile-Content-Service, 10ORES, 10Reading-Web-Trending-Service, 06Services: Trending API should consult ORES - https://phabricator.wikimedia.org/T145829#3002750 (10mobrovac) >>! In T145829#3002708, @Pchelolo wrote: >> My hope would be that an ores score is in the kafka... [18:26:43] o/ Is there anything I can do to support the proposal for more tech resources? [18:45:40] 06Revision-Scoring-As-A-Service, 10Mobile-Content-Service, 10ORES, 10Reading-Web-Trending-Service, 06Services (designing): Trending API should consult ORES - https://phabricator.wikimedia.org/T145829#3002788 (10Pchelolo) [18:52:49] * awight pings nobody in particular ;) [19:20:17] 06Revision-Scoring-As-A-Service, 10Mobile-Content-Service, 10ORES, 10Reading-Web-Trending-Service, 06Services (designing): Trending API should consult ORES - https://phabricator.wikimedia.org/T145829#2642399 (10Ottomata) Would a dedicated stream of revision-scores (ORES scores) work for this? We are tal... [19:27:22] 06Revision-Scoring-As-A-Service, 10Mobile-Content-Service, 10ORES, 10RESTBase-API, 06Services (designing): Add ORES WP10 data to summaries? - https://phabricator.wikimedia.org/T157132#3003028 (10Fjalapeno) @Halfak do you have data on @GWicke's question? [19:34:54] 06Revision-Scoring-As-A-Service, 10Mobile-Content-Service, 10ORES, 10Reading-Web-Trending-Service, 06Services (designing): Trending API should consult ORES - https://phabricator.wikimedia.org/T145829#3003038 (10Fjalapeno) @Ottomata reading that ticket it seems the basic thrust is that you want to provide... [19:36:16] 06Revision-Scoring-As-A-Service, 10Mobile-Content-Service, 10ORES, 10Reading-Web-Trending-Service, 06Services (designing): Trending API should consult ORES - https://phabricator.wikimedia.org/T145829#3003040 (10Ottomata) Correct :) [20:25:59] (03CR) 10Legoktm: [C: 032] Move RecentChangesFlags to top level in extension.json [extensions/ORES] - 10https://gerrit.wikimedia.org/r/336125 (owner: 10Umherirrender) [20:32:22] (03Merged) 10jenkins-bot: Move RecentChangesFlags to top level in extension.json [extensions/ORES] - 10https://gerrit.wikimedia.org/r/336125 (owner: 10Umherirrender) [20:37:01] 06Revision-Scoring-As-A-Service, 10Mobile-Content-Service, 10ORES, 10RESTBase-API, 06Services (designing): Add ORES WP10 data to summaries? - https://phabricator.wikimedia.org/T157132#3003250 (10Halfak) Not handy, but this graph should be close-ish. One big difference is that we won't need to wait for t... [20:53:45] 06Revision-Scoring-As-A-Service, 10Mobile-Content-Service, 10ORES, 10RESTBase-API, 06Services (designing): Add ORES WP10 data to summaries? - https://phabricator.wikimedia.org/T157132#3003281 (10GWicke) >>! In T157132#3003250, @Halfak wrote: > Not handy, but this graph should be close-ish. One big diffe... [23:01:05] 06Revision-Scoring-As-A-Service, 10Mobile-Content-Service, 10ORES, 10RESTBase-API, and 2 others: Add ORES WP10 data to summaries? - https://phabricator.wikimedia.org/T157132#3003608 (10Fjalapeno) [23:06:48] 06Revision-Scoring-As-A-Service, 10Mobile-Content-Service, 10ORES, 10RESTBase-API, and 2 others: Add ORES WP10 data to summaries? - https://phabricator.wikimedia.org/T157132#3003638 (10Fjalapeno) > We could use that for ORES model updates - bump a patch version of the content type. > >Massive Varnish purge...