[00:34:59] 10Scoring-platform-team (Current), 10ORES, 10Patch-For-Review, 10User-Ladsgroup: Enable wp10 and draftquality models for testwiki - https://phabricator.wikimedia.org/T198997 (10Catrope) >>! In T198997#4481811, @gerritbot wrote: > Change 450392 **merged** by Halfak: > [mediawiki/services/ores/deploy@master]... [14:02:34] Technical Advice IRC meeting starting in 60 minutes in channel #wikimedia-tech, hosts: @leszek_wmde & @jakob_WMDE - all questions welcome, more infos: https://www.mediawiki.org/wiki/Technical_Advice_IRC_Meeting [14:05:52] o/ [14:37:08] 10Scoring-platform-team (Current), 10ORES: ORES deployment (Early August) - https://phabricator.wikimedia.org/T201518 (10Halfak) [14:38:41] 10Scoring-platform-team (Current), 10ORES, 10editquality-modeling, 10artificial-intelligence: Duplicated feature name in editquality - https://phabricator.wikimedia.org/T197679 (10Halfak) [14:38:45] 10Scoring-platform-team (Current), 10Analytics, 10Analytics-Kanban, 10EventBus, and 3 others: Fix "score_schema" -- invalid JSON Schema - https://phabricator.wikimedia.org/T197828 (10Halfak) [14:38:47] 10Scoring-platform-team (Current), 10articlequality-modeling, 10User-Ladsgroup, 10artificial-intelligence: Train and test wp10 model for fawiki - https://phabricator.wikimedia.org/T190050 (10Halfak) [14:38:51] 10Scoring-platform-team (Current), 10ORES: ORES deployment (Early August) - https://phabricator.wikimedia.org/T201518 (10Halfak) [14:39:23] 10Scoring-platform-team (Current), 10ORES, 10Patch-For-Review, 10User-Ladsgroup: Enable wp10 and draftquality models for testwiki - https://phabricator.wikimedia.org/T198997 (10Halfak) @awight is working on come updates to some of our models. I believe that is the only blocker to the next ORES deployment.... [14:45:14] 10Scoring-platform-team, 10ORES, 10Performance-Team, 10Wikimedia-log-errors: ORES Storage::SqlScoreStorage exception every 2-3 minutes: Model contains an error for [id]: TimeoutError - https://phabricator.wikimedia.org/T201412 (10Imarlier) @awight I'm not sure what you're seeing -- can you clarify? [14:52:08] Technical Advice IRC meeting starting in 10 minutes in channel #wikimedia-tech, hosts: @leszek_wmde & @jakob_WMDE - all questions welcome, more infos: https://www.mediawiki.org/wiki/Technical_Advice_IRC_Meeting [14:54:14] 10Scoring-platform-team, 10ORES, 10Performance-Team, 10Wikimedia-log-errors: ORES Storage::SqlScoreStorage exception every 2-3 minutes: Model contains an error for [id]: TimeoutError - https://phabricator.wikimedia.org/T201412 (10Halfak) For one, mean latency jumps from 60ms to 75ms. There's a similar shi... [14:55:36] 10Scoring-platform-team, 10ORES, 10Performance-Team, 10Wikimedia-log-errors: ORES Storage::SqlScoreStorage exception every 2-3 minutes: Model contains an error for [id]: TimeoutError - https://phabricator.wikimedia.org/T201412 (10Halfak) FWIW, we're seeing timeouts increases in ORES that correspond to the... [15:04:18] halfak: Around? [15:04:19] Yup [15:05:08] Cool :) [15:05:21] For using item completeness as signal [15:05:33] If I got your email right, https://gerrit.wikimedia.org/r/c/mediawiki/extensions/PropertySuggester/+/356043 would be the next step [15:06:11] (sorry if I'm not thinking straight… it's 34°C in this office… hard to stay focused) [15:08:18] right. [15:08:42] I'm amazed reading through this how much work Glorian did and how much review he got! [15:11:28] Essentially, steps are: (1) Implement a way to get all suggestions from the API (all_suggestions) (2) Implement features for extracting signal in revscoring/item_quality, (3) Train and test new models to look for evidence of improvement, (4) Explore avenues for improving PropertySuggestion itself. [15:13:28] Ok… so I'll start by seeing how well Glorian's patch can be adapted (it's > 1y old) and take it over if possible… starting from scratch, if necessary [15:13:39] Sounds reasonable. [15:42:05] 10Scoring-platform-team (Current), 10revscoring, 10User-Ladsgroup, 10artificial-intelligence: Rewrite scoring libraries to replace pywikibase with mwbase - https://phabricator.wikimedia.org/T194758 (10Halfak) a:05Ladsgroup>03Halfak [15:42:22] 10Scoring-platform-team (Current), 10revscoring, 10User-Ladsgroup, 10artificial-intelligence: Rewrite scoring libraries to replace pywikibase with mwbase - https://phabricator.wikimedia.org/T194758 (10Halfak) https://github.com/mediawiki-utilities/python-mwbase/pull/7 to add some string output. [15:52:47] ah, nuts. [15:52:56] Scrum of scrums was at the earlier time today. [16:06:29] 10Scoring-platform-team (Current), 10ORES: ORES deployment (Early August) - https://phabricator.wikimedia.org/T201518 (10awight) [16:48:16] 10Scoring-platform-team, 10AbuseFilter, 10JADE: AbuseFilter integration for JADE - https://phabricator.wikimedia.org/T201365 (10Harej) p:05Triage>03High [16:53:07] 10Scoring-platform-team (Current), 10DBA, 10JADE, 10Operations, 10TechCom-RFC: Introduce a new namespace for collaborative judgments about wiki entities - https://phabricator.wikimedia.org/T200297 (10Harej) I've worked with @awight on a document describing JADE's requirements and possible implementation... [16:53:46] 10Scoring-platform-team (Current), 10DBA, 10JADE, 10Operations, 10TechCom-RFC: Introduce a new namespace for collaborative judgments about wiki entities - https://phabricator.wikimedia.org/T200297 (10awight) [16:54:20] awight: are you working on updating https://phabricator.wikimedia.org/T200297 ? [16:54:38] harej: Just made some changes, lmk what you think. [16:54:45] 10Scoring-platform-team (Current), 10DBA, 10JADE, 10Operations, 10TechCom-RFC: Introduce a new namespace for collaborative judgments about wiki entities - https://phabricator.wikimedia.org/T200297 (10awight) [16:57:09] 10Scoring-platform-team (Current), 10DBA, 10JADE, 10Operations, 10TechCom-RFC: Introduce a new namespace for collaborative judgments about wiki entities - https://phabricator.wikimedia.org/T200297 (10awight) [16:57:37] harej: OK, I’m done in there. [16:57:41] 10Scoring-platform-team (Current), 10DBA, 10JADE, 10Operations, 10TechCom-RFC: Introduce a new namespace for collaborative judgments about wiki entities - https://phabricator.wikimedia.org/T200297 (10awight) [17:00:39] looks good! [17:09:26] 10Scoring-platform-team (Current), 10DBA, 10JADE, 10Operations, 10TechCom-RFC: Introduce a new namespace for collaborative judgments about wiki entities - https://phabricator.wikimedia.org/T200297 (10daniel) > be curated (patrolled, deleted, etc.) within MediaWiki. The must important question is: how i... [17:33:51] harej: Am I missing any of our abuse-limiting strategies? [17:33:56] *JADE will only be deployed after community consensus and an agreement that they will not allow bots to automatically populate every possible page. [17:33:57] *Normal on-wiki mechanisms to combat vandalism and harassment. These work best in our preferred JADE-per-wiki implementation. [17:33:58] *Various workflows will be integrated with attention given to each one, and the technical means to disable the integration at any point if it becomes a problem. [17:37:31] 10Scoring-platform-team (Current), 10DBA, 10JADE, 10Operations, 10TechCom-RFC: Introduce a new namespace for collaborative judgments about wiki entities - https://phabricator.wikimedia.org/T200297 (10awight) >>! In T200297#4489021, @daniel wrote: >> be curated (patrolled, deleted, etc.) within MediaWiki.... [17:38:21] --> Lunch [17:38:40] I’m out for a few hours… [19:17:39] https://www.theverge.com/2018/8/8/17663544/ai-scientists-wikipedia-primer [19:39:58] 10Scoring-platform-team (Current), 10DBA, 10JADE, 10Operations, 10TechCom-RFC: Introduce a new namespace for collaborative judgments about wiki entities - https://phabricator.wikimedia.org/T200297 (10daniel) > That sounds great to us, but we've put quite a bit of time into planning how to prevent abuse a... [20:08:18] 10Scoring-platform-team, 10Analytics, 10Analytics-Kanban, 10EventBus, and 4 others: Modify revision-score schema so that model probabilities won't conflict - https://phabricator.wikimedia.org/T197000 (10Pchelolo) After a quick h-o with @Ottomata and @JAllemandou we've understood that the `/precache` endpoi... [20:25:38] Pchelolo, o/ [20:25:47] hi halfak [20:25:52] Let me know if I misunderstood something with ^ [20:26:05] 10Scoring-platform-team, 10Analytics, 10Analytics-Kanban, 10EventBus, and 4 others: Modify revision-score schema so that model probabilities won't conflict - https://phabricator.wikimedia.org/T197000 (10Halfak) That's roughly right. Precache will always produce ORES native format that has been designed fo... [20:26:12] Oh there ^ [20:26:18] lemme read it first :) [20:26:19] wikibugs is a bit delayed :D [20:27:23] halfak: so, someone other then change-prop actually uses /precache POST endpoint?? [20:28:13] Pchelolo, I'm not 100% sure on that. [20:28:32] But I don't want custom reformatting code for EventStream in ORES. Just doesn't seem to make sense. [20:28:34] it's even not documented correctly https://ores.wikimedia.org/v3/#!/scoring/get_v3_precache [20:29:05] halfak: heh.. nobody want to have that code :) [20:29:24] Pchelolo, is there a reason it makes *more* sense to be in ORES than somewhere else? [20:29:40] It's very nice that we can use the same output formatter across all of v3 [20:29:56] Maybe we can make a vCrazyEventStreamFormat :P [20:30:46] We built a whole score-schema thing for y'all so that we wouldn't have to do it in ORES. [20:31:14] Also, good point re. v3/precache docs. Looks silly. [20:33:05] halfak: the biggest pro with having the reformatter in ores is that when you change something in the model output you know that something is being changed [20:33:23] huh? [20:34:15] let's say you release some new version of a model and add a new field to the output - then if the reformatter lives in ores it's only one place where that change needs to be applied. If elswehere - 2 places [20:34:56] I'm ok with making it more robust and leaving it in change-prop, but that would require some kind of guarantees for format stability within model versions [20:35:05] Pchelolo, but EventStream is just one consumer of ORES. Are we to do this for all consumers? [20:35:19] makes sense.. [20:35:26] Yes. We provide those guarantees. [20:35:30] That's why we version. [20:35:41] And maintain backwards compatibility. [20:36:07] ok. the backward compatibility part wasn't clear [20:37:11] Right. If you use v3, it won't change when we come up with v4 -- whatever that will be. [20:37:25] Same with v2 and v1 but they don't have a precache endpoint. [20:37:41] ok, lemme update the ticket. [20:39:49] 10Scoring-platform-team, 10Analytics, 10Analytics-Kanban, 10EventBus, and 4 others: Modify revision-score schema so that model probabilities won't conflict - https://phabricator.wikimedia.org/T197000 (10Ottomata) Sooooo, could we make a /v4/precache endpoint that does this? [20:41:45] 10Scoring-platform-team, 10Analytics, 10Analytics-Kanban, 10EventBus, and 4 others: Modify revision-score schema so that model probabilities won't conflict - https://phabricator.wikimedia.org/T197000 (10Pchelolo) Ok, I revoke the idea of having this in ores - @Halfak said that within an API version regardl... [20:44:02] Weird. I re-read the description of the ticket and something doesn't make sense. [20:44:29] The JSON provided as an example of what won't work isn't our JSON. [20:45:07] Our JSON *is* keyed by model names. [20:47:22] Also, our probabilities are keyed by class names. [20:47:26] Pchelolo, ^ [20:48:53] halfak: ye, just looked at that and I guess it's all easier actually [20:49:07] the only issue I see there it that the whole thing is prefixed with rev_id [20:49:12] but that's easy to remove [20:49:13] OK. Let me know if I can help. FWIW, I could see an issue with rev_id. Yeah... [20:49:25] oh, ottomata is no on this channel [20:49:36] So maybe v4 could be super-simplified and contain no rev_ids and also not allow for scoring multiple edits in one shot. [20:50:14] * halfak invited him [20:50:28] * halfak wishes there were the equivalent of sudo in IRC [20:50:34] like /opdo [20:50:36] or whatever [20:51:30] 10Scoring-platform-team, 10Analytics, 10Analytics-Kanban, 10EventBus, and 4 others: Modify revision-score schema so that model probabilities won't conflict - https://phabricator.wikimedia.org/T197000 (10Pchelolo) And also, looking into native ORES response, like https://ores.wikimedia.org/v3/scores/enwiki/... [20:52:11] ok halfak thank you. I'will tell the analytics guys about this discussion and we most likely very much simplify this whole thing now [20:53:23] Cool. FWIW, I'm starting to get a bit interested in this super-simple v4 schema that drops most of the functionality if it'll make life a lot easier for y'all. [20:53:38] I want to call it something different though to warn people away from it. [20:53:45] v0 [20:53:50] vSimple [20:57:12] v-1 ;) [20:57:17] o/ awight [20:57:40] heya [20:57:42] Looks like we're likely to do an ORES deploy tomorrow. Anything I can do to help you be ready with the work you're already doing? [20:58:26] Thanks, I haven’t looked at the patches yet but can’t think of anything else that needs doing. [21:01:19] Currently reading through this horrible wikitech-l thread [21:01:34] I’m really surprised at some of our peers here [21:06:12] Yes. I've had it open most of the day. [21:07:32] ugh I still don’t want to participate, but it’s gross and disappointing [21:11:00] awight, do you have a patch for ORES config that needs review? [21:11:06] It would be cool to get that in beta ASAP. [21:11:20] KK I’ll do some actual work :-) [21:11:26] awight: which thread? [21:11:29] Probably best if I don’t have access to email today, actually. [21:11:30] there are several [21:11:39] Hauskatze: exactly. That mess [21:11:53] we have two ban threads that are the most active atm [21:12:09] sadly, I must say [21:12:17] * Hauskatze feels hungry [21:12:30] I need some cookies or something - brb [21:12:37] It’s frustrating to see our staff and volunteer colleagues defending some kind of abstract right to free speech [21:12:59] well, freedom of speech exists, but no right is unlimited [21:13:01] Seems like the current U.S. political climate has already thrown that whole concept into a stark light [21:13:27] yeah, like you start telling people they’re gaslighting you and “cruelly” wasting time, and they’re gonna hang up the phone. [21:14:10] * awight tears eyes away from the 20-car pileup [21:14:19] in fact no right or liberty is absolute because you're not the only person inhabiting the world [21:14:53] not commenting on the specific case as I don't have access to any evidence [21:18:52] good night [21:18:55] have a good one [21:28:39] 10Scoring-platform-team, 10Analytics, 10Analytics-Kanban, 10EventBus, and 4 others: Modify revision-score schema so that model probabilities won't conflict - https://phabricator.wikimedia.org/T197000 (10Halfak) FWIW, I am interested in supporting a super-basic format, but I don't want to call it v4 because... [21:39:57] 10Scoring-platform-team (Current), 10DBA, 10JADE, 10Operations, 10TechCom-RFC: Introduce a new namespace for collaborative judgments about wiki entities - https://phabricator.wikimedia.org/T200297 (10awight) >>! In T200297#4489550, @daniel wrote: > My point was that "must be editable and watchable" prett... [21:48:49] Curses on that 1GB repo [21:50:37] Yes. Down with big repos [21:50:43] Up with LFS [21:51:49] haha that is an LFS repo [21:52:53] dog only knows why it’s re-downloading anything. [21:54:25] halfak: To answer your earlier question, yeah it looks like there’s going to be a config patch to review. Putting that together now. [21:55:32] OK cool. I have a few more minutes before I need to start on evening chores :| [21:57:54] 10Scoring-platform-team (Current), 10DBA, 10JADE, 10Operations, 10TechCom-RFC: Introduce a new namespace for collaborative judgments about wiki entities - https://phabricator.wikimedia.org/T200297 (10daniel) > are you just saying that pages on a central wiki seems reasonable, or that you think judgments... [21:59:08] awight & harej, I'll be AFK on Aug 22nd. [21:59:22] Can I rely on y'all to handle the IRC discussion with TechCom? [21:59:38] See daniel's comment above. [22:00:24] halfak: I see you already CR’d, but this seems odd to me: https://gerrit.wikimedia.org/r/#/c/mediawiki/services/ores/deploy/+/450392/ [22:00:41] What's up? [22:00:50] enwiki models and the testwiki extractor… [22:01:01] Right. It's going to actually score content on testwiki [22:01:01] I guess it just takes testwiki data and feeds it through the enwiki model, like simplewiki [22:01:03] Using the enwiki models. [22:01:06] right [22:01:21] Great [22:02:08] Technical Advice IRC meeting starting in 60 minutes in channel #wikimedia-tech, hosts: @CFisch_WMDE & @chiborg - all questions welcome, more infos: https://www.mediawiki.org/wiki/Technical_Advice_IRC_Meeting [22:03:54] Looks like the changes are actually just a new revscoring wheel, and submodule bumps. [22:04:23] Confirmed. [22:04:55] I'll be AFK briefly. Should be back in time to review. [22:05:03] I seem to be ~50% through downloading word2vec, yeah ^ that’s just what I was gonna say. [22:05:10] No need to drag out your night for this. [22:05:29] No sweat. I can prepare for my bike training while I wait :D [22:05:35] hehe k [22:07:14] halfak: Sure, I can represent us Aug 22nd. [22:09:34] (03CR) 10jenkins-bot: Localisation updates from https://translatewiki.net. [extensions/ORES] - 10https://gerrit.wikimedia.org/r/451488 (owner: 10L10n-bot) [22:16:18] Found some missing config, after all... [22:19:56] 10Scoring-platform-team (Current), 10DBA, 10JADE, 10Operations, 10TechCom-RFC: Introduce a new namespace for collaborative judgments about wiki entities - https://phabricator.wikimedia.org/T200297 (10awight) >>! In T200297#4490106, @daniel wrote: > How do you feel about having a public discussion on this... [22:20:11] Oops, I’m doing Phabricator while annoyed. [22:45:08] (03PS1) 10Awight: [WIP] New fawiki wp10 model, other submodule updates [services/ores/deploy] - 10https://gerrit.wikimedia.org/r/451539 (https://phabricator.wikimedia.org/T201518) [22:48:05] o/ [22:48:49] awight, ready for review? [22:49:09] naw I’m building wheels [22:51:04] I don’t think revscoring is being updated, I’ll bump the version number to force. [22:51:40] OK I'm going to head out for tonight ^_^. I'll be on in the AM and might have time for a beta deployment after I review :D [22:51:53] awight, ^ have a good one! [22:52:07] kk [22:52:39] Technical Advice IRC meeting starting in 10 minutes in channel #wikimedia-tech, hosts: @CFisch_WMDE & @chiborg - all questions welcome, more infos: https://www.mediawiki.org/wiki/Technical_Advice_IRC_Meeting [23:05:06] Oh wait. [23:05:10] I'm reading this backwards! [23:05:14] awight, ^ [23:05:16] lol [23:05:28] goddamn [23:05:38] (03CR) 10Halfak: [C: 032] Update wheels [research/ores/wheels] - 10https://gerrit.wikimedia.org/r/451542 (owner: 10Awight) [23:05:44] ah that was weird, cos you had me convinced too. [23:17:11] too bad, I got burned for rebuilding the models early after all: [23:17:23] sklearn/base.py:311: UserWarning: Trying to unpickle estimator RandomForestClassifier from version 0.19.1 when using version 0.19.2. This might lead to breaking code or invalid results. Use at your own risk.