[07:40:17] (03CR) 10Ladsgroup: "ping!" [extensions/ORES] - 10https://gerrit.wikimedia.org/r/333493 (https://phabricator.wikimedia.org/T155903) (owner: 10Ladsgroup) [07:40:42] (03CR) 10Ladsgroup: "ping" [extensions/ORES] - 10https://gerrit.wikimedia.org/r/333497 (https://phabricator.wikimedia.org/T155930) (owner: 10Ladsgroup) [08:44:14] 06Revision-Scoring-As-A-Service, 10Bad-Words-Detection-System, 10revscoring: Add language support for Romanian - https://phabricator.wikimedia.org/T152482#2971670 (10Andrei_Stroe) [08:44:48] 06Revision-Scoring-As-A-Service, 10Bad-Words-Detection-System, 10revscoring: Add language support for Romanian - https://phabricator.wikimedia.org/T152482#2849918 (10Andrei_Stroe) I have reviewed the list and copied words to the two lists. [09:41:24] 06Revision-Scoring-As-A-Service, 10Analytics, 10ChangeProp, 10EventBus, and 2 others: Rewrite ORES precaching change propagation configuration as a code module - https://phabricator.wikimedia.org/T148714#2971851 (10Liuxinyu970226) [10:12:42] (03CR) 10Thiemo Mättig (WMDE): [C: 04-1] "I believe this could be merged as it is, and improved later. Personally I like it more if issues are fixed before merge. Please tell me wh" (033 comments) [extensions/ORES] - 10https://gerrit.wikimedia.org/r/333493 (https://phabricator.wikimedia.org/T155903) (owner: 10Ladsgroup) [10:18:22] (03CR) 10Thiemo Mättig (WMDE): [C: 031] "The code seems to be fine. But the commit message does not seem to describe the code change. There are also no tests that help reviewers t" (032 comments) [extensions/ORES] - 10https://gerrit.wikimedia.org/r/333497 (https://phabricator.wikimedia.org/T155930) (owner: 10Ladsgroup) [10:35:50] 10Revision-Scoring-As-A-Service-Backlog, 10Wikilabels, 10rsaas-editquality: Edit quality campaign for Romanian Wikipedia - https://phabricator.wikimedia.org/T156357#2971998 (10Andrei_Stroe) [10:52:37] (03CR) 10Ladsgroup: Add javascript highlighting to Special:Contributions as well (032 comments) [extensions/ORES] - 10https://gerrit.wikimedia.org/r/333497 (https://phabricator.wikimedia.org/T155930) (owner: 10Ladsgroup) [10:53:11] (03PS3) 10Ladsgroup: Add javascript highlighting to Special:Contributions as well [extensions/ORES] - 10https://gerrit.wikimedia.org/r/333497 [10:55:07] 10Revision-Scoring-As-A-Service-Backlog, 10Wikilabels, 10rsaas-editquality: Edit quality campaign for Romanian Wikipedia - https://phabricator.wikimedia.org/T156357#2971998 (10MUTARO12) [[ URL | name ]] [10:55:07] 10[1] 1010https://meta.wikimedia.org/wiki/Help:URL - Redirección desde 10https://meta.wikimedia.org/wiki/URL?redirect=no [10:57:25] 10Revision-Scoring-As-A-Service-Backlog, 10Wikilabels, 10rsaas-editquality: Edit quality campaign for Romanian Wikipedia - https://phabricator.wikimedia.org/T156357#2972056 (10MUTARO12) JE SUIS DAN RECHERCHE DES ENTERRESSE DE EN VESTIR DEN PROJECT PAICHE EN AFRIQUE 0033659021810 [11:44:17] 10Revision-Scoring-As-A-Service-Backlog, 10Wikilabels, 10rsaas-editquality: Edit quality campaign for Romanian Wikipedia - https://phabricator.wikimedia.org/T156357#2972177 (10MUTARO12) [[ URL | name {F5361865}]]{F5362045} {F5362047} {F5362050} {F5362057} [11:44:17] 10[2] 1010https://meta.wikimedia.org/wiki/Help:URL - Redirección desde 10https://meta.wikimedia.org/wiki/URL?redirect=no [11:47:25] 10Revision-Scoring-As-A-Service-Backlog, 10Wikilabels, 10rsaas-editquality: Edit quality campaign for Romanian Wikipedia - https://phabricator.wikimedia.org/T156357#2972183 (10MUTARO12) {F5362063} {F5362067} {F5362074} {F5362082} [11:50:59] 10Revision-Scoring-As-A-Service-Backlog, 10Wikilabels, 10rsaas-editquality: Edit quality campaign for Romanian Wikipedia - https://phabricator.wikimedia.org/T156357#2972195 (10MUTARO12) [12:48:44] Amir1: halfak: o/ Quick q: https://grafana.wikimedia.org/dashboard/db/ores?from=now-30d&to=now .. any idea what is the increased usage in "External scores returned" and "Scores processed" since the 16th ? They coincide with an increase in SCB CPU usage [13:01:40] 10Revision-Scoring-As-A-Service-Backlog, 10Wikilabels, 10rsaas-editquality: Edit quality campaign for Romanian Wikipedia - https://phabricator.wikimedia.org/T156357#2972328 (10Andrei_Stroe) [13:02:29] Amir1: halfak: I might have an idea ... [13:21:55] 06Revision-Scoring-As-A-Service, 10Wikidata, 15User-Ladsgroup, 05WMDE-Tech-Communication-Mentoring-And-Events: Build item_quality form - https://phabricator.wikimedia.org/T155828#2972375 (10Glorian_Yapinus) @Halfak : I thought we do not have to define the criteria for each quality grade because the classif... [14:08:34] (03CR) 10Thiemo Mättig (WMDE): [C: 031] "Code is fine. Not manually tested with Special:Contributions." [extensions/ORES] - 10https://gerrit.wikimedia.org/r/333497 (owner: 10Ladsgroup) [14:59:41] o/ [15:11:55] So, it looks like this romanian Wikipedia campaign is good to go. [15:12:05] It looks like we still need the list of trusted user groups though. [15:14:21] I'm drafting a response to the task creator to let him/her know what we need to move forward [15:14:51] Amir1, do you know a good reference to help Wikipedians understand what we mean by "user groups"? [15:15:11] Aha! [15:15:12] https://en.wikipedia.org/wiki/Special:ListGroupRights [15:15:18] What's the prefix for romanian? [15:15:34] looks like ro.wikipedia.org [15:15:38] yup [15:16:12] I wish that https://ro.wikipedia.org/wiki/Special:List%C4%83_drepturi_grup would list out the standard group ID in the interface. [15:20:37] 10Revision-Scoring-As-A-Service-Backlog, 10Wikilabels, 10rsaas-editquality: Edit quality campaign for Romanian Wikipedia - https://phabricator.wikimedia.org/T156357#2971998 (10Halfak) @Andrei_Stroe, in order to move forward with this, we need help determining which user_groups are only awarded to "trusted" u... [15:20:43] There [15:20:45] Amir1, https://phabricator.wikimedia.org/T156357#2972806 [15:20:59] I think that'll make it clear what we need Andrei_Stroe to do. [15:21:26] Great [15:21:47] Once it's done we can deploy the campaign [15:22:08] 10Revision-Scoring-As-A-Service-Backlog, 10Wikilabels, 10rsaas-editquality: Edit quality campaign for Romanian Wikipedia - https://phabricator.wikimedia.org/T156357#2972810 (10Halfak) p:05Triage>03Normal [15:22:30] OK. That card is cleaned up. Next is Adding a licensing notice. [15:22:50] https://phabricator.wikimedia.org/T156052 [15:22:53] btw. Some languages are making a hell out of progress: http://tools.wmflabs.org/dexbot/tools/wikilabels_stats.php [15:23:09] That's great news. [15:23:42] Oh! And it reminds me. I think I want to change our revisions_to_review strategy again -- but I'd be changing it back to look like enwiki and fawiki used to. [15:23:44] halfak: I can add it, it looks easy [15:23:55] Maybe we could chat about that quickly. [15:24:08] Oh great. Let me move that card to the main board and assign it to you :) [15:24:19] Go ahead :0 [15:24:24] * :) [15:24:39] Awesome [15:24:41] OK. done [15:24:41] 06Revision-Scoring-As-A-Service, 10Wikilabels, 15User-Ladsgroup: Add notice of CC0 status of Wikilabels data to UI & Docs - https://phabricator.wikimedia.org/T156052#2972821 (10Halfak) p:05Triage>03Normal a:03Ladsgroup [15:24:47] SO yeah... revisions_to_review. [15:27:33] So I was worried about getting review of revisions that "do not need review" so that we can have some observations of sysops accidentally doing damage [15:27:46] This was back in the days when we just filtered out all sysop edits. [15:28:00] But now, we include a sysop edit in "needs review" if it was reverted. [15:28:04] And I think that's pretty good. [15:29:06] So, I want to have a little party where we go through the labeled data from past campaigns (enwiki, fawiki, etc.) and run the current version of the auto-labeler. [15:29:40] Then update the task status and make the old campaigns active again. I think we'll need to label maybe 100s of more edits and then we'll be done again and we'll have better training data. [15:30:16] hmm, I don't think we do have much observations for sysops, so not a big difference but rather good gain [15:30:39] For the campaigns where we split 5k 50/50, I don't think there's any harm in keeping the 2.5k edits that "do not need review" in the campaign, but I'd like to try to find a way to supplement the edits that "do need review". [15:30:58] Well, right now, we usually have ~75% marked as "not needing review" [15:31:09] The subset of those that were reverted will need to be labeled. [15:31:22] * halfak was using "sysop" as an example ;) [15:32:54] 06Revision-Scoring-As-A-Service, 10Bad-Words-Detection-System, 10revscoring: Add language support for Romanian - https://phabricator.wikimedia.org/T152482#2972844 (10Halfak) Great! We'll get this integrated as soon as we can. Thanks for your work! [15:33:10] Amir1, ^ all that sound OK to you? [15:33:24] 06Revision-Scoring-As-A-Service, 10Bad-Words-Detection-System, 10revscoring: Add language support for Romanian - https://phabricator.wikimedia.org/T152482#2972845 (10Halfak) a:03Halfak [15:33:35] I'm reading and trying to get my head around it [15:34:31] halfak: can you get an estimate on how much of those 75% edits are reverted? [15:34:46] Also I think we need to start working on self-training soon [15:34:49] Yeah. We'll need to re-run the auto-labeler and have a look. [15:34:56] self-training...? [15:36:24] halfak: It would be a great help to decide, If we end up needing to label 50% of those edits, we probably shouldn't :D [15:36:45] self-training so we can get a random sample and make a boost in our models [15:36:53] Amir1, ha! Fair point. Well, we can ask the locals. For enwiki and wikidata, I'd be willing to put in some extra work. [15:37:07] Not sure I understand what you mean by self-training. [15:37:16] Is this the semi-supervised strategy? [15:37:41] yup [15:38:05] Gotcha. Yeah, I'd like to see some experimentation with that. [15:38:10] it's not very related but I just remembered tht [15:38:10] Something you want to pick up soon? [15:38:21] probably depends on WMDE work [15:38:37] Gotcha. [15:39:24] OK back to the backlog. So we just have one more on the "backlog" lane [15:39:29] https://phabricator.wikimedia.org/T155853 [15:39:42] It looks like this belongs in the "ideas" column and it might not be a revscoring task. [15:40:04] I'm interested in just leaving this on the "artificial-intelligence" backlog. What do you think? [15:40:42] halfak: agreed [15:41:02] Even though the idea is good but ... [15:43:03] Cool. I'll make sure that Shilad can find the task quick. [15:48:43] halfak: Okay, What else needs to be done [15:52:14] Oh! The "ideas" column. [15:52:22] I think I have that mostly cleaned up, but let's have a look [15:52:40] (Sorry was doing a bunch of stuff to try to help Shilad) [15:53:06] https://phabricator.wikimedia.org/T155845 -- Interesting to us or remove rsaas? [15:53:25] Honestly, I'm not sure I *get* this one. [15:54:03] Amir1, ^ [15:54:10] I think I'm going to remove rsaas [15:54:30] halfak: I was reading it [15:54:33] https://phabricator.wikimedia.org/T155843 seems totally do-able. Would be our first non-ML scorer. [15:54:35] Sorry [15:54:39] * halfak waits more. [15:54:59] It's a lot of work for little gain, we can do it but in the next three years :D [15:55:00] 10Revision-Scoring-As-A-Service-Backlog, 07Easy, 07artificial-intelligence: Text complexity scoring - https://phabricator.wikimedia.org/T155843#2972906 (10Halfak) p:05Triage>03Low [15:55:15] agreed. Not sure there's demand for this idea. [15:55:39] I'm sorry, this shitty connection. If we had a better one it'd be much faster [15:55:52] No worries. I'm glad we could still meet over IRC ^_^ [15:56:08] Finished reading complexity: yay NLP [15:56:19] I love to do more NLP work [15:56:46] like finding new types of swears :D [15:58:40] :) Relevant to NLP is https://phabricator.wikimedia.org/T155111 [15:59:05] Having an NLP service with support for 50 language (!!!) might be a huge catalyst for us. [15:59:22] Would also address our concerns with memory footprint for the feature extractors that include grammar models [15:59:56] * halfak subscribes Amir1 [16:00:21] Thanks. I will look into this soon [16:02:24] We're out of time, but I think this was a good run. I'll be cleaning up the "ideas" column a little more this week. [16:02:47] And I'll send an announcement of https://phabricator.wikimedia.org/tag/artificial-intelligence/ to the AI mailing list. [16:03:03] awesome, ping me when you need any opinion [16:03:12] Will do. [16:03:49] Oh! Amir1, before you go, I'd love to know about anything you think needs to be communicated about broadly. [16:04:04] I've been making a list here: https://etherpad.wikimedia.org/p/ores_announcements [16:04:34] hmm, let me think and I'll come back to you before end of the day [16:04:38] would it work for you? [16:05:02] Sure! I have a meeting with comms peopel (you're included) in 4.5 hours [16:05:09] Which sounds like it will be end-of-day for you. [16:06:52] I'm okay timely, but this connection [16:10:13] halfak: I need to go, o/ [16:13:24] o/ [16:22:25] 06Revision-Scoring-As-A-Service, 10Wikidata, 15User-Ladsgroup, 05WMDE-Tech-Communication-Mentoring-And-Events: Build item_quality form - https://phabricator.wikimedia.org/T155828#2973035 (10Halfak) @Glorian_Yapinus, we'll need to have consistency in labeling in order for the classifier to be able to learn... [16:25:51] 06Revision-Scoring-As-A-Service, 10Wikidata, 15User-Ladsgroup, 05WMDE-Tech-Communication-Mentoring-And-Events: Build item_quality form - https://phabricator.wikimedia.org/T155828#2973051 (10Halfak) Note that, the more consistent our labels, the better test data we'll have. It means that we'll be able to g... [16:49:04] 10Revision-Scoring-As-A-Service-Backlog, 07artificial-intelligence: Major edit detector - https://phabricator.wikimedia.org/T156385#2973141 (10Halfak) [18:32:29] (03CR) 10Ladsgroup: Enhance ORES support in enhanced changes list (032 comments) [extensions/ORES] - 10https://gerrit.wikimedia.org/r/333493 (https://phabricator.wikimedia.org/T155903) (owner: 10Ladsgroup) [18:49:24] 10Revision-Scoring-As-A-Service-Backlog, 10Research Ideas, 10revscoring, 07artificial-intelligence: Non-article focused damage detection - https://phabricator.wikimedia.org/T156403#2973628 (10Halfak) [18:50:04] 10Revision-Scoring-As-A-Service-Backlog, 10Research Ideas, 10revscoring, 07artificial-intelligence: Non-article focused damage detection - https://phabricator.wikimedia.org/T156403#2973647 (10Halfak) Note that we currently train models on a random sample of edits. However, we could also train models on sp... [20:12:26] 06Revision-Scoring-As-A-Service, 10revscoring: Generate PCFG sentence models - https://phabricator.wikimedia.org/T148037#2712808 (10Halfak) See https://github.com/wiki-ai/wikigrammar [20:17:09] 10Revision-Scoring-As-A-Service-Backlog, 10Research Ideas, 07artificial-intelligence: Article categorizer - https://phabricator.wikimedia.org/T123327#2974028 (10Halfak) [23:13:52] wiki-ai/revscoring#869 (tamil_lang - ed8aeb5 : halfak): The build passed. https://travis-ci.org/wiki-ai/revscoring/builds/195710839 [23:41:50] 06Revision-Scoring-As-A-Service, 06Design-Research, 10Research Ideas, 06Research-and-Data, 10Wikimedia-Developer-Summit (2017): Evaluating the user experience of AI systems - https://phabricator.wikimedia.org/T149373#2750321 (10Capt_Swing) I was sick, so this talk didn't get delivered :( [23:42:09] 06Revision-Scoring-As-A-Service, 06Design-Research, 10Research Ideas, 06Research-and-Data, 10Wikimedia-Developer-Summit (2017): Evaluating the user experience of AI systems - https://phabricator.wikimedia.org/T149373#2974691 (10Capt_Swing) 05Open>03Resolved [23:42:31] 10Revision-Scoring-As-A-Service-Backlog, 07Easy, 07artificial-intelligence: Text complexity scoring - https://phabricator.wikimedia.org/T155843#2956513 (10DatGuy) Is this really 'easy'?