[13:10:56] 10Scoring-platform-team, 10WMF-Legal, 10Wikilabels: Add legal language to wikilabels - https://phabricator.wikimedia.org/T223313 (10Halfak) Here's the language I got from the legal folks: # Wiki Label Terms of Use Users are prohibited from engaging in deceptive activities, including misrepresentation of a... [13:11:05] 10Scoring-platform-team, 10WMF-Legal, 10Wikilabels: Add legal language to wikilabels - https://phabricator.wikimedia.org/T223313 (10Halfak) p:05Low→03High [13:54:37] 10Scoring-platform-team (Current), 10Operations, 10Release-Engineering-Team: Production shell access for Chris Albon - https://phabricator.wikimedia.org/T256412 (10ssingh) a:05ssingh→03None The only remaining item on this task is the "cloud (labs) groups: deployment-prep", that is better suited for the r... [14:11:04] wikimedia/ores#1473 (build_event_set - 68fe200 : halfak): The build was fixed. https://travis-ci.org/wikimedia/ores/builds/705795558 [14:11:26] \o/ [14:11:29] That was an old one. [14:11:32] woops [14:11:37] lol [14:11:38] forgot to change my nick :| [14:11:48] o/ kevinbazira [14:11:58] o/ [15:52:24] hi halfak, I don't know if you saw what I said here yesterday, I found the page that had that C tokenizer error with articlequality, and I said about articlequality being acumulating memory [15:52:40] Oh! Great. [15:52:56] We can use that to file a bug against mwparserfromhell. [15:56:19] danilo, are you running our oresapi utility directly? [15:57:09] no, I installed articlequality in a virtualenv in toolforge [15:58:18] Aha. Can you show me what script you are running? [15:59:36] it is not in a repository, it is in toolforge in /data/project/ptwikis/scripts/db-artigos/dumpartigos.py [16:00:25] danilo, see https://github.com/earwig/mwparserfromhell/issues/248 re that parser error. [16:01:19] the part of the script that uses articlequality is this http://dpaste.com/0WC80N2 [16:29:59] 10Scoring-platform-team, 10Wikidata, 10Wikidata-Query-Service, 10articlequality-modeling, 10artificial-intelligence: Add ORES article quality predictions to the WDQS - https://phabricator.wikimedia.org/T257341 (10Halfak) [16:31:47] 10Scoring-platform-team, 10Wikidata, 10Wikidata-Query-Service, 10articlequality-modeling, 10artificial-intelligence: Add ORES article quality predictions to the WDQS - https://phabricator.wikimedia.org/T257341 (10Halfak) We already store article quality predictions in the `ores_classification` table on t... [16:35:18] 10Scoring-platform-team, 10Wikidata, 10Wikidata-Query-Service, 10articlequality-modeling, 10artificial-intelligence: Add ORES article quality predictions to the WDQS - https://phabricator.wikimedia.org/T257341 (10Spinster) [16:40:52] 10Scoring-platform-team, 10revscoring, 10Chinese-Sites, 10artificial-intelligence: Tokenization of "word" things for CJK - https://phabricator.wikimedia.org/T111179 (10calbon) a:03calbon [16:54:42] danilo, I'm not sure if the article quality stuff is related to the memory leak. [16:54:48] I just took a pass and I don't see any issues there. [17:50:43] 10ORES, 10Scoring-platform-team: [Discuss] Future ORES architecture - https://phabricator.wikimedia.org/T226193 (10Ottomata) https://dvc.org/ [18:13:47] 10ORES, 10Scoring-platform-team: [Discuss] Future ORES architecture - https://phabricator.wikimedia.org/T226193 (10ACraze) > https://dvc.org/ @ottomata I've been looking at possibly combining dvc and it's related project [[ https://cml.dev/ | cml ]] to automate our workflows, not sure if it does everything w... [18:13:48] 10[1] 04https://meta.wikimedia.org/wiki/https://cml.dev/ [18:16:18] 10ORES, 10Scoring-platform-team: [Discuss] Future ORES architecture - https://phabricator.wikimedia.org/T226193 (10Ottomata) Yeah, and also seems possibly useful for non ML workflows instead of tools like git-fat or git-lfs. [18:47:31] halfak: ok, that is not a issue for me since I found a way to deal with the memory usage, I just reported because it seemed to be related to articlequality because I didn't have that problem when I tested the same script without articlequality [18:49:20] Interesting! Thanks. That is good to know. Thanks! [18:50:19] BTW, danilo : have did you show halfak the tool you've created for comparing ORES quality predictions to your script predictions? [18:50:52] no, I thought you did [18:51:05] https://ptwikis.toolforge.org/Qualidades_diferentes [18:51:37] What are some values I can use here? [18:51:54] danilo, I think it was a previous version, just the result of a query, If I remember correcly [18:52:09] https://ptwikis.toolforge.org/Qualidades_diferentes?cat=Biografias&catdeep=2 [18:52:21] halfak, ^ [18:52:32] The cells contain links [18:52:51] they will show you which articles have a given ORES prediction, and a given prediction by his script [18:52:52] Oh interesting. Looks like ORES splits quality 1 into 1 & 2. [18:53:00] Generally, it's more optimistic. [18:53:26] halfak, and the tool restricts the comparison to the given category [18:53:37] Any general thoughts after using this tool about what ORES is getting right/wrong? [18:54:49] The tool seems related to what was mentioned in today's meeting, about investigating the quality for pages related to a given wikiproject [18:56:04] halfak, when danilo produced a file with the quality predictions for all articles, I was playing with its data here: https://paws-public.wmflabs.org/paws-public/User:He7d3r/analysis/article-quality-assessment-differences.ipynb [18:57:25] the "tema de introdução" is an approximation of the "topic" of the article, extracted from its introduction, If I remember correctly [18:58:31] halfak, given the matrix from Out [4], I agree with you that ORES seems to be more optimistic in general [18:58:31] El búfer 4 está vacío. [18:58:39] one difference is that ORES gives quality 5 and 6 to articles that didn't passed in the featured/good article nominations, and my algorithm (the same used in Lua module) use featured/good article template to give those qualites [18:59:21] Aha. Yeah, I think it's good that ORES still makes those predictions and we apply rules/human review to judge on top of (overrule) ORES [19:01:27] 10Scoring-platform-team, 10editquality-modeling, 10artificial-intelligence: Update Turkish Wikipedia's labeling campaign for 2020 - https://phabricator.wikimedia.org/T257359 (10Halfak) [19:02:50] I need to run to grab lunch. Back in a bit. [19:03:39] yeah, I believe the community could consider using e.g. articles with current quality <= 4, but which ores predicts as having quality 5 or 6, as possible candidates for the featured/good article nominations as good candidates for [19:05:34] Helder_: I put the two qualities in the lists of https://ptwikis.toolforge.org/Matriz_acessos , I will take a while to I put it in the table because I will need to write all the tool code for that [19:06:37] *rewrite [19:07:25] danilo, great! I'll use that to choose the next math-related article I'll translate into Portuguese :) [19:07:44] :) [19:08:28] some small-medium sized article, with a large number of page views, to maximize the impact of each translated word [19:12:59] halfak: the "tema de introdução" ("introduction subject" in english) in the tool is the subject extracted from the first phrase of each article, that usually follow the pattern "'''''' is a " [19:14:09] I made a research about those introduction subjects two years ago: https://pt.wikipedia.org/wiki/Usu%C3%A1rio:Danilo.mac/Temas_mais_comuns [19:43:39] 10Jade, 10Scoring-platform-team (Current), 10MW-1.35-notes (1.35.0-wmf.41; 2020-07-14): Render wikitext in Jade endorsementcomment and notes fields. - https://phabricator.wikimedia.org/T254355 (10ACraze) nice one @kevinbazira ! I'm glad we figured out how to include the parsed output in the results without a... [20:01:51] Aha! that introduction subject makes a lot of sense. [20:02:11] That's really cool and a really interesting way to search categories of topics. [20:47:07] 10ORES, 10Scoring-platform-team, 10Growth-Scaling, 10Growth-Team, 10drafttopic-modeling: Add articletopic model to testwiki - https://phabricator.wikimedia.org/T257248 (10MMiller_WMF) @Tgr -- which team would be doing this work? Is it Growth, Scoring, or Search? Or all three? [20:49:06] 10ORES, 10Scoring-platform-team, 10Growth-Scaling, 10Growth-Team, 10drafttopic-modeling: Add articletopic model to testwiki - https://phabricator.wikimedia.org/T257248 (10Halfak) Aha! We don't have it here, but we could. See https://ores.wikimedia.org/v3/scores/testwiki/ I'm thinking that we would jus... [21:28:29] I just ran "git review" and nothing happened. O_O [21:29:03] It happily checked the configs, processed a change, and then didn't tell me that it had created a changeset [21:30:59] Well I guess that is going to be a problem for tomorrow. [21:31:05] Take care, folks! [21:43:23] * halAFK lied and tried to troubleshoot for another 10 minutes [21:43:28] actually giving up now :( [21:43:30] o/ [22:36:28] 10ORES, 10Scoring-platform-team, 10Growth-Scaling, 10Growth-Team, 10drafttopic-modeling: Add articletopic model to testwiki - https://phabricator.wikimedia.org/T257248 (10Tgr) >>! In T257248#6287235, @MMiller_WMF wrote: > @Tgr -- which team would be doing this work? Is it Growth, Scoring, or Search? Or...