[11:12:59] 10Scoring-platform-team, 10editquality-modeling, 10User-Ladsgroup, 10artificial-intelligence: Flagged revs approve model to fiwiki - https://phabricator.wikimedia.org/T166235#3483594 (10Zache) A couple of questions: #1 to get model to be useful then at some point we need [[ https://ores.wmflabs.org/v2/sco... [14:02:21] o/ [15:44:49] halfak: hi! [16:05:53] o/ fajne! [16:06:15] I've got to leave soon, but I'll be around for a few more minutes if you want to chat about something. [16:09:19] Sorry to not get your ping. I was helping out another volunteer work with the article quality dataset ^_^ [16:11:19] ok, i just sent you an email [16:13:12] halfak: in a word, i need instructions on how to start wikilabels under wikilabels-wmflabs-deploy [16:14:29] Okay! [16:14:40] * halfak looks [16:14:56] hey [16:15:17] fajne, try "python labels_web.py" from the base of the wikilabels-wmflabs-deploy directory [16:15:21] Hey Amir1_! [16:15:46] I sent you a telegram message that I'll be late but you haven't seen it [16:15:47] :D [16:15:58] Bah. My phone didn't notify me [16:16:24] * halfak opens telegram and it immediately pings [16:16:27] WTF telegram [16:16:52] halfak: this one did, here is the output: Traceback (most recent call last): File "labels_web.py", line 8, in from wikilabels.wsgi import server ImportError: No module named wikilabels.wsgi [16:17:09] run "git submodule update --init" [16:17:13] Then try again [16:19:05] i am doing it all on my local computer.. is that ok? [16:20:03] fatal: Not a git repository (or any parent up to mount point /mnt/c) Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set). [16:22:14] halfak: or, if i use your server, how can i launch the browser from there? [16:23:47] Doing from your computer is great [16:23:53] Sorry was AFK briefly. [16:24:23] So, the "not a git repository" -- are you doing this from the same directory as where the "labels_web.py" file is? [16:25:08] halfak: btw. Do you have any opinion about the email I sento scoring-internal regarding patrolling against mullahs [16:25:18] how we should proceed, etc. [16:25:28] Amir1_, Yeah. Let me look again though. [16:26:40] Thanks :) [16:26:58] Amir1_, OK so here's my thought: I want to model the past edits of a user. [16:27:19] And I think we'll get signal from looking for high amounts of consistency between edits across different pages. [16:27:50] The problem with removing the same-sex image is that we don't know the significance of the image from a computing point of view. [16:28:27] But maybe we'll see consistency in file names or a regular removal of images by the user. [16:28:50] E.g. maybe we have a set of features defined on the last_n_edits(N) [16:29:04] I don't want to fix just one issue, I just want to use this opportunity to improve signals we get [16:29:07] halfak: yep, labels_web.py is here, in the /deploy [16:29:09] Right [16:29:24] one feature that would be very useful is number of images removed / added [16:29:32] fajne, that error is saying that you are not working in a git repository. What happens when you run "git status"? [16:29:41] Amir1_, I think we have that, don't we? [16:29:54] I don't think so [16:30:01] I checked briefly [16:30:05] Oh! We don't! [16:30:20] o/ hows everything breaking i mean doing [16:30:33] Amir1_, Agreed! We'll need to localize file namespaces, but let's do it [16:30:35] :D [16:30:36] o/ Zppix [16:30:47] Suddenly this channel is alive! [16:30:53] computationally is not easy, e.g. in galleries and infoboxes [16:31:10] Amir1_, agreed but we have solutions for this. [16:31:20] halfak i tend to make stuff come alive :P [16:31:29] Amir1_, https://github.com/mediawiki-utilities/python-mwxml/blob/master/ipython/labs_example.ipynb [16:31:35] Thanks! [16:32:02] ok, i think i just downloaded it, not "opened on desktop" [16:32:24] if i redo this part, all will work smoothly? [16:32:38] localizing namespaces... thats hell waiting to happen [16:33:07] fajne, yeah. You want to "git clone" [16:33:28] And you will need to do that "git submodule update --init" then inside of the cloned directory. [16:33:33] fajne are you familar with how git works? [16:33:35] It pulls down the wikilabels assets. [16:33:47] Zppix, fajne has a few recent PRs :) [16:34:01] halfak i see thats great. [16:34:08] Zppix, it is hell, but we have sitematrix to help. [16:34:24] Check this out: https://en.wikipedia.org/w/api.php?action=sitematrix&formatversion=2 [16:34:29] this is true but we dont have the best track record [16:34:38] If you run that and then iterate through it running siteinfo, you can get all of the localizations. [16:35:32] E.g. https://en.wikipedia.org/w/api.php?action=query&meta=siteinfo&siprop=general%7Cnamespaces%7Cnamespacealiases%7Cstatistics [16:35:40] halfak i am aware of sitematrix :P [16:35:48] i fiddled with it a bit in my spare time before [16:35:59] Gotcha. I like it. Very useful [16:36:23] Amir1_, you could build a script that gets the relevant localizations for each wiki. [16:36:30] Or maybe we could include this in revscoring somehow [16:36:39] it is but its hard to decode just parts of it without having to pull unneeded info (in my experience) [16:36:52] Like maybe sitematrix is a dependency that gets solved when you create a api.Extractor [16:37:04] I think pywikibot has something like that, let me check how it does what it does [16:37:05] halfak for revs? [16:37:09] Meh. A little unneeded info is OK [16:37:41] I think ores would better benefit with using sitematrix than revscoring using it no? [16:38:04] Hmm... Not really. [16:38:10] ORES is just a hosting service [16:38:19] revscoring is responsible for figuring out what data it needs. [16:38:22] https://github.com/wiki-ai/revscoring/blob/master/revscoring/extractors/api/datasources.py#L9 [16:38:24] Amir1_, ^ [16:38:27] ah yes i forgot [16:38:29] I think I'd add an API call there [16:38:43] And then I'd add the site matrix to the persistent cache. [16:38:58] OH wait. no [16:39:13] Here [16:39:14] https://github.com/wiki-ai/revscoring/blob/master/revscoring/extractors/api/extractor.py#L20 [16:39:43] yeah, I was wondering :D [16:39:56] Then we can add something like "image links added/removed" to https://github.com/wiki-ai/revscoring/blob/master/revscoring/features/wikitext/datasources/parsed.py [16:40:20] We'll need to have some datasources for "siteinfo" or something like that in https://github.com/wiki-ai/revscoring/tree/master/revscoring/datasources [16:40:44] If you copy those notes to a phab task, I'd love to work with you on this :) [16:44:45] halfak: ImportError: No module named flask_oojsui ... pip install? [16:44:48] Thanks! [16:44:56] pip install -r requirements.txt [16:45:03] fajne, ^ [16:45:04] Amir1_, :) [16:45:30] I've got to run. Amir1_, can you help fajne get wikilabels-wmflabs-deploy running so she can test some i18n changes? [16:45:47] sure [16:45:53] Thanks! [16:45:53] thanks! [16:45:53] hey fajne [16:45:56] Have a good rest of your saturday! [16:45:58] o/ [16:46:12] you too [16:48:20] fajne: Ping me if you get any errors [16:48:27] i have abunch [16:50:37] Amir1_: pastebin doesnt' work, so ... [16:50:39] Traceback (most recent call last): File "labels_web.py", line 8, in from wikilabels.wsgi import server File "/mnt/c/Users/Fajne/Documents/wikilabels-wmflabs-deploy/wikilabels/wsgi/server.py", line 10, in from . import assets, routes, sessions File "/mnt/c/Users/Fajne/Documents/wikilabels-wmflabs-deploy/wikilabels/wsgi/routes/__init__.py", line 7, in from . import form_builder File [16:51:02] It is not complete [16:51:20] http://pastebin.ubuntu.com/ [16:51:22] ? [16:52:16] http://pastebin.ubuntu.com/25198657/ [16:53:10] oh, yeah, uninstalled dependency [16:54:20] try pip install functools [17:05:33] I have to be afk for fifteen minutes [17:24:24] Amir1_: are you back? [17:28:10] fajne: I'm back now [17:28:14] Amir1_: here is bunch of things i tried while you were afk: http://pastebin.ubuntu.com/25198869/ [17:29:18] wow that's impressive [17:30:57] fajne: God Save the Google (and stackoverflow) [17:30:57] functools32 [17:31:01] install this instead [17:31:15] it seems you are using python2.7 btw. Can you check? [17:31:40] Amir1_: yes, i do [17:31:46] a conflict? [17:32:00] btw, 32 worked [17:32:16] well, wikilabels doesn't work with python 2.7 [17:32:21] you need to use python 3 [17:32:35] (lru-cache is already in python3) [17:32:41] daaaamn [17:32:42] ok [17:33:01] everywhere in ores, we use python3 [17:34:39] does ores have that in its dependencies? [17:34:41] doesnt* [17:35:11] no, I guess [17:38:13] 10Scoring-platform-team-Backlog, 10Bad-Words-Detection-System, 10revscoring, 10artificial-intelligence: Add language support for Croatian (hr.wiki) - https://phabricator.wikimedia.org/T172046#3483983 (10Ivi104) [17:38:58] i do have python3 as well actually [17:41:16] 10Scoring-platform-team-Backlog, 10Wikilabels, 10editquality-modeling, 10artificial-intelligence: Edit quality campaign for hr.wiki - https://phabricator.wikimedia.org/T172047#3483996 (10Ivi104) [17:53:05] Amir1_: have a wonderful German evening)) I am off to have a breakfas finally. This damn thing works, yay! [17:53:32] fajne: YAY \o/ [17:53:42] Thanks [17:53:44] have fun [17:53:50] thank you [19:46:00] 10Scoring-platform-team, 10editquality-modeling, 10Release-Engineering-Team (Watching / External), 10User-Ladsgroup, 10artificial-intelligence: Split editquality repo to two repos, one with full history, one shallow - https://phabricator.wikimedia.org/T170967#3484048 (10Ladsgroup) The team decided to mov... [19:46:13] 10Scoring-platform-team, 10editquality-modeling, 10Release-Engineering-Team (Watching / External), 10User-Ladsgroup, 10artificial-intelligence: Split editquality repo to two repos, one with full history, one shallow - https://phabricator.wikimedia.org/T170967#3484049 (10Ladsgroup) 05Open>03declined [19:51:24] 10Scoring-platform-team-Backlog, 10revscoring, 10artificial-intelligence: Get signal from adding/removing images - https://phabricator.wikimedia.org/T172049#3484056 (10Ladsgroup)