[09:11:05] Amir1: Hello! You are mentoring my task on GCI (adding feature to wikiclass). May I ask you some questions about my task? [09:34:05] Phantom42: hey, sure. Can you give link to the exact phab card? [09:34:11] 10Scoring-platform-team (Current), 10ORES, 10Operations, 10Graphite, and 2 others: Regularly purge old ores graphite metrics - https://phabricator.wikimedia.org/T169969#3892437 (10fgiunchedi) 05Open>03Resolved All done! Agreed the parameter isn't the best, and naming is hard :( This task is done from... [09:37:21] Amir1: Here is the task on Phab: https://phabricator.wikimedia.org/T174384 [09:38:05] Amir1: Cool [09:38:07] So I check wikiclass and revscoring code and made myself familiar with it. I also understand what needs to be done in that task [09:38:16] That would be very interesting [09:38:22] pinged myself. facepalm [09:38:57] Phantom42: so what you need to do is to add some features that can signal big clumps of unreferenced text [09:39:02] I didn't have problems with building models on my machine (with `make models/enwiki.nettrom_wp10.gradient_boosting.model`). But I have problems with generating tunning reports [09:39:24] Error log for tunning reports: https://dpaste.de/LqOV [09:39:56] it seems the make command is wrong [09:40:13] Yes. There are some problems with Makefile [09:41:25] Phantom42: can you get make the version of revscoring library (you can get it by pip freeze) [09:42:07] pip shows 2.0.8 [09:42:15] I had some problems with 2.1.0 so I downgraded it [09:42:37] Try generating tunning report with 2.1.0? [09:42:42] yup [09:42:46] otherwise it doesn't work [09:42:53] we should fix the upgrade [09:42:57] Okay, thanks. Will try it now [09:47:37] Amir1: Still failing after upgrading: https://dpaste.de/L3aZ :( [09:50:06] let me check the changes in the code [09:51:38] I played a bit with parameters and found out that it stops failing after removing `--label-type=str`, but generates empty output file :( [09:52:31] can you send me the log and what it outputs you? [09:54:31] One moment... [09:59:15] Amir1: So here is what happens if I remove `--label-type=str`: https://dpaste.de/uSCy [10:00:24] it seems sklearn library is not installed and called correctly [10:00:34] Model sklearn.svm.SVC does not have a train() method. [10:04:20] Hm, I think I forgot to install it. Will install and try again now. [10:11:26] Amir1: Okay, I installed it with (pip3 install -U scikit-learn). The output has changed a bit, but looks like the error is the same: https://dpaste.de/VDbu [10:13:25] Phantom42: I think the version that should be installed is very important [10:13:32] otherwise, it makes a mess [10:14:44] Phantom42: install 0.17 [10:15:09] Okay, will try now... [10:37:31] Amir1: Installing 0.17 did not help: https://dpaste.de/3nSq Maybe there are some other dependencies needed? [10:48:26] Phantom42: it can be [10:48:38] go through dependencies in requirements.txt in revscoring [10:51:37] Amir1: Just tried running "pip3 install -r requirements.txt" for revscoring requirements.txt. Got "Requirement already satisfied" for all of them [10:54:42] Phantom42: give it a try with pip3 install -U -r requirements.txt [10:54:55] and if that doesn't work out, I'm running out of ideas :( [11:00:59] I just tried it. Some dependencies were updated. But still the same problem. I am doing everything in parallel on 2 machines - same problem on both. Let me try rebuilding the model. Hopefully I will get that error I was previously getting there and it gives us some clues. If it doesn't, I will try again in virtualenv [11:06:57] Hm, I didn't get error building model with 2.1.0 revscoring this time. Okay, let's wait for model to rebuild and while I am waiting, I will try in virtualenv other machine [11:15:07] Unluckily no difference with virtualenv [11:16:34] I will also try doing things on machine with different OS. Both previous machines were running Ubuntu, but I also have laptop running windows. Will try there too... [14:48:23] o/ [14:49:16] Amir1: I rebuilt the model, but it didn't help. And trying things on Windows doesn't seem to be good, as there are even more problems because some dependencies need to be compiled. I better try to get it to work on my Ubuntu machine... [14:51:51] (03PS1) 10Awight: Namespace maintenance scripts so they're discoverable from tests [extensions/ORES] - 10https://gerrit.wikimedia.org/r/403663 (https://phabricator.wikimedia.org/T184140) [14:51:56] halfak: heyo [14:52:04] yo! Good morning [14:52:32] Today is super weird -- sandwiched between trips. [14:52:47] Just about to start my first meeting :| [14:54:34] Fine with me, I’ve been digging into E:ORES tests like a wombat [14:54:50] Phantom42: Sorry I’m late to the party. Looking at your error output, I think there’s something wrong with our Makefile. [14:55:14] awight, do you have a rough sense for what the progress is on that work? E.g. do you think we'll be able to call the refactor "done" soon? [14:55:40] * halfak needs to find time to finish converting ores.wmflabs.org to stretch. [14:55:42] Phantom42: The --pop-rate option doesn’t appear in the “tune” tool’s error output. [14:56:22] halfak: The refactor is happing at blinding speed, IIRC Amir1 is only planning on slicing up one more small class, if any. [14:56:36] gotcha. Cool. thanks :) [14:56:42] The tests are at 56% coverage, out of a self-imposed goal of 60%. [14:56:50] I’ll hit that in a couple of hours. [14:59:11] (03CR) 10jerkins-bot: [V: 04-1] Namespace maintenance scripts so they're discoverable from tests [extensions/ORES] - 10https://gerrit.wikimedia.org/r/403663 (https://phabricator.wikimedia.org/T184140) (owner: 10Awight) [14:59:54] halfak: ah, also I remembered about this great code quality tool to help us find the pain points: https://scrutinizer-ci.com/g/wikimedia/mediawiki-extensions-ORES [15:03:36] Oh we must have already had this conversation…. https://scrutinizer-ci.com/g/wiki-ai/revscoring/ [15:05:04] 10Scoring-platform-team, 10Collaboration-Community-Engagement, 10MediaWiki-extensions-ORES, 10Patch-For-Review, 10User-notice-collaboration: Deploy ORES filters to Simple Wikipedia - https://phabricator.wikimedia.org/T182012#3893520 (10Trizek-WMF) What is the status of that task? I haven't seen the filte... [15:10:36] awight: Hm, initially the problem was `--label-type` parameter and we removed it from command, so it works. But what's with `--pop-rate`? [15:13:39] Phantom42: Sorry, this is definitely a problem with our Makefile. [15:13:44] o/ [15:14:05] Phantom42: Can I ask why you’re trying to build that particular model? Have you and Amir1 already determined it’s the closest fit for your project? [15:14:34] oh hehe I see, you’re just running the tuning reports. [15:16:59] * awight files a bug [15:18:22] awight: I am working on this task: https://phabricator.wikimedia.org/T174384 I am working on adding new feature to enwiki and I need running reports to see if model prediction rates improved after new feature was added. [15:18:41] tuning reports * [15:22:12] Phantom42: Can you show me the result of “pip freeze | grep revscoring"? [15:22:34] oh nvm, I see it’s in your paste already! [15:25:28] Phantom42: I’m curious how you got “datasets/enwiki.labeling_revisions.w_cache.nettrom_30k.json”, which isn’t in the Makefile. [15:26:17] ah my fault again—I was in editquality, but you’re working in wikiclass. [15:27:13] * awight pretends to drink some coffee [15:29:46] Amir1: btw, <3 tqdm, I use it for everything now. [15:31:36] awight: sorry, was afk meeting with Angel [15:31:38] so [15:31:58] Amir1: that’s right! Cool, I hope it went well. [15:32:51] Yeah [15:33:08] Okay, Cool. Today is wikidata day, so can't work much [15:33:17] but if there is anything, let me know :) [15:33:50] 10Scoring-platform-team, 10ORES: Makiefile tuning reports broken by deprecated command parameters - https://phabricator.wikimedia.org/T184727#3893621 (10awight) [15:34:45] awight: there is one small thing with your patch: https://integration.wikimedia.org/ci/job/mwext-testextension-hhvm-jessie/28693/console [15:35:11] Amir1: I’m planning to write tests for maintenance/, do you think that’s worthwhile? [15:35:12] otherwise it should've been done loong time ago [15:36:09] Amir1: That’s weird, it totally does use ORESService. [15:36:21] Lint fail? [15:40:09] Phantom42: Looks like it was something simple. You can pull this branch, or make the change locally as you wish: https://github.com/wiki-ai/wikiclass/pull/58/files [15:42:28] awight: Good! But unluckily it still does not help with the problem that I get empty tuning report. [15:42:41] hahaha that’s something else, then. [15:43:01] Can you paste the output? [15:43:43] awight: https://dpaste.de/3nSq [15:44:40] Phantom42: I haven’t actually run this tool myself, yet :-/ but this line looks most suspicious: [15:44:42] > Running gridsearch for 0 model/params pairs [15:44:48] Not much of a grid! [15:45:30] But I have a built model... Why doesn't it use it? [15:46:10] Hmm, I get a totally different error... [15:46:30] What do you get? [15:47:01] wiki-ai/wikiclass#28 (fix_T184727 - 8066d2b : Adam Roses Wight): The build passed. https://travis-ci.org/wiki-ai/wikiclass/builds/327731953 [15:50:09] Phantom42: I get, https://dpaste.de/ooK6 [15:50:25] that is extremely weird [15:50:41] I’m copying down a built model to see if that’s the issue and it’s just a misleading error. [15:51:19] Amir1: phpcs passes locally! [15:51:31] awight: Hm. I didn't have such error [15:52:55] (03CR) 10Awight: "recheck" [extensions/ORES] - 10https://gerrit.wikimedia.org/r/403663 (https://phabricator.wikimedia.org/T184140) (owner: 10Awight) [15:54:11] (03CR) 10jerkins-bot: [V: 04-1] Namespace maintenance scripts so they're discoverable from tests [extensions/ORES] - 10https://gerrit.wikimedia.org/r/403663 (https://phabricator.wikimedia.org/T184140) (owner: 10Awight) [15:57:34] (03PS2) 10Awight: Namespace maintenance scripts so they're discoverable from tests [extensions/ORES] - 10https://gerrit.wikimedia.org/r/403663 (https://phabricator.wikimedia.org/T184140) [15:58:50] (03CR) 10Awight: "php-cs is killing me. With the change in PS2, php-cs seems to pass, but of course the script crashes at runtime. With or without the cha" [extensions/ORES] - 10https://gerrit.wikimedia.org/r/403663 (https://phabricator.wikimedia.org/T184140) (owner: 10Awight) [15:59:22] (03CR) 10Awight: [C: 04-1] Namespace maintenance scripts so they're discoverable from tests [extensions/ORES] - 10https://gerrit.wikimedia.org/r/403663 (https://phabricator.wikimedia.org/T184140) (owner: 10Awight) [16:02:10] (03PS3) 10Awight: Namespace maintenance scripts so they're discoverable from tests [extensions/ORES] - 10https://gerrit.wikimedia.org/r/403663 (https://phabricator.wikimedia.org/T184140) [16:02:12] (03CR) 10jerkins-bot: [V: 04-1] Namespace maintenance scripts so they're discoverable from tests [extensions/ORES] - 10https://gerrit.wikimedia.org/r/403663 (https://phabricator.wikimedia.org/T184140) (owner: 10Awight) [16:02:28] what is happening [16:04:51] Amir1: :D It was silently rebase, which changed that line to your MediaWikiServices singleton call. So my use statement was extraneous after rebase. [16:05:55] (03PS4) 10Awight: Namespace maintenance scripts so they're discoverable from tests [extensions/ORES] - 10https://gerrit.wikimedia.org/r/403663 (https://phabricator.wikimedia.org/T184140) [16:08:53] (03CR) 10Awight: Namespace maintenance scripts so they're discoverable from tests [extensions/ORES] - 10https://gerrit.wikimedia.org/r/403663 (https://phabricator.wikimedia.org/T184140) (owner: 10Awight) [16:10:23] halfak: Feel like sharing a clue… https://dpaste.de/ooK6/raw [16:11:02] awight, features aren't extracted. [16:12:57] halfak: I double-checked and datasets/enwiki.labeling_revisions.w_cache.nettrom_30k.json does included the cached features. [16:13:28] Or at least, a bunch of binary data in the “cache” field... [16:13:31] awight, looks like one of the features isn't cached. :/ That's my only explanation [16:13:46] Might be different versions of wikiclass/revscoring [16:13:52] OK cool, that’s helpful! Yes I think that’s it. [16:14:13] I grabbed this file out of a random homedir to “save time” :) [16:14:18] Will rebuild from scratch. [16:14:24] :) godspeed [16:17:18] And meanwhile I am experimenting with dependencies, versions, etc... [16:17:48] Phantom42: Probably a good idea. Thanks for being patient :) [16:19:52] halfak: Feel free to try to diagnose the latest issue that Phantom42 is running against, if you find time: https://dpaste.de/3nSq [16:19:58] Note line 34. [16:20:07] The output is empty. [16:20:31] Old version of wikiclass with a new version of revscoring :) [16:20:54] Phantom42: :D ^ [16:20:58] Ack! No! [16:21:03] lol [16:21:11] We haven't updated the wikiclass param set since we updated revscoring to 2.0! [16:21:30] OK so here's the solution. Check out the editquality param set here: [16:21:38] https://github.com/wiki-ai/editquality/blob/master/config/classifiers.params.yaml [16:21:40] I did find a —label-type param, which I crudely butchered out. [16:21:53] Note that you see lines like "revscoring.scoring.models.GradientBoosting" [16:22:04] Rather than "sklearn.ensemble.GradientBoostingClassifier" [16:22:25] halfak: gotcha. I’ll see if I can adapt it, always nice to explore new alleyways. [16:22:29] This is because we now use our own "scoring.Model" classes rather than sklearn directly. [16:22:35] \o/ Thanks awight [16:22:43] Just got done with mega meeting [16:22:53] Will be working on ores.wmflabs.org --> Stretch soon. [16:23:01] yuck. OK great! [16:24:04] halfak does revscoring cv_train save the statistics when we save the model? [16:24:17] yes [16:24:24] The statistics are stored inside of the model. [16:24:29] awight, +1 for yuck! [16:24:37] :D But it's a really good test ^_^ [16:24:54] It's very exciting for me to be back to normal work [16:25:00] Conferences get old after a while :) [16:25:06] I missed y'all [16:25:09] halfak: Should I comment out SVC? [16:25:26] awight, +1 SVC is super slow and bad [16:25:33] It never wins. just wastes time. [16:25:41] {{done}} [16:25:41] How efficient, awight! [16:26:23] n.b. I noticed that the “RandomForestClassifier” key is still inconsistent with the others, which have had the “Classifier” suffix stripped. [16:28:42] awight: are you refering to classifier names in revscoring/scoring/models? [16:29:18] codezee: These are the keys in config/classifiers.params.yaml —I’m not sure what they correspond to, yet. [16:29:32] awight: those keys correspond to the classifier to use [16:30:05] awight: oh sorry, you're on wikiclass right? [16:31:11] codezee: I’m updating wikiclass/config, yeah [16:31:24] awight: that key is just a key to the config in yaml, the real thing is the "class:" attribute from which it'll pick up classs [16:31:33] tgr|away: Argh, donno how I missed the meeting notification! Is there anything I can help with, halfway through? [16:31:46] codezee: oho, thanks. So the keys are arbitrary? [16:32:53] awight: yes, see - https://github.com/wiki-ai/revscoring/blob/master/revscoring/utilities/tune.py#L253 [16:33:02] its the "name" part [16:33:36] awight: btw i use the latest config for drafttopic , sth like this - https://dpaste.de/19Do [16:34:01] codezee: Nice, ty [16:34:06] np :) [16:34:52] note to self: draftquality needs an updated config as well. [16:36:46] 10Scoring-platform-team, 10ORES: Wikiclass tuning broken, needs revscoring 2 update - https://phabricator.wikimedia.org/T184727#3893930 (10awight) [16:37:25] Thank you awight! Will try that now [16:37:33] Phantom42: It’s working for me! [16:38:09] 10Scoring-platform-team, 10ORES: Wikiclass tuning broken, needs revscoring 2 update - https://phabricator.wikimedia.org/T184727#3893621 (10awight) https://github.com/wiki-ai/wikiclass/pull/58 [16:43:04] 10Scoring-platform-team, 10ORES: Wikiclass tuning broken, needs revscoring 2 update - https://phabricator.wikimedia.org/T184727#3893940 (10awight) https://github.com/wiki-ai/draftquality/pull/18 [16:43:12] awight: btw, after the fast scoring change merge, the model building should take 5 times less time...I'm hoping [16:43:18] 10Scoring-platform-team, 10ORES: Tuning broken in some repos, needs revscoring 2 update - https://phabricator.wikimedia.org/T184727#3893941 (10awight) [16:43:39] codezee: wat! I missed this, is this about reducing the number of estimators for drafttopic? [16:44:34] awight: this was a general revscoring change that would improve tuning times for every model by 5 times approx [16:44:52] by scoring items in a bunch rather than one by one [16:44:59] Fantastic, thanks for scaling us! [16:45:41] awight: https://github.com/wiki-ai/revscoring/pull/388 there hasn't been any substantial model building after that so i'm hoping you could report on that [16:45:55] https://img00.deviantart.net/bfed/i/2011/099/6/1/shrinking_or_growing__by_cannibalcupcake-d3dl5hn.jpg [16:46:39] codezee: Cool. I’m running the wikiclass enwiki tuning on ores-misc-01, if that’s going to be comparable with any older data points? [16:47:16] darn! Our tuning reports don’t include total run time! [16:49:22] awight: i don't suppose a quantitative comparison is possible , was just hoping if you could related to by memory.... :D [16:49:48] codezee: This is the first time I’ve run a tuning report /o\ [16:50:47] 10Scoring-platform-team, 10ORES: Tuning broken in some repos, needs revscoring 2 update - https://phabricator.wikimedia.org/T184727#3893974 (10Phantom42) Looks like there is one more minor problem with Makefile. `enwiki_tuning_reports` rule runs `tuning_reports/enwiki.wp10.md` and `tuning_reports/enwiki.nettro... [16:50:53] nevermind :P [16:51:00] 10Scoring-platform-team, 10ORES, 10Performance: Tuning reports should give us a rough indication of algorithm performance - https://phabricator.wikimedia.org/T184743#3893975 (10awight) [16:51:08] codezee: For next time ^ [16:51:15] awight hi [16:51:22] about adding javascript to add a class [16:51:29] how would i use that in the css please? [16:51:29] paladox: -releng? [16:51:44] awight for that gerrit change you reviewed :) [16:51:55] https://gerrit.wikimedia.org/r/#/c/402665/ [16:51:58] codezee: I’ll record the total time and we can see if anyone else remembers ballpark [16:52:15] paladox: Totally, I was just suggesting #wikimedia-releng cos other people there might be interested. [16:52:21] ah i see [16:52:36] paladox: My thought was that you could jam the .js line into the init() function [16:52:48] yep, it works in as far as it adds the class [16:52:56] but im wondering how do i use it? [16:53:27] Oh great! So the next change is to rewrite any CSS with the “rootNode” keyword, to instead be a top-level rule addressing .loginParent { [16:53:59] ah so [16:54:08] .loginParent body { [16:54:10] for example? [16:54:34] hmm doing this [16:54:35] html body .loginParent [16:54:38] does [16:54:39] https://gerrit.git.wmflabs.org/r/login/ [16:55:01] I’m confused about $root body [16:55:11] that's html [16:55:18] $root = html [16:55:28] so html body ($root body) [16:55:48] paladox: Are you already using the newer gerrit? Might want to use the same version as WMF [16:56:22] awight i am using gerrit 2.14. But the version dosen't really matter as we use GerritSite.css (gerrit 2.13 uses that too). [16:56:28] paladox: I thought that $root was a trick to get $(‘.loginForm').parentNode [16:56:42] awight that was a eqcss var [16:56:52] paladox: ah cos I noticed that the element already has an ID that you can address, so we don’t need the extra JS code [16:57:00] yep [16:57:11] awight: thanks, it was mostly MediaWiki-related topics [16:57:49] paladox: confirmed that it exists in WMF gerrit 2.13 [16:58:12] tgr: OK, thanks for handling! [16:58:16] awight: Looks like it works now! 77 model/param pairs! Thank you so much! [16:58:34] yep [16:58:34] Phantom42: \o/ any time I can help muddle through things, at your service [16:59:25] paladox: You’re right, I see “$root which always refers to the HTML document.” on http://elementqueries.com [16:59:33] yep [16:59:33] In that case, I have no idea what we’re doing here. [16:59:53] well what we are trying to do is apply this new css only if we are on the login page [17:00:01] otherwise it will apply it to the whole of gerrit [17:00:19] paladox: ty. OK so the mere existence of a matching element is enough... [17:00:42] yep [17:01:28] So I’m a terrible hack, but perhaps you could use the login init() function to add a class to the body, then qualify all the login-only rules with “body.isLoginPage ..." [17:02:17] The JS to add the class would look like, “document.body.classList.add"... [17:03:27] ah [17:03:27] ok [17:03:33] thanks will try with that [17:04:08] awight yay [17:04:12] that works i think [17:04:12] https://gerrit.git.wmflabs.org/r/login/ [17:04:23] Looks good! [17:04:41] Seems like it’s not wrecking non-login pages either? [17:05:04] Sorry to be a jerk about EQCSS, it just seems like overkill. [17:05:38] Heh yeh [17:52:23] 10Scoring-platform-team, 10Analytics, 10Analytics-Wikistats, 10ORES: Discuss Wikistats integration for ORES - https://phabricator.wikimedia.org/T184479#3884392 (10Milimetric) totally, put a meeting on our calendar or let's chat here. [17:54:54] 10Scoring-platform-team (Current), 10Beta-Cluster-Infrastructure, 10Recommendation-API, 10Release-Engineering-Team: What to do with deployment-sca03? - https://phabricator.wikimedia.org/T184501#3894240 (10Nuria) [17:59:37] biab [18:09:37] Nettrom: quick q. - what tool do you use to plot graphs? [18:11:06] codezee: I use R and ggplot2 [18:31:58] halfak: we chatted about using just the “OK” label from the draft quality model, let me know when you have a few minutes to dig more into that [18:32:13] Oh yeah. So. [18:32:23] You probably want to choose a threshold based on some constraints. [18:32:33] E.g. 90% recall of non-OK drafts [18:33:53] * halfak gets a link/api call [18:34:39] https://ores.wikimedia.org/v3/scores/enwiki/?models=draftquality&model_info=statistics.thresholds.OK.%22maximum%20!precision%20@%20!recall%20%3E=%200.9%22 [18:34:55] That is wrong 50% of the time when it flags something as !OK [18:35:04] But it catches 90% of !OK stuff. [18:35:17] 0.664 probability threshold. [18:37:56] ah, so I can use that to figure out the cost/benefit of using different thresholds? [18:39:13] * Nettrom thinks about this for a bit [18:42:51] Right :D [18:43:07] It will also help you figure out the "meaning" of certain prediction "probabilities" [18:43:28] You could fit a spline to recall or precision so you can convert "probability" to your desired metric. [18:49:20] halfak: while I was collecting statistics with different hyperparams, I saw that averaged best precision lies around 35% for drafttopic while recall is as high as 82%, does this suggest that its not missing out on topics that are assigned(high recall), and predicting some more(low precision) ? [18:58:49] 10Scoring-platform-team, 10Collaboration-Community-Engagement, 10MediaWiki-extensions-ORES, 10Patch-For-Review, 10User-notice-collaboration: Deploy ORES filters to Simple Wikipedia - https://phabricator.wikimedia.org/T182012#3894528 (10Halfak) I don't think this is blocked from #ORES end. Is there any i... [19:03:06] halfak: thanks for the link and the info, I might end up making a plot, but first I need to wrap my head around precision & recall metrics again to make sure I know what I want :) [19:04:09] codezee, average precision (aka PR-AUC) is a measure of precision and recall across the set of all thresholds. [19:04:28] While the recall presented is a measure of recall at one specific threshold. [19:04:59] Nettrom, +1 specificity and sensitivity (precision, recall) are weird to think about but really useful concepts. [19:05:38] I usually use 90% recall for the damaging model as a "needs review" threshold because patrolers like it. [19:05:51] "Let's make sure to catch at least 90% of the damage on the first pass" [19:06:02] In practice, any obvious damage is going to get caught by that. [19:06:30] yeah, that makes total sense [19:06:34] The stuff that doesn't get caught is often a False False-Positive ;) [19:07:12] and I’m fairly sure I can make some good progress by figuring out a better threshold for the draft quality model, just need to make sure I understand it [19:07:46] Nettrom, see my brief essay http://socio-technologist.blogspot.com/2016/01/notes-on-writing-wikipedia-vandalism.html for some more thoughts [19:07:59] excellent, thanks! [19:08:02] Should be relevant to the new article reviewing problem (at least WRT draftquality) [19:08:18] codezee, did you see my response earlier? [19:09:50] 10Scoring-platform-team, 10Wikilabels, 10editquality-modeling, 10User-Tgr, 10artificial-intelligence: Complete edit quality campaign for Hungarian Wikipedia - https://phabricator.wikimedia.org/T167968#3894584 (10Halfak) Bot edits should not be included in the dataset. Is it possible that some bots that... [19:11:19] halfak: so by averaged precision I meant the metric shown when we pass "precision.macro" to tune as fitness...and I'm assuming thats the precision of all classes averaged? [19:11:36] Oh! Right. [19:12:00] That "average precision" metric that sklearn has is such a confusing name. I'm happy we're avoiding it :D [19:13:16] halfak: also according to the implementation I suppose we're penalizing a prediction if its NOT in the true set, not the other way round right? [19:14:14] Right. It's a "false positive". One thing we could do is encourage editors to help us by cleaning up wikiproject tags on our train/test set and re-extracting that data. [19:14:41] E.g. we could give editors work lists of "missing" mid-level categories and given them a list of potentially relevant WikiProject tags. [19:15:29] some papers that I read upon take the set difference of actual and predicted sets, thereby accounting both ways, do you think we should do that here? [19:15:41] (03CR) 10Catrope: [C: 032] Tentatively re-enable ORES filters on RecentChangesLinked [extensions/ORES] - 10https://gerrit.wikimedia.org/r/403479 (https://phabricator.wikimedia.org/T179718) (owner: 10Sbisson) [19:15:57] so a penalization even if we missed on predicting a tag [19:16:47] codezee, yeah. that will penalize recall [19:17:12] codezee, I don't feel strongly about that. but I suppose we could add a metric for it if you felt strongly. [19:17:27] I think I like the nuanced metrics for each target class :) [19:18:05] I just remembered that a lot of problems could be solved by fixing the directory hierarchy too. :) [19:18:27] Again, that'd be on Wikipedians. [19:18:38] I'm guessing we'll see a lot of that when Wikipedians first start using the tool :D [19:22:44] (03Merged) 10jenkins-bot: Tentatively re-enable ORES filters on RecentChangesLinked [extensions/ORES] - 10https://gerrit.wikimedia.org/r/403479 (https://phabricator.wikimedia.org/T179718) (owner: 10Sbisson) [19:25:30] Oh...I got confused, the current implementation already does a kind of set difference and is penalizing both ways so nothing to worry... [19:27:03] yes, +1 for directory hierarchy... [19:30:25] (03CR) 10jenkins-bot: Tentatively re-enable ORES filters on RecentChangesLinked [extensions/ORES] - 10https://gerrit.wikimedia.org/r/403479 (https://phabricator.wikimedia.org/T179718) (owner: 10Sbisson) [19:34:41] :) [19:34:51] OK emails done. Time to start working on ores.wmflabs.org [19:35:15] I have 1.5 hours before I must wrap up and leave. :/ [19:37:48] 10Scoring-platform-team (Current), 10ORES: Back up ores-misc-01 to ores-staging-01 - https://phabricator.wikimedia.org/T184765#3894658 (10Halfak) [19:38:07] 10Scoring-platform-team (Current), 10ORES: Convert ores-misc-01 to stretch - https://phabricator.wikimedia.org/T184766#3894669 (10Halfak) [19:38:38] 10Scoring-platform-team (Current), 10ORES: Convert ores-misc-01 to stretch - https://phabricator.wikimedia.org/T184766#3894682 (10Halfak) [19:38:40] 10Scoring-platform-team (Current), 10ORES: Back up ores-misc-01 to ores-staging-01 - https://phabricator.wikimedia.org/T184765#3894681 (10Halfak) [19:38:56] 10Scoring-platform-team (Current), 10ORES: Convert CloudVPS instances to stretch. - https://phabricator.wikimedia.org/T184296#3894684 (10Halfak) [19:38:58] 10Scoring-platform-team (Current), 10ORES: Convert ores-misc-01 to stretch - https://phabricator.wikimedia.org/T184766#3894669 (10Halfak) [19:39:15] Anyone see if awight said he'd be back soon? [19:39:43] Last I saw was "biab" 1.5 hours ago. [19:43:03] halfak: not able to scp to ores-staging from ores-misc, that not possible? [19:47:46] @seen awight [19:47:46] 04Error: Command “seen” not recognized. Please review and correct what you’ve written. [19:48:06] It's a pain to do that. You'll need to rsync from remote to remote. It passes the data through your computer so it's not exactly performant. [19:48:07] wm-bot4: your supposed to answer :/ [19:48:32] Is @seenon not enable here? [19:50:37] :/ means i'll have to rebuild models.... then [19:50:50] i'll copy the important reports in that case [20:02:17] codezee, important reports should be committed to the repo [20:02:21] Same with important models ^_^ [20:02:39] Usually the only think I need to copy is datasets that are painful (slow) to re-produce. [20:03:53] Like enwiki's? [20:04:39] Depends on the dataset [20:05:06] E.g. working from a random sample for enwiki is pretty quick :) [20:08:43] halfak: yes I'll copy them eventually, I have these reports of varying one hyperparameter keeping others constant for reporting them [20:08:54] I'll commit the main statistics report and model in a commit [20:09:19] kk. Others could be copied then. [20:09:24] * halfak looks for a better way [20:27:30] OK. I got ores-web-01 set up with new code. My plan is to switch that node in. It should be able to talk to the old celery nodes. We'll find out. [21:00:07] codezee: Looks like the tuning run took 2.5hr [21:04:54] awight: how many models x params did it run? [21:05:34] codezee: You can see results in ores-misc-01:/srv/awight/wikiclass/tuning_reports/enwiki.nettrom_wp10.md, if u want more detail. Lemme see... [21:05:54] awight: nevermind if its from config file i can see [21:06:20] Really? I’m having a hard time interpreting. [21:07:05] argh scrollback limit [21:08:09] I guess it’s just the product of options in each line? So 4^3 + 2*3 + 1 + 1 + 7*5*2 [21:08:26] awight: its simple, yes [21:08:27] 142 [21:08:28] that one [21:09:06] awight: just can you also tell the number of entries in enwiki.nettrom.wp10? [21:09:33] I could wc -l, but there’s lots of header junk [21:09:56] 186 lines, so I think my 142 guess is on point. [21:10:12] awight: sorry, i meants the dataset... [21:10:15] :/ [21:10:19] 32k or so [21:10:21] *meant [21:10:32] 32,450 [21:10:37] oh, then it seems to be pretty fast scoring 142 classifier runs [21:10:51] Does that account for CV? [21:10:55] 5-fold. [21:11:06] so 32k * 142 * 5? or 6? [21:11:08] awight: tune does a 5 fold CV [21:11:14] so yes [21:11:30] I think training does the CV folds, then a final run over all the data [21:11:45] oh it’s tricker, cos each fold is actually 4/5 of the data. [21:12:18] 4/5 * 5 = 4 :) [21:16:09] 10Scoring-platform-team (Current), 10ORES: Convert CloudVPS instances to stretch. - https://phabricator.wikimedia.org/T184296#3895041 (10Halfak) ores-web-01 is created and configured, but not pooled. It seems to work fine to request scores from this node and it is able to talk to celery as planned. [21:16:32] halfak: How is the wheels thing not a problem? [21:17:40] Not sure I'd wee how it could be a problem [21:17:58] The Jessie instances are running Jessie wheels and the one Stretch instance is running Stretch wheels. :) [21:18:01] halfak: We were talking about the wheel versioning stuff... [21:18:02] AH [21:18:04] great. [21:18:07] Same exact version of celery :) [21:18:08] thanks for doing all that, then :) [21:18:14] yep that should be fine. [21:18:19] Which means we can switch out the web node first and then the celery nodes second. A [21:18:28] Also we can probably switch them out one half at a time :) [21:18:37] Regretfully I need to go get on an airplane. [21:18:43] o/ [21:18:46] So it'll need to wait until i have more time. [21:18:53] I'll be AFK tomorrow and Saturday. [21:18:54] halfak: safe travels! [21:18:56] Should be around on Sunday. [21:19:05] hahahaahahahahaha [21:19:10] Looks like Monday is a holiday. I plan to show up for our sync meetings and take the rest of the day off. [21:19:11] you and whose army. [21:19:25] Thanks Nettrom [21:19:26] :) [21:19:41] OK good warning. I’ll… try to make the Monday meeting. [22:36:35] 10Scoring-platform-team, 10MediaWiki-extensions-ORES, 10Release-Engineering-Team: How do I test my extension's maintenance scripts? - https://phabricator.wikimedia.org/T184775#3895786 (10awight) [22:56:23] (03CR) 10jenkins-bot: Localisation updates from https://translatewiki.net. [extensions/ORES] - 10https://gerrit.wikimedia.org/r/403815 (owner: 10L10n-bot) [23:18:51] (03PS1) 10Awight: Steal model fixtures for TestHelper; add dirty tricks [extensions/ORES] - 10https://gerrit.wikimedia.org/r/403838 (https://phabricator.wikimedia.org/T184140) [23:18:53] (03PS1) 10Awight: Add a maintenance script test [extensions/ORES] - 10https://gerrit.wikimedia.org/r/403839 (https://phabricator.wikimedia.org/T184140)