[13:45:52] 10Scoring-platform-team, 10draftquality-modeling, 10artificial-intelligence: Test draftquality sentiment feature on Editquality - https://phabricator.wikimedia.org/T170177#3465965 (10Sumit) **Enwiki damaging gives slight rise in accuracy:** Model tuning report - Revscoring version: 1.3.17 - Features: editqu... [13:48:50] 10Scoring-platform-team, 10draftquality-modeling, 10artificial-intelligence: Test draftquality sentiment feature on Editquality - https://phabricator.wikimedia.org/T170177#3465993 (10Sumit) **Goodfaith model shows a slight fall in accuracy:** Model tuning report - Revscoring version: 1.3.17 - Features: edit... [14:04:34] o/ [14:13:06] 10Scoring-platform-team-Backlog, 10ORES, 10RESTBase, 10RESTBase-API, 10Services (later): Use RESTBase for ORES precaching - https://phabricator.wikimedia.org/T166161#3466137 (10GWicke) [15:40:18] 10Scoring-platform-team-Backlog, 10Wikidata, 10WMDE-Tech-Communication-Mentoring-And-Events: Increase signal of feature set for Wikidata model - https://phabricator.wikimedia.org/T127473#3466492 (10Halfak) [15:40:21] 10Scoring-platform-team, 10Wikidata, 10editquality-modeling, 10User-Ladsgroup, 10artificial-intelligence: Use 'informals', 'badwords', etc. in Wikidata feature set - https://phabricator.wikimedia.org/T162617#3466491 (10Halfak) 05Open>03Resolved [15:40:24] 10Scoring-platform-team, 10editquality-modeling, 10User-Ladsgroup, 10artificial-intelligence: Add new data for damaging models of Persian Wikipedia - https://phabricator.wikimedia.org/T170960#3466494 (10Halfak) 05Open>03Resolved [15:40:31] 10Scoring-platform-team, 10User-Ladsgroup: Apply mediawiki core styling convention on javascript files - https://phabricator.wikimedia.org/T169576#3466496 (10Halfak) [15:51:28] 10Scoring-platform-team, 10editquality-modeling, 10User-Ladsgroup, 10artificial-intelligence: Split editquality repo to two repos, one with full history, one shallow - https://phabricator.wikimedia.org/T170967#3449254 (10Halfak) We really want something like git-lfs that tracks the history but never forces... [15:54:37] 10Scoring-platform-team, 10draftquality-modeling, 10artificial-intelligence: Experiment with Sentiment score feature for draftquality - https://phabricator.wikimedia.org/T167305#3466561 (10Halfak) I'd try this: ``` $ cd draftquality/ $ python Python 3.5.1+ (default, Mar 30 2016, 22:46:26) [GCC 5.3.1 201603... [15:55:53] o/ awight [15:56:10] I'm working on the agenda for the meeting that'll start in a bit. See https://etherpad.wikimedia.org/p/scoring_platform if you want to add some stuff. [15:56:15] heyo [15:56:25] Cool, I was just looking [15:56:52] lol trying to respond to your comment from 1 minute ago [15:56:58] :D [15:57:14] aww [15:57:15] File "/home/awight/revscoring/revscoring/scoring/models/sklearn.py", line 51, in __init__ [15:57:18] self.estimator = self.Estimator(**params) [15:57:21] TypeError: __init__() got an unexpected keyword argument 'threshold_optimizations' [15:57:31] so flaggedrevs thing didn't complete [15:57:31] awight, uninstall revscoring 2.0 [15:57:36] Oh [15:57:37] that [15:57:37] 10Scoring-platform-team, 10editquality-modeling, 10User-Ladsgroup, 10artificial-intelligence: Split editquality repo to two repos, one with full history, one shallow - https://phabricator.wikimedia.org/T170967#3466569 (10Ladsgroup) Production can't talk to the outside and download things (unless using carb... [15:57:59] awight, what utility failed? [15:58:48] cv_train [15:59:02] I can paste that somewhere if it's helpful [15:59:08] 10Scoring-platform-team, 10editquality-modeling, 10User-Ladsgroup, 10artificial-intelligence: Split editquality repo to two repos, one with full history, one shallow - https://phabricator.wikimedia.org/T170967#3466577 (10Halfak) > add a secondary step to our prod deploys that allow us to copy stuff from ou... [15:59:47] awight, I realized while AFK that I think we can proceed with revscoring 1.3.x for the work that you are doing. [16:00:00] Either that or you can track down the bug in revscoring 2.0 and work on it yourself ;) [16:00:56] hehe gotcha [16:01:06] perhaps we should branch either 1.3 or 2.0 [16:11:33] 10Scoring-platform-team-Backlog, 10ORES: Blog about ORES regex-pocalypse - https://phabricator.wikimedia.org/T171486#3466616 (10Halfak) [16:22:38] 10Scoring-platform-team, 10Wikilabels, 10User-Ladsgroup: linting tests for wikilabels - https://phabricator.wikimedia.org/T171084#3466685 (10Ladsgroup) [16:29:55] 10Scoring-platform-team, 10draftquality-modeling, 10artificial-intelligence: Experiment with Sentiment score feature for draftquality - https://phabricator.wikimedia.org/T167305#3466705 (10awight) Verified that the with_sentiment model includes , , 10Scoring-platform-team, 10editquality-modeling, 10User-Ladsgroup, 10artificial-intelligence: Investigate small loss in fitness with the new data in fawiki - https://phabricator.wikimedia.org/T171386#3466761 (10Halfak) It happened with the goodfaith model here too: {T170177} [16:59:50] 10Scoring-platform-team, 10MediaWiki-extensions-ORES: Make list of features and locations of ORES Review Tool for handoff - https://phabricator.wikimedia.org/T167911#3466788 (10awight) 05Open>03Resolved This is complete from our perspective. @jmatazzoni has a plan for the preferences, which we'll discuss... [17:02:09] halfak: regarding this fall in fitness, is revscoring version change a likely suspect? [17:04:53] awight: I'm interested in parallelizing model makefile ( https://phabricator.wikimedia.org/T170650 ), is there a cluster accessible to test hadoop integration? [17:06:04] codezee: oh cool! [17:06:22] Yes, try the "stat" boxes--lemme get you the latest howto [17:07:12] codezee: So the machine is in flux, it was stat1002 and will soon be stat1005: https://phabricator.wikimedia.org/T165368 [17:07:42] https://wikitech.wikimedia.org/wiki/Analytics/Data_access#Stats_machines [17:07:47] ^ seems to already be stat1005 [17:07:53] halfak: You did show me an ORES development timeline, right? If so, where is it again? I'll bookmark it. I'm going to start working on some community engagement visualizations [17:09:39] awight: I probably can't access them, as I don't have prod access :( [17:09:55] codezee: These are available to anyone with an NDA on file [17:10:24] awight: I've not signed an NDA [17:10:32] yet [17:11:05] codezee: It's fun and rewarding! https://wikitech.wikimedia.org/wiki/Volunteer_NDA [17:11:26] Keegan, rough timeline is here: https://etherpad.wikimedia.org/p/abbey_aaron [17:11:48] can maybe get back to you later with more [17:12:03] Amir1: lolol thx for the video [17:12:11] :P [17:13:35] awight: oh I see, but would need an endorsement from someone I guess... [17:13:39] halfak: Great, thank you. This looks like plenty to digest for now :) [17:14:31] https://etherpad.wikimedia.org/p/natalia_aaron [17:14:50] Great Keegan. Only bits I have are some summary graphics. :D [17:15:46] codezee: I would support but it looks like we need to get halfak's opinion due to managerliness [17:16:34] At this point, I'm happy to support. [17:17:26] especially if you can speed up our model training by 4x ;-) [17:18:00] ok I'll file a task sometime this week :) [17:20:52] halfak: there seems to be observations on sentiment in natalia's etherpad...^ is she also working on something similar? [17:33:43] 10Scoring-platform-team, 10Wikilabels, 10editquality-modeling, 10artificial-intelligence: Unlabeled goodfaith observations are assumed "false" -- should be "true" - https://phabricator.wikimedia.org/T171491#3466851 (10Halfak) [17:33:54] 10Scoring-platform-team, 10Wikilabels, 10editquality-modeling, 10artificial-intelligence: Unlabeled goodfaith observations are assumed "false" -- should be "true" - https://phabricator.wikimedia.org/T171491#3466865 (10Halfak) a:03Natalia [17:38:30] 10Scoring-platform-team, 10Wikilabels, 10editquality-modeling, 10artificial-intelligence: Change "yes/no" in damaging_goodfaith form to "damaging/good" and "good-faith/bad-faith" - https://phabricator.wikimedia.org/T171493#3466879 (10Halfak) [17:45:26] 10Scoring-platform-team-Backlog, 10ORES: Design a re-review pattern for meta ORES - https://phabricator.wikimedia.org/T171496#3466935 (10Halfak) [17:47:49] 10Scoring-platform-team-Backlog, 10ORES, 10editquality-modeling, 10artificial-intelligence: Review training set to check strange examples of labels - https://phabricator.wikimedia.org/T171497#3466961 (10Halfak) [17:48:13] 10Scoring-platform-team, 10ORES, 10editquality-modeling, 10artificial-intelligence: Review training set to check strange examples of labels - https://phabricator.wikimedia.org/T171497#3466977 (10Halfak) [17:48:40] 10Scoring-platform-team, 10ORES, 10editquality-modeling, 10artificial-intelligence: Review training set to check strange examples of labels - https://phabricator.wikimedia.org/T171497#3466961 (10Halfak) [17:48:41] 10Scoring-platform-team, 10Wikilabels, 10editquality-modeling, 10artificial-intelligence: Change "yes/no" in damaging_goodfaith form to "damaging/good" and "good-faith/bad-faith" - https://phabricator.wikimedia.org/T171493#3466978 (10Halfak) [17:49:05] 10Scoring-platform-team, 10ORES, 10editquality-modeling, 10artificial-intelligence: Review training set to check strange examples of labels - https://phabricator.wikimedia.org/T171497#3466961 (10Halfak) [17:49:07] 10Scoring-platform-team, 10Wikilabels, 10editquality-modeling, 10artificial-intelligence: Unlabeled goodfaith observations are assumed "false" -- should be "true" - https://phabricator.wikimedia.org/T171491#3466981 (10Halfak) [17:53:00] 10Scoring-platform-team, 10Wikilabels, 10editquality-modeling, 10artificial-intelligence: Change "yes/no" in damaging_goodfaith form to "damaging/good" and "good-faith/bad-faith" - https://phabricator.wikimedia.org/T171493#3467022 (10Halfak) ``` cat enwiki.labeled_revisions.20k_2015.json | grep '"damaging... [17:59:13] o/ im back sorry for not comming on when i came home last night (i was feeling very unwell and just now starting to feel better) [17:59:39] Zppix: sorry to hear it! Please don't get any of that on our chat window :p [17:59:56] * awight uncharacteristically gives hands an extra washing [18:00:08] lol [18:00:15] awight ah non of that yet... and hopefully never [18:00:20] none* [18:00:23] :D [18:13:58] (03PS1) 10Catrope: Add index on oresc_probability [extensions/ORES] - 10https://gerrit.wikimedia.org/r/367449 [18:18:52] (03CR) 10Catrope: "Putting this in because I think it will likely be helpful, but it's more of a proposal at this point. I think it might help with some slow" [extensions/ORES] - 10https://gerrit.wikimedia.org/r/367449 (owner: 10Catrope) [18:20:06] 10Scoring-platform-team, 10ORES: Late-July 2017 ORES deploy - https://phabricator.wikimedia.org/T171505#3467194 (10Ladsgroup) [18:20:51] 10Scoring-platform-team, 10ORES: Late-July 2017 ORES deploy - https://phabricator.wikimedia.org/T171505#3467211 (10Ladsgroup) [18:20:54] 10Scoring-platform-team, 10editquality-modeling, 10User-Ladsgroup, 10artificial-intelligence: Add new data for damaging models of Persian Wikipedia - https://phabricator.wikimedia.org/T170960#3467210 (10Ladsgroup) [18:21:01] 10Scoring-platform-team, 10ORES: Late-July 2017 ORES deploy - https://phabricator.wikimedia.org/T171505#3467194 (10Ladsgroup) [18:21:03] 10Scoring-platform-team, 10Wikidata, 10editquality-modeling, 10User-Ladsgroup, 10artificial-intelligence: Use 'informals', 'badwords', etc. in Wikidata feature set - https://phabricator.wikimedia.org/T162617#3467212 (10Ladsgroup) [18:59:08] Amir1: halfak: (codezee: fajne:) public service announcement, we’re already up against the wall with disk space on ores-misc. I just moved my work to /srv & you all should consider doing so as well. [18:59:47] I'm downloading lots of thing by just git cloning the deploy repo :/ [18:59:53] once it's done I move it there [19:00:38] kk [19:00:48] I just made /srv/~ for everyone as a courtesy [19:03:27] 10Scoring-platform-team, 10editquality-modeling, 10User-Ladsgroup, 10artificial-intelligence: Investigate small loss in fitness with the new data in fawiki - https://phabricator.wikimedia.org/T171386#3467422 (10Ladsgroup) FYI, for old data and new revscoring version, this is result of damaging: ``` ScikitL... [19:16:39] halfak: heads-up, the docs meeting is canceled due to being redundant with last week’s chat [19:18:15] awight, thanks [19:18:34] 10Scoring-platform-team-Backlog, 10ORES, 10Reading-Admin: Announce presence of "oresscores" in api.php - https://phabricator.wikimedia.org/T153688#3467469 (10dr0ptp4kt) @Tgr I spoke with @Halfak and we were wondering, would you feel comfortable sending an email to medaiwiki-api (and other appropriate mailing... [19:19:08] 10Scoring-platform-team-Backlog, 10ORES, 10Reading-Admin: Announce presence of "oresscores" in api.php - https://phabricator.wikimedia.org/T153688#3467472 (10dr0ptp4kt) ^ That, instead of doing a full blown blog post. [19:19:17] ooh also ^ scrum-of-scrums material [19:20:10] noted. [19:24:50] awight i agree halfak ^ [20:14:37] Getting a handful of strange test errors in revscoring on ores-misc…. they won’t block my work though. [20:14:49] enchant.errors.DictNotFoundError: Dictionary for language 'ta' could not be found [20:14:55] same error for bn [20:15:05] however, aspell-ta and aspell-bn are installed [20:15:32] Then, a fun failure for Ukranian: [20:15:33] nose.proxy.AssertionError: [Token('потовщена', type='word'), Token('ущільнена', type='word'), Token('і', type='word'), Token('worn [20:15:33] gly', type='word')] != ['потовщена', 'ущільнена', 'і'] [20:15:47] awight, upgrade revscoring to 1.3.18 [20:15:55] I’m in -master [20:16:09] Oh! You could rebase on 1.3.x [20:16:20] harr, I was going to fix the errors I found [20:16:29] :) rebase would be better [20:16:35] but yeah on second thought, lemme just unblock and finish the Finnish [20:16:35] Will make eventually easier. [20:16:42] Those problems were fixed in 1.3.x [20:20:20] Amir1, pr-auc is way better with new fawiki data :))) [20:20:22] \o/ [20:20:55] I wasn't sure if it was good or bad [20:20:56] so yay [20:21:00] ()I still get the Ukranian fail with 1.3.x but that’s fine for now) [20:21:22] halfak: Just to make sure I understand it correctly. In this code https://github.com/wiki-ai/editquality/blob/master/editquality/utilities/fetch_labels.py#L85 "automated" means edits made by trusted editors and hence labeled automatically as not damaging/goodfaith? [20:27:30] BTW, here’s a recent paper that will be at OpenSym, improves the wp10 model using Deep Learning: https://hal.inria.fr/hal-01559693/document (points out that it’s currently not production ready as it takes ~4s to get a prediction) [20:28:50] halfak: Sorry, not sure what happened with my irc, so just repeating the question. In this code https://github.com/wiki-ai/editquality/blob/master/editquality/utilities/fetch_labels.py#L85 "automated" means edits made by trusted editors and hence labeled automatically as not damaging/goodfaith? [20:35:24] Random side thing, I’m going to wire up https://scrutinizer-ci.com/ to our repos [20:46:11] https://scrutinizer-ci.com/g/wiki-ai/ores/ [20:47:29] https://scrutinizer-ci.com/g/wiki-ai/revscoring/ [20:47:34] bd808 would be proud. [20:55:47] halfak: cv_train on revscoring1 doesn’t allow the null folds thing. Would you prefer I try to cherry-pick, or just run train w/o CV? [20:57:25] cv_train --folds=1 in 2.0 == train_model in 1.3 [20:57:31] awight, ^ [20:57:44] gotcha [20:57:47] :D! Looks like we scored pretty well for revscoring :)) [20:58:05] Hell yeah! [20:58:33] It’s still a useful service though, finds DRY stuff, lets you know about potential pain points and neglected functions etc... [21:01:09] halfak: what would -w “true=50” map to? [21:01:21] —balance-sample-weight? [21:03:31] Yes :) [21:03:34] Nice intuition :D [21:04:10] lol excellent. It was actually a vague memory of our last conversation in which you probably had me delete that line and add -w [21:35:20] 10Scoring-platform-team, 10Wikilabels, 10editquality-modeling, 10artificial-intelligence: Change "yes/no" in damaging_goodfaith form to "damaging/good" and "good-faith/bad-faith" - https://phabricator.wikimedia.org/T171493#3466879 (10Natalia) **a note to the design of the new buttons: in addition to rep... [21:40:01] asda is trying to buy b&m [21:48:16] 10Scoring-platform-team, 10Wikilabels, 10editquality-modeling, 10artificial-intelligence: Change "yes/no" in damaging_goodfaith form to "damaging/good" and "good-faith/bad-faith" - https://phabricator.wikimedia.org/T171493#3468174 (10Halfak) +1 that sounds like a good idea to me [23:00:35] 10Scoring-platform-team, 10MediaWiki-extensions-ORES, 10Collaboration-Team-Triage (Collab-Team-Q1-Jul-Sep-2017), 10Patch-For-Review: Summarize what it will take to separate product and platform for ORES Extension - https://phabricator.wikimedia.org/T167908#3468394 (10Catrope) [23:08:34] gerrit's about to rewrite the change url in 2.15 [23:08:34] https://gerrit-review.googlesource.com/#/c/108592/ [23:08:35] heh [23:12:19] * halfak runs away from the computer at full tilt [23:12:22] have a good evening folks [23:12:23] o/ [23:18:29] \o