[09:48:39] <wikibugs>	 (03PS2) 10Ladsgroup: Use minified responses [extensions/ORES] - 10https://gerrit.wikimedia.org/r/334695
[11:22:24] <wikibugs>	 10Revision-Scoring-As-A-Service-Backlog, 10Wikilabels, 10rsaas-editquality: Deploy edit quality campaign for Romanian Wikipedia - https://phabricator.wikimedia.org/T156357#2979187 (10Andrei_Stroe) Apparently, the labels I translated in https://meta.wikimedia.org/wiki/Wiki_labels/Interface_translation/Edit_qu...
[13:28:34] <wikibugs>	 10Revision-Scoring-As-A-Service-Backlog, 10AbuseFilter, 10ORES, 07Community-Wishlist-Survey-2015: Suggesting AbuseFilter by machine learning - https://phabricator.wikimedia.org/T120741#2979326 (10BethNaught)
[16:50:54] <wikibugs>	 (03CR) 10Legoktm: "Has the ORES change been deployed yet? (Does it matter?)" [extensions/ORES] - 10https://gerrit.wikimedia.org/r/334695 (owner: 10Ladsgroup)
[16:55:03] <wikibugs>	 (03CR) 10Ladsgroup: "It's not deployed there but it'll be early next week (if nothing happens) and it doesn't matter too. AFAIK it just ignores extra params." [extensions/ORES] - 10https://gerrit.wikimedia.org/r/334695 (owner: 10Ladsgroup)
[16:55:06] <halfak>	 o/
[16:56:24] <Amir1>	 halfak: hey there, I'm working on getting the cswiki damaging model up and running, I'm updating my venv atm
[16:56:33] <halfak>	 Great! 
[16:56:38] <halfak>	 Let me know how the stats work out. 
[16:57:02] <halfak>	 I'm worries that the 2.5/2.5k balancing strategy was ... problematic 
[16:57:33] <Amir1>	 sure
[16:57:54] <wikibugs>	 10Revision-Scoring-As-A-Service-Backlog, 10AbuseFilter, 10ORES, 07Community-Wishlist-Survey-2015: Suggesting AbuseFilter by machine learning - https://phabricator.wikimedia.org/T120741#2979547 (10Huji)
[17:00:44] <Amir1>	 btw. halfak: I made a commit in revscoring and pushed it directly to the master. It was fixing broken link in docs
[17:00:49] <Amir1>	 hope you don't mind
[17:01:20] * halfak checks 
[17:01:47] <halfak>	 Amir1, hmm... I don't think that's right. 
[17:02:09] <halfak>	 Eek!  Looks like enchant.org is down
[17:02:15] <halfak>	 But pyenchant != enchant
[17:02:35] <Amir1>	 hmmm
[17:03:02] <Amir1>	 What do you think we should do?
[17:03:15] <halfak>	 https://en.wikipedia.org/wiki/Enchant_(software)
[17:03:17] <halfak>	 maybe?
[17:03:27] <halfak>	 http://abisource.com/projects/enchant/
[17:03:31] <halfak>	 That could work too
[17:03:45] <halfak>	 Or https://abiword.github.io/enchant/
[17:04:53] <halfak>	 Amir1, do you want to do that?  I could do this right now if you're busy. 
[17:04:57] <Amir1>	 the github link is creepy 
[17:05:04] <halfak>	 lol
[17:05:05] <Amir1>	 let's go with Wikipedia :D
[17:05:10] <halfak>	 Yeah,  Super basic. 
[17:05:12] <halfak>	 +1
[17:05:15] <halfak>	 Wikipedia
[17:05:32] <Amir1>	 I'll do it
[17:05:35] <Amir1>	 halfak: ^
[17:07:15] <halfak>	 cool. 
[17:07:16] <Amir1>	 halfak: and don
[17:07:20] <Amir1>	 *done
[17:07:21] <halfak>	 \o/
[17:09:29] <wikibugs>	 06Revision-Scoring-As-A-Service, 10Wikidata, 15User-Ladsgroup, 05WMDE-Tech-Communication-Mentoring-And-Events: Build item_quality form - https://phabricator.wikimedia.org/T155828#2979558 (10Halfak) Looks good.  Maybe it's time to move it to the Wiki and to host a discussion about it.  Once there's some buy...
[17:09:43] <mattsun>	 halfak, Amir1 nice to meet you! this is matthew from the GSoC emails :) 
[17:10:25] <halfak>	 o/ mattsun 
[17:10:51] <Amir1>	 mattsun: hey there
[17:10:55] <Amir1>	 nice to meet you too
[17:11:41] <mattsun>	 so i was looking at the task here I was looking at this task https://phabricator.wikimedia.org/T156494
[17:12:02] <mattsun>	 and was wondering if you had any advice on how to get started (Amir1, halfak)
[17:12:25] <mattsun>	 halfak: the task description says you have some helpful datasets?
[17:12:35] <halfak>	 Oh yes.  Let me dig around a bit quick :) 
[17:13:12] <mattsun>	 Great, thanks! 
[17:13:46] <halfak>	 https://github.com/wiki-ai/draftquality/blob/master/datasets/enwiki.draft_quality.201508-201608.tsv.bz2
[17:13:49] <halfak>	 mattsun, ^ 
[17:14:18] <halfak>	 That dataset contains a record for every article creation in the last year (ending on Aug 2016)
[17:14:18] <mattsun>	 cool, downloading right now
[17:14:31] <halfak>	 It also contains a label for spam/vandalism/attack/OK
[17:15:19] <mattsun>	 got it
[17:16:07] <mattsun>	 so my goal is probably to run the draftquality model on those articles and see how well it matches the actual labels, is that correct?
[17:16:20] <halfak>	 Right.
[17:16:26] <halfak>	 Oh wait... I have a better dataset for this.  
[17:16:33] <halfak>	 Regretfully, we trained the model on that data. 
[17:16:38] <halfak>	 I'll get some fresh data. 
[17:17:08] <mattsun>	 Got it, thanks!
[17:18:18] <halfak>	 Hmm... OK this is going to be a bit of a pain.  One minute. 
[17:18:46] <halfak>	 Regretfully, my process isn't finished.  But I can get some intemediary data so you can start experimenting.
[17:18:47] <mattsun>	 No worries :)
[17:18:57] <mattsun>	 That sounds good
[17:20:23] <halfak>	 OK I just sent an email that should arrive shortly.  
[17:20:38] <mattsun>	 Great, I'll be on the lookout
[17:20:42] <halfak>	 It contains a sample from the month of Aug 2016 -- the month immediately after our training sample. 
[17:20:52] <halfak>	 Which should be good for testing. 
[17:21:52] <halfak>	 61k OK, 1.5k spam, 426 vandalism, 175 attack. 
[17:22:26] <mattsun>	 Great, just received it!
[17:23:32] <mattsun>	 I usually use R for this kind of data analysis stuff - is that something you recommend?
[17:24:48] <mattsun>	 And should I run the draftquality model by making api requests? 
[17:25:59] <halfak>	 R is fine.  I'd recommend python for data gathering from an API, but R will work. 
[17:26:11] <halfak>	 I do recommend you use the API. 
[17:26:30] <halfak>	 Oh... wait.  Damn it. 
[17:26:50] <halfak>	 You can't gather predictions for this because you can't access the text of deleted pages through ORES. 
[17:27:00] <halfak>	 ORES has no privileged access to data. 
[17:27:03] <halfak>	 hmm. 
[17:27:21] <halfak>	 I might need to generate the scores for all of the deleted pages in that set,. 
[17:28:06] * halfak thinks. 
[17:28:38] <mattsun>	 Oh, I see. That makes sense
[17:28:40] <halfak>	 OK.  I think I need to get you scores. 
[17:28:51] <halfak>	 I'm going to look into doing that. 
[17:29:12] <mattsun>	 OK, thanks so much!
[17:30:34] <mattsun>	 By "scores" do you mean the scores that the non-draftquality ORES system would give the deleted pages? (not sure if that question even makes sense)
[17:31:06] <halfak>	 hmm... indeed. it seems I'm confused. 
[17:31:22] <halfak>	 ORES can't score deleted things. 
[17:32:03] <halfak>	 Many pages in that dataset are deleted. 
[17:32:07] <mattsun>	 Oh
[17:32:12] <halfak>	 Basically, all of the not "OK" examples
[17:32:17] <Amir1>	 halfak: quick question. for 5k samples, do we need to add other revs from the 20k or just train based on the 5k?
[17:32:57] <halfak>	 Amir1, for the balanced dataset, the theory was to train on the 5k after sampling with replacement. 
[17:33:12] <halfak>	 But I'm pretty skeptical of that, honestly. 
[17:33:17] <halfak>	 :(  
[17:33:27] <halfak>	 Past halfak may have been a dummy. 
[17:33:33] <Amir1>	 let's go with the 5k and see how it turns out
[17:33:44] <halfak>	 OK
[17:33:55] <Amir1>	 I don't think it will be horrible. We gather all signals from the 5k too
[17:34:00] <Amir1>	 (IMO)
[17:34:16] <halfak>	 Yeah.  Our test stats are going to be weird though :/
[17:34:51] <halfak>	 mattsun, OK so my plan is to get you predictions for those observations. 
[17:34:58] <halfak>	 I'm going to have to hack together a script to do that. 
[17:35:03] <halfak>	 I think it'll be pretty easy. 
[17:35:21] <mattsun>	 OK, gotcha
[17:35:23] <halfak>	 but it's going to take a bit. 
[17:35:43] <mattsun>	 OK, cool
[17:36:17] <mattsun>	 So even though some pages are deleted, you're going to make predictions for them
[17:36:22] <halfak>	 In the meantime, how about you get that dataset loaded and get us a nice plot that shows us trends over time in that dataset. 
[17:36:27] <halfak>	 Roght
[17:36:29] <halfak>	 *Right
[17:36:36] <mattsun>	 Ok, got it
[17:36:38] <mattsun>	 Will do!
[17:52:05] <wikibugs>	 06Revision-Scoring-As-A-Service, 10revscoring, 10rsaas-editquality, 15User-Ladsgroup, 15User-Urbanecm: Train and test editquality models for Czech Wikipedia - https://phabricator.wikimedia.org/T156492#2979568 (10Ladsgroup) ``` # Model tuning report - Revscoring version: 1.3.5 - Features: editquality.feat...
[17:52:59] <halfak>	 Amir1, that's a not-so-great tuning. 
[17:53:03] <halfak>	 How many true obs?
[17:54:03] <Amir1>	 halfak: It seems it doesn't have it in the tuning reports, I need to build the model
[17:54:29] <halfak>	 cat dataset | grep '"damaging": true' | wc
[17:55:35] <Amir1>	 475 true cases
[17:55:53] <halfak>	 That's pretty good. 
[17:56:23] <halfak>	 cat dataset | grep '"damaging": true' | grep '"needs_review": true' | wc
[17:57:28] <Amir1>	 (p3)ladsgroup@ores-compute-01:~/editquality$  cat datasets/cswiki.human_labeled_revisions.5k_2016.json | grep '"damaging": true' | grep '"needs_review": true' | wc -l
[17:57:28] <Amir1>	 434
[17:59:12] <mattsun>	 halfak, quick question about the dataset you gave me - how is creation_timestamp formatted exactly? what date is "20160801000408" for example?
[17:59:29] <halfak>	 %y%m%d%H%i%S
[17:59:39] <halfak>	 YYYYMMDDHHMMSS
[17:59:43] <mattsun>	 got it, thank you!
[17:59:50] <halfak>	 :) 
[18:09:01] <wikibugs>	 (03CR) 10Legoktm: [C: 032] Use minified responses [extensions/ORES] - 10https://gerrit.wikimedia.org/r/334695 (owner: 10Ladsgroup)
[18:10:38] <wikibugs>	 (03Merged) 10jenkins-bot: Use minified responses [extensions/ORES] - 10https://gerrit.wikimedia.org/r/334695 (owner: 10Ladsgroup)
[18:11:33] <wikibugs>	 06Revision-Scoring-As-A-Service, 10revscoring, 10rsaas-editquality, 15User-Ladsgroup, 15User-Urbanecm: Train and test editquality models for Czech Wikipedia - https://phabricator.wikimedia.org/T156492#2979570 (10Ladsgroup) Model for damaging: ```  - type: GradientBoosting  - params: balanced_sample_weigh...
[18:13:31] <halfak>	 script is ready.  Working on running it now. 
[18:20:11] <wikibugs>	 06Revision-Scoring-As-A-Service, 10revscoring, 10rsaas-editquality, 15User-Ladsgroup, 15User-Urbanecm: Train and test editquality models for Czech Wikipedia - https://phabricator.wikimedia.org/T156492#2979579 (10Ladsgroup) https://github.com/wiki-ai/editquality/pull/57
[18:22:35] <halfak>	 script running.  Making coffee. 
[18:26:06] <Amir1>	 I got to go, be back soonish
[18:26:24] <halfak>	 o/
[18:27:39] <mattsun>	 halfak, looks like my relatives came early to pick me up for chinese new year celebrations (happy lunar new year to anyone who celebrates it!)
[18:27:56] <mattsun>	 i gotta go but i'm making good progress on the R script! i'll report back when it's done :)
[18:51:48] <halfak>	 Sounds good mattsun.  See you soon :) 
[18:52:52] <halfak>	 Amir1, the threshold statistics won't work with the sample as-is. 
[18:53:00] <halfak>	 We need to re-scale the observations. 
[18:54:05] * halfak looks into the obs for cswiki
[19:00:55] <halfak>	 OK.  So this might be a bit hare-brained. 
[19:01:01] <halfak>	 But here's what I propose we do. 
[19:01:45] <halfak>	 sample 4558 observations from the "needs_review": true labeled subsample (of 2.5k, so it'll be sampling with replacement)
[19:02:31] <halfak>	 and then merge the labels we have for damage/goodfaith into the remaining "needs_review": false subset
[19:02:39] <halfak>	 We'll get ~20k observations.  
[19:03:03] <halfak>	 A very small number of "damaging" edits that were marked as not needing review will be mislabeled. 
[19:08:13] <halfak>	 Damn.  This is messy.  I wish I could go back in time and have the cswiki folk just label all of the "needs_review": true observations. 
[19:08:24] <halfak>	 I'm going to iterate on this PR. 
[19:24:49] <halfak>	 OK.  Just completed the work.  Am rebuilding the model now. 
[19:24:58] <halfak>	 Will have a followup commit soon
[22:20:12] <wikibugs>	 06Revision-Scoring-As-A-Service, 10revscoring, 10rsaas-editquality, 15User-Ladsgroup, 15User-Urbanecm: Train and test editquality models for Czech Wikipedia - https://phabricator.wikimedia.org/T156492#2979829 (10Halfak) ``` ScikitLearnClassifier  - type: GradientBoosting  - params: warm_start=false, min_...
[22:24:55] <halfak>	 Arg.  Forgot to build the goodfaith model.  Fixing that now. 
[22:29:50] <wikibugs>	 06Revision-Scoring-As-A-Service, 10revscoring, 10rsaas-editquality, 15User-Ladsgroup, 15User-Urbanecm: Train and test editquality models for Czech Wikipedia - https://phabricator.wikimedia.org/T156492#2979844 (10Halfak) ``` ScikitLearnClassifier  - type: GradientBoosting  - params: center=true, scale=tru...
[22:33:30] <Amir1>	 halfak: should I merge it or you do it?
[22:33:41] <halfak>	 Merge if you like the changes :) 
[22:33:57] <halfak>	 Just about to run away.  
[22:34:00] <halfak>	 Have a good one! 
[22:34:01] <halfak>	 o/
[22:34:01] <Amir1>	 with pleasure 
[22:34:10] <Amir1>	 you too