[08:02:01] 06Revision-Scoring-As-A-Service, 10MediaWiki-extensions-Translate, 06translatewiki.net: qqq for a wiki-ai message cannot be loaded - https://phabricator.wikimedia.org/T132197#3167574 (10Nikerabbit) This was reported again elsewhere. The issue is that the key has two spaces, and MediaWiki does not support tha... [14:16:31] glorian_wd, I'm looking at your query ad I don't understand why you have cut down the number of observations for 262144 and inf to 500. [14:16:31] https://quarry.wmflabs.org/query/17904 [14:19:14] halfak: Actually I cut down the number of the first and second highest strata. The reason was, I think most of the items in those strata will be bio related items or chess players. Generally, those items aren't having high quality grade. I'd say most of them at most graded as "C". [14:19:14] On the other hand, the low q-id items have higher odds of obtaining high quality grade. Therefore, I add the number of those low q-id items in order to increase our odds to get more items with class "A" and "B" [14:20:45] glorian_wd, but why 500 [14:21:06] If you eventually want 500 of them, then you should query for 750 so that filtering will not remove all that you want. [14:21:51] halfak: nope. I only wanted 250 from both those strata. Look at: https://github.com/wiki-ai/wikiclass/pull/32/files [14:22:04] OK [14:22:25] In the PR, specifically in the "shuf" command, I specified the number of items which I think appropriate [14:22:58] halfak: but that's just my proposal which based purely on my estimation. So, obviously, you can tell me if it's not correct :P [14:23:50] 06Revision-Scoring-As-A-Service, 10MediaWiki-extensions-Translate, 06translatewiki.net: qqq for a wiki-ai message cannot be loaded - https://phabricator.wikimedia.org/T132197#3168414 (10Halfak) @Nikerabbit, shouldn't any valid JSON object key be valid? It seems like you're proposing a work-around, but not a... [14:24:39] It does look strange, but really, so long as we get as much representation of the assessment classes as we can, I think it'll be fine. [14:25:53] halfak: hmm do you think the strata "262144" and "inf" will yield to items which fall on class A and B? [14:27:55] Because based on the pilot campaign result, I'd say those stratas do not give us much items class A and B. [14:28:11] This means, we won't get items which cover all quality grades [14:28:16] so long as we get as much representation of the assessment classes as we can, I think it'll be fine.  [14:29:35] 10Revision-Scoring-As-A-Service-Backlog, 06Community-Liaisons, 10ORES, 15User-Johan: Support ORES - https://phabricator.wikimedia.org/T141357#3168440 (10Elitre) Johan, looks like either Keegan is working on something similar, or another CL will, at some point? I don't think you volunteered to be the CL for... [14:29:36] Yeah. If I correctly understand what you are saying, we will be fine if we have items which represent all assessment classes right? [14:30:13] halfak: OK, what do you think should be changed? [14:38:35] * halfak starts building the datafiles [15:04:19] 06Revision-Scoring-As-A-Service, 10MediaWiki-extensions-Translate, 06translatewiki.net: qqq for a wiki-ai message cannot be loaded - https://phabricator.wikimedia.org/T132197#3168566 (10Nikerabbit) There is no fix, only work-arounds. MediaWiki limits what page titles are valid. I believe that it is far easie... [15:27:22] 06Revision-Scoring-As-A-Service, 10Analytics, 10ChangeProp, 10EventBus, and 3 others: Create generalized "precache" endpoint for ORES - https://phabricator.wikimedia.org/T148714#3168638 (10mobrovac) I agree that the minimum that should be done here is to switch to POST-style requests. Would you be availabl... [15:34:04] 06Revision-Scoring-As-A-Service, 10Analytics, 10ChangeProp, 10EventBus, and 3 others: Create generalized "precache" endpoint for ORES - https://phabricator.wikimedia.org/T148714#3168675 (10Ladsgroup) Yeah, Just please make a phab card (a subtask of this) and assign it to me. I get it done ASAP. [15:44:26] Amir1, seems like POST would be an abuse of the request type, but I'm OK with it. [15:44:39] I'm just surprised that the Services folks are OK with abusing POST [15:45:55] halfak: hmm, I'm not sure either [15:46:00] Do you want to talk to Gabriel? [15:46:54] Na. I'm OK with it. [15:47:00] * halfak is not a purist [15:48:13] Amir1, https://github.com/wiki-ai/ores/blob/master/ores/wsgi/util.py#L15 [15:48:42] you can set the type to "json.loads" [15:48:48] And it'll all get handled nicely. [15:49:31] yeah [15:51:05] 06Revision-Scoring-As-A-Service, 10MediaWiki-extensions-Translate, 06translatewiki.net: qqq for a wiki-ai message cannot be loaded - https://phabricator.wikimedia.org/T132197#3168696 (10Halfak) > MediaWiki limits what page titles are valid. I see. So all message keys must be render-able as a page title in... [16:02:13] halfak: ping [16:28:40] 10Revision-Scoring-As-A-Service-Backlog, 10Wikidata, 10rsaas-editquality: Use 'informals', 'badwords', etc. in Wikidata feature set - https://phabricator.wikimedia.org/T162617#3168776 (10Halfak) [16:29:23] 10Revision-Scoring-As-A-Service-Backlog, 10Wikidata, 10rsaas-editquality: Use 'informals', 'badwords', etc. in Wikidata feature set - https://phabricator.wikimedia.org/T162617#3168776 (10Halfak) See https://github.com/wiki-ai/bwds/blob/master/dump_based_detection.py#L71 for known alphabets [16:49:11] 06Revision-Scoring-As-A-Service, 10MediaWiki-Vagrant, 10ORES, 10Wikilabels: ORES services should have vagrant roles - https://phabricator.wikimedia.org/T159105#3168835 (10Halfak) @Ladsgroup said he would take a look. [16:50:58] 06Revision-Scoring-As-A-Service, 10Wikilabels, 10rsaas-editquality, 15User-Ladsgroup: Start v2 editquality campaign for trwiki - https://phabricator.wikimedia.org/T161977#3168840 (10Halfak) a:03Ladsgroup [16:51:56] 06Revision-Scoring-As-A-Service, 10Wikilabels: Manage wikilabels for labsdb1004 maintenance - https://phabricator.wikimedia.org/T162265#3168843 (10Halfak) a:03Halfak [17:22:06] 10Revision-Scoring-As-A-Service-Backlog, 10Bad-Words-Detection-System, 10revscoring, 07Bengali-Sites: Add language support for Bengali - https://phabricator.wikimedia.org/T162620#3168926 (10Aftabuzzaman) [17:50:41] halfak: are you still building the datafiles? [17:56:43] glorian_wd, yup [17:56:54] halfak: it hasn't finished yet? [17:57:40] glorian_wd, I have other things to do with my mornings :P [17:58:01] halfak: oh okay. A while ago, I thought probably it's a good idea if you teach me to do it :) [17:58:12] how to do it* [18:06:09] I'm dealing with new types of weirdness in wikibase format. [18:20:12] http://labels.wmflabs.org/campaigns/wikidatawiki/52/?campaign=stats [18:20:15] glorian_wd, ^ done [18:20:41] halfak: Thanks! [18:21:10] 06Revision-Scoring-As-A-Service, 10Wikidata, 10Wikilabels: Deploy Wikidata item quality campaign - https://phabricator.wikimedia.org/T157493#3007227 (10Halfak) @Glorian_WD updated the sampling strategy so I re-deployed. See http://labels.wmflabs.org/campaigns/wikidatawiki/52/?campaign=stats ``` halfak@wik... [18:21:21] 06Revision-Scoring-As-A-Service, 10Wikidata, 10Wikilabels: Complete Wikidata item quality campaign - https://phabricator.wikimedia.org/T157495#3169174 (10Halfak) [18:21:30] 06Revision-Scoring-As-A-Service, 10Wikidata, 10Wikilabels: Complete Wikidata item quality campaign - https://phabricator.wikimedia.org/T157495#3007265 (10Halfak) Updated in http://labels.wmflabs.org/campaigns/wikidatawiki/52/?campaign=stats [18:24:33] o/ Amir1 [18:24:45] halfak: hey, the deployment just finished [18:24:59] I'm checking if Amir is around [18:25:00] I saw that. What exactly was deployed this time? [18:25:21] Oh yeah. I can wait to talk to you about other stuff ^_^ [18:26:31] the review tool is enabled as a beta feature [18:28:24] halfak: one thing, It seems the recall is good for soft precision too for hewiki [18:28:52] we might need to change it to lower sensitivity (like wikidata) [18:28:59] but I need to check the test stats to be sure [18:29:49] Gotcha. Let me know what you learn. [18:29:58] I thought that we'd already enabled it for hewiki [18:31:37] nah, I made the patch but it didn't get to SWAT (the patches before took too much time) then travel [18:35:07] halfak: It seems to me that in min_recall = 0.9 the filter rate is "filter_rate": 0.843 which is really high comparing to enwiki (0.736) and a little bit lower than wikidata (0.887) [18:35:15] I'm an undecided voter right now [18:38:16] halfak: what do you think? I might have some time to deploy it in this window [18:38:29] If we can decide fast :P [18:40:42] Oh, you're in backlog grooming [18:43:57] * halfak reads scroll [18:45:35] Not sure what you'd like to change. Essentially, it seems we're aiming to make one of the threshholds more strict, right? [18:47:04] * halfak cut his finger and is typing really slow [18:47:40] let's use the recall_at_precision 99 as the strictest [18:48:40] no, I mean when we want to have the default [18:49:11] the default is min_recall = 0.75 to have acceptable precision (except in Wikidata when 0.9 recall have enough precision too) [18:49:27] I was thinking we can do the same with the Hebrew [18:49:45] but I checked some edit by hand and using google translate I think we are fine [18:49:59] halfak: ^ [18:51:05] ok. We can always move the thresholds based on feedback if we get this wrong [18:51:14] now I'm available for more talks on other things [18:51:20] yeah [18:54:41] 06Revision-Scoring-As-A-Service, 10Analytics, 10ChangeProp, 10EventBus, and 3 others: Switch `/precache` to be a POST end point - https://phabricator.wikimedia.org/T162627#3169314 (10mobrovac) [18:56:35] halfak: ^ (If you're busy I can talk in another time) [19:09:55] sorry, am dealing with bloody finger [19:09:58] Amir1, ^ [19:10:30] really, the only thing I wanted to talk about is the way we load editquality campaigns [19:10:45] oh shit [19:10:47] 'cause you were looking at doing the tr one [19:10:51] no worries, deal with it [19:11:21] we have been loading 2.5k/2.5k needs_review/not [19:12:01] I think we should go back to 100% loading the needs_review. The new autolabeler is better at flagging sketchy edits from "trusted editors" [19:12:25] hmm, okay [19:12:29] got it [19:12:36] I will do it tomorrow [19:12:56] btw. I have a comment on the phab card for tensorflow assessment [19:13:14] linky? [19:13:21] sure [19:13:48] https://phabricator.wikimedia.org/T161376 [19:18:31] That ROC-AUC is pretty bad. What do you think is happening? [19:20:35] halfak: do we tests on unbalanced sample or balanced one [19:20:41] this is unbalanced [19:26:11] halfak: with balanced one, it gives out 0.90 AUC [19:26:33] Amir1, is the test set balanced then? [19:27:06] in the 0.90 AUC yes, but in the ~0.7 no [19:27:53] Interesting. This shouldn't happen [19:28:10] I'm guessing we need hyper tuning in the network architecture [19:28:24] once that's done we might boost that [19:28:35] Indeed/ [19:28:55] Right now, we get .92 ROC-AUC on enwiki damaging [19:28:58] someone showed me there are genetic algorithms to find the best architecture [19:29:04] that is indeed fun and fancy [19:29:22] Fancy is the enemy of software engineering [19:29:26] ;P [19:29:29] :D [19:30:59] Scientifically speaking I would sacrifice a spaghetti to FSM to play with this [19:31:23] I never did genetic algorithms but I love to [19:31:30] :P [19:34:35] I'm not against the work. Can you help put the value argument in context? [19:34:55] E.g. if we had a genetic tuner, we'd get [19:40:22] Right now, I don't think we're suffering from our grid search, but it might be that there's a way bigger parameter space for NN models [19:46:45] halfak: exactly [19:46:58] there is a huge parameter space for NNs [19:49:16] OK that's really all I need to know. Does TF have something to help us with that. [23:23:15] OK that's enough for the evening. [23:23:52] Amir1, I'm probably going to self-merge that big 'ol pull request once awight has a chance to look at it. [23:24:17] Or awight could merge if he's feeling bold and that it all came together OK. [23:28:00] hi! [23:28:09] I liked it the last I looked [23:28:12] lessee... [23:35:00] Not much change. Fixed the plumbing. Tests work. [23:35:09] Responded to a couple of your notes. [23:35:28] ojala that you added patches rather than the heinous gerrit --amend workflow [23:37:04] Yeah. I've been considering going back and re-writing the commits to be cleaner, but that's probably not worth the time it'd take. [23:37:18] plus the reviewer strain [23:41:14] Quite the internal leap forward! [23:41:25] \o/ [23:41:38] * halfak goes to cut a version [23:43:04] And it's in pypi [23:43:11] Thanks for your help, awight :) [23:43:22] * halfak goes to play with doggo [23:43:23] o/ [23:43:26] Ah thank you for letting me press the green button