[00:04:59] wiki-ai/revscoring#1133 (data_utils - 9677b27 : Adam Roses Wight): The build was broken. https://travis-ci.org/wiki-ai/revscoring/builds/259999670 [01:54:17] 10Scoring-platform-team, 10Wikilabels, 10translatewiki.net, 10I18n: Wiki-ai-wikilabels-form-dagf-damaging-label and Wiki-ai-wikilabels-form-dagf-goodfaith-label appear as empty in translatewiki.net - https://phabricator.wikimedia.org/T172180#3491795 (10Liuxinyu970226) [01:54:21] 10Scoring-platform-team, 10Wikilabels, 10translatewiki.net, 10I18n: Wiki-ai-wikilabels-form-dagf-damaging-label and Wiki-ai-wikilabels-form-dagf-goodfaith-label appear as empty in translatewiki.net - https://phabricator.wikimedia.org/T172180#3488931 (10Liuxinyu970226) [07:27:58] 10Scoring-platform-team, 10editquality-modeling, 10User-Ladsgroup, 10artificial-intelligence: Flagged revs approve model to fiwiki - https://phabricator.wikimedia.org/T166235#3492096 (10awight) I have some more results--allow me to muddle through an attempt at interpreting them. * This model catches more o... [09:12:18] 10Scoring-platform-team-Backlog, 10Research Ideas, 10Research-Backlog, 10Wikimedia-Hackathon-2017, and 2 others: General image classifier for commons - https://phabricator.wikimedia.org/T155538#3492314 (10Strainu) >>! In T155538#3490759, @Basvb wrote: > Hi Strainu, whose work are you referring to? The go... [09:58:29] 10Scoring-platform-team, 10Wikilabels, 10translatewiki.net, 10I18n: Wiki-ai-wikilabels-form-dagf-damaging-label and Wiki-ai-wikilabels-form-dagf-goodfaith-label appear as empty in translatewiki.net - https://phabricator.wikimedia.org/T172180#3492365 (10Nikerabbit) Two issues I can see: # Mysterious root us... [10:32:54] 10Scoring-platform-team, 10Wikilabels, 10translatewiki.net, 10I18n: Wiki-ai-wikilabels-form-dagf-damaging-label and Wiki-ai-wikilabels-form-dagf-goodfaith-label appear as empty in translatewiki.net - https://phabricator.wikimedia.org/T172180#3492458 (10Ladsgroup) Thanks for finding this commit, I will fix it. [10:33:59] 10Scoring-platform-team, 10Wikilabels, 10translatewiki.net, 10I18n: Wiki-ai-wikilabels-form-dagf-damaging-label and Wiki-ai-wikilabels-form-dagf-goodfaith-label appear as empty in translatewiki.net - https://phabricator.wikimedia.org/T172180#3492459 (10Ladsgroup) >>! In T172180#3492365, @Nikerabbit wrote:... [10:40:38] 10Scoring-platform-team, 10Wikilabels, 10translatewiki.net, 10I18n: Wiki-ai-wikilabels-form-dagf-damaging-label and Wiki-ai-wikilabels-form-dagf-goodfaith-label appear as empty in translatewiki.net - https://phabricator.wikimedia.org/T172180#3492477 (10Ladsgroup) https://github.com/wiki-ai/wikilabels-wmfla... [10:43:34] 10Scoring-platform-team, 10Wikilabels, 10translatewiki.net, 10I18n: Wiki-ai-wikilabels-form-dagf-damaging-label and Wiki-ai-wikilabels-form-dagf-goodfaith-label appear as empty in translatewiki.net - https://phabricator.wikimedia.org/T172180#3492481 (10Nikerabbit) >>! In T172180#3492459, @Ladsgroup wrote:... [15:32:20] I am doing icinga2 update which includes changes to the configuation so sorry for the noise if there is any :) [15:34:16] Thanks for the note paladox [15:34:22] your welcome :) [15:51:54] well notification_logtosyslog is a new config [15:51:58] * paladox wonders what it does [15:54:47] wow huge increase in the shell scripts for notifications [16:00:45] o/ [16:11:55] halfak what do you think about https://github.com/wiki-ai/wikilabels-wmflabs-deploy/pull/40 is it the right move? [16:14:01] Zppix, seems localization should be fixed in mediawiki.org [16:14:39] halfak regardless its kinda weird to use mediawiki.org for oauth either way no? [16:14:59] halfak and since when cause when myself and adam tested it wouldnt even change it [16:15:14] what? [16:15:21] oauth dialog box [16:15:21] And I'm not sure that mediawiki.org is weird. [16:15:34] "since when cause" [16:15:40] "tested it" [16:15:43] since when was it fixed [16:15:51] What was fixed? [16:15:51] it = oauth dialog box with uselang [16:16:00] [11:14]<+halfak> Zppix, seems localization should be fixed in mediawiki.org [16:16:17] Oh. I'm not saying it was fixed. I'm saying it *should* be fixed. [16:16:28] well its not unless we're doing it wrong xD [16:17:28] Right. Someone should fix that or identify what wiki we *should* use for oauth. [16:17:49] I don't believe there's a standard wiki to OAuth against but I do know that it changed from meta to mediawiki at some point. [16:18:28] well i spoke with bryan davis a member of the oauth ext project on phab and he said oauth dialog should work as we expect (localisation) on meta [16:19:00] so in theory making T166472 resolved [16:19:01] T166472: Wikilabels should authenticate on the right wiki - https://phabricator.wikimedia.org/T166472 [16:20:27] Zppix, OK I appreciate that then. [16:20:39] Can you record this somewhere or ask bd808 to record it? [16:20:43] Like on the task [16:22:22] uhh let me go grab my logs [16:27:54] o/ codezee [16:28:14] I'm just finishing up re-training that draftquality model so we can finally resolve that task. [16:32:51] 10Scoring-platform-team, 10editquality-modeling, 10User-Ladsgroup, 10artificial-intelligence: Flagged revs approve model to fiwiki - https://phabricator.wikimedia.org/T166235#3493861 (10Halfak) Interesting and surprising. Could it be that our labels for `"damaging": false` are recording something differen... [16:34:36] nice... [16:42:30] halfak [22:49:56] meta would be localized to their ui language I think [16:42:37] i can put that on the task [16:43:04] 10Scoring-platform-team-Backlog, 10Wikilabels: Wikilabels should authenticate on the right wiki - https://phabricator.wikimedia.org/T166472#3493910 (10Zppix) [22:49:56] meta would be localized to their ui language I think so i proposed a change to config to make meta the oauth wiki https://github.co... [16:58:11] halfak: from what i make of yesterday's discussion, a more promising way would be to send new drafts to some categories which make sense as long as the content is in goodfaith, (even though quality might be below "Wikipedia standards" initially), right? [16:58:11] rather than deleting them outright... [17:17:21] awight there's an icinga2 update :) [17:17:28] the scripts have a huge update [17:17:45] see https://gerrit.wikimedia.org/r/#/c/369681/ :) [17:17:53] hopefully config is still compatible! [17:18:07] awight nope huge change [17:18:10] ah hehe that’s exactly what your patch addresses [17:18:26] yeh but the mail script is failing locally for some reason [17:18:29] This might be why ops is still using v1 [17:18:34] so i filled https://github.com/Icinga/icinga2/issues/5453 [17:18:46] awight yeh icinga2 is a huge syntax change [17:18:52] will require huge prep work [17:19:04] though it's worth it :) [17:19:32] I’m sure, but yeah maintaining a large amount of config would feel kinda… tail-chasey [17:19:54] yep, but puppet db generate the host configs [17:20:02] (on icinga on ops) [17:20:15] That empty -6 parameter is weird [17:20:20] in your bug report [17:21:12] oh [17:21:42] -e is the param it’s complaining about, I believe [17:22:00] ah also -u [17:22:12] oh [17:22:13] * awight rubs eyes [17:22:16] thanks [17:25:33] aha [17:25:40] awight :) :) :) :) :) :) [17:25:45] they missed two prams [17:25:47] params [17:25:48] lol [17:28:46] awight shoot i know what i did wrong [17:28:52] i copied the service shell script [17:28:56] instead of the host one [17:28:57] lol [17:29:41] hahaha that’s a nice solution to the mystery though [17:30:49] lol :) [17:30:53] thanks for spotting that [17:31:02] i would not have spotted that without you :) [17:34:23] Four eyes good, two eyes bad :) [17:35:15] :) [17:38:18] awight lol https://github.com/Icinga/icinga2/issues/5453 i think he now thinks wikimedia upgraded to icinga2 lol [17:38:21] they did not heh [18:05:48] awight i merged https://gerrit.wikimedia.org/r/#/c/369681/ :) [18:05:55] * paladox has figures crossed heh [18:14:08] awight damn i missed the syntax error in comamnds.conf [18:14:10] fixed in https://gerrit.wikimedia.org/r/#/c/369705/ [18:16:34] CUSTOM - Host ores-redis-02 is UP: PING OK - Packet loss = 0%, RTA = 0.61 ms paladox test [18:20:12] * awight is a bit late to the +2 [18:22:57] lol [18:29:01] codezee, sorry to miss your message earlier. I guess I was unclear but that's exactly what I'd proposed. [18:29:17] Only the severely problematic pages would get flagged for quick review/deletion [18:29:25] spam/vandalism/attack [18:29:33] Everything else would get routed somewhere else. [19:25:24] 10Scoring-platform-team-Backlog, 10Research Ideas, 10artificial-intelligence: New article categorizer - https://phabricator.wikimedia.org/T123327#3494657 (10Halfak) [19:35:02] 10Scoring-platform-team-Backlog, 10Research Ideas, 10artificial-intelligence: New article categorizer - https://phabricator.wikimedia.org/T123327#3494713 (10Halfak) Based on the [WikiProject Directory](https://en.wikipedia.org/wiki/Wikipedia:WikiProject_Council/Directory), I think we want to predict the mid-... [19:35:24] I just did a bunch of digging into https://phabricator.wikimedia.org/T123327 so that we can advertise it during the Wikimania Hackathon. [19:36:20] 10Scoring-platform-team-Backlog, 10Research Ideas, 10artificial-intelligence: New article review routing AI - https://phabricator.wikimedia.org/T123327#3494717 (10Halfak) [19:38:10] Nettrom, ^ BTW [19:38:19] Thought you might be interested in this. [19:38:49] That looks fun... [19:39:30] * halfak is racing to get it all defined so it actually makes sense. [19:40:20] It should also enable some cool analysis of pageviews [19:48:43] harej, is there a good way to get a machine-readable version of the WikiProject directory? [19:51:33] Not that I know of. [19:53:08] 10Scoring-platform-team-Backlog, 10Research Ideas, 10artificial-intelligence: Build mid-level WikiProject category training set - https://phabricator.wikimedia.org/T172321#3494777 (10Halfak) [19:54:01] harej, let's say we had JSON blob or something like that and checked it into a repository, then we could have a bot that rebuilt https://en.wikipedia.org/wiki/Wikipedia:WikiProject_Council/Directory periodically [19:54:04] Crazy? Sane? [19:56:15] My goal is to replace the council directory altogether [19:56:21] I like this [19:56:29] Tell me more [19:56:58] harej coucil directory as in govement council? (lol) [19:57:06] 10Scoring-platform-team-Backlog, 10Research Ideas, 10artificial-intelligence: Build mid-level WikiProject category training set - https://phabricator.wikimedia.org/T172321#3494798 (10Halfak) [20:01:19] halfak: I need to get the new directory up to feature parity with the old one, then with their blessing I just make everything a redirect [20:02:41] Interesting. How would that work? [20:02:47] Do you have an example somewhere? [20:03:09] Well first step I need to get a lot of free time or enough money to pay Earwig to do it [20:03:39] I think the main things are clustering task forces with their main project and then allowing annotations [20:04:35] harej, is this a job for Wikidata? [20:05:03] No it's a job for the assessments extension [20:05:22] That community tech built [20:06:10] harej, Oh! I see [20:07:44] halfak: that’s definitely interesting, I’ve been thinking along some similar lines with regards to NPP specifically for ACTRIAL [20:07:49] * Nettrom subscribed [20:08:16] Nettrom, +1 I think it's time to take this task seriously. I've been talking about it for too long. [20:08:24] I'll have some more stuff fleshed out in ~ an hour. [20:08:30] cool [20:09:45] harej, is there any structured data right now? [20:09:50] E.g. categories or something like that? [20:10:01] Or is building this going to be highly manual for now? [20:10:07] The page assessments table? [20:10:18] The WikiProject Directory [20:10:46] Essentially turning WikiProject Mathematics into it's tree path (e.g. ["stem", "mathematics", "WikiProject Mathematics"]) [20:11:10] halfak on enwiki theres the wikiproject dir, i think theres a wikiproject cat for each wikiproject [20:11:28] Right. see the things I have been linking to Zppix :P [20:11:35] e.g. https://en.wikipedia.org/wiki/Wikipedia:WikiProject_Council/Directory [20:11:46] halfak im only half paying attention im trying to multitask lol [20:12:11] :) [20:13:09] halfak: there are categories but they suck. [20:13:17] Gotcha. Thanks. [20:13:26] It's good to know before I start trying o use them [20:13:54] i mean its a start you can usually scan the talk page for templates of wikiprojects as well but thats relying on human reliablity [20:13:56] I also had a project to associate WikiProjects with Wikidata items but that never finished [20:14:18] Zppix: you can just use the page assessments table for that [20:14:29] true [20:15:26] i dont know i cant think of some way to get wikiproject stuff without relying on something or manually [20:15:51] halfak: I’m reading the scikit-learn docs but haven’t found the answer I’m looking for. Is it fair to compare cvtrain test results with manually split train/test results? [20:15:57] harej, is the page assessments table actually being used now? [20:16:15] awight, only problem is confidence intervals. [20:16:22] Ask community tech? I think it's deployed [20:16:24] Essentially, your manual split is just one fold CV [20:16:38] It seems like cvtrain has the “advantage” of having been trained on the same data we’re testing against. [20:17:44] halfak iirc enwiki has a bot that updates a wp10 assessments table onwiki [20:18:27] 10Scoring-platform-team-Backlog, 10Research Ideas, 10artificial-intelligence: Efficient method for mapping a WikiProject template to the WikiProject Directory - https://phabricator.wikimedia.org/T172325#3494885 (10Halfak) [20:19:37] Zppix, yes, I'm familiar with that bot. [20:19:53] 10Scoring-platform-team-Backlog, 10Research Ideas, 10artificial-intelligence: Create machine-readable version of the WikiProject Directory - https://phabricator.wikimedia.org/T172326#3494902 (10Halfak) [20:20:03] Zppix, we've been using WikiProject assessments for the article quality models, so you might imagine we have a bit of experience processing the assessments. [20:20:06] The problem is the directory. [20:20:15] It's super useful but not machine readable. [20:20:37] Nettrom, see https://phabricator.wikimedia.org/T123327 now [20:20:48] It has a sequence of sub-tasks that I think represent the basic work. [20:20:50] ah i must be talking about something else [20:21:05] ill answer bd808's question in cloud for you halfak [20:25:59] 10Scoring-platform-team-Backlog, 10Research Ideas, 10artificial-intelligence: Create machine-readable version of the WikiProject Directory - https://phabricator.wikimedia.org/T172326#3494953 (10JMinor) Have you checked out the [[ https://www.mediawiki.org/wiki/Extension:PageAssessments | PageAssesements ]]... [20:25:59] 10[1] 04https://meta.wikimedia.org/wiki/https://www.mediawiki.org/wiki/Extension:PageAssessments [20:28:39] 10Scoring-platform-team-Backlog, 10Research Ideas, 10artificial-intelligence: Create machine-readable version of the WikiProject Directory - https://phabricator.wikimedia.org/T172326#3494960 (10Halfak) See the WikiProject Directory link in the task description. This will certainly make identifying which Wik... [20:34:09] 10Scoring-platform-team, 10Wikilabels: Wikilabels should authenticate on the right wiki - https://phabricator.wikimedia.org/T166472#3494970 (10Zppix) [20:34:36] 10Scoring-platform-team, 10Wikilabels, 10User-Zppix: Wikilabels should authenticate on the right wiki - https://phabricator.wikimedia.org/T166472#3494974 (10Zppix) a:03Zppix [20:34:41] 10Scoring-platform-team, 10Wikilabels, 10translatewiki.net, 10I18n, 10User-Ladsgroup: Wiki-ai-wikilabels-form-dagf-damaging-label and Wiki-ai-wikilabels-form-dagf-goodfaith-label appear as empty in translatewiki.net - https://phabricator.wikimedia.org/T172180#3494976 (10Halfak) a:03Ladsgroup [20:57:05] halfak: not sure you saw my question above? [20:57:17] > Is it fair to compare cvtrain test results with manually split train/test results? [20:57:38] > It seems like cvtrain has the “advantage” of having been trained on the same data we’re testing against. [20:58:35] awight, ahh yes, but not for any of the statistics reporting [20:58:56] not unfair, you say? [20:59:06] Essentially, the cross-validation step creates N slices of the data, trains on N-1 and tests on the leftovers. [20:59:09] Does that N times. [20:59:23] So yeah, not unfair. [20:59:38] In this case, we're not including any of this new data in the test data because we're skeptical of it. [21:00:47] I re-ran the test on all 20k human- and auto-labled observations just to see what it would do to the stats… [21:01:15] ok so you’re saying that CVtrain gives us stats are are a sum of each CV fold? [21:01:21] right [21:01:25] cool. [21:01:28] :) [21:01:46] A bit more robust than just reporting the results of a one-fold test [21:01:52] And we get to train on all the data at the end. [21:01:59] yeah. [21:02:13] and once it’s trained on all the data, we don’t validate again? [21:02:18] So in reality, it's very likely that the test statistics are pessimistic because the model we deploy was trained on all of the data [21:02:23] awight, right [21:02:29] ok [21:02:37] We assume that it won't do worse by including 1/N more data [21:02:44] I think N is usually 10 [21:03:51] Well, I spot-checked a dozen or so revisions from the approved list and they’re indeed non-damaging. [21:04:09] I’ll go ahead and write up what I’ve done for this second iteration [21:04:16] Unless you have other ideas about things I can try? [21:05:03] brb new meeting [21:05:07] back in 25 mins [21:05:51] k. From your Phabricator comment, I take it I should spot-check some more, then go ahead and add to the real model, regardless of fitness loss? [21:11:30] I kind of want Zache to comment on that [21:11:40] His judgement will be important there [21:12:19] But yeah, I think that if you just add those observations in and do a cv_train, you'll get better fitness stats. [21:12:24] But it won't be a fair comparison. [21:14:54] 10Scoring-platform-team-Backlog, 10Research Ideas, 10artificial-intelligence: New article review routing AI - https://phabricator.wikimedia.org/T123327#3495080 (10Halfak) [21:17:10] halfak: Why would that be unfair? AIUI we haven’t changed the definition of damaging, since we’re only adding observations of non-damaging edits…. [21:17:29] ROC-AUC is sensitive to the proportion of positive cases. [21:17:42] The proportion of positive cases will go down [21:18:00] I see [21:48:09] when can i assume wikilabels-wmflabs changes will be deployed? [21:56:22] Zppix, make an "Deploy Wikilabels Early Aug. 2017" task and put it in active [21:56:26] Assign it to me [21:56:41] I'll likely get it out tomorrow or Friday [21:56:55] Make sure to add your work as a subtask [21:57:12] And anything else that seems relevant (like fajne's recent work) [21:59:06] ok [21:59:52] 10Scoring-platform-team, 10User-Zppix: Early Aug 2017 Wikilabels Deployment - https://phabricator.wikimedia.org/T172332#3495230 (10Zppix) [22:00:07] 10Scoring-platform-team, 10User-Zppix: Early Aug 2017 Wikilabels Deployment - https://phabricator.wikimedia.org/T172332#3495243 (10Zppix) a:03Halfak [22:00:28] 10Scoring-platform-team, 10Wikilabels, 10User-Zppix: Wikilabels should authenticate on the right wiki - https://phabricator.wikimedia.org/T166472#3495245 (10Zppix) [22:00:30] 10Scoring-platform-team, 10User-Zppix: Early Aug 2017 Wikilabels Deployment - https://phabricator.wikimedia.org/T172332#3495230 (10Zppix) [22:01:10] what has fajne done? [22:04:07] Zppix: halfak is probably referring to https://meta.wikimedia.org/wiki/Research_talk:Automated_classification_of_edit_quality/Work_log/2017-07-24#Labels.27_validity_test [22:04:38] tl;dr, fixing problems with our input data sets [22:05:54] 10Scoring-platform-team, 10Wikilabels, 10editquality-modeling, 10artificial-intelligence: Change "yes/no" in damaging_goodfaith form to "damaging/good" and "good-faith/bad-faith" - https://phabricator.wikimedia.org/T171493#3495258 (10Zppix) [22:05:56] 10Scoring-platform-team, 10User-Zppix: Early Aug 2017 Wikilabels Deployment - https://phabricator.wikimedia.org/T172332#3495257 (10Zppix) [22:06:54] halfak ^^ there yuo go [22:13:56] 10Scoring-platform-team, 10User-Zppix: Early Aug 2017 Wikilabels Deployment - https://phabricator.wikimedia.org/T172332#3495282 (10Zppix) p:05Triage>03Normal [22:20:28] 10Scoring-platform-team, 10Edit-Review-Improvements-RC-Page, 10ORES, 10Collaboration-Team-Triage (Collab-Team-Q1-Jul-Sep-2017): Define a process for adding ORES filters to new wikis when ORES is enabled on those wikis - https://phabricator.wikimedia.org/T164331#3495321 (10Etonkovidova) [22:20:32] 10Scoring-platform-team, 10MediaWiki-extensions-ORES, 10Collaboration-Team-Triage (Collab-Team-Q1-Jul-Sep-2017), 10MW-1.30-release-notes (WMF-deploy-2017-08-01_(1.30.0-wmf.12)), 10Patch-For-Review: Summarize what it will take to separate UI and infrastruc... - https://phabricator.wikimedia.org/T167908#3495320 [22:36:17] Zppix, thanks [22:36:56] Oh! I was referring to switching the Yes/No for "Damaging/Good" [22:56:09] np [23:27:24] 10Scoring-platform-team, 10MediaWiki-extensions-ORES, 10Collaboration-Team-Triage (Collab-Team-Q1-Jul-Sep-2017), 10MW-1.30-release-notes (WMF-deploy-2017-08-01_(1.30.0-wmf.12)): Hide ORES review letter from the change list legend. - https://phabricator.wikimedia.org/T172338#3495526 (10awight) [23:33:23] 10Scoring-platform-team, 10MediaWiki-extensions-ORES, 10Collaboration-Team-Triage (Collab-Team-Q1-Jul-Sep-2017), 10MW-1.30-release-notes (WMF-deploy-2017-08-01_(1.30.0-wmf.12)): Componentization-lite of Extension:ORES UI and API - https://phabricator.wikimedia.org/T172339#3495548 (10awight)