[04:10:32] 10Scoring-platform-team (Current): Scoring platform team FY18 Q2 - https://phabricator.wikimedia.org/T176324#4048658 (10awight) [04:10:36] 10Scoring-platform-team (Current), 10JADE: Deploy JADE prototype in Beta Cluster - https://phabricator.wikimedia.org/T176333#4048656 (10awight) 05Open>03Resolved [04:10:47] 10Scoring-platform-team (Current), 10JADE: Deploy JADE prototype in Beta Cluster - https://phabricator.wikimedia.org/T176333#3621812 (10awight) https://en.wikipedia.beta.wmflabs.org/wiki/JADE:Diff/376901 [04:19:35] (03PS1) 10Awight: Update schema URLs in test fixtures [extensions/JADE] - 10https://gerrit.wikimedia.org/r/419345 [08:34:01] (03CR) 10Brian Wolff: [C: 032] Security review followups [extensions/JADE] - 10https://gerrit.wikimedia.org/r/419211 (https://phabricator.wikimedia.org/T188308) (owner: 10Awight) [08:37:30] (03Merged) 10jenkins-bot: Security review followups [extensions/JADE] - 10https://gerrit.wikimedia.org/r/419211 (https://phabricator.wikimedia.org/T188308) (owner: 10Awight) [08:40:18] (03CR) 10jenkins-bot: Security review followups [extensions/JADE] - 10https://gerrit.wikimedia.org/r/419211 (https://phabricator.wikimedia.org/T188308) (owner: 10Awight) [09:54:03] Does anyone know how to request a username change on phabricator? [10:56:57] 10Scoring-platform-team, 10editquality-modeling, 10artificial-intelligence: Complete edit quality campaign for Arabic Wikipedia - https://phabricator.wikimedia.org/T131669#4049282 (10Ghassanmas) @Halfak 96 edits left to label, but all of them seems to point to a deleted or unavailable edit, given all of th... [11:12:26] 10Scoring-platform-team, 10Wikilabels, 10editquality-modeling, 10artificial-intelligence: Complete Latvian Wikipedia editquality campaign - https://phabricator.wikimedia.org/T163005#4049310 (10Papuass) Yeah, I have been struggling to convince others to join, so had to do most of it myself :/ Will try to in... [14:59:30] awight, when you're ready, let's talk more about goals that we can't afford :) [14:59:41] hehe [14:59:57] I’m waiting for Elliott to drop by and take some keys, then I’m free [15:00:01] kk [15:00:28] back in 10 [15:04:09] codezee, looking at your PR, I want to drop the global dict [15:04:33] And we can have the "user" of the word2vec datasource pass a function [15:05:01] E.g. oh! I have a better idea. Let me try something on your branch [15:06:43] halfak: Shall I call? [15:06:49] Sure! [15:13:31] halfak: pass a function? and what would the function have? [15:17:51] The function would wrap the keyed_vectors asset [15:18:59] something like a partial func? [15:33:51] Nope. Just a regular func. You'll see. I'm updating the test to make sure this works [15:34:00] codezee, sorry was AFK doing QR stuff with awight [15:35:11] Ahh and I got caught with something that is not pickle-able. [15:35:13] * halfak digs [15:38:45] OK kind of like a partial but without explicitly giving it a member. [15:43:04] OK I just submitted a new commit to the branch. [15:43:08] codezee, ^ [15:43:34] Note how "test_vectors" gets wrapped in a function implicitly. [15:43:40] (in the test) [15:43:52] I think this will work. Would you adapt your memory test and give it a try? [15:46:38] Also, ha! I did submit my offsite travel request after all. Looks like it's just travel who is being slow. [15:52:05] akosiaris, o/ it looks like awight and I are both going to be AFK March 27-29th. [15:52:21] I thought you'd want to know in case ORES starts on fire then. [15:52:28] I will have 12-hour latency due to plane rides. [15:53:08] having dinner, will attend to it... [15:53:45] halfak ok I 'll keep it in mind. Thanks for the heads up [15:54:11] akosiaris, no problem. Would oyu like me to put something in your calendar? [15:55:24] done already [16:17:18] halfak: https://etherpad.wikimedia.org/p/Wikimedia_Scoring_Integration_Manual [16:32:35] Amir1: I need to deliver a broken bike to my sketchy shop, are you up for SoS in an hour? [16:52:25] 10Scoring-platform-team (Current), 10Packaging, 10Patch-For-Review: Package word2vec binaries - https://phabricator.wikimedia.org/T188446#4050548 (10awight) I see we have git-lfs on ores*.eqiad.wmnet, so we're almost ready to give this a try. What I'm missing is the `scoring/ores/assets` repo (request submi... [16:53:43] let me take a look [17:06:12] Amir1, I'm around now if you want to talk wikimania or I could go to lunch and be back after SoS [17:06:33] halfak: Are you going to SoS? [17:06:45] I thought you were :) [17:06:49] Since awight asked [17:06:52] But I could [17:07:12] :) thanks, that would be a big help. [17:08:17] btw this UNICEF paper is fantastic and a line drive through our problem space [17:08:58] I can go it's okay [17:13:43] Thanks Amir1 [17:34:54] back in 30-90min [17:44:09] 10Scoring-platform-team, 10editquality-modeling, 10artificial-intelligence: Complete edit quality campaign for Arabic Wikipedia - https://phabricator.wikimedia.org/T131669#4050755 (10Halfak) Oh great! That means that you're done. :) [17:44:48] 10Scoring-platform-team (Current), 10editquality-modeling, 10artificial-intelligence: Train and test damaging/goodfaith model for arwiki - https://phabricator.wikimedia.org/T189710#4050761 (10Halfak) [17:44:58] 10Scoring-platform-team, 10editquality-modeling, 10artificial-intelligence: Complete edit quality campaign for Arabic Wikipedia - https://phabricator.wikimedia.org/T131669#4050772 (10Halfak) [17:45:00] 10Scoring-platform-team (Current), 10editquality-modeling, 10artificial-intelligence: Train and test damaging/goodfaith model for arwiki - https://phabricator.wikimedia.org/T189710#4050771 (10Halfak) [17:45:16] 10Scoring-platform-team (Current), 10editquality-modeling, 10artificial-intelligence: Complete edit quality campaign for Arabic Wikipedia - https://phabricator.wikimedia.org/T131669#2174291 (10Halfak) [17:53:02] \o/ arwiki has data! [17:53:06] We can train models :) [17:53:30] Nice goading :) [17:54:40] * halfak starts work while his noodles boil [18:00:16] Oh crap. [18:00:46] haha! arwiki is one of our balanced datasets *and* it looks like we converted the true/false to the strings "True" and "False" [18:00:47] :) [18:00:50] New corner case/ [18:02:04] I wonder if I could clean this up in the wikilabels database [18:02:05] hmm [18:03:16] Nope. Not gonna work [18:04:05] o/ Amir1 [18:04:10] how was SoS? [18:04:30] halfak: hey, they told me RelEng is working on the blockers :) [18:04:35] Nice :) [18:04:49] Wondering if you have some thoughts on what I have discovered. [18:04:53] beside that, usual stuff I think [18:04:57] for arwiki [18:05:23] :((((((((((((((( [18:07:27] Amir1, I think we should consider having a special case for balanced labeled sets [18:07:44] *and* we should just modify the template to make 'true' and '"True"' [18:07:48] What say? [18:08:06] halfak: That wouldn't be hard I think [18:11:07] Want to chat about wikimania? [18:11:46] Amir1, [18:11:47] ^ [18:13:11] halfak: The labeled sets seem to be the code generation’s biggest weakness, hence all the special cases. [18:13:23] I was thinking we could analyze the labeled sets on-the-fly... [18:13:30] This would give pop-rate for one. [18:13:47] "on the fly"? [18:14:07] part of the build process will summarize data from the labeled inputs [18:14:22] Ahh. I see [18:14:50] We’ll use that data automatically as part of make, e.g. to calculate pop-rate, and semi-automatically, like comparing actual rates against a “balanced” tag to validate the preconditions [18:17:39] Would that help with your arwiki issue? [18:19:03] Hmm. Maybe. I don't think it'd make it easier but it might open us up to other solutions. [18:22:56] halfak: shoot, I was afk for dinner (doner :D) [18:23:11] I forgot to mention it, sorry, too hungry [18:23:15] No worries [18:23:36] So, I think we generally need to solve for balanced datasets [18:23:37] I haven’t looked at the special cases, but I can imagine they’ll have to stay, good point. [18:24:02] I can produce a basic template for getting from autolabels + human_labels to labeled_revisions. [18:24:04] halfak: Let's chat [18:24:45] awight, for what it's worth, I think these balanced datasets all should be identical. [18:24:53] Even pseudocode would be great [18:24:56] ok perfect [18:25:04] So it's really only one special case the adopted quirks that can be ignored. [18:25:43] Amir1, so, I think that measuring time to revert is a good first start. In talking with awight, I realized that we could do a little bit of coding like we did for the old wikidata vandalims detection paper. [18:25:47] The only thing that bothers me so far is that we have this random section of “5k_unbalanced” tags, getting rid of those warts in the template file would be worth additional code complexity, imho [18:26:31] Amir1, E.g. we can look at what is flagged as damage, but was not reverted and use that to get a sense for how much more (or less) vandalism is being caught now. [18:26:35] halfak: my mind is blurry, what type of coding do you mean? going through dump of wikidata? that's my favourite [18:26:55] 5k_unbalanced is the common case. [18:27:00] 5k_balanced is less common. [18:27:42] Just to confirm, both of those cases would be easy to detect by processing the labeled inputs, right? [18:27:50] Amir1, I'm thinking that (1) we take a sample of edits from Wikidata, (2) we score them with ores, and (3) we filter to edits that are flagged as likely to be damage and (4) we label them manually. [18:28:36] awight, we could confirm by processing the inputs, yes. But we have a finite set of wikis/datasets that look like this. [18:29:00] halfak: hmm, I would suggest going through the dump and trying to build a dataset of reverted edits and how long it took to revert them and check the difference along the time [18:29:07] both of them can happen ofc [18:29:16] Amir1, oh yes. That's the more straightforward analysis :) [18:30:44] Amir1, let's start with that [18:30:44] halfak: we did this to build the reveted model in wikidatawiki [18:30:51] Amir1, right [18:31:21] Amir1, I think you could just run mwreverts on the wikidata dumps and then analyze the results. [18:31:31] yup [18:32:40] https://pythonhosted.org/mwreverts/utilities.html#mwreverts-dump2reverts [18:33:48] halfak: I think we built that for wikidata :P [18:33:51] IIRC [18:34:07] Na. Older than my wikidata work :P [18:34:39] hmm.. Still I think you might be right about this particular utility [18:34:42] :( [18:38:07] :(? [18:39:48] "Na. Older than my wikidata work :P " [18:40:34] Oh. Just that this utility is pretty old. Way older than ORES. I first built this utility in 2008 or 2009. :) [18:40:53] But I suppose I pulled it into this repo around the time we were detecting reverts in wikidata [18:41:54] cool [18:42:08] Either way. I think this is a good place to start ^_^ [18:42:30] brb I need to take advantage of the sunshine to go break an ice damn in my yard. [18:42:44] When I get back, I'll make a pseudo-template for balanced datasets. [18:45:22] I would jump into CG code to help, but I’m not going to this week :) [18:46:55] 10Scoring-platform-team (Current), 10JADE, 10WMF-Communications: Blog about JADE - https://phabricator.wikimedia.org/T183200#4051130 (10awight) Submitted an "extended abstract" about the political economy of JADE and its context: https://github.com/adamwight/jade-pol-econ/blob/master/jade-pol-econ.pdf [18:50:21] What the heck. [18:50:34] Wikimania submissions are private, in EasyChair. No wiki. [18:51:07] Hmm, there’s an explanation, at least. [18:57:38] Ooh, there’s no way to add an image to your Wikimania proposal, it’s a 300-word plain-text abstract, nada más. [19:04:33] halfak: I’m planning to make the JADE talk just a 25-min presentation, unless someone stops me. [19:12:22] ah you can link to the project page where pictures live. [19:15:17] +1 awight [19:15:52] wow. that took longer than expected. Got some work done, but had a neighbor come over to tell me about parallel dimensions and government conspiracies for a while. [19:16:01] I didn't want to be rude, so I heard him out. [19:16:42] holy moley. [19:17:01] People get strange when the ice pack builds up [19:17:03] Amir1, forgot to say a next step for the wikimania stuff. How about I do a first draft of the talk proposal on Saturday during the hack session. Then you can review and we'll get to work on analysis. [19:17:06] lol awight [19:17:28] Turns out they opened a portal to a parallel dimension at CERN. Who knew? [19:17:33] Road trip to Chicago! [19:17:34] Hell [19:17:53] That does start to explain some gov’t conspiracies, of course... [19:18:11] https://www.cnbc.com/2017/02/22/alternate-realities-and-trump-mandala-effect-and-what-cern-does.html [19:18:14] ^ relevant [19:18:49] Oh wait. CERN is in Europe [19:18:52] rofl [19:18:55] What's the one in Chicago? [19:22:09] There was the original Fermi reactor… [19:22:30] Oh the new thing is named after him as well, https://en.wikipedia.org/wiki/Fermilab [19:23:02] I do agree that it’s a bad look to try to make mini-blackholes on Earth. [19:23:25] Even if it’s “very likely” that they’ll collapse immediately rather than devour our solar system. [19:24:38] 10Scoring-platform-team, 10Wikilabels, 10editquality-modeling, 10artificial-intelligence: Complete Latvian Wikipedia editquality campaign - https://phabricator.wikimedia.org/T163005#4051298 (10Halfak) Sounds good. Thanks for doing so much hard work. I'm hoping ORES will be helpful for you. It's been a h... [19:25:26] awight, the only difference between our particle acceleration and what happens naturally is that ours is more controlled and predictable. [19:25:59] I hadn’t heard of it in those terms. What kind of energy do natural collisions have? [19:31:49] nvm, my curiousity is being satisfied. https://en.wikipedia.org/wiki/Safety_of_high-energy_particle_collision_experiments [19:36:08] :D [19:36:12] awight, https://gist.github.com/halfak/78e5f7ca11f0a724f1b1705ebb01326e [19:36:24] When we have a balanced set, we want to do the above thing. [19:36:36] When we have an unbalanced set, we do the normal thing. [19:37:23] halfak: Just a detail, but have you tried that regex? I think you need “grep -E”? [19:37:37] -P is perl style [19:37:37] 04Error: Command “p” not recognized. Please review and correct what you’ve written. [19:37:40] Which I usually reach for [19:37:45] * halfak pats AsimovBot [19:37:52] But -E works too [19:38:04] ah, just one is using -P [19:38:07] gotcha [19:38:27] -Punch-AsimovBot [19:38:35] aww [19:38:52] Woops [19:39:06] halfak: also, lines 19 and 20 should be *human*no_review* and *human*review*, right? [19:39:30] Got 'em [19:39:51] great [19:40:21] Fixed [19:40:23] Oh woops [19:40:27] Why is “\” “.review"? [19:40:59] There. I simplified [19:41:17] awight, ^ [19:41:34] ARG [19:41:51] OK now it's good [19:41:53] lol [19:43:21] halfak: Why are we merging the autolabeled_revisions? Aren’t those guaranteed to be unique? [19:43:43] Maybe that’s supposed to be $* in line 15? [19:43:43] human_labeled are a subset of autolabeled. [19:43:47] halfak: sure [19:43:49] Oh yeah probably [19:43:57] oops $^ [19:44:20] halfak: ^ [19:44:26] ha [19:44:35] {{done}} [19:44:36] You rule, halfak! [19:44:40] Thanks AsimovBot [19:44:52] * halfak rewards himself with frozen fruit [19:46:12] IMO the intermediate human*review* file was cool. Why would we inline that rule? [19:47:31] halfak: otherwise, lgtm [19:47:42] 'cause we don't use it for anything else, I guess. [19:47:52] Maybe it was more clear what was happening that way? [19:48:24] * halfak turns his mouth blue [19:49:31] lol I would rather just merge_labels the whole lot of ‘em, damn the redundancy [19:49:39] I’m fine with the rules as you have them now [19:49:47] The nice thing about templating is that we can iterate :D [19:53:37] :) [19:53:50] Do you want to take a hack on Amir's PR to see if you can get this working [19:53:51] ? [19:53:54] halfak: Side note, I just realized that my suggestion to automatically calculate the population rate will be trickier than I thought [19:53:55] He has one for cswiki.. [19:54:00] halfak: sure, will do [19:54:28] Cool. I have to go generate a table for a paper. But I'll be around for chatting/review. [19:54:31] pop-rate is actually encoded into the Quarry query to pull the sample, right? [19:54:32] kk [19:54:44] Kinda of yeah. That's where it all starts. [19:54:47] A pure random sample [19:54:47] ah it’s not even part of the query [19:54:50] +1 [19:56:23] So if it’s a random sample, then you can calculate the pop rate by tallying the sample, but if it’s been changed from random e.g. balanced manually, then you do have to include that in configuration :-/ [20:05:22] Right [20:05:27] And do some algebra :) [20:05:29] ALGEBRA! [20:10:39] Our template has way too much logic. [20:22:08] halfak: cswiki PR updated [20:39:59] I’m smoke testing on ores-misc-01, btw [20:44:06] \o/ [20:47:25] fwiw [20:47:26] wc -l datasets/cswiki.labeled_revisions.20k_2016.json [20:47:26] 18015 datasets/cswiki.labeled_revisions.20k_2016.json [20:47:52] cat datasets/cswiki.labeled_revisions.20k_2016.json | grep -P '"needs_review": (false|"False")' | wc -l [20:47:53] 15525 [20:48:08] Not sure how else to check the Makefile [21:09:20] Wandering over to check on my “bike”, aka pile of parts [21:16:01] * halfak checks out awight's work [21:27:22] OK I have a sense for what is going on here [21:27:27] But I have some questions. [22:57:30] (03PS1) 10Awight: [DNM] Experimental git-lfs submodule [services/ores/deploy] - 10https://gerrit.wikimedia.org/r/419613 (https://phabricator.wikimedia.org/T180627) [23:28:39] (03PS2) 10Awight: [DNM] Experimental git-lfs submodule [services/ores/deploy] - 10https://gerrit.wikimedia.org/r/419613 (https://phabricator.wikimedia.org/T180627)