[00:15:49] Krenair, I was considered eligible even though I didn't edit fr in the last 2 years :P [00:16:37] I was going to relate your case with mine in phab, but saw you were easier to be considered "active" [00:17:03] I just hope they didn't consider everyone active [00:30:54] (I alerted them on meta) [04:01:37] harej: sorry I had to leave earlier. [10:06:24] 10Quarry: Show all published queries in profile - https://phabricator.wikimedia.org/T77948#1404110 (10Edgars2007) Temporary solution. In the "Recent queries" page add `?limit=5000` to page URL, so you (currently) get all queries. Then you can search for your username. Yes, it isn't a simple way, but at least it... [18:29:53] halfak I'm running a couple minutes late. Be at our meeting in ~5. [18:30:04] hokay. Thanks for letting me know. :) [18:33:03] * halfak takes the opportunity to reinstall chrome [18:33:19] Yay! Dropdowns work again [18:33:29] * halfak joins meeting [20:46:43] hey Nemo_bis. just want to thank you for chiming in in the talk page. You've been helping us with your comments. :-) [20:48:28] are people shouting online again [20:50:00] leila: good [20:50:44] harej: to be fair the loudest people were those mass-emailing candidate translators :D [20:51:06] hi harej. some people are not happy, some people are happy, generally the feedback has been very valuable. We'd like to respond to all messages directly, and that's a challenge since every message requires some time to write. Nemo_bis has been helping with answering to few of the comments which is great. [20:51:39] Nemo_bis: :-[ [20:52:11] loud doesn't equal bad :) [20:52:27] :-) [21:05:01] leila, will you be reaching out to the community liaison team for future experiments? (I don't know if you did so already - I wasn't aware this was going on until I got the email) [21:05:40] Ironholds: we have been working with CL team all along. [21:06:22] There was a frwiki village pump announcement about it two days ago as well (which doesn't solve exactly the problem you experienced since you don't go there often) [21:06:27] Ironholds, ^ [21:09:11] leila & Ironholds, what's up? [21:09:24] Something more than Juergen's recent email? [21:09:45] * halfak reads scrollback [21:09:46] Juergen, halfak? [21:09:57] leila, awesome, cool [21:09:59] I'll get a link [21:10:07] like I said, I didn't know if you had been; it wasn't a "why did you not do this thing" [21:10:25] leila, see https://lists.wikimedia.org/pipermail/wiki-research-l/2015-June/004542.html [21:10:28] I understand, Ironholds. it's appreciated because that's the way we should be doing it. :-) [21:10:37] Sam Klein is a fan. :) [21:10:50] Ironholds, I was more surprised that you didn't know about the research based on your comment in wiki-research-l [21:11:17] I guess I missed the threads? [21:11:18] ah! cool, halfak. yeah, I saw that, I sent a batch response to the mailing list. [21:11:20] leila, seems like you saw the thread and responded. Juergen Fenn's coment is last [21:11:26] :-) [21:11:56] I don't see the ethical issue with having an algorithm make personalized recommendations, but I don't much feel like having an email fight over the weekend : [21:12:00] halfak: there are a lot of comments in https://meta.wikimedia.org/wiki/Research_talk:Increasing_article_coverage that I'm trying to respond to. That's the major effort right now. [21:12:08] Gotcha. [21:12:16] That's a better spot to have the discussion anyway. [21:12:23] np, halfak, and I think it's really a personal choice [21:13:00] The nice thing about research talk pages is that the discussion about the project is attached to the project documentation itself. [21:13:18] Email archives are really awful for everything [21:13:22] I had very good reasons for choosing email over anything else but it was a tough call and I can see how some people are not happy about it. [21:13:34] yeah, agreed. [21:13:53] Ironholds, we mostly talked about this research (not so frequently) in the research meeting and back when we had standups. [21:14:05] leila, I'm talking about the research project talk page. Not how you contacted editors. [21:14:21] As for how you contact editors, I can see the frustration with email. [21:14:26] Given that Bob came for this research (one of the two), I somehow assumed the whole planet knows about it. Should communicate more. [21:14:38] ah! I see, halfak. sorry. missed that. [21:15:41] I wish I were faster in responding to comments, halfak. I'll be around Saturday and Sunday, too, at least for the part of the two days. :-) [21:15:59] Any of the CL's helping out with that? [21:16:23] leila, oh, gotcha. Yeah, mailing lists are always better because sometimes meetings aren't around - I have a standup that conflicts with the research meeting which is why I personally haven't been in as much [21:16:27] I got loads of support for the revent VE experiment from the Cls. [21:16:41] Ironholds, noo! Let's move the meeting. [21:16:43] I'm thinking of extending the "if it didn't happen on the mailing list, it didn't happen" rule to my work on measuring user satisfaction [21:16:51] maybe we could use that as a template? [21:16:52] halfak, we don't have allocated resources from the CL team, but Benoit has been working with me and has been very helpful. [21:17:08] just to see what the research-l reaction to "here is something we're planning on doing do you have any thoughts" is [21:17:12] Ironholds, yes. Please. I have no idea what you are doing. [21:17:17] halfak, awesome! [21:17:21] ^ not 100% true, but enough [21:17:24] wait. bad phrasing [21:17:27] ha [21:17:31] :P [21:17:37] yeah, Ironholds, I didn't know you have a conflict. :-\ [21:17:41] halfak, to be fair, you mostly don't know what I'm doing because I haven't done any research since I was actually on the research team ;p [21:18:01] leila, that and..well, most of the work I do these days 'doesn't count' and so I feel kinda weird bothering people with it. [21:18:02] that's not true, Ironholds. you presented couple of weeks back what you're working on. ;-) [21:18:03] Boo. Not even clever, sneaky research like with AFT? [21:18:18] that's the user satisfaction stuff, THAT is going to be research! :D [21:18:29] everything else is building mobile-optimised dashboards and making eventlogging not fall ove [21:18:29] r [21:18:45] ^ sounds relevant to me [21:20:25] I guess? It doesn't feel on the scale of what yinz do [21:20:32] you find out new things, I learn python ;p [21:21:11] and move to Pittsburgh, apparently [21:22:18] hah [21:22:18] * harej explodes in a morass of WikiProjects [21:40:05] are people shouting online again < Yes. And I need to repeat my mantra "Don't respond to idiots. Don't respond to idiots." more than usual today. [21:41:12] harej: did you remove https://tools.wmflabs.org/projanalysis/config.php ? what should I be using instead? [21:41:24] Mailing lists are the new Comments. [21:41:35] Mailing lists are the *old* comments :P [21:41:38] leila: [21:41:40] :p [21:42:11] leila: And you're not seeing the messages on the internal WM-FR list, fueled by anti-WMF rage. [21:42:19] Nettrom: I think config.php died in the NFS outage. Let me re-upload it quickly. [21:42:35] harej: sweet, working on it now [21:42:59] "Why do they think articles from the English Wikipedia are more important than articles from other Wikipedia? TERRORISTS!" (only slightly exaggerated) [21:44:22] but fr-wp already translated a whole bunch of English articles, what are they complaining about? [21:44:28] (see my 2012 WikiSym paper :) [21:44:34] ow guillom. :P [21:45:03] Yeah, French and English are both Western languages, so a lot of overlap in interests. It's not like we're translating English into Xhosa and bringing civilization to the dark continent (I'm being sarcastic) [21:45:54] yeah, that was a concern that we were aware of guillom. that's why we explained in the village pump announcement that the choice of English and French is just for the test. [21:47:37] Nettrom: have fun! https://tools.wmflabs.org/projanalysis/config.php [21:47:43] leila: For what it's worth, I think there's a silent majority that enjoyed the suggestions. Maybe a future version of the message could be worded so that it reads more like an invitation (like "of course you're free to contribute on any subject; these are merely suggestions. and we'd love to get your feedback to help us improve future recommendations"). It was clear to me, but some felt more directed than the message intended, I think. [21:48:12] I think there's more confusion than anything else [21:48:26] Especially for those anglophones who got an email in French yet can't read French. [21:48:33] I can *read* French but I don't think you want me *writing* French. [21:49:02] leila: I think we all fail to remember that people don't read village pumps, or even the full recommendation emails. They just get outraged and write angry messages. :) [21:49:07] this is a good suggestion, guillom. [21:49:34] we just updated the page with the lessons learned so far: https://meta.wikimedia.org/wiki/Research:Increasing_article_coverage#French_Wikipedia_Test:_Lessons_Learned [21:49:40] harej: thanks! [21:49:42] harej: I think they've changed the contribution thresholds already :) [21:50:09] https://fr.wikipedia.org/wiki/Sp%C3%A9cial:Contributions/Harej << Look! I made *one edit* to an article! [21:50:13] I understand guillom. I like to learn to help reduce that reaction. [21:50:58] And many said "Don't spam me, I have a talk page godammit!". But of course if you actually read your documents, you would see that that had privacy implications. But they don't read… [21:51:11] harej: our lessons learned write up may be interesting to you, too. we have changed the way we identify editors and we are explaining it there: https://meta.wikimedia.org/wiki/Research:Increasing_article_coverage#French_Wikipedia_Test:_Lessons_Learned [21:51:34] guillom: thanks for sharing these. they are very valuable. [21:51:46] Anyway; I'm interested to see the final results :) [21:51:59] I haven't actually looked at my new recommendations :) [21:52:04] This reminds me of the issue with my WikiProject Directory, in that it names people as contributing to a subject area even if they're just AutoWikiBrowser power users [21:53:08] b.t.w., guillom, did you get any recommendation? [21:53:18] ah! sorry. hadn't read that line yet [21:53:32] Take my friend Ser Amantio di Nicolao, who has over a million edits and is named as a contributor to several different subject areas, not necessarily because he's interested in them, but because in his work he edits pretty much all the articles. [21:53:37] And those edits are things like category sorting [21:56:28] harej, in my head Ser Amantio is registered as "sounds like a Giano sock but actually a really nice person" [21:57:05] harej: We could probably use WikiLabels to train an algorithm to recognize such edits (category edits, portal edits, template and other maintenance edits, image edits etc.). It could be a fun project. [21:57:05] He is a nice person. [21:57:30] WHOOAAAA [21:57:55] guillom: that's an awesome project that will have a lot of customers. [21:57:58] [Idea: get the WikiProject Directory classified as an analytics product, have me work on it full time] [21:59:12] but yes WikiLabels is a great invention [22:00:08] Ironholds: He's actually kind of the anti-Giano in that he steadfastly refuses to get involved with drama [22:00:47] Was that test counting MediaWiki namespace edits? [22:00:54] or just main namespace? [22:05:19] * guillom makes a note to ping halfa.k about edit-type classification once he's not in "war mode" any more. [22:05:44] I guess on frwiki it doesn't really matter [22:06:04] but on some other wikis you'll only really see me with MediaWiki: namespace contributions and perhaps a user page [22:06:22] Because I have global editinterface [22:06:22] That's a good point. [22:07:43] Also, I've added photos to several articles on wikis where I don't speak the language (usually, photos of people I took at events), so this is another scenario where someone with edits might not speak the language. I guess that edit-type classifier really could come in handy in a variety of projects. [22:08:40] actually, no local user page anymore because of global user pages [22:09:10] Yeah, I had all of my user pages deleted as well after GUP was deployed. [22:10:35] I still have a couple of local user pages. [22:10:47] My English Wikipedia user page is over ten years old! It's a legacy! [22:13:18] harej, kill it! [22:13:23] Come over to the meta side [22:13:23] NEVER [22:13:33] I *have* a meta user page but I have a specific exemption for English Wikipedia. [22:13:38] And Wikidata, I think. [22:13:43] Ahhh. Cool [22:14:01] leila: related to my recommendations: maybe remove Disambiguation pages? I'm not sure it makes sense to translate those. [22:14:14] halfak! [22:14:25] making a note of that now, guillom. thanks. [22:14:42] o/ guillom [22:15:15] We were talking a few minutes ago about using WikiLabels to train an algorthim to recognize types of edits (category edits, template/maintenance edits, image edits etc.). [22:15:39] ha. I'm working on a project description for that right now. [22:15:52] I've been working with some researchers at CMU to replicate some past work. [22:15:58] \o/ [22:16:16] http://www.aclweb.org/old_anthology/C/C12/C12-1044.pdf [22:16:25] ^ Regretfully non-useful model [22:16:43] *but* they did some ground work in developing features and a proposed classification scheme. [22:17:48] Unrelatedly, I read Michael Ekstrand's paper about "discarded work in history" while you were away, as part of work on my history analysis tool. Great stuff. [22:18:37] +1 [22:18:44] He got a best paper for that one. [22:18:54] "rv your dumb", right? [22:18:58] yup :) [22:19:17] Oh... he used the right "you're" [22:19:33] For some reason I thought he purposefully stuck a typo in there. [22:20:20] *that* is an under-cited paper. I appreciate the multiple and complex strategies with the basic recommendation of "meh, just match checksums" [22:20:25] It's good science. [22:20:50] There's a strong temptation to develop a complex technique that works a tiny bit better and try to sell it. [22:21:01] See most of RecSys literature [22:21:21] (Which, incidentally, is Michael's scholarly niche) [22:22:07] Yeah, I've been looking into cosine similarity and similar distance analysis stuff and it was great to read that part. [22:23:26] I wondered if the results with CS and adoption coefficient might be better if they were applied to the diffs themselves, instead of the revisions. I want to look into it after Wikimania. [22:24:43] That's related to some work that FaFlo has been doing. [22:25:03] He wasn't really looking for a tree so much as better measurement of provenence. [22:25:16] http://www.aifb.kit.edu/images/1/1f/Www2014_submission_715_(9).pdf [22:25:19] ^ WikiWho [22:25:41] The general diff algorithm he developed for this is implemented in one of my diff libraries. [22:25:52] https://pythonhosted.org/deltas/ [22:25:52] Ah, great! Yes, my context was also one of authorship, not of trees. [22:26:06] Cool, this'll be easy then. [22:26:12] pip install deltas [22:26:25] :D [22:26:33] from deltas import segment_matcher [22:26:48] segment_matcher.diff(tokens_a, tokens_b) [22:26:51] I'm doing this one in JavaScript, but was considering using Python/C for the computing-heavy parts. [22:27:05] guillom, I don't think it'll be hard to convert. [22:27:31] If you can do regex-based segmentation and compare hash maps (javascript'll do great), then you're set. [22:28:09] halfak: Yeah; I'll look into it after Wikimania. Right now I'm focusing on those presentations. [22:28:15] Nemo_bis: question about https://meta.wikimedia.org/wiki/Research_talk:Increasing_article_coverage#Bug_with_imports [22:28:24] +1 guillom [22:28:39] how can content gets imported? is there a way we can identify such imports quickly? [22:29:11] Transwiki imports and XML imports [22:29:26] The import log should have entries for both. [22:31:22] thanks guillom. [22:35:21] in theory they should have rev_user = 0 [22:36:02] you can also notice them because they have rev_id higher than revisions with higher rev_timestamp [22:36:54] Oh, I didn't know about rev_user = 0. [22:37:29] But that broke :( I don't know when [22:38:17] what broke? [22:38:36] are imports getting actual local user IDs?! [22:47:38] http://quarry.wmflabs.org/query/4120 for https://www.wikidata.org/w/index.php?title=Template:-&oldid=2285007 [22:47:46] Have to go home, bbl [22:52:53] Alright. I'm off for the evening. Have a good one folks. o/ [22:53:11] (also consider signing up to collaborate in my schemes while I am gone https://meta.wikimedia.org/wiki/Research:Automated_classification_of_edit_types) [22:53:20] Krenair: yes, see https://meta.wikimedia.org/wiki/Research_talk:Increasing_article_coverage#Bug_with_imports [23:01:15] Nemo_bis, it sounds like it didn't just blindly copy in the source wiki's user ID, but CA may have looked up the correct one? [23:07:38] no