[05:14:40] halfak: btw, paws has mwclient, mwxml and mwdb, and dumps at /public/dumps [14:41:57] yuvipanda, no mwapi? [18:34:44] halfak: bah, I meant mwapi (not mwclient) [18:34:56] \o/ [18:42:59] halfak: remind me, your DOI dataset includes accompanying Wikipedia articles, right? [18:43:35] Yes [18:43:41] And a date at which the DOI was first added. [18:43:54] Oooh that's nice. [18:44:05] BRB lunch [18:44:12] Also, you can now query ElasticSearch directly. Not just the subset exposed on the website, but the whole thing and its complexity. [19:06:27] harej, we might be able to use ElasticSearch with references. E.g. we can grab the content that immediately precedes the reference and index that content. [19:06:44] Seems like that would make for the beginnings of a poor-man's recommender [20:47:26] OMG I got work done today! See https://meta.wikimedia.org/wiki/Research_talk:Measuring_edit_productivity/Work_log/2015-12-02 [20:47:58] It looks like registered editor productivity in English Wikipedia hasn't really fallen along with "the decline" [20:48:20] Anyone who wants to volunteer to have me plot your productivity over time, let me know. [20:48:21] :D [20:48:51] you can plot mine but it'll probably be flat allll the time :) [20:49:32] Yeah. me too. Gotta figure out how to measure tool dev productivity ;) [20:50:15] heh [20:50:29] Who is the dev of HotCat? [20:50:35] And AutoWikiBrowser [20:50:38] Those would be huge! [20:50:39] not sure, plenty of them (HotCat) [20:50:42] yeah [20:53:02] oh uh, I know one of the awb devs but I don't know his username [20:53:04] meh [20:54:41] Reedy! [20:55:34] I think my productivity graph would look very sad [21:53:08] halfak: What's up? [21:53:59] o/ Reedy [21:54:18] What do you want? [21:54:19] :P [21:54:31] I'm technically the lead dev of AWB [21:55:23] apergos: You thinking of Magioladitis? [21:55:23] Presuming due to both being in Greece [21:55:38] yes I am thinking of him indeed [21:55:49] halfak: FYI, we have https://tools.wmflabs.org/awb/stats/ [21:56:07] The PHP code and SQL queries are bad, at best [21:56:10] so that means I know two of the devs :-P [21:56:12] db tables are interesting, as is the indexing [21:56:16] apergos: You know 3... [21:56:26] MaxSem was pretty active too [21:56:27] who's the other one? [21:56:29] But not recently [21:56:29] oh heh [21:56:31] nice [21:57:23] heh [21:57:46] halfak: AWB has made hundreds of millions of edits to WMF projects :) [21:57:50] Reedy, "What do you want?" not sure. [21:57:52] Oh! [21:58:05] I was talking about measuring value-added by tool developpers with Yuvi [21:58:06] :0 [21:58:14] yep it's a really popular tool [21:58:17] Some archived static stats are at https://tools.wmflabs.org/awb/stats/archive/May2009.html to [21:58:28] So, I've been working on this project for measuring value added in Wikipedia. [21:58:30] where's that linux port eh :-P [21:58:39] See https://meta.wikimedia.org/wiki/Research:Measuring_value-added [21:58:48] apergos: I wish I had the time/investment to be able to do one [21:58:52] So, right now, our measure of productivity only counts for contributions to articles that stick. [21:59:01] But that cuts out a lot of value added by other types of wiki work. [21:59:02] We try to support/work around for Mono/WINE where possible [21:59:17] I flagged tool devs as one type of work where it will be difficult to measure value. [21:59:24] And AWB has likely been VERY valuable. [22:00:01] Reedy: understood, I just get less and less enthusiastic about running anything under wine as the years go by [22:02:49] heh, yeah [22:03:22] I'd love to know "what to do", and have someone cover my costs for a few months to write a multiplatform version [22:03:24] Like Huggle did [22:03:49] IEG? [22:03:57] Oh go online with it. [22:04:01] Maybe [22:04:12] You should be able to do it all in javascript, right? [22:04:53] Potentially, in theory [22:04:57] I'm not a JS developer :) [22:05:18] Ahh. I see. Might be able to find some collaborators. JS is hot pants these days. [22:05:19] It probably could be [22:05:48] I think Addshore was working on a version that did mass editing using teh MW job queue etc [22:05:49] https://github.com/reedy/AutoWikiBrowserUsageStats/blob/master/includes/database.sql [22:06:09] halfak: ^ [22:06:24] Interesting -- and kinda scary. [22:06:24] We store session "stats" and such [22:06:30] We're very open about it [22:06:35] And you can opt out [22:06:42] makes sense. [22:06:52] We didn't know how well used AWB was till we did it [22:06:57] Even more so, outside WMF projects [22:07:12] That's on my long list of stuff to improve etc [22:08:01] I can give you access to the awb project on labs if you want [22:08:09] And/or make you a db dump if you wanted to do some digging [22:09:24] Oh man. I could tell you a lot about AWB. We catch it everywhere and need to account for it in analysis work. [22:10:00] Reedy, the most important thing for me right now for AWB are some high level thoughts on how we think about the impact that the tool dev has on Wikipedia. [22:10:19] I've been thinking about templates in a way that might apply. [22:10:30] Magioladitis is one of the highest editors by count on enwiki [22:10:34] Most of that is AWB [22:10:49] Seems like the devs of AWB should get a little bit of credit for each one of those edits. [22:11:01] It would be great if we could quantify the time saved using AWB. [22:11:18] I bet it would be MASSIVe. [22:11:37] In fact, I have some evidence that Wikipedians are getting more and more efficient over time. [22:11:59] The labor hours invested in Wikipedia editing are going down, but the overall productivity is holding stable. [22:12:19] ^ At least as far as article contributions are concerned. [22:22:29] We're just enablers ;) [22:25:02] * yuvipanda buys drugs from Reedy [22:27:18] Ha. the WMF is a big bunch of enablers under that framing ;) [22:27:52] I like the word "empower" [22:30:24] yuvipanda: You mean sugar? [22:30:56] I've surprisingly taken off a lot of sugar from my diet [22:31:00] most of it comes from grapes now [22:38:55] [17:05:43] I think Addshore was working on a version that did mass editing using teh MW job queue etc <- https://www.mediawiki.org/wiki/Extension:MassAction [22:39:54] It would be cool if we could encode some intensive operations that way. [22:40:20] E.g. renaming a category. [22:40:36] I suppose wikised would be cool too. [22:40:47] (after unix `sed`) [22:44:54] wiki sed whattttt? [22:50:31] "WHAT DO YOU MEAN WIKISED DOESN'T SUPPORT SED SYNTAX!?"