[00:01:15] fhocutt: great email! [00:01:27] (I'd reply on-list but I'm not meant to be reading my email) [00:01:32] * fhocutt grins [00:01:35] thanks, Ironholds. [00:02:17] Also, on point #2: my favourite paper ever has a similar message, and I keep wanting to find some way to make everyone read it. [00:02:41] link? :) [00:03:11] https://services.brics.dk/java/courseadmin/TQMHCI/documents/getDocument/%28bdz1%29Bardzell-feministHCI.pdf?d=101833 [00:03:28] oh agh, the formatting got messed up [00:03:29] oh well [00:03:49] ooo, that looks interesting just from the title [00:03:52] saving this [00:04:05] It's about trying to apply feminist theory to human-computer interaction methodologies, which is awesome but secondary to why I love it, which is: it rebutts the "but we have no maps for these territories! we don't know what to do! it's HARD" argument by saying "look, you think this is only a HCI problem? Anthropology and ethnography have these issues. They came up with these [00:04:05] methodological approaches. Use them and stop saying you have to think of your own" [00:04:48] a lot of the junk that happens when we say "we need to turn this into a better environment for good-faith users" is "but HARD, we're special snowflakes!". Let's do a Shaowen Bardzell and see how other communities/groups deal with this. It's not a new problem. [00:05:01] anyway, I should head off because I'm being quite-rightly scolded ;p. See y'all on Thursday! [00:05:06] o/ [00:05:28] which mailing list is this? [14:06:43] Hey hey science people. o/ [17:01:47] * Nettrom has a new Random Forest classifier, yay [17:01:55] now to go make one in Python as well [17:05:23] o/ Nettrom [17:05:50] Could you share your labeled observations? [17:05:55] was thinking about it this morning, in the WikiSym'13 paper we report ~40% accuracy, now it's slightly over 60% [17:06:15] yeah, I want to get the dataset up as well as the new model that goes with it [17:07:00] :) I'll need that dataset when I start experimenting with revscoring. [17:07:45] I'll need to use the revscoring feature extractor so just rev_id, class pairs would be great. [17:08:15] that's even easier, as that dataset is already on Tool Labs :) [17:08:32] just need to extract two columns from it [17:08:46] Cool. Where do I find it? [17:09:10] morning folks [17:10:08] Hey DarTar [17:10:28] halfak: /data/project/suggestbot/projects/improved-qualpred/datasets/cleaned-assessment-sample.tsv [17:10:43] halfak: not 100% sure it's readable outside of the suggestbot tool account, though [17:10:48] hey DarTar [17:10:54] howdy [17:11:22] * Nettrom takes 5 [17:34:23] hey halfak, do you have 10 mins to chat before standup? [17:36:16] Sadly no. [17:36:50] halfak: kk np [17:38:23] in meetingh [17:44:15] j/k DarTar. meeting done early. Can chat. [17:44:38] cool, hangout? [18:03:33] running behind, Elder is performing a firmware 8-| [18:04:06] laptop! [18:04:20] laptop always runs out of battery, no? [18:05:05] DarTar, ^ [18:05:22] :P [18:05:33] no, it’s actually the mac mini in the room [18:05:37] for a change [18:05:39] getting there [18:33:30] Deskana: swing by when you have a sec, I want to show you something [19:17:57] p [19:18:00] oops, sorry [19:23:01] DarTar: does each Wikimania submission have to have 1 author? [19:23:48] leila: no, they are typically co-authored [19:23:57] most of my presentations were [19:23:59] got it. thanks, DarTar. [19:24:14] I see, I guess by accident the few I checked today were all single-authored, DarTar. [19:24:17] thanks! [20:48:32] halfak: do you know which Wikimania track is reasonable for research submissions? [20:48:49] I don't think that research submissions get a special track. [20:48:56] I've split mind between technical and community. [20:48:59] *mine [20:49:01] exactly! [20:49:14] okay, I'll go with technology, but that's really not the one. ;-) [20:49:49] I think it's really about the audience you want. [20:50:02] For revscoring stuff, I want a technical audience. [20:50:16] For social science stuff, I want a general audience. [20:54:16] got it. thanks, halfak. [20:54:38] :) [21:06:32] halfak / leila: seen ellery? [21:06:47] I've been trying to ask him about the hackathon [21:06:49] milimetric: he was in the standup today [21:07:02] milimetric, we need word from Ironholds as well. [21:07:17] yes, but I've gotta send out the travel email so they can start the ball rolling [21:07:24] I put Oliver with a ? :) [21:07:32] Did we decide on dates yet? [21:07:36] and I was just wondering if to add Ellery with ? or not at all [21:07:40] yes [21:07:43] March 2nd to 6th [21:09:05] Cool. [22:40:54] halfak: do you have 5-10 minutes for a Hangout chat? [22:48:58] leila -- in 30 minutes? [22:49:07] sure, halfak [22:53:13] halfak: do we have diff dumps? [22:53:24] as opposed to revision dumps, I mean [22:53:47] nope. I'm working to change that though :) [22:54:01] yeah, that's what I thought. [22:54:26] is this something you're generating now? like if I want to look at the past 30 days and find all the diffs with timestamp in enwiki, can I do this? [23:06:30] leila, regretfully not. This is the point of the process I've been trying to run in hadoop. :( [23:07:12] no worries. just thought to check before spending time on dumps, halfak. [23:07:42] Some day, I'm going to just give you a query that will download whatever slice of diffs you want. [23:07:45] some day... [23:07:56] anyway, I just finished the thing I needed done. Do you want to get on a call quick? [23:08:15] sure. [23:08:22] Call when ready [23:20:45] leila, BTW, the documentation at the top of that script is a lie. [23:20:49] * halfak feels shame [23:21:03] It *does not* extract PubMed ids [23:21:10] haha! I'll make sure I don't pay attention to anything but you just explained [23:21:11] :D [23:23:31] leila, BTW, see the docs on mw.xml_dump here: https://pythonhosted.org/mediawiki-utilities/core/xml_dump.html#mw-xml-dump [23:26:20] ow cool! thanks!