[00:07:45] o/ srijan [00:08:10] Yeah. That's a mega-slow query. [00:08:27] You can see why I commented it out. I didn't want my makefile accidentally overwriting an old run of the SQL! [00:08:44] I can get you this data with the date restriction if that would be helpful. [00:15:20] hey halfak! [00:15:33] yeah that would be useful :) [00:15:43] OK. Gimme a few minutes. [00:15:44] I will figure out what to do with the remaining time duration [00:16:38] sure [00:16:44] thanks! [00:18:19] Data is extracting from the DB [00:20:19] great :) [00:21:29] Must be a big table. It's taking a long time. [00:21:47] I'll keep monitoring and ping when it is done. [00:21:59] okay cool [00:22:03] thanks a ton [00:29:38] srijan, 2.8GB! I'll compress and get it on one of our http servers [00:32:19] woah that's huge! [00:49:24] Down to 811 MB [00:49:27] Transferring [00:49:34] okay :) [00:50:20] At some time in the next 30 minutes you should see a file appear at this URL: [00:50:28] http://datasets.wikimedia.org/public-datasets/enwiki/etc/nov13_page.tsv.bz2 [00:50:48] Also note that if you go up a dir, you can get some of the other datasets I found lying around from that study. [00:51:15] e.g. "creation", "move", "page_origin", and "user_stats" [00:52:46] okay cool! [00:54:27] o/ notconfusing [00:54:42] halfak, yes? [00:55:22] JUst saying hi. :) [00:57:03] halfak, hi [00:57:37] working on processing gender ratios from wikidata [00:58:00] before i get to minneapolis and have to focus on nonwikimedia things [00:58:17] :) maybe some wikimedia things. [00:58:30] Gotta keep your hobbies while the course work beats up your mind. [01:00:42] is htat how it is? [01:01:30] i wrote a programmers guide to "Nonviolent communcation" (the conflict management theory). That was kind of fun [01:01:41] Hobbies are really important to remember the spice of life [01:03:25] Indeed. The problem is when your hobbies become your work. [01:03:28] Or maybe that's good. [01:03:33] I haven't decided yet. [01:10:40] notconfusing: link? [01:11:03] http://notconfusing.com/universal-empathy-machine/ [01:11:11] YuviPanda, ^ [01:11:37] halfak: madhuvishy as an FYI I will be on vacation next week [01:11:47] my hobby became work and now i have no hobbies [01:12:12] hare: that's two of us [01:13:11] halfak, im not sure either [01:16:13] YuviPanda, cool. Going anywhere fun? [01:16:48] Yeah. I used to have hobbies other than building wiki tools, but then I needed to finish grad school, so now I build wiki tools all the time. [01:16:54] I guess I do some science too. :) [01:27:24] halfak: chaos computer club camp in Germany then Rome then berlin [01:28:01] Chaos Computer Club Camp!? Do you guys do fun things with cellular automata? [01:28:42] halfak: hah :) [01:29:25] So, no chaotic systems stuff or WAY COOLER chaotic systems stuff? [01:33:55] halfak: bit of both I guess? [01:34:07] halfak: they built their own gsm tower at some point [01:35:08] Ha. I wouldn't want to be nearby with my phone when they turned it on. [01:35:23] Then again, I bet those things take a lot of energy. [01:35:29] :) [01:35:33] It's a fun and old group [01:36:13] https://en.wikipedia.org/wiki/Chaos_Communication_Camp? [01:36:55] halfak: yes [01:39:59] halfak, thanks for the file :) [01:41:57] srijan, happy to help! [14:45:32] GOod morning sciency people. [15:11:19] Good morning halfak & everyone. [17:01:08] o/ bearloga [17:01:14] running a little late. there in 1 minute [17:03:28] halfak: np [17:23:08] leila: also I ran into bob west at the airport yesterday night :D [17:23:34] mmm, ooooo! [17:23:45] why were /you/ in the airport? [17:25:08] leila: I was coming back from Chicago! [17:25:33] leila: I took the train to chicago [17:26:49] mmm, I thought you were coming back with train YuviPanda. [17:26:57] leila: no that'd be crazy :) [17:27:16] haha! [17:27:30] ACRip_ [18:51:02] 10Quarry: Large queries results not displaying (Aug 2015) - https://phabricator.wikimedia.org/T108084#1511927 (10Dispenser) 3NEW [19:26:00] YuviPanda, what did you see in Chicago? [19:26:28] Ironholds: the seafront, the art institute, skydeck, and a super fun thunderstorm from the lake! [19:26:30] err [19:26:32] lakefront [19:26:35] nice! [19:26:46] the museum district is SO COOL. I might be out there in October [19:27:20] nice [19:28:20] Ironholds, me too! What are you going to be out there for? [19:28:32] halfak, SO's lunatic cousin's wedding, first weekend of the month [19:28:59] and then I'm hopefully going to DC for the conference because I am *puts on snooty eyeglass* on the Programme Committee, so I get to hang out with Aaron/Keilana/et al for two weekends in a row [19:29:00] me too! maybe we have the same cousin! [19:29:09] naw, I've met halfak's cousins [19:29:17] they're hella-cool people but not tall enough to be part of this family [19:29:22] I might go to CXCW? [19:29:26] ooh [19:29:35] I've never been to a Proper Conference other than useR honestly [19:30:00] heh [19:30:02] I'll be doing an "invited" talk at Northwestern -- assuming all the logistics work out. [19:30:05] NICE [19:30:14] I might do wikiconference usa too [19:30:23] halfak: I was thinking of proposing a quarry session at wikiconference usa [19:30:25] "invited" because I suggest it myself. They were happy to accommodate though. [19:30:36] When/where is wikiconference? [19:30:39] DC [19:30:46] second weekend [19:31:01] halfak: btw, the stat -> labs pipeline is 99% complete, just waiting on some network ACL requests now [19:31:28] Cool! [19:31:35] YuviPanda, I mean, the programme committee would welcome the submission :D [19:31:43] * Ironholds puts on PC hat [19:31:48] halfak, which weekend? [19:31:50] or week? [19:36:33] Ironholds, hasn't been nailed down yet [19:36:55] lemme know if it syncs up! And you should come to WCUSA if you can; the programme is going to be ~~awesome~~ [20:06:23] hmn. Where's dartar? [20:18:38] not at his desk Ironholds. [20:19:29] thanks leila :). Can I ask you for a favour? [20:19:41] let's see. :D [20:20:12] So, I've asked my friend Hilary to do a tech talk on integrating data into product work; she's a lead data person at Etsy (and previously a biostatistician). [20:20:37] If the timing works out for your calendar would you be okay with leading the in-office questions section? I'm remote and you know far more than me about data /anyhoo/ ;p [20:22:27] does someone usually leads the question in the office, Ironholds? [20:22:41] what I've seen is someone taking care of IRC and others just asking question, Ironholds. [20:22:45] thinking about it I actually don't know [20:22:50] nevermind then, we're cool :D [20:22:53] And I know Rachel used to introduce the speaker and all that. [20:22:57] :D [20:23:10] in that case: can you do me the favour of coming to the talk! She's one of the smartest people I know and I expect it will be excellently useful [20:23:15] I'm happy to help, but I think they have already nailed down the logistics for tech talks. :-) [20:23:39] I will be there for sure unless there is a conflict or OoO. that's for all the tech talks. :-) [20:23:57] do you know when it will be? [20:24:54] Ironholds, ^ [20:25:13] Not quite, we're setting up timing with Rachel now :) [20:25:17] there will be a wmfall email I think [20:25:37] ooki. will await the email then [20:27:05] yay! Oh! When do you want to chat about the data/legal/etc stuff? [20:27:17] I totally forgot to put a meeting together. Should I just find a free calendar space? [20:27:36] how do tomorrow after RG meeting sound to you Ironholds? [20:27:46] does* [20:28:08] perfect! [20:28:36] I will put an event together so there is a room for you (and so I do not forget ;p) [20:29:31] ah! I'll work from home, Ironholds, so no room, but thanks for reserving a time slot. [20:29:32] :-) [20:32:18] okee, will cancel room for the common good, or something [22:00:46] Nemo_bis: how do you feel about archiving the discussions in https://meta.wikimedia.org/wiki/Research_talk:Increasing_article_coverage ? [22:01:22] I expect new discussions to start in a week or two when we will start testing a tool on Labs and I'd like to have a fresh page, Nemo_bis. [22:03:31] leila, I recommend against a fresh page. [22:03:40] why is that halfak? [22:03:41] I think that people will find it informative to see old discussions. [22:03:54] there will be link to archive, right? [22:04:05] Yeah... I'd miss that. ;) [22:04:15] ah! okay! ;-) so is that cool? [22:04:37] archiving and having a link to the archive from the current talk page halfak? [22:04:51] Yup. [22:05:23] I would not notice the link to the archive and I'd be on the lookout for active topics -- and specifically, responses from the person running the study. [22:06:03] sounds good. I'll wait for Nemo_bis as the person who always keeps an eye on my talk pages. :-) [22:54:46] hi halfak...i have another question (sorry) [22:55:00] No worries. You're in the right place :) [22:55:11] :D [22:55:13] do you still have the raw data for the first plot in https://meta.wikimedia.org/wiki/Research:New_editor [22:57:07] srijan, do you just want the counts of monthly active editors? [22:58:49] not monthly active editors, but new users who registered [22:59:26] with n = 1 [22:59:28] Just registered? [22:59:32] Oh. Sure. [23:00:18] OK. So I have a pretty good one that I think will make sense for you. [23:00:29] It has one row per wiki-user-month. [23:00:48] Users are annotated by whether they meet the "new editor" criteria or not [23:01:05] So you can set your own threshold [23:01:21] I'm compressing it right now. It's 2.3 GB [23:01:35] Oh. And it accounts for all wikis. [23:01:48] okay great [23:02:04] i guess there is a field that says "enwiki" to filter that one out? [23:02:12] Yup [23:02:17] awesome :) [23:06:29] srijan, some time in the next 30 minutes, your data will be here: http://datasets.wikimedia.org/public-datasets/enwiki/etc/joined_editor_months.all.tsv.bz2 [23:06:59] awesome [23:07:05] 'revisions' is a count of all revisions this user has ever made. [23:07:06] thanks again, halfak! :) [23:07:19] 'archives' is a count of revisions to deleted pages. [23:07:20] okay [23:07:32] If you did a raw count on the revisions table, you would end up with revisions-archives. [23:07:35] Make sense? [23:08:08] okay got it [23:08:24] Oh! And 'attached_method' != "login" means that they count as a "newly registered user" [23:08:50] When attached_method == "login" that means their account was autocreated for them when they were browsing inbetween wikis. [23:09:01] oh okay [23:09:05] Useful for counting old actives, but not new actives. [23:09:20] okay [23:46:21] leila: are you still in the office? :) [23:46:50] yes, YuviPanda. Do you want to meet in 15 min or so? [23:48:24] leila: yes I do. I'm just walking to the office. [23:48:37] ooki. let's meet then, YuviPanda.