[00:00:31] I'll need to query enwiki_p [00:00:36] What host should I use? [00:01:07] halfak: enwiki.labsdb [00:01:16] that's just a convention, they have all the dbs [00:01:25] Great. [00:01:37] And I should be able to make my user DB on that server? [00:01:42] yes [00:01:58] in the format of [00:02:06] __ [00:02:11] halfak: note the two underscores and not one [00:02:20] COol. Worked great [00:02:37] awesome [00:02:49] halfak: my hot water just came back so I might take a shower in like 10minutes [00:03:08] Sure. no prob. I'm going to be moving files soon anyway. [00:03:21] That'll take some doing. I think i have enough to hack with :) [00:04:28] halfak: :D so far I've: 1. R magic, 2. SQL magic, 3. .sql files editing, 4. versioning [00:04:31] anything else? [00:05:05] Nothings coming to mind [00:05:43] Would be great to use keyboard shortcuts to copy-paste, I guess. [00:05:50] but thaat [00:05:52] 's OK [00:06:35] halfak: in terminal or? [00:07:44] Yeah.. in terminal. [00:07:53] Seems to work fine in notebook [00:07:55] Sorry [00:08:01] halfak: yeah, that's fixed in next version. [00:08:52] woot [00:19:26] J-Mo, looks like we have about half as many obs. as last time (for teahouse study). Does that sound right? [00:19:57] yep. [00:20:04] cool. [00:20:05] but about the same number of controls [00:20:15] That'll make all the difference. [00:20:26] * J-Mo crosses fingers and toes [00:20:30] Also, I'm going to lump both groups together with the experiment # as a predictor. [00:20:42] I'm calling the last experiment th2 and this one th3 [00:20:50] Since th1 was from your original pub [00:21:06] ... just in case I slip and say those acronyms at you [00:21:10] :) [00:24:15] Oooh! Looks like I can work on ipython notebooks directly in my editor [00:24:21] <3 jupyter team [00:25:34] joop-why-ter [00:30:34] ShiveringPanda, when you get back, it would be great if we could get some colors in the term. [00:30:58] I just want files, folders and my prompt to be different colors. [00:31:44] hmm.. I can't get 'ls' to show colors but maybe I can do my own prompt [00:33:01] lol. [00:33:07] pasting sometimes forgets line breaks [00:35:03] Yeah! [00:35:06] Changed my prompt [00:35:07] wooo [00:35:17] muuuuch better [00:37:10] halfak: yeah, it respects .bashrc [00:37:15] we can add our defaults too yeah [00:37:47] Looks like I was able to get colors enabled for 'ls' too [00:37:49] :) [00:37:59] nice [00:39:51] yay hot showers [00:46:22] So... the labsdbs are missing some indexes that we have on the analytics dbs [00:46:38] which ones? [00:46:53] e.g. an index that lets you filter 'archive_userindex' by user_id & timestamp [00:47:07] it exists on revision_userindex, but it's not on archive because it doesn't exist in prod. [00:47:19] I see. [00:47:20] rather it doesn't exist on archive in prod. [00:47:32] right, and is an index that was added for analytics-store [00:47:38] it TOTALLY MAKES SENSE to have, but I couldnt' convince springle. [00:47:45] Yeah [00:47:54] heh, maybe we can make it happen for labsdb. can you file a bug? [00:48:01] we have another DBA starting in a few months [00:48:07] Maybe I can find the old bug. [00:48:14] I hope we were in phab at the time [00:48:15] halfak: sorry, didn't notice your reply. no, i don't mind moving the page to meta, but please leave a soft redirect at the original (i managed to link it in a few places already) [00:48:24] Will do [00:54:38] halfak: does the lack of that index prevent you from doing what you wanted to do? [00:56:19] YuviPanda, gathering user metrics from the archive table quickly. [00:56:36] E.g. a user who edits a page that eventually gets deleted technically was still editing. [00:56:47] So I need to include it in my queries. [00:59:09] * YuviPanda nods [00:59:18] I hope we can convince jynus to put that index on labsdb [00:59:26] but not holding my breath for this month at least :( [01:01:54] Yeah. We'll see if it finishes in a reasonable time. [01:02:02] Only getting stats for ~7000 users [01:03:03] ok! [01:04:46] * halfak waits for it to run. [01:05:02] If I close a tab with a term, will it stay alive? [01:05:16] Aha! It does! [01:05:20] * halfak lives dangerously [01:05:29] Will come back to this later [01:05:35] Thanks for your help YuviPanda [01:05:46] :D [01:05:48] ok [01:05:50] I ended up doing mostly prep work today, but I'll come back and actually do some analysis tomorrow. [01:06:07] halfak: ok! [01:06:18] I'm thinking of best way to enable sharing and publishing of notebooks [01:07:38] halfak: found out that one of the biggest contributors to this ecosystem works at CERN :D [01:08:36] You know, it might be nice to look over someone's shoulder as they use ipython. [01:08:45] I bet I'd like some good behaviors within minutes. [01:09:27] halfak: yeah, ellery is your best bet I think [01:09:34] since he does a *lot* of his work with ipython [01:10:42] My notes from today. Nothing fancy. [01:10:43] http://halfak-jupy.wmflabs.org/notebooks/projects/thr/ipython/2015-12-15.ipynb [01:11:17] awesome [01:11:23] I can see it only because I've the password though [01:11:28] no sharing setup yet I think? [01:11:31] or maybe there is [01:11:43] no there isn't [15:53:31] o/ [16:06:41] \o [16:20:35] SeƱor halfak. [16:33:00] :D [16:57:17] FYI: had some ORES downtime this morning. https://wikitech.wikimedia.org/wiki/Incident_documentation/20151216-ores [19:48:53] halfak [19:49:03] o/ diligent [19:49:27] Hey! [19:49:41] hello [19:50:11] We were going to discuss projects, right? [19:50:18] yes [19:50:45] finally, got you [19:51:01] :) [19:51:10] * halfak is reviewing our email thread. [19:51:40] Ahh yes. Interests and skills. What kind of things are you interested in working on? [19:53:30] interests are evaluating personality attributes, relating wikipedia with collective intelligence, social network analysis [19:54:39] by skills, you mean to know about... language, tools? [19:55:02] Yeah. Do you like to program systems, do stats, interviews & ethnography? [19:55:20] or any other ideas... [19:56:38] like stats,...interviews and developing models from interview results , [19:58:11] mostly stats...and system development can also be done but...confused, whether could be done by me or not as have less experience in that but want to do... [19:59:26] Gotcha. [19:59:35] How far are you into your PhD program? [20:00:39] i have completed my course work, and now have to select some phd level project...have to find problem,... [20:01:56] i have been studying work done on wikipedia.... [20:02:19] Gotcha. Any past work I can look at to get a sense for the direction you are going? [20:02:35] I'm trying to avoid just dumping a bucket of potential projects at you :) [20:03:06] and think ...whatever problem comes to my mind, have already been discussed.... [20:03:56] diligent, IMO, no research project is ever done. [20:04:25] If you ever find a study that is truly terminal (no more to learn), then study how to apply the learnings :) [20:04:26] yes, in phd no work done yet.. [20:05:32] OK. Well, we have a few active projects now that I think are really interesting. Harej is working to build up infrastructure around WikiProjects. I think that's likely to have substantial impacts on the efficiency of content production and newcomer retention. [20:05:46] no...things are not terminal...i just meant to say, i found little similar projects [20:05:55] Leila and ellery are working on strategies for recommending articles based on editing/reading behavior. [20:06:28] I'm working on building basic infrastructure for quality control, newcomer routing (based on quality of work) and curation. [20:06:50] The last example involves a lot of machine learning of behavioral studies. 4 [20:07:18] If you want to just get in on the behavioral studies, theres another active project that I'm sort-of advising. [20:07:21] * halfak gets link [20:07:47] https://meta.wikimedia.org/wiki/Grants:IEG/Editor_Behaviour_Analysis [20:08:15] Essentially, they are taking advanced measures of Wiki behavior and re-applying them more broadly than the original studies did. [20:08:47] On a related note, I'm doing a focused study into measuring "value-adding" activities in Wikipedia. See https://meta.wikimedia.org/wiki/Research:Measuring_value-added [20:09:51] okay...so if i work on any of these,...can this be my phd work...? [20:09:53] There's yet another project where we are exploring strategies for importing scholarly paper metadata into Wikidata. See http://librarybase.wmflabs.org/wiki/Librarybase:Home and https://phabricator.wikimedia.org/T120115 [20:10:29] So, academia and wiki's are a little weird when mixed. If you work with me on a project, I'll insist on open documentation while we work. [20:10:44] This is unusual in academic contexts, so it is generally frowned upon. [20:10:59] But IME, no one ever gets scooped because they posted progress reports on a wiki. [20:11:58] yes, i guess, there are no issues on open documentation... [20:13:05] but i was asking,if i work on these projects, can these serve as my phd work? [20:13:40] as these will be in collaboration [20:14:40] hello... [20:14:48] diligent, that's a question for your advisor. FWIW, most of the work that went into my dissertation was done by me *while* I was collaborating with others. [20:15:29] The nature of these sort of projects (in my experience) is that there is always plenty of work to go around and that there's always someone who does the vast majority of the work. [20:16:53] if we publish some work, will i be able to write my institution name , as my affiliation? [20:17:02] Yup [20:17:15] depends on your are, from what I have been told [20:17:23] late post [20:17:57] diligent, here's an example of a paper that I did while in grad school with WMF staffers: http://nitens.org/docs/cscw13.pdf [20:18:50] i guess, if we can write institutes, then supervisor may agree... [20:19:20] i have seen many people working in this way... [20:20:52] the people who are working on these projects, will agree for all this? anddd can i work on multiple projects at the same time?? [20:22:55] It's called "open collaboration" for a reason :) [20:23:06] But it does depend on the individual and the arrangement you work out. [20:23:27] Generally though, I'd assume that anyone would be quite happy to have the support. [20:24:19] :) [20:24:35] okay [20:24:55] diligent: what are you doing your PhD work in? [20:25:16] and can work on multiple projects? [20:26:32] guerillero: want to work on wikipedia, was here to have some project ideas... [20:27:33] No one will fault you for working on multiple projects. [20:27:37] diligent: yes but through what field are you getting there. Anthro, CS, Soc, Poli Sci, HCI? [20:27:42] In fact, I'd recommend starting broadly and seeing what works out. [20:27:54] halfak is really very nice ,....giving me suggestions... [20:28:48] Gurillero: through CS [20:30:45] I imagine Guerillero|BNC has some project ideas too :) [20:31:13] Halfak: yes, i must look upon,... i look into projects you told. [20:32:07] Yes if Guerillero has, must be telling [20:33:03] but I am an ethnography type [20:34:17] And 1 more thing, suppose if i want to work upon some project with you, how we will be collaborating? And division of work? [20:34:53] Ooh! Another bit of 'halfak' advice is to find someone with a completely different methodological skill set to collaborate with :) [20:35:15] diligent, A good way to coordinate is via this channel. [20:35:26] As far as dividing work, whoever gets to it first. [20:35:43] If you lead a project, I'll be much too busy to do any of your work before you get to it. [20:35:52] What kind of projects ethnography involve? [20:36:32] Becoming a community member along with your research participants. [20:36:34] E.g. http://community.hciresearch.org/sites/community.hciresearch.org/files/Bryant05-BecomingWikipedian.pdf [20:36:45] Although, that's more of an interview study. [20:36:50] Yes collaborating with people from different field can also result in some good work, i agree with you halfak [20:38:22] So the questions I am interested in are along the lines of power structures, how automated data can be used as part of ethnography, how do people use tools to edit, how the tools we use effect our culture [20:38:56] I will point out that I am not a phd http://guerillero.net/about-me/ [20:39:16] halfak: All of Forte's articles are great [20:39:50] Guerillero|BNC, I have something you might be interested in. [20:39:58] http://socio-technologist.blogspot.com/2015/12/disparate-impact-of-damage-detection-on.html [20:40:10] I've been working night and day to try to rectify this problem. [20:40:32] very interesting [20:40:42] It was a good excuse to fix our infra and I think it'll inspire a new set of strategies for testing our models. [20:41:35] * Guerillero|BNC nods [20:41:58] diligent: also anything that has to do with maps [20:42:35] Maps? [20:43:01] means people from.different regions ? [20:43:05] http://guerillero.net/WikipediaMap [20:44:16] Your stream of work also seems to be interesting.... [20:47:01] Halfak [20:47:19] Are you there...? [20:48:11] I went away from this window and connection is lost... [20:48:25] Halfak [20:49:19] Hellooo [20:49:56] Any body there [20:50:00] yes [20:50:38] I am chatting here for first time,... [20:51:10] Why connection is lost if i go away from this window [20:51:48] And i need my chat history...can i get that? [20:52:00] I have the logs [20:52:13] Guerillero: whts ur email id? [20:52:27] tfish@guerillero.net [20:52:51] Can you please send me log of this previous chat? [20:53:23] Halfak [20:54:11] Guerillero: can you please send me log? [20:55:03] send me an email and I will get it to you [20:55:14] My email id is diligent13@gmail.com [20:56:00] sent [20:56:18] Thanks :) [20:56:39] But why this is lost...from my window.. [20:59:01] Sorry. Was afk for a few minutes. [20:59:22] diligent_, the logs of this channel are available at http://bots.wmflabs.org/~wm-bot/logs/%23wikimedia-research [20:59:22] The world melted while you were gone. [20:59:28] :( [20:59:44] Or maybe it now tastes like a better cinnamon roll? [21:01:32] :) got it [21:02:32] Ok, let me have a look on these ideas and then i will be back .. [21:05:24] Halfak: will be back... [21:08:39] Sounds good. Talk to you soon diligent_ [21:08:44] :)