[14:59:59] Good morning science people [15:00:04] o/ lzia [15:00:09] o/ kevinator [15:00:39] good morning halfak. [15:06:19] good morning… it’s Scrum time! [16:23:41] halfak: you are using a lot of space in /home on stat1003 [16:23:48] can you use /srv there instead? [16:23:55] 644G ./halfak [16:24:07] Yes. Will move some stuff. [16:24:12] you are welcome to make a /srv/halfak directory, if you like [16:24:22] danke [16:27:22] Hey ottomata, what kind of redundancy do we have for /home and /srv? [16:28:05] none, unless you want it specifically [16:28:14] OK. Just checking. [16:28:39] We might want to talk about something later for valuable datasets that would be painful or impossible to regenerate. [16:28:57] hdfs dfs -put :) [16:28:58] :p [16:29:08] that will get it replicated 3 times acros the cluster by default [16:29:13] or, we can backup to bacula [16:29:15] if you like [16:30:33] be back in abit [16:32:17] OK. That should clear up 400GB. [17:07:37] yo [17:07:58] wait, we have something called bacula? [17:08:54] for backups? [17:08:55] that's awesome. [17:16:05] I can feel today is gonna be a great day. [17:18:32] :) [17:20:28] * Ironholds blinks [17:20:37] leila, I think Dario forgot that people worked last week :D [17:21:14] Ironholds, why? [17:21:27] he cancelled our weekly meeting because last week was a short week [17:21:30] we worked the days off! :D [17:21:37] I completed /all the open mobile readership cards/ [17:21:58] in this case, you should ping him and keep the meeting [17:22:16] I did! [17:25:03] leila, on the other hand, we should probably not complain too hard [17:25:08] I am glad Dario took some time. He needed it! [17:26:14] My items can go to the standup report. no complaints for fewer meetings on my end. ;-) [17:26:50] no. More meetings! I demand the universe allow me to get less stuff done! [17:50:09] morning DarTar :) [17:50:19] howdy [17:50:49] Ironholds: the car battery experiments in Elder failed dramatically [17:50:58] the guy is dead [17:51:01] awww [17:51:17] quick summary, then: please get dan and maryana to give feedback on the session thing and the UUID thing because those are done. [17:51:22] I don't think I have any other current mobile asks [17:51:37] I wrote a cryptographic hashing library that's an order of magnitude faster than digest, even when you add a salt [17:51:38] I saw your note, that’s crazy [17:51:59] and I got bored and am writing a generalised MaxMind GeoIP library, too. [17:52:14] hey, let me run up to 6 for an admin thing I’ll forget otherwise and we can chat more [17:52:16] (once I work out why the C I'm using for sampled log reading is broken, anyway) [17:52:17] totally! [18:13:14] hey Ironholds, I want to crunch some wikigrok data this morning before looking into other stuff, how about we chat early in the afternoon? [18:24:00] DarTar, sure! Email? [18:24:58] Ironholds: quick hangout would more effective [18:34:09] DarTar, I meant email me with timing for the hangout ;p [18:34:53] Ironholds: any time in the afternoon is good (except 2-3) [18:42:37] DarTar, kk [19:28:34] halfak, https://en.wikipedia.org/w/index.php?title=List_of_people_by_Erd%C5%91s_number&diff=636209448&oldid=635431327 [19:28:47] Nice work :) [19:29:01] I should also add you to a list [19:29:39] https://en.wikipedia.org/w/index.php?title=Wikipedia%3AWikipedians_by_Erd%C5%91s_number&diff=636209562&oldid=635803614 [19:32:02] Woot! :) [19:36:19] Ironholds: now you can use the co-authors list to increase the next section http://wikipapers.referata.com/wiki/Aaron_Halfaker [19:57:45] halfak: ping? is anyone from https://wikitech.wikimedia.org/wiki/New_Project_Request/Wikipedia_Thanks_%26_Love_Research here? [20:01:48] Hey YuviPanda, I don't think they hangout in IRC. [20:01:50] What's up? [20:02:14] halfak: just wondering if they need a new project or can get by with toollabs. ToolLabs means we take care of the underlying infra, while with their own project they have to. [20:02:43] Gotcha. That's a good question YuviPanda [20:03:00] I'll see if I can pull 'em in. [20:03:09] halfak: ok! email would be ok too! [20:03:25] that project just keeps coming up today :D [20:03:37] heh [20:08:51] halfak: so I'll put that on hold until we figure that out. [20:09:17] OK. I just pinged. I expect we'll gate Nate in IRC shortly if he is available. [20:09:32] halfak: ah, cool! [20:40:52] okay, I lied, I'm feeling tremendously out of it today. bleh. [20:40:56] I blame the caffeine I accidentally had. [20:41:00] * YuviPanda pats Ironholds [20:41:07] I had 6 shots of espresso yesterday. [20:41:17] and since I hate the taste of coffee, that included about 6-8 sugars per espresso [20:41:24] err, per double espresso. 3 doubles. [20:42:26] I accidentally coffee coffee buzz buzz buzz'd [20:43:03] ow man, accidental drugging is no good [20:51:59] Ironholds, exercise works for me when I get a little too much caffeine. [20:52:17] a healthy distaste for the taste helps me! [20:52:32] although my drink of choice has always been really, really cold water. [20:52:59] halfak, not really an option at the moment but will bear in mind :) [20:53:25] * halfak does pushups next to his desk. [20:53:44] Also, Jenny filled up my punch bag with weight, so I can go hit it. [20:53:51] Speaking of which -- BREAK TIEM [20:55:53] ....aaaand [20:55:58] I finally worked out the bug in my C++ [20:56:02] * Ironholds headdesks over and over again [20:56:07] 0-indexing. NOT 1-INDEXING. [20:57:05] tnegrin, your christmas present arrives at the office today, btw [20:57:19] you moving back? [20:57:34] Yeah, I'm writing this from the wheel hub of a '37 [20:57:44] keep your jacket on! [20:57:52] naw, it's a book [20:58:06] I was sat there thinking "what do you get the man who has everything?" and then I realised this was an irrelevant use case because I was buying for you, and focused on that instead. [21:13:34] I don't think the bag saw me coming. He just stood there and took it. [21:16:42] hah [21:27:56] Ironholds, we got covered in the signpost. https://blog.wikimedia.org/2014/12/01/research-newsletter-november-2014/#Wikipedia_user_session_timing_compared_with_other_online_activities [21:28:47] *halfaker and team* [21:28:59] see, this is exactly what I thought would happen, dude :D [21:29:12] This is your Priedhorsky et al! :D [21:29:58] still, yay, signposting! [21:31:17] I can't parse that review. But then I'm tired. [21:33:28] Yeah. The conclusion leaves a lot to be desired. [21:33:36] But woo hoo for coverage :) [21:33:43] also, I don't like that summary of the SO argument [21:34:15] it's nothing to do with expected high quality, it's to do with the interplay between community-assessed quality and hierarchical position. [21:36:14] Also, a goodness of fit measure would not help us with arguments about validity. [21:36:30] halfak, Ironholds: Tilman was not particularly happy with the review FWIW [21:37:24] hey halfak: I’m about to post a thing on the internal list that I’d like to talk to you about when you have a moment [21:37:42] Sure. I'm free now. [21:38:15] DarTar, gotcha. Also, again; metrics is /next thursday/? [21:38:39] halfak: cool, give me a moment to summarize this [21:39:00] Ironholds: yes, which is why I have been pinging everybody on that card or by mail for a few weeks [21:39:14] and WMUtils is currently broken because Faidon. Shit. Okay, now I know how I'm spending my time, I guess >.> [21:39:40] I asked tnegrin to reach out if he needs more data with the understanding that we won’t be able to do much new with a short notice [21:40:04] he’s driving it [21:40:17] yeah, but I at least need to get November's pageviews data in [21:40:17] problem: the thing that's broken is "someone removed a dependency ua-parser needs". [21:40:36] (on that note, WMUtils is not currently usable. Soz.) [21:40:48] is Nov data that critical? [21:41:39] I was asked for it, so I assume so [21:41:40] it shouldn’t if the narrative is about trends, it should if it’s about having a tool to generate this data quickly at the push of a button :) [21:41:53] hm check with da boss? [21:42:00] well, the important thing is WMUtils includes sampled_logs [21:42:17] currently-installed library includes a /broken/ version of sampled_logs. [21:42:17] new version fixes this! Cannot currently install new version. [21:42:22] so, goodbye all data requests. [22:01:32] Ironholds: some of the requests in the card were not specifically about new or fresh data [22:02:03] but I’m not prioritizing them since toby is driving [22:02:47] I need more hadoops [22:03:01] I need a lotta things :( [22:03:05] DarTar, yeah, I've seen the new requests [22:03:09] working on getting the ITU data loaded now. [22:10:22] Ironholds: last week I made another scary discovery regarding HTTPS traffic, I managed to get redirected from Google over SSL to a SSL-less Wikipedia page [22:10:37] cool! [22:11:07] I don’t know if I’ll be able to reproduce this but the whole thing (if true) makes the whole referer analysis a bit unstable [22:11:16] Ironholds: uh, the pageview definitionis no longer being worked on? [22:12:08] YuviPanda, what gave you that idea? [22:12:21] Ironholds: uh, wait, some email thread on mobile-tech [22:12:37] ...please do give me the name of that thread. [22:12:39] so that I can read it. [22:12:45] > this got rejected by the daemon. Weird! [22:12:47] ... [22:12:49] daemon [22:12:51] I read that as damon [22:12:53] Ironholds: in other words, if this behavior is common it will generate tons of referer-less requests [22:12:58] nevermind me. [22:13:10] * DarTar waves at YuviPanda [22:13:15] hi DarTar [22:13:36] YuviPanda, I'm not seeing what that says. [22:13:47] DarTar, also cool! [22:13:47] *where it says pageviews are not being worked on [22:14:23] Ironholds: no, you said your email was rejected by the mail daemon, I... read that as if what you were referring (PV defs) was rejected by *Damon*, who is new VPE. [22:14:36] Ironholds: so, it's me being an idiot and nothing more. [22:15:23] ahh [22:15:28] don't sweat it, dude [22:15:43] :) [22:16:44] oh god, this ITU data is a hellish format [22:16:49] it's a microsoft access database [22:16:51] this is not a joke [22:17:12] * YuviPanda has enjoyed quite some time writing stuff in Access / VBA [22:18:47] ohrly? [22:18:52] if I give you a database can you turn it into a tsv? [22:20:34] Ironholds: not anymore, sadly, but that's more about lack of tools (OS X, etc) [22:20:45] Ironholds: but if I did have a windows machine, I would find that fairly trivial still, I'd like to think [22:22:37] http://svitsrv25.epfl.ch/R-doc/library/Hmisc/html/mdb.get.html [22:22:48] Ironholds, ^ [22:22:53] looks like that might work. [22:22:56] "Assuming the mdb-tools package has been installed on your system" [22:23:05] will try to do locally. It's not on stat1002 :( [22:23:53] Ironholds: hmm, that shouldn't be too hard. [22:23:55] * YuviPanda examines [22:24:00] mdb-tools on stat1002 that is [22:24:03] eh, not worth it [22:24:09] heh, ok then [22:24:16] I don't want to encourage this format [22:24:16] ;p [22:24:16] * Ironholds installs locally, adds [22:24:29] hehe :) [22:24:39] https://github.com/harrelfe/Hmisc WTF. NO. WHY. WTF. [22:25:06] * halfak wants to punch the internet in the face. [22:25:17] R AND ASSEMBLER?! [22:25:19] * Ironholds cries [22:25:26] that's like... [22:25:57] that's like an army armed exclusively with kinetic kill vehicles and very, very tiny scalpels. [22:26:02] both are useful but...maybe something mid-range, y'know? Maybe? [22:26:05] halfak, why? [22:26:55] Oh, the internet is full of useful things when it comes to free software, but as soon as you have a file built by a non-free thing -- NO. YOU MUST PAY MONEY. [22:27:14] "But I already have the data. I just want to be able to make use of it." [22:27:27] NO. NO YOU PAY MONEY AND MAYBE THIS WILL WORK FOR YOU IN WINDOWS> [22:27:37] * halfak punches internet in the face. [22:28:01] ahh [22:28:07] more importantly this is just a really dumb database format. [22:28:10] stupid ITU. [22:28:21] at least... it's not tables in word. [22:28:29] Fair point [22:29:24] to be fair we have also produced terrifying data structures. [22:29:39] I've basically given up on getting data out of the HTML tables on stats.* [22:30:20] I don't think I've produced a dataset that wasn't a TSV or lines of JSON. [22:30:28] I think I might have made a couple XML ones once. [22:34:40] halfak, oh, totally. I think a few .RData files. [22:34:44] and that was mostly for compression's sake. [22:34:54] (and because you cannot write.table a list in a format anything understands) [22:35:22] The only format I have figured out for lists is the repr in the code. [22:35:34] list(foo = list(bar = 5)) [22:35:50] Otherwise, I generally don't mess around with lists. [22:36:04] * YuviPanda produces data in sqlite formats [22:36:15] * halfak rages and growls [22:36:31] * halfak remembers that *EVERYTHING* knows how to read sqlite. [22:36:37] * halfak remains calm [22:36:59] halfak, use case for lists is stuff like intertimes, pre-aggregation [22:37:15] (I know, easily handled with a df of uuid-inter) [22:37:21] (but this was waybackwhen. January!) [22:37:27] halfak: :) indeed! [22:39:17] aw goddammit [22:39:19] it's snowing tomorrow [22:39:29] Come ooon thursday. Sunny thursday. [22:39:55] Ironholds, is Trello down? [22:40:35] not for me! [22:41:27] DarTar, I have no idea how ComScore data works [22:41:40] are you saying you're not involved in december metrics from the management end or the research end too? [22:41:53] Ironholds: both [22:41:59] okay. So, see above. [22:42:11] The due date is Wednesday and I don't know how to interact with that data source properly, I don't think. [22:42:20] slightly worried :/ [22:42:30] we can discuss this later when you have less of a helter-skelter day though [22:42:30] sure, I can quickly brief you on that item, but it doesn’t involve comScore data at all [22:42:39] "Estimate comScore's entries (Google referred traffic vs traffic with no referral)"? [22:42:48] yup [22:42:59] ookay [22:43:07] let’s talk about this during 1:1 [22:43:26] That's on Thursday ;p [22:43:48] I mean our 1:1 today [22:43:52] aha [22:49:11] halfak: just a fyi, I've moved the labs project request to https://phabricator.wikimedia.org/T76396, hopefully you can shoot off an email when you've the time :) [22:50:54] YuviPanda, just sent an email letting him know. [22:51:00] halfak: ty! [22:51:04] It looks like Nate is in meeting until late. [23:05:04] tnegrin, mobile checkin? [23:06:04] coming