[02:49:24] it's the famous nettrom! [15:00:27] o/ [16:14:02] _o/ [16:15:54] :D good morning quillom [16:30:21] Woah! Looks like the enwiki revids just passed 700m [16:30:51] Stuff that I just noticed :\ Looks like this happened a month ago [16:30:59] I've been drawing samples from 2015 recently. [16:34:47] halfak, how do you load a json file into hive? [16:35:09] kjschiroo, each line is a JSON blob? [16:36:40] halfak, is each line supposed to be an unnested dictionary, no all encompassing list? [16:37:23] That's right [16:37:46] So it isn't actually a valid json file? [16:42:28] @halfak, did you need "ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.JsonSerde'" is your CREATE TABLE? [16:42:33] in* [16:42:46] yes. [16:43:01] Not sure what the "it" is that you are referring to. Which file? [16:43:34] * guillom has fallen through the rabbit hole of sociology. [16:44:14] "Ok so what does this paper mean?" "Ok so what are the authors basing their work on?" "Ok so what's this theory they're referring to?" [16:44:29] And there you have it. [16:44:39] 5 new books ordered. [16:48:51] halfak, I was referring to the input file. A series of lines containing json dictionaries would not be considered actual json. I was just verifying that I was understanding it correctly. Is the jar for org.apache.hadoop.hive.contrib.serde2.JsonSerde already on the system somewhere? I am getting Cannot validate serde: org.apache.hadoop.hive.contrib.serde2.JsonSerde [16:49:38] kjschiroo, indeed. it is a file containing valid JSON lines -- the entire file, if processed as a string, is not valid JSON. [16:50:31] kjschiroo, here' [16:50:36] s an example command: https://gist.github.com/halfak/3075794480e2e3b229d9 [16:53:14] halfak: Thanks, that is helpful! [16:53:22] :D [16:53:47] I just realized I never finished fixing the productivity table so that it has timestamps in it. [16:53:50] * halfak fixes that now. [17:02:49] kjschiroo: also, check out the wmf_raw.webrequest table [17:02:52] it is a JSON backed table [17:03:10] https://wikitech.wikimedia.org/wiki/Analytics/Data/Webrequest#wmf_raw.webrequest [17:03:28] that data is snappy compressed sequence files [17:03:30] but you don't need that [18:08:21] morning, folks [18:09:09] hey schana: we should get you an IRC cloak: process is here https://meta.wikimedia.org/wiki/IRC/Cloaks [18:10:54] DarTar: I already went through that process...is it not working? [18:11:03] I see a cloak [18:11:11] oh weird, I see it too now [18:11:23] maybe it was a temporary glitch with my client [18:22:06] alright, schana: I’m inviting halAFK, leila and J-Mo as project leads to our short 1:1 today so we can have a brief discussion about high-level engineering asks [18:22:27] okay DarTar [18:22:46] I can make time later for anything else we should discuss on 1:1 [22:53:52] * yuvipanda waves [22:55:27] * guillom particles. [22:56:41] :D [22:56:57] I shall try to bring PAWS back up today