[02:28:51] (PS1) Milimetric: Better information around the namespaces field [analytics/wikimetrics] - https://gerrit.wikimedia.org/r/102871 [02:30:42] (CR) Milimetric: [C: 2 V: 2] Better information around the namespaces field [analytics/wikimetrics] - https://gerrit.wikimedia.org/r/102871 (owner: Milimetric) [13:21:41] (CR) Erik Zachte: [C: 2 V: 2] Make squid tests time-independent [analytics/wikistats] - https://gerrit.wikimedia.org/r/102292 (owner: QChris) [13:21:48] (CR) jenkins-bot: [V: -1] Make squid tests time-independent [analytics/wikistats] - https://gerrit.wikimedia.org/r/102292 (owner: QChris) [13:22:17] (CR) Erik Zachte: [C: 2 V: 2] Use WORKSPACE variable to determine $__CODE_BASE in fallback [analytics/wikistats] - https://gerrit.wikimedia.org/r/102299 (owner: QChris) [13:22:26] (CR) jenkins-bot: [V: -1] Use WORKSPACE variable to determine $__CODE_BASE in fallback [analytics/wikistats] - https://gerrit.wikimedia.org/r/102299 (owner: QChris) [13:22:27] (CR) jenkins-bot: [V: -1] use new wikivoyage logo on stats.wikimedia.org [analytics/wikistats] - https://gerrit.wikimedia.org/r/88978 (owner: Dzahn) [13:22:28] (CR) jenkins-bot: [V: -1] Make squid tests time-independent [analytics/wikistats] - https://gerrit.wikimedia.org/r/102292 (owner: QChris) [14:18:08] (CR) QChris: "Direct push, as Erik voted CR+2." [analytics/wikistats] - https://gerrit.wikimedia.org/r/102292 (owner: QChris) [14:18:32] (CR) QChris: "recheck" [analytics/wikistats] - https://gerrit.wikimedia.org/r/102299 (owner: QChris) [14:54:32] (PS4) Hashar: Use WORKSPACE variable to determine $__CODE_BASE in fallback [analytics/wikistats] - https://gerrit.wikimedia.org/r/102299 (owner: QChris) [15:11:00] (CR) QChris: "Erik CR+2 on the identicat PS3." [analytics/wikistats] - https://gerrit.wikimedia.org/r/102299 (owner: QChris) [15:11:08] (CR) QChris: [C: 2] Use WORKSPACE variable to determine $__CODE_BASE in fallback [analytics/wikistats] - https://gerrit.wikimedia.org/r/102299 (owner: QChris) [15:14:29] (PS5) QChris: use new wikivoyage logo on stats.wikimedia.org [analytics/wikistats] - https://gerrit.wikimedia.org/r/88978 (owner: Dzahn) [15:14:35] (PS4) QChris: Add some more author aliases [analytics/wikistats] - https://gerrit.wikimedia.org/r/92069 (owner: Nemo bis) [15:14:38] (PS3) QChris: Typofix in squids report [analytics/wikistats] - https://gerrit.wikimedia.org/r/100368 (owner: Nemo bis) [15:14:47] (PS5) QChris: Ignore sampled-1000 testdata generated when running tests [analytics/wikistats] - https://gerrit.wikimedia.org/r/102298 [15:15:30] (CR) QChris: [C: 2] use new wikivoyage logo on stats.wikimedia.org [analytics/wikistats] - https://gerrit.wikimedia.org/r/88978 (owner: Dzahn) [15:48:43] qchris_away: [15:48:45] ah, away! [16:36:14] ottomata: chris_away back. [16:41:52] ok i have discovered that I can indeed use camus to write snappy compressed sequence files and query them with external hive tables [16:42:03] is this what we should do!? [16:42:06] i do not know :; [16:42:07] :p [16:42:16] :-) [16:43:40] Did you manage to add uncompressed files to hive and compress them afterwards upon need? [16:45:03] no [16:45:17] So we have to compress right away :-( [16:45:26] options are: [16:45:45] uncompressed + external table [16:45:45] compressed sequence file + external table [16:45:45] compressed internal table [16:45:57] compressed internal table would only work if we imported the data into hive later [16:46:02] rather than using the raw imports [16:46:29] I also think that external table is the way to go. [16:46:40] If we cannot compress later, [16:46:50] we probably have to compress right away :-( [16:47:04] yeah we want external tables for sure [16:47:04] but, we will only keep 30 days of external table data [16:47:18] i think the historical sanitized data will not be in external tables [16:47:18] not sure though [16:47:55] If it's not in external tables ... can we use it like plain hdfs files? [16:48:21] (e.g.: consume from pig) [16:52:04] sure, it is just stored in /user/hive/warehouse.../year=2013/month=12/... [16:52:06] its just in files there too [16:52:20] ok. [16:52:20] and other hadoop tools should be able to read snappy sequence files too [16:52:29] seq files are well supported in hadoop [16:52:34] http://mail-archives.apache.org/mod_mbox/hive-user/201211.mbox/%3CCAPEqew+BCPX28gyQ9ThsRPyXR38hJ9y5jz6jW5XT3m=4yyjOBg@mail.gmail.com%3E [16:52:46] Cool. [17:52:01] qchris: [17:52:03] java q for you [17:52:16] [ERROR] /Users/otto/Projects/wm/analytics/camus/camus-etl-kafka/src/main/java/com/linkedin/camus/etl/kafka/common/SequenceFileRecordWriterProvider.java:[76,72] incompatible types [17:52:16] [ERROR] found : java.lang.Class [17:52:16] [ERROR] required: org.apache.hadoop.io.compress.CompressionCodec [17:52:29] I am doing [17:52:31] FileOutputFormat.getOutputCompressorClass((JobConf)context, DefaultCodec.class); [17:52:47] and here is the doc [17:52:47] http://hadoop.apache.org/docs/current2/api/org/apache/hadoop/mapred/FileOutputFormat.html#setOutputCompressorClass(org.apache.hadoop.mapred.JobConf, java.lang.Class) [17:53:02] not sure what i'm doing wrong there [18:22:43] ottomata: Which camus repo are you using? [18:23:28] Is everyithing on the same Hadoop version? [18:35:45] i htink i got it [18:35:54] Ok. [18:36:02] What caused the problems? [18:36:30] i was assigning th result of getOutputCompressorClass [18:36:32] which is a class [18:36:35] and I needed to use [18:36:48] Ah. [18:36:50] Ok. [18:36:50] .getInstance() [18:36:51] sorry [18:36:51] .newInstance() [18:37:08] it was confusing because the first error I was getting said I needed to pass use a Class [18:37:10] and then I used a class [18:37:16] and then the next error said it wanted an instance [18:37:31] didn't realize I had actually solved the first error by using the .class [18:37:31] :-D [18:37:50] Stupid java! [19:10:10] is rfaulkner on irc these days? nick? [21:18:57] jeremb: Just saw your Q. I haven't seen rfaulkner on IRC recently. [21:19:55] ottomata: Can you tell me which machines have /dev/mapper/stat1-a "/a" mounted. Is it just stat1? [21:20:46] yes [21:20:51] that's not nfs [21:20:55] its just a lvm partition [21:20:56] Gotcha. [21:21:01] Fancy! [21:21:02] :) [21:21:11] I'm hoping to do a little cleanup. [21:21:31] I see there are some database directories there (e.g. mysql & mongo) are those still in use? [21:26:40] ottomata: ---^ [21:27:51] halfak, not sure, ummm [21:27:54] check puppet? [21:27:57] mysql almost certainly actually [21:27:59] dunno about monog [21:28:00] mongo [21:28:17] That mongo could have been the old one that I used for Snuggle. [21:28:38] Yeah. There's only test databases in there. [21:29:20] * halfak goes to look through puppet config again.  [21:29:32] Thanks for your help. [21:41:11] gotta run, latesr all! [21:54:40] gah, missed halfak [21:54:45] * jeremyb hands halfak a tab key [22:59:34] halfak: you around? [22:59:45] Hey! Yup [22:59:47] hey [22:59:55] Feel'n better? [23:00:14] meh, I think I've got some fever [23:00:30] but not the end of the world [23:00:48] do you have time for a quick check in with tnegrin and me? [23:00:49] data fever? [23:00:54] lol [23:00:56] Sure. [23:01:04] I have Growth standup in... now [23:01:09] But i could meet in 15-20 [23:01:26] dartar's got chills, they're multiplyin' [23:01:40] spreading some love [23:02:04] DarTar, will 3:20ish work? [23:02:16] yes, 15-20 should be fine, I think tnegrin is also afk now [23:06:57] I'm available now [23:07:15] Still doing growth standup. I should be ready soon [23:07:52] tnegrin: I'm sending an invite and blocking Chambers if you need a room [23:07:59] thanks D [23:14:01] DarTar: OK. I'm good to go. I'll go hop into the hangout. [23:14:14] Bah! I have no invite. [23:16:53] got it [23:36:04] * jeremyb hands halfak a tab key :P [23:36:10] (jer)