[08:46:12] hey average, are you around? [13:15:33] hey average [13:15:42] need any help with git-buildpackage? [13:16:51] drdee: I'm reading paravoid's git@github.com:paravoid/gdnsd.git [13:17:08] drdee: he has a debian/gbp.conf there and I'm using it as an example [13:17:16] great! [13:17:29] drdee: he suggested this yesterday to me on the ops irc chan [13:17:41] cool [13:38:05] average: is this still accurate https://github.com/wikimedia/metrics/tree/master/pageviews/new_mobile_pageviews_report ? [13:38:53] drdee: yes [13:39:34] ok, so if I implement it in java for kraken can i easily compare my implementation against your implementation? [13:40:42] drdee: yes [13:40:51] how? [13:41:24] drdee: well, I can make another run for New mobile pageviews and then hand you the report [13:41:33] drdee: and then you can compare with the results of the java implementation [13:41:54] ok [13:42:28] brb 30m (getting lunch) [14:03:03] yargh, stupid hadoop does not return any error message [14:03:06] yarghyy [14:14:22] what are you debugging? [14:15:33] deduplicate [14:15:38] just trying to run a single workflow [14:15:45] typical situation [14:15:51] runs fine as a standalone pig script [14:15:55] weird errors in oozie :/ [14:21:46] that would not be the fist time [14:22:09] i know! [14:22:09] hah [14:29:26] back [14:36:09] drdee: the new mobile pageviews reports [14:36:14] drdee: the e-mail that you sent [14:36:20] drdee: should I work on stat1 or stat1002 ? [14:36:22] or stat1001 ? [14:36:32] stat1 [14:36:43] ok, thanks [14:41:54] do you have unit-tests for your new mobile page view report? if yes, can you give me the link to those tests? [14:43:48] drdee: yes [14:44:22] https://github.com/wikimedia/analytics-wikistats/tree/master/pageviews_reports/t [14:44:45] drdee: https://github.com/wikimedia/analytics-wikistats/blob/master/pageviews_reports/t/02-accept-url.t [14:47:44] ungh i don't even need an oozie job for this crap [14:47:49] i'm just going to for loop it :) [14:51:18] anything considerably different between *.tsv.log* and *.tab.log* ? [14:51:32] or just the filename ? [14:57:22] just the filename, .tsv. is what we should have named them in the first place [14:57:32] its also when we upgraded and repuppetized the server :) [14:57:34] but yeah [14:57:38] no difference in files [14:57:41] just the nmae [14:57:42] name [14:57:57] thank you [15:18:05] drdee: morning [15:18:13] morning [15:19:05] regarding zero stuff, i was thinking maybe we could make sure that if we remove the isPageView check that the numbers are large enough [15:19:08] what do you think? [15:19:49] sure, that is easy to test [15:20:09] i'm wondering what the best environment for this is [15:20:20] i guess grunt should work [15:20:32] have you had success loading the webrequests as a hive table? [15:20:35] drdee: ^^ [15:21:10] not yet [15:21:13] k [15:21:16] let's not worry about that then [15:22:45] drdee: last question: where is that script you used to generate the tsv which we used to compare to the old method [15:30:59] milimetric: quick pig question [15:31:10] shoot [15:31:28] do you have an up to date pig script which parses log lines correctly? [15:31:31] average: can you join me and adrian in https://plus.google.com/hangouts/_/2da993a9acec7936399e9d78d13bf7ec0c0afdbc [15:31:53] I just need to a really simple script and I am trying to find the most up to date boiler plate [15:31:54] yeah, the mobile platform one parses the sampled log lines [15:32:02] great [15:32:14] also, you wouldn't happen to have written anything that parses x-cs would you? [15:32:26] are you dealing with the geocoded/anonymized IPs? [15:32:30] naw [15:32:40] milimetric: I'm just intending to use the mobile stream [15:32:49] ooohhhh those were david's hive queries [15:32:51] not pig [15:32:56] k, well the x-cs would be parsed along with everything else, lemme see [15:33:31] drdee: I think you're right, but I looked at his hive script and didn't see any load commands [15:33:42] drdee: https://github.com/wikimedia/kraken/blob/master/hive/scripts/zero_cache_status_xcs.sql [15:33:56] milimetric: good point. That should be enough [15:34:01] so erosen, if you have kraken/pig/mobile_platform.pig open, the LOAD_WEBREQUEST is the piece that [15:34:12] parses out a log line [15:34:17] milimetric: great [15:34:18] thanks [15:34:37] and I'm just using a few fields in there, though you can see the rest in include/load_webrequest.pig [15:34:44] cool [15:34:47] and yes, x_cs is in there [15:37:26] milimetric: one last question: where do the jars live? [15:37:57] well, right now they're where david was referencing them from [15:38:06] i think i see them in /libs [15:38:13] are you doing a one-off or oozie job? [15:38:15] where was david referencing them? [15:38:21] (looking that up) [15:38:22] interactive grunt shell [15:39:14] ${nameNode}/libs/kraken-0.0.2 [15:39:16] i think i found something that might work in /libs/kraken-0.0.2 [15:39:18] yeah [15:39:25] k [15:39:28] i'll try using those [15:39:35] for the interactive grunt shell, here's what I do [15:39:39] mvn package [15:39:55] ~/bin/copy_kraken_jars_to_anxx 02 [15:40:06] i can send you that copy script [15:40:14] interesting [15:40:18] are you building on an01 [15:40:19] ? [15:40:22] it puts them in my home directory [15:40:26] no, building locally [15:40:28] gotcha [15:40:43] so then I can just import the jars by name [15:40:51] and if it gets promoted to an oozie job I don't change anything [15:41:02] i just make sure the jars are on its path [15:42:05] nice [15:42:12] okay I may go that route [15:42:25] i'm tempted to trust that these jars are up tp date [15:42:31] but they probabluy aren't... [15:42:49] no those should be, as far as I know [15:43:03] but if you need to add udfs, etc, it's dangerous to change the real jars until you know you're good [15:43:13] yeah [15:43:14] totally [15:43:16] that makes sens [15:43:25] i'm hopefully just using the jars atm [15:43:38] k, good luck, lemme know if you need to brain bounce pig (can be a pain) [15:44:35] thanks a bunch [15:44:37] will do [15:53:24] ottomata, I looked into the migration thing for SQLAlchemy [15:53:38] I'm not sure if this is the standard but Alembic looks ok: http://alembic.readthedocs.org/en/latest/tutorial.html [15:53:50] it even has a handy auto-generate: http://alembic.readthedocs.org/en/latest/tutorial.html#auto-generating-migrations [15:54:16] nice cool [15:54:20] I'm not going to worry about it until we are ready to launch it though [15:54:27] ja, i think whatever you guys pick it will be fine [15:54:32] because there will be no versioning until then anyway [15:54:41] hmm, ja ok [15:59:21] milimetric: speaking of versions... [15:59:45] yes [15:59:45] milimetric: i've built the kraken jars off what i thought was the upt to date repo [15:59:45] and tehy came out with 0.0.1 [15:59:52] any ideas on how that could happen [15:59:53] ? [15:59:57] some of the jars come out that way, you sure it was all of them? [16:00:20] no, not all of them [16:00:33] but the main ones at least [16:01:14] when i do it the funnel one comes out as 0.0.1 [16:01:17] and the rest 0.0.2 [16:02:14] milimetric: I think it was my fault, I just called mvn compile [16:02:18] so i do mvn package and then find -name *.jar [16:02:21] oh ok [16:02:21] which probably doesn't build jars [16:02:24] yep [16:03:37] milimetric: one more thing, you had to skip the tests right? [16:03:47] no, they pass for me [16:03:51] hrmmm [16:03:52] interesting [16:03:56] oh you probably don't have dclass or GeoIP [16:04:22] i copied the GeoIP files off one of the an machines [16:04:26] yeah I get an error for dclass dClass JNI Wrapper ................................ FAILURE [1.470s] [16:04:33] ok, so it's dclass then [16:04:40] and then the rest are skipped [16:04:45] there's a deb, can you install that with brew or no? [16:04:53] hmm [16:04:58] i don't think brew can install debs [16:05:06] but I could always build on stat1 or something [16:05:14] (<-- 0 knowledge of mac and proud of it) [16:05:24] hehe [16:05:33] oh, I can copy them to my home dir and you can pull them from there, wanna do that? [16:05:39] i'll put them on an02 [16:05:48] sure [16:05:56] i'm tempted to try to use the jars as is [16:06:12] but that will probably just result in a bunch of failed pig jobs and log digging [16:06:21] k, they're in /home/milimetric/ [16:06:25] on an02 [16:06:29] thanks [16:07:05] did you mean I could just use the jars [16:07:12] they look world readable so you can copy or use [16:07:19] awesome [16:07:19] but let me know if I misread the permissions [16:07:31] and they're fresh, i just built them [16:07:37] awesome [16:07:40] thanks for helping with this [16:07:46] sorry for distracting you from other things [17:01:15] ottomata: scrum [17:01:18] average: scrum [17:01:41] hokay! [17:23:56] ok average, how can I help? [17:25:21] so the weird thing is sqlalchemy seems to have versioning support built in too: http://docs.sqlalchemy.org/en/rel_0_8/orm/examples.html#versioned-objects [17:25:28] k, gonna stop looking at that now :) [17:26:06] milimetric: cool [17:26:13] ottomata: hangout ? [17:26:38] jajaja [17:27:22] how the crap do you start a new hangout?! [17:27:32] ottomata: interface changed [17:27:35] ottomata: I'm calling you [17:27:38] on hangout [17:27:40] it's weird [17:27:43] ottomata: can you see it? [17:27:48] ottomata: https://plus.google.com/hangouts/_/ [17:27:54] that will create a new hangout [19:10:40] hey erosen, how goes with x-cs? [19:10:50] oh he must be at lunch :) [19:50:09] holy cow sqlalchemy is amazing [19:50:12] best ORM I've ever see