[00:00:05] lets remove those [00:00:15] argghh [00:00:15] i used that one [00:00:21] for jsut that change right/ [00:00:22] ? [00:00:30] yeah [00:01:08] oook, try again again [00:02:41] nein [00:02:47] poopers [00:03:54] this is odd, we know the problem, we know the solution, but it isn't working,so either we have not diagnosed the problem correctly or our solution sucks [00:04:26] welllll, we don't konw the problem, do we? [00:04:32] there is some classpath problem, rigth? [00:04:34] what's the error again? [00:05:09] java.lang.NoClassDefFoundError: org/apache/pig/Main [00:05:25] which job are you running? [00:05:25] the pig job [00:05:40] and if you try the hive job then a hive class is missing [00:05:41] ok [00:05:55] gonna run one just so i have a job id [00:30:22] yeah, drdee, this is a bug that will be fixed in new version [00:30:27] i followed these instructions [00:30:31] https://issues.cloudera.org/browse/HUE-877?focusedCommentId=14983&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14983 [00:30:32] and got it to work [00:30:46] : Open the workflow, go to the "Properties" tab, then click on the "Advanced" link, then click on the 'Add' button of the 'Oozie parameters'. Specify "oozie.use.system.libpath" as "Name" and "true" as "Value and save. [00:34:51] drdee, oh yes, the jobs they are succeeding [00:54:38] nice, but it's weird because that property was already set in the oozie workflow itself [00:54:38] f [00:54:40] but that's great news! [00:59:25] right, not sure if i understand why that works [00:59:26] but it does [01:07:20] heading upstairs for election shenanigans, brb [14:33:22] hiooo [14:40:01] morning ottomata, milimetric, average_1rifter [14:40:08] drdee: morning [14:40:11] i am a reliefed man [14:40:19] morning both [14:40:23] three [14:40:27] :) [14:40:33] reliefed? [14:40:33] oh, obama [14:40:47] seemed closer than I thought it'd be [14:40:49] morning! [14:40:57] oh, thought you were gonna say baby [14:41:01] me too [14:41:17] wait he didn't actually say anything yet [14:41:24] maybe both! [14:41:26] nope, we are still waiting [14:42:01] milimetric: it was quite nerve wrecking early on but it's so much better for both the US and the world [14:43:18] yeah, I hope you're right [14:44:42] i stayed up until 2 am to watch the speeches so i neeeeeed my coffee first [14:45:28] average_1rifter: have you asked ottomata to look at the puppet change? [14:45:44] drdee: I added him to the reviewers [14:45:58] oo, lets see if I get an email about that [14:46:00] i'm still going through email [14:46:04] k [14:46:25] ottomata: hey , yesterday night, in a purely coincidental way, me an Daniel Zahn from operations were sitting somewhere on irc.debian.net [14:46:39] ;) [14:46:39] drdee: so I talked to him and told him about our need for puppet changes [14:47:20] ottomata: so uhm, he guided me through the process of adding a package to puppet. I did not test it but he said it was ok if I just added for review the change [14:48:39] nice! [14:48:53] yeah adding a single package is a purty easy thing to do, probably doesn't need testing [14:54:51] the answer is no! I do not get notification emails when I am added as a reviewer :( [14:55:00] average_1rifter: [14:55:09] 1. what happened to your 'd'? [14:55:18] 2. gimme gerrit link and I will review [14:55:51] ottomata: https://gerrit.wikimedia.org/r/#/c/32192/ [14:56:14] 1) that is a *very* good question [14:56:28] ahhh ok, roles [14:56:29] so [14:56:35] 1) I closed my gnome-terminal with an open irssi, and had to open another one. there is one zombie irssi process right now on my system [14:56:36] you don't need to make a 'role' class for this [14:56:55] actually, this isn't even an analytics thing [14:56:55] hehe [14:56:57] ottomata: Daniel suggested that [14:56:58] sorry this is really not clear in puppet [14:57:02] i'll add some comments now for that [14:57:08] ottomata: please do [14:57:15] but the 'analytics' classes, refer specifically to analytics/kraken cluster stuff [14:57:36] the statistics classes are more relevant for your stuff…but actually, since this is just a package install [14:57:36] hmm [14:58:10] so, it took me a bit to understand role classes, as they seem to be something that WMF made up [14:58:26] a role class is something like [14:58:38] webservers, cache servers, database servers, analytics cluster servers, etc. [14:58:46] a role is a particular function that a server might play [14:58:50] a server can have multiple roles [14:58:56] but roles are very high level [14:59:20] so, a role for a 'server that has the openssl package', isn't quite right [14:59:35] maybe we extend it in the near future ? [15:00:01] yeah, but this is specifically for building and jenckins stuff [15:00:17] i would add [15:00:19] in contint.pp [15:00:24] misc::contint::openssl [15:00:26] and put it there [15:00:28] then [15:00:59] or mayyyybe [15:01:00] hm [15:01:21] this is up to you [15:01:39] i'd either put this in misc::contint:: classes, as this package has more to do with building things in jenkins, right? [15:01:39] OR [15:01:46] oh no, I can change it. if you think the semantics is better in misc::contint::openssl, then we use that [15:02:06] create a series of misc::udp2log::build classes [15:02:06] so maybe [15:02:46] misc::udp2log::build::packages { [15:02:46] package { "openssl": ensure => "installed" [15:02:46] } [15:02:46] misc::udp2log::build { [15:02:46] include misc::udp2log::build::packages [15:02:46] } [15:02:46] something like that [15:02:55] then, if/when you need more packages for building udp2log and/or udp-filter [15:03:06] you can add them to the misc::udp2log::build::packages [15:03:23] whatever machine you intend to use for building these packages, you can just include misc::udp2log::build [15:03:43] or meh, since this is all udp-filter specific, maybe even just create a misc::udpfilter::build series of classes [15:03:44] heheh [15:03:47] options! [15:03:51] heh [15:04:11] but, the least effort [15:04:38] would be to just add a class in contint.pp (or even in misc/statistics.pp) and include that wherever [15:04:45] probably contint.pp is better [15:06:57] sigh [15:07:01] my favorite cafe [15:07:05] has shut their power outlets [15:07:11] this is a sad day [15:07:14] obama may have won [15:07:38] but I did not here one bit during the campaigns about what he's going to do about outletless cafes [15:07:39] hear* [15:11:53] bweerr bwerrr, gerrit down [15:19:11] vagrant is really cool [15:19:25] (or rather, looks cool, i ahven't used it yet) [15:19:29] i'd like to make a kraken vagrant dev box :) [16:01:17] ottomata, indeed outlet less cafes suck, it's also spreading here in toronto [16:01:34] you wanna tinkle a bit more with oozie before the dell's arrive? [16:01:59] sure, more probs? [16:02:09] (working on getting jmx ports organized atm) [16:35:21] ottomata, so you need to add the /user/oozie/share/lib as a job parameter to an example in hue to make it work? [16:35:51] no [16:36:00] you need to set enable share lib to true [16:36:00] ummm [16:36:13] Open the workflow, go to the "Properties" tab, then click on the "Advanced" link, then click on the 'Add' button of the 'Oozie parameters'. Specify "oozie.use.system.libpath" as "Name" and "true" as "Value and save. [16:37:55] ty [16:38:18] i just figured out how to parametrize oozie workflows [16:45:09] louisdang, we got oozie finally to work :) [16:45:18] drdee, cool [16:47:00] ottomata, can you have a look at this log file: http://analytics1001.eqiad.wmnet:8888/oozie/list_oozie_workflow/0000006-121107001622628-oozie-oozi-W/ [16:47:08] and tell me why it doesn't work :) [16:51:14] Launcher ERROR, reason: Main class [org.apache.oozie.action.hadoop.SqoopMain], exit code [1] [16:51:14] ERROR is considered as FAILED for SLA [16:51:18] no idea what that means [16:51:21] sqoop died? [16:51:22] ottomata, i have a suspicion: it is looking for the mysql connect jar [16:51:36] and it can't find it, but this error is very unclear [16:51:37] think that needs to be in share lib too? [16:51:42] possibly [16:51:52] or at least somewhere :) [16:51:55] try manually putting it in /user/oozie/share/lib [16:51:59] if nothing changes take it out [16:52:05] k [16:55:25] huh. it's snowing [16:57:02] i see rain [16:59:37] i see a shining sun [16:59:50] erosen, are you a brave person? [17:00:01] and what does that mean? [17:00:17] do you have the courage to confront the kraken? [17:00:22] hehe [17:00:26] certainly [17:00:33] AWESOME! [17:00:40] what is the context [17:00:45] i literally just opened my compuyter [17:00:49] i shared a simple google doc with instructions on how to connect to kraken [17:00:54] ohhh ok [17:01:01] i wanna get you started today if you want too [17:01:30] sounds good [17:01:39] ping me when you are ready [17:01:41] great [17:01:48] we have the analyst meeting at 10 [17:01:50] and I have a lunch thing [17:01:55] but otherwise my schedule is free [17:02:03] k [17:04:40] growl, i have 50 mins of battery left [17:04:49] just enough time to fun out before standup [17:04:49] hmm [17:04:54] guess i should move [17:05:23] ottomata, this is the same issue i am having: http://stackoverflow.com/questions/11555344/sqoop-export-fail-through-oozie [17:08:04] hmm, we should find the job log, I don't see one though [17:08:05] hmm [17:13:05] * drdee is searching as well [17:14:20] ottomata, did you have a script to unzip the sampled traffic files and copy them to hdfs? [17:14:52] no, just clied it real quick [17:14:56] in a for loop [17:15:10] for f in $(ls); do file=${f%.gz}; echo $file; zcat $f | hadoop fs -put - /user/otto/logs/sampled/$file; done [17:15:23] that'll work if you are in a dir with the gziped sampeld files [17:15:56] ty [17:29:11] hokay, battery almost dead, going to friend's place, gonna get wet! [17:30:37] be back on before standup [17:31:02] aight [18:01:22] you guys in a parallel universe again? [18:01:22] https://plus.google.com/hangouts/_/2e8127ccf7baae1df74153f25553c443bd351e90 [18:01:44] still talking to robla [18:02:07] linkyyyyyy [18:03:00] https://plus.google.com/hangouts/_/2e8127ccf7baae1df74153f25553c443bd351e90 [18:03:58] drdee, we are there [18:04:09] coming.... [18:18:12] happy obama hangover day everyone. [18:18:18] there a hangout still? [18:18:36] sorry i'm late. i shall leave the reason as an exercise to the reader [18:18:39] https://plus.google.com/hangouts/_/2e8127ccf7baae1df74153f25553c443bd351e90 [18:18:44] drdee: https://gerrit.wikimedia.org/r/#/c/32137/ [18:18:46] drdee: new patchset [18:18:49] drdee: please review [18:18:50] ty [18:18:52] drdee: lintian empty output [18:18:52] k [18:19:04] awesome [18:19:46] dschoon are you coming? [18:19:48] yep [18:29:50] average_1rifter: i think it's not a good idea to manually edit the 'configure' file [18:30:05] drdee: it is autogenerated [18:30:08] drdee: I don't alter it [18:30:20] ok, just checking [18:51:49] average_1rifter, but why does configure say 'DEBIANIZE VERSION PLACEHOLDER' ? are you post processing the 'configure' file? [18:52:29] dschoon, ottomata: about connecting to kraken, ottomata feels it's a bit too soon to handout VPN access, shall we just stick to proxy access for the moment? [18:52:42] it is the only other way [18:52:42] so yes. [18:54:45] drdee: I just ran autoconf once on the development directory and so configure changed because it processed configure.ac and it generated the new configure [18:54:51] right :) [18:55:05] I think configure itself should not be in the git repo as it is autogenerated by configure.ac [18:55:42] don't know about that, most people expect a configure script when they download a C src package [18:57:32] average_1rifter: and compile.sh still contains the DEBUG flag [18:57:46] i thought that was deprecated and replaced using a CL parameter [18:58:23] need to doublecheck [18:58:27] moment [19:08:15] ottomata, could you install http://pypi.python.org/pypi/WebHDFS/ on stat1? [19:09:30] yeah? does that work? have you tried it? did you know that you can try it directly from your compy if you are connectd to the vpn? [19:11:41] i didn't realize that you could use it using the VPN [19:12:45] drdee, mind if I restart hadoop? [19:12:53] go ahead [19:15:37] dschoon: [19:15:37] https://www.mediawiki.org/wiki/Analytics/Kraken/JMX_Ports [19:15:41] ja [19:15:42] they should all be up and pupetized [19:15:51] hotness [19:15:51] with JMX remote running [19:16:21] since ganglia is giving me so much trouble, shoudl we look into somehting else for JMX stuff? [19:16:25] something nice and pretty? [19:16:32] you have a page about this methinks already... [19:16:34] tha I have read [19:16:38] or at least an email [19:16:58] i have found it: [19:16:58] https://www.mediawiki.org/wiki/Analytics/Kraken/JMX_Monitoring [19:21:12] [19:35:15] be back in a min, getting coffee [19:36:28] Amgine: Oh? [19:36:35] Amgine: Where at? What about? [19:36:55] Yes, at http://www.mcn.edu/2012/wikimedia-tech-workshop-bridging-gap [19:37:02] Amgine: Awesome! Thanks for the pointer! [19:37:08] yw [19:39:12] dschoon: I'll ask this pre-emptively, since I'm sure someone will ask: will it be possible to aggregate views of an image at all resolutions? [19:40:22] Hm. You're talking about pageviews of an image in (say) commons, yes? [19:40:25] Amgine: is your question GLAM related? [19:40:58] Yes. Most of the museums here have media files they have uploaded to commons. [19:40:59] Amgine: That sounds like a reasonable feature-request. Do we not do that now? [19:41:25] afaik, determining canonical image is non-trivial on commons [19:41:34] No; BaGLAMa is pretty much the only reporting tool, and it only reports views of articles which have the image thumbnailed in. [19:41:41] it is a feature request i know about and I am talking with the GLAM folks on how to detect that [19:42:54] brb [19:43:08] Interesting. [19:43:21] I'll definitely add it to our list. We'll certainly look into it. [19:43:35] Amgine: Tell them "yes, absolutely, though it might not come for a while". :) [19:43:52] [19:44:40] GLAM report I wrote up had lots of fun data. [19:45:26] dschoon, in Asana under Kraken under Core Jobs 'Tag as GLAM" [19:45:54] That's not the same request, drdee [19:46:03] it totally is [19:46:07] it'd be aggregating per-image. [19:46:15] not merely tagging the request as being glam-related [19:46:27] that's a subtask for doing the GLAM stuff [19:46:36] as you said, canonicalization of different resolutions to bear a pointer to the original is a bit harder [19:46:51] In any case. We'll work on it. [19:46:52] :) [20:08:05] ottomata: how's it looking out your window? [20:14:33] purrrtyyyy wet [20:54:36] drdeeeeeeeee [20:54:36] drdee [20:54:41] on an01 [20:54:45] 12G ./diederik [20:54:52] 12G /home/diederik [20:55:20] you copied all the zero log files to your home dir? [20:56:21] i am moving them to /a/logs/zero [20:58:12] ty [21:15:11] um, ganglia is working [21:15:16] i have not messed with it at all [21:15:40] sweet sweet [21:16:28] http://ganglia.wikimedia.org/latest/?c=Analytics%20cluster%20eqiad&h=analytics1006.eqiad.wmnet&m=load_one&r=hour&s=by%20name&hc=4&mc=2 [21:16:29] did you remove the metrics.properties file? [21:16:36] nope [21:16:37] can do though [21:19:33] ottomata, can we look into configuring the hadoop log server? [21:19:47] because we still can't look into the logs of a job that has finished [21:19:55] we can only look into jobs that are running [21:20:47] for example: http://analytics1004:8042/node/containerlogs/container_1352315593540_0003_01_000001/louisdang [21:21:44] hmmm, yeah hadoop log server? [21:21:48] you can find them in the hadoop fs [21:22:12] but yeah looking into it [21:26:21] it might be this issue: https://issues.apache.org/jira/browse/MAPREDUCE-4428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13412022#comment-13412022