[00:04:07] so the kraken-pig, kraken-generic, and kraken-dclass jars that I need are in HDFS. So I can't access them in pig -x local mode. [00:05:07] But I would rather not work with pig in normal mode because it's very slow. [00:05:20] drdee, if you have any suggestion, I would love to hear ^ [00:05:36] and by very slow I mean it's taken me 3 hours to get absolutely nowhere [00:05:49] you can put the jars in your home folder [00:05:58] and use that for development [00:06:27] and if you want you can show me your console, maybe i can give some tips [00:06:50] oh yea, otto fixed my ssh so now I can do this maybe [00:06:52] k, i'll try [00:11:54] ok, I give up - how do I create the SNAPSHOT jars drdee? [00:12:07] mvn clean;mvn package; [00:12:13] and then copy them to an10 [00:12:26] i work on an02 on dschoon's instruction [00:12:35] doesn't matter [00:12:38] that's fine as well [00:12:58] cool, thanks, it compiled :) [00:13:49] awesome! [00:15:52] back [00:34:56] dschoon, how'd you build dclass jni thing? [00:35:04] ah yeah [00:35:26] you have to check out a different branch [00:35:31] i didn't realize that at first [00:36:19] it's called branch package [00:36:35] oh i see [00:36:43] and then init the submodules [00:36:44] and package [00:36:45] k [00:37:09] beyond that, i recall i had to also copy the dtrees into a system folder [00:37:14] for me, it was: sudo cp -R dtrees/* /usr/share/libdclass/dtrees/ [00:37:20] but it might be different for linux [00:37:39] yep, that makes sense, i'll put them where the maven error says [00:38:09] follow this in as far as it makes sense :: https://github.com/wikimedia/dClass/blob/package/README [00:38:13] that's for OSX [00:38:21] feel free to add Ubuntu specific instructions [00:39:14] glibtoolize [00:39:14] aclocal [00:39:14] autoheader [00:39:14] autoconf [00:39:14] automake --add-missing [00:39:16] ./configure [00:39:19] make [00:39:29] hello, this is how you compile the .so for the dclass lib [00:39:37] oh ok [00:39:43] thx average_drifter [00:39:43] but first, you must checkout the package branch from this repo [00:39:54] https://github.com/wikimedia/dClass/tree/package [00:39:55] i got that [00:40:23] i don't have glibtoolize... [00:40:29] milimetric: are you on a Mac ? [00:40:35] linux [00:40:36] i think for ubuntu it is libtooolize [00:40:57] yes [00:41:23] milimetric: sudo aptitude install libtool [00:41:27] oooh ok [00:41:39] general question: how do you guys know that? just search for libtoolize? [00:42:18] tinkering a lot :) [00:42:21] yes [00:42:23] hm, the submodule is not initializing average_drifter [00:42:28] but we are getting closer! [00:42:31] * milimetric is the worst tinkerer of all time [00:42:36] milimetric: you don't need the submodule [00:42:38] k [00:45:03] you can also consider just using the .deb http://garage-coding.com/releases/libdclass-dev/ [00:45:12] ugh [00:45:14] but there probably is a reason you're compiling it.. [00:45:37] drdee, were you aware that / was full on an03, an09, an26? [00:45:49] milimetric: ^^ [00:45:52] if no, i am emailing ot [00:45:53] otto [00:46:28] no, i was not :( but notpeter was setting up disk space monitoring today so we should also have our boxes monitored [00:46:30] ok, i solved that dclass problem, now another test fails [00:46:40] shoot [00:46:42] Failed tests: testIpad2(org.wikimedia.analytics.kraken.pig.UserAgentClassifierTest): expected: but was: [00:46:42] testIpod(org.wikimedia.analytics.kraken.pig.UserAgentClassifierTest): expected: but was: [00:47:22] milimetric: yes, that's because you need the openddr file in place [00:47:29] yep. [00:47:35] k, searching [00:47:37] what i said above [00:47:39] drdee: can milimetric use the .deb ? [00:47:44] copying the dtree files to the appropriate places [00:47:48] milimetric: can you use the deb ? [00:47:52] no idea bud :) [00:47:55] it does eveyrthing for you [00:47:56] i'm just trying to compile the jars [00:48:13] milimetric: are you on 32bit ? [00:48:20] the kraken-pig-blah-SNAPSHOT [00:48:26] deb should work if right architecture [00:48:38] milimetric: uname -a please ? [00:48:40] well, it matters what the analytics machines are on, not my local right? [00:48:43] i'm on 64 [00:48:46] great ! [00:48:48] wget http://garage-coding.com/releases/libdclass-dev/libdclass-dev_2.0.12_amd64.deb [00:48:54] dpkg -i libdclass*.deb [00:49:00] milimetric: did you copy the dtree files? [00:49:35] average_drifter: does the deb contain the dtree files? (IIRC i think it does) [00:49:35] milimetric: ^^ [00:49:39] drdee: yes [00:49:41] nope, nobody told me about dtree files [00:49:42] it contains everything needed [00:49:54] * milimetric is forever amazed at how people figure this out on their own [00:50:04] try the deb milimetric :) [00:50:04] trying! [00:50:04] :) [00:50:07] and, fwiw, i *did* tell you above. [00:50:24] average_drifter: put the deb on github.com/wikimedia/dclass in files section [00:50:32] drdee: ok [00:50:32] that is sort of useful :) [00:51:09] ok, build failure #3: [00:51:09] Tests in error: [00:51:10] testExec1(org.wikimedia.analytics.kraken.pig.GeoIpLookupTest): /usr/share/GeoIP/GeoIPCity.dat (No such file or directory) [00:51:33] i rsync'd them from the cluster. [00:51:34] sudo aptitude install libgeoip-dev libgeoip1 [00:51:37] milimetric: ^^ [00:51:44] no I saw you just told me dschoon, I meant before like before today [00:52:23] and I had already switched to trying the .deb when you said [00:52:53] mm same error after that average_drifter [00:53:05] you still need to copy the dat files from the cluster [00:53:07] i don't have that geoip database though [00:53:11] oh rsync it [00:53:11] k [00:53:16] milimetric: sudo su; cd / ; updatedb ; locate GeoIPCity.dat [00:53:27] no [00:53:31] i don't think i have it on my local box [00:53:33] milimetric: listen to drdee [00:53:44] just copy them from /usr/share/GeoIP/ on the cluster [00:53:47] k [00:53:57] copy all 4 dat files [00:54:25] when you say "cluster", do you mean any an** machine? [00:54:27] or... [00:54:29] yes [00:54:38] or at least any an1* machine [00:55:33] permission denied drdee [00:55:36] ? [00:55:55] try scp [00:56:16] rsync -Cavz an11:/usr/share/GeoIP/\*.dat ./ [00:56:46] I'm doing scp dandreescu@analytics1010.eqiad.wmnet:/usr/share/GeoIP/* /usr/share/GeoIP/ [00:57:10] try dschoon's command [00:57:30] yep, that works [00:57:53] is that because the rsync module has different permissions? [00:57:57] thanks dschoon [00:58:24] scp doesn't necessarily use your ssh config correctly [00:58:34] rsync is always better. [00:58:42] scp is more or less obsolete. [00:58:48] k so never use scp, got it [01:00:06] scp is ok too if you have this in your ~/.ssh/config [01:00:30] drdee: do you have any idea what's special about an03, an09? [01:00:39] even an26 has nothing listed in https://www.mediawiki.org/wiki/Analytics/Kraken/Infrastructure [01:00:42] this all worries me [01:01:03] no, i used the same hostname for scp and rsync average_drifter and scp said permission denied [01:01:20] i need to use dandreescu@analytics10**.eqiad.wmnet because my username is different on those machines [01:01:23] Host bastion1.pmtpa.wmflabs Hostname bastion.wmflabs.org ProxyCommand none [01:01:26] so maybe rsync gets that [01:01:26] Host bastion1.eqiad.wmflabs Hostname bastion2.wmflabs.org ProxyCommand none [01:01:29] Host *.pmtpa.wmflabs ProxyCommand ssh -a -W %h:%p bastion1.pmtpa.wmflabs [01:01:32] Host *.eqiad.wmflabs ProxyCommand ssh -a -W %h:%p bastion1.eqiad.wmflabs [01:01:35] Host *.wmflabs User spetrea [01:01:37] average_drifter: we know. we all have the appropriate proxy commands. [01:02:00] yep, i have all those, i think rsync must be smarter about my bad username situation - it's gonna be fixed at some point anyway [01:02:07] (still copying the .dat files) [01:02:17] https://gist.github.com/wsdookadr/fc50039b332fab2a85fd [01:02:42] dschoon: ah, ok [01:03:28] shit [01:03:34] drdee: they're all udp2log recievers [01:05:22] drdee: i think we should probably text otto [01:05:36] yes i agree [01:05:41] there's a chance we'll lose mobile data otherwise [01:05:44] okay. i'll do it. [01:06:03] milimetric; got the dat files? [01:06:17] nope, still copying [01:06:40] k [01:09:01] no response on phone [01:09:02] i also texted [01:09:10] :( [01:09:11] ^^ kraigparkinson, drdee [01:09:27] otto not responding? [01:09:28] i double-checked his number on https://office.wikimedia.org/wiki/Contact_list [01:09:36] correct [01:09:43] i belief there are 4 udp2log instances running in total on an03-an06 [01:09:57] the number is correct. I got a text from him.... [01:10:07] but no clue why / would fil up [01:10:10] dsc ~/w/w/k/tmp/dotfiles ❥ dsh -g kka -- pgrep -fl udp2log [01:10:10] an03: 25474 /usr/bin/udp2log --config-file=/etc/udp2log/webrequest --daemon -p 8420 --multicast 233.58.59.1 --recv-queue=524288 [01:10:11] an04: 1685 /usr/bin/udp2log --config-file=/etc/udp2log/webrequest --daemon -p 8420 --multicast 233.58.59.1 --recv-queue=524288 [01:10:11] not recently, but last week. [01:10:12] an05: 1516 /usr/bin/udp2log --config-file=/etc/udp2log/webrequest --daemon -p 8420 --multicast 233.58.59.1 --recv-queue=524288 [01:10:14] an06: 30376 /usr/bin/udp2log --config-file=/etc/udp2log/webrequest --daemon -p 8420 --multicast 233.58.59.1 --recv-queue=524288 [01:10:16] an09: 24170 /usr/bin/udp2log --config-file=/etc/udp2log --daemon -p 8420 --multicast 233.58.59.1 --recv-queue=131072 [01:10:18] an08: 6562 /usr/bin/udp2log --config-file=/etc/udp2log --daemon -p 8420 --multicast 233.58.59.1 --recv-queue=16384 [01:10:20] an26: 32283 /usr/bin/udp2log --config-file=/etc/udp2log --daemon -p 8420 --multicast 233.58.59.1 --recv-queue=524288 [01:10:29] so no. [01:10:37] all the listed boxes are udp2log recievers [01:10:46] and i checked on an26, it's looking for cp1044 [01:10:54] mmmmmm [01:10:55] that's a mobile varnish box [01:10:57] yup [01:11:04] have you mentioned this in the wikimedia-operations channel? [01:11:11] no. [01:11:16] they might be able to find someone that can help us in the short run, no? [01:11:20] only otto admins our boxes. [01:11:29] jesus. [01:11:34] txt from otto: [01:11:37] still worth asking. got the same advice from robla [01:11:47] "Hm. Ok Seder dinner ahhh" [01:11:49] (during the packetloss fun) [01:12:25] I deny everything. wait, what am I denying? [01:12:52] we're texting -- i'm going to call [01:12:56] lol, robla, I was suggesting that dschoon pings #wikimedia-operations to get help on the boxes with disk full warnings. [01:13:47] assuming ottomata is otherwise detained. [01:14:03] he says / won't effect udp2log [01:14:23] still weird that they are filling up [01:14:50] disks filling up never ends well [01:14:51] i will look into it [01:14:55] crisis is less critical [01:15:05] which machine is filling up, and which partition? [01:16:02] / on an03, an09 and an26 [01:18:04] root partitions filling up really never ends well. are those boxes expendable? [01:18:11] milimetric: were you able to build the jars? [01:18:31] they are not. [01:19:05] yeah, some help from the other opsen may be in order [01:19:13] i think that is wise. [01:21:29] yes drdee, jars are built! [01:21:35] AWESOME! [01:21:39] congrats [01:21:45] from now on it's easy breezy [01:21:45] ops is ignoring me. [01:22:09] * average_drifter is happy if you're happy [01:22:23] :) [01:22:35] thank you very much for all your help dschoon, average_drifter, drdee [01:24:16] ugh [01:24:47] /var/lib/hadoop is 700G [01:24:50] this is a conf bug [01:25:14] logs? [01:25:20] Leslie must be AFK at the moment [01:46:41] okay, i think we're okay now [01:46:47] only an03 was handling customer data [01:49:00] robla, drdee [01:49:02] ^^^ [01:49:35] k [01:50:09] * robla skims #wm-ops backlog [01:51:13] dschoon: great, glad to hear. [01:51:40] is an03 sorta limping along, or is it pretty much fixed? [01:51:43] we'll look into the root cause tomorrow [01:52:05] an03 has 1.5G free [01:52:11] sounds reasonable [01:52:15] this hasn't changed in the last 10m [01:52:42] i've kept otto up to date [01:53:04] I wonder if that torrent server I was running there had anything to do with it [01:53:11] >:( [01:53:16] whut? [01:53:29] ok...I'll stop :) [01:53:40] how many times have we told you? [01:53:49] an21-24 are the porn boxes? [01:53:50] INFORMATION JUST WANTS TO BE FREE [01:53:55] jesus. [01:54:10] time for a goddamn glass of wine. [01:54:36] yes, sounds like it. enjoy! [03:49:54] that your doing, ottomata? [03:50:31] i see disk on an09 and an26 resolving itself magically [03:54:24] yes [03:54:29] so [03:54:46] i was using those when I was hunting down udp2log packet loss problems (that turned out not to exist) [03:55:10] in order to capture short bursts of logs, I wrote unsampled logs to disk by turning udp2log on and then off again [03:55:18] looks like udp2log started back up [03:55:24] i've edited the config files there to turn it off [03:55:26] as for an03, i don't konw [03:55:28] looking now [03:57:30] what did you delete from an03 to free ups pace? [03:58:32] ahh [03:58:42] an03? [03:58:49] ja [03:58:49] ottomata: i think mutante gzipped a log file [03:59:01] i recommend reading over the ops channel later [03:59:14] all the chat about it was there [03:59:25] plus, uh, your phone :) [03:59:43] i'm sure #ops will make more sense to you [04:00:13] and as a reminder, i deleted /var/lib/hadoop/data on an03 (because you ok'd it!), even tho that was silly [04:00:21] i agree it doesn't matter, though [04:00:40] an03 definitely doesn't show up as a datanode [04:00:48] milimetric: were you asking about profiling code in nodejs ? [04:00:55] dschoon: were you interested in this ? [04:01:05] nope. [04:01:08] ok [04:01:12] ottomata: lmk if you have other qs [04:02:09] hey average_drifter not recently [04:02:21] but I was trying to profile knockout js code [04:03:04] http://substack.net/heatwave_node_knockout_2011 [04:04:14] heh, that's funny [04:04:31] dschoon, we should get kraken webrequest loss into ganglia! :) [04:04:34] that woudl be real nice [04:04:37] it's the client side code that needs optimizing in limn though [04:04:40] unfortunately [04:04:54] more importantly, we should get / freespace into ganglia :P [04:04:59] ottomata: can I help with getting that into ganglia ? [04:05:08] i don't give any shits about aggregate disk [04:05:37] that should be in nagios, afaik / should be in nagios, but nagios alerts from the analytics cluster never worked right [04:05:44] yeah. [04:05:46] but i mean [04:06:00] the ganglia disk_free metric appears to be the aggregate of / + jbod [04:06:05] https://icinga.wikimedia.org/icinga/ [04:06:06] we should split them into /, jbod [04:06:12] + jbod?! [04:06:28] either that, or the numbers here make 0 sense: [04:06:41] aww, i can't link to icinga? [04:06:48] there we go [04:06:49] https://icinga.wikimedia.org/cgi-bin/icinga/extinfo.cgi?type=2&host=analytics1003&service=Disk+space [04:07:13] http://ganglia.wikimedia.org/latest/graph_all_periods.php?title=&vl=&x=&n=&hreg%5B%5D=analytics10&mreg%5B%5D=disk_free>ype=line&glegend=show&aggregate=1 [04:07:24] that number means nothing to me [04:07:41] i mean, i'm sure i'm wrong and it means SOMETHING [04:07:50] but clearly the worker nodes are including more than just / [04:08:20] hm [04:08:21] http://ganglia.wikimedia.org/latest/graph.php?r=4hr&z=xlarge&c=Analytics+cluster+eqiad&h=analytics1003.eqiad.wmnet&v=1670.212&m=disk_free&jr=&js=&vl=GB&ti=Disk+Space+Available [04:08:29] i'm going to go back to beating spelunky now [04:08:33] we can talk more tomorrow [04:08:41] ooook bye [13:20:59] morning [13:50:33] mornin! [14:06:04] mooooorning [14:32:28] heya drdee [14:32:51] erggh, 1 sec... [14:33:17] better [14:33:21] yeah, drdee, you there? [14:37:28] yo [14:37:49] and idea what caused the full root partitions? [14:38:43] on 9 and 26, yes [14:39:00] errant udp2log procs leftover from the packet loss sleuthing, i turned them off, but you know what happens [14:39:07] i commented outthe lines in udp2log config files there now [14:39:10] on 03, i don't know [14:39:23] but, i callllled you, [14:39:34] oh, and 09 and 26 do nothing right now [14:39:39] k [14:39:39] 03 is one of the 4 udp2log producers [14:39:43] so that is a worry if it busts [14:39:45] but ja [14:39:47] but [14:39:47] so [14:39:50] yup [14:39:55] the 5xx log on locke is huge in the last few days [14:40:10] historically it is approximately 5MB per day [14:40:24] on march 20-22, 50-100MB [14:40:25] and now [14:40:33] march 23-25, 350-550MB [14:40:42] that's compressed [14:41:00] how often does your SSH session freeze up? [14:41:03] i see mostly requests like this: [14:41:04] http://commons.wikimedia.org/w/index.php?title=MediaWiki:Filepage.css&action=raw&maxage=2678400&usemsgcache=yes&ctype=text%2Fcss&smaxage=2678400 [14:41:08] milimetric, not very often [14:41:20] mine stutters a lot but once every hour or so it freezes and I can't get control back, I have to quit [14:42:10] ottomata, i am looking, 1 sec [14:43:15] btw, I just moved 5xx over to gadolinium using udp-filter [14:43:54] k [14:45:33] it seems mostly to be the Mediawiki:Filepage.css file, definitely worth mentioning this in #mediawiki and #wikimedia-operations [14:48:53] yeah [14:48:55] ok [14:56:59] ok, drdee, in other news [14:57:25] all filters expcept for webstatscollector are on gadolinium now [14:57:40] that one is annoying, so many unknowns with that one [14:57:55] like what? [14:58:15] like, how do the files get to dumps.wikimedia.org? how is collector launched? [14:58:29] the first question we did answer a while back ...... [14:58:39] let me poke apergos about that (again) [14:59:05] ok…i poked around for it yesterday but didn't find it [14:59:35] just asked apergos [14:59:45] about the collector [15:00:14] no init.d no nothing? [15:01:14] ... happened again... [15:02:09] what happened? ssh timeout? [15:03:35] ja nothing [15:03:38] that i can see [15:03:54] drdee, can we just leave locke as the webstatscollector machine? :p [15:05:16] from apergos [15:05:16] pagecounts go via cron from spanoshot1 [15:05:33] snapshot 1 that is [15:05:41] snapshot! [15:05:49] yup [15:05:54] probably not in puppet though [15:06:04] ha, totally not [15:07:45] ok cool, I see it [15:07:48] I can make that work [15:09:33] drdee, we renamed webstatscollector master branch, right? [15:09:39] the one I should be looking at is master, not time_travel? [15:11:56] yes [15:11:59] look at master [15:12:59] k [15:19:45] drdee: I sent an e-mail about the csv [15:19:53] drdee: I still haven't found how to run WikiReports.pl [15:19:54] to erik? [15:19:59] drdee: to Erik, you, and Kraig [15:20:05] ok ty [15:25:02] ottomata1, drdee, so my SSH resets after every major dump statement I make. [15:25:02] or every 20 minutes [15:25:02] and by resets I mean I lose my work, have to kill the window and open a new one [15:25:03] but! I've got a pig script ready for oozie [15:25:18] that's annoying and weird! [15:25:24] well that is real annoying [15:25:31] if you have connection probs [15:25:35] work in a screen [15:25:41] that will help at least with not losing your work [15:26:12] but! I've got a pig script ready for oozie :) [15:26:29] so I'm reading your oozie tutorial again andrew and will ask you as I have questions [15:27:22] drdee, is this good enough for output? [15:27:24] (2013-03-25,-,185698) [15:27:24] (2013-03-25,Android,314583) [15:27:24] (2013-03-25,iPhone OS,566736) [15:27:24] (2013-03-25,Symbian OS,13242) [15:27:24] (2013-03-25,Windows CE,6) [15:27:25] (2013-03-25,BlackBerry ,105) [15:27:25] (2013-03-25,BlackBerry OS,144) [15:27:26] (2013-03-25,Windows Phone,6) [15:27:26] (2013-03-25,Windows Phone OS,17513) [15:27:27] (2013-03-25,Windows Mobile OS,366) [15:27:27] (2013-03-25,Linux Smartphone OS,9) [15:27:28] (2013-03-25,BREW,10) [15:28:50] I think tomasz wants it at the platform level, you can use the parentId property i belief [15:28:55] see https://mingle.corp.wikimedia.org/projects/analytics/cards/92 [15:29:39] oh right, doh [15:29:51] i got wrapped up in the pig stuff, forgot what I was doing :) [15:33:35] milimetric: if your ssh breaks every now and then.. try GNU screen or tmux [15:33:48] milimetric: that will allow you to continue where you left off when your ssh breaks [15:34:29] :) honestly, knowing how hard setting that up probably is, I'd rather just let it break every 20 minutes for now [15:34:50] thanks though, I'm sure that would help [15:38:55] it's not hard :) [15:38:57] wget http://garage-coding.com/.screenrc [15:39:03] put it in your $HOME [15:39:13] then you can do F9 => New Window [15:39:26] F10 => Detach [15:39:29] F12 => Quit [15:39:44] ottomata: did you improve the packet loss script by any chance? (https://mingle.corp.wikimedia.org/projects/analytics/cards/442) [15:39:55] CTRL+A P => previous window [15:40:01] CTRL+A N => next window [15:40:19] screen -x to attach to the screen [15:40:25] that's all [15:40:36] no [15:40:43] drdee, didn't know i was supposed to ? [15:40:56] not supposed to just checking if you worked on it [15:41:00] emit an event to ganglia?? [15:41:16] as part of all the work that you have been doing for gadolinium [15:41:20] ok [15:43:28] drdee, webstats collector always outputs to cwd dumps/, right? [15:44:17] yes i belief so, we add a conf param but that code never got deployed and is now rousting in the old_master branch [15:44:26] right [15:56:39] woa! If I wait about 10 minutes it gets unstuck [15:57:32] is diskfullageddon over? :-) [15:59:27] hey ottomata [15:59:43] why is the blog.sh script not updated on stat1 while the gerrit changeset has been merged [16:00:04] bwerrrr [16:00:07] where is it? [16:00:34] https://gerrit.wikimedia.org/r/#/c/55390/ [16:00:36] oh because of htis [16:00:36] err: Could not retrieve catalog from remote server: Error 400 on SERVER: Could not find resource 'Class[Nrpe::Packages]' for relationship on 'Nrpe::Check[check_ [16:00:41] ergh [16:00:45] drdee, looking at your UserMetrics cards now. [16:00:50] something leslie and peter were working on, ghmmm [16:00:58] grumble [16:00:59] k [16:01:54] kraigparkinson: ok [16:02:37] will ask [16:03:49] how do I add progress information to a card on mingle ? [16:03:53] kraigparkinson: ^^ [16:04:12] as in change the status? or add information for additional context? [16:04:24] additional context, progress information [16:04:33] like.. where we're at right now [16:05:47] Can add as a comment, or create a new header (h1. Implementation Notes) at the bottom of the card and add details [16:21:47] drdee, I think #419 is weak. We should either get a really smart rationale for it or changing the class of service to standard. [16:22:06] * drdee is looking [16:22:39] yes i agree, it seems to me that the admin by default can see all cohorts [16:23:05] yes, otherwise we're getting into user access issues that I think we should defer. [16:23:22] i certainly hope so [16:25:22] kraigparkinson: could you add a google hangout link to the Analytics Leadership Scrum meetings? [16:25:44] sure, but I figured we were going to use the stand-up scrum and roll through… that OK? [16:26:04] er scrum hangout. [16:27:17] sure, but can you give me modify rights to the event so I can add the scrum hangout link [16:27:17] ? [16:27:35] sure [16:28:00] done [16:37:08] drdee: I am currently blocked [16:37:30] I'm updating mingle about the progress on #60 [16:39:23] ok, look at 353 and detail what needs to happen to get wikistats run from master again [16:39:39] ok [16:39:40] it's not running from master right now but from a commit sometime back in october [16:45:14] drdee [16:45:15] https://gerrit.wikimedia.org/r/#/c/55900/ [16:45:57] any objections? [16:46:03] * drdee is looking [16:46:29] no, average_drifter might be able to help ou [16:46:55] well, its good, i just want a quicka nd dirty deb [16:47:05] i've got it working [16:47:17] (mornin) [16:47:24] if i could install make and g++ and libdb-dev etc. on gadolinium without puppetizing, i would just clone and compile there [16:47:31] but, since people might get upset about that [16:47:40] i figured a quick and nasty deb package is better than any of that anyway [16:48:03] so, if you've got not objections, i'm going to merge, build, put in apt, and puppetize webstats collector [16:48:13] morning dschoon! [16:48:17] hihi [16:48:42] dschoon, we should really move the different stats ein public zero into their own heirarchies [16:48:52] having multiple stats inside a single time bucket is weird [16:48:56] makes things weird for pig [16:48:59] and for liminifcation [16:55:45] hierarchies and cohorts [16:55:52] * average_drifter is confuzzled [16:58:23] ottomata: it's not bad, actually [16:58:26] you can glob [16:59:15] not with pig easily, also, concat_sort just uses the whole hierarchy [16:59:16] ottomata , average_drifter: scrum [16:59:23] on my way [16:59:28] average_drifter why are you confuzzled? [17:01:19] drdee: I dunno what cohorts are [17:01:30] scrum! :) [17:06:02] i mean, pig supports globs [17:06:43] ottomata: data = LOAD '/wmf/public/webrequest/zero_carrier_country/2013/03/*/*/carrier/*' AS ... [17:06:51] that does what you think it does [17:14:47] ja but its real annoying, you have to customize everything then [17:14:57] and, actually, i'm not sure if that works in oozie? [17:15:03] i've had trouble with wildcards [17:15:34] but still, how is one to konw what cohorts(?) are in the hierarchy with out going to the leaf dir [17:15:42] cohorts shoudl be toplevel [17:16:35] any of these rooms good: [17:16:36] Add6Flr - R62 - JaucourtAdd6Flr - R63 - GeoffrinAdd6Flr - R64 - MinorAdd6Flr - R67 - Inversion Table [17:16:57] 6Flr - R62 - Jaucourt 6Flr - R63 - Geoffrin 6Flr - R64 - Minor 6Flr - R67 - Inversion Table [17:19:29] ahh [17:19:30] fair. [17:32:56] omw [18:03:39] drdee, btw, puppet just ran on stat1, the blog script got updated [18:04:10] thx ottomataa [18:29:25] hey milimetric, can you merge https://gerrit.wikimedia.org/r/#/c/55924/? [18:30:36] looking YuviPanda [18:32:43] brb lunch [18:33:41] thanks milimetric [18:35:14] growl, my rebuild filter is segfaulting :( : ( :( [18:35:20] webstats filter [18:35:39] done YuviPanda [18:35:40] can i help? [18:35:45] milimetric: thank you :) [18:35:48] maybe? [18:35:50] I'll deploy :) [18:35:55] one comment - limnpy seems to generate annoying spaces at the end of the lines [18:36:03] milimetric: oh? [18:36:04] in the graph definition json files [18:36:13] i added the comment in gerrit [18:36:14] milimetric: I didn't generate this with limnpy, i just copied a prior graph and added [18:36:15] okay [18:36:20] oh gotcha [18:36:20] I'll clean them up [18:36:28] i had removed it from the existing graphs... hm [18:36:29] drdee, how do I debug filter? :/ [18:36:44] average_drifter: can you help ottomata with filter [18:36:54] it's segfaulting. did you push your latest fixes [18:36:54] ? [18:36:59] hm Program received signal SIGSEGV, Segmentation fault. [18:36:59] 0x000000000040092f in replace_space () [18:37:27] i belief that average_drifter fixed that one [18:37:30] maybe he forgot to push [18:37:35] so the next time this runs YuviPanda, it should automatically pick up your new script and generate the target csv. But yeah, you still need to deploy to get the graph up on kripke [18:37:42] yup [18:37:47] i'm doing some cosmetic changes too [18:37:50] (changing labels and stuff) [18:37:56] cool, good [18:37:59] so will deploy after that [18:38:12] and I haven't forgotten about documentation on nodeType(s) [18:38:31] I've been busy with looking at User Metrics API and Pig though [18:38:41] oh but maybe that's because I'm just inputing whatever [18:38:47] ottomata: can you show me what code segfaulted please? [18:38:48] it doesn't break with the entries-with-urls-with-spaces-2013-02-10.txt file [18:39:03] (back but eating) [18:39:08] milimetric: :) okay! Looking forward to it [18:39:11] I think we're a bit de-sync-ed on what branch we must use and stuff [18:39:38] average_drifter, i'm using master HEAD [18:39:39] we need to get udp-filters into CI again [18:39:42] milimetric: you need to hit the Publish and Submit button in gerrit [18:39:46] and this is webstatscollector [18:39:47] kraigparkinson, drdee: we did not discuss https://mingle.corp.wikimedia.org/projects/analytics/cards/469 but it seems to be in a sprint [18:39:49] filter.c [18:40:02] ottomata: webstatscollector, I'm going to pull [18:40:03] (just noting) [18:40:07] ottomata: can you provide me with segfaulting input ? [18:40:10] please ? [18:40:11] :) [18:40:23] don't have it yet! [18:40:25] haha, trying to find it myself too [18:40:33] it doesn't segfault on a tail of the sampled fiel [18:40:33] so hm [18:40:53] drdee, is that one we can punt? [18:41:06] drdee: yes [18:41:24] I think YuviPanda's still making changes drdee. But you're saying after +2-ing it, I still have to click "Submit Patch Set 1"? [18:41:36] milimetric: yes, you have to hit 'submit' [18:41:44] i think we need to discuss which features we are going to slot in the next 2-3 weeks [18:41:44] gerrit is not the brightest kid in the class, I think [18:41:55] gotcha, hitting [18:42:01] it seems that we have already more than enough in Queued for Dev [18:42:37] drdee, shall we chat about that this afternoon? [18:43:08] yes! would love to [18:44:39] ottomata: did it segfault on today's data ? [18:44:43] ottomata: or some other day ? [18:45:38] drdee, have you had the chance to look at the big ole list of defects and figure out roughly if/when they should go into the backlog or not? [18:45:47] ottomata: can you update https://www.mediawiki.org/wiki/Analytics/Kraken/Infrastructure with the current roles of the machines? [18:46:06] i realized last night i didn't know which were critical to the mobile features, and which were not [18:46:33] (as there were six or eight upd2log receivers, i guess many were just for diagnostic purposes?) [18:47:15] kraigparkinson: yes i have been looking at them many times :( but not sure what to do with them [18:47:41] huh. [18:47:41] ok, then let's chat about that in addition to the next 3-4 sprints of stories. [18:47:47] what a great word, dia-gnosis [18:47:50] apart-knowning [18:47:52] average_drifter, found it [18:47:52] drde, I'll send an invitation for 1pm pacific. [18:48:08] drdee ^^ :) [18:48:15] k [18:48:26] https://gist.github.com/ottomata/5248039 [18:48:30] ottomata: looking [18:48:33] its very looooong [18:48:58] ottomata that's CRAZY! [18:49:09] ottomata, could you look at the Puppet/Redis thing again when you have a chance? https://gerrit.wikimedia.org/r/#/c/54970/ [18:49:10] i mean, there might be others [18:49:16] it segfaults on the live stream over few seconds [18:49:19] thats just one example I found [18:49:34] heh [18:49:46] > buffer size for the line? [18:49:51] or greater than the buffer for the URL? [18:50:00] probably the latter [18:50:10] line buffer is 64k [18:50:24] amusingly, IE6 has a URL buffer that is only 2k [18:50:36] ottomata: I'll create a test with that [18:50:37] it just drops everything after 2048 [18:51:01] 0x000000000040092f in replace_space (url=0x0) at filter.c:95 [18:51:01] 95 int len = strlen(url); [18:52:04] 4453 characters, it is. [18:52:06] url is 4455 chars [18:52:10] :) [18:52:14] JINX [18:52:46] i wonder if i can find it on the referer page [18:52:54] wait [18:52:55] kraigparkinson: can we do it after quarterly review meeting prep? i need to attend GLAM sprint at 1pm [18:53:14] it appears to be a URL appended with the *page content* [18:53:15] sure. can we push to 30 min after that? [18:53:16] http://www.invertia.com/mercados/bolsa/indices/mdo-continuo/resultados-ib011continu/4T12 [18:53:25] yes [18:53:37] OHH [18:53:57] "select text to get the definition with wikipedia" [18:53:58] ok. Every time I hear you mention GLAM, I have this mental image of you wearing a feature boa. [18:54:00] ah-hahahaha [18:54:15] think hotpants, kraigparkinson [18:54:17] hotpants [18:54:27] brbrb [18:54:27] dschoon, re infrastructure, done. [18:54:31] I'm going to stab my eyes with hot pokers over lunch. [18:56:30] average_drifter: did erik answer to your questions? [18:56:36] and is it clear what needs to happen for 353? [18:59:27] drdee, I'm changing the values kraken assigns to wmf_mobile_app. For example, "Wikimedia App Android" to "Android" as card 92 says [18:59:30] drdee: first question, no [18:59:31] that ok? [18:59:37] yes! [18:59:40] drdee: I am still reading 353 and thinking about it [18:59:49] no need to think about it :) [19:00:09] you need to try run wikistats from master and document the issues in card #353 [19:00:27] it's just about documenting what is not working [19:00:39] drdee: ok, I will do that [19:01:07] ottomata: is webstatscollector urgent? should I have a look on it, write a test, patch and send you a deb ? [19:08:44] back [19:08:47] ty, ottomata [19:09:10] ottomata: i recall you did some stuff for an09 an26 last night [19:09:31] what was the root cause again? something about test instances of udp2log? [19:09:35] oh, right [19:09:36] unsampled [19:09:39] for seq stuff [19:09:51] yeah [19:09:57] on an03 i dunno thouhg [19:10:00] hm [19:10:07] that's a little worrisome [19:11:19] we gzipped kafka.log, iirc [19:11:28] do we have logrotate or something around that? [19:11:30] rather [19:11:32] it's java, obv [19:11:42] log4j has built-in rotation support [19:11:55] could you check what our conf looks like? [19:12:07] average_drifter [19:12:10] don't worry about sending me .deb [19:12:19] but if it was quick to fix the sefault that would be very helpful [19:12:31] this is one of the last pieces in deploying gadolinium [19:12:37] if you can fix the segfault and push to master that will be good enough [19:14:35] ok [19:14:38] dschoon, an03 [19:14:49] /etc/kafka/log4j.properties [19:14:53] ok [19:15:18] ja [19:15:22] so, no rotator [19:15:27] i'll find the conf for that [19:16:24] log4j.appender.WHATEVER=org.apache.log4j.DailyRollingFileAppender [19:16:42] ok, will add [19:16:49] ottomata: check out log4j.properties for hadoop [19:16:55] and you'll see there's a lot more than just the class [19:17:06] max file size, backup schedules, etc [19:28:26] drdee, pushed the pig script I've been working on along with the kraken updates. Running it on 15 minutes of data gives this result: [19:28:26] (2013-03-25,Firefox OS,1) [19:28:27] (2013-03-25,BlackBerry PlayBook,5) [19:28:27] (2013-03-25,Android,47) [19:28:27] (2013-03-25,iOS,564719) [19:28:35] doesn't sound right... [19:28:45] 1. cool, grats! [19:28:50] I've gotta run pick up my car from the shop (drunk kid ran into it the other night) [19:28:55] 2. aiight. i need to catch up on my scripts, so i can't look now [19:28:56] lame [19:28:58] wow [19:28:59] thx dschoon, and thx for all the help [19:29:00] that sucks [19:29:02] np [19:29:08] once i'm caught up, we can dig inot it [19:29:16] yeah numbers seem a bit off :) [19:29:26] cool, maybe set up oozie [19:29:47] well, I used the probably false assumption written here about the UA string: http://www.mediawiki.org/wiki/Mobile/User_agents [19:30:47] which is that the iOS string is just Mozilla 5.0 (anything but Safari)* iPhone (anything but Safari)* [19:30:58] but yeah, i'll be back in an hour or so [19:54:38] ottomata: do we have an oozie-site.xml? [19:55:01] also, could we get the rest of the conf dirs into the kraken project? [19:55:39] and what machine is blessed for conf update? [19:56:43] okay, answered my first question. it's all in /etc/oozie [19:57:20] kraigparkinson: we can do the release planning in 5 minutes if you want [19:57:39] dschoon, jajjaa [19:57:40] good idea [19:58:07] i think they are: /etc/{oozie,hive,pig,sqoop,hbase,zookeeper} [19:58:10] yeah [19:58:12] they are [19:58:18] well, hbase wasn't puppetized [19:58:23] dunno about others [19:58:29] i have local copies, randomly pulled down from a server [19:58:30] but yeah [19:58:33] they all have useful stuff [19:58:43] i never did the symlink on an10, so we can choose any machine to be the conf deployer [20:05:05] ottomata: do an08 an09 do anything atm? [20:05:11] btw, the namenode is pretty low on space [20:05:16] like, <10% free [20:05:17] iirc [20:05:31] back [20:06:10] no [20:06:28] /dev/md2 99G 603M 93G 1% /var/lib/hadoop/name [20:06:30] looks ok to me dcshoon [20:06:35] /dev/md0 19G 14G 3.5G 81% / [20:06:37] has 3.5G [20:06:48] this is just the default partitioning scheme ops gives [20:06:50] small / dirs [20:06:59] if we expect things to write data we're suppose to create new partitions for them [20:12:43] oh [20:12:46] my bad. [20:12:52] i was probably looking at something else [20:12:59] ottomata: the env variable 'OOZIE_URL' is used as default value for the '-oozie' option [20:13:05] * dschoon adds to .profile [20:13:40] haha, yeah [20:13:55] on my to do to make that all streamlined (oozie-env.sh?) [20:15:59] argh [20:16:00] Coordinator Rerun will not re-read the coordinator.xml in hdfs. [20:16:12] what does the -env.sh script do? [20:18:14] its used by the oozie daemons to set some env stuff, but I betcha it could be used by oozie cli too, dunno [20:18:17] that's why it son my todo list [20:18:58] gotcha [20:19:01] so uh [20:19:13] ottomata, do you have any idea how to get oozie to refresh a coordinator.xml file? [20:19:38] because i'd like to update the extant jobs (as they have a pause time that matters for re-materializing the runs) [20:19:47] and if i kill-readd, i lose the pause info [20:20:02] i could work around it, but this seems so basic... [20:20:06] yeah, i don't think you can [20:20:08] you have to restart them [20:20:13] that's what i tried to do the other day [20:20:20] when i was messing with the oozie mysql db [20:20:20] and http://oozie.apache.org/docs/3.3.1/DG_CoordinatorRerun.html says no [20:20:24] boo [20:20:31] yeah, you can do it with workflows [20:20:35] but not coordinators [20:20:42] i am now reading about bundles [20:20:47] in case that lets me do it [20:22:16] yeah, it seems like bundles are what we want [20:22:22] oh? [20:22:27] they represent a coord+properties [20:22:53] and you can actually define any number of coordinators with a bundle [20:23:05] oo reading, haven't read this! [20:23:18] back :) [20:24:18] it seems like it'd reread the coordinator definition(s) whenever it runs the bundle [20:24:29] http://oozie.apache.org/docs/3.3.1/BundleFunctionalSpec.html [20:25:11] oh [20:25:13] never mind [20:25:21] it says it just triggers coordinator rerun [20:25:43] meh [20:25:48] yeah [20:26:00] not quite sure what they are for…i guess just another abstraction? [20:26:04] yeah. [20:26:14] like if you have several related jobs with different frequencies [20:26:16] and time steps [20:26:37] like if i have an hourly job, and then a monthly rollup that runs less frequently [20:26:46] okay. [20:26:53] i guess i'll just delete the current coordinators [20:27:10] and resubmit with the correct start times after fixing their paths [20:30:19] ottomata: could you email me SMTP info to use for job-fail notifications? [20:31:45] ha [20:31:46] uhhh [20:31:56] sometimes I think you think I know more than I do :) [20:32:52] ah, i think this is all I've got [20:32:52] mchenry.wikimedia.org [20:32:57] is our outgoing smtphost [20:33:47] hm, that is actually in oozie-site.xml [20:33:57] okay. [20:34:03] [20:34:04] [20:34:04] oozie.email.smtp.host [20:34:04] mchenry.wikimedia.org [20:34:04] [20:34:04] [20:34:04] oozie.email.smtp.port [20:34:05] 25 [20:34:05] [20:41:35] mk [20:54:01] hey milimetric [20:54:03] can you merge https://gerrit.wikimedia.org/r/55957 [20:54:08] minor copy editing patch [20:54:18] k [20:54:42] ottomata: ages ago, did you say dsync didn't work for you? [20:55:16] because it seems to work fine for me... [20:55:23] done [20:55:44] i haven't tried in months [20:55:47] dsync -m an02 -m an03 -v -- ./file* ./foo HOST:blah/ [20:55:56] i hadn't either [20:55:58] milimetric: thanks :) [20:56:01] but it worked right now when i tried it! [20:56:04] nice! [20:56:47] i'll note you basically need to read the help [20:56:53] as it's a weird mashup of dsh and rsync [20:58:19] drdee: Erik answered [20:58:29] ok [20:58:31] drdee: looks like I have a formatting problem which I need to fix [20:58:36] ok [21:00:58] milimetric: I deployed to http://mobile-reportcard-dev.wmflabs.org/ but seeing no changes at all... [21:02:13] cool, ok, so this *looks* like it's still serving the old files: http://mobile-reportcard-dev.wmflabs.org/graphs/unique-uploaders.json?pretty=1 [21:02:21] let's take a look on kripke and see what's actually there [21:02:34] milimetric: hmm, okay [21:03:29] mmm, yeah, so the json wasn't updated [21:03:34] did the deployment succeed? [21:03:43] what command did you use? [21:03:47] YuviPanda: ^ [21:03:52] milimetric: fab mobile_dev deploy [21:04:01] and did you get any errors? [21:04:04] milimetric: no [21:04:13] i can pastebin the output if you want [21:04:14] let me try it then [21:04:18] no it's cool [21:05:01] are you going to delete the github repo anytime? [21:05:27] milimetric: http://pastebin.com/9Y2VXpRk [21:05:39] milimetric: i am unsrue if i have delete rights [21:05:46] but yes, we should delete it to avoid confusion [21:05:50] milimetric: i'll ask brion to delete it now? [21:06:12] hm? [21:06:14] i'll delete it YuviPanda [21:06:17] ah okay [21:06:23] yay [21:08:05] ottomata: kindly check on that disk alert for an10 that just fired [21:08:20] we kinda need the namenode, i guess? [21:09:39] YuviPanda: The deployer didn't correctly handle *changing* the repository after initial deployment. So it was still doing git pull from the existing directory [21:09:48] aah! [21:09:50] right [21:09:50] ottomata: nm, it's not real [21:09:52] you're welcome to try to fix that, but it's not high priority [21:10:01] i'll delete the old directories in mobile and mobile_dev [21:10:04] and redeploy [21:10:09] milimetric: okay, thanks! [21:11:29] oh try it now YuviPanda [21:12:24] milimetric: thanks :) [21:13:23] milimetric: <3 :) [21:13:44] heh, glad you're happy [21:13:54] it looks much better with your labels btw [21:14:12] YuviPanda, while I have you, I wanted to ask a question [21:14:18] sure! [21:14:27] do you have any sense of about how many "page views" you're expecting to see from each mobile app per day? [21:14:34] or per hour or anything [21:14:42] dcshoon, what's up? [21:14:48] milimetric: not more than whatever was available in the older ua tables [21:14:51] i'm working on the pig script that's tracking this and I want a sanity check [21:14:56] milimetric: I do have data on how many users have the app installed [21:15:07] yeah, I had that as well [21:15:24] 6,512,000 [21:15:28] other than that...nope. [21:15:34] k, thanks [21:15:53] :) [21:36:25] anyone usng sublime ? [21:36:41] any tree navigator plugins for it? [21:44:09] i use it occasionally, but not really [21:44:22] average_drifter, any news on the segfault? :D [21:45:21] ottomata: I'm working on some wikistats thing to finish it and have something for tommorow to show [21:45:46] ottomata: after I finish that(1h) , I will get back to webstatscollector and fix that [21:47:46] ok cool, danke [22:11:03] kraigparkinson: can't work on slide deck, google app drive is unreachable :( [22:15:57] drdee: pushed an update to the iOS mobile app regex [22:16:06] nice! [22:16:09] these are the results now: [22:16:09] (2013-03-25,iOS,47533) [22:16:09] (2013-03-25,Android,47) [22:16:09] (2013-03-25,Firefox OS,1) [22:16:09] (2013-03-25,BlackBerry PlayBook,5) [22:16:14] mmmmm [22:16:16] I don't think they make too much sense still... [22:16:18] BUT [22:16:37] I'm pretty confident the regex is implementing the rule as laid out on that wiki page [22:17:00] SO, I think we can deliver this to them, along with the regex we used. [22:17:15] and tell them that until that bug is fixed, we can't be super confident that we're not grabbing some regular iOS pageviews [22:17:32] so this data is for 15 minutes [22:17:48] ok [22:17:56] and for comparison, the total number of log fields that satisfies IS_PAGEVIEW is over 1 million [22:18:06] but still, both Android and Firefox look suspiciously low [22:18:18] what if you comment out the is_pageview [22:18:23] and just do raw web requests? [22:18:25] well, those regexes had "VERIFIED" next to them in the code [22:18:30] except the Android one which I changed [22:18:45] ok, I'll give that a shot in a bit. Gonna run out for 10 min. [22:18:53] i verified them :) [22:19:10] oh gotcha [22:19:21] it could also an issue with the is_pageview function [22:55:16] back [22:56:01] milimetric: you might want to try a few other things [22:56:11] - count the total number of pageviews in the dataset [22:56:21] a bit over a million [22:56:32] - if using MATCHES, ensure your regex matches the whole line [22:56:39] (don't use ^$) [22:57:00] btw without the IS_PAGEVIEW it's: [22:57:00] (2013-03-25,iOS,77577) [22:57:00] (2013-03-25,Android,503) [22:57:00] (2013-03-25,Firefox OS,3) [22:57:00] (2013-03-25,BlackBerry PlayBook,49) [22:57:01] - also dump counts for the total number of iOS devices, android devices, etc [22:57:39] right, kraken was using patternInstance.matcher(UAstring) [22:57:47] so i left that [22:57:47] oh, right [22:57:48] it's in a UDF [22:57:50] aiight. [22:58:08] yeah, I'm compiling the jars as I change them so thanks again for that help last night to all you guys [22:58:13] it's very useful [22:58:35] but we're gonna talk to Brion and Yuvi via email and ask them what they think based on what we found so far [22:58:43] but yeah, i agree with what drdee said earlier, that it sounds like time to reach out to brion -- if we had some rough idea of installs, active devices, etc [22:58:45] that'd be great [22:58:50] word. [22:59:00] np [22:59:02] ! [22:59:34] i'm about to maybe break everything, so i'll let you go first with testing [22:59:43] lmk when you're done [23:01:17] ^^ milimetric [23:01:46] we know there are over 6 million installs [23:01:52] but don't know about active devices [23:02:00] i think that's the point of this feature :) [23:06:53] heh [23:06:56] hokay. [23:07:18] lmk when you're done enough that you won't be pissed if i monopolize the cluster for a while [23:07:28] tomorrow i can help you set up the oozie configs [23:07:30] ^^ milimetric [23:07:36] (unless you've already done it) [23:08:03] i haven't done the oozie stuff [23:08:08] um [23:08:14] as far as the cluster, give me one more shot? [23:08:14] :) [23:08:15] coolio [23:08:16] sure [23:08:18] making the regexes more generic [23:08:23] should be like... 10 mn. [23:08:24] *min [23:08:33] k [23:08:50] i have acquired new oozie-fu that has enabled some cool changes to my other jobs [23:08:55] i'll share tomorrow! [23:10:49] k, job's running [23:11:12] :) oozie-fu seems like some of the highest level fu one can gather [23:11:27] with proper oozie-fu I believe you can even beat crane-monkey style [23:12:15] i'll admit there are some things that could be desired [23:12:40] but it pwns cron pretty hardcore [23:19:31] ok, i'm done with the analysis except i'm trying to count the elements in a simple bag [23:19:43] dschoon, how do I do this? [23:19:45] i can only find references to counting groups [23:19:52] and I figured it out before but it's not intuitive [23:20:00] don't want to keep you waiting [23:20:03] no worries [23:20:18] you aren't counting a projection? [23:20:40] matching_log_fields = FILTER log_fields BY ( [23:20:40] (GET_DAY(timestamp) MATCHES '.*') [23:20:40] AND IS_PAGEVIEW(uri, referer, user_agent, http_status, remote_addr, content_type, request_method) [23:20:40] ); [23:20:48] i wanted to count matching_log_fields [23:20:57] learned it yesterday and forgot :) [23:21:11] ah [23:21:18] then you can count any field, really [23:21:23] yes [23:21:48] how? [23:21:49] :) [23:21:57] result = FOREACH matching_log_fields GENERATE COUNT(*) as num; [23:22:07] oh, derf [23:22:09] foreach matching_log_fields!! [23:22:09] sorry, I forgot [23:22:19] there's COUNT_ALL [23:22:21] I believe [23:22:37] there's count_star... no count_all I can find [23:22:56] http://pig.apache.org/docs/r0.11.0/func.html#count-star [23:23:41] "Use the COUNT_STAR function to compute the number of elements in a bag" [23:23:44] ^^ milimetric [23:24:00] yeah, it says that on count too [23:24:08] but i literally can't figure out how to use it in the simplest case [23:24:18] result = FOREACH matching_log_fields GENERATE COUNT($0) as num; [23:24:28] basically you can put any valid field where $0 is [23:25:01] because it's just adding 1 to the count every time it's passed something [23:25:15] COUNT_STAR will do it even for nulls [23:25:18] whereas COUNT will not [23:25:26] i don't think you have nulls, so it doesn't matter [23:25:36] though [23:25:39] you could go: [23:25:47] nope that doesn't work [23:25:53] result = FOREACH matching_log_fields GENERATE COUNT($0) as num, COUNT_STAR($0) as star_num, ; [23:25:59] even though matching_log_fields seemed flat, it isn't: [23:25:59] matching_log_fields: {kafka_byte_offset: double,hostname: chararray,sequence: long,timestamp: chararray,request_time: chararray,remote_addr: chararray,http_status: chararray,bytes_sent: long,request_method: chararray,uri: chararray,proxy_host: chararray,content_type: chararray,referer: chararray,x_forwarded_for: chararray,user_agent: chararray,accept_language: chararray,x_cs: chararray} [23:26:08] oh no that is flat [23:26:21] fine, use kafka_byte_offset? [23:26:26] trying [23:26:47] 2013-03-26 23:26:31,185 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1045: [23:26:47] Could not infer the matching function for org.apache.pig.builtin.COUNT as multiple or none of them fit. Please use an explicit cast. [23:26:55] i mean, it might be the case that you'll need a GROUP matching_log_fields BY ALL first [23:27:15] this makes very little sense from a set operation perspective [23:27:21] COUNT requires a preceding GROUP ALL statement for global counts and a GROUP BY statement for group counts [23:27:22] like... it's a flat bag [23:27:23] so yeah. [23:27:29] right. [23:28:09] result = FOREACH (GROUP matching_log_fields BY ALL) GENERATE COUNT($0) as num, COUNT_STAR($0) as star_num; [23:28:12] that should work. [23:28:18] now that i, you know, actually read the docs [23:29:29] yep, this is what i learned yesterday. My brain rejects this outright :) [23:29:35] i will forget it in about 30 seconds [23:30:04] it's just making the types match up. [23:30:12] it needs a bag. [23:30:16] so we give it a bag. [23:31:14] matching_log_fields is a bag [23:32:01] but the target of COUNT is not. [23:32:58] after all, you're saying "for each row in matching_log_fields..." [23:33:04] and each row isn't a bag [23:33:12] lol [23:33:14] that's insane [23:33:16] it's a tuple! [23:33:19] why are we saying "for each row" [23:33:19] yes well. [23:33:22] in the first place? [23:33:26] because COUNT is an expression [23:33:37] and there's no statement that just evaluates an expression across the whole input [23:33:57] the language is designed to handle the record-stream [23:34:10] so you have to trampoline your record-stream into a bag [23:34:12] *shrug* [23:34:19] you think this is silly, go write some haskell :) [23:35:08] matching_log_fields IS A BAG! [23:35:57] i'm just saying, this seems like a poorly thought out part of pig [23:36:00] sql has the same problem [23:36:38] k, signing off for the night [23:36:45] i'm done with the cluster, thanks for the help and patience [23:36:47] as always [23:36:54] dschoon ^ [23:37:07] wor [23:37:08] word [23:37:13] i shall commence brutalizing things [23:37:15] cheers