[00:20:17] no :) [01:12:26] dude, riemann is so cool [01:12:35] this thing is what i've always wanted out of a monitoring system [01:12:39] i am totally setting this up on the cluster. [01:14:30] go for it! and please also work your git foo ;) [01:45:57] average_drifter: what did you need? [01:46:53] we already requested access, rt ticket has been opened [01:47:00] waiting for approval [01:48:05] ah [01:48:10] ok [03:00:15] heading home [03:23:35] preilly: hi, related to your previous question about wikimedia-incubator [03:24:00] preilly: moving all our projects to wikimedia-incubator would entail renaming all the remotes in all the git repos on all the machines we have [03:24:05] preilly: and all the submodules to those git repos [03:24:23] preilly: and any submodules outside the github repos [03:29:58] preilly: oh also wikimedia-incubator doesn't have any repositories and it was created yesterday [03:30:36] but I don't know much about the incubator, maybe dschoon will tell me more about it tommorow [04:10:40] average_drifter: okay [11:04:05] I'm installing oozie locally [12:49:18] hey [12:49:32] milimetric: hi [12:49:48] milimetric: sorry haven't answered you on limn, been caught with dclass and stats scripts :( [12:49:51] but time will come soon [12:50:14] jeremyb: you deal with hadoop and pig and oozie ? [14:09:44] drdee: hi ! [14:10:05] drdee: I've set up locally oozie and hadoop and pig [14:10:10] we're gonna get to the bottom of this ! :) [14:13:45] I've also downloaded the cloudera demo VM [14:13:46] http://www.ibm.com/developerworks/data/library/techarticle/dm-1209hadoopbigdata/ [14:13:49] the one described here [14:13:58] I think that's a good sandbox for me to try things out locally [14:17:16] oh morning everyone! [14:17:32] hi average_drifter - no problem [14:17:49] it's a side project, definitely only if you have free time [14:17:51] hey milimetric [14:25:02] java.lang.OutOfMemoryError: Java heap space [14:25:10] user@garage:~$ rm -rf /tmp/out1.txt/ ; JAVA_HEAP_MAX=-Xmx1000m hadoop jar /usr/share/hadoop/hadoop-examples-1.1.1.jar wordcount scanresult.txt /tmp/out1.txt [14:25:22] how can I fix this ? I tried setting the JAVA_HEAP_MAX variable [15:26:12] phew! leaving your keys at home puts a delay in your day! [15:37:38] morning! [15:38:40] drdee: hi [15:39:41] let's get you access to kraken, that way you don't have to fool around with OOM errors [15:39:59] morning! [15:40:03] MORNING! [15:40:15] ok [15:40:16] packet loss because of the unsample zero filter? [15:47:38] sooooooooo time for brain bounce [15:48:08] it has to do with shared C libraries, JavaLinkErrors and Kraken and single/vs multithreadedness and other goodies [15:48:12] who's in? [15:48:44] milimetric, ottomata, average_drifter ^^ [15:49:51] ha [15:49:52] um [15:50:02] i'm eating breakfast at the moment [15:50:06] but in 10 minutes I'll brainbounce [15:50:46] average_drifter; could you create a debian package for udp-filter based on the field_delim_param branch? [15:51:55] yes [15:53:44] ? [15:53:52] drdee: I'm in about what you wrote bove [15:53:54] *above [15:53:58] ah just saw that [15:54:00] just happened [15:54:24] https://plus.google.com/hangouts/_/2da993a9acec7936399e9d78d13bf7ec0c0afdbc [15:54:32] why must we hanguuoouuuuut [15:54:33] join me when you are ready to bounce [15:54:46] faster way of communicating [15:54:47] huh? [15:54:53] oh brain bounce [15:55:07] yup [15:55:07] this is not about oxygen? [15:56:45] i'm just going to disable that filter for now, and get the x-cs log file to erosen today [15:56:52] and see if he says its cool to disable all the others [15:58:33] ok! [16:01:06] no brain bounce is about [16:01:07] drdee: sooooooooo time for brain bounce [16:01:07] [10:48am] drdee: it has to do with shared C libraries, JavaLinkErrors and Kraken and single/vs multithreadedness and other goodies [16:01:16] haha, maaan i'm fixing oxygen [16:01:23] and i'm still sleuthing some data leslie gave me [16:01:25] aaaaggggg [16:01:40] gotta talk about shared C libs aaghhhh, ok ok ok ok [16:01:41] 3 mins [16:01:56] I'm restarting my machine but will be on the hangout in seconds [16:04:23] ottomata, it was optional :) [16:06:22] hahahahahah [16:06:28] i'm comiiinnngnnnnngngngngngn [16:06:31] brain bounce is good for all [16:08:03] it says I don't have permission [16:08:09] just tried to join [16:08:45] halp [16:09:15] finally joined [16:09:19] &authuser=2 [16:18:18] average_drifter: not recently. i know a little bit about them [16:18:35] well hadoop and maybe pig. definitely not oozie [16:18:41] ottomata: did you say you put the unsampled cs-header file in /user/erosen/tmp? [16:19:05] sorry [16:19:09] /home [16:19:10] on stat1 [16:19:18] didn't tell you the machine, oops [16:19:34] /home/erosen/tmp/zero-x-cs.log [16:19:44] aah [16:19:46] hehe [16:19:47] thanks [16:38:05] ottomata: another question about x-cs [16:38:22] is this an example? "ar;q=1.0,en;q=0.5,fr;q=0.5,pt;q=0.5" [16:41:40] no, that looks like an accept language header [16:41:51] so i'm capturing any output where the last field is not '-' [16:42:00] aah [16:42:11] i basically don't know what the cs codes look like [16:42:16] what exactly is the standard? [16:42:19] 406-12 [16:42:26] k [16:42:29] i see now [16:42:38] so 3 digits a dash and then 2 digits [16:42:41] and these are provider specific, right? [16:42:47] first 3 are country [16:42:49] http://en.wikipedia.org/wiki/Mobile_country_code [16:42:49] not provider X country specific [16:42:56] last 2 are carrier [16:43:01] aah [16:43:12] so this is the MCC-MNC [16:43:15] makes more sense [16:43:16] yes [16:43:43] seem sufficient for the moment, I'll mess around with a test report, now [16:43:46] ok [16:43:49] but i have bad news for turning off the old filters [16:43:55] amit wants to keep the old filters for the remainder of the month [16:43:59] I guess it isn't that bad of news [16:44:44] actually, drdee has someone constructed the mapping from carriers to mcc-mnc codes already? [16:44:50] no [16:44:56] like is there a file I can modify, so I don't have to go look at the names? [16:45:01] or is it just a full on library [16:45:12] scrape mcc-mnc [16:45:16] k [16:45:20] or scrape the wiki page [16:45:24] ya [16:45:41] well, yes [16:45:51] its in varnish .vcl [16:45:53] right? [16:46:01] yes [16:46:24] hmm [16:46:30] can you point me to that source? [16:46:37] if not i can scrape [16:46:48] https://gerrit.wikimedia.org/r/gitweb?p=operations/puppet.git;a=blob;f=templates/varnish/mobile-frontend.inc.vcl.erb;h=2b29d8742d40c4c18008ecb54612947ac11a1bab;hb=refs/heads/production [16:47:07] someone should probably add that to the Partner_IP_Ranges page [16:47:17] yay [16:48:01] erosen, i turned off that filter though, so we aren't collecting more data for it [16:48:06] yeah [16:48:17] you should tell amit that I haven't deployed his 4 new filters because oxygen is falling over [16:48:22] so turning them off will help [16:48:22] but [16:48:27] how about [16:48:31] I collect this data into kraken [16:48:38] that is better [16:48:39] instead of the unsampled IP based zero logs? [16:48:48] that's perfect [16:48:51] cool [16:48:54] that is sort of what I have been hoping for [16:49:04] that way I don't have to do ip matching in pig [16:55:30] relocating [16:55:31] b [16:55:31] rb [16:57:24] !log now collecting X-CS logs into kraken [16:57:26] Logged the message, Master [16:57:31] that was fast [16:57:42] well I didn't check up on my work :) [16:57:53] added filter, ran puppet, restarted udp2log [16:57:57] that should be it [16:58:02] and it should be consumed hourly automatically [17:03:21] http://imgs.xkcd.com/comics/regular_expressions.png [17:03:24] heh [17:03:50] that image should be the logo on wikistats :) [17:04:31] heheh [17:08:47] lol average_drifter [17:39:16] back [17:59:05] gooood morning [17:59:21] m'ning [17:59:27] let's do stand up real quick todayjjjjaaa? [17:59:31] ja [18:01:37] i am getting a new link whenever i click on hangout link in calendar [18:01:48] please share link [18:01:52] https://plus.google.com/hangouts/_/2e8127ccf7baae1df74153f25553c443bd351e90 [18:02:05] we are at [18:02:06] https://plus.google.com/hangouts/_/2da993a9acec7936399e9d78d13bf7ec0c0afdbc [18:02:09] oh. [18:02:13] dan and i ... bookmarked it [18:02:25] yeah i had it bookmarked too [18:02:27] i guess it changed [18:02:45] i think it changed after wednesday [18:25:47] ottomata, where do you want graphite? [18:31:10] drdee: https://gist.github.com/2925497#file-git-rewrite-history-sh [18:31:15] you should try first :) [18:31:20] if you get stuck, i'll help [18:31:24] ty [18:32:04] eroson, i'm going to sleuth a bit more, and then see what's up with the x-cs thing [18:32:08] its coming soon though, stay tuned [18:39:20] the office lost connectivity. :P [18:41:41] drdee [18:41:45] can you remove files from your home dir on an01? [18:41:52] YO [18:41:53] you are using 5.4G there [18:41:56] YEssir [18:42:00] if you need to store big files there [18:42:03] do it in /a/diederik [18:42:10] i am a data gobbler [18:42:15] gobbgllelelelel [18:42:19] that's my nature [18:42:25] will do [18:43:07] ottomata: updated, https://www.mediawiki.org/wiki/Analytics/Kraken/Infrastructure [18:43:29] can we make a second table [18:43:35] ideal/proposed rolls [18:43:38] haha [18:43:39] roles [18:43:41] vs current? [18:44:07] or I can just edit the purpose and add in parens [18:44:37] ottomata, better? [18:45:22] average_drifter, pig course? [18:48:40] eh? [18:51:21] yeah, ottomata, go for it. another col would proally be easier to read [18:53:51] aye cool [18:56:47] ottomata, i'm having intermittent problems connecting to the internet from an08 [18:57:08] my proxy is brewster [18:57:18] can i switch to an01? [18:57:22] if so, what port? [18:57:48] yeah, internet is tough on these guys, you trying to download stuff? [18:57:52] yeah. [18:57:59] namely, all the packages and deps :P [18:58:00] if you are just http downloading [18:58:02] it should work [18:58:07] apt should work too [18:58:20] https doesn't, it seems [18:58:23] hm [18:58:27] yeah maybe not [18:58:29] super annoying eh? [18:58:29] the proxy times out regularly :/ [18:58:42] even my shell is super-slow :( [18:58:42] hm [18:58:52] sometimes I dl on an01 and then scp over :p [18:59:10] but yeah, i think doing this on an01 should be ok [18:59:22] i am running production udp2log stuff on ther ethough (event and blog streams) [19:00:30] ugh. [19:00:53] we need to not do that on the box with the public IP :P [19:01:49] ottomata, this is probably totally useless but I looked up how to capture UDP packets with TCPDump: http://java.dzone.com/articles/tcpdump-learning-how-read-udp [19:04:23] didn't have a choice [19:04:26] especially for event [19:04:34] asher said it had to go to a public IP [19:04:38] don't remember why [19:04:48] hm. [19:04:51] nice! i'm doing it now with this [19:04:59] sudo tcpdump udp port 8420 -A -w - | strings [19:05:03] is it being sent straight there? [19:05:11] (the event stream) [19:05:16] yes [19:05:23] it's not going through oxygen first? [19:06:52] no [19:07:32] jesus, the latency is pretty awful [19:24:30] guys, please delete your local kraken repo and clone again, i have removed some files from the history, don't push / pull without recloning [19:24:44] dschoon, average_drifter, milimetric, ottomata ^^ [19:25:00] ok [19:25:05] see :) i knew you had it in you [19:25:18] of course we had patch your script ;) [19:25:32] clearly [19:25:59] ok [19:29:19] ottomata, can we point graphite.analytics.wikimedia.org to an08:80? [19:29:36] in the haproxy proxy? [19:29:37] hehe [19:29:40] yes, but not the DNS [19:29:41] yes. [19:29:45] so you have to add your hosts entry [19:29:47] i know. [19:29:49] you can do it! [19:29:54] checkout haproxy.cfg.erb in puppet [19:29:56] you will see [19:30:00] okay. [19:30:07] oh [19:30:10] ? [19:30:11] from our github [19:30:19] kraken-puppet? [19:30:22] our analytics branch [19:30:24] https://github.com/wmf-analytics/operations-puppet [19:30:32] you can push directly to that [19:31:07] k [19:31:13] then how do i run it again? [19:31:20] sudo puppetd something test something? [19:31:25] if you have kraken repo [19:31:29] easiest way is [19:31:32] bin/kpuppet uprun an01 [19:31:43] bin/kpuppet uprun analytics1001.wikimedia.org (if an01 is not an ssh alias fo ryou) [19:31:54] (christ this thing is huge) [19:32:09] where do i run that? [19:32:11] locally? [19:32:14] or an01? [19:33:09] where do i run that? locally? or an01? [19:33:15] locally [19:33:19] erggggh [19:33:20] internneneet [19:33:42] errgh [19:33:45] oh well [19:33:46] i am #2 [19:34:26] mk. [19:34:34] ...does it matter where i run that from? [19:34:41] like, in the kraken repo? [19:34:45] or in the puppet one? [19:35:36] no its just a wrapper [19:35:37] using dsh [19:35:45] to log into where it needs and do whatever [19:35:50] kinda like a local deploy script for puppet [19:35:56] you could log into an01 [19:36:03] git pull on the puppet repo in /etc/puppet.analytics [19:36:15] and then run uhhhhhhhhh [19:36:25] there should be a shortcut [19:36:26] puptest [19:36:27] that would work [19:36:32] do not run puppetd —test, [19:36:36] that runs the production puppet [19:36:38] i mean, you can run that [19:36:40] it won't hurt anything [19:36:43] it just won't do what you want [19:36:47] alias puptest='sudo puppetd --test --verbose --server analytics1001.wikimedia.org --vardir /var/lib/puppet.analytics --ssldir /var/lib/puppet.analytics/ssl --confdir=/etc/puppet.analytics --rundir=/var/run/puppet.analytics' [19:36:48] is what you want [19:39:15] k [19:39:26] gotcha. [19:39:33] but [19:39:41] kpuppet uprun ... [19:39:46] wait. [19:39:55] will update the repo on an01, and then run puppet wherever you want [19:39:55] no you wait! [19:39:57] oh. [19:40:01] there's a branch [19:40:02] sec [19:40:08] sorry, took a while to follow along [19:40:11] there should only be two branches on the github one [19:40:14] as i had to clone that huge repo [19:40:15] but you only want the default [19:40:17] 'analytics' [19:40:21] right/ [19:40:23] ? [19:45:54] yes. [19:46:00] i was so confused [19:46:06] until i realized the default branch was not master [19:46:14] okay. looking for the haproxy setup [19:48:26] ottomata2: proxy.pp controls the variables? [19:49:05] yeah, they're set in class kraken::proxy::haproxy [19:49:05] kk [19:50:59] yup [19:51:09] they might grab variables defined elsewhere [19:51:15] but they are set locally for easy access in the template [19:51:20] s'all good [19:51:23] i think i've got it [19:51:30] (to access fqdn variables in the template you have to do scope.lookupvar() fanciness in ruby) [19:51:31] cool [19:51:43] local variables just make it more readable [19:56:46] ottomata2: i pushed. take a look? [19:57:16] https://github.com/wmf-analytics/operations-puppet/commit/02a16be7929513d0f928a52399c3a6cb75f1edb2 [19:58:10] looks gooood [19:59:45] brb [20:10:29] ahg, i am so confused [20:10:37] i have tried tcpdump and tshark [20:10:41] and I keep getting the same weirdness [20:10:46] i tell it to capture 10 packets [20:10:47] and it does [20:10:51] and I get 30 udp2log lines! [20:11:45] drdee: I pointed Brian Keegan to the analytics list, he's one of my favorite Wikipedia researchers, I can tell you more about his work if you're interested [20:12:35] drdee, i think I know why my UDPsource thing lost so many packets. [20:12:41] and dschoon^ [20:12:51] DO TELL! [20:12:54] lol [20:12:55] there are 3 udp2log lines per udp packet [20:13:04] DarTar:thx [20:13:07] uh. [20:13:23] how many copies of the process are there? [20:13:28] ? [20:13:31] what process [20:13:32] ? [20:13:39] here, check it on an01 [20:13:42] /tmp/t1 [20:13:47] that is a 10 packet file [20:13:51] so 1 dropped packet is 3 missing seq numbers? [20:13:52] of udp2log capture in ascii [20:14:03] maybe [20:14:08] but you can see in that capture [20:14:16] that every 3 lines you get a new IP/UDP header (in binary) [20:14:27] and that each of the 3 lines in between the headers is from the same upstream host [20:16:00] and you know, dschoon, that makes more sense [20:16:10] since on average, a udp2log line is about 300 or 400 bytes long [20:16:13] if we are close to the MTU [20:16:22] its because we are packing more than one line into a packet [20:16:38] ohh. [20:16:40] yes. [20:16:47] that does make sesne [20:16:49] how does UDP know to combine the packets? [20:16:56] dunno, it would be the upstream hosts [20:16:58] an ID in the header? [20:17:02] no [20:17:05] its not done by udp2log [20:17:09] of course not. [20:17:11] its done by the orignal sender [20:17:11] it's done by UDP. [20:17:21] so you should have something in the packet header [20:17:23] you think? its the protocl [20:17:25] hmmmmmm [20:17:41] http://en.wikipedia.org/wiki/User_Datagram_Protocol#IPv4_Pseudo_Header [20:17:53] maybe whatever is packing them just waits til it gets closed to the UDP MTU size and then sends it? [20:18:08] either that or cache software is packing things together [20:18:09] right? [20:18:21] i doubt udp does that [20:18:21] no way [20:18:21] right? [20:18:23] no way you do a send() call and it waits for more data [20:18:50] no no. [20:18:53] you send all at once [20:18:56] yeah, so [20:19:05] and the socket knows what protocol you're speaking [20:19:11] right [20:19:12] and splits up your data into the right number of packets [20:19:17] it also reconstructs it on the other side [20:19:17] so who is packing 3 into one? [20:19:24] that's if the packet is larger than the MTU [20:19:28] not if it is smaller [20:19:31] if it is smaller, it shoudl just send [20:19:35] right? [20:20:03] that's udp2log [20:20:16] 3 loglines -> 1 packet is in our software [20:20:22] no [20:20:24] correct. [20:20:27] this data is not from udp2log [20:20:36] the catpure file i'm looking at [20:20:38] is from tcpdump [20:21:01] frontend caches -> oxygen socat relay -> tcpdump on an01 [20:22:08] http://en.wikipedia.org/wiki/Maximum_transmission_unit#IP_.28Internet_protocol.29 [20:22:15] i know [20:22:20] but i'm saying the sender does the packing [20:22:24] ja [20:22:26] not udp2log [20:22:28] layer-3 apparently does packet fragmentation [20:22:39] fine. i guess the squid/varnish logger, then? [20:22:58] yeah that's the only thing it could be [20:23:50] we do patch those [20:23:51] drdee, I just realized (unrelated to packets), that you scheduled deploying the tab separtor on a friday (feb 1) [20:23:52] U CRAZY [20:24:12] there have been 5 mails suggesting the date :) [20:24:15] lets do Feb 4 [20:24:18] i didn't realize it was a friday [20:24:32] ok, [20:24:35] big ol' changes on friday == have to work friday night or saturday [20:24:36] BOOOO [20:24:43] ok ok ok [20:29:12] not going to mess with this at all right now [20:29:20] but now I feel better about Flume UDPsource not working! [20:29:25] it was not expecting multiple lines in a single packet! [20:31:36] is the 3 in 1 hardcoded or does it cramp as many log lines as possible in a single udp packet? [20:34:14] not sure [20:34:23] i think I am not going to look into that at the moment either [20:34:30] it is irrelevant to the 5% problem [20:34:40] it was just throwing me off in my search [20:34:50] for an hour or more [20:34:50] ha [20:35:29] but it was useful because now you might be able to fix the flume udpsource, right? [20:36:29] yes true! [20:36:55] how the crap does udp2log do that, i guess it examines every character [20:37:04] just like i'd have to (and was doing for a bit) in udpsource [20:37:32] i bloody hate apache [20:38:03] (late, but totally agreed with no-fridays, ottomata2) [20:39:55] i'm tired of being #2! [20:43:08] drdee: how long do you expect your dclass job to take? [20:43:43] am I #0? [20:43:43] yes! [20:43:45] oh are you waiting :D ? cause then it's time to start looking into queues for jobs :) [20:43:45] I am! [20:44:05] i think 30 more minutes [20:45:20] jesus, finally. http://graphite.analytics.wikimedia.org/ [20:45:35] no data yet. [20:46:52] ottomata, if i have a .deb file, what's the command to install it? [20:47:09] dpkg -i ? [20:47:43] woo [20:47:52] yup [20:48:42] cool [20:48:57] dschoon, it should be realllly easy to start pumping the existing jmxtrans data into it [20:49:07] yep. [20:49:11] you wanna do that? [20:49:33] actually [20:49:34] check it out [20:49:38] your into the puppet stuff right now [20:49:38] see [20:49:42] monitoring.pp line 8 [20:49:45] # set the default output writer to ganglia. [20:49:45] Jmxtrans::Metrics { [20:49:45] ganglia => "239.192.1.32:8649" [20:49:45] } [20:49:46] add [20:49:55] # set the default output writer to ganglia. [20:49:55] Jmxtrans::Metrics { [20:49:55] ganglia => "239.192.1.32:8649", [20:49:55] graphite => "host:port", [20:49:55] } [20:49:57] that should be it [20:50:15] aiight [20:50:17] what is the host/port for graphite? [20:50:17] i'll do so! [20:50:26] http://graphite.analytics.wikimedia.org/ [20:50:33] doesn't work for me [20:50:41] you need the host alias, dummy :) [20:50:54] add graphite.analytics.wikimedia.org and riemann.analytics.wikimedia.org [20:50:59] to your existing line [20:52:01] ok ty [21:02:34] erosen: kraken is ready for you [21:02:41] drdee: pong [21:02:59] hokay. [21:03:09] http://riemann.analytics.wikimedia.org/ [21:03:15] now to get data in [21:07:50] ottomata: how should i go about daemonizing this dashboard? [21:09:42] the thing is a sinatra app, so i'm not experienced in dealing with this. [21:11:12] upstart has been pretty easy for me so far, then it is easily puppetizeable too [21:11:31] k [21:11:37] check out [21:11:39] you have an example script in puppet? [21:11:39] for reference maybe [21:11:48] /etc/init/storm-ui.conf [21:14:01] those are checked in here [21:14:04] https://github.com/wmf-analytics/storm-deb-packaging [21:14:06] part of the storm .debs [21:14:13] but also on all the cisco nodes at that path [21:16:29] HRM [21:16:30] so [21:16:33] at the moment [21:16:40] in my most recent test on an01 [21:16:45] NIC gets all packets [21:16:48] udp2log does not [21:17:11] gotta go run some errands, will be back tonight. Have a good weekend all. It's snowing on the East Coast! :) [21:17:24] laters dudes! [21:17:48] laterz! [21:17:57] doooo dooo doooooOOOOo [21:17:58] hm. [21:18:00] hm hm hm hm hm [21:18:02] how how how [21:18:02] hm [21:18:03] hm [21:18:04] hm [21:18:04] hm [21:18:06] so [21:18:54] the conclusion I am currently lead to believe [21:19:00] that udp2log drops packets systematically [21:34:55] new conclusion (with help from robla) [21:34:58] udp2log is not the problem [21:35:01] udp-filter is [21:35:27] *welp* [21:35:57] if I remove udp-filter from my udp2log logging on an01 [21:36:01] I don't lose any packets [21:36:11] which is the same behavior I saw on an09 [21:36:21] and I wasn't running udp-filter there [21:36:28] burn it! [21:36:44] hahah [21:36:59] seriously [21:37:05] hm [21:37:30] use awk or whatever to filter [21:37:32] I'm going to change the filter that is importing mobile logs to use awk or grep [21:37:34] and see what happens [21:37:35] yeah [21:49:10] ottomata, do the dell nodes have their memory settings in hadoop puppe? [21:50:16] is it possible to see mapper and reducer processes in a hadoop system ? [21:50:34] let me reformulate. is it possible to have a top/htop for a hadoop cluster ? [21:50:39] maybe someone wrote such a thing already [21:51:05] drdee: you said that the memory leak is in the reducer. can you please tell me how you reached that conclusion ? [21:51:38] it's not a memory leak [21:51:57] drdee, this udp-filter problem shoudl be fixeable, i think, it is reproduceable without udp2log [21:52:09] i think the current config is not right for the dell nodes [21:52:13] i get 5% loss from a static file | udp-filter -m m.wikipedia.org [21:52:29] BURN IT BURN IT BURN IT [21:53:03] but where does the packet loss on emery and oxygen come from? [21:53:24] interesting question [21:53:29] robla^ [21:53:33] that is relevant [21:53:35] perhaps not related [21:53:58] wow. [21:54:00] that is... wow. [21:54:08] who knew string parsing was so hard! [21:54:56] I was just telling ottomata my debugging technique for udp2log [21:55:07] stubbornly refuse to blame udp2log :) [21:55:24] it's usually something filter related [21:55:26] ottomata: can I reproduce the packet loss locally for udp-filter ? [21:55:46] shoudl be able to yes [21:55:53] they are going to switch off my internet here in 5 minutes [21:55:55] ottomata: tell me how you do it please [21:56:09] take an unsampled file [21:56:15] pipe it through udp-filter -d m.wikipedia.org [21:56:21] grep out a single host (cp1044 is my use) [21:56:29] find the first and last sequence number [21:56:34] ottomata: so if I make udp-filter faster, could that solve the packet loss ? [21:56:38] subtract them to see how many lines you should have [21:56:46] i think it is not a perf problem [21:56:47] but a bug [21:56:51] all instances drop the same lines [21:56:59] ohhhh [21:57:11] which lines are dropped? can we put that in a gist? [21:57:46] stripping out personal data before putting it in gist. right? right? :) [21:58:01] always a private gist :D [21:58:08] and just url's and seq numbers [21:58:19] ...and IP addresses [21:58:38] gusy, i have to go like righ tnow [21:58:41] drdee [21:58:46] the file I was using to test is on an01 [21:58:58] /a/otto/udp-sleuth/3/webrequest.log [22:01:27] have a good weekend all! [22:01:49] (am at a cowork space and they are kicking us out) [22:08:46] OTTOMATA [22:09:03] i think the reason is very very simple [22:10:22] i'm interested [22:16:05] drdee: what's the bug? [22:17:33] i think it was exactly a decision to drop lines either too few or too many fields [22:18:06] not 100% sure [22:18:15] but that's my best guess right now [22:21:20] i am getting closer [22:25:42] er [22:25:45] so [22:25:59] yes it's a bug [22:26:03] i just instructed puppet to update on an01 [22:26:28] and it gave me a bunch of errors about udp2log config [22:26:37] BECAUSE OF THE STUPID SPACE DELIMITER [22:27:36] heh [22:27:48] so what i is this [22:28:13] 1) run udp-filter against unsampled file with filter m.wikipedia.org, save output to udpfilter.log [22:28:41] 2) run grep against unsampled file with exact same filter but not greppin the referer, save to grep.log [22:28:46] so wait [22:28:53] are we simply incapable of properly escaping things? [22:29:03] 3) run diff against the files, save [22:29:08] wait dschoon [22:29:28] 4) input diff into udp-filter and turn on verbose logging to see why it drops it [22:30:01] 5) url field is set to the status field, which obviously does not match the m.wikipedia.org domain [22:30:16] udp-filter discards line as a mismatch [22:30:21] move on [22:36:10] hm [22:36:15] only 5% though? [22:36:22] shouldn't it basically be everything if that's true? [23:54:40] argh [23:54:44] haproxy is old! [23:54:45] goddamnit! [23:54:50] THAT's why this isn't working [23:55:38] dschoon i am withdrawing my prev explanation [23:55:49] there was an extra space in the diff file [23:55:50] duhhhhhh [23:55:52] ...i'm talking about riemann [23:55:53] heh [23:55:54] oh [23:55:59] yeah, i didn't think that was it [23:56:09] interesting enough most of the dropped log lines are HEAD requests [23:56:16] ... [23:56:18] really. [23:56:29] oHOHOHOHOH [23:56:35] that means body-size is 0? or maybe 0? [23:56:43] i am too wired [23:56:44] which might not match the regex? [23:56:46] i have to shut up [23:56:58] i am calling it weekend [23:57:02] LATERZ!!!! [23:58:41] ta