[12:53:23] morning everyone [12:53:37] Change merged: Milimetric; [analytics/user-metrics] (master) - https://gerrit.wikimedia.org/r/62098 [13:16:49] moorning! [13:43:55] mooorning [13:52:35] morrnnigngi [13:59:05] sbt sbt sbt! [14:05:24] restarting compy [14:09:46] there he is! mr ooooooottttttooooooomaaaaaataaaaaa [14:15:41] one more restart, trying to fix my ssh-agent [14:17:38] milimetric [14:17:40] around? [14:17:46] yea [14:17:49] average, around? [14:17:51] working on the umapi stuff [14:17:55] to the bat cave? [15:02:25] New patchset: Milimetric; "fixing bugs from DarTar's feedback" [analytics/user-metrics] (master) - https://gerrit.wikimedia.org/r/62155 [15:02:43] bugs fixed ^ should I self-merge drdee? [15:02:55] try git-review :) [15:02:57] well, someone else can review i guess [15:03:00] (seriously) [15:03:15] no i mean, the patch is submitted [15:03:26] oh i see the link [15:03:26] great! [15:03:34] let's add erosen as reviewer [15:03:38] i spent 55 minutes fixing the bugs and 30 minutes fighting with gerrit, then giving up and re-cloning [15:04:46] milimetric, drdee: thanks for this - I'm leaving home in a moment, will follow up on this + reply on feedback once I'm in the office [15:04:47] New review: Diederik; "Ok." [analytics/user-metrics] (master); V: 1 C: 1; - https://gerrit.wikimedia.org/r/62155 [15:05:11] i gave it +1 looks good to e [15:05:23] you read fast :) [15:05:36] or trust me too much [15:05:46] i saw that you fixed the things from the email [15:12:09] milimetric, you got a sec for the batcave? [15:13:03] ya, brt [15:29:45] ottomata: can you kill your udp2log process on locke: otto 1199 0.0 0.0 20744 2000 ? S May02 0:01 /usr/bin/udp2log -p 8420 --config-file ./udp2log.conf --recv-queue=16384 [15:30:12] hmm [15:30:29] weird [15:30:30] done [15:33:14] New review: Erosen; "(1 comment)" [analytics/user-metrics] (master); V: 1 C: 1; - https://gerrit.wikimedia.org/r/62155 [15:38:37] ottomata: drdee: x_cs spike? [15:38:47] sure! [15:38:59] i'm looking into this sbt stuff now, would rather help you at the moment :) [15:39:49] let's do it [15:39:52] i will join in a few minutes [15:40:16] k [15:40:32] ottomata: do you want to take a look at this ipython notebook i made on stat1 [15:40:51] sure [15:41:43] * erosen checking ssh command [15:41:57] k [15:41:59] try this: ssh -N stat1 -L 7000:localhost:7000 [15:42:37] and then open up localhost:7000 and hopefully you'll see an ipython notebook browser [15:42:47] click on the x_cs_forensics notebook [15:43:15] there [15:43:26] great [15:43:49] the other data source of interest are the counts which david made from kraken [15:44:34] which he put on the mingle card here: https://mingle.corp.wikimedia.org/projects/analytics/attachments/297 [15:44:53] hmm, can we count just zero tagged vs not zero tagged? [15:44:56] we aren't checking for validity here [15:45:03] just whether or not it gets tagged [15:45:08] might make it easier to read and see problems [15:45:18] sure [15:45:33] that was part of my goal with the whole ipython notebook thing (which may or may not be that useful) [15:45:58] if you replace fields = [….] with just fields = ['x_cs'] [15:46:01] that should simplifiy things [15:47:27] also it hink we should drop hostname from the grouping as well [15:47:35] we are pretty sure there aren't any host specific problems, i think [15:47:47] fair, [15:48:04] but really we should just be checking off each theory, one by one and marking it somewhere [15:49:09] also, i think we should be using dtac thailand now [15:49:46] where are the field names defined? [15:50:25] hangout? [15:50:40] mime_type [15:50:41] :) [15:51:01] ? [15:51:10] i call it content-type [15:51:12] :) [15:51:14] fair [15:51:32] hehe, i don't actually know anything about how the web works [15:52:07] ok, hangout is tough atm, i'd have to change locations [15:52:11] which I want to do soon anyway [15:52:13] that's fine [15:52:15] should I do that now or you wanna just chat? [15:52:20] did you figure out how to change the fields [15:52:23] yeah [15:53:49] ok, you want to look at thailand? [15:54:24] so [15:54:24] for thailand, they have to be zero domans, right? [15:54:27] i think so [15:54:28] yes [15:54:33] k [15:56:22] ok so [15:56:26] i'm looking at may 02 [15:56:31] 126472 lines [15:56:36] 814 matches for zero.wiki [15:56:37] is that right? [15:56:47] dtac thailand [15:57:51] let me check [15:58:58] ottomata: which days are you running? [16:01:39] just may 02 [16:01:51] zero-dtac-thailand.tab.log-20130502 [16:04:16] most of the untagged reqs there are from 302s [16:04:25] but there are 200s as well [16:05:49] 1% of reqs are not tagged [16:06:06] .06% of untagged reqs are 302s [16:06:07] but are you checking that they are m. or zero. domains? [16:06:12] just zero [16:06:13] not m. [16:06:17] m. not valid for thaland [16:06:22] aah [16:06:23] good [16:06:25] (I'm using awk) [16:07:05] k [16:07:22] i can't find any consistency though [16:07:25] as to why these are untagged [16:08:00] when I run against the whole month of march, I get zero untagged requets to zero. domains... [16:08:48] oh weird [16:08:50] but you get some in may? [16:09:33] may is good to... [16:09:46] oh maybe my check is bad [16:10:05] oo [16:10:12] i was filtering on the request being for an article [16:10:19] may 2 you get 0 untagged zero domains? [16:10:22] oh [16:11:01] k, running on just may 2 [16:11:21] drdee: ottomata: Just sent you guys some email about reproducing data loss. [16:11:28] can i share screen with you, or will that also overload your connection? [16:12:07] sure, actually, i'm going to go sit outside, just dont' want to disturb my coworkers here [16:12:09] will grab headphones [16:12:19] gotcha [16:12:33] it's fine to just have video, if you prefer [16:13:56] in standup hangout [16:14:01] man there is zero shade back here [16:14:06] i can't really see my screen anyway :p [16:14:10] hehe [16:14:19] let's do screen share no audio inside? [16:14:31] let's try thist first [16:14:38] k [16:32:47] drdee, rounce123: which hangout do you want to use? [16:32:53] https://plus.google.com/hangouts/_/69179906f1bf2473b03a00a77eafe0e5f14e98f1 [16:33:00] otto and erosen are in the scrum one.. [16:33:05] yup [16:51:26] erosen: logging.exception(str(e)) seemed to get me both the traceback and the exception message [16:51:33] you're saying just do logging.exception()? [16:51:33] interesting [16:51:37] yeah [16:51:40] i like it, done [16:54:14] well, instead I changed it from str(e) to a descriptive unique message to help debug [16:54:26] it looks like either way the stack trace is getting dumped: http://docs.python.org/2.6/library/logging.html#logging.exception [16:56:14] New patchset: Milimetric; "exception fix" [analytics/user-metrics] (master) - https://gerrit.wikimedia.org/r/62165 [16:56:58] Change abandoned: Milimetric; "gerrit fail" [analytics/user-metrics] (master) - https://gerrit.wikimedia.org/r/62165 [16:57:44] erosen: ha, didn't know you were using ipython nbs on stat1, we're experimenting the same on vanadium (to access eventlogging data) [16:58:06] yeah, i heard ori-l was looking into that [16:58:22] works well, minus the lack of multi-user support issue [16:58:45] let's chat about that, would be great to join forces [16:59:47] oh god [16:59:55] GERRRIT!!!!! [17:00:14] i have now officially spent more time dealing with gerrit than fixing code [17:01:36] milimetric, average: scrum [17:02:19] milimetric ^^ [17:12:08] average, around? [17:12:30] drdee: I am here, yes [17:12:48] millimetric, average: wanna quickly talk about 356? [17:13:04] sure [17:13:32] ottomata, is this done: https://mingle.corp.wikimedia.org/projects/analytics/cards/385 ? [17:14:17] ottomata, how busy are you this afternoon? [17:14:58] New patchset: Milimetric; "fixing bugs from DarTar's feedback" [analytics/user-metrics] (master) - https://gerrit.wikimedia.org/r/62155 [17:15:16] finally god [17:15:32] milimetric, hangout? [17:16:23] TypeError: exception() takes at least 1 argument (0 given) [17:16:37] average: hangout? [17:17:15] which one ? [17:17:31] the regular one [17:17:33] ok [18:05:27] erosen: got a moment? [18:05:40] not right now [18:05:42] in a meeting [18:05:57] we can chat at noon? [18:06:00] sure [18:06:07] are you in the office? [18:06:08] milimetric: what's up with vagrant these days? [18:06:18] which vagrant [18:06:24] ori's mediawiki one? [18:06:25] erosen: ye[ [18:06:35] cool [18:29:32] udp-filter is not compatible with file format in 2011 because it had IPs of the form [18:29:40] 88.88.88.88|US [18:30:00] udp-filter is now splitting on tabs instead of spaces [18:30:37] ok, skip 2011 [18:30:50] so I have to write some code to detect the delimiter first [18:31:06] so I can tell udp-filter what the delimiter is with -F [18:31:14] no :) [18:31:31] just determine the first file that has not 88.88.88.88|US [18:32:12] these are two separate problems [18:32:26] drdee: I understood your solution for 88.88.88.88|US and I will use it [18:32:32] drdee: but now there's another one, the separator [18:32:44] which is different pre/post 1st feb 2013 [18:32:49] right [18:33:08] so pre 1st feb 2013, has space as delimiter [18:33:40] yes, I'll have to adapt the script, to use this information [18:34:24] remember that card I wrote about heterogenous data ? it was hard for me to describe it properly [18:34:29] but this is what I wanted to put in it [18:34:39] we have heterogenous data formats across time [18:38:28] ottomata: I'm around if we need to discuss anything today (since I hear you're off next week). [18:39:20] average: you are making it too complicated [18:39:36] find the files that are space delimited and are not yet geocoded [18:40:09] write a bash script to geocode those files [18:40:20] yes [18:46:10] xyzram: in a meeting uhhh, not sure [18:46:19] drdee wants me to just try your new stuff [18:46:32] i'll try to build it on analytics1008 and run some stuff there [18:46:39] are you working in a branch? [18:50:12] ottomata: No, it is just standalone code in ~ram/udp2log-local/src built with a trivial Makefile [18:53:30] can you commit it to a branch of udp-filter repo? [18:55:27] xyzram: ^? [18:55:37] ok, sure. [18:55:56] danke [18:57:36] Just "git review" as for other projects OK or is there some other preferred way ? [18:57:50] i think you can make a branch and just push to it [18:57:55] we'll do review when we merge back into master [18:58:00] git push ? [18:58:07] git push origin [18:58:09] i think will doi t [18:58:20] Ok, thanks. [19:00:42] mwalker: ping [19:01:40] erosen: pong! I was wondering, now that I have stat1 access; a) how do I get onto stat1, and then b) that counting script you wrote; I was going to play around with it for that list of URLs so I was wondering if you had any pointers (/paths I needed to know to get to the log files) [19:01:59] sure, want to chat in person? [19:02:17] can do -- are you on 3 or 6? [19:02:37] 6 [19:02:56] i can come to you, or you to me [19:03:46] you can also meet midway at in between floor 4 and 5 in the elevator and test the wifi network [19:09:56] xyzram: lemme know the branch if it works [19:13:58] drdee: I wrote a script to find out what you wrote above [19:14:07] cool! [19:14:10] it's now running [19:14:20] it detected 6 files so far [19:14:31] there are ~400 more to process.. [19:17:30] ottomata: branch multiplexor pushed [19:17:56] dankeee [19:18:28] je vous en prie (not sure what that says actually) [19:19:52] Just realized: You might have to hack one line of the code since the names of [19:20:05] output files are automatically generated with: sprintf( cmd_buf, "%s >> out_%d.log", args.cmd, i ); [19:25:56] milimetric: i meant with vagrant not working for you [19:28:00] away for a bit, back in about an hour. [19:29:41] oh, I got it to work yesterday, jeremyb_ [19:29:57] average figured out the right versions of virtualbox and vagrant to use on Ubuntu [19:30:18] I sent them to ori-l, and they're: Vagrant 1.1.2, VirtualBox 4.2.8 [19:30:31] any other combination I tried (higher or lower) did not work for me [19:39:21] $ dpkg -l virtualbox | fgrep virtualbox; gem list vagrant | fgrep vagrant [19:39:24] ii virtualbox 4.1.18-dfsg-2+deb7u1 amd64 x86 virtualization solution - base binaries [19:39:28] vagrant (1.2.1) [19:39:30] worksforme [19:39:31] but i'm debian not ubuntu [19:49:43] drdee: ETA 6h for that script [19:49:49] drdee: and after that I can run the geocoding.. [19:49:50] :| [19:54:06] https://gist.github.com/wsdookadr/fef34b127d4300935e5e [19:54:15] this is what's running now on stat1 [21:06:37] heyaa drdee [21:06:49] and xyzram [21:06:53] hi [21:07:13] afaict, we can use new udp-filter to anon and geocode produce unsampled mobile into kafka, yay [21:07:18] and i think we can do it with a single producer [21:07:19] hey [21:07:31] WOOOT [21:07:35] that's great news [21:07:35] ! [21:07:45] without needing multiplexor ? [21:08:09] what do you mean by "new" ? [21:08:12] your thing [21:08:20] i just tried the version with your branch [21:08:22] Oh, interesting, great news! [21:08:29] i mean, we never tried it with the old one either [21:08:30] :) [21:08:37] but I just tried yours and it seems good [21:08:46] great to hear. [21:09:06] you guys want to build a .deb next week so that when I get back we can start doing so? [21:09:44] yes, we will do that [21:10:01] ram, thank yooouuu so much! that's really awesome! [21:10:16] looking forward to you joining our scrum meetings, 10AM PST [21:10:18] welcome, glad I could help. [21:10:50] Yes, I'll be there Monday though I may miss Tuesday due to some personal stuff. [21:10:54] ok, np [21:13:33] oh awesome! [21:15:52] milimetric: have the umapi fixes been deployed? [21:16:07] not yet [21:16:13] blocked? [21:18:46] hm, no, i'm just waiting for review [21:18:51] you guys said i shouldn't self merge :) [21:23:35] oink [21:23:39] ok i will merge [21:23:40] 1 sec [21:24:20] New review: Diederik; "Ok." [analytics/user-metrics] (master); V: 2 C: 2; - https://gerrit.wikimedia.org/r/62155 [21:24:24] Change merged: Diederik; [analytics/user-metrics] (master) - https://gerrit.wikimedia.org/r/62155 [21:24:45] milimetric ^^ [21:25:19] thanks [21:25:37] so now it'll be deployed whenever puppet runs [21:25:40] or whenever someone pulls it [21:25:49] yes so that is within the next 30 minutes [21:26:35] ottomata, what's the name of the new branch in udp-filter? [21:26:47] and if it is working, shouldn't we merge it into master? [21:26:55] and do a code review? [21:26:57] sure, ja [21:26:58] totally [21:26:59] xyzram ^^ [21:27:39] xyzram: can you submit a patch to gerrit that merges your branch? [21:27:43] Branch name is 'multiplexor' [21:27:53] k [21:28:08] can you merge and push to gerrit for review? [21:28:30] ok [21:28:38] ty [21:34:51] drdee: I'm getting an error with git review: ! [remote rejected] HEAD -> refs/publish/master/add-multiplexor (no new changes) [21:34:59] sweet! [21:35:12] milimetric is really good in fixing these problems :) [21:35:27] but i think you have to merge the branch locally [21:35:31] and submit that as a patch [21:36:05] I did that: created a new branch called add-multiplexor, merged multiplexor and I still get this error. [21:36:19] git checkout master [21:36:24] git merge multiplexor [21:36:36] Oh, merge into master ? [21:36:39] yes [21:37:12] For the other projects I always create a separate feature branch and "git review" that. [21:37:21] Why is it different here ? [21:37:48] because you pushed directly your feature branch [21:37:56] you can do that [21:37:58] actually [21:38:02] for udp-filter [21:38:03] but yes in general that's a better practise [21:38:04] it probably doesn't matter [21:38:07] doing into master is good [21:38:21] but xyzram, in this case what you would want is not a feature branch, but a topic branch of master [21:38:22] so [21:38:34] git checkout -b add-multiplexor origin/master [21:38:39] But then if I want to do an independent change I have to reset hard on master. [21:38:47] nope [21:38:49] its a local branch [21:38:55] that gets submitted for review to the remote master [21:39:10] so you can make as many of those local topic branches as you want [21:39:20] and each one tracks the origin/master remote [21:39:36] whenever you git pull it pulls from origin/master [21:40:03] that way you can do separate work locally on different branches, but still use the same remote for all of them (makes the review side easier) [21:40:25] so [21:40:45] average: your script https://gist.github.com/wsdookadr/fef34b127d4300935e5e is nifty but you don't have to parse the entire file, the first line of every file will tell you what you need to know [21:40:50] git checkout -b multiplexor-merge origin/master [21:40:51] git pull [21:40:51] git merge multiplexor [21:40:51] git review [21:41:05] i think that would do it [21:41:23] or you could just merge into master (and not do any more work til the review is done), but i think the topic branch way is better [21:41:24] I think I did something similar: git checkout master [21:41:43] git checkout -b add-multiplexor [21:41:59] git merge multiplexor [21:42:03] git review [21:42:09] and got that error. [21:42:30] maybe you pushed against origin/master [21:42:46] don't think so, lemme see [21:42:52] let met chck [21:43:17] This is how I pushed: git push origin multiplexor [21:44:04] hmmm [21:44:09] but did you create the remote branch? [21:44:28] No, not even sure how to [21:44:54] multiplexor was my local branch [21:45:22] That was, I think, your recommendation earlier today :-) [21:45:31] yes :) [21:45:39] okay so master does not have your patch yet: https://gerrit.wikimedia.org/r/gitweb?p=analytics%2Fudp-filters.git;a=shortlog;h=refs%2Fheads%2Fmaster [21:45:51] and there is a remote branch called multiplexor [21:45:55] so that's all good [21:46:03] ja, xyzram git push origin multiplexor created that [21:47:01] Seems like git review wants to create a new branch but finds there is already an identical branch in place. [21:47:16] So it says, "no changes" [21:47:50] should I try to git review it? [21:47:54] we can delete the remote branch, would that solve it? [21:47:58] don't thikn so [21:48:16] xyzram: git branch -a [21:48:24] hmm no [21:48:31] ottomata: yes see if git review works for you. [21:48:37] k [21:48:43] New patchset: Ottomata; "Added src/multiplexor.c which runs multiple copies of a filter." [analytics/udp-filters] (master) - https://gerrit.wikimedia.org/r/62198 [21:48:51] :D [21:48:54] hilarious [21:49:09] oh!, uhhhhh [21:49:10] Ok, so looks like that worked. [21:49:15] was I supposed to use multiplexor to test stuff today? [21:49:17] i just used udp-filter [21:49:42] i don't even know what this is, i thoguth you were just doing performance improvement work on udp-filter [21:51:19] drdee? [21:51:35] It is performance work but doesn't modify udp-filter directly; it adds a shim which is a standalone program that multiplexes to multiple copies of udp-filter. [21:51:54] i am confused ottomata; this is what you said: [21:51:58] xyzram: without needing multiplexor ? [21:51:59] [5:08pm] xyzram: what do you mean by "new" ? [21:52:00] [5:08pm] ottomata: your thing [21:52:00] [5:08pm] ottomata: i just tried the version with your branch [21:52:13] so i assumed you used the multiplexor branch [21:52:13] * robla tries to follow along too [21:52:15] i did [21:52:24] So did you test by just using udp-filter or did you use multiplexor as a filter ? [21:52:26] i compiled from multiplexor branch [21:52:33] and then used udp-filter [21:52:37] no, i didn't know t here was a new binary [21:52:44] No but you also need to use the new binary. [21:53:07] So, looks like your testing shows that the multiplexor is not needed ! [21:53:24] That udp-filter alone is adequate. [21:54:14] at least for the mobile stream [21:54:36] ottomata, can you test the multiplexor on the full stream? [21:54:38] how do I use multiplexor? pipes? [21:54:56] sure, but guys, this is why I said we should try it first, we never tried it :p [21:54:58] pipe 1 ./src/multiplexor -proc 2 -cmd "/usr/bin/udp-filter ....options...." [21:55:15] Suitably modified. [21:55:28] it would be nice to have a use case that fails, and then see if multiplexor helps [21:55:32] That's why I posted the comment here about the output files. [21:55:35] yeah [21:56:16] Exactly, I've been trying to find a realistic case that fails for days and could make no progress. [21:56:17] -proc is the number of threads? [21:56:32] Number of processes. [21:56:36] k [21:56:39] It creates child processes. [21:56:45] all identical [21:56:53] except for the output files. [21:57:22] let's add these notes to a file :) [21:57:25] so this is cool, cause this will let me shard the stream on a single machine without examining content [21:57:29] So I created a synthetic failure case where I added a delay loop for my testing. [21:57:30] if we need to [21:57:39] basically its what i'm doing right now on 4 of the analytics nodes [21:57:41] with awk [21:57:57] this lets us use a single node's multiple processors to produce [21:58:06] hmmm [21:58:07] yes [21:58:09] wait, curious [21:58:11] how does this output? [21:58:20] can I do [21:58:27] forks a shell which redirects output. [21:58:31] pipe 1 multplexor … | kafka-producer? [21:59:11] Right now that part is in the C code and it writes to a generated file name. [21:59:55] new test imminent [22:00:53] so it only outputs to files? [22:01:21] average: what do you mean? and we need to brain bounce more often, you need to keep me or milimetric in the loop, we need to communicate more to overcome the distributedness of the team [22:01:41] Right now yes, I can add additional parameters for writing to another process if needed. [22:02:06] yeah, the i think the point of this is to produce to kafka [22:02:21] which right now we are doing via the shell producer which reads from stdin [22:04:06] so that will double the number of processes since each child will have its own output process. [22:04:33] we could use librdkafka. heheh drdee :) [22:04:51] or could you unmultiplex? [22:05:10] Will "-o /path/to/cmd" be adequate ? [22:05:31] https://github.com/edenhill/librdkafka/blob/master/examples/rdkafka_example.c#L146 [22:06:41] i guess so? [22:08:04] I'll need to look at that code carefully and also read about kafka ... [22:08:36] well, i think we should all talk about what we are trying to accomplish here [22:08:39] right? [22:08:48] Agreed, [22:09:01] if the point is to anon + geocode logs mobile logs that we produce into kafka [22:09:06] bikeshed color contribution: "-c" (for command) seems more appropriate than "-o" (which is often used to output to file in other utilities) [22:09:42] sure, I'll use -c [22:09:46] if we can already do that with udp-filter as is [22:10:05] then no point of multiplexor [22:10:05] should we continue to work on this? [22:10:16] ottomata: probably not [22:10:49] it's good to have this in our back pocket, but we don't have to use it right right right now [22:10:59] i mean, multiplexor is cool and all, but udp2log itself is a multiplexor, it just doesn't support sharding [22:11:06] for a single filter [22:11:13] multiplexor looks like that's basically what it does [22:11:26] Right, and I understood CPU consumption in a single filter was a bottleneck [22:11:34] allows multiple procs to run the same code on the same input [22:11:39] That's why I wrote it. [22:12:13] this might be real useful if/when we want to use udp2log to import the full unsampled firehose [22:12:22] not sure [22:12:28] one other minor benefit that the multiplexor might provide is a little bit of output buffering from the central udp2log instance [22:13:19] i.e. if the multiplexor is taking packets quickly from udp2log, then it's the multiplexor getting blocked by a slow filter, rather than udp2log [22:13:31] Yes, but if the output processes are sufficiently slow, the buffering will ultimately not help [22:13:46] sure....minor benefit :) [22:13:51] right. [22:13:54] can we use the multiplexor on oxygen and spin up the other zero filters that we had to disable? [22:13:58] it'd deal with a small spike [22:14:08] exactly. [22:14:19] a baby pope :D [22:14:51] mui pequino [22:14:55] (sp?) [22:15:21] drdee, i think it woudlnt' only because there are lots of filters already, so we're already running into more filters than there are processor [22:15:22] s [22:15:39] 8 procs there [22:15:55] ok, just checking [22:16:22] one thing Tim suggested is that perhaps we should allocate of the 24 core Hadoop boxes to this [22:16:30] But are all cores busy ? [22:16:38] we are running 17 filters there [22:16:46] how many cores ? [22:16:48] 8 [22:17:05] looking at htop right now, not all cores are busy [22:17:45] exactly, I'm seeing around 4 underutilized. [22:17:56] so there it could help. [22:18:23] hm, wonder why it isnt' using more procs though, udp2log is basically a multiplexor for its configured filters [22:18:24] so maybe? [22:18:27] i dunno [22:18:29] I seem to recall there was some design problem that kept it from using all of the cores efficiently [22:18:33] hm [22:18:58] oh....I think I know what it is. [22:19:08] all it takes is one of the filters to max out [22:19:31] it blocks the central process [22:19:40] If one or more filters are slow enough that is cannot consume the input stream data is lost. [22:19:48] right, exactly [22:20:18] That's the scenario where the multiplexor can help. [22:21:15] even a chained udp2log instance might help on oxygen [22:21:26] ...if you get the config right [22:22:06] udp2log is consuming around 40% of CPU, one instance of udp-filter is using 12% all else is in single digits. [22:22:41] the other scenario that you can hit with udp2log is that there may not be one single "slow" filter, but the accumulation of blocking from the other processes can still add up [22:23:25] what do you mean by "accumulation of blocking" ? [22:23:45] well, it runs each filter in serial, then starts over again [22:23:49] right? [22:24:03] yes [22:24:41] maybe one of the filters gets behind a little and blocks, but not enough on its own to cause problems [22:24:56] but the central process has to get through them all [22:25:09] there is some code that checks that all output pipes are writable or something like that -- I need to dig into the code ... [22:25:33] right, ok, that's possible. [22:26:07] 2 cores are completly idle. [22:26:20] almost [22:28:10] seems like the consensus is to set this aside for now until we have a concrete failure scenario ? [22:28:31] yeah, at least for now [22:30:27] at some point, when Andrew is not on the verge of going on vacation, and we've got a demand to turn something back on on oxygen, that seems like a fine time to revisit this. [22:30:52] ok [22:32:13] robla: IIRC one of your concerns was that parallelizing the work over multiple cores could violate assumptions made by various analytic scripts that incoming requests are processed serially in the order they were received [22:32:29] worth checking with erik zachte if that is the case (it may well be) [22:34:07] ori-l: that is true in the general case. I think the nice thing about the multiplexor is that it's something we can do on a per-filter basis [22:34:44] geocoding is a great, non-order dependent thing [22:35:20] hrm, yeah. and probably among the most expensive filters. [22:35:34] but you're right that we need to be careful not to just assume we can slap it on any filter [22:36:55] Do we have numeric data on how expensive each filter is ? [22:37:05] no [22:37:32] I think we should -- it is hard to design a solution without hard numeric data for this type of problem. [22:37:54] that's true, and it's something that could be done without touching production [22:38:25] I ran a single geo coding filter on the full unsampled stream and saw no loss of data. [22:38:35] on locke. [22:39:18] how did you check? [22:39:27] Next time we revisit this issue, that is the first thing I would tackle -- also it is not just the type of filter the arguments also make a difference. [22:39:51] The longer the list of IP addresses for example the longer it takes [22:40:12] yes, but those are very a-typical filters [22:40:17] we are not trying to optimize those [22:40:27] I ran /bin/cat concurrently to a file and wrote a script to analyze the sequence numbers in that file. [22:40:28] we will deprecate them and use the X-CS header [22:41:06] how about country list in geocoding ? [22:41:17] also a-typical [22:41:27] I think Ram's point still stands. command line options matter [22:41:58] yes of course but we don't need to investigate that [22:42:13] there is a clear alternative for the wikipedia zero filters by using the X-CS header [22:42:24] we are waiting for evan to give the go [22:42:31] anon + geo is what we are trying to do for kraken, right? [22:42:35] yes [22:42:35] so benching that would be uesful [22:42:38] that [22:42:41] I think it would really help if we have a list of "typical" filter options we want to use on a wiki page somewhere. [22:42:46] that's the use case [22:42:50] gotta run. sorry for crashing. i feel a cosmic yearning whenever udp logging is mentioned on irc. [22:42:57] xyzram: 'typical' is hard [22:42:57] i [22:43:05] anon + geo is the typical filter option we are interested in right now [22:43:09] but atypical is easy ? [22:43:12] i've argued against that, as there are a ton of them and they change often enough [22:43:26] but anon + geo is what we are trying to make awesome right now [22:43:33] so we can store anoned data in kraken [22:43:40] no atypical is hard and are exceptions, [22:43:46] the kraken stuff is a must have [22:43:58] i meant 'typical is hard to define' [22:44:00] not hard tod o [22:44:04] Ok that's a good start; let's characterize those with hard numbers next time we revisit. [22:45:44] Ok but given an option looks like we are able to say with certainty that it is "atypical" . [22:46:23] So let's list _all_ the options, identify those we can as "atypical"; what remains is "typical" for our purposes. [22:46:36] i rather not do this :) [22:46:42] ok [22:46:43] the use case is this: [22:47:07] make sure that udp-filter can geocode at country level and anonymize the unsampeld mobile varnish stream [22:47:17] if we succeed in doing that then we are done [22:47:44] that's the minimum viable feature we are looking for [22:48:23] That's great, so did ottomata's test today show that we can already do that ? [22:48:51] ottomata ^^ [22:51:51] i think so yes [22:52:40] okay aweseom! [22:54:44] so when you get back, we should enable the anonymization and geocoding [22:54:50] on the mobile streams [22:55:04] and reimport the old data and put them through the udp-filter as well [22:55:19] mmk, we can probably do that with hadoop streaming, eh? :) [22:55:26] yes [22:55:28] i think so [22:55:37] that will break the existing geocoding jobs though, need to be careful about those [22:55:56] well easy to fix, [22:56:03] disable geocoding function in pig [22:56:13] and point to the new country field [22:56:16] well, i mean, the jobs sort by stuff too [22:56:17] eyah [22:56:25] also, i think i'd like to run this along side of the existing stuff for a while…hmmm, i'll leave the one on an09 running while I'm gone ig uess [22:56:30] it won't hurt, its just going into kafka [22:56:46] sounds good [22:56:56] drdee: I'm in the hangout [23:04:49] laterz gusy! [23:06:01] laterz for me too byyyeyeye [23:12:39] New patchset: Stefan.petrea; "Fix for mingle 356" [analytics/wikistats] (master) - https://gerrit.wikimedia.org/r/62206 [23:13:28] New patchset: Stefan.petrea; "Fix for mingle 356" [analytics/wikistats] (master) - https://gerrit.wikimedia.org/r/62206 [23:13:47] ciao otto [23:13:54] bye drdee