[00:06:06] dammit [00:06:11] why does Toby have to be right all the time? [04:03:40] http://feltron.com/ (he of the Personal Annual Reports with nice visuals) just helped make an iphone app. http://www.reporter-app.com/ [14:56:16] (PS6) Nuria: [WIP] Changes tu support wikimetrics in vagrant. [analytics/wikimetrics] - https://gerrit.wikimedia.org/r/109676 [15:32:04] (PS10) Ottomata: Initial debian version [analytics/kafkatee] (debian) - https://gerrit.wikimedia.org/r/110620 [15:38:11] (CR) Edenhill: [C: 1] Initial debian version [analytics/kafkatee] (debian) - https://gerrit.wikimedia.org/r/110620 (owner: Ottomata) [15:38:43] heyaaa qchris [15:38:52] heyaaaa ottomata [15:39:19] i've got kafkatee up on analytics1003 producing what should be a sampled 100 mobile tsv log [15:39:28] i'd like to compare it to the one that is produced from udp2log [15:39:34] to make sure it will work as a drop in replacement [15:39:39] Awesome! [15:39:47] but, since they are sampled, i can't just diff the log files [15:39:57] what would be a good way of checking? [15:40:17] counting hits to domains maybe? [15:40:45] For most fields, counting different buckets should work fine. [15:40:52] Like comparing ratios from hosts. [15:41:05] Comparing ratios for HTTP result codes and the like. [15:41:39] Stripping URLs down to the domain and comparing them sounds good as well. [15:42:47] Looking over the fields ... I think one has to strip to the domain (maybe including the first part of the path) for the url and the referrer [15:43:14] User Agent ... I am not too sure ... There I'd probably just look at the ratios for the top ten User-Agents. [15:43:37] For the other fields, comparing plain ratios should be fine. [15:44:03] well, i only want to to a rough check right now [15:44:07] so one or two fields I guess is good? [15:44:22] Go for hostname. [15:44:27] That has only a few entries [15:44:29] just machine hostname? [15:44:32] That's a good first test. [15:44:34] ok [15:45:08] If you want to test for a second field, I'd use the URL. [15:45:22] (Stripped to the domain+first part of path) [15:46:37] can't you pipe a sampled log file through kafkatee and make sure those two match? [15:47:02] * drdee goes back into his cave [15:48:21] Can we guarantee that the sampling picks the same elements for both streams? [15:48:44] Get out of your cave already! :-D [15:49:57] it's winter - i am in hibernation [15:51:13] It's almost spring over here ... my chives already shows fresh green stems :-) [15:51:41] drdee.dehibernate(); [15:51:46] yeah, naw we can't guarantee that, i don't think [15:51:50] hey Snaps [15:51:53] its working! [15:51:58] q though, you there? [15:52:03] sys.write( drdee.context.getTemperature().to_celsius ) [15:52:16] ottomata: splendid! [15:52:19] shoot [15:52:50] I see log messages like this [15:52:51] Closing output "/var/spool/kafka/e/kafaktee.test/mobile-100.log.tsv": exited with status 0 (346 messages in queue) [15:52:52] sys.write('Current temparature: %d" % -25) [15:52:54] every 3 or 4 seconds [15:52:58] is that ok [15:52:58] ? [15:53:53] huhm [15:54:03] thats an output file X /var/spool... ? [15:54:13] yes [15:54:20] output file 100 /var/spool/kafka/e/kafaktee.test/mobile-100.log.tsv [15:54:54] okay, bug. [15:54:57] let me fix that [15:54:59] k [15:55:54] yeah, it doesn't seem to do it with pipe [15:55:55] i jsut changed it to [15:56:00] pipe 100 /bin/cat >> ... [15:59:23] (PS1) Edenhill: File outputs were erroneously closed in outputs_check [analytics/kafkatee] - https://gerrit.wikimedia.org/r/112467 [16:01:56] ok! [16:05:20] coooool looking good Snaps [16:05:34] (CR) Ottomata: [C: 2 V: 2] File outputs were erroneously closed in outputs_check [analytics/kafkatee] - https://gerrit.wikimedia.org/r/112467 (owner: Edenhill) [16:05:46] nice :) is it drinking from the firehose now? [16:08:20] yup [16:10:43] ottomata: cool, whats the cpu and memory usage like? [16:11:56] mem supre low [16:12:02] 28M res [16:12:16] i'm only running one output [16:12:26] 10 top-par intpu [16:12:38] sampled 100 [16:14:09] http://ganglia.wikimedia.org/latest/graph_all_periods.php?h=analytics1003.wikimedia.org&m=cpu_report&r=hour&s=descending&hc=4&mc=2&st=1392048769&g=cpu_report&z=large&c=Analytics%20cluster%20eqiad [16:14:58] good :) [16:23:20] ahhh Snaps [16:23:21] also [16:23:28] cpu was busy just now [16:23:36] because it was consuming with offset [16:23:38] so since feb 8 [16:23:41] so it had a lot to do [16:23:49] i was wondering how my log file was so big already :p [16:23:54] aha :) has it catched up yet? [16:23:58] naw it hadn't [16:24:04] but, i didn't want it to [16:24:10] i jsut want to test latest data right now [16:24:13] so i just removed offset files and restarted [16:24:17] load is much lower now [16:24:25] but, awesome that that worked! [16:24:39] that is already a huuuuuuge improvement over udp2log [16:26:22] very good! [16:26:44] will you put kafkatee rdkafka stats in ganglia too? [16:28:29] fo sho! [16:28:40] sadly though, i didn't write a generic module for tailling json stats [16:28:51] so I will have to copy/paste and make a custom one :/ [16:29:16] ok cool [16:30:18] great [16:30:21] sighup works great too [16:30:29] ok Snaps [16:30:33] this is looking good I think [16:30:44] we'll need to roll a tag of 0.8.3 before I can officially deploy this [16:30:58] i can do a pull request for librdkafka for that [16:30:59] should I? [16:31:16] well, i mean, pull request for debian branch [16:31:19] I need Faidon to do a testbuild first to see that the new symbol thing works as expected [16:31:22] you'll have to create a tag [16:31:23] hmmmm [16:32:40] https://github.com/ottomata/librdkafka/commit/487eaabb2f0ea2d2a5a296be5b25b80254d9eb79 [16:32:42] i will email faidon [16:32:43] and you [16:33:00] I mailed him a couple of days ago about it, but I think he's been very busy [16:34:03] hmm oh ok [16:34:50] MBO)BO) l-577iououtpju;;;;;; jt;;:: [16:35:02] but maybe we dont need to verify that, that commit looks rather good. I just need to fix a couple of things in master first, then I'll create the tag [16:35:03] LL ?L L/ [16:37:54] drdee: so yeah that -25°Celsius is the main reason I dont relocate to eastern Canada. My wife find it is too cold for some reason [16:37:59] have fun fixing your keyboard [16:41:56] ottomata: there, 0.8.3 tagged and released [16:46:15] oof so fast! [16:46:16] awesome [16:51:30] ottomata: anything else before I leave? [16:52:03] don't think so, things are looking good, just need to get kafkatee debian reviewed and merged, faidon to push debian for 0.8.3 librdkafka [16:52:05] and puppet stuff [16:52:07] but ja, you are good [16:52:08] thanks! [16:54:34] okay, great! :) see ya [17:05:30] (CR) Alexandros Kosiaris: [C: 2] Initial debian version [analytics/kafkatee] (debian) - https://gerrit.wikimedia.org/r/110620 (owner: Ottomata) [17:53:07] (CR) Milimetric: [C: -1] "since this is done multiple times, we should DRY it up. I know this is just firefighting, sorry if it feels like nitpicking." (2 comments) [analytics/wp-zero] - https://gerrit.wikimedia.org/r/112207 (owner: QChris) [17:56:47] (CR) Milimetric: [C: -1] Drop three carriers from total sum (1 comment) [analytics/wp-zero] - https://gerrit.wikimedia.org/r/112210 (owner: QChris) [17:57:36] (CR) Milimetric: [C: 2 V: 2] Stop excluding a carrier's start date from the treated dates [analytics/wp-zero] - https://gerrit.wikimedia.org/r/112211 (owner: QChris) [17:58:12] (CR) Milimetric: [C: 2 V: 2] Use year to detect edge cases when downsampling months [analytics/wp-zero] - https://gerrit.wikimedia.org/r/112212 (owner: QChris) [17:58:43] (CR) Milimetric: [C: 2 V: 2] When counting bad days for a month, repspect the year [analytics/wp-zero] - https://gerrit.wikimedia.org/r/112213 (owner: QChris) [17:59:08] (CR) Milimetric: [C: 2 V: 2] Updating bad days [analytics/wp-zero] - https://gerrit.wikimedia.org/r/112214 (owner: QChris) [18:00:28] (CR) Milimetric: [C: 2 V: 2] "cool :)" [analytics/wp-zero] - https://gerrit.wikimedia.org/r/112215 (owner: QChris) [18:05:17] (CR) Milimetric: [C: -1] Fix treatment of good days, if no request was logged (1 comment) [analytics/wp-zero] - https://gerrit.wikimedia.org/r/112216 (owner: QChris) [18:05:52] (CR) Milimetric: [C: 2 V: 2] Updating bad dates [analytics/wp-zero] - https://gerrit.wikimedia.org/r/112217 (owner: QChris) [18:08:07] (CR) Milimetric: "Kinda funny that you did these in the same gerrit Change group. But I like it because it shows history properly. If you accept my refact" [analytics/wp-zero] - https://gerrit.wikimedia.org/r/112218 (owner: QChris) [18:09:18] (CR) Milimetric: "now I'm reviewing a bunch of code that's getting deleted..." [analytics/wp-zero] - https://gerrit.wikimedia.org/r/112219 (owner: QChris) [18:11:03] (CR) Milimetric: [C: 2 V: 2] Add graph that sums monthly graphs [analytics/wp-zero] - https://gerrit.wikimedia.org/r/112220 (owner: QChris) [18:12:15] (CR) Milimetric: [C: 2 V: 2] Stop dropping carrier's start date for summary graphs [analytics/wp-zero] - https://gerrit.wikimedia.org/r/112221 (owner: QChris) [18:12:36] (CR) Milimetric: [C: 2 V: 2] Mark 2013-08-14–2013-08-18 as bad days [analytics/wp-zero] - https://gerrit.wikimedia.org/r/112222 (owner: QChris) [18:13:02] (CR) Milimetric: [C: 2 V: 2] Add carrier 'Orange Madagascar' [analytics/wp-zero] - https://gerrit.wikimedia.org/r/112223 (owner: QChris) [18:13:34] (CR) Milimetric: [C: 2 V: 2] Add carrier 'Banglalink Bangladesh' [analytics/wp-zero] - https://gerrit.wikimedia.org/r/112224 (owner: QChris) [18:13:58] (CR) Milimetric: [C: 2 V: 2] Add carrier 'Umniah Jordan' [analytics/wp-zero] - https://gerrit.wikimedia.org/r/112225 (owner: QChris) [18:14:55] (CR) Milimetric: [C: 2 V: 2] Add carrier 'Airtel Kenya' [analytics/wp-zero] - https://gerrit.wikimedia.org/r/112226 (owner: QChris) [18:39:28] (CR) Milimetric: [C: -1] Strip markup from MCC-MNC loaded from wiki table (1 comment) [analytics/wp-zero] - https://gerrit.wikimedia.org/r/112227 (owner: QChris) [18:39:42] (CR) Milimetric: [C: 2 V: 2] Add carrier 'Beeline Kazakhstan' [analytics/wp-zero] - https://gerrit.wikimedia.org/r/112228 (owner: QChris) [18:40:00] (CR) Milimetric: [C: 2 V: 2] Add carrier 'Tcell Tajikistan' [analytics/wp-zero] - https://gerrit.wikimedia.org/r/112229 (owner: QChris) [18:40:48] (CR) Milimetric: [C: 2 V: 2] "fine assuming the related review is addressed." [analytics/wp-zero] - https://gerrit.wikimedia.org/r/112230 (owner: QChris) [18:41:04] (CR) Milimetric: [C: 2 V: 2] Mark 2014-01-05, and 2014-01-06 as bad dates [analytics/wp-zero] - https://gerrit.wikimedia.org/r/112231 (owner: QChris) [18:41:20] (CR) Milimetric: [C: 2 V: 2] Add carrier 'Grameenphone Bangladesh' [analytics/wp-zero] - https://gerrit.wikimedia.org/r/112232 (owner: QChris) [18:41:42] (CR) Milimetric: [C: 2 V: 2] Mark 2014-02-05, and 2014-02-06 as bad dates [analytics/wp-zero] - https://gerrit.wikimedia.org/r/112233 (owner: QChris) [18:57:19] ottomata, ping [19:01:00] poonnngg [19:01:55] ottomata, so, toby and I were talking about hive bugs and issues, and he mentioned you're someone who follows up on them. I note you're not on the list of default subscribers for kraken bugs; would you like to be? [19:02:04] (I appreciate you probably already get a metric frick-ton of email) [19:04:14] ah, i barely even use bugzilla, didn't know there was a kraken bug list [19:04:33] aha. then...is that a yes? ;p [19:04:35] yes! [19:04:39] cool! [19:04:42] I shall add you [19:04:53] TL;DR you'll be automatically told about kraken-related bugs unless you opt out of them [19:05:15] (or go "Oliver, this is ridiculous, I haven't seen my significant other in a week and my parents are wondering if I've died or something, please turn off the firehose") [19:05:35] haha ok [19:14:31] nuria: yt? [19:58:34] sayhar, lzia: lunch plans? [19:58:46] DarTar: Let's chill. [19:58:52] lzia %^ [19:59:24] DarTar: No plans though I have my lunch with me. [22:43:26] ottomata: gerrit is complaining about missing 37c1ba61731779bfc621eb36b6ca55d445cc2a9d when replicating for operations/debs/kafka [22:43:40] Since the commit is still on github, can I just teach gerrit about it, [22:43:46] or is that on purpose missing? [23:03:59] ottomata: You've had your chance, I now fixed it :-)