[14:30:30] morning guys [14:37:20] morning [17:15:03] mornin [17:16:18] morning [17:16:27] dschoon, good and bad news since our last chat [17:16:35] increasing read buffer helped a alooooot [17:16:38] morning [17:16:39] yay! [17:16:42] as well as using netty threads to handle messages [17:16:46] I'm doing both now :) [17:16:46] nice. [17:16:52] was it as easy as hoped? [17:16:54] but, still not good enough [17:17:00] did netty fulfill its promise? [17:17:00] yeah not bad, found a good stack overflow [17:17:03] *nod* [17:17:16] i tried to do some research into figuring out the interface properties [17:17:18] and hardware [17:17:18] the channelpipeline just had to have a threaded executor handler in front of the message receiver handler [17:17:21] and i did not succeed [17:17:25] *nod* [17:17:36] http://stackoverflow.com/questions/9637436/lot-of-udp-requests-lost-in-udp-server-with-netty [17:17:44] its basically the exact same problem I was having :) [17:18:36] nice! [17:18:40] i will read it [17:18:46] because i spent another hour with the docs [17:18:54] and i also could not figure out how the puzzle pieces fit together [17:18:58] usually with java that's kind of fun [17:19:05] this time it was rather mysterious [17:19:45] haha, yeah, if it wasn't for this post I would have no idea! [17:22:48] i do like netty quite a bit though [17:22:53] the more i read, the more i got it [17:23:03] yeah its pretty thourough [17:23:17] after adding the threads, I see the load spread across the cpus more too [17:23:21] br [17:23:23] brb [17:50:05] drdee, do you have time after scrum to sync up? [17:50:31] YES! [17:50:35] sweet [17:50:46] we can also talk about the guy i interviewed yesterday [17:53:08] i moved the appointment [17:53:23] you need a room? [17:56:42] nah [17:56:44] i'm wfh [17:56:54] i'm trying to avoid getting the flu [17:57:08] as my doctor doesn't think i had it last week. i had SOMETHING ELSE [17:57:17] which is terrifying. i do not need to lose ANOTHER week to sickness [17:59:17] yeah, I've been thinking of permanently moving to a rural area and working from home, for health reasons ;) [18:01:36] come to toronto [20:05:16] ottomata, just fyi, i'm looking into packetloss on oxygen atm [20:05:24] mmk [20:05:29] cool [20:05:41] i kind of freaked out when drdee told me about his findings [20:06:17] i am probably wrong [20:06:26] i am hoping so. [20:06:30] (at least about the part that it affects oxygen) [20:06:39] but if you look at the convo i just had in MW-sec, you may be less optimistic [20:08:16] dschoon: what is MW-sec, btw? [20:08:26] #mediawiki_security [20:08:37] i see [20:08:39] invite only [20:08:59] the name says it all, heh [20:09:08] yeah [20:09:24] (i only asked there because nagios is spamming #ops) [20:35:12] hey drdee, dschoon, so [20:35:18] sup [20:35:27] re possible network packet loss in udp2log stream (dschoon I just told you this) [20:35:34] yo [20:35:46] i captured a ~1 minute of logs from just cp1044 on an26 using udp2log [20:35:49] and all of the seqs were there [20:35:51] so [20:36:02] that indicates to me that all of the packets can/do make it to the NICs [20:36:10] pffffewwwww [20:36:45] is that test easy to repeat? [20:37:13] related: could you commit/gist the code? [20:37:29] because i think we might want to do this regularly. [20:37:50] i'm still going to continue reading through our packetloss ganglia stuff until i understand it all [20:37:54] just to be sure [20:42:41] dschoon [20:42:42] https://gist.github.com/4568355 [20:42:47] ty! [20:42:48] <3 [20:42:50] you are the best [20:55:35] ottomata, can you quickly add dan to stat1, ct has approved [20:55:53] dan needs to download my funnel jar from stat1:) [20:55:58] ah yeah [20:58:22] ottomata: where'd you run that? [20:58:50] analytics1026 [20:59:05] gotcha. [21:01:10] it should work on any eqiad machine though [21:01:16] (if udp2log is installed) [21:03:03] dschoon: demo-fri? [21:03:19] ja [21:03:20] sec [21:08:05] hey all [21:31:48] drdee, ottomata: you need to use another character as a field separator in the mobile req log [21:31:49] http://i.imgur.com/DCKJ9.png [21:32:21] there are spaces in the character encoding so the breakdown by column is inconsistent from one row to the next [21:33:00] hahaha, indeeeeEEEEd [21:33:19] we have been talking about that for about 8 months now [21:33:25] BUT [21:33:31] those are varnish logs [21:33:33] and there should not be spaces [21:33:58] can you give me those few raw lines ori-l? [21:34:18] the one you pasted in a pm [21:34:25] k [21:35:31] IIINnnnteresting [21:35:34] application/json; charset=utf-8 [21:35:36] space in content type! [21:36:24] yes. [21:36:37] there's also a tab separating the seq id and the other fields [21:36:48] so if you split on space, the host and seq id are combined [21:36:51] 316554683463cp1043.wikimedia.org [21:36:55] yes [21:37:04] that is a hadoop kafka importer annoyance [21:37:11] that first number isn't a seq [21:37:17] it is the byte offset in the kafka buffer [21:37:32] since I haven't needed to to any analysis that needed hostname [21:37:34] i've ignored it [21:37:41] split on space and drop first field [21:37:42] but [21:37:49] space in content-type is a problem fo sho [21:38:06] we've been wanting use tab as separator foreevuuuhhh now [21:38:07] yeah. and beeswax / hive only lets you split on a single character [21:38:14] but also don't want to break downstream scripts (mainly erik z's) [21:38:15] so our attempts to get clever are failing [21:38:23] and sadly user agent is the last field [21:38:29] pig will let you [21:38:33] i haven't used hive very much [21:38:40] ok, we'll try pig [21:38:45] here are some samples [21:38:59] https://github.com/wmf-analytics/kraken/tree/master/src/main/pig [21:39:03] particuarly, for mobile stuff [21:39:17] for that hourly continent graph thing I was playing with [21:39:17] https://github.com/wmf-analytics/kraken/blob/master/src/main/pig/geocode_group_by_continent.pig [21:39:42] is there an interface for hive? [21:39:43] brb [21:40:02] i have barely used hive at all [21:40:14] for pig, there is a shell [21:40:21] http://hue.analytics.wikimedia.org/shell/create?keyName=pig [21:40:39] or, if you have access to servers (which we can get you), you can run pig scripts via cli [21:40:43] that's what i've been doing [21:44:52] i have access to the servers, i think [21:44:56] how do you invoke the CLI? [21:48:53] ^ ottomata [21:49:05] grunt [21:52:41] ottomata, drdee: http://browsertoolkit.com/fault-tolerance.png [21:53:06] ROFLOL [21:54:26] ok, i think we're going to download the files for a particular day [21:54:33] which day is least likely to have gaps? [21:56:09] ^ ottomata / drdee [21:57:24] unsampled is not yet working without packet loss, if you want just raw sampled data you can get it from stat1:/a/squids/archive/mobile [22:00:12] pig [22:01:24] ori-l [22:01:25] https://www.mediawiki.org/wiki/Analytics/Kraken/Tutorial [22:13:52] brb food [22:37:02] !logbot yesterday, deployed updated filter for webstatscollector to collect page view counts for wikimediafoundation.org [22:37:36] !analytics-logbot logbot yesterday, deployed updated filter for webstatscollector to collect page view counts for wikimediafoundation.org [22:37:36] I am a logbot running on bots-analytics. [22:37:36] Messages are logged to www.mediawiki.org/wiki/Analytics/Server_Admin_Log. [22:37:36] To log a message, type !log . [22:37:50] !log yesterday, deployed updated filter for webstatscollector to collect page view counts for wikimediafoundation.org [22:37:53] Logged the message, cap'n [22:56:08] back [22:56:30] ori-l: lol [22:56:44] god that's fantastic.