[13:49:35] good morning [13:52:41] good morning milimetric [14:23:00] moooorning!!!!!! [14:24:11] morning! [14:24:17] it is time for a 3rd pair of socks! [14:24:57] indeed! [14:34:38] so, unsampled packet loss monitoring doesn't look very helpful [14:34:38] http://ganglia.wikimedia.org/latest/?c=Analytics%20cluster%20eqiad&h=analytics1009.eqiad.wmnet&m=cpu_report&r=hour&s=by%20name&hc=4&mc=2 [14:34:55] the m meens thousandths [14:35:00] so over here [14:35:00] http://ganglia.wikimedia.org/latest/graph_all_periods.php?c=Analytics%20cluster%20eqiad&h=analytics1009.eqiad.wmnet&v=0.01253&m=packet_loss_90th&r=hour&z=default&jr=&js=&st=1359037834&vl=%25&z=large [14:35:59] so that hovers between .005 and .05 % [14:36:14] whereas the sampled packet loss monitoring hovers around 5% [14:36:33] now, a big difference there is that there is nothing else going on on analytics1009 [14:36:46] and on analytics1003 (where there is sampled packet loss monitoring), there's all those filters running [14:40:56] i don't understand how this could be though, if I am getting the same numbers writing to a file as I am in hdfs and only have 0.05% loss [14:41:02] very confusing [14:41:53] how about running the packet loss python script against your local saved fies [14:41:54] a [14:42:00] and compare the numbers to see if they match [14:42:12] that at least both methods report the same packet losss [14:43:07] hmmmmmmMmmmmm [14:43:11] interesting, yeah I think I can do that [14:43:17] good idea [14:58:49] drdee, i'm looking at packet loss for those 4 hosts in the mobile logs as computed by packet-loss.cpp [14:58:53] i'm saving all the results, but as it goes [14:58:57] it looks pretty typical [14:59:03] about 5 or 6% packet loss all around [14:59:11] which is what I computed with my sleuthing manually as well [15:08:38] ok [15:14:41] EROSEN: http://analytics1010.eqiad.wmnet:19888/jobhistory/job/job_1355947335321_9786/mapreduce/job/job_1355947335321_9786 [15:14:43] FINISHED [15:14:45] DONE [15:14:46] DEAL [15:14:48] IT WORKS [15:14:49] niiiiice [15:14:55] terribly slow [15:15:00] 22 hours [15:15:00] ya [15:15:22] so please have a look at the data :) [15:15:29] what was the output did called? [15:15:38] i didn't see anythign in wikihadoop9 [15:15:57] data is on /usrr/diederik/wikihadoop02 [15:16:08] gotcha [15:16:16] and this was the working command [15:16:18] hadoop jar /usr/lib/hadoop-mapreduce/hadoop-streaming.jar -libjars wikihadoop-0.2-CDH4.jar -D mapred.reduce.tasks=5 -file revision_differ.py -file diff_match_patch.py -file xml_simulator.py -input /user/diederik/arwiki-20130120-pages-meta-history.xml.bz2 -output /user/diederik/wikihadoop02 -inputformat org.wikimedia.wikihadoop.StreamWikiDumpInputFormat -mapper "revision_differ.py" [15:16:20] lot's of unicode hehe [15:16:45] with the sys.path.append('.'), right? [15:16:48] hey erosen, how do I make a limn graph? :) [15:16:50] I have limnified data [15:16:58] I just want to see a graph of it now [15:17:10] if you want all of the rows in the DataSource to be graphed [15:17:22] sure [15:17:24] you can just called ds.write_graph(') [15:17:26] oh but [15:17:27] i mean [15:17:29] i don't have an instance [15:17:30] haha [15:17:33] like how do you see it [15:17:35] yup [15:17:38] i just have datafiles and datasources [15:17:39] ... [15:17:40] now what? [15:17:41] haha [15:17:44] can I use an existing instance? [15:17:55] milimetric has the remote source working [15:17:59] it works for the continent graph [15:18:03] hmm [15:18:07] can't I ijust put my files somewhere that they are accessible? [15:18:14] and then somehow create a graph in one of the existing limn instances? [15:18:15] this get's into the question of whether to use the new limn or old limn [15:18:20] ha [15:18:22] yeah, in theory [15:18:23] yep [15:18:31] I think you should use the new limn [15:18:32] oh milimetric can tell me! [15:18:35] and you just have to talk to milimetric [15:18:36] yeah [15:18:41] hehe [15:18:42] :) [15:18:43] milimetric, can I just do something on dev-reportcard? [15:19:05] maybe without affecting the dashboard? [15:19:05] yep [15:19:18] wanna hangout and I can show you where to commit graphs/datasources? [15:19:30] we can do it together [15:19:31] i don't wanna commit! [15:19:44] i jsut want to drop data somewhere public and then see a graph :) [15:19:58] i think you can only use datasources remotely [15:20:00] well, the graph definition has to go somewhere unfortunately [15:20:06] not graph json files [15:20:10] yeah [15:20:40] erosen: with the sys.path.append('.'), right? [15:20:42] yes [15:20:45] but it's pretty easy ottomata, I can just do it for you [15:20:59] yeah ok [15:21:15] http://analytics1001.wikimedia.org:81/limn/ [15:22:02] k, i'll put it in a graph on dev-reportcard [15:22:05] erosen: i will load the diffs into the revdiff db [15:22:07] coooooooool [15:22:10] can't wait 'till it's easier to do this [15:22:17] word [15:22:20] dschoon used to have a graph editor [15:22:21] no good anymore? [15:22:57] drdee: when are you doing this, let me know about the steps required, if you don't mind [15:23:22] erosen, now? [15:23:26] k [15:23:47] ottomata, can you puppetize libdclassjni-dev and pypy on all an machines? [15:24:22] sure [15:25:14] if there are debs! [15:27:21] seems like what we need: http://packages.debian.org/experimental/pypy-lib [15:27:36] pypy i got [15:27:43] you need lib? [15:27:46] pypy-lib [15:27:55] don't see libdclassjin-dev [15:27:59] hmm [15:28:05] good point [15:28:13] this probably the one: http://packages.debian.org/experimental/pypy [15:28:23] my bad [15:28:31] libdclassjin-dev is on stat1:/home/spetrea/releases/ [15:28:47] we need pypy binary [15:29:14] erosen:building jar for revdiffdb [15:29:25] k [15:29:29] need to make some small changes [15:29:48] to the pom? [15:33:46] it isn't called jin-dev [15:33:47] libdclass-dev [15:33:51] right drdee? [15:34:14] yup [15:35:22] that shoudl be installed on all hadoop nodes [15:35:36] yup [15:35:38] and pypy coming up [15:35:56] awesome and thank you! and this will also be puppetized? [15:35:58] yup [15:36:16] libdclass-dev should go into the wikimedia apt repo [15:36:31] its in the kraken apt repo [15:36:36] not in wikimedia [15:36:40] do you need it in wikimedia? [15:36:45] oh ok [15:36:49] dunno [15:36:51] maybe not [15:38:42] erosen: built jar, now copying files to local fs [15:38:48] from hdfs [15:38:53] nice [15:39:05] revdiff should be able to read straight from hdfs but it can't :) [15:39:17] hmm [15:39:17] weird [15:39:24] ottomata: Exception in thread "main" org.apache.hadoop.fs.FSError: java.io.IOException: No space left on device [15:39:31] on an01 :D :D :D [15:39:32] uh ohhhhh [15:39:36] that's me [15:39:52] or actually erosen ;) [15:39:59] on an01? [15:40:06] hmm [15:40:07] i don't see any disks full [15:40:23] i'm doing anything on an01 right now [15:40:27] diederik you are using 25% of / [15:40:38] i can make some space if we realize where the issue is [15:41:30] (it's deploying ottomata - had to change the date format but other than that it was good) [15:41:50] hmmm, yeah ok, erosen, when I limnified, i thought it would change the date format to what limn expects [15:42:04] no it's ok [15:42:06] it is supposed too.... [15:42:10] it left it in this format: [15:42:10] 2013-01-23T01:21:01 [15:42:11] that ok? [15:42:14] but maybe the new date format is different somehow? [15:42:15] hmm [15:42:15] it is now :) [15:42:16] weird [15:42:27] not sure what the old one did [15:42:37] but I made a regex for the new one that looks like this: [15:42:41] i'm getting off the train now, so i'll be walking for a few min [15:42:50] YYYY-MM-DD.HH:mm [15:42:56] hmm [15:42:59] so the dashes are mandatory but the T can be anything [15:43:08] sorry! bbib [15:43:09] oh ok [15:43:16] i see, you changed what limn expects, not my file? [15:43:41] yep [15:43:51] the customer is always right [15:43:51] cool [15:43:57] maybe if you are at that [15:43:59] make the regex do [15:45:31] YYYY[-/]mm[-/]dd.HH[:-\.]MM[:-\.]SS [15:45:35] but whatteevvaah [15:46:22] no prob, will do [15:46:50] in the meantime, feast thy eyes [15:46:52] http://dev-reportcard.wmflabs.org/graphs/mobile_packet_loss [15:46:58] (still bugs on xaxis) [15:47:14] zat is crazy looking [15:47:26] what is the time period? [15:47:27] cool smoothing! [15:47:48] two days worth [15:47:50] jan 23 and 24 [15:47:57] on an09? [15:48:01] definitely bugs on callout :( [15:48:26] no [15:48:27] an01 [15:48:30] ok [15:48:45] this is directly from the file I used to compute the seq gaps for udp2log data yesterday [15:49:00] but isn't a 5% loss always the case? [15:49:01] so, its pretty uniform [15:49:10] yup, that's what we are saying here [15:49:19] but it is due to network [15:49:21] i mean also on lockeemry [15:49:23] not to local buffers [15:49:23] right [15:49:26] which is not what we thought [15:49:34] which also means [15:49:35] I just made it YYYY.MM.DD.HH.mm.ss ottomata/erosen [15:49:41] that partitioning the stream will not help [15:50:23] so basically, here's what I think drdee [15:50:24] so i think we should start storing the data for mobile, keep monitoring this, maybe using a pig / oozie / limn workflow? and as long as it hovers around 4-7% we are godo [15:50:32] well, i mean [15:50:39] we ahve this number in ganglia already [15:50:49] it isn't unsampled packet loss numbers, but it is the same report [15:50:57] aight [15:51:06] i still think partitioning is worthwhile [15:51:07] but this means we can't do funnel analysis with kraken [15:51:14] i don't htink it is worht it right now [15:51:19] it won't help anything [15:51:24] it might be nice organizationally [15:51:32] it will reduce computation time significantly [15:51:43] but for mobile we raen't having a problem [15:51:58] we saw yesterday that all of hte data that gets to the machines also gets into kraken [15:51:59] because we can store traffic by source in separate folders [15:52:08] so there is less filtering to be done [15:52:20] but we're already doing that [15:52:22] but filtering on our nodes [15:52:26] ok [15:52:31] i mean yes, our nodes will have to do less work [15:52:36] we could probably use fewer of them to do it [15:52:39] so that would be cool [15:52:49] but [15:52:51] at the moment [15:52:55] funnel: we will use event logging data [15:52:57] we're getting everything we cna get for mobile data [15:53:02] that is ideal for sure [15:53:10] yeah, let me rephrase that [15:53:13] we can't use webrequest data for funnel [15:53:18] right [15:53:25] but, if we are thinking about it [15:53:31] well you could [15:53:32] i bet we can turn off all other filters [15:53:34] webrequest 100 [15:53:38] whatever else is there [15:53:40] and just leave mobile [15:53:43] since we are only working on mobile right now [15:53:44] right? [15:53:52] mobile and blog :) [15:53:57] yeah blog is its own stream now [15:53:58] so that's cool [15:54:08] let's do it! [16:00:04] erosen, do you use the unsampled zero logs in kraken? [16:00:46] oh he's walking [16:00:49] couldn't that just be part of the mobile stream [16:00:56] probably so yeah! [16:01:05] all zero requests are also to m.wikipedia domains? [16:01:15] yes [16:01:17] cool [16:03:27] nope, sorry [16:04:00] this is the regex (^([a-zA-Z0-9-]+)\.zero|^zero)\.([a-zA-Z0-9-]+)\.org" [16:04:33] better to ask erosen probaby [16:05:19] oh it has to have zero in it? [16:05:24] ok i'll wait for erosen [16:05:27] yes [16:05:44] maybe mobile logs should include zero domain [16:05:50] it isn't zero.m.wikipedia [16:05:50] ? [16:06:18] i thought so but now i am not sure anymore [16:08:38] ottomata: all zero requests are also to m.wikipedia domains? -- I don't think so [16:08:41] but i'm not sure [16:09:39] my q to you is! [16:09:50] are you using the unsampled zero request logs in kraken? [16:09:55] no [16:10:05] basically i am waiting for x-carrier [16:10:09] yeah man [16:10:10] HAAHAHAHAHAHAHH [16:10:16] are you trolling us? [16:10:18] hehe [16:10:29] but you wanna do zero in kraken, right? [16:10:38] please say yes :) [16:10:43] yeah for sure [16:10:58] i mean I am already doing it, just hackle with the 1:10 sampled files [16:11:38] apparently colloquy doesn't like hackily [16:11:49] i got Orange Congo code from Amit [16:12:18] ok cool [16:12:40] i'm going to go ahead and merge the X-CS change now [16:13:11] do you still need the tata code? [16:13:21] sort of [16:13:25] there are 40 tata codes in india [16:13:32] each for their own geography [16:13:36] so i have now as code [16:13:51] 405-0* [16:14:38] interesting [16:14:53] just to clarify, these headers will stick around in the logs? [16:15:36] yes [16:16:49] k [16:21:08] ottomata, the squids / nginx also need to log the CS field or set it to '-' [16:21:16] else we have inconsistent number of fields [16:21:17] yeah well [16:21:21] that is already happening [16:21:30] we already deployed the x-carrier header to all frontends, ja? [16:21:35] so, i changed the nginx one [16:21:41] but deployign squid is more annoying (it isn't in puppet) [16:21:46] so, i just left it at x-carrier [16:21:54] both squid and nginx don't set this header anyway [16:21:59] so the field will always be - [16:22:00] form them [16:22:01] from* [16:22:02] ok [16:22:11] when we deploy the changes that do change squid [16:22:13] i'll fix it there too [16:22:19] but asher might be good banana's when he sees carrier :) [16:22:23] (cookies and tabs) [16:22:25] its already ther eman [16:22:29] he helped me deploy it originally [16:22:34] and didn't say anything at the time [16:22:34] aight [16:22:41] about cookies and tabs [16:22:43] first tab [16:22:45] s [16:22:48] fooo suuure [16:22:51] yeah [16:22:57] i'll get the patch in today [16:23:27] because we need to figure out if it's possible to log only specify key/value pairs from a cookie and not the entire cookei [16:23:39] sorry, ottomata you mean we won't get to the point where all the requests have been tagged? [16:23:54] until something additional happens? [16:23:54] eh? [16:24:03] with x-cs? [16:24:15] both squid and nginx don't set this header anyway so the field will always be - [16:24:18] yes [16:24:25] but all mobiles requests are in varnish [16:24:30] so no biggie [16:24:31] hmm [16:24:42] maybe, but a lot of the requests are not to the mobile site [16:24:49] we are logging - in squid and nginx just so that the number of fields are consistent [16:24:54] a lot of the zero requests? [16:24:56] yeah [16:25:02] then yup, those will not be tagged [16:25:20] i mean it isn't clear that this is important, but the current logs are 90% main site requests [16:25:21] ok [16:25:31] 90% desktop? [16:25:33] i think it will be fine, i'll talk with amit [16:25:34] yeah [16:25:35] ? [16:25:37] haha [16:25:42] well, for the mobile logs [16:25:47] * drdee is confused [16:25:54] we are only importing those that match m.wikipedia domains [16:25:55] i mean they mostly only care about the mobile site visits [16:26:09] well, if you are trying to know who is using the zero program [16:26:17] and 90% of visits from zero clients are to desktop [16:26:21] and we filter on the X-CS header [16:26:25] you'll be missing 90% of your requests [16:26:30] indeed [16:26:31] yeah [16:26:31] right now doing it by IP address will get them all [16:26:46] but as i said this should still be okay because amit usually only cares about the mobile site requests [16:26:52] are you sure that 90% goes to desktop? [16:27:09] i'll show you a graph, one sec [16:27:10] and why is this the case? [16:27:24] nobody knows [16:27:28] I keep asking [16:27:51] i think it is either that nobody makes bugzilla requests when their phone goes to the main site [16:28:00] or that there is a lot of dongle use [16:28:19] or very broken redirection [16:28:57] or that [16:32:54] here is an example: http://global-dev.wmflabs.org/graphs/orange_kenya_versions [16:33:36] X: desktop, M;m., Z:zero. [16:34:00] yup [16:34:35] those counts come from the current udp-filters [16:47:54] i just looked at 10000 lines from tim-brasil [16:47:57] hardly any m. domains [16:50:03] or maybe mobile is highly over-rated in the global south :) [16:50:19] i should say mobile internet [16:52:16] drdee, well, if the IPs we are filtering on are supposed to be from mobile phones [16:52:23] you'd think they would get directed to m. no matter what [16:52:37] drdee, do you know if we have an RT for tab change? [16:53:08] 1 sec [16:53:50] maybe not ottomata, probably a good idea to open one ;) [17:00:17] ok ja [17:13:29] poop drdee, packet-loss.cpp splits manually on space [17:13:35] we'll need to deploy a new udp2log deb [17:14:09] ugggh [17:14:46] why is it splitting? [17:16:01] it needs seq and timestamp [17:16:03] from the log lines [17:16:15] also, the udp-filter version we have deployed doesn't have the -F flag option yet [17:16:25] so we need to deploy a new udp-filter too [17:16:45] okay so let's do that first [17:17:04] ok, should we just deploy a new one with the default as \t? [17:17:14] i will ask stefan to build the deb [17:17:32] let's make it a parameter [17:17:40] ok, but which version? [17:17:46] you guys have a lot of changes to udp-filter [17:17:47] right? [17:17:54] do we want the webstats changes? [17:17:56] probably not, right? [17:17:57] no [17:17:59] it is already a parameter [17:18:02] i will go back in time [17:18:20] and make a new package with just the -F flag [17:18:22] you shoudl probably go to 67dcf53f0ee4214e24f75528ecee6ba37e0e58ca [17:18:25] ok [17:18:26] ty [17:18:30] or actually [17:18:34] 639d66215c5217f37c25ccf2d56053b2892d7356 [17:20:21] i'll work on the udp2log packet loss change [17:20:28] awesome [17:29:24] graph editor, ottomata erosen? [17:29:48] oh, was that about limn? [17:29:58] for some reason i immediately thought of graph theory [17:30:01] heh [17:54:32] lunchtime! aaahhhh 6 mins til standup [17:54:33] scarftime [18:04:44] guys [18:04:46] https://plus.google.com/hangouts/_/2e8127ccf7baae1df74153f25553c443bd351e90 [18:04:50] https://plus.google.com/hangouts/_/2da993a9acec7936399e9d78d13bf7ec0c0afdbc [18:05:07] whhaa where'd you get that one? [18:05:12] dunno [18:28:55] erosen: http://gp.wmflabs.org/ [18:29:42] nice thanks [18:48:23] drdee: no luck sshing to oxygen. Is it easy to request / grant access? i just need to so I can see up to date logs so I can check in on the x-carrier status [18:48:45] okidoki [18:53:31] hey drdee, and anyone else: the mobile stats we're compiling might benefit from the general mobile stats that akamai just released: http://www.engadget.com/2012/08/09/akamai-peak-internet-speeds-jumped-25-percent-year-to-year-in-q1/ [19:06:18] hey erosen [19:06:32] HEY [19:06:35] hey* [19:06:36] sup [19:06:56] just saw your message - can you expand a bit? [19:07:20] what I meant by read-only is that user:research can run queries on log + prod + staging [19:07:24] ash sorry. question is whether you know if asher created a read-only account [19:07:48] but it has write permissions on staging [19:07:58] correct [19:08:06] nothing changed to staging [19:08:19] but if you want to write into prod, we should talk :) [19:08:22] the goal for asaf was that he would have read-only access to staging alone [19:08:34] he doesn't need prod [19:08:54] hmm can't he just use research? [19:09:03] or you think he's going to drop your tables ;) [19:09:06] yeah, it will work [19:09:08] i don't think so [19:09:16] k cool [19:09:24] it just seemed like the best option [19:09:43] but given the difficulty of adding an accout, it is probably fine to leave him with write access [19:10:21] I think we could ask for a shared read-only account, what ops is not willing to support/maintain is personal user credentials [19:10:45] yeah [19:10:49] that is all i was thinking [19:10:59] ok, I'll bring this up with py [19:10:59] it could be useful for scripts which are meant to be consumers only [19:11:03] as a basic protection [19:11:26] peter youngmeister? [19:11:29] yup [19:11:45] k, just checking--not yet on initial terms ;) [19:16:06] ottomata: ping [19:16:46] ottomata: why isn't https://github.com/wmf-analytics under our official git repo e.g., ? [19:17:46] good question! [19:18:04] i think because wikimedia is a bit more official and clean [19:18:08] wmf-analytics is messier [19:18:29] also, whne we started it, it was just mobile people using wikimedia, and github was more controversial [19:18:46] but we can move it, can't we? [19:23:11] http://stat1.wikimedia.org/spetrea/new_pageview_mobile_reports/r19-parallel/pageviews.html [19:23:18] new mobile pageview report [19:23:39] didn't render the /wiki/ separated from /w/api.php yet [19:23:42] but it is computed [19:24:17] also have a few days with very low counts which are suspicious, so I have to check those too [19:25:33] 174G of data in 118m [19:25:40] ok [19:26:51] so we are still about 500M page views too low :( [19:27:22] drdee: you're comparing to wikistats right ? [19:27:27] ye [19:27:28] s [19:27:29] can you give me a file with [19:27:35] discarded log lines [19:27:36] ? [19:28:44] so the actual discarded lines not just their count [19:28:47] I need to write code for that [19:28:54] yup [19:28:56] and then do another run, and then I'll have a .gz for you [19:29:07] perfect, but that should be quite easy, right? [19:29:09] but it will be quite big.. [19:29:16] or just sample that [19:29:22] or just run it for a day [19:29:31] patrick and i just talked in the office [19:29:35] i just need to look at some data to see what is being thrown away [19:29:46] drdee: ok, will do a run for a day now [19:29:56] just to put things here as well for the remote peeps, i said they're separate because a lot of the projects were forks [19:30:19] it's more of a sandbox for experiments and collaboration than it is "wikimedia projects" [19:30:42] so when we were experimenting with scribe, we forked it to fix something that was only an issue in our setup [19:30:53] putting that under wikimedia is misleading [19:35:59] dschoon, about the negative packet loss numbers, those are just confidence intervals, and if the mean is close to 0 then you could get a negative confidence interval, so i don't think this is a real issue [19:36:08] no [19:36:20] it's expressed as (value +/- interval) [19:36:26] the *value* is often negative [19:37:22] that should be easily tractable in the source code [19:37:27] yep [19:37:33] reading it over [19:37:49] cool [19:37:58] it is, in fact, what sent me there :) [19:38:16] dschoon: https://github.com/wikimedia-incubator [19:38:22] cool [19:38:22] dschoon: you're an admin [19:38:43] hmm. [19:39:00] it still seems like a different purpose than what we're using wmf-analytics for [19:39:10] i agree that if we had private repos, it wouldn't be an issue [19:39:40] dschoon: do you want private repos on wikimedia-incubator? [19:39:47] otherwise, we really don't want to suggest we're working on a fork of, say, thrift [19:40:01] i dunno. this has worked fine so far to disambiguate things [19:40:09] i'll update the description on the group, for starters [19:40:27] and we can move our experimental projects into the incubator (like sqoopy and gerrit-stats) [19:41:45] dschoon: okay cool [19:42:09] won't happen right away. i'll make a note. [20:30:41] brb lunch [21:23:06] back [21:24:54] drdee, ottomata1 asher says varnish (including our patches) will not extract cookie KV-pairs [21:25:02] yeah didn't think so [21:25:07] nginx will do it [21:25:29] ok [21:26:04] but squid won't. [21:26:10] so we'd have to patch both squid and varnish. [21:26:40] and he says that it would need to be mindful of CPU use. [21:27:00] to me, this sounds like a great way for us to do something awful to production in a degenerate case [21:27:10] and i'd much rather postprocess our values out of the cookie string [21:29:02] i agree [21:29:08] they should just event log it anyway :) [21:29:16] yep. [21:29:29] is that actually on the table? [21:29:39] i thought that wasn't going to happen due to noscript or whatever [21:29:52] yes and the dep on jquery [21:30:05] so it would only work on smartphones [21:30:18] to be more precise it would only work on js enalbed phones [21:30:47] right. [21:31:16] the takeaway, then, is that we should just log the full Cookie header [21:31:29] that also has performance degradation issues [21:31:37] it will make udp packets much bigger [21:31:40] i agree with the concerns about data retention, so we could preprocess the files to throw away the irrelevant values [21:31:47] i don't think that's a major concern. [21:31:51] it totally is [21:32:00] are we close to capacity with net? [21:32:17] i read a bunch of whitepapers about UDP packetloss against packet size [21:32:20] we couldn't log the xcarrrier header (avg 10 characters) and had to go with the CS carrier (5 chractera) [21:32:26] and so long as you're under the MTU, it has no effect [21:32:35] what. [21:32:37] why?? [21:32:51] because a single udp packet is 1400 bytes large [21:32:58] already!? [21:33:07] and then it would cross that and you would have multiple packets for a single log line [21:33:11] yes. [21:33:15] i realize. [21:33:19] and that also is a rpobem [21:33:21] :D [21:33:32] as 1500 is the MTU. [21:33:34] so. [21:33:39] that is a huge problem. [21:33:57] i will discuss. [21:34:03] aight [21:35:07] also please poke asher/patrick about the tab separator issue, i haven't heard back from them [21:35:18] sure. what's the issue? [21:35:28] that they're not responding? heh [21:35:33] yup :) [21:35:37] k [21:35:40] what do we need? [21:35:47] their feedback / approval [21:35:52] ...of? [21:35:52] to go ahead with the change [21:35:54] ah. [21:36:02] to deploy the sep-change to prod? [21:36:02] moving from space as delimiter to tab [21:36:14] deploy is scheduled for feb 1st [21:36:34] read the RFC on to engineering :D [21:36:38] there is an RT [21:36:55] https://rt.wikimedia.org/Ticket/Display.html?id=4400 [21:37:07] ty [21:55:38] woo hoo bar graphs [21:55:39] http://dev-reportcard.wmflabs.org/graphs/kbar [21:55:57] !log Limn released a new version to dev with bar graph support [21:55:59] Logged the message, Master [21:58:09] woo! [21:58:15] dan is the man! [21:58:18] !! [21:58:25] heh [21:58:31] we should really get rid of that initial animation [21:59:21] milimetric, does it also support timeseries with multiple columns? [21:59:44] initial animation rock! [21:59:47] don't touch it [21:59:49] :P [21:59:55] hehe [21:59:56] * milimetric likes cartoons [22:00:04] no timeseries for you! [22:00:06] come back 2 sprint [22:00:25] kk [22:00:29] stacked bar charts would really be a different node I think [22:00:40] i think you might have broke maps [22:00:43] http://dev-reportcard.wmflabs.org/graphs/editors_by_geo [22:00:43] I'm not sure how to marry the two [22:00:46] oh no [22:00:48] i forgot to check that [22:01:00] i was excited the deploy would let me show that off :) [22:01:26] ah, easy fix [22:01:59] k, fixed locally [22:02:06] ooooh! infobox [22:02:11] dschoon is the man :) [22:02:20] dude, you should !bang! log that [22:19:53] ottomata, i figured i should upload the python thing i wrote to calc the diffs and percents [22:19:53] https://gist.github.com/4628582 [22:23:51] milimetric: did you redeploy? [22:24:06] nope, just pushed [22:24:14] because i'm redeploying soon [22:24:17] k [22:24:22] when I get annotation node box to pop up [22:25:20] dschoon [22:25:23] awesome! [22:25:27] what does that take in? [22:25:29] a seq file? [22:25:36] yaml [22:25:39] i have this bad feeling that trackHover is very expensive compared to on 'mouseover' but I'm loving how easy it is to program with it :) [22:25:48] yaml? [22:26:11] basically the output [22:26:16] you usually print to stdout [22:26:16] :) [22:26:18] ohhhhhh [22:26:19] ok [22:26:21] i changed your script [22:26:21] haha [22:26:24] i guess that is kinda yanml [22:26:24] haha [22:26:25] nice [22:26:30] it is literally yaml :) [22:26:31] i love how yaml is kinda what you would write anyway [22:26:34] yes! [22:26:38] i was not thinking yaml at all when I wrote that [22:26:39] same as markdown [22:26:42] that's why i love them both [22:27:01] nice thanks [22:27:13] i'm still sleuthing this issue with leslie right now [22:27:22] we're about to compare udp2log loss vs tcpdump loss [22:30:57] cool. [23:00:42] laters all! [23:41:34] ok dClass lib works in pig unit test environment, it passes assert; now it still fails on kraken but this is pretty good news AFAIC [23:52:13] drdee: pong [23:52:17] good news :) [23:52:27] drdee: does the unit test feature threads ?