[00:24:26] back [00:24:46] back [00:30:28] cool milimetric, and thanks for using the bot! [00:49:05] dschoon: quick color brewer q [00:49:12] sure [00:49:15] shoot [00:49:21] do you know of a canonical js file somewhere on the interwebs [00:49:22] ? [00:49:29] so I can use it in a gist? [00:49:33] i think d3 has one checked in [00:49:38] http://bl.ocks.org/4458705 [00:49:43] aa [00:50:47] i did one for python [00:50:52] yeah [00:51:07] i am trying to get it to used in a bl.ock [00:51:08] nice [00:51:16] are you not using that? [00:51:27] i'm not [00:51:39] i mean, the one you have there [00:51:55] https://github.com/mbostock/d3/blob/master/lib/colorbrewer/colorbrewer.js [00:51:56] i tried but I found it in some rando's github repo [00:52:01] raw: [00:52:01] https://raw.github.com/mbostock/d3/master/lib/colorbrewer/colorbrewer.js [00:52:16] excellent [00:52:18] I think that is what I need [00:52:20] thanks [00:52:21] i get the feeling the one you're using is from me [00:52:25] hehe [00:52:27] as it has exports stuff [00:52:48] yeah [00:53:11] i can't find the page where I found it but it was someone else [00:53:39] the website is colorbrewer2.org [00:53:49] yeah [00:53:49] but that JS impl is mike, iirc [00:54:17] i looked around but weirdly i didn't see any source [00:54:26] yay [00:54:27] works [00:54:29] woo [00:54:33] http://bl.ocks.org/4458705 [00:54:47] (looks the same to me) [00:55:00] chrome doesn't deal well with blocks refreshes [00:55:04] i had to switch to firefox [00:56:31] http://bl.ocks.org/d/4458705/ [00:56:37] that looks very red-blue to me [00:57:10] you really ought to change those floats to %s [00:57:11] :) [00:57:16] hehe [00:57:19] will do [00:57:24] and make it clear that it's percent of traffic from that country [00:57:31] the colors are just a random set [00:57:42] i'll add a README.md [00:58:00] er: make it clear that it's mobile as a fraction of all traffic from that country [00:58:09] otherwise it looks like chad is 99% of our mobile traffic :) [00:58:15] hehe [00:58:41] btw [00:59:00] d3 has a string formatter which is basically the same as the python curly syntax [00:59:43] nice [00:59:45] i'll look into it [00:59:51] d3.format(',.2p')(0.92134) == '92.13%' [01:00:03] that makes it easy [01:00:29] sorry [01:00:30] d3.format(',.2%')(0.92134) == '92.13%' [01:00:40] ah [01:00:42] so you can just create formatter once [01:00:45] gotcha [01:00:48] var formatter = d3.format(',.2%'); [01:00:51] ya [01:00:56] it's function [01:01:04] formatter(.92124) == '92.12%' [03:13:11] at the moment I'm getting 1minute/day_of_data [03:13:22] that's not bad, it could be better, still using just 1CPU.. [03:14:20] http://stat1.wikimedia.org/spetrea/new_pageview_mobile_reports/r7/pageviews.html [03:15:48] this is another run, I'm getting data the same way (1st january, 1st february, 1st march , 1st april , 1st may ..) and I ran the report on those [03:18:48] some of the percentages look off. that's because I still have as TODOs to select the period of the data to process ( currently it's getting the files mentioned above but those files don't contain data from that day only, they're just dumps from squid and they contain the previous day also, so there are problems with the first and the last month processed, but the fix is on the way) [03:20:37] as you can see the first month processed has very few entries, that's because the file for it (/a/squid/archive/sampled/sampled-log-20120101.gz probably contains just a few entries for january, and it contains more from the previous day which is 2011-12-31) [03:23:23] ottomata: would it be hard to customize the squid configuration to dump the data on disk according to the entries inside those files ? [03:23:45] looking now [03:23:55] ottomata: I mean, currently it just dumps the data at 03:00 in the morning so one day labeled 2012-10-02 contains entries from 2012-10-01 also [03:24:19] average_drifter: wikistats does contain some code to handle those issues [03:24:26] maybe you can use that? [03:24:32] ottomata: if we could do that, it would make my code more simple because I wouldn't have to check at which row the data I need starts [03:24:51] drdee: yeah, it's no problem to implement it [03:25:21] that's probably the fastest route [03:25:26] ok [03:27:16] 1m / day => 30m / month (on average) => 6h / year => 12h to process 2010,2012, and january 2013 [03:33:36] drdee: line 27 here => https://gerrit.wikimedia.org/r/#/c/41979/5/pageviews_reports/lib/PageViews/Model.pm [03:33:44] drdee: what do you think of that regex ? [03:34:00] drdee: my $re_valid_language = "([a-z\-]{1,8})"; [03:34:05] looking [03:34:18] drdee: is this how we should verify if a language is valid ? [03:34:31] i don't think it can be longer than 3 maybe 4 characters