[00:06:46] drdee: any ideas on pig OOM error? [00:10:13] drdee: seems like it happend right at the end, fyi [00:23:22] YuviPanda labs is dead [00:23:23] :( [00:23:29] we've been trying to figure out why [00:23:42] whee :D [00:24:09] Hey DarTar - sorry labs is down and we were in the middle of deploying [00:24:20] but when it's back up, you'll have a custom datasources page on your own instance [00:24:59] oh ok, labs list says it's because someone was running processes on Labs [00:25:17] ask on -labs? [00:25:27] also - apologies to everyone for not being responsive on IRC these last two weeks. Being in the office is Very distracting [00:25:37] yeah, Yuvi, they're rebooting Bastion [00:26:02] well, the process thing was 3 days ago. [00:26:13] but yeah. labs is unavailable. [00:30:41] YuviPanda, DarTar: Labs is having an issue with ssh. ryan's ETA is "hopefully soon" [00:30:48] :) [00:32:27] I still have a shell on kripke [00:32:31] so it's possible i can work around it. [00:32:50] oo, that is a good call, just keep a screen session on stat1 to kripke [00:54:27] YuviPanda http://mobile-reportcard.wmflabs.org/ [00:54:38] wooohooo [00:55:58] dschoon: I should probably clean that up a little :) [00:56:01] also, http://mobile-reportcard-dev.wmflabs.org/ [00:56:05] *nod* [00:56:12] sadly, there's no way into labs atm [00:56:18] but you can test locally. [01:09:10] dschoon: my local limn instance doesn't work, sadly (neither me nor milimetric were able to debug) [01:09:56] okay, we'll try to take a look at that tomorrow (maybe) [01:10:48] but thanks for getting this deployed :) [01:12:35] dschoon: I suppose I can update the data / configs myself by logging into labs and updating them? [01:12:39] (once labs is back up [01:12:39] ) [01:13:24] if you update the data / configs, you'd have to pull from the labs instance and reset permissions [01:13:33] (since www-data:www needs access to these files) [01:13:45] so the easiest way to do that is with our fabric deployer [01:13:50] ooh. [01:14:04] is there a doc on how I can use the fabric deployer? [01:15:30] YuviPanda, it's literally gonna be issuing "fab mobile_dev deploy" [01:15:37] wheee :D [01:15:40] but one sec, I'll get you the repository [01:15:43] oh, sorry [01:15:51] in your case it's fab mobile_dev deploy.only_data [01:15:58] and if I change config? [01:16:06] deploy does code and data, deploy.only_data does only data [01:16:12] okay! [01:16:13] what do you mean by config? [01:16:24] graphs/* datasource/* [01:16:37] (I am getting rid of a few timeseries, too cluttered) [01:16:39] yep, those are all considered "data" and deployed by deploy.only_data [01:17:09] ah [01:17:10] okay :D [01:17:11] thanks! [01:17:17] still can't login to labs though [01:17:33] milimetric: also I just figured that I have an rpi running linux. I should / could setup limn there and see how that goes :) [01:17:48] it's not going to be blazing fast - I had to turn off half my vim plugins to get vim to a usable speed [01:17:53] ok, my guess is that you're still hitting permissions issues [01:18:05] on my local instance? [01:18:15] yeah [01:18:20] unless npm start does funky things with it... [01:18:37] yeah, it's very strange either way [01:18:48] I'll have to get a blank VM and try it [01:18:52] milimetric: ok [01:19:00] milimetric: I see that everyone has 'r [01:19:17] permissions are what i'd expect [01:20:38] lemme paste what permissions I have [01:20:57] ok [01:21:42] drwxrwxr-x for all the directories, and -rw-rw-r-- for all the users [01:21:51] *not users, files [01:22:23] but you should get some sleep :) it's working on labs at least and we'll help you more tomorrow [01:23:52] milimetric: yeah i have that [01:23:53] too [01:24:06] so weird [01:24:15] milimetric: are oyu logged into labs? [01:24:45] ok, maybe I'll try to make a .sh file that does all the necessary commands, then run it on a blank VM and see if I get the same issue [01:24:59] yup! [01:25:01] no, I can't get to labs yet, there's some magic ssh config change I can't figure out [01:25:07] maybe i've an older version? [01:26:12] milimetric: if I make a commit now, can it be deployed? [01:26:25] yes, I'll deploy it to test out the fabric changes [01:26:35] as soon as I get access to labs (shouldn't take long) [01:27:03] I'm making changes blind [01:27:05] * YuviPanda no likey [01:27:39] do you guys put sugar in your coffee? [01:28:13] I'm unsure if you're asking me, but I do put an average of 10 packets of sugar in a large cup of coffee [01:28:18] washes out the bitterness [01:29:09] yes, I thought so, I try to do that too [01:29:22] lol [01:29:44] it's ok Yuvi, hopefully Edit UI will get prioritized soon and we can all stop making blind changes [01:29:53] :) [01:29:55] milimetric: I pushed [01:30:05] * milimetric looks [01:30:23] milimetric: also json sucks for editing by hand :P [01:30:31] i saw yaml there sometime. I guess it takes yaml too? [01:31:02] yeah [01:31:23] erosen: here? [01:31:23] we prefer yaml when we do it by hand, we're ok with json if it has to be done that way [01:31:35] yup [01:31:35] sup? [01:31:37] the datasources don't need to have their columns removed for the graphs to stop showing them [01:31:37] erosen: do you track just *.m.wikipedia.org in your stats ? [01:31:42] erosen: or also wikibooks [01:31:46] erosen: erm wiktionary [01:31:48] that's the idea of those [01:31:53] erosen: and all those other ones [01:31:58] they point to data and the graphs visualize data [01:32:00] eveything [01:32:02] milimetric: yeah, but I was removing them from the SQL too [01:32:03] so [01:32:05] erosen: oooooOOOOooo [01:32:11] hmmm [01:32:11] gotcha [01:32:14] i mean not in the number we're comparing though [01:32:38] erosen: ok, in the number we're comparing you just count *.m.wikipedia.org right ? [01:32:42] i just track them, as I have that information parsed out and I have files sitting around with counts for all those projecst [01:32:49] yes [01:32:56] but now that you as I might as well check [01:33:00] erosen: so your input is preprocessed somehow ? [01:33:05] erosen: how is it preprocessed ? [01:33:15] no [01:33:20] ok [01:34:11] if you are refering to the files sitting around, those are just the results of my counting [01:34:43] erosen: oh so that's the output [01:34:47] yeah [01:35:03] i'm checking the actual code, just to make sure ;) [01:35:07] erosen: do you count images also ? [01:35:23] i don't think so [01:35:27] ok [01:35:34] so you have a filter for that [01:35:38] YuviPanda, the changes look ok, I'll ping when they're deployed [01:35:41] erosen: do you filter on mimetype ? [01:35:42] my decision criterion for a page view is just whether hte url is a ….org/wiki/... [01:35:46] milimetric: thanks [01:35:51] no mime-type [01:35:58] erosen: mm ok [01:36:01] I tried that and it had a very small effect [01:38:00] erosen: so you are counting /wiki/ but not /w/ [01:38:23] well I'll try with /wiki/ and see what happens [01:38:47] yup, just wiki [01:39:19] one last thing I should point out is that my code does periodically raise exceptions when parsing log lines [01:39:23] and I skip them [01:39:29] YuviPanda: all better: http://mobile-reportcard-dev.wmflabs.org/ [01:39:29] :) [01:39:40] i should really know how often this happens [01:39:44] erosen: invalid lines I suppose [01:39:56] output them to a file in the catch block [01:39:58] average_1rifter: yeah, well, by definition [01:40:15] good call [01:41:23] milimetric: "name" gets displayed as the title, and "slug" is for URL, right? [01:42:07] yes YuviPanda, but slug must mirror the url, which comes from the file [01:42:21] sure, that's fine [01:42:21] id actually must mirror the url and slug must mirror id [01:42:25] I just needed to know which one to change to change the display [01:42:30] I can explain why but it's complicated :) [01:42:38] change all three [01:42:44] filename/slug/id [01:42:53] for now, that's how it has to be unfortunately [01:43:15] YuviPanda: pretty sure, yes [01:43:34] milimetric: no, I don't want to touch those :D [01:43:36] DarTar: dashboards are up [01:43:37] just the "name" field [01:43:48] \o/ [01:43:53] dschoon: link? [01:44:00] DarTar: http://ee-dashboard.wmflabs.org/ is http://ee-dashboard.wmflabs.org/dashboards/metrics [01:44:05] there's also http://ee-dashboard.wmflabs.org/dashboards/features [01:45:55] that's great :) [01:47:20] I'll start hacking the settings, glad that these dashboards have a permanent home [01:48:55] sounds great. let us know if you have any issues, DarTar [01:50:06] heading home [01:50:08] ta~ [01:56:59] milimetric: i'm going to test fabric deploy now :) [01:57:42] milimetric: I'm getting a 'command not found' [01:57:44] (on kripke) [01:58:29] oh well [01:59:04] erosen: response status code check ? [01:59:11] erosen: none right? [01:59:18] i don't thikn so [01:59:25] ok [01:59:38] * average_1rifter is deleting code now to adapt to embr_py [02:02:21] average_1rifter: just checked, no status code checking [02:02:27] I also tried that and found minor differences [02:03:02] cool [02:08:59] does anyone know how to run fab on kripke? [02:09:06] (it tells me currently that fabric is not installed) [02:10:15] install fabric? [02:10:17] with pip ? [02:10:25] * average_1rifter hasn't used fabric [02:10:54] average_1rifter: oh [02:11:06] well, I also guess I need to know *where* the fab file is [02:11:40] http://docs.fabfile.org/en/1.6/installation.html [02:13:32] hmm [02:13:39] clearly I'm missing some critical piece of info [02:13:41] also it is 8AM [02:13:44] I should go sleep :| [02:20:29] ok [05:18:54] milimetric: hey :) [05:18:57] milimetric: you up ? [05:30:03] drdee: hi [05:30:10] drdee: new reports are on the way [05:30:16] ok [05:30:21] just december? [05:30:31] december and november, so we can compare and see if the bump occurs [05:31:12] drdee: can you let me know when you crunch december and november through kraken ? [05:31:16] I'm really curious about the outcome [05:32:09] I mean if you plan to do that [05:32:22] sure but we first need to solve this problem :) [05:32:26] ok [05:32:51] when were tabs introduced ? [05:32:58] feb 1 [05:33:05] ok [05:33:43] we should make like a timeline of events that produced changes in the datasets, I'll try to find some mediawiki extension that does this. I think there's a timeline [05:46:15] erosen: can you link me up one more time to your results please ? [05:46:20] the results of embr_py [05:46:26] drdee: http://stat1.wikimedia.org/spetrea/new_pageview_mobile_reports/r31-embr-py-rules/pageviews.html [05:46:29] this is the new one [05:46:35] uhm still the gap is there [05:47:06] the code now has exactly the same rules as Evan's, I'll have to look again on the languages to see if the arrays are the same [05:47:25] i am not surprised [05:49:22] ok [05:49:32] hey average_1rifter , one sec [05:49:39] erosen: ok [05:52:35] switched to Evan's list of languages [05:52:38] another run [05:53:35] erosen: can we diff our input sets ? or md5sum their contents [05:53:44] sure [05:53:52] ok we'll check sizes to [05:53:53] which file are you running it on exactly? [05:54:14] erosen: /a/squid/archive/sampled-geocoded [05:54:24] erosen: you're using /a/squid/archive/sampled right ? [05:54:30] yup [05:54:36] ok, I'm going to check sizes now [05:55:24] average_1rifter: here is the output http://gp.wmflabs.org/data/datafiles/gp/daily_mobile_wp_views_by_country.csv [05:55:54] you can visualize it in limn but it crashes because there are too many lines [05:56:02] hey average_1rifter sorry missed your ping [05:56:21] I'm up, trying to tie together some user metrics API stuff for DarTar [05:56:35] hey milimetric [05:56:42] hey Dario :) [05:56:44] how goes [05:56:50] good [05:57:15] I'm working on the timeseries example, seems pretty simple [05:57:26] nice :) [05:57:39] hopefully it's not too late but I should have something by tomorrow morning [05:57:56] my only impediment is I have to do laundry at some point because I have no more socks :) [05:58:17] we kind of concluded we wouldn't showcase any dataviz but we can always add something at the very last minute if available [05:59:16] socks: if you want to deprioritize them I can lend you a couple :D [06:00:40] haha [06:01:39] yeah, if it's ready, and we have time to add it to the presentation, it'd be nice to have [06:02:38] http://diffchecker.com/4jPg3L7i [06:02:43] erosen: LHS is you [06:02:45] erosen: RHS is me [06:02:59] :( I don't know why the output got considerably bigger [06:03:05] woah [06:03:06] weird [06:03:17] all I did between the two was use udp-filter to geocode them [06:03:18] average_1rifter where do you files come from? [06:03:42] erosen: they're the files you're using passed either through udp-filter or through a oneliner to split the ip column for geocoding [06:03:59] ya [06:04:13] I'll check one last thing, the size in lines [06:04:21] and after that I'm switching to your datasource [06:04:33] the geocoding just adds one field, right? [06:04:47] yes [06:04:54] which looks about right then [06:05:08] it does ? [06:05:10] the factor is roughly 21 / 20 [06:05:34] unless they are just two letter country codes [06:05:42] they're 2 letter [06:05:44] hmm [06:06:03] that would imply that the log line is only 40 characters long [06:06:06] which can't be true [06:06:16] okay, I agree that they are weird again [06:34:51] erosen: the whole dataset differs by just 415 lines [06:35:07] interesting [06:35:23] i assume the geocoded dataset has fewer? [06:35:31] because udp-filter eats a few? [06:35:42] has more, for some reason [06:35:56] I would expect it to have fewer also [06:36:03] interesting [06:36:17] well that clearly can't be throwing us off [06:36:20] good to know [06:36:36] tried with your list of languages [06:36:37] http://stat1.wikimedia.org/spetrea/new_pageview_mobile_reports/r32-embr-py-logic-replaced-languages/pageviews.html [06:36:50] difference is extremely small (I had one or two more languages) [06:37:08] ok, now I'm switching to your datasource [06:37:13] /a/squid/archive/sampled [06:37:28] seems reasonable [06:41:33] wait, I don't even need the geocoding for this [06:41:45] I can directly switch to your datasource [06:41:52] erosen: you have any field count checks ? [06:41:58] yeah [06:41:58] sorry, just double checking [06:42:01] yes ? [06:42:05] what's your minimum ? [06:42:25] they aren't checks so much as custom ways of dealing with it when two of hte fields get merged [06:43:00] hm, I don't have that [06:43:19] erosen: do you happen to know if a big percentage of lines have fields that get merged ? [06:43:42] i don't actually know significant an effect it has [06:43:48] i could check though [06:45:15] ok, what kind of logic do you have for that ? [06:45:21] is embr_py still on github ? [06:45:29] ya [06:45:47] can't access it [06:45:47] https://github.com/embr/embr_py [06:46:06] https://github.com/wikimedia/metrics/tree/master/pageviews/embr_py [06:46:11] oh sorry [06:46:11] ok [06:46:39] basically i delete the 11th token (zero-indexed) if i have too many tokens [06:47:18] because it tends to arise from a space in the mime_type field [06:47:25] ok [06:48:13] i running a quick count on a recent file righ tnow [06:48:22] ok [06:48:42] I'm switching datasources and doing another run [06:49:58] k [06:52:14] running [06:52:23] eta 20m-30m [06:52:27] less actually [06:52:45] hopefully i'll have an answer about the effect of the extra field by then [06:52:51] ok [06:53:21] thanks btw, if a lot of lines need field-merging then I'll need to implement that as well [06:53:58] erosen: how long did it take you to develop your python framework for making these reports ? [06:54:38] they emerged over a few weeks off and on [06:54:39] I think things went wrong on the Perl script I've written because I pumped too much logic without making reports for each new discarding rule.. [06:56:53] I also made another mistake [06:57:02] I should've made tags for each new reports I'd release [06:57:05] so on a sample of 100k total lines from a few days ago, I found that I deleted the extra token 3.2k times [06:57:17] so as to be able to return to a previous version of the code associated with a given report version [06:57:26] good call [06:57:51] I've had the same headaches because I can't figure out hte difference between code versions [06:58:36] so again, it shouldn't really matter [06:59:31] the delta is 41% [06:59:36] so 3% yes, won't account for it [06:59:51] but looking closer it happen on 1677 / 22021.0, or 7%, of the total page view requests (after filtering) [07:00:11] it did ? [07:00:15] still not enought, i know, but interesting that there is a correlation [07:02:33] in a few minutes the new reports are coming out [07:02:47] cool [07:03:20] could you run the counting of the merged-field lines on a bigger sample than 100k ? [07:03:54] sure, I'm running it on an entire file now [07:04:04] ok, thanks [07:07:02] [travis-ci] develop/bb8b5b6 (#107 by milimetric): The build passed. http://travis-ci.org/wikimedia/limn/builds/5306716 [07:10:08] average_1rifter: Counter({(False,): 1630509, (True,): 105015, (None,): 515}) [07:10:23] read: no-merge, merge, other-error [07:10:44] 0.0644 [07:11:02] that is on filtered requests, btw [07:12:31] so it's 6.5% ? [07:12:39] yeah [07:12:52] ok, could be part of the explanation [07:13:20] erosen: do you count discarded lines ? [07:13:29] yeah [07:13:32] that is the last number [07:13:33] 515 [07:14:03] apologies for the cryptic formatting [07:14:33] I changed the datasource [07:14:34] http://stat1.wikimedia.org/spetrea/new_pageview_mobile_reports/r33-embr-py-logic-changed-datasource/pageviews.html [07:14:46] lowered minimum fields to 9 instead of 10 [07:14:54] gap is still there [07:20:46] ok let me re-check my code [07:20:51] maybe I'm doing something wrong [07:26:10] well obviously I am uhm.. [07:26:36] [travis-ci] master/2bcf24b (#72 by Diederik van Liere): The build has errored. http://travis-ci.org/wikimedia/kraken/builds/5306965 [07:53:37] average_1rifter: I'm off to bed, catch you tomorrow probably [08:05:46] [travis-ci] master/a6b5c44 (#73 by Diederik van Liere): The build has errored. http://travis-ci.org/wikimedia/kraken/builds/5307429 [08:21:14] [travis-ci] master/2201d31 (#74 by Diederik van Liere): The build has errored. http://travis-ci.org/wikimedia/kraken/builds/5307613 [14:55:41] hi erosen [14:55:46] hey sup? [14:56:05] erosen: the data is beating me [14:56:17] erosen: but I'm not tapping out [14:56:38] erosen: http://www.mediawiki.org/wiki/User:Spetrea#2013-March-07 [14:56:42] average_1rifter hehe, well glad to hear you've still got some hope [14:57:42] what do you think about doing a more granular analysis a few thousand lines, so we can see how they differ? [14:58:01] erosen: I am all for that [14:58:39] i'm wondering what the best approach would be. maybe we just create a file with the lines that matched [14:58:55] and then we can diff those, to see which you are counting and I'm not (if any) [15:00:15] dschoon: seems like something you might like: http://www.brainpickings.org/index.php/2013/03/07/a-map-of-the-world-according-to-illustrators-and-storytellers/ [15:37:53] erosen: ok finished the last run I had to do, proved it's not related to the forks I'm doing either [15:38:11] average_1rifter: that's good news [15:38:37] average_1rifter: do you want to go ahead with comparing a specific file? [15:38:40] erosen: yes [15:38:49] presumably the one that is the first day of the bump [15:39:00] erosen: I'll create a sampled input file with data from nov => dec [15:39:08] sounds good [15:39:11] erosen: then we'll use that in both my implementation and yours [15:39:15] sounds good [16:08:34] erosen: ls sampled-1000.log-201211* sampled-1000.log-201212* | xargs zcat | perl -ne 'print if $. % 10000 == 0' | gzip > /tmp/mix.gz [16:08:55] I'm running that to produce a 1:10000 sampled mixed gzip for november and december [16:09:14] hopefuly it will be large enough to reproduce the problems [16:09:17] but small enough to run fast [16:09:36] if it doesn't work, we'll try again(with a different remainder in that module thing, or some other stuff) [16:09:48] by "doesn't work" I mean that we don't reproduce the problem [16:09:49] sounds good [16:09:53] ya [16:10:48] average_1rifter just one thought: to make it easy to tell whether it does reproduce the problem, might it make sense to create two files--one for nov and one for dec? [16:11:08] or do you already have code for grouping log lines by month [16:11:20] it's not an issue for me [16:12:06] erosen: I'll add code for htat [16:12:16] k [16:12:37] average_1rifter i'm getting off a train now, i'll be back in 20 [17:10:55] erosen: sweet! [17:12:29] glad you're excited [17:28:41] good morning peoples [17:29:07] aye [17:38:08] erosen: got the datasets ready [17:38:13] nice [17:38:18] where do they live? [17:38:18] /tmp/mix_nov.gz [17:38:23] /tmp/mix_dec.gz [17:38:25] stat1 [17:39:17] average_1rifter: k, should I create just one file for the matched page requests? or one for the matches and one for the non-matches [17:40:48] erosen: two, matched and non-matched please [17:40:53] then we will diff against mine [17:41:00] yeah, that makes the most sense to me [17:41:21] average_1rifter: ok, just finishing up something else, real quick but it should be done in 20 or 30 min [17:43:07] ok [18:23:52] erosen: I have some results ready [18:24:02] erosen: when you have some time ping me [18:26:55] average_1rifter: hey, just finished a surprise meeting [18:26:59] erosen: great [18:27:09] average_1rifter: looks like it will be another few minutes for my results [18:27:31] erosen: my results are in stat1:/home/spetrea/comparative_nov_dec_sampled_results.zip [18:27:38] great [18:27:38] erosen: alright [18:27:50] average_1rifter: I'll ping you when mine are ready [18:27:57] ok [18:28:17] average_1rifter: also, I though it made sense to have a file with lines which had an error, unless you want to just count those as non-matches [18:28:23] average_1rifter: what do you think [18:28:24] ? [18:28:39] erosen: I have [18:28:48] erosen: erm wait [18:28:57] erosen: so we need accepted/discarded lines right? [18:29:08] yeah [18:29:11] erosen: if we can make just this distinction it would be good [18:29:16] sounds good [18:29:36] erosen: do you have more than these two types ? [18:30:00] so you have error,accepted, discarded ? [18:30:17] well, I have ones which explicitly get discard and ones which throw and error that I catch [18:30:20] yeah [18:31:15] it would be helpful if you could separate them at a granular level. although I should do the same in that case [18:31:24] let's first try with just accepted/discarded [18:31:29] if that doesn't work out, we can go more in-depth [18:31:32] yeah [18:31:36] sounds like a plan [18:31:47] DarTar: just created RT ticket for stat1001. I'll setup crons, etc there once I get access. [18:31:47] heh [18:34:45] PissedPanda: thank you! [18:59:32] average_1rifter: results! [19:00:09] average_1rifter - I randomly found a cool pie chart example if you're still interested: http://theartofasking.com/question/wjucdg6k [19:02:52] average_1rifter: /home/erosen/src/metrics/pageviews/embr_py/results [19:04:31] just as a note, Andrew is off today (his brother is in town) [19:23:11] milimetric: is it possible to directly have limn execute sql? IIRC no, right? [19:24:19] YuviPanda: not yet … [19:24:32] is it a planned thing? [19:24:36] there is talk of a data broker which [19:25:02] would expose saved views / queries / files over http [19:25:59] but limn is trying to stay agnostic afaik [19:26:16] and just point at urls [19:28:05] average_1rifter: can you point me one more time to filter code of your mobile pageviews? [19:28:52] erosen: okay. [19:29:02] erosen: sounds right to me. juliusz just wanted to check [19:29:02] YuviPanda: sorry that was sort of vague [19:29:11] nah, that was good enough. [19:29:21] there *might* be a databroker at some point, but right now this is good enough [19:33:30] erosen: thanks [19:33:55] average_1rifter: i just refreshed about 5min ago [19:34:10] i had forgotten to add the wikipedia filter [20:46:37] YuviPanda: are you happy with your dashboards? can we consider this as being finished? [20:46:48] drdee: needs to auto-update data. [20:46:58] drdee: I need stat1001 access for that, so waiting on that [20:47:30] is there an RT ticket for that? [20:47:38] yes [20:47:45] drdee: https://rt.wikimedia.org/Ticket/Display.html?id=4687&results=dc1f8f2885139f8be90da1071f84e76f [20:47:58] i get a no permission error [20:48:22] to view that? [20:48:24] :| [20:48:25] yes [20:48:39] drdee: https://rt.wikimedia.org/Ticket/Display.html?id=4687 any better? [20:48:45] no [20:48:48] make sure it's not private [20:49:18] I don't see anything that could potentially mark it as private [20:49:25] but the auto-update of data is more of a new request then something that is related to the dashboard, right? [20:50:02] drdee: well, the dashboard is stale unless that happens, no? [20:50:15] dashboard is stale but it works :) [20:50:19] drdee: data currently lives inside the git repo, rather than accessed over http. milimetric said this is a 'temp solution' until we get that out. [20:50:24] drdee: sure, the dashoard itself works great :) [20:50:36] but won't be particularly useful until we have update setup. [20:52:19] i understand [20:53:03] any other acceptance criteria? [21:06:02] hey, i clicked around on http://stat1.wikimedia.org:3307/ a bit after seeing it in metrics meeting. fyi: when going to "See all generated requests" and then hitting some, i get Tracebacks and BuildError: ('cohort', {}, None) [21:06:30] rfaulkner: ^^ [21:06:36] drdee: I think that is it. [21:06:47] YuvIPanda: awesome! [21:06:53] :D [21:07:22] drdee: you about? [21:09:46] mutante, that instance was really just meant for the demo [21:10:23] We're looking at having a stable version hosted at metrics-api.wikimedia.org this month [21:10:41] rfaulkner: gotcha, no worries, i was just curious and wanted to report [21:11:20] appreciate it :) You should be able to kick-off requests however [21:11:46] some may take 10 minutes or so, you can see the status in the job queue [21:12:30] picks "moodbar_confused" very randomly :) [21:12:47] cool, thx [21:26:55] erosen: hey! limnpy questions, is this a good time? [21:28:09] hey YuviPanda - your app is down? [21:28:19] the dashboard? [21:28:21] it shouldn't be [21:28:24] yeah I wanted to showcase it :) [21:28:41] DarTar: no it isn't! [21:28:42] http://mobile-reportcard-dev.wmflabs.org/ [21:28:42] YuviPanda: hey [21:28:43] works [21:29:20] oh I was thinking of http://stat1.wikimedia.org:1337/app-stats.html [21:29:23] DarTar: code is now at github.com/wikimedia/limn-mobile-data, and is getting rather generic. People should be able to write SQL and get csvs out of it :) [21:29:27] has it been decommissioned? [21:29:30] DarTar: oh, no that has been dead for a while [21:29:35] DarTar: yes, that was just a super-quick hack :) [21:29:39] ok np [21:30:27] erosen: hey! I see that limnpy is generating "columns" data for "datasources" by having separate "labels" and "types" fields [21:30:39] erosen: but milimetric rewrote it by hand yesterday to have them together as a dictionary [21:30:44] erosen: will the older format still work? [21:30:49] interesting... [21:30:57] I can show you a diff... [21:31:04] YuviPanda: the old format definitely works, but dan might have different plans for the future [21:31:31] he is sitting next to me right now, let me just ask him what he thinks will happen [21:31:40] erosen: okay! [21:32:01] erosen: I'm trying to make the script in limn-mobile-data generic enough that anyone can give it some SQL and it'll spit out appropriate limn files and data csvs [21:33:45] nice [21:34:04] he says that the old format works [21:34:07] okay [21:34:10] but that he is expecting an array of objects [21:34:20] erosen: http://pastebin.com/agnqKs5R is the diff from his hand-written one to limnpy ones [21:34:37] erosen: okay, in that case I assume at some point limnpy will change to emit that, and I shouldn't worry :) [21:34:42] indeed [21:34:52] YuviPanda I might do it right now… [21:34:58] hah, sweet :) [21:35:11] erosen: if you can, also look at that diff to see other things I might be missing [21:35:27] will do [21:36:45] erosen: thanks :) [21:36:50] np [21:37:15] erosen: does limnpy also write datafiles/*? [21:37:19] ya [21:37:21] (just confirming) [21:37:22] okay [21:37:49] it is intended to do exactly what you're trying to do: generate graphs programmatically [21:38:21] yeah, it's doing 99% of the work :) [21:38:29] I'm just building a wrapper script on top of it [21:46:52] milimetric, are you there? [21:47:13] hey kraigparkinson I'm upstairs in the collab space [21:47:33] attending the e3 workshop [21:47:35] what's up? [21:48:13] milimetric, could you make it possible for me to edit the recurring Sprint Planning meeting? I want to extend it by half an hour to include our improvement discussion time, so I can eliminate the afternoon meeting... [21:48:27] sure, one sec [21:48:51] or you could do it. :) [21:49:38] ok, i'll just do it [21:50:02] so the wednesday 2-3, make it 2-3:30, right? [21:50:34] done, and made it editable by all [21:50:40] You're talking Eastern Time, right? :) [21:55:12] erosen: is ds.__source__ immutable? [21:55:15] or rather, ds in general? [21:55:26] no, you can mess with it before writing [21:55:28] I see that the examples in the README construct new ones [21:55:29] ah okay [21:55:34] fine, so that's just the examples [21:55:34] ok [21:55:52] (urls for the datafiles are different, so...) [21:55:59] i was wondering whether it was an abuse of the norms to use the __ naming [21:56:14] it probably is. __ is magic stuff [21:56:17] or rather [21:56:21] __a__ is definitely magic stuff [21:56:29] you could do _ or __ but they'd be mangled [21:56:39] proper way would be to not do _, I think [21:56:47] yeah I might change that as well [21:57:25] I started out intending for it to be private, but realized that it wasn't a bad way to do advanced configuratino [22:30:08] erosen: added a few more commits. Now it is fully pushing data / config from limnpy :) one hack remains, I'll get rid of it once I get access to stat1001 [22:30:19] erosen: the 'url' field can be an actual external HTTP URL? [22:30:29] or is it always supposed to be same domain? [22:32:39] i think so [22:32:47] but I admit I haven't messed around with it myself [22:33:01] YuviPanda: ^^ [22:33:04] alright [22:33:09] i'll try to poke milimetric tomorrow [22:33:18] hi! :) [22:33:26] url can be anything [22:33:40] each limn instance has its own whitespace [22:33:42] ugh [22:33:44] whitelist [22:33:49] aha! [22:33:49] for where the URL can point to [22:33:50] right [22:33:53] but by default they're all * [22:33:54] but it can be external [22:33:59] (becaues it's labs and we don't care yet) [22:34:03] and is it fetched serverside or clientside? [22:34:11] yeah, they can be external, they get processed through a proxy in Limn [22:34:19] aaah, that makes sense [22:34:19] streamed through the limn server [22:34:35] we're looking at doing cooler things like CORS [22:34:43] juliuz was talking about how you can actually have the URL point to a flask server running alesewhere :D [22:34:47] and get, in theory, realtime stats [22:35:06] does it just stream? is there any caching at all? [22:35:15] no caching, it just streams [22:35:17] okay [22:35:31] Dario's dashboards all point to remote toolserver urls [22:35:38] http://ee-dashboard.wmflabs.org [22:35:47] i gotta run because my battery's about to die :( [22:35:59] right [22:40:38] * YuviPanda goes to sleep