[04:29:08] should proposals for new research that need data that's not currently public be sent first to rcom or who? [04:29:25] i see no drdee nor dartar :( [04:29:41] this is a request via OTRS [04:30:31] what's the request? [04:30:50] analytics-l is a good target [04:33:56] i was thinking either analytics-l or the rcom list. couldn't decide which first [04:34:34] ori-l: 2013041210001122 if you want to look firsthand [04:38:48] jeremyb_: yeah, analytics@lists.wikimedia.org [04:39:27] ori-l: you read it? [04:43:29] ori-l: see also http://seclists.org/fulldisclosure/2013/Mar/166 [04:43:37] i did read it, yeah [04:54:45] ok, replied [04:55:01] he's aussie so maybe it will hit the list soon [13:37:02] yoyo [13:44:26] yoyoo [14:07:23] ping average_driftere [14:07:33] nice job ottomata with emery! [14:07:45] so what happened with the detour of lucid and precise? [14:14:17] hang on coffee/bfast [14:26:05] heya [14:26:08] so, i'm not entirely sure [14:26:23] the installer couldn't load the precise image for some unknown reason [14:26:29] so just to check, mutante tried installing lucid [14:26:30] and it worked [14:26:37] so, he just did the upgrade instead of the precise reinstall [14:27:14] which was quite a headache apparently [14:27:20] but he got it [14:27:47] also, yay! [14:27:47] http://localhost:8888/filebrowser/#/wmf/raw/webrequest/webrequest-all-sampled-1000 [14:27:58] geocoded anonymized sampled data every 4 hours [14:28:22] awesome! [14:29:27] ottomata: is that job stable? [14:29:57] should be, its pretty simple [14:30:32] i'm wondering whetherwe can use that for country mobile page view counts [14:31:28] is that doable with sampled like that? [14:31:39] that is sampled from all logs, so i guess the proportions will be the same? [14:31:49] yeah [14:31:58] and most country counts are relatively large [14:32:06] so the effect of sampling isn't as significnat [14:33:17] hey erosen, drdee, quick question [14:33:30] who should be on the list of "allowed to edit" for limn instances? [14:33:36] i guess it varies by instance... [14:33:49] yeah [14:33:53] is it globally configured? [14:33:55] I'll just start with the owners of the respective projects plus the analytics team? [14:34:02] well... [14:34:04] hm... [14:34:05] seems like a good start [14:34:24] is dschoon still up and around? [14:34:38] I saw him email yesterday but is he still sick? [14:34:52] I think we need to have a once and for all limn configuration discussion. The config file to end them all [14:35:14] hehe [14:35:31] i'm happy to participate [14:35:53] cool, that would be useful [14:36:05] whenever dave shows signs of life [14:36:13] hehe [14:37:06] it's pretty cool right now, I just have to make it all client side and not refresh the whole page [14:38:00] hehe [15:27:56] ottomata, can you review https://gerrit.wikimedia.org/r/58904 [15:29:05] done [15:30:16] ty [16:07:06] ottomata, you wanna poke demon about the gerrit/github naming stuff? [16:07:30] hmm, maybe in a bit [16:07:39] its not a blocker for anything we need to do [16:08:53] good morning, excitement seekers! [16:08:58] i bring forth exciting news [16:08:59] first: [16:09:11] ottomata, your hourly rsync will run for some time this hour [16:09:25] because i moved the old device props data to device_class_geo [16:09:40] so that way it's still accessible for whatever while the backfill job runs [16:10:35] uhhh, ok! [16:10:36] drdee: the new device class job is running. [16:10:51] nice! ETA? [16:11:00] well, therein lies the excitement [16:11:16] i think i should kill it, and change the frequency to eat 6-hour chunks [16:11:26] or maybe split it into a backfill, and the hourly job [16:11:35] because it is going to take a while otherwise [16:12:33] estimating now... [16:13:44] can your concat sub workflow work with one large file? you could do the back fill as a single job [16:13:47] even outside of oozie [16:13:54] and just save the file into a place that your concat stuff will find it [16:13:57] it can. [16:13:59] well [16:14:04] it creates daily files [16:14:09] and a monthly file [16:14:15] but that consumes the hourly output [16:14:23] i'd have to write a new workflow to skip the hourly part [16:14:27] but... that'd be a lot faster [16:15:12] at 5min per job, it's 8 days to get to current [16:16:16] also, its running in queue=standard [16:16:19] which worries me [16:16:27] drdee, ottomata ^^ [16:16:29] thoughts? [16:16:39] i think i should probably write the backfill job to run separately [16:16:39] 8 days sounds awfully long [16:16:42] and leave the current job [16:16:43] yes. [16:16:47] a lot of it is MR overhead [16:16:56] each action runs for maybe 30s [16:17:09] 5 actions = 2.5m, but the job takes like 12m to complete [16:17:16] that's a TON of framework crap [16:17:25] that's why i was thinking of changing the frequency to 6h [16:17:28] to bite off more data [16:17:33] why not 24h? [16:17:33] thoughts? [16:17:35] sure [16:17:42] people seemed to like the freshness though [16:17:58] the other option is to move a lot of the concat crap into a shell script running in cron [16:18:18] as it's honestly just a series of HDFS commands, and there's no need to spawn a job for that [16:18:21] let's first crank out the data; we can always make it more fresh [16:18:27] that's my thought [16:18:39] i'll split out the backfill [16:18:46] and update the normal frequency to 6h [16:21:15] okay, doing that now. [16:24:38] also, ottomata, drdee -- i'm feeling increasingly like we might want to replace kraken/{oozie,pig} with kraken/jobs [16:24:43] or maybe even a kraken-jobs repo [16:25:03] i actually prefer separate repos... it makes it easier to track changes, and there's less to understand when you get started [16:25:29] i would prefer single repo :) [16:25:52] thoughts? [16:25:59] why? [16:26:51] our stuff is already in too many places and a lot of the pig jobs depend on the kraken jars [16:27:10] but that isn't solved by having them in one repo... the jars have to be in HDFS [16:27:21] i agree that's actually a really annoying fact [16:27:31] (i know we have a problem card about it. i'll find it and bump.) [16:29:02] i should probably write an email about this so we can all weigh in [16:41:28] drdee: scp -r user@stat1.wikimedia.org:/home/spetrea/bak.www/new_pageview_mobile_reports/r51-full-logic-run . [16:48:09] http://i.imgur.com/ZGvvE4i.png [16:48:27] hopefuly this would describe workflow times [17:00:56] average: scrummy!!!!! [17:01:10] ottomata: scrummy!! [17:02:33] ottomata: SCRUMMY!!!! :) [17:02:44] ohcrap [17:30:39] milimetric: http://localhost:8888/filebrowser/view//wmf/raw/webrequest/webrequest-all-sampled-1000/2013-04-12_16.00.00/part-1364239892421_10991-m-00000 [17:30:49] importantly, geo is now in the IP: [17:30:57] 170.70.193.151|DK [17:31:05] (that's a nonsense IP) [17:31:18] so you'll have to split on pipe and extract or something [17:38:59] oiy! my computer hard crashed again [17:39:04] sorry if i missed anything [17:43:08] oh ok, I figured out what causes the hard crash, because it happened again [17:43:15] analytics-logbot, you still keeping it together? [17:43:16] I am a logbot running on tools-exec-02. [17:43:16] Messages are logged to www.mediawiki.org/wiki/Analytics/Server_Admin_Log. [17:43:16] To log a message, type !log . [17:43:19] apparently I'm no longer allowed to press the left mouse button and space in combination [17:44:26] !log pressing left mouse button + space on stat1 [17:44:28] Logged the message, Master [17:45:52] haha, analytics logbot doesn't get sarcasm ori [17:50:24] [travis-ci] develop/b50f65e (#125 by milimetric): The build was broken. http://travis-ci.org/wikimedia/limn/builds/6289198 [17:50:46] aw balls [17:54:30] milimetric: is this card still relevant: https://mingle.corp.wikimedia.org/projects/analytics/cards/306 or is this basically part of #95 ? [17:55:09] well, we hadn't scoped it yet [17:55:26] but yeah, from the title it looks like we're covered by what we did on 95 [18:01:47] ok ty [18:09:03] drdee: r51 in wikistats format is in stat1:/home/spetrea/bak.www/new_pageview_mobile_reports/r51-full-logic-run/ [18:09:09] drdee: r51 in wikistats format is in stat1:/home/spetrea/bak.www/new_pageview_mobile_reports/r51-full-logic-run/out_sp/ [18:09:10] ty [18:54:39] drdee: may I have bugzilla access..... please ? [18:55:13] milimetric: dev-reportcard appears to not have been restarted [18:55:16] or.. maybe I can make an account [18:55:16] just open an account [18:55:18] the google stuff works now [18:55:19] ok [18:55:52] well, that problem is fixed, but something's still weird when I click sign in [19:01:10] lunch with a friend [19:01:12] back in a bit [19:01:12] ! [19:02:15] I want to assign myself those tickets in bugzilla [19:02:23] https://bugzilla.wikimedia.org/show_bug.cgi?id=46193 [19:02:35] https://bugzilla.wikimedia.org/show_bug.cgi?id=46273 [19:02:40] https://bugzilla.wikimedia.org/show_bug.cgi?id=46269 [19:02:45] https://bugzilla.wikimedia.org/show_bug.cgi?id=46205 [19:02:49] https://bugzilla.wikimedia.org/show_bug.cgi?id=46194 [19:03:06] drdee: could you assign all of them to me please ? [19:03:23] we only assign people to a defect once you are actively working on [19:04:01] ok [19:04:08] I'll focus on the first item in 353 then [19:04:23] let's first start with documenting what needs to happen [19:14:09] hey drdee, authentication is live on dev [19:14:10] http://dev-reportcard.wmflabs.org/ [19:14:38] ottomata, erosen, average, DarTar, you should all have rights to edit graphs [19:14:43] yay [19:14:44] (though most of them you can only edit the title) [19:15:03] i have to find a way to evidence blank fields for editing [19:15:31] so yeah, if anyone's interested, let me know. The big open question is: how annoying is the sign-in process (it redirects you back to the home page) [19:15:42] is it a show-stopper or can we make it a separate card and move on for now? [19:16:24] btw, erosen, if you change anything in the graph, it'll still save. So you can go all ninja-command-line on it [19:16:44] ? [19:17:08] like if you go limn.model().root().children()[7]... [19:17:10] that stuff [19:17:11] :) [19:17:11] aah [19:17:13] gotcha [19:17:14] nice [19:17:33] i still need to click save, right? [19:17:36] well, not *nice* but ... nice is coming soon [19:17:48] yeah, or just limn.model.save() [19:19:11] moving to cafe, back on in a bit [19:19:26] http://dev-reportcard.wmflabs.org/dashboards/donations is on that server too, for playing [19:19:34] i should give matt walker rights [19:35:55] milimetric: really neat [19:36:24] so? is it annoying or passable for now? [19:42:31] i would say passable [20:01:44] milimetric: so is possible to restrict edit access to Limn for now to @wikimedia.org emails? [20:02:22] sure, I'd have to find a reliable way of parsing the domain out of the email address, but I don't see a reason why not [20:02:48] that would leave out people like Stefan though [20:03:31] but yeah, I'd have to change only one function so maybe think about it some more and come up with a scheme we'd want to use [20:35:50] test /yes [20:35:52] hmm [20:35:52] ohh [20:35:55] /path [20:35:57] look at that [20:35:58] ha [20:38:15] back! [21:19:48] i'm out, laters all! [22:05:28] drdee: [22:05:53] drdee: 1,2,4,5,7 were solved from this list https://bugzilla.wikimedia.org/show_bug.cgi?id=46265 [22:06:03] drdee: the remained needs to be solved so we can closeit [22:07:07] are those bug fixes pushed? [22:07:20] yes, they were merged [22:08:50] *remainder [22:08:59] so why is this report empty: http://stats.wikimedia.org/wikimedia/squids/SquidReportCountryData.htm [22:10:03] this is also empty: http://stats.wikimedia.org/wikimedia/squids/SquidReportDevices.htm [22:11:59] ok this is a problem [22:12:10] drdee: did this happen recently ? [22:12:17] so what i would say is this: [22:12:42] listening [22:12:53] turn on the curren wikistats on stat1 and generate reports, copy the reports to kripke [22:13:01] and let's make a test instance there [22:13:14] a test instance of wikistats on kripke ? [22:13:23] then we just investigate the reports as they currently are and update the bug reports [22:13:24] yes [22:13:34] because i don't know what is fixed what is not fixeed [22:13:37] what is broken [22:13:42] how much work it is to fix [22:13:51] so let's just do a zero measurement [22:14:02] can you start a run on stat1 right now? [22:14:10] just run it as-is [22:14:20] on monday we will setup kripke to show these reports [22:14:29] I have to remember all the parameters [22:14:45] christ, hopefully I have it saved somewhere [22:14:48] or in bash history [22:15:08] well we should have a bash script for that :) [22:16:20] ok [22:28:41] can you start wikistats now? [22:29:48] yes I'm working on it [22:30:25] ok, also put this in a bash script so simple folks like me can start it :) [22:30:46] ok [23:23:01] apparently the new wifi disconnects you every time you lock your machine?