[00:07:13] DarTar: You still around? [00:07:25] kaldari: yes [00:07:28] in Yongle [00:07:45] having a bit of trouble generating the data on stat1 [00:08:00] you have any time to look at this with me? [00:08:02] wanna come over? It's quiet and bright in R66 [00:08:04] sure [00:08:07] sweet [00:17:41] New review: Diederik; "Ok" [analytics/udp-filters] (master); V: 2 C: 2; - https://gerrit.wikimedia.org/r/60222 [00:17:41] Change merged: Diederik; [analytics/udp-filters] (master) - https://gerrit.wikimedia.org/r/60222 [01:12:37] milimetric: OK, I think I got everything done: generated the data, registered the data source, created the graph, updated github. Any chance I could convince you to do the deploy part? Don't have to do it right now of course :) [01:19:26] no prob, deploying now [01:23:35] milimetric: so the next step is to set up a way to switch the graphs for different projects, like English Wikipedia, MediaWiki, German Wikiquote, etc. What would be your suggestion for that? [01:23:50] switch them? [01:24:01] or have some way to choose which dataset you want [01:24:19] oh ok, maybe a separate dashboard with a tab for each [01:24:35] yeah, that would make sense [01:24:39] or we could brainstorm some cool way of swapping out data within one graph [01:24:48] easy way or cool way - entirely up to you guys :) [01:25:12] I like the idea of a separate dashboard for the project, since some projects won't need certain tabs [01:27:06] milimetric: would there be any way to set that up before Thursday or is that a longer-term thing? [01:27:45] setting up a dashboard is just a matter of making a json file like the dashboards/features one that you just edited [01:27:56] so you can do it yourself, it'll be served automatically [01:28:01] cool [01:29:48] does this look good kaldari? http://ee-dashboard.wmflabs.org/dashboards/features/ [01:30:01] seems like it worked to me [01:30:24] yep, thanks! [01:30:34] cool, gj btw, told you it was easy :) [01:30:51] or wait, I don't wanna presume, was anything weird about it? [01:31:55] there was one thing I was wondering about... how do I edit a data source after I've created it on /datasources? [01:32:16] ah yeah, not very friendly, but you can edit the file itself [01:32:39] ah, that's fine [01:32:43] for you it stores it in /srv/ee-dashboard.wmflabs.org/limn/var/data/datasources/ [01:32:47] on kripke [01:33:05] so what i just did is took the files from there, and from var/data/graphs/ and added them to your repo [01:33:08] then deployed [01:33:10] that's useful thanks! [01:33:31] it's a little roundabout, we should have some way of committing the newly created stuff [01:33:33] np [01:33:43] k, now i'm gonna go eat some dinner, have a nice night :) [06:06:57] analysis files? [06:07:56] oh. that was from, like, an eon ago. [06:08:13] silly scrollbars, hiding from me. [13:33:51] speaking of puppet ottomata, how do we get the limn::instance config to show up on labs console? [13:37:39] you add it yourself [13:37:56] but, hmmm [13:39:10] do you want it for reportcard or something else? [13:51:58] heya milimetric, do you have the operations/puppet repo cloned? [13:52:14] yes [13:52:20] can you try something for me? [13:52:24] sure [13:52:36] git pull [13:52:36] git checkout sandbox/ottomata/testA [13:52:45] re: above, I was just thinking limn::instance would be on there for good so people can easily set up their own limn instances [13:52:54] k, doing that ^^ [13:53:02] yeah, it sorta works that way, um, we'd ahve to create a role class that uses global variables [13:53:10] labsconsole doesn't work with parameterized classes [13:53:13] so [13:53:19] misc::statistics::sites::reportcard will work [13:53:23] yep, got your branch [13:53:44] ok, make a change and try pushing [13:54:43] prohibited by gerrit [13:54:46] hm ko [13:54:55] K.O. indeed [13:54:56] :) [13:56:38] hmmm, ok [14:05:49] morning guys [14:06:42] morning drdee [14:46:08] hmmm, milimetric [14:46:18] does this stuff need to be set up? [14:46:18] __cohort_data_instance__ = 'cohorts' [14:46:18] __cohort_db__ = 'usertags' [14:46:18] __cohort_meta_db__ = 'usertags_meta' [14:46:18] __cohort_meta_instance__ = 'prod' [14:46:25] in mysql as a db for user_metrics to work? [14:53:46] ottomata: I need to run some experiments on locke with udp2log ... [14:54:09] so need privileges to modify config file and to send SIGHUP to that process. [14:54:48] do you already have access to locke? [14:54:59] yes, I'm logged in. [14:55:02] hmmmm [14:55:05] k [14:56:05] hmmmm [14:56:13] i can make it so you can edit…not sure of the best way to let you sighup or restart udp2log...hmm [14:56:27] just curious, whatcha doin? [14:56:56] udp-filter is sampling 10 to 1 currently [14:57:21] drdee would like to reduce that or even eliminate entirely. [14:57:52] But it is not fast enough, so loses data -- I'm trying to improve performance. [14:58:52] Had a question too: The filtered files seem to get written to /a/...; are they transferred elsewhere somehow ? [15:01:34] ottomata, regarding user metrics, i think those variables never change [15:02:18] on locke right now, no [15:02:30] its not in production anymore [15:02:32] its a deprecated box [15:02:37] so you can mess with it [15:02:48] No, I mean on production boxes [15:02:52] oh [15:02:54] yes [15:02:58] not all of them [15:03:09] but lots are rsycned to stat1:/a/squid/archive [15:03:38] ah, ok, and then deleted on the udp2log boxes ? [15:04:03] just trying to understand how we keep disk usage under control [15:04:20] yeah [15:04:39] by a cron job ? [15:05:16] files are deleted on log boxes by logrotate [15:05:20] they are rsynced to stat1 by a cron job [15:05:35] I have to ask ops about how best to give you udp2log restart perms [15:05:35] ah, ok. [15:06:22] Ok, some sort of local root on locke would work. [15:07:11] Since it is not in production, shouldn't be an issue I would think but I'm not a sysadmin :-) [15:07:37] well, there are huge discussions about local root stuff on boxes [15:07:39] so that is hard [15:07:40] but [15:07:44] Hmmm, you know, you can run udp2log procs as your user, i think I can shut the running one down and you can start it up [15:07:53] lemme see [15:16:58] ok xyzram [15:17:03] i put a dir in your homedir on locke [15:17:05] udp2log-local [15:17:09] it has a conf file and a script [15:17:15] you can edit that file to your hearts content [15:17:17] and just run the script [15:17:29] the script will start up udp2log as your user listening on the webrequest udp port [15:17:34] try it out and see if it works [15:17:43] the config file has a single 10000 sampled /bin/cat right now [15:17:52] so if you just run ~/udp2log-local/udp2log [15:17:56] you should see output [15:18:07] awesome ottomata@! [15:18:30] getting breakfast and alll [15:22:17] ottomata: great, thanks, that works. [15:25:51] seems like it's getting around 800 lines per minute; is that typical or does it spike up substantially at different times of the day ? [15:26:08] unsampled? [15:26:27] Oh, sorry, forgot about that. [15:26:46] back in 30 min [15:26:49] yea the cat I put there is sampeld 10000, you can edit and do whatever you need [15:27:50] yes, it is around 12 lines per second sampled which would be 120,000/s unsampled. [15:29:35] milimetric: when you get back, we need to talk about the cohorts db [15:29:49] i think it only exists on db1047 [15:30:01] and we need to get the schema checked into the project or something [15:30:08] word [15:30:20] ottomata: I also want to chat with you about x-cs stuff [15:30:42] I was going to write an update to the ops thread but I thought we could do another round of due diligence [15:31:20] yeah i think thats good [15:31:28] i'm stuck on my other stuff right now, let's check it out now? [15:33:36] word [15:33:38] hangout [15:33:48] ottomata: ^^ [15:34:56] kj [15:35:00] in standup [15:55:55] average: around? [15:57:15] dschoon:poke [15:57:31] about to leave [15:57:37] for zee office. need zee foods also. [15:57:41] ok, will talk in office [16:19:46] drdee: question about sequence numbers: with sampling factor of 10000 I'm expecting the seq numbers of each host to increment by exactly 10000 but I'm seeing almost random increments, e.g. [16:20:00] cp1032.eqiad.wmnet: 11 lines [16:20:01] 7451388 [16:20:01] 7451912 (524) [16:20:01] 7456520 (4608) [16:20:01] 7459847 (3327) [16:20:01] [16:20:21] where the parenthesized value is the increment over the previous value. [16:23:56] so ottomata, let's talk cohorts db [16:28:05] xyzram: you have to check by hostname [16:28:06] ottomata: looks there is a cron that starts up udp2log as user udp2log [16:28:23] I'm not able to kill it. [16:28:48] drdee: Yes, I'm checking the host cp1032.eqiad.wmnet [16:29:09] in meeting [16:29:15] ottomata can you help xyzram? [16:30:05] oh puppet is doing that xyzram, will fix one sec [16:30:08] stopping it manually for now [16:30:30] milimetric: talking with erosen about x-cs [16:30:33] ok [16:30:35] after [16:30:47] after standjup maybe [16:30:47] ottomata: do you know anything about those seq numbers ? [16:31:25] why I'm seeing random increments I mean rather than the expected 10000 ? [16:32:38] back in a half hour or so. [16:37:39] AG grooomiing today [16:37:49] getting lunch as fast as I can [16:47:21] drdee: pong [16:47:23] just woke up [16:47:32] gonna take a shower and be ready for scrum [16:47:35] ok [16:59:32] erosen, mark says that requests are only tagged on .zero. domains [16:59:43] would you join #wikimedia-operations room? [16:59:46] erosen^ [16:59:48] word [16:59:49] on it [17:35:00] ottomata: has the puppet issue been fixed ? (I'm still seeing a udp2log process I cannot kill) [17:35:08] ack no, on it now [17:44:27] average: you about? [17:44:36] the backlog grooming hangout https://plus.google.com/hangouts/_/e920454707b309ebf66331e55e9d557eb5ed1caa [17:44:41] join us! [17:46:16] dschoon: ok [18:34:55] erosen, we can just chat real quick [18:56:20] New patchset: Ottomata; "Adding user_metrics.sql schema" [analytics/E3Analysis] (master) - https://gerrit.wikimedia.org/r/60454 [18:56:30] milimetric [18:56:32] https://gerrit.wikimedia.org/r/#/c/60454/ [19:09:04] Change merged: Ottomata; [analytics/E3Analysis] (master) - https://gerrit.wikimedia.org/r/60454 [19:16:57] New patchset: Hashar; "Jenkins job validation (DO NOT SUBMIT)" [analytics/asana-stats] (master) - https://gerrit.wikimedia.org/r/60459 [19:19:00] New patchset: Hashar; "Jenkins job validation (DO NOT SUBMIT).:" [analytics/asana-stats] (master) - https://gerrit.wikimedia.org/r/60459 [19:19:11] New patchset: Hashar; "Jenkins job validation (DO NOT SUBMIT)" [analytics/check-stats] (master) - https://gerrit.wikimedia.org/r/60460 [19:19:56] New patchset: Hashar; "Jenkins job validation (DO NOT SUBMIT)" [analytics/editor-geocoding] (master) - https://gerrit.wikimedia.org/r/60461 [19:20:03] New patchset: Hashar; "Jenkins job validation (DO NOT SUBMIT)" [analytics/glass] (master) - https://gerrit.wikimedia.org/r/60462 [19:20:04] New patchset: Hashar; "Jenkins job validation (DO NOT SUBMIT)" [analytics/gerrit-stats] (master) - https://gerrit.wikimedia.org/r/60463 [19:20:07] New patchset: Hashar; "Jenkins job validation (DO NOT SUBMIT)" [analytics/reportcard] (master) - https://gerrit.wikimedia.org/r/60464 [19:22:42] one sec ottomata1, working with evan [19:24:14] its ok, i merged it, just wanted you to see that I added the script there [19:51:21] New review: Milimetric; "looks good to me, nice otto" [analytics/E3Analysis] (master) - https://gerrit.wikimedia.org/r/60454 [19:53:01] drdee: is the repo fixup done ? [19:53:08] nope :( [19:53:09] sorry [19:53:11] hold on [19:54:57] asked for it, hopefully done soon [19:55:46] * average is going to do it as soon as he wraps up what he agreed with Erik [19:56:13] ETA tommorow [19:56:26] or late today [20:53:54] ottomata1: you about? [20:53:59] yup [20:54:17] qq [20:56:12] can you think of any reason our kafka importer would bitch about an = in the target path? [20:56:21] ^^ ottomata1 [20:56:41] hm [20:56:47] not that I know of [20:57:22] then! [20:57:25] i have a very clever idea [20:57:59] basically, if we change the filename of our mobile imports slightly, the whole dataset will become a giant hive table [20:58:52] haha [20:58:53] oh yeah? [21:02:23] i need to test the format first [21:02:25] but yeah :) [21:27:25] dschoon go for it [21:27:40] ? [21:27:44] https://mingle.corp.wikimedia.org/projects/analytics/cards/620?version=1 [21:27:50] the hive kafka thing [22:08:27] drdee: http://localhost:8888/beeswax/results/155/0?context=design%3A86 [22:09:58] drdee: http://localhost:8888/beeswax/results/156/0?context=design%3A83 [22:09:59] also [22:10:57] in meeting, will check son [22:18:39] it's the top referring search queries [22:18:44] and the top referring domains [22:19:30] the job for top referring keywords is almost done :) [22:19:35] (i was curious) [22:22:47] man. i really want to filter out stop words [22:22:52] but i am going to stop fiddling. [22:23:21] milimetric: you about? [22:24:09] drdee: http://stats.wikimedia.org/kraken-public/webrequest/mobile/views/sessions/2013/04/ [22:30:47] david; google the biggest referer ? [22:31:48] result/155/ doesn't work, is that expected? [22:32:57] sec [22:33:37] and yeah about google. :P surprise! i was more surprised that wikipedia is the #2 referer [22:34:07] Top Session Referring Search Keywords: http://localhost:8888/beeswax/results/160/0?context=design%3A88 [22:43:23] Top Mobile Session Entry Searches: http://localhost:8888/beeswax/watch/164?context=design%3A89 [22:43:41] Top Mobile Session Referering Domains: http://localhost:8888/beeswax/results/163/0?context=design%3A83 [23:29:51] New patchset: JGonera; "Generate JSON files in datasources using templates" [analytics/limn-mobile-data] (master) - https://gerrit.wikimedia.org/r/60608 [23:45:26] New patchset: JGonera; "Update upload_web table" [analytics/limn-mobile-data] (master) - https://gerrit.wikimedia.org/r/60610 [23:49:33] New patchset: JGonera; "Generate JSON files in datasources using templates" [analytics/limn-mobile-data] (master) - https://gerrit.wikimedia.org/r/60608 [23:54:31] finishing out from home. [23:54:31] brb