[00:00:09] hehe yes [00:00:21] :) [00:00:32] and self built NAS systems meant to use low power and no fan [00:00:38] (stupid me) [00:00:55] actually I had one of those, and it was quite fast performance for a cheap price [00:01:04] the eee-pc I mean [00:01:14] I carried a VGA / DVI cable with me so I could plug it into bigger displays [00:01:39] jorn: so are you running hadoop on it ? [00:01:51] well, the NAS performs quite nice as well, just not very good if you want to use crypt as the chip doesn't have AES-NI [00:02:05] average_drifter: haha no, it's a poor PhD's solution [00:02:15] :) [00:02:21] bash, sort and a python script [00:02:54] jorn: I wonder how effective it would be if you tried to make a Raspberry PI cluster [00:03:02] jorn: I mean cost-effective and performance [00:03:17] I always wondered about that [00:03:50] from what i saw i guess not very… you don't need all that graphics shit :D [00:04:09] if you're not using cuda or something [00:06:01] relocating, i'll be online later tonight [00:06:58] ah yes, CUDA [00:07:09] one of the things I have and I'm not using on my machine [00:12:21] well, not too bad… you can mine bitcoins with it quite efficiently but for most less special usecases it's mostly wasted developer time… unless someone already solved your problem on it and all tests pass on your card ;) [03:24:42] nite everyone [03:29:58] morning everyone :) [13:13:21] gooood morning! [13:21:48] http://i.imgur.com/kHvxv.jpg [13:22:05] might need some paperclips today [14:34:43] morning average_drifter, milimetric [14:34:47] good morning sir [14:40:51] hi [14:40:54] drdee: hi [14:40:55] :) [14:41:12] yo yo [14:41:28] i am still trying to figure out how to get traffic send from blog.wikimedia.org to udp2log [14:41:36] what's the status of wikistats? [14:42:52] 01:55 < notpeter> I think that robh has mostly looked after the blog [14:43:12] just asked RobH, waiting for reply [14:44:24] drdee: for wikistats I'm re-reading the list Erik compiled [14:44:32] k [14:44:36] drdee: today I plan to start working on this point "new report shows strange domains, like 2620-, 2001-, 2a01, etc" [14:44:54] that look like parts of ip6 addresses to me [14:46:11] quite possible yes [14:47:50] rebooting brb [15:09:25] moooorning ottomata [15:09:35] how is the kafka consumer debianization? [15:10:09] morning! [15:10:54] well debianization was done monday, and puppetization mostly done yesterday it looks like [15:10:55] http://hue.analytics.wikimedia.org/filebrowser/view/wmf/raw/event-unknown?file_filter=any [15:11:00] they are being consumed into the wrong dir [15:11:02] i will fix that [15:11:04] but it is working! [15:14:51] and those files are small, i might want to change the import to happen daily rather than hourly for /event [15:19:12] ohh man it's daddy brain :D [15:19:28] but very cooooool! [15:19:57] so is the next step setting up unsampled streams for en.wikipedia.org/wiki/ and *.m.wikipedia.org ? [15:20:09] brb getting coffee [15:29:04] moin moin [15:29:27] ottomata: are you using git-buildpackage? [15:29:43] where's the kafka consumer packaging? [15:29:53] i am not, i'm using fpm [15:30:13] https://github.com/wmf-analytics/kafka-hadoop-consumer [15:32:08] ottomata: errr, wget to somewhere that's not WMF is not OK i thinks [15:32:22] wget? [15:32:34] > cd /root/.m2/repository/asm/asm/3.1 && rm asm-3.1.jar && wget http://repo1.maven.org/maven2/asm/asm/3.1/asm-3.1.jar [15:32:38] oh yeah [15:32:39] that's fine [15:32:41] make it work then [15:32:42] it should just work [15:32:50] the .deb doesn't do that [15:32:53] this is only for building the package [15:32:58] i understand [15:33:11] so that doesn't happen on prod machines [15:33:11] debs need to be rebuildable [15:33:27] at least i can tell you for sure this wouldn't be accepted in debian proper [15:33:42] whether or not your other opsen will care i can't say for certain [15:34:09] so, dunno how you are going to get around that [15:34:13] java uses maven repos to build packages [15:34:21] they download from all over the place to satisfy deps [15:34:26] you need to make a local copy of that and download from there [15:34:37] sounds fun! [15:34:46] i am the giantist maven noob [15:34:47] (e.g. brewster) [15:35:00] and would rather spend time on making kraken work right now, than weeks on maven and java :p [15:35:06] best solution is probably just check how other packages for maven projects do it [15:35:19] someone must have already solved this [15:36:26] ottomata, didn't dschoon setup a nexus maven repo in labs? [15:36:27] heheh, would you like to figure that out for us? :D [15:36:36] yeah, he's got something going, but is busy with limn [15:36:59] ottomata: want to bug me about it in ~10 days? ;) [15:36:59] well we can fiddle with that as well ;) [15:37:20] nope [17:05:21] drdee: [17:05:22] http://hue.analytics.wikimedia.org/filebrowser/view/wmf/raw?file_filter=any [17:09:51] i'm not so sure I trust this producer as is though, we shoudl check to make sure it can keep up [17:09:59] one we have some data we shoudl verify that it looks good [17:10:42] aight [17:43:05] hey ottomata, feel like fiddling with udp2log ????? :D [17:43:55] sho what's up? [17:43:56] we need to setup traffic from blog.wikimedia.org (host is marmontel to be send to upd2log) so we can count blog page views in the webstatscollector [17:44:28] its just one machine? [17:44:30] yes [17:45:03] RobH has offered to help getting it reviewed / deployed [17:45:37] is there cache in front of it? or just apache? [17:46:07] ah varnish, i see it [17:50:54] so this should put blog.wikimedia.org requests into the main stream, right? [17:51:08] yes [17:52:06] do we know who has set up blog.wikimedia.org? i'm not sure where or if this is puppetized [17:52:08] looking... [17:52:17] hola. [17:52:41] morning dschoon [17:52:48] mornin [17:53:07] ottomata, i think it is puppetized, RobH might be able to give quick directions [17:54:07] ah, found it I think [17:59:23] https://plus.google.com/hangouts/_/2e8127ccf7baae1df74153f25553c443bd351e90 [18:00:22] ottomata, milimetric, erosen ^^ [18:11:46] fyi, really interesting stuff recently in storm: https://github.com/nathanmarz/storm/wiki/Trident-tutorial [18:31:21] lunctime, back in a bit! [19:35:35] ping average_drifter [20:20:03] milimetric, dschoon: any news on limn? [20:35:05] hey erosen [20:35:10] ayo [20:35:16] for the last couple of hours I think I have been duplicating the zero logs i'm feeding in [20:35:26] hehe [20:35:28] cool [20:35:34] i'm not using the stream yet actually [20:35:38] ok cool [20:35:46] but it does present a dilemma [20:35:57] were you planning to soon? [20:36:19] ideally they would be the basis of the ongoing mobile dashboards [20:36:33] aye [20:36:40] basically it is just so i don't have to copy over the zero-* files [20:36:43] manually [20:36:47] right [20:37:01] all the data is there, i just had duplicates for the last couple of hours. I puppetized your producer and didn't turn off the old one [20:37:03] basically I'm very flexible on the issue [20:37:11] gotcha [20:37:32] i can just do a simple duplicate filter in pig [20:37:42] assuming the rows with timestamps are unique enough [20:38:15] drdee: do you know where erik moved his wikistats csv to? [20:38:23] i was using some of them for getting historical active editor counts [20:38:42] the wikistats git repo? [20:39:14] well i'm not sure when it became a repo [20:39:19] I'm not really in the loop on wikistats [20:39:36] but i was using a file at '/a/wikistats/csv/csv_wp/StatisticsMonthly.csv' but it moved and I think I found the same one at: [20:39:36] stats_fn='/a/wikistats_git/dumps/csv/csv_wp/StatisticsMonthly.csv' [20:40:11] yeah so that is the git repo [20:40:24] that is 99.9999999% sure the file you are looking for [20:40:41] cool [20:40:50] just checking cause it only had october counts in it [20:41:03] but now I remember that the perl scripts take some time [20:41:09] right but it's running right now [20:41:12] yeah [20:41:13] and they are fixing some bugs [20:41:17] interesting [20:41:22] so november data should come soon [20:41:25] cool [20:41:34] maybe ping me when you find out? so I can update things? [20:41:49] when all data is ready? [20:41:51] sure! [20:42:28] yeah [20:42:30] thanks [20:45:46] drdee: i was interviewing a guy for the senior position on platform [20:46:02] so that ate up everything but an hour since scrum. :/ [20:46:11] gonna eat some lunch now. then back at it. [20:46:19] dschoon: have a good lunch :) [20:47:13] aight [20:53:37] erosen, drdee: anybody using hue right now? i'm going to play with some settings [20:53:43] nope [20:53:45] go for it [20:55:15] go for it [21:23:31] back-ish [21:58:31] brb [22:11:15] drdee: here [22:11:27] drdee: workin on wikistats [23:39:48] hey average_drifter [23:42:51] hey drdee [23:42:56] yoyo [23:43:34] what's brewing? any progress on fixing erik's issues? [23:43:52] not yet, but I'm working on them [23:44:13] wikistats will be on jenkins as well with tests failing for part of Erik's findings [23:44:35] and of course we'll try to make them pass afterwards [23:45:29] drdee: is there a way to expose jenkins data to a web server ? [23:45:50] why? [23:46:05] drdee: I mean, we generate reports, and each test run will generate new reports, and it would be helpful to be able to read the reports generated in a jenkins build [23:46:16] that's only for wikistats [23:46:30] mmmmmmm you have to ask hashar tomorrow [23:46:36] hmm ok