[01:04:40] dschoon: kafka is not in maven [01:04:41] fun times [01:04:55] sigh. [01:04:56] of course it's not. [01:04:58] okay, i'll look into that. [01:05:03] that's recockulous if no. [01:05:12] but all the more reason we have to get our own nexus up and running. [01:05:13] SIGH [01:05:43] dschoon: this would work, if only it worked: http://dpaste.de/KNCZm/ [01:06:07] yep. [01:06:11] we shall make it work! [01:06:13] probably weds? [01:06:15] this makes for a good read: https://issues.apache.org/jira/browse/KAFKA-133 [01:06:19] if that's ok with you [01:06:21] i shall do so. [01:06:34] i love how you are always an oracle of issues and mailing lists. [01:06:39] it's always so thorough [01:07:00] "Similar to other Apache projects, it will be nice to have a way to publish Kafka releases to a public maven repo." [01:07:01] SIGH [01:07:08] this was an issue with cassandra for ages [01:07:12] well, the last few comments are pretty encouraging [01:07:14] i have no idea why apache doesn't have a public nexus [01:07:22] they do [01:07:23] it's a fucking flat directory tree with xml files! [01:07:26] oh! [01:07:27] good! [01:07:31] https://repository.apache.org/index.html [01:07:36] kafka isn't on there. [01:07:44] soon! [01:07:51] yes, that does look encouraging [01:07:58] well, see the last couple of comments, please seem to have gotten it to work [01:08:05] i'm just not sure what resolver / repo URI they are using. [01:08:08] yeah. [01:08:14] i'll look into it! [01:08:19] thanks [01:08:20] i have some perf problems to fix in limn first [01:08:24] then you're next on the list [01:08:31] and generally we need our nexus situation cleaned up [12:54:40] morning! [13:01:30] hey milimetric ! :) [13:01:44] what's going on ? :) [13:01:46] howdy [13:01:50] is the demo friday postponed ? [13:02:10] no, I think drdee was trying to re-work it into more of a sprint demo [13:02:50] oh [13:24:42] hey drdee ! :) [13:24:58] yoo! [13:25:05] i need you for something quick and urgent [13:25:14] webstatscollector..... [13:25:25] i think we should do the following [13:25:54] all ears [13:25:59] eyes also [13:26:04] 1) put the current head of master origin in a new branch called refactor or something like that [13:26:30] 2) get the last checkin before you started rewriting and make that the new head on origin master [13:26:57] 3) cherry pick some *small* fixes from refactor branch (but not the big features) and maybe even nothing [13:27:13] 4) add support for wikivoyage domain to origin master [13:27:16] 5) deploy [13:27:40] there is quite some urgency with this so maybe you can work on this today [13:27:55] ok, can we bypass gerrit on that ? because we'd need to rebase the master branch and gerrit would cause delays on that [13:30:25] why would gerrit cause delays? [13:30:50] you mean the cherry picking requires rebase? [13:31:07] if that's the case then let's just skip cherry picking and just add the wikivoyage domain [13:32:50] drdee: in 2) you're saying to get a previous version of master to be the new master [13:32:54] drdee: that's perfectly possible [13:33:02] drdee: just haven't done that before with gerrit [13:33:08] drdee: I hope it will go smooth [13:33:10] but this could be very easy [13:33:12] hold on [13:33:20] i was thinking this: [13:33:36] git checkout -b refactor [13:33:50] git push origin refactor ? :D [13:33:56] git push gerrit origin refactor [13:33:58] yes [13:34:04] or whatever the command is [13:34:27] and then dig through the logs, figure out the last id before the rewrite [13:34:32] and then do [13:34:37] (in master) [13:35:06] git checkout GIT_ID [13:35:33] yes, except that will tell you "you're on no branch now" [13:35:43] if you checkout a specific commit [13:41:54] hold I'm trying this on a sandbox repo [13:41:58] and then going for webstatscollector [13:44:54] ok [13:48:59] brb, we should also ask milimetric for advice [13:49:03] drdee that email from jmorgan listing toolserver links to even more data [13:49:11] oh, I'll read up [13:49:41] but re: jmorgan - this made me wonder whether we should do a big fanfare roundup of all the analytics going on everywhere in the org [13:50:20] i don't think i have seen that email [13:50:52] got it [13:51:33] milimetric, can you chime in on how to execute above described scenario with as little hassle as possible? [13:51:48] trying to understand what you guys are trying to do [13:52:05] basically we need to go back in time [13:52:28] the current version of webstatscollector has a lot of improvements but we can't deploy that [13:52:33] it will lead to packetloss [13:52:47] but we need to deploy a small fix for webstatscollector [13:52:53] http://stackoverflow.com/a/1625275/827519 [13:52:56] so i want to park the current master in a separate branch [13:52:57] seems like this should work [13:53:11] and go back in time, get a commit from before the refactoring [13:53:14] the problem is when you push, because the git remote says "hey, you're pushing something, but I know your HEAD is newer" [13:53:43] make that commit from back in time new master [13:53:53] add fix and deploy [13:54:23] yes that' what i was suggesting basically :) [13:54:38] yep, that stackoverflow answer is kind of what I was going to say [13:55:00] I'm trying it out on a sandbox github repo now [13:55:27] cool. If gerrit gives you crap, can't you just ignore and have drdee accept whatever you're pushing [13:55:29] ? [13:55:51] I can't believe we're using a code review tool that makes it harder to code [13:55:57] let's try to use gerrit, i will review it right away [13:56:07] if it's a nightmare then sure just push [13:57:47] brb coffeee!!! [14:25:28] average_drifter: need help? [14:28:24] drdee: got it made on the sandbox [14:29:13] aight [14:29:20] can you share your screen ? [14:34:27] drdee: http://garage-coding.com/wikipedia-demo-revert.flv [14:34:55] screen shared :) [14:38:03] awesome! [14:38:18] :) [14:38:19] this looks good, let's have a look at the graph visualization in github [14:39:08] https://github.com/wsdookadr/demo_revert/graphs [14:39:14] https://github.com/wsdookadr/demo_revert/network [14:39:48] I'm not sure if Gerrit will be upset about --force but we'll find out [14:40:20] 1 sec [14:40:26] milimetric [14:40:36] can you also look at average_drifter demo [14:40:46] let's make sure that we don't create a huge mess :) [14:40:55] :) [14:41:23] so the github network does not fully reflect what happened, right? [14:41:34] drdee: second link reflects the changes [14:41:39] drdee: https://github.com/wsdookadr/demo_revert/network [14:41:46] it does not show that master is back to an older comit [14:42:08] oh actually it does [14:42:09] HAH [14:42:11] :) [14:42:12] ok [14:42:14] I have no idea [14:42:19] because gerrit is crazy [14:42:25] this is all without gerrit [14:42:29] should work [14:42:30] I know [14:42:34] please watch http://garage-coding.com/wikipedia-demo-revert.flv [14:42:42] and then look at github [14:42:48] it's only a couple of minutes [14:42:48] I'm saying I don't know how if it'll work in gerrit [14:42:53] ohh right [14:42:58] none of us do ;) [14:43:57] could we have a gerrit sandbox ? [14:45:50] i think that exists [14:47:35] that would be useful [14:48:08] * drdee is digging through mailinglis archives [14:48:30] how can you do --force to gerrit? [14:48:42] https://www.mediawiki.org/wiki/Gerrit/personal_sandbox [14:49:26] git review doesn't have any such option [14:50:25] that just sounds like a place to push changes drdee, not a place to test git review madness [14:51:17] can we just set up a new repo in gerrit? Call it gerrit-sucks or gerrit-testing? :) [14:54:27] looks like git has revert http://www.kernel.org/pub/software/scm/git/docs/git-revert.html [14:54:32] looking into that also [14:55:04] you'd have to revert each commit individually [14:55:18] I was gonna suggest that average_drifter but it'll make your history awful [14:55:32] er.. average_1rifter [14:56:47] milimetric: oneliners [15:04:35] can we just set up a new repo in gerrit? Call it gerrit-sucks or gerrit-testing? [15:04:36] yes [15:04:39] let's do that [15:04:43] just to be safe [15:06:53] ok, can you create the repo please ? [15:09:29] ye [15:10:01] i will call it sandbox [15:11:21] cool [15:11:24] done, it's analytics/sandbox [15:12:10] ok, will use in a bit [15:52:25] drdee: chrome fixed the reload problem with Limn! [15:52:38] nice! [15:52:49] also, gzipping our JS alone gets us under 200K from 800K [15:53:03] so only rendering and the memory leak in Chrome remain as problems [15:53:08] sweet [18:00:20] what was the chrome reload problem, milimetric? [18:00:33] must've been a chrome bug [18:00:34] are you guys getting a 404 on the hangout link? [18:00:36] i just installed update [18:00:40] no [18:00:41] hmm [18:00:45] i'm on it [18:00:55] here I'll make a new one [18:01:05] https://talkgadget.google.com/hangouts/_/calendar/ZHNjaG9vbm92ZXJAd2lraW1lZGlhLm9yZw.9nskdejqsjnt1p1bqdp70geu68 == 404! [18:01:11] yeah [18:01:12] i did also [18:01:17] all hangouts 404 for me [18:01:19] i can't find the raw link [18:01:30] https://plus.google.com/hangouts/_/2e8127ccf7baae1df74153f25553c443bd351e90 [18:01:31] is talk gadget the final url or just a redirect? [18:01:33] does that work? [18:01:42] hmm 404 for me too :( [18:01:45] 404 for me [18:01:51] strange [18:01:55] linux for the win [18:01:56] you guys have been cut off from the internets [18:02:08] been nice knowing you [18:02:08] i'm going to try inviting you all from the plus UI [18:02:09] k [18:02:24] wow. [18:02:26] it 404s. [18:02:28] so yeah. [18:02:35] skype? [18:02:39] sure [18:02:57] have you tried a different browser dschoon_ ? [18:03:07] trying now. [18:03:44] 404 on Firefox [18:03:48] same [18:03:50] yup [18:03:51] so! [18:03:53] ok [18:03:57] Skype it is? [18:04:04] yeah [18:04:10] embrosen [18:04:12] ok [18:04:33] davidschoonover [18:05:47] ottomata, can you join Skype for today's scrum [18:07:24] ACK [18:07:26] skype? [18:07:37] google is down [18:07:37] hhe [18:07:45] i'm in the hangout [18:07:56] but sure [18:07:57] davi and I can't join [18:09:04] me skype: ottomatona [18:29:35] hi? [18:50:39] ugh is there another meeting rigiht now drdee? [18:50:45] Tracking usage of mobile beta site [18:50:45] ? [18:50:48] yes [18:50:52] unnggghhhhh [18:50:59] you've got 10 min [18:51:04] i'm at a cafe, no food here [18:51:05] uNNGHHhhh [18:51:07] gotta run home [18:51:15] yikes [18:51:31] ok i can do it, be back in 10 [19:00:49] ottomata or ori-l: do either of you know the origin of the redis instance running on stat1? [19:01:38] i need redis as the backend to pyrq (http://python-rq.org/) [19:01:45] i can set up my own if this has some special purpose [19:02:10] though it appears to be empty... [19:02:12] redis 127.0.0.1:6379> keys * [19:02:12] (empty list or set) [19:02:13] hmmm, i set it up for someone once [19:02:16] hmm, who was it? [19:03:46] where is that? stat1001? [19:03:53] stat1 [19:04:00] ah [19:05:34] ottomata: is it currently being managed by launchd or something? [19:09:46] ah ya [19:09:47] for ori [19:09:47] misc::statistics::eventlogging [19:10:36] thanks [19:10:42] I'll ping him to make sure it is okay, iguess [19:43:53] does anyone know what exactly the database size refers to on wikistats (http://stats.wikimedia.org/EN/TablesDatabaseSize.htm)? [19:43:55] average_1rifter: ^^ [19:44:30] btw, average_1rifter sorry if this isn't something you know about--i'm never exactly sure which parts of wikistats you work on [19:52:07] erosen: I think those numbers indicate per month, the total size of log lines for a particular wikiproject [19:52:46] erosen: so if you were to take all lines for en.wikipedia.org for example , in December 2009, those would total up to 14G [19:52:59] k [19:53:04] this page: http://stats.wikimedia.org/EN/TablesWikipediaAR.htm [19:53:05] erosen: the squid log lines [19:53:12] seems to suggest it is related to the databases however [19:53:31] but it looks like the number isn't even generated for most projects anyway [19:55:38] erosen: best to ask ez about that one. I'm mostly on CountryReport, Mobile PageViews , Mobile Devices ( and all the things you see here http://stat1.wikimedia.org/spetrea/wikistats/regression-tablets-discrepancy_for_config_editors/reports/2012-10/ ) [19:55:45] erosen: aka wikistats/squids [19:55:54] great [19:55:57] good to know [19:56:00] thanks [19:56:28] you're welcome :) [20:20:10] ottomata……i've got some C stuff :D [20:20:36] i pushed a change to webstatscollector that adds wikivoyage domain support [20:20:46] it's in a separate branch called time_travel [20:20:54] it needs to manually compiled [20:21:02] tested on a an* box [20:21:19] only filter needs to be recompiled [20:21:36] so on locke the current filter binary should be saved and replaced with this new binary [20:27:01] ottomata ^^ [20:27:30] yeah [20:27:35] i'm looking for a lucid machine to compile it on [20:27:48] what package is db.h in? [20:28:35] berkely-db 4.4 [20:28:39] or something like that [20:29:07] to minimize disruption i would only replace the filter binary, not the collector binary [20:34:02] drdee: did you want to meet now? [20:34:12] yes [20:35:13] robla, yes erik and i are in the hangout [20:35:16] https://plus.google.com/hangouts/_/7f9aacf6132dfbc92319c6ccfa39799005de283e [20:35:43] hrm [20:36:56] drdee, do I need 4.3? [20:37:04] I was able to compile, but had to change the linker flags [20:37:12] -ldb instead of -ldb-4.3 [20:37:32] yeah could be the right verseion [20:37:42] can you use linker or something on the current binary [20:37:43] t [20:37:52] to see what version was used to compile [20:38:06] 4.8 is on locke [20:38:41] ok [20:38:54] and [20:38:58] the binary on lock was build with 5.6 [20:39:00] 4.6 [20:39:11] but the filter binary doesn't use it [20:39:13] only the collector [20:40:11] ok, i mean, i have it compiled, now what? [20:42:06] try the collector / filter on an analytics box to make sure it works [20:44:01] oh on the stream, righhhht [20:44:34] ungh how does this work again? [20:45:20] collector listens on stdin or a port? [20:45:25] oh 3815, got it [20:50:23] :D [20:52:12] ungghhh [20:55:14] still trying to compbile and run this [20:57:43] ottomata: what errors do you get ? [20:58:25] /tmp/cc8fYGxy.o: In function `initEmptyDB': [20:58:26] /home/otto/scr/webstatscollecotr-wikivoyage/collector.c:169: undefined reference to `db_create' [20:58:26] /tmp/cc8fYGxy.o: In function `produceDump': [20:58:26] /home/otto/scr/webstatscollecotr-wikivoyage/collector.c:250: undefined reference to `pthread_create' [20:58:26] /home/otto/scr/webstatscollecotr-wikivoyage/collector.c:256: undefined reference to `pthread_create' [20:59:01] -lpthread [20:59:03] -ldb [20:59:08] those are in the make file [20:59:22] for collector ? [20:59:51] ottomata: what url is the git repo at please ? [21:00:04] https://gerrit.wikimedia.org/r/gitweb?p=analytics/webstatscollector.git;a=summary [21:00:07] branch time_travel [21:00:08] analytics/webstatscollector [21:00:15] ok [21:02:08] ok, compiled manually with [21:02:09] cc -o collector collector.c collector.h export.c -ldb -lpthread [21:15:10] heh time_travel [21:17:08] drdee ok, i think something is running on an09 [21:17:10] no idea what i'm doing though [21:31:27] mmmm [21:31:32] everything looking good [21:32:23] ? [21:38:21] after 1 hour it should output a txt file [21:38:28] then we need to grep for ".voy" and inspect it, if all looks good then we are ready to go [21:38:53] where is it running on an09 exactly? [21:40:40] ottomata ^^ [21:41:19] /home/otto/scr/webstatscollector-wikivoyage [21:49:24] ty [21:53:05] ottomata, not sure if it is working on an09 [21:53:25] the db files should be written to /tmp [21:53:29] but i don't see anything [21:53:36] i don't think its working either [21:53:47] i have seen filter output stuff though, but only from a static file [21:54:16] oh but [21:54:19] it hasn't been running for an hour yet [21:54:21] only 40 mins [21:54:37] so is it keeping all the counts in memory then? [21:55:55] i don't think that's the case [21:55:57] IIRC [21:56:00] this is what happens [21:56:11] every hour collector creates a db in /tmp [21:56:19] during that hour all counts are stored in that db file [21:56:31] once an hour is over, it will export the data to a text file [21:56:33] delete the db [21:56:36] and create a new one [21:56:56] do you have write permissions on tmp? [22:01:58] ottomata, maybe first check the flow from udp2log -> filter and see what filter outputs? [22:04:41] ja nothing [22:05:23] OH [22:05:24] doh [22:05:28] udp2log isn't running on multicast here [22:05:29] DOH [22:06:28] ok, that should be better [22:09:14] D [22:09:14] ;D