[03:31:16] average_drifter: the todo for clicktracking is to kill it, it suffers from fundamental design problems [03:31:20] we're hoping to do that this week [03:33:00] ori-l: that's interesting [03:33:06] ori-l: so a rewrite is imminent ? [03:34:11] yes and no. it'll take some time before functionality is at parity with clicktracking. some of the features it implements are useful and don't currently have full-featured alternatives [03:34:42] but yes, in that lots of its basic functionality is obviated by things in E3Experiments [03:35:45] ori-l: so E3Experiments is a group of projects ? [03:35:46] what's the nature of your interest? are you interesting in hacking on feature testing code? (if so, yay, there's lots to do!) or are you the administrator of a mediawiki deployment that is using it? [03:36:08] a little bit, yeah [03:36:22] ori-l: I'm just a guy interested in the clicktracking extension, I've done some stuff with clicktracking, but more simplistic(heatmaps, in Perl) [03:37:13] cool! happy to answer questions [03:37:23] ori-l: I wonder if heatmaps is one of the objectives of clicktracking.. I know for example Erik Zachte likes visualization a lot, I saw loads of stuff on his blog about visualizing data [03:37:33] ori-l: so let's go back to something concrete, you mentioned testing code [03:37:57] ori-l: would writing some QUnit tests for the clicktracking extension be a good thing ? [03:38:44] again, i think it'll be disabled this week, so i'm not sure how useful that would be, but E3Experiments could certainly use more testing [03:38:45] ori-l: that would basically just cover the modules/jQuery.clicktracking.js and the stuff in modules/ [03:39:49] you're free to contribute tests to clicktracking too if you want, of course, but if you give me a better feel for what kind of stuff you're interested in i could help identify opportunities in the codebase that are likely to see wide deployment if implemented [03:41:01] ori-l: well first of all, let's start of with what I know, so basically I know Javascript and Perl and C/C++ [03:41:28] ori-l: these are my main things. I can read PHP but I'm not proficient in it, however due to my Perl background I could somehow transfer that to PHP [03:41:37] my wife and i are just putting my son to bed so i have to go for a bit, but if you can bare to wait, i'll respond [03:41:39] ori-l: about what I'm interested in, I like TDD a lot [03:41:51] ori-l: no problem, I'll be here [03:41:59] thanks! and sorry to nip out in the middle of a conversation like that [03:42:13] alright [03:43:32] so as I was saying, I like TDD a lot and if it's possible to write tests for the ClickTracking extension and if you know some ideas where that would fit in, I'd be glad to give try that out [03:43:44] but before that, I'd like to ask what a UserBucket is ? [03:44:01] also, why does E3Experiments extension depend on the Clicktracking extension ? [04:48:38] average_drifter: sorry for the delay [04:49:21] so, if you look at the SkelJS extension on gerrit (under mediawiki/extensions/SkelJS) you'll see sample code for qunit tests [04:50:56] because most javascript code has dependencies on the mediawiki environment, you can't run them using just qunit. there's a standard way to declare test modules to mediawiki, though, and that'll cause your tests to be included when the test suite is run, using Special:JavaScriptTest or something like that [04:51:29] i think we have a couple in the E3Experiments extension too, but we haven't exactly been shining exemplars of TDD [04:52:27] if you want to write tests for E3, i'd follow the pattern laid out by the couple of tests we have already. that'd be a huge service, really, and i'd be personally grateful [04:52:47] if you prefer to target the clicktracking extension, i'd follow the example laid out by SkelJS to add qunit tests [04:53:11] the docs at http://www.mediawiki.org/wiki/Manual:JavaScript_unit_testing are pretty good [04:53:51] i'd be happy to answer any questions, review code, or help out. [05:15:36] ok, cool, thanks ! how about tests on server-side ? [13:25:33] morning otto [13:26:41] morning! [13:31:08] howdy [13:31:14] i am up early, and mucking about. [13:31:38] woa that is early [13:31:45] yes well. [13:31:57] it is due to my amazing dedication to the cause and/or periodic insomnia. [13:32:15] so~ [13:32:20] zo! [13:32:25] afaict, all nodes are up. [13:32:42] here are a few useful commands: [13:33:20] oooo [13:33:21] ssh an01 -- nodetool -h localhost info [13:33:21] dsetool ring [13:33:22] yay! [13:33:24] ssh an01 -- nodetool -h localhost ring [13:33:31] oh, they wrapped it? [13:33:34] i will look into that. [13:33:34] yup [13:33:42] probably just adds hadoop-related commands [13:33:42] same output from dsetool ring [13:33:44] looks great [13:33:47] yeah [13:33:49] awesome [13:33:54] did you do anything to make them happy? [13:34:08] i was getting some reported down or missing or something, and sometimes nodetool / dsetool would timeout [13:34:37] it looks like around 8p UTC, an07 wedged itself. [13:34:57] the jsvc processes were still up, but it wouldn't respond to anything [13:35:30] i asked them politely to quit, they didn't; i asked them rather impolitely to fuck off, they did; i restarted the service; all was well [13:36:05] from past experience, i am pretty sure i know exactly why you saw churn when you started it up. [13:36:22] have you done any reading about the seeds? [13:36:35] basically, the whole system runs on gossip, so there's no single point of failure [13:37:06] but when a node joins a cluster for the first time, there need to be some well-known points of contact for it to bootstrap its knowledge [13:37:30] that's what the seeds are. (apparently this is now pluggable with a service call, which is neat, but not really relevant for us) [13:37:59] hmm [13:38:07] seeds as in the IPs it starts with? [13:38:14] they should all be seeded with all 10 IPs [13:38:30] so right now, we have all 10 there [13:38:33] which is bad [13:38:35] oh [13:38:43] because it creates a race. [13:38:54] everybody is trying to contact everybody else, in order [13:39:01] and everybody is failing, because, well, everybody is down. [13:39:20] eventually, some number of unlucky nodes will throttle and wait [13:39:41] morning guys [13:39:47] and probably a small kernel of gossip will start between (i'd guess) a few early nodes, and a few late nodes [13:39:55] hmmmm [13:40:00] so you'd see an01 come up and an08 come up, something like that [13:40:01] anyway [13:40:05] so we should have what, 1 seed per 3 nodes or something? [13:40:06] the solution! [13:40:27] you usually have 1/rack, 2-3 per dc [13:40:31] oh [13:40:38] so maybe we only need two right now then? [13:40:41] yes. [13:40:42] an01 and05? [13:40:46] sure. [13:40:49] the other thing [13:40:49] an01 an06 [13:40:50] yeah [13:40:52] is to bring them up in order [13:41:02] what order? [13:41:03] seeds first? [13:41:07] let an01 and an05 come up, bootstrap, and chat up each other first [13:41:09] aye [13:41:13] then you can bring up the rest all at once. [13:41:31] ok cool, can change the seeds easy [13:41:53] since typical operation strives to avoid 100% down more than anything else, this case isn't exactly one of the main concerns. [13:42:24] you'll notice everything settled down eventually. i only had to restart one node, and i have no idea why it was wedged. [13:42:36] (logs showed nothing) [13:44:02] aye, i noticed 07 being weird too [13:44:22] lmk if you happen to notice anything else there [13:44:33] kinda worrisome, since there's absolutely nothing going on. [13:46:55] ottomata, you wanna make a user named "opscenter" on an01 only? [13:47:24] (or maybe show me how?) [13:47:43] (i am assuming `sudo adduser opscenter` is not an acceptable answer) [13:48:02] hm, why? [13:50:31] it runs as root otherwise? [13:53:15] dse? [13:53:18] doesn't it run as cassandra? [13:53:29] it does not. [13:53:41] not according to /etc/init.d/opscenterd, anyway [13:53:56] hmm, yup running as root, [13:54:00] let's make it run as cassandra [13:54:07] do as you will. [13:54:08] cassandra already exists and owns the data directories [13:54:15] well [13:54:20] this is just a monitoring webapp [13:54:23] what is opscenterd [13:54:24] ? [13:54:27] it'll connect an agent to each node [13:54:28] ohohh [13:54:29] opscenter [13:54:34] i thought you meant dse/cassandra [13:54:37] and collect instrumenting/perf/monitoring data [13:54:42] no no. [13:55:04] ah [13:55:16] is that running now? [13:55:44] the dashboard is, i think. [13:55:51] aye [13:55:53] http://analytics1001.wikimedia.org:8888/ [13:55:54] can I see it? [13:55:57] though that doesn't respond for me. [13:56:01] you have iptables running? [13:57:00] yeah you are supposed to proxy all web stuff on the cluster though an01:8085 [13:57:05] but it isn't working for me right now either.. [13:57:10] this prompt is about as confidence-building as "curl ... | bash": [13:57:11] http://www.datastax.com/docs/_images/agent_install_credentials2.png [13:57:34] can we open 8888 for now? [13:57:41] er. [13:57:42] fine [13:57:46] i'll set up a proxy. [13:57:58] i just dedicate opera to my cluster browser [13:58:00] and make it always proxy [13:58:06] :P [13:58:29] i don't get anything when I curl locally though [13:58:34] curl http://analytics1001.wikimedia.org:8888/ [13:58:42] 504 Gateway Time-out [13:58:47] hm. [13:58:48] oh i know [13:59:01] hm [13:59:01] HEAD http://analytics1001.wikimedia.org:8888/ [13:59:01] 500 Can't connect to analytics1001.wikimedia.org:8888 (connect: Connection refused) [13:59:51] hmm, i don't know [14:01:20] ack [14:01:21] hm [14:02:30] ok the proxy is working [14:02:43] :8888 is not [14:02:58] i just added a vhost to proxy 8086 to 8888... [14:03:43] well, the an01 8085 shoudl just proxy everything nicely [14:03:51] so if you configure your browser to always use 8085 [14:04:01] then you should be able to enter any address and get it proxied through an01 [14:04:14] the cluster hosts all allow traffic from an01 [14:04:16] uh. [14:04:20] are we running a squid here? [14:04:33] no [14:04:36] i'm doing it with apache [14:04:43] also [14:04:49] you're logged into an01, right? [14:04:52] i can't HEAD or curl that port locally [14:04:57] 8888 [14:04:57] curl --url http://localhost:8888/ [14:05:11] you'll get a very ... interesting ... response [14:05:16] ah now it is working [14:05:19] well [14:05:20] HEAE is [14:05:22] HEAD is [14:05:26] buh? [14:05:26] curl shows nothing [14:06:46] i thikn maybe the python thing is not bound to the public IP [14:06:54] opscenter is not bound to public IP [14:06:59] cause I get 200 on localhost [14:07:05] 500 on analytics1001.wikimedia.org [14:07:10] tcp 0 0 127.0.0.1:8888 0.0.0.0:* LISTEN [14:07:14] hm [14:07:22] that does not look very public to me. [14:07:24] one sec. [14:07:35] aye [14:08:41] opscenter.conf [14:08:42] eh? [14:08:47] where? [14:09:07] /etc/opscenter [14:09:17] [webserver] [14:09:17] port = 8888 [14:09:18] interface = 127.0.0.1 [14:09:22] ...how [14:09:24] did i miss that [14:09:55] btw: service opscenterd restart [14:10:03] whenever you're done :) [14:10:34] i still want http://analytics.wikimedia.org/ :( [14:11:24] http://analytics1001.wikimedia.org:8888/opscenter/index.html [14:11:31] with browser 8085 proxy [14:11:45] should I go through this or do you want to? [14:11:52] i'm happy too [14:12:07] hm? [14:12:26] you mean, "dave, shut up and set up a socks proxy" [14:13:25] haha [14:13:25] no [14:13:33] the opscenter is asking questions [14:13:37] what are our hosts? [14:13:39] oh [14:13:46] however, i cannot get to it [14:13:58] i assume because i need to shut up and set up a socks proxy :) [14:15:03] it also wants to know our names [14:15:07] and about our first kisses [14:15:13] who are our parents? [14:15:20] why do we do things? [14:15:28] okay, "stfu-socks-proxy" running. [14:15:32] i down see this wizard. [14:15:39] those are all excellent questions. [14:16:44] i need to eat a food, i think. man cannot live on espresso alone. [14:17:15] hopefully you'll be done recounting your romantic history and moral failings to ops center by the time i return. [14:17:32] ha, except [14:17:35] opera things look funny [14:17:38] I cannot see what I type [14:18:17] you have to be wearing opera glasses [14:18:33] or maybe you lack sufficient self-loathing to run that browser? [14:18:35] dunno. [14:18:38] brb foodz [14:18:41] mk [14:25:24] those *are* very good questions ;) [14:37:50] back [14:38:13] aye, i'm currently searching for a way to puppetize opscenter-agent [14:38:37] "0 of 10 agents connected" [14:39:03] right, the opscenter-agent is not running anywhere [14:39:10] i can do this through the web gui [14:39:15] ja [14:39:16] if I give it my private key [14:39:18] but i'd rather puppetize it [14:39:33] oo! [14:39:34] opscenter-agent.deb [14:39:41] is included in /usr/share/opscenter [14:39:56] yeah, i was gonna say... [14:40:05] also: http://www.datastax.com/docs/opscenter/agent/agent_index [14:40:16] aye eyah i'm reading all that [14:40:59] ah [14:41:02] this is the one we want http://www.datastax.com/docs/opscenter/agent/agent_manual [14:42:20] yup [14:42:35] i shall trust in your inordinate expertise :) [14:45:19] i don't like the fact that I have to run a script... [14:45:26] i just want to install a deb, and maybe modify a config file [14:45:28] but i'll get it [14:45:41] what's the script dooo? [14:46:21] create directories, copy some files around, install the .deb [14:46:24] i can do it with puppet [14:46:39] word. [14:46:40] ergh, i might not bother right now though... [14:46:48] heh [14:46:56] you sound like me all of a sudden... [14:47:51] btw: [14:47:57] jconsole-proxy analytics1001 7199 7199 analytics1001.wikimedia.org [14:48:01] works as expected. [14:51:33] oh. [14:51:44] i just realized why we were getting that weird squid page from the shell. [14:52:01] http_proxy=http://brewster.wikimedia.org:8080 [15:06:58] aye yeah [15:07:08] that should not be set on an01, but I think it is right now [15:07:19] man, i've got opscenter-agent up and running on 6 of the nodes [15:07:25] but still am seeing nothing in opscenter gui [15:09:35] hm. [15:10:05] trying gui installer... [15:11:29] want me to look first? [15:13:08] psh, that worked [15:14:06] snerk. [15:14:18] did you start the agent service everywhere afterward? [15:15:10] yeah [15:18:23] milimetric: you have mail? [15:18:27] what's his TZ? [15:18:44] mail? [15:18:50] i guess he's up because he said good morning! [15:19:02] milimetric: yes. the fast version not the snail version [15:19:27] I have a wikimedia.org account, is that what you mean [15:19:30] ottomata, is iptables allowing 80? [15:19:42] dandreescu@wikimedia.org [15:19:44] milimetric: have you checked your inbox recently? ;) [15:20:37] only from cluster IPs, i think [15:20:41] so you have to proxy [15:20:54] 8085 is the only port allowed from outside of cluster [15:21:07] jeremyb: what's my token? [15:21:12] milimetric: empty string [15:22:54] cool, thanks [15:24:12] ottomata: boo, even 80? [15:24:49] yup [15:25:01] why you got a prob? [15:25:13] just thinking about ways to make our lives easier. [15:33:48] hey jeremyb, why couldn't you create an account for stefan? [15:34:37] drdee: i asked sumanah about the case of "also wants an svn acct" and she said to leave it and she'd do the whole thing herself [15:34:54] k [15:34:56] ty [15:58:03] changing locations, eating food, be back on for standup [16:19:42] * milimetric lunch [16:25:40] hi average_drifter [16:25:49] I've created Stefan's Labs account but the Subversion account process requires an ssh key https://www.mediawiki.org/wiki/Developer_access/Subversion#Requesting_commit_access [16:47:25] average_drifter ^^^ [16:47:30] sumanah: thanks [16:56:46] https://plus.google.com/hangouts/_/747c161e540f46a13aa312b2f1ab701b9ec441b3 [16:56:56] https://plus.google.com/hangouts/_/747c161e540f46a13aa312b2f1ab701b9ec441b3 [17:02:10] https://plus.google.com/hangouts/_/747c161e540f46a13aa312b2f1ab701b9ec441b3 [17:02:25] milimetric ^^ erosen [17:21:04] sorry about that [17:21:10] got caught up in a meeting with jessie [17:22:08] she can get out of hand at times. [17:22:10] wild, even. [17:28:37] ok, dschoon, should I go ahead and push the seed change and restart cassandras? [17:28:40] i'll stop them all [17:28:44] and then start an01 and an06 [17:28:46] sure. [17:28:46] let them chat [17:28:49] then start each one [17:28:53] sounds great. [17:29:29] i'd like to try walking through the "automated deployment" myself, just to see wtf it says [17:29:52] the puppet thing? [17:29:54] yeah you should do it [17:30:01] well, that too, someday. [17:30:08] but at the moment, i was referring to the "fix" button [17:30:11] in the DSE UI [17:30:18] oh [17:30:27] it doesn't say much [17:30:36] (i'll also note that an01 has succeeded in finally finding its way to its own public interface, and its agent now shows up) [17:30:37] it asks for your username and private SSH key [17:30:46] (i have a key that I generated on an01 that I used) [17:30:54] i'd like to diff the conf files it generates [17:31:00] just curious how much of it comes from gossip. [17:31:10] hm [17:31:21] i noticed there's an opscenter_agent_monitor daemon [17:31:24] should I stop the opscenter0agents too? [17:31:24] which is kinda "buh?" [17:31:27] yes. [17:31:31] ok, how about [17:31:33] I stop everything [17:31:37] start up the cassandra cluster [17:31:40] and then let you mess with opscenter? [17:39:24] did the wifi in SF die? [17:40:52] ha, must have [17:50:53] what happened? network out? [17:53:33] ok dschoon, all cassandras restarted and reporting in [17:53:43] do you want to play with the gui installer? [17:53:49] for opscenter? [17:54:06] actually, so, when I had orignally started opscenter [17:54:13] I had entered the IPs of each node in the gui [17:54:21] but then I wanted to see if Ic ould make it display hostnames [17:54:26] so i edited the cluster and put in the hostnames [17:54:29] didn't seem to make a difference [17:54:30] so I left it [17:54:35] but maybe that is confusing things [17:54:37] I will change it back to IPs [17:55:40] dschoon, lemme know if you want to hit the fix button (if you don't I will :p ) [17:58:28] oook i'm doing it :p [18:01:09] go ottomata, go~ [18:17:54] interesting, dschoon, now that I've made it so tha tonly an01 and an06 are listed as seeds [18:17:59] those are the only machines that are connected in opscenter [18:23:13] yargh blarg [18:27:17] lol [18:27:31] an10 cassandra was reporting as down [18:27:33] dunno why [18:27:35] restarting it now [18:27:46] and you are right about disabling ssl [18:27:48] i did that again [18:27:55] and log messages about opscenter are happier [18:28:01] even though it still doesn't seem to work [18:28:24] (hooray internet) [18:28:24] hokay, lemee look. [18:28:34] oh more are conecting now? [18:28:34] hmm [18:28:37] 4 of 10 connected [18:28:53] weird [18:29:49] btw, now that my ssh agent keys are better [18:29:52] dsh is awesome [18:30:19] dsh -g k 'sudo tail -f /var/log/{cassandra,opscenter,opscenter-agent}/*' [18:30:30] i think the entirety of the office is tethered over Ryan_Lane's mifi [18:30:35] hahhaa [18:30:43] yeah, i don't think the office can handle that right now [18:30:48] it's having trouble with ... jpg [18:32:04] maybe opscenter agents just take forever to start working together [18:32:09] now there are 5/10 [18:32:22] i think there's a polling threshold [18:33:38] sigh. http://www.datastax.com/docs/opscenter/configure/configure_opscenter_adv#stat-reporter-interval [18:33:51] the only documentation about this property is that you can set it to 0 to disable it? [18:33:52] really? [18:34:13] wha!? [18:34:17] it phones home to datastax?! [18:34:24] to ops-center :P [18:34:31] oh! [18:34:36] By default, OpsCenter periodically sends usage metrics about your cluster to DataStax Support [18:34:38] yeah. [18:34:42]