[13:02:50] Goooooooooooooood morning analytics! [13:26:05] finally, it's morning again! ! morning ottomata, milimetric, average_drifter [13:26:13] good morning good sir [13:26:46] hihaaaaaa [13:27:11] so…………. what's cooking today, lemme think [13:29:38] ottomata, can i help you with fixing the snappy byte counter on an11? [13:30:23] hmmm, i haven't looked at it, so I don't know [13:30:26] but I can look at it! [13:30:28] today! [13:30:35] in just a few minutes! as soon as the coffee is ready [13:30:37] which is soon! [13:33:09] nice, my coffee is also ready [13:33:31] do you have a nice coffee machine or you are a topic expert for that matter? [13:35:41] average_drifter, what is actually the origin of your handle? [13:36:01] drdee: hey [13:36:14] whhaaazzzzup? [13:36:14] drdee: drifter as in hobo :D [13:36:59] i beg to disagree :) [13:38:38] http://en.wikipedia.org/wiki/Hobo [13:38:47] i have a machina [13:38:52] for coffee [13:38:59] i grind the beans with cardamom pods [13:39:06] and heat up cream with maple syrup [13:41:06] sounds fancy to me and would love to try your coffee one day! [13:42:41] average_drifter, shall we go through the asana backlog? [13:42:46] i'll start a hangout [13:42:52] https://plus.google.com/hangouts/_/1962b20ec0622b81c0fd43d54ceb5fbbd372bccb [13:48:46] average_drifter ^^ [14:17:03] milimetric, git question in https://plus.google.com/hangouts/_/1962b20ec0622b81c0fd43d54ceb5fbbd372bccb [14:17:14] k [14:25:15] ok, drdee, hopefully inputcount is fixed [14:25:23] python was buffering the output [14:25:25] so it was all working [14:25:32] but since there isn't much output [14:25:34] (just hourly counts) [14:25:40] nothing every got written to the file [14:25:47] so I started python with -u [14:25:52] to make it unbuffered [14:26:19] :) [14:59:01] yoyo ottomata, on stat1, can you configure the 'stats' user to be able to push to gerrit for the /a/wikistats_git repo? [15:25:54] did you guys resolve it? [15:27:06] average_drifter, let me push some more changes form stat1 to gerrit [15:27:11] and then just start clean [15:29:07] drdee: ok [15:29:25] drdee: we agreed to just make a new git review [15:29:42] drdee: but I'm going to wait for the changes from stat1 [15:32:52] yes do a new git review [15:33:01] and pull changes from stat1 first [15:50:06] FYI: Already cleared this with Andrew, but the analytics1011-1022 systems are coming offline today (now) for shipment back and replacement by dell with far more stable less crappy R720XD systems [15:50:06] in case you guys pay attention to the nagios messages, you will see analytics machines go down shortly. [15:50:27] yup [15:50:49] can you wait for 5 minutes with an11? [15:50:53] of course [15:50:56] need to copy some stuff from it [15:50:57] 1011 i can unrack dead last [15:51:11] will be hour or more, as each one has to be popped open and have the SSD pulled [15:51:18] ottomata around? [15:51:25] thanks RobH! [15:51:25] c2100s are crap design, have to remove two dozen damned screws to do that [15:51:32] :) [15:51:46] i planned to leave it for last anyhow, so will say smoething in here before i pull it [15:52:06] the rest i am just going to pull all power and cables for in the same group [15:52:15] okay and can we do a ritual with the last one? [15:52:20] you want closure eh? [15:52:29] i think i am not the only one :D [15:52:44] these things have dark juju man, they pulled blood on pretty much everyone to touch them. [15:52:45] i think paravoid would also love a ritual :D [15:53:04] and you of course, and apergos and probably the entire ops team [15:53:58] brb coffee is calling [15:54:15] hrmm, actually.... these werent wiped [15:54:26] i dislike sending them back with all data [15:54:53] I am going to start a wipe on all disks (3pass) and then pull their network connections so they wont bitch about it on nagios [15:55:02] (i wont do this to 1011 until clearing with you guys and andrew later today) [15:55:07] thanks [16:24:16] average_drifter: what is the status of your patchset? [16:36:27] drdee: merging with master [16:36:42] mornin errybody [16:38:16] thanks average_drifter, morning dschooon! [16:51:39] hey drdee, what license do we use for source code? [16:51:47] gpl2 [16:51:59] ok thanks [16:56:46] https://plus.google.com/hangouts/_/2e8127ccf7baae1df74153f25553c443bd351e90 [17:00:14] drdee: can you please run this on wikistats/squids and push [17:00:41] drdee: I would need this because my diffs are red right now while merging with master , and I need to manually do fromdos on all the files so I can see the actual changes [17:00:49] drdee: this makes it very hard to merge [17:00:52] drdee: please run this on the wikistats/squids [17:00:54] drdee: find . -type d \( -name .git \) -prune -o \( -type f \( -name "*.pl" -o -name "*.pm" \) -print \) | xargs fromdos [17:01:12] drdee: this just converts all .pl and .pm files to Linux line-endings [17:09:50] ahhhhh poooooop sorry [17:09:53] i was online [17:09:58] waiting to see link to hangout [17:10:01] but my IRC wasn't connected [17:10:01] you still can [17:10:02] RARGHHH [17:10:05] link? [17:10:07] baae1df74153f25553c443bd351e90 [17:10:12] https://plus.google.com/hangouts/_/2e8127ccf7baae1df74153f25553c443bd351e90 [17:10:12] hehe [17:10:35] https://www.mediawiki.org/wiki/Analytics/Product_Codes [17:19:51] ottomata: please install package tofrodos on stat1 [17:26:27] ottomata: So I have 1012-1022 running wipe [17:26:31] i held off on 1011 until i heard from you [17:26:42] ok cool [17:26:42] since it has stuff that was marginally needed to keep [17:26:59] ummmm, lemme just move a little scripty [17:26:59] one sec [17:27:04] cool, then i can start wipe and remove its network [17:30:02] So Dell screwed up and reissued new servers with the improper internal config (no 2.5" internal slots, and cannot be field added later) [17:30:12] so we are making them fix it [17:30:14] doh [17:30:20] end result is another day or two added to the estimate on how soon these come in [17:30:32] they actually sent 12 to tampa and chris racked them already =[ [17:30:49] even though you guys dont plan to use the internal bays anytime soon [17:30:57] i dont like suddenly not having that option when they 'fix' their issues [17:31:10] yeah [17:31:10] for sure [17:31:13] we should get it right now [17:31:27] yep, so if you need to add ssd's for speed or whatever you have the option in the future [17:31:39] it'll be fixed [17:32:38] so will it be good to wipe in a few minutes? (or should i get lunch and plan to do after?) [17:32:52] (no pressure either works) [17:34:38] ottomata: ^ ? ;] [17:35:36] going to assume post lunch [17:37:20] sorry [17:37:20] hey [17:37:26] yeah its cool now [17:37:28] take it down! [17:37:51] cool [17:39:10] ottomata: https://www.mediawiki.org/wiki/Analytics/Kraken/JMX_Ports [17:39:27] c2100s are done man. [17:39:35] all d/c from network and wiping \o/ [17:52:28] RobH is awesome! [17:52:29] <3 [17:52:41] ottomata, if you want to fill in the JMX ports as you go, that'd be sweet [17:52:44] i'll be doing the same whenever i look one up [17:53:41] can do [17:53:54] RobH is awesome! [17:54:23] RobH! if you are totally bored after wiping C2100s, wanna look into those misbehaving dells again? [17:54:46] (he signed off) [17:54:56] and i think you meant misbehaving ciscos [17:55:31] doh [17:55:36] thank you [17:55:51] drdee, gimme snappy on an26 [17:55:51] and I will start that thang again [17:58:42] okidoki [18:15:41] did you guys know dschoon did 16 years of work since February 2012? https://www.ohloh.net/p/limn [18:15:48] lol [18:16:01] ottomata, snappy is installed on an26 [18:16:16] danke [18:16:46] that's hilarious. [18:16:50] i'm totally tweeting that. [18:17:07] though they're definitely counting /static/vendor [18:17:15] limn "... [18:17:16] has a codebase with a very short history [18:17:17] maintained by a large development team [18:17:18] with stable year-over-year commits" [18:17:30] uhmmmmm if they say so [18:17:33] additionally, i was not paid $872,408 [18:17:40] crap [18:17:45] in the first 6 months of my employment [18:17:45] in case anyone was wondering. [18:21:19] ottomata did you install fromdos on stat1 or is still pending review [18:22:20] i puppetized nad merged [18:22:22] puppet is being weird there [18:22:27] notice: Skipping run of Puppet configuration client; administratively disabled; use 'puppet Puppet configuration client --enable' to re-enable. [18:23:06] mmmmmm [18:24:20] running now... [18:30:18] average_drifter [18:30:24] tofrodos installed [18:32:49] ottomata: which vpn client did you use? [18:33:13] ottomata: thanks man [18:33:27] viscosity [18:35:19] ah hm, annoying thing about vpn [18:35:33] since analytics1000.wikimedia.org is a public addy that our compys already know how to route [18:35:41] it goes through physical interface by default [18:35:42] hm [18:35:45] 1001, you mean [18:35:48] yeah [18:35:58] i thought traffic *couldn't* go through the physical interface, ever [18:36:09] isn't that the point of a vpn? [18:36:15] that all traffic is encrypted and proxied through the network? [18:38:09] it has a public interface [18:38:22] 1001 is hosting the vpn [18:38:51] it has a tun0 interface alias to eth0 [18:38:56] did you mean my machine's physical interface? [18:38:59] no [18:39:06] ah [18:39:08] one moment, engaging vpn [18:39:14] which i assume means i'll drop all my connections [18:39:21] here is a good test url: [18:39:21] http://analytics1003.eqiad.wmnet:50075/browseDirectory.jsp?namenodeInfoPort=50070&dir=%2F&nnaddr=208.80.154.154:8020 [18:39:21] no! it shouldn't! [18:39:36] i *think* it shoudl only be configured to route to the two analytics subnets [18:39:51] hm. [18:40:09] under networking, in the client config [18:40:12] i see nothing [18:40:20] no default gateway, no defined routes [18:40:29] btw [18:40:29] https://gist.github.com/693deb34f703d92c69dd [18:40:35] it is pushed from the server [18:40:44] the routes………..agh i dunno, i'm making this up as I go [18:40:45] ahh [18:40:50] i tried lots of different things [18:40:50] okay, i'll connect. [18:40:50] sec. [18:40:54] was just confirming first [18:41:00] in case i *do* get dropped :) [18:41:08] heh, k [18:41:08] whatever it is at right now works for me well [18:41:17] i'm signed onto it right now [18:41:54] no error message [18:42:00] yeah, the openvpn server is pushing this to clients: [18:42:00] # this pushes routes to the client [18:42:01] push "route 10.64.21.0 255.255.255.0" [18:42:01] push "route 10.64.36.0 255.255.255.0" [18:42:08] it just spun for a bit, and then said, "the connection has been disconnected [18:42:08] hmm [18:42:53] k try now [18:42:55] i'm watching logs [18:43:15] connecting [18:43:35] disconnected, again [18:43:38] hmm, didn't see anything [18:44:08] Oct 23 11:43:12 vex.local openvpn[61561]: WARNING: No server certificate verification method has been enabled. See http://openvpn.net/howto.html#mitm for more info. [18:44:08] Oct 23 11:43:12 vex.local openvpn[61561]: NOTE: OpenVPN 2.1 requires '--script-security 2' or higher to call user-defined scripts or executables [18:44:08] Oct 23 11:43:12 vex.local openvpn[61561]: Cannot load certificate file cert.crt: error:0906D06C:PEM routines:PEM_read_bio:no start line: error:140AD009:SSL [18:44:34] hmmmmmm [18:44:43] i suspect the import didn't work [18:44:53] when i imported the profile, i selected the folder [18:44:57] was that the right thing to do? [18:45:03] yeah [18:45:18] ottomata, i am running a pretty big MR job on the cluster so please do not restart hadoop for now [18:45:32] ok good to konw! iw as thinking about doing that for ganglia stuff in a bit [18:45:38] oh [18:45:47] i gave you a bad cert.crt somehow... [18:45:54] ... [18:46:02] hmmmmmmm [18:46:07] stick the new one in my home on an01 [18:46:10] and i'll update [18:46:31] okhmmmmmmmmmm, yeah weird [18:46:46] cert.crt is 0 bytes [18:47:45] lmk when to look for a new one [18:47:54] i think i have to regenerate the whole thing for you, one sec [18:48:33] mk. [18:48:35] lmk. [18:49:19] i'm gonna go grab lunch a little early, to avoid lines. [18:49:21] i'll be back in a few. [18:49:29] ok [18:50:37] ok new one in the same place [18:50:58] try that [19:02:45] back [19:02:45] trying [19:03:44] cert.crt is still 0 bytes [19:03:44] fyi [19:03:47] ottomata ^^ [19:03:47] bwa [19:03:59] oh it is generated that way [19:04:00] waaaa [19:04:01] hm [19:04:08] trying anyway [19:04:14] nopes [19:04:28] my cert is not 0 bytes [19:04:28] hmmmmm [19:04:28] wha? [19:04:29] hm [19:04:32] can't you just give me the cert file? [19:04:45] the one I generated is 0 bytes too [19:05:32] buh [19:05:41] looking into it... [19:07:15] ~/Library/Application\ Support/Viscosity/OpenVPN/1 [19:07:30] should contain the files that Viscosity is using [19:07:45] (maybe the number will be different for you. dunno why they'd start at 1 :P) [19:09:54] ottomata, dschoon: http://etherpad.wikimedia.org/avro initial proposal avro schema for traffic logs based on current server log config [19:10:14] cool [19:10:23] will look in a moment [19:10:29] ok dschoon, one more time [19:11:22] cert.crt isn't 0! [19:11:28] drdee, i have no idea, but should some of that data be converted to more binary forms? [19:11:32] timestamp and IP could be converted [19:11:37] connecting! [19:12:13] hmm, what happened, it started to look good [19:12:16] still connecting! [19:12:20] ok... [19:12:27] so after deserialization, you still want people to work with binary fields? [19:12:57] i dunno, deserialization could de-binary them? [19:13:12] that's what the current proposal does [19:13:12] looks unhappy in logs, dschoon [19:13:12] is it working? [19:13:15] by what? [19:13:17] i don't think so [19:13:20] it's still "connecting" [19:13:27] hmmmm [19:13:32] tyr now [19:13:43] i disconnected, i wanna see if it does something different [19:13:45] well, what i gather ottomata is saying is that it's way more compact to store TS and IP as longs [19:13:48] k [19:13:59] nope, same logs [19:14:13] Oct 23 19:13:55 analytics1001 ovpn-openvpn[32162]: 216.38.130.167:51676 Connection reset, restarting [0] [19:14:13] Oct 23 19:13:55 analytics1001 ovpn-openvpn[32162]: 216.38.130.167:51676 SIGUSR1[soft,connection-reset] received, client-instance restarting [19:14:13] Oct 23 19:13:55 analytics1001 ovpn-openvpn[32162]: TCP/UDP: Closing socket [19:14:32] you got anything in your client logs? [19:14:43] TS and IP are both strings in the schema [19:14:58] because avro does not have date time or custom ip field [19:14:58] emailAddress=otto@wikimedia.org [19:14:58] Oct 23 12:13:55: TLS_ERROR: BIO read tls_read_plaintext error: error:14090086:SSL routines:SSL3_GET_SERVER_CERTIFICATE:certificate verify failed [19:15:05] hm [19:15:15] oh hmmmmmmmm, hm. [19:15:32] sorry, missed part of that [19:15:33] Oct 23 12:13:55: VERIFY ERROR: depth=0, error=unable to get local issuer certificate: /C=US/ST=CA/L=SanFrancisco/O=Wikimedia-Foundation/OU=Analytics/CN=server/name=wikimedia-kraken/emailAddress=otto@wikimedia.org [19:15:49] interesting, hmmmmm [19:16:15] ah! [19:16:16] Oct 23 12:13:53: WARNING: No server certificate verification method has been enabled. See http://openvpn.net/howto.html#mitm for more info. [19:16:36] you could convert a timestamp to long, but you can't convert an ip6 address to a long [19:16:49] drdee: can we hold off on this convo for a bit [19:16:52] but i was under the impression that in the ETL phase as little as possible should be done [19:16:57] so ottomata and i can work this out? [19:17:51] hmmmmmmmmm [19:17:51] we'll take it to PM, but i can't do two things at once, and i think the avro stuff is super important [19:18:01] try adding that in your conf then? the server cert was built like that link says [19:18:05] ottomata: hangout? [19:18:11] ▪ remote-cert-tls server [19:18:11] k [19:18:26] https://plus.google.com/hangouts/_/cfca1c6e90797a92ee2bfb5f04da029c15b58160 [20:04:44] I'm going down a bit of a rabbit hole. [20:05:07] I upgraded to Precise Pangolin and it messed up a VIM plugin that I had compiled with ruby 187 [20:05:28] So I spent over an hour trying to fix this and it seems to be getting worse [20:05:57] yikes [20:05:58] anyone know how to compile with ruby 191? [20:06:14] i tried installing rvm and installing ruby 191 from that but no go [20:06:46] yeah, normally I'd just reformat my computer and put windows back on it, fire up Visual Studio and call it a productive 30 minutes :) [20:07:18] now I'm kinda torn 'cause that plugin happens to be command-t and it's very useful [20:09:25] milimetric: any chance you can pathogen to do this for you? [20:09:29] you can get* [20:09:45] i'm managing it in pathogen but I still have to compile it manually [20:09:49] aah [20:10:04] https://github.com/wincent/Command-T [20:10:18] btw, if you use vim and don't have that, I highly recommend it [20:10:41] \t + type in-order parts of a file path and it opens it [20:10:53] nice [20:11:08] except of course when it poops all over my head [20:11:15] i am trying to switch to vim, but I haven't found a good python indentation plugin [20:11:24] hehe [20:12:58] you try all these things that come up in googling: http://henry.precheur.org/vim/python, http://www.protocolostomy.com/2009/11/13/way-better-python-indentation-in-vim/ ? [20:14:53] yeah [20:14:56] i've got the basic indentation set up [20:14:57] which is nice [20:15:29] but I like the way emacs lets you cycle through the indentations [20:15:37] i could probably just give it up [20:15:52] cool, yeah, I definitely overlook a lot of vim's faults because of its sexiness [20:16:21] but you don't happen to know how to make with a particular ruby version do you? [20:16:51] when I ruby extconf.rb I get a makefile that points to 1.8 and it seems like it should be easy to make it point to 1.9... [20:17:14] nope [20:17:35] sorry [20:17:39] not much of a ruby developer [20:17:59] milimetric: 1.9 is not backwards compatible with 1.8.7 [20:18:20] i thought it was just 'cause i was on windows but i literally have never been able to do anything with ruby because I always run into setup issues [20:18:20] how does vim call ruby? [20:18:28] does it have an ENV file it uses? [20:18:39] well i'm not sure about that but this is different [20:18:44] because you need to make sure you have ruby 1.8.7 installed and then set the path to ruby [20:18:51] i just need to compile the plugin with the same version of ruby that vim was compiled with [20:18:51] RUBY_HOME i believe [20:19:04] ...they embedded ruby? [20:19:04] it's not just a shell call? [20:19:12] and precise pangolin changed my vim to one that's compiled against 1.9.1 [20:19:22] i don't think that's what happened [20:19:27] i think if you jump on the shell [20:19:35] and type `ruby --version` [20:19:35] command-t uses ruby AND some C stuff [20:19:35] you'll see 1.9.1 [20:19:44] ruby --version is 1.8.7 [20:20:04] so :P [20:20:05] :) [20:20:05] `which ruby` [20:20:19] ubuntu's? [20:20:24] ... [20:20:26] run the command [20:20:35] hehe [20:20:35] basically, i'm guessing there's another ruby on the path somewhere [20:21:06] huh? [20:21:10] I ran literally [20:21:12] ruby --version [20:21:15] try this [20:21:15] I got: [20:21:15] bash --norc -c 'which ruby' [20:21:23] ruby 1.8.7 (2012-02-08 patchlevel 358) [x86_64-linux] [20:21:27] oh lol [20:21:35] bash --norc --noprofile -c 'which ruby' [20:21:43] /usr/bin/ruby [20:21:52] and what do you get when you run it normally [20:22:05] run what normally? [20:22:08] `which ruby` [20:22:14] me [20:22:19] *same [20:22:34] /usr/bin/ruby [20:22:36] so recap [20:22:39] weird. [20:22:57] vim is compiled with +ruby and ruby1.9 shows up in the version part [20:23:34] I have ruby 1.8.7 installed and I have to ruby extconf.rb && make inside the plugin directory [20:23:48] but I need that to be ruby 1.9.1 [20:24:08] do you *have* ruby 1.9.1? [20:24:12] no [20:24:18] so i tried installing rvm to help me install that [20:24:20] well. [20:24:22] yeah. [20:24:25] and that fails because of some openssl error [20:24:33] but i don't know much about things after this point [20:24:39] i've also tried using rvm before [20:24:39] k [20:24:48] but i know so little about ruby i just straight up failed [20:24:53] (i *did* succeed in installing it, at least) [20:25:09] moral of the story: all things that ruby touches turn to shit and it therefore is condemned to the lowest depths of hell for me and I shall never look at or speak to it again. [20:25:23] thx :) [20:25:24] hehe [20:33:15] rvm is not the best solution for what it does [20:33:23] there is rubyenv and ruby-buildd [20:38:18] aiight drdee -- wanna talk about the avro schema? [20:38:37] let's do it [20:39:08] hangout? [20:39:16] so, what's "hostname"? [20:39:20] i think text is easier. [20:39:59] and what does URL actually contain? [20:40:20] so hostname is cache server actually serving the request [20:40:27] i think we can drop it [20:40:44] ehhh. [20:40:49] i think we should keep it. [20:40:51] URL is the full URL including all parameters, everything [20:40:56] it's important instrumentation data [20:41:10] so URL includes the requested hostname [20:41:28] ?? [20:41:42] a GET request does not have a hostname. [20:41:53] GET /foo/bar/baz?derf=1 [20:42:11] you resolve the hostname via DNS and connect directly to the HTTP server on 80 iv IP [20:42:13] *via [20:42:28] so i'm making sure that the request URL has been composed by this point [20:42:31] that is not how it is currently outputted by the cache servers [20:42:40] example log line [20:42:43] sq18.wikimedia.org 1715898 1169499304.066 0 216.38.130.161 TCP_MEM_HIT/400 13208 GET http://en.wikipedia.org/wiki/Main_Page NONE/- text/html - - Mozilla/4.0%20(compatible;%20MSIE%206.0;%20Windows%20NT%205.1;%20.NET%20CLR%201.1.4322) [20:42:46] okay, that's fine [20:43:06] so sq18.wikiedia.org is the hostname [20:43:06] what is TCP_MEM_HIT? [20:43:06] squid code [20:43:10] ah. [20:43:22] we should drop those things as well [20:43:26] yes. [20:43:27] and just keep the http status code [20:43:36] let's keep hostname, and rename it "server" [20:43:38] i think that *is* valuable [20:43:45] because varnish and squid use different codes [20:43:54] errors are more interesting than successes :) [20:44:08] what is x_forwarded? [20:44:11] i am sure it can be, but it's not our primary use case to supply instrumentation to ops [20:44:14] vs IP? [20:44:23] no, but it'd be nice to be able to do so. [20:44:34] x-forwared can contain a list of ip addresses [20:44:42] for what? [20:44:49] this is mostly an opera-mini issue [20:44:55] as opera has forward proxies [20:45:03] ah. [20:45:03] ideally this field would be dropped [20:45:07] so IP is the requesting IP [20:45:17] and the ip field would match the real origin [20:45:18] right. [20:45:25] however this requires some processing [20:45:32] so i agree. kill that. [20:45:32] yes, we can do it in ETL [20:45:33] we can't [20:45:38] oh? [20:45:42] i mean what do you mean by kill? [20:46:01] hold on. why would x_forwarded by a ist? [20:46:01] *list [20:46:16] OH WAIT [20:46:17] this is important [20:46:18] "doc": "Default schema for Wikimedia web traffic from squid/varnish/nginx.", [20:46:21] no no. [20:46:30] this is the schema for storing ALL requests [20:46:51] so that way everything that represents a request would be stored in the same format [20:47:09] well my scope was a bit more limited and just attempts to get the current sampled log lines in avro format [20:47:15] then we expand [20:47:28] well, here's my reasoning [20:47:51] event data sent into the cluster from the pixel service needs to have some fields transposed into the right places [20:47:57] like referrer [20:48:20] right [20:48:41] heh, it's a lot worse than I thought. All my plugins are dead [20:48:55] this is what you get for upgrading! [20:48:56] I feel... violated [20:49:17] so, drdee, the idea is that "canonicalization" during ETL involves moving things into the right places for event data [20:49:42] for other data (normal requests) we'll have less processing to do [20:49:46] as we don't need to parse the URL [20:50:07] but we will still need to parse the request to create the record [20:50:07] right so i was focusing on normal requests first [20:50:24] do you think there's a reason to use two different structures? [20:50:32] they contain almost all the same information [20:50:52] no probably we can use one because avro can handle missing data [20:51:21] the only reason i wanted to start 'easy' was so i could test it against the current sampled files [20:51:54] but we should add a field params, which would be a map of pairs for the event stuff [20:52:16] yes, but hold off on that for a sec [20:52:23] i need to refresh myself on the avro datatypes [20:52:55] they are quite basic, http://avro.apache.org/docs/current/spec.html#schema_primitive [20:53:09] i'm reading now. [20:53:19] k [21:02:38] oh. zigzag encoding is rather clever. [21:03:02] i'm very surprised i never thought of that [21:03:07] for integers, right? [21:03:11] yeah [21:03:21] pretty neat indeed [21:03:21] and longs [21:03:26] it's totally a trick to make counts more compression [21:03:44] ottomata, did you restart hadoop by any chance? [21:03:50] because most of the instances of int/float in the serialized binary are offsets or counts for fields [21:04:22] a string is [ BYTE_LENGTH:int | BYTES ] [21:04:34] that int is more frequently a small number than not [21:04:49] but! same with negative numbers! [21:04:57] where sign is used to give absolute sizes for arrays [21:05:18] but 2-complement would end up with a bunch of 1's instead of 0's for negative numbers [21:05:36] whereas with zig-zag, small negative numbers look very similar to small positive numbers [21:05:48] so the whole thing compresses a lot better [21:06:21] this is the same reason bz2's file format uses *tallies* for encoding shorts [21:06:45] 0x01 = 10; 0x02 = 110; 0x03 = 1110; [21:08:56] average_drifter: good call, rvm fails to do what it wants. Just apt-get install(ing) ruby 1.9.1-full fixes all my problems. [21:09:19] nice [21:09:26] I'm sure only to cause others yet unfound but that be another day's worry :) [21:09:32] though it probably broke any scripts you had that depended on 187 [21:09:48] i did not restart hadoop [21:09:48] everything seems ok with vim, just exhaustively checked [21:11:14] i am trying to start a MR job and get: There are 0 datanode(s) running and no node(s) are excluded in this operation. [21:11:50] snerk. [21:12:06] ottomata, it had occurred to me right before drdee said that [21:12:22] that if we made changes resolve.conf on an01 [21:12:28] they effect hadoop [21:12:34] because the namenode is on an01 [21:12:57] welp [21:13:06] (and also my previous job was killed at 50%) [21:14:00] anyways [21:14:30] but indeed there are 0 active nodes right now [21:14:36] so maybe you could restart hadoop? [21:14:39] see http://analytics1001.wikimedia.org:8088/cluster/nodes [21:21:11] nawww, but we didn't make changes [21:21:19] only to vpn [21:21:19] bmmm [21:21:19] hmm [21:21:48] hmmmmm [21:21:57] yup things are unhappy [21:23:01] iiinnterrestin [21:23:44] hmmmmmmm [21:23:49] check namenode logs [21:23:54] i betcha it's in there somewhere [21:24:13] naw its networking for sure, [21:24:23] i took down the eth0:0 alias [21:24:30] the .101 (analytics1001.eqiad.wmnet) thing [21:24:35] now I can ping an03 from an01 again [21:25:16] hmmmmmmmmmmmmmmmmmmmmmmmm, interesting, ok I will have to play with that more then [21:25:25] all nodes are back up [21:25:25] yeah [21:25:29] ok, i won't play with this right now [21:25:40] continue playing [21:25:44] the job has been killed :) [21:25:46] but I think I need a manual route for backend addies, naw, I gotta run [21:25:48] ha, sorry about that [21:25:54] run it over night? [21:25:54] heh [21:25:55] yup [21:28:50] man, i love avro [21:39:38] dschoon: could you guys live without an ssd in your replacement dells? [21:39:45] yes. [21:40:07] especially if we were able to get some spinning disks out of the deal [21:40:11] (they don't even need to be fast!) [21:40:28] ok. you might still be able to get one, but dell screwed up and shipped half of the r720's without internal 2.5" bays. but you could indeed get more spinning disk [21:40:44] <3 [21:40:49] that sounds great. [21:40:56] who should we follow up with on that? [21:42:29] see security - it turns out as part of dell's mistake, they have 12 2TB drives instead of 8 [21:42:44] so as long as those go to analytics, you should be set [21:47:19] that, my friend, is some hot shit. [21:47:20] <3 [22:01:41] https://bugzilla.wikimedia.org/show_bug.cgi?id=41326 [22:01:51] I cc'd a few people on that bug. [22:02:00] i would call it a feature request [22:02:10] As would I. [22:03:04] How is the analytics infrastructure coming along? Relying on stats.grok.se still is really nasty. :-/ [22:03:17] I totally agree that's a good idea, though [22:03:36] it's coming along quite well! [22:03:40] Nice. :-) [22:03:46] you should check out https://www.mediawiki.org/wiki/Analytics/Kraken [22:03:55] there's significantly more information there since last we spoke. [22:04:10] Cool. [22:04:34] But we have a ton of ideas for data products as well. We're really excited about things like that. [22:05:03] Heh, "firehose of banality." [22:05:05] I may steal that. [22:05:20] Sometimes I am inspired. [22:05:51] Another example: what if every logged-in user had a dashboard showing traffic to articles they've edited? That'd be actual feedback showing how people are seeing their contribution [22:07:20] Aye, that'd be cool. [22:07:30] People have done some work on that in the past, kind of. [22:07:45] Looking at how many eyeballs saw vandalism to U.S. Senator articles, for example. [22:07:47] That research was a bit political. ;-) [22:08:00] *nod* [22:08:35] I'm sure there are better ideas than this, but it's just one example of how data can make the editing process more exciting [22:09:13] And this is to say nothing for readers, as recommendations, auto-categorization, etc is all pretty obvious and interesting. [22:12:48] I can't say I'm worried about people finding uses for the data. Any idea when it'll be available to the public in a usable form (API or similar)? [22:13:08] Just looking for a ballpark. January 2013? July 2013? Earlier? Later? [22:13:11] No idea. That's a long way off, because it implies things about stability. [22:13:17] Hmm. [22:13:26] The best I could say, with a HUGE grain of salt, would be mid-2013. [22:13:37] There's a lot on the table before then. [22:13:43] All right. [22:13:52] And we want to make sure we do it right, with API keys and rate-limiting and all that. [22:14:30] Yeah. And I guess data dumps for larger analysis. [22:14:42] Not raw, but maybe aggregated. [22:14:42] yeah. [22:14:45] Exactly. [22:15:20] What's the primary bottleneck at this point? Money? Staff? [22:15:34] Aggregated datasets are probably one of the first things we'd start providing once we felt the platform was stable, and were comfortable with the integrity of the data. [22:15:49] Heh, Staff, which is solved by Money. [22:15:52] Wikipedia has never been about the integrity of its data. Just throw a disclaimer on this. ;-) [22:16:04] on this --> on it [22:16:22] Yes well. Perhaps a thing I'd like to change. [22:16:25] Lead by example and all. [22:16:35] Well, I'm wondering if it's a "we don't have enough people" or "we don't have enough servers" or "we need a big grant" kind of thing. [22:18:06] For a public API? [22:18:15] It's definitely staff. [22:18:34] Partially because there's a lot of thinking that is in the future about anonymity. [22:19:13] Partially because of infrastructure that doesn't exist and needs to. [22:19:35] Partially because we didn't plan for hardware for an API or Query gateway. [22:20:02] And all that could be improved with money. [22:30:39] Ah, got it. [22:31:04] Right. If privacy weren't an issue, we could just stick some Google Analytics JS in the footer and call it a day. :D [22:31:16] exactly. [22:31:18] (mostly) [22:31:37] (Google would fucking love us forever if we did that, jesus.) [23:25:33] brb [23:29:54] mk [23:29:55] dschoon, i checked the avro schema in python, it now parses, there was one syntax error in the enum field [23:30:11] extra comma? [23:30:22] er, [23:30:29] no, the enum needs to be nested in a "type" key [23:30:29] what you changed isn't equivalent to what i had... [23:30:41] what you had didn't' validate [23:30:52] i don't think that's true. [23:30:56] i checked the docs [23:31:05] the nested type bit [23:31:06] i agree it probably didn't validate [23:31:13] but that's because the type was a union before [23:31:23] it was ["enum", "string"] [23:31:40] not probably didn't validate, it did not validate, this is not bayesian thing ;) [23:31:57] i mean the union didn't [23:32:18] did you just `pip install avro`? [23:32:32] yup [23:32:38] k, one sec [23:33:04] wait i think i am blind as well [23:33:16] the fix is wrong [23:34:13] in python [23:34:13] from avro import schema [23:34:18] parsedSchema = schema.parse(mySchema) [23:34:28] k [23:34:37] okay, i am going to make dinner [23:34:47] before i am being lynched [23:35:15] k [23:35:17] we'll talk later [23:43:37] the problem is that you forgot a " in the sub-record. [23:47:09] that fixes the syntax error in any case. [23:47:16] now it says that enum isn't a type. [23:53:31] that was the original error as well [23:53:41] yeah [23:53:44] you were right. [23:53:46] i figured it out. [23:53:51] sec, updating. [23:55:52] that should work now [23:56:26] without the comments, obviously. [23:57:46] nice! [23:58:00] that was a pretty effective afternoon, IMHO [23:58:12] still gotta do the event request schema. [23:58:18] which is actually a lot more complicated. [23:58:34] i need to interview potential housemates tonight though [23:58:41] this was a good warmup :) [23:58:44] let's continue tomorrow [23:59:55] word