[00:02:48] If I add a reviewer in gerrit do I spam them? [00:03:24] I don't know if spam is the right word, but yes [00:04:26] Well sorry for 'spamming' then :P [00:04:49] I suppose technically you subscribed to it so it's not un-soliciated therefore not spam [00:05:27] https://github.com/WhiteHouse/petition < oh shit, they use drupal [00:05:31] Explains the state of the goverment [00:32:37] oh diaf [00:32:56] Ryan_Lane: Does the upgrade fix the stupid dns records missing issue? :) [00:33:58] that's already fixed [00:34:03] I put a live-hack in [00:35:02] 'fixed' doesn't seem so fixed [00:35:15] maybe I missed some instances [00:35:25] or are you talking about one that was deleted/recreated? [00:35:33] Yeah [00:35:43] It was there, deleted it, removed fine, created, no a record [00:36:04] Alternativly a re-install with same config would be awesome :D [00:36:06] currently there's no instances with broken ds [00:36:08] *dns [00:36:26] after the upgrade I can add the ability to resize instances [00:36:28] and rename instances [00:37:02] so reinstall with same config will be less necessary [00:37:09] :D [00:37:11] Weird... [00:37:22] dig hostname = no A record, dig full hostname = A record [00:37:26] Damn you pdsn packet cache [00:37:28] also, if you use project storage rather than local storage, and use puppet them a reinstall with the same config is already possible ;) [00:37:33] s/pdsn/pdns/ [00:37:50] dig requires fqdn [00:37:53] host doesn't [00:38:11] also. you'll love this [00:38:14] hostnames will be instance names [00:38:16] Well ssh doesn't work either ;P [00:38:17] not instance ids [00:38:30] which hostname isn't working right now? [00:38:46] delete/recreate with the same name causes dns cache issues [00:38:47] bots-cb-dev, might just be the cache though as I'm being impacient [00:38:57] And yeah that would be awesome because id-xxxx is really un-useful :D [00:39:05] err i- not id- [00:40:03] yeah [00:40:07] it'll still be that for puppet [00:40:28] but, hostname -fqdn will be based on the instance name [00:40:53] ok. gotta run [00:41:11] sounds energetic [00:41:12] have fun [11:40:13] <^demon> A-ha! https://gerrit-review.googlesource.com/#/c/37660/ will make it into 2.5 :) [11:40:17] * ^demon does a happy dance [11:40:39] <^demon> (got it in master last night) [11:41:14] when is the closing bell? [11:41:21] need to rebase my stuff :( [11:41:52] <^demon> rc0 is out, so need to get it in master before rc1 prolly. [11:42:29] ok [11:42:31] <^demon> Dunno what the timeline is--Edwin is doing release management for this one. [11:42:39] need to drop another project and do that [15:13:21] why is it painfully slow to simply untar a file? [15:33:10] paravoid: Have a minute to answer puppet questions? [15:38:14] I'm stumped by the fact that when I reference a class I get an error that it's unknown. But when I include the manifest that defines the class, puppet instantiates the class before I'm ready. [15:44:48] shoot [15:50:12] paravoid: By 'shoot' do you mean 'go ahead and ask the question'? Because I just did :) [15:51:01] I think my general question is, "Is it OK for me to think of puppet as having classes and objects and distinguishing between them, just like in any other OO language?" [15:54:14] no :) [15:54:46] Damn. [15:55:14] I've been reading the puppet language guide again but am still unclear on when a class is merely defined and when it is actually… instantiated. [15:55:34] Partly because the docs seem to use the word 'declare' to mean 'instantiate'... [15:56:02] All I want right now is to be able to /refer/ to the name of a class, without having to first cause that class to actually create all of the files that it creates. [15:57:31] paravoid: http://pastebin.com/AKvmnF12 <- doesn't work because puppet doesn't know what gerrit::jetty is. [15:58:14] is there a class gerrit:jetty? [15:58:29] if so, it needs to be $declared first for that manifest to work [15:58:34] and by $declared I mean "include" [15:58:53] Right, but when I include it, puppet goes ahead and does everything that the class does. [15:58:55] Before I'm ready. [15:59:09] Creates files and such. [15:59:58] …hence my confustion between 'define' and 'instantiate'. I want puppet to know about gerrit:jetty but I don't want to actually create one yet. [16:01:38] paravoid: Here is gerrit::jetty: http://pastebin.com/M57nKXUW [16:02:27] I don't understand what you're trying to achieve [16:02:34] class foo { } makes the class known to puppet [16:02:44] but if you don't include it it won't do anything [16:02:54] and hence before => on something that's included does not make sense [16:03:01] you're saying "before" something that will never happen [16:03:14] OK. [16:04:00] I want gerrit::jetty to happen after the 'labs' class is created. [16:04:08] So maybe my problem is just misunderstanding what 'before' does. [16:04:25] I thought that it was the same as 'require' only reordered. [16:04:37] you want all the stuff in gerrit::jetty to happen after all the stuff in labs happened? [16:05:16] Well… gerrit::jetty creates files from templates, and those templates do variable lookups on role::gerrit::labs. [16:05:27] So I need role::gerrit::labs to be defined beforehand. [16:05:45] I'm refactoring code (not mine) which does things in the wrong order, hence all the template refs from gerrit:jetty are broken. [16:06:03] template expansion happens at the end iirc [16:06:10] so it shouldn't matter [16:06:19] but that's wrong nevertheless [16:06:19] really? [16:06:24] Dammit, I wonder what's happening then [16:06:36] role::gerrit::labs should call gerrit::jetty parameterized [16:06:56] the templates in gerrit::jetty should not reference variables off role::gerrit::labs [16:07:00] it might work, but it's a layering violation [16:07:03] of our structure [16:07:15] we're supposed to layer role classes on top of "base" classes [16:08:08] ok. So, gerrit:jetty should take parameters from role:gerrit::labs… and then the templates used by gerrit:jetty should refer to values in gerrit:jetty that were filled in by those params. [16:08:10] Yes? [16:10:25] * Damianz yawn [16:11:09] paravoid: It may be that my problem is assuming that the existing code is "probably mostly right" rather than "probably completely messed up." [16:11:32] yes what you said above is correct [16:12:05] you might want to use gerrit::jetty from role::gerrit::foo in the future [16:12:19] which could parametize gerrit::jetty a bit differently [16:12:29] and role::gerrit::foo should have nothing to with role::gerrit::labs [16:12:54] paravoid: If you have some free time later I managled https://gerrit.wikimedia.org/r/#/c/21307/ up last night, I'll be surprised if it works :) [16:12:55] am I making any sense? [16:13:00] Yeah, in fact there are three different role:gerrits defined already, so I'm already forced to do that part right :) [16:13:08] Also it would be cool if someone merged https://gerrit.wikimedia.org/r/#/c/21297/ so nagios will get fixed and become useful again. [16:14:21] paravoid: And, I was already confused/annoyed at the circularity of the existing code, so it's reasurring that you regard that as unacceptable. [16:14:33] right [16:15:00] well defined layers help readability/debuggability too [16:15:14] I can only guess how messed up that must be [16:15:17] * andrewbogott nods [16:15:47] I'm an incrementalist by nature, so I often find myself mired in tarpits not of my own making when I should just be starting from scratch. [16:16:17] Well, especially when working in a language that I barely understand :( [16:28:18] <^demon> andrewbogott: That tarpit was even worse before ;-) [16:28:58] Yeah, it's clear that I need to rise to your level of boldness when refactoring :) [16:30:03] <^demon> Be bold [when updating puppet manifests] :) [17:21:42] ^demon: The more I look at this the less I know. [17:22:17] What does it mean that gerrit::instance is defined /inside/ the 'gerrit' class, whereas gerrit::jetty is declared outside the class but with the explicit gerrit:: prefix? [17:22:50] <^demon> That shouldn't make a difference, afaik. A lot of that stuff is old. [17:23:04] hm... [17:23:06] * andrewbogott tries rearranging [17:24:25] <^demon> Might be nice to just move it all inside gerrit. [17:25:47] I just moved jetty inside but it didn't help, I still get Invalid resource type gerrit::jetty [17:25:50] wtf? [17:26:33] <^demon> If you move it inside but don't rename, it'd be gerrit::gerrit::jetty? [17:26:41] I moved and renamed. [17:26:45] <^demon> Hrm. [17:26:52] Why can it find puppet::instance but not puppet::jetty? [17:26:59] I must be making some dumb syntactic mistake [17:27:24] http://pastebin.com/GfV93Jwb <- obvious syntactic mistake? [17:30:00] <^demon> I've never seen it done like that--with the instance passed to the other classes. [17:30:18] ok, but it's not getting as far as caring about the param is it? [17:30:18] <^demon> I have no idea anymore--I'm in way over my head. [17:31:01] Adding the instance as a param is an attempt to follow paravoid's advice, earlier. [17:31:21] But I'm stuck at roughly the same place as before, which is that puppet keeps claiming to have never heard of puppet::jetty [17:33:13] paravoid: help? [18:01:56] Ryan_Lane: Because I'm asking everyone… here is my current puppet puzzle: http://pastebin.com/JJEzPRHD [18:15:06] andrewbogott: because one is a definition and the other is a class [18:15:15] you invoke paramaterized classes differently [18:15:26] class { "class::name": [18:15:51] * andrewbogott looks for an example [18:17:29] * Damianz pokes Ryan_Lane and wonders if he wants to merge a change into puppet [18:18:24] Ryan_Lane: OK, that makes sense, and seems to get me unstuck (or, rather, back to the problem I had before, at least.) [18:18:35] Damianz: which change? [18:19:42] https://gerrit.wikimedia.org/r/#/c/21297/ [18:19:46] fixes the ip for nrpe [18:20:07] Also can we actually delete the test branch now :P [18:21:01] I thought it was already ficed [18:21:03] *fixed [18:21:12] 120 isn't correct? [18:21:28] I don't believe so [18:21:33] it is [18:21:36] it definitely is [18:21:39] Really? [18:21:41] 10.4.0.120 [18:21:44] I'm on the system right now [18:21:51] ahhhh [18:21:58] wait no [18:21:59] Sure it's the right one [18:22:13] 120 is the correct address, though [18:22:37] hmm mk [18:22:53] 5666 is allowed from 10.4.0.0/24 [18:22:58] in the security groups [18:23:16] allowed_hosts=10.4.0.120 should be right then [18:23:21] yes [18:23:48] So why do I see no traffic from that host to an instance and/or why is nrpe not connecting to the instance properly [18:24:59] I'm thinking nrpe is not running properly on the server, maybe [18:25:08] because I just just connectivity and it's working [18:25:18] (No output returned from plugin)  [18:25:21] That could be a valid reason [18:25:22] that's not the same as can't connect [18:25:31] I just restarted nagios3 on the server [18:27:47] Hmm I'll check it in a while then, weird though as it's not /all/ servers just a large chunk of them [18:28:03] 2012-08-24 18:27:43] Warning: Return code of 127 for check of service 'Free ram' on host 'nagios' was out of bounds. Make sure the plugin you're trying to run actually exists. [18:28:10] would explain something though [18:28:19] yeah [18:28:20] * Damianz didn't realise the log was actually open now [18:28:22] that somehow went away [18:28:28] but the other checks shoudl work [18:28:35] it was lost in the test->prod merge [18:28:49] Ah, yeah that went more messy than it should have done :P [18:29:30] Hmm, what timezone is this box actually in? [18:29:50] It's like utc but I'd of thought sf [18:30:36] Oh I seee [18:31:06] Ryan_Lane: Looks like the syntax for check_nrpe is wrong, looking at status info anyway. [18:31:26] it's utc [18:31:29] all are in utc [18:31:48] the servers are in tampa, why would they be in sf time? :) [18:31:51] <3 utc [18:32:03] our servers are global, so utc is the only time that makes any sense [18:32:07] tampa is near enough sf, you do realise our entire country is like downtown for you? :P [18:32:31] <^demon> Near as in "3 timezones away?" [18:32:33] sf -> tampa is the equivalent of one side of europe to the other. heh [18:33:49] At least they're not in russia :) [18:34:30] Hah [18:34:34] It's actually significantly farther [18:34:41] SF-Tampa is 2,393 miles [18:34:48] Belfast-Istanbul is 1,856 miles [18:35:12] hmm [18:35:21] Lisbon-Moscow works, that's 2,436 [18:35:24] *2,437 mi [18:35:32] <^demon> RoanKattouw: WHY DO YOU KNOW THESE THINGS? [18:35:43] ^demon: Cause I'm looking them up as I type [18:35:50] I've just realised... bleh I was right [18:36:04] Dublin office *is* closer to me than the London office *profit* [18:36:14] By about 13miles... [18:37:13] <^demon> RoanKattouw: It's more fun when I imagine you just spouting off distances between two arbitrary locations like anyone should know it :p [18:37:48] I was thinking more stood infront of a huge map with a pointy stick wearing nerdy glasses like old school news [18:39:28] "Analyzing CFEngine Logs with Splunk" < bad subject to read quickly [18:40:36] <^demon> CF? As in...ColdFusion? [18:41:19] Nah, it's a config management system written in I believe C [18:41:51] <^demon> Ah, much better. I was like...people still use ColdFusion? [18:42:09] Linode still uses ColdFusion, never liked it personally [18:42:43] Yo Lane. Can I have a wikitech account? [18:42:59] lol [18:43:00] http://wikitech.wikimedia.org/view/Event_tracking <- my team wants help with docs like these [18:43:13] Why does "Yo Lane" sounds like the start of a rap song [18:43:17] <^demon> "Adobe claims that people are continuously adopting it, but the general attitude seems to be that Adobe is full of shit in that regard." [18:43:25] LOL [18:43:26] Everything I say sounds like a rap song, dawg. [18:43:46] ;) [18:43:47] <^demon> StevenW: E-mail address? Account name? [18:43:58] swalling@wikimedia.org, Steven Walling [18:44:25] <^demon> "A randomly generated password for Steven Walling has been sent to swalling@wikimedia.org." [18:44:33] You're a prince among men. [18:44:40] I really need to stop eating on my bed when I have a perfectally good desk and couch. Damn crumbs [18:45:05] <^demon> StevenW: Have fun :) [18:48:54] Hmm, can you have multiple alternative accounts on wikipedia without breaking the sockpuppet rules? [19:04:39] Ryan_Lane, paravoid: The gerrit::jetty class (created here http://pastebin.com/Zs7GHETJ) creates files from templates, and those templates need access to variables defined in the 'instance' that is passed into the jetty class. [19:04:45] Any idea what scope.lookupvar magic will get me that? [19:05:12] you should make those local variables [19:05:26] meaning, in the manifest, pull the variables from the class into local variables [19:05:58] You mean, make them local to the jetty class? [19:06:18] yes [19:06:27] And then how do I refer to them in the template? [19:06:34] as local variables :) [19:06:54] Oh, so, just ${varname} [19:07:09] I can't do something like ${instance::varname}? [19:07:12] well, <%= varname %> [19:07:41] clearly I need to read the template docs [19:07:44] heh [19:08:23] I was nearly impressed with how well erb can handle hases/arrays. Though it was annoying that it seemed to randomly sort the keys causing template changes when the data was the same. [19:08:25] But, my question about looking inside $instance (which is a local variable) still stands. Possible? [19:11:16] why not just pass the variables into the jetty class? [19:11:41] you're passing them into the instance class, right? [19:12:18] <^demon> Prolly could get rid of the common class so we don't have to pass through that. [19:12:49] Ryan_Lane: At the moment, the instance class generates a bunch of things [19:13:34] it's likely best to have all the config in the role [19:13:43] that way you can pass it in where needed [19:13:56] Maybe. Trying to share some of the config work between multiple roles. [19:14:25] see how I'm doing it in the ldap role [19:14:33] I put the config in a separate class [19:14:40] then I can reference it from multiple places [19:16:07] I'm sure your design suggestions are right, but I still want to know the answer to my question :) [19:16:20] The docs say I can do it but then don't really say how :( [19:16:24] * andrewbogott reads more [19:24:32] there's a scope lookup [19:24:35] inside of templates [19:24:42] the docs used to recommend it [19:24:45] they don't any more [19:25:31] "Lance Armstrong takes drugs and wins the hardest bike race in the world 7 times. Programmers take drugs and what do we get? Node.js." [19:29:29] <^demon> s/Node.js/Ruby/ [19:30:53] I think node falls into the same group as ruby :D [20:20:48] how to mount project storage? [20:31:54] Ryan_Lane: I imagine, perhaps foolishly, that scopes are nested. So, scope lookup gets things from the global scope, but I am wondering about whether the local scope itself has subscopes. local::subscope::variablethatisinsubscope [20:32:15] um… also maybe I should be saying 'namespace' instead of 'scope' :( [20:47:59] Platonides: it's automount [20:48:11] Platonides: just access inside of /data/project [20:48:16] it'll mount itself [20:49:39] bash: cd: /data/project: No such file or directory [20:52:13] Change on 12mediawiki a page Developer access was modified, changed by Sumanah link https://www.mediawiki.org/w/index.php?diff=575510 edit summary: /* User:Nbprithv */ [20:52:19] https://fbcdn-sphotos-h-a.akamaihd.net/hphotos-ak-ash3/575763_383677978346951_1587375987_n.jpg [20:53:03] Change on 12mediawiki a page Developer access was modified, changed by Sumanah link https://www.mediawiki.org/w/index.php?diff=575512 edit summary: /* User:Solon */ [20:53:44] Change on 12mediawiki a page Developer access was modified, changed by Sumanah link https://www.mediawiki.org/w/index.php?diff=575514 edit summary: /* User:Euloiix */ [20:59:47] Ryan_Lane, it doesn't automount [21:00:43] Platonides: which instance? [21:00:59] sube [21:01:14] I created it a few hours earlier [21:03:32] so it shouldn't have broken... [21:03:59] OTOH, if at some point automountng broke, old instances may have continued running [21:05:30] ok. gimme a min [21:06:18] hm [21:06:21] auth failed [21:06:27] I wonder if the script is failing [21:07:01] this process is running: /usr/sbin/automount -DGSERVNAME=projectstorage.pmtpa.wmnet -DSERVNAME=labs-nfs1... [21:07:18] and there's also /usr/sbin/glusterfs [21:08:04] there's a waiting time when I do a ls of /data/project [21:08:17] so I guess it's trying to access but failing [21:08:47] could this be related? [514.012471] svc: failed to register lockdv1 RPC service (errno 97) [21:10:30] I have now six copies of /usr/sbin/glusterfs --volfile-id=/tor-project --volfile-server=projectstorage.pmtpa.wmnet /data/project [21:10:42] seems there's one spawned each time I try to access it [21:18:02] seems to be a simple /bin/mount call in the background [21:19:58] why is it called with -f (aka. --fake) ? [21:20:36] mkdir: cannot create directory `/data/project': No such file or directory [21:20:39] this is crazy [21:20:51] is it possible that the project storage folder itself is missing? [22:37:14] Platonides: crap, sorry. I found a security vulnerabilty in something and needed to handle it upstream [22:37:21] let me look at that issue again [22:38:52] this is crazy < Damn you for putting that song in my head [22:39:13] Platonides: the automount is failing because the server is rejecting it [22:40:23] seems that it's in the auth.allow lsit [22:40:25] *list [22:40:45] whoops [22:40:57] of course, security vulns are more important :) [22:41:17] if you can see why the server doesn't like this instance... [22:43:23] seems this is yet another fucking glusterfs bug [22:44:10] lol [22:44:50] occasionally it loses its shit and ignores new entries into the auth.allow list [22:45:00] I hate this filesystem [22:45:50] btw, I found /home (nfs) really slow [22:45:57] yes [22:46:04] it's on a single instance [22:46:08] extracting a 17M tbz [22:46:13] don't use home :) [22:46:20] 46m31.764s [22:46:31] /tmp: 0m13.281s [22:46:32] home should only be used for environment [22:46:42] it'll eventually switch to glusterfs project storage [22:46:50] right now it all goes to a single underpowered instance [22:47:06] it was a temporary solution while we wanted for gluster to be stable [22:47:08] now, when project storage works... [22:47:11] and here we are. still waiting [22:47:54] I'll try again tomorrow [22:48:04] good night [22:48:52] night [22:49:25] I guess I can switch the servers to use iptables rather than gluster's broken auth.allow [22:49:28] since it's always broken [22:49:43] that's a lot of iptables rules, though [22:51:05] andrewbogott: sooooo :) [22:51:21] want to work on the gluster manage script some more? [22:51:36] this may also be something you want to do in the gluster plugin for openstack [22:51:37] sure. [22:51:52] replace auth.allow with iptables rules [22:51:59] back in a few mins [22:52:11] 'k [23:11:20] Ryan_Lane: Is it possible to have per-volume security using iptables? Aren't all the volumes accessed via the same port? [23:19:47] andrewbogott: back [23:19:52] andrewbogott: each share uses a port [23:20:21] now the question is, how do you get the port number... [23:20:49] ugh. I locked gluster up on labstore1 [23:28:28] What's the specific problem with auth.allow? [23:29:06] it doesn't work [23:29:15] there's some bug with it [23:29:20] it works for a while, then stops working [23:29:26] Oh, perfect :( [23:29:56] technically it just stops accepting changes to the list [23:31:02] Any chance we could just have a cron restart gluster frequently? [23:31:23] I'm looking at ways to get the port for a given volume… but I fear we'll be treading on an unpublished (hence unstable) api [23:33:19] no. that could cause serious issues [23:34:23] for instance. gluster is locked up on labstore1 because I tried to do just that [23:34:40] but, I'm pretty sure the shares always use a specific port [23:35:04] Looks like they do. I'm just not sure how to determine what it is. [23:35:10] people in the #gluster channel are already doing it [23:35:12] yeah [23:35:14] that's the hard part [23:35:25] I really wish I could get this damn service to start [23:36:09] part of the handshake between the gluster client and server involves discovering the port. So we can pretend to be a cluster client… that's where the 'unpublished api' comes in [23:36:49] oh. awesome. a segfault? [23:37:16] I bet it's defined in the brick config [23:37:28] ugh. we'd need to do this on all nodes, too [23:37:31] this sucks [23:37:39] The brick port is different from the share port, though [23:38:07] Um… isn't it? Or does each share need its own private bricks? I can't remember [23:38:14] there's brick config per share [23:38:53] Oh, which is written out someplace by 'gluster volume create'? [23:39:21] yeah [23:39:24] it's all in config files [23:40:55] There are a config files in e.g. /etc/glusterd/vols/juju-home/bricks [23:41:03] But each just says "listen-port=0" [23:41:06] heh [23:42:05] well, wait, a few of these config files have valid port #s in them. [23:42:12] I wonder why some and not all or none... [23:44:34] For each share I've looked at, there's a port specified for the brick on labstore2. The port is marked as 0 for every other host. [23:45:27] bah / was full on labstore1 [23:45:32] no wonder it wouldn't start [23:45:44] lol [23:45:45] maybe that's also the reason that auth.allow wasn't working [23:46:00] * andrewbogott holds breath [23:46:04] I hope [23:46:32] bah apple-q [23:46:40] We need a quick fix, you're due at the pub in 10 minutes. [23:46:56] heh [23:47:23] seems the auth.allow was wiped when I restarted too [23:47:33] the script is fixing that [23:47:58] Gluster just throws away all its security settings when the server restarts? [23:49:34] it shouldn't [23:49:46] I have no clue what's up with this shitty filesystem [23:50:32] maybe this has something to do with / filling on labstore1, though [23:50:40] I'm going to restart services everywhere [23:50:44] we need an alert on that [23:52:25] Awesome I just managed to kill mysql with locks [23:52:33] heh [23:53:52] Turns out piping the raw feed of 3 wikis in there is a bad idea, time to use redis for the queue. [23:59:23] andrewbogott: gluster volume status shows the port [23:59:52] So it does! I didn't know there was 'status' in addition to 'info' [23:59:58] it's new