[00:03:54] RECOVERY Current Load is now: OK on bots-cb bots-cb output: OK - load average: 0.65, 0.89, 4.23 [00:05:05] Helllz yeah [00:05:17] Styled Gerrit set up at http://gerrit.pmtpa.wmflabs:8080/ [00:05:20] Ah, a Krinkle ! [00:05:45] Firefox can't find the server at gerrit.pmtpa.wmflabs. [00:05:55] It's not a public IP [00:05:57] sumanah: Yeah it doesn't have a public IP, you need to proxy to it to get there [00:05:59] :) [00:06:02] Hey Roan [00:06:07] I could ask Ryan_Lane for one [00:06:13] * Damianz straightens Krinkle [00:06:17] Krinkle: You wanna help me out doing some CSS and JS tweaks to Gerrit? [00:06:25] One of the OpenStack guys found a way to do it [00:06:27] Damianz: Gooood morning [00:06:33] RoanKattouw: I read about that [00:06:46] Gerrit needs a whole new face not just a face lift :P [00:06:49] I knwo [00:06:53] Damianz: btw have you heard about https://www.mediawiki.org/wiki/Berlin_Hackathon_2012 ? [00:06:55] RoanKattouw: could you add some more squids to deployment-prep please? it's running quite slow [00:06:55] But I'm gonna do two trivial fixups [00:06:56] Can't find the exact link, but I remember seeing a mockup of some sorts that openstack ppl were proud of [00:07:05] Thehelpfulone: I'm not involved with that project at all [00:07:13] Oooh berlin :D [00:07:16] Krinkle: http://review-dev.openstack.org [00:07:20] Thehelpfulone: Not at the moment [00:07:27] I meant a blogpost, but this was the preview indeed [00:07:35] http://www.linuxjedi.co.uk/2012/03/changes-coming-to-gerrits-style.html [00:07:37] that one [00:07:46] My Gerrit looks exactly the same sans the logo [00:07:52] Oh, LinuxJedi blogged about it? [00:07:54] * RoanKattouw reads [00:07:54] bah, all these sysadmins that aren't involved in the projects! [00:08:03] I'm not even a sysadmin :) [00:08:15] you are, and a netadmin too [00:08:23] your name is catrope right? [00:08:28] Thehelpfulone: he's being modest [00:08:40] Yes [00:08:44] Deployment prep needs some other stuff doing -- like a proxy putting infront of it to distribute to the squids before it can have moar. [00:08:45] I'm sort of a sysadmin, but that's not my main job [00:08:49] The openstack skin is a bit more breathy, which is nice. The colors are bad imho though, but that's just because they're going with the puristic logo-colors only [00:08:49] anyway, RoanKattouw, wanna show me how to ... proxy ... thing? [00:08:50] w [00:08:53] which is a good start [00:08:59] RoanKattouw: how are you connecting to this via a socks-proxy? [00:09:03] sumanah: [[labsconsole:Access]] , see the section about SOCKS proxy [00:09:04] yeah, I'll need to set up that proxy as well [00:09:04] I want to see your changes :D [00:09:05] Ryan_Lane: Yes [00:09:11] it isn't working for me [00:09:16] https://labsconsole.wikimedia.org/wiki/Help:Access#Accessing_web_services_using_a_SOCKS_proxy [00:09:27] What's "it" and how is it "not working"? [00:09:44] sumanah: Essentially that means doing ssh bastion.wmflabs.org -D 8080 and then setting up some FoxyProxy rules [00:10:15] -L 8080:internalip:8080 http://localhost:8080. Screw SOCKS [00:10:22] where do I go to verify that this fingerprint is that of bastion? [00:10:28] Ryan_Lane: I really just did the top two file additions @ https://review.openstack.org/#change,5234,patchset=4 [00:10:47] I don't know where we have these fingerprints [00:10:51] RoanKattouw: I'm connected to a socks proxy on bastion-restricted [00:10:57] Puppet probably knows them [00:11:01] and when I try to connect to your instance on 8080 it doesn't work :( [00:11:25] we don't have the fingerprints available yet [00:11:32] I have some ideas on how to include them [00:11:38] we have an open bug, actually [00:11:41] Ryan_Lane: nc gerrit 8080 GET / works for me [00:11:42] with implementation details [00:11:46] from bastion [00:11:59] Oh wait, that was from gerrit [00:12:15] Same deal on bastion1 though [00:12:21] I can get there from bastion-restricted1 too [00:12:43] Wait, why did bastion-restricted1 boot me? [00:12:46] Is that because it's ops-only? [00:12:53] well... [00:12:58] (for now) [00:13:01] I haven't switched over all shell people yet [00:13:16] I mentioned I'd send a follow-up email when I do that [00:13:30] Right [00:14:19] Hmm [00:14:27] Ryan_Lane: Could I just get a public IP for the gerrit dev instance? [00:15:38] Krinkle: I'll add you to the project so you can SSH into the machine and edit the stylesheet [00:15:47] sure [00:15:56] you should really proxy to gerrit [00:16:00] from apache [00:16:07] Right [00:16:12] I'll look that up in the gerrit puppet manifest [00:16:14] I can't get the thing to work in the browser [00:16:27] That's OK, we're setting up a public IP [00:16:37] ok [00:17:20] :8080 doesn't match my foxyproxy rule :) [00:17:27] heh [00:17:42] RoanKattouw: ok, you can allocate an IP [00:17:46] Thanks [00:17:54] note, the manage addresses special page takes an eternity [00:20:14] 03/15/2012 - 00:20:13 - Creating a home directory for krinkle at /export/home/gerrit/krinkle [00:20:25] Got it working at http://gerrit.pmtpa.wmflabs:8080/ [00:20:33] yay [00:20:37] * Ryan_Lane wants to see [00:20:40] $ ssh @bastion.wmflabs.org -D 8080; and in Mac OS X proxy prefs, localhost:8080 in Socks [00:21:13] ah [00:21:15] 03/15/2012 - 00:21:15 - Updating keys for krinkle [00:21:24] I changed my proxy settings to always use labs :) [00:21:52] Krinkle: You can ssh into bastion.wmflabs.org and then from there into gerrit, the relevant files are /var/lib/gerrit2/review_site/etc/{GerritSite.css,GerritSiteHeader.html} [00:22:20] Once I set up a public IP you can ssh in directly too [00:23:36] Change on 12mediawiki a page Wikimedia Labs was modified, changed by Ryan lane link https://www.mediawiki.org/w/index.php?diff=510957 edit summary: /* Documents */ [00:24:16] Krinkle: 208.80.153.233 is the public IP [00:24:35] * Krinkle updated https://labsconsole.wikimedia.org/wiki/Help:Access#Accessing_web_services_using_a_SOCKS_proxy [00:24:47] RoanKattouw: add a hostname to it :) [00:24:56] I'm trying to [00:25:01] yeah, slow as shit [00:25:02] But like you said, that address page is hella slow [00:25:08] And I have to reload that every time [00:25:08] really poor code there [00:25:12] Actually associating it was quick [00:25:16] too many lookups [00:26:11] gerrit-dev.wmflabs.org [00:26:47] So full link: http://gerrit-dev.wmflabs.org:8080/ [00:27:08] Can I log in with something? [00:27:12] Ryan_Lane: has db replication been setup on labs yet? [00:27:13] (onto gerrit) [00:27:25] Thehelpfulone: nope [00:27:34] Krinkle: Click "become" in the top right corner [00:27:41] It's in development mode where anyone can become any user [00:27:44] we're about to have public datasets available to all instances read-only, though [00:27:44] okay, I imagine there's a roadmap somewhere that I should look for.. [00:27:50] I may or may not want to disable that now that the instance has a public IP [00:27:55] we don't have a timeframe for it right now [00:28:24] I don't think I like the red text [00:28:37] it makes me think of mediawiki red-links [00:28:43] I just stole the OpenStack stylesheet [00:28:47] Krinkle was also critical of it [00:29:00] the red is awful [00:29:13] I'm hoping he'll channel that criticism into a better stylesheet :) [00:29:23] JRWR: what error are you getting? [00:29:43] I'm gonna mess with the JS now [00:30:22] Ryan_Lane: There were no Nova credentials found for your user account. Please ask a Nova administrator to create credentials for you. [00:30:26] Ryan_Lane: okay, is there an ETA for that, is labs intended to be able to be used to host web tools that might be hosted on the toolserver? [00:30:35] JRWR: you need to logout and login again [00:30:40] that's happened to me a couple of times [00:30:42] from the wiki that is [00:32:05] yay it works [00:34:18] http://gerrit-dev.wmflabs.org:8080/ is supposed to work now?> [00:34:27] I just shut down the proxy [00:34:37] but this urls isn't working for me yet [00:34:38] Krinkle: works for me [00:34:52] and me [00:35:22] Yes, that's supposed to work [00:35:24] It works for me [00:35:55] oh, the stupid socks proxy set in mac os takes everything over [00:35:59] I need to disable it first [00:36:04] can't even access wikipedia [00:36:18] working now :) [00:36:22] ok, with testing the response times (using pingdom) the avg response time of this page: http://en.wikipedia.org/wiki/Index_of_Windows_games_(A) os 3.77s [00:36:41] My wiki with nginx cache response time is 2.32s avg [00:36:46] over 10 tests for both [00:37:07] I tested http://pcgamingwiki.com/wiki/Home [00:39:24] Looks like the nginx is able to connect faster and handle the cache lookup faster then what ever you guys are using, needs more testing, but its looking good [00:41:28] we're very unlikely to switch away from varnish for caching [00:41:57] I know [00:42:04] too much control over varnish [00:42:05] but… it's good to have the info available, especially with stats [00:42:09] !log gerrit Installing Apache [00:42:10] Logged the message, Mr. Obvious [00:42:24] its good for single server setups [00:42:24] and, even when we switch to hiphop, we still need a server for proxying [00:42:32] it may be good to use nginx rather than apache forit [00:42:46] !account-questions | paravoid [00:42:46] paravoid: I need the following info from you: 1. Your preferred wiki user name. This will also be your git username, so if you'd prefer this to be your real name, then provide your real name. 2. Your preferred email address. 3. Your SVN account name, or your preferred shell account name, if you do not have SVN access. [00:42:46] Apache is the replayer of Web servers :) [00:43:06] RoanKattouw: I'll try to wing it towards Vector for starters [00:43:18] OK [00:43:30] I've rewritten the JS using jQuery, now I have to install Apache so I can actually serve jQuery [00:43:45] silly question: how do I reply to the bot? 1. paravoid 2. etc.? [00:43:52] can reply to me :) [00:43:54] I plan on making a guide on how to make a mediawiki install handle a slashdotting, nowadays its called RedditDoS [00:43:54] Hm.. you're changing the actual logic? [00:43:58] it's just a helper for me [00:44:13] JRWR: cool [00:44:14] or is the logic not well separated with JS and do you need to change it in order to make other changes [00:44:35] oh and php5.4.0 is a blazin on Mediawiki 1.18.1 [00:45:48] paravoid: ok, made an account for you. it'll email you a password [00:45:53] !initial-login | paravoid [00:45:53] paravoid: https://labsconsole.wikimedia.org/wiki/Access#Initial_log_in [00:46:12] you can ignore the part about resetting your password, since it's sending you one [00:50:46] paravoid: so, I've made you a full admin on labsconsole [00:51:15] * Krinkle opens inbox to see 390 emails from toolserver because SGE is crashing (or rather, it's not up and cronsub commands fall on their face) [00:51:48] admin in the wiki, and a cloudadmin in nova. the former allows you to add people to cloudadmin, the latter lets you do all of the openstack related things in the interface [00:52:17] that said, no one except the ops team should be in either of those things :) [00:52:57] Toolserver is dodgy, has issues pretty much every week. [00:53:11] Krinkle: No, not changing logic [00:53:18] Krinkle: JS exists only to apply the "patch" class to patch views [00:54:09] RoanKattouw: There is a lot more JS in gerrit though, right? You mean that's the only change you're making. [00:54:19] Damianz: labs only has issues when I totally fuck it up! [00:54:20] :D [00:54:57] though a hardware failure would be problematic, depending on the node [00:55:02] Krinkle: No, this is separate from Gerrit's JS. Gerrit's JS is compiled from Java or something, this is injected elsewehere [00:55:16] wtf [00:55:19] ok [00:56:21] Ryan_Lane: I'd like to see our ram graphs if a node failed :D [00:58:08] well, the instances wouldn't restart [00:58:17] so, it would stay about the same ;) [00:59:14] openstack's availability is kind of crap [00:59:42] I could manually cause the instances to start on another node [01:00:07] but I'd have to modify the database :D [01:00:47] RAWR [01:00:51] SSLCertificateFile: file '/etc/ssl/certs/star.wikimedia.org.pem' does not exist or is empty [01:01:00] RoanKattouw: add it via puppet [01:01:03] oh [01:01:04] heh [01:01:06] right [01:01:10] Apparently even the Apache proxy config for Gerrit is WMf-specific [01:01:11] * RoanKattouw STABS [01:01:24] lemme turn it into a template :) [01:01:35] I'll just rip out the SSL part [01:01:39] locally [01:02:03] Wait [01:02:06] :80 redirects to :443 [01:02:26] RedirectMatch ^/$ https://gerrit.wikimedia.org/r/ [01:02:28] Of course [01:02:47] I'm fixing it :) [01:02:48] Can't that just redirect to relative paths [01:02:48] gimme a sec [01:03:04] no, it can't [01:03:25] Boo [01:05:16] [Thu Mar 15 01:05:12 2012] [warn] NameVirtualHost *:80 has no VirtualHosts [01:05:18] Is that normal? [01:05:32] hm [01:05:33] maybe? [01:05:34] :) [01:06:14] PROBLEM Disk Space is now: CRITICAL on mobile-enwp mobile-enwp output: DISK CRITICAL - free space: / 259 MB (2% inode=79%): [01:07:10] ok, pushing to prod, then cherry-picking to test [01:07:25] Meh [01:07:36] Ryan_Lane: How do I open port 80 to the world on my instance? [01:07:47] security group [01:07:54] did you read the docs? :) [01:07:56] !instances [01:07:56] https://labsconsole.wikimedia.org/wiki/Help:Instances [01:07:59] !security [01:07:59] https://labsconsole.wikimedia.org/wiki/Help:Security_Groups [01:08:07] https://labsconsole.wikimedia.org/wiki/Special:NovaSecurityGroup gives me a 404 !! [01:08:13] seriously? [01:08:24] wtf [01:08:24] PROBLEM Free ram is now: WARNING on mobile-enwp mobile-enwp output: Warning: 10% free memory [01:08:26] Something is screwed up there [01:08:38] something is screwed up on labsconsole as a whole [01:08:51] php is broken somehow [01:09:13] did someone puke on apache again? [01:09:30] Ryan_Lane: Wait, did you just DELETE files/apache/sites/gerrit.wikimedia.org ? [01:09:37] That looks wrong [01:09:41] it is. kind of [01:09:43] !g 3179 [01:09:43] https://gerrit.wikimedia.org/r/3179 [01:09:53] I didn't merge it all the way through [01:10:15] ok, what the fuck is up with php on labsconsole? [01:10:22] and why all of a sudden did it stop working? [01:10:57] check error logs [01:11:14] RECOVERY Disk Space is now: OK on mobile-enwp mobile-enwp output: DISK OK [01:11:23] Well, did you depoy 3179 to labsconsole? [01:11:28] no [01:11:39] I didn't even merge it through to production [01:11:43] via sockpuppet [01:12:00] I had this issue at some point before.... [01:12:13] php is installed... [01:12:20] ok, thats good [01:12:25] do the files exist? [01:12:30] Try gracefulling apache, see what errors yyou get [01:12:35] rc libapache2-mod-php5 [01:12:40] that's fucked up [01:12:45] whoops [01:12:58] Oops [01:13:42] how the hell did *that* happen? [01:14:02] wow, the way the core stylesheet is generated is just freaking me out [01:14:08] I didn't even try to find out [01:14:18] -size:9pt;}.GCLMTUVDOB .GCLMTUVDGG{border-left:1px solid ',X0d=';white-spa [01:14:23] that's the second time that's happened too [01:14:47] some kind of javascript interpreter because the class names can potentially change [01:15:00] from what I gather the classnames are chosen as short as possible, so if gerrit adds an element somewhere, all classnames change [01:15:43] I was wondering why the stylesheet from openstack is so generic, not styling any individual elements (like the annoying double border when you're logged one where a start is in-between when logged in), that column has a class but it's not stable [01:15:57] logged in* [01:16:04] the default green stylesheet removes that border [01:16:08] openstack doesn't [01:16:11] I'll take my chances [01:17:18] Krinkle: We can always add JS to find those elements and add classes [01:17:29] yeah, we'll see [01:17:55] http://i.imgur.com/kqPXd.png [01:21:38] hehe, easiest way to get my hands on the dynamic stylesheet is prompt('x', document.styleSheets[1].ownerNode.innerHTML) [01:21:48] can't even access it through web inspector, it's in a sub document [01:21:49] weird [01:22:00] Yay [01:22:06] http://gerrit-dev.wmflabs.org is up [01:22:14] Proxy is working, and port 80 is open to the world now [01:22:39] what happened Ryan_Lane, maybe its something that we can help prevent [01:23:42] PROBLEM Free ram is now: CRITICAL on mobile-enwp mobile-enwp output: Critical: 5% free memory [01:25:13] it's because we have something set to ensure => latest in puppet, and it didn't install cleanly [01:25:33] More boxes going down now [01:25:43] Oh, singer is back up but now ekrem is down [01:25:56] (Apache down that is [01:31:28] well, I surely hope the php on the cluster doesn't have this issue [01:31:28] heh [01:31:59] RoanKattouw: ekrem has other issues [01:32:09] its just overloaded by that AppleDictionary thingie [01:32:32] RoanKattouw: I've messed with chmod a little bit on /var/lib/gerrit2/review_site/etc so that I can edit the file as 'krinkle' in my code editor over ssh [01:32:44] 777 for now, just fyi [01:33:43] PROBLEM Current Load is now: WARNING on bots-sql3 bots-sql3 output: WARNING - load average: 4.42, 6.04, 5.45 [01:35:07] OK, sure [01:35:18] I am editing the .html file now [01:36:27] ok [01:37:13] RoanKattouw: So I reverse engineered the default stylesheet (as you might have noticed) will work from here instead. The openstack one used too many !important and #id heavy selectors to have to fight against [01:38:05] Kept a copy of the default one before I made changes so that if these class names change we can diff GerritSite -gerrit-default-extraction-tidy.css to GerritSite.css and should be able to re-create it [01:38:17] OK, thanks [01:38:33] which we'll need with classnames all looking alike "GCLMTUVDGI" [01:38:36] :D [01:38:38] So am I right that almost all classes and IDs are generated sequentially [01:38:43] RECOVERY Current Load is now: OK on bots-sql3 bots-sql3 output: OK - load average: 1.20, 2.93, 4.27 [01:38:53] Or are class names like GCLMTUVDGI actually stable? [01:39:09] I'm fine with having nondescript class names as long as they're stable [01:39:16] and map to one and only one concept [01:40:29] that's the most ridiculous class name I've ever seen [01:41:18] Hey it's shorter than gerrit-review-state-plustwo [01:41:35] So clearly the shorter class name improves performance, right? [01:42:15] Krinkle: Is review_site/etc/GerritSite.css in a non-broken deployable state? I want to restart Gerrit so it picks up my JS changes, but that picks up your CSS changes in that file as well [01:42:31] deployable where [01:43:54] gerrit-dev [01:44:04] Hmm it looks like it's already sort of on there [01:44:06] nm, restarting [01:44:18] yeah, I've been editing live for the past half hour [01:44:26] seems to apply right away [01:44:29] Yeah [01:44:36] The HTML changes don't but the CSS changes do [01:44:38] apparently [01:44:47] ok [01:46:06] sumanah: http://gerrit-dev.wmflabs.org/r/#change,publish,2,3 [01:46:16] It's buggy cause it also changes the 0 for verified [01:46:48] That's because of the way it's implemented. I'll put it on pastebin, and reading it should make you realize what a terrible terrible person that code makes me [01:50:00] Behold: http://pastebin.com/R1vG7cz4 [01:50:05] One of the most terrible hacks I've ever written [01:50:21] And the fact that it also changes the Verified box tells you why [01:56:43] PROBLEM Current Load is now: WARNING on bots-sql3 bots-sql3 output: WARNING - load average: 7.26, 7.04, 5.90 [01:58:41] RoanKattouw: weird, it seems the default stylesheet comes after GerritSite.css [01:58:51] I'm copying a rule and changing a value, it's overridden [01:58:51] *sigh* [01:59:00] argh [01:59:48] Krinkle: I'll file a bug for that with Gerrit [02:00:52] OK https://code.google.com/p/gerrit/issues/detail?id=864 exists for human-readable class names [02:01:12] let me send you a screenshot of the bug [02:01:24] Please do [02:02:28] http://i.imgur.com/zA2Ej.png [02:02:30] RoanKattouw: [02:04:00] Thanks [02:10:02] https://code.google.com/p/gerrit/issues/detail?id=1290 [02:10:05] Please star [02:10:57] And star https://code.google.com/p/gerrit/issues/detail?id=864 too [02:20:21] +3 stars on both :) [02:20:47] nice now that googlecode supports easy google account switch [02:21:55] haha [02:25:12] grmpf, this gerrit stuff is getting ugly. I'll swap the openstack one and my working copy. It's too much work to make little changes. [02:25:28] not worth the time to wing it, probably better to instead design off-code first [02:26:19] what did you change in the html? [02:26:29] added "Gerrit Code Review" ? [02:26:30] Im surpised you didnt go with something like Github [02:26:59] Yes [02:27:02] 03/15/2012 - 02:27:02 - Updating keys for andrew [02:27:05] 03/15/2012 - 02:27:04 - Updating keys for andrew [02:27:06] 03/15/2012 - 02:27:06 - Updating keys for andrew [02:27:14] 03/15/2012 - 02:27:14 - Updating keys for andrew [02:36:40] Krinkle: Yeah, added that and a bunch of JS [02:36:46] k [03:29:34] any known issues with labs bastion at the moment? [03:31:57] you guys can still login as normal? [03:44:01] !log bastion - getting my Connection closed after the "If you are having access problems.." message. just me or all ? [03:44:02] Logged the message, Master [04:14:40] PROBLEM Current Load is now: WARNING on bots-sql3 bots-sql3 output: WARNING - load average: 2.03, 4.73, 5.04 [04:19:40] RECOVERY Current Load is now: OK on bots-sql3 bots-sql3 output: OK - load average: 2.63, 3.70, 4.51 [04:33:50] PROBLEM Current Load is now: CRITICAL on mobile-enwp mobile-enwp output: CHECK_NRPE: Socket timeout after 10 seconds. [04:38:50] PROBLEM Current Load is now: WARNING on mobile-enwp mobile-enwp output: WARNING - load average: 8.60, 8.40, 8.45 [05:18:45] * jeremyb stumbles in [05:19:43] yeah, so i am being disconnected by the bastion host [05:19:51] mutante: ssh -vvv ? [05:20:01] but AFTER i already get the "If you are having access problems.." message. [05:20:08] true, lets see [05:21:00] i get [05:21:04] debug3: input_userauth_banner [05:21:04] If you are having access problems, please see: https://labsconsole.wikimedia.org/wiki/Access#Accessing_public_and_private_instances [05:21:07] debug1: Authentications that can continue: publickey [05:21:15] so, you seeing the banner means nothing [05:23:06] mutante: paste it? [05:23:34] i can't say stick it in my $HOME because you can't connect ;-P [05:24:19] re, distracted [05:24:23] debug2: we sent a publickey packet, wait for reply [05:24:23] debug1: Server accepts key: pkalg ssh-rsa blen 535 [05:24:23] debug2: input_userauth_pk_ok: fp 15:23:97:1a:2a:fc:c0:f3:6a:3b:f6:5f:71:d8:86:fd [05:24:26] Connection closed by 208.80.153.194 [05:26:57] that's -vvv ? [05:28:38] -vv [05:28:50] debug1: Server accepts key: pkalg ssh-rsa blen 535 [05:28:50] debug2: input_userauth_pk_ok: fp 15:23:97:1a:2a:fc:c0:f3:6a:3b:f6:5f:71:d8:86:fd [05:28:53] debug3: sign_and_send_pubkey: RSA 15:23:97:1a:2a:fc:c0:f3:6a:3b:f6:5f:71:d8:86:fd [05:28:56] Connection closed by 208.80.153.194 [05:29:07] mutante: try -vvv ? [05:29:19] oh [05:29:21] you just did [05:29:22] that is -vv now, debug3 [05:29:27] -vvv [05:32:59] $ ssh-keygen -l -F bastion.wmflabs.org | tail -n 1 [05:33:02] 2048 26:02:b9:20:f7:6c:5c:c8:2d:58:a2:4c:27:f8:f5:a7 bastion.wmflabs.org (RSA) [05:33:02] mutante: ^ ? [05:34:38] mutante: you can log into the wiki? [05:37:40] hah [05:37:40] export INSTANCENAME=bots-apache1 [05:37:41] PS1='${debian_chroot:+($debian_chroot)}\u@${INSTANCENAME}:\w\$ ' [05:38:53] * jeremyb pokes mutante [05:38:59] * jeremyb is sleeping sooooon [05:41:43] > A place where magic happens [05:41:49] i wonder when that showed up [05:43:40] PROBLEM Free ram is now: WARNING on mobile-enwp mobile-enwp output: Warning: 7% free memory [05:43:42] jeremyb: sorry, re [05:43:47] jeremyb: yes, i can login to wiki [05:43:53] mutante: add yourself to bots [05:44:07] mutante: try ssh -vvv bots.wmflabs.org [05:44:31] (shouldn't work but it does. lets use since we need it as long as it's open!) [05:45:35] mutante: (and hurry up ;) ) [05:46:53] on it [05:47:09] 03/15/2012 - 05:47:09 - Creating a home directory for dzahn at /export/home/bots/dzahn [05:48:09] 03/15/2012 - 05:48:09 - Updating keys for dzahn [05:48:09] 03/15/2012 - 05:48:09 - Updating keys for jeremyb [05:48:12] 03/15/2012 - 05:48:12 - Updating keys for jeremyb [05:48:16] 03/15/2012 - 05:48:15 - Updating keys for jeremyb [05:48:16] 03/15/2012 - 05:48:16 - Updating keys for jeremyb [05:48:19] jeremyb: that works [05:48:22] 03/15/2012 - 05:48:22 - Updating keys for jeremyb [05:48:27] labs-home-wm: quiet [05:48:38] all i did was change my password not my keys... [05:50:21] trying to remove and add myself from/to bastion [05:50:31] it looks like i dont h ave a valid shell there anymore [05:51:08] $ getent passwd dzahn [05:51:08] dzahn:x:2075:550:Dzahn:/home/dzahn:/bin/bash [05:51:16] next try? [05:51:58] doesnt change [05:53:02] well, you were about to leave an dall i wanted was confirm it is not a global problem .. for now [05:53:11] so dont worry for now [05:53:57] you cant look at auth.log i suppose [05:54:45] i could if i could sudo! [05:54:59] (so no) [05:55:16] * mutante nods [05:55:18] mutante: you don't have the root key? [05:55:25] or password or whatever [05:55:40] no, not for labs bastion [05:55:51] should be the same for all labs hosts [05:55:55] or not that i know of where it is [05:56:54] well, i could sudo if i was logged in with this user [05:57:10] and then check what it is:) [05:57:23] yeah, that much i got [05:58:40] PROBLEM Free ram is now: CRITICAL on mobile-enwp mobile-enwp output: Critical: 3% free memory [06:00:24] I'm guessing labs-nagios-wm has just informed me why my attempt to create some monsters has failed. [06:01:17] it's a ryan! [06:01:19] My fault. [06:01:27] just so you know ahead of time. [06:01:41] Ryan_Lane: bastion1 won't let mutante in and i can't figure out why. he can get in to other hosts [06:01:51] i'm attempting to allocate 3x m1.XL for cassandra and hadoop, as instructed by CT [06:01:56] mutante: did you miss the email? [06:02:02] unfortunately, it appears we lack sufficient RAM. [06:02:04] it was labeled "Important" :) [06:02:16] dschoon: yes, we lack sufficient ram [06:02:27] until the ciscos replace the current hardware [06:02:32] hokay. [06:02:34] then we'll have more than enough ram [06:02:39] Ryan_Lane: ouch :) i see [06:02:46] mutante: you must log into bastion-restricted [06:03:03] another reason to not dick about with labs when we should just be working on bare metal. [06:03:08] but i digress. ty for help. [06:03:12] yw [06:03:33] i agree, labs should only be used for testing the configurations and such [06:03:40] PROBLEM Free ram is now: WARNING on mobile-enwp mobile-enwp output: Warning: 8% free memory [06:03:41] it's a poor environment for performance testing [06:03:44] oh yay. i got myself another one. [06:04:10] PROBLEM Total Processes is now: CRITICAL on kant1 kant1 output: Connection refused by host [06:04:47] hm. i need to come up with more philosophers that start with k. [06:05:00] PROBLEM dpkg-check is now: CRITICAL on kant1 kant1 output: CHECK_NRPE: Socket timeout after 10 seconds. [06:05:03] heh [06:05:12] Ryan_Lane: what's the difference with the bastions? [06:05:27] let's try pmtpa. [06:05:28] nope! [06:05:29] well, try to log into bastion-restricted ;) [06:05:30] PROBLEM Current Load is now: CRITICAL on kant1 kant1 output: Connection refused by host [06:05:38] dschoon: what error are you getting? [06:05:45] "Failed to create instance" [06:05:54] what size are you trying? [06:06:07] Ryan_Lane: i did... same error mutante was getting with bastion. [06:06:09] gargantuan, and gargantuan-1 [06:06:12] most likely we've hit the limit of the current hardware [06:06:15] Ryan_Lane: sorry, that was marked as read but black out :p .. the cool thing about this, i don't have to ssh-add -D all other keys anymore before connection to labs bastion? [06:06:30] PROBLEM Current Users is now: CRITICAL on kant1 kant1 output: CHECK_NRPE: Socket timeout after 10 seconds. [06:06:33] m1.M worked. [06:06:34] mutante: well, you don't *have* to, but it's still a good idea ;) [06:06:38] * Ryan_Lane nods [06:06:40] PROBLEM host: storm11 is DOWN address: storm11 CRITICAL - Host Unreachable (storm11) [06:06:44] it may fail to build [06:06:50] PROBLEM host: storm1 is DOWN address: storm1 CRITICAL - Host Unreachable (storm1) [06:07:00] PROBLEM Disk Space is now: CRITICAL on kant1 kant1 output: Connection refused by host [06:07:01] it seems medium instances hit the failed to bug more often [06:07:03] no clue why [06:07:03] (i cannibalized storm11 and storm1 for RAMz half an hour ago.) [06:07:16] yeah, we need to move hardware soon [06:07:26] can't you all just not forward agents? kthx [06:07:26] I have a new cisco box waiting to be installed [06:07:36] can haz? [06:07:40] PROBLEM Free ram is now: CRITICAL on kant1 kant1 output: Connection refused by host [06:07:40] i only want 4 :( [06:07:45] I'm having a hard time installing it [06:07:47] it's for labs [06:08:06] of course I'm the guinea pig for the damn ciscos [06:08:24] i'd be happy to help :3 [06:08:33] of course, not really so much with the labs part [06:08:46] heh [06:08:48] as the part where i try to turn them into a smoking crater via java. [06:08:51] I'm having issues pxe booting it [06:08:57] m1.M no longer succeeds [06:08:58] boo [06:09:04] failed to create? [06:09:08] yesh [06:09:15] yep, we're out of ram [06:09:22] (both availability zones) [06:09:34] well, they're both the same zone [06:09:39] i figured. [06:09:47] it's a bug from when I moved the controller [06:10:06] Fetched 3673kB in 0s (7744kB/s) [06:10:06] Failed to fetch http://ubuntu.wikimedia.org/ubuntu/pool/main/d/debconf/debconf-utils_1.5.28ubuntu4_all.deb Hash Sum mismatch [06:10:07] E: Unable to fetch some archives, maybe run apt-get update or try with --fix-missing? [06:10:14] that's different [06:10:14] that's kripke1 [06:10:18] delete/recreate [06:10:19] https://labsconsole.wikimedia.org/w/index.php?title=Special:NovaInstance&action=consoleoutput&project=reportcard&instanceid=i-000001a8 [06:10:22] feh. [06:10:25] ok. momentarily. [06:10:27] try a few times [06:10:35] that's the issue I mentioned gets hit more with mediums [06:10:45] this is an XML [06:10:47] XL [06:10:56] (though i'm sure there are many angle-brackets involved) [06:11:01] heh [06:12:44] PROBLEM Current Load is now: WARNING on bots-sql3 bots-sql3 output: WARNING - load average: 4.57, 5.54, 5.07 [06:14:24] PROBLEM Current Load is now: CRITICAL on kierkegaard1 kierkegaard1 output: Connection refused by host [06:14:24] PROBLEM Current Load is now: CRITICAL on kripke1 kripke1 output: Connection refused by host [06:15:14] PROBLEM Current Users is now: CRITICAL on kierkegaard1 kierkegaard1 output: Connection refused by host [06:15:14] PROBLEM Current Users is now: CRITICAL on kripke1 kripke1 output: Connection refused by host [06:15:44] PROBLEM Disk Space is now: CRITICAL on kierkegaard1 kierkegaard1 output: Connection refused by host [06:15:44] PROBLEM Disk Space is now: CRITICAL on kripke1 kripke1 output: Connection refused by host [06:16:34] PROBLEM Free ram is now: CRITICAL on kierkegaard1 kierkegaard1 output: Connection refused by host [06:16:34] PROBLEM Free ram is now: CRITICAL on kripke1 kripke1 output: Connection refused by host [06:17:54] PROBLEM Total Processes is now: CRITICAL on kierkegaard1 kierkegaard1 output: Connection refused by host [06:17:59] PROBLEM Total Processes is now: CRITICAL on kripke1 kripke1 output: Connection refused by host [06:18:44] PROBLEM dpkg-check is now: CRITICAL on kierkegaard1 kierkegaard1 output: Connection refused by host [06:18:44] PROBLEM dpkg-check is now: CRITICAL on kripke1 kripke1 output: Connection refused by host [07:03:09] RECOVERY Current Load is now: OK on bots-sql3 bots-sql3 output: OK - load average: 1.26, 1.75, 4.03 [07:33:59] PROBLEM Current Load is now: CRITICAL on mobile-enwp mobile-enwp output: CHECK_NRPE: Socket timeout after 10 seconds. [07:38:49] PROBLEM Current Load is now: WARNING on mobile-enwp mobile-enwp output: WARNING - load average: 7.89, 8.80, 9.19 [08:08:59] PROBLEM host: kripke1 is DOWN address: kripke1 CRITICAL - Host Unreachable (kripke1) [08:14:02] PROBLEM Current Load is now: CRITICAL on kierkegaard kierkegaard output: CHECK_NRPE: Error - Could not complete SSL handshake. [08:14:02] PROBLEM Current Load is now: CRITICAL on kripke kripke output: Connection refused by host [08:14:52] PROBLEM Current Users is now: CRITICAL on kripke kripke output: Connection refused by host [08:14:52] PROBLEM Current Users is now: CRITICAL on kierkegaard kierkegaard output: CHECK_NRPE: Error - Could not complete SSL handshake. [08:15:22] PROBLEM Disk Space is now: CRITICAL on kierkegaard kierkegaard output: CHECK_NRPE: Error - Could not complete SSL handshake. [08:15:22] PROBLEM Disk Space is now: CRITICAL on kripke kripke output: Connection refused by host [08:16:12] PROBLEM Free ram is now: CRITICAL on kierkegaard kierkegaard output: CHECK_NRPE: Error - Could not complete SSL handshake. [08:16:12] PROBLEM Free ram is now: CRITICAL on kripke kripke output: Connection refused by host [08:17:32] PROBLEM Total Processes is now: CRITICAL on kierkegaard kierkegaard output: CHECK_NRPE: Error - Could not complete SSL handshake. [08:17:37] PROBLEM Total Processes is now: CRITICAL on kripke kripke output: Connection refused by host [08:18:22] PROBLEM dpkg-check is now: CRITICAL on kierkegaard kierkegaard output: CHECK_NRPE: Error - Could not complete SSL handshake. [08:18:22] PROBLEM dpkg-check is now: CRITICAL on kripke kripke output: Connection refused by host [09:06:52] PROBLEM dpkg-check is now: CRITICAL on bots-apache1 bots-apache1 output: DPKG CRITICAL dpkg reports broken packages [09:11:52] RECOVERY dpkg-check is now: OK on bots-apache1 bots-apache1 output: All packages OK [09:13:23] RECOVERY dpkg-check is now: OK on kierkegaard kierkegaard output: All packages OK [09:14:02] RECOVERY Current Load is now: OK on kierkegaard kierkegaard output: OK - load average: 0.10, 0.06, 0.01 [09:14:52] RECOVERY Current Users is now: OK on kierkegaard kierkegaard output: USERS OK - 0 users currently logged in [09:15:22] RECOVERY Disk Space is now: OK on kierkegaard kierkegaard output: DISK OK [09:16:12] RECOVERY Free ram is now: OK on kierkegaard kierkegaard output: OK: 94% free memory [09:17:32] RECOVERY Total Processes is now: OK on kierkegaard kierkegaard output: PROCS OK: 93 processes [09:53:48] !log deployment-prep petrb: scheduling auto replication of sql server [09:53:49] Logged the message, Master [10:16:52] !log deployment-prep petrb: failed auth on db server reboot was required [10:16:53] Logged the message, Master [10:27:12] PROBLEM dpkg-check is now: CRITICAL on deployment-sql deployment-sql output: DPKG CRITICAL dpkg reports broken packages [10:37:12] RECOVERY dpkg-check is now: OK on deployment-sql deployment-sql output: All packages OK [11:01:10] 03/15/2012 - 11:01:09 - Creating a home directory for dzahn at /export/home/nagios/dzahn [11:02:10] 03/15/2012 - 11:02:10 - Updating keys for dzahn [11:11:12] PROBLEM Current Load is now: WARNING on bots-sql3 bots-sql3 output: WARNING - load average: 6.14, 6.82, 5.69 [11:23:58] PROBLEM Current Load is now: CRITICAL on deployment-web deployment-web output: CHECK_NRPE: Socket timeout after 10 seconds. [11:24:37] PROBLEM Free ram is now: CRITICAL on deployment-web deployment-web output: CHECK_NRPE: Socket timeout after 10 seconds. [11:24:37] PROBLEM Current Users is now: CRITICAL on deployment-web deployment-web output: CHECK_NRPE: Socket timeout after 10 seconds. [11:24:37] PROBLEM Total Processes is now: CRITICAL on deployment-web deployment-web output: CHECK_NRPE: Socket timeout after 10 seconds. [11:24:42] PROBLEM Disk Space is now: CRITICAL on deployment-web deployment-web output: CHECK_NRPE: Socket timeout after 10 seconds. [11:25:09] PROBLEM dpkg-check is now: CRITICAL on deployment-web deployment-web output: CHECK_NRPE: Socket timeout after 10 seconds. [11:28:31] PROBLEM SSH is now: CRITICAL on deployment-web deployment-web output: CRITICAL - Socket timeout after 10 seconds [11:35:06] RECOVERY dpkg-check is now: OK on deployment-web deployment-web output: All packages OK [11:38:36] RECOVERY SSH is now: OK on deployment-web deployment-web output: SSH OK - OpenSSH_5.3p1 Debian-3ubuntu7 (protocol 2.0) [11:39:26] RECOVERY Free ram is now: OK on deployment-web deployment-web output: OK: 71% free memory [11:39:26] RECOVERY Current Users is now: OK on deployment-web deployment-web output: USERS OK - 0 users currently logged in [11:39:26] RECOVERY Total Processes is now: OK on deployment-web deployment-web output: PROCS OK: 97 processes [11:39:31] RECOVERY Disk Space is now: OK on deployment-web deployment-web output: DISK OK [11:44:06] PROBLEM Current Load is now: WARNING on deployment-web deployment-web output: WARNING - load average: 0.11, 7.80, 17.80 [12:01:26] 03/15/2012 - 12:01:26 - Creating a project directory for varnish [12:01:27] 03/15/2012 - 12:01:26 - Creating a home directory for mark at /export/home/varnish/mark [12:02:27] 03/15/2012 - 12:02:27 - Updating keys for mark [12:04:01] RECOVERY Current Load is now: OK on deployment-web deployment-web output: OK - load average: 0.12, 0.25, 4.94 [12:09:08] PROBLEM Current Load is now: CRITICAL on mobile-enwp mobile-enwp output: CHECK_NRPE: Socket timeout after 10 seconds. [12:14:06] PROBLEM Current Load is now: WARNING on mobile-enwp mobile-enwp output: WARNING - load average: 9.37, 9.61, 9.88 [12:15:06] PROBLEM Total Processes is now: CRITICAL on varnish varnish output: CHECK_NRPE: Error - Could not complete SSL handshake. [12:15:46] PROBLEM dpkg-check is now: CRITICAL on varnish varnish output: CHECK_NRPE: Error - Could not complete SSL handshake. [12:16:26] PROBLEM Current Load is now: CRITICAL on varnish varnish output: CHECK_NRPE: Error - Could not complete SSL handshake. [12:17:16] PROBLEM Current Users is now: CRITICAL on varnish varnish output: CHECK_NRPE: Error - Could not complete SSL handshake. [12:17:56] PROBLEM Disk Space is now: CRITICAL on varnish varnish output: CHECK_NRPE: Error - Could not complete SSL handshake. [12:18:36] PROBLEM Free ram is now: CRITICAL on varnish varnish output: CHECK_NRPE: Error - Could not complete SSL handshake. [12:25:06] RECOVERY Total Processes is now: OK on varnish varnish output: PROCS OK: 85 processes [12:25:46] RECOVERY dpkg-check is now: OK on varnish varnish output: All packages OK [12:26:26] RECOVERY Current Load is now: OK on varnish varnish output: OK - load average: 1.13, 0.83, 0.58 [12:27:16] RECOVERY Current Users is now: OK on varnish varnish output: USERS OK - 1 users currently logged in [12:27:56] RECOVERY Disk Space is now: OK on varnish varnish output: DISK OK [12:28:36] RECOVERY Free ram is now: OK on varnish varnish output: OK: 92% free memory [13:13:15] PROBLEM Disk Space is now: UNKNOWN on testing-ioerror testing-ioerror output: Invalid host name testing-ioerror [13:18:15] PROBLEM Disk Space is now: CRITICAL on testing-ioerror testing-ioerror output: DISK CRITICAL - free space: / 0 MB (0% inode=22%): [14:10:58] New patchset: Mark Bergsma; "Move misc::package-builder into a separate file" [operations/puppet] (test) - https://gerrit.wikimedia.org/r/3196 [14:11:09] New patchset: Mark Bergsma; "Puppetize pbuilder" [operations/puppet] (test) - https://gerrit.wikimedia.org/r/3197 [14:11:20] New review: gerrit2; "Lint check passed." [operations/puppet] (test); V: 1 - https://gerrit.wikimedia.org/r/3196 [14:11:20] New review: gerrit2; "Lint check passed." [operations/puppet] (test); V: 1 - https://gerrit.wikimedia.org/r/3197 [14:14:09] Change abandoned: Mark Bergsma; "(no reason)" [operations/puppet] (test) - https://gerrit.wikimedia.org/r/3196 [14:14:27] Change abandoned: Mark Bergsma; "(no reason)" [operations/puppet] (test) - https://gerrit.wikimedia.org/r/3197 [14:21:00] New patchset: Mark Bergsma; "Merge package-builder changes into test" [operations/puppet] (test) - https://gerrit.wikimedia.org/r/3198 [14:21:11] New review: gerrit2; "Lint check passed." [operations/puppet] (test); V: 1 - https://gerrit.wikimedia.org/r/3198 [14:21:31] New review: Mark Bergsma; "(no comment)" [operations/puppet] (test); V: 0 C: 2; - https://gerrit.wikimedia.org/r/3198 [14:21:34] Change merged: Mark Bergsma; [operations/puppet] (test) - https://gerrit.wikimedia.org/r/3198 [14:25:23] New patchset: Mark Bergsma; "Fix dependency cycle" [operations/puppet] (test) - https://gerrit.wikimedia.org/r/3200 [14:25:34] New review: gerrit2; "Lint check passed." [operations/puppet] (test); V: 1 - https://gerrit.wikimedia.org/r/3200 [14:25:51] New review: Mark Bergsma; "(no comment)" [operations/puppet] (test); V: 0 C: 2; - https://gerrit.wikimedia.org/r/3200 [14:25:54] Change merged: Mark Bergsma; [operations/puppet] (test) - https://gerrit.wikimedia.org/r/3200 [14:37:35] PROBLEM Current Load is now: CRITICAL on bots-cb bots-cb output: CRITICAL - load average: 72.10, 30.32, 12.03 [14:40:06] New patchset: Mark Bergsma; "Fix othermirrors, setup default dist link" [operations/puppet] (test) - https://gerrit.wikimedia.org/r/3201 [14:40:17] New review: gerrit2; "Lint check passed." [operations/puppet] (test); V: 1 - https://gerrit.wikimedia.org/r/3201 [14:41:04] New review: Mark Bergsma; "(no comment)" [operations/puppet] (test); V: 0 C: 2; - https://gerrit.wikimedia.org/r/3201 [14:41:07] Change merged: Mark Bergsma; [operations/puppet] (test) - https://gerrit.wikimedia.org/r/3201 [14:47:35] PROBLEM Current Load is now: WARNING on bots-cb bots-cb output: WARNING - load average: 0.88, 16.35, 17.75 [15:12:38] RECOVERY Current Load is now: OK on bots-cb bots-cb output: OK - load average: 0.70, 0.85, 4.30 [15:15:22] how to start a new project/instance (sorry I'm not familiar with these terms) [15:19:20] PROBLEM dpkg-check is now: UNKNOWN on deployment-nfs-memc deployment-nfs-memc output: Invalid host name deployment-nfs-memc [15:24:23] PROBLEM dpkg-check is now: CRITICAL on deployment-nfs-memc deployment-nfs-memc output: CHECK_NRPE: Socket timeout after 10 seconds. [15:24:42] PROBLEM SSH is now: CRITICAL on deployment-nfs-memc deployment-nfs-memc output: CRITICAL - Socket timeout after 10 seconds [15:29:32] RECOVERY SSH is now: OK on deployment-nfs-memc deployment-nfs-memc output: SSH OK - OpenSSH_5.3p1 Debian-3ubuntu7 (protocol 2.0) [16:24:18] PROBLEM dpkg-check is now: CRITICAL on p-b p-b output: CHECK_NRPE: Socket timeout after 10 seconds. [16:29:08] RECOVERY dpkg-check is now: OK on p-b p-b output: All packages OK [17:30:18] New patchset: Mark Bergsma; "Build with source by default" [operations/puppet] (test) - https://gerrit.wikimedia.org/r/3212 [17:30:29] New review: gerrit2; "Lint check passed." [operations/puppet] (test); V: 1 - https://gerrit.wikimedia.org/r/3212 [17:31:07] New review: Mark Bergsma; "(no comment)" [operations/puppet] (test); V: 0 C: 2; - https://gerrit.wikimedia.org/r/3212 [17:31:10] Change merged: Mark Bergsma; [operations/puppet] (test) - https://gerrit.wikimedia.org/r/3212 [17:37:47] New patchset: Mark Bergsma; "Fix creates file name" [operations/puppet] (test) - https://gerrit.wikimedia.org/r/3213 [17:37:59] New review: gerrit2; "Lint check passed." [operations/puppet] (test); V: 1 - https://gerrit.wikimedia.org/r/3213 [17:38:07] New review: Mark Bergsma; "(no comment)" [operations/puppet] (test); V: 0 C: 2; - https://gerrit.wikimedia.org/r/3213 [17:38:09] Change merged: Mark Bergsma; [operations/puppet] (test) - https://gerrit.wikimedia.org/r/3213 [18:11:54] Git is slow :O (99 Kib/s) [18:13:03] hmm, SQL seems to be down on labs again, (Can't contact the database server: Lost connection to MySQL server at 'reading initial communication packet', system error: 111 (deployment-sql)) [18:13:14] any planned maintenance Ryan_Lane? [18:14:04] no. I don't do any maintenance in that project [18:14:09] I'm also not doing any right now [18:14:51] !project deployment-prep [18:14:51] https://labsconsole.wikimedia.org/wiki/Nova_Resource:deployment-prep [18:14:58] hm. it's really slow logging into it [18:15:04] I wonder if it OOM'd [18:15:49] it's failing to start [18:17:22] ah. neat. he moved mysql's datadir to the gluster storage [18:17:37] Do I have a full LDAP account? I have a labs account. [18:17:43] yes [18:17:55] he updated apparmor, so it isn't that.... [18:18:04] petan: any clue what's up with mysql on deployment-sql? [18:18:33] (or petan|wk) ;) [18:20:11] Ryan_Lane: why git clone is slow? [18:20:41] *why is git clone slow? [18:21:19] for which repo? [18:22:48] test [18:23:09] /test/mediawiki/core [18:23:15] 03/15/2012 - 18:23:15 - Creating a home directory for asher at /export/home/deployment-prep/asher [18:24:16] 03/15/2012 - 18:24:16 - Updating keys for asher [18:37:08] Thehelpfulone: binasher fixed mysql for us :) [18:39:12] great [18:39:50] hmm new error for me, (Can't contact the database server: Host 'i-00000162.pmtpa.wmflabs' is not allowed to connect to this MySQL server (deployment-sql)) [18:49:25] New patchset: Jgreen; "starting to puppetize the pediapress project" [operations/puppet] (test) - https://gerrit.wikimedia.org/r/3220 [18:49:36] New review: gerrit2; "Lint check passed." [operations/puppet] (test); V: 1 - https://gerrit.wikimedia.org/r/3220 [18:50:33] New review: Jgreen; "(no comment)" [operations/puppet] (test); V: 1 C: 2; - https://gerrit.wikimedia.org/r/3220 [18:50:36] Change merged: Jgreen; [operations/puppet] (test) - https://gerrit.wikimedia.org/r/3220 [18:50:55] I don't know the mysql root pass [18:51:01] we'll need to wait for petan [18:51:07] we need some way to document this stuff [18:51:20] um.. the wiki? [18:51:28] no. the mysql root pass [18:51:32] oh [18:51:37] yeah, no. not the wiki :) [18:51:43] A file on the HDD in /root? [18:51:44] I don't want people storing passwords in there. heh [18:51:56] keeping it in /root/.my.cf is likely fine [18:54:49] Ryan_Lane: MediaWiki has over 1 million lines in code (normal installation) [18:55:13] ? [18:55:35] 1081171 lines in 1513 files [18:55:47] I have counted them. [18:56:22] oh, I have 2-3 extension installed :O [18:56:46] PROBLEM host: pediapress-ocg2 is DOWN address: pediapress-ocg2 check_ping: Invalid hostname/address - pediapress-ocg2 [18:57:32] IWorld: I don't know what this is in reference to [18:58:20] the lines in the code of mediawiki [19:23:29] is this a known issue? [19:23:31] (Can't contact the database server: Host 'i-00000176.pmtpa.wmflabs' is not allowed to connect to this MySQL server (deployment-sql)) [19:23:35] for abs.wikimedia.beta.wmflabs.org/wiki/Global_Requests [19:23:42] http://labs.wikimedia.beta.wmflabs.org/wiki/Global_Requests [19:23:55] btw hi all [19:27:09] Which name has the instance? [19:27:44] PROBLEM host: pediapress-ocg2 is DOWN address: pediapress-ocg2 check_ping: Invalid hostname/address - pediapress-ocg2 [19:30:19] Ryan_Lane: can you help me track down a disappearing instance? labsconsole says pediapress-ocg2 is running, but the bastion host doesn't seem to know it exists [19:44:22] PROBLEM Total Processes is now: CRITICAL on pediapress-ocg3 pediapress-ocg3 output: Connection refused by host [19:45:03] PROBLEM dpkg-check is now: CRITICAL on pediapress-ocg3 pediapress-ocg3 output: Connection refused by host [19:45:42] PROBLEM Current Load is now: CRITICAL on pediapress-ocg3 pediapress-ocg3 output: Connection refused by host [19:46:32] PROBLEM Current Users is now: CRITICAL on pediapress-ocg3 pediapress-ocg3 output: Connection refused by host [19:47:12] PROBLEM Disk Space is now: CRITICAL on pediapress-ocg3 pediapress-ocg3 output: Connection refused by host [19:48:02] PROBLEM Free ram is now: CRITICAL on pediapress-ocg3 pediapress-ocg3 output: Connection refused by host [19:49:18] Raymond_: yep, waiting for petan to be able to fix that one [19:49:47] Thehelpfulone: thanks [19:50:13] sure, and I granted you your admin on commons, if you need any more flags anywhere, feel free to ask :) [19:53:32] RECOVERY host: pediapress-ocg2 is UP address: pediapress-ocg2 PING OK - Packet loss = 0%, RTA = 565.90 ms [19:55:03] Thehelpfulone: one more thanks :) [19:56:15] okay, tell me which wiki and rights when it works I'll give them to you [19:56:32] PROBLEM Current Load is now: CRITICAL on pediapress-ocg2 pediapress-ocg2 output: Connection refused by host [19:56:35] Jeff_Green: did you delete/recreate it? [19:56:42] PROBLEM dpkg-check is now: CRITICAL on pediapress-ocg2 pediapress-ocg2 output: Connection refused by host [19:57:18] Ryan_Lane: yes, just once, and after recreating it appears to have booted but is not accessible [19:57:52] PROBLEM Disk Space is now: CRITICAL on pediapress-ocg2 pediapress-ocg2 output: Connection refused by host [19:57:53] initially the bastion host didn't recognize the hostname, then it started recognizing it maybe 10 minutes later [19:58:02] PROBLEM Current Users is now: CRITICAL on pediapress-ocg2 pediapress-ocg2 output: Connection refused by host [19:58:52] PROBLEM Free ram is now: CRITICAL on pediapress-ocg2 pediapress-ocg2 output: Connection refused by host [20:00:52] PROBLEM Total Processes is now: CRITICAL on pediapress-ocg2 pediapress-ocg2 output: Connection refused by host [20:01:01] Ryan_Lane: maybe I'm just impatient--console output suggests it's been slowly building over the past hour [20:01:09] petan: perhaps best to talk in here [20:01:36] petan: ah. ok [20:01:57] petan: can you add the grants back in? [20:02:15] when you copied the databases across to gluster, you missed the mysql database :) [20:02:23] binasher fixed it, kind of [20:02:26] but the grants are missing [20:02:44] Jeff_Green: that's weird. [20:02:47] lemme look [20:03:42] seems to have stalled on dhcp for a while for one [20:03:51] o.O [20:03:56] ah [20:04:02] Ryan_Lane: I surely did cp -r [20:04:11] if there is anything missing, it's wrong [20:04:31] dunno. would have to ask binasher [20:04:34] maybe not--i just see it renewing every 60s for maybe 20 min with nothing else happening in between [20:04:40] when he gets back I'll ask him to come into the table [20:04:42] *channel [20:05:27] let me check [20:05:41] but I am pretty sure it used to work [20:05:45] Jeff_Green: weird I wonder why this is taking so long to build [20:06:09] n to MySQL server at 'reading initial communication packet', system error: [20:06:15] this is failure of mysql service [20:09:06] etrb@bastion1:~$ ssh deployment-sql [20:09:08] If you are having access problems, please see: https://labsconsole.wikimedia.org/wiki/Access#Accessing_public_and_private_instances [20:09:12] Permission denied (publickey). [20:09:16] so [20:09:20] problem is in machine [20:10:07] should be fixed in min [20:10:14] petan: no [20:10:18] petan: we logged into it [20:10:21] ah [20:10:23] I just rebooted it [20:10:25] mysql wasn't running [20:10:27] yes [20:10:28] it wouldn't start [20:10:31] it goes on oom [20:10:34] it was workign before [20:10:40] because it was missing the mysql database [20:10:42] today after move to gluster it worked [20:10:47] actually [20:10:49] yesterday [20:10:57] I tested the sites after I moved it [20:11:02] it was ok [20:12:31] !log bots added DeltaQuad to bots [20:12:32] Logged the message, Master [20:13:09] 03/15/2012 - 20:13:09 - Creating a home directory for deltaquad at /export/home/bots/deltaquad [20:13:18] Host 'i-00000176.pmtpa.wmflabs' is not allowed to connect to this MySQL server (deployment-sql)) [20:13:21] this is new problem now [20:13:25] it didn't happen before [20:14:05] that's because the mysql database was missing [20:14:10] 03/15/2012 - 20:14:09 - Updating keys for deltaquad [20:14:17] binasher copied something across, and now the grants are missing [20:14:37] !log deployment-prep root: restoring database from backup [20:14:38] Logged the message, Master [20:15:07] I wonder why puppet is running so slowly in labs [20:15:31] hm. the ldap server is getting slammed [20:15:35] ruby? [20:15:36] :P [20:15:41] yes it is [20:15:45] I can't login some times [20:15:49] ldap error [20:15:53] really? [20:15:55] yes [20:15:55] that makes no sense [20:15:59] yeah, I'm finally in but it took 1:14 to build ocg2 [20:16:09] I do ssh and receive "error can't auth, ldap err..." [20:16:17] something like [20:16:21] don't remember [20:16:45] ocg3 (started after ocg2) built in 34 minutes [20:17:00] that's crazy [20:17:09] it should *never* take longer than like 5 minutes [20:17:16] there's something wrong [20:17:27] yeah ~5 has been my previous experience [20:17:58] !log deployment-prep root: restored ok [20:17:59] Logged the message, Master [20:18:22] fixed [20:18:42] I restored only mysql and mysql_upgrade_info db's [20:18:55] petan: could you add some more squids too? it was running slow yesterday [20:18:57] so wiki data should be latest [20:19:07] that's because of config [20:19:24] we don't have prod v. [20:19:30] but some we made [20:19:36] and http://labs.wikimedia.beta.wmflabs.org/wiki/Global_Requests#Update_Commonswiki if you have a minute for that too [20:19:42] ehj? [20:19:44] what's there [20:19:58] I'm going to have to write log analysis tools for ldap again [20:20:01] * Ryan_Lane sighs [20:20:05] I knew I should have kept those [20:20:10] they'd like an update to a new r of commons [20:20:14] btw next time [20:20:16] create a ticket [20:20:17] :D [20:20:19] "... Commons wiki is currently running r112965. Any chance to update Commons wiki to r113591 at least? ..." [20:20:19] in bugzilla [20:20:23] we have a ticketing system? [20:20:25] oh :P [20:20:25] I respond to that quickly, not to irc [20:20:27] yes [20:20:39] there is product beta wmf [20:20:43] or something like that [20:22:10] ok [20:22:10] yes it's slow [20:22:16] I suspect memc [20:22:30] Raymond_: which other rights would you like? [20:22:31] there is a bug in mediawiki great to debug :D [20:22:54] oot@i-000000d0:/home/petrb# exit [20:22:56] LDAP close_session failed [20:22:58] su: Authentication service cannot retrieve authentication info [20:23:00] Ryan_Lane: ^ [20:23:08] another one [20:23:08] which instance is this? [20:23:14] i-000000d0 [20:23:18] Thehelpfulone: the right to update the wiki :P [20:23:18] instance name? [20:23:23] -sql [20:23:41] now it was when I wanted to logout [20:23:44] instead of login [20:23:49] Thehelpfulone: ok, joke. but sysop on dewiki would be helpful [20:23:50] hm [20:23:54] Raymond_: I grant you global permissions to bug petan for that. :D [20:23:55] it doesn't even see my user [20:24:03] what [20:24:05] perfect :) [20:24:06] Thehelpfulone: ? [20:24:25] petan: it was Raymond_ who posted http://labs.wikimedia.beta.wmflabs.org/wiki/Global_Requests#Update_Commonswiki [20:24:30] aha [20:24:31] ok [20:24:34] petan: it isn't an issue with load on the ldap server [20:24:41] if you see that, it's an issue with nslcd [20:24:44] ok [20:24:49] I did some updates on machine [20:24:56] it asked if I want to manage ldap by some tool [20:24:59] I answered no [20:25:02] to fix that, as root: ncsd -i passwd; nscd -i group; /etc/init.d/nslcd restart [20:25:03] Raymond_: done :) [20:25:38] an easy way to know if it is broken is to use id on a user that hasn't logged in [20:25:48] or not.. hmm looks like it's going to time out [20:25:49] if it can't find the user, then nslcd is fucked up [20:25:49] thanks a lot [20:26:14] ok [20:26:21] I think I don't see id's often [20:26:22] unfortunately nslcd seems less stable than nss-ldap :( [20:26:24] on files [20:26:32] processes too [20:26:44] no. I mean use the id command [20:26:47] ah [20:26:48] as root [20:26:50] id laner [20:26:50] ok [20:26:55] will show my uid and all my gids [20:27:05] is there a ganglia for gluster I think it's loaded a bit [20:27:29] it's not [20:27:34] http://ganglia.wikimedia.org/latest/?r=hour&cs=&ce=&m=load_one&s=by+name&c=Glusterfs+cluster+pmtpa&h=&host_regex=&max_graphs=0&tab=m&vn=&sh=1&z=small&hc=4 [20:27:39] Thehelpfulone: petrb@deployment-webs1:~$ svn up /usr/local/apache/common-local/php-* [20:27:42] waiting for it [20:27:45] mounting is slow [20:27:46] gluster is slow [20:27:53] oh wait [20:27:58] this isn't on gluster [20:27:59] :D [20:28:01] heh [20:28:01] ok [20:28:07] instance storage is slow [20:28:11] right [20:28:14] we should move it [20:28:14] and the less people use it, the faster it'll become [20:28:24] is there a way to move and keep /usr/local/apache [20:28:32] like remount /data/project [20:28:41] why do you need /usr/local/apache? [20:28:46] same as on prod [20:28:55] ah [20:28:57] goal is to make identical setting [20:28:57] for the code? [20:28:59] yes [20:29:01] for testing [20:29:07] it should be same [20:29:12] more same, better for testing [20:29:13] you can link [20:29:24] php-cli has troubles with links [20:29:29] dunno why [20:29:33] it was described in php doc [20:29:36] ah [20:29:47] hardlink would be ok [20:29:52] but that's not possible here [20:29:58] well, you can mount the gluster storage via fstab, yeah [20:30:19] I was thinking of a symlink of nfs server [20:30:29] so that /export would link to /data/project [20:30:34] that might work [20:30:46] but it would be slower [20:30:48] php has an issue with softlinked directories? [20:30:58] I think it has a bit [20:31:02] php-cli does [20:31:07] php itself not [20:31:28] command line version is sometimes working with another directory than it should [20:31:39] when it's symlink [20:32:36] oh yes [20:32:41] include has bug in there [20:33:19] when you include relative path which is symlinked then you can have troubles [20:33:25] so, yeah, you can mount the storage in as many locations as you want, just remember that fstab mounts can cause problems [20:33:33] even if it's a bug, I think having testing same as prod would be better [20:34:01] well... [20:34:09] in production the mediawiki code isn't on shared-storage [20:34:17] it's rsync'd to all apaches [20:34:21] oh [20:34:24] doesn't it suck? [20:34:32] you are using php-apc on the apaches, right? [20:34:43] I use php-cli on dbdump only [20:34:51] for updating stuff [20:34:58] and imports of db [20:35:10] apaches run only apache :) [20:35:16] nothing else [20:35:40] no, it's better to have a local copy of mediawiki [20:35:50] hm, if nfs is down, yes [20:35:53] if it was on shared storage, if the shared storage dies, then everything else does too [20:35:59] I know [20:35:59] also, it can be a bottleneck [20:36:14] maybe having more shared storages would do [20:36:22] like 1 storage per 20 apaches :P [20:36:34] or we could just rsync mediawiki to them :) [20:36:48] which is reliable [20:37:23] we may switch to using git, rather than using rsync at some point [20:37:32] +1 [20:37:33] someone will eventually rewrite our shitty deploy system [20:37:38] Ryan_Lane: have you looked at labs puppet yet? i see it's running on virt0 which does not seem exceedingly loaded, but it's taking forever to finish any runs [20:37:55] Jeff_Green: that's really odd. lemme try running it on an instance [20:38:04] Ryan_Lane: I have sent you a CSV file of svn authors with their real emails [20:38:10] i've been watching puppetd --test on pediapress-ocg2 for about 10 minutes [20:38:11] ah. right [20:38:13] Ryan_Lane: should let you mass create gerrit accounts :-] [20:38:35] lemme log into that system and look at some things [20:38:45] sure [20:38:54] ldap is fine on it... [20:39:03] puppet runs fast on other instances [20:39:05] load is nil, puppet proc is snoring, it's just slow as death [20:39:06] does it? [20:39:09] yeah [20:39:19] I wonder why it's so slow on this one [20:39:20] it's sucking on both instances in this project [20:39:54] hah, now I can't get on to ocg3--it ate my key since last login [20:40:03] this project is full of beelzebub [20:40:09] heh [20:40:24] actually. it's slow on nova-production1 [20:40:33] could it be network? [20:40:40] lemme see [20:41:12] http://ganglia.wikimedia.org/latest/graph_all_periods.php?h=virt2.pmtpa.wmnet&m=load_one&r=hour&s=by%20name&hc=4&mc=2&st=1331844049&g=network_report&z=large&c=Virtualization%20cluster%20pmtpa [20:41:16] shouldn't be [20:41:17] but..... [20:41:23] http://ganglia.wikimedia.org/latest/graph_all_periods.php?h=virt2.pmtpa.wmnet&m=load_one&r=hour&s=by%20name&hc=4&mc=2&st=1331844049&g=cpu_report&z=large&c=Virtualization%20cluster%20pmtpa [20:41:28] that's *terrible* [20:41:43] why the hell is the wait-io so bad? [20:41:50] did we enable "network circa 1994" ? [20:43:02] Ryan_Lane: can you type top on storage [20:43:03] it's the network node... [20:43:03] :D [20:43:10] petan: eh? [20:43:11] I guess io is like 100% [20:43:18] storage? [20:43:21] which storage? [20:43:22] instance storage [20:43:23] server [20:43:29] eh? [20:43:34] I have no clue what you mean [20:43:34] is it gluster server [20:43:40] where is instance storage [20:43:49] it's mounted on virt? [20:43:51] directly [20:43:52] it's on all of the compute nodes [20:43:59] oh [20:44:16] !load [20:44:16] http://ganglia.wikimedia.org/2.2.0/graph_all_periods.php?h=virt2.pmtpa.wmnet&m=load_one&r=hour&s=by%20name&hc=4&mc=2&st=1327006829&g=load_report&z=large&c=Virtualization%20cluster%20pmtpa [20:44:19] !load-all [20:44:20] http://ganglia.wikimedia.org/2.2.0/?c=Virtualization%20cluster%20pmtpa&m=load_one&r=hour&s=by%20name&hc=4&mc=2 [20:45:01] doesn't show storage load [20:45:06] only cpu [20:45:20] it would be cool to monitor io [20:45:25] traffic etc [20:45:27] writes [20:45:45] however when ls takes 10 seconds [20:45:52] ... [20:45:53] :D [20:46:01] I guess it's wrong [20:46:18] btw Thehelpfulone it's updated [20:46:25] I would log it but terminal doesn't respond [20:46:35] heh use !log in here then :) [20:46:38] hm... [20:46:49] lazy :D [20:46:50] and how did you update it out of interest? [20:46:56] svn up [20:47:12] oh wait [20:47:13] it's not [20:47:19] only -19 is :D [20:47:22] now it's running on trunk [20:47:34] we really need to set up network-node per host [20:47:38] heh [20:47:42] hi RoanKattouw [20:47:48] how's weather in sf [20:47:58] I want to move too :D [20:47:59] PROBLEM Current Load is now: WARNING on essex-test-l essex-test-l output: WARNING - load average: 26.74, 18.57, 8.88 [20:48:06] all you had to type was svn up? [20:48:11] sort of [20:48:17] I sent you how I did it [20:48:26] read up [20:48:53] It's been raining since Tuesday, it's gonna continue to rain through Sunday [20:48:55] :( [20:49:01] Although it wasn't raining when we got lunch today [20:49:41] ok [20:49:50] ah yes see it [20:49:56] here is still same european weather :D ugly [20:50:32] but warmer [20:50:54] virt2 is very overloaded [20:51:02] I need to live-migrate some instances [20:51:16] virt1 and virt3 are hardly loaded at all [20:51:50] Ryan_Lane: I can haz new project? [20:51:56] gimme a min [20:51:59] (For moving stuff from prototype) [20:52:02] Sure, take your time [20:52:05] I need to un-fuck virt2 [20:52:15] OK [20:52:21] I have to deploy stuff to the cluster first anyway [20:55:10] great. the fucking driver won't let me migrate because it things there's a lack of disk space [20:55:18] it is shared storage! [20:56:43] petan: also, http://labs.wikimedia.beta.wmflabs.org/wiki/Special:GlobalGroupPermissions - global_sysop and global sysop seem to be the same group, duplicated? [20:57:07] Thehelpfulone: remove it [20:57:10] create again [20:57:15] both [20:57:27] ok [20:57:41] actually, it's saying lack of memory [20:58:14] commenting *that* check out [20:59:16] do you need to remove users from the group first petan? [21:00:23] no [21:00:51] New patchset: Jgreen; "added misc::mwlib::users and misc::mwlib::groups" [operations/puppet] (test) - https://gerrit.wikimedia.org/r/3225 [21:01:03] New review: gerrit2; "Lint check passed." [operations/puppet] (test); V: 1 - https://gerrit.wikimedia.org/r/3225 [21:01:08] and of course the live migration failed [21:01:22] New review: Jgreen; "(no comment)" [operations/puppet] (test); V: 1 C: 2; - https://gerrit.wikimedia.org/r/3225 [21:01:24] Change merged: Jgreen; [operations/puppet] (test) - https://gerrit.wikimedia.org/r/3225 [21:03:00] well, it worked. [21:03:10] though I wonder if it'll fuck me whenever I restart the host [21:04:30] oh. cool. nova-manage vm list command actually shows migrating status [21:04:51] so does euca command. didn't notice that before [21:06:11] heh. I love watching the network spike when I do migrations :) [21:06:17] http://ganglia.wikimedia.org/latest/graph_all_periods.php?h=virt2.pmtpa.wmnet&m=load_one&r=hour&s=by%20name&hc=4&mc=2&st=1331845532&g=network_report&z=large&c=Virtualization%20cluster%20pmtpa [21:07:07] migrations actually seem to be working fine [21:09:51] iowait slowly getting better on virt2 [21:16:20] PROBLEM host: nginx-ffuqua-doom1-3 is DOWN address: nginx-ffuqua-doom1-3 CRITICAL - Host Unreachable (nginx-ffuqua-doom1-3) [21:18:17] RECOVERY Current Load is now: OK on essex-test-l essex-test-l output: OK - load average: 0.02, 0.66, 4.08 [21:18:45] meh, this doesn't seem to be helping any :() [21:27:43] PROBLEM Current Load is now: CRITICAL on mobile-enwp mobile-enwp output: CRITICAL - load average: 33.57, 25.07, 16.54 [21:29:26] PROBLEM Current Load is now: WARNING on bots-cb bots-cb output: WARNING - load average: 4.44, 22.72, 14.59 [21:29:26] PROBLEM Current Load is now: WARNING on nagios 127.0.0.1 output: WARNING - load average: 2.07, 4.70, 3.39 [21:29:46] RECOVERY host: nginx-ffuqua-doom1-3 is UP address: nginx-ffuqua-doom1-3 PING OK - Packet loss = 0%, RTA = 3.28 ms [21:32:16] PROBLEM Current Load is now: WARNING on mobile-enwp mobile-enwp output: WARNING - load average: 8.04, 15.78, 15.06 [21:34:26] RECOVERY Current Load is now: OK on nagios 127.0.0.1 output: OK - load average: 0.12, 1.92, 2.55 [21:36:56] weird. almost all of the storage is on virt2 and virt3 [21:49:26] RECOVERY Current Load is now: OK on bots-cb bots-cb output: OK - load average: 0.77, 1.20, 4.56 [21:54:56] RECOVERY Current Load is now: OK on bots-sql3 bots-sql3 output: OK - load average: 2.36, 2.96, 4.81 [22:07:29] ok. disabling profiling, and doing some migrations seemed to help some [22:07:57] we *really* need to set up multi-host network nodes [22:08:03] I think most of this wait-io is due to that [22:21:44] Jeff_Green: is puppet running any faster now? [22:21:55] I really think it was due to virt2 being severly overloaded [22:22:33] seriously, look at this crap: http://ganglia.wikimedia.org/latest/graph.php?r=week&z=xlarge&h=virt2.pmtpa.wmnet&m=load_one&s=by+name&mc=2&g=cpu_report&c=Virtualization+cluster+pmtpa [22:28:56] hmm… logstash uses jruby by default [22:29:12] and can be run from a monolithic jar with all the dependencies [22:29:21] this actually makes me hate ruby slightly less [22:29:28] oh. wrong channel [22:29:29] heh [22:59:39] ok, this is odd, on http://meta.wikimedia.beta.wmflabs.org/wiki/Special:GlobalGroupPermissions [22:59:46] there's 2 global_sysop groups [23:00:13] I tried to delete them by removing all the rights from them, but they haven't been deleted.. [23:37:57] New patchset: Bhartshorne; "needed to escape the %s for cron to play nice." [operations/puppet] (production) - https://gerrit.wikimedia.org/r/3231 [23:38:09] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/3231 [23:38:20] New review: Bhartshorne; "(no comment)" [operations/puppet] (production); V: 1 C: 2; - https://gerrit.wikimedia.org/r/3231 [23:38:23] Change merged: Bhartshorne; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/3231 [23:54:27] New patchset: Bhartshorne; "putting the location of the swiftcleaner script into the config file" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/3232 [23:54:39] New review: gerrit2; "Lint check passed." [operations/puppet] (production); V: 1 - https://gerrit.wikimedia.org/r/3232 [23:55:38] New review: Bhartshorne; "(no comment)" [operations/puppet] (production); V: 1 C: 2; - https://gerrit.wikimedia.org/r/3232 [23:55:41] Change merged: Bhartshorne; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/3232