[00:05:05] New patchset: Ryan Lane; "More followup to Ida74297b" [operations/puppet] (test) - https://gerrit.wikimedia.org/r/6356 [00:05:18] New review: gerrit2; "Lint check passed." [operations/puppet] (test); V: 1 - https://gerrit.wikimedia.org/r/6356 [00:05:47] New patchset: Ryan Lane; "Adding proxyagent pass for keystone" [labs/private] (master) - https://gerrit.wikimedia.org/r/6357 [00:06:08] New review: Ryan Lane; "(no comment)" [operations/puppet] (test); V: 0 C: 2; - https://gerrit.wikimedia.org/r/6356 [00:06:11] Change merged: Ryan Lane; [operations/puppet] (test) - https://gerrit.wikimedia.org/r/6356 [00:06:26] New review: Ryan Lane; "(no comment)" [labs/private] (master); V: 1 C: 2; - https://gerrit.wikimedia.org/r/6357 [00:06:28] Change merged: Ryan Lane; [labs/private] (master) - https://gerrit.wikimedia.org/r/6357 [00:10:56] New patchset: Ryan Lane; "The variable definitions were flipped and slightly redundant" [operations/puppet] (test) - https://gerrit.wikimedia.org/r/6358 [00:11:01] spammy spammy spam [00:11:09] New review: gerrit2; "Lint check passed." [operations/puppet] (test); V: 1 - https://gerrit.wikimedia.org/r/6358 [00:11:10] I'll never be able to fucking merge all these changes into production [00:11:25] New review: Ryan Lane; "(no comment)" [operations/puppet] (test); V: 0 C: 2; - https://gerrit.wikimedia.org/r/6358 [00:11:28] Change merged: Ryan Lane; [operations/puppet] (test) - https://gerrit.wikimedia.org/r/6358 [00:20:54] New patchset: Ryan Lane; "Use the right variables and remove needless subscribes" [operations/puppet] (test) - https://gerrit.wikimedia.org/r/6359 [00:20:59] will *this* be the magical merge!? [00:21:07] New review: gerrit2; "Lint check passed." [operations/puppet] (test); V: 1 - https://gerrit.wikimedia.org/r/6359 [00:21:18] New review: Ryan Lane; "(no comment)" [operations/puppet] (test); V: 0 C: 2; - https://gerrit.wikimedia.org/r/6359 [00:21:21] Change merged: Ryan Lane; [operations/puppet] (test) - https://gerrit.wikimedia.org/r/6359 [00:54:09] New patchset: Ryan Lane; "Change around nova config some and fix keystone database usernames" [operations/puppet] (test) - https://gerrit.wikimedia.org/r/6363 [00:54:22] New review: gerrit2; "Lint check passed." [operations/puppet] (test); V: 1 - https://gerrit.wikimedia.org/r/6363 [00:54:37] New review: Ryan Lane; "(no comment)" [operations/puppet] (test); V: 0 C: 2; - https://gerrit.wikimedia.org/r/6363 [00:54:40] Change merged: Ryan Lane; [operations/puppet] (test) - https://gerrit.wikimedia.org/r/6363 [06:56:11] Damianz hey [06:56:23] can you open a ticket for that [07:01:01] !log nagios rebooting [07:01:02] Logged the message, Master [07:06:43] RECOVERY Current Load is now: OK on opengrok-web i-000001e1 output: OK - load average: 0.00, 0.00, 0.00 [07:06:44] RECOVERY Current Load is now: OK on pediapress-ocg2 i-00000234 output: OK - load average: 0.05, 0.03, 0.05 [07:06:44] RECOVERY Free ram is now: OK on precise-test i-00000231 output: OK: 94% free memory [07:07:23] RECOVERY Current Users is now: OK on pediapress-ocg2 i-00000234 output: USERS OK - 0 users currently logged in [07:07:28] RECOVERY Current Load is now: OK on swift-be2 i-000001c8 output: OK - load average: 0.09, 0.03, 0.01 [07:07:28] RECOVERY Total Processes is now: OK on test2 i-0000013c output: PROCS OK: 86 processes [07:07:53] RECOVERY Free ram is now: OK on orgcharts-dev i-0000018f output: OK: 96% free memory [07:07:53] RECOVERY Disk Space is now: OK on pediapress-ocg2 i-00000234 output: DISK OK [07:08:03] RECOVERY Total Processes is now: OK on utils-abogott i-00000131 output: PROCS OK: 75 processes [07:08:33] RECOVERY Free ram is now: OK on opengrok-web i-000001e1 output: OK: 89% free memory [07:08:33] RECOVERY Free ram is now: OK on pediapress-ocg2 i-00000234 output: OK: 91% free memory [07:08:53] RECOVERY Disk Space is now: OK on deployment-feed i-00000118 output: DISK OK [07:09:03] RECOVERY dpkg-check is now: OK on labs-lvs1 i-00000057 output: All packages OK [07:09:13] RECOVERY SSH is now: OK on opengrok-web i-000001e1 output: SSH OK - OpenSSH_5.3p1 Debian-3ubuntu7 (protocol 2.0) [07:09:23] PROBLEM Total Processes is now: CRITICAL on deployment-dbdump i-000000d2 output: Connection refused by host [07:09:28] RECOVERY Current Load is now: OK on bots-apache1 i-000000b0 output: OK - load average: 0.30, 0.09, 0.02 [07:09:28] RECOVERY Current Load is now: OK on bots-cb i-0000009e output: OK - load average: 0.18, 0.18, 0.17 [07:09:33] RECOVERY Total Processes is now: OK on dumps-2 i-00000174 output: PROCS OK: 93 processes [07:09:43] RECOVERY Total Processes is now: OK on opengrok-web i-000001e1 output: PROCS OK: 85 processes [07:09:53] RECOVERY Total Processes is now: OK on pediapress-ocg2 i-00000234 output: PROCS OK: 78 processes [07:10:03] PROBLEM dpkg-check is now: CRITICAL on deployment-dbdump i-000000d2 output: Connection refused by host [07:10:23] RECOVERY dpkg-check is now: OK on opengrok-web i-000001e1 output: All packages OK [07:10:23] RECOVERY dpkg-check is now: OK on pediapress-ocg2 i-00000234 output: All packages OK [07:10:23] PROBLEM host: analytics is DOWN address: i-000000e2 CRITICAL - Host Unreachable (i-000000e2) [07:10:23] RECOVERY Current Users is now: OK on precise-test i-00000231 output: USERS OK - 0 users currently logged in [07:10:33] RECOVERY Total Processes is now: OK on swift-be2 i-000001c8 output: PROCS OK: 90 processes [07:11:03] RECOVERY Disk Space is now: OK on precise-test i-00000231 output: DISK OK [07:11:03] PROBLEM host: ganglia-test is DOWN address: i-00000202 CRITICAL - Host Unreachable (i-00000202) [07:11:13] RECOVERY Disk Space is now: OK on utils-abogott i-00000131 output: DISK OK [07:11:13] PROBLEM Current Load is now: CRITICAL on deployment-dbdump i-000000d2 output: Connection refused by host [07:11:33] PROBLEM host: salt is DOWN address: i-000001c1 CRITICAL - Host Unreachable (i-000001c1) [07:11:52] meh [07:11:53] PROBLEM Current Users is now: CRITICAL on deployment-dbdump i-000000d2 output: Connection refused by host [07:12:43] PROBLEM Disk Space is now: CRITICAL on deployment-dbdump i-000000d2 output: CHECK_NRPE: Socket timeout after 10 seconds. [07:13:03] PROBLEM Free ram is now: CRITICAL on deployment-dbdump i-000000d2 output: Connection refused by host [07:13:13] PROBLEM dpkg-check is now: CRITICAL on dumps-2 i-00000174 output: DPKG CRITICAL dpkg reports broken packages [07:14:23] RECOVERY Total Processes is now: OK on deployment-dbdump i-000000d2 output: PROCS OK: 100 processes [07:15:03] RECOVERY dpkg-check is now: OK on deployment-dbdump i-000000d2 output: All packages OK [07:16:13] RECOVERY Current Load is now: OK on deployment-dbdump i-000000d2 output: OK - load average: 0.17, 0.51, 0.25 [07:16:53] RECOVERY Current Users is now: OK on deployment-dbdump i-000000d2 output: USERS OK - 0 users currently logged in [07:17:33] RECOVERY Disk Space is now: OK on deployment-dbdump i-000000d2 output: DISK OK [07:18:03] RECOVERY Free ram is now: OK on deployment-dbdump i-000000d2 output: OK: 94% free memory [07:36:51] hi Ryan_Lane [07:36:56] what is etherpad project for? [07:36:58] in labs [07:37:07] フォrえてぇrぱd [07:37:12] err [07:37:13] for etherpad [07:37:46] specifically, for installing and testing etherpad, for moving changes to production [07:40:43] PROBLEM host: analytics is DOWN address: i-000000e2 CRITICAL - Host Unreachable (i-000000e2) [07:41:50] hello [07:42:03] PROBLEM host: ganglia-test is DOWN address: i-00000202 CRITICAL - Host Unreachable (i-00000202) [07:42:23] PROBLEM host: salt is DOWN address: i-000001c1 CRITICAL - Host Unreachable (i-000001c1) [08:10:43] PROBLEM host: analytics is DOWN address: i-000000e2 CRITICAL - Host Unreachable (i-000000e2) [08:12:13] PROBLEM host: ganglia-test is DOWN address: i-00000202 CRITICAL - Host Unreachable (i-00000202) [08:12:23] PROBLEM host: salt is DOWN address: i-000001c1 CRITICAL - Host Unreachable (i-000001c1) [08:34:53] hashar: we have a lot of troubles on deployment [08:35:05] there used to be a redirect from beta to deployment [08:35:12] it doesn't work for some reason [08:40:43] PROBLEM host: analytics is DOWN address: i-000000e2 CRITICAL - Host Unreachable (i-000000e2) [08:42:13] PROBLEM host: ganglia-test is DOWN address: i-00000202 CRITICAL - Host Unreachable (i-00000202) [08:42:23] PROBLEM host: salt is DOWN address: i-000001c1 CRITICAL - Host Unreachable (i-000001c1) [08:54:25] petan|wk: back, I was deploying a change [08:54:36] petan|wk: so that sounds like just one issue isn't it ? :-] [08:57:09] also http://beta.wmflabs.org/ redirects me to http://deployment.wikimedia.beta.wmflabs.org/ [09:07:56] that's right [09:08:03] http://labs.wikimedia.beta.wmflabs.org/wiki/Special:RecentChanges [09:08:08] this isn't [09:08:12] hashar: ^ [09:08:14] fix it [09:08:15] :P [09:08:18] log it [09:08:32] ohhh [09:10:43] PROBLEM host: analytics is DOWN address: i-000000e2 CRITICAL - Host Unreachable (i-000000e2) [09:12:13] PROBLEM host: ganglia-test is DOWN address: i-00000202 CRITICAL - Host Unreachable (i-00000202) [09:12:23] PROBLEM host: salt is DOWN address: i-000001c1 CRITICAL - Host Unreachable (i-000001c1) [09:27:56] petan|wk: I think I will end up proposing a new name [09:28:05] deployment-preparation is too long :-∏ [09:28:10] beta not specific enough [09:28:18] I need to figure out a nice acronym :-] [09:31:41] hashar: BZ is now 4.0.6 [09:32:16] hashar: nevermind, wrong nick :P. but hi :) [09:33:01] well done mutante :-] [09:33:43] hashar: whatever just so that it's short [09:33:50] like beta [09:33:56] and correspond to logo [09:33:59] so only "beta" [09:34:04] there is no other choice :P [09:34:34] we can change the logo [09:37:48] ok, let's make the logo first then I can change the dns [09:38:12] https://bugzilla.wikimedia.org/36414 [09:38:16] to find a better name :° [09:38:25] if you have any idea [09:40:43] PROBLEM host: analytics is DOWN address: i-000000e2 CRITICAL - Host Unreachable (i-000000e2) [09:41:39] Mid-air collision detected! [09:41:44] :) [09:42:13] PROBLEM host: ganglia-test is DOWN address: i-00000202 CRITICAL - Host Unreachable (i-00000202) [09:42:23] PROBLEM host: salt is DOWN address: i-000001c1 CRITICAL - Host Unreachable (i-000001c1) [09:43:47] petan|wk: finding a better name does not block the redirection issue [09:43:57] but I feel we need a real project name :-]] [09:44:02] but you flagged it as blocker [09:44:18] it is a blocker to create the git repositories in Gerrit yes [09:44:24] eh: Additionally, a 404 Not Found error was encountered while trying to use an ErrorDocument to handle the request. [09:44:25] cause we will want to use the 'new' name [09:44:30] this is another bug [09:44:34] 2 bugs in one [09:44:38] ohh let me fix that [09:47:28] mutante: I like staging :-] [09:48:19] I thought about "preprod" for "pre production" [09:48:48] i think it fits pretty good in http://en.wikipedia.org/wiki/Staging_%28websites%29 [09:48:57] "The staging server will resemble the production environment where the clients can do the user acceptance testing" [09:49:31] that would be it yes [09:49:44] usually you do something like: dev -> stag -> preproduction -> production [09:49:54] I guess preproduction would be test.wikipedia.org :-] [09:50:29] "get it up on the stage" [09:53:27] area52 :-] [09:54:34] hah, unstable.wm [09:54:45] !log deployment-prep hashar: /usr/local/apache/conf is now an independant git repository [09:54:46] Logged the message, Master [09:55:02] that would be the debian way of doing things :-] [09:55:30] yea, Debian fan [09:58:28] so "sid.wm" would always be the latest [09:59:22] but we need to replace toy story naming scheme with wm related stuff [10:04:27] lets name it Roan so :-] [10:05:22] hashar: i just imagined a link on a BZ ticket that automagically adds/integrates a voting template somewhere on meta [10:05:54] or is BZ itself good for actually voting [10:06:56] haha nice, i guess he will agree [10:10:43] PROBLEM host: analytics is DOWN address: i-000000e2 CRITICAL - Host Unreachable (i-000000e2) [10:12:13] PROBLEM host: ganglia-test is DOWN address: i-00000202 CRITICAL - Host Unreachable (i-00000202) [10:12:23] PROBLEM host: salt is DOWN address: i-000001c1 CRITICAL - Host Unreachable (i-000001c1) [10:40:43] PROBLEM host: analytics is DOWN address: i-000000e2 CRITICAL - Host Unreachable (i-000000e2) [10:42:13] PROBLEM host: ganglia-test is DOWN address: i-00000202 CRITICAL - Host Unreachable (i-00000202) [10:42:23] PROBLEM host: salt is DOWN address: i-000001c1 CRITICAL - Host Unreachable (i-000001c1) [10:43:07] hmm [10:43:16] somehow there is nothing set up for http://labs.wikimedia.beta.wmflabs.org/ [10:43:24] so it most probably ends up in a default virtual host [10:57:13] petan|wk: do you have any script to reload the Apaches? [11:05:00] New patchset: Hashar; "classes for deployment preparation project (beta)" [operations/puppet] (test) - https://gerrit.wikimedia.org/r/5790 [11:05:14] New review: gerrit2; "Lint check passed." [operations/puppet] (test); V: 1 - https://gerrit.wikimedia.org/r/5790 [11:05:21] New review: Hashar; "Adds `dsh` package" [operations/puppet] (test); V: 0 C: 0; - https://gerrit.wikimedia.org/r/5790 [11:05:53] mutante: I have added you as a review for https://gerrit.wikimedia.org/r/5790 [11:06:17] which is an utility class for the deployment-pre (beta) / staging / R.O.A.N project [11:08:22] !log deployment-prep hashar: install dsh package on deployment-dbdump [11:08:23] Logged the message, Master [11:10:43] PROBLEM host: analytics is DOWN address: i-000000e2 CRITICAL - Host Unreachable (i-000000e2) [11:12:13] PROBLEM host: ganglia-test is DOWN address: i-00000202 CRITICAL - Host Unreachable (i-00000202) [11:12:23] PROBLEM host: salt is DOWN address: i-000001c1 CRITICAL - Host Unreachable (i-000001c1) [11:16:16] New review: Dzahn; "wait for the naming bug you opened before you add the class name now?" [operations/puppet] (test); V: 0 C: 0; - https://gerrit.wikimedia.org/r/5790 [11:17:08] New review: Hashar; "Pending better group name : https://bugzilla.wikimedia.org/36414" [operations/puppet] (test); V: 0 C: -1; - https://gerrit.wikimedia.org/r/5790 [11:20:42] ahhhh [11:20:54] we need net groups :-D [11:30:26] broke the site :-D [11:30:28] (again) [11:35:06] hashar: there is no script so far because of weird sudo config [11:35:22] we need a bug report so :-) [11:35:24] well two [11:35:28] if sudo didn't need a password every time it would be great [11:35:44] but now it's a problem [11:38:10] hashar: so stage? [11:38:20] mutante: can you register dns [11:38:32] stage.wmflabs.org [11:39:53] btw hashar I deployed udp logs [11:39:58] it works fine [11:40:43] PROBLEM host: analytics is DOWN address: i-000000e2 CRITICAL - Host Unreachable (i-000000e2) [11:41:45] Change on 12mediawiki a page Wikimedia Labs was modified, changed by OrenBochman link https://www.mediawiki.org/w/index.php?diff=531502 edit summary: /* Proposals */ [11:42:13] PROBLEM host: ganglia-test is DOWN address: i-00000202 CRITICAL - Host Unreachable (i-00000202) [11:42:15] petan|wk: I have noticed that [11:42:20] ok [11:42:21] :-) [11:42:23] PROBLEM host: salt is DOWN address: i-000001c1 CRITICAL - Host Unreachable (i-000001c1) [11:42:59] sudo: no tty present and no askpass program specified [11:43:02] ohhh [11:43:08] DISK CRITICAL - free space: / 0 MB (0% inode=53%): [11:43:12] meh [11:43:24] hashar: where [11:43:42] on bastion1 while doing : ssh deployment-web3 sudo service apache restart [11:45:59] we will need to find out a way to remotely connect using a passwordless key [11:46:02] or something like that [11:46:51] !log deployment-prep petrb: purged stuff on transcoding and freed some 119468kb [11:46:52] Logged the message, Master [11:47:33] oh yes I think we need a ticket for that [11:49:23] petan|wk: https://bugzilla.wikimedia.org/36422 "easily reload all apaches" [11:49:32] I meant sudo [11:49:42] it's labs issue not project related [11:50:42] man sudoers , see 'visiblepw' [11:50:43] RECOVERY Disk Space is now: OK on deployment-transcoding i-00000105 output: DISK OK [11:50:50] looks like it is intended [11:51:05] it should work without pw [11:51:20] need to set 'visiblepw' in /etc/sudoers so :-] [11:51:32] anyway that does not scale [11:51:39] we need ssh keys so we can load them in an agent [11:52:40] !log deployment-prep j: add transcoding settings to CommonSettings.php again [11:52:41] Logged the message, Master [11:52:57] !log deployment-prep petrb: changed sudo policies on web to test if puppet override it [11:52:58] Logged the message, Master [11:53:21] hashar: it doesn't require pw on -web now [11:53:31] I manualy changed policies [11:53:37] but I think puppet will override it [11:54:43] run puppet you will see :-] [11:54:46] sudo puppetd -tv [11:54:49] ok [11:55:22] yes [11:55:23] it did [11:55:36] -%sudo ALL=(ALL) NOPASSWD: ALL [11:55:36] +%sudo ALL=(ALL) ALL [11:55:36] # [11:55:36] #includedir /etc/sudoers.d [11:55:43] meh [11:57:29] hashar: we could create a local script with setuid [11:57:46] 4755 [11:57:55] but that may not work on gluster dunno [11:58:11] mutante: some idea? [12:05:54] production uses $wgUseTidy = true? [12:06:04] mutante: yes [12:06:07] eh [12:06:10] j^: yes [12:06:11] mutante: nvm [12:06:39] and that considers