[00:02:16] PROBLEM Puppet freshness is now: CRITICAL on bots-apache1 i-000000b0 output: Puppet has not run in last 20 hours [00:02:16] PROBLEM Puppet freshness is now: CRITICAL on secondinstance i-0000015b output: Puppet has not run in last 20 hours [00:04:39] PROBLEM host: ganglia-test2 is DOWN address: i-00000250 CRITICAL - Host Unreachable (i-00000250) [00:05:14] PROBLEM Puppet freshness is now: CRITICAL on ganglia-collector i-000000b7 output: Puppet has not run in last 20 hours [00:08:34] PROBLEM host: aggregator-test is DOWN address: i-0000024d CRITICAL - Host Unreachable (i-0000024d) [00:10:04] PROBLEM host: aggregator1 is DOWN address: i-0000010c CRITICAL - Host Unreachable (i-0000010c) [00:10:04] PROBLEM host: aggregator-test3 is DOWN address: i-00000293 CRITICAL - Host Unreachable (i-00000293) [00:12:14] PROBLEM Puppet freshness is now: CRITICAL on deployment-deb i-000002b5 output: Puppet has not run in last 20 hours [00:12:14] PROBLEM Puppet freshness is now: CRITICAL on mwreview-test6 i-000002b9 output: Puppet has not run in last 20 hours [00:12:14] PROBLEM Puppet freshness is now: CRITICAL on queue-wiki1 i-000002b8 output: Puppet has not run in last 20 hours [00:12:14] PROBLEM Puppet freshness is now: CRITICAL on redis1 i-000002b6 output: Puppet has not run in last 20 hours [00:12:14] PROBLEM Puppet freshness is now: CRITICAL on zeromq1 i-000002b7 output: Puppet has not run in last 20 hours [00:13:14] PROBLEM Puppet freshness is now: CRITICAL on shop-analytics-main i-000001e6 output: Puppet has not run in last 20 hours [00:27:08] PROBLEM Puppet freshness is now: CRITICAL on building i-0000014d output: Puppet has not run in last 20 hours [00:32:12] PROBLEM Puppet freshness is now: CRITICAL on swift-fe1 i-000001d2 output: Puppet has not run in last 20 hours [00:35:03] PROBLEM host: ganglia-test2 is DOWN address: i-00000250 CRITICAL - Host Unreachable (i-00000250) [00:38:42] PROBLEM host: aggregator-test is DOWN address: i-0000024d CRITICAL - Host Unreachable (i-0000024d) [00:40:32] PROBLEM host: aggregator-test3 is DOWN address: i-00000293 CRITICAL - Host Unreachable (i-00000293) [00:40:42] PROBLEM host: aggregator1 is DOWN address: i-0000010c CRITICAL - Host Unreachable (i-0000010c) [00:42:12] PROBLEM Puppet freshness is now: CRITICAL on firstinstance i-0000013e output: Puppet has not run in last 20 hours [00:42:12] PROBLEM Puppet freshness is now: CRITICAL on log1 i-00000239 output: Puppet has not run in last 20 hours [00:45:12] PROBLEM Puppet freshness is now: CRITICAL on pybal-precise i-00000289 output: Puppet has not run in last 20 hours [00:47:12] PROBLEM Puppet freshness is now: CRITICAL on reportcard2 i-000001ea output: Puppet has not run in last 20 hours [01:01:25] PROBLEM Puppet freshness is now: CRITICAL on deployment-imagescaler01 i-0000025a output: Puppet has not run in last 20 hours [01:01:25] PROBLEM Puppet freshness is now: CRITICAL on worker1 i-00000208 output: Puppet has not run in last 20 hours [01:05:10] PROBLEM host: ganglia-test2 is DOWN address: i-00000250 CRITICAL - Host Unreachable (i-00000250) [01:08:44] PROBLEM host: aggregator-test is DOWN address: i-0000024d CRITICAL - Host Unreachable (i-0000024d) [01:09:14] PROBLEM Puppet freshness is now: CRITICAL on migration1 i-00000261 output: Puppet has not run in last 20 hours [01:10:14] PROBLEM Puppet freshness is now: CRITICAL on deployment-cache-bits i-00000264 output: Puppet has not run in last 20 hours [01:10:44] PROBLEM host: aggregator1 is DOWN address: i-0000010c CRITICAL - Host Unreachable (i-0000010c) [01:11:44] PROBLEM host: aggregator-test3 is DOWN address: i-00000293 CRITICAL - Host Unreachable (i-00000293) [01:18:14] PROBLEM Puppet freshness is now: CRITICAL on catsort-pub i-000001cc output: Puppet has not run in last 20 hours [01:20:14] PROBLEM Puppet freshness is now: CRITICAL on deployment-apache21 i-0000026d output: Puppet has not run in last 20 hours [01:20:14] PROBLEM Puppet freshness is now: CRITICAL on labs-build1 i-0000006b output: Puppet has not run in last 20 hours [01:22:14] PROBLEM Puppet freshness is now: CRITICAL on bots-sql2 i-000000af output: Puppet has not run in last 20 hours [01:22:14] PROBLEM Puppet freshness is now: CRITICAL on deployment-apache20 i-0000026c output: Puppet has not run in last 20 hours [01:22:14] PROBLEM Puppet freshness is now: CRITICAL on dumps-2 i-00000257 output: Puppet has not run in last 20 hours [01:24:14] PROBLEM Puppet freshness is now: CRITICAL on deployment-sql i-000000d0 output: Puppet has not run in last 20 hours [01:24:14] PROBLEM Puppet freshness is now: CRITICAL on deployment-wmsearch i-000000e1 output: Puppet has not run in last 20 hours [01:24:14] PROBLEM Puppet freshness is now: CRITICAL on hugglewa-1 i-000001e0 output: Puppet has not run in last 20 hours [01:24:14] PROBLEM Puppet freshness is now: CRITICAL on master i-0000007a output: Puppet has not run in last 20 hours [01:24:14] PROBLEM Puppet freshness is now: CRITICAL on mingledbtest i-00000283 output: Puppet has not run in last 20 hours [01:26:14] PROBLEM Puppet freshness is now: CRITICAL on deployment-syslog i-00000269 output: Puppet has not run in last 20 hours [01:26:14] PROBLEM Puppet freshness is now: CRITICAL on p-b i-000000ae output: Puppet has not run in last 20 hours [01:27:14] PROBLEM Puppet freshness is now: CRITICAL on feeds i-000000fa output: Puppet has not run in last 20 hours [01:27:14] PROBLEM Puppet freshness is now: CRITICAL on incubator-bot0 i-00000296 output: Puppet has not run in last 20 hours [01:28:14] PROBLEM Puppet freshness is now: CRITICAL on asher1 i-0000003a output: Puppet has not run in last 20 hours [01:28:14] PROBLEM Puppet freshness is now: CRITICAL on pediapress-ocg2 i-00000234 output: Puppet has not run in last 20 hours [01:28:14] PROBLEM Puppet freshness is now: CRITICAL on udp-filter i-000001df output: Puppet has not run in last 20 hours [01:29:14] PROBLEM Puppet freshness is now: CRITICAL on bob i-0000012d output: Puppet has not run in last 20 hours [01:29:14] PROBLEM Puppet freshness is now: CRITICAL on bots-dev i-00000190 output: Puppet has not run in last 20 hours [01:29:14] PROBLEM Puppet freshness is now: CRITICAL on e3 i-00000291 output: Puppet has not run in last 20 hours [01:29:14] PROBLEM Puppet freshness is now: CRITICAL on testing-groupchange i-00000205 output: Puppet has not run in last 20 hours [01:30:14] PROBLEM Puppet freshness is now: CRITICAL on bots-labs i-0000015e output: Puppet has not run in last 20 hours [01:30:14] PROBLEM Puppet freshness is now: CRITICAL on demo-deployment1 i-00000276 output: Puppet has not run in last 20 hours [01:30:14] PROBLEM Puppet freshness is now: CRITICAL on fundraising-db i-0000015c output: Puppet has not run in last 20 hours [01:30:14] PROBLEM Puppet freshness is now: CRITICAL on otrs-jgreen i-0000015a output: Puppet has not run in last 20 hours [01:30:14] PROBLEM Puppet freshness is now: CRITICAL on ve-nodejs i-00000245 output: Puppet has not run in last 20 hours [01:31:14] PROBLEM Puppet freshness is now: CRITICAL on ganglia-test4 i-000002a2 output: Puppet has not run in last 20 hours [01:31:14] PROBLEM Puppet freshness is now: CRITICAL on incubator-common i-00000254 output: Puppet has not run in last 20 hours [01:31:14] PROBLEM Puppet freshness is now: CRITICAL on varnish i-000001ac output: Puppet has not run in last 20 hours [01:32:15] PROBLEM Puppet freshness is now: CRITICAL on tw-next i-0000027e output: Puppet has not run in last 20 hours [01:33:18] PROBLEM Puppet freshness is now: CRITICAL on bots-nfs i-000000b1 output: Puppet has not run in last 20 hours [01:33:18] PROBLEM Puppet freshness is now: CRITICAL on outreacheval i-0000012e output: Puppet has not run in last 20 hours [01:33:18] PROBLEM Puppet freshness is now: CRITICAL on webserver-lcarr i-00000134 output: Puppet has not run in last 20 hours [01:35:14] PROBLEM Puppet freshness is now: CRITICAL on deployment-jobrunner05 i-0000028c output: Puppet has not run in last 20 hours [01:37:15] PROBLEM Puppet freshness is now: CRITICAL on bots-sql3 i-000000b4 output: Puppet has not run in last 20 hours [01:37:15] PROBLEM Puppet freshness is now: CRITICAL on build-precise1 i-00000273 output: Puppet has not run in last 20 hours [01:37:15] PROBLEM Puppet freshness is now: CRITICAL on en-wiki-db-lucid i-0000023b output: Puppet has not run in last 20 hours [01:37:15] PROBLEM Puppet freshness is now: CRITICAL on exim-test i-00000265 output: Puppet has not run in last 20 hours [01:37:15] PROBLEM Puppet freshness is now: CRITICAL on fr-wiki-db-precise i-0000023e output: Puppet has not run in last 20 hours [01:37:15] PROBLEM Puppet freshness is now: CRITICAL on incubator-apache i-00000211 output: Puppet has not run in last 20 hours [01:37:15] PROBLEM Puppet freshness is now: CRITICAL on labs-realserver i-00000104 output: Puppet has not run in last 20 hours [01:37:16] PROBLEM Puppet freshness is now: CRITICAL on nginx-ffuqua-doom1-3 i-00000196 output: Puppet has not run in last 20 hours [01:37:16] PROBLEM Puppet freshness is now: CRITICAL on pediapress-ocg1 i-00000233 output: Puppet has not run in last 20 hours [01:37:17] PROBLEM Puppet freshness is now: CRITICAL on pediapress-packager i-000001e4 output: Puppet has not run in last 20 hours [01:37:17] PROBLEM Puppet freshness is now: CRITICAL on scribunto i-0000022c output: Puppet has not run in last 20 hours [01:37:18] PROBLEM Puppet freshness is now: CRITICAL on simplewikt i-00000149 output: Puppet has not run in last 20 hours [01:37:18] PROBLEM Puppet freshness is now: CRITICAL on tutorial-mysql i-0000028b output: Puppet has not run in last 20 hours [01:37:19] PROBLEM Puppet freshness is now: CRITICAL on vumi-gw1 i-0000008f output: Puppet has not run in last 20 hours [01:37:43] PROBLEM host: ganglia-test2 is DOWN address: i-00000250 CRITICAL - Host Unreachable (i-00000250) [01:38:44] PROBLEM host: aggregator-test is DOWN address: i-0000024d CRITICAL - Host Unreachable (i-0000024d) [01:39:14] PROBLEM Puppet freshness is now: CRITICAL on ganglia-test3 i-0000025b output: Puppet has not run in last 20 hours [01:39:14] PROBLEM Puppet freshness is now: CRITICAL on incubator-bot1 i-00000251 output: Puppet has not run in last 20 hours [01:39:14] PROBLEM Puppet freshness is now: CRITICAL on pageviews i-000000b2 output: Puppet has not run in last 20 hours [01:40:54] PROBLEM host: aggregator1 is DOWN address: i-0000010c CRITICAL - Host Unreachable (i-0000010c) [01:41:44] PROBLEM host: aggregator-test3 is DOWN address: i-00000293 CRITICAL - Host Unreachable (i-00000293) [01:42:14] RECOVERY Disk Space is now: OK on ipv6test1 i-00000282 output: DISK OK [01:43:14] PROBLEM Puppet freshness is now: CRITICAL on bastion-restricted1 i-0000019b output: Puppet has not run in last 20 hours [01:43:14] PROBLEM Puppet freshness is now: CRITICAL on deployment-mc i-0000021b output: Puppet has not run in last 20 hours [01:43:14] PROBLEM Puppet freshness is now: CRITICAL on nova-dev3 i-000000e9 output: Puppet has not run in last 20 hours [01:43:14] PROBLEM Puppet freshness is now: CRITICAL on patchtest i-000000f1 output: Puppet has not run in last 20 hours [01:43:14] PROBLEM Puppet freshness is now: CRITICAL on swift-aux2 i-0000024c output: Puppet has not run in last 20 hours [01:43:14] PROBLEM Puppet freshness is now: CRITICAL on vumi i-000001e5 output: Puppet has not run in last 20 hours [01:44:14] PROBLEM Puppet freshness is now: CRITICAL on deployment-squid i-000000dc output: Puppet has not run in last 20 hours [01:44:14] PROBLEM Puppet freshness is now: CRITICAL on embed-sandbox i-000000d1 output: Puppet has not run in last 20 hours [01:44:14] PROBLEM Puppet freshness is now: CRITICAL on opengrok-web i-000001e1 output: Puppet has not run in last 20 hours [01:44:14] PROBLEM Puppet freshness is now: CRITICAL on swift-aux1 i-0000024b output: Puppet has not run in last 20 hours [01:44:14] PROBLEM Puppet freshness is now: CRITICAL on vivek-puppet i-000000ca output: Puppet has not run in last 20 hours [01:46:14] PROBLEM Puppet freshness is now: CRITICAL on dev-solr i-00000152 output: Puppet has not run in last 20 hours [01:47:15] PROBLEM Puppet freshness is now: CRITICAL on demo-web2 i-00000285 output: Puppet has not run in last 20 hours [01:47:15] PROBLEM Puppet freshness is now: CRITICAL on en-wiki-db-precise i-0000023c output: Puppet has not run in last 20 hours [01:47:15] PROBLEM Puppet freshness is now: CRITICAL on ubuntu1-pgehres i-000000fb output: Puppet has not run in last 20 hours [01:48:14] PROBLEM Puppet freshness is now: CRITICAL on maps-test2 i-00000253 output: Puppet has not run in last 20 hours [01:49:14] PROBLEM Puppet freshness is now: CRITICAL on bots-1 i-000000a9 output: Puppet has not run in last 20 hours [01:50:14] PROBLEM Disk Space is now: WARNING on ipv6test1 i-00000282 output: DISK WARNING - free space: / 72 MB (5% inode=58%): [01:51:14] PROBLEM Puppet freshness is now: CRITICAL on build1 i-000002b3 output: Puppet has not run in last 20 hours [01:51:14] PROBLEM Puppet freshness is now: CRITICAL on deployment-nfs-memc i-000000d7 output: Puppet has not run in last 20 hours [01:52:15] PROBLEM Puppet freshness is now: CRITICAL on hugglewa-db i-00000188 output: Puppet has not run in last 20 hours [01:52:15] PROBLEM Puppet freshness is now: CRITICAL on maps-test3 i-0000028f output: Puppet has not run in last 20 hours [01:52:15] PROBLEM Puppet freshness is now: CRITICAL on nova-ldap2 i-00000238 output: Puppet has not run in last 20 hours [01:53:14] PROBLEM Puppet freshness is now: CRITICAL on translation-memory-1 i-0000013a output: Puppet has not run in last 20 hours [02:08:44] PROBLEM host: aggregator-test is DOWN address: i-0000024d CRITICAL - Host Unreachable (i-0000024d) [02:08:44] PROBLEM host: ganglia-test2 is DOWN address: i-00000250 CRITICAL - Host Unreachable (i-00000250) [02:10:54] PROBLEM host: aggregator1 is DOWN address: i-0000010c CRITICAL - Host Unreachable (i-0000010c) [02:11:44] PROBLEM host: aggregator-test3 is DOWN address: i-00000293 CRITICAL - Host Unreachable (i-00000293) [02:39:35] PROBLEM host: ganglia-test2 is DOWN address: i-00000250 CRITICAL - Host Unreachable (i-00000250) [02:39:35] PROBLEM host: aggregator-test is DOWN address: i-0000024d CRITICAL - Host Unreachable (i-0000024d) [02:40:54] PROBLEM host: aggregator1 is DOWN address: i-0000010c CRITICAL - Host Unreachable (i-0000010c) [02:41:44] PROBLEM host: aggregator-test3 is DOWN address: i-00000293 CRITICAL - Host Unreachable (i-00000293) [02:44:44] RECOVERY Current Load is now: OK on mobile-wlm i-000002bc output: OK - load average: 0.48, 0.43, 0.54 [02:45:04] RECOVERY Current Users is now: OK on mobile-wlm i-000002bc output: USERS OK - 0 users currently logged in [02:45:44] RECOVERY Free ram is now: OK on mobile-wlm i-000002bc output: OK: 83% free memory [02:46:34] RECOVERY Disk Space is now: OK on mobile-wlm i-000002bc output: DISK OK [02:48:34] RECOVERY Total Processes is now: OK on mobile-wlm i-000002bc output: PROCS OK: 91 processes [02:48:44] RECOVERY dpkg-check is now: OK on mobile-wlm i-000002bc output: All packages OK [02:53:40] PROBLEM dpkg-check is now: CRITICAL on nova-daas-1 i-000000e7 output: CHECK_NRPE: Socket timeout after 10 seconds. [02:58:40] RECOVERY dpkg-check is now: OK on nova-daas-1 i-000000e7 output: All packages OK [03:01:51] PROBLEM Disk Space is now: CRITICAL on rds i-00000207 output: CHECK_NRPE: Socket timeout after 10 seconds. [03:01:51] PROBLEM Current Load is now: CRITICAL on rds i-00000207 output: CHECK_NRPE: Socket timeout after 10 seconds. [03:01:51] PROBLEM SSH is now: CRITICAL on bots-sql2 i-000000af output: CRITICAL - Socket timeout after 10 seconds [03:01:51] PROBLEM Current Users is now: CRITICAL on rds i-00000207 output: CHECK_NRPE: Socket timeout after 10 seconds. [03:02:57] PROBLEM Total Processes is now: CRITICAL on rds i-00000207 output: CHECK_NRPE: Socket timeout after 10 seconds. [03:03:02] PROBLEM Current Users is now: CRITICAL on ganglia-test5 i-000002a7 output: CHECK_NRPE: Socket timeout after 10 seconds. [03:03:16] PROBLEM Total Processes is now: CRITICAL on ganglia-test5 i-000002a7 output: CHECK_NRPE: Socket timeout after 10 seconds. [03:04:21] 06/03/2012 - 03:04:21 - Updating keys for laner at /export/home/deployment-prep/laner [03:04:28] PROBLEM Disk Space is now: CRITICAL on ganglia-test5 i-000002a7 output: CHECK_NRPE: Socket timeout after 10 seconds. [03:04:28] PROBLEM Free ram is now: CRITICAL on ganglia-test5 i-000002a7 output: CHECK_NRPE: Socket timeout after 10 seconds. [03:04:28] PROBLEM dpkg-check is now: CRITICAL on ganglia-test5 i-000002a7 output: CHECK_NRPE: Socket timeout after 10 seconds. [03:05:01] PROBLEM Current Load is now: WARNING on bots-sql2 i-000000af output: WARNING - load average: 7.89, 8.69, 5.50 [03:06:34] PROBLEM Current Load is now: CRITICAL on nagios 127.0.0.1 output: CRITICAL - load average: 4.75, 10.03, 7.25 [03:07:01] PROBLEM Total Processes is now: CRITICAL on ganglia-test6 i-000002af output: CHECK_NRPE: Socket timeout after 10 seconds. [03:07:06] PROBLEM dpkg-check is now: CRITICAL on ganglia-test6 i-000002af output: CHECK_NRPE: Socket timeout after 10 seconds. [03:07:06] PROBLEM Free ram is now: CRITICAL on ganglia-test6 i-000002af output: CHECK_NRPE: Socket timeout after 10 seconds. [03:07:15] RECOVERY Disk Space is now: OK on rds i-00000207 output: DISK OK [03:07:15] RECOVERY Current Load is now: OK on rds i-00000207 output: OK - load average: 10.01, 7.60, 4.36 [03:07:15] RECOVERY Current Users is now: OK on rds i-00000207 output: USERS OK - 0 users currently logged in [03:10:02] RECOVERY Disk Space is now: OK on ganglia-test5 i-000002a7 output: DISK OK [03:10:02] RECOVERY Free ram is now: OK on ganglia-test5 i-000002a7 output: OK: 84% free memory [03:10:02] RECOVERY dpkg-check is now: OK on ganglia-test5 i-000002a7 output: All packages OK [03:10:06] PROBLEM host: aggregator-test is DOWN address: i-0000024d CRITICAL - Host Unreachable (i-0000024d) [03:10:42] PROBLEM host: ganglia-test2 is DOWN address: i-00000250 CRITICAL - Host Unreachable (i-00000250) [03:11:23] PROBLEM Current Users is now: CRITICAL on upload-wizard i-0000021c output: CHECK_NRPE: Socket timeout after 10 seconds. [03:11:23] PROBLEM Disk Space is now: CRITICAL on upload-wizard i-0000021c output: CHECK_NRPE: Socket timeout after 10 seconds. [03:11:23] PROBLEM Free ram is now: CRITICAL on upload-wizard i-0000021c output: CHECK_NRPE: Socket timeout after 10 seconds. [03:12:27] RECOVERY SSH is now: OK on bots-sql2 i-000000af output: SSH OK - OpenSSH_5.3p1 Debian-3ubuntu7 (protocol 2.0) [03:13:53] PROBLEM host: aggregator1 is DOWN address: i-0000010c CRITICAL - Host Unreachable (i-0000010c) [03:13:53] PROBLEM host: aggregator-test3 is DOWN address: i-00000293 CRITICAL - Host Unreachable (i-00000293) [03:13:58] RECOVERY Total Processes is now: OK on ganglia-test5 i-000002a7 output: PROCS OK: 181 processes [03:14:03] RECOVERY Current Users is now: OK on ganglia-test5 i-000002a7 output: USERS OK - 0 users currently logged in [03:14:03] RECOVERY Total Processes is now: OK on rds i-00000207 output: PROCS OK: 75 processes [03:15:18] RECOVERY Total Processes is now: OK on ganglia-test6 i-000002af output: PROCS OK: 80 processes [03:15:28] RECOVERY dpkg-check is now: OK on ganglia-test6 i-000002af output: All packages OK [03:15:28] RECOVERY Free ram is now: OK on ganglia-test6 i-000002af output: OK: 91% free memory [03:16:08] RECOVERY Current Users is now: OK on upload-wizard i-0000021c output: USERS OK - 0 users currently logged in [03:16:08] RECOVERY Disk Space is now: OK on upload-wizard i-0000021c output: DISK OK [03:16:13] RECOVERY Free ram is now: OK on upload-wizard i-0000021c output: OK: 94% free memory [03:19:38] RECOVERY Current Load is now: OK on bots-sql2 i-000000af output: OK - load average: 1.16, 3.29, 4.83 [03:25:21] PROBLEM Current Load is now: WARNING on nagios 127.0.0.1 output: WARNING - load average: 0.77, 1.06, 3.57 [03:30:18] RECOVERY Current Load is now: OK on nagios 127.0.0.1 output: OK - load average: 1.27, 0.87, 2.76 [03:38:18] PROBLEM Free ram is now: WARNING on utils-abogott i-00000131 output: Warning: 16% free memory [03:40:28] PROBLEM host: aggregator-test is DOWN address: i-0000024d CRITICAL - Host Unreachable (i-0000024d) [03:42:58] PROBLEM host: ganglia-test2 is DOWN address: i-00000250 CRITICAL - Host Unreachable (i-00000250) [03:43:56] PROBLEM host: aggregator-test3 is DOWN address: i-00000293 CRITICAL - Host Unreachable (i-00000293) [03:43:56] PROBLEM host: aggregator1 is DOWN address: i-0000010c CRITICAL - Host Unreachable (i-0000010c) [03:44:51] PROBLEM Puppet freshness is now: CRITICAL on blamemaps-m1small i-000002a1 output: Puppet has not run in last 20 hours [03:56:53] PROBLEM Free ram is now: WARNING on test-oneiric i-00000187 output: Warning: 16% free memory [03:58:23] PROBLEM Free ram is now: CRITICAL on utils-abogott i-00000131 output: Critical: 3% free memory [04:02:00] PROBLEM Free ram is now: WARNING on nova-daas-1 i-000000e7 output: Warning: 15% free memory [04:03:30] RECOVERY Free ram is now: OK on utils-abogott i-00000131 output: OK: 96% free memory [04:10:10] PROBLEM Puppet freshness is now: CRITICAL on mobile-testing i-00000271 output: Puppet has not run in last 20 hours [04:11:01] PROBLEM host: aggregator-test is DOWN address: i-0000024d CRITICAL - Host Unreachable (i-0000024d) [04:12:10] PROBLEM Puppet freshness is now: CRITICAL on precise-test i-00000231 output: Puppet has not run in last 20 hours [04:14:10] PROBLEM host: ganglia-test2 is DOWN address: i-00000250 CRITICAL - Host Unreachable (i-00000250) [04:14:10] PROBLEM host: aggregator-test3 is DOWN address: i-00000293 CRITICAL - Host Unreachable (i-00000293) [04:14:10] PROBLEM host: aggregator1 is DOWN address: i-0000010c CRITICAL - Host Unreachable (i-0000010c) [04:15:20] PROBLEM Free ram is now: WARNING on orgcharts-dev i-0000018f output: Warning: 17% free memory [04:17:00] PROBLEM Free ram is now: CRITICAL on test-oneiric i-00000187 output: Critical: 3% free memory [04:20:26] PROBLEM Puppet freshness is now: CRITICAL on bots-3 i-000000e5 output: Puppet has not run in last 20 hours [04:22:03] RECOVERY Free ram is now: OK on test-oneiric i-00000187 output: OK: 96% free memory [04:23:13] PROBLEM Puppet freshness is now: CRITICAL on ganglia-test5 i-000002a7 output: Puppet has not run in last 20 hours [04:23:13] PROBLEM Puppet freshness is now: CRITICAL on mwreview i-000002ae output: Puppet has not run in last 20 hours [04:24:13] PROBLEM Puppet freshness is now: CRITICAL on incubator-bot2 i-00000252 output: Puppet has not run in last 20 hours [04:25:13] PROBLEM Puppet freshness is now: CRITICAL on ganglia-test6 i-000002af output: Puppet has not run in last 20 hours [04:26:13] PROBLEM Puppet freshness is now: CRITICAL on localpuppet1 i-0000020b output: Puppet has not run in last 20 hours [04:27:03] PROBLEM Free ram is now: CRITICAL on nova-daas-1 i-000000e7 output: Critical: 5% free memory [04:33:13] PROBLEM Puppet freshness is now: CRITICAL on maps-tilemill1 i-00000294 output: Puppet has not run in last 20 hours [04:37:03] RECOVERY Free ram is now: OK on nova-daas-1 i-000000e7 output: OK: 94% free memory [04:40:23] PROBLEM Free ram is now: CRITICAL on orgcharts-dev i-0000018f output: Critical: 3% free memory [04:41:03] PROBLEM host: aggregator-test is DOWN address: i-0000024d CRITICAL - Host Unreachable (i-0000024d) [04:44:13] PROBLEM host: ganglia-test2 is DOWN address: i-00000250 CRITICAL - Host Unreachable (i-00000250) [04:44:13] PROBLEM host: aggregator-test3 is DOWN address: i-00000293 CRITICAL - Host Unreachable (i-00000293) [04:44:13] PROBLEM host: aggregator1 is DOWN address: i-0000010c CRITICAL - Host Unreachable (i-0000010c) [04:45:23] RECOVERY Free ram is now: OK on orgcharts-dev i-0000018f output: OK: 97% free memory [04:50:13] PROBLEM Puppet freshness is now: CRITICAL on mailman-01 i-00000235 output: Puppet has not run in last 20 hours [04:53:56] RECOVERY Disk Space is now: OK on ipv6test1 i-00000282 output: DISK OK [05:01:53] PROBLEM Disk Space is now: WARNING on ipv6test1 i-00000282 output: DISK WARNING - free space: / 72 MB (5% inode=58%): [05:11:03] PROBLEM host: aggregator-test is DOWN address: i-0000024d CRITICAL - Host Unreachable (i-0000024d) [05:14:13] PROBLEM host: ganglia-test2 is DOWN address: i-00000250 CRITICAL - Host Unreachable (i-00000250) [05:14:13] PROBLEM host: aggregator-test3 is DOWN address: i-00000293 CRITICAL - Host Unreachable (i-00000293) [05:14:13] PROBLEM host: aggregator1 is DOWN address: i-0000010c CRITICAL - Host Unreachable (i-0000010c) [05:41:03] PROBLEM host: aggregator-test is DOWN address: i-0000024d CRITICAL - Host Unreachable (i-0000024d) [05:44:13] PROBLEM host: ganglia-test2 is DOWN address: i-00000250 CRITICAL - Host Unreachable (i-00000250) [05:44:13] PROBLEM host: aggregator1 is DOWN address: i-0000010c CRITICAL - Host Unreachable (i-0000010c) [05:44:13] PROBLEM host: aggregator-test3 is DOWN address: i-00000293 CRITICAL - Host Unreachable (i-00000293) [06:11:03] PROBLEM host: aggregator-test is DOWN address: i-0000024d CRITICAL - Host Unreachable (i-0000024d) [06:14:13] PROBLEM host: ganglia-test2 is DOWN address: i-00000250 CRITICAL - Host Unreachable (i-00000250) [06:14:13] PROBLEM host: aggregator1 is DOWN address: i-0000010c CRITICAL - Host Unreachable (i-0000010c) [06:14:13] PROBLEM host: aggregator-test3 is DOWN address: i-00000293 CRITICAL - Host Unreachable (i-00000293) [06:21:13] PROBLEM Puppet freshness is now: CRITICAL on bots-cb i-0000009e output: Puppet has not run in last 20 hours [06:32:29] PROBLEM dpkg-check is now: CRITICAL on pediapress-ocg2 i-00000234 output: CHECK_NRPE: Socket timeout after 10 seconds. [06:32:29] PROBLEM Disk Space is now: CRITICAL on pediapress-ocg2 i-00000234 output: CHECK_NRPE: Socket timeout after 10 seconds. [06:34:34] PROBLEM Total Processes is now: CRITICAL on e3 i-00000291 output: CHECK_NRPE: Socket timeout after 10 seconds. [06:37:19] RECOVERY Disk Space is now: OK on pediapress-ocg2 i-00000234 output: DISK OK [06:37:19] RECOVERY dpkg-check is now: OK on pediapress-ocg2 i-00000234 output: All packages OK [06:38:12] PROBLEM Current Load is now: WARNING on nagios 127.0.0.1 output: WARNING - load average: 4.24, 4.62, 2.64 [06:38:12] RECOVERY Total Processes is now: OK on e3 i-00000291 output: PROCS OK: 94 processes [06:38:17] PROBLEM dpkg-check is now: CRITICAL on nova-precise1 i-00000236 output: CHECK_NRPE: Socket timeout after 10 seconds. [06:38:17] PROBLEM Current Load is now: CRITICAL on nova-precise1 i-00000236 output: CHECK_NRPE: Socket timeout after 10 seconds. [06:38:18] PROBLEM dpkg-check is now: CRITICAL on pediapress-ocg1 i-00000233 output: CHECK_NRPE: Socket timeout after 10 seconds. [06:38:18] PROBLEM Current Users is now: CRITICAL on nova-precise1 i-00000236 output: CHECK_NRPE: Socket timeout after 10 seconds. [06:38:18] PROBLEM Disk Space is now: CRITICAL on pediapress-ocg1 i-00000233 output: CHECK_NRPE: Socket timeout after 10 seconds. [06:38:18] PROBLEM Current Users is now: CRITICAL on pediapress-ocg1 i-00000233 output: CHECK_NRPE: Socket timeout after 10 seconds. [06:38:52] PROBLEM Disk Space is now: CRITICAL on nova-precise1 i-00000236 output: CHECK_NRPE: Socket timeout after 10 seconds. [06:38:52] PROBLEM Free ram is now: CRITICAL on nova-precise1 i-00000236 output: CHECK_NRPE: Socket timeout after 10 seconds. [06:41:19] PROBLEM host: aggregator-test is DOWN address: i-0000024d CRITICAL - Host Unreachable (i-0000024d) [06:41:29] PROBLEM Current Load is now: CRITICAL on ganglia-test4 i-000002a2 output: CHECK_NRPE: Socket timeout after 10 seconds. [06:41:29] PROBLEM Total Processes is now: CRITICAL on ganglia-test4 i-000002a2 output: CHECK_NRPE: Socket timeout after 10 seconds. [06:43:06] RECOVERY Current Load is now: OK on nagios 127.0.0.1 output: OK - load average: 1.17, 2.56, 2.29 [06:43:07] RECOVERY Current Load is now: OK on nova-precise1 i-00000236 output: OK - load average: 8.54, 7.58, 4.23 [06:43:07] RECOVERY Current Users is now: OK on nova-precise1 i-00000236 output: USERS OK - 0 users currently logged in [06:43:07] RECOVERY dpkg-check is now: OK on nova-precise1 i-00000236 output: All packages OK [06:43:22] PROBLEM Current Users is now: CRITICAL on incubator-bot0 i-00000296 output: CHECK_NRPE: Socket timeout after 10 seconds. [06:43:22] PROBLEM dpkg-check is now: CRITICAL on deployment-jobrunner05 i-0000028c output: CHECK_NRPE: Socket timeout after 10 seconds. [06:43:36] PROBLEM Current Load is now: WARNING on bots-sql2 i-000000af output: WARNING - load average: 7.57, 7.39, 5.50 [06:43:36] RECOVERY Disk Space is now: OK on nova-precise1 i-00000236 output: DISK OK [06:43:36] RECOVERY Free ram is now: OK on nova-precise1 i-00000236 output: OK: 76% free memory [06:43:46] PROBLEM Disk Space is now: CRITICAL on ganglia-test4 i-000002a2 output: CHECK_NRPE: Socket timeout after 10 seconds. [06:43:46] PROBLEM dpkg-check is now: CRITICAL on ganglia-test4 i-000002a2 output: CHECK_NRPE: Socket timeout after 10 seconds. [06:44:07] PROBLEM Current Load is now: CRITICAL on pediapress-ocg1 i-00000233 output: CHECK_NRPE: Socket timeout after 10 seconds. [06:44:07] PROBLEM Total Processes is now: CRITICAL on pediapress-ocg1 i-00000233 output: CHECK_NRPE: Socket timeout after 10 seconds. [06:45:32] PROBLEM host: ganglia-test2 is DOWN address: i-00000250 CRITICAL - Host Unreachable (i-00000250) [06:45:32] PROBLEM host: aggregator1 is DOWN address: i-0000010c CRITICAL - Host Unreachable (i-0000010c) [06:45:32] PROBLEM host: aggregator-test3 is DOWN address: i-00000293 CRITICAL - Host Unreachable (i-00000293) [06:46:18] PROBLEM Total Processes is now: CRITICAL on e3 i-00000291 output: CHECK_NRPE: Socket timeout after 10 seconds. [06:48:56] RECOVERY Disk Space is now: OK on ipv6test1 i-00000282 output: DISK OK [06:49:06] PROBLEM Total Processes is now: CRITICAL on incubator-bot0 i-00000296 output: CHECK_NRPE: Socket timeout after 10 seconds. [06:49:14] PROBLEM Free ram is now: CRITICAL on incubator-bot0 i-00000296 output: CHECK_NRPE: Socket timeout after 10 seconds. [06:49:14] PROBLEM Current Load is now: CRITICAL on incubator-bot0 i-00000296 output: CHECK_NRPE: Socket timeout after 10 seconds. [06:49:14] PROBLEM HTTP is now: CRITICAL on mailman-01 i-00000235 output: Connection refused [06:49:14] RECOVERY Disk Space is now: OK on pediapress-ocg1 i-00000233 output: DISK OK [06:49:14] RECOVERY Current Users is now: OK on pediapress-ocg1 i-00000233 output: USERS OK - 0 users currently logged in [06:49:14] RECOVERY dpkg-check is now: OK on pediapress-ocg1 i-00000233 output: All packages OK [06:49:14] RECOVERY Current Load is now: OK on pediapress-ocg1 i-00000233 output: OK - load average: 1.97, 3.95, 3.31 [06:49:15] RECOVERY Total Processes is now: OK on pediapress-ocg1 i-00000233 output: PROCS OK: 90 processes [06:49:28] PROBLEM dpkg-check is now: CRITICAL on incubator-bot0 i-00000296 output: CHECK_NRPE: Socket timeout after 10 seconds. [06:49:28] PROBLEM Free ram is now: CRITICAL on deployment-jobrunner05 i-0000028c output: CHECK_NRPE: Socket timeout after 10 seconds. [06:49:28] PROBLEM Disk Space is now: CRITICAL on incubator-bot0 i-00000296 output: CHECK_NRPE: Socket timeout after 10 seconds. [06:50:30] PROBLEM dpkg-check is now: CRITICAL on e3 i-00000291 output: CHECK_NRPE: Socket timeout after 10 seconds. [06:51:55] PROBLEM Free ram is now: CRITICAL on zeromq1 i-000002b7 output: CHECK_NRPE: Socket timeout after 10 seconds. [06:51:56] PROBLEM Free ram is now: CRITICAL on e3 i-00000291 output: CHECK_NRPE: Socket timeout after 10 seconds. [06:51:56] PROBLEM Current Users is now: CRITICAL on ganglia-test4 i-000002a2 output: CHECK_NRPE: Socket timeout after 10 seconds. [06:54:26] RECOVERY Disk Space is now: OK on deployment-transcoding i-00000105 output: DISK OK [06:56:18] PROBLEM Total Processes is now: CRITICAL on zeromq1 i-000002b7 output: CHECK_NRPE: Socket timeout after 10 seconds. [06:56:38] PROBLEM Free ram is now: CRITICAL on en-wiki-db-precise i-0000023c output: CHECK_NRPE: Socket timeout after 10 seconds. [06:56:38] PROBLEM Free ram is now: CRITICAL on maps-test2 i-00000253 output: CHECK_NRPE: Socket timeout after 10 seconds. [06:56:38] PROBLEM dpkg-check is now: CRITICAL on maps-test2 i-00000253 output: CHECK_NRPE: Socket timeout after 10 seconds. [06:56:38] PROBLEM Disk Space is now: CRITICAL on maps-test2 i-00000253 output: CHECK_NRPE: Socket timeout after 10 seconds. [06:57:11] PROBLEM Current Load is now: CRITICAL on nagios 127.0.0.1 output: CRITICAL - load average: 3.85, 7.58, 5.91 [06:57:11] RECOVERY dpkg-check is now: OK on e3 i-00000291 output: All packages OK [06:57:11] RECOVERY Free ram is now: OK on e3 i-00000291 output: OK: 91% free memory [06:57:14] RECOVERY Disk Space is now: OK on ganglia-test4 i-000002a2 output: DISK OK [06:57:14] RECOVERY Current Users is now: OK on ganglia-test4 i-000002a2 output: USERS OK - 0 users currently logged in [06:57:14] RECOVERY dpkg-check is now: OK on ganglia-test4 i-000002a2 output: All packages OK [06:58:36] PROBLEM Current Load is now: CRITICAL on bots-sql2 i-000000af output: CHECK_NRPE: Socket timeout after 10 seconds. [06:59:17] PROBLEM Disk Space is now: CRITICAL on pediapress-ocg2 i-00000234 output: CHECK_NRPE: Socket timeout after 10 seconds. [07:02:27] PROBLEM dpkg-check is now: CRITICAL on pediapress-ocg2 i-00000234 output: CHECK_NRPE: Socket timeout after 10 seconds. [07:02:27] PROBLEM Disk Space is now: CRITICAL on en-wiki-db-precise i-0000023c output: CHECK_NRPE: Socket timeout after 10 seconds. [07:02:27] PROBLEM Current Users is now: CRITICAL on en-wiki-db-precise i-0000023c output: CHECK_NRPE: Socket timeout after 10 seconds. [07:03:11] PROBLEM Current Load is now: WARNING on deployment-jobrunner05 i-0000028c output: WARNING - load average: 6.59, 9.63, 9.04 [07:04:13] PROBLEM Current Load is now: WARNING on ganglia-test4 i-000002a2 output: WARNING - load average: 5.50, 6.13, 5.45 [07:04:14] RECOVERY Total Processes is now: OK on ganglia-test4 i-000002a2 output: PROCS OK: 198 processes [07:04:23] RECOVERY Free ram is now: OK on en-wiki-db-precise i-0000023c output: OK: 84% free memory [07:04:24] RECOVERY Free ram is now: OK on maps-test2 i-00000253 output: OK: 89% free memory [07:04:24] RECOVERY dpkg-check is now: OK on maps-test2 i-00000253 output: All packages OK [07:04:29] RECOVERY Disk Space is now: OK on maps-test2 i-00000253 output: DISK OK [07:04:29] RECOVERY Free ram is now: OK on deployment-jobrunner05 i-0000028c output: OK: 87% free memory [07:04:29] RECOVERY Current Users is now: OK on incubator-bot0 i-00000296 output: USERS OK - 0 users currently logged in [07:04:29] RECOVERY dpkg-check is now: OK on deployment-jobrunner05 i-0000028c output: All packages OK [07:07:20] RECOVERY Free ram is now: OK on zeromq1 i-000002b7 output: OK: 88% free memory [07:07:26] PROBLEM Current Load is now: WARNING on bots-sql2 i-000000af output: WARNING - load average: 5.26, 7.28, 7.96 [07:08:47] RECOVERY Disk Space is now: OK on en-wiki-db-precise i-0000023c output: DISK OK [07:08:48] RECOVERY Current Users is now: OK on en-wiki-db-precise i-0000023c output: USERS OK - 0 users currently logged in [07:08:53] PROBLEM Disk Space is now: WARNING on ipv6test1 i-00000282 output: DISK WARNING - free space: / 77 MB (5% inode=57%): [07:09:14] RECOVERY Current Load is now: OK on ganglia-test4 i-000002a2 output: OK - load average: 0.20, 2.66, 4.18 [07:09:19] PROBLEM Free ram is now: CRITICAL on pybal-precise i-00000289 output: CHECK_NRPE: Socket timeout after 10 seconds. [07:09:19] PROBLEM Disk Space is now: CRITICAL on upload-wizard i-0000021c output: CHECK_NRPE: Socket timeout after 10 seconds. [07:09:19] PROBLEM Current Users is now: CRITICAL on upload-wizard i-0000021c output: CHECK_NRPE: Socket timeout after 10 seconds. [07:09:19] PROBLEM Current Load is now: WARNING on pybal-precise i-00000289 output: WARNING - load average: 7.28, 9.00, 6.65 [07:09:24] PROBLEM Free ram is now: CRITICAL on upload-wizard i-0000021c output: CHECK_NRPE: Socket timeout after 10 seconds. [07:09:29] PROBLEM Current Load is now: CRITICAL on mobile-testing i-00000271 output: CHECK_NRPE: Socket timeout after 10 seconds. [07:09:29] PROBLEM Current Users is now: CRITICAL on mobile-testing i-00000271 output: CHECK_NRPE: Socket timeout after 10 seconds. [07:10:09] PROBLEM Total Processes is now: CRITICAL on upload-wizard i-0000021c output: CHECK_NRPE: Socket timeout after 10 seconds. [07:10:13] PROBLEM Disk Space is now: CRITICAL on mobile-testing i-00000271 output: CHECK_NRPE: Socket timeout after 10 seconds. [07:11:29] RECOVERY Total Processes is now: OK on zeromq1 i-000002b7 output: PROCS OK: 82 processes [07:11:40] PROBLEM host: aggregator-test is DOWN address: i-0000024d CRITICAL - Host Unreachable (i-0000024d) [07:11:45] PROBLEM dpkg-check is now: CRITICAL on bots-sql2 i-000000af output: CHECK_NRPE: Socket timeout after 10 seconds. [07:12:05] PROBLEM Total Processes is now: CRITICAL on mobile-testing i-00000271 output: CHECK_NRPE: Socket timeout after 10 seconds. [07:12:17] PROBLEM Free ram is now: CRITICAL on mobile-testing i-00000271 output: CHECK_NRPE: Socket timeout after 10 seconds. [07:12:17] PROBLEM dpkg-check is now: CRITICAL on mobile-testing i-00000271 output: CHECK_NRPE: Socket timeout after 10 seconds. [07:12:58] PROBLEM Current Load is now: WARNING on bots-cb i-0000009e output: WARNING - load average: 9.52, 17.20, 11.64 [07:13:19] PROBLEM Total Processes is now: CRITICAL on worker1 i-00000208 output: CHECK_NRPE: Socket timeout after 10 seconds. [07:14:12] PROBLEM Current Load is now: CRITICAL on worker1 i-00000208 output: CHECK_NRPE: Socket timeout after 10 seconds. [07:14:12] PROBLEM Current Users is now: CRITICAL on worker1 i-00000208 output: CHECK_NRPE: Socket timeout after 10 seconds. [07:14:13] RECOVERY Free ram is now: OK on pybal-precise i-00000289 output: OK: 79% free memory [07:14:13] PROBLEM Disk Space is now: CRITICAL on worker1 i-00000208 output: CHECK_NRPE: Socket timeout after 10 seconds. [07:14:13] PROBLEM Free ram is now: CRITICAL on worker1 i-00000208 output: CHECK_NRPE: Socket timeout after 10 seconds. [07:14:58] PROBLEM Current Load is now: CRITICAL on maps-tilemill1 i-00000294 output: CHECK_NRPE: Socket timeout after 10 seconds. [07:15:47] PROBLEM Current Load is now: WARNING on upload-wizard i-0000021c output: WARNING - load average: 4.70, 6.77, 5.89 [07:15:47] RECOVERY dpkg-check is now: OK on bots-sql2 i-000000af output: All packages OK [07:15:47] PROBLEM host: ganglia-test2 is DOWN address: i-00000250 CRITICAL - Host Unreachable (i-00000250) [07:15:47] PROBLEM host: aggregator1 is DOWN address: i-0000010c CRITICAL - Host Unreachable (i-0000010c) [07:15:47] PROBLEM host: aggregator-test3 is DOWN address: i-00000293 CRITICAL - Host Unreachable (i-00000293) [07:17:53] RECOVERY Total Processes is now: OK on worker1 i-00000208 output: PROCS OK: 76 processes [07:18:29] RECOVERY Current Load is now: OK on worker1 i-00000208 output: OK - load average: 0.46, 3.83, 4.57 [07:18:29] RECOVERY Disk Space is now: OK on worker1 i-00000208 output: DISK OK [07:18:29] RECOVERY Current Users is now: OK on worker1 i-00000208 output: USERS OK - 0 users currently logged in [07:18:29] RECOVERY Free ram is now: OK on worker1 i-00000208 output: OK: 95% free memory [07:18:29] RECOVERY Disk Space is now: OK on upload-wizard i-0000021c output: DISK OK [07:18:29] RECOVERY Current Users is now: OK on upload-wizard i-0000021c output: USERS OK - 0 users currently logged in [07:18:30] RECOVERY Free ram is now: OK on upload-wizard i-0000021c output: OK: 94% free memory [07:18:30] RECOVERY Current Load is now: OK on pybal-precise i-00000289 output: OK - load average: 0.26, 2.64, 4.56 [07:18:31] RECOVERY Total Processes is now: OK on upload-wizard i-0000021c output: PROCS OK: 83 processes [07:18:35] PROBLEM Current Load is now: WARNING on mobile-testing i-00000271 output: WARNING - load average: 5.08, 11.91, 14.24 [07:18:37] RECOVERY Current Users is now: OK on mobile-testing i-00000271 output: USERS OK - 0 users currently logged in [07:18:37] RECOVERY Disk Space is now: OK on mobile-testing i-00000271 output: DISK OK [07:18:38] PROBLEM Current Load is now: WARNING on ganglia-test5 i-000002a7 output: WARNING - load average: 5.42, 9.84, 9.14 [07:19:25] PROBLEM Current Load is now: WARNING on maps-tilemill1 i-00000294 output: WARNING - load average: 0.38, 3.29, 5.03 [07:20:28] RECOVERY Current Load is now: OK on upload-wizard i-0000021c output: OK - load average: 0.21, 3.02, 4.54 [07:20:28] RECOVERY Total Processes is now: OK on mobile-testing i-00000271 output: PROCS OK: 131 processes [07:20:33] RECOVERY Free ram is now: OK on mobile-testing i-00000271 output: OK: 91% free memory [07:20:33] RECOVERY dpkg-check is now: OK on mobile-testing i-00000271 output: All packages OK [07:20:48] PROBLEM Current Load is now: WARNING on jenkins2 i-00000102 output: WARNING - load average: 0.70, 6.07, 6.89 [07:23:32] O_o [07:24:18] RECOVERY Current Load is now: OK on maps-tilemill1 i-00000294 output: OK - load average: 0.22, 1.35, 3.70 [07:25:48] RECOVERY Current Load is now: OK on jenkins2 i-00000102 output: OK - load average: 0.00, 2.23, 4.98 [07:28:28] RECOVERY Current Load is now: OK on ganglia-test5 i-000002a7 output: OK - load average: 0.02, 1.45, 4.93 [07:28:28] RECOVERY Current Load is now: OK on deployment-jobrunner05 i-0000028c output: OK - load average: 3.12, 3.47, 4.59 [07:32:28] RECOVERY Current Load is now: OK on bots-cb i-0000009e output: OK - load average: 0.17, 0.58, 3.74 [07:38:28] RECOVERY Current Load is now: OK on mobile-testing i-00000271 output: OK - load average: 0.93, 0.99, 4.48 [07:43:28] PROBLEM host: aggregator-test is DOWN address: i-0000024d CRITICAL - Host Unreachable (i-0000024d) [07:47:48] PROBLEM host: ganglia-test2 is DOWN address: i-00000250 CRITICAL - Host Unreachable (i-00000250) [07:48:28] PROBLEM host: aggregator-test3 is DOWN address: i-00000293 CRITICAL - Host Unreachable (i-00000293) [07:48:28] PROBLEM host: aggregator1 is DOWN address: i-0000010c CRITICAL - Host Unreachable (i-0000010c) [07:54:08] PROBLEM Puppet freshness is now: CRITICAL on ipv6test1 i-00000282 output: Puppet has not run in last 20 hours [07:56:08] PROBLEM Puppet freshness is now: CRITICAL on rds i-00000207 output: Puppet has not run in last 20 hours [07:56:08] PROBLEM Puppet freshness is now: CRITICAL on upload-wizard i-0000021c output: Puppet has not run in last 20 hours [08:13:28] PROBLEM host: aggregator-test is DOWN address: i-0000024d CRITICAL - Host Unreachable (i-0000024d) [08:18:28] PROBLEM host: aggregator-test3 is DOWN address: i-00000293 CRITICAL - Host Unreachable (i-00000293) [08:18:28] PROBLEM host: ganglia-test2 is DOWN address: i-00000250 CRITICAL - Host Unreachable (i-00000250) [08:18:28] PROBLEM host: aggregator1 is DOWN address: i-0000010c CRITICAL - Host Unreachable (i-0000010c) [08:43:28] PROBLEM host: aggregator-test is DOWN address: i-0000024d CRITICAL - Host Unreachable (i-0000024d) [08:45:05] Change on 12mediawiki a page Wikimedia Labs was modified, changed by Husky link https://www.mediawiki.org/w/index.php?diff=546063 edit summary: What about having the URL in this document somewhere? :) [08:48:28] PROBLEM host: aggregator-test3 is DOWN address: i-00000293 CRITICAL - Host Unreachable (i-00000293) [08:48:28] PROBLEM host: aggregator1 is DOWN address: i-0000010c CRITICAL - Host Unreachable (i-0000010c) [08:48:28] PROBLEM host: ganglia-test2 is DOWN address: i-00000250 CRITICAL - Host Unreachable (i-00000250) [09:08:08] PROBLEM Puppet freshness is now: CRITICAL on cn-wiki-db-lucid i-00000241 output: Puppet has not run in last 20 hours [09:13:28] PROBLEM host: aggregator-test is DOWN address: i-0000024d CRITICAL - Host Unreachable (i-0000024d) [09:15:08] PROBLEM Puppet freshness is now: CRITICAL on wikidata-dev-1 i-0000020c output: Puppet has not run in last 20 hours [09:16:08] PROBLEM Puppet freshness is now: CRITICAL on demo-mysql1 i-00000256 output: Puppet has not run in last 20 hours [09:18:08] PROBLEM Puppet freshness is now: CRITICAL on dumps-1 i-00000170 output: Puppet has not run in last 20 hours [09:18:08] PROBLEM Puppet freshness is now: CRITICAL on wikistream-1 i-0000016e output: Puppet has not run in last 20 hours [09:18:28] PROBLEM host: aggregator-test3 is DOWN address: i-00000293 CRITICAL - Host Unreachable (i-00000293) [09:18:28] PROBLEM host: aggregator1 is DOWN address: i-0000010c CRITICAL - Host Unreachable (i-0000010c) [09:18:28] PROBLEM host: ganglia-test2 is DOWN address: i-00000250 CRITICAL - Host Unreachable (i-00000250) [09:19:08] PROBLEM Puppet freshness is now: CRITICAL on swift-be3 i-000001c9 output: Puppet has not run in last 20 hours [09:23:08] PROBLEM Puppet freshness is now: CRITICAL on deployment-cache-upload i-00000263 output: Puppet has not run in last 20 hours [09:23:08] PROBLEM Puppet freshness is now: CRITICAL on publicdata-administration i-0000019e output: Puppet has not run in last 20 hours [09:23:08] PROBLEM Puppet freshness is now: CRITICAL on resourceloader2-apache i-000001d7 output: Puppet has not run in last 20 hours [09:23:08] PROBLEM Puppet freshness is now: CRITICAL on test2 i-0000013c output: Puppet has not run in last 20 hours [09:23:08] PROBLEM Puppet freshness is now: CRITICAL on wikidata-dev-3 i-00000225 output: Puppet has not run in last 20 hours [09:24:08] PROBLEM Puppet freshness is now: CRITICAL on venus i-000000ea output: Puppet has not run in last 20 hours [09:24:08] PROBLEM Puppet freshness is now: CRITICAL on wmde-test i-000002ad output: Puppet has not run in last 20 hours [09:25:08] PROBLEM Puppet freshness is now: CRITICAL on wikisource-web i-000000fe output: Puppet has not run in last 20 hours [09:30:08] PROBLEM Puppet freshness is now: CRITICAL on bots-sql1 i-000000b5 output: Puppet has not run in last 20 hours [09:30:08] PROBLEM Puppet freshness is now: CRITICAL on wikistats-01 i-00000042 output: Puppet has not run in last 20 hours [09:31:08] PROBLEM Puppet freshness is now: CRITICAL on bastion1 i-000000ba output: Puppet has not run in last 20 hours [09:31:08] PROBLEM Puppet freshness is now: CRITICAL on demo-web1 i-00000255 output: Puppet has not run in last 20 hours [09:31:08] PROBLEM Puppet freshness is now: CRITICAL on kripke i-00000268 output: Puppet has not run in last 20 hours [09:31:08] PROBLEM Puppet freshness is now: CRITICAL on labs-nfs1 i-0000005d output: Puppet has not run in last 20 hours [09:34:08] PROBLEM Puppet freshness is now: CRITICAL on bots-4 i-000000e8 output: Puppet has not run in last 20 hours [09:35:08] PROBLEM Puppet freshness is now: CRITICAL on labs-relay i-00000103 output: Puppet has not run in last 20 hours [09:36:08] PROBLEM Puppet freshness is now: CRITICAL on gerrit i-000000ff output: Puppet has not run in last 20 hours [09:37:08] PROBLEM Puppet freshness is now: CRITICAL on deployment-backup i-000000f8 output: Puppet has not run in last 20 hours [09:38:08] PROBLEM Puppet freshness is now: CRITICAL on bots-2 i-0000009c output: Puppet has not run in last 20 hours [09:38:08] PROBLEM Puppet freshness is now: CRITICAL on deployment-apache22 i-0000026f output: Puppet has not run in last 20 hours [09:38:08] PROBLEM Puppet freshness is now: CRITICAL on grail i-0000021e output: Puppet has not run in last 20 hours [09:38:08] PROBLEM Puppet freshness is now: CRITICAL on jenkins2 i-00000102 output: Puppet has not run in last 20 hours [09:38:08] PROBLEM Puppet freshness is now: CRITICAL on swift-be2 i-000001c8 output: Puppet has not run in last 20 hours [09:39:08] PROBLEM Puppet freshness is now: CRITICAL on nginx-dev1 i-000000f0 output: Puppet has not run in last 20 hours [09:41:15] PROBLEM Puppet freshness is now: CRITICAL on deployment-apache23 i-00000270 output: Puppet has not run in last 20 hours [09:41:15] PROBLEM Puppet freshness is now: CRITICAL on memcache-puppet i-00000153 output: Puppet has not run in last 20 hours [09:42:15] PROBLEM Puppet freshness is now: CRITICAL on php5builds i-00000192 output: Puppet has not run in last 20 hours [09:43:35] PROBLEM host: aggregator-test is DOWN address: i-0000024d CRITICAL - Host Unreachable (i-0000024d) [09:44:15] PROBLEM Puppet freshness is now: CRITICAL on mobile-feeds i-000000c1 output: Puppet has not run in last 20 hours [09:45:15] PROBLEM Puppet freshness is now: CRITICAL on deployment-dbdump i-000000d2 output: Puppet has not run in last 20 hours [09:45:15] PROBLEM Puppet freshness is now: CRITICAL on robh2 i-000001a2 output: Puppet has not run in last 20 hours [09:45:15] PROBLEM Puppet freshness is now: CRITICAL on swift-be4 i-000001ca output: Puppet has not run in last 20 hours [09:47:15] PROBLEM Puppet freshness is now: CRITICAL on deployment-thumbproxy i-0000026b output: Puppet has not run in last 20 hours [09:47:15] PROBLEM Puppet freshness is now: CRITICAL on fundraising-civicrm i-00000169 output: Puppet has not run in last 20 hours [09:48:35] PROBLEM host: aggregator-test3 is DOWN address: i-00000293 CRITICAL - Host Unreachable (i-00000293) [09:48:35] PROBLEM host: aggregator1 is DOWN address: i-0000010c CRITICAL - Host Unreachable (i-0000010c) [09:48:35] PROBLEM host: ganglia-test2 is DOWN address: i-00000250 CRITICAL - Host Unreachable (i-00000250) [09:49:15] PROBLEM Puppet freshness is now: CRITICAL on wep i-000000c2 output: Puppet has not run in last 20 hours [09:51:15] PROBLEM Puppet freshness is now: CRITICAL on deployment-feed i-00000118 output: Puppet has not run in last 20 hours [09:55:15] PROBLEM Puppet freshness is now: CRITICAL on ee-prototype i-0000013d output: Puppet has not run in last 20 hours [09:57:15] PROBLEM Puppet freshness is now: CRITICAL on hugglewiki i-000000aa output: Puppet has not run in last 20 hours [09:59:11] PROBLEM Puppet freshness is now: CRITICAL on deployment-transcoding i-00000105 output: Puppet has not run in last 20 hours [10:03:15] PROBLEM Puppet freshness is now: CRITICAL on bots-apache1 i-000000b0 output: Puppet has not run in last 20 hours [10:03:15] PROBLEM Puppet freshness is now: CRITICAL on secondinstance i-0000015b output: Puppet has not run in last 20 hours [10:06:15] PROBLEM Puppet freshness is now: CRITICAL on ganglia-collector i-000000b7 output: Puppet has not run in last 20 hours [10:13:15] PROBLEM Puppet freshness is now: CRITICAL on deployment-deb i-000002b5 output: Puppet has not run in last 20 hours [10:13:15] PROBLEM Puppet freshness is now: CRITICAL on mwreview-test6 i-000002b9 output: Puppet has not run in last 20 hours [10:13:15] PROBLEM Puppet freshness is now: CRITICAL on queue-wiki1 i-000002b8 output: Puppet has not run in last 20 hours [10:13:15] PROBLEM Puppet freshness is now: CRITICAL on redis1 i-000002b6 output: Puppet has not run in last 20 hours [10:13:15] PROBLEM Puppet freshness is now: CRITICAL on zeromq1 i-000002b7 output: Puppet has not run in last 20 hours [10:13:35] PROBLEM host: aggregator-test is DOWN address: i-0000024d CRITICAL - Host Unreachable (i-0000024d) [10:14:15] PROBLEM Puppet freshness is now: CRITICAL on shop-analytics-main i-000001e6 output: Puppet has not run in last 20 hours [10:16:45] PROBLEM Disk Space is now: WARNING on deployment-transcoding i-00000105 output: DISK WARNING - free space: / 78 MB (5% inode=52%): [10:17:15] PROBLEM Disk Space is now: WARNING on nagios 127.0.0.1 output: DISK WARNING - free space: /home/petrb 3569 MB (20% inode=77%): /home/dzahn 3569 MB (20% inode=77%): [10:18:35] PROBLEM host: aggregator-test3 is DOWN address: i-00000293 CRITICAL - Host Unreachable (i-00000293) [10:18:35] PROBLEM host: aggregator1 is DOWN address: i-0000010c CRITICAL - Host Unreachable (i-0000010c) [10:18:35] PROBLEM host: ganglia-test2 is DOWN address: i-00000250 CRITICAL - Host Unreachable (i-00000250) [10:26:06] mutante: can you have a look on puppet check [10:27:15] RECOVERY Disk Space is now: OK on nagios 127.0.0.1 output: DISK OK [10:28:15] PROBLEM Puppet freshness is now: CRITICAL on building i-0000014d output: Puppet has not run in last 20 hours [10:33:15] PROBLEM Puppet freshness is now: CRITICAL on swift-fe1 i-000001d2 output: Puppet has not run in last 20 hours [10:34:04] can someone tell mutante to check irc :D [10:43:15] PROBLEM Puppet freshness is now: CRITICAL on firstinstance i-0000013e output: Puppet has not run in last 20 hours [10:43:15] PROBLEM Puppet freshness is now: CRITICAL on log1 i-00000239 output: Puppet has not run in last 20 hours [10:43:35] PROBLEM host: aggregator-test is DOWN address: i-0000024d CRITICAL - Host Unreachable (i-0000024d) [10:46:15] PROBLEM Puppet freshness is now: CRITICAL on pybal-precise i-00000289 output: Puppet has not run in last 20 hours [10:48:15] PROBLEM Puppet freshness is now: CRITICAL on reportcard2 i-000001ea output: Puppet has not run in last 20 hours [10:48:35] PROBLEM host: aggregator-test3 is DOWN address: i-00000293 CRITICAL - Host Unreachable (i-00000293) [10:48:35] PROBLEM host: aggregator1 is DOWN address: i-0000010c CRITICAL - Host Unreachable (i-0000010c) [10:48:35] PROBLEM host: ganglia-test2 is DOWN address: i-00000250 CRITICAL - Host Unreachable (i-00000250) [10:51:07] petan|hackaton: you will have to stand up, all ops are buys in one of the small rooms [10:51:27] hashar: are you with them [10:51:30] nop [10:51:54] hashar: btw hashar how do I access the labs puppet repo, what I have is probably not labs [10:52:14] I did git clone ssh://gerrit.wikimedia.org:29418/operations/puppet.git [10:52:23] you need the 'test' branch [10:52:28] so: git checkout test [10:52:29] there is no test branch atm [10:52:37] there is only 1 branch and that is production [10:52:40] oh [10:52:41] well [10:52:41] hmm [10:52:47] git checkout -b test -t origin/test [10:52:50] I did checkout test it created new branch [10:52:52] -b is to create a local branch [10:53:03] -t is to make it track changes from another branhc [10:53:08] yes, but that created a new branch I need to checkout the current files [10:53:12] origin/test is an alias for refs/remotes/origin/test [10:53:35] so you might want to delete your local test [10:53:41] git checkout master # come back to master [10:53:48] git branch -D test # delete your local test branch [10:53:49] hm I did actually that command you sent [10:53:54] git checkout -b test -t origin/test [10:53:57] yes [10:53:58] that [10:54:00] that should do it [10:54:20] Branch test set up to track remote branch test from origin. [10:54:20] Switched to a new branch 'test' [10:55:20] ok [10:56:03] I am almost sure this is not config of labs there is no realm [10:56:47] damn that documentation on labs sucks [10:58:17] as any other documentation created by hackers themselves... ;-) [11:02:15] PROBLEM Puppet freshness is now: CRITICAL on deployment-imagescaler01 i-0000025a output: Puppet has not run in last 20 hours [11:02:15] PROBLEM Puppet freshness is now: CRITICAL on worker1 i-00000208 output: Puppet has not run in last 20 hours [11:07:37] Change on 12mediawiki a page Wikimedia Labs was modified, changed by Hydriz link https://www.mediawiki.org/w/index.php?diff=546095 edit summary: Remove link, breaks templates [11:10:15] PROBLEM Puppet freshness is now: CRITICAL on migration1 i-00000261 output: Puppet has not run in last 20 hours [11:11:15] PROBLEM Puppet freshness is now: CRITICAL on deployment-cache-bits i-00000264 output: Puppet has not run in last 20 hours [11:13:35] PROBLEM host: aggregator-test is DOWN address: i-0000024d CRITICAL - Host Unreachable (i-0000024d) [11:15:15] PROBLEM Disk Space is now: WARNING on nagios 127.0.0.1 output: DISK WARNING - free space: /home/petrb 3588 MB (20% inode=77%): [11:18:35] PROBLEM host: aggregator-test3 is DOWN address: i-00000293 CRITICAL - Host Unreachable (i-00000293) [11:18:35] PROBLEM host: aggregator1 is DOWN address: i-0000010c CRITICAL - Host Unreachable (i-0000010c) [11:18:35] PROBLEM host: ganglia-test2 is DOWN address: i-00000250 CRITICAL - Host Unreachable (i-00000250) [11:19:15] PROBLEM Puppet freshness is now: CRITICAL on catsort-pub i-000001cc output: Puppet has not run in last 20 hours [11:21:15] PROBLEM Puppet freshness is now: CRITICAL on deployment-apache21 i-0000026d output: Puppet has not run in last 20 hours [11:21:15] PROBLEM Puppet freshness is now: CRITICAL on labs-build1 i-0000006b output: Puppet has not run in last 20 hours [11:23:15] PROBLEM Puppet freshness is now: CRITICAL on bots-sql2 i-000000af output: Puppet has not run in last 20 hours [11:23:15] PROBLEM Puppet freshness is now: CRITICAL on deployment-apache20 i-0000026c output: Puppet has not run in last 20 hours [11:23:15] PROBLEM Puppet freshness is now: CRITICAL on dumps-2 i-00000257 output: Puppet has not run in last 20 hours [11:25:15] PROBLEM Puppet freshness is now: CRITICAL on deployment-sql i-000000d0 output: Puppet has not run in last 20 hours [11:25:15] PROBLEM Puppet freshness is now: CRITICAL on deployment-wmsearch i-000000e1 output: Puppet has not run in last 20 hours [11:25:15] PROBLEM Puppet freshness is now: CRITICAL on hugglewa-1 i-000001e0 output: Puppet has not run in last 20 hours [11:25:15] PROBLEM Puppet freshness is now: CRITICAL on master i-0000007a output: Puppet has not run in last 20 hours [11:25:15] PROBLEM Puppet freshness is now: CRITICAL on mingledbtest i-00000283 output: Puppet has not run in last 20 hours [11:27:15] PROBLEM Puppet freshness is now: CRITICAL on deployment-syslog i-00000269 output: Puppet has not run in last 20 hours [11:27:15] PROBLEM Puppet freshness is now: CRITICAL on p-b i-000000ae output: Puppet has not run in last 20 hours [11:28:15] PROBLEM Puppet freshness is now: CRITICAL on feeds i-000000fa output: Puppet has not run in last 20 hours [11:28:15] PROBLEM Puppet freshness is now: CRITICAL on incubator-bot0 i-00000296 output: Puppet has not run in last 20 hours [11:29:15] PROBLEM Puppet freshness is now: CRITICAL on asher1 i-0000003a output: Puppet has not run in last 20 hours [11:29:15] PROBLEM Puppet freshness is now: CRITICAL on pediapress-ocg2 i-00000234 output: Puppet has not run in last 20 hours [11:29:15] PROBLEM Puppet freshness is now: CRITICAL on udp-filter i-000001df output: Puppet has not run in last 20 hours [11:30:15] PROBLEM Puppet freshness is now: CRITICAL on bob i-0000012d output: Puppet has not run in last 20 hours [11:30:15] PROBLEM Puppet freshness is now: CRITICAL on bots-dev i-00000190 output: Puppet has not run in last 20 hours [11:30:15] PROBLEM Puppet freshness is now: CRITICAL on e3 i-00000291 output: Puppet has not run in last 20 hours [11:30:15] PROBLEM Puppet freshness is now: CRITICAL on testing-groupchange i-00000205 output: Puppet has not run in last 20 hours [11:31:15] PROBLEM Puppet freshness is now: CRITICAL on bots-labs i-0000015e output: Puppet has not run in last 20 hours [11:31:15] PROBLEM Puppet freshness is now: CRITICAL on demo-deployment1 i-00000276 output: Puppet has not run in last 20 hours [11:31:15] PROBLEM Puppet freshness is now: CRITICAL on fundraising-db i-0000015c output: Puppet has not run in last 20 hours [11:31:15] PROBLEM Puppet freshness is now: CRITICAL on otrs-jgreen i-0000015a output: Puppet has not run in last 20 hours [11:31:15] PROBLEM Puppet freshness is now: CRITICAL on ve-nodejs i-00000245 output: Puppet has not run in last 20 hours [11:32:23] PROBLEM Puppet freshness is now: CRITICAL on ganglia-test4 i-000002a2 output: Puppet has not run in last 20 hours [11:32:24] PROBLEM Puppet freshness is now: CRITICAL on incubator-common i-00000254 output: Puppet has not run in last 20 hours [11:32:24] PROBLEM Puppet freshness is now: CRITICAL on varnish i-000001ac output: Puppet has not run in last 20 hours [11:33:15] PROBLEM Puppet freshness is now: CRITICAL on tw-next i-0000027e output: Puppet has not run in last 20 hours [11:34:15] PROBLEM Puppet freshness is now: CRITICAL on bots-nfs i-000000b1 output: Puppet has not run in last 20 hours [11:34:15] PROBLEM Puppet freshness is now: CRITICAL on outreacheval i-0000012e output: Puppet has not run in last 20 hours [11:34:15] PROBLEM Puppet freshness is now: CRITICAL on webserver-lcarr i-00000134 output: Puppet has not run in last 20 hours [11:36:15] PROBLEM Puppet freshness is now: CRITICAL on deployment-jobrunner05 i-0000028c output: Puppet has not run in last 20 hours [11:38:16] PROBLEM Puppet freshness is now: CRITICAL on bots-sql3 i-000000b4 output: Puppet has not run in last 20 hours [11:38:16] PROBLEM Puppet freshness is now: CRITICAL on build-precise1 i-00000273 output: Puppet has not run in last 20 hours [11:38:16] PROBLEM Puppet freshness is now: CRITICAL on en-wiki-db-lucid i-0000023b output: Puppet has not run in last 20 hours [11:38:16] PROBLEM Puppet freshness is now: CRITICAL on exim-test i-00000265 output: Puppet has not run in last 20 hours [11:38:16] PROBLEM Puppet freshness is now: CRITICAL on fr-wiki-db-precise i-0000023e output: Puppet has not run in last 20 hours [11:38:16] PROBLEM Puppet freshness is now: CRITICAL on incubator-apache i-00000211 output: Puppet has not run in last 20 hours [11:38:16] PROBLEM Puppet freshness is now: CRITICAL on labs-realserver i-00000104 output: Puppet has not run in last 20 hours [11:38:16] PROBLEM Puppet freshness is now: CRITICAL on nginx-ffuqua-doom1-3 i-00000196 output: Puppet has not run in last 20 hours [11:38:16] PROBLEM Puppet freshness is now: CRITICAL on pediapress-ocg1 i-00000233 output: Puppet has not run in last 20 hours [11:38:17] PROBLEM Puppet freshness is now: CRITICAL on pediapress-packager i-000001e4 output: Puppet has not run in last 20 hours [11:38:17] PROBLEM Puppet freshness is now: CRITICAL on scribunto i-0000022c output: Puppet has not run in last 20 hours [11:38:18] PROBLEM Puppet freshness is now: CRITICAL on simplewikt i-00000149 output: Puppet has not run in last 20 hours [11:40:15] PROBLEM Puppet freshness is now: CRITICAL on ganglia-test3 i-0000025b output: Puppet has not run in last 20 hours [11:40:15] PROBLEM Puppet freshness is now: CRITICAL on incubator-bot1 i-00000251 output: Puppet has not run in last 20 hours [11:40:15] PROBLEM Puppet freshness is now: CRITICAL on pageviews i-000000b2 output: Puppet has not run in last 20 hours [11:43:35] PROBLEM host: aggregator-test is DOWN address: i-0000024d CRITICAL - Host Unreachable (i-0000024d) [11:44:15] PROBLEM Puppet freshness is now: CRITICAL on bastion-restricted1 i-0000019b output: Puppet has not run in last 20 hours [11:44:15] PROBLEM Puppet freshness is now: CRITICAL on deployment-mc i-0000021b output: Puppet has not run in last 20 hours [11:44:15] PROBLEM Puppet freshness is now: CRITICAL on nova-dev3 i-000000e9 output: Puppet has not run in last 20 hours [11:44:15] PROBLEM Puppet freshness is now: CRITICAL on patchtest i-000000f1 output: Puppet has not run in last 20 hours [11:44:15] PROBLEM Puppet freshness is now: CRITICAL on swift-aux2 i-0000024c output: Puppet has not run in last 20 hours [11:44:15] PROBLEM Puppet freshness is now: CRITICAL on vumi i-000001e5 output: Puppet has not run in last 20 hours [11:45:05] RECOVERY Puppet freshness is now: OK on pediapress-ocg1 i-00000233 output: puppet ran at Sun Jun 3 11:45:02 UTC 2012 vdsga [11:45:15] PROBLEM Puppet freshness is now: CRITICAL on deployment-squid i-000000dc output: Puppet has not run in last 20 hours [11:45:15] PROBLEM Puppet freshness is now: CRITICAL on embed-sandbox i-000000d1 output: Puppet has not run in last 20 hours [11:45:15] PROBLEM Puppet freshness is now: CRITICAL on opengrok-web i-000001e1 output: Puppet has not run in last 20 hours [11:45:15] PROBLEM Puppet freshness is now: CRITICAL on swift-aux1 i-0000024b output: Puppet has not run in last 20 hours [11:45:15] PROBLEM Puppet freshness is now: CRITICAL on vivek-puppet i-000000ca output: Puppet has not run in last 20 hours [11:47:15] PROBLEM Puppet freshness is now: CRITICAL on dev-solr i-00000152 output: Puppet has not run in last 20 hours [11:48:15] PROBLEM Puppet freshness is now: CRITICAL on demo-web2 i-00000285 output: Puppet has not run in last 20 hours [11:48:15] PROBLEM Puppet freshness is now: CRITICAL on en-wiki-db-precise i-0000023c output: Puppet has not run in last 20 hours [11:48:15] PROBLEM Puppet freshness is now: CRITICAL on ubuntu1-pgehres i-000000fb output: Puppet has not run in last 20 hours [11:48:35] PROBLEM host: aggregator-test3 is DOWN address: i-00000293 CRITICAL - Host Unreachable (i-00000293) [11:48:35] PROBLEM host: aggregator1 is DOWN address: i-0000010c CRITICAL - Host Unreachable (i-0000010c) [11:48:35] PROBLEM host: ganglia-test2 is DOWN address: i-00000250 CRITICAL - Host Unreachable (i-00000250) [11:49:15] PROBLEM Puppet freshness is now: CRITICAL on maps-test2 i-00000253 output: Puppet has not run in last 20 hours [11:50:15] PROBLEM Puppet freshness is now: CRITICAL on bots-1 i-000000a9 output: Puppet has not run in last 20 hours [11:52:15] PROBLEM Puppet freshness is now: CRITICAL on build1 i-000002b3 output: Puppet has not run in last 20 hours [11:52:15] PROBLEM Puppet freshness is now: CRITICAL on deployment-nfs-memc i-000000d7 output: Puppet has not run in last 20 hours [11:53:15] PROBLEM Puppet freshness is now: CRITICAL on hugglewa-db i-00000188 output: Puppet has not run in last 20 hours [11:53:15] PROBLEM Puppet freshness is now: CRITICAL on maps-test3 i-0000028f output: Puppet has not run in last 20 hours [11:53:15] PROBLEM Puppet freshness is now: CRITICAL on nova-ldap2 i-00000238 output: Puppet has not run in last 20 hours [11:54:15] PROBLEM Puppet freshness is now: CRITICAL on translation-memory-1 i-0000013a output: Puppet has not run in last 20 hours [11:56:40] Damianz: you know if there is a simple way to check if there is anything coming from instances to nagios [11:57:16] like the snmpd is running, command is working but I guess there is no traffic coming from instances to nagios server [12:02:15] PROBLEM Puppet freshness is now: CRITICAL on deployment-bastion i-000002bd output: Puppet has not run in last 20 hours [12:02:15] PROBLEM Puppet freshness is now: CRITICAL on dumps-incr i-000002bb output: Puppet has not run in last 20 hours [12:02:15] PROBLEM Puppet freshness is now: CRITICAL on mobile-wlm i-000002bc output: Puppet has not run in last 20 hours [12:08:36] Thehelpfulone: can you do @trustadd .*@conference\/mediawiki.* trusted [12:08:46] or maybe just conference/mediawiki [12:11:22] Ryan_Lane: can you poke mutante [12:11:40] petan|hackaton: he isn't here [12:13:06] yay [12:13:18] where is he :O [12:13:35] PROBLEM host: aggregator-test is DOWN address: i-0000024d CRITICAL - Host Unreachable (i-0000024d) [12:13:36] dunno [12:18:35] PROBLEM host: aggregator-test3 is DOWN address: i-00000293 CRITICAL - Host Unreachable (i-00000293) [12:18:35] PROBLEM host: aggregator1 is DOWN address: i-0000010c CRITICAL - Host Unreachable (i-0000010c) [12:18:35] PROBLEM host: ganglia-test2 is DOWN address: i-00000250 CRITICAL - Host Unreachable (i-00000250) [12:43:37] PROBLEM host: aggregator-test is DOWN address: i-0000024d CRITICAL - Host Unreachable (i-0000024d) [12:48:37] PROBLEM host: aggregator-test3 is DOWN address: i-00000293 CRITICAL - Host Unreachable (i-00000293) [12:48:37] PROBLEM host: aggregator1 is DOWN address: i-0000010c CRITICAL - Host Unreachable (i-0000010c) [12:48:37] PROBLEM host: ganglia-test2 is DOWN address: i-00000250 CRITICAL - Host Unreachable (i-00000250) [13:13:37] PROBLEM host: aggregator-test is DOWN address: i-0000024d CRITICAL - Host Unreachable (i-0000024d) [13:15:48] Ryan_Lane: when you see mutante tell him to check nagios checks [13:16:01] He just walked back in [13:16:11] I don't know how he looks [13:16:18] Male [13:16:19] German [13:16:48] I can recognize male, how do I recognize it's a German [13:16:48] mutante: ping ? [13:16:57] he is tall [13:17:03] handling a bottle of beer ? [13:17:17] PROBLEM Free ram is now: WARNING on bots-3 i-000000e5 output: Warning: 13% free memory [13:18:37] PROBLEM host: aggregator-test3 is DOWN address: i-00000293 CRITICAL - Host Unreachable (i-00000293) [13:18:37] PROBLEM host: aggregator1 is DOWN address: i-0000010c CRITICAL - Host Unreachable (i-0000010c) [13:18:37] PROBLEM host: ganglia-test2 is DOWN address: i-00000250 CRITICAL - Host Unreachable (i-00000250) [13:20:17] RECOVERY Disk Space is now: OK on nagios 127.0.0.1 output: DISK OK [13:20:19] !log deployment-prep petrb: installing git on a bastion [13:20:20] Logged the message, Master [13:20:28] hashar: hi [13:20:33] ohh [13:20:44] mutante: would you like to meet petan ? [13:21:12] whats up with nagios checks? labs nagios i suppose? checking [13:22:12] hashar: yea, brb [13:24:27] RECOVERY Puppet freshness is now: OK on bots-sql3 i-000000b4 output: puppet ran at Sun Jun 3 13:24:17 UTC 2012 [13:24:36] !log nagios starting snmptrapd [13:24:37] Logged the message, Master [13:24:57] RECOVERY Puppet freshness is now: OK on ganglia-test5 i-000002a7 output: puppet ran at Sun Jun 3 13:24:45 UTC 2012 [13:25:11] petan|hackaton: still same thing as last times. rebooted? [13:25:27] RECOVERY Puppet freshness is now: OK on dumps-incr i-000002bb output: puppet ran at Sun Jun 3 13:25:12 UTC 2012 [13:25:37] RECOVERY Puppet freshness is now: OK on queue-wiki1 i-000002b8 output: puppet ran at Sun Jun 3 13:25:26 UTC 2012 [13:25:49] petan|hackaton: where are you currently? [13:26:07] RECOVERY Puppet freshness is now: OK on deployment-wmsearch i-000000e1 output: puppet ran at Sun Jun 3 13:25:56 UTC 2012 [13:26:07] RECOVERY Puppet freshness is now: OK on nova-ldap2 i-00000238 output: puppet ran at Sun Jun 3 13:25:57 UTC 2012 [13:26:07] RECOVERY Puppet freshness is now: OK on mailman-01 i-00000235 output: puppet ran at Sun Jun 3 13:25:58 UTC 2012 [13:26:37] RECOVERY Puppet freshness is now: OK on mwreview-test6 i-000002b9 output: puppet ran at Sun Jun 3 13:26:20 UTC 2012 [13:27:07] RECOVERY Puppet freshness is now: OK on en-wiki-db-lucid i-0000023b output: puppet ran at Sun Jun 3 13:26:59 UTC 2012 [13:27:07] RECOVERY Puppet freshness is now: OK on bastion-restricted1 i-0000019b output: puppet ran at Sun Jun 3 13:27:01 UTC 2012 [13:27:19] mutante: i am in first table near projector [13:27:27] RECOVERY Puppet freshness is now: OK on rds i-00000207 output: puppet ran at Sun Jun 3 13:27:09 UTC 2012 [13:27:39] Ryan_Lane: can you create new project cs-labs for me and Danny_B|backup [13:27:57] RECOVERY Puppet freshness is now: OK on deployment-nfs-memc i-000000d7 output: puppet ran at Sun Jun 3 13:27:37 UTC 2012 [13:27:57] RECOVERY Puppet freshness is now: OK on bots-dev i-00000190 output: puppet ran at Sun Jun 3 13:27:42 UTC 2012 [13:28:07] RECOVERY Puppet freshness is now: OK on udp-filter i-000001df output: puppet ran at Sun Jun 3 13:27:55 UTC 2012 [13:28:07] RECOVERY Puppet freshness is now: OK on hugglewa-1 i-000001e0 output: puppet ran at Sun Jun 3 13:28:02 UTC 2012 [13:28:17] PROBLEM Disk Space is now: WARNING on nagios 127.0.0.1 output: DISK WARNING - free space: /home/dzahn 3586 MB (20% inode=77%): [13:28:27] RECOVERY Puppet freshness is now: OK on zeromq1 i-000002b7 output: puppet ran at Sun Jun 3 13:28:14 UTC 2012 [13:28:58] petan|hackaton: i'll catch you [13:29:07] RECOVERY Puppet freshness is now: OK on translation-memory-1 i-0000013a output: puppet ran at Sun Jun 3 13:28:57 UTC 2012 [13:29:18] petan|hackaton: what's it for? [13:29:37] RECOVERY Puppet freshness is now: OK on kripke i-00000268 output: puppet ran at Sun Jun 3 13:29:25 UTC 2012 [13:29:57] RECOVERY Puppet freshness is now: OK on deployment-apache21 i-0000026d output: puppet ran at Sun Jun 3 13:29:46 UTC 2012 [13:30:27] RECOVERY Puppet freshness is now: OK on tutorial-mysql i-0000028b output: puppet ran at Sun Jun 3 13:30:17 UTC 2012 [13:30:37] RECOVERY Puppet freshness is now: OK on bots-2 i-0000009c output: puppet ran at Sun Jun 3 13:30:30 UTC 2012 [13:30:57] RECOVERY Puppet freshness is now: OK on maps-test3 i-0000028f output: puppet ran at Sun Jun 3 13:30:41 UTC 2012 [13:31:07] RECOVERY Puppet freshness is now: OK on log1 i-00000239 output: puppet ran at Sun Jun 3 13:30:56 UTC 2012 [13:31:07] RECOVERY Puppet freshness is now: OK on scribunto i-0000022c output: puppet ran at Sun Jun 3 13:31:04 UTC 2012 [13:31:27] RECOVERY Puppet freshness is now: OK on bots-labs i-0000015e output: puppet ran at Sun Jun 3 13:31:07 UTC 2012 [13:31:27] RECOVERY Puppet freshness is now: OK on reportcard2 i-000001ea output: puppet ran at Sun Jun 3 13:31:19 UTC 2012 [13:31:37] RECOVERY Puppet freshness is now: OK on en-wiki-db-precise i-0000023c output: puppet ran at Sun Jun 3 13:31:23 UTC 2012 [13:32:07] RECOVERY Puppet freshness is now: OK on shop-analytics-main i-000001e6 output: puppet ran at Sun Jun 3 13:31:58 UTC 2012 [13:32:37] RECOVERY Puppet freshness is now: OK on otrs-jgreen i-0000015a output: puppet ran at Sun Jun 3 13:32:29 UTC 2012 [13:32:57] RECOVERY Puppet freshness is now: OK on vumi-gw1 i-0000008f output: puppet ran at Sun Jun 3 13:32:37 UTC 2012 [13:32:57] RECOVERY Puppet freshness is now: OK on bots-nfs i-000000b1 output: puppet ran at Sun Jun 3 13:32:42 UTC 2012 [13:33:03] petan|hackaton: poke poke [13:33:07] RECOVERY Puppet freshness is now: OK on demo-mysql1 i-00000256 output: puppet ran at Sun Jun 3 13:33:03 UTC 2012 [13:33:25] Ryan_Lane: hey [13:33:27] RECOVERY Puppet freshness is now: OK on swift-aux1 i-0000024b output: puppet ran at Sun Jun 3 13:33:18 UTC 2012 [13:33:27] RECOVERY Puppet freshness is now: OK on bots-4 i-000000e8 output: puppet ran at Sun Jun 3 13:33:18 UTC 2012 [13:33:41] Ryan_Lane: Danny_B|backup is coming to you [13:34:07] RECOVERY Puppet freshness is now: OK on hugglewiki i-000000aa output: puppet ran at Sun Jun 3 13:34:01 UTC 2012 [13:34:27] RECOVERY Puppet freshness is now: OK on deployment-cache-upload i-00000263 output: puppet ran at Sun Jun 3 13:34:16 UTC 2012 [13:34:37] RECOVERY Puppet freshness is now: OK on publicdata-administration i-0000019e output: puppet ran at Sun Jun 3 13:34:26 UTC 2012 [13:35:27] RECOVERY Puppet freshness is now: OK on nova-dev3 i-000000e9 output: puppet ran at Sun Jun 3 13:35:11 UTC 2012 [13:35:37] RECOVERY Puppet freshness is now: OK on mwreview i-000002ae output: puppet ran at Sun Jun 3 13:35:23 UTC 2012 [13:35:57] RECOVERY Puppet freshness is now: OK on mobile-feeds i-000000c1 output: puppet ran at Sun Jun 3 13:35:38 UTC 2012 [13:35:57] RECOVERY Puppet freshness is now: OK on wep i-000000c2 output: puppet ran at Sun Jun 3 13:35:40 UTC 2012 [13:36:07] RECOVERY Puppet freshness is now: OK on mobile-wlm i-000002bc output: puppet ran at Sun Jun 3 13:35:53 UTC 2012 [13:36:07] RECOVERY Puppet freshness is now: OK on deployment-bastion i-000002bd output: puppet ran at Sun Jun 3 13:35:54 UTC 2012 [13:36:07] RECOVERY Puppet freshness is now: OK on deployment-backup i-000000f8 output: puppet ran at Sun Jun 3 13:35:56 UTC 2012 [13:36:07] RECOVERY Puppet freshness is now: OK on wikisource-web i-000000fe output: puppet ran at Sun Jun 3 13:35:57 UTC 2012 [13:36:17] RECOVERY Puppet freshness is now: OK on wikistats-01 i-00000042 output: puppet ran at Sun Jun 3 13:36:05 UTC 2012 [13:36:27] RECOVERY Puppet freshness is now: OK on deployment-apache22 i-0000026f output: puppet ran at Sun Jun 3 13:36:09 UTC 2012 [13:36:37] RECOVERY Puppet freshness is now: OK on tw-next i-0000027e output: puppet ran at Sun Jun 3 13:36:33 UTC 2012 [13:36:54] petan|hackaton: why not use deployment-prep for this? [13:36:57] RECOVERY Puppet freshness is now: OK on ganglia-collector i-000000b7 output: puppet ran at Sun Jun 3 13:36:42 UTC 2012 [13:37:07] RECOVERY Puppet freshness is now: OK on opengrok-web i-000001e1 output: puppet ran at Sun Jun 3 13:36:57 UTC 2012 [13:37:07] RECOVERY Puppet freshness is now: OK on resourceloader2-apache i-000001d7 output: puppet ran at Sun Jun 3 13:37:00 UTC 2012 [13:37:07] RECOVERY Puppet freshness is now: OK on upload-wizard i-0000021c output: puppet ran at Sun Jun 3 13:37:04 UTC 2012 [13:37:17] RECOVERY Puppet freshness is now: OK on memcache-puppet i-00000153 output: puppet ran at Sun Jun 3 13:37:06 UTC 2012 [13:37:37] RECOVERY Puppet freshness is now: OK on firstinstance i-0000013e output: puppet ran at Sun Jun 3 13:37:21 UTC 2012 [13:37:37] RECOVERY Puppet freshness is now: OK on bastion1 i-000000ba output: puppet ran at Sun Jun 3 13:37:34 UTC 2012 [13:37:39] Ryan_Lane: where are you [13:37:48] in the back [13:37:57] RECOVERY Puppet freshness is now: OK on localpuppet1 i-0000020b output: puppet ran at Sun Jun 3 13:37:45 UTC 2012 [13:38:07] RECOVERY Puppet freshness is now: OK on asher1 i-0000003a output: puppet ran at Sun Jun 3 13:38:01 UTC 2012 [13:38:37] RECOVERY Puppet freshness is now: OK on ve-nodejs i-00000245 output: puppet ran at Sun Jun 3 13:38:28 UTC 2012 [13:38:37] RECOVERY Puppet freshness is now: OK on labs-realserver i-00000104 output: puppet ran at Sun Jun 3 13:38:30 UTC 2012 [13:38:47] RECOVERY Puppet freshness is now: OK on master i-0000007a output: puppet ran at Sun Jun 3 13:38:37 UTC 2012 [13:38:57] RECOVERY Puppet freshness is now: OK on deployment-imagescaler01 i-0000025a output: puppet ran at Sun Jun 3 13:38:42 UTC 2012 [13:38:57] RECOVERY Puppet freshness is now: OK on fundraising-db i-0000015c output: puppet ran at Sun Jun 3 13:38:49 UTC 2012 [13:39:07] RECOVERY Puppet freshness is now: OK on deployment-squid i-000000dc output: puppet ran at Sun Jun 3 13:38:52 UTC 2012 [13:39:07] RECOVERY Puppet freshness is now: OK on ganglia-test4 i-000002a2 output: puppet ran at Sun Jun 3 13:38:53 UTC 2012 [13:39:07] RECOVERY Puppet freshness is now: OK on venus i-000000ea output: puppet ran at Sun Jun 3 13:38:58 UTC 2012 [13:39:27] RECOVERY Puppet freshness is now: OK on labs-relay i-00000103 output: puppet ran at Sun Jun 3 13:39:14 UTC 2012 [13:39:27] RECOVERY Puppet freshness is now: OK on build1 i-000002b3 output: puppet ran at Sun Jun 3 13:39:16 UTC 2012 [13:39:57] RECOVERY Puppet freshness is now: OK on ganglia-test3 i-0000025b output: puppet ran at Sun Jun 3 13:39:42 UTC 2012 [13:40:07] RECOVERY Puppet freshness is now: OK on dumps-1 i-00000170 output: puppet ran at Sun Jun 3 13:39:53 UTC 2012 [13:40:07] RECOVERY Puppet freshness is now: OK on deployment-mc i-0000021b output: puppet ran at Sun Jun 3 13:39:56 UTC 2012 [13:40:07] RECOVERY Puppet freshness is now: OK on demo-web1 i-00000255 output: puppet ran at Sun Jun 3 13:39:59 UTC 2012 [13:40:07] RECOVERY Puppet freshness is now: OK on vumi i-000001e5 output: puppet ran at Sun Jun 3 13:40:03 UTC 2012 [13:40:07] RECOVERY Puppet freshness is now: OK on incubator-common i-00000254 output: puppet ran at Sun Jun 3 13:40:03 UTC 2012 [13:40:57] RECOVERY Puppet freshness is now: OK on redis1 i-000002b6 output: puppet ran at Sun Jun 3 13:40:40 UTC 2012 [13:41:07] RECOVERY Puppet freshness is now: OK on ipv6test1 i-00000282 output: puppet ran at Sun Jun 3 13:40:51 UTC 2012 [13:41:07] RECOVERY Puppet freshness is now: OK on incubator-apache i-00000211 output: puppet ran at Sun Jun 3 13:40:52 UTC 2012 [13:41:07] RECOVERY Puppet freshness is now: OK on build-precise1 i-00000273 output: puppet ran at Sun Jun 3 13:40:58 UTC 2012 [13:41:17] RECOVERY Puppet freshness is now: OK on cn-wiki-db-lucid i-00000241 output: puppet ran at Sun Jun 3 13:41:05 UTC 2012 [13:41:27] RECOVERY Puppet freshness is now: OK on building i-0000014d output: puppet ran at Sun Jun 3 13:41:15 UTC 2012 [13:41:57] RECOVERY Puppet freshness is now: OK on outreacheval i-0000012e output: puppet ran at Sun Jun 3 13:41:41 UTC 2012 [13:41:57] RECOVERY Puppet freshness is now: OK on mingledbtest i-00000283 output: puppet ran at Sun Jun 3 13:41:42 UTC 2012 [13:41:57] RECOVERY Puppet freshness is now: OK on fundraising-civicrm i-00000169 output: puppet ran at Sun Jun 3 13:41:43 UTC 2012 [13:41:57] RECOVERY Puppet freshness is now: OK on deployment-transcoding i-00000105 output: puppet ran at Sun Jun 3 13:41:46 UTC 2012 [13:41:57] RECOVERY Puppet freshness is now: OK on deployment-dbdump i-000000d2 output: puppet ran at Sun Jun 3 13:41:47 UTC 2012 [13:42:09] RECOVERY Puppet freshness is now: OK on php5builds i-00000192 output: puppet ran at Sun Jun 3 13:41:58 UTC 2012 [13:42:27] RECOVERY Puppet freshness is now: OK on e3 i-00000291 output: puppet ran at Sun Jun 3 13:42:10 UTC 2012 [13:42:27] RECOVERY Puppet freshness is now: OK on swift-be4 i-000001ca output: puppet ran at Sun Jun 3 13:42:13 UTC 2012 [13:42:57] RECOVERY Puppet freshness is now: OK on deployment-jobrunner05 i-0000028c output: puppet ran at Sun Jun 3 13:42:36 UTC 2012 [13:42:57] RECOVERY Puppet freshness is now: OK on deployment-apache20 i-0000026c output: puppet ran at Sun Jun 3 13:42:46 UTC 2012 [13:42:57] RECOVERY Puppet freshness is now: OK on p-b i-000000ae output: puppet ran at Sun Jun 3 13:42:49 UTC 2012 [13:43:37] PROBLEM host: aggregator-test is DOWN address: i-0000024d CRITICAL - Host Unreachable (i-0000024d) [13:44:07] RECOVERY Puppet freshness is now: OK on wmde-test i-000002ad output: puppet ran at Sun Jun 3 13:43:52 UTC 2012 [13:44:27] RECOVERY Puppet freshness is now: OK on migration1 i-00000261 output: puppet ran at Sun Jun 3 13:44:10 UTC 2012 [13:44:57] RECOVERY Puppet freshness is now: OK on incubator-bot1 i-00000251 output: puppet ran at Sun Jun 3 13:44:38 UTC 2012 [13:45:07] RECOVERY Puppet freshness is now: OK on pediapress-ocg2 i-00000234 output: puppet ran at Sun Jun 3 13:44:51 UTC 2012 [13:45:07] RECOVERY Puppet freshness is now: OK on demo-deployment1 i-00000276 output: puppet ran at Sun Jun 3 13:44:52 UTC 2012 [13:45:07] RECOVERY Puppet freshness is now: OK on swift-be1 i-000001c7 output: puppet ran at Sun Jun 3 13:44:53 UTC 2012 [13:45:07] RECOVERY Puppet freshness is now: OK on vivek-puppet i-000000ca output: puppet ran at Sun Jun 3 13:44:54 UTC 2012 [13:45:07] RECOVERY Puppet freshness is now: OK on robh2 i-000001a2 output: puppet ran at Sun Jun 3 13:44:57 UTC 2012 [13:45:07] RECOVERY Puppet freshness is now: OK on hugglewa-db i-00000188 output: puppet ran at Sun Jun 3 13:45:03 UTC 2012 [13:45:07] PROBLEM Puppet freshness is now: CRITICAL on blamemaps-m1small i-000002a1 output: Puppet has not run in last 20 hours [13:45:27] RECOVERY Puppet freshness is now: OK on maps-test2 i-00000253 output: puppet ran at Sun Jun 3 13:45:14 UTC 2012 [13:45:37] RECOVERY Puppet freshness is now: OK on labs-build1 i-0000006b output: puppet ran at Sun Jun 3 13:45:23 UTC 2012 [13:45:57] RECOVERY Puppet freshness is now: OK on webserver-lcarr i-00000134 output: puppet ran at Sun Jun 3 13:45:37 UTC 2012 [13:45:57] RECOVERY Puppet freshness is now: OK on deployment-deb i-000002b5 output: puppet ran at Sun Jun 3 13:45:44 UTC 2012 [13:45:57] RECOVERY Puppet freshness is now: OK on maps-tilemill1 i-00000294 output: puppet ran at Sun Jun 3 13:45:48 UTC 2012 [13:46:27] RECOVERY Puppet freshness is now: OK on fr-wiki-db-precise i-0000023e output: puppet ran at Sun Jun 3 13:46:12 UTC 2012 [13:46:37] RECOVERY Puppet freshness is now: OK on mobile-testing i-00000271 output: puppet ran at Sun Jun 3 13:46:28 UTC 2012 [13:46:45] !project configtest [13:46:45] https://labsconsole.wikimedia.org/wiki/Nova_Resource:configtest [13:46:51] 06/03/2012 - 13:46:50 - Creating a project directory for configtest [13:46:51] 06/03/2012 - 13:46:51 - Creating a home directory for danny_b at /export/home/configtest/danny_b [13:46:57] RECOVERY Puppet freshness is now: OK on varnish i-000001ac output: puppet ran at Sun Jun 3 13:46:45 UTC 2012 [13:47:27] RECOVERY Puppet freshness is now: OK on swift-fe1 i-000001d2 output: puppet ran at Sun Jun 3 13:47:08 UTC 2012 [13:47:37] RECOVERY Puppet freshness is now: OK on bots-sql1 i-000000b5 output: puppet ran at Sun Jun 3 13:47:27 UTC 2012 [13:47:37] !log configtest Created configtest project for testing configuration changes to wikis before requesting changes on the sites. [13:47:37] configtest is not a valid project. [13:47:46] -_- [13:47:50] 06/03/2012 - 13:47:49 - Updating keys for danny_b at /export/home/configtest/danny_b [13:47:57] RECOVERY Puppet freshness is now: OK on ubuntu1-pgehres i-000000fb output: puppet ran at Sun Jun 3 13:47:38 UTC 2012 [13:47:57] RECOVERY Puppet freshness is now: OK on nginx-ffuqua-doom1-3 i-00000196 output: puppet ran at Sun Jun 3 13:47:42 UTC 2012 [13:47:57] RECOVERY Puppet freshness is now: OK on gerrit i-000000ff output: puppet ran at Sun Jun 3 13:47:46 UTC 2012 [13:47:57] RECOVERY Puppet freshness is now: OK on bob i-0000012d output: puppet ran at Sun Jun 3 13:47:47 UTC 2012 [13:48:27] RECOVERY Puppet freshness is now: OK on exim-test i-00000265 output: puppet ran at Sun Jun 3 13:48:07 UTC 2012 [13:48:37] PROBLEM host: aggregator-test3 is DOWN address: i-00000293 CRITICAL - Host Unreachable (i-00000293) [13:48:37] PROBLEM host: aggregator1 is DOWN address: i-0000010c CRITICAL - Host Unreachable (i-0000010c) [13:48:37] PROBLEM host: ganglia-test2 is DOWN address: i-00000250 CRITICAL - Host Unreachable (i-00000250) [13:48:37] RECOVERY Puppet freshness is now: OK on bots-sql2 i-000000af output: puppet ran at Sun Jun 3 13:48:29 UTC 2012 [13:48:56] !log configtest Created configtest project for testing configuration changes to wikis before requesting changes on the sites. [13:48:56] configtest is not a valid project. [13:49:07] RECOVERY Puppet freshness is now: OK on wikistream-1 i-0000016e output: puppet ran at Sun Jun 3 13:48:53 UTC 2012 [13:49:07] RECOVERY Puppet freshness is now: OK on worker1 i-00000208 output: puppet ran at Sun Jun 3 13:48:54 UTC 2012 [13:49:07] RECOVERY Puppet freshness is now: OK on deployment-cache-bits i-00000264 output: puppet ran at Sun Jun 3 13:49:02 UTC 2012 [13:49:17] RECOVERY Puppet freshness is now: OK on patchtest i-000000f1 output: puppet ran at Sun Jun 3 13:49:05 UTC 2012 [13:49:27] RECOVERY Puppet freshness is now: OK on ee-prototype i-0000013d output: puppet ran at Sun Jun 3 13:49:09 UTC 2012 [13:49:27] RECOVERY Puppet freshness is now: OK on bots-apache1 i-000000b0 output: puppet ran at Sun Jun 3 13:49:16 UTC 2012 [13:49:37] RECOVERY Puppet freshness is now: OK on test2 i-0000013c output: puppet ran at Sun Jun 3 13:49:26 UTC 2012 [13:49:47] RECOVERY Puppet freshness is now: OK on swift-be2 i-000001c8 output: puppet ran at Sun Jun 3 13:49:35 UTC 2012 [13:49:57] RECOVERY Puppet freshness is now: OK on incubator-bot0 i-00000296 output: puppet ran at Sun Jun 3 13:49:46 UTC 2012 [13:49:57] RECOVERY Puppet freshness is now: OK on labs-nfs1 i-0000005d output: puppet ran at Sun Jun 3 13:49:49 UTC 2012 [13:50:03] !log configtest Created configtest project for testing configuration changes to wikis before requesting changes on the sites. [13:50:04] configtest is not a valid project. [13:50:07] RECOVERY Puppet freshness is now: OK on bots-1 i-000000a9 output: puppet ran at Sun Jun 3 13:49:56 UTC 2012 [13:50:28] RECOVERY Puppet freshness is now: OK on embed-sandbox i-000000d1 output: puppet ran at Sun Jun 3 13:50:20 UTC 2012 [13:50:37] RECOVERY Puppet freshness is now: OK on pageviews i-000000b2 output: puppet ran at Sun Jun 3 13:50:30 UTC 2012 [13:50:45] !log configtest Created configtest project for testing configuration changes to wikis before requesting changes on the sites. [13:50:46] Logged the message, Master [13:50:57] RECOVERY Puppet freshness is now: OK on secondinstance i-0000015b output: puppet ran at Sun Jun 3 13:50:37 UTC 2012 [13:50:57] RECOVERY Puppet freshness is now: OK on dev-solr i-00000152 output: puppet ran at Sun Jun 3 13:50:43 UTC 2012 [13:51:07] RECOVERY Puppet freshness is now: OK on ganglia-test6 i-000002af output: puppet ran at Sun Jun 3 13:50:55 UTC 2012 [13:51:27] RECOVERY Puppet freshness is now: OK on deployment-sql i-000000d0 output: puppet ran at Sun Jun 3 13:51:13 UTC 2012 [13:51:57] RECOVERY Puppet freshness is now: OK on nginx-dev1 i-000000f0 output: puppet ran at Sun Jun 3 13:51:42 UTC 2012 [13:52:07] RECOVERY Puppet freshness is now: OK on wikidata-dev-1 i-0000020c output: puppet ran at Sun Jun 3 13:51:52 UTC 2012 [13:52:27] RECOVERY Puppet freshness is now: OK on catsort-pub i-000001cc output: puppet ran at Sun Jun 3 13:52:08 UTC 2012 [13:52:37] RECOVERY Puppet freshness is now: OK on jenkins2 i-00000102 output: puppet ran at Sun Jun 3 13:52:22 UTC 2012 [13:52:57] RECOVERY Puppet freshness is now: OK on deployment-thumbproxy i-0000026b output: puppet ran at Sun Jun 3 13:52:44 UTC 2012 [13:53:27] RECOVERY Puppet freshness is now: OK on simplewikt i-00000149 output: puppet ran at Sun Jun 3 13:53:06 UTC 2012 [13:53:57] RECOVERY Puppet freshness is now: OK on grail i-0000021e output: puppet ran at Sun Jun 3 13:53:35 UTC 2012 [13:54:07] RECOVERY Puppet freshness is now: OK on incubator-bot2 i-00000252 output: puppet ran at Sun Jun 3 13:53:57 UTC 2012 [13:54:37] RECOVERY Puppet freshness is now: OK on swift-aux2 i-0000024c output: puppet ran at Sun Jun 3 13:54:27 UTC 2012 [13:54:57] RECOVERY Puppet freshness is now: OK on dumps-2 i-00000257 output: puppet ran at Sun Jun 3 13:54:43 UTC 2012 [13:55:07] RECOVERY Puppet freshness is now: OK on pediapress-packager i-000001e4 output: puppet ran at Sun Jun 3 13:54:52 UTC 2012 [13:55:07] RECOVERY Puppet freshness is now: OK on wikidata-dev-3 i-00000225 output: puppet ran at Sun Jun 3 13:55:00 UTC 2012 [13:56:37] RECOVERY Puppet freshness is now: OK on swift-be3 i-000001c9 output: puppet ran at Sun Jun 3 13:56:23 UTC 2012 [13:56:37] RECOVERY Puppet freshness is now: OK on bots-3 i-000000e5 output: puppet ran at Sun Jun 3 13:56:26 UTC 2012 [13:58:37] RECOVERY Puppet freshness is now: OK on deployment-syslog i-00000269 output: puppet ran at Sun Jun 3 13:58:20 UTC 2012 [14:05:27] RECOVERY Puppet freshness is now: OK on feeds i-000000fa output: puppet ran at Sun Jun 3 14:05:06 UTC 2012 [14:10:27] RECOVERY Puppet freshness is now: OK on demo-web2 i-00000285 output: puppet ran at Sun Jun 3 14:10:17 UTC 2012 [14:13:07] PROBLEM Puppet freshness is now: CRITICAL on precise-test i-00000231 output: Puppet has not run in last 20 hours [14:13:37] PROBLEM host: aggregator-test is DOWN address: i-0000024d CRITICAL - Host Unreachable (i-0000024d) [14:16:27] RECOVERY Puppet freshness is now: OK on deployment-feed i-00000118 output: puppet ran at Sun Jun 3 14:16:10 UTC 2012 [14:16:37] RECOVERY Puppet freshness is now: OK on testing-groupchange i-00000205 output: puppet ran at Sun Jun 3 14:16:21 UTC 2012 [14:18:37] PROBLEM host: aggregator-test3 is DOWN address: i-00000293 CRITICAL - Host Unreachable (i-00000293) [14:18:37] PROBLEM host: aggregator1 is DOWN address: i-0000010c CRITICAL - Host Unreachable (i-0000010c) [14:18:37] PROBLEM host: ganglia-test2 is DOWN address: i-00000250 CRITICAL - Host Unreachable (i-00000250) [14:30:46] 06/03/2012 - 14:30:46 - Creating a home directory for catrope at /export/home/demo/catrope [14:31:45] 06/03/2012 - 14:31:45 - Updating keys for catrope at /export/home/demo/catrope [14:42:34] PROBLEM Current Load is now: WARNING on bots-sql2 i-000000af output: WARNING - load average: 6.82, 6.18, 5.41 [14:43:44] PROBLEM Current Load is now: CRITICAL on mwreview-test7 i-000002be output: Connection refused by host [14:44:24] PROBLEM Current Users is now: CRITICAL on mwreview-test7 i-000002be output: Connection refused by host [14:45:05] PROBLEM Disk Space is now: CRITICAL on mwreview-test7 i-000002be output: Connection refused by host [14:45:34] PROBLEM host: aggregator-test is DOWN address: i-0000024d CRITICAL - Host Unreachable (i-0000024d) [14:45:44] PROBLEM Free ram is now: CRITICAL on mwreview-test7 i-000002be output: Connection refused by host [14:46:54] PROBLEM Total Processes is now: CRITICAL on mwreview-test7 i-000002be output: CHECK_NRPE: Error - Could not complete SSL handshake. [14:47:34] PROBLEM dpkg-check is now: CRITICAL on mwreview-test7 i-000002be output: CHECK_NRPE: Error - Could not complete SSL handshake. [14:48:44] PROBLEM host: aggregator1 is DOWN address: i-0000010c CRITICAL - Host Unreachable (i-0000010c) [14:48:44] PROBLEM host: aggregator-test3 is DOWN address: i-00000293 CRITICAL - Host Unreachable (i-00000293) [14:50:34] PROBLEM host: ganglia-test2 is DOWN address: i-00000250 CRITICAL - Host Unreachable (i-00000250) [14:57:34] RECOVERY Current Load is now: OK on bots-sql2 i-000000af output: OK - load average: 3.49, 3.76, 4.66 [15:00:07] New review: Dzahn; "(no comment)" [operations/puppet] (test); V: 1 C: 2; - https://gerrit.wikimedia.org/r/7255 [15:00:09] Change merged: Dzahn; [operations/puppet] (test) - https://gerrit.wikimedia.org/r/7255 [15:07:09] Krinkle: Got the account [15:07:56] OK [15:08:49] 06/03/2012 - 15:08:49 - Creating a project directory for wikitrust [15:08:50] 06/03/2012 - 15:08:50 - Creating a home directory for gwicke at /export/home/wikitrust/gwicke [15:08:50] 06/03/2012 - 15:08:50 - Creating a home directory for oren at /export/home/wikitrust/oren [15:08:50] 06/03/2012 - 15:08:50 - Creating a home directory for laner at /export/home/wikitrust/laner [15:08:50] 06/03/2012 - 15:08:50 - Creating a home directory for krinkle at /export/home/wikitrust/krinkle [15:08:59] :O [15:09:09] Krinkle: ;) [15:09:15] Awesome [15:09:38] Krinkle: you may want to wait a while before starting importing the data [15:09:48] since we have IO issues right now [15:09:52] 06/03/2012 - 15:09:51 - Updating keys for oren at /export/home/wikitrust/oren [15:09:52] 06/03/2012 - 15:09:51 - Updating keys for gwicke at /export/home/wikitrust/gwicke [15:09:52] 06/03/2012 - 15:09:51 - Updating keys for krinkle at /export/home/wikitrust/krinkle [15:09:52] 06/03/2012 - 15:09:51 - Updating keys for laner at /export/home/wikitrust/laner [15:09:52] srsly.. I don't even know where to start [15:10:25] Ryan_Lane: But will you be so kind to allocate a nice little 2 petabyte for the project? [15:10:26] drive [15:10:30] :D [15:10:34] PROBLEM Current Load is now: WARNING on bots-sql2 i-000000af output: WARNING - load average: 4.72, 5.49, 5.28 [15:10:50] chicocvenancio: You're now admin on en.wikipedia beta.wmflabs [15:11:18] :) [15:11:20] chicocvenancio: http://en.wikipedia.beta.wmflabs.org/w/index.php?title=Special%3ALog&page=User:Chicocvenancio [15:11:59] Krinkle: heh. they need about 10TB [15:12:07] oh, ok [15:12:51] what's the total size of all wiki db's text storage? [15:15:01] about 300GB [15:15:34] PROBLEM host: aggregator-test is DOWN address: i-0000024d CRITICAL - Host Unreachable (i-0000024d) [15:18:44] RECOVERY Current Load is now: OK on mwreview-test7 i-000002be output: OK - load average: 0.63, 0.30, 0.22 [15:19:24] RECOVERY Current Users is now: OK on mwreview-test7 i-000002be output: USERS OK - 0 users currently logged in [15:19:44] PROBLEM host: aggregator-test3 is DOWN address: i-00000293 CRITICAL - Host Unreachable (i-00000293) [15:19:44] PROBLEM host: aggregator1 is DOWN address: i-0000010c CRITICAL - Host Unreachable (i-0000010c) [15:20:04] RECOVERY Disk Space is now: OK on mwreview-test7 i-000002be output: DISK OK [15:20:34] PROBLEM host: ganglia-test2 is DOWN address: i-00000250 CRITICAL - Host Unreachable (i-00000250) [15:20:44] RECOVERY Free ram is now: OK on mwreview-test7 i-000002be output: OK: 91% free memory [15:21:54] RECOVERY Total Processes is now: OK on mwreview-test7 i-000002be output: PROCS OK: 79 processes [15:22:34] RECOVERY dpkg-check is now: OK on mwreview-test7 i-000002be output: All packages OK [15:37:14] PROBLEM Free ram is now: CRITICAL on bots-3 i-000000e5 output: Critical: 5% free memory [15:45:34] PROBLEM host: aggregator-test is DOWN address: i-0000024d CRITICAL - Host Unreachable (i-0000024d) [15:50:34] PROBLEM host: aggregator-test3 is DOWN address: i-00000293 CRITICAL - Host Unreachable (i-00000293) [15:50:34] PROBLEM host: aggregator1 is DOWN address: i-0000010c CRITICAL - Host Unreachable (i-0000010c) [15:50:34] PROBLEM host: ganglia-test2 is DOWN address: i-00000250 CRITICAL - Host Unreachable (i-00000250) [16:12:14] PROBLEM Free ram is now: WARNING on bots-3 i-000000e5 output: Warning: 9% free memory [16:15:34] PROBLEM host: aggregator-test is DOWN address: i-0000024d CRITICAL - Host Unreachable (i-0000024d) [16:20:34] PROBLEM host: aggregator-test3 is DOWN address: i-00000293 CRITICAL - Host Unreachable (i-00000293) [16:20:34] PROBLEM host: aggregator1 is DOWN address: i-0000010c CRITICAL - Host Unreachable (i-0000010c) [16:20:34] PROBLEM host: ganglia-test2 is DOWN address: i-00000250 CRITICAL - Host Unreachable (i-00000250) [16:26:44] PROBLEM Current Load is now: WARNING on bots-3 i-000000e5 output: WARNING - load average: 6.89, 6.49, 5.47 [16:45:34] PROBLEM host: aggregator-test is DOWN address: i-0000024d CRITICAL - Host Unreachable (i-0000024d) [16:50:34] PROBLEM host: aggregator-test3 is DOWN address: i-00000293 CRITICAL - Host Unreachable (i-00000293) [16:50:34] PROBLEM host: ganglia-test2 is DOWN address: i-00000250 CRITICAL - Host Unreachable (i-00000250) [16:50:34] PROBLEM host: aggregator1 is DOWN address: i-0000010c CRITICAL - Host Unreachable (i-0000010c) [17:07:14] PROBLEM Free ram is now: CRITICAL on bots-3 i-000000e5 output: Critical: 5% free memory [17:15:34] PROBLEM host: aggregator-test is DOWN address: i-0000024d CRITICAL - Host Unreachable (i-0000024d) [17:20:34] PROBLEM host: aggregator-test3 is DOWN address: i-00000293 CRITICAL - Host Unreachable (i-00000293) [17:20:34] PROBLEM host: ganglia-test2 is DOWN address: i-00000250 CRITICAL - Host Unreachable (i-00000250) [17:20:34] PROBLEM host: aggregator1 is DOWN address: i-0000010c CRITICAL - Host Unreachable (i-0000010c) [17:22:14] PROBLEM Free ram is now: WARNING on bots-3 i-000000e5 output: Warning: 6% free memory [17:45:34] PROBLEM host: aggregator-test is DOWN address: i-0000024d CRITICAL - Host Unreachable (i-0000024d) [17:50:34] PROBLEM host: aggregator-test3 is DOWN address: i-00000293 CRITICAL - Host Unreachable (i-00000293) [17:50:34] PROBLEM host: ganglia-test2 is DOWN address: i-00000250 CRITICAL - Host Unreachable (i-00000250) [17:50:34] PROBLEM host: aggregator1 is DOWN address: i-0000010c CRITICAL - Host Unreachable (i-0000010c) [18:15:34] PROBLEM host: aggregator-test is DOWN address: i-0000024d CRITICAL - Host Unreachable (i-0000024d) [18:20:34] PROBLEM host: aggregator-test3 is DOWN address: i-00000293 CRITICAL - Host Unreachable (i-00000293) [18:20:34] PROBLEM host: ganglia-test2 is DOWN address: i-00000250 CRITICAL - Host Unreachable (i-00000250) [18:20:34] PROBLEM host: aggregator1 is DOWN address: i-0000010c CRITICAL - Host Unreachable (i-0000010c) [18:45:34] PROBLEM host: aggregator-test is DOWN address: i-0000024d CRITICAL - Host Unreachable (i-0000024d) [18:50:34] PROBLEM host: aggregator-test3 is DOWN address: i-00000293 CRITICAL - Host Unreachable (i-00000293) [18:50:34] PROBLEM host: ganglia-test2 is DOWN address: i-00000250 CRITICAL - Host Unreachable (i-00000250) [18:50:34] PROBLEM host: aggregator1 is DOWN address: i-0000010c CRITICAL - Host Unreachable (i-0000010c) [19:02:14] PROBLEM Free ram is now: CRITICAL on bots-3 i-000000e5 output: Critical: 4% free memory [19:15:34] PROBLEM host: aggregator-test is DOWN address: i-0000024d CRITICAL - Host Unreachable (i-0000024d) [19:20:34] PROBLEM host: aggregator-test3 is DOWN address: i-00000293 CRITICAL - Host Unreachable (i-00000293) [19:20:34] PROBLEM host: ganglia-test2 is DOWN address: i-00000250 CRITICAL - Host Unreachable (i-00000250) [19:20:34] PROBLEM host: aggregator1 is DOWN address: i-0000010c CRITICAL - Host Unreachable (i-0000010c) [19:42:14] PROBLEM Puppet freshness is now: CRITICAL on deployment-apache23 i-00000270 output: Puppet has not run in last 20 hours [19:45:34] PROBLEM host: aggregator-test is DOWN address: i-0000024d CRITICAL - Host Unreachable (i-0000024d) [19:50:34] PROBLEM host: aggregator-test3 is DOWN address: i-00000293 CRITICAL - Host Unreachable (i-00000293) [19:50:34] PROBLEM host: ganglia-test2 is DOWN address: i-00000250 CRITICAL - Host Unreachable (i-00000250) [19:50:34] PROBLEM host: aggregator1 is DOWN address: i-0000010c CRITICAL - Host Unreachable (i-0000010c) [20:15:34] PROBLEM host: aggregator-test is DOWN address: i-0000024d CRITICAL - Host Unreachable (i-0000024d) [20:20:34] PROBLEM host: aggregator-test3 is DOWN address: i-00000293 CRITICAL - Host Unreachable (i-00000293) [20:20:34] PROBLEM host: aggregator1 is DOWN address: i-0000010c CRITICAL - Host Unreachable (i-0000010c) [20:20:34] PROBLEM host: ganglia-test2 is DOWN address: i-00000250 CRITICAL - Host Unreachable (i-00000250) [20:28:18] yay [20:28:19] back [20:31:55] New patchset: Andrew Bogott; "Further apache/mediawiki flailings." [operations/puppet] (test) - https://gerrit.wikimedia.org/r/10071 [20:32:13] New review: Andrew Bogott; "(no comment)" [operations/puppet] (test); V: 1 C: 2; - https://gerrit.wikimedia.org/r/10071 [20:32:13] Change merged: Andrew Bogott; [operations/puppet] (test) - https://gerrit.wikimedia.org/r/10071 [20:36:00] New patchset: Andrew Bogott; "Rename new mw class." [operations/puppet] (test) - https://gerrit.wikimedia.org/r/10072 [20:36:16] New review: gerrit2; "Lint check passed." [operations/puppet] (test); V: 1 - https://gerrit.wikimedia.org/r/10072 [20:36:23] New review: Andrew Bogott; "(no comment)" [operations/puppet] (test); V: 1 C: 2; - https://gerrit.wikimedia.org/r/10072 [20:36:26] Change merged: Andrew Bogott; [operations/puppet] (test) - https://gerrit.wikimedia.org/r/10072 [20:39:48] New patchset: Andrew Bogott; "And so on..." [operations/puppet] (test) - https://gerrit.wikimedia.org/r/10073 [20:40:06] New review: gerrit2; "Lint check passed." [operations/puppet] (test); V: 1 - https://gerrit.wikimedia.org/r/10073 [20:40:15] New review: Andrew Bogott; "(no comment)" [operations/puppet] (test); V: 1 C: 2; - https://gerrit.wikimedia.org/r/10073 [20:42:20] Change merged: Andrew Bogott; [operations/puppet] (test) - https://gerrit.wikimedia.org/r/10073 [20:45:35] PROBLEM host: aggregator-test is DOWN address: i-0000024d CRITICAL - Host Unreachable (i-0000024d) [20:47:15] PROBLEM Puppet freshness is now: CRITICAL on pybal-precise i-00000289 output: Puppet has not run in last 20 hours [20:48:45] RECOVERY Disk Space is now: OK on ipv6test1 i-00000282 output: DISK OK [20:50:35] PROBLEM host: aggregator-test3 is DOWN address: i-00000293 CRITICAL - Host Unreachable (i-00000293) [20:50:35] PROBLEM host: aggregator1 is DOWN address: i-0000010c CRITICAL - Host Unreachable (i-0000010c) [20:50:35] PROBLEM host: ganglia-test2 is DOWN address: i-00000250 CRITICAL - Host Unreachable (i-00000250) [20:56:45] PROBLEM Disk Space is now: WARNING on ipv6test1 i-00000282 output: DISK WARNING - free space: / 72 MB (5% inode=58%): [21:15:35] PROBLEM host: aggregator-test is DOWN address: i-0000024d CRITICAL - Host Unreachable (i-0000024d) [21:20:35] PROBLEM host: aggregator-test3 is DOWN address: i-00000293 CRITICAL - Host Unreachable (i-00000293) [21:20:35] PROBLEM host: aggregator1 is DOWN address: i-0000010c CRITICAL - Host Unreachable (i-0000010c) [21:20:35] PROBLEM host: ganglia-test2 is DOWN address: i-00000250 CRITICAL - Host Unreachable (i-00000250) [21:45:35] PROBLEM host: aggregator-test is DOWN address: i-0000024d CRITICAL - Host Unreachable (i-0000024d) [21:50:35] PROBLEM host: aggregator-test3 is DOWN address: i-00000293 CRITICAL - Host Unreachable (i-00000293) [21:50:35] PROBLEM host: aggregator1 is DOWN address: i-0000010c CRITICAL - Host Unreachable (i-0000010c) [21:50:35] PROBLEM host: ganglia-test2 is DOWN address: i-00000250 CRITICAL - Host Unreachable (i-00000250) [22:18:29] PROBLEM host: aggregator-test is DOWN address: i-0000024d CRITICAL - Host Unreachable (i-0000024d) [22:20:39] PROBLEM host: aggregator-test3 is DOWN address: i-00000293 CRITICAL - Host Unreachable (i-00000293) [22:20:39] PROBLEM host: aggregator1 is DOWN address: i-0000010c CRITICAL - Host Unreachable (i-0000010c) [22:20:39] PROBLEM host: ganglia-test2 is DOWN address: i-00000250 CRITICAL - Host Unreachable (i-00000250) [22:22:19] PROBLEM Free ram is now: WARNING on bots-3 i-000000e5 output: Warning: 6% free memory [22:48:39] PROBLEM host: aggregator-test is DOWN address: i-0000024d CRITICAL - Host Unreachable (i-0000024d) [22:50:39] PROBLEM host: aggregator-test3 is DOWN address: i-00000293 CRITICAL - Host Unreachable (i-00000293) [22:50:39] PROBLEM host: aggregator1 is DOWN address: i-0000010c CRITICAL - Host Unreachable (i-0000010c) [22:50:39] PROBLEM host: ganglia-test2 is DOWN address: i-00000250 CRITICAL - Host Unreachable (i-00000250) [22:52:19] PROBLEM Free ram is now: CRITICAL on bots-3 i-000000e5 output: Critical: 5% free memory [22:57:19] PROBLEM Free ram is now: WARNING on bots-3 i-000000e5 output: Warning: 8% free memory [23:18:39] PROBLEM host: aggregator-test is DOWN address: i-0000024d CRITICAL - Host Unreachable (i-0000024d) [23:20:39] PROBLEM host: aggregator-test3 is DOWN address: i-00000293 CRITICAL - Host Unreachable (i-00000293) [23:20:39] PROBLEM host: ganglia-test2 is DOWN address: i-00000250 CRITICAL - Host Unreachable (i-00000250) [23:20:39] PROBLEM host: aggregator1 is DOWN address: i-0000010c CRITICAL - Host Unreachable (i-0000010c) [23:46:09] PROBLEM Puppet freshness is now: CRITICAL on blamemaps-m1small i-000002a1 output: Puppet has not run in last 20 hours [23:48:39] PROBLEM host: aggregator-test is DOWN address: i-0000024d CRITICAL - Host Unreachable (i-0000024d) [23:50:39] PROBLEM host: aggregator-test3 is DOWN address: i-00000293 CRITICAL - Host Unreachable (i-00000293) [23:50:39] PROBLEM host: ganglia-test2 is DOWN address: i-00000250 CRITICAL - Host Unreachable (i-00000250) [23:50:39] PROBLEM host: aggregator1 is DOWN address: i-0000010c CRITICAL - Host Unreachable (i-0000010c) [23:57:19] PROBLEM Free ram is now: CRITICAL on bots-3 i-000000e5 output: Critical: 5% free memory