[01:43:12] PROBLEM Puppet freshness is now: CRITICAL on puppet-lucid i-00000080 output: Puppet has not run in last 20 hours [02:48:42] RECOVERY Current Load is now: OK on shop-analytics-main i-000001e6 output: OK - load average: 0.16, 0.08, 0.03 [02:48:52] RECOVERY Disk Space is now: OK on shop-analytics-main i-000001e6 output: DISK OK [02:49:32] RECOVERY Current Users is now: OK on shop-analytics-main i-000001e6 output: USERS OK - 0 users currently logged in [02:50:42] RECOVERY Free ram is now: OK on shop-analytics-main i-000001e6 output: OK: 88% free memory [02:52:42] RECOVERY Total Processes is now: OK on shop-analytics-main i-000001e6 output: PROCS OK: 81 processes [02:52:47] RECOVERY dpkg-check is now: OK on shop-analytics-main i-000001e6 output: All packages OK [02:54:22] RECOVERY Current Users is now: OK on login-test i-000001e9 output: USERS OK - 0 users currently logged in [02:55:02] RECOVERY Disk Space is now: OK on login-test i-000001e9 output: DISK OK [02:55:42] RECOVERY Free ram is now: OK on login-test i-000001e9 output: OK: 62% free memory [02:57:42] RECOVERY dpkg-check is now: OK on login-test i-000001e9 output: All packages OK [02:57:42] RECOVERY Total Processes is now: OK on login-test i-000001e9 output: PROCS OK: 80 processes [02:58:42] RECOVERY Current Load is now: OK on login-test i-000001e9 output: OK - load average: 0.00, 0.01, 0.00 [03:23:46] RECOVERY Current Load is now: OK on aggregator2 i-000001e8 output: OK - load average: 0.13, 0.05, 0.02 [03:23:56] PROBLEM Disk Space is now: WARNING on aggregator1 i-0000010c output: DISK WARNING - free space: / 401 MB (4% inode=93%): [03:24:26] RECOVERY Current Users is now: OK on aggregator2 i-000001e8 output: USERS OK - 0 users currently logged in [03:25:16] RECOVERY Disk Space is now: OK on aggregator2 i-000001e8 output: DISK OK [03:25:56] RECOVERY Free ram is now: OK on aggregator2 i-000001e8 output: OK: 93% free memory [03:27:36] RECOVERY Total Processes is now: OK on aggregator2 i-000001e8 output: PROCS OK: 93 processes [03:27:41] RECOVERY dpkg-check is now: OK on aggregator2 i-000001e8 output: All packages OK [03:28:56] PROBLEM Disk Space is now: CRITICAL on aggregator1 i-0000010c output: DISK CRITICAL - free space: / 205 MB (2% inode=93%): [03:37:16] PROBLEM Free ram is now: WARNING on utils-abogott i-00000131 output: Warning: 17% free memory [03:37:16] PROBLEM Free ram is now: WARNING on orgcharts-dev i-0000018f output: Warning: 15% free memory [03:47:16] PROBLEM Free ram is now: WARNING on test-oneiric i-00000187 output: Warning: 15% free memory [03:52:16] PROBLEM Free ram is now: WARNING on nova-daas-1 i-000000e7 output: Warning: 13% free memory [03:57:16] PROBLEM Free ram is now: CRITICAL on orgcharts-dev i-0000018f output: Critical: 3% free memory [03:57:16] PROBLEM Free ram is now: CRITICAL on utils-abogott i-00000131 output: Critical: 4% free memory [04:02:16] RECOVERY Free ram is now: OK on orgcharts-dev i-0000018f output: OK: 96% free memory [04:02:16] RECOVERY Free ram is now: OK on utils-abogott i-00000131 output: OK: 96% free memory [04:02:26] PROBLEM Free ram is now: CRITICAL on test-oneiric i-00000187 output: Critical: 5% free memory [04:12:16] RECOVERY Free ram is now: OK on test-oneiric i-00000187 output: OK: 97% free memory [04:12:16] PROBLEM Free ram is now: CRITICAL on nova-daas-1 i-000000e7 output: Critical: 5% free memory [04:17:16] RECOVERY Free ram is now: OK on nova-daas-1 i-000000e7 output: OK: 93% free memory [04:29:06] RECOVERY Free ram is now: OK on bots-2 i-0000009c output: OK: 20% free memory [06:12:06] PROBLEM Free ram is now: WARNING on bots-2 i-0000009c output: Warning: 19% free memory [06:48:56] RECOVERY Disk Space is now: OK on aggregator1 i-0000010c output: DISK OK [08:56:35] PROBLEM Current Load is now: CRITICAL on nagios 127.0.0.1 output: CRITICAL - load average: 6.41, 6.30, 4.07 [09:01:50] PROBLEM Current Load is now: WARNING on nagios 127.0.0.1 output: WARNING - load average: 2.86, 3.92, 3.63 [09:07:41] PROBLEM Current Load is now: WARNING on deployment-nfs-memc i-000000d7 output: WARNING - load average: 9.47, 8.87, 6.22 [09:11:42] RECOVERY Current Load is now: OK on nagios 127.0.0.1 output: OK - load average: 1.65, 1.62, 2.53 [09:21:47] RECOVERY Disk Space is now: OK on deployment-transcoding i-00000105 output: DISK OK [09:27:48] PROBLEM Disk Space is now: WARNING on aggregator1 i-0000010c output: DISK WARNING - free space: / 355 MB (3% inode=93%): [09:29:37] PROBLEM Disk Space is now: WARNING on deployment-transcoding i-00000105 output: DISK WARNING - free space: / 75 MB (5% inode=53%): [09:32:46] PROBLEM Disk Space is now: CRITICAL on aggregator1 i-0000010c output: DISK CRITICAL - free space: / 149 MB (1% inode=93%): [10:02:41] RECOVERY Current Load is now: OK on deployment-nfs-memc i-000000d7 output: OK - load average: 0.81, 1.52, 4.36 [10:55:53] PROBLEM Disk Space is now: CRITICAL on deployment-transcoding i-00000105 output: CHECK_NRPE: Socket timeout after 10 seconds. [11:08:19] PROBLEM Current Load is now: WARNING on bots-cb i-0000009e output: WARNING - load average: 5.53, 8.42, 5.48 [11:13:32] RECOVERY Current Load is now: OK on bots-cb i-0000009e output: OK - load average: 0.47, 3.37, 4.12 [11:13:32] PROBLEM Current Load is now: WARNING on deployment-nfs-memc i-000000d7 output: WARNING - load average: 6.70, 7.47, 5.62 [11:13:42] PROBLEM Disk Space is now: WARNING on deployment-transcoding i-00000105 output: DISK WARNING - free space: / 74 MB (5% inode=53%): [11:16:01] PROBLEM Free ram is now: CRITICAL on bots-2 i-0000009c output: CHECK_NRPE: Socket timeout after 10 seconds. [11:21:51] PROBLEM Free ram is now: WARNING on bots-2 i-0000009c output: Warning: 9% free memory [11:36:24] RECOVERY Disk Space is now: OK on deployment-transcoding i-00000105 output: DISK OK [11:39:45] Grr [11:39:46] that sucks [11:40:19] something died, resulting in the memory filling up with work to do, and now the processes that need to do the work won't run [11:43:06] PROBLEM Free ram is now: CRITICAL on bots-2 i-0000009c output: Critical: 1% free memory [11:44:34] PROBLEM Puppet freshness is now: CRITICAL on puppet-lucid i-00000080 output: Puppet has not run in last 20 hours [11:46:11] PROBLEM Current Load is now: WARNING on deployment-web2 i-00000125 output: WARNING - load average: 8.89, 8.08, 6.24 [11:46:41] PROBLEM Current Load is now: WARNING on deployment-web4 i-00000163 output: WARNING - load average: 14.44, 11.07, 7.11 [11:58:15] PROBLEM Current Load is now: WARNING on deployment-web i-000000cf output: WARNING - load average: 5.53, 5.87, 5.22 [11:58:16] PROBLEM Current Load is now: WARNING on bots-sql3 i-000000b4 output: WARNING - load average: 8.57, 8.78, 6.55 [12:02:42] RECOVERY Current Load is now: OK on deployment-web i-000000cf output: OK - load average: 2.91, 4.76, 4.96 [12:17:36] RECOVERY Current Load is now: OK on bots-sql3 i-000000b4 output: OK - load average: 0.11, 2.34, 4.90 [12:20:56] RECOVERY Current Load is now: OK on deployment-web2 i-00000125 output: OK - load average: 0.04, 0.77, 3.74 [12:20:56] RECOVERY Current Load is now: OK on deployment-nfs-memc i-000000d7 output: OK - load average: 0.02, 1.25, 4.53 [12:22:36] RECOVERY Current Load is now: OK on deployment-web4 i-00000163 output: OK - load average: 0.01, 0.63, 3.68 [13:20:56] PROBLEM Disk Space is now: WARNING on deployment-transcoding i-00000105 output: DISK WARNING - free space: / 74 MB (5% inode=53%): [14:09:08] PROBLEM Current Load is now: WARNING on nova-production1 i-0000007b output: WARNING - load average: 11.88, 8.38, 6.01 [14:44:09] RECOVERY Current Load is now: OK on nova-production1 i-0000007b output: OK - load average: 3.88, 3.73, 4.74 [14:46:19] PROBLEM HTTP is now: CRITICAL on deployment-web2 i-00000125 output: CRITICAL - Socket timeout after 10 seconds [14:51:19] RECOVERY HTTP is now: OK on deployment-web2 i-00000125 output: HTTP OK: HTTP/1.1 302 Found - 561 bytes in 3.154 second response time [15:02:07] hmm ftp isn't working on bots-3 [15:50:52] hi Krinkle [15:51:22] is there any way to find the public IP of the bots running on bots-3? I'm not sure if my bot is logged in or if it's just not making any edits [15:53:40] https://labsconsole.wikimedia.org/w/index.php?title=Special:NovaInstance&showmsg=setfilter [15:54:24] thanks Reedy, so I imagine "floating IP" is the one I want? [15:54:31] No [15:54:37] That's assigned to a specific instance [15:54:49] the instance IPs are local, like 10.4.0.59 [15:55:09] unless that's an IP [15:55:49] it's an internal ip [15:56:09] yeah that's what I thought [15:56:13] https://labsconsole.wikimedia.org/wiki/Nova_Resource:I-000000e5 [15:56:27] that's down as both the public and private IP [15:57:59] Thehelpfulone: most instances do not (and should not) have public IP addresses [15:58:12] they're only accessible through bastion with the ssh key [15:58:20] They can get out though [15:58:25] yep [15:58:26] simplest way would be making a http request to a server and ask that what it thinks your ip is [15:58:46] ok, and how do I do that? [15:58:49] but connecting back to that outgoing IP won't bring you to that instance [15:59:02] it's a bit like having a router in your home [15:59:07] he's not wanting that [15:59:11] OK [15:59:13] he's wanting to see if his bot is editing anonymously [15:59:16] yeah that's fine Krinkle, I just wanted to check if my bot is running but logged out [15:59:19] yeah exactly Reedy [15:59:20] oh, ok [15:59:42] You could do curl or wget of something like "what is my ip.com" from the command line of that instance [16:00:20] or create a page somewhere that does echo $_SERVER["REMOTE_ADDR"]; and access that with curl [16:01:10] wget http://toolserver.org/~reedy/ip.php [16:01:39] Reedy: fail [16:01:50] Reedy: that's one of the ts webservers [16:01:54] load balancer :) [16:01:58] use x forwarded for [16:02:02] got it thanks [16:02:07] no [16:02:19] will it always edit through that IP or does that change? [16:02:38] Thehelpfulone: "91.198.174.204" is the IP of the toolserver webserver [16:02:39] try again [16:02:43] :) [16:02:51] 208.80.153.194 [16:02:54] (is bastion) [16:02:55] I get 208.80.153.192 [16:02:59] good [16:03:06] Looks sensible [21:44:29] PROBLEM Puppet freshness is now: CRITICAL on puppet-lucid i-00000080 output: Puppet has not run in last 20 hours [22:10:31] PROBLEM Puppet freshness is now: CRITICAL on deployment-nfs-memc i-000000d7 output: Puppet has not run in last 20 hours [22:44:57] !initial-login | Darkpsy [22:44:57] Darkpsy: https://labsconsole.wikimedia.org/wiki/Access#Initial_log_in [22:45:16] !projects | Darkpsy [22:45:16] Darkpsy: https://labsconsole.wikimedia.org/wiki/Special:Ask/-5B-5BResource-20Type::project-5D-5D/-3F/-3FMember/-3FDescription/mainlabel%3D-2D [22:46:56] Thanks Ryan [22:47:31] yw [23:48:57] Ryan_Lane: how much spam did you get? I imagine it must have been a lot per day, sorry about that :) [23:49:06] heh [23:49:09] it was a bit [23:49:22] I filter cronspam, but I need to stop it occasionally [23:50:12] heh