[00:43:23] 2013/06/30 00:42 WARN wolfsbane / DISK WARNING - free space: / 3313 MB (11% inode=92%): [00:44:23] 2013/06/30 00:43 CRIT wolfsbane / DISK CRITICAL - free space: / 3290 MB (10% inode=92%): [00:44:23] 2013/06/30 00:43 WARN wolfsbane /tmp DISK WARNING - free space: / 3303 MB (11% inode=92%): [00:45:23] 2013/06/30 00:44 WARN wolfsbane / DISK WARNING - free space: / 3314 MB (11% inode=92%): [00:45:23] 2013/06/30 00:44 CRIT wolfsbane /tmp DISK CRITICAL - free space: / 3290 MB (10% inode=92%): [00:46:23] 2013/06/30 00:45 WARN wolfsbane /tmp DISK WARNING - free space: / 3304 MB (11% inode=92%): [00:49:24] 2013/06/30 00:48 CRIT wolfsbane / DISK CRITICAL - free space: / 3273 MB (10% inode=92%): [00:50:26] 2013/06/30 00:49 CRIT wolfsbane /tmp DISK CRITICAL - free space: / 3285 MB (10% inode=92%): [00:56:10] Can anyone else connect to enwiki-p.rrdb? Elsie? [02:47:32] 2013/06/30 02:44 WARN z-dat-s5-b MySQL slave SLOW_SLAVE WARNING: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 1896 [02:50:33] 2013/06/30 02:44 WARN z-dat-s5-b wikidata replag QUERY WARNING: 'SELECT ts_rc_age()' returned 1893.000000 [03:07:33] 2013/06/30 03:07 WARN wolfsbane /tmp DISK WARNING - free space: / 4206 MB (14% inode=92%): [03:08:33] 2013/06/30 03:07 WARN wolfsbane / DISK WARNING - free space: / 4215 MB (14% inode=92%): [03:27:34] 2013/06/30 03:27 OK z-dat-s5-b wikidata replag QUERY OK: 'SELECT ts_rc_age()' returned 1791.000000 [03:28:34] 2013/06/30 03:28 OK z-dat-s5-b MySQL slave Uptime: 569911 Threads: 4 Questions: 694998952 Slow queries: 2795 Opens: 358515 Flush tables: 1 Open tables: 256 Queries per second avg: 1219.486 Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 1668 [04:10:37] 2013/06/30 04:10 OK wolfsbane /tmp DISK OK - free space: / 9020 MB (30% inode=92%): [04:11:37] 2013/06/30 04:10 OK wolfsbane / DISK OK - free space: / 9016 MB (30% inode=92%): [05:27:39] Dispenser: enwiki_p seems fine. [05:27:47] rosemary [05:28:26] that's s1-user, what about sql-s1-rr [05:31:54] $ mysql -hsql-s1-rr [05:31:56] Works for me. ^ [05:32:23] $ mysql -hsql-s1-rr [05:32:24] mysql> select @@hostname; [05:32:24] +------------+ [05:32:24] | @@hostname | [05:32:24] +------------+ [05:32:26] | z-dat-s1-b | [05:32:29] +------------+ [05:32:32] 1 row in set (0.01 sec) [05:33:12] I get has exceeded the max_user_connections, but nothing seems running [07:48:28] [[User talk:Teles]] !N 10https://wiki.toolserver.org/w/index.php?oldid=8050&rcid=22015 * Teles * (+27) (Redirected page to [[m:User:Teles]]) [09:06:54] 2013/06/30 09:05 CRIT ortelius toolserver.org HTTP HTTP CRITICAL: HTTP/1.1 200 OK - 239 bytes in 3.782 second response time [09:11:54] 2013/06/30 09:11 WARN ortelius toolserver.org HTTP HTTP WARNING: HTTP/1.1 200 OK - 239 bytes in 0.653 second response time [09:12:54] 2013/06/30 09:12 CRIT ortelius toolserver.org HTTP HTTP CRITICAL: HTTP/1.1 200 OK - 239 bytes in 2.144 second response time [09:13:55] 2013/06/30 09:13 WARN ortelius toolserver.org HTTP HTTP WARNING: HTTP/1.1 200 OK - 239 bytes in 0.830 second response time [09:14:55] 2013/06/30 09:14 OK ortelius toolserver.org HTTP HTTP OK: HTTP/1.1 200 OK - 239 bytes in 0.144 second response time [09:34:56] 2013/06/30 09:28 WARN z-dat-s2-b MySQL slave SLOW_SLAVE WARNING: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 2156 [10:02:00] 2013/06/30 10:01 OK z-dat-s2-b MySQL slave Uptime: 3634715 Threads: 6 Questions: 3344970308 Slow queries: 67726 Opens: 26088285 Flush tables: 1 Open tables: 256 Queries per second avg: 920.284 Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 1780 [10:22:00] 2013/06/30 10:17 CRIT ortelius toolserver.org HTTP HTTP CRITICAL: HTTP/1.1 200 OK - 239 bytes in 7.475 second response time [10:28:00] 2013/06/30 10:27 WARN ortelius toolserver.org HTTP HTTP WARNING: HTTP/1.1 200 OK - 239 bytes in 0.897 second response time [10:30:00] 2013/06/30 10:29 CRIT ortelius toolserver.org HTTP CRITICAL - Socket timeout after 10 seconds [10:37:02] 2013/06/30 10:29 CRIT wolfsbane toolserver.org HTTP CRITICAL - Socket timeout after 10 seconds [10:44:03] 2013/06/30 10:37 WARN nightshade Load avg. WARNING - load average: 25.42, 19.82, 11.62 [10:47:03] 2013/06/30 10:46 CRIT nightshade Load avg. CRITICAL - load average: 30.23, 24.28, 14.77 [10:47:03] 2013/06/30 10:40 WARN yarrow Load avg. WARNING - load average: 25.87, 19.81, 11.03 [10:51:04] 2013/06/30 10:50 CRIT yarrow Load avg. CRITICAL - load average: 31.30, 25.44, 15.28 [11:25:36] Looks like one of the webservers just died [11:26:03] Not sure why ortelius isn't kicked out of the load balancer if it's down [11:27:59] Oh, both are down.... [11:32:05] Maarten Dammers * [Toolserver-l] Both webservers down [12:08:08] 2013/06/30 12:07 OK ortelius toolserver.org HTTP HTTP OK: HTTP/1.1 200 OK - 239 bytes in 0.005 second response time [12:08:08] 2013/06/30 12:07 OK wolfsbane toolserver.org HTTP HTTP OK: HTTP/1.1 200 OK - 239 bytes in 0.007 second response time [12:09:42] hello [12:09:49] web service should be back [12:09:58] seems nfs for the user-store died [12:26:05] Maarten Dammers * Re: [Toolserver-l] Both webservers down [12:26:08] 2013/06/30 12:25 ?? hemlock /aux0 CHECK_NRPE: Received 0 bytes from daemon. Check the remote server logs for error messages. [12:27:08] 2013/06/30 12:26 CRIT hemlock /aux0 Connection refused by host [12:32:08] 2013/06/30 12:25 CRIT hemlock /home Connection refused by host [12:32:08] 2013/06/30 12:25 CRIT hemlock Environment IPMI Connection refused by host [12:32:08] 2013/06/30 12:25 CRIT hemlock Load avg. Connection refused by host [12:32:08] 2013/06/30 12:25 CRIT hemlock NTP NTP CRITICAL: No response from NTP server [12:32:08] 2013/06/30 12:25 CRIT hemlock SMTP Connection refused [12:32:09] 2013/06/30 12:25 CRIT hemlock SSH Connection refused [12:33:08] 2013/06/30 12:26 CRIT hemlock / Connection refused by host [12:33:08] 2013/06/30 12:26 CRIT hemlock /tmp Connection refused by host [12:35:08] 2013/06/30 12:34 CRIT hemlock PING CRITICAL - Host Unreachable (hemlock) [12:37:09] 2013/06/30 12:36 OK hemlock PING PING OK - Packet loss = 0%, RTA = 0.18 ms [12:41:09] 2013/06/30 12:40 OK hemlock NTP NTP OK: Offset 0.011139 secs [12:56:10] 2013/06/30 12:55 OK hemlock /home DISK OK - free space: /home 15604 MB (31% inode=83%): [12:56:10] 2013/06/30 12:55 OK hemlock Environment IPMI ok: temperature ok fan ok voltage ok chassis ok [12:56:10] 2013/06/30 12:55 OK hemlock SSH SSH OK - OpenSSH_5.8p2-hpn13v11 (protocol 2.0) [12:57:10] 2013/06/30 12:56 OK hemlock / DISK OK - free space: / 5441 MB (27% inode=88%): [12:57:10] 2013/06/30 12:56 OK hemlock /tmp DISK OK - free space: / 5422 MB (27% inode=88%): [12:57:10] 2013/06/30 12:56 OK hemlock Load avg. OK - load average: 1.48, 0.49, 0.38 [12:57:10] 2013/06/30 12:56 OK hemlock SMTP SMTP OK - 3.409 sec. response time [13:24:13] 2013/06/30 13:17 WARN willow / DISK WARNING - free space: / 16056 MB (15% inode=98%): [13:24:13] 2013/06/30 13:17 WARN willow /tmp DISK WARNING - free space: / 16292 MB (15% inode=98%): [13:29:11] 2013/06/30 13:28 CRIT willow / DISK CRITICAL - free space: / 11441 MB (10% inode=98%): [13:30:11] 2013/06/30 13:29 CRIT willow /tmp DISK CRITICAL - free space: / 10794 MB (10% inode=97%): [13:32:11] 2013/06/30 13:31 WARN yarrow Load avg. WARNING - load average: 0.13, 0.53, 19.82 [13:35:12] 2013/06/30 13:34 WARN nightshade Load avg. WARNING - load average: 1.97, 3.57, 19.15 [13:37:12] 2013/06/30 13:36 OK yarrow Load avg. OK - load average: 0.37, 0.35, 14.40 [13:38:18] Hello all [13:39:12] 2013/06/30 13:32 WARN wolfsbane / DISK WARNING - free space: / 6182 MB (20% inode=92%): [13:40:12] 2013/06/30 13:39 OK nightshade Load avg. OK - load average: 0.93, 1.95, 14.14 [13:40:12] 2013/06/30 13:33 WARN wolfsbane /tmp DISK WARNING - free space: / 6173 MB (20% inode=92%): [14:15:14] 2013/06/30 14:08 WARN z-dat-s2-b MySQL slave SLOW_SLAVE WARNING: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 2130 [14:28:14] 2013/06/30 14:27 OK wolfsbane / DISK OK - free space: / 10754 MB (35% inode=92%): [14:29:14] 2013/06/30 14:28 OK wolfsbane /tmp DISK OK - free space: / 10749 MB (35% inode=92%): [14:39:16] 2013/06/30 14:32 WARN rosemary wikidata replag QUERY WARNING: 'SELECT ts_rc_age()' returned 2022.000000 [14:44:16] 2013/06/30 14:43 CRIT z-dat-s2-b MySQL slave SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 3609 [14:47:16] 2013/06/30 14:40 WARN z-dat-s4-a /sql DISK WARNING - free space: /sql 67080 MB (10% inode=99%): [15:02:17] 2013/06/30 15:01 WARN ortelius toolserver.org HTTP HTTP WARNING: HTTP/1.1 200 OK - 239 bytes in 0.930 second response time [15:03:17] 2013/06/30 15:02 CRIT ortelius toolserver.org HTTP HTTP CRITICAL: HTTP/1.1 200 OK - 239 bytes in 1.120 second response time [15:04:17] 2013/06/30 15:03 OK ortelius toolserver.org HTTP HTTP OK: HTTP/1.1 200 OK - 239 bytes in 0.280 second response time [15:16:17] 2013/06/30 15:15 CRIT rosemary wikidata replag QUERY CRITICAL: 'SELECT ts_rc_age()' returned 3633.000000 [15:22:18] 2013/06/30 15:15 CRIT ortelius toolserver.org HTTP HTTP CRITICAL: HTTP/1.1 200 OK - 239 bytes in 7.885 second response time [15:27:18] 2013/06/30 15:26 OK ortelius toolserver.org HTTP HTTP OK: HTTP/1.1 200 OK - 239 bytes in 0.326 second response time [15:37:18] 2013/06/30 15:36 WARN ortelius toolserver.org HTTP HTTP WARNING: HTTP/1.1 200 OK - 239 bytes in 0.819 second response time [15:39:18] 2013/06/30 15:38 CRIT ortelius toolserver.org HTTP HTTP CRITICAL: HTTP/1.1 200 OK - 239 bytes in 1.488 second response time [15:48:18] 2013/06/30 15:47 WARN ortelius toolserver.org HTTP HTTP WARNING: HTTP/1.1 200 OK - 239 bytes in 0.775 second response time [15:49:18] 2013/06/30 15:48 CRIT ortelius toolserver.org HTTP HTTP CRITICAL: HTTP/1.1 200 OK - 239 bytes in 1.244 second response time [15:58:19] 2013/06/30 15:57 OK ortelius toolserver.org HTTP HTTP OK: HTTP/1.1 200 OK - 239 bytes in 0.002 second response time [16:05:20] 2013/06/30 15:58 CRIT ortelius toolserver.org HTTP HTTP CRITICAL: HTTP/1.1 200 OK - 239 bytes in 1.345 second response time [16:17:21] 2013/06/30 16:16 WARN ortelius toolserver.org HTTP HTTP WARNING: HTTP/1.1 200 OK - 239 bytes in 0.747 second response time [16:18:21] 2013/06/30 16:17 OK willow / DISK OK - free space: / 38567 MB (36% inode=99%): [16:18:21] 2013/06/30 16:17 OK willow /tmp DISK OK - free space: / 38567 MB (36% inode=99%): [16:19:21] 2013/06/30 16:18 CRIT ortelius toolserver.org HTTP HTTP CRITICAL: HTTP/1.1 200 OK - 239 bytes in 1.451 second response time [16:21:21] 2013/06/30 16:14 WARN ptolemy Load avg. WARNING - load average: 21.34, 20.23, 13.06 [16:23:21] 2013/06/30 16:22 OK ortelius toolserver.org HTTP HTTP OK: HTTP/1.1 200 OK - 239 bytes in 0.300 second response time [16:30:21] 2013/06/30 16:23 CRIT ortelius toolserver.org HTTP HTTP CRITICAL: HTTP/1.1 200 OK - 239 bytes in 1.764 second response time [16:31:21] 2013/06/30 16:30 CRIT yarrow APT APT CRITICAL: 65 packages available for upgrade (61 critical updates). [16:32:21] 2013/06/30 16:31 CRIT nightshade APT APT CRITICAL: 61 packages available for upgrade (61 critical updates). [16:33:21] 2013/06/30 16:32 OK ortelius toolserver.org HTTP HTTP OK: HTTP/1.1 200 OK - 239 bytes in 0.252 second response time [16:35:21] 2013/06/30 16:34 OK ptolemy Load avg. OK - load average: 8.71, 13.88, 14.84 [16:45:22] 2013/06/30 16:44 WARN z-dat-s2-b MySQL slave SLOW_SLAVE WARNING: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 3079 [16:51:21] 2013/06/30 16:44 CRIT ortelius toolserver.org HTTP CRITICAL - Socket timeout after 10 seconds [16:52:15] getting 502's again [16:53:22] 2013/06/30 16:52 OK z-dat-s2-b MySQL slave Uptime: 3659375 Threads: 8 Questions: 3361523417 Slow queries: 68002 Opens: 26227412 Flush tables: 1 Open tables: 256 Queries per second avg: 918.605 Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 1765 [16:55:22] 2013/06/30 16:48 CRIT wolfsbane toolserver.org HTTP CRITICAL - Socket timeout after 10 seconds [17:04:23] 2013/06/30 16:57 WARN nightshade Load avg. WARNING - load average: 25.29, 19.27, 11.41 [17:07:24] 2013/06/30 17:00 WARN yarrow Load avg. WARNING - load average: 25.59, 19.46, 10.90 [17:08:24] 2013/06/30 17:07 CRIT nightshade Load avg. CRITICAL - load average: 31.47, 25.25, 15.55 [17:08:24] 2013/06/30 17:07 WARN rosemary wikidata replag QUERY WARNING: 'SELECT ts_rc_age()' returned 3572.000000 [17:10:24] 2013/06/30 17:09 CRIT yarrow Load avg. CRITICAL - load average: 30.20, 24.10, 14.18 [17:27:00] Danny_B: I will take a look after dinner (if my connection is ok again hopefully) [17:39:25] 2013/06/30 17:39 OK rosemary wikidata replag QUERY OK: 'SELECT ts_rc_age()' returned 1772.000000 [18:51:05] Krinkle * Re: [Toolserver-l] Both webservers down [19:01:20] thank you DaBPunkt for install lxml! [19:25:27] np. sorry for the delay [19:43:55] DaBPunkt: Maybe you can give the webservers nfs a kick again? [19:44:01] Down again.... [19:44:20] mm, let me see [19:46:10] looks like the mounted partition is away [19:48:08] I will send nosy a message. I have no idea how to fix this [19:54:26] message sent [20:03:34] 2013/06/30 20:02 OK wolfsbane toolserver.org HTTP HTTP OK: HTTP/1.1 200 OK - 239 bytes in 0.003 second response time [20:04:34] 2013/06/30 20:03 OK ortelius toolserver.org HTTP HTTP OK: HTTP/1.1 200 OK - 239 bytes in 0.009 second response time [20:05:34] nosy repaired it, Cheers to her! [20:09:35] 2013/06/30 20:08 CRIT hemlock / Connection refused or timed out [20:09:35] 2013/06/30 20:08 CRIT hemlock /home Connection refused or timed out [20:09:35] 2013/06/30 20:08 CRIT hemlock /tmp Connection refused or timed out [20:09:35] 2013/06/30 20:08 CRIT hemlock Environment IPMI Connection refused or timed out [20:09:35] 2013/06/30 20:08 CRIT hemlock Load avg. Connection refused or timed out [20:09:36] 2013/06/30 20:08 CRIT hemlock NTP CRITICAL - Socket timeout after 10 seconds [20:09:36] 2013/06/30 20:08 CRIT hemlock PING CRITICAL - Host Unreachable (hemlock) [20:09:37] 2013/06/30 20:08 CRIT hemlock SMTP CRITICAL - Socket timeout after 10 seconds [20:09:37] 2013/06/30 20:08 CRIT hemlock SSH No route to host [20:12:35] 2013/06/30 20:11 OK hemlock PING PING OK - Packet loss = 0%, RTA = 0.17 ms [20:15:35] 2013/06/30 20:15 OK hemlock /aux0 DISK OK - free space: / 5575 MB (27% inode=88%): [20:33:36] 2013/06/30 20:26 CRIT rosemary /mnt user-store DISK CRITICAL - free space: /mnt 95059 MB (1% inode=62%): [20:37:36] 2013/06/30 20:30 WARN willow / DISK WARNING - free space: / 19187 MB (18% inode=98%): [20:37:36] 2013/06/30 20:30 WARN willow /tmp DISK WARNING - free space: / 19187 MB (18% inode=98%): [20:47:37] 2013/06/30 20:46 CRIT willow / DISK CRITICAL - free space: / 11287 MB (10% inode=98%): [20:47:38] 2013/06/30 20:46 CRIT willow /tmp DISK CRITICAL - free space: / 11440 MB (10% inode=98%): [20:56:38] 2013/06/30 20:56 WARN yarrow Load avg. WARNING - load average: 0.02, 0.12, 19.27 [20:57:38] 2013/06/30 20:56 WARN nightshade Load avg. WARNING - load average: 0.00, 0.09, 18.97 [21:00:39] 2013/06/30 21:00 OK yarrow Load avg. OK - load average: 0.35, 0.19, 14.92 [21:01:39] 2013/06/30 21:00 OK nightshade Load avg. OK - load average: 0.72, 0.23, 14.70 [21:10:39] 2013/06/30 21:09 OK z-dat-s4-a /sql DISK OK - free space: /sql 67661 MB (11% inode=99%): [21:15:05] Marlen Caemmerer * Re: [Toolserver-l] Both webservers down [21:26:39] 2013/06/30 21:25 WARN willow / DISK WARNING - free space: / 11581 MB (11% inode=98%): [21:26:39] 2013/06/30 21:25 WARN willow /tmp DISK WARNING - free space: / 11625 MB (11% inode=98%): [21:32:40] 2013/06/30 21:31 OK willow / DISK OK - free space: / 23765 MB (22% inode=99%): [21:33:40] 2013/06/30 21:32 OK willow /tmp DISK OK - free space: / 24501 MB (23% inode=99%): [22:17:41] 2013/06/30 22:10 WARN z-dat-s2-b MySQL slave SLOW_SLAVE WARNING: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 2044 [22:19:45] nacht ts [22:24:41] 2013/06/30 22:23 OK z-dat-s2-b MySQL slave Uptime: 3679235 Threads: 10 Questions: 3382777973 Slow queries: 68148 Opens: 26327309 Flush tables: 1 Open tables: 256 Queries per second avg: 919.424 Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 1791