[00:02:28] PROBLEM - mw171 Current Load on mw171 is WARNING: LOAD WARNING - total load average: 20.73, 19.14, 17.36 [00:03:15] RECOVERY - mwtask181 Puppet on mwtask181 is OK: OK: Puppet is currently enabled, last run 42 seconds ago with 0 failures [00:04:22] RECOVERY - mw151 Puppet on mw151 is OK: OK: Puppet is currently enabled, last run 19 seconds ago with 0 failures [00:04:28] RECOVERY - mw171 Current Load on mw171 is OK: LOAD OK - total load average: 14.94, 17.37, 16.92 [00:04:56] RECOVERY - os162 Current Load on os162 is OK: LOAD OK - total load average: 3.76, 5.86, 6.73 [00:06:51] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 19.02, 21.57, 23.53 [00:07:04] RECOVERY - mw162 Puppet on mw162 is OK: OK: Puppet is currently enabled, last run 10 seconds ago with 0 failures [00:07:58] RECOVERY - mwtask171 Puppet on mwtask171 is OK: OK: Puppet is currently enabled, last run 15 seconds ago with 0 failures [00:21:20] PROBLEM - prometheus151 PowerDNS Recursor on prometheus151 is CRITICAL: CRITICAL - Plugin timed out while executing system call [00:22:52] PROBLEM - os162 Current Load on os162 is WARNING: LOAD WARNING - total load average: 7.09, 6.78, 6.56 [00:23:15] RECOVERY - prometheus151 PowerDNS Recursor on prometheus151 is OK: DNS OK: 0.285 seconds response time. wikitide.net returns 2602:294:0:b13::110,2602:294:0:b23::112,38.46.223.205,38.46.223.206 [00:24:51] RECOVERY - mw182 Current Load on mw182 is OK: LOAD OK - total load average: 17.26, 18.70, 20.22 [00:28:51] RECOVERY - os162 Current Load on os162 is OK: LOAD OK - total load average: 5.96, 6.71, 6.66 [00:30:51] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 23.84, 22.57, 21.39 [00:32:31] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 1.22, 2.97, 3.98 [00:34:31] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 4.17, 3.68, 4.12 [00:34:51] PROBLEM - mw182 Current Load on mw182 is CRITICAL: LOAD CRITICAL - total load average: 26.05, 23.47, 21.95 [00:36:51] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 22.50, 22.56, 21.78 [00:38:32] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 3.26, 3.20, 3.81 [00:38:48] PROBLEM - os162 Current Load on os162 is WARNING: LOAD WARNING - total load average: 7.12, 6.76, 6.67 [00:38:51] PROBLEM - mw182 Current Load on mw182 is CRITICAL: LOAD CRITICAL - total load average: 24.33, 23.46, 22.22 [00:40:32] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 7.56, 4.89, 4.36 [00:40:51] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 22.94, 23.10, 22.23 [00:42:48] RECOVERY - os162 Current Load on os162 is OK: LOAD OK - total load average: 5.28, 6.60, 6.67 [00:42:51] PROBLEM - mw182 Current Load on mw182 is CRITICAL: LOAD CRITICAL - total load average: 25.51, 23.82, 22.58 [00:50:31] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 1.92, 3.39, 3.95 [00:50:51] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 20.45, 23.35, 23.05 [00:52:31] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 4.16, 3.85, 4.05 [00:54:31] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 3.41, 3.40, 3.84 [00:56:51] PROBLEM - mw182 Current Load on mw182 is CRITICAL: LOAD CRITICAL - total load average: 26.60, 23.27, 22.93 [00:57:30] [Grafana] RESOLVED: PHP-FPM Worker Usage High https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [00:58:31] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 6.88, 4.44, 4.07 [01:00:32] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 3.05, 3.93, 3.95 [01:00:51] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 22.31, 23.17, 23.08 [01:02:51] PROBLEM - mw182 Current Load on mw182 is CRITICAL: LOAD CRITICAL - total load average: 26.06, 24.00, 23.38 [01:04:51] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 21.28, 23.00, 23.10 [01:08:51] PROBLEM - mw182 Current Load on mw182 is CRITICAL: LOAD CRITICAL - total load average: 28.42, 24.64, 23.64 [01:10:32] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 4.86, 3.36, 3.53 [01:12:41] PROBLEM - prometheus151 PowerDNS Recursor on prometheus151 is CRITICAL: CRITICAL - Plugin timed out while executing system call [01:12:52] PROBLEM - prometheus151 SSH on prometheus151 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [01:14:51] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 21.75, 22.93, 23.30 [01:14:51] RECOVERY - prometheus151 SSH on prometheus151 is OK: SSH OK - OpenSSH_9.2p1 Debian-2+deb12u3 (protocol 2.0) [01:16:31] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 1.80, 3.80, 3.87 [01:18:33] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 5.40, 4.55, 4.14 [01:20:31] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 1.87, 3.73, 3.91 [01:22:27] PROBLEM - ie.unigon.net - reverse DNS on sslhost is CRITICAL: rDNS CRITICAL - ie.unigon.net All nameservers failed to answer the query. [01:22:28] PROBLEM - wiki.consid.vn - reverse DNS on sslhost is CRITICAL: rDNS CRITICAL - wiki.consid.vn All nameservers failed to answer the query. [01:22:51] PROBLEM - mw182 Current Load on mw182 is CRITICAL: LOAD CRITICAL - total load average: 24.83, 22.30, 22.62 [01:24:58] RECOVERY - prometheus151 PowerDNS Recursor on prometheus151 is OK: DNS OK: 0.073 seconds response time. wikitide.net returns 2602:294:0:b13::110,2602:294:0:b23::112,38.46.223.205,38.46.223.206 [01:27:00] [Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [01:28:32] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 5.11, 3.44, 3.60 [01:28:51] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 16.89, 21.31, 22.31 [01:30:01] PROBLEM - wiki.tulpa.info - reverse DNS on sslhost is CRITICAL: rDNS CRITICAL - wiki.tulpa.info All nameservers failed to answer the query. [01:30:51] PROBLEM - mw182 Current Load on mw182 is CRITICAL: LOAD CRITICAL - total load average: 26.14, 23.30, 22.91 [01:37:24] PROBLEM - prometheus151 PowerDNS Recursor on prometheus151 is CRITICAL: CRITICAL - Plugin timed out while executing system call [01:39:24] RECOVERY - prometheus151 PowerDNS Recursor on prometheus151 is OK: DNS OK: 5.053 seconds response time. wikitide.net returns 2602:294:0:b13::110,2602:294:0:b23::112,38.46.223.205,38.46.223.206 [01:42:00] [Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] FIRING: There has been a rise in the MediaWiki exception rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [01:44:51] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 19.27, 23.00, 23.57 [01:47:00] [Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] RESOLVED: MediaWiki Exception Rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [01:50:32] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 2.01, 3.23, 3.98 [01:51:55] RECOVERY - ie.unigon.net - reverse DNS on sslhost is OK: SSL OK - ie.unigon.net reverse DNS resolves to cp36.wikitide.net - CNAME OK [01:51:57] RECOVERY - wiki.consid.vn - reverse DNS on sslhost is OK: SSL OK - wiki.consid.vn reverse DNS resolves to cp36.wikitide.net - CNAME OK [01:56:31] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 5.45, 3.78, 3.94 [01:58:32] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 3.36, 3.24, 3.71 [01:59:30] RECOVERY - wiki.tulpa.info - reverse DNS on sslhost is OK: SSL OK - wiki.tulpa.info reverse DNS resolves to cp36.wikitide.net - CNAME OK [02:00:51] PROBLEM - mw182 Current Load on mw182 is CRITICAL: LOAD CRITICAL - total load average: 24.78, 22.27, 22.48 [02:02:32] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 5.90, 4.13, 3.90 [02:02:51] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 17.75, 20.70, 21.90 [02:04:32] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 2.75, 3.54, 3.72 [02:06:31] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 4.64, 4.09, 3.91 [02:07:09] PROBLEM - prometheus151 PowerDNS Recursor on prometheus151 is CRITICAL: CRITICAL - Plugin timed out while executing system call [02:09:11] RECOVERY - prometheus151 PowerDNS Recursor on prometheus151 is OK: DNS OK: 8.249 seconds response time. wikitide.net returns 2602:294:0:b13::110,2602:294:0:b23::112,38.46.223.205,38.46.223.206 [02:10:51] RECOVERY - mw182 Current Load on mw182 is OK: LOAD OK - total load average: 19.16, 18.36, 20.24 [02:12:00] [Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] FIRING: There has been a rise in the MediaWiki exception rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [02:13:31] PROBLEM - prometheus151 PowerDNS Recursor on prometheus151 is CRITICAL: CRITICAL - Plugin timed out while executing system call [02:16:51] PROBLEM - mw182 Current Load on mw182 is CRITICAL: LOAD CRITICAL - total load average: 28.23, 22.43, 21.19 [02:17:00] [Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] RESOLVED: MediaWiki Exception Rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [02:17:55] PROBLEM - mw161 Current Load on mw161 is WARNING: LOAD WARNING - total load average: 20.62, 18.45, 15.81 [02:18:10] PROBLEM - mw172 Current Load on mw172 is WARNING: LOAD WARNING - total load average: 20.49, 18.51, 16.29 [02:18:37] PROBLEM - prometheus151 SSH on prometheus151 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [02:19:45] RECOVERY - prometheus151 PowerDNS Recursor on prometheus151 is OK: DNS OK: 9.935 seconds response time. wikitide.net returns 2602:294:0:b13::110,2602:294:0:b23::112,38.46.223.205,38.46.223.206 [02:19:55] RECOVERY - mw161 Current Load on mw161 is OK: LOAD OK - total load average: 18.74, 18.77, 16.26 [02:20:10] RECOVERY - mw172 Current Load on mw172 is OK: LOAD OK - total load average: 16.98, 17.95, 16.34 [02:21:04] PROBLEM - prometheus151 APT on prometheus151 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 60 seconds. [02:21:04] PROBLEM - prometheus151 Puppet on prometheus151 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 60 seconds. [02:22:00] [Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] FIRING: There has been a rise in the MediaWiki exception rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [02:23:09] RECOVERY - prometheus151 Puppet on prometheus151 is OK: OK: Puppet is currently enabled, last run 36 minutes ago with 0 failures [02:23:09] RECOVERY - prometheus151 APT on prometheus151 is OK: APT OK: 51 packages available for upgrade (0 critical updates). [02:26:54] RECOVERY - prometheus151 SSH on prometheus151 is OK: SSH OK - OpenSSH_9.2p1 Debian-2+deb12u3 (protocol 2.0) [02:27:00] [Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] RESOLVED: MediaWiki Exception Rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [02:30:51] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 18.23, 22.94, 23.04 [02:46:51] PROBLEM - mw182 Current Load on mw182 is CRITICAL: LOAD CRITICAL - total load average: 25.47, 22.27, 22.08 [02:48:51] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 22.25, 21.55, 21.81 [02:50:51] PROBLEM - mw182 Current Load on mw182 is CRITICAL: LOAD CRITICAL - total load average: 26.41, 22.86, 22.22 [02:52:51] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 22.89, 22.95, 22.35 [02:54:51] PROBLEM - mw182 Current Load on mw182 is CRITICAL: LOAD CRITICAL - total load average: 27.19, 24.47, 22.98 [02:58:32] PROBLEM - prometheus151 PowerDNS Recursor on prometheus151 is CRITICAL: CRITICAL - Plugin timed out while executing system call [03:00:26] RECOVERY - prometheus151 PowerDNS Recursor on prometheus151 is OK: DNS OK: 0.072 seconds response time. wikitide.net returns 2602:294:0:b13::110,2602:294:0:b23::112,38.46.223.205,38.46.223.206 [03:02:00] [Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] FIRING: There has been a rise in the MediaWiki exception rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [03:04:51] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 19.61, 23.57, 23.64 [03:06:31] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 2.90, 3.39, 3.94 [03:07:00] [Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] RESOLVED: MediaWiki Exception Rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [03:10:32] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 6.99, 4.54, 4.20 [03:10:51] PROBLEM - mw182 Current Load on mw182 is CRITICAL: LOAD CRITICAL - total load average: 24.82, 22.98, 23.25 [03:12:32] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 2.76, 3.79, 3.98 [03:14:51] PROBLEM - mw162 Current Load on mw162 is WARNING: LOAD WARNING - total load average: 20.45, 18.61, 17.17 [03:16:46] RECOVERY - mw162 Current Load on mw162 is OK: LOAD OK - total load average: 18.10, 17.87, 17.04 [03:20:51] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 22.96, 23.54, 23.78 [03:22:32] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 5.38, 3.60, 3.64 [03:24:32] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 3.12, 3.16, 3.46 [03:26:32] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 6.09, 4.30, 3.84 [03:26:51] PROBLEM - mw182 Current Load on mw182 is CRITICAL: LOAD CRITICAL - total load average: 25.77, 22.70, 23.13 [03:27:38] PROBLEM - prometheus151 PowerDNS Recursor on prometheus151 is CRITICAL: CRITICAL - Plugin timed out while executing system call [03:28:51] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 21.54, 21.98, 22.81 [03:31:38] RECOVERY - prometheus151 PowerDNS Recursor on prometheus151 is OK: DNS OK: 0.086 seconds response time. wikitide.net returns 2602:294:0:b13::110,2602:294:0:b23::112,38.46.223.205,38.46.223.206 [03:34:31] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 1.79, 3.47, 3.77 [03:36:31] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 5.57, 4.06, 3.93 [03:42:51] RECOVERY - mw182 Current Load on mw182 is OK: LOAD OK - total load average: 14.39, 17.53, 19.98 [03:42:57] PROBLEM - prometheus151 PowerDNS Recursor on prometheus151 is CRITICAL: CRITICAL - Plugin timed out while executing system call [03:44:52] RECOVERY - prometheus151 PowerDNS Recursor on prometheus151 is OK: DNS OK: 0.065 seconds response time. wikitide.net returns 2602:294:0:b13::110,2602:294:0:b23::112,38.46.223.205,38.46.223.206 [03:50:51] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 21.29, 20.64, 20.55 [03:52:31] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 2.51, 3.57, 4.00 [03:52:51] PROBLEM - mw182 Current Load on mw182 is CRITICAL: LOAD CRITICAL - total load average: 24.37, 21.47, 20.83 [03:54:31] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 4.44, 3.74, 4.00 [03:54:51] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 23.28, 22.36, 21.26 [03:56:33] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 3.23, 3.17, 3.73 [03:58:18] PROBLEM - os162 Current Load on os162 is WARNING: LOAD WARNING - total load average: 7.38, 5.97, 4.06 [03:58:26] PROBLEM - mw162 Current Load on mw162 is WARNING: LOAD WARNING - total load average: 20.85, 18.80, 16.80 [03:58:51] PROBLEM - mw182 Current Load on mw182 is CRITICAL: LOAD CRITICAL - total load average: 25.72, 23.19, 21.78 [04:00:26] RECOVERY - mw162 Current Load on mw162 is OK: LOAD OK - total load average: 20.04, 18.79, 17.02 [04:00:31] RECOVERY - prometheus151 Current Load on prometheus151 is OK: LOAD OK - total load average: 1.03, 2.49, 3.38 [04:04:51] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 19.25, 22.97, 22.35 [04:06:18] RECOVERY - os162 Current Load on os162 is OK: LOAD OK - total load average: 6.16, 6.44, 5.15 [04:07:23] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 2.64, 3.24, 3.48 [04:09:19] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 4.86, 3.91, 3.70 [04:10:54] PROBLEM - franchise.franchising.org.ua - reverse DNS on sslhost is CRITICAL: rDNS CRITICAL - franchise.franchising.org.ua All nameservers failed to answer the query. [04:11:15] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 2.58, 3.36, 3.53 [04:18:05] PROBLEM - tl.awiki.org - LetsEncrypt on sslhost is CRITICAL: Name or service not knownHTTP CRITICAL - Unable to open TCP socket [04:18:36] PROBLEM - aman.awiki.org - reverse DNS on sslhost is WARNING: NXDOMAIN: The DNS query name does not exist: aman.awiki.org. [04:18:54] RECOVERY - prometheus151 Current Load on prometheus151 is OK: LOAD OK - total load average: 1.74, 2.64, 3.19 [04:22:00] [Grafana] RESOLVED: PHP-FPM Worker Usage High https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [04:22:49] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 7.39, 4.36, 3.69 [04:23:56] PROBLEM - prometheus151 SSH on prometheus151 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [04:24:20] PROBLEM - tl.awiki.org - reverse DNS on sslhost is WARNING: rDNS WARNING - reverse DNS entry for tl.awiki.org could not be found [04:24:51] PROBLEM - mw182 Current Load on mw182 is CRITICAL: LOAD CRITICAL - total load average: 26.38, 23.83, 22.51 [04:25:20] [Grafana] FIRING: There has been a rise in the MediaWiki exception rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [04:25:53] RECOVERY - prometheus151 SSH on prometheus151 is OK: SSH OK - OpenSSH_9.2p1 Debian-2+deb12u3 (protocol 2.0) [04:28:51] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 20.91, 23.48, 22.76 [04:30:20] [Grafana] RESOLVED: MediaWiki Exception Rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [04:35:16] PROBLEM - os162 Current Load on os162 is WARNING: LOAD WARNING - total load average: 7.22, 6.69, 6.04 [04:37:16] RECOVERY - os162 Current Load on os162 is OK: LOAD OK - total load average: 6.01, 6.51, 6.06 [04:40:52] PROBLEM - aman.awiki.org - LetsEncrypt on sslhost is CRITICAL: Name or service not knownHTTP CRITICAL - Unable to open TCP socket [04:44:51] PROBLEM - mw182 Current Load on mw182 is CRITICAL: LOAD CRITICAL - total load average: 24.18, 22.34, 22.13 [04:45:14] PROBLEM - os162 Current Load on os162 is WARNING: LOAD WARNING - total load average: 6.98, 6.80, 6.35 [04:48:51] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 23.31, 22.89, 22.42 [04:49:13] RECOVERY - os162 Current Load on os162 is OK: LOAD OK - total load average: 6.27, 6.80, 6.48 [04:49:24] PROBLEM - prometheus151 PowerDNS Recursor on prometheus151 is CRITICAL: CRITICAL - Plugin timed out while executing system call [04:49:30] [Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [04:51:19] RECOVERY - prometheus151 PowerDNS Recursor on prometheus151 is OK: DNS OK: 0.085 seconds response time. wikitide.net returns 2602:294:0:b13::110,2602:294:0:b23::112,38.46.223.205,38.46.223.206 [04:53:12] PROBLEM - os162 Current Load on os162 is WARNING: LOAD WARNING - total load average: 7.35, 7.17, 6.72 [04:54:30] [Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] RESOLVED: MediaWiki Exception Rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [04:54:51] PROBLEM - mw182 Current Load on mw182 is CRITICAL: LOAD CRITICAL - total load average: 26.62, 24.21, 23.05 [04:58:51] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 21.95, 23.99, 23.33 [04:59:11] RECOVERY - os162 Current Load on os162 is OK: LOAD OK - total load average: 5.72, 6.67, 6.69 [05:00:44] PROBLEM - prometheus151 PowerDNS Recursor on prometheus151 is CRITICAL: CRITICAL - Plugin timed out while executing system call [05:01:05] PROBLEM - prometheus151 SSH on prometheus151 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [05:02:39] RECOVERY - prometheus151 PowerDNS Recursor on prometheus151 is OK: DNS OK: 0.185 seconds response time. wikitide.net returns 2602:294:0:b13::110,2602:294:0:b23::112,38.46.223.205,38.46.223.206 [05:02:51] PROBLEM - mw182 Current Load on mw182 is CRITICAL: LOAD CRITICAL - total load average: 25.05, 23.19, 23.05 [05:03:10] PROBLEM - os162 Current Load on os162 is WARNING: LOAD WARNING - total load average: 6.72, 6.92, 6.81 [05:04:30] [Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] FIRING: There has been a rise in the MediaWiki exception rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [05:08:51] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 19.33, 22.25, 22.87 [05:09:39] PROBLEM - prometheus151 APT on prometheus151 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 60 seconds. [05:10:53] PROBLEM - franchise.franchising.org.ua - reverse DNS on sslhost is WARNING: rDNS WARNING - reverse DNS entry for franchise.franchising.org.ua could not be found [05:11:38] RECOVERY - prometheus151 SSH on prometheus151 is OK: SSH OK - OpenSSH_9.2p1 Debian-2+deb12u3 (protocol 2.0) [05:12:09] RECOVERY - prometheus151 APT on prometheus151 is OK: APT OK: 51 packages available for upgrade (0 critical updates). [05:14:30] [Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] RESOLVED: MediaWiki Exception Rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [05:15:07] RECOVERY - os162 Current Load on os162 is OK: LOAD OK - total load average: 3.76, 5.95, 6.55 [05:16:00] PROBLEM - ns2 Puppet on ns2 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. It might be a dependency cycle. [05:16:51] PROBLEM - mw182 Current Load on mw182 is CRITICAL: LOAD CRITICAL - total load average: 24.01, 22.56, 22.63 [05:18:51] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 19.41, 21.50, 22.24 [05:20:58] PROBLEM - prometheus151 PowerDNS Recursor on prometheus151 is CRITICAL: CRITICAL - Plugin timed out while executing system call [05:21:06] PROBLEM - prometheus151 SSH on prometheus151 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [05:21:06] PROBLEM - os162 Current Load on os162 is WARNING: LOAD WARNING - total load average: 6.90, 6.84, 6.77 [05:22:53] RECOVERY - prometheus151 PowerDNS Recursor on prometheus151 is OK: DNS OK: 0.483 seconds response time. wikitide.net returns 2602:294:0:b13::110,2602:294:0:b23::112,38.46.223.205,38.46.223.206 [05:23:02] RECOVERY - prometheus151 SSH on prometheus151 is OK: SSH OK - OpenSSH_9.2p1 Debian-2+deb12u3 (protocol 2.0) [05:26:51] PROBLEM - mw182 Current Load on mw182 is CRITICAL: LOAD CRITICAL - total load average: 24.17, 22.60, 22.43 [05:29:20] PROBLEM - prometheus151 SSH on prometheus151 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [05:30:28] PROBLEM - mw171 Current Load on mw171 is WARNING: LOAD WARNING - total load average: 23.13, 19.79, 17.83 [05:31:26] RECOVERY - prometheus151 SSH on prometheus151 is OK: SSH OK - OpenSSH_9.2p1 Debian-2+deb12u3 (protocol 2.0) [05:32:28] RECOVERY - mw171 Current Load on mw171 is OK: LOAD OK - total load average: 17.70, 18.52, 17.58 [05:38:51] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 20.05, 23.07, 23.41 [05:44:08] RECOVERY - ns2 Puppet on ns2 is OK: OK: Puppet is currently enabled, last run 45 seconds ago with 0 failures [05:44:51] PROBLEM - mw182 Current Load on mw182 is CRITICAL: LOAD CRITICAL - total load average: 25.53, 23.40, 23.31 [05:50:10] PROBLEM - prometheus151 SSH on prometheus151 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [05:51:00] RECOVERY - os162 Current Load on os162 is OK: LOAD OK - total load average: 5.06, 6.14, 6.69 [05:54:21] RECOVERY - prometheus151 SSH on prometheus151 is OK: SSH OK - OpenSSH_9.2p1 Debian-2+deb12u3 (protocol 2.0) [05:54:30] [Grafana] FIRING: The mediawiki job queue has more than 500 unclaimed jobs https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] FIRING: There has been a rise in the MediaWiki exception rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [05:55:24] PROBLEM - prometheus151 PowerDNS Recursor on prometheus151 is CRITICAL: CRITICAL - Plugin timed out while executing system call [05:59:29] RECOVERY - prometheus151 PowerDNS Recursor on prometheus151 is OK: DNS OK: 6.264 seconds response time. wikitide.net returns 2602:294:0:b13::110,2602:294:0:b23::112,38.46.223.205,38.46.223.206 [06:00:57] PROBLEM - os162 Current Load on os162 is WARNING: LOAD WARNING - total load average: 7.14, 6.77, 6.75 [06:04:45] PROBLEM - prometheus151 SSH on prometheus151 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [06:04:57] RECOVERY - os162 Current Load on os162 is OK: LOAD OK - total load average: 4.51, 6.25, 6.60 [06:06:51] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 21.70, 23.04, 23.77 [06:09:30] [Grafana] FIRING: The mediawiki job queue has more than 500 unclaimed jobs https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] RESOLVED: MediaWiki Exception Rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [06:10:55] RECOVERY - prometheus151 SSH on prometheus151 is OK: SSH OK - OpenSSH_9.2p1 Debian-2+deb12u3 (protocol 2.0) [06:12:51] PROBLEM - mw182 Current Load on mw182 is CRITICAL: LOAD CRITICAL - total load average: 25.19, 22.84, 23.32 [06:14:51] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 18.94, 21.53, 22.80 [06:16:54] PROBLEM - os162 Current Load on os162 is WARNING: LOAD WARNING - total load average: 7.24, 6.55, 6.47 [06:18:53] RECOVERY - os162 Current Load on os162 is OK: LOAD OK - total load average: 6.64, 6.48, 6.45 [06:22:51] PROBLEM - mw182 Current Load on mw182 is CRITICAL: LOAD CRITICAL - total load average: 24.48, 23.12, 23.04 [06:24:13] PROBLEM - airlineinsider.org - LetsEncrypt on sslhost is WARNING: WARNING - Certificate 'airlineinsider.org' expires in 15 day(s) (Sat 31 Aug 2024 06:03:53 AM GMT +0000). [06:24:17] [02mw-config] 07BlankEclair opened pull request 03#5638: Reenable Evelution for MediaWiki 1.42 - 13https://github.com/miraheze/mw-config/pull/5638 [06:24:26] [02ssl] 07WikiTideSSLBot pushed 031 commit to 03master [+0/-0/±1] 13https://github.com/miraheze/ssl/compare/630cef2158d6...3e5cc4a43048 [06:24:29] [02ssl] 07WikiTideSSLBot 033e5cc4a - Bot: Update SSL cert for airlineinsider.org [06:25:22] miraheze/mw-config - BlankEclair the build passed. [06:26:31] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 2.42, 2.83, 3.82 [06:30:51] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 18.51, 22.45, 23.23 [06:32:31] RECOVERY - prometheus151 Current Load on prometheus151 is OK: LOAD OK - total load average: 1.65, 2.38, 3.34 [06:34:50] PROBLEM - os162 Current Load on os162 is WARNING: LOAD WARNING - total load average: 6.92, 6.72, 6.58 [06:36:32] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 6.48, 4.18, 3.83 [06:37:19] PROBLEM - prometheus151 SSH on prometheus151 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [06:37:29] PROBLEM - prometheus151 PowerDNS Recursor on prometheus151 is CRITICAL: CRITICAL - Plugin timed out while executing system call [06:38:49] RECOVERY - os162 Current Load on os162 is OK: LOAD OK - total load average: 5.61, 6.55, 6.58 [06:38:51] PROBLEM - mw182 Current Load on mw182 is CRITICAL: LOAD CRITICAL - total load average: 24.19, 21.09, 22.17 [06:39:16] RECOVERY - prometheus151 SSH on prometheus151 is OK: SSH OK - OpenSSH_9.2p1 Debian-2+deb12u3 (protocol 2.0) [06:42:48] PROBLEM - os162 Current Load on os162 is WARNING: LOAD WARNING - total load average: 6.98, 6.93, 6.74 [06:42:51] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 20.41, 21.93, 22.39 [06:44:47] RECOVERY - os162 Current Load on os162 is OK: LOAD OK - total load average: 2.29, 5.48, 6.26 [06:45:38] RECOVERY - prometheus151 PowerDNS Recursor on prometheus151 is OK: DNS OK: 0.465 seconds response time. wikitide.net returns 2602:294:0:b13::110,2602:294:0:b23::112,38.46.223.205,38.46.223.206 [06:46:55] PROBLEM - ns2 Puppet on ns2 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. It might be a dependency cycle. [06:49:47] PROBLEM - prometheus151 SSH on prometheus151 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [06:50:51] RECOVERY - mw182 Current Load on mw182 is OK: LOAD OK - total load average: 15.63, 17.54, 20.06 [06:51:43] RECOVERY - prometheus151 SSH on prometheus151 is OK: SSH OK - OpenSSH_9.2p1 Debian-2+deb12u3 (protocol 2.0) [06:53:20] PROBLEM - mw181 Current Load on mw181 is WARNING: LOAD WARNING - total load average: 11.82, 19.28, 23.95 [06:53:57] RECOVERY - airlineinsider.org - LetsEncrypt on sslhost is OK: OK - Certificate 'airlineinsider.org' will expire on Wed 13 Nov 2024 05:25:50 AM GMT +0000. [06:54:30] [Grafana] RESOLVED: High Job Queue Backlog https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] RESOLVED: MediaWiki Exception Rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [06:54:45] PROBLEM - os162 Current Load on os162 is WARNING: LOAD WARNING - total load average: 7.27, 5.60, 5.52 [06:55:06] PROBLEM - prometheus151 PowerDNS Recursor on prometheus151 is CRITICAL: CRITICAL - Plugin timed out while executing system call [06:55:20] PROBLEM - mw181 Current Load on mw181 is CRITICAL: LOAD CRITICAL - total load average: 25.94, 22.28, 24.49 [06:56:51] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 21.77, 19.29, 19.79 [06:58:26] PROBLEM - prometheus151 APT on prometheus151 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 60 seconds. [06:58:27] PROBLEM - prometheus151 Puppet on prometheus151 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 60 seconds. [06:58:51] RECOVERY - mw182 Current Load on mw182 is OK: LOAD OK - total load average: 19.79, 20.15, 20.10 [06:59:13] RECOVERY - prometheus151 PowerDNS Recursor on prometheus151 is OK: DNS OK: 8.303 seconds response time. wikitide.net returns 2602:294:0:b13::110,2602:294:0:b23::112,38.46.223.205,38.46.223.206 [06:59:30] [Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] FIRING: There has been a rise in the MediaWiki exception rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [07:01:05] RECOVERY - prometheus151 Puppet on prometheus151 is OK: OK: Puppet is currently enabled, last run 16 minutes ago with 0 failures [07:01:06] RECOVERY - prometheus151 APT on prometheus151 is OK: APT OK: 51 packages available for upgrade (0 critical updates). [07:04:30] [Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] RESOLVED: MediaWiki Exception Rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [07:04:43] RECOVERY - os162 Current Load on os162 is OK: LOAD OK - total load average: 6.48, 6.58, 6.19 [07:11:20] PROBLEM - ns2 NTP time on ns2 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [07:13:18] RECOVERY - ns2 NTP time on ns2 is OK: NTP OK: Offset 9.831786156e-05 secs [07:18:31] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 2.35, 3.09, 3.97 [07:19:30] [Grafana] RESOLVED: PHP-FPM Worker Usage High https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [07:19:53] PROBLEM - zh.tardis.wiki - LetsEncrypt on sslhost is WARNING: WARNING - Certificate 'zh.tardis.wiki' expires in 15 day(s) (Sat 31 Aug 2024 07:07:46 AM GMT +0000). [07:20:05] [02ssl] 07WikiTideSSLBot pushed 031 commit to 03master [+0/-0/±1] 13https://github.com/miraheze/ssl/compare/3e5cc4a43048...627d3eb90437 [07:20:08] [02ssl] 07WikiTideSSLBot 03627d3eb - Bot: Update SSL cert for zh.tardis.wiki [07:22:31] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 5.05, 3.25, 3.77 [07:23:01] PROBLEM - ns2 NTP time on ns2 is UNKNOWN: check_ntp_time: Invalid hostname/address - time.cloudflare.comUsage: check_ntp_time -H [-4|-6] [-w ] [-c ] [-v verbose] [-o