[00:01:32] PROBLEM - mw162 Current Load on mw162 is WARNING: LOAD WARNING - total load average: 22.99, 21.21, 20.13 [00:03:03] PROBLEM - cp36 Disk Space on cp36 is WARNING: DISK WARNING - free space: / 8978MiB (10% inode=98%); [00:03:26] RECOVERY - mw162 Current Load on mw162 is OK: LOAD OK - total load average: 16.73, 19.59, 19.68 [00:03:38] PROBLEM - prometheus151 PowerDNS Recursor on prometheus151 is CRITICAL: CRITICAL - Plugin timed out while executing system call [00:05:03] RECOVERY - cp36 Disk Space on cp36 is OK: DISK OK - free space: / 21226MiB (23% inode=98%); [00:05:03] PROBLEM - wiki.ouro.one - reverse DNS on sslhost is CRITICAL: rDNS CRITICAL - wiki.ouro.one All nameservers failed to answer the query. [00:05:34] PROBLEM - prometheus151 SSH on prometheus151 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [00:05:35] RECOVERY - prometheus151 PowerDNS Recursor on prometheus151 is OK: DNS OK: 0.263 seconds response time. wikitide.net returns 2602:294:0:b13::110,2602:294:0:b23::112,38.46.223.205,38.46.223.206 [00:05:56] PROBLEM - threedomwiki.pcast.site - reverse DNS on sslhost is CRITICAL: rDNS CRITICAL - threedomwiki.pcast.site All nameservers failed to answer the query. [00:06:30] PROBLEM - mw171 Current Load on mw171 is WARNING: LOAD WARNING - total load average: 22.39, 20.81, 19.27 [00:07:41] RECOVERY - prometheus151 SSH on prometheus151 is OK: SSH OK - OpenSSH_9.2p1 Debian-2+deb12u3 (protocol 2.0) [00:08:29] PROBLEM - wiki.maribelhearn.com - reverse DNS on sslhost is CRITICAL: rDNS CRITICAL - wiki.maribelhearn.com All nameservers failed to answer the query. [00:08:30] RECOVERY - mw171 Current Load on mw171 is OK: LOAD OK - total load average: 14.56, 19.03, 18.84 [00:18:10] PROBLEM - mw171 Current Load on mw171 is WARNING: LOAD WARNING - total load average: 22.08, 20.56, 19.47 [00:18:12] PROBLEM - mw152 Current Load on mw152 is CRITICAL: LOAD CRITICAL - total load average: 26.39, 23.47, 20.43 [00:18:32] PROBLEM - mw161 Current Load on mw161 is WARNING: LOAD WARNING - total load average: 23.82, 20.83, 19.06 [00:19:11] PROBLEM - prometheus151 PowerDNS Recursor on prometheus151 is CRITICAL: CRITICAL - Plugin timed out while executing system call [00:20:04] RECOVERY - mw171 Current Load on mw171 is OK: LOAD OK - total load average: 17.08, 19.41, 19.20 [00:20:08] PROBLEM - mw152 Current Load on mw152 is WARNING: LOAD WARNING - total load average: 20.21, 22.04, 20.27 [00:20:28] RECOVERY - mw161 Current Load on mw161 is OK: LOAD OK - total load average: 19.75, 19.85, 18.89 [00:20:34] PROBLEM - mw162 Current Load on mw162 is WARNING: LOAD WARNING - total load average: 23.71, 21.97, 20.31 [00:21:07] RECOVERY - prometheus151 PowerDNS Recursor on prometheus151 is OK: DNS OK: 0.060 seconds response time. wikitide.net returns 2602:294:0:b13::110,2602:294:0:b23::112,38.46.223.205,38.46.223.206 [00:23:26] PROBLEM - franchise.franchising.org.ua - reverse DNS on sslhost is WARNING: rDNS WARNING - reverse DNS entry for franchise.franchising.org.ua could not be found [00:23:59] PROBLEM - mw152 Current Load on mw152 is CRITICAL: LOAD CRITICAL - total load average: 24.25, 21.84, 20.52 [00:25:55] PROBLEM - mw152 Current Load on mw152 is WARNING: LOAD WARNING - total load average: 19.95, 21.06, 20.40 [00:29:21] PROBLEM - prometheus151 SSH on prometheus151 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [00:29:42] PROBLEM - prometheus151 APT on prometheus151 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 60 seconds. [00:29:46] RECOVERY - mw152 Current Load on mw152 is OK: LOAD OK - total load average: 18.48, 19.98, 20.16 [00:30:30] PROBLEM - prometheus151 PowerDNS Recursor on prometheus151 is CRITICAL: CRITICAL - Plugin timed out while executing system call [00:32:02] RECOVERY - prometheus151 APT on prometheus151 is OK: APT OK: 51 packages available for upgrade (0 critical updates). [00:32:26] RECOVERY - prometheus151 PowerDNS Recursor on prometheus151 is OK: DNS OK: 0.080 seconds response time. wikitide.net returns 2602:294:0:b13::110,2602:294:0:b23::112,38.46.223.205,38.46.223.206 [00:33:32] RECOVERY - prometheus151 SSH on prometheus151 is OK: SSH OK - OpenSSH_9.2p1 Debian-2+deb12u3 (protocol 2.0) [00:33:36] RECOVERY - wiki.ouro.one - reverse DNS on sslhost is OK: SSL OK - wiki.ouro.one reverse DNS resolves to cp36.wikitide.net - CNAME OK [00:35:22] RECOVERY - threedomwiki.pcast.site - reverse DNS on sslhost is OK: SSL OK - threedomwiki.pcast.site reverse DNS resolves to cp36.wikitide.net - CNAME OK [00:35:35] PROBLEM - mw152 Current Load on mw152 is CRITICAL: LOAD CRITICAL - total load average: 25.32, 22.46, 21.11 [00:37:26] RECOVERY - wiki.maribelhearn.com - reverse DNS on sslhost is OK: SSL OK - wiki.maribelhearn.com reverse DNS resolves to cp36.wikitide.net - CNAME OK [00:37:31] PROBLEM - mw152 Current Load on mw152 is WARNING: LOAD WARNING - total load average: 18.14, 20.64, 20.61 [00:41:22] RECOVERY - mw152 Current Load on mw152 is OK: LOAD OK - total load average: 16.82, 18.86, 19.95 [00:47:12] PROBLEM - mw162 Current Load on mw162 is CRITICAL: LOAD CRITICAL - total load average: 24.14, 21.31, 20.99 [00:47:46] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 1.34, 2.93, 3.93 [00:48:10] PROBLEM - mw152 Current Load on mw152 is WARNING: LOAD WARNING - total load average: 22.34, 20.52, 20.34 [00:49:12] PROBLEM - mw162 Current Load on mw162 is WARNING: LOAD WARNING - total load average: 16.94, 19.67, 20.45 [00:50:02] [Grafana] RESOLVED: High Job Queue Backlog https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [00:50:05] RECOVERY - mw152 Current Load on mw152 is OK: LOAD OK - total load average: 17.72, 19.90, 20.18 [00:51:12] RECOVERY - mw162 Current Load on mw162 is OK: LOAD OK - total load average: 17.28, 19.15, 20.18 [00:51:47] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 4.85, 3.44, 3.89 [00:53:46] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 3.70, 3.77, 3.97 [00:57:47] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 4.80, 3.65, 3.82 [01:00:32] PROBLEM - mw172 Current Load on mw172 is CRITICAL: LOAD CRITICAL - total load average: 26.70, 20.71, 18.69 [01:02:15] PROBLEM - prometheus151 APT on prometheus151 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 60 seconds. [01:02:31] PROBLEM - mw172 Current Load on mw172 is WARNING: LOAD WARNING - total load average: 19.81, 21.21, 19.19 [01:03:38] PROBLEM - mw152 Current Load on mw152 is WARNING: LOAD WARNING - total load average: 19.54, 20.66, 20.32 [01:03:47] PROBLEM - wiki.alathramc.com - reverse DNS on sslhost is CRITICAL: rDNS CRITICAL - wiki.alathramc.com All nameservers failed to answer the query. [01:04:29] RECOVERY - mw172 Current Load on mw172 is OK: LOAD OK - total load average: 17.39, 19.92, 18.97 [01:04:36] RECOVERY - prometheus151 APT on prometheus151 is OK: APT OK: 51 packages available for upgrade (0 critical updates). [01:05:02] [Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] FIRING: There has been a rise in the MediaWiki exception rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [01:05:34] RECOVERY - mw152 Current Load on mw152 is OK: LOAD OK - total load average: 18.15, 20.07, 20.16 [01:09:09] PROBLEM - prometheus151 SSH on prometheus151 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [01:13:14] RECOVERY - prometheus151 SSH on prometheus151 is OK: SSH OK - OpenSSH_9.2p1 Debian-2+deb12u3 (protocol 2.0) [01:15:02] [Grafana] RESOLVED: PHP-FPM Worker Usage High https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] RESOLVED: MediaWiki Exception Rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [01:16:30] PROBLEM - mw171 Current Load on mw171 is WARNING: LOAD WARNING - total load average: 22.72, 20.88, 19.76 [01:16:42] PROBLEM - mw161 Current Load on mw161 is WARNING: LOAD WARNING - total load average: 23.96, 21.03, 19.77 [01:17:36] PROBLEM - prometheus151 SSH on prometheus151 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [01:18:17] PROBLEM - mw172 Current Load on mw172 is WARNING: LOAD WARNING - total load average: 21.93, 21.80, 19.99 [01:19:12] PROBLEM - mw162 Current Load on mw162 is CRITICAL: LOAD CRITICAL - total load average: 24.51, 21.72, 20.25 [01:19:33] RECOVERY - prometheus151 SSH on prometheus151 is OK: SSH OK - OpenSSH_9.2p1 Debian-2+deb12u3 (protocol 2.0) [01:20:04] PROBLEM - mw152 Current Load on mw152 is CRITICAL: LOAD CRITICAL - total load average: 25.39, 22.12, 20.65 [01:20:16] PROBLEM - mw172 Current Load on mw172 is CRITICAL: LOAD CRITICAL - total load average: 24.71, 22.79, 20.57 [01:20:30] PROBLEM - mw171 Current Load on mw171 is CRITICAL: LOAD CRITICAL - total load average: 25.13, 22.83, 20.80 [01:21:12] PROBLEM - mw162 Current Load on mw162 is WARNING: LOAD WARNING - total load average: 23.29, 22.88, 20.90 [01:22:00] PROBLEM - mw152 Current Load on mw152 is WARNING: LOAD WARNING - total load average: 21.76, 22.09, 20.82 [01:22:15] PROBLEM - mw172 Current Load on mw172 is WARNING: LOAD WARNING - total load average: 17.97, 21.58, 20.44 [01:22:25] [Grafana] FIRING: The mediawiki job queue has more than 500 unclaimed jobs https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [01:23:56] PROBLEM - mw152 Current Load on mw152 is CRITICAL: LOAD CRITICAL - total load average: 27.93, 24.20, 21.73 [01:24:13] RECOVERY - mw172 Current Load on mw172 is OK: LOAD OK - total load average: 15.77, 19.50, 19.81 [01:24:30] PROBLEM - mw171 Current Load on mw171 is WARNING: LOAD WARNING - total load average: 18.72, 21.60, 20.86 [01:27:47] PROBLEM - mw152 Current Load on mw152 is WARNING: LOAD WARNING - total load average: 21.06, 23.39, 22.00 [01:28:30] PROBLEM - mw171 Current Load on mw171 is CRITICAL: LOAD CRITICAL - total load average: 24.29, 22.24, 21.21 [01:29:48] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 2.51, 2.87, 3.96 [01:30:30] PROBLEM - mw171 Current Load on mw171 is WARNING: LOAD WARNING - total load average: 22.22, 21.40, 20.97 [01:31:12] PROBLEM - mw162 Current Load on mw162 is CRITICAL: LOAD CRITICAL - total load average: 25.95, 23.04, 21.79 [01:33:12] PROBLEM - mw162 Current Load on mw162 is WARNING: LOAD WARNING - total load average: 17.58, 20.80, 21.13 [01:33:34] PROBLEM - wiki.alathramc.com - reverse DNS on sslhost is WARNING: SSL WARNING - rDNS OK but records conflict. {'NS': ['damon.ns.cloudflare.com.', 'gigi.ns.cloudflare.com.'], 'CNAME': 'wiki.alathra.com.'} [01:33:49] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 5.21, 3.60, 3.98 [01:36:30] RECOVERY - mw171 Current Load on mw171 is OK: LOAD OK - total load average: 18.55, 19.75, 20.37 [01:40:30] PROBLEM - mw171 Current Load on mw171 is WARNING: LOAD WARNING - total load average: 18.99, 19.99, 20.41 [01:41:17] RECOVERY - mw152 Current Load on mw152 is OK: LOAD OK - total load average: 14.55, 18.22, 19.85 [01:41:45] RECOVERY - mw161 Current Load on mw161 is OK: LOAD OK - total load average: 14.95, 18.38, 19.83 [01:42:09] icinga is having a great day [01:42:30] RECOVERY - mw171 Current Load on mw171 is OK: LOAD OK - total load average: 15.85, 18.52, 19.81 [01:43:12] RECOVERY - mw162 Current Load on mw162 is OK: LOAD OK - total load average: 16.43, 18.72, 20.16 [01:43:47] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 2.72, 3.46, 3.99 [01:47:12] PROBLEM - mw162 Current Load on mw162 is WARNING: LOAD WARNING - total load average: 21.05, 20.95, 20.80 [01:47:25] [Grafana] FIRING: The mediawiki job queue has more than 500 unclaimed jobs https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [01:47:47] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 6.51, 4.50, 4.23 [01:49:47] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 2.65, 3.77, 4.00 [01:51:12] RECOVERY - mw162 Current Load on mw162 is OK: LOAD OK - total load average: 16.10, 18.77, 19.98 [01:51:48] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 4.95, 4.35, 4.20 [02:06:24] PROBLEM - prometheus151 PowerDNS Recursor on prometheus151 is CRITICAL: CRITICAL - Plugin timed out while executing system call [02:06:36] PROBLEM - prometheus151 SSH on prometheus151 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [02:08:20] RECOVERY - prometheus151 PowerDNS Recursor on prometheus151 is OK: DNS OK: 0.093 seconds response time. wikitide.net returns 2602:294:0:b13::110,2602:294:0:b23::112,38.46.223.205,38.46.223.206 [02:08:34] RECOVERY - prometheus151 SSH on prometheus151 is OK: SSH OK - OpenSSH_9.2p1 Debian-2+deb12u3 (protocol 2.0) [02:15:06] PROBLEM - mw171 Current Load on mw171 is CRITICAL: LOAD CRITICAL - total load average: 24.67, 20.96, 19.26 [02:18:54] PROBLEM - mw171 Current Load on mw171 is WARNING: LOAD WARNING - total load average: 21.46, 21.97, 20.12 [02:20:01] PROBLEM - mw152 Current Load on mw152 is CRITICAL: LOAD CRITICAL - total load average: 24.21, 20.80, 19.37 [02:20:24] PROBLEM - mw161 Current Load on mw161 is WARNING: LOAD WARNING - total load average: 22.60, 21.40, 19.67 [02:20:52] PROBLEM - mw162 Current Load on mw162 is CRITICAL: LOAD CRITICAL - total load average: 24.48, 23.06, 20.75 [02:21:57] PROBLEM - mw152 Current Load on mw152 is WARNING: LOAD WARNING - total load average: 23.74, 21.79, 19.91 [02:22:42] PROBLEM - mw171 Current Load on mw171 is CRITICAL: LOAD CRITICAL - total load average: 24.71, 22.80, 20.84 [02:23:52] PROBLEM - mw152 Current Load on mw152 is CRITICAL: LOAD CRITICAL - total load average: 27.59, 23.49, 20.73 [02:24:37] PROBLEM - mw171 Current Load on mw171 is WARNING: LOAD WARNING - total load average: 20.49, 21.74, 20.67 [02:24:40] PROBLEM - mw162 Current Load on mw162 is WARNING: LOAD WARNING - total load average: 19.60, 22.26, 21.01 [02:27:44] PROBLEM - mw152 Current Load on mw152 is WARNING: LOAD WARNING - total load average: 22.92, 23.60, 21.43 [02:28:28] PROBLEM - mw162 Current Load on mw162 is CRITICAL: LOAD CRITICAL - total load average: 28.81, 24.81, 22.24 [02:28:30] PROBLEM - mw171 Current Load on mw171 is CRITICAL: LOAD CRITICAL - total load average: 24.54, 22.84, 21.32 [02:29:47] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 2.52, 3.19, 3.78 [02:30:30] PROBLEM - mw171 Current Load on mw171 is WARNING: LOAD WARNING - total load average: 17.57, 21.09, 20.88 [02:31:58] RECOVERY - mw161 Current Load on mw161 is OK: LOAD OK - total load average: 16.70, 19.91, 20.12 [02:32:16] PROBLEM - mw162 Current Load on mw162 is WARNING: LOAD WARNING - total load average: 18.82, 22.78, 22.10 [02:33:31] PROBLEM - mw152 Current Load on mw152 is CRITICAL: LOAD CRITICAL - total load average: 24.23, 21.81, 21.15 [02:34:10] PROBLEM - mw162 Current Load on mw162 is CRITICAL: LOAD CRITICAL - total load average: 26.77, 24.18, 22.66 [02:34:30] PROBLEM - mw171 Current Load on mw171 is CRITICAL: LOAD CRITICAL - total load average: 28.01, 23.88, 21.99 [02:35:46] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 4.27, 3.39, 3.62 [02:36:30] PROBLEM - mw171 Current Load on mw171 is WARNING: LOAD WARNING - total load average: 21.37, 23.54, 22.15 [02:37:22] PROBLEM - mw152 Current Load on mw152 is WARNING: LOAD WARNING - total load average: 19.85, 21.78, 21.34 [02:37:46] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 3.07, 3.05, 3.45 [02:37:47] PROBLEM - mw161 Current Load on mw161 is WARNING: LOAD WARNING - total load average: 20.72, 21.17, 20.65 [02:39:46] RECOVERY - prometheus151 Current Load on prometheus151 is OK: LOAD OK - total load average: 2.84, 2.76, 3.28 [02:39:53] PROBLEM - mw162 Current Load on mw162 is WARNING: LOAD WARNING - total load average: 22.31, 23.79, 23.16 [02:40:30] PROBLEM - mw171 Current Load on mw171 is CRITICAL: LOAD CRITICAL - total load average: 27.65, 23.85, 22.43 [02:41:38] PROBLEM - mw161 Current Load on mw161 is CRITICAL: LOAD CRITICAL - total load average: 25.62, 22.75, 21.35 [02:41:47] PROBLEM - mw162 Current Load on mw162 is CRITICAL: LOAD CRITICAL - total load average: 24.32, 23.85, 23.25 [02:42:29] PROBLEM - mw172 Current Load on mw172 is WARNING: LOAD WARNING - total load average: 21.81, 20.41, 18.90 [02:43:09] PROBLEM - mw152 Current Load on mw152 is CRITICAL: LOAD CRITICAL - total load average: 24.16, 22.59, 21.69 [02:43:33] PROBLEM - mw161 Current Load on mw161 is WARNING: LOAD WARNING - total load average: 22.26, 22.96, 21.62 [02:43:41] PROBLEM - mw162 Current Load on mw162 is WARNING: LOAD WARNING - total load average: 22.22, 23.64, 23.27 [02:44:27] RECOVERY - mw172 Current Load on mw172 is OK: LOAD OK - total load average: 17.58, 19.00, 18.54 [02:45:05] PROBLEM - mw152 Current Load on mw152 is WARNING: LOAD WARNING - total load average: 19.07, 21.38, 21.36 [02:46:44] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 7.46, 4.32, 3.65 [02:48:30] PROBLEM - mw171 Current Load on mw171 is WARNING: LOAD WARNING - total load average: 18.20, 22.02, 22.54 [02:48:42] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 3.30, 3.64, 3.47 [02:51:32] RECOVERY - mw161 Current Load on mw161 is OK: LOAD OK - total load average: 18.57, 19.06, 20.28 [02:52:40] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 6.50, 4.46, 3.80 [02:58:09] PROBLEM - prometheus151 SSH on prometheus151 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [02:58:44] RECOVERY - mw152 Current Load on mw152 is OK: LOAD OK - total load average: 15.54, 18.16, 19.89 [02:59:12] RECOVERY - mw162 Current Load on mw162 is OK: LOAD OK - total load average: 14.94, 17.01, 19.93 [03:00:17] PROBLEM - os162 Current Load on os162 is WARNING: LOAD WARNING - total load average: 7.38, 6.08, 5.37 [03:01:00] RECOVERY - db161 Backups SQL on db161 is OK: FILE_AGE OK: /var/log/sql-backup.log is 59 seconds old and 0 bytes [03:01:17] !log [void@puppet181] Upgraded packages libswscale6, libavdevice59, libavformat59, libavfilter8, libavcodec59, libavutil57, libpostproc56, libswresample4, and ffmpeg on mwtask171 [03:01:32] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [03:01:33] !log [void@puppet181] Upgraded packages libswscale6, libavdevice59, libavformat59, libavfilter8, libavcodec59, libavutil57, libpostproc56, libswresample4, and ffmpeg on mw172 [03:01:37] RECOVERY - db181 Backups SQL on db181 is OK: FILE_AGE OK: /var/log/sql-backup.log is 96 seconds old and 0 bytes [03:01:40] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [03:01:54] RECOVERY - mw172 APT on mw172 is OK: APT OK: 64 packages available for upgrade (0 critical updates). [03:01:59] !log [void@puppet181] Upgraded packages libswscale6, libavdevice59, libavformat59, libavfilter8, libavcodec59, libavutil57, libpostproc56, libswresample4, and ffmpeg on mw161 [03:02:10] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [03:02:15] RECOVERY - mwtask171 APT on mwtask171 is OK: APT OK: 62 packages available for upgrade (0 critical updates). [03:02:18] RECOVERY - prometheus151 SSH on prometheus151 is OK: SSH OK - OpenSSH_9.2p1 Debian-2+deb12u3 (protocol 2.0) [03:02:22] RECOVERY - mw161 APT on mw161 is OK: APT OK: 64 packages available for upgrade (0 critical updates). [03:02:30] RECOVERY - mw171 Current Load on mw171 is OK: LOAD OK - total load average: 17.19, 17.88, 20.14 [03:02:36] !log [void@puppet181] Upgraded packages libswscale6, libavdevice59, libavformat59, libavfilter8, libavcodec59, libavutil57, libpostproc56, libswresample4, and ffmpeg on mw151 [03:02:46] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [03:03:02] !log [void@puppet181] Upgraded packages libswscale6, libavdevice59, libavformat59, libavfilter8, libavcodec59, libavutil57, libpostproc56, libswresample4, and ffmpeg on mwtask181 [03:03:12] RECOVERY - mwtask181 APT on mwtask181 is OK: APT OK: 62 packages available for upgrade (0 critical updates). [03:03:13] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [03:03:20] !log [void@puppet181] Upgraded packages libswscale6, libavdevice59, libavformat59, libavfilter8, libavcodec59, libavutil57, libpostproc56, libswresample4, and ffmpeg on mw152 [03:03:30] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [03:03:39] !log [void@puppet181] Upgraded packages libswscale6, libavdevice59, libavformat59, libavfilter8, libavcodec59, libavutil57, libpostproc56, libswresample4, and ffmpeg on mw162 [03:03:44] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [03:03:51] RECOVERY - mw151 APT on mw151 is OK: APT OK: 64 packages available for upgrade (0 critical updates). [03:03:56] !log [void@puppet181] Upgraded packages libswscale6, libavdevice59, libavformat59, libavfilter8, libavcodec59, libavutil57, libpostproc56, libswresample4, and ffmpeg on mw171 [03:04:03] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [03:04:17] RECOVERY - os162 Current Load on os162 is OK: LOAD OK - total load average: 3.94, 5.68, 5.43 [03:04:21] RECOVERY - mw152 APT on mw152 is OK: APT OK: 64 packages available for upgrade (0 critical updates). [03:04:21] !log [void@puppet181] Upgraded packages libswscale6, libavdevice59, libavformat59, libavfilter8, libavcodec59, libavutil57, libpostproc56, libswresample4, and ffmpeg on mw182 [03:04:22] RECOVERY - mw171 APT on mw171 is OK: APT OK: 64 packages available for upgrade (0 critical updates). [03:04:27] RECOVERY - mw182 APT on mw182 is OK: APT OK: 65 packages available for upgrade (0 critical updates). [03:04:30] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [03:04:42] !log [void@puppet181] Upgraded packages libswscale6, libavdevice59, libavformat59, libavfilter8, libavcodec59, libavutil57, libpostproc56, libswresample4, and ffmpeg on mw181 [03:05:00] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [03:05:15] RECOVERY - mw162 APT on mw162 is OK: APT OK: 64 packages available for upgrade (0 critical updates). [03:05:22] !log [void@puppet181] Upgraded packages libswscale6, libavdevice59, libavformat59, libavfilter8, libavcodec59, libavutil57, libpostproc56, libswresample4, and ffmpeg on test151 [03:05:30] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [03:05:47] RECOVERY - mw181 APT on mw181 is OK: APT OK: 65 packages available for upgrade (0 critical updates). [03:06:54] RECOVERY - test151 APT on test151 is OK: APT OK: 74 packages available for upgrade (0 critical updates). [03:07:25] [Grafana] FIRING: The mediawiki job queue has more than 500 unclaimed jobs https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] FIRING: There has been a rise in the MediaWiki exception rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [03:09:46] PROBLEM - prometheus151 PowerDNS Recursor on prometheus151 is CRITICAL: CRITICAL - Plugin timed out while executing system call [03:12:25] [Grafana] FIRING: The mediawiki job queue has more than 500 unclaimed jobs https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] RESOLVED: MediaWiki Exception Rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [03:13:48] RECOVERY - prometheus151 PowerDNS Recursor on prometheus151 is OK: DNS OK: 0.139 seconds response time. wikitide.net returns 2602:294:0:b13::110,2602:294:0:b23::112,38.46.223.205,38.46.223.206 [03:16:40] PROBLEM - mw162 Current Load on mw162 is WARNING: LOAD WARNING - total load average: 22.70, 21.57, 20.63 [03:19:19] PROBLEM - mw171 Current Load on mw171 is WARNING: LOAD WARNING - total load average: 22.22, 21.15, 20.06 [03:20:28] RECOVERY - mw162 Current Load on mw162 is OK: LOAD OK - total load average: 15.03, 19.61, 20.13 [03:21:13] RECOVERY - mw171 Current Load on mw171 is OK: LOAD OK - total load average: 18.03, 19.83, 19.70 [03:22:25] [Grafana] RESOLVED: High Job Queue Backlog https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [03:25:24] PROBLEM - mw152 Current Load on mw152 is CRITICAL: LOAD CRITICAL - total load average: 24.11, 22.08, 19.93 [03:26:12] PROBLEM - mw162 Current Load on mw162 is WARNING: LOAD WARNING - total load average: 21.89, 20.76, 20.41 [03:27:20] PROBLEM - mw152 Current Load on mw152 is WARNING: LOAD WARNING - total load average: 22.60, 22.21, 20.24 [03:28:06] RECOVERY - mw162 Current Load on mw162 is OK: LOAD OK - total load average: 18.23, 19.73, 20.08 [03:29:15] RECOVERY - mw152 Current Load on mw152 is OK: LOAD OK - total load average: 14.28, 19.34, 19.42 [03:33:08] PROBLEM - mw152 Current Load on mw152 is WARNING: LOAD WARNING - total load average: 21.68, 21.18, 20.18 [03:35:04] RECOVERY - mw152 Current Load on mw152 is OK: LOAD OK - total load average: 15.29, 19.30, 19.64 [03:36:29] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 3.18, 3.44, 3.98 [03:39:31] PROBLEM - mw162 Current Load on mw162 is CRITICAL: LOAD CRITICAL - total load average: 24.23, 21.79, 20.59 [03:40:26] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 5.56, 4.19, 4.16 [03:40:30] PROBLEM - mw171 Current Load on mw171 is CRITICAL: LOAD CRITICAL - total load average: 24.37, 22.07, 20.52 [03:40:53] PROBLEM - mw152 Current Load on mw152 is WARNING: LOAD WARNING - total load average: 18.38, 20.48, 20.08 [03:41:25] PROBLEM - mw162 Current Load on mw162 is WARNING: LOAD WARNING - total load average: 18.97, 20.78, 20.37 [03:42:30] RECOVERY - mw171 Current Load on mw171 is OK: LOAD OK - total load average: 15.08, 19.69, 19.85 [03:42:49] RECOVERY - mw152 Current Load on mw152 is OK: LOAD OK - total load average: 17.11, 19.46, 19.75 [03:43:19] RECOVERY - mw162 Current Load on mw162 is OK: LOAD OK - total load average: 15.39, 19.12, 19.82 [03:46:44] PROBLEM - mw152 Current Load on mw152 is WARNING: LOAD WARNING - total load average: 21.70, 20.83, 20.24 [03:52:44] RECOVERY - mw152 Current Load on mw152 is OK: LOAD OK - total load average: 17.62, 19.66, 19.92 [03:57:25] [Grafana] FIRING: The mediawiki job queue has more than 500 unclaimed jobs https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [04:02:25] [Grafana] RESOLVED: High Job Queue Backlog https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [04:05:03] PROBLEM - wiki.ouro.one - reverse DNS on sslhost is CRITICAL: rDNS CRITICAL - wiki.ouro.one All nameservers failed to answer the query. [04:05:56] PROBLEM - threedomwiki.pcast.site - reverse DNS on sslhost is CRITICAL: rDNS CRITICAL - threedomwiki.pcast.site All nameservers failed to answer the query. [04:08:11] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 1.39, 3.13, 3.85 [04:08:27] PROBLEM - wiki.maribelhearn.com - reverse DNS on sslhost is CRITICAL: rDNS CRITICAL - wiki.maribelhearn.com All nameservers failed to answer the query. [04:10:12] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 5.46, 4.05, 4.09 [04:11:17] PROBLEM - prometheus151 PowerDNS Recursor on prometheus151 is CRITICAL: CRITICAL - Plugin timed out while executing system call [04:11:47] PROBLEM - prometheus151 SSH on prometheus151 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [04:13:22] RECOVERY - prometheus151 PowerDNS Recursor on prometheus151 is OK: DNS OK: 9.086 seconds response time. wikitide.net returns 2602:294:0:b13::110,2602:294:0:b23::112,38.46.223.205,38.46.223.206 [04:13:44] RECOVERY - prometheus151 SSH on prometheus151 is OK: SSH OK - OpenSSH_9.2p1 Debian-2+deb12u3 (protocol 2.0) [04:16:30] PROBLEM - mw171 Current Load on mw171 is WARNING: LOAD WARNING - total load average: 21.27, 19.67, 18.87 [04:17:11] PROBLEM - ns2 Puppet on ns2 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. It might be a dependency cycle. [04:18:30] RECOVERY - mw171 Current Load on mw171 is OK: LOAD OK - total load average: 16.80, 18.27, 18.45 [04:26:30] PROBLEM - mw171 Current Load on mw171 is CRITICAL: LOAD CRITICAL - total load average: 24.26, 21.86, 19.84 [04:28:08] PROBLEM - wiki.overwood.xyz - LetsEncrypt on sslhost is CRITICAL: Name or service not knownHTTP CRITICAL - Unable to open TCP socket [04:28:30] PROBLEM - mw171 Current Load on mw171 is WARNING: LOAD WARNING - total load average: 19.45, 20.43, 19.53 [04:29:09] PROBLEM - prometheus151 PowerDNS Recursor on prometheus151 is CRITICAL: CRITICAL - Plugin timed out while executing system call [04:29:18] PROBLEM - prometheus151 SSH on prometheus151 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [04:31:05] RECOVERY - prometheus151 PowerDNS Recursor on prometheus151 is OK: DNS OK: 0.095 seconds response time. wikitide.net returns 2602:294:0:b13::110,2602:294:0:b23::112,38.46.223.205,38.46.223.206 [04:31:15] RECOVERY - prometheus151 SSH on prometheus151 is OK: SSH OK - OpenSSH_9.2p1 Debian-2+deb12u3 (protocol 2.0) [04:32:25] [Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] FIRING: There has been a rise in the MediaWiki exception rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [04:32:30] RECOVERY - mw171 Current Load on mw171 is OK: LOAD OK - total load average: 14.54, 18.52, 19.11 [04:33:35] RECOVERY - wiki.ouro.one - reverse DNS on sslhost is OK: SSL OK - wiki.ouro.one reverse DNS resolves to cp36.wikitide.net - CNAME OK [04:34:23] PROBLEM - wiki.overwood.xyz - reverse DNS on sslhost is WARNING: rDNS WARNING - reverse DNS entry for wiki.overwood.xyz could not be found [04:35:22] RECOVERY - threedomwiki.pcast.site - reverse DNS on sslhost is OK: SSL OK - threedomwiki.pcast.site reverse DNS resolves to cp36.wikitide.net - CNAME OK [04:37:25] [Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] RESOLVED: MediaWiki Exception Rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [04:37:25] RECOVERY - wiki.maribelhearn.com - reverse DNS on sslhost is OK: SSL OK - wiki.maribelhearn.com reverse DNS resolves to cp36.wikitide.net - CNAME OK [04:45:53] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 2.35, 3.23, 3.95 [04:46:30] PROBLEM - mw171 Current Load on mw171 is CRITICAL: LOAD CRITICAL - total load average: 24.60, 22.41, 20.47 [04:47:51] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 5.34, 3.93, 4.11 [04:49:50] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 3.30, 3.34, 3.85 [04:50:30] PROBLEM - mw171 Current Load on mw171 is WARNING: LOAD WARNING - total load average: 17.12, 21.03, 20.46 [04:51:49] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 6.83, 4.47, 4.20 [04:52:30] RECOVERY - mw171 Current Load on mw171 is OK: LOAD OK - total load average: 13.80, 18.89, 19.77 [04:55:48] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 2.61, 3.70, 3.95 [05:01:48] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 5.91, 4.16, 3.99 [05:04:03] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 19.07, 22.18, 23.71 [05:04:28] PROBLEM - archive.stellurgists.wiki - reverse DNS on sslhost is CRITICAL: rDNS CRITICAL - archive.stellurgists.wiki All nameservers failed to answer the query. [05:20:03] PROBLEM - mw182 Current Load on mw182 is CRITICAL: LOAD CRITICAL - total load average: 28.32, 22.30, 21.75 [05:20:55] PROBLEM - prometheus151 SSH on prometheus151 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [05:22:52] PROBLEM - prometheus151 APT on prometheus151 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 60 seconds. [05:22:56] RECOVERY - prometheus151 SSH on prometheus151 is OK: SSH OK - OpenSSH_9.2p1 Debian-2+deb12u3 (protocol 2.0) [05:23:01] PROBLEM - prometheus151 Puppet on prometheus151 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 60 seconds. [05:24:44] PROBLEM - mw152 Current Load on mw152 is WARNING: LOAD WARNING - total load average: 21.20, 20.15, 17.91 [05:24:48] RECOVERY - prometheus151 APT on prometheus151 is OK: APT OK: 51 packages available for upgrade (0 critical updates). [05:25:02] RECOVERY - prometheus151 Puppet on prometheus151 is OK: OK: Puppet is currently enabled, last run 8 minutes ago with 0 failures [05:26:44] RECOVERY - mw152 Current Load on mw152 is OK: LOAD OK - total load average: 16.44, 18.61, 17.62 [05:33:57] RECOVERY - archive.stellurgists.wiki - reverse DNS on sslhost is OK: SSL OK - archive.stellurgists.wiki reverse DNS resolves to cp36.wikitide.net - CNAME OK [05:36:03] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 18.48, 22.80, 23.57 [05:38:03] PROBLEM - mw182 Current Load on mw182 is CRITICAL: LOAD CRITICAL - total load average: 25.81, 24.11, 23.96 [05:40:03] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 18.88, 22.16, 23.27 [05:44:03] PROBLEM - mw182 Current Load on mw182 is CRITICAL: LOAD CRITICAL - total load average: 27.25, 24.10, 23.76 [05:46:03] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 18.97, 22.39, 23.21 [05:50:03] PROBLEM - mw182 Current Load on mw182 is CRITICAL: LOAD CRITICAL - total load average: 25.72, 23.41, 23.42 [05:54:03] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 21.91, 23.06, 23.38 [05:56:03] PROBLEM - mw182 Current Load on mw182 is CRITICAL: LOAD CRITICAL - total load average: 25.77, 23.70, 23.55 [06:01:05] PROBLEM - prometheus151 SSH on prometheus151 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [06:01:40] PROBLEM - prometheus151 PowerDNS Recursor on prometheus151 is CRITICAL: CRITICAL - Plugin timed out while executing system call [06:02:03] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 22.56, 23.13, 23.45 [06:03:16] RECOVERY - prometheus151 SSH on prometheus151 is OK: SSH OK - OpenSSH_9.2p1 Debian-2+deb12u3 (protocol 2.0) [06:03:36] RECOVERY - prometheus151 PowerDNS Recursor on prometheus151 is OK: DNS OK: 0.216 seconds response time. wikitide.net returns 2602:294:0:b13::110,2602:294:0:b23::112,38.46.223.205,38.46.223.206 [06:04:07] PROBLEM - prometheus151 APT on prometheus151 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 60 seconds. [06:06:35] !log [reception@mwtask181] sudo -u www-data php /srv/mediawiki/1.42/maintenance/run.php /srv/mediawiki/1.42/maintenance/importImages.php --wiki=haremhotelwiki --search-recursively --summary=Imported from harem-hotel.fandom.com /home/reception/haremhotel (START) [06:06:41] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [06:07:47] !log [reception@mwtask181] sudo -u www-data php /srv/mediawiki/1.42/maintenance/run.php /srv/mediawiki/1.42/maintenance/importImages.php --wiki=haremhotelwiki --search-recursively --summary=Imported from harem-hotel.fandom.com /home/reception/haremhotel (END - exit=0) [06:07:57] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [06:08:56] RECOVERY - prometheus151 APT on prometheus151 is OK: APT OK: 51 packages available for upgrade (0 critical updates). [06:12:03] PROBLEM - mw182 Current Load on mw182 is CRITICAL: LOAD CRITICAL - total load average: 24.28, 22.49, 22.66 [06:17:19] PROBLEM - mw171 Current Load on mw171 is WARNING: LOAD WARNING - total load average: 23.06, 19.87, 17.82 [06:23:32] PROBLEM - mw172 Current Load on mw172 is WARNING: LOAD WARNING - total load average: 21.28, 19.85, 17.96 [06:23:46] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 1.71, 2.79, 3.86 [06:25:30] RECOVERY - mw172 Current Load on mw172 is OK: LOAD OK - total load average: 16.77, 18.65, 17.76 [06:26:50] RECOVERY - mw171 Current Load on mw171 is OK: LOAD OK - total load average: 18.00, 19.95, 19.18 [06:27:47] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 5.16, 3.47, 3.85 [06:29:47] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 3.85, 3.36, 3.76 [06:31:27] PROBLEM - prometheus151 PowerDNS Recursor on prometheus151 is CRITICAL: CRITICAL - Plugin timed out while executing system call [06:31:47] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 5.56, 4.30, 4.07 [06:32:25] [Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] FIRING: There has been a rise in the MediaWiki exception rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [06:32:41] PROBLEM - ns2 NTP time on ns2 is UNKNOWN: check_ntp_time: Invalid hostname/address - time.cloudflare.comUsage: check_ntp_time -H [-4|-6] [-w ] [-c ] [-v verbose] [-o ] [06:33:23] RECOVERY - prometheus151 PowerDNS Recursor on prometheus151 is OK: DNS OK: 0.068 seconds response time. wikitide.net returns 2602:294:0:b13::110,2602:294:0:b23::112,38.46.223.205,38.46.223.206 [06:33:47] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 2.62, 3.58, 3.83 [06:34:30] PROBLEM - mw171 Current Load on mw171 is WARNING: LOAD WARNING - total load average: 17.78, 20.79, 19.92 [06:34:41] RECOVERY - ns2 NTP time on ns2 is OK: NTP OK: Offset -0.000789731741 secs [06:36:30] RECOVERY - mw171 Current Load on mw171 is OK: LOAD OK - total load average: 14.16, 18.39, 19.15 [06:37:25] [Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] RESOLVED: MediaWiki Exception Rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [06:37:48] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 5.67, 4.19, 3.96 [06:39:48] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 2.70, 3.64, 3.80 [06:42:35] PROBLEM - prometheus151 SSH on prometheus151 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [06:43:46] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 4.24, 4.13, 3.96 [06:44:38] RECOVERY - prometheus151 SSH on prometheus151 is OK: SSH OK - OpenSSH_9.2p1 Debian-2+deb12u3 (protocol 2.0) [06:45:46] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 3.05, 3.63, 3.79 [06:45:50] RECOVERY - ns2 Puppet on ns2 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:47:47] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 5.02, 3.74, 3.78 [06:48:03] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 18.58, 21.86, 23.73 [06:50:55] PROBLEM - ns2 NTP time on ns2 is UNKNOWN: check_ntp_time: Invalid hostname/address - time.cloudflare.comUsage: check_ntp_time -H [-4|-6] [-w ] [-c ] [-v verbose] [-o ] [06:51:47] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 1.80, 3.63, 3.81 [06:53:06] PROBLEM - ns2 NTP time on ns2 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [06:53:46] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 4.88, 4.02, 3.92 [06:55:47] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 2.99, 3.35, 3.67 [06:57:16] RECOVERY - ns2 NTP time on ns2 is OK: NTP OK: Offset 0.0009208619595 secs [06:57:47] RECOVERY - prometheus151 Current Load on prometheus151 is OK: LOAD OK - total load average: 2.15, 2.75, 3.40 [07:00:03] RECOVERY - mw182 Current Load on mw182 is OK: LOAD OK - total load average: 17.19, 17.73, 20.30 [07:01:47] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 4.70, 3.92, 3.76 [07:04:00] PROBLEM - prometheus151 PowerDNS Recursor on prometheus151 is CRITICAL: CRITICAL - Plugin timed out while executing system call [07:07:04] PROBLEM - prometheus151 APT on prometheus151 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 60 seconds. [07:07:13] PROBLEM - prometheus151 Puppet on prometheus151 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 60 seconds. [07:07:22] PROBLEM - prometheus151 SSH on prometheus151 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [07:07:25] [Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] FIRING: There has been a rise in the MediaWiki exception rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [07:09:24] RECOVERY - prometheus151 SSH on prometheus151 is OK: SSH OK - OpenSSH_9.2p1 Debian-2+deb12u3 (protocol 2.0) [07:10:00] RECOVERY - prometheus151 Puppet on prometheus151 is OK: OK: Puppet is currently enabled, last run 24 minutes ago with 0 failures [07:10:08] RECOVERY - prometheus151 PowerDNS Recursor on prometheus151 is OK: DNS OK: 0.077 seconds response time. wikitide.net returns 2602:294:0:b13::110,2602:294:0:b23::112,38.46.223.205,38.46.223.206 [07:12:25] [Grafana] RESOLVED: PHP-FPM Worker Usage High https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] RESOLVED: MediaWiki Exception Rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [07:14:47] RECOVERY - prometheus151 APT on prometheus151 is OK: APT OK: 51 packages available for upgrade (0 critical updates). [07:19:37] PROBLEM - mw182 Current Load on mw182 is CRITICAL: LOAD CRITICAL - total load average: 24.79, 22.77, 20.83 [07:21:34] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 22.53, 22.82, 21.09 [07:27:24] PROBLEM - mw182 Current Load on mw182 is CRITICAL: LOAD CRITICAL - total load average: 26.17, 22.98, 21.52 [07:29:21] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 21.12, 22.31, 21.47 [07:32:25] [Grafana] FIRING: The mediawiki job queue has more than 500 unclaimed jobs https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [07:35:51] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 3.50, 3.16, 3.85 [07:37:08] RECOVERY - mw182 Current Load on mw182 is OK: LOAD OK - total load average: 16.53, 19.09, 20.40 [07:37:25] [Grafana] RESOLVED: High Job Queue Backlog https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [07:39:50] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 6.51, 4.10, 4.00 [07:40:30] [Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [07:41:48] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 2.50, 3.82, 3.94 [07:43:47] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 5.28, 4.40, 4.13 [07:45:49] PROBLEM - mw182 Current Load on mw182 is CRITICAL: LOAD CRITICAL - total load average: 25.18, 22.67, 21.32 [07:47:45] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 18.26, 21.06, 20.90 [07:49:11] PROBLEM - prometheus151 SSH on prometheus151 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [07:49:31] PROBLEM - prometheus151 PowerDNS Recursor on prometheus151 is CRITICAL: CRITICAL - Plugin timed out while executing system call [07:51:08] RECOVERY - prometheus151 SSH on prometheus151 is OK: SSH OK - OpenSSH_9.2p1 Debian-2+deb12u3 (protocol 2.0) [07:51:26] RECOVERY - prometheus151 PowerDNS Recursor on prometheus151 is OK: DNS OK: 0.064 seconds response time. wikitide.net returns 2602:294:0:b13::110,2602:294:0:b23::112,38.46.223.205,38.46.223.206 [07:53:36] PROBLEM - mw182 Current Load on mw182 is CRITICAL: LOAD CRITICAL - total load average: 26.10, 22.60, 21.57 [07:55:30] [Grafana] RESOLVED: PHP-FPM Worker Usage High https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [07:59:43] PROBLEM - prometheus151 PowerDNS Recursor on prometheus151 is CRITICAL: CRITICAL - Plugin timed out while executing system call [08:00:46] [Grafana] FIRING: There has been a rise in the MediaWiki exception rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [08:03:45] RECOVERY - prometheus151 PowerDNS Recursor on prometheus151 is OK: DNS OK: 0.053 seconds response time. wikitide.net returns 2602:294:0:b13::110,2602:294:0:b23::112,38.46.223.205,38.46.223.206 [08:05:01] PROBLEM - wiki.ouro.one - reverse DNS on sslhost is CRITICAL: rDNS CRITICAL - wiki.ouro.one All nameservers failed to answer the query. [08:05:16] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 23.97, 23.76, 23.11 [08:05:46] [Grafana] RESOLVED: MediaWiki Exception Rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [08:05:56] PROBLEM - threedomwiki.pcast.site - reverse DNS on sslhost is CRITICAL: rDNS CRITICAL - threedomwiki.pcast.site All nameservers failed to answer the query. [08:07:13] PROBLEM - mw182 Current Load on mw182 is CRITICAL: LOAD CRITICAL - total load average: 25.56, 24.02, 23.26 [08:08:27] PROBLEM - wiki.maribelhearn.com - reverse DNS on sslhost is CRITICAL: rDNS CRITICAL - wiki.maribelhearn.com All nameservers failed to answer the query. [08:11:07] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 22.61, 23.03, 23.01 [08:13:03] PROBLEM - mw182 Current Load on mw182 is CRITICAL: LOAD CRITICAL - total load average: 25.01, 23.23, 23.05 [08:15:47] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 3.53, 3.22, 3.78 [08:17:33] PROBLEM - ns2 Puppet on ns2 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. It might be a dependency cycle. [08:19:49] PROBLEM - ns2 NTP time on ns2 is UNKNOWN: check_ntp_time: Invalid hostname/address - time.cloudflare.comUsage: check_ntp_time -H [-4|-6] [-w ] [-c ] [-v verbose] [-o ] [08:21:47] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 4.76, 3.62, 3.72 [08:21:49] RECOVERY - ns2 NTP time on ns2 is OK: NTP OK: Offset -0.0004480779171 secs [08:23:46] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 1.96, 3.26, 3.60 [08:24:00] [Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [08:27:47] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 7.04, 4.24, 3.87 [08:29:47] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 3.25, 3.52, 3.64 [08:31:48] RECOVERY - prometheus151 Current Load on prometheus151 is OK: LOAD OK - total load average: 2.45, 2.95, 3.40 [08:32:31] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 18.14, 21.70, 23.45 [08:33:35] RECOVERY - wiki.ouro.one - reverse DNS on sslhost is OK: SSL OK - wiki.ouro.one reverse DNS resolves to cp36.wikitide.net - CNAME OK [08:35:22] RECOVERY - threedomwiki.pcast.site - reverse DNS on sslhost is OK: SSL OK - threedomwiki.pcast.site reverse DNS resolves to cp36.wikitide.net - CNAME OK [08:37:24] RECOVERY - wiki.maribelhearn.com - reverse DNS on sslhost is OK: SSL OK - wiki.maribelhearn.com reverse DNS resolves to cp36.wikitide.net - CNAME OK [08:37:47] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 2.31, 3.22, 3.42 [08:39:47] RECOVERY - prometheus151 Current Load on prometheus151 is OK: LOAD OK - total load average: 2.41, 2.83, 3.25 [08:47:49] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 6.47, 4.70, 3.91 [08:49:00] [Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] FIRING: There has been a rise in the MediaWiki exception rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [08:52:07] PROBLEM - prometheus151 APT on prometheus151 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 60 seconds. [08:54:08] RECOVERY - prometheus151 APT on prometheus151 is OK: APT OK: 51 packages available for upgrade (0 critical updates). [08:55:53] PROBLEM - ns2 NTP time on ns2 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [08:58:03] PROBLEM - mw182 Current Load on mw182 is CRITICAL: LOAD CRITICAL - total load average: 26.89, 22.41, 21.46 [08:58:03] PROBLEM - ns2 NTP time on ns2 is UNKNOWN: check_ntp_time: Invalid hostname/address - time.cloudflare.comUsage: check_ntp_time -H [-4|-6] [-w ] [-c ] [-v verbose] [-o ] [08:58:55] PROBLEM - prometheus151 PowerDNS Recursor on prometheus151 is CRITICAL: CRITICAL - Plugin timed out while executing system call [08:59:00] [Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] RESOLVED: MediaWiki Exception Rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [09:00:04] RECOVERY - ns2 NTP time on ns2 is OK: NTP OK: Offset 0.0004426538944 secs [09:00:50] RECOVERY - prometheus151 PowerDNS Recursor on prometheus151 is OK: DNS OK: 0.090 seconds response time. wikitide.net returns 2602:294:0:b13::110,2602:294:0:b23::112,38.46.223.205,38.46.223.206 [09:02:03]