[01:05:22] RECOVERY - Host cp1 is UP: PING WARNING - Packet loss = 96%, RTA = 0.37 ms [01:06:33] PROBLEM - cp1 Current Load on cp1 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 60 seconds. [01:07:13] PROBLEM - cp1 PowerDNS Recursor on cp1 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 60 seconds. [01:07:18] PROBLEM - cp1 APT on cp1 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 60 seconds. [01:07:53] PROBLEM - cp1 Nginx Backend for matomo21 on cp1 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 60 seconds. [01:10:42] PROBLEM - Host cp1 is DOWN: PING CRITICAL - Packet loss = 100% [02:11:30] !log [universalomega@jobrunner21] sudo -u www-data php /srv/mediawiki/1.40/maintenance/run.php /srv/mediawiki/1.40/extensions/ManageWiki/maintenance/toggleExtension.php --wiki=lhmnwiki pageproperties --disable (END - exit=0) [02:11:32] Logged the message at https://meta.wikitide.org/wiki/Tech:Server_admin_log [02:25:30] !log [universalomega@jobrunner21] sudo -u www-data php /srv/mediawiki/1.40/maintenance/run.php /srv/mediawiki/1.40/extensions/ManageWiki/maintenance/toggleExtension.php --wiki=lhmnwiki lingo --disable (END - exit=0) [02:25:32] Logged the message at https://meta.wikitide.org/wiki/Tech:Server_admin_log [04:26:45] PROBLEM - mem21 Current Load on mem21 is WARNING: LOAD WARNING - total load average: 3.02, 3.30, 3.90 [04:30:45] PROBLEM - mem21 Current Load on mem21 is CRITICAL: LOAD CRITICAL - total load average: 4.03, 3.74, 3.93 [04:32:45] PROBLEM - mem21 Current Load on mem21 is WARNING: LOAD WARNING - total load average: 3.93, 3.70, 3.89 [04:34:45] PROBLEM - mem21 Current Load on mem21 is CRITICAL: LOAD CRITICAL - total load average: 4.45, 3.92, 3.95 [04:42:45] PROBLEM - mem21 Current Load on mem21 is WARNING: LOAD WARNING - total load average: 3.41, 3.91, 3.97 [04:52:45] PROBLEM - mem21 Current Load on mem21 is CRITICAL: LOAD CRITICAL - total load average: 4.20, 4.05, 3.98 [06:12:45] PROBLEM - mem21 Current Load on mem21 is WARNING: LOAD WARNING - total load average: 3.49, 3.54, 3.94 [06:14:45] PROBLEM - mem21 Current Load on mem21 is CRITICAL: LOAD CRITICAL - total load average: 4.54, 3.84, 4.00 [06:26:48] RECOVERY - mail21 webmail.wikitide.net HTTPS on mail21 is OK: HTTP OK: HTTP/1.1 200 OK - 143553 bytes in 0.094 second response time [06:27:54] RECOVERY - mail21 Puppet on mail21 is OK: OK: Puppet is currently enabled, last run 59 seconds ago with 0 failures [07:06:55] PROBLEM - jobrunner21 APT on jobrunner21 is CRITICAL: APT CRITICAL: 60 packages available for upgrade (3 critical updates). [07:12:44] RECOVERY - db21 APT on db21 is OK: APT OK: 0 packages available for upgrade (0 critical updates). [07:14:55] RECOVERY - jobrunner21 APT on jobrunner21 is OK: APT OK: 0 packages available for upgrade (0 critical updates). [07:15:23] PROBLEM - jobrunner21 Puppet on jobrunner21 is CRITICAL: CRITICAL: Puppet has 2 failures. Last run 28 minutes ago with 2 failures. Failed resources (up to 3 shown): File[/srv/mediawiki/1.40/skins/Femiwiki/node_modules],File[/srv/mediawiki/1.41/skins/Femiwiki/node_modules] [07:16:55] PROBLEM - mw21 Puppet on mw21 is CRITICAL: CRITICAL: Puppet has 2 failures. Last run 17 minutes ago with 2 failures. Failed resources (up to 3 shown): File[/srv/mediawiki/1.40/skins/Femiwiki/node_modules],File[/srv/mediawiki/1.41/skins/Femiwiki/node_modules] [07:17:23] PROBLEM - mw22 APT on mw22 is CRITICAL: APT CRITICAL: 60 packages available for upgrade (3 critical updates). [07:19:22] RECOVERY - mw22 APT on mw22 is OK: APT OK: 0 packages available for upgrade (0 critical updates). [07:19:44] PROBLEM - mw22 Puppet on mw22 is CRITICAL: CRITICAL: Puppet has 19 failures. Last run 47 seconds ago with 19 failures. Failed resources (up to 3 shown): Package[php8.2-common],Package[php8.2-opcache],Package[php8.2-cli],Package[php8.2-fpm] [07:22:26] huh what happened to PHP 8.2 on mw22 @agentisai [08:23:13] RECOVERY - newcascadia.net - reverse DNS on sslhost is OK: SSL OK - newcascadia.net reverse DNS resolves to cp5.wikitide.net - NS RECORDS OK [08:45:31] PROBLEM - puppet21 Current Load on puppet21 is WARNING: LOAD WARNING - total load average: 7.95, 6.35, 5.46 [08:49:31] RECOVERY - puppet21 Current Load on puppet21 is OK: LOAD OK - total load average: 6.47, 6.78, 5.86 [08:54:44] PROBLEM - newcascadia.net - reverse DNS on sslhost is WARNING: LifetimeTimeout: The resolution lifetime expired after 5.405 seconds: Server 1.1.1.1 UDP port 53 answered The DNS operation timed out.; Server 1.1.1.1 UDP port 53 answered The DNS operation timed out.; Server 1.1.1.1 UDP port 53 answered The DNS operation timed out. [10:04:45] PROBLEM - mem21 Current Load on mem21 is WARNING: LOAD WARNING - total load average: 2.92, 3.35, 3.91 [10:06:45] PROBLEM - mem21 Current Load on mem21 is CRITICAL: LOAD CRITICAL - total load average: 4.39, 3.81, 4.01 [10:26:21] PROBLEM - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is WARNING: WARNING - NGINX Error Rate is 49% [10:28:21] RECOVERY - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is OK: OK - NGINX Error Rate is 31% [10:36:21] PROBLEM - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is WARNING: WARNING - NGINX Error Rate is 49% [10:38:21] RECOVERY - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is OK: OK - NGINX Error Rate is 28% [10:47:13] PROBLEM - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is CRITICAL: CRITICAL - NGINX Error Rate is 61% [10:49:09] PROBLEM - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is WARNING: WARNING - NGINX Error Rate is 46% [10:51:05] PROBLEM - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is CRITICAL: CRITICAL - NGINX Error Rate is 70% [10:53:02] PROBLEM - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is WARNING: WARNING - NGINX Error Rate is 45% [11:04:41] RECOVERY - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is OK: OK - NGINX Error Rate is 37% [11:10:56] PROBLEM - cp6 HTTP 4xx/5xx ERROR Rate on cp6 is WARNING: WARNING - NGINX Error Rate is 56% [11:14:57] RECOVERY - cp6 HTTP 4xx/5xx ERROR Rate on cp6 is OK: OK - NGINX Error Rate is 34% [11:18:21] PROBLEM - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is CRITICAL: CRITICAL - NGINX Error Rate is 81% [11:20:21] PROBLEM - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is WARNING: WARNING - NGINX Error Rate is 42% [11:22:21] RECOVERY - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is OK: OK - NGINX Error Rate is 20% [11:42:56] PROBLEM - cp6 HTTP 4xx/5xx ERROR Rate on cp6 is WARNING: WARNING - NGINX Error Rate is 43% [11:44:45] PROBLEM - mem21 Current Load on mem21 is WARNING: LOAD WARNING - total load average: 3.81, 3.69, 3.96 [11:44:56] RECOVERY - cp6 HTTP 4xx/5xx ERROR Rate on cp6 is OK: OK - NGINX Error Rate is 31% [11:48:45] PROBLEM - mem21 Current Load on mem21 is CRITICAL: LOAD CRITICAL - total load average: 4.58, 4.02, 4.03 [11:50:51] RECOVERY - newcascadia.net - reverse DNS on sslhost is OK: SSL OK - newcascadia.net reverse DNS resolves to cp2.wikitide.net - NS RECORDS OK [11:53:49] PROBLEM - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is WARNING: WARNING - NGINX Error Rate is 51% [11:55:45] PROBLEM - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is CRITICAL: CRITICAL - NGINX Error Rate is 65% [11:57:42] PROBLEM - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is WARNING: WARNING - NGINX Error Rate is 51% [12:03:31] PROBLEM - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is CRITICAL: CRITICAL - NGINX Error Rate is 77% [12:05:28] PROBLEM - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is WARNING: WARNING - NGINX Error Rate is 41% [12:11:17] RECOVERY - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is OK: OK - NGINX Error Rate is 22% [12:15:13] PROBLEM - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is WARNING: WARNING - NGINX Error Rate is 40% [12:17:09] RECOVERY - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is OK: OK - NGINX Error Rate is 28% [12:22:31] PROBLEM - newcascadia.net - reverse DNS on sslhost is WARNING: LifetimeTimeout: The resolution lifetime expired after 5.404 seconds: Server 1.1.1.1 UDP port 53 answered The DNS operation timed out.; Server 1.1.1.1 UDP port 53 answered The DNS operation timed out.; Server 1.1.1.1 UDP port 53 answered The DNS operation timed out. [13:21:12] RECOVERY - newcascadia.net - reverse DNS on sslhost is OK: SSL OK - newcascadia.net reverse DNS resolves to cp5.wikitide.net - NS RECORDS OK [14:23:10] PROBLEM - newcascadia.net - reverse DNS on sslhost is WARNING: LifetimeTimeout: The resolution lifetime expired after 5.401 seconds: Server 1.1.1.1 UDP port 53 answered The DNS operation timed out.; Server 1.1.1.1 UDP port 53 answered The DNS operation timed out.; Server 1.1.1.1 UDP port 53 answered The DNS operation timed out. [15:36:45] PROBLEM - mem21 Current Load on mem21 is WARNING: LOAD WARNING - total load average: 3.82, 3.72, 4.00 [15:40:45] PROBLEM - mem21 Current Load on mem21 is CRITICAL: LOAD CRITICAL - total load average: 4.66, 4.16, 4.10 [16:49:15] PROBLEM - cloud4 Puppet on cloud4 is UNKNOWN: NRPE: Unable to read output [16:58:45] PROBLEM - mem21 Current Load on mem21 is WARNING: LOAD WARNING - total load average: 3.65, 3.64, 3.99 [17:00:45] PROBLEM - mem21 Current Load on mem21 is CRITICAL: LOAD CRITICAL - total load average: 4.17, 3.86, 4.03 [17:02:45] PROBLEM - mem21 Current Load on mem21 is WARNING: LOAD WARNING - total load average: 3.49, 3.71, 3.95 [17:04:45] PROBLEM - mem21 Current Load on mem21 is CRITICAL: LOAD CRITICAL - total load average: 4.35, 4.21, 4.12 [17:17:15] RECOVERY - cloud4 Puppet on cloud4 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [17:22:45] PROBLEM - mem21 Current Load on mem21 is WARNING: LOAD WARNING - total load average: 3.45, 3.71, 4.00 [17:24:45] PROBLEM - mem21 Current Load on mem21 is CRITICAL: LOAD CRITICAL - total load average: 5.39, 4.39, 4.21 [18:12:38] PROBLEM - cp4 Puppet on cp4 is CRITICAL: CRITICAL: Puppet has 1 failures. Last run 2 minutes ago with 1 failures. Failed resources (up to 3 shown): File[/etc/ferm/conf.d/02_main] [18:38:59] RECOVERY - cp4 Puppet on cp4 is OK: OK: Puppet is currently enabled, last run 19 seconds ago with 0 failures [20:10:45] PROBLEM - mem21 Current Load on mem21 is WARNING: LOAD WARNING - total load average: 3.60, 3.57, 3.97 [20:14:45] PROBLEM - mem21 Current Load on mem21 is CRITICAL: LOAD CRITICAL - total load average: 4.11, 3.80, 3.95 [20:20:20] RECOVERY - newcascadia.net - reverse DNS on sslhost is OK: SSL OK - newcascadia.net reverse DNS resolves to cp5.wikitide.net - NS RECORDS OK [20:28:45] PROBLEM - mem21 Current Load on mem21 is WARNING: LOAD WARNING - total load average: 3.20, 3.62, 3.97 [20:36:45] PROBLEM - mem21 Current Load on mem21 is CRITICAL: LOAD CRITICAL - total load average: 4.07, 3.78, 3.90 [20:48:45] PROBLEM - mem21 Current Load on mem21 is WARNING: LOAD WARNING - total load average: 3.32, 3.84, 3.97 [20:52:45] PROBLEM - mem21 Current Load on mem21 is CRITICAL: LOAD CRITICAL - total load average: 4.57, 3.96, 3.97 [22:10:45] PROBLEM - mem21 Current Load on mem21 is WARNING: LOAD WARNING - total load average: 3.15, 3.71, 3.99 [22:14:45] PROBLEM - mem21 Current Load on mem21 is CRITICAL: LOAD CRITICAL - total load average: 4.64, 3.90, 3.98 [23:25:45] PROBLEM - mw21 MediaWiki Rendering on mw21 is UNKNOWN: HTTP UNKNOWN: Failed to unchunk message body [23:27:45] PROBLEM - mw21 MediaWiki Rendering on mw21 is WARNING: HTTP WARNING: HTTP/1.1 404 Not Found - 8191 bytes in 0.119 second response time [23:37:37] !log [universalomega@jobrunner21] starting deploy of {'versions': ['1.40', '1.41'], 'show_tags': True, 'upgrade_extensions': ['CreateWiki', 'DataDump', 'GlobalNewFiles', 'ImportDump', 'IncidentReporting', 'ManageWiki', 'MatomoAnalytics', 'PDFEmbed', 'RemovePII', 'RottenLinks', 'SpriteSheet', 'WikiDiscover', 'WikiTideMagic', 'YouTube'], 'upgrade_skins': []} to [23:37:38] [mw21, mw22, jobrunner21] [23:37:39] !log [universalomega@jobrunner21] finished deploy of {'versions': ['1.40', '1.41'], 'show_tags': True, 'upgrade_extensions': ['CreateWiki', 'DataDump', 'GlobalNewFiles', 'ImportDump', 'IncidentReporting', 'ManageWiki', 'MatomoAnalytics', 'PDFEmbed', 'RemovePII', 'RottenLinks', 'SpriteSheet', 'WikiDiscover', 'WikiTideMagic', 'YouTube'], 'upgrade_skins': []} to [23:37:39] Logged the message at https://meta.wikitide.org/wiki/Tech:Server_admin_log [23:37:40] [mw21, mw22, jobrunner21] - SUCCESS in 0s [23:37:41] Logged the message at https://meta.wikitide.org/wiki/Tech:Server_admin_log [23:37:41] !log [universalomega@jobrunner21] starting deploy of {'versions': ['1.40', '1.41'], 'show_tags': True, 'upgrade_extensions': ['CreateWiki', 'DataDump', 'GlobalNewFiles', 'ImportDump', 'IncidentReporting', 'ManageWiki', 'MatomoAnalytics', 'PDFEmbed', 'RemovePII', 'RottenLinks', 'SpriteSheet', 'WikiDiscover', 'WikiTideMagic', 'YouTube'], 'upgrade_skins': []} to [23:37:42] [mw21, mw22, jobrunner21] [23:37:43] Logged the message at https://meta.wikitide.org/wiki/Tech:Server_admin_log [23:38:19] !log [universalomega@jobrunner21] finished deploy of {'versions': ['1.40', '1.41'], 'show_tags': True, 'upgrade_extensions': ['CreateWiki', 'DataDump', 'GlobalNewFiles', 'ImportDump', 'IncidentReporting', 'ManageWiki', 'MatomoAnalytics', 'PDFEmbed', 'RemovePII', 'RottenLinks', 'SpriteSheet', 'WikiDiscover', 'WikiTideMagic', 'YouTube'], 'upgrade_skins': []} to [23:38:20] [mw21, mw22, jobrunner21] - SUCCESS in 42s [23:38:21] Logged the message at https://meta.wikitide.org/wiki/Tech:Server_admin_log [23:38:21] !log [universalomega@jobrunner21] starting deploy of {'versions': ['1.40', '1.41'], 'show_tags': True, 'upgrade_extensions': ['CreateWiki', 'DataDump', 'GlobalNewFiles', 'ImportDump', 'IncidentReporting', 'ManageWiki', 'MatomoAnalytics', 'PDFEmbed', 'RemovePII', 'RottenLinks', 'SpriteSheet', 'WikiDiscover', 'WikiTideMagic', 'YouTube'], 'upgrade_skins': []} to [23:38:22] [mw21, mw22, jobrunner21] [23:38:23] Logged the message at https://meta.wikitide.org/wiki/Tech:Server_admin_log [23:39:30] !log [universalomega@jobrunner21] finished deploy of {'versions': ['1.40', '1.41'], 'show_tags': True, 'upgrade_extensions': ['CreateWiki', 'DataDump', 'GlobalNewFiles', 'ImportDump', 'IncidentReporting', 'ManageWiki', 'MatomoAnalytics', 'PDFEmbed', 'RemovePII', 'RottenLinks', 'SpriteSheet', 'WikiDiscover', 'WikiTideMagic', 'YouTube'], 'upgrade_skins': []} to [23:39:31] [mw21, mw22, jobrunner21] - SUCCESS in 113s [23:39:32] Logged the message at https://meta.wikitide.org/wiki/Tech:Server_admin_log