[00:02:49] [Grafana] RESOLVED: High Job Queue Backlog https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [00:13:15] PROBLEM - mw181 Current Load on mw181 is WARNING: LOAD WARNING - total load average: 20.98, 20.65, 23.72 [00:14:30] PROBLEM - prometheus151 SSH on prometheus151 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [00:15:15] PROBLEM - mw181 Current Load on mw181 is CRITICAL: LOAD CRITICAL - total load average: 26.07, 22.00, 23.80 [00:16:24] RECOVERY - prometheus151 SSH on prometheus151 is OK: SSH OK - OpenSSH_9.2p1 Debian-2+deb12u3 (protocol 2.0) [00:21:15] PROBLEM - mw181 Current Load on mw181 is WARNING: LOAD WARNING - total load average: 21.42, 22.48, 23.71 [00:25:52] PROBLEM - prometheus151 PowerDNS Recursor on prometheus151 is CRITICAL: CRITICAL - Plugin timed out while executing system call [00:27:57] RECOVERY - prometheus151 PowerDNS Recursor on prometheus151 is OK: DNS OK: 6.762 seconds response time. wikitide.net returns 2602:294:0:b13::110,2602:294:0:b23::112,38.46.223.205,38.46.223.206 [00:33:15] PROBLEM - mw181 Current Load on mw181 is CRITICAL: LOAD CRITICAL - total load average: 28.01, 22.66, 22.47 [00:36:22] PROBLEM - prometheus151 PowerDNS Recursor on prometheus151 is CRITICAL: CRITICAL - Plugin timed out while executing system call [00:36:47] PROBLEM - prometheus151 SSH on prometheus151 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [00:38:42] RECOVERY - prometheus151 SSH on prometheus151 is OK: SSH OK - OpenSSH_9.2p1 Debian-2+deb12u3 (protocol 2.0) [00:44:20] [Grafana] FIRING: There has been a rise in the MediaWiki exception rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [00:44:49] RECOVERY - prometheus151 PowerDNS Recursor on prometheus151 is OK: DNS OK: 2.273 seconds response time. wikitide.net returns 2602:294:0:b13::110,2602:294:0:b23::112,38.46.223.205,38.46.223.206 [00:51:26] PROBLEM - prometheus151 Puppet on prometheus151 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 60 seconds. [00:53:26] RECOVERY - prometheus151 Puppet on prometheus151 is OK: OK: Puppet is currently enabled, last run 42 seconds ago with 0 failures [00:54:20] [Grafana] RESOLVED: MediaWiki Exception Rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [00:57:33] PROBLEM - prometheus151 PowerDNS Recursor on prometheus151 is CRITICAL: CRITICAL - Plugin timed out while executing system call [01:00:20] [Grafana] FIRING: There has been a rise in the MediaWiki exception rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [01:01:43] RECOVERY - prometheus151 PowerDNS Recursor on prometheus151 is OK: DNS OK: 2.831 seconds response time. wikitide.net returns 2602:294:0:b13::110,2602:294:0:b23::112,38.46.223.205,38.46.223.206 [01:05:20] [Grafana] FIRING: The mediawiki job queue has more than 500 unclaimed jobs https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] FIRING: There has been a rise in the MediaWiki exception rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [01:10:20] [Grafana] RESOLVED: High Job Queue Backlog https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] RESOLVED: MediaWiki Exception Rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [01:23:58] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 2.97, 3.17, 3.88 [01:31:57] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 5.33, 3.72, 3.74 [01:32:35] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 20.70, 19.14, 17.08 [01:34:35] RECOVERY - mw182 Current Load on mw182 is OK: LOAD OK - total load average: 19.85, 19.80, 17.60 [01:35:57] RECOVERY - prometheus151 Current Load on prometheus151 is OK: LOAD OK - total load average: 3.16, 2.99, 3.40 [01:38:39] PROBLEM - prometheus151 PowerDNS Recursor on prometheus151 is CRITICAL: CRITICAL - Plugin timed out while executing system call [01:39:57] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 5.87, 5.09, 4.20 [01:40:38] RECOVERY - prometheus151 PowerDNS Recursor on prometheus151 is OK: DNS OK: 0.065 seconds response time. wikitide.net returns 2602:294:0:b13::110,2602:294:0:b23::112,38.46.223.205,38.46.223.206 [01:54:20] [Grafana] FIRING: There has been a rise in the MediaWiki exception rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [01:59:20] [Grafana] RESOLVED: MediaWiki Exception Rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [02:03:20] [Grafana] FIRING: There has been a rise in the MediaWiki exception rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [02:08:20] [Grafana] RESOLVED: MediaWiki Exception Rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [02:20:20] PROBLEM - puritwiki.p-e.kr - reverse DNS on sslhost is WARNING: LifetimeTimeout: The resolution lifetime expired after 5.404 seconds: Server 2606:4700:4700::1111 UDP port 53 answered The DNS operation timed out.; Server 2606:4700:4700::1111 UDP port 53 answered The DNS operation timed out.; Server 2606:4700:4700::1111 UDP port 53 answered The DNS operation timed out. [02:26:48] PROBLEM - prometheus151 SSH on prometheus151 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [02:28:42] RECOVERY - prometheus151 SSH on prometheus151 is OK: SSH OK - OpenSSH_9.2p1 Debian-2+deb12u3 (protocol 2.0) [02:29:26] [02puppet] 07The-Voidwalker created branch 03The-Voidwalker-patch-1 - 13https://github.com/miraheze/puppet [02:29:28] [02puppet] 07The-Voidwalker pushed 031 commit to 03The-Voidwalker-patch-1 [+0/-0/±1] 13https://github.com/miraheze/puppet/commit/d8a21803f976 [02:29:31] [02puppet] 07The-Voidwalker 03d8a2180 - port some cloudflare rules to varnish nginx [02:30:45] [02puppet] 07The-Voidwalker opened pull request 03#3900: port some cloudflare rules to varnish nginx - 13https://github.com/miraheze/puppet/pull/3900 [02:30:50] [02puppet] 07coderabbitai[bot] commented on pull request 03#3900: port some cloudflare rules to varnish nginx - 13https://github.com/miraheze/puppet/pull/3900#issuecomment-2308627533 [02:37:17] PROBLEM - prometheus151 PowerDNS Recursor on prometheus151 is CRITICAL: CRITICAL - Plugin timed out while executing system call [02:37:55] PROBLEM - prometheus151 SSH on prometheus151 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [02:39:16] RECOVERY - prometheus151 PowerDNS Recursor on prometheus151 is OK: DNS OK: 0.062 seconds response time. wikitide.net returns 2602:294:0:b13::110,2602:294:0:b23::112,38.46.223.205,38.46.223.206 [02:39:54] RECOVERY - prometheus151 SSH on prometheus151 is OK: SSH OK - OpenSSH_9.2p1 Debian-2+deb12u3 (protocol 2.0) [02:47:25] [Grafana] FIRING: The mediawiki job queue has more than 500 unclaimed jobs https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [02:47:44] RECOVERY - jobchron171 APT on jobchron171 is OK: APT OK: 61 packages available for upgrade (0 critical updates). [02:48:05] RECOVERY - eventgate181 APT on eventgate181 is OK: APT OK: 44 packages available for upgrade (0 critical updates). [02:48:17] RECOVERY - cp36 APT on cp36 is OK: APT OK: 51 packages available for upgrade (0 critical updates). [02:48:42] RECOVERY - cp37 APT on cp37 is OK: APT OK: 51 packages available for upgrade (0 critical updates). [02:48:46] RECOVERY - bots171 APT on bots171 is OK: APT OK: 62 packages available for upgrade (0 critical updates). [02:48:52] RECOVERY - db161 APT on db161 is OK: APT OK: 58 packages available for upgrade (0 critical updates). [02:49:03] RECOVERY - graphite151 APT on graphite151 is OK: APT OK: 39 packages available for upgrade (0 critical updates). [02:49:06] RECOVERY - db151 APT on db151 is OK: APT OK: 59 packages available for upgrade (0 critical updates). [02:49:17] RECOVERY - bast161 APT on bast161 is OK: APT OK: 51 packages available for upgrade (0 critical updates). [02:49:32] RECOVERY - cloud16 APT on cloud16 is OK: APT OK: 101 packages available for upgrade (0 critical updates). [02:49:49] RECOVERY - matomo151 APT on matomo151 is OK: APT OK: 62 packages available for upgrade (0 critical updates). [02:49:50] PROBLEM - prometheus151 Puppet on prometheus151 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 60 seconds. [02:49:59] RECOVERY - cloud15 APT on cloud15 is OK: APT OK: 109 packages available for upgrade (0 critical updates). [02:50:02] RECOVERY - db171 APT on db171 is OK: APT OK: 58 packages available for upgrade (0 critical updates). [02:50:33] RECOVERY - db182 APT on db182 is OK: APT OK: 59 packages available for upgrade (0 critical updates). [02:50:40] RECOVERY - graylog161 APT on graylog161 is OK: APT OK: 55 packages available for upgrade (0 critical updates). [02:51:42] RECOVERY - db181 APT on db181 is OK: APT OK: 59 packages available for upgrade (0 critical updates). [02:51:44] RECOVERY - cloud17 APT on cloud17 is OK: APT OK: 101 packages available for upgrade (0 critical updates). [02:51:56] RECOVERY - prometheus151 Puppet on prometheus151 is OK: OK: Puppet is currently enabled, last run 35 minutes ago with 0 failures [02:52:10] RECOVERY - ldap171 APT on ldap171 is OK: APT OK: 51 packages available for upgrade (0 critical updates). [02:52:17] RECOVERY - bast181 APT on bast181 is OK: APT OK: 52 packages available for upgrade (0 critical updates). [02:52:25] [Grafana] FIRING: The mediawiki job queue has more than 500 unclaimed jobs https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [02:52:51] RECOVERY - kafka181 APT on kafka181 is OK: APT OK: 37 packages available for upgrade (0 critical updates). [02:53:00] RECOVERY - cp26 APT on cp26 is OK: APT OK: 51 packages available for upgrade (0 critical updates). [02:53:16] RECOVERY - cp27 APT on cp27 is OK: APT OK: 51 packages available for upgrade (0 critical updates). [02:53:26] RECOVERY - mem151 APT on mem151 is OK: APT OK: 51 packages available for upgrade (0 critical updates). [02:53:29] RECOVERY - cloud18 APT on cloud18 is OK: APT OK: 102 packages available for upgrade (0 critical updates). [02:53:59] RECOVERY - mem161 APT on mem161 is OK: APT OK: 51 packages available for upgrade (0 critical updates). [02:54:28] RECOVERY - cp41 APT on cp41 is OK: APT OK: 51 packages available for upgrade (0 critical updates). [02:54:32] RECOVERY - mw151 APT on mw151 is OK: APT OK: 64 packages available for upgrade (0 critical updates). [02:54:52] RECOVERY - mwtask171 APT on mwtask171 is OK: APT OK: 62 packages available for upgrade (0 critical updates). [02:54:58] RECOVERY - mon181 APT on mon181 is OK: APT OK: 57 packages available for upgrade (0 critical updates). [02:54:58] RECOVERY - os162 APT on os162 is OK: APT OK: 53 packages available for upgrade (0 critical updates). [02:55:11] RECOVERY - puppet181 APT on puppet181 is OK: APT OK: 58 packages available for upgrade (0 critical updates). [02:55:11] RECOVERY - ns1 APT on ns1 is OK: APT OK: 50 packages available for upgrade (0 critical updates). [02:55:44] RECOVERY - os161 APT on os161 is OK: APT OK: 52 packages available for upgrade (0 critical updates). [02:55:59] RECOVERY - mw162 APT on mw162 is OK: APT OK: 64 packages available for upgrade (0 critical updates). [02:56:15] RECOVERY - mw161 APT on mw161 is OK: APT OK: 64 packages available for upgrade (0 critical updates). [02:56:18] RECOVERY - phorge171 APT on phorge171 is OK: APT OK: 53 packages available for upgrade (0 critical updates). [02:56:29] RECOVERY - mw152 APT on mw152 is OK: APT OK: 64 packages available for upgrade (0 critical updates). [02:56:38] RECOVERY - os151 APT on os151 is OK: APT OK: 53 packages available for upgrade (0 critical updates). [02:56:58] RECOVERY - mw171 APT on mw171 is OK: APT OK: 64 packages available for upgrade (0 critical updates). [02:57:34] RECOVERY - reports171 APT on reports171 is OK: APT OK: 63 packages available for upgrade (0 critical updates). [02:57:46] RECOVERY - mw181 APT on mw181 is OK: APT OK: 65 packages available for upgrade (0 critical updates). [02:57:51] RECOVERY - rdb151 APT on rdb151 is OK: APT OK: 51 packages available for upgrade (0 critical updates). [02:58:10] RECOVERY - mw172 APT on mw172 is OK: APT OK: 64 packages available for upgrade (0 critical updates). [02:58:12] RECOVERY - mwtask181 APT on mwtask181 is OK: APT OK: 62 packages available for upgrade (0 critical updates). [02:58:17] !log [void@puppet181] Upgraded packages libaom3 on swiftobject151 [02:58:22] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [02:58:27] RECOVERY - cp51 APT on cp51 is OK: APT OK: 51 packages available for upgrade (0 critical updates). [02:58:31] !log [void@puppet181] Upgraded packages libaom3 on test151 [02:58:37] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [02:58:40] RECOVERY - mw182 APT on mw182 is OK: APT OK: 65 packages available for upgrade (0 critical updates). [02:58:47] !log [void@puppet181] Upgraded packages libaom3 on swiftac171 [02:58:53] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [02:59:01] !log [void@puppet181] Upgraded packages libaom3 on swiftobject171 [02:59:05] RECOVERY - swiftobject171 APT on swiftobject171 is OK: APT OK: 51 packages available for upgrade (0 critical updates). [02:59:11] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [02:59:12] RECOVERY - swiftac171 APT on swiftac171 is OK: APT OK: 40 packages available for upgrade (0 critical updates). [02:59:15] !log [void@puppet181] Upgraded packages libaom3 on swiftproxy161 [02:59:21] RECOVERY - swiftobject151 APT on swiftobject151 is OK: APT OK: 51 packages available for upgrade (0 critical updates). [02:59:21] RECOVERY - swiftobject161 APT on swiftobject161 is OK: APT OK: 51 packages available for upgrade (0 critical updates). [02:59:28] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [02:59:31] !log [void@puppet181] Upgraded packages libaom3 on swiftobject181 [02:59:32] !log [void@bots171] restart irclogserverbot [02:59:37] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [02:59:38] RECOVERY - swiftobject181 APT on swiftobject181 is OK: APT OK: 52 packages available for upgrade (0 critical updates). [02:59:42] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [02:59:44] RECOVERY - swiftproxy161 APT on swiftproxy161 is OK: APT OK: 40 packages available for upgrade (0 critical updates). [02:59:46] !log [void@puppet181] Upgraded packages libaom3 on swiftproxy171 [02:59:50] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [03:00:03] RECOVERY - test151 APT on test151 is OK: APT OK: 74 packages available for upgrade (0 critical updates). [03:00:46] RECOVERY - mon181 Backups Grafana on mon181 is OK: FILE_AGE OK: /var/log/grafana-backup.log is 23 seconds old and 93 bytes [03:01:13] [02ssl] 07WikiTideSSLBot pushed 031 commit to 03master [+0/-0/±1] 13https://github.com/miraheze/ssl/compare/87d37448b8d3...775f3f7607f8 [03:01:14] [02ssl] 07WikiTideSSLBot 03775f3f7 - Bot: Update SSL cert for wiki.tdrweb.top [03:01:31] PROBLEM - prometheus151 PowerDNS Recursor on prometheus151 is CRITICAL: CRITICAL - Plugin timed out while executing system call [03:01:39] RECOVERY - swiftproxy171 APT on swiftproxy171 is OK: APT OK: 54 packages available for upgrade (0 critical updates). [03:01:46] PROBLEM - prometheus151 SSH on prometheus151 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [03:03:02] !log [void@puppet181] Upgraded packages libaom3 on prometheus151 [03:03:10] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [03:03:14] [02ssl] 07WikiTideSSLBot pushed 031 commit to 03master [+0/-0/±1] 13https://github.com/miraheze/ssl/compare/775f3f7607f8...b1719ff9288a [03:03:17] [02ssl] 07WikiTideSSLBot 03b1719ff - Bot: Update SSL cert for project-patterns.com [03:03:29] RECOVERY - prometheus151 PowerDNS Recursor on prometheus151 is OK: DNS OK: 0.074 seconds response time. wikitide.net returns 2602:294:0:b13::110,2602:294:0:b23::112,38.46.223.205,38.46.223.206 [03:03:40] RECOVERY - prometheus151 SSH on prometheus151 is OK: SSH OK - OpenSSH_9.2p1 Debian-2+deb12u3 (protocol 2.0) [03:07:25] [Grafana] FIRING: The mediawiki job queue has more than 500 unclaimed jobs https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] FIRING: There has been a rise in the MediaWiki exception rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [03:07:55] PROBLEM - prometheus151 PowerDNS Recursor on prometheus151 is CRITICAL: CRITICAL - Plugin timed out while executing system call [03:07:57] PROBLEM - prometheus151 SSH on prometheus151 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [03:08:30] RECOVERY - prometheus151 APT on prometheus151 is OK: APT OK: 51 packages available for upgrade (0 critical updates). [03:09:57] RECOVERY - prometheus151 SSH on prometheus151 is OK: SSH OK - OpenSSH_9.2p1 Debian-2+deb12u3 (protocol 2.0) [03:09:58] RECOVERY - prometheus151 PowerDNS Recursor on prometheus151 is OK: DNS OK: 4.112 seconds response time. wikitide.net returns 2602:294:0:b13::110,2602:294:0:b23::112,38.46.223.205,38.46.223.206 [03:12:25] [Grafana] FIRING: The mediawiki job queue has more than 500 unclaimed jobs https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] RESOLVED: MediaWiki Exception Rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [03:17:25] [Grafana] RESOLVED: High Job Queue Backlog https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [03:20:17] PROBLEM - puritwiki.p-e.kr - reverse DNS on sslhost is CRITICAL: rDNS CRITICAL - puritwiki.p-e.kr All nameservers failed to answer the query. [03:25:39] RECOVERY - project-patterns.com - LetsEncrypt on sslhost is OK: OK - Certificate 'project-patterns.com' will expire on Sat 23 Nov 2024 02:04:38 AM GMT +0000. [03:26:48] RECOVERY - www.project-patterns.com - LetsEncrypt on sslhost is OK: OK - Certificate 'project-patterns.com' will expire on Sat 23 Nov 2024 02:04:38 AM GMT +0000. [03:30:02] RECOVERY - wiki.tdrweb.top - LetsEncrypt on sslhost is OK: OK - Certificate 'wiki.tdrweb.top' will expire on Sat 23 Nov 2024 02:02:36 AM GMT +0000. [03:32:25] [Grafana] RESOLVED: PHP-FPM Worker Usage High https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [03:43:58] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 2.75, 2.96, 3.75 [03:49:07] [02ImportDump] 07Universal-Omega pushed 031 commit to 03fix-completed-job-duplication [+0/-0/±1] 13https://github.com/miraheze/ImportDump/commit/0d7eb3a33f8a [03:49:09] [02ImportDump] 07Universal-Omega 030d7eb3a - Don't rerun a job that is already completed. [03:49:11] [02ImportDump] 07Universal-Omega created branch 03fix-completed-job-duplication - 13https://github.com/miraheze/ImportDump [03:49:14] [02ImportDump] 07Universal-Omega opened pull request 03#118: Don't rerun a job that is already completed. - 13https://github.com/miraheze/ImportDump/pull/118 [03:49:22] [02ImportDump] 07coderabbitai[bot] commented on pull request 03#118: Don't rerun a job that is already completed. - 13https://github.com/miraheze/ImportDump/pull/118#issuecomment-2308642399 [03:49:34] [02ImportDump] 07Universal-Omega edited pull request 03#118: Don't rerun a job that is already completed - 13https://github.com/miraheze/ImportDump/pull/118 [03:49:57] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 4.47, 3.46, 3.66 [03:50:17] [02ImportDump] 07coderabbitai[bot] edited pull request 03#118: Don't rerun a job that is already completed - 13https://github.com/miraheze/ImportDump/pull/118 [03:50:59] [02ImportDump] 07Universal-Omega pushed 031 commit to 03fix-completed-job-duplication [+0/-0/±1] 13https://github.com/miraheze/ImportDump/compare/0d7eb3a33f8a...364bbad8da0c [03:51:00] [02ImportDump] 07Universal-Omega 03364bbad - Add fallback check [03:51:03] [02ImportDump] 07Universal-Omega synchronize pull request 03#118: Don't rerun a job that is already completed - 13https://github.com/miraheze/ImportDump/pull/118 [03:51:40] [02ImportDump] 07coderabbitai[bot] edited a comment on pull request 03#118: Don't rerun a job that is already completed - 13https://github.com/miraheze/ImportDump/pull/118#issuecomment-2308642399 [03:51:42] [02ImportDump] 07coderabbitai[bot] edited a comment on pull request 03#118: Don't rerun a job that is already completed - 13https://github.com/miraheze/ImportDump/pull/118#issuecomment-2308642399 [03:52:27] [Grafana] FIRING: The mediawiki job queue has more than 500 unclaimed jobs https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [03:53:39] [02ImportDump] 07coderabbitai[bot] edited pull request 03#118: Don't rerun a job that is already completed - 13https://github.com/miraheze/ImportDump/pull/118 [03:53:57] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 3.42, 3.23, 3.50 [03:55:35] [02ImportDump] 07coderabbitai[bot] edited a comment on pull request 03#118: Don't rerun a job that is already completed - 13https://github.com/miraheze/ImportDump/pull/118#issuecomment-2308642399 [03:55:57] RECOVERY - prometheus151 Current Load on prometheus151 is OK: LOAD OK - total load average: 2.77, 2.85, 3.32 [03:59:23] [02ImportDump] 07Universal-Omega pushed 031 commit to 03fix-completed-job-duplication [+0/-0/±1] 13https://github.com/miraheze/ImportDump/compare/364bbad8da0c...b37d1aea3b21 [03:59:24] [02ImportDump] 07Universal-Omega 03b37d1ae - Also add a notify fallback [03:59:25] [02ImportDump] 07Universal-Omega synchronize pull request 03#118: Don't rerun a job that is already completed - 13https://github.com/miraheze/ImportDump/pull/118 [03:59:40] [02ImportDump] 07coderabbitai[bot] edited a comment on pull request 03#118: Don't rerun a job that is already completed - 13https://github.com/miraheze/ImportDump/pull/118#issuecomment-2308642399 [04:00:51] PROBLEM - wiki.corgicam.tv - reverse DNS on sslhost is WARNING: LifetimeTimeout: The resolution lifetime expired after 5.404 seconds: Server 2606:4700:4700::1111 UDP port 53 answered The DNS operation timed out.; Server 2606:4700:4700::1111 UDP port 53 answered The DNS operation timed out.; Server 2606:4700:4700::1111 UDP port 53 answered The DNS operation timed out. [04:03:15] PROBLEM - mw181 Current Load on mw181 is WARNING: LOAD WARNING - total load average: 21.83, 22.65, 23.91 [04:04:54] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 2.41, 3.91, 3.73 [04:05:15] PROBLEM - mw181 Current Load on mw181 is CRITICAL: LOAD CRITICAL - total load average: 24.85, 23.29, 23.98 [04:06:51] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 6.45, 4.88, 4.10 [04:07:36] PROBLEM - wiki.graalmilitary.com - LetsEncrypt on sslhost is CRITICAL: Name or service not knownHTTP CRITICAL - Unable to open TCP socket [04:08:16] PROBLEM - prometheus151 PowerDNS Recursor on prometheus151 is CRITICAL: CRITICAL - Plugin timed out while executing system call [04:08:43] PROBLEM - wiki.graalmilitary.com - reverse DNS on sslhost is WARNING: rDNS WARNING - reverse DNS entry for wiki.graalmilitary.com could not be found [04:09:07] miraheze/ImportDump - Universal-Omega the build passed. [04:10:24] RECOVERY - prometheus151 PowerDNS Recursor on prometheus151 is OK: DNS OK: 9.007 seconds response time. wikitide.net returns 2602:294:0:b13::110,2602:294:0:b23::112,38.46.223.205,38.46.223.206 [04:12:42] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 2.89, 3.81, 3.95 [04:16:35] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 5.21, 4.10, 4.01 [04:17:27] [Grafana] RESOLVED: High Job Queue Backlog https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [04:18:32] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 2.76, 3.34, 3.73 [04:18:43] PROBLEM - sarovia.graalmilitary.com - reverse DNS on sslhost is WARNING: rDNS WARNING - reverse DNS entry for sarovia.graalmilitary.com could not be found [04:18:55] PROBLEM - pt.graalmilitary.com - LetsEncrypt on sslhost is CRITICAL: Name or service not knownHTTP CRITICAL - Unable to open TCP socket [04:20:30] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 5.26, 3.92, 3.89 [04:21:25] PROBLEM - prometheus151 SSH on prometheus151 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [04:21:55] PROBLEM - pt.graalmilitary.com - reverse DNS on sslhost is WARNING: rDNS WARNING - reverse DNS entry for pt.graalmilitary.com could not be found [04:23:18] RECOVERY - prometheus151 SSH on prometheus151 is OK: SSH OK - OpenSSH_9.2p1 Debian-2+deb12u3 (protocol 2.0) [04:24:23] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 3.58, 3.81, 3.88 [04:28:19] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 4.54, 3.87, 3.87 [04:29:35] PROBLEM - prometheus151 SSH on prometheus151 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [04:30:49] PROBLEM - wiki.corgicam.tv - reverse DNS on sslhost is CRITICAL: rDNS CRITICAL - wiki.corgicam.tv All nameservers failed to answer the query. [04:32:13] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 3.20, 3.70, 3.84 [04:34:10] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 5.78, 4.45, 4.10 [04:35:28] PROBLEM - sarovia.graalmilitary.com - LetsEncrypt on sslhost is CRITICAL: Name or service not knownHTTP CRITICAL - Unable to open TCP socket [04:35:36] RECOVERY - prometheus151 SSH on prometheus151 is OK: SSH OK - OpenSSH_9.2p1 Debian-2+deb12u3 (protocol 2.0) [04:36:07] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 2.30, 3.74, 3.90 [04:37:15] PROBLEM - mw181 Current Load on mw181 is WARNING: LOAD WARNING - total load average: 22.46, 22.25, 23.89 [04:38:04] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 4.76, 4.13, 4.02 [04:39:15] PROBLEM - mw181 Current Load on mw181 is CRITICAL: LOAD CRITICAL - total load average: 25.32, 23.46, 24.14 [04:40:01] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 2.14, 3.50, 3.81 [04:41:58] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 4.65, 4.35, 4.10 [04:47:15] PROBLEM - mw181 Current Load on mw181 is WARNING: LOAD WARNING - total load average: 21.57, 22.45, 23.55 [04:48:25] PROBLEM - prometheus151 APT on prometheus151 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 60 seconds. [04:49:16] PROBLEM - prometheus151 PowerDNS Recursor on prometheus151 is CRITICAL: CRITICAL - Plugin timed out while executing system call [04:51:16] RECOVERY - prometheus151 PowerDNS Recursor on prometheus151 is OK: DNS OK: 1.461 second response time. wikitide.net returns 2602:294:0:b13::110,2602:294:0:b23::112,38.46.223.205,38.46.223.206 [04:51:20] RECOVERY - prometheus151 APT on prometheus151 is OK: APT OK: 51 packages available for upgrade (0 critical updates). [04:51:46] [Grafana] FIRING: There has been a rise in the MediaWiki exception rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [04:56:46] [Grafana] RESOLVED: MediaWiki Exception Rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [05:01:15] PROBLEM - mw181 Current Load on mw181 is CRITICAL: LOAD CRITICAL - total load average: 29.25, 24.68, 23.47 [05:03:15] PROBLEM - mw181 Current Load on mw181 is WARNING: LOAD WARNING - total load average: 22.12, 23.15, 23.04 [05:03:20] [Grafana] FIRING: There has been a rise in the MediaWiki exception rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [05:05:22] PROBLEM - prometheus151 SSH on prometheus151 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [05:08:14] PROBLEM - prometheus151 PowerDNS Recursor on prometheus151 is CRITICAL: CRITICAL - Plugin timed out while executing system call [05:08:20] [Grafana] RESOLVED: MediaWiki Exception Rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [05:09:15] PROBLEM - mw181 Current Load on mw181 is CRITICAL: LOAD CRITICAL - total load average: 26.19, 23.42, 23.04 [05:09:23] RECOVERY - prometheus151 SSH on prometheus151 is OK: SSH OK - OpenSSH_9.2p1 Debian-2+deb12u3 (protocol 2.0) [05:10:13] RECOVERY - prometheus151 PowerDNS Recursor on prometheus151 is OK: DNS OK: 0.071 seconds response time. wikitide.net returns 2602:294:0:b13::110,2602:294:0:b23::112,38.46.223.205,38.46.223.206 [05:10:20] [Grafana] FIRING: There has been a rise in the MediaWiki exception rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [05:11:15] PROBLEM - mw181 Current Load on mw181 is WARNING: LOAD WARNING - total load average: 22.96, 22.68, 22.78 [05:15:15] PROBLEM - mw181 Current Load on mw181 is CRITICAL: LOAD CRITICAL - total load average: 26.83, 23.50, 22.94 [05:15:20] [Grafana] RESOLVED: MediaWiki Exception Rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [05:35:54] PROBLEM - prometheus151 Puppet on prometheus151 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 60 seconds. [05:35:56] PROBLEM - prometheus151 APT on prometheus151 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 60 seconds. [05:39:20] [Grafana] FIRING: There has been a rise in the MediaWiki exception rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [05:39:28] PROBLEM - prometheus151 PowerDNS Recursor on prometheus151 is CRITICAL: CRITICAL - Plugin timed out while executing system call [05:39:35] PROBLEM - prometheus151 SSH on prometheus151 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [05:40:53] RECOVERY - prometheus151 Puppet on prometheus151 is OK: OK: Puppet is currently enabled, last run 22 minutes ago with 0 failures [05:40:54] RECOVERY - prometheus151 APT on prometheus151 is OK: APT OK: 51 packages available for upgrade (0 critical updates). [05:41:34] RECOVERY - prometheus151 SSH on prometheus151 is OK: SSH OK - OpenSSH_9.2p1 Debian-2+deb12u3 (protocol 2.0) [05:43:37] RECOVERY - prometheus151 PowerDNS Recursor on prometheus151 is OK: DNS OK: 2.270 seconds response time. wikitide.net returns 2602:294:0:b13::110,2602:294:0:b23::112,38.46.223.205,38.46.223.206 [05:44:20] [Grafana] RESOLVED: MediaWiki Exception Rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [06:10:52] PROBLEM - swiftac171 PowerDNS Recursor on swiftac171 is CRITICAL: CRITICAL - Plugin timed out while executing system call [06:11:17] PROBLEM - swiftac171 APT on swiftac171 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 60 seconds. [06:12:19] PROBLEM - swiftac171 Current Load on swiftac171 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 60 seconds. [06:12:21] PROBLEM - swiftac171 Puppet on swiftac171 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 60 seconds. [06:13:06] RECOVERY - swiftac171 PowerDNS Recursor on swiftac171 is OK: DNS OK: 0.392 seconds response time. wikitide.net returns 2602:294:0:b13::110,2602:294:0:b23::112,38.46.223.205,38.46.223.206 [06:13:18] RECOVERY - swiftac171 APT on swiftac171 is OK: APT OK: 40 packages available for upgrade (0 critical updates). [06:13:57] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 2.43, 3.05, 3.86 [06:14:20] RECOVERY - swiftac171 Puppet on swiftac171 is OK: OK: Puppet is currently enabled, last run 19 minutes ago with 0 failures [06:18:03] PROBLEM - swiftac171 Current Load on swiftac171 is WARNING: LOAD WARNING - total load average: 3.40, 10.29, 8.32 [06:19:57] RECOVERY - swiftac171 Current Load on swiftac171 is OK: LOAD OK - total load average: 2.47, 7.78, 7.63 [06:20:20] PROBLEM - puritwiki.p-e.kr - reverse DNS on sslhost is WARNING: LifetimeTimeout: The resolution lifetime expired after 5.404 seconds: Server 2606:4700:4700::1111 UDP port 53 answered The DNS operation timed out.; Server 2606:4700:4700::1111 UDP port 53 answered The DNS operation timed out.; Server 2606:4700:4700::1111 UDP port 53 answered The DNS operation timed out. [06:21:57] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 4.31, 3.22, 3.58 [06:22:25] [Grafana] FIRING: The mediawiki job queue has more than 500 unclaimed jobs https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [06:23:57] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 3.32, 3.49, 3.66 [06:29:57] RECOVERY - prometheus151 Current Load on prometheus151 is OK: LOAD OK - total load average: 2.61, 2.82, 3.28 [06:32:25] [Grafana] RESOLVED: High Job Queue Backlog https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [06:33:00] [Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [06:45:44] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 6.42, 5.18, 3.92 [06:48:00] [Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] FIRING: There has been a rise in the MediaWiki exception rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [06:50:17] PROBLEM - puritwiki.p-e.kr - reverse DNS on sslhost is CRITICAL: rDNS CRITICAL - puritwiki.p-e.kr All nameservers failed to answer the query. [06:51:21] PROBLEM - prometheus151 PowerDNS Recursor on prometheus151 is CRITICAL: CRITICAL - Plugin timed out while executing system call [06:53:00] [Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] RESOLVED: MediaWiki Exception Rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [06:53:20] RECOVERY - prometheus151 PowerDNS Recursor on prometheus151 is OK: DNS OK: 0.070 seconds response time. wikitide.net returns 2602:294:0:b13::110,2602:294:0:b23::112,38.46.223.205,38.46.223.206 [06:57:26] PROBLEM - franchise.franchising.org.ua - reverse DNS on sslhost is CRITICAL: rDNS CRITICAL - franchise.franchising.org.ua All nameservers failed to answer the query. [07:08:00] [Grafana] RESOLVED: PHP-FPM Worker Usage High https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [07:24:45] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 1.99, 3.23, 3.95 [07:26:12] RECOVERY - wiki.thesimswiki.com - reverse DNS on sslhost is OK: SSL OK - wiki.thesimswiki.com reverse DNS resolves to cp36.wikitide.net - CNAME OK [07:27:07] RECOVERY - wiki.tmyt105.leyhp.com - LetsEncrypt on sslhost is OK: OK - Certificate 'wiki.tmyt105.leyhp.com' will expire on Thu 12 Sep 2024 12:27:49 PM GMT +0000. [07:28:38] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 5.81, 4.03, 4.08 [07:29:01] RECOVERY - wiki.sheepservermc.net - reverse DNS on sslhost is OK: SSL OK - wiki.sheepservermc.net reverse DNS resolves to cp36.wikitide.net - CNAME OK [07:29:17] RECOVERY - wiki.mobilityengineer.com - reverse DNS on sslhost is OK: SSL OK - wiki.mobilityengineer.com reverse DNS resolves to cp36.wikitide.net - CNAME OK [07:29:27] RECOVERY - puritwiki.p-e.kr - LetsEncrypt on sslhost is OK: OK - Certificate 'puritwiki.p-e.kr' will expire on Sat 19 Oct 2024 02:09:03 PM GMT +0000. [07:29:40] RECOVERY - wiki.tmyt105.leyhp.com - reverse DNS on sslhost is OK: SSL OK - wiki.tmyt105.leyhp.com reverse DNS resolves to cp36.wikitide.net - CNAME OK [07:29:53] RECOVERY - wiki.joust.ro - reverse DNS on sslhost is OK: SSL OK - wiki.joust.ro reverse DNS resolves to cp36.wikitide.net - CNAME OK [07:30:33] RECOVERY - wiki.nowchess.org - reverse DNS on sslhost is OK: SSL OK - wiki.nowchess.org reverse DNS resolves to cp36.wikitide.net - CNAME OK [07:30:46] RECOVERY - wiki.corgicam.tv - reverse DNS on sslhost is OK: SSL OK - wiki.corgicam.tv reverse DNS resolves to cp36.wikitide.net - CNAME OK [07:32:36] RECOVERY - wiki.astralprojections.org - reverse DNS on sslhost is OK: SSL OK - wiki.astralprojections.org reverse DNS resolves to cp36.wikitide.net - CNAME OK [07:33:59] RECOVERY - wiki.villagecollaborative.net - LetsEncrypt on sslhost is OK: OK - Certificate 'wiki.villagecollaborative.net' will expire on Wed 09 Oct 2024 09:20:12 PM GMT +0000. [07:35:42] RECOVERY - wiki.digitalcandela.com - reverse DNS on sslhost is OK: SSL OK - wiki.digitalcandela.com reverse DNS resolves to cp36.wikitide.net - CNAME OK [07:37:50] RECOVERY - wiki.yahia.xyz - reverse DNS on sslhost is OK: SSL OK - wiki.yahia.xyz reverse DNS resolves to cp36.wikitide.net - CNAME OK [07:38:00] RECOVERY - tno.wiki - reverse DNS on sslhost is OK: SSL OK - tno.wiki reverse DNS resolves to cp36.wikitide.net - CNAME OK [07:38:00] RECOVERY - wiki.junkstore.xyz - reverse DNS on sslhost is OK: SSL OK - wiki.junkstore.xyz reverse DNS resolves to cp36.wikitide.net - CNAME OK [07:38:03] RECOVERY - wiki.ate42.ru - reverse DNS on sslhost is OK: SSL OK - wiki.ate42.ru reverse DNS resolves to cp36.wikitide.net - CNAME OK [07:39:47] RECOVERY - wiki.wikimedia.cat - reverse DNS on sslhost is OK: SSL OK - wiki.wikimedia.cat reverse DNS resolves to cp36.wikitide.net - CNAME OK [07:40:20] RECOVERY - www.kinitopedia.lol - reverse DNS on sslhost is OK: SSL OK - www.kinitopedia.lol reverse DNS resolves to cp36.wikitide.net - CNAME OK [07:42:07] RECOVERY - wiki.villagecollaborative.net - reverse DNS on sslhost is OK: SSL OK - wiki.villagecollaborative.net reverse DNS resolves to cp36.wikitide.net - CNAME OK [07:42:23] PROBLEM - prometheus151 PowerDNS Recursor on prometheus151 is CRITICAL: CRITICAL - Plugin timed out while executing system call [07:44:22] RECOVERY - prometheus151 PowerDNS Recursor on prometheus151 is OK: DNS OK: 0.106 seconds response time. wikitide.net returns 2602:294:0:b13::110,2602:294:0:b23::112,38.46.223.205,38.46.223.206 [07:45:30] RECOVERY - www.thesimswiki.com - reverse DNS on sslhost is OK: SSL OK - www.thesimswiki.com reverse DNS resolves to cp36.wikitide.net - CNAME OK [07:47:18] RECOVERY - minescape.wiki - reverse DNS on sslhost is OK: SSL OK - minescape.wiki reverse DNS resolves to cp36.wikitide.net - CNAME FLAT [07:47:19] RECOVERY - vise.dayid.org - reverse DNS on sslhost is OK: SSL OK - vise.dayid.org reverse DNS resolves to cp36.wikitide.net - CNAME OK [07:49:20] [Grafana] FIRING: There has been a rise in the MediaWiki exception rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [07:50:15] RECOVERY - puritwiki.p-e.kr - reverse DNS on sslhost is OK: SSL OK - puritwiki.p-e.kr reverse DNS resolves to cp36.wikitide.net - CNAME OK [07:51:33] RECOVERY - wiki.joust.ro - LetsEncrypt on sslhost is OK: OK - Certificate 'wiki.joust.ro' will expire on Wed 09 Oct 2024 09:25:14 PM GMT +0000. [07:51:42] RECOVERY - wiki.wikimedia.cat - LetsEncrypt on sslhost is OK: OK - Certificate 'wiki.wikimedia.cat' will expire on Thu 10 Oct 2024 07:43:36 AM GMT +0000. [07:51:44] RECOVERY - wiki.eggsdstudios.com - reverse DNS on sslhost is OK: SSL OK - wiki.eggsdstudios.com reverse DNS resolves to cp36.wikitide.net - CNAME OK [07:52:51] RECOVERY - gufengcheng.top - reverse DNS on sslhost is OK: SSL OK - gufengcheng.top reverse DNS resolves to cp36.wikitide.net - CNAME OK [07:54:20] [Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] RESOLVED: MediaWiki Exception Rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [07:57:27] PROBLEM - franchise.franchising.org.ua - reverse DNS on sslhost is WARNING: rDNS WARNING - reverse DNS entry for franchise.franchising.org.ua could not be found [08:09:57] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 2.75, 2.94, 3.70 [08:13:57] RECOVERY - prometheus151 Current Load on prometheus151 is OK: LOAD OK - total load average: 1.94, 2.59, 3.38 [08:16:45] PROBLEM - prometheus151 SSH on prometheus151 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [08:17:58] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 4.85, 4.20, 3.90 [08:19:20] [Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] FIRING: There has been a rise in the MediaWiki exception rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [08:24:51] RECOVERY - prometheus151 SSH on prometheus151 is OK: SSH OK - OpenSSH_9.2p1 Debian-2+deb12u3 (protocol 2.0) [08:29:06] PROBLEM - prometheus151 PowerDNS Recursor on prometheus151 is CRITICAL: CRITICAL - Plugin timed out while executing system call [08:29:20] [Grafana] RESOLVED: PHP-FPM Worker Usage High https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] FIRING: There has been a rise in the MediaWiki exception rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [08:31:05] RECOVERY - prometheus151 PowerDNS Recursor on prometheus151 is OK: DNS OK: 0.068 seconds response time. wikitide.net returns 2602:294:0:b13::110,2602:294:0:b23::112,38.46.223.205,38.46.223.206 [08:34:20] [Grafana] FIRING: The mediawiki job queue has more than 500 unclaimed jobs https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] RESOLVED: MediaWiki Exception Rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [08:44:20] [Grafana] RESOLVED: High Job Queue Backlog https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [08:57:00] [Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [08:57:57] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 2.70, 2.88, 3.78 [09:01:57] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 4.11, 3.60, 3.87 [09:04:35] PROBLEM - prometheus151 PowerDNS Recursor on prometheus151 is CRITICAL: CRITICAL - Plugin timed out while executing system call [09:09:52] PROBLEM - prometheus151 SSH on prometheus151 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [09:10:51] RECOVERY - prometheus151 PowerDNS Recursor on prometheus151 is OK: DNS OK: 0.070 seconds response time. wikitide.net returns 2602:294:0:b13::110,2602:294:0:b23::112,38.46.223.205,38.46.223.206 [09:11:52] RECOVERY - prometheus151 SSH on prometheus151 is OK: SSH OK - OpenSSH_9.2p1 Debian-2+deb12u3 (protocol 2.0) [09:19:58] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 2.03, 3.21, 3.85 [09:23:58] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 4.57, 3.51, 3.81 [09:25:57] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 2.34, 3.32, 3.73 [09:27:57] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 4.64, 3.64, 3.78 [09:29:45] PROBLEM - prometheus151 PowerDNS Recursor on prometheus151 is CRITICAL: CRITICAL - Plugin timed out while executing system call [09:31:57] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 3.08, 3.76, 3.85 [09:33:59] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 4.76, 4.29, 4.04 [09:35:57] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 1.97, 3.76, 3.91 [09:37:57] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 3.53, 4.06, 4.01 [09:38:10] RECOVERY - prometheus151 PowerDNS Recursor on prometheus151 is OK: DNS OK: 0.069 seconds response time. wikitide.net returns 2602:294:0:b13::110,2602:294:0:b23::112,38.46.223.205,38.46.223.206 [09:42:02] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 21.33, 19.08, 18.02 [09:43:57] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 2.97, 3.68, 3.87 [09:44:01] RECOVERY - mw182 Current Load on mw182 is OK: LOAD OK - total load average: 16.54, 17.63, 17.60 [09:45:57] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 5.56, 3.88, 3.89 [09:46:42] PROBLEM - ns2 Puppet on ns2 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. It might be a dependency cycle. [09:47:57] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 2.92, 3.24, 3.64 [09:49:57] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 5.84, 4.32, 4.00 [09:53:57] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 2.34, 3.82, 3.92 [09:55:57] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 5.25, 4.13, 4.00 [09:57:54] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 20.12, 21.32, 19.59 [09:58:20] PROBLEM - prometheus151 PowerDNS Recursor on prometheus151 is CRITICAL: CRITICAL - Plugin timed out while executing system call [09:59:53] RECOVERY - mw182 Current Load on mw182 is OK: LOAD OK - total load average: 15.80, 19.92, 19.33 [10:02:27] RECOVERY - prometheus151 PowerDNS Recursor on prometheus151 is OK: DNS OK: 0.197 seconds response time. wikitide.net returns 2602:294:0:b13::110,2602:294:0:b23::112,38.46.223.205,38.46.223.206 [10:11:57] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 2.80, 3.21, 3.98 [10:13:57] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 5.62, 4.26, 4.28 [10:15:32] RECOVERY - ns2 Puppet on ns2 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [10:27:57] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 3.34, 3.32, 3.76 [10:31:57] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 5.32, 3.85, 3.85 [10:33:57] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 2.26, 3.60, 3.79 [10:35:57] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 5.16, 4.15, 3.96 [10:37:57] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 1.94, 3.58, 3.80 [10:40:01] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 6.19, 4.38, 4.05 [10:41:58] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 2.24, 3.74, 3.87 [10:42:00] [Grafana] RESOLVED: PHP-FPM Worker Usage High https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [10:43:57] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 4.79, 4.16, 4.00 [10:45:59] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 3.52, 3.67, 3.83 [10:48:01] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 6.76, 4.74, 4.20 [10:49:58] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 3.66, 4.00, 3.98 [10:51:57] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 4.20, 3.99, 3.97 [10:55:20] [Grafana] FIRING: There has been a rise in the MediaWiki exception rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [10:55:57] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 1.88, 3.52, 3.84 [10:59:57] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 5.22, 4.11, 3.98 [11:00:20] [Grafana] RESOLVED: MediaWiki Exception Rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [11:01:00] PROBLEM - prometheus151 SSH on prometheus151 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [11:02:20] [Grafana] FIRING: There has been a rise in the MediaWiki exception rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [11:03:04] RECOVERY - prometheus151 SSH on prometheus151 is OK: SSH OK - OpenSSH_9.2p1 Debian-2+deb12u3 (protocol 2.0) [11:07:20] [Grafana] RESOLVED: MediaWiki Exception Rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [11:11:57] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 1.79, 3.25, 3.85 [11:13:30] [Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [11:17:20] PROBLEM - mw182 Current Load on mw182 is CRITICAL: LOAD CRITICAL - total load average: 24.82, 21.48, 19.10 [11:18:30] [Grafana] FIRING: The mediawiki job queue has more than 500 unclaimed jobs https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [11:19:19] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 23.11, 21.37, 19.32 [11:21:19] RECOVERY - mw182 Current Load on mw182 is OK: LOAD OK - total load average: 18.29, 20.40, 19.23 [11:21:58] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 5.08, 3.73, 3.72 [11:23:57] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 2.53, 3.35, 3.59 [11:25:58] RECOVERY - prometheus151 Current Load on prometheus151 is OK: LOAD OK - total load average: 1.69, 2.75, 3.34 [11:29:58] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 8.28, 5.33, 4.22 [11:33:58] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 2.01, 3.55, 3.74 [11:39:57] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 5.16, 3.55, 3.62 [11:41:58] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 3.03, 3.17, 3.47 [11:43:30] [Grafana] FIRING: The mediawiki job queue has more than 500 unclaimed jobs https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] RESOLVED: PHP-FPM Worker Usage High https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [11:45:58] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 4.02, 3.54, 3.56 [11:48:52] PROBLEM - ns2 Puppet on ns2 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. It might be a dependency cycle. [11:52:23] PROBLEM - prometheus151 SSH on prometheus151 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [11:54:16] RECOVERY - prometheus151 SSH on prometheus151 is OK: SSH OK - OpenSSH_9.2p1 Debian-2+deb12u3 (protocol 2.0) [12:03:58] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 1.77, 3.17, 3.93 [12:05:57] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 4.21, 3.86, 4.10 [12:09:57] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 3.34, 3.44, 3.86 [12:13:30] [Grafana] FIRING: The mediawiki job queue has more than 500 unclaimed jobs https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [12:17:58] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 6.26, 4.19, 3.83 [12:21:57] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 1.91, 3.66, 3.77 [12:23:30] [Grafana] RESOLVED: High Job Queue Backlog https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [12:25:57] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 5.26, 3.74, 3.72 [12:27:57] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 2.80, 3.22, 3.53 [12:29:59] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 5.69, 4.18, 3.85 [12:31:57] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 3.19, 3.46, 3.61 [12:35:58] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 7.80, 4.93, 4.10 [12:39:57] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 1.61, 3.56, 3.76 [12:43:57] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 4.27, 3.56, 3.71 [12:45:57] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 3.00, 3.24, 3.57 [12:47:12] RECOVERY - ns2 Puppet on ns2 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [12:47:57] RECOVERY - prometheus151 Current Load on prometheus151 is OK: LOAD OK - total load average: 2.04, 2.87, 3.40 [12:51:58] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 6.11, 4.25, 3.82 [12:53:30] [Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] FIRING: There has been a rise in the MediaWiki exception rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [12:55:48] PROBLEM - prometheus151 APT on prometheus151 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 60 seconds. [12:57:17] PROBLEM - prometheus151 PowerDNS Recursor on prometheus151 is CRITICAL: CRITICAL - Plugin timed out while executing system call [12:57:35] PROBLEM - prometheus151 SSH on prometheus151 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [12:58:12] RECOVERY - prometheus151 APT on prometheus151 is OK: APT OK: 51 packages available for upgrade (0 critical updates). [12:58:30] [Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] RESOLVED: MediaWiki Exception Rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [13:03:30] [Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] FIRING: There has been a rise in the MediaWiki exception rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [13:08:30] [Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] RESOLVED: MediaWiki Exception Rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [13:09:47] RECOVERY - prometheus151 SSH on prometheus151 is OK: SSH OK - OpenSSH_9.2p1 Debian-2+deb12u3 (protocol 2.0) [13:09:59] RECOVERY - prometheus151 PowerDNS Recursor on prometheus151 is OK: DNS OK: 0.063 seconds response time. wikitide.net returns 2602:294:0:b13::110,2602:294:0:b23::112,38.46.223.205,38.46.223.206 [13:13:30] [Grafana] FIRING: The mediawiki job queue has more than 500 unclaimed jobs https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] RESOLVED: PHP-FPM Worker Usage High https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [13:18:30] [Grafana] FIRING: The mediawiki job queue has more than 500 unclaimed jobs https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] FIRING: There has been a rise in the MediaWiki exception rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [13:23:30] [Grafana] FIRING: The mediawiki job queue has more than 500 unclaimed jobs https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] RESOLVED: MediaWiki Exception Rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [13:27:07] PROBLEM - prometheus151 PowerDNS Recursor on prometheus151 is CRITICAL: CRITICAL - Plugin timed out while executing system call [13:29:05] RECOVERY - prometheus151 PowerDNS Recursor on prometheus151 is OK: DNS OK: 0.092 seconds response time. wikitide.net returns 2602:294:0:b13::110,2602:294:0:b23::112,38.46.223.205,38.46.223.206 [13:29:20] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 21.74, 21.80, 19.67 [13:31:19] RECOVERY - mw182 Current Load on mw182 is OK: LOAD OK - total load average: 17.38, 20.22, 19.35 [13:33:30] [Grafana] RESOLVED: High Job Queue Backlog https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [13:42:00] [Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [13:45:57] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 3.28, 3.39, 3.97 [13:55:57] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 5.25, 3.55, 3.65 [14:12:00] [Grafana] RESOLVED: PHP-FPM Worker Usage High https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [14:14:07] PROBLEM - prometheus151 PowerDNS Recursor on prometheus151 is CRITICAL: CRITICAL - Plugin timed out while executing system call [14:16:05] RECOVERY - prometheus151 PowerDNS Recursor on prometheus151 is OK: DNS OK: 0.073 seconds response time. wikitide.net returns 2602:294:0:b13::110,2602:294:0:b23::112,38.46.223.205,38.46.223.206 [14:29:57] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 2.14, 3.32, 3.93 [14:30:20] [Grafana] FIRING: There has been a rise in the MediaWiki exception rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [14:33:57] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 5.20, 3.59, 3.84 [14:35:20] [Grafana] RESOLVED: MediaWiki Exception Rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [14:35:57] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 3.92, 3.86, 3.93 [14:37:07] !log reverse some migrations that aren't needed [14:37:19] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [14:37:58] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 5.06, 4.02, 3.96 [14:41:58] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 2.93, 3.87, 3.93 [14:43:58] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 5.52, 4.42, 4.12 [14:45:20] [Grafana] FIRING: There has been a rise in the MediaWiki exception rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [14:47:57] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 2.31, 3.65, 3.91 [14:49:57] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 4.82, 3.94, 3.96 [14:50:20] [Grafana] RESOLVED: MediaWiki Exception Rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [14:53:58] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 2.69, 3.29, 3.70 [14:57:58] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 5.06, 3.87, 3.83 [15:03:57] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 2.37, 3.35, 3.70 [15:09:57] RECOVERY - prometheus151 Current Load on prometheus151 is OK: LOAD OK - total load average: 2.43, 2.96, 3.40 [15:16:55] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 4.93, 4.02, 3.71 [15:19:28] PROBLEM - mw182 Current Load on mw182 is CRITICAL: LOAD CRITICAL - total load average: 25.10, 20.37, 17.84 [15:20:38] PROBLEM - prometheus151 SSH on prometheus151 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [15:21:23] [Grafana] FIRING: There has been a rise in the MediaWiki exception rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [15:21:27] RECOVERY - mw182 Current Load on mw182 is OK: LOAD OK - total load average: 19.58, 19.36, 17.73 [15:26:23] [Grafana] RESOLVED: MediaWiki Exception Rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [15:26:39] RECOVERY - prometheus151 SSH on prometheus151 is OK: SSH OK - OpenSSH_9.2p1 Debian-2+deb12u3 (protocol 2.0) [15:27:23] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 21.73, 20.47, 18.69 [15:29:48] PROBLEM - prometheus151 PowerDNS Recursor on prometheus151 is CRITICAL: CRITICAL - Plugin timed out while executing system call [15:31:21] PROBLEM - mw182 Current Load on mw182 is CRITICAL: LOAD CRITICAL - total load average: 25.55, 22.93, 20.07 [15:32:20] [Grafana] FIRING: There has been a rise in the MediaWiki exception rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [15:32:56] PROBLEM - prometheus151 SSH on prometheus151 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [15:33:20] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 18.96, 21.45, 19.87 [15:34:04] RECOVERY - prometheus151 PowerDNS Recursor on prometheus151 is OK: DNS OK: 8.642 seconds response time. wikitide.net returns 2602:294:0:b13::110,2602:294:0:b23::112,38.46.223.205,38.46.223.206 [15:34:54] RECOVERY - prometheus151 SSH on prometheus151 is OK: SSH OK - OpenSSH_9.2p1 Debian-2+deb12u3 (protocol 2.0) [15:39:17] RECOVERY - mw182 Current Load on mw182 is OK: LOAD OK - total load average: 20.31, 19.61, 19.54 [15:42:20] [Grafana] RESOLVED: MediaWiki Exception Rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [15:43:30] [Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [15:44:15] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 3.42, 3.49, 4.00 [15:46:11] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 5.87, 4.14, 4.16 [15:47:43] PROBLEM - ns2 Puppet on ns2 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. It might be a dependency cycle. [15:49:12] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 20.70, 19.07, 18.86 [15:51:11] RECOVERY - mw182 Current Load on mw182 is OK: LOAD OK - total load average: 16.66, 18.10, 18.53 [15:53:03] PROBLEM - prometheus151 PowerDNS Recursor on prometheus151 is CRITICAL: CRITICAL - Plugin timed out while executing system call [15:53:30] [Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] FIRING: There has been a rise in the MediaWiki exception rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [15:55:02] RECOVERY - prometheus151 PowerDNS Recursor on prometheus151 is OK: DNS OK: 0.069 seconds response time. wikitide.net returns 2602:294:0:b13::110,2602:294:0:b23::112,38.46.223.205,38.46.223.206 [15:58:30] [Grafana] FIRING: The mediawiki job queue has more than 500 unclaimed jobs https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] RESOLVED: MediaWiki Exception Rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [15:59:12] PROBLEM - prometheus151 SSH on prometheus151 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [15:59:28] PROBLEM - prometheus151 PowerDNS Recursor on prometheus151 is CRITICAL: CRITICAL - Plugin timed out while executing system call [16:03:12] RECOVERY - prometheus151 SSH on prometheus151 is OK: SSH OK - OpenSSH_9.2p1 Debian-2+deb12u3 (protocol 2.0) [16:03:14] PROBLEM - prometheus151 Puppet on prometheus151 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 60 seconds. [16:03:14] PROBLEM - prometheus151 APT on prometheus151 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 60 seconds. [16:06:00] RECOVERY - prometheus151 Puppet on prometheus151 is OK: OK: Puppet is currently enabled, last run 11 minutes ago with 0 failures [16:06:01] RECOVERY - prometheus151 APT on prometheus151 is OK: APT OK: 51 packages available for upgrade (0 critical updates). [16:08:30] [Grafana] RESOLVED: High Job Queue Backlog https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [16:09:01] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 22.64, 20.54, 19.46 [16:11:00] RECOVERY - mw182 Current Load on mw182 is OK: LOAD OK - total load average: 19.97, 20.11, 19.42 [16:13:30] [Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] FIRING: There has been a rise in the MediaWiki exception rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [16:16:27] RECOVERY - prometheus151 PowerDNS Recursor on prometheus151 is OK: DNS OK: 0.114 seconds response time. wikitide.net returns 2602:294:0:b13::110,2602:294:0:b23::112,38.46.223.205,38.46.223.206 [16:18:30] [Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] RESOLVED: MediaWiki Exception Rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [16:22:52] PROBLEM - mw182 Current Load on mw182 is CRITICAL: LOAD CRITICAL - total load average: 27.44, 23.16, 20.84 [16:24:51] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 21.91, 22.17, 20.74 [16:30:49] PROBLEM - mw182 Current Load on mw182 is CRITICAL: LOAD CRITICAL - total load average: 25.57, 23.59, 21.78 [16:32:48] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 22.00, 22.73, 21.68 [16:40:44] RECOVERY - mw182 Current Load on mw182 is OK: LOAD OK - total load average: 17.76, 19.50, 20.36 [16:48:30] [Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] FIRING: There has been a rise in the MediaWiki exception rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [16:50:57] PROBLEM - prometheus151 PowerDNS Recursor on prometheus151 is CRITICAL: CRITICAL - Plugin timed out while executing system call [16:52:55] RECOVERY - prometheus151 PowerDNS Recursor on prometheus151 is OK: DNS OK: 0.056 seconds response time. wikitide.net returns 2602:294:0:b13::110,2602:294:0:b23::112,38.46.223.205,38.46.223.206 [16:53:30] [Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] RESOLVED: MediaWiki Exception Rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [16:55:45] PROBLEM - prometheus151 SSH on prometheus151 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:57:41] RECOVERY - prometheus151 SSH on prometheus151 is OK: SSH OK - OpenSSH_9.2p1 Debian-2+deb12u3 (protocol 2.0) [16:58:30] [Grafana] FIRING: The mediawiki job queue has more than 500 unclaimed jobs https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [16:58:56] PROBLEM - prometheus151 Puppet on prometheus151 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 60 seconds. [16:58:56] PROBLEM - prometheus151 APT on prometheus151 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 60 seconds. [16:59:28] PROBLEM - prometheus151 PowerDNS Recursor on prometheus151 is CRITICAL: CRITICAL - Plugin timed out while executing system call [17:01:29] RECOVERY - prometheus151 PowerDNS Recursor on prometheus151 is OK: DNS OK: 2.166 seconds response time. wikitide.net returns 2602:294:0:b13::110,2602:294:0:b23::112,38.46.223.205,38.46.223.206 [17:04:23] RECOVERY - prometheus151 Puppet on prometheus151 is OK: OK: Puppet is currently enabled, last run 12 minutes ago with 0 failures [17:04:24] RECOVERY - prometheus151 APT on prometheus151 is OK: APT OK: 51 packages available for upgrade (0 critical updates). [17:08:30] [Grafana] FIRING: The mediawiki job queue has more than 500 unclaimed jobs https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] RESOLVED: MediaWiki Exception Rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [17:08:35] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 23.55, 20.25, 18.91 [17:10:35] RECOVERY - mw182 Current Load on mw182 is OK: LOAD OK - total load average: 19.84, 20.07, 19.01 [17:13:44] RECOVERY - ns2 Puppet on ns2 is OK: OK: Puppet is currently enabled, last run 34 seconds ago with 0 failures [17:13:59] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 3.68, 3.06, 4.00 [17:19:57] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 5.51, 3.82, 3.95 [17:20:35] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 23.14, 22.11, 20.41 [17:21:57] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 2.37, 3.44, 3.82 [17:22:35] RECOVERY - mw182 Current Load on mw182 is OK: LOAD OK - total load average: 18.80, 20.20, 19.88 [17:27:57] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 4.80, 3.36, 3.60 [17:29:58] RECOVERY - prometheus151 Current Load on prometheus151 is OK: LOAD OK - total load average: 3.07, 2.94, 3.40 [17:38:35] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 22.71, 21.83, 20.85 [17:40:35] PROBLEM - mw182 Current Load on mw182 is CRITICAL: LOAD CRITICAL - total load average: 28.19, 24.01, 21.75 [17:43:58] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 2.46, 3.78, 3.61 [17:45:57] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 4.13, 3.91, 3.67 [17:46:38] PROBLEM - mw152 Current Load on mw152 is WARNING: LOAD WARNING - total load average: 21.91, 19.94, 17.22 [17:47:57] RECOVERY - prometheus151 Current Load on prometheus151 is OK: LOAD OK - total load average: 2.34, 3.13, 3.40 [17:48:36] RECOVERY - mw152 Current Load on mw152 is OK: LOAD OK - total load average: 17.11, 18.50, 17.00 [17:50:35] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 23.03, 23.36, 22.64 [17:56:35] PROBLEM - mw182 Current Load on mw182 is CRITICAL: LOAD CRITICAL - total load average: 24.60, 23.37, 22.92 [17:58:35] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 18.73, 21.76, 22.41 [18:00:18] PROBLEM - ns2 NTP time on ns2 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [18:02:15] RECOVERY - ns2 NTP time on ns2 is OK: NTP OK: Offset -0.0001960992813 secs [18:02:45] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 7.05, 4.74, 3.85 [18:06:39] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 2.09, 3.51, 3.55 [18:08:37] RECOVERY - prometheus151 Current Load on prometheus151 is OK: LOAD OK - total load average: 1.41, 2.98, 3.36 [18:10:35] RECOVERY - mw182 Current Load on mw182 is OK: LOAD OK - total load average: 15.68, 17.97, 19.96 [18:12:00] PROBLEM - prometheus151 SSH on prometheus151 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [18:12:14] PROBLEM - prometheus151 PowerDNS Recursor on prometheus151 is CRITICAL: CRITICAL - Plugin timed out while executing system call [18:12:34] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 8.79, 5.63, 4.32 [18:13:57] RECOVERY - prometheus151 SSH on prometheus151 is OK: SSH OK - OpenSSH_9.2p1 Debian-2+deb12u3 (protocol 2.0) [18:14:18] RECOVERY - prometheus151 PowerDNS Recursor on prometheus151 is OK: DNS OK: 5.866 seconds response time. wikitide.net returns 2602:294:0:b13::110,2602:294:0:b23::112,38.46.223.205,38.46.223.206 [18:18:27] PROBLEM - mw171 Current Load on mw171 is WARNING: LOAD WARNING - total load average: 22.18, 20.43, 17.28 [18:18:30] [Grafana] FIRING: The mediawiki job queue has more than 500 unclaimed jobs https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] FIRING: There has been a rise in the MediaWiki exception rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [18:18:35] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 20.90, 21.33, 20.70 [18:20:23] RECOVERY - mw171 Current Load on mw171 is OK: LOAD OK - total load average: 19.25, 20.06, 17.54 [18:20:35] PROBLEM - mw182 Current Load on mw182 is CRITICAL: LOAD CRITICAL - total load average: 26.78, 22.96, 21.34 [18:22:14] PROBLEM - prometheus151 SSH on prometheus151 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [18:22:35] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 20.29, 21.34, 20.92 [18:23:03] PROBLEM - prometheus151 PowerDNS Recursor on prometheus151 is CRITICAL: CRITICAL - Plugin timed out while executing system call [18:24:35] PROBLEM - mw182 Current Load on mw182 is CRITICAL: LOAD CRITICAL - total load average: 25.47, 22.52, 21.37 [18:25:02] RECOVERY - prometheus151 PowerDNS Recursor on prometheus151 is OK: DNS OK: 0.232 seconds response time. wikitide.net returns 2602:294:0:b13::110,2602:294:0:b23::112,38.46.223.205,38.46.223.206 [18:30:05] PROBLEM - mw171 Current Load on mw171 is CRITICAL: LOAD CRITICAL - total load average: 29.29, 22.92, 19.41 [18:30:19] RECOVERY - prometheus151 SSH on prometheus151 is OK: SSH OK - OpenSSH_9.2p1 Debian-2+deb12u3 (protocol 2.0) [18:32:01] PROBLEM - mw171 Current Load on mw171 is WARNING: LOAD WARNING - total load average: 22.53, 21.74, 19.34 [18:33:30] [Grafana] FIRING: The mediawiki job queue has more than 500 unclaimed jobs https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] RESOLVED: MediaWiki Exception Rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [18:35:33] PROBLEM - mw172 Current Load on mw172 is WARNING: LOAD WARNING - total load average: 20.05, 20.56, 18.65 [18:35:46] PROBLEM - mw152 Current Load on mw152 is WARNING: LOAD WARNING - total load average: 19.11, 21.21, 19.21 [18:37:33] RECOVERY - mw172 Current Load on mw172 is OK: LOAD OK - total load average: 19.13, 20.04, 18.68 [18:37:46] RECOVERY - mw152 Current Load on mw152 is OK: LOAD OK - total load average: 16.96, 20.14, 19.10 [18:39:45] RECOVERY - mw171 Current Load on mw171 is OK: LOAD OK - total load average: 16.52, 19.72, 19.50 [18:42:38] PROBLEM - cp37 Varnish Backends on cp37 is CRITICAL: 1 backends are down. mw171 [19:03:57] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 2.48, 3.60, 3.84 [19:04:34] RECOVERY - prometheus151 SSH on prometheus151 is OK: SSH OK - OpenSSH_9.2p1 Debian-2+deb12u3 (protocol 2.0) [19:07:57] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 5.42, 3.68, 3.75 [19:08:30] [Grafana] FIRING: The mediawiki job queue has more than 500 unclaimed jobs https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] RESOLVED: PHP-FPM Worker Usage High https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [19:08:35] PROBLEM - mw182 Current Load on mw182 is CRITICAL: LOAD CRITICAL - total load average: 25.72, 22.35, 22.92 [19:08:53] PROBLEM - mw171 Current Load on mw171 is WARNING: LOAD WARNING - total load average: 23.50, 22.40, 20.80 [19:09:58] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 3.09, 3.11, 3.52 [19:10:35] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 23.10, 22.13, 22.75 [19:10:49] RECOVERY - mw171 Current Load on mw171 is OK: LOAD OK - total load average: 13.40, 19.14, 19.81 [19:13:57] RECOVERY - prometheus151 Current Load on prometheus151 is OK: LOAD OK - total load average: 2.06, 2.86, 3.35 [19:16:21] PROBLEM - ns2 Puppet on ns2 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. It might be a dependency cycle. [19:16:35] PROBLEM - mw182 Current Load on mw182 is CRITICAL: LOAD CRITICAL - total load average: 24.50, 22.23, 22.45 [19:19:58] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 4.56, 4.33, 3.84 [19:25:57] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 2.42, 3.52, 3.65 [19:28:35] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 20.31, 22.71, 23.09 [19:29:58] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 4.41, 3.59, 3.63 [19:31:57] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 3.51, 3.40, 3.54 [19:32:35] PROBLEM - mw182 Current Load on mw182 is CRITICAL: LOAD CRITICAL - total load average: 26.96, 23.14, 23.02 [19:34:35] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 23.03, 23.12, 23.05 [19:37:58] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 6.07, 4.33, 3.82 [19:38:30] [Grafana] FIRING: The mediawiki job queue has more than 500 unclaimed jobs https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [19:38:35] PROBLEM - mw182 Current Load on mw182 is CRITICAL: LOAD CRITICAL - total load average: 26.53, 23.28, 23.00 [19:40:35] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 21.44, 23.04, 22.99 [19:43:30] [Grafana] FIRING: The mediawiki job queue has more than 500 unclaimed jobs https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] FIRING: There has been a rise in the MediaWiki exception rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [19:43:57] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 2.99, 3.88, 3.82 [19:45:25] RECOVERY - ns2 Puppet on ns2 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [19:45:57] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 5.80, 4.14, 3.89 [19:48:30] [Grafana] FIRING: The mediawiki job queue has more than 500 unclaimed jobs https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] RESOLVED: PHP-FPM Worker Usage High https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] RESOLVED: MediaWiki Exception Rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [19:52:35] RECOVERY - mw182 Current Load on mw182 is OK: LOAD OK - total load average: 17.02, 17.30, 20.15 [19:55:10] PROBLEM - prometheus151 PowerDNS Recursor on prometheus151 is CRITICAL: CRITICAL - Plugin timed out while executing system call [19:56:35] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 19.97, 19.77, 20.59 [19:58:30] [Grafana] FIRING: The mediawiki job queue has more than 500 unclaimed jobs https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] FIRING: There has been a rise in the MediaWiki exception rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [20:01:29] RECOVERY - prometheus151 PowerDNS Recursor on prometheus151 is OK: DNS OK: 3.792 seconds response time. wikitide.net returns 2602:294:0:b13::110,2602:294:0:b23::112,38.46.223.205,38.46.223.206 [20:02:35] RECOVERY - mw182 Current Load on mw182 is OK: LOAD OK - total load average: 18.89, 19.75, 20.40 [20:03:30] [Grafana] FIRING: The mediawiki job queue has more than 500 unclaimed jobs https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] RESOLVED: MediaWiki Exception Rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [20:13:30] [Grafana] RESOLVED: High Job Queue Backlog https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [20:13:57] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 1.82, 3.07, 3.83 [20:15:57] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 7.21, 4.84, 4.39 [20:17:27] PROBLEM - prometheus151 PowerDNS Recursor on prometheus151 is CRITICAL: CRITICAL - Plugin timed out while executing system call [20:18:27] [Grafana] FIRING: There has been a rise in the MediaWiki exception rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [20:23:27] [Grafana] RESOLVED: MediaWiki Exception Rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [20:23:37] PROBLEM - wiki.andreijiroh.uk.eu.org - reverse DNS on sslhost is WARNING: NoNameservers: All nameservers failed to answer the query uk.eu.org. IN NS: Server 2606:4700:4700::1111 UDP port 53 answered SERVFAIL [20:23:43] RECOVERY - prometheus151 PowerDNS Recursor on prometheus151 is OK: DNS OK: 0.063 seconds response time. wikitide.net returns 2602:294:0:b13::110,2602:294:0:b23::112,38.46.223.205,38.46.223.206 [20:27:22] PROBLEM - mw182 Current Load on mw182 is CRITICAL: LOAD CRITICAL - total load average: 29.77, 25.08, 21.79 [20:29:21] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 20.22, 22.86, 21.36 [20:30:20] [Grafana] FIRING: There has been a rise in the MediaWiki exception rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [20:35:20] [Grafana] RESOLVED: MediaWiki Exception Rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [20:37:20] [Grafana] FIRING: There has been a rise in the MediaWiki exception rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [20:38:27] PROBLEM - prometheus151 PowerDNS Recursor on prometheus151 is CRITICAL: CRITICAL - Plugin timed out while executing system call [20:39:09] PROBLEM - prometheus151 SSH on prometheus151 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [20:40:26] RECOVERY - prometheus151 PowerDNS Recursor on prometheus151 is OK: DNS OK: 0.102 seconds response time. wikitide.net returns 2602:294:0:b13::110,2602:294:0:b23::112,38.46.223.205,38.46.223.206 [20:40:45] [02mw-config] 07OAuthority pushed 031 commit to 03master [+0/-0/±1] 13https://github.com/miraheze/mw-config/compare/84350d787b8e...a182ac555cb2 [20:40:46] [02mw-config] 07OAuthority 03a182ac5 - Disable TSPortal verification at Harej's request [20:41:11] RECOVERY - prometheus151 SSH on prometheus151 is OK: SSH OK - OpenSSH_9.2p1 Debian-2+deb12u3 (protocol 2.0) [20:41:15] !log [oa@mwtask181] starting deploy of {'config': True, 'world': True, 'versions': '1.42'} to all [20:41:22] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [20:41:41] miraheze/mw-config - OAuthority the build passed. [20:42:20] [Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] FIRING: There has been a rise in the MediaWiki exception rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [20:42:58] !log [oa@mwtask181] finished deploy of {'config': True, 'world': True, 'versions': '1.42'} to all - SUCCESS in 102s [20:43:07] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [20:44:52] PROBLEM - prometheus151 PowerDNS Recursor on prometheus151 is CRITICAL: CRITICAL - Plugin timed out while executing system call [20:46:11] !log [@mwtask181] starting deploy of {'config': True} to all [20:46:21] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [20:46:30] !log [@mwtask181] finished deploy of {'config': True} to all - SUCCESS in 19s [20:46:36] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [20:46:51] RECOVERY - prometheus151 PowerDNS Recursor on prometheus151 is OK: DNS OK: 0.562 seconds response time. wikitide.net returns 2602:294:0:b13::110,2602:294:0:b23::112,38.46.223.205,38.46.223.206 [20:47:01] !log [oa@mwtask181] starting deploy of {'config': True, 'world': True, 'versions': '1.42'} to all [20:47:07] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [20:47:20] [Grafana] FIRING: The mediawiki job queue has more than 500 unclaimed jobs https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] FIRING: There has been a rise in the MediaWiki exception rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [20:48:42] !log [oa@mwtask181] finished deploy of {'config': True, 'world': True, 'versions': '1.42'} to all - SUCCESS in 101s [20:48:50] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [20:49:12] PROBLEM - mw182 Current Load on mw182 is CRITICAL: LOAD CRITICAL - total load average: 24.93, 22.62, 21.95 [20:51:11] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 20.20, 22.28, 21.95 [20:52:20] [Grafana] FIRING: The mediawiki job queue has more than 500 unclaimed jobs https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] RESOLVED: MediaWiki Exception Rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [20:52:46] RECOVERY - wiki.andreijiroh.uk.eu.org - reverse DNS on sslhost is OK: SSL OK - wiki.andreijiroh.uk.eu.org reverse DNS resolves to cp36.wikitide.net - CNAME OK [20:54:02] !log [@test151] starting deploy of {'config': True} to test151 [20:54:03] !log [@test151] finished deploy of {'config': True} to test151 - SUCCESS in 0s [20:54:08] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [20:54:21] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [20:55:33] [02RemovePII] 07OAuthority opened pull request 03#90: Fix issue with CentralAuthDatabaseManager not being passed - 13https://github.com/miraheze/RemovePII/pull/90 [20:55:55] [02RemovePII] 07coderabbitai[bot] commented on pull request 03#90: Fix issue with CentralAuthDatabaseManager not being passed - 13https://github.com/miraheze/RemovePII/pull/90#issuecomment-2308990605 [20:57:08] PROBLEM - mw182 Current Load on mw182 is CRITICAL: LOAD CRITICAL - total load average: 27.45, 23.60, 22.35 [20:59:07] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 19.66, 22.01, 21.94 [21:03:16] !log [oa@test151] starting deploy of {'versions': '1.42', 'upgrade_extensions': 'RemovePII'} to test151 [21:03:17] !log [oa@test151] finished deploy of {'versions': '1.42', 'upgrade_extensions': 'RemovePII'} to test151 - SUCCESS in 0s [21:03:19] [02RemovePII] 07Universal-Omega commented on pull request 03#90: Fix issue with CentralAuthDatabaseManager not being passed - 13https://github.com/miraheze/RemovePII/pull/90#issuecomment-2308992764 [21:03:22] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [21:03:27] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [21:03:36] miraheze/RemovePII - OAuthority the build has errored. [21:04:11] [02RemovePII] 07Universal-Omega commented on pull request 03#90: Fix issue with CentralAuthDatabaseManager not being passed - 13https://github.com/miraheze/RemovePII/pull/90#issuecomment-2308992994 [21:04:41] !log [oa@test151] starting deploy of {'versions': '1.42', 'upgrade_extensions': 'RemovePII'} to test151 [21:04:42] !log [oa@test151] DEPLOY ABORTED: Canary check failed for meta.mirabeta.org@localhost [21:04:49] !log [oa@test151] starting deploy of {'force': True, 'versions': '1.42', 'upgrade_extensions': 'RemovePII'} to test151 [21:04:50] !log [oa@test151] finished deploy of {'force': True, 'versions': '1.42', 'upgrade_extensions': 'RemovePII'} to test151 - SUCCESS in 0s [21:04:54] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [21:05:00] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [21:05:05] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [21:05:10] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [21:05:57] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 1.69, 3.28, 3.89 [21:07:20] !log [@mwtask171] starting deploy of {'config': True} to all [21:07:26] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [21:07:36] !log [@mwtask171] finished deploy of {'config': True} to all - SUCCESS in 16s [21:07:54] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [21:08:23] [02RemovePII] 07OAuthority synchronize pull request 03#90: Fix issue with CentralAuthDatabaseManager not being passed - 13https://github.com/miraheze/RemovePII/pull/90 [21:09:58] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 4.51, 3.35, 3.74 [21:11:55] miraheze/RemovePII - OAuthority the build has errored. [21:11:57] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 2.73, 3.35, 3.71 [21:13:57] RECOVERY - prometheus151 Current Load on prometheus151 is OK: LOAD OK - total load average: 1.10, 2.58, 3.39 [21:18:58] PROBLEM - mw182 Current Load on mw182 is CRITICAL: LOAD CRITICAL - total load average: 26.74, 23.13, 21.94 [21:20:54] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 2.94, 3.38, 3.53 [21:22:51] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 4.96, 4.05, 3.76 [21:24:48] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 2.27, 3.65, 3.68 [21:24:55] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 19.44, 23.09, 22.57 [21:26:45] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 6.68, 4.69, 4.04 [21:27:58] [02TSPortal] 07Universal-Omega pushed 031 commit to 03migrate-underage [+0/-0/±1] 13https://github.com/miraheze/TSPortal/compare/9a5711ebea88...71e95b81d834 [21:28:01] [02TSPortal] 07Universal-Omega 0371e95b8 - Bump version [21:28:02] [02TSPortal] 07Universal-Omega synchronize pull request 03#20: Add migration to change dpas underage field to TEXT rather than VARCHAR - 13https://github.com/miraheze/TSPortal/pull/20 [21:50:43] PROBLEM - mw182 Current Load on mw182 is CRITICAL: LOAD CRITICAL - total load average: 28.45, 24.54, 22.40 [21:52:42] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 22.50, 23.89, 22.44 [21:57:20] [Grafana] RESOLVED: PHP-FPM Worker Usage High https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [21:57:57] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 1.35, 3.02, 3.76 [21:58:39] PROBLEM - mw182 Current Load on mw182 is CRITICAL: LOAD CRITICAL - total load average: 27.25, 23.14, 22.16 [22:00:38] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 20.08, 21.53, 21.68 [22:01:57] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 6.26, 4.12, 4.01 [22:03:57] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 2.77, 3.85, 3.95 [22:09:58] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 5.29, 3.37, 3.59 [22:10:35] RECOVERY - mw182 Current Load on mw182 is OK: LOAD OK - total load average: 16.85, 18.65, 20.28 [22:11:41] PROBLEM - prometheus151 PowerDNS Recursor on prometheus151 is CRITICAL: CRITICAL - Plugin timed out while executing system call [22:15:48] RECOVERY - prometheus151 PowerDNS Recursor on prometheus151 is OK: DNS OK: 0.070 seconds response time. wikitide.net returns 2602:294:0:b13::110,2602:294:0:b23::112,38.46.223.205,38.46.223.206 [22:20:35] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 23.92, 21.32, 20.45 [22:22:35] PROBLEM - mw182 Current Load on mw182 is CRITICAL: LOAD CRITICAL - total load average: 26.54, 22.67, 21.02 [22:22:45] [Grafana] FIRING: The mediawiki job queue has more than 500 unclaimed jobs https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] FIRING: There has been a rise in the MediaWiki exception rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [22:24:35] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 20.82, 22.70, 21.29 [22:26:35] PROBLEM - mw182 Current Load on mw182 is CRITICAL: LOAD CRITICAL - total load average: 24.58, 23.06, 21.56 [22:27:45] [Grafana] FIRING: The mediawiki job queue has more than 500 unclaimed jobs https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] RESOLVED: MediaWiki Exception Rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [22:30:35] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 18.39, 22.23, 21.71 [22:42:35] RECOVERY - mw182 Current Load on mw182 is OK: LOAD OK - total load average: 14.72, 18.69, 20.14 [22:47:45] [Grafana] FIRING: The mediawiki job queue has more than 500 unclaimed jobs https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [22:49:57] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 1.96, 3.37, 3.94 [22:51:57] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 4.48, 3.65, 3.96 [22:53:17] PROBLEM - wiki.stag.lol - reverse DNS on sslhost is CRITICAL: rDNS CRITICAL - wiki.stag.lol All nameservers failed to answer the query. [22:53:57] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 2.48, 3.49, 3.88 [22:54:20] PROBLEM - wiki.tulpa.info - reverse DNS on sslhost is CRITICAL: rDNS CRITICAL - wiki.tulpa.info All nameservers failed to answer the query. [22:55:57] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 5.23, 3.93, 3.98 [22:57:08] PROBLEM - wiki.cubestudios.xyz - reverse DNS on sslhost is CRITICAL: rDNS CRITICAL - wiki.cubestudios.xyz All nameservers failed to answer the query. [22:57:54] PROBLEM - ao90.pinho.org - reverse DNS on sslhost is CRITICAL: rDNS CRITICAL - ao90.pinho.org All nameservers failed to answer the query. [22:58:14] PROBLEM - prometheus151 SSH on prometheus151 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [22:59:56] PROBLEM - data.nonbinary.wiki - reverse DNS on sslhost is CRITICAL: rDNS CRITICAL - data.nonbinary.wiki All nameservers failed to answer the query. [23:00:08] RECOVERY - prometheus151 SSH on prometheus151 is OK: SSH OK - OpenSSH_9.2p1 Debian-2+deb12u3 (protocol 2.0) [23:00:50] PROBLEM - legacygt.wiki - reverse DNS on sslhost is CRITICAL: rDNS CRITICAL - legacygt.wiki All nameservers failed to answer the query. [23:02:45] [Grafana] FIRING: The mediawiki job queue has more than 500 unclaimed jobs https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] FIRING: There has been a rise in the MediaWiki exception rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [23:02:57] PROBLEM - ru-teirailway.f5.si - reverse DNS on sslhost is CRITICAL: rDNS CRITICAL - ru-teirailway.f5.si All nameservers failed to answer the query. [23:03:57] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 2.86, 3.58, 3.92 [23:05:57] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 4.93, 4.21, 4.12 [23:07:45] [Grafana] FIRING: The mediawiki job queue has more than 500 unclaimed jobs https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] RESOLVED: MediaWiki Exception Rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [23:07:57] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 2.92, 3.53, 3.87 [23:09:57] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 5.90, 4.46, 4.17 [23:19:37] PROBLEM - prometheus151 PowerDNS Recursor on prometheus151 is CRITICAL: CRITICAL - Plugin timed out while executing system call [23:21:41] PROBLEM - prometheus151 Puppet on prometheus151 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 60 seconds. [23:21:42] PROBLEM - prometheus151 APT on prometheus151 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 60 seconds. [23:22:26] PROBLEM - mw182 Current Load on mw182 is CRITICAL: LOAD CRITICAL - total load average: 24.73, 20.32, 19.26 [23:22:45] [Grafana] FIRING: The mediawiki job queue has more than 500 unclaimed jobs https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] FIRING: There has been a rise in the MediaWiki exception rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [23:24:25] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 21.80, 20.63, 19.50 [23:26:24] RECOVERY - mw182 Current Load on mw182 is OK: LOAD OK - total load average: 19.59, 20.11, 19.45 [23:27:03] RECOVERY - prometheus151 Puppet on prometheus151 is OK: OK: Puppet is currently enabled, last run 39 minutes ago with 0 failures [23:27:03] RECOVERY - prometheus151 APT on prometheus151 is OK: APT OK: 51 packages available for upgrade (0 critical updates). [23:28:02] RECOVERY - prometheus151 PowerDNS Recursor on prometheus151 is OK: DNS OK: 0.063 seconds response time. wikitide.net returns 2602:294:0:b13::110,2602:294:0:b23::112,38.46.223.205,38.46.223.206 [23:30:35] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 21.89, 22.73, 20.86 [23:32:21] RECOVERY - ru-teirailway.f5.si - reverse DNS on sslhost is OK: SSL OK - ru-teirailway.f5.si reverse DNS resolves to cp36.wikitide.net - CNAME OK [23:32:33] PROBLEM - wiki.nowchess.org - reverse DNS on sslhost is CRITICAL: rDNS CRITICAL - wiki.nowchess.org All nameservers failed to answer the query. [23:32:45] [Grafana] FIRING: The mediawiki job queue has more than 500 unclaimed jobs https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] RESOLVED: PHP-FPM Worker Usage High https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] RESOLVED: MediaWiki Exception Rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [23:32:46] PROBLEM - wiki.corgicam.tv - reverse DNS on sslhost is CRITICAL: rDNS CRITICAL - wiki.corgicam.tv All nameservers failed to answer the query. [23:34:44] PROBLEM - wiki.astralprojections.org - reverse DNS on sslhost is CRITICAL: rDNS CRITICAL - wiki.astralprojections.org All nameservers failed to answer the query. [23:37:50] PROBLEM - wiki.digitalcandela.com - reverse DNS on sslhost is CRITICAL: rDNS CRITICAL - wiki.digitalcandela.com All nameservers failed to answer the query. [23:38:08] PROBLEM - www.kinitopedia.lol - reverse DNS on sslhost is CRITICAL: rDNS CRITICAL - www.kinitopedia.lol All nameservers failed to answer the query. [23:39:56] PROBLEM - wiki.yahia.xyz - reverse DNS on sslhost is CRITICAL: rDNS CRITICAL - wiki.yahia.xyz All nameservers failed to answer the query. [23:40:09] PROBLEM - tno.wiki - reverse DNS on sslhost is CRITICAL: rDNS CRITICAL - tno.wiki All nameservers failed to answer the query. [23:40:09] PROBLEM - wiki.ate42.ru - reverse DNS on sslhost is CRITICAL: rDNS CRITICAL - wiki.ate42.ru All nameservers failed to answer the query. [23:40:09] PROBLEM - wiki.junkstore.xyz - reverse DNS on sslhost is CRITICAL: rDNS CRITICAL - wiki.junkstore.xyz All nameservers failed to answer the query. [23:40:34] PROBLEM - wiki.wikimedia.cat - reverse DNS on sslhost is CRITICAL: rDNS CRITICAL - wiki.wikimedia.cat All nameservers failed to answer the query. [23:40:45] PROBLEM - prometheus151 SSH on prometheus151 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [23:42:35] RECOVERY - mw182 Current Load on mw182 is OK: LOAD OK - total load average: 16.89, 19.33, 20.12 [23:44:09] PROBLEM - wiki.villagecollaborative.net - reverse DNS on sslhost is CRITICAL: rDNS CRITICAL - wiki.villagecollaborative.net All nameservers failed to answer the query. [23:46:48] RECOVERY - prometheus151 SSH on prometheus151 is OK: SSH OK - OpenSSH_9.2p1 Debian-2+deb12u3 (protocol 2.0) [23:47:33] PROBLEM - www.thesimswiki.com - reverse DNS on sslhost is CRITICAL: rDNS CRITICAL - www.thesimswiki.com All nameservers failed to answer the query. [23:49:24] PROBLEM - minescape.wiki - reverse DNS on sslhost is CRITICAL: rDNS CRITICAL - minescape.wiki All nameservers failed to answer the query. [23:49:24] PROBLEM - vise.dayid.org - reverse DNS on sslhost is CRITICAL: rDNS CRITICAL - vise.dayid.org All nameservers failed to answer the query. [23:49:55] PROBLEM - wiki.tmyt105.leyhp.com - LetsEncrypt on sslhost is CRITICAL: Temporary failure in name resolutionHTTP CRITICAL - Unable to open TCP socket [23:51:18] RECOVERY - wiki.stag.lol - reverse DNS on sslhost is OK: SSL OK - wiki.stag.lol reverse DNS resolves to cp36.wikitide.net - CNAME OK [23:52:22] PROBLEM - puritwiki.p-e.kr - reverse DNS on sslhost is CRITICAL: rDNS CRITICAL - puritwiki.p-e.kr All nameservers failed to answer the query. [23:53:44] PROBLEM - wiki.joust.ro - LetsEncrypt on sslhost is CRITICAL: Temporary failure in name resolutionHTTP CRITICAL - Unable to open TCP socket [23:53:45] PROBLEM - wiki.eggsdstudios.com - reverse DNS on sslhost is CRITICAL: rDNS CRITICAL - wiki.eggsdstudios.com All nameservers failed to answer the query. [23:53:53] PROBLEM - wiki.wikimedia.cat - LetsEncrypt on sslhost is CRITICAL: Temporary failure in name resolutionHTTP CRITICAL - Unable to open TCP socket [23:54:01] RECOVERY - wiki.tulpa.info - reverse DNS on sslhost is OK: SSL OK - wiki.tulpa.info reverse DNS resolves to cp36.wikitide.net - CNAME OK [23:55:00] PROBLEM - gufengcheng.top - reverse DNS on sslhost is CRITICAL: rDNS CRITICAL - gufengcheng.top All nameservers failed to answer the query. [23:55:46] PROBLEM - mw152 Current Load on mw152 is WARNING: LOAD WARNING - total load average: 22.59, 20.03, 17.26 [23:56:22] RECOVERY - wiki.cubestudios.xyz - reverse DNS on sslhost is OK: SSL OK - wiki.cubestudios.xyz reverse DNS resolves to cp36.wikitide.net - CNAME OK [23:56:51] PROBLEM - wiki.joust.ro - reverse DNS on sslhost is CRITICAL: rDNS CRITICAL - wiki.joust.ro All nameservers failed to answer the query. [23:57:26] PROBLEM - franchise.franchising.org.ua - reverse DNS on sslhost is CRITICAL: rDNS CRITICAL - franchise.franchising.org.ua All nameservers failed to answer the query. [23:57:28] RECOVERY - ao90.pinho.org - reverse DNS on sslhost is OK: SSL OK - ao90.pinho.org reverse DNS resolves to cp36.wikitide.net - CNAME OK [23:57:29] PROBLEM - mw182 Current Load on mw182 is CRITICAL: LOAD CRITICAL - total load average: 26.01, 22.10, 20.75 [23:57:45] [Grafana] RESOLVED: High Job Queue Backlog https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] RESOLVED: MediaWiki Exception Rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [23:57:46] RECOVERY - mw152 Current Load on mw152 is OK: LOAD OK - total load average: 17.25, 18.76, 17.12 [23:58:16] PROBLEM - wiki.thesimswiki.com - reverse DNS on sslhost is CRITICAL: rDNS CRITICAL - wiki.thesimswiki.com All nameservers failed to answer the query. [23:58:46] RECOVERY - legacygt.wiki - reverse DNS on sslhost is OK: SSL OK - legacygt.wiki reverse DNS resolves to cp36.wikitide.net - CNAME OK [23:58:50] RECOVERY - data.nonbinary.wiki - reverse DNS on sslhost is OK: SSL OK - data.nonbinary.wiki reverse DNS resolves to cp36.wikitide.net - CNAME OK