[00:02:49] [Grafana] RESOLVED: High Job Queue Backlog https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [00:13:15] PROBLEM - mw181 Current Load on mw181 is WARNING: LOAD WARNING - total load average: 20.98, 20.65, 23.72 [00:14:30] PROBLEM - prometheus151 SSH on prometheus151 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [00:15:15] PROBLEM - mw181 Current Load on mw181 is CRITICAL: LOAD CRITICAL - total load average: 26.07, 22.00, 23.80 [00:16:24] RECOVERY - prometheus151 SSH on prometheus151 is OK: SSH OK - OpenSSH_9.2p1 Debian-2+deb12u3 (protocol 2.0) [00:21:15] PROBLEM - mw181 Current Load on mw181 is WARNING: LOAD WARNING - total load average: 21.42, 22.48, 23.71 [00:25:52] PROBLEM - prometheus151 PowerDNS Recursor on prometheus151 is CRITICAL: CRITICAL - Plugin timed out while executing system call [00:27:57] RECOVERY - prometheus151 PowerDNS Recursor on prometheus151 is OK: DNS OK: 6.762 seconds response time. wikitide.net returns 2602:294:0:b13::110,2602:294:0:b23::112,38.46.223.205,38.46.223.206 [00:33:15] PROBLEM - mw181 Current Load on mw181 is CRITICAL: LOAD CRITICAL - total load average: 28.01, 22.66, 22.47 [00:36:22] PROBLEM - prometheus151 PowerDNS Recursor on prometheus151 is CRITICAL: CRITICAL - Plugin timed out while executing system call [00:36:47] PROBLEM - prometheus151 SSH on prometheus151 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [00:38:42] RECOVERY - prometheus151 SSH on prometheus151 is OK: SSH OK - OpenSSH_9.2p1 Debian-2+deb12u3 (protocol 2.0) [00:44:20] [Grafana] FIRING: There has been a rise in the MediaWiki exception rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [00:44:49] RECOVERY - prometheus151 PowerDNS Recursor on prometheus151 is OK: DNS OK: 2.273 seconds response time. wikitide.net returns 2602:294:0:b13::110,2602:294:0:b23::112,38.46.223.205,38.46.223.206 [00:51:26] PROBLEM - prometheus151 Puppet on prometheus151 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 60 seconds. [00:53:26] RECOVERY - prometheus151 Puppet on prometheus151 is OK: OK: Puppet is currently enabled, last run 42 seconds ago with 0 failures [00:54:20] [Grafana] RESOLVED: MediaWiki Exception Rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [00:57:33] PROBLEM - prometheus151 PowerDNS Recursor on prometheus151 is CRITICAL: CRITICAL - Plugin timed out while executing system call [01:00:20] [Grafana] FIRING: There has been a rise in the MediaWiki exception rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [01:01:43] RECOVERY - prometheus151 PowerDNS Recursor on prometheus151 is OK: DNS OK: 2.831 seconds response time. wikitide.net returns 2602:294:0:b13::110,2602:294:0:b23::112,38.46.223.205,38.46.223.206 [01:05:20] [Grafana] FIRING: The mediawiki job queue has more than 500 unclaimed jobs https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] FIRING: There has been a rise in the MediaWiki exception rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [01:10:20] [Grafana] RESOLVED: High Job Queue Backlog https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] RESOLVED: MediaWiki Exception Rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [01:23:58] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 2.97, 3.17, 3.88 [01:31:57] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 5.33, 3.72, 3.74 [01:32:35] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 20.70, 19.14, 17.08 [01:34:35] RECOVERY - mw182 Current Load on mw182 is OK: LOAD OK - total load average: 19.85, 19.80, 17.60 [01:35:57] RECOVERY - prometheus151 Current Load on prometheus151 is OK: LOAD OK - total load average: 3.16, 2.99, 3.40 [01:38:39] PROBLEM - prometheus151 PowerDNS Recursor on prometheus151 is CRITICAL: CRITICAL - Plugin timed out while executing system call [01:39:57] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 5.87, 5.09, 4.20 [01:40:38] RECOVERY - prometheus151 PowerDNS Recursor on prometheus151 is OK: DNS OK: 0.065 seconds response time. wikitide.net returns 2602:294:0:b13::110,2602:294:0:b23::112,38.46.223.205,38.46.223.206 [01:54:20] [Grafana] FIRING: There has been a rise in the MediaWiki exception rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [01:59:20] [Grafana] RESOLVED: MediaWiki Exception Rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [02:03:20] [Grafana] FIRING: There has been a rise in the MediaWiki exception rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [02:08:20] [Grafana] RESOLVED: MediaWiki Exception Rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [02:20:20] PROBLEM - puritwiki.p-e.kr - reverse DNS on sslhost is WARNING: LifetimeTimeout: The resolution lifetime expired after 5.404 seconds: Server 2606:4700:4700::1111 UDP port 53 answered The DNS operation timed out.; Server 2606:4700:4700::1111 UDP port 53 answered The DNS operation timed out.; Server 2606:4700:4700::1111 UDP port 53 answered The DNS operation timed out. [02:26:48] PROBLEM - prometheus151 SSH on prometheus151 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [02:28:42] RECOVERY - prometheus151 SSH on prometheus151 is OK: SSH OK - OpenSSH_9.2p1 Debian-2+deb12u3 (protocol 2.0) [02:29:26] [02puppet] 07The-Voidwalker created branch 03The-Voidwalker-patch-1 - 13https://github.com/miraheze/puppet [02:29:28] [02puppet] 07The-Voidwalker pushed 031 commit to 03The-Voidwalker-patch-1 [+0/-0/±1] 13https://github.com/miraheze/puppet/commit/d8a21803f976 [02:29:31] [02puppet] 07The-Voidwalker 03d8a2180 - port some cloudflare rules to varnish nginx [02:30:45] [02puppet] 07The-Voidwalker opened pull request 03#3900: port some cloudflare rules to varnish nginx - 13https://github.com/miraheze/puppet/pull/3900 [02:30:50] [02puppet] 07coderabbitai[bot] commented on pull request 03#3900: port some cloudflare rules to varnish nginx - 13https://github.com/miraheze/puppet/pull/3900#issuecomment-2308627533 [02:37:17] PROBLEM - prometheus151 PowerDNS Recursor on prometheus151 is CRITICAL: CRITICAL - Plugin timed out while executing system call [02:37:55] PROBLEM - prometheus151 SSH on prometheus151 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [02:39:16] RECOVERY - prometheus151 PowerDNS Recursor on prometheus151 is OK: DNS OK: 0.062 seconds response time. wikitide.net returns 2602:294:0:b13::110,2602:294:0:b23::112,38.46.223.205,38.46.223.206 [02:39:54] RECOVERY - prometheus151 SSH on prometheus151 is OK: SSH OK - OpenSSH_9.2p1 Debian-2+deb12u3 (protocol 2.0) [02:47:25] [Grafana] FIRING: The mediawiki job queue has more than 500 unclaimed jobs https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [02:47:44] RECOVERY - jobchron171 APT on jobchron171 is OK: APT OK: 61 packages available for upgrade (0 critical updates). [02:48:05] RECOVERY - eventgate181 APT on eventgate181 is OK: APT OK: 44 packages available for upgrade (0 critical updates). [02:48:17] RECOVERY - cp36 APT on cp36 is OK: APT OK: 51 packages available for upgrade (0 critical updates). [02:48:42] RECOVERY - cp37 APT on cp37 is OK: APT OK: 51 packages available for upgrade (0 critical updates). [02:48:46] RECOVERY - bots171 APT on bots171 is OK: APT OK: 62 packages available for upgrade (0 critical updates). [02:48:52] RECOVERY - db161 APT on db161 is OK: APT OK: 58 packages available for upgrade (0 critical updates). [02:49:03] RECOVERY - graphite151 APT on graphite151 is OK: APT OK: 39 packages available for upgrade (0 critical updates). [02:49:06] RECOVERY - db151 APT on db151 is OK: APT OK: 59 packages available for upgrade (0 critical updates). [02:49:17] RECOVERY - bast161 APT on bast161 is OK: APT OK: 51 packages available for upgrade (0 critical updates). [02:49:32] RECOVERY - cloud16 APT on cloud16 is OK: APT OK: 101 packages available for upgrade (0 critical updates). [02:49:49] RECOVERY - matomo151 APT on matomo151 is OK: APT OK: 62 packages available for upgrade (0 critical updates). [02:49:50] PROBLEM - prometheus151 Puppet on prometheus151 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 60 seconds. [02:49:59] RECOVERY - cloud15 APT on cloud15 is OK: APT OK: 109 packages available for upgrade (0 critical updates). [02:50:02] RECOVERY - db171 APT on db171 is OK: APT OK: 58 packages available for upgrade (0 critical updates). [02:50:33] RECOVERY - db182 APT on db182 is OK: APT OK: 59 packages available for upgrade (0 critical updates). [02:50:40] RECOVERY - graylog161 APT on graylog161 is OK: APT OK: 55 packages available for upgrade (0 critical updates). [02:51:42] RECOVERY - db181 APT on db181 is OK: APT OK: 59 packages available for upgrade (0 critical updates). [02:51:44] RECOVERY - cloud17 APT on cloud17 is OK: APT OK: 101 packages available for upgrade (0 critical updates). [02:51:56] RECOVERY - prometheus151 Puppet on prometheus151 is OK: OK: Puppet is currently enabled, last run 35 minutes ago with 0 failures [02:52:10] RECOVERY - ldap171 APT on ldap171 is OK: APT OK: 51 packages available for upgrade (0 critical updates). [02:52:17] RECOVERY - bast181 APT on bast181 is OK: APT OK: 52 packages available for upgrade (0 critical updates). [02:52:25] [Grafana] FIRING: The mediawiki job queue has more than 500 unclaimed jobs https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [02:52:51] RECOVERY - kafka181 APT on kafka181 is OK: APT OK: 37 packages available for upgrade (0 critical updates). [02:53:00] RECOVERY - cp26 APT on cp26 is OK: APT OK: 51 packages available for upgrade (0 critical updates). [02:53:16] RECOVERY - cp27 APT on cp27 is OK: APT OK: 51 packages available for upgrade (0 critical updates). [02:53:26] RECOVERY - mem151 APT on mem151 is OK: APT OK: 51 packages available for upgrade (0 critical updates). [02:53:29] RECOVERY - cloud18 APT on cloud18 is OK: APT OK: 102 packages available for upgrade (0 critical updates). [02:53:59] RECOVERY - mem161 APT on mem161 is OK: APT OK: 51 packages available for upgrade (0 critical updates). [02:54:28] RECOVERY - cp41 APT on cp41 is OK: APT OK: 51 packages available for upgrade (0 critical updates). [02:54:32] RECOVERY - mw151 APT on mw151 is OK: APT OK: 64 packages available for upgrade (0 critical updates). [02:54:52] RECOVERY - mwtask171 APT on mwtask171 is OK: APT OK: 62 packages available for upgrade (0 critical updates). [02:54:58] RECOVERY - mon181 APT on mon181 is OK: APT OK: 57 packages available for upgrade (0 critical updates). [02:54:58] RECOVERY - os162 APT on os162 is OK: APT OK: 53 packages available for upgrade (0 critical updates). [02:55:11] RECOVERY - puppet181 APT on puppet181 is OK: APT OK: 58 packages available for upgrade (0 critical updates). [02:55:11] RECOVERY - ns1 APT on ns1 is OK: APT OK: 50 packages available for upgrade (0 critical updates). [02:55:44] RECOVERY - os161 APT on os161 is OK: APT OK: 52 packages available for upgrade (0 critical updates). [02:55:59] RECOVERY - mw162 APT on mw162 is OK: APT OK: 64 packages available for upgrade (0 critical updates). [02:56:15] RECOVERY - mw161 APT on mw161 is OK: APT OK: 64 packages available for upgrade (0 critical updates). [02:56:18] RECOVERY - phorge171 APT on phorge171 is OK: APT OK: 53 packages available for upgrade (0 critical updates). [02:56:29] RECOVERY - mw152 APT on mw152 is OK: APT OK: 64 packages available for upgrade (0 critical updates). [02:56:38] RECOVERY - os151 APT on os151 is OK: APT OK: 53 packages available for upgrade (0 critical updates). [02:56:58] RECOVERY - mw171 APT on mw171 is OK: APT OK: 64 packages available for upgrade (0 critical updates). [02:57:34] RECOVERY - reports171 APT on reports171 is OK: APT OK: 63 packages available for upgrade (0 critical updates). [02:57:46] RECOVERY - mw181 APT on mw181 is OK: APT OK: 65 packages available for upgrade (0 critical updates). [02:57:51] RECOVERY - rdb151 APT on rdb151 is OK: APT OK: 51 packages available for upgrade (0 critical updates). [02:58:10] RECOVERY - mw172 APT on mw172 is OK: APT OK: 64 packages available for upgrade (0 critical updates). [02:58:12] RECOVERY - mwtask181 APT on mwtask181 is OK: APT OK: 62 packages available for upgrade (0 critical updates). [02:58:17] !log [void@puppet181] Upgraded packages libaom3 on swiftobject151 [02:58:22] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [02:58:27] RECOVERY - cp51 APT on cp51 is OK: APT OK: 51 packages available for upgrade (0 critical updates). [02:58:31] !log [void@puppet181] Upgraded packages libaom3 on test151 [02:58:37] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [02:58:40] RECOVERY - mw182 APT on mw182 is OK: APT OK: 65 packages available for upgrade (0 critical updates). [02:58:47] !log [void@puppet181] Upgraded packages libaom3 on swiftac171 [02:58:53] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [02:59:01] !log [void@puppet181] Upgraded packages libaom3 on swiftobject171 [02:59:05] RECOVERY - swiftobject171 APT on swiftobject171 is OK: APT OK: 51 packages available for upgrade (0 critical updates). [02:59:11] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [02:59:12] RECOVERY - swiftac171 APT on swiftac171 is OK: APT OK: 40 packages available for upgrade (0 critical updates). [02:59:15] !log [void@puppet181] Upgraded packages libaom3 on swiftproxy161 [02:59:21] RECOVERY - swiftobject151 APT on swiftobject151 is OK: APT OK: 51 packages available for upgrade (0 critical updates). [02:59:21] RECOVERY - swiftobject161 APT on swiftobject161 is OK: APT OK: 51 packages available for upgrade (0 critical updates). [02:59:28] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [02:59:31] !log [void@puppet181] Upgraded packages libaom3 on swiftobject181 [02:59:32] !log [void@bots171] restart irclogserverbot [02:59:37] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [02:59:38] RECOVERY - swiftobject181 APT on swiftobject181 is OK: APT OK: 52 packages available for upgrade (0 critical updates). [02:59:42] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [02:59:44] RECOVERY - swiftproxy161 APT on swiftproxy161 is OK: APT OK: 40 packages available for upgrade (0 critical updates). [02:59:46] !log [void@puppet181] Upgraded packages libaom3 on swiftproxy171 [02:59:50] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [03:00:03] RECOVERY - test151 APT on test151 is OK: APT OK: 74 packages available for upgrade (0 critical updates). [03:00:46] RECOVERY - mon181 Backups Grafana on mon181 is OK: FILE_AGE OK: /var/log/grafana-backup.log is 23 seconds old and 93 bytes [03:01:13] [02ssl] 07WikiTideSSLBot pushed 031 commit to 03master [+0/-0/±1] 13https://github.com/miraheze/ssl/compare/87d37448b8d3...775f3f7607f8 [03:01:14] [02ssl] 07WikiTideSSLBot 03775f3f7 - Bot: Update SSL cert for wiki.tdrweb.top [03:01:31] PROBLEM - prometheus151 PowerDNS Recursor on prometheus151 is CRITICAL: CRITICAL - Plugin timed out while executing system call [03:01:39] RECOVERY - swiftproxy171 APT on swiftproxy171 is OK: APT OK: 54 packages available for upgrade (0 critical updates). [03:01:46] PROBLEM - prometheus151 SSH on prometheus151 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [03:03:02] !log [void@puppet181] Upgraded packages libaom3 on prometheus151 [03:03:10] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [03:03:14] [02ssl] 07WikiTideSSLBot pushed 031 commit to 03master [+0/-0/±1] 13https://github.com/miraheze/ssl/compare/775f3f7607f8...b1719ff9288a [03:03:17] [02ssl] 07WikiTideSSLBot 03b1719ff - Bot: Update SSL cert for project-patterns.com [03:03:29] RECOVERY - prometheus151 PowerDNS Recursor on prometheus151 is OK: DNS OK: 0.074 seconds response time. wikitide.net returns 2602:294:0:b13::110,2602:294:0:b23::112,38.46.223.205,38.46.223.206 [03:03:40] RECOVERY - prometheus151 SSH on prometheus151 is OK: SSH OK - OpenSSH_9.2p1 Debian-2+deb12u3 (protocol 2.0) [03:07:25] [Grafana] FIRING: The mediawiki job queue has more than 500 unclaimed jobs https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] FIRING: There has been a rise in the MediaWiki exception rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [03:07:55] PROBLEM - prometheus151 PowerDNS Recursor on prometheus151 is CRITICAL: CRITICAL - Plugin timed out while executing system call [03:07:57] PROBLEM - prometheus151 SSH on prometheus151 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [03:08:30] RECOVERY - prometheus151 APT on prometheus151 is OK: APT OK: 51 packages available for upgrade (0 critical updates). [03:09:57] RECOVERY - prometheus151 SSH on prometheus151 is OK: SSH OK - OpenSSH_9.2p1 Debian-2+deb12u3 (protocol 2.0) [03:09:58] RECOVERY - prometheus151 PowerDNS Recursor on prometheus151 is OK: DNS OK: 4.112 seconds response time. wikitide.net returns 2602:294:0:b13::110,2602:294:0:b23::112,38.46.223.205,38.46.223.206 [03:12:25] [Grafana] FIRING: The mediawiki job queue has more than 500 unclaimed jobs https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] RESOLVED: MediaWiki Exception Rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [03:17:25] [Grafana] RESOLVED: High Job Queue Backlog https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [03:20:17] PROBLEM - puritwiki.p-e.kr - reverse DNS on sslhost is CRITICAL: rDNS CRITICAL - puritwiki.p-e.kr All nameservers failed to answer the query. [03:25:39] RECOVERY - project-patterns.com - LetsEncrypt on sslhost is OK: OK - Certificate 'project-patterns.com' will expire on Sat 23 Nov 2024 02:04:38 AM GMT +0000. [03:26:48] RECOVERY - www.project-patterns.com - LetsEncrypt on sslhost is OK: OK - Certificate 'project-patterns.com' will expire on Sat 23 Nov 2024 02:04:38 AM GMT +0000. [03:30:02] RECOVERY - wiki.tdrweb.top - LetsEncrypt on sslhost is OK: OK - Certificate 'wiki.tdrweb.top' will expire on Sat 23 Nov 2024 02:02:36 AM GMT +0000. [03:32:25] [Grafana] RESOLVED: PHP-FPM Worker Usage High https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [03:43:58] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 2.75, 2.96, 3.75 [03:49:07] [02ImportDump] 07Universal-Omega pushed 031 commit to 03fix-completed-job-duplication [+0/-0/±1] 13https://github.com/miraheze/ImportDump/commit/0d7eb3a33f8a [03:49:09] [02ImportDump] 07Universal-Omega 030d7eb3a - Don't rerun a job that is already completed. [03:49:11] [02ImportDump] 07Universal-Omega created branch 03fix-completed-job-duplication - 13https://github.com/miraheze/ImportDump [03:49:14] [02ImportDump] 07Universal-Omega opened pull request 03#118: Don't rerun a job that is already completed. - 13https://github.com/miraheze/ImportDump/pull/118 [03:49:22] [02ImportDump] 07coderabbitai[bot] commented on pull request 03#118: Don't rerun a job that is already completed. - 13https://github.com/miraheze/ImportDump/pull/118#issuecomment-2308642399 [03:49:34] [02ImportDump] 07Universal-Omega edited pull request 03#118: Don't rerun a job that is already completed - 13https://github.com/miraheze/ImportDump/pull/118 [03:49:57] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 4.47, 3.46, 3.66 [03:50:17] [02ImportDump] 07coderabbitai[bot] edited pull request 03#118: Don't rerun a job that is already completed - 13https://github.com/miraheze/ImportDump/pull/118 [03:50:59] [02ImportDump] 07Universal-Omega pushed 031 commit to 03fix-completed-job-duplication [+0/-0/±1] 13https://github.com/miraheze/ImportDump/compare/0d7eb3a33f8a...364bbad8da0c [03:51:00] [02ImportDump] 07Universal-Omega 03364bbad - Add fallback check [03:51:03] [02ImportDump] 07Universal-Omega synchronize pull request 03#118: Don't rerun a job that is already completed - 13https://github.com/miraheze/ImportDump/pull/118 [03:51:40] [02ImportDump] 07coderabbitai[bot] edited a comment on pull request 03#118: Don't rerun a job that is already completed - 13https://github.com/miraheze/ImportDump/pull/118#issuecomment-2308642399 [03:51:42] [02ImportDump] 07coderabbitai[bot] edited a comment on pull request 03#118: Don't rerun a job that is already completed - 13https://github.com/miraheze/ImportDump/pull/118#issuecomment-2308642399 [03:52:27] [Grafana] FIRING: The mediawiki job queue has more than 500 unclaimed jobs https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [03:53:39] [02ImportDump] 07coderabbitai[bot] edited pull request 03#118: Don't rerun a job that is already completed - 13https://github.com/miraheze/ImportDump/pull/118 [03:53:57] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 3.42, 3.23, 3.50 [03:55:35] [02ImportDump] 07coderabbitai[bot] edited a comment on pull request 03#118: Don't rerun a job that is already completed - 13https://github.com/miraheze/ImportDump/pull/118#issuecomment-2308642399 [03:55:57] RECOVERY - prometheus151 Current Load on prometheus151 is OK: LOAD OK - total load average: 2.77, 2.85, 3.32 [03:59:23] [02ImportDump] 07Universal-Omega pushed 031 commit to 03fix-completed-job-duplication [+0/-0/±1] 13https://github.com/miraheze/ImportDump/compare/364bbad8da0c...b37d1aea3b21 [03:59:24] [02ImportDump] 07Universal-Omega 03b37d1ae - Also add a notify fallback [03:59:25] [02ImportDump] 07Universal-Omega synchronize pull request 03#118: Don't rerun a job that is already completed - 13https://github.com/miraheze/ImportDump/pull/118 [03:59:40] [02ImportDump] 07coderabbitai[bot] edited a comment on pull request 03#118: Don't rerun a job that is already completed - 13https://github.com/miraheze/ImportDump/pull/118#issuecomment-2308642399 [04:00:51] PROBLEM - wiki.corgicam.tv - reverse DNS on sslhost is WARNING: LifetimeTimeout: The resolution lifetime expired after 5.404 seconds: Server 2606:4700:4700::1111 UDP port 53 answered The DNS operation timed out.; Server 2606:4700:4700::1111 UDP port 53 answered The DNS operation timed out.; Server 2606:4700:4700::1111 UDP port 53 answered The DNS operation timed out. [04:03:15] PROBLEM - mw181 Current Load on mw181 is WARNING: LOAD WARNING - total load average: 21.83, 22.65, 23.91 [04:04:54] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 2.41, 3.91, 3.73 [04:05:15] PROBLEM - mw181 Current Load on mw181 is CRITICAL: LOAD CRITICAL - total load average: 24.85, 23.29, 23.98 [04:06:51] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 6.45, 4.88, 4.10 [04:07:36] PROBLEM - wiki.graalmilitary.com - LetsEncrypt on sslhost is CRITICAL: Name or service not knownHTTP CRITICAL - Unable to open TCP socket [04:08:16] PROBLEM - prometheus151 PowerDNS Recursor on prometheus151 is CRITICAL: CRITICAL - Plugin timed out while executing system call [04:08:43] PROBLEM - wiki.graalmilitary.com - reverse DNS on sslhost is WARNING: rDNS WARNING - reverse DNS entry for wiki.graalmilitary.com could not be found [04:09:07] miraheze/ImportDump - Universal-Omega the build passed. [04:10:24] RECOVERY - prometheus151 PowerDNS Recursor on prometheus151 is OK: DNS OK: 9.007 seconds response time. wikitide.net returns 2602:294:0:b13::110,2602:294:0:b23::112,38.46.223.205,38.46.223.206 [04:12:42] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 2.89, 3.81, 3.95 [04:16:35] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 5.21, 4.10, 4.01 [04:17:27] [Grafana] RESOLVED: High Job Queue Backlog https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [04:18:32] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 2.76, 3.34, 3.73 [04:18:43] PROBLEM - sarovia.graalmilitary.com - reverse DNS on sslhost is WARNING: rDNS WARNING - reverse DNS entry for sarovia.graalmilitary.com could not be found [04:18:55] PROBLEM - pt.graalmilitary.com - LetsEncrypt on sslhost is CRITICAL: Name or service not knownHTTP CRITICAL - Unable to open TCP socket [04:20:30] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 5.26, 3.92, 3.89 [04:21:25] PROBLEM - prometheus151 SSH on prometheus151 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [04:21:55] PROBLEM - pt.graalmilitary.com - reverse DNS on sslhost is WARNING: rDNS WARNING - reverse DNS entry for pt.graalmilitary.com could not be found [04:23:18] RECOVERY - prometheus151 SSH on prometheus151 is OK: SSH OK - OpenSSH_9.2p1 Debian-2+deb12u3 (protocol 2.0) [04:24:23] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 3.58, 3.81, 3.88 [04:28:19] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 4.54, 3.87, 3.87 [04:29:35] PROBLEM - prometheus151 SSH on prometheus151 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [04:30:49] PROBLEM - wiki.corgicam.tv - reverse DNS on sslhost is CRITICAL: rDNS CRITICAL - wiki.corgicam.tv All nameservers failed to answer the query. [04:32:13] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 3.20, 3.70, 3.84 [04:34:10] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 5.78, 4.45, 4.10 [04:35:28] PROBLEM - sarovia.graalmilitary.com - LetsEncrypt on sslhost is CRITICAL: Name or service not knownHTTP CRITICAL - Unable to open TCP socket [04:35:36] RECOVERY - prometheus151 SSH on prometheus151 is OK: SSH OK - OpenSSH_9.2p1 Debian-2+deb12u3 (protocol 2.0) [04:36:07] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 2.30, 3.74, 3.90 [04:37:15] PROBLEM - mw181 Current Load on mw181 is WARNING: LOAD WARNING - total load average: 22.46, 22.25, 23.89 [04:38:04] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 4.76, 4.13, 4.02 [04:39:15] PROBLEM - mw181 Current Load on mw181 is CRITICAL: LOAD CRITICAL - total load average: 25.32, 23.46, 24.14 [04:40:01] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 2.14, 3.50, 3.81 [04:41:58] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 4.65, 4.35, 4.10 [04:47:15] PROBLEM - mw181 Current Load on mw181 is WARNING: LOAD WARNING - total load average: 21.57, 22.45, 23.55 [04:48:25] PROBLEM - prometheus151 APT on prometheus151 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 60 seconds. [04:49:16] PROBLEM - prometheus151 PowerDNS Recursor on prometheus151 is CRITICAL: CRITICAL - Plugin timed out while executing system call [04:51:16] RECOVERY - prometheus151 PowerDNS Recursor on prometheus151 is OK: DNS OK: 1.461 second response time. wikitide.net returns 2602:294:0:b13::110,2602:294:0:b23::112,38.46.223.205,38.46.223.206 [04:51:20] RECOVERY - prometheus151 APT on prometheus151 is OK: APT OK: 51 packages available for upgrade (0 critical updates). [04:51:46] [Grafana] FIRING: There has been a rise in the MediaWiki exception rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [04:56:46] [Grafana] RESOLVED: MediaWiki Exception Rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [05:01:15] PROBLEM - mw181 Current Load on mw181 is CRITICAL: LOAD CRITICAL - total load average: 29.25, 24.68, 23.47 [05:03:15] PROBLEM - mw181 Current Load on mw181 is WARNING: LOAD WARNING - total load average: 22.12, 23.15, 23.04 [05:03:20] [Grafana] FIRING: There has been a rise in the MediaWiki exception rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [05:05:22] PROBLEM - prometheus151 SSH on prometheus151 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [05:08:14] PROBLEM - prometheus151 PowerDNS Recursor on prometheus151 is CRITICAL: CRITICAL - Plugin timed out while executing system call [05:08:20] [Grafana] RESOLVED: MediaWiki Exception Rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [05:09:15] PROBLEM - mw181 Current Load on mw181 is CRITICAL: LOAD CRITICAL - total load average: 26.19, 23.42, 23.04 [05:09:23] RECOVERY - prometheus151 SSH on prometheus151 is OK: SSH OK - OpenSSH_9.2p1 Debian-2+deb12u3 (protocol 2.0) [05:10:13] RECOVERY - prometheus151 PowerDNS Recursor on prometheus151 is OK: DNS OK: 0.071 seconds response time. wikitide.net returns 2602:294:0:b13::110,2602:294:0:b23::112,38.46.223.205,38.46.223.206 [05:10:20] [Grafana] FIRING: There has been a rise in the MediaWiki exception rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [05:11:15] PROBLEM - mw181 Current Load on mw181 is WARNING: LOAD WARNING - total load average: 22.96, 22.68, 22.78 [05:15:15] PROBLEM - mw181 Current Load on mw181 is CRITICAL: LOAD CRITICAL - total load average: 26.83, 23.50, 22.94 [05:15:20] [Grafana] RESOLVED: MediaWiki Exception Rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [05:35:54] PROBLEM - prometheus151 Puppet on prometheus151 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 60 seconds. [05:35:56] PROBLEM - prometheus151 APT on prometheus151 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 60 seconds. [05:39:20] [Grafana] FIRING: There has been a rise in the MediaWiki exception rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [05:39:28] PROBLEM - prometheus151 PowerDNS Recursor on prometheus151 is CRITICAL: CRITICAL - Plugin timed out while executing system call [05:39:35] PROBLEM - prometheus151 SSH on prometheus151 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [05:40:53] RECOVERY - prometheus151 Puppet on prometheus151 is OK: OK: Puppet is currently enabled, last run 22 minutes ago with 0 failures [05:40:54] RECOVERY - prometheus151 APT on prometheus151 is OK: APT OK: 51 packages available for upgrade (0 critical updates). [05:41:34] RECOVERY - prometheus151 SSH on prometheus151 is OK: SSH OK - OpenSSH_9.2p1 Debian-2+deb12u3 (protocol 2.0) [05:43:37] RECOVERY - prometheus151 PowerDNS Recursor on prometheus151 is OK: DNS OK: 2.270 seconds response time. wikitide.net returns 2602:294:0:b13::110,2602:294:0:b23::112,38.46.223.205,38.46.223.206 [05:44:20] [Grafana] RESOLVED: MediaWiki Exception Rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [06:10:52] PROBLEM - swiftac171 PowerDNS Recursor on swiftac171 is CRITICAL: CRITICAL - Plugin timed out while executing system call [06:11:17] PROBLEM - swiftac171 APT on swiftac171 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 60 seconds. [06:12:19] PROBLEM - swiftac171 Current Load on swiftac171 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 60 seconds. [06:12:21] PROBLEM - swiftac171 Puppet on swiftac171 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 60 seconds. [06:13:06] RECOVERY - swiftac171 PowerDNS Recursor on swiftac171 is OK: DNS OK: 0.392 seconds response time. wikitide.net returns 2602:294:0:b13::110,2602:294:0:b23::112,38.46.223.205,38.46.223.206 [06:13:18] RECOVERY - swiftac171 APT on swiftac171 is OK: APT OK: 40 packages available for upgrade (0 critical updates). [06:13:57] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 2.43, 3.05, 3.86 [06:14:20] RECOVERY - swiftac171 Puppet on swiftac171 is OK: OK: Puppet is currently enabled, last run 19 minutes ago with 0 failures [06:18:03] PROBLEM - swiftac171 Current Load on swiftac171 is WARNING: LOAD WARNING - total load average: 3.40, 10.29, 8.32 [06:19:57] RECOVERY - swiftac171 Current Load on swiftac171 is OK: LOAD OK - total load average: 2.47, 7.78, 7.63 [06:20:20] PROBLEM - puritwiki.p-e.kr - reverse DNS on sslhost is WARNING: LifetimeTimeout: The resolution lifetime expired after 5.404 seconds: Server 2606:4700:4700::1111 UDP port 53 answered The DNS operation timed out.; Server 2606:4700:4700::1111 UDP port 53 answered The DNS operation timed out.; Server 2606:4700:4700::1111 UDP port 53 answered The DNS operation timed out. [06:21:57] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 4.31, 3.22, 3.58 [06:22:25] [Grafana] FIRING: The mediawiki job queue has more than 500 unclaimed jobs https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [06:23:57] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 3.32, 3.49, 3.66 [06:29:57] RECOVERY - prometheus151 Current Load on prometheus151 is OK: LOAD OK - total load average: 2.61, 2.82, 3.28 [06:32:25] [Grafana] RESOLVED: High Job Queue Backlog https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [06:33:00] [Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [06:45:44] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 6.42, 5.18, 3.92 [06:48:00] [Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] FIRING: There has been a rise in the MediaWiki exception rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [06:50:17] PROBLEM - puritwiki.p-e.kr - reverse DNS on sslhost is CRITICAL: rDNS CRITICAL - puritwiki.p-e.kr All nameservers failed to answer the query. [06:51:21] PROBLEM - prometheus151 PowerDNS Recursor on prometheus151 is CRITICAL: CRITICAL - Plugin timed out while executing system call [06:53:00] [Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] RESOLVED: MediaWiki Exception Rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [06:53:20] RECOVERY - prometheus151 PowerDNS Recursor on prometheus151 is OK: DNS OK: 0.070 seconds response time. wikitide.net returns 2602:294:0:b13::110,2602:294:0:b23::112,38.46.223.205,38.46.223.206 [06:57:26] PROBLEM - franchise.franchising.org.ua - reverse DNS on sslhost is CRITICAL: rDNS CRITICAL - franchise.franchising.org.ua All nameservers failed to answer the query. [07:08:00] [Grafana] RESOLVED: PHP-FPM Worker Usage High https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [07:24:45] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 1.99, 3.23, 3.95 [07:26:12] RECOVERY - wiki.thesimswiki.com - reverse DNS on sslhost is OK: SSL OK - wiki.thesimswiki.com reverse DNS resolves to cp36.wikitide.net - CNAME OK [07:27:07] RECOVERY - wiki.tmyt105.leyhp.com - LetsEncrypt on sslhost is OK: OK - Certificate 'wiki.tmyt105.leyhp.com' will expire on Thu 12 Sep 2024 12:27:49 PM GMT +0000. [07:28:38] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 5.81, 4.03, 4.08 [07:29:01] RECOVERY - wiki.sheepservermc.net - reverse DNS on sslhost is OK: SSL OK - wiki.sheepservermc.net reverse DNS resolves to cp36.wikitide.net - CNAME OK [07:29:17] RECOVERY - wiki.mobilityengineer.com - reverse DNS on sslhost is OK: SSL OK - wiki.mobilityengineer.com reverse DNS resolves to cp36.wikitide.net - CNAME OK [07:29:27] RECOVERY - puritwiki.p-e.kr - LetsEncrypt on sslhost is OK: OK - Certificate 'puritwiki.p-e.kr' will expire on Sat 19 Oct 2024 02:09:03 PM GMT +0000. [07:29:40] RECOVERY - wiki.tmyt105.leyhp.com - reverse DNS on sslhost is OK: SSL OK - wiki.tmyt105.leyhp.com reverse DNS resolves to cp36.wikitide.net - CNAME OK [07:29:53] RECOVERY - wiki.joust.ro - reverse DNS on sslhost is OK: SSL OK - wiki.joust.ro reverse DNS resolves to cp36.wikitide.net - CNAME OK [07:30:33] RECOVERY - wiki.nowchess.org - reverse DNS on sslhost is OK: SSL OK - wiki.nowchess.org reverse DNS resolves to cp36.wikitide.net - CNAME OK [07:30:46] RECOVERY - wiki.corgicam.tv - reverse DNS on sslhost is OK: SSL OK - wiki.corgicam.tv reverse DNS resolves to cp36.wikitide.net - CNAME OK [07:32:36] RECOVERY - wiki.astralprojections.org - reverse DNS on sslhost is OK: SSL OK - wiki.astralprojections.org reverse DNS resolves to cp36.wikitide.net - CNAME OK [07:33:59] RECOVERY - wiki.villagecollaborative.net - LetsEncrypt on sslhost is OK: OK - Certificate 'wiki.villagecollaborative.net' will expire on Wed 09 Oct 2024 09:20:12 PM GMT +0000. [07:35:42] RECOVERY - wiki.digitalcandela.com - reverse DNS on sslhost is OK: SSL OK - wiki.digitalcandela.com reverse DNS resolves to cp36.wikitide.net - CNAME OK [07:37:50] RECOVERY - wiki.yahia.xyz - reverse DNS on sslhost is OK: SSL OK - wiki.yahia.xyz reverse DNS resolves to cp36.wikitide.net - CNAME OK [07:38:00] RECOVERY - tno.wiki - reverse DNS on sslhost is OK: SSL OK - tno.wiki reverse DNS resolves to cp36.wikitide.net - CNAME OK [07:38:00] RECOVERY - wiki.junkstore.xyz - reverse DNS on sslhost is OK: SSL OK - wiki.junkstore.xyz reverse DNS resolves to cp36.wikitide.net - CNAME OK [07:38:03] RECOVERY - wiki.ate42.ru - reverse DNS on sslhost is OK: SSL OK - wiki.ate42.ru reverse DNS resolves to cp36.wikitide.net - CNAME OK [07:39:47] RECOVERY - wiki.wikimedia.cat - reverse DNS on sslhost is OK: SSL OK - wiki.wikimedia.cat reverse DNS resolves to cp36.wikitide.net - CNAME OK [07:40:20] RECOVERY - www.kinitopedia.lol - reverse DNS on sslhost is OK: SSL OK - www.kinitopedia.lol reverse DNS resolves to cp36.wikitide.net - CNAME OK [07:42:07] RECOVERY - wiki.villagecollaborative.net - reverse DNS on sslhost is OK: SSL OK - wiki.villagecollaborative.net reverse DNS resolves to cp36.wikitide.net - CNAME OK [07:42:23] PROBLEM - prometheus151 PowerDNS Recursor on prometheus151 is CRITICAL: CRITICAL - Plugin timed out while executing system call [07:44:22] RECOVERY - prometheus151 PowerDNS Recursor on prometheus151 is OK: DNS OK: 0.106 seconds response time. wikitide.net returns 2602:294:0:b13::110,2602:294:0:b23::112,38.46.223.205,38.46.223.206 [07:45:30] RECOVERY - www.thesimswiki.com - reverse DNS on sslhost is OK: SSL OK - www.thesimswiki.com reverse DNS resolves to cp36.wikitide.net - CNAME OK [07:47:18] RECOVERY - minescape.wiki - reverse DNS on sslhost is OK: SSL OK - minescape.wiki reverse DNS resolves to cp36.wikitide.net - CNAME FLAT [07:47:19] RECOVERY - vise.dayid.org - reverse DNS on sslhost is OK: SSL OK - vise.dayid.org reverse DNS resolves to cp36.wikitide.net - CNAME OK [07:49:20] [Grafana] FIRING: There has been a rise in the MediaWiki exception rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [07:50:15] RECOVERY - puritwiki.p-e.kr - reverse DNS on sslhost is OK: SSL OK - puritwiki.p-e.kr reverse DNS resolves to cp36.wikitide.net - CNAME OK [07:51:33] RECOVERY - wiki.joust.ro - LetsEncrypt on sslhost is OK: OK - Certificate 'wiki.joust.ro' will expire on Wed 09 Oct 2024 09:25:14 PM GMT +0000. [07:51:42] RECOVERY - wiki.wikimedia.cat - LetsEncrypt on sslhost is OK: OK - Certificate 'wiki.wikimedia.cat' will expire on Thu 10 Oct 2024 07:43:36 AM GMT +0000. [07:51:44] RECOVERY - wiki.eggsdstudios.com - reverse DNS on sslhost is OK: SSL OK - wiki.eggsdstudios.com reverse DNS resolves to cp36.wikitide.net - CNAME OK [07:52:51] RECOVERY - gufengcheng.top - reverse DNS on sslhost is OK: SSL OK - gufengcheng.top reverse DNS resolves to cp36.wikitide.net - CNAME OK [07:54:20] [Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] RESOLVED: MediaWiki Exception Rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [07:57:27] PROBLEM - franchise.franchising.org.ua - reverse DNS on sslhost is WARNING: rDNS WARNING - reverse DNS entry for franchise.franchising.org.ua could not be found [08:09:57] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 2.75, 2.94, 3.70 [08:13:57] RECOVERY - prometheus151 Current Load on prometheus151 is OK: LOAD OK - total load average: 1.94, 2.59, 3.38 [08:16:45] PROBLEM - prometheus151 SSH on prometheus151 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [08:17:58] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 4.85, 4.20, 3.90 [08:19:20] [Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] FIRING: There has been a rise in the MediaWiki exception rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [08:24:51] RECOVERY - prometheus151 SSH on prometheus151 is OK: SSH OK - OpenSSH_9.2p1 Debian-2+deb12u3 (protocol 2.0) [08:29:06] PROBLEM - prometheus151 PowerDNS Recursor on prometheus151 is CRITICAL: CRITICAL - Plugin timed out while executing system call [08:29:20] [Grafana] RESOLVED: PHP-FPM Worker Usage High https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] FIRING: There has been a rise in the MediaWiki exception rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [08:31:05] RECOVERY - prometheus151 PowerDNS Recursor on prometheus151 is OK: DNS OK: 0.068 seconds response time. wikitide.net returns 2602:294:0:b13::110,2602:294:0:b23::112,38.46.223.205,38.46.223.206 [08:34:20] [Grafana] FIRING: The mediawiki job queue has more than 500 unclaimed jobs https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] RESOLVED: MediaWiki Exception Rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [08:44:20] [Grafana] RESOLVED: High Job Queue Backlog https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [08:57:00] [Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [08:57:57] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 2.70, 2.88, 3.78 [09:01:57] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 4.11, 3.60, 3.87 [09:04:35] PROBLEM - prometheus151 PowerDNS Recursor on prometheus151 is CRITICAL: CRITICAL - Plugin timed out while executing system call [09:09:52] PROBLEM - prometheus151 SSH on prometheus151 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [09:10:51] RECOVERY - prometheus151 PowerDNS Recursor on prometheus151 is OK: DNS OK: 0.070 seconds response time. wikitide.net returns 2602:294:0:b13::110,2602:294:0:b23::112,38.46.223.205,38.46.223.206 [09:11:52] RECOVERY - prometheus151 SSH on prometheus151 is OK: SSH OK - OpenSSH_9.2p1 Debian-2+deb12u3 (protocol 2.0) [09:19:58] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 2.03, 3.21, 3.85 [09:23:58] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 4.57, 3.51, 3.81 [09:25:57] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 2.34, 3.32, 3.73 [09:27:57] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 4.64, 3.64, 3.78 [09:29:45] PROBLEM - prometheus151 PowerDNS Recursor on prometheus151 is CRITICAL: CRITICAL - Plugin timed out while executing system call [09:31:57] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 3.08, 3.76, 3.85 [09:33:59] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 4.76, 4.29, 4.04 [09:35:57] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 1.97, 3.76, 3.91 [09:37:57] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 3.53, 4.06, 4.01 [09:38:10] RECOVERY - prometheus151 PowerDNS Recursor on prometheus151 is OK: DNS OK: 0.069 seconds response time. wikitide.net returns 2602:294:0:b13::110,2602:294:0:b23::112,38.46.223.205,38.46.223.206 [09:42:02] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 21.33, 19.08, 18.02 [09:43:57] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 2.97, 3.68, 3.87 [09:44:01] RECOVERY - mw182 Current Load on mw182 is OK: LOAD OK - total load average: 16.54, 17.63, 17.60 [09:45:57] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 5.56, 3.88, 3.89 [09:46:42] PROBLEM - ns2 Puppet on ns2 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. It might be a dependency cycle. [09:47:57] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 2.92, 3.24, 3.64 [09:49:57] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 5.84, 4.32, 4.00 [09:53:57] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 2.34, 3.82, 3.92 [09:55:57] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 5.25, 4.13, 4.00 [09:57:54] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 20.12, 21.32, 19.59 [09:58:20] PROBLEM - prometheus151 PowerDNS Recursor on prometheus151 is CRITICAL: CRITICAL - Plugin timed out while executing system call [09:59:53] RECOVERY - mw182 Current Load on mw182 is OK: LOAD OK - total load average: 15.80, 19.92, 19.33 [10:02:27] RECOVERY - prometheus151 PowerDNS Recursor on prometheus151 is OK: DNS OK: 0.197 seconds response time. wikitide.net returns 2602:294:0:b13::110,2602:294:0:b23::112,38.46.223.205,38.46.223.206 [10:11:57] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 2.80, 3.21, 3.98 [10:13:57] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 5.62, 4.26, 4.28 [10:15:32] RECOVERY - ns2 Puppet on ns2 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [10:27:57] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 3.34, 3.32, 3.76 [10:31:57] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 5.32, 3.85, 3.85 [10:33:57] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 2.26, 3.60, 3.79 [10:35:57] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 5.16, 4.15, 3.96 [10:37:57] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 1.94, 3.58, 3.80 [10:40:01] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 6.19, 4.38, 4.05 [10:41:58] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 2.24, 3.74, 3.87 [10:42:00] [Grafana] RESOLVED: PHP-FPM Worker Usage High https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [10:43:57] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 4.79, 4.16, 4.00 [10:45:59] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 3.52, 3.67, 3.83 [10:48:01] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 6.76, 4.74, 4.20 [10:49:58] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 3.66, 4.00, 3.98 [10:51:57] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 4.20, 3.99, 3.97 [10:55:20] [Grafana] FIRING: There has been a rise in the MediaWiki exception rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [10:55:57] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 1.88, 3.52, 3.84 [10:59:57] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 5.22, 4.11, 3.98 [11:00:20] [Grafana] RESOLVED: MediaWiki Exception Rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [11:01:00] PROBLEM - prometheus151 SSH on prometheus151 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [11:02:20] [Grafana] FIRING: There has been a rise in the MediaWiki exception rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [11:03:04] RECOVERY - prometheus151 SSH on prometheus151 is OK: SSH OK - OpenSSH_9.2p1 Debian-2+deb12u3 (protocol 2.0) [11:07:20] [Grafana] RESOLVED: MediaWiki Exception Rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [11:11:57] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 1.79, 3.25, 3.85 [11:13:30] [Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [11:17:20] PROBLEM - mw182 Current Load on mw182 is CRITICAL: LOAD CRITICAL - total load average: 24.82, 21.48, 19.10 [11:18:30] [Grafana] FIRING: The mediawiki job queue has more than 500 unclaimed jobs https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [11:19:19] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 23.11, 21.37, 19.32 [11:21:19] RECOVERY - mw182 Current Load on mw182 is OK: LOAD OK - total load average: 18.29, 20.40, 19.23 [11:21:58] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 5.08, 3.73, 3.72 [11:23:57] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 2.53, 3.35, 3.59 [11:25:58] RECOVERY - prometheus151 Current Load on prometheus151 is OK: LOAD OK - total load average: 1.69, 2.75, 3.34 [11:29:58] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 8.28, 5.33, 4.22 [11:33:58] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 2.01, 3.55, 3.74 [11:39:57] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 5.16, 3.55, 3.62 [11:41:58] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 3.03, 3.17, 3.47 [11:43:30] [Grafana] FIRING: The mediawiki job queue has more than 500 unclaimed jobs https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] RESOLVED: PHP-FPM Worker Usage High https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [11:45:58] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 4.02, 3.54, 3.56 [11:48:52] PROBLEM - ns2 Puppet on ns2 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. It might be a dependency cycle. [11:52:23] PROBLEM - prometheus151 SSH on prometheus151 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [11:54:16] RECOVERY - prometheus151 SSH on prometheus151 is OK: SSH OK - OpenSSH_9.2p1 Debian-2+deb12u3 (protocol 2.0) [12:03:58] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 1.77, 3.17, 3.93 [12:05:57] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 4.21, 3.86, 4.10 [12:09:57] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 3.34, 3.44, 3.86 [12:13:30] [Grafana] FIRING: The mediawiki job queue has more than 500 unclaimed jobs https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [12:17:58] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 6.26, 4.19, 3.83 [12:21:57] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 1.91, 3.66, 3.77 [12:23:30] [Grafana] RESOLVED: High Job Queue Backlog https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [12:25:57] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 5.26, 3.74, 3.72 [12:27:57] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 2.80, 3.22, 3.53 [12:29:59] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 5.69, 4.18, 3.85 [12:31:57] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 3.19, 3.46, 3.61 [12:35:58] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 7.80, 4.93, 4.10 [12:39:57] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 1.61, 3.56, 3.76 [12:43:57] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 4.27, 3.56, 3.71 [12:45:57] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 3.00, 3.24, 3.57 [12:47:12] RECOVERY - ns2 Puppet on ns2 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [12:47:57] RECOVERY - prometheus151 Current Load on prometheus151 is OK: LOAD OK - total load average: 2.04, 2.87, 3.40 [12:51:58] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 6.11, 4.25, 3.82 [12:53:30] [Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] FIRING: There has been a rise in the MediaWiki exception rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [12:55:48] PROBLEM - prometheus151 APT on prometheus151 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 60 seconds. [12:57:17] PROBLEM - prometheus151 PowerDNS Recursor on prometheus151 is CRITICAL: CRITICAL - Plugin timed out while executing system call [12:57:35] PROBLEM - prometheus151 SSH on prometheus151 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [12:58:12] RECOVERY - prometheus151 APT on prometheus151 is OK: APT OK: 51 packages available for upgrade (0 critical updates). [12:58:30] [Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] RESOLVED: MediaWiki Exception Rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [13:03:30] [Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] FIRING: There has been a rise in the MediaWiki exception rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [13:08:30] [Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] RESOLVED: MediaWiki Exception Rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [13:09:47] RECOVERY - prometheus151 SSH on prometheus151 is OK: SSH OK - OpenSSH_9.2p1 Debian-2+deb12u3 (protocol 2.0) [13:09:59]