[00:02:55] (LogstashKafkaConsumerLag) firing: Too many messages in kafka logging - https://wikitech.wikimedia.org/wiki/Logstash#Kafka_consumer_lag - https://grafana.wikimedia.org/d/000000484/kafka-consumer-lag?var-cluster=logging-codfw&var-datasource=codfw%20prometheus/ops - https://alerts.wikimedia.org/?q=alertname%3DLogstashKafkaConsumerLag [00:07:55] (LogstashKafkaConsumerLag) resolved: Too many messages in kafka logging - https://wikitech.wikimedia.org/wiki/Logstash#Kafka_consumer_lag - https://grafana.wikimedia.org/d/000000484/kafka-consumer-lag?var-cluster=logging-codfw&var-datasource=codfw%20prometheus/ops - https://alerts.wikimedia.org/?q=alertname%3DLogstashKafkaConsumerLag [00:32:17] PROBLEM - Freshness of OCSP Stapling files -HAProxy- on cp2035 is CRITICAL: CRITICAL: File /var/cache/ocsp/digicert-2022-rsa-unified.ocsp is more than 259500 secs old! https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [00:32:33] PROBLEM - Freshness of OCSP Stapling files -HAProxy- on cp4040 is CRITICAL: CRITICAL: File /var/cache/ocsp/digicert-2022-rsa-unified.ocsp is more than 259500 secs old! https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [00:41:23] PROBLEM - Freshness of OCSP Stapling files -HAProxy- on cp6015 is CRITICAL: CRITICAL: File /var/cache/ocsp/digicert-2022-ecdsa-unified.ocsp is more than 259500 secs old! https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [00:48:55] (LogstashKafkaConsumerLag) firing: Too many messages in kafka logging - https://wikitech.wikimedia.org/wiki/Logstash#Kafka_consumer_lag - https://grafana.wikimedia.org/d/000000484/kafka-consumer-lag?var-cluster=logging-codfw&var-datasource=codfw%20prometheus/ops - https://alerts.wikimedia.org/?q=alertname%3DLogstashKafkaConsumerLag [00:53:55] (LogstashKafkaConsumerLag) resolved: Too many messages in kafka logging - https://wikitech.wikimedia.org/wiki/Logstash#Kafka_consumer_lag - https://grafana.wikimedia.org/d/000000484/kafka-consumer-lag?var-cluster=logging-codfw&var-datasource=codfw%20prometheus/ops - https://alerts.wikimedia.org/?q=alertname%3DLogstashKafkaConsumerLag [00:57:53] PROBLEM - Freshness of OCSP Stapling files -HAProxy- on cp2031 is CRITICAL: CRITICAL: File /var/cache/ocsp/digicert-2022-rsa-unified.ocsp is more than 259500 secs old! https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [01:13:23] PROBLEM - Freshness of OCSP Stapling files -HAProxy- on cp1078 is CRITICAL: CRITICAL: File /var/cache/ocsp/digicert-2022-rsa-unified.ocsp is more than 259500 secs old! https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [01:17:11] PROBLEM - Freshness of OCSP Stapling files -HAProxy- on cp5011 is CRITICAL: CRITICAL: File /var/cache/ocsp/digicert-2022-ecdsa-unified.ocsp is more than 259500 secs old! https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [01:20:35] PROBLEM - Freshness of OCSP Stapling files -HAProxy- on cp4045 is CRITICAL: CRITICAL: File /var/cache/ocsp/digicert-2021-ecdsa-unified.ocsp is more than 259500 secs old! https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [01:24:51] PROBLEM - Freshness of OCSP Stapling files -HAProxy- on cp1085 is CRITICAL: CRITICAL: File /var/cache/ocsp/digicert-2021-rsa-unified.ocsp is more than 259500 secs old! https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [01:25:09] PROBLEM - Check systemd state on an-launcher1002 is CRITICAL: CRITICAL - degraded: The following units failed: monitor_refine_event.service https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state [01:29:31] PROBLEM - Freshness of OCSP Stapling files -HAProxy- on cp4051 is CRITICAL: CRITICAL: File /var/cache/ocsp/digicert-2022-ecdsa-unified.ocsp is more than 259500 secs old! https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [01:33:39] PROBLEM - Freshness of OCSP Stapling files -HAProxy- on cp2033 is CRITICAL: CRITICAL: File /var/cache/ocsp/digicert-2021-ecdsa-unified.ocsp is more than 259500 secs old! https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [01:36:11] PROBLEM - Freshness of OCSP Stapling files -HAProxy- on cp4038 is CRITICAL: CRITICAL: File /var/cache/ocsp/digicert-2022-rsa-unified.ocsp is more than 259500 secs old! https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [01:36:57] PROBLEM - Freshness of OCSP Stapling files -HAProxy- on cp3057 is CRITICAL: CRITICAL: File /var/cache/ocsp/digicert-2022-rsa-unified.ocsp is more than 259500 secs old! https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [01:38:53] (JobUnavailable) firing: (8) Reduced availability for job nginx in ops@codfw - https://wikitech.wikimedia.org/wiki/Prometheus#Prometheus_job_unavailable - https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets - https://alerts.wikimedia.org/?q=alertname%3DJobUnavailable [01:43:41] PROBLEM - Freshness of OCSP Stapling files -HAProxy- on cp5006 is CRITICAL: CRITICAL: File /var/cache/ocsp/digicert-2022-rsa-unified.ocsp is more than 259500 secs old! https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [01:48:53] (JobUnavailable) firing: (8) Reduced availability for job nginx in ops@codfw - https://wikitech.wikimedia.org/wiki/Prometheus#Prometheus_job_unavailable - https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets - https://alerts.wikimedia.org/?q=alertname%3DJobUnavailable [01:53:53] (JobUnavailable) firing: (9) Reduced availability for job gitaly in ops@codfw - https://wikitech.wikimedia.org/wiki/Prometheus#Prometheus_job_unavailable - https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets - https://alerts.wikimedia.org/?q=alertname%3DJobUnavailable [02:08:53] (JobUnavailable) resolved: (5) Reduced availability for job gitaly in ops@codfw - https://wikitech.wikimedia.org/wiki/Prometheus#Prometheus_job_unavailable - https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets - https://alerts.wikimedia.org/?q=alertname%3DJobUnavailable [02:14:11] PROBLEM - Freshness of OCSP Stapling files -HAProxy- on cp5003 is CRITICAL: CRITICAL: File /var/cache/ocsp/digicert-2021-rsa-unified.ocsp is more than 259500 secs old! https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [02:19:57] PROBLEM - Freshness of OCSP Stapling files -HAProxy- on cp1081 is CRITICAL: CRITICAL: File /var/cache/ocsp/digicert-2021-rsa-unified.ocsp is more than 259500 secs old! https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [02:20:07] PROBLEM - Freshness of OCSP Stapling files -HAProxy- on cp4050 is CRITICAL: CRITICAL: File /var/cache/ocsp/digicert-2022-rsa-unified.ocsp is more than 259500 secs old! https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [02:24:27] PROBLEM - Freshness of OCSP Stapling files -HAProxy- on cp6006 is CRITICAL: CRITICAL: File /var/cache/ocsp/digicert-2022-rsa-unified.ocsp is more than 259500 secs old! https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [02:36:09] PROBLEM - Freshness of OCSP Stapling files -HAProxy- on cp6005 is CRITICAL: CRITICAL: File /var/cache/ocsp/digicert-2022-rsa-unified.ocsp is more than 259500 secs old! https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [02:47:01] PROBLEM - Freshness of OCSP Stapling files -HAProxy- on cp5004 is CRITICAL: CRITICAL: File /var/cache/ocsp/digicert-2021-ecdsa-unified.ocsp is more than 259500 secs old! https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [02:52:53] PROBLEM - Freshness of OCSP Stapling files -HAProxy- on cp6008 is CRITICAL: CRITICAL: File /var/cache/ocsp/digicert-2022-rsa-unified.ocsp is more than 259500 secs old! https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [02:53:21] PROBLEM - Freshness of OCSP Stapling files -HAProxy- on cp3065 is CRITICAL: CRITICAL: File /var/cache/ocsp/digicert-2021-rsa-unified.ocsp is more than 259500 secs old! https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [02:55:25] PROBLEM - Freshness of OCSP Stapling files -HAProxy- on cp2041 is CRITICAL: CRITICAL: File /var/cache/ocsp/digicert-2022-rsa-unified.ocsp is more than 259500 secs old! https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [03:07:21] PROBLEM - Freshness of OCSP Stapling files -HAProxy- on cp3055 is CRITICAL: CRITICAL: File /var/cache/ocsp/digicert-2021-rsa-unified.ocsp is more than 259500 secs old! https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [03:11:47] PROBLEM - Freshness of OCSP Stapling files -HAProxy- on cp6001 is CRITICAL: CRITICAL: File /var/cache/ocsp/digicert-2021-rsa-unified.ocsp is more than 259500 secs old! https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [03:11:57] PROBLEM - Freshness of OCSP Stapling files -HAProxy- on cp4039 is CRITICAL: CRITICAL: File /var/cache/ocsp/digicert-2021-ecdsa-unified.ocsp is more than 259500 secs old! https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [03:26:53] PROBLEM - Freshness of OCSP Stapling files -HAProxy- on cp6009 is CRITICAL: CRITICAL: File /var/cache/ocsp/digicert-2021-rsa-unified.ocsp is more than 259500 secs old! https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [03:27:15] PROBLEM - Freshness of OCSP Stapling files -HAProxy- on cp2027 is CRITICAL: CRITICAL: File /var/cache/ocsp/digicert-2021-rsa-unified.ocsp is more than 259500 secs old! https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [03:39:01] PROBLEM - Freshness of OCSP Stapling files -HAProxy- on cp5016 is CRITICAL: CRITICAL: File /var/cache/ocsp/digicert-2022-ecdsa-unified.ocsp is more than 259500 secs old! https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [03:44:53] PROBLEM - Freshness of OCSP Stapling files -HAProxy- on cp6011 is CRITICAL: CRITICAL: File /var/cache/ocsp/digicert-2022-rsa-unified.ocsp is more than 259500 secs old! https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [03:47:45] PROBLEM - Freshness of OCSP Stapling files -HAProxy- on cp3061 is CRITICAL: CRITICAL: File /var/cache/ocsp/digicert-2021-ecdsa-unified.ocsp is more than 259500 secs old! https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [03:51:37] PROBLEM - MediaWiki exceptions and fatals per minute for api_appserver on alert1001 is CRITICAL: 117 gt 100 https://wikitech.wikimedia.org/wiki/Application_servers https://grafana.wikimedia.org/d/000000438/mediawiki-alerts?panelId=18&fullscreen&orgId=1&var-datasource=eqiad+prometheus/ops [03:52:43] PROBLEM - Freshness of OCSP Stapling files -HAProxy- on cp6004 is CRITICAL: CRITICAL: File /var/cache/ocsp/digicert-2022-ecdsa-unified.ocsp is more than 259500 secs old! https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [03:53:31] RECOVERY - MediaWiki exceptions and fatals per minute for api_appserver on alert1001 is OK: (C)100 gt (W)50 gt 1 https://wikitech.wikimedia.org/wiki/Application_servers https://grafana.wikimedia.org/d/000000438/mediawiki-alerts?panelId=18&fullscreen&orgId=1&var-datasource=eqiad+prometheus/ops [03:59:17] PROBLEM - Freshness of OCSP Stapling files -HAProxy- on cp2029 is CRITICAL: CRITICAL: File /var/cache/ocsp/digicert-2022-rsa-unified.ocsp is more than 259500 secs old! https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [04:01:09] PROBLEM - Freshness of OCSP Stapling files -HAProxy- on cp1075 is CRITICAL: CRITICAL: File /var/cache/ocsp/digicert-2022-ecdsa-unified.ocsp is more than 259500 secs old! https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [04:14:29] PROBLEM - Freshness of OCSP Stapling files -HAProxy- on cp3056 is CRITICAL: CRITICAL: File /var/cache/ocsp/digicert-2021-ecdsa-unified.ocsp is more than 259500 secs old! https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [04:33:01] PROBLEM - Freshness of OCSP Stapling files -HAProxy- on cp6003 is CRITICAL: CRITICAL: File /var/cache/ocsp/digicert-2021-rsa-unified.ocsp is more than 259500 secs old! https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [04:35:17] PROBLEM - Freshness of OCSP Stapling files -HAProxy- on cp5008 is CRITICAL: CRITICAL: File /var/cache/ocsp/digicert-2022-ecdsa-unified.ocsp is more than 259500 secs old! https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [04:44:53] PROBLEM - Freshness of OCSP Stapling files -HAProxy- on cp2040 is CRITICAL: CRITICAL: File /var/cache/ocsp/digicert-2021-ecdsa-unified.ocsp is more than 259500 secs old! https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [04:47:31] PROBLEM - Freshness of OCSP Stapling files -HAProxy- on cp1084 is CRITICAL: CRITICAL: File /var/cache/ocsp/digicert-2022-rsa-unified.ocsp is more than 259500 secs old! https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [04:50:23] PROBLEM - Freshness of OCSP Stapling files -HAProxy- on cp3052 is CRITICAL: CRITICAL: File /var/cache/ocsp/digicert-2021-rsa-unified.ocsp is more than 259500 secs old! https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [04:57:55] PROBLEM - Freshness of OCSP Stapling files -HAProxy- on cp5012 is CRITICAL: CRITICAL: File /var/cache/ocsp/digicert-2022-ecdsa-unified.ocsp is more than 259500 secs old! https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [04:58:11] PROBLEM - Freshness of OCSP Stapling files -HAProxy- on cp4049 is CRITICAL: CRITICAL: File /var/cache/ocsp/digicert-2022-ecdsa-unified.ocsp is more than 259500 secs old! https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [04:58:47] PROBLEM - Freshness of OCSP Stapling files -HAProxy- on cp5002 is CRITICAL: CRITICAL: File /var/cache/ocsp/digicert-2022-ecdsa-unified.ocsp is more than 259500 secs old! https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [04:59:47] PROBLEM - Freshness of OCSP Stapling files -HAProxy- on cp5007 is CRITICAL: CRITICAL: File /var/cache/ocsp/digicert-2021-ecdsa-unified.ocsp is more than 259500 secs old! https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [05:00:59] PROBLEM - Freshness of OCSP Stapling files -HAProxy- on cp3053 is CRITICAL: CRITICAL: File /var/cache/ocsp/digicert-2021-rsa-unified.ocsp is more than 259500 secs old! https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [05:01:09] PROBLEM - Freshness of OCSP Stapling files -HAProxy- on cp6007 is CRITICAL: CRITICAL: File /var/cache/ocsp/digicert-2021-rsa-unified.ocsp is more than 259500 secs old! https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [05:14:05] PROBLEM - Freshness of OCSP Stapling files -HAProxy- on cp4048 is CRITICAL: CRITICAL: File /var/cache/ocsp/digicert-2022-rsa-unified.ocsp is more than 259500 secs old! https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [05:16:37] PROBLEM - Freshness of OCSP Stapling files -HAProxy- on cp1090 is CRITICAL: CRITICAL: File /var/cache/ocsp/digicert-2022-rsa-unified.ocsp is more than 259500 secs old! https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [05:18:15] PROBLEM - Freshness of OCSP Stapling files -HAProxy- on cp3063 is CRITICAL: CRITICAL: File /var/cache/ocsp/digicert-2022-ecdsa-unified.ocsp is more than 259500 secs old! https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [05:26:15] (MjolnirUpdateFailureRateExceedesThreshold) firing: Data shipping to CirrusSearch in eqiad is experiencing abnormal failure rates - TODO - https://grafana.wikimedia.org/d/000000591/elasticsearch-mjolnir-bulk-updates - https://alerts.wikimedia.org/?q=alertname%3DMjolnirUpdateFailureRateExceedesThreshold [05:28:39] PROBLEM - Freshness of OCSP Stapling files -HAProxy- on cp2042 is CRITICAL: CRITICAL: File /var/cache/ocsp/digicert-2022-ecdsa-unified.ocsp is more than 259500 secs old! https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [05:31:15] (MjolnirUpdateFailureRateExceedesThreshold) resolved: Data shipping to CirrusSearch in eqiad is experiencing abnormal failure rates - TODO - https://grafana.wikimedia.org/d/000000591/elasticsearch-mjolnir-bulk-updates - https://alerts.wikimedia.org/?q=alertname%3DMjolnirUpdateFailureRateExceedesThreshold [05:40:29] PROBLEM - Freshness of OCSP Stapling files -HAProxy- on cp2039 is CRITICAL: CRITICAL: File /var/cache/ocsp/digicert-2022-rsa-unified.ocsp is more than 259500 secs old! https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [05:48:41] PROBLEM - Freshness of OCSP Stapling files -HAProxy- on cp3050 is CRITICAL: CRITICAL: File /var/cache/ocsp/digicert-2022-ecdsa-unified.ocsp is more than 259500 secs old! https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [05:51:43] PROBLEM - Freshness of OCSP Stapling files -HAProxy- on cp6013 is CRITICAL: CRITICAL: File /var/cache/ocsp/digicert-2022-rsa-unified.ocsp is more than 259500 secs old! https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [05:55:59] PROBLEM - Freshness of OCSP Stapling files -HAProxy- on cp1089 is CRITICAL: CRITICAL: File /var/cache/ocsp/digicert-2021-ecdsa-unified.ocsp is more than 259500 secs old! https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [06:08:09] PROBLEM - Freshness of OCSP Stapling files -HAProxy- on cp2028 is CRITICAL: CRITICAL: File /var/cache/ocsp/digicert-2022-ecdsa-unified.ocsp is more than 259500 secs old! https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [06:23:25] PROBLEM - Freshness of OCSP Stapling files -HAProxy- on cp5009 is CRITICAL: CRITICAL: File /var/cache/ocsp/digicert-2022-rsa-unified.ocsp is more than 259500 secs old! https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [06:27:05] (03CR) 10Reedy: [C: 04-1] Add w/api/index.php (031 comment) [mediawiki-config] - 10https://gerrit.wikimedia.org/r/856030 (https://phabricator.wikimedia.org/T273179) (owner: 10Ladsgroup) [06:27:35] PROBLEM - Freshness of OCSP Stapling files -HAProxy- on cp4041 is CRITICAL: CRITICAL: File /var/cache/ocsp/digicert-2022-rsa-unified.ocsp is more than 259500 secs old! https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [06:37:09] PROBLEM - Freshness of OCSP Stapling files -HAProxy- on cp6012 is CRITICAL: CRITICAL: File /var/cache/ocsp/digicert-2021-rsa-unified.ocsp is more than 259500 secs old! https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [06:42:11] PROBLEM - Freshness of OCSP Stapling files -HAProxy- on cp2037 is CRITICAL: CRITICAL: File /var/cache/ocsp/digicert-2022-ecdsa-unified.ocsp is more than 259500 secs old! https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [06:44:11] PROBLEM - Check systemd state on cumin2002 is CRITICAL: CRITICAL - degraded: The following units failed: httpbb_hourly_appserver.service https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state [06:48:23] PROBLEM - Check unit status of httpbb_hourly_appserver on cumin2002 is CRITICAL: CRITICAL: Status of the systemd unit httpbb_hourly_appserver https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [07:00:09] PROBLEM - Freshness of OCSP Stapling files -HAProxy- on cp5005 is CRITICAL: CRITICAL: File /var/cache/ocsp/digicert-2021-rsa-unified.ocsp is more than 259500 secs old! https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [07:00:55] (LogstashKafkaConsumerLag) firing: Too many messages in kafka logging - https://wikitech.wikimedia.org/wiki/Logstash#Kafka_consumer_lag - https://grafana.wikimedia.org/d/000000484/kafka-consumer-lag?var-cluster=logging-codfw&var-datasource=codfw%20prometheus/ops - https://alerts.wikimedia.org/?q=alertname%3DLogstashKafkaConsumerLag [07:03:19] PROBLEM - Freshness of OCSP Stapling files -HAProxy- on cp5013 is CRITICAL: CRITICAL: File /var/cache/ocsp/digicert-2022-ecdsa-unified.ocsp is more than 259500 secs old! https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [07:05:55] (LogstashKafkaConsumerLag) resolved: Too many messages in kafka logging - https://wikitech.wikimedia.org/wiki/Logstash#Kafka_consumer_lag - https://grafana.wikimedia.org/d/000000484/kafka-consumer-lag?var-cluster=logging-codfw&var-datasource=codfw%20prometheus/ops - https://alerts.wikimedia.org/?q=alertname%3DLogstashKafkaConsumerLag [07:06:49] PROBLEM - Freshness of OCSP Stapling files -HAProxy- on cp6016 is CRITICAL: CRITICAL: File /var/cache/ocsp/digicert-2021-ecdsa-unified.ocsp is more than 259500 secs old! https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [07:15:09] PROBLEM - Freshness of OCSP Stapling files -HAProxy- on cp4046 is CRITICAL: CRITICAL: File /var/cache/ocsp/digicert-2022-rsa-unified.ocsp is more than 259500 secs old! https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [07:24:09] PROBLEM - Freshness of OCSP Stapling files -HAProxy- on cp3062 is CRITICAL: CRITICAL: File /var/cache/ocsp/digicert-2021-rsa-unified.ocsp is more than 259500 secs old! https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [07:26:57] PROBLEM - Freshness of OCSP Stapling files -HAProxy- on cp4037 is CRITICAL: CRITICAL: File /var/cache/ocsp/digicert-2021-ecdsa-unified.ocsp is more than 259500 secs old! https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [07:28:51] PROBLEM - Freshness of OCSP Stapling files -HAProxy- on cp1077 is CRITICAL: CRITICAL: File /var/cache/ocsp/digicert-2022-ecdsa-unified.ocsp is more than 259500 secs old! https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [07:39:19] RECOVERY - Check systemd state on cumin2002 is OK: OK - running: The system is fully operational https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state [07:43:13] RECOVERY - Check unit status of httpbb_hourly_appserver on cumin2002 is OK: OK: Status of the systemd unit httpbb_hourly_appserver https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [07:48:31] PROBLEM - Freshness of OCSP Stapling files -HAProxy- on cp4042 is CRITICAL: CRITICAL: File /var/cache/ocsp/digicert-2022-ecdsa-unified.ocsp is more than 259500 secs old! https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [07:52:33] PROBLEM - Freshness of OCSP Stapling files -HAProxy- on cp5010 is CRITICAL: CRITICAL: File /var/cache/ocsp/digicert-2021-ecdsa-unified.ocsp is more than 259500 secs old! https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [07:53:23] PROBLEM - Freshness of OCSP Stapling files -HAProxy- on cp3060 is CRITICAL: CRITICAL: File /var/cache/ocsp/digicert-2022-ecdsa-unified.ocsp is more than 259500 secs old! https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [07:53:54] 10SRE, 10Traffic, 10Patch-For-Review: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10elukey) @ssingh Hi! I am seeing the following error for update-ocsp-all on various cp nodes: ` Nov 13 07:23:00 cp1077 update-ocsp-all[42808]: /usr/local/sbin/update-ocsp:52: DeprecationWarn... [07:55:33] PROBLEM - Freshness of OCSP Stapling files -HAProxy- on cp1088 is CRITICAL: CRITICAL: File /var/cache/ocsp/digicert-2022-ecdsa-unified.ocsp is more than 259500 secs old! https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [08:00:04] Deploy window No deploys all day! See Deployments/Emergencies if things are broken. (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20221113T0800) [08:05:13] (03PS1) 10Elukey: sslcert: add decode() after popen's communicate toupdate-ocsp.py [puppet] - 10https://gerrit.wikimedia.org/r/856126 (https://phabricator.wikimedia.org/T321309) [08:12:52] (03PS2) 10Elukey: sslcert: add decode() after popen's communicate to update-ocsp.py [puppet] - 10https://gerrit.wikimedia.org/r/856126 (https://phabricator.wikimedia.org/T321309) [08:13:48] (03PS3) 10Elukey: sslcert: add text=True to Popen for update-ocsp.py [puppet] - 10https://gerrit.wikimedia.org/r/856126 (https://phabricator.wikimedia.org/T321309) [08:24:05] PROBLEM - Freshness of OCSP Stapling files -HAProxy- on cp4047 is CRITICAL: CRITICAL: File /var/cache/ocsp/digicert-2022-rsa-unified.ocsp is more than 259500 secs old! https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [08:26:33] PROBLEM - Freshness of OCSP Stapling files -HAProxy- on cp2034 is CRITICAL: CRITICAL: File /var/cache/ocsp/digicert-2022-ecdsa-unified.ocsp is more than 259500 secs old! https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [08:29:28] (03PS1) 10Vgutierrez: haproxy: Remove digicert-2021 TLS material [puppet] - 10https://gerrit.wikimedia.org/r/856129 (https://phabricator.wikimedia.org/T313328) [08:31:28] (03CR) 10Vgutierrez: [V: 03+1] "PCC SUCCESS (DIFF 2): https://integration.wikimedia.org/ci/job/operations-puppet-catalog-compiler/38120/console" [puppet] - 10https://gerrit.wikimedia.org/r/856129 (https://phabricator.wikimedia.org/T313328) (owner: 10Vgutierrez) [08:36:32] (03CR) 10Vgutierrez: [C: 03+2] sslcert: add text=True to Popen for update-ocsp.py [puppet] - 10https://gerrit.wikimedia.org/r/856126 (https://phabricator.wikimedia.org/T321309) (owner: 10Elukey) [08:36:45] RECOVERY - Freshness of OCSP Stapling files -HAProxy- on cp1075 is OK: OK https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [08:36:59] (03CR) 10Vgutierrez: [V: 03+1 C: 03+2] haproxy: Remove digicert-2021 TLS material [puppet] - 10https://gerrit.wikimedia.org/r/856129 (https://phabricator.wikimedia.org/T313328) (owner: 10Vgutierrez) [08:46:31] PROBLEM - Freshness of OCSP Stapling files -HAProxy- on cp6010 is CRITICAL: CRITICAL: File /var/cache/ocsp/digicert-2021-ecdsa-unified.ocsp is more than 259500 secs old! https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [08:52:01] (CirrusSearchJobQueueBacklogTooBig) firing: CirrusSearch job topic eqiad.mediawiki.job.cirrusSearchLinksUpdate is heavily backlogged with 214.5k messages - TODO - https://grafana.wikimedia.org/d/CbmStnlGk/jobqueue-job?orgId=1&var-dc=eqiad%20prometheus/k8s&var-job=cirrusSearchLinksUpdate - https://alerts.wikimedia.org/?q=alertname%3DCirrusSearchJobQueueBacklogTooBig [09:02:01] (CirrusSearchJobQueueBacklogTooBig) resolved: CirrusSearch job topic eqiad.mediawiki.job.cirrusSearchLinksUpdate is heavily backlogged with 203.9k messages - TODO - https://grafana.wikimedia.org/d/CbmStnlGk/jobqueue-job?orgId=1&var-dc=eqiad%20prometheus/k8s&var-job=cirrusSearchLinksUpdate - https://alerts.wikimedia.org/?q=alertname%3DCirrusSearchJobQueueBacklogTooBig [09:04:17] RECOVERY - Freshness of OCSP Stapling files -HAProxy- on cp2027 is OK: OK https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [09:04:17] RECOVERY - Freshness of OCSP Stapling files -HAProxy- on cp2028 is OK: OK https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [09:04:39] RECOVERY - Freshness of OCSP Stapling files -HAProxy- on cp2029 is OK: OK https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [09:06:15] RECOVERY - Freshness of OCSP Stapling files -HAProxy- on cp2034 is OK: OK https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [09:06:15] RECOVERY - Freshness of OCSP Stapling files -HAProxy- on cp2033 is OK: OK https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [09:06:15] RECOVERY - Freshness of OCSP Stapling files -HAProxy- on cp2031 is OK: OK https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [09:06:35] RECOVERY - Freshness of OCSP Stapling files -HAProxy- on cp2035 is OK: OK https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [09:08:25] RECOVERY - Freshness of OCSP Stapling files -HAProxy- on cp2041 is OK: OK https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [09:08:25] RECOVERY - Freshness of OCSP Stapling files -HAProxy- on cp2039 is OK: OK https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [09:08:29] RECOVERY - Freshness of OCSP Stapling files -HAProxy- on cp2040 is OK: OK https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [09:08:35] RECOVERY - Freshness of OCSP Stapling files -HAProxy- on cp2042 is OK: OK https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [09:08:35] RECOVERY - Freshness of OCSP Stapling files -HAProxy- on cp2037 is OK: OK https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [09:09:01] RECOVERY - Freshness of OCSP Stapling files -HAProxy- on cp6003 is OK: OK https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [09:09:39] RECOVERY - Freshness of OCSP Stapling files -HAProxy- on cp6001 is OK: OK https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [09:09:41] RECOVERY - Freshness of OCSP Stapling files -HAProxy- on cp6005 is OK: OK https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [09:10:05] RECOVERY - Freshness of OCSP Stapling files -HAProxy- on cp6006 is OK: OK https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [09:10:29] RECOVERY - Freshness of OCSP Stapling files -HAProxy- on cp6004 is OK: OK https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [09:10:29] RECOVERY - Freshness of OCSP Stapling files -HAProxy- on cp6008 is OK: OK https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [09:10:29] RECOVERY - Freshness of OCSP Stapling files -HAProxy- on cp6009 is OK: OK https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [09:10:31] RECOVERY - Freshness of OCSP Stapling files -HAProxy- on cp6010 is OK: OK https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [09:11:13] RECOVERY - Freshness of OCSP Stapling files -HAProxy- on cp6007 is OK: OK https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [09:11:41] RECOVERY - Freshness of OCSP Stapling files -HAProxy- on cp6012 is OK: OK https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [09:12:03] RECOVERY - Freshness of OCSP Stapling files -HAProxy- on cp6013 is OK: OK https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [09:12:29] RECOVERY - Freshness of OCSP Stapling files -HAProxy- on cp6015 is OK: OK https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [09:12:29] RECOVERY - Freshness of OCSP Stapling files -HAProxy- on cp6011 is OK: OK https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [09:13:15] RECOVERY - Freshness of OCSP Stapling files -HAProxy- on cp1078 is OK: OK https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [09:13:41] RECOVERY - Freshness of OCSP Stapling files -HAProxy- on cp6016 is OK: OK https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [09:13:43] RECOVERY - Freshness of OCSP Stapling files -HAProxy- on cp1077 is OK: OK https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [09:14:11] RECOVERY - Freshness of OCSP Stapling files -HAProxy- on cp1081 is OK: OK https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [09:14:41] RECOVERY - Freshness of OCSP Stapling files -HAProxy- on cp1084 is OK: OK https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [09:14:49] RECOVERY - Freshness of OCSP Stapling files -HAProxy- on cp1085 is OK: OK https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [09:14:55] RECOVERY - Freshness of OCSP Stapling files -HAProxy- on cp1083 is OK: OK https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [09:15:41] RECOVERY - Freshness of OCSP Stapling files -HAProxy- on cp1089 is OK: OK https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [09:16:23] RECOVERY - Freshness of OCSP Stapling files -HAProxy- on cp1086 is OK: OK https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [09:16:49] RECOVERY - Freshness of OCSP Stapling files -HAProxy- on cp1088 is OK: OK https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [09:16:59] RECOVERY - Freshness of OCSP Stapling files -HAProxy- on cp5004 is OK: OK https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [09:17:07] RECOVERY - Freshness of OCSP Stapling files -HAProxy- on cp5002 is OK: OK https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [09:17:37] RECOVERY - Freshness of OCSP Stapling files -HAProxy- on cp5006 is OK: OK https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [09:17:55] RECOVERY - Freshness of OCSP Stapling files -HAProxy- on cp1090 is OK: OK https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [09:18:03] RECOVERY - Freshness of OCSP Stapling files -HAProxy- on cp5007 is OK: OK https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [09:18:13] RECOVERY - Freshness of OCSP Stapling files -HAProxy- on cp5003 is OK: OK https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [09:18:57] RECOVERY - Freshness of OCSP Stapling files -HAProxy- on cp5005 is OK: OK https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [09:19:41] RECOVERY - Freshness of OCSP Stapling files -HAProxy- on cp5008 is OK: OK https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [09:20:01] RECOVERY - Freshness of OCSP Stapling files -HAProxy- on cp5009 is OK: OK https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [09:20:01] RECOVERY - Freshness of OCSP Stapling files -HAProxy- on cp5012 is OK: OK https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [09:20:09] RECOVERY - Freshness of OCSP Stapling files -HAProxy- on cp5010 is OK: OK https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [09:20:11] RECOVERY - Freshness of OCSP Stapling files -HAProxy- on cp5013 is OK: OK https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [09:20:47] RECOVERY - Freshness of OCSP Stapling files -HAProxy- on cp3050 is OK: OK https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [09:20:57] RECOVERY - Freshness of OCSP Stapling files -HAProxy- on cp5016 is OK: OK https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [09:20:57] RECOVERY - Freshness of OCSP Stapling files -HAProxy- on cp5011 is OK: OK https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [09:22:17] RECOVERY - Freshness of OCSP Stapling files -HAProxy- on cp3052 is OK: OK https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [09:22:47] RECOVERY - Freshness of OCSP Stapling files -HAProxy- on cp3055 is OK: OK https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [09:22:47] RECOVERY - Freshness of OCSP Stapling files -HAProxy- on cp3053 is OK: OK https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [09:24:15] RECOVERY - Freshness of OCSP Stapling files -HAProxy- on cp3063 is OK: OK https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [09:24:15] RECOVERY - Freshness of OCSP Stapling files -HAProxy- on cp3057 is OK: OK https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [09:24:15] RECOVERY - Freshness of OCSP Stapling files -HAProxy- on cp3056 is OK: OK https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [09:24:45] RECOVERY - Freshness of OCSP Stapling files -HAProxy- on cp3065 is OK: OK https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [09:24:45] RECOVERY - Freshness of OCSP Stapling files -HAProxy- on cp3060 is OK: OK https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [09:25:07] RECOVERY - Freshness of OCSP Stapling files -HAProxy- on cp3062 is OK: OK https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [09:25:23] RECOVERY - Freshness of OCSP Stapling files -HAProxy- on cp4038 is OK: OK https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [09:25:25] RECOVERY - Freshness of OCSP Stapling files -HAProxy- on cp3061 is OK: OK https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [09:25:57] RECOVERY - Freshness of OCSP Stapling files -HAProxy- on cp4037 is OK: OK https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [09:27:23] RECOVERY - Freshness of OCSP Stapling files -HAProxy- on cp4040 is OK: OK https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [09:27:23] RECOVERY - Freshness of OCSP Stapling files -HAProxy- on cp4039 is OK: OK https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [09:27:45] RECOVERY - Freshness of OCSP Stapling files -HAProxy- on cp4047 is OK: OK https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [09:27:45] RECOVERY - Freshness of OCSP Stapling files -HAProxy- on cp4045 is OK: OK https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [09:27:55] RECOVERY - Freshness of OCSP Stapling files -HAProxy- on cp4048 is OK: OK https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [09:27:55] RECOVERY - Freshness of OCSP Stapling files -HAProxy- on cp4042 is OK: OK https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [09:27:55] RECOVERY - Freshness of OCSP Stapling files -HAProxy- on cp4046 is OK: OK https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [09:27:55] RECOVERY - Freshness of OCSP Stapling files -HAProxy- on cp4041 is OK: OK https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [09:29:21] RECOVERY - Freshness of OCSP Stapling files -HAProxy- on cp4050 is OK: OK https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [09:29:53] RECOVERY - Freshness of OCSP Stapling files -HAProxy- on cp4049 is OK: OK https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [09:30:33] RECOVERY - Freshness of OCSP Stapling files -HAProxy- on cp4051 is OK: OK https://wikitech.wikimedia.org/wiki/HTTPS/Unified_Certificates [13:34:03] (ProbeDown) firing: Service centrallog2002:6514 has failed probes (tcp_rsyslog_receiver_ip6) - https://wikitech.wikimedia.org/wiki/TLS/Runbook#centrallog2002:6514 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://alerts.wikimedia.org/?q=alertname%3DProbeDown [13:39:03] (ProbeDown) resolved: Service centrallog2002:6514 has failed probes (tcp_rsyslog_receiver_ip6) - https://wikitech.wikimedia.org/wiki/TLS/Runbook#centrallog2002:6514 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://alerts.wikimedia.org/?q=alertname%3DProbeDown [15:14:59] PROBLEM - mailman list info on lists1001 is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Mailman/Monitoring [15:15:21] PROBLEM - mailman archives on lists1001 is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Mailman/Monitoring [15:16:53] RECOVERY - mailman list info on lists1001 is OK: HTTP OK: HTTP/1.1 200 OK - 8572 bytes in 2.235 second response time https://wikitech.wikimedia.org/wiki/Mailman/Monitoring [15:17:11] RECOVERY - mailman archives on lists1001 is OK: HTTP OK: HTTP/1.1 200 OK - 48974 bytes in 0.093 second response time https://wikitech.wikimedia.org/wiki/Mailman/Monitoring [15:43:17] PROBLEM - Check systemd state on cumin2002 is CRITICAL: CRITICAL - degraded: The following units failed: httpbb_hourly_appserver.service https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state [15:48:25] PROBLEM - Check unit status of httpbb_hourly_appserver on cumin2002 is CRITICAL: CRITICAL: Status of the systemd unit httpbb_hourly_appserver https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [16:39:13] RECOVERY - Check systemd state on cumin2002 is OK: OK - running: The system is fully operational https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state [16:43:23] RECOVERY - Check unit status of httpbb_hourly_appserver on cumin2002 is OK: OK: Status of the systemd unit httpbb_hourly_appserver https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [18:23:54] (03PS2) 10Ladsgroup: Add w/api/index.html [mediawiki-config] - 10https://gerrit.wikimedia.org/r/856030 (https://phabricator.wikimedia.org/T273179) [18:24:34] (03PS3) 10Ladsgroup: Add w/api/index.html [mediawiki-config] - 10https://gerrit.wikimedia.org/r/856030 (https://phabricator.wikimedia.org/T273179) [18:24:47] (03CR) 10Ladsgroup: Add w/api/index.html (032 comments) [mediawiki-config] - 10https://gerrit.wikimedia.org/r/856030 (https://phabricator.wikimedia.org/T273179) (owner: 10Ladsgroup) [20:02:55] (LogstashKafkaConsumerLag) firing: Too many messages in kafka logging - https://wikitech.wikimedia.org/wiki/Logstash#Kafka_consumer_lag - https://grafana.wikimedia.org/d/000000484/kafka-consumer-lag?var-cluster=logging-eqiad&var-datasource=eqiad%20prometheus/ops - https://alerts.wikimedia.org/?q=alertname%3DLogstashKafkaConsumerLag [20:07:55] (LogstashKafkaConsumerLag) resolved: Too many messages in kafka logging - https://wikitech.wikimedia.org/wiki/Logstash#Kafka_consumer_lag - https://grafana.wikimedia.org/d/000000484/kafka-consumer-lag?var-cluster=logging-eqiad&var-datasource=eqiad%20prometheus/ops - https://alerts.wikimedia.org/?q=alertname%3DLogstashKafkaConsumerLag [20:10:49] PROBLEM - SSH on mw1330.mgmt is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [20:19:54] (03PS1) 10Hashar: Add CI results to a tab [software/gerrit] (deploy/wmf/stable-3.4) - 10https://gerrit.wikimedia.org/r/856182 [20:20:25] (03CR) 10Hashar: "Redone as a standalone JavaScript plugin at https://gerrit.wikimedia.org/r/c/operations/software/gerrit/+/856182" [puppet] - 10https://gerrit.wikimedia.org/r/756685 (owner: 10Hashar) [20:20:31] (03CR) 10CI reject: [V: 04-1] Add CI results to a tab [software/gerrit] (deploy/wmf/stable-3.4) - 10https://gerrit.wikimedia.org/r/856182 (owner: 10Hashar) [21:11:39] RECOVERY - SSH on mw1330.mgmt is OK: SSH OK - OpenSSH_7.0 (protocol 2.0) https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [22:58:51] PROBLEM - Citoid LVS codfw on citoid.svc.codfw.wmnet is CRITICAL: /api (Zotero and citoid alive) timed out before a response was received https://wikitech.wikimedia.org/wiki/Citoid [22:59:55] (LogstashKafkaConsumerLag) firing: Too many messages in kafka logging - https://wikitech.wikimedia.org/wiki/Logstash#Kafka_consumer_lag - https://grafana.wikimedia.org/d/000000484/kafka-consumer-lag?var-cluster=logging-codfw&var-datasource=codfw%20prometheus/ops - https://alerts.wikimedia.org/?q=alertname%3DLogstashKafkaConsumerLag [23:00:47] RECOVERY - Citoid LVS codfw on citoid.svc.codfw.wmnet is OK: All endpoints are healthy https://wikitech.wikimedia.org/wiki/Citoid [23:04:55] (LogstashKafkaConsumerLag) resolved: Too many messages in kafka logging - https://wikitech.wikimedia.org/wiki/Logstash#Kafka_consumer_lag - https://grafana.wikimedia.org/d/000000484/kafka-consumer-lag?var-cluster=logging-codfw&var-datasource=codfw%20prometheus/ops - https://alerts.wikimedia.org/?q=alertname%3DLogstashKafkaConsumerLag