[00:26:02] (03PS1) 10Paladox: grant access to Javamelody Monitoring for ldap/ops [All-Projects] (refs/meta/config) - 10https://gerrit.wikimedia.org/r/508068 [00:26:17] thcipriani when your around later, mind reviewing ^^ please? :) [00:38:38] 10Gerrit, 10Release-Engineering-Team, 10Operations, 10serviceops: cobalt is experiencing frequent icmp/errs causing high threads in gerrit - https://phabricator.wikimedia.org/T222498 (10Paladox) [00:45:08] 10Gerrit, 10Release-Engineering-Team, 10Operations, 10serviceops: cobalt is experiencing frequent icmp/errs causing high threads in gerrit - https://phabricator.wikimedia.org/T222498 (10Paladox) p:05Triage→03High [01:03:45] PROBLEM - Puppet staleness on deployment-logstash2 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [43200.0] [01:20:07] PROBLEM - Mathoid on deployment-mathoid is CRITICAL: connect to address 172.16.5.73 and port 10042: Connection refused [01:25:06] RECOVERY - Mathoid on deployment-mathoid is OK: HTTP OK: HTTP/1.1 200 OK - 925 bytes in 0.028 second response time [01:31:06] PROBLEM - Mathoid on deployment-mathoid is CRITICAL: connect to address 172.16.5.73 and port 10042: Connection refused [01:41:04] RECOVERY - Mathoid on deployment-mathoid is OK: HTTP OK: HTTP/1.1 200 OK - 925 bytes in 0.026 second response time [02:37:55] PROBLEM - Free space - all mounts on deployment-fluorine02 is CRITICAL: CRITICAL: deployment-prep.deployment-fluorine02.diskspace._srv.byte_percentfree (<30.00%) [05:30:04] PROBLEM - Mathoid on deployment-mathoid is CRITICAL: connect to address 172.16.5.73 and port 10042: Connection refused [06:05:06] RECOVERY - Mathoid on deployment-mathoid is OK: HTTP OK: HTTP/1.1 200 OK - 925 bytes in 0.551 second response time [06:11:06] PROBLEM - Mathoid on deployment-mathoid is CRITICAL: connect to address 172.16.5.73 and port 10042: Connection refused [06:46:06] RECOVERY - Mathoid on deployment-mathoid is OK: HTTP OK: HTTP/1.1 200 OK - 925 bytes in 0.080 second response time [06:52:51] RECOVERY - Free space - all mounts on deployment-fluorine02 is OK: OK: All targets OK [07:47:03] PROBLEM - Mathoid on deployment-mathoid is CRITICAL: connect to address 172.16.5.73 and port 10042: Connection refused [08:15:34] 10Gerrit, 10Release-Engineering-Team, 10Operations, 10serviceops: cobalt is experiencing frequent icmp/errs causing high threads in gerrit - https://phabricator.wikimedia.org/T222498 (10hashar) 05Open→03Invalid Essentially it is the other way around ;) The TCP errors are due to client connections being... [08:47:06] RECOVERY - Mathoid on deployment-mathoid is OK: HTTP OK: HTTP/1.1 200 OK - 925 bytes in 0.026 second response time [08:53:06] PROBLEM - Mathoid on deployment-mathoid is CRITICAL: connect to address 172.16.5.73 and port 10042: Connection refused [09:53:04] RECOVERY - Mathoid on deployment-mathoid is OK: HTTP OK: HTTP/1.1 200 OK - 925 bytes in 0.270 second response time [09:59:04] PROBLEM - Mathoid on deployment-mathoid is CRITICAL: connect to address 172.16.5.73 and port 10042: Connection refused [10:19:05] RECOVERY - Mathoid on deployment-mathoid is OK: HTTP OK: HTTP/1.1 200 OK - 925 bytes in 0.029 second response time [10:35:05] PROBLEM - Mathoid on deployment-mathoid is CRITICAL: connect to address 172.16.5.73 and port 10042: Connection refused [10:50:07] RECOVERY - Mathoid on deployment-mathoid is OK: HTTP OK: HTTP/1.1 200 OK - 925 bytes in 0.023 second response time [12:59:04] PROBLEM - Mathoid on deployment-mathoid is CRITICAL: connect to address 172.16.5.73 and port 10042: Connection refused [13:18:30] PROBLEM - Long lived cherry-picks on puppetmaster on deployment-puppetmaster03 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [13:34:04] RECOVERY - Mathoid on deployment-mathoid is OK: HTTP OK: HTTP/1.1 200 OK - 925 bytes in 0.037 second response time [13:45:02] PROBLEM - Mathoid on deployment-mathoid is CRITICAL: connect to address 172.16.5.73 and port 10042: Connection refused [14:05:07] RECOVERY - Mathoid on deployment-mathoid is OK: HTTP OK: HTTP/1.1 200 OK - 925 bytes in 0.195 second response time [14:32:50] 10Continuous-Integration-Infrastructure: CI: upgrade tox or allow to override its version per-project - https://phabricator.wikimedia.org/T222512 (10Volans) [15:50:13] (03PS1) 10Kosta Harlan: sonar-scanner: Adjust polling script [integration/config] - 10https://gerrit.wikimedia.org/r/508086 (https://phabricator.wikimedia.org/T218598) [15:52:39] (03CR) 10Kosta Harlan: "it looks like the first part of the polling (getting the analysis response to = SUCCESS) worked, but piping the quality gate response to J" [integration/config] - 10https://gerrit.wikimedia.org/r/508086 (https://phabricator.wikimedia.org/T218598) (owner: 10Kosta Harlan) [15:53:05] PROBLEM - Mathoid on deployment-mathoid is CRITICAL: connect to address 172.16.5.73 and port 10042: Connection refused [15:58:03] RECOVERY - Mathoid on deployment-mathoid is OK: HTTP OK: HTTP/1.1 200 OK - 925 bytes in 0.036 second response time [20:51:05] PROBLEM - Mathoid on deployment-mathoid is CRITICAL: connect to address 172.16.5.73 and port 10042: Connection refused [21:01:07] RECOVERY - Mathoid on deployment-mathoid is OK: HTTP OK: HTTP/1.1 200 OK - 925 bytes in 0.024 second response time [21:22:04] PROBLEM - Mathoid on deployment-mathoid is CRITICAL: connect to address 172.16.5.73 and port 10042: Connection refused [21:47:07] RECOVERY - Mathoid on deployment-mathoid is OK: HTTP OK: HTTP/1.1 200 OK - 925 bytes in 0.027 second response time [22:13:18] 10Release-Engineering-Team, 10Code-Health, 10Epic: [EPIC] Encourage developers to increase code coverage - https://phabricator.wikimedia.org/T100294 (10Aklapper)