[00:01:51] https://integration.wikimedia.org/ci/job/quibble-vendor-mysql-php70-docker/3761/console npm stalled liked 10 minutes ago :o [00:03:14] 10Continuous-Integration-Infrastructure, 10Quibble: npm install stalled as part of quibble - https://phabricator.wikimedia.org/T195641#4233244 (10Legoktm) [00:03:59] 10Continuous-Integration-Infrastructure, 10Quibble: npm install stalled as part of quibble - https://phabricator.wikimedia.org/T195641#4233244 (10Legoktm) [00:21:07] PROBLEM - SSH on integration-slave-docker-1014 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [00:25:56] RECOVERY - SSH on integration-slave-docker-1014 is OK: SSH OK - OpenSSH_6.7p1 Debian-5+deb8u4 (protocol 2.0) [00:31:17] PROBLEM - Long lived cherry-picks on puppetmaster on deployment-puppetmaster02 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [00:39:04] (03PS1) 10Legoktm: doc: Add XMPReader [integration/docroot] - 10https://gerrit.wikimedia.org/r/435341 [00:39:39] (03CR) 10Legoktm: [C: 032] doc: Add XMPReader [integration/docroot] - 10https://gerrit.wikimedia.org/r/435341 (owner: 10Legoktm) [00:40:18] (03Merged) 10jenkins-bot: doc: Add XMPReader [integration/docroot] - 10https://gerrit.wikimedia.org/r/435341 (owner: 10Legoktm) [00:40:38] (03CR) 10jenkins-bot: doc: Add XMPReader [integration/docroot] - 10https://gerrit.wikimedia.org/r/435341 (owner: 10Legoktm) [11:00:07] PROBLEM - Puppet errors on deployment-snapshot01 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [12:15:53] 10Phabricator (Upstream), 10Upstream: Phabricator does not let me to add tag to task and move it on tag's workboard at once - https://phabricator.wikimedia.org/T195638#4233476 (10Aklapper) p:05Triage>03Lowest [12:59:14] PROBLEM - Puppet staleness on deployment-puppetdb01 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [43200.0] [13:01:41] PROBLEM - Puppet errors on deployment-puppetdb01 is CRITICAL: CRITICAL: 85.71% of data above the critical threshold [0.0] [13:04:14] RECOVERY - Puppet staleness on deployment-puppetdb01 is OK: OK: Less than 1.00% above the threshold [3600.0] [13:16:40] RECOVERY - Puppet errors on deployment-puppetdb01 is OK: OK: Less than 1.00% above the threshold [0.0] [13:23:57] PROBLEM - Puppet errors on deployment-puppetmaster02 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [13:24:22] 10Continuous-Integration-Infrastructure, 10Quibble: npm install stalled as part of quibble - https://phabricator.wikimedia.org/T195641#4233244 (10hashar) From https://status.npmjs.org/ / https://status.npmjs.org/incidents/t3j62lxb7jg3 > **Registry maintenance in progress** > > Monitoring - Current registry... [13:24:33] PROBLEM - Puppet errors on deployment-sca04 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [13:29:45] PROBLEM - Puppet errors on deployment-certcentral is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [13:30:13] 10Beta-Cluster-Infrastructure, 10Puppet, 10Tracking: Deployment-prep hosts with puppet errors (tracking) - https://phabricator.wikimedia.org/T132259#4233678 (10Krenair) [13:30:16] 10Beta-Cluster-Infrastructure, 10Operations, 10Puppet: Host deployment-puppetdb01 is DOWN: CRITICAL - Host Unreachable (10.68.23.76) - https://phabricator.wikimedia.org/T187736#4233675 (10Krenair) 05Open>03Resolved a:03Krenair It's back and Puppet is behaving. [13:33:56] RECOVERY - Puppet errors on deployment-puppetmaster02 is OK: OK: Less than 1.00% above the threshold [0.0] [13:39:46] RECOVERY - Puppet errors on deployment-certcentral is OK: OK: Less than 1.00% above the threshold [0.0] [13:51:59] RECOVERY - Puppet errors on deployment-certcentral-testclient is OK: OK: Less than 1.00% above the threshold [0.0] [14:04:33] RECOVERY - Puppet errors on deployment-sca04 is OK: OK: Less than 1.00% above the threshold [0.0] [14:08:58] PROBLEM - Puppet errors on deployment-tin is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [14:09:28] PROBLEM - Puppet errors on deployment-aqs01 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [14:09:40] PROBLEM - Puppet errors on deployment-changeprop is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [14:09:46] PROBLEM - Puppet errors on deployment-cumin is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [14:10:04] PROBLEM - Puppet errors on deployment-sca01 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [14:11:08] PROBLEM - Puppet errors on deployment-kafka-jumbo-2 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [14:11:10] PROBLEM - Puppet errors on deployment-memc06 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [14:12:19] PROBLEM - Puppet errors on deployment-imagescaler01 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [14:12:47] PROBLEM - Puppet errors on deployment-imagescaler02 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [14:12:59] PROBLEM - Puppet errors on deployment-elastic05 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [14:13:01] PROBLEM - Puppet errors on deployment-urldownloader is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [14:13:25] ^ me [14:14:14] PROBLEM - Puppet errors on deployment-aqs02 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [14:14:14] PROBLEM - Puppet errors on deployment-kafka-jumbo-1 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [14:14:14] PROBLEM - Puppet errors on deployment-mira is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [14:14:56] PROBLEM - Puppet errors on deployment-puppetmaster02 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [14:15:08] PROBLEM - Puppet errors on deployment-db04 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [14:22:27] PROBLEM - Puppet errors on deployment-dumps-puppetmaster is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [14:22:59] Krenair I run puppetdb in cloud using stretch :) [14:23:11] yeah but we've got jessie [14:23:23] are you using role::puppetmaster::standalone? [14:23:34] Yep I think [14:23:44] yeah [14:24:00] RECOVERY - Puppet errors on deployment-tin is OK: OK: Less than 1.00% above the threshold [0.0] [14:24:13] But I’m running it on the same instance as puppetmaster [14:27:50] 10Project-Admins: Create two projects: Wikimedia-Technical-Conference-2018 & Wikimedia-Technical-Conference-2018-Organization - https://phabricator.wikimedia.org/T195398#4233749 (10Aklapper) [14:27:52] 10Project-Admins, 10Developer-Relations (Apr-Jun-2018): Create 4 event projects for Wikimania Hackathon 2018 and 2018 Wikimedia Technical Conference - https://phabricator.wikimedia.org/T191372#4233752 (10Aklapper) [14:40:00] RECOVERY - Puppet errors on deployment-puppetmaster02 is OK: OK: Less than 1.00% above the threshold [0.0] [14:44:46] RECOVERY - Puppet errors on deployment-cumin is OK: OK: Less than 1.00% above the threshold [0.0] [14:46:06] RECOVERY - Puppet errors on deployment-memc06 is OK: OK: Less than 1.00% above the threshold [0.0] [14:47:58] RECOVERY - Puppet errors on deployment-elastic05 is OK: OK: Less than 1.00% above the threshold [0.0] [14:49:11] RECOVERY - Puppet errors on deployment-mira is OK: OK: Less than 1.00% above the threshold [0.0] [14:49:25] RECOVERY - Puppet errors on deployment-aqs01 is OK: OK: Less than 1.00% above the threshold [0.0] [14:49:39] RECOVERY - Puppet errors on deployment-changeprop is OK: OK: Less than 1.00% above the threshold [0.0] [14:50:05] RECOVERY - Puppet errors on deployment-sca01 is OK: OK: Less than 1.00% above the threshold [0.0] [14:50:09] RECOVERY - Puppet errors on deployment-db04 is OK: OK: Less than 1.00% above the threshold [0.0] [14:51:09] RECOVERY - Puppet errors on deployment-kafka-jumbo-2 is OK: OK: Less than 1.00% above the threshold [0.0] [14:52:21] RECOVERY - Puppet errors on deployment-imagescaler01 is OK: OK: Less than 1.00% above the threshold [0.0] [14:52:48] RECOVERY - Puppet errors on deployment-imagescaler02 is OK: OK: Less than 1.00% above the threshold [0.0] [14:53:00] RECOVERY - Puppet errors on deployment-urldownloader is OK: OK: Less than 1.00% above the threshold [0.0] [14:54:14] RECOVERY - Puppet errors on deployment-aqs02 is OK: OK: Less than 1.00% above the threshold [0.0] [14:54:14] RECOVERY - Puppet errors on deployment-kafka-jumbo-1 is OK: OK: Less than 1.00% above the threshold [0.0] [14:57:29] RECOVERY - Puppet errors on deployment-dumps-puppetmaster is OK: OK: Less than 1.00% above the threshold [0.0] [15:14:26] 10Continuous-Integration-Infrastructure, 10Quibble: npm install stalled as part of quibble - https://phabricator.wikimedia.org/T195641#4233845 (10Legoktm) 05Open>03Resolved a:03Legoktm Ooh, I should have thought to check that, nice :) Yep, I did not see any other failures, and libraryupgrader triggered... [15:55:18] PROBLEM - Puppet errors on deployment-puppetmaster03 is CRITICAL: CRITICAL: 88.89% of data above the critical threshold [0.0] [16:05:18] RECOVERY - Puppet errors on deployment-puppetmaster03 is OK: OK: Less than 1.00% above the threshold [0.0] [16:21:19] PROBLEM - Puppet errors on deployment-puppetmaster03 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [16:30:00] PROBLEM - Puppet errors on deployment-tin is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [17:09:59] RECOVERY - Puppet errors on deployment-tin is OK: OK: Less than 1.00% above the threshold [0.0] [18:18:54] 10Beta-Cluster-Infrastructure: en-rtl listed under other projects on beta.wmflabs.org - https://phabricator.wikimedia.org/T195675#4234027 (10Dvorapa) [19:41:52] PROBLEM - SSH on integration-slave-docker-1020 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:46:43] RECOVERY - SSH on integration-slave-docker-1020 is OK: SSH OK - OpenSSH_6.7p1 Debian-5+deb8u4 (protocol 2.0) [20:43:51] PROBLEM - Puppet errors on integration-slave-jessie-android is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [20:56:20] RECOVERY - Puppet errors on deployment-puppetmaster03 is OK: OK: Less than 1.00% above the threshold [0.0] [21:06:00] PROBLEM - Puppet errors on deployment-tin is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [21:08:22] PROBLEM - Puppet errors on deployment-mx02 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [21:16:16] 10MediaWiki-Codesniffer: Warn if using + to concat strings - https://phabricator.wikimedia.org/T195683#4234171 (10Reedy) [21:20:58] RECOVERY - Puppet errors on deployment-tin is OK: OK: Less than 1.00% above the threshold [0.0] [21:31:58] PROBLEM - Puppet errors on deployment-tin is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [21:42:19] PROBLEM - Puppet errors on deployment-puppetmaster03 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [21:56:42] 10Beta-Cluster-Infrastructure: Move puppetmaster to Stretch - https://phabricator.wikimedia.org/T195686#4234225 (10Krenair) [22:02:58] PROBLEM - Puppet errors on deployment-certcentral-testclient is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [22:12:40] PROBLEM - Puppet errors on deployment-puppetdb01 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [22:15:12] PROBLEM - Puppet errors on deployment-aqs02 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [22:15:47] PROBLEM - Puppet errors on deployment-certcentral is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [22:16:05] PROBLEM - Puppet errors on deployment-sca01 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [22:16:07] PROBLEM - Puppet errors on deployment-db04 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [22:17:07] PROBLEM - Puppet errors on deployment-kafka-jumbo-2 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [22:17:45] PROBLEM - Puppet errors on deployment-maps03 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [22:18:22] PROBLEM - Puppet errors on deployment-imagescaler01 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [22:18:29] sigh [22:18:47] PROBLEM - Puppet errors on deployment-imagescaler02 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [22:19:01] PROBLEM - Puppet errors on deployment-elastic06 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [22:19:01] PROBLEM - Puppet errors on deployment-urldownloader is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [22:19:13] PROBLEM - Puppet errors on deployment-cassandra3-02 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [22:19:15] rip [22:19:35] (03PS1) 10Legoktm: Sync full clover.xml.bz2 to doc.wikimedia.org [integration/config] - 10https://gerrit.wikimedia.org/r/435660 [22:20:05] (03CR) 10Legoktm: [C: 032] Sync full clover.xml.bz2 to doc.wikimedia.org [integration/config] - 10https://gerrit.wikimedia.org/r/435660 (owner: 10Legoktm) [22:20:14] PROBLEM - Puppet errors on deployment-kafka-jumbo-1 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [22:20:14] PROBLEM - Puppet errors on deployment-mira is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [22:20:58] PROBLEM - Puppet errors on deployment-puppetmaster02 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [22:22:09] (03Merged) 10jenkins-bot: Sync full clover.xml.bz2 to doc.wikimedia.org [integration/config] - 10https://gerrit.wikimedia.org/r/435660 (owner: 10Legoktm) [22:22:12] PROBLEM - Puppet errors on deployment-mathoid is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [22:22:24] PROBLEM - Puppet errors on deployment-ircd is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [22:22:56] PROBLEM - Puppet errors on deployment-zookeeper02 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [22:22:58] PROBLEM - Puppet errors on deployment-sentry01 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [22:23:26] PROBLEM - Puppet errors on deployment-dumps-puppetmaster is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [22:24:03] PROBLEM - Puppet errors on deployment-mediawiki06 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [22:24:37] PROBLEM - Puppet errors on deployment-restbase02 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [22:24:39] PROBLEM - Puppet errors on deployment-apertium02 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [22:24:59] PROBLEM - Puppet errors on deployment-memc05 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [22:25:17] PROBLEM - Puppet errors on deployment-chromium01 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [22:25:34] PROBLEM - Puppet errors on deployment-sca04 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [22:26:19] PROBLEM - Puppet errors on deployment-mediawiki-09 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [22:27:39] PROBLEM - Puppet errors on deployment-jobrunner03 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [22:28:53] PROBLEM - Puppet errors on deployment-fluorine02 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [22:29:15] PROBLEM - Puppet errors on deployment-restbase01 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [22:29:50] PROBLEM - Puppet errors on deployment-memc04 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [22:29:56] I think I know what I did [22:29:57] probably [22:30:02] :| [22:30:30] PROBLEM - Puppet errors on deployment-cache-upload04 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [22:30:34] PROBLEM - Puppet errors on deployment-mcs01 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [22:30:53] PROBLEM - Puppet errors on deployment-ores01 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [22:32:22] PROBLEM - Puppet errors on deployment-pdfrender02 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [22:32:26] PROBLEM - Puppet errors on deployment-zotero01 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [22:32:34] PROBLEM - Puppet errors on deployment-poolcounter04 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [22:33:08] PROBLEM - Puppet errors on deployment-elastic07 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [22:33:50] ok [22:33:58] PROBLEM - Puppet errors on deployment-mediawiki-07 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [22:34:03] when moving puppetdb to a different puppetmaster [22:34:05] PROBLEM - Puppet errors on deployment-conf03 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [22:34:11] PROBLEM - Puppet errors on deployment-prometheus01 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [22:34:11] PROBLEM - Puppet errors on deployment-kafka-main-2 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [22:34:15] remember to stop the current puppetmaster connecting to it first [22:34:43] PROBLEM - Puppet errors on deployment-ms-fe02 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [22:34:45] PROBLEM - Puppet errors on deployment-redis06 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [22:35:02] and ensure the puppetmaster's /etc/puppet/routes.yaml that puppetmaster::puppetdb::client would've put there is gone [22:35:03] PROBLEM - Puppet errors on deployment-db03 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [22:35:08] otherwise this happens [22:35:17] PROBLEM - Puppet errors on deployment-deploy1001 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [22:35:22] PROBLEM - Puppet errors on deployment-parsoid09 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [22:35:24] PROBLEM - Puppet errors on deployment-cache-text04 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [22:35:26] PROBLEM - Puppet errors on deployment-memc07 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [22:37:21] and it's not pretty [22:37:22] PROBLEM - Puppet errors on deployment-logstash2 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [22:37:54] PROBLEM - Puppet errors on deployment-sca02 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [22:38:02] PROBLEM - Puppet errors on deployment-kafka-main-1 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [22:38:07] wat [22:39:09] yeah [22:39:25] PROBLEM - Puppet errors on deployment-cpjobqueue is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [22:39:49] PROBLEM - Puppet errors on deployment-aqs03 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [22:46:00] RECOVERY - Puppet errors on deployment-puppetmaster02 is OK: OK: Less than 1.00% above the threshold [0.0] [22:47:06] RECOVERY - Puppet errors on deployment-kafka-jumbo-2 is OK: OK: Less than 1.00% above the threshold [0.0] [22:48:48] RECOVERY - Puppet errors on deployment-imagescaler02 is OK: OK: Less than 1.00% above the threshold [0.0] [22:50:03] RECOVERY - Puppet errors on deployment-db03 is OK: OK: Less than 1.00% above the threshold [0.0] [22:50:13] RECOVERY - Puppet errors on deployment-mira is OK: OK: Less than 1.00% above the threshold [0.0] [22:50:47] RECOVERY - Puppet errors on deployment-certcentral is OK: OK: Less than 1.00% above the threshold [0.0] [22:51:05] RECOVERY - Puppet errors on deployment-sca01 is OK: OK: Less than 1.00% above the threshold [0.0] [22:51:09] RECOVERY - Puppet errors on deployment-db04 is OK: OK: Less than 1.00% above the threshold [0.0] [22:51:57] RECOVERY - Puppet errors on deployment-tin is OK: OK: Less than 1.00% above the threshold [0.0] [22:52:40] RECOVERY - Puppet errors on deployment-puppetdb01 is OK: OK: Less than 1.00% above the threshold [0.0] [22:52:46] RECOVERY - Puppet errors on deployment-maps03 is OK: OK: Less than 1.00% above the threshold [0.0] [22:53:00] RECOVERY - Puppet errors on deployment-certcentral-testclient is OK: OK: Less than 1.00% above the threshold [0.0] [22:53:20] RECOVERY - Puppet errors on deployment-imagescaler01 is OK: OK: Less than 1.00% above the threshold [0.0] [22:54:00] RECOVERY - Puppet errors on deployment-urldownloader is OK: OK: Less than 1.00% above the threshold [0.0] [22:54:01] RECOVERY - Puppet errors on deployment-elastic06 is OK: OK: Less than 1.00% above the threshold [0.0] [22:54:15] RECOVERY - Puppet errors on deployment-cassandra3-02 is OK: OK: Less than 1.00% above the threshold [0.0] [22:54:15] also to get the puppetdb server working on the new puppetmaster, service nginx restart after all the normal stuff is done [22:55:12] RECOVERY - Puppet errors on deployment-kafka-jumbo-1 is OK: OK: Less than 1.00% above the threshold [0.0] [22:55:12] RECOVERY - Puppet errors on deployment-aqs02 is OK: OK: Less than 1.00% above the threshold [0.0] [22:55:16] RECOVERY - Puppet errors on deployment-chromium01 is OK: OK: Less than 1.00% above the threshold [0.0] [22:57:12] RECOVERY - Puppet errors on deployment-mathoid is OK: OK: Less than 1.00% above the threshold [0.0] [22:57:18] RECOVERY - Puppet errors on deployment-puppetmaster03 is OK: OK: Less than 1.00% above the threshold [0.0] [22:57:58] RECOVERY - Puppet errors on deployment-zookeeper02 is OK: OK: Less than 1.00% above the threshold [0.0] [22:57:59] RECOVERY - Puppet errors on deployment-sentry01 is OK: OK: Less than 1.00% above the threshold [0.0] [22:58:27] RECOVERY - Puppet errors on deployment-dumps-puppetmaster is OK: OK: Less than 1.00% above the threshold [0.0] [22:58:41] PROBLEM - Puppet errors on deployment-puppetdb01 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [22:58:59] PROBLEM - Puppet errors on deployment-certcentral-testclient is CRITICAL: CRITICAL: 10.00% of data above the critical threshold [0.0] [22:59:03] RECOVERY - Puppet errors on deployment-mediawiki06 is OK: OK: Less than 1.00% above the threshold [0.0] [22:59:37] RECOVERY - Puppet errors on deployment-restbase02 is OK: OK: Less than 1.00% above the threshold [0.0] [22:59:57] RECOVERY - Puppet errors on deployment-memc05 is OK: OK: Less than 1.00% above the threshold [0.0] [23:01:48] PROBLEM - Puppet errors on deployment-certcentral is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [23:02:22] RECOVERY - Puppet errors on deployment-ircd is OK: OK: Less than 1.00% above the threshold [0.0] [23:02:36] RECOVERY - Puppet errors on deployment-jobrunner03 is OK: OK: Less than 1.00% above the threshold [0.0] [23:04:03] RECOVERY - Puppet errors on deployment-conf03 is OK: OK: Less than 1.00% above the threshold [0.0] [23:04:15] RECOVERY - Puppet errors on deployment-restbase01 is OK: OK: Less than 1.00% above the threshold [0.0] [23:04:41] RECOVERY - Puppet errors on deployment-apertium02 is OK: OK: Less than 1.00% above the threshold [0.0] [23:05:31] RECOVERY - Puppet errors on deployment-mcs01 is OK: OK: Less than 1.00% above the threshold [0.0] [23:05:33] RECOVERY - Puppet errors on deployment-sca04 is OK: OK: Less than 1.00% above the threshold [0.0] [23:06:18] RECOVERY - Puppet errors on deployment-mediawiki-09 is OK: OK: Less than 1.00% above the threshold [0.0] [23:06:49] !log beta-mediawiki-config-update-eqiad jobs have been stuck in Zuul for 17 hours [23:06:51] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [23:07:24] RECOVERY - Puppet errors on deployment-pdfrender02 is OK: OK: Less than 1.00% above the threshold [0.0] [23:07:26] RECOVERY - Puppet errors on deployment-zotero01 is OK: OK: Less than 1.00% above the threshold [0.0] [23:07:34] RECOVERY - Puppet errors on deployment-poolcounter04 is OK: OK: Less than 1.00% above the threshold [0.0] [23:08:52] RECOVERY - Puppet errors on deployment-fluorine02 is OK: OK: Less than 1.00% above the threshold [0.0] [23:08:56] RECOVERY - Puppet errors on deployment-mediawiki-07 is OK: OK: Less than 1.00% above the threshold [0.0] [23:09:09] RECOVERY - Puppet errors on deployment-prometheus01 is OK: OK: Less than 1.00% above the threshold [0.0] [23:09:11] RECOVERY - Puppet errors on deployment-kafka-main-2 is OK: OK: Less than 1.00% above the threshold [0.0] [23:09:41] RECOVERY - Puppet errors on deployment-ms-fe02 is OK: OK: Less than 1.00% above the threshold [0.0] [23:09:47] RECOVERY - Puppet errors on deployment-memc04 is OK: OK: Less than 1.00% above the threshold [0.0] [23:09:52] !log Killed a bunch of stuck beta-mediawiki-config-update-eqiad jobs in Jenkins [23:09:54] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [23:10:21] RECOVERY - Puppet errors on deployment-parsoid09 is OK: OK: Less than 1.00% above the threshold [0.0] [23:10:23] RECOVERY - Puppet errors on deployment-memc07 is OK: OK: Less than 1.00% above the threshold [0.0] [23:10:29] RECOVERY - Puppet errors on deployment-cache-upload04 is OK: OK: Less than 1.00% above the threshold [0.0] [23:10:52] RECOVERY - Puppet errors on deployment-ores01 is OK: OK: Less than 1.00% above the threshold [0.0] [23:11:45] Project beta-scap-eqiad build #209211: 04FAILURE in 2 min 2 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/209211/ [23:11:48] RECOVERY - Puppet errors on deployment-certcentral is OK: OK: Less than 1.00% above the threshold [0.0] [23:12:22] RECOVERY - Puppet errors on deployment-logstash2 is OK: OK: Less than 1.00% above the threshold [0.0] [23:12:54] RECOVERY - Puppet errors on deployment-sca02 is OK: OK: Less than 1.00% above the threshold [0.0] [23:12:58] RECOVERY - Puppet errors on deployment-kafka-main-1 is OK: OK: Less than 1.00% above the threshold [0.0] [23:13:06] RECOVERY - Puppet errors on deployment-elastic07 is OK: OK: Less than 1.00% above the threshold [0.0] [23:13:39] RECOVERY - Puppet errors on deployment-puppetdb01 is OK: OK: Less than 1.00% above the threshold [0.0] [23:14:23] RECOVERY - Puppet errors on deployment-cpjobqueue is OK: OK: Less than 1.00% above the threshold [0.0] [23:14:43] RECOVERY - Puppet errors on deployment-redis06 is OK: OK: Less than 1.00% above the threshold [0.0] [23:14:47] RECOVERY - Puppet errors on deployment-aqs03 is OK: OK: Less than 1.00% above the threshold [0.0] [23:15:21] RECOVERY - Puppet errors on deployment-cache-text04 is OK: OK: Less than 1.00% above the threshold [0.0] [23:16:28] Project beta-scap-eqiad build #209212: 04STILL FAILING in 2 min 0 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/209212/ [23:17:10] 10Continuous-Integration-Config, 10Continuous-Integration-Infrastructure: operations-mw-config-composer-test-docker has composer version constraint regression - https://phabricator.wikimedia.org/T195688#4234284 (10Reedy) [23:18:20] PROBLEM - Puppet errors on deployment-puppetmaster03 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [23:20:16] Project beta-update-databases-eqiad build #25740: 04FAILURE in 14 sec: https://integration.wikimedia.org/ci/job/beta-update-databases-eqiad/25740/ [23:25:49] Project beta-scap-eqiad build #209213: 04STILL FAILING in 1 min 59 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/209213/ [23:28:58] RECOVERY - Puppet errors on deployment-certcentral-testclient is OK: OK: Less than 1.00% above the threshold [0.0] [23:32:49] Project mwext-phpunit-coverage-publish build #4841: 04FAILURE in 1 min 34 sec: https://integration.wikimedia.org/ci/job/mwext-phpunit-coverage-publish/4841/ [23:35:47] Project beta-scap-eqiad build #209214: 04STILL FAILING in 2 min 1 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/209214/ [23:38:53] Project mwext-phpunit-coverage-publish build #4842: 04STILL FAILING in 42 sec: https://integration.wikimedia.org/ci/job/mwext-phpunit-coverage-publish/4842/ [23:43:27] Yippee, build fixed! [23:43:27] Project mwext-phpunit-coverage-publish build #4843: 09FIXED in 56 sec: https://integration.wikimedia.org/ci/job/mwext-phpunit-coverage-publish/4843/ [23:45:49] Project beta-scap-eqiad build #209215: 04STILL FAILING in 1 min 59 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/209215/ [23:47:02] Project mwext-phpunit-coverage-publish build #4845: 04FAILURE in 24 sec: https://integration.wikimedia.org/ci/job/mwext-phpunit-coverage-publish/4845/ [23:55:36] Project beta-scap-eqiad build #209216: 04STILL FAILING in 1 min 53 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/209216/ [23:58:20] RECOVERY - Puppet errors on deployment-puppetmaster03 is OK: OK: Less than 1.00% above the threshold [0.0]