[01:06:40] Project beta-scap-eqiad build #247877: 04FAILURE in 2 min 15 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/247877/ [01:16:00] Project beta-scap-eqiad build #247878: 04STILL FAILING in 1 min 39 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/247878/ [01:26:15] Yippee, build fixed! [01:26:15] Project beta-scap-eqiad build #247879: 09FIXED in 1 min 53 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/247879/ [01:36:10] Project beta-scap-eqiad build #247880: 04FAILURE in 1 min 50 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/247880/ [01:46:09] Yippee, build fixed! [01:46:10] Project beta-scap-eqiad build #247881: 09FIXED in 1 min 48 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/247881/ [01:55:06] PROBLEM - Mathoid on deployment-mathoid is CRITICAL: connect to address 172.16.5.73 and port 10042: Connection refused [02:05:04] RECOVERY - Mathoid on deployment-mathoid is OK: HTTP OK: HTTP/1.1 200 OK - 925 bytes in 0.030 second response time [02:11:03] PROBLEM - Mathoid on deployment-mathoid is CRITICAL: connect to address 172.16.5.73 and port 10042: Connection refused [02:16:16] Project beta-scap-eqiad build #247884: 04FAILURE in 1 min 55 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/247884/ [02:26:15] Project beta-scap-eqiad build #247885: 04STILL FAILING in 1 min 52 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/247885/ [02:36:24] Project beta-scap-eqiad build #247886: 04STILL FAILING in 2 min 3 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/247886/ [02:46:06] Project beta-scap-eqiad build #247887: 04STILL FAILING in 1 min 45 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/247887/ [02:53:15] PROBLEM - Puppet errors on deployment-cache-text05 is CRITICAL: CRITICAL: 3.37% of data above the critical threshold [3.0] [02:56:22] Yippee, build fixed! [02:56:23] Project beta-scap-eqiad build #247888: 09FIXED in 2 min 3 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/247888/ [03:06:21] Project beta-scap-eqiad build #247889: 04FAILURE in 2 min 1 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/247889/ [03:15:56] Project beta-scap-eqiad build #247890: 04STILL FAILING in 1 min 37 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/247890/ [03:26:23] Project beta-scap-eqiad build #247891: 04STILL FAILING in 2 min 2 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/247891/ [03:28:52] PROBLEM - Free space - all mounts on deployment-fluorine02 is CRITICAL: CRITICAL: deployment-prep.deployment-fluorine02.diskspace._srv.byte_percentfree (<40.00%) [03:34:25] Project beta-scap-eqiad build #247892: 04STILL FAILING in 4.3 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/247892/ [03:44:29] Project beta-scap-eqiad build #247893: 04STILL FAILING in 7.6 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/247893/ [03:56:10] Yippee, build fixed! [03:56:10] Project beta-scap-eqiad build #247894: 09FIXED in 1 min 51 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/247894/ [04:44:36] PROBLEM - Content Translation Server on deployment-sca02 is CRITICAL: connect to address 172.16.5.112 and port 8080: Connection refused [04:49:38] RECOVERY - Content Translation Server on deployment-sca02 is OK: HTTP OK: HTTP/1.1 200 OK - 904 bytes in 0.027 second response time [04:51:08] RECOVERY - Mathoid on deployment-mathoid is OK: HTTP OK: HTTP/1.1 200 OK - 925 bytes in 0.041 second response time [04:53:15] RECOVERY - Puppet errors on deployment-cache-text05 is OK: OK: Less than 1.00% above the threshold [2.0] [05:03:06] PROBLEM - Mathoid on deployment-mathoid is CRITICAL: connect to address 172.16.5.73 and port 10042: Connection refused [06:13:06] RECOVERY - Mathoid on deployment-mathoid is OK: HTTP OK: HTTP/1.1 200 OK - 925 bytes in 0.037 second response time [06:19:07] PROBLEM - Mathoid on deployment-mathoid is CRITICAL: connect to address 172.16.5.73 and port 10042: Connection refused [06:25:59] PROBLEM - Puppet errors on integration-slave-jessie-1002 is CRITICAL: (Service Check Timed Out) [07:08:51] RECOVERY - Free space - all mounts on deployment-fluorine02 is OK: OK: All targets OK [07:14:04] RECOVERY - Mathoid on deployment-mathoid is OK: HTTP OK: HTTP/1.1 200 OK - 925 bytes in 0.048 second response time [07:20:06] PROBLEM - Mathoid on deployment-mathoid is CRITICAL: connect to address 172.16.5.73 and port 10042: Connection refused [07:50:07] RECOVERY - Mathoid on deployment-mathoid is OK: HTTP OK: HTTP/1.1 200 OK - 925 bytes in 0.027 second response time [10:58:04] PROBLEM - Mathoid on deployment-mathoid is CRITICAL: connect to address 172.16.5.73 and port 10042: Connection refused [11:18:11] PROBLEM - Work requests waiting in Zuul Gearman server on contint1001 is CRITICAL: CRITICAL: 71.43% of data above the critical threshold [140.0] https://grafana.wikimedia.org/dashboard/db/zuul-gearman?panelId=10&fullscreen&orgId=1 [11:38:06] RECOVERY - Mathoid on deployment-mathoid is OK: HTTP OK: HTTP/1.1 200 OK - 925 bytes in 0.594 second response time [11:44:03] PROBLEM - Mathoid on deployment-mathoid is CRITICAL: connect to address 172.16.5.73 and port 10042: Connection refused [11:55:05] RECOVERY - Work requests waiting in Zuul Gearman server on contint1001 is OK: OK: Less than 30.00% above the threshold [90.0] https://grafana.wikimedia.org/dashboard/db/zuul-gearman?panelId=10&fullscreen&orgId=1 [12:09:06] RECOVERY - Mathoid on deployment-mathoid is OK: HTTP OK: HTTP/1.1 200 OK - 925 bytes in 0.053 second response time [14:44:27] PROBLEM - Host deployment-ms-be03 is DOWN: CRITICAL - Host Unreachable (172.16.5.51) [14:45:20] PROBLEM - Host deployment-ms-be04 is DOWN: CRITICAL - Host Unreachable (172.16.4.129) [15:41:04] PROBLEM - Mathoid on deployment-mathoid is CRITICAL: connect to address 172.16.5.73 and port 10042: Connection refused [15:46:16] Hey all, how to deal with case someone uploaded keys to Gerrit (except revocation of such keys, of course)? Can we delete a patchset permanently? Thanks! [15:46:39] (I can provide you links, but due to the nature of this incident, I'd prefer PM) [15:52:07] Patchsets cannot be deleted from the ui, though we now have a script that can delete stuff (not sure if it will delete everything created by that user though). [15:52:21] changes can be deleted though [15:52:38] so the user could upload a new change (and have the one where they uploaded a key to, deleted) [15:55:39] paladox, thanks. The problem is they've just added another patchset, without the key, and merged the patch :( [15:55:47] oh [15:56:22] Yeh, get them to revoke the key (as it should be made useless) [15:56:39] okay [16:00:46] Urbanecm: yeah, best bet there is to make the key useless. mistakes happen. [16:01:33] that's something that should happen for sure, just wanted to know if there's some way to delete the patchset in gerrit [16:01:37] look it's not, so let's revoke :) [16:01:42] let's "just" revoke [16:01:53] there is, but the key is out there on email archives etc [16:02:01] so it wouldn't be a cure-all [16:02:48] hmm, didn't think about the email thing [16:16:37] Urbanecm, generally speaking once you make something public around here it's exposed permanently [16:16:51] regardless of attempts to restrict it going forward [16:17:30] good idea to revoke even if they go manage to purge it from gerrit [16:59:22] (03PS10) 10Urbanecm: Setup gate-and-submit-l10n for wikimedia-cz/tracker [integration/config] - 10https://gerrit.wikimedia.org/r/479738 [17:11:02] 10Continuous-Integration-Config, 10WMCZ-Tracker: Setup gate-and-submit-l10n for wikimedia-cz/tracker - https://phabricator.wikimedia.org/T222544 (10Urbanecm) [17:11:16] (03PS11) 10Urbanecm: Setup gate-and-submit-l10n for wikimedia-cz/tracker [integration/config] - 10https://gerrit.wikimedia.org/r/479738 (https://phabricator.wikimedia.org/T222544) [17:41:04] RECOVERY - Mathoid on deployment-mathoid is OK: HTTP OK: HTTP/1.1 200 OK - 925 bytes in 0.023 second response time [17:53:04] PROBLEM - Mathoid on deployment-mathoid is CRITICAL: connect to address 172.16.5.73 and port 10042: Connection refused [18:16:03] PROBLEM - Work requests waiting in Zuul Gearman server on contint1001 is CRITICAL: CRITICAL: 53.85% of data above the critical threshold [140.0] https://grafana.wikimedia.org/dashboard/db/zuul-gearman?panelId=10&fullscreen&orgId=1 [19:07:39] RECOVERY - Work requests waiting in Zuul Gearman server on contint1001 is OK: OK: Less than 30.00% above the threshold [90.0] https://grafana.wikimedia.org/dashboard/db/zuul-gearman?panelId=10&fullscreen&orgId=1 [20:20:15] 10Gerrit, 10translatewiki.net: Running translatewiki export for MediaWiki extensions: Too many concurrent connections (4) - max. allowed: 4 - https://phabricator.wikimedia.org/T222546 (10Raymond) [20:22:40] 10Gerrit, 10translatewiki.net: Running translatewiki export for MediaWiki extensions: Too many concurrent connections (4) - max. allowed: 4 - https://phabricator.wikimedia.org/T222546 (10Paladox) This is due to https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/504973/ [20:40:31] PROBLEM - Citoid on deployment-sca01 is CRITICAL: connect to address 172.16.5.13 and port 1970: Connection refused [20:55:30] RECOVERY - Citoid on deployment-sca01 is OK: HTTP OK: HTTP/1.1 200 OK - 921 bytes in 0.027 second response time [21:04:13] 10Gerrit, 10translatewiki.net: Running translatewiki export for MediaWiki extensions: Too many concurrent connections (4) - max. allowed: 4 - https://phabricator.wikimedia.org/T222546 (10Reedy) >This makes a complete export of MediaWiki extensions impossible Well, maybe as currently written, but it could be a... [22:03:05] RECOVERY - Mathoid on deployment-mathoid is OK: HTTP OK: HTTP/1.1 200 OK - 925 bytes in 0.160 second response time [22:14:03] PROBLEM - Mathoid on deployment-mathoid is CRITICAL: connect to address 172.16.5.73 and port 10042: Connection refused [22:19:07] RECOVERY - Mathoid on deployment-mathoid is OK: HTTP OK: HTTP/1.1 200 OK - 925 bytes in 0.023 second response time [22:25:04] PROBLEM - Mathoid on deployment-mathoid is CRITICAL: connect to address 172.16.5.73 and port 10042: Connection refused [22:35:04] RECOVERY - Mathoid on deployment-mathoid is OK: HTTP OK: HTTP/1.1 200 OK - 925 bytes in 0.025 second response time