[14:18:03] Project mediawiki-core-doxygen build #19526: 04FAILURE in 2.1 sec: https://integration.wikimedia.org/ci/job/mediawiki-core-doxygen/19526/ [14:18:36] FIRING: [2x] ProbeDown: Service gerrit2003:443 has failed probes (http_gerrit_tls_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#gerrit2003:443 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://alerts.wikimedia.org/?q=alertname%3DProbeDown [14:18:46] 06Release-Engineering-Team, 06collaboration-services: ProbeDown - https://phabricator.wikimedia.org/T423034 (10phaultfinder) 03NEW [14:23:04] Project beta-code-update-eqiad build #595782: 04FAILURE in 3.9 sec: https://integration.wikimedia.org/ci/job/beta-code-update-eqiad/595782/ [14:23:31] RESOLVED: [2x] ProbeDown: Service gerrit2003:443 has failed probes (http_gerrit_tls_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#gerrit2003:443 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://alerts.wikimedia.org/?q=alertname%3DProbeDown [14:24:31] FIRING: [2x] ProbeDown: Service gerrit2003:443 has failed probes (http_gerrit_tls_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#gerrit2003:443 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://alerts.wikimedia.org/?q=alertname%3DProbeDown [14:24:35] * A_smart_kitten speculates about whether the above alerts for Gerrit may be related to T423027 [14:25:02] (https://phabricator.wikimedia.org/T423027: DiskSpace) [14:31:29] 10Gerrit, 06collaboration-services: DiskSpace - https://phabricator.wikimedia.org/T423027#11811829 (10A_smart_kitten) p:05Triage→03Unbreak! Gerrit currently seems completely down & I would assume that this is why. ` upstream connect error or disconnect/reset before headers. reset reason: connection timeout ` [14:33:03] Project beta-code-update-eqiad build #595783: 04STILL FAILING in 3.1 sec: https://integration.wikimedia.org/ci/job/beta-code-update-eqiad/595783/ [14:39:31] FIRING: [2x] ProbeDown: Service gerrit2003:443 has failed probes (http_gerrit_tls_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#gerrit2003:443 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://alerts.wikimedia.org/?q=alertname%3DProbeDown [14:40:08] see -operations for the alerts [14:40:41] 10Gerrit, 06collaboration-services, 07Wikimedia-Incident: DiskSpace - https://phabricator.wikimedia.org/T423027#11811833 (10RhinosF1) [14:43:03] Project beta-code-update-eqiad build #595784: 04STILL FAILING in 3 sec: https://integration.wikimedia.org/ci/job/beta-code-update-eqiad/595784/ [14:44:31] FIRING: [2x] ProbeDown: Service gerrit2003:443 has failed probes (http_gerrit_tls_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#gerrit2003:443 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://alerts.wikimedia.org/?q=alertname%3DProbeDown [14:45:12] 10Gerrit, 06collaboration-services, 07Wikimedia-Incident: DiskSpace - https://phabricator.wikimedia.org/T423027#11811836 (10Clement_Goubert) ` cgoubert@gerrit2003:/var/log/apache2$ sudo lvextend -L+20G -r /dev/vg0/root Size of logical volume vg0/root changed from 74.50 GiB (19073 extents) to 94.50 GiB (24... [14:46:16] 10Gerrit, 06collaboration-services, 07Wikimedia-Incident: 2026-04-12 Gerrit Outage (was: DiskSpace) - https://phabricator.wikimedia.org/T423027#11811850 (10RhinosF1) [14:49:31] RESOLVED: [2x] ProbeDown: Service gerrit2003:443 has failed probes (http_gerrit_tls_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#gerrit2003:443 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://alerts.wikimedia.org/?q=alertname%3DProbeDown [14:55:20] Yippee, build fixed! [14:55:20] Project beta-code-update-eqiad build #595785: 09FIXED in 2 min 20 sec: https://integration.wikimedia.org/ci/job/beta-code-update-eqiad/595785/ [15:04:37] 10Gerrit, 06collaboration-services, 07Wikimedia-Incident: 2026-04-12 Gerrit Outage (was: DiskSpace) - https://phabricator.wikimedia.org/T423027#11811897 (10IKhitron) Much better, thank you. It has logical problems now, but at least it works. (For example, a click on "save" saves the new version, but shows an... [15:07:44] 10Gerrit, 06collaboration-services, 07Wikimedia-Incident: 2026-04-12 Gerrit Outage (was: DiskSpace) - https://phabricator.wikimedia.org/T423027#11811898 (10Clement_Goubert) p:05Unbreak!→03High Temporary fix was to extend the root LV by 20GB. This should hold us over until Monday when #collaboration-servi... [15:30:36] Yippee, build fixed! [15:30:36] Project mediawiki-core-doxygen build #19527: 09FIXED in 12 min: https://integration.wikimedia.org/ci/job/mediawiki-core-doxygen/19527/ [18:16:03] 10Continuous-Integration-Infrastructure, 07Test-Coverage: docker-registry.wikimedia.org/releng/quibble-coverage:1.16.0-s2 not found - https://phabricator.wikimedia.org/T421596#11812151 (10A_smart_kitten) 05Open→03Resolved I believe this will now have been resolved following the work done by @dancy in T... [18:19:58] 10Gerrit, 06collaboration-services, 07Wikimedia-Incident: 2026-04-12 Gerrit Outage (was: DiskSpace) - https://phabricator.wikimedia.org/T423027#11812156 (10Jdforrester-WMF) Zuul doesn't seem to be picking up events from gerrit. I'll have a poke. [18:21:21] !log jforrester@contint1002:~$ sudo /usr/sbin/service zuul restart && tail -f -n100 /var/log/zuul/zuul.log # T423027 [18:21:23] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [18:21:23] T423027: 2026-04-12 Gerrit Outage (was: DiskSpace) - https://phabricator.wikimedia.org/T423027 [18:30:59] 10Gerrit, 06collaboration-services, 07Wikimedia-Incident: 2026-04-12 Gerrit Outage (was: DiskSpace) - https://phabricator.wikimedia.org/T423027#11812176 (10Jdforrester-WMF) I think the issue is in the gerrit<->contint connection. I've gracefully-restarted zuul but it isn't picking up any events. A manual `zu... [18:58:18] (03merge) 10jforrester: build: Upgrade dependencies to latest [repos/ci-tools/grunt-stylelint] - 10https://gitlab.wikimedia.org/repos/ci-tools/grunt-stylelint/-/merge_requests/7 (owner: 10volker-e) [19:43:51] 10Gerrit, 06collaboration-services, 07Wikimedia-Incident: 2026-04-12 Gerrit Outage (was: DiskSpace) - https://phabricator.wikimedia.org/T423027#11812236 (10IKhitron) Looks like it's more than that. I tried to send recheck many times, but there is no way to write a replay. [19:48:33] 10Phabricator, 07Regression: Mails do not arrive any more - https://phabricator.wikimedia.org/T423055 (10IKhitron) 03NEW [19:52:51] 10Gerrit, 06collaboration-services, 07Wikimedia-Incident: 2026-04-12 Gerrit Outage (was: DiskSpace) - https://phabricator.wikimedia.org/T423027#11812257 (10Jdforrester-WMF) >>! In T423027#11812236, @IKhitron wrote: > Looks like it's more than that. I tried to send recheck many times, but there is no way to w... [19:55:54] 10Gerrit, 06collaboration-services, 07Wikimedia-Incident: 2026-04-12 Gerrit Outage (was: DiskSpace) - https://phabricator.wikimedia.org/T423027#11812259 (10IKhitron) Thank you. Sure, I will stop, if you think it's better. But as it looks for me, it's not that the recheck reply does not invoke the Jenkins bot... [20:41:15] 10Phabricator, 06Infrastructure-Foundations, 10Mail, 07Regression: Mails do not arrive any more - https://phabricator.wikimedia.org/T423055#11812334 (10A_smart_kitten) [21:15:16] 10Phabricator, 06Infrastructure-Foundations, 10Mail, 07Regression: Mails do not arrive any more - https://phabricator.wikimedia.org/T423055#11812373 (10IKhitron) I just changed my Phabricator mail address to another one, to see if the problem will be there too or it is not for all.