[04:10:47] Project selenium-MultimediaViewer » safari,beta,OS X 10.9,BrowserTests build #396: 04FAILURE in 14 min: https://integration.wikimedia.org/ci/job/selenium-MultimediaViewer/BROWSER=safari,MEDIAWIKI_ENVIRONMENT=beta,PLATFORM=OS%20X%2010.9,label=BrowserTests/396/ [04:13:32] Project selenium-MultimediaViewer » firefox,beta,Linux,BrowserTests build #396: 04FAILURE in 17 min: https://integration.wikimedia.org/ci/job/selenium-MultimediaViewer/BROWSER=firefox,MEDIAWIKI_ENVIRONMENT=beta,PLATFORM=Linux,label=BrowserTests/396/ [04:36:08] 10Continuous-Integration-Infrastructure, 10ContentTranslation: Jenkins fails to pass ContentTranslation mwext-qunit-jessie with "Disconnected (1 times), because no message in 60000 ms." error - https://phabricator.wikimedia.org/T165647#3274583 (10santhosh) Acutally tests after this ext.cx.tools.mtabuse was cau... [07:41:37] 10Continuous-Integration-Infrastructure, 10Page-Previews, 06Reading-Web-Backlog: Execute mwgate-composer workers in CI while building Popups extension - https://phabricator.wikimedia.org/T165521#3274696 (10Jdlrobson) [07:58:27] (03PS1) 10Phedenskog: Add Linux location for WebPageTest [integration/config] - 10https://gerrit.wikimedia.org/r/354385 [08:20:20] 06Release-Engineering-Team (Kanban), 10MediaWiki-extensions-OATHAuth, 07SQLite: unittest_oathauth_users table not created under sqlite - https://phabricator.wikimedia.org/T69297#3274785 (10tstarling) [08:20:27] 06Release-Engineering-Team (Kanban), 10MediaWiki-extensions-Other, 07SQLite: EditPageTracking does not pass Jenkins tests (sqlite compatibility) - https://phabricator.wikimedia.org/T68191#3274789 (10tstarling) [08:20:30] 06Release-Engineering-Team (Kanban), 10MediaWiki-extensions-TitleKey, 07SQLite: TitleKey does not pass Jenkins unit tests (sqlite compatibility) - https://phabricator.wikimedia.org/T67896#3274791 (10tstarling) [08:20:35] 06Release-Engineering-Team (Kanban), 10MediaWiki-extensions-CreditsSource, 07SQLite: CreditsSource does not pass Jenkins unit tests (sqlite compatibility) - https://phabricator.wikimedia.org/T67877#3274794 (10tstarling) [08:20:37] 06Release-Engineering-Team (Kanban), 10MediaWiki-extensions-Other, 07SQLite: CommunityVoice does not pass Jenkins unit tests (sqlite compatibility) - https://phabricator.wikimedia.org/T67876#3274796 (10tstarling) [09:17:14] (03PS1) 10Hashar: integration/quibble: tox job [integration/config] - 10https://gerrit.wikimedia.org/r/354436 [09:49:54] 10Continuous-Integration-Infrastructure, 10Page-Previews, 06Reading-Web-Backlog: Execute mwgate-composer workers in CI while building Popups extension - https://phabricator.wikimedia.org/T165521#3275142 (10Legoktm) 05Open>03Invalid Those tests are already being run through the mwext-testextension-(php55|... [10:06:03] 06Release-Engineering-Team (Kanban), 10Wikimedia-Hackathon-2017, 07Browser-Tests, 07JavaScript, 15User-zeljkofilipin: Selenium/WebdriverIO tests in JavaScript/Node.js - https://phabricator.wikimedia.org/T159945#3275276 (10zeljkofilipin) Scheduled for Friday 18-19 in Wiaschtl! See you there! [10:10:28] (03CR) 10Hashar: [C: 032] integration/quibble: tox job [integration/config] - 10https://gerrit.wikimedia.org/r/354436 (owner: 10Hashar) [10:12:25] (03CR) 10Hashar: "The email has to be added to two lists. I will amend an deploy" [integration/config] - 10https://gerrit.wikimedia.org/r/354098 (owner: 10Sebastian Berlin (WMSE)) [10:13:03] (03PS2) 10Hashar: Whitelist Eugene233 [integration/config] - 10https://gerrit.wikimedia.org/r/354098 (owner: 10Sebastian Berlin (WMSE)) [10:13:04] (03PS2) 10Hashar: Whitelist Jcasariego [integration/config] - 10https://gerrit.wikimedia.org/r/351863 (owner: 10Mholloway) [10:14:55] PROBLEM - Work requests waiting in Zuul Gearman server https://grafana.wikimedia.org/dashboard/db/zuul-gearman on contint1001 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [140.0] [10:18:36] ACKNOWLEDGEMENT - Work requests waiting in Zuul Gearman server https://grafana.wikimedia.org/dashboard/db/zuul-gearman on contint1001 is CRITICAL: CRITICAL: 80.00% of data above the critical threshold [140.0] amusso Lot of mw extensions l10n reverts going on. [10:36:11] (03CR) 10Hashar: [C: 032] Whitelist Jcasariego [integration/config] - 10https://gerrit.wikimedia.org/r/351863 (owner: 10Mholloway) [10:36:16] (03CR) 10Hashar: [C: 032] Whitelist Eugene233 [integration/config] - 10https://gerrit.wikimedia.org/r/354098 (owner: 10Sebastian Berlin (WMSE)) [10:40:38] (03CR) 10Sebastian Berlin (WMSE): "Ok, I didn't know that. Thanks for fixing it." [integration/config] - 10https://gerrit.wikimedia.org/r/354098 (owner: 10Sebastian Berlin (WMSE)) [10:47:58] (03Merged) 10jenkins-bot: integration/quibble: tox job [integration/config] - 10https://gerrit.wikimedia.org/r/354436 (owner: 10Hashar) [10:50:56] (03Merged) 10jenkins-bot: Whitelist Eugene233 [integration/config] - 10https://gerrit.wikimedia.org/r/354098 (owner: 10Sebastian Berlin (WMSE)) [10:50:58] (03Merged) 10jenkins-bot: Whitelist Jcasariego [integration/config] - 10https://gerrit.wikimedia.org/r/351863 (owner: 10Mholloway) [10:54:42] What's up with Zuul? There are very few jobs running [10:54:51] Everything's backed up 45 mins [10:55:29] Only 2 jobs running right now, 19 patches waiting to merge [10:55:32] ( hashar ) [10:55:47] RoanKattouw: let me check [10:55:56] Oh seems to be recovering a little [10:56:03] Of coures that has to happen just as I ping you [10:56:09] demo bis has sent bunch of patches to revert l10n updates on mediawiki extensions [10:56:18] which overloaded the CI system slighty [10:56:42] maybe we should raise the quota for the hackathon [10:58:40] As we don't run CI when merging l10n changes, I was wondering why we do so when reverting them? [11:03:54] eddiegp: We can't tell that it's a revert of an l10n change [11:04:27] The way we determine that something is an l10n change is checking if its author is l10n-bot [11:04:51] Also, the most common reason for reverting l10n changes is because they break CI, so I'm not that happy about l10n-bot merges bypassing CI :/ [11:05:19] (I think the way l10n-bot actually bypasses CI is by having the right to merge changes directly, which is something only the CI system and l10n-bot can do) [11:10:27] poor jenkins [11:16:36] RoanKattouw: According to https://gerrit.wikimedia.org/r/#/admin/projects/mediawiki,access everybody with +2 also can set Verified to +2 and hit the "Submit" button afterwards, bypassing CI. So that's not exclusive to CI & l10n-bot. I've also already seen this a few times (when CI fails were definitely unrelated, e.g. only a comment was changed). [11:16:51] Hmm [11:16:54] In some repos that is not allowed [11:17:01] But those must just be the repos I frequent I suppose [11:17:35] Yeah, this is for the mediawiki/* repos, but that is where l10n bot is most active ;) [11:32:55] RECOVERY - Work requests waiting in Zuul Gearman server https://grafana.wikimedia.org/dashboard/db/zuul-gearman on contint1001 is OK: OK: Less than 30.00% above the threshold [90.0] [11:50:42] 10Gerrit, 06Release-Engineering-Team (Kanban), 10Wikidata, 15User-Ladsgroup, 03Wikidata-Sprint: [Task] Move PropertySuggester extension to gerrit - https://phabricator.wikimedia.org/T104309#3275642 (10greg) a:05Ladsgroup>03demon [11:55:37] 06Release-Engineering-Team (Next), 10Wikimedia-Logstash, 07Wikimedia-log-errors: Fatalmonitor on logstash still includes deprecated channel:wfLogDBError - https://phabricator.wikimedia.org/T165675#3275647 (10greg) [11:55:57] 10Continuous-Integration-Config, 10Continuous-Integration-Infrastructure, 06Release-Engineering-Team (Backlog), 07Jenkins: Upgrade jenkins server and jenkins slaves to java 8 - https://phabricator.wikimedia.org/T162828#3275650 (10greg) [11:56:01] 10Continuous-Integration-Infrastructure, 06Release-Engineering-Team (Backlog), 07Zuul: Upgrade pbr for zuul - https://phabricator.wikimedia.org/T162787#3275652 (10greg) [11:56:08] 10Continuous-Integration-Infrastructure, 06Release-Engineering-Team (Backlog), 07Jenkins: Jenkins Web UI error - Backend fetch failed - https://phabricator.wikimedia.org/T162505#3275654 (10greg) [11:56:32] 10Beta-Cluster-Infrastructure, 06Release-Engineering-Team (Next), 06Labs, 07Puppet, 15User-Joe: Re-think puppet management for deployment-prep - https://phabricator.wikimedia.org/T161675#3275656 (10greg) [12:17:10] 10Deployment-Systems, 06Release-Engineering-Team (Kanban), 10Scap (Scap3-Adoption-Phase1), 10scap2, and 2 others: Deploy jobrunner with scap3 (Trebuchet jobrunner/jobrunner) - https://phabricator.wikimedia.org/T129148#3275691 (10thcipriani) [12:17:24] 10Deployment-Systems, 06Release-Engineering-Team (Kanban), 10Scap (Scap3-Adoption-Phase1), 10scap2, and 2 others: Deploy jobrunner with scap3 (Trebuchet jobrunner/jobrunner) - https://phabricator.wikimedia.org/T129148#2096570 (10thcipriani) p:05Triage>03Normal [12:47:53] 06Release-Engineering-Team (Kanban), 10Scap (Scap3-Adoption-Phase2): Deploy logstash/plugins with scap3 - https://phabricator.wikimedia.org/T165748#3275815 (10thcipriani) [12:48:27] 10Scap (Scap3-Adoption-Phase1), 10releng-201516-q4: [keyresult] Migrate remaining trebuchet deployed services - https://phabricator.wikimedia.org/T129290#3275829 (10thcipriani) [12:48:29] 06Release-Engineering-Team (Kanban), 10Scap (Scap3-Adoption-Phase2): Deploy logstash/plugins with scap3 - https://phabricator.wikimedia.org/T165748#3275828 (10thcipriani) [12:48:58] 06Release-Engineering-Team (Kanban), 10Scap (Scap3-Adoption-Phase2): Deploy logstash/plugins with scap3 - https://phabricator.wikimedia.org/T165748#3275815 (10thcipriani) p:05Triage>03Normal [12:49:05] :) [12:50:32] greg-g: how is the offsites and such going? (Sorry if off-topic, I'm just curious feel free to disregard if you have more important things to do) [12:51:39] tl;dr: good. :) [12:53:05] greg-g: thats great, if you need anything done that i have access to, just know all you have to do is yell. [12:53:51] 10Deployment-Systems, 06Release-Engineering-Team, 15User-greg: Require an associated task with each SWAT item - https://phabricator.wikimedia.org/T145255#3275878 (10greg) 05Open>03declined >>! In T145255#3126857, @greg wrote: > meh? meh. [13:25:19] is it ok if I start sending some test alerts from beta prometheus here during the hackathon? :P [13:26:15] godog: suuuuure [13:26:33] I'll silence it if we need to, but, we're mostly all here so it's not a big deal [13:28:18] greg-g: nice, thanks! yeah LMK for sure if it becomes obnoxious (and the alerts are false that is) [13:28:45] :) [13:38:57] 06Release-Engineering-Team (Kanban), 10Phabricator, 06Project-Admins, 15User-greg: Create project/tag User-Aude as personal work board - https://phabricator.wikimedia.org/T165735#3276080 (10greg) 05Open>03Resolved a:03greg Done #user-aude [13:43:08] zomg [13:44:58] well deployment-phab02 is down alright, deployment-sca04 has puppet disabled since forever [13:46:55] Yippee, build fixed! [13:46:55] Project selenium-VisualEditor » firefox,beta,Linux,BrowserTests build #402: 09FIXED in 2 min 54 sec: https://integration.wikimedia.org/ci/job/selenium-VisualEditor/BROWSER=firefox,MEDIAWIKI_ENVIRONMENT=beta,PLATFORM=Linux,label=BrowserTests/402/ [13:52:20] 10Gerrit, 06Labs, 10wikitech.wikimedia.org: Request to rename LegoFan4000 to MacFan4000 on WikiTech - https://phabricator.wikimedia.org/T165624#3276168 (10demon) >>! In T165624#3274279, @bd808 wrote: > User `MacFan4000` already exists in LDAP. The account was created 2016-08-25T22:53:04Z. Both accounts are r... [13:52:45] ugh. bots using notice? [13:53:39] bd808: ups, I forgot that bot used notice heh, it is a test during the hackathon though [13:54:26] godog: :) it only annoys me because my irc client does a desktop alert on all notices in all channels. Kinds dumb client config. [13:54:32] * bd808 looks for a way to turn it off [13:55:36] it annoys me too, honestly [13:56:17] heheh reminds me I tried this (this == bots using notice) before and gave up [13:56:25] greg-g: what does your client do out of curiosity? [13:56:26] * bd808 has changed his config to be less randomly grumpy [13:57:42] godog: shameless plug, if your bot is in python check out -- https://python-ib3.readthedocs.io/en/latest/ [13:57:59] lots of python irc bot goodness [13:58:03] godog: I'm on irssi [14:01:17] bd808: nice, thanks! the bot isn't by me and it is golang, though it is so simple (receive a json webhook on http, format it and send to irc) that it might be as well python [14:01:39] godog: https://phabricator.wikimedia.org/F8122889 [14:01:53] godog: but types! [14:02:44] haha TYPE ALL THE TYPES [14:02:52] :) [14:05:12] greg-g: ugh I yeah that's loud, this is what I saw https://phabricator.wikimedia.org/F8122921 [14:05:31] note that I also format messages from bots as NOTICE, but yeah [14:05:47] shameless plug there because that's the irssi theme I wrote [14:11:09] bd808: tl;dr receiving this thing and send it on to irc https://prometheus.io/docs/alerting/configuration/# [14:12:54] godog: *nod* would be pretty trivial in python probably. I haven't tried mixing flask and the irc lib yet though, so my helper lib might not be of much use. [14:29:37] PROBLEM - Host deployment-phab02 is DOWN: CRITICAL - Host Unreachable (10.68.19.232) [14:32:47] I wonder is deployment-phab01 and 02 shut down? or delete? [14:33:00] Should they be removed from shinken if they have been deleted? [14:35:51] 06Release-Engineering-Team, 15User-greg: Improve per-person assigned list - https://phabricator.wikimedia.org/T137513#3276379 (10greg) 05Open>03declined Can't do much with it since tabs are limited to 6 (T76532). [14:43:16] 10Continuous-Integration-Config, 10Wikidata, 15User-aude: Wikidata build jenkins failure: Command "test" is not defined and don't run composer update in the job - https://phabricator.wikimedia.org/T165316#3276436 (10aude) [14:44:35] 06Release-Engineering-Team (Kanban): Follow-ups from project refactor - https://phabricator.wikimedia.org/T165596#3276438 (10greg) [14:45:26] 06Release-Engineering-Team (Kanban): Follow-ups from project refactor - https://phabricator.wikimedia.org/T165596#3270886 (10greg) a:03mmodell Assigning to @mmodel to do the last couple things I can't do (see description). [14:50:41] 10Continuous-Integration-Config, 10Wikidata, 15User-aude: Wikidata build jenkins failure: Command "test" is not defined and don't run composer update in the job - https://phabricator.wikimedia.org/T165316#3262820 (10Paladox) See https://gerrit.wikimedia.org/r/#/c/353564/ please. [14:54:48] 06Release-Engineering-Team (Kanban), 10Wikimedia-Hackathon-2017: Building Better Software (Hack-a-thon session) - https://phabricator.wikimedia.org/T165729#3276531 (10greg) a:03Jrbranaa [15:05:09] 10Continuous-Integration-Config, 10Wikidata, 15User-aude: Wikidata build jenkins failure: Command "test" is not defined and don't run composer update in the job - https://phabricator.wikimedia.org/T165316#3276551 (10Paladox) with that fix, it now shows this error 15:02:23 [RuntimeException]... [15:05:12] I think I broke puppet in beta, fixing [15:08:49] PROBLEM - Puppet errors on deployment-eventlogging03 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [15:09:11] PROBLEM - Puppet errors on deployment-aqs02 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [15:09:59] should be recovering now [15:10:20] PROBLEM - Puppet errors on deployment-cache-upload04 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [15:11:06] PROBLEM - Puppet errors on deployment-puppetmaster02 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [15:13:50] (03Draft1) 10Paladox: Wikidata: Make composer test command non voting [integration/config] - 10https://gerrit.wikimedia.org/r/354510 [15:13:52] (03PS2) 10Paladox: Wikidata: Make composer test command non voting [integration/config] - 10https://gerrit.wikimedia.org/r/354510 [15:14:08] PROBLEM - Puppet errors on deployment-mx is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [15:14:15] (03PS3) 10Paladox: Wikidata: Make composer test command non voting [integration/config] - 10https://gerrit.wikimedia.org/r/354510 (https://phabricator.wikimedia.org/T165316) [15:17:41] PROBLEM - Puppet errors on deployment-zotero01 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [15:21:03] RECOVERY - Puppet errors on deployment-puppetmaster02 is OK: OK: Less than 1.00% above the threshold [0.0] [15:22:43] PROBLEM - Puppet errors on deployment-urldownloader is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [15:27:48] PROBLEM - Puppet errors on deployment-redis01 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [15:30:30] PROBLEM - Puppet errors on deployment-redis02 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [15:32:15] sigh, trusty hosts fail, checking those [15:32:39] PROBLEM - Puppet errors on deployment-zookeeper01 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [15:33:59] PROBLEM - Puppet errors on deployment-tmh01 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [15:37:25] PROBLEM - Puppet errors on deployment-stream is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [15:43:46] PROBLEM - Puppet errors on deployment-elastic05 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [15:44:35] ok now it should be recovering for real [15:46:08] Yippee, build fixed! [15:46:08] Project selenium-MobileFrontend » chrome,beta,Linux,BrowserTests build #428: 09FIXED in 24 min: https://integration.wikimedia.org/ci/job/selenium-MobileFrontend/BROWSER=chrome,MEDIAWIKI_ENVIRONMENT=beta,PLATFORM=Linux,label=BrowserTests/428/ [15:49:10] RECOVERY - Puppet errors on deployment-aqs02 is OK: OK: Less than 1.00% above the threshold [0.0] [15:50:21] RECOVERY - Puppet errors on deployment-cache-upload04 is OK: OK: Less than 1.00% above the threshold [0.0] [15:51:55] (03PS1) 10Aude: Add composer-install & use in composer-test-mwextension [integration/config] - 10https://gerrit.wikimedia.org/r/354522 [15:52:28] (03PS2) 10Aude: Add composer-install & use in composer-test-mwextension [integration/config] - 10https://gerrit.wikimedia.org/r/354522 [15:55:27] Yippee, build fixed! [15:55:27] Project selenium-MobileFrontend » firefox,beta,Linux,BrowserTests build #428: 09FIXED in 33 min: https://integration.wikimedia.org/ci/job/selenium-MobileFrontend/BROWSER=firefox,MEDIAWIKI_ENVIRONMENT=beta,PLATFORM=Linux,label=BrowserTests/428/ [15:55:30] RECOVERY - Puppet errors on deployment-redis02 is OK: OK: Less than 1.00% above the threshold [0.0] [15:57:14] 06Release-Engineering-Team (Kanban), 10Wikimedia-Hackathon-2017, 07Browser-Tests, 07JavaScript, 15User-zeljkofilipin: Selenium/WebdriverIO tests in JavaScript/Node.js - https://phabricator.wikimedia.org/T159945#3276754 (10zeljkofilipin) Scheduled also for Saturday, Café Wien, 10-11. [16:02:45] RECOVERY - Puppet errors on deployment-urldownloader is OK: OK: Less than 1.00% above the threshold [0.0] [16:02:48] RECOVERY - Puppet errors on deployment-redis01 is OK: OK: Less than 1.00% above the threshold [0.0] [16:05:00] Project beta-update-databases-eqiad build #17211: 15ABORTED in 45 min: https://integration.wikimedia.org/ci/job/beta-update-databases-eqiad/17211/ [16:07:39] RECOVERY - Puppet errors on deployment-zookeeper01 is OK: OK: Less than 1.00% above the threshold [0.0] [16:08:59] RECOVERY - Puppet errors on deployment-tmh01 is OK: OK: Less than 1.00% above the threshold [0.0] [16:09:22] (03CR) 10Aude: "problem is:" [integration/config] - 10https://gerrit.wikimedia.org/r/354510 (https://phabricator.wikimedia.org/T165316) (owner: 10Paladox) [16:10:10] (03Abandoned) 10Paladox: Wikidata: Make composer test command non voting [integration/config] - 10https://gerrit.wikimedia.org/r/354510 (https://phabricator.wikimedia.org/T165316) (owner: 10Paladox) [16:12:26] RECOVERY - Puppet errors on deployment-stream is OK: OK: Less than 1.00% above the threshold [0.0] [16:18:49] RECOVERY - Puppet errors on deployment-eventlogging03 is OK: OK: Less than 1.00% above the threshold [0.0] [16:19:07] RECOVERY - Puppet errors on deployment-mx is OK: OK: Less than 1.00% above the threshold [0.0] [16:22:41] RECOVERY - Puppet errors on deployment-zotero01 is OK: OK: Less than 1.00% above the threshold [0.0] [16:23:45] RECOVERY - Puppet errors on deployment-elastic05 is OK: OK: Less than 1.00% above the threshold [0.0] [17:05:00] Project beta-update-databases-eqiad build #17212: 15ABORTED in 45 min: https://integration.wikimedia.org/ci/job/beta-update-databases-eqiad/17212/ [17:28:06] what is "Alert InstanceDown on deployment-phab02:9100 is firing"? [18:05:00] Project beta-update-databases-eqiad build #17213: 15ABORTED in 45 min: https://integration.wikimedia.org/ci/job/beta-update-databases-eqiad/17213/ [19:03:05] is alerts-beta-wm supposed to replace icinga-wm ? [19:03:27] Nope [19:03:36] they are testing bots for the hackathon it seems [19:05:00] Project beta-update-databases-eqiad build #17214: 15ABORTED in 45 min: https://integration.wikimedia.org/ci/job/beta-update-databases-eqiad/17214/ [19:19:44] RECOVERY - Puppet errors on deployment-phab01 is OK: OK: Less than 1.00% above the threshold [0.0] [19:20:17] ^ :) [19:22:33] :) [19:51:57] PROBLEM - Puppet errors on deployment-ores-redis-01 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [20:03:28] fancy alerts [20:04:10] Is someone sending those? Or is the bot actually getting it's data from shinken? [20:05:00] Project beta-update-databases-eqiad build #17215: 15ABORTED in 45 min: https://integration.wikimedia.org/ci/job/beta-update-databases-eqiad/17215/ [20:25:05] PROBLEM - Puppet errors on deployment-kafka01 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [20:31:56] RECOVERY - Puppet errors on deployment-ores-redis-01 is OK: OK: Less than 1.00% above the threshold [0.0] [21:00:07] RECOVERY - Puppet errors on deployment-kafka01 is OK: OK: Less than 1.00% above the threshold [0.0] [21:05:00] Project beta-update-databases-eqiad build #17216: 15ABORTED in 45 min: https://integration.wikimedia.org/ci/job/beta-update-databases-eqiad/17216/ [21:13:42] PROBLEM - Puppet errors on deployment-etcd-01 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [21:53:43] RECOVERY - Puppet errors on deployment-etcd-01 is OK: OK: Less than 1.00% above the threshold [0.0] [22:05:00] Project beta-update-databases-eqiad build #17217: 15ABORTED in 45 min: https://integration.wikimedia.org/ci/job/beta-update-databases-eqiad/17217/ [22:52:57] PROBLEM - Puppet errors on deployment-ores-redis-01 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [23:05:00] Project beta-update-databases-eqiad build #17218: 15ABORTED in 45 min: https://integration.wikimedia.org/ci/job/beta-update-databases-eqiad/17218/ [23:26:06] PROBLEM - Puppet errors on deployment-kafka01 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [23:32:57] RECOVERY - Puppet errors on deployment-ores-redis-01 is OK: OK: Less than 1.00% above the threshold [0.0]