[00:03:31] RECOVERY - Puppet errors on deployment-pdfrender02 is OK: OK: Less than 1.00% above the threshold [0.0] [00:18:58] legoktm: Yay, PHPCS merged upstream the autoloader stuff. [00:20:58] 10Release-Engineering-Team (Kanban), 10Phabricator (2016-10-05), 10Regression, 10Upstream: Regression: Ability to comment on phame posts not working - https://phabricator.wikimedia.org/T144338#3433820 (10Luke081515) 05Resolved>03Open I can't comment again. Example: Go to: https://phabricator.wikimedia.... [00:43:29] RainbowSprinkles: sweet. Saw. Yep all looks good. Not sure why MF has merge conflicts though.. will look at that tomorrow [01:44:41] PROBLEM - Puppet errors on deployment-kafka05 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [01:55:16] PROBLEM - Puppet errors on deployment-redis01 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [02:10:15] RECOVERY - Puppet errors on deployment-redis01 is OK: OK: Less than 1.00% above the threshold [0.0] [02:14:36] PROBLEM - Puppet errors on deployment-restbase01 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [02:24:40] RECOVERY - Puppet errors on deployment-kafka05 is OK: OK: Less than 1.00% above the threshold [0.0] [02:49:35] RECOVERY - Puppet errors on deployment-restbase01 is OK: OK: Less than 1.00% above the threshold [0.0] [04:18:43] PROBLEM - Puppet errors on deployment-fluorine02 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [04:18:43] PROBLEM - Puppet errors on deployment-cache-text04 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [04:33:42] RECOVERY - Puppet errors on deployment-cache-text04 is OK: OK: Less than 1.00% above the threshold [0.0] [04:47:10] PROBLEM - Free space - all mounts on deployment-fluorine02 is CRITICAL: CRITICAL: deployment-prep.deployment-fluorine02.diskspace._srv.byte_percentfree (<55.56%) [04:53:43] RECOVERY - Puppet errors on deployment-fluorine02 is OK: OK: Less than 1.00% above the threshold [0.0] [04:54:19] 10Continuous-Integration-Infrastructure, 10Collaboration-Team-Sprint-P-2015-02-11/q3, 10Collaboration-Team-Triage, 10Reading-Web-Backlog, and 2 others: Thanks is broken again (Mobile Thanks needs qunit tests) - https://phabricator.wikimedia.org/T86687#3434100 (10Krinkle) [04:54:21] 10Continuous-Integration-Infrastructure (Little Steps Sprint), 10Release-Engineering-Team (Next): Merge extensions PHPUnit and QUnit jobs - https://phabricator.wikimedia.org/T88207#3434098 (10Krinkle) 05Open>03declined The qunit job was instead combined with the selenium job. I don't think combining it wit... [05:03:31] 10Continuous-Integration-Config, 10VisualEditor, 10User-Ryasmeen: Intermittent failures of mwext-qunit-jessie - https://phabricator.wikimedia.org/T163123#3434131 (10Krinkle) 05Open>03Resolved a:03Krinkle Nope. [05:05:23] 10Continuous-Integration-Infrastructure: Jenkins: Assert no PHP errors (notices, warnings) were raised or exceptions were thrown - https://phabricator.wikimedia.org/T50002#3434145 (10Krinkle) [05:18:18] 10Continuous-Integration-Infrastructure: Jenkins: Assert no PHP errors (notices, warnings) were raised or exceptions were thrown - https://phabricator.wikimedia.org/T50002#3434165 (10Krinkle) Recent runs: * **mediawiki-phpunit-hhvm-jessie** RECOVERY - Long lived cherry-picks on puppetmaster on deployment-puppetmaster02 is OK: OK: Less than 100.00% above the threshold [0.0] [05:47:17] RECOVERY - Check for valid instance states on labnodepool1001 is OK: nodepool state management is OK [05:54:16] PROBLEM - Puppet errors on deployment-parsoid09 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [06:07:59] PROBLEM - Puppet errors on deployment-stream is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [06:24:18] RECOVERY - Puppet errors on deployment-parsoid09 is OK: OK: Less than 1.00% above the threshold [0.0] [06:28:02] RECOVERY - Puppet errors on deployment-stream is OK: OK: Less than 1.00% above the threshold [0.0] [07:12:10] RECOVERY - Free space - all mounts on deployment-fluorine02 is OK: OK: All targets OK [07:18:59] PROBLEM - Puppet errors on deployment-stream is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [07:30:33] PROBLEM - Puppet errors on deployment-tmh01 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [07:30:41] PROBLEM - Puppet errors on deployment-tin is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [08:02:48] PROBLEM - Puppet errors on deployment-kafka03 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [08:09:18] PROBLEM - Puppet errors on deployment-db04 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [08:12:49] RECOVERY - Puppet errors on deployment-kafka03 is OK: OK: Less than 1.00% above the threshold [0.0] [08:14:01] RECOVERY - Puppet errors on deployment-stream is OK: OK: Less than 1.00% above the threshold [0.0] [08:17:09] 10Release-Engineering-Team, 10Documentation: Require vagrant role for extensions wanting review for WMF deployment - https://phabricator.wikimedia.org/T170488#3434363 (10Aklapper) Should be documented by adding to https://www.mediawiki.org/wiki/Review_queue if wanted/agreed on [08:25:33] RECOVERY - Puppet errors on deployment-tmh01 is OK: OK: Less than 1.00% above the threshold [0.0] [08:33:47] 10Beta-Cluster-Infrastructure: deployment-logstash2 out of disk space - https://phabricator.wikimedia.org/T170521#3434408 (10fgiunchedi) [08:35:46] 10Beta-Cluster-Infrastructure: deployment-eventlogging03 out of disk space - https://phabricator.wikimedia.org/T170522#3434423 (10fgiunchedi) [08:36:51] 10Beta-Cluster-Infrastructure: deployment-kafka01 out of disk space - https://phabricator.wikimedia.org/T170523#3434437 (10fgiunchedi) [08:39:20] RECOVERY - Puppet errors on deployment-db04 is OK: OK: Less than 1.00% above the threshold [0.0] [08:39:29] if anyone is around, I don't think logstash is working in beta ATM ^ [08:42:52] PROBLEM - Puppet errors on deployment-ms-fe02 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [09:15:43] RECOVERY - Puppet errors on deployment-tin is OK: OK: Less than 1.00% above the threshold [0.0] [09:22:04] PROBLEM - Puppet errors on deployment-mediawiki05 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [09:37:54] RECOVERY - Puppet errors on deployment-ms-fe02 is OK: OK: Less than 1.00% above the threshold [0.0] [09:42:39] PROBLEM - Puppet errors on deployment-mx is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [10:13:21] PROBLEM - Puppet errors on integration-puppetmaster01 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [10:17:06] RECOVERY - Puppet errors on deployment-mediawiki05 is OK: OK: Less than 1.00% above the threshold [0.0] [10:19:42] PROBLEM - Puppet errors on deployment-cache-text04 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [10:32:40] RECOVERY - Puppet errors on deployment-mx is OK: OK: Less than 1.00% above the threshold [0.0] [10:39:43] RECOVERY - Puppet errors on deployment-cache-text04 is OK: OK: Less than 1.00% above the threshold [0.0] [10:53:22] RECOVERY - Puppet errors on integration-puppetmaster01 is OK: OK: Less than 1.00% above the threshold [0.0] [10:59:31] PROBLEM - Puppet errors on deployment-pdf01 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [11:00:20] PROBLEM - Free space - all mounts on deployment-kafka01 is CRITICAL: CRITICAL: deployment-prep.deployment-kafka01.diskspace.root.byte_percentfree (<100.00%) [11:20:45] PROBLEM - Puppet errors on deployment-cache-text04 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [11:54:31] I guess the CI issues are resolved? [11:54:34] greg-g: ^ [11:58:56] 10Browser-Tests-Infrastructure, 10Release-Engineering-Team, 10Ruby, 10User-zeljkofilipin: Update Selenium/Ruby documentation - https://phabricator.wikimedia.org/T170543#3434902 (10zeljkofilipin) [12:02:10] 10Browser-Tests-Infrastructure, 10Release-Engineering-Team, 10Ruby, 10User-zeljkofilipin: Update Selenium/Ruby documentation - https://phabricator.wikimedia.org/T170543#3434925 (10zeljkofilipin) p:05Triage>03Low [12:25:44] RECOVERY - Puppet errors on deployment-cache-text04 is OK: OK: Less than 1.00% above the threshold [0.0] [12:37:57] (03CR) 10Zfilipin: "All green:" [integration/config] - 10https://gerrit.wikimedia.org/r/361012 (https://phabricator.wikimedia.org/T166750) (owner: 10Jdlrobson) [12:38:08] 10Continuous-Integration-Config, 10MinervaNeue, 10Patch-For-Review, 10Ruby, 10User-zeljkofilipin: Setup CI on Minerva repo - https://phabricator.wikimedia.org/T166750#3435029 (10zeljkofilipin) All green: https://integration.wikimedia.org/ci/job/selenium-MinervaNeue/ [12:44:05] PROBLEM - Puppet errors on deployment-zotero01 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [12:44:44] PROBLEM - Puppet errors on deployment-etcd-01 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [12:51:23] PROBLEM - Puppet staleness on deployment-kafka01 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [43200.0] [13:04:48] PROBLEM - Puppet errors on deployment-conf03 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [13:08:26] PROBLEM - Puppet staleness on deployment-eventlogging03 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [43200.0] [13:49:06] RECOVERY - Puppet errors on deployment-zotero01 is OK: OK: Less than 1.00% above the threshold [0.0] [14:17:18] 10Continuous-Integration-Infrastructure, 10Cloud-VPS, 10Nodepool, 10Patch-For-Review: figure out if nodepool is overwhelming rabbitmq and/or nova - https://phabricator.wikimedia.org/T170492#3435469 (10Andrew) So far things seem quieter and more stable with the nodepool rate change. If we get another good... [14:31:33] PROBLEM - Puppet errors on deployment-tmh01 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [14:40:22] PROBLEM - Free space - all mounts on deployment-logstash2 is CRITICAL: CRITICAL: deployment-prep.deployment-logstash2.diskspace._var_lib_elasticsearch.byte_percentfree (No valid datapoints found) deployment-prep.deployment-logstash2.diskspace._srv.byte_percentfree (No valid datapoints found)deployment-prep.deployment-logstash2.diskspace._mnt.byte_percentfree (<100.00%) [14:48:42] PROBLEM - Puppet errors on deployment-sca04 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [14:53:43] PROBLEM - Puppet errors on deployment-mira is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [14:55:11] PROBLEM - Puppet errors on deployment-sca02 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [14:57:48] PROBLEM - Puppet errors on deployment-prometheus01 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [15:00:41] PROBLEM - Puppet errors on deployment-trending01 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [15:10:15] PROBLEM - Puppet errors on deployment-changeprop is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [15:12:41] PROBLEM - Puppet errors on deployment-mathoid is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [15:13:22] PROBLEM - Puppet errors on deployment-mcs01 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [15:13:38] PROBLEM - Puppet errors on deployment-sca01 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [15:15:48] PROBLEM - Puppet errors on deployment-sca03 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [15:26:18] PROBLEM - Puppet errors on deployment-redis01 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [15:27:31] PROBLEM - Puppet errors on deployment-eventlogging03 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [15:31:31] Sagan: :) yeah [15:32:01] irc topics are hard [15:38:43] RECOVERY - Puppet errors on deployment-mira is OK: OK: Less than 1.00% above the threshold [0.0] [15:41:16] RECOVERY - Puppet errors on deployment-redis01 is OK: OK: Less than 1.00% above the threshold [0.0] [15:59:02] FYI in case it has been missed, looks like logstash ran out of disk space in beta T170521 [15:59:03] T170521: deployment-logstash2 out of disk space - https://phabricator.wikimedia.org/T170521 [15:59:22] godog: who maintains logstash? :) [16:06:42] 10Beta-Cluster-Infrastructure: deployment-logstash2 out of disk space - https://phabricator.wikimedia.org/T170521#3436147 (10greg) Based on https://tools.wmflabs.org/sal/production?p=0&q=logstash&d= and https://tools.wmflabs.org/sal/releng?p=0&q=logstash&d= I'm adding some CC's here of the people who de facto ma... [16:07:20] 10Beta-Cluster-Infrastructure: deployment-logstash2 out of disk space - https://phabricator.wikimedia.org/T170521#3436149 (10greg) p:05Triage>03High [16:09:30] 10Beta-Cluster-Infrastructure, 10Analytics: deployment-kafka01 out of disk space - https://phabricator.wikimedia.org/T170523#3436168 (10greg) kafka -> #analytics Analytics: please help diagnose/fix the beta cluster kafka host. [16:10:32] 10Beta-Cluster-Infrastructure, 10Analytics: deployment-eventlogging03 out of disk space - https://phabricator.wikimedia.org/T170522#3436174 (10greg) eventlogging -> #analytics Please help here as well. Thanks! [16:11:32] RECOVERY - Puppet errors on deployment-tmh01 is OK: OK: Less than 1.00% above the threshold [0.0] [16:13:57] 10Browser-Tests-Infrastructure, 10Release-Engineering-Team, 10Ruby, 10User-zeljkofilipin: Update Selenium/Ruby documentation - https://phabricator.wikimedia.org/T170543#3436198 (10greg) p:05Low>03Lowest Talked with Zeljko about this in our 1:1. Given the active migration to webdriver.io/node tests this... [16:14:05] 10Browser-Tests-Infrastructure, 10Release-Engineering-Team (Backlog), 10Ruby, 10User-zeljkofilipin: Update Selenium/Ruby documentation - https://phabricator.wikimedia.org/T170543#3436203 (10greg) [16:16:05] jdlrobson: I won't be able to do 11am, I have a doc appt. Could we do it after the train, like 1pm? [16:21:27] 10Beta-Cluster-Infrastructure: deployment-logstash2 out of disk space - https://phabricator.wikimedia.org/T170521#3436241 (10EBernhardson) Index sizes: ``` yellow open logstash-2017.06.13 1w0X6t80SASEnV6Y1kDTUw 1 2 1770573 0 924.8mb 924.8mb yellow open logstash-2017.06.14 cAUWl9U2QYefGMsi07sW4w 1 2 1373470 0... [16:24:45] 10Beta-Cluster-Infrastructure: deployment-logstash2 out of disk space - https://phabricator.wikimedia.org/T170521#3436264 (10EBernhardson) It looks like there was also 20G of old indexes from a previous version of elasticsearch in /mnt. Clearing that out along with the 7 oldest days of logs has brought us 43G of... [16:28:32] thanks ebernhardson, is "status: red" normal after you delete thoses indexes? https://logstash-beta.wmflabs.org/app/kibana#?_g=%28%29 [16:28:46] thoses? :) [16:29:10] greg-g: it should fix itself in a few minutes, if not i can poke at it [16:29:38] kk, thanks! [16:29:40] <3 [16:31:57] hrm :S [16:35:02] PROBLEM - Puppet errors on deployment-eventlogging04 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [16:35:22] RECOVERY - Free space - all mounts on deployment-logstash2 is OK: OK: deployment-prep.deployment-logstash2.diskspace._var_lib_elasticsearch.byte_percentfree (No valid datapoints found) deployment-prep.deployment-logstash2.diskspace._srv.byte_percentfree (No valid datapoints found) [16:36:36] hopefully the index isn't completely borked ... but there are some log messages about failing to write index state for .kibana due to no space left on device. In a typical elasticsearch cluster this is protected against by having replicas, but this is a single node cluster. Worst case i can import the prod dashboards... [16:37:08] oh awesome [16:37:12] :/ [16:37:44] 10Release-Engineering-Team (Kanban), 10Release, 10Train Deployments: MW-1.30.0-wmf.9 deployment blockers - https://phabricator.wikimedia.org/T167893#3436341 (10thcipriani) [16:39:35] bouncing elasticsearch seems to have triggered it to pick the index back up off disk, seems ok [16:41:48] ebernhardson: thanks man [16:54:44] PROBLEM - Puppet errors on integration-slave-trusty-1006 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [16:54:56] RainbowSprinkles: 1pm should be fine. might be closer to 1.30 though [16:55:02] ok [16:57:03] PROBLEM - Puppet errors on deployment-memc05 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [16:59:00] PROBLEM - Puppet errors on deployment-memc04 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [17:00:03] PROBLEM - Puppet errors on deployment-db03 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [17:02:17] PROBLEM - Puppet errors on deployment-urldownloader is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [17:03:12] PROBLEM - Puppet errors on deployment-aqs01 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [17:04:03] !log restarting jenkins for updates [17:04:07] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [17:04:19] greg-g: heh was in a meeting, looks like we know the answer now :) thanks for shepherding the other tasks too! [17:04:21] ...once the current g&s backlog is out of the way [17:11:42] hi, ci looks down [17:11:43] at [17:11:45] https://integration.wikimedia.org/zuul/ [17:11:52] i see nothing running. Or am i wrong? [17:11:54] TabbyCat: do you particularly care about preserving the edit histories of the redirects? It would probably be easier for me to just mass-create redirects and then delete the broken ones [17:11:56] Project mediawiki-core-code-coverage build #2881: 15ABORTED in 2 hr 11 min: https://integration.wikimedia.org/ci/job/mediawiki-core-code-coverage/2881/ [17:12:07] (probably how we should've done this to begin with, but time makes fools of us all) [17:12:18] harej: if the history is just the redirect, I'd just nuke [17:12:28] paladox: you are correct, restarting for upgrades [17:12:37] ok thanks :) [17:13:04] PROBLEM - Puppet errors on deployment-mediawiki05 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [17:13:44] should be back now [17:14:00] thanks :) [17:17:02] RECOVERY - Puppet errors on deployment-memc05 is OK: OK: Less than 1.00% above the threshold [0.0] [17:18:12] RECOVERY - Puppet errors on deployment-aqs01 is OK: OK: Less than 1.00% above the threshold [0.0] [17:21:50] godog: :) [17:22:49] 10Release-Engineering-Team (Kanban), 10CirrusSearch, 10Discovery, 10Discovery-Search, and 2 others: Figure out why browser tests can't create suggestion box - https://phabricator.wikimedia.org/T162966#3181640 (10debt) We need to apply this fix in production. [17:24:58] RECOVERY - Puppet errors on deployment-eventlogging04 is OK: OK: Less than 1.00% above the threshold [0.0] [17:27:59] PROBLEM - Free space - all mounts on deployment-eventlogging03 is CRITICAL: CRITICAL: deployment-prep.deployment-eventlogging03.diskspace.root.byte_percentfree (<100.00%) [17:29:01] RECOVERY - Puppet errors on deployment-memc04 is OK: OK: Less than 1.00% above the threshold [0.0] [17:33:48] PROBLEM - Puppet errors on deployment-kafka03 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [17:34:47] RECOVERY - Puppet errors on integration-slave-trusty-1006 is OK: OK: Less than 1.00% above the threshold [0.0] [17:35:05] RECOVERY - Puppet errors on deployment-db03 is OK: OK: Less than 1.00% above the threshold [0.0] [17:47:16] RECOVERY - Puppet errors on deployment-urldownloader is OK: OK: Less than 1.00% above the threshold [0.0] [17:53:05] RECOVERY - Puppet errors on deployment-mediawiki05 is OK: OK: Less than 1.00% above the threshold [0.0] [18:06:01] PROBLEM - Puppet errors on deployment-eventlogging04 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [18:10:58] 10Release-Engineering-Team (Kanban), 10Release, 10Train Deployments: 1.30.0-wmf.11 deployment blockers - https://phabricator.wikimedia.org/T168051#3436999 (10greg) a:03demon [18:11:15] 10Release-Engineering-Team (Kanban), 10Release, 10Train Deployments: MW 1.30.0-wmf.10 deployment blockers - https://phabricator.wikimedia.org/T168050#3437001 (10greg) a:03demon [18:13:24] RECOVERY - Puppet staleness on deployment-eventlogging03 is OK: OK: Less than 1.00% above the threshold [3600.0] [18:14:21] PROBLEM - Puppet errors on integration-puppetmaster01 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [18:18:48] RECOVERY - Puppet errors on deployment-kafka03 is OK: OK: Less than 1.00% above the threshold [0.0] [18:21:00] RECOVERY - Puppet errors on deployment-eventlogging04 is OK: OK: Less than 1.00% above the threshold [0.0] [18:24:45] PROBLEM - Puppet errors on deployment-mira is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [18:25:59] PROBLEM - Puppet errors on deployment-mediawiki06 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [18:38:25] TabbyCat: redirect creation in process. when that's all done i'll delete the Broken/ pages. I think we'll be just fine :) [18:38:37] harej: cool :) [18:42:11] 10Browser-Tests-Infrastructure, 10Release-Engineering-Team, 10Reading-Web-Backlog: MobileFrontend Chrome browser test job has become unstable - https://phabricator.wikimedia.org/T167994#3437097 (10Jdlrobson) 05stalled>03Resolved a:03Jdlrobson good enough ! :) [18:49:22] RECOVERY - Puppet errors on integration-puppetmaster01 is OK: OK: Less than 1.00% above the threshold [0.0] [19:04:05] PROBLEM - Puppet errors on deployment-mediawiki05 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [19:19:43] RECOVERY - Puppet errors on deployment-mira is OK: OK: Less than 1.00% above the threshold [0.0] [19:23:25] Hi relengers! [19:23:47] I figured i'd lighten the load of fr-tech's CiviCRM tests a bit [19:24:11] Project selenium-MinervaNeue » chrome,beta,Linux,BrowserTests build #7: 04FAILURE in 35 min: https://integration.wikimedia.org/ci/job/selenium-MinervaNeue/BROWSER=chrome,MEDIAWIKI_ENVIRONMENT=beta,PLATFORM=Linux,label=BrowserTests/7/ [19:24:38] by folding the json linting (currently done by npm) into the composer test [19:24:46] which is currently linting the yaml [19:25:18] We've got the patch ready to go in the crm repo, we just need to get rid of the npm test in zuul: [19:25:34] https://gerrit.wikimedia.org/r/364804 [19:26:12] Anyone have a second to +2 that test removal ^^ ? [19:27:22] 10Release-Engineering-Team (Kanban), 10releng-201617-q4, 10MediaWiki-General-or-Unknown, 10MW-1.29-release, 10Release: Release MediaWiki 1.29 - https://phabricator.wikimedia.org/T153271#3437375 (10demon) 05Open>03Resolved [[ https://lists.wikimedia.org/pipermail/mediawiki-announce/2017-July/000212.ht... [19:30:59] RECOVERY - Puppet errors on deployment-mediawiki06 is OK: OK: Less than 1.00% above the threshold [0.0] [19:31:02] Project selenium-MinervaNeue » firefox,beta,Linux,BrowserTests build #7: 04FAILURE in 42 min: https://integration.wikimedia.org/ci/job/selenium-MinervaNeue/BROWSER=firefox,MEDIAWIKI_ENVIRONMENT=beta,PLATFORM=Linux,label=BrowserTests/7/ [19:31:56] PROBLEM - Puppet errors on deployment-imagescaler01 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [19:49:07] RECOVERY - Puppet errors on deployment-mediawiki05 is OK: OK: Less than 1.00% above the threshold [0.0] [19:56:44] 10Continuous-Integration-Config, 10FR-Smashpig, 10Fundraising-Backlog: SmashPig CI should run phpunit tests - https://phabricator.wikimedia.org/T127879#3437464 (10Ejegg) 05Open>03Resolved [19:58:34] RainbowSprinkles: Woo-hoo. [20:01:41] PROBLEM - Puppet errors on deployment-tin is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [20:12:35] * paladox https://phabricator.wikimedia.org/D717 [20:14:59] PROBLEM - Puppet errors on deployment-stream is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [20:15:23] PROBLEM - Puppet errors on integration-puppetmaster01 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [20:25:43] PROBLEM - Puppet errors on deployment-mira is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [20:35:43] RECOVERY - Puppet errors on deployment-mira is OK: OK: Less than 1.00% above the threshold [0.0] [20:36:41] RECOVERY - Puppet errors on deployment-tin is OK: OK: Less than 1.00% above the threshold [0.0] [20:42:51] Project selenium-Echo » firefox,beta,Linux,BrowserTests build #454: 04FAILURE in 1 min 50 sec: https://integration.wikimedia.org/ci/job/selenium-Echo/BROWSER=firefox,MEDIAWIKI_ENVIRONMENT=beta,PLATFORM=Linux,label=BrowserTests/454/ [20:42:53] Project selenium-Echo » chrome,beta,Linux,BrowserTests build #454: 04FAILURE in 1 min 52 sec: https://integration.wikimedia.org/ci/job/selenium-Echo/BROWSER=chrome,MEDIAWIKI_ENVIRONMENT=beta,PLATFORM=Linux,label=BrowserTests/454/ [20:50:21] RECOVERY - Puppet errors on integration-puppetmaster01 is OK: OK: Less than 1.00% above the threshold [0.0] [20:51:43] PROBLEM - Puppet errors on deployment-cache-text04 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [20:57:59] PROBLEM - Puppet errors on deployment-memc05 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [21:01:44] RECOVERY - Puppet errors on deployment-cache-text04 is OK: OK: Less than 1.00% above the threshold [0.0] [21:03:36] PROBLEM - Puppet errors on deployment-mx is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [21:05:02] RECOVERY - Puppet errors on deployment-stream is OK: OK: Less than 1.00% above the threshold [0.0] [21:16:23] PROBLEM - Puppet errors on integration-puppetmaster01 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [21:23:00] RECOVERY - Puppet errors on deployment-memc05 is OK: OK: Less than 1.00% above the threshold [0.0] [21:29:15] 10Release-Engineering-Team (Kanban), 10Operations, 10Phabricator: setup/install phab1001.eqiad.wmnet - https://phabricator.wikimedia.org/T163938#3437807 (10Dzahn) a:05mmodell>03RobH @robh @mmodell It seems this system isn't actually up and running but has an issue. I noticed that i could not ssh to it,... [21:30:25] 10Release-Engineering-Team (Kanban), 10Operations, 10Phabricator: setup/install phab1001.eqiad.wmnet - https://phabricator.wikimedia.org/T163938#3437811 (10mmodell) @Dzahn hmm, I thought it was working before? I think I even logged into it but maybe I am remembering incorrectly. [21:33:00] 10Release-Engineering-Team (Kanban), 10Operations, 10Phabricator: setup/install phab1001.eqiad.wmnet - https://phabricator.wikimedia.org/T163938#3437812 (10RobH) busy box is a failed hdd spin up during post, I'd try rebooting and I bet it comes back. [21:37:57] 10Release-Engineering-Team (Kanban), 10Release, 10Train Deployments: MW-1.30.0-wmf.9 deployment blockers - https://phabricator.wikimedia.org/T167893#3437816 (10thcipriani) [21:48:36] RECOVERY - Puppet errors on deployment-mx is OK: OK: Less than 1.00% above the threshold [0.0] [21:51:25] RECOVERY - Puppet errors on integration-puppetmaster01 is OK: OK: Less than 1.00% above the threshold [0.0] [22:12:35] PROBLEM - Puppet errors on deployment-tmh01 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [22:15:36] PROBLEM - Puppet errors on deployment-restbase01 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [22:30:35] RECOVERY - Puppet errors on deployment-restbase01 is OK: OK: Less than 1.00% above the threshold [0.0] [22:40:03] PROBLEM - Puppet errors on deployment-memc04 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [22:45:34] PROBLEM - Puppet errors on deployment-logstash2 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [22:53:34] Reedy: do you see any security issues with a JavaScript gadget that reads content out of a page with a JSON content model? [22:53:50] There's potentially loads [22:54:01] But the API should be doing read restrictions... [22:54:19] I'm sure there are, potentially, but is there a "right" way that avoids those issues? [22:54:23] Possible cache issues if target page is priveleged [22:54:43] Depends what you're trying to do :) [22:55:00] harej: I have a lua module on enwiki that parses a jsoncontent page [22:55:03] RECOVERY - Puppet errors on deployment-memc04 is OK: OK: Less than 1.00% above the threshold [0.0] [22:55:11] I do something comparable weird on Meta [22:55:34] Anyways, the idea is a new version of FormWizard that (among other things) reads form schemas from JSON-modeled pages anywhere and not just the MediaWiki namespace [22:55:39] So as to make form building available for more people [22:55:50] (Which rests on the assumption that non-admins can change content model, which, er.)) [23:07:34] RECOVERY - Puppet errors on deployment-tmh01 is OK: OK: Less than 1.00% above the threshold [0.0] [23:09:38] PROBLEM - Puppet errors on deployment-mx is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [23:14:25] PROBLEM - Puppet errors on deployment-elastic07 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [23:16:36] Yippee, build fixed! [23:16:39] Project selenium-MinervaNeue » chrome,beta,Linux,BrowserTests build #8: 09FIXED in 20 min: https://integration.wikimedia.org/ci/job/selenium-MinervaNeue/BROWSER=chrome,MEDIAWIKI_ENVIRONMENT=beta,PLATFORM=Linux,label=BrowserTests/8/ [23:20:33] RECOVERY - Puppet errors on deployment-logstash2 is OK: OK: Less than 1.00% above the threshold [0.0] [23:23:23] PROBLEM - Puppet errors on castor is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [23:24:25] RECOVERY - Puppet errors on deployment-elastic07 is OK: OK: Less than 1.00% above the threshold [0.0] [23:24:37] RECOVERY - Puppet errors on deployment-mx is OK: OK: Less than 1.00% above the threshold [0.0] [23:25:23] Yippee, build fixed! [23:25:23] Project selenium-MinervaNeue » firefox,beta,Linux,BrowserTests build #8: 09FIXED in 29 min: https://integration.wikimedia.org/ci/job/selenium-MinervaNeue/BROWSER=firefox,MEDIAWIKI_ENVIRONMENT=beta,PLATFORM=Linux,label=BrowserTests/8/ [23:57:52] 10Release-Engineering-Team (Kanban), 10Release, 10Train Deployments: MW-1.30.0-wmf.9 deployment blockers - https://phabricator.wikimedia.org/T167893#3438073 (10thcipriani)