[03:17:12] Project browsertests-Flow-test2.wikipedia.org-windows_8-internet_explorer-sauce build #379: FAILURE in 16 min: https://integration.wikimedia.org/ci/job/browsertests-Flow-test2.wikipedia.org-windows_8-internet_explorer-sauce/379/ [03:17:27] Project browsertests-Echo-test2.wikipedia.org-linux-chrome-sauce build #274: FAILURE in 11 min: https://integration.wikimedia.org/ci/job/browsertests-Echo-test2.wikipedia.org-linux-chrome-sauce/274/ [03:27:05] (03CR) 1020after4: [C: 031] phabricator job to run arc lint on all repo [integration/config] - 10https://gerrit.wikimedia.org/r/183094 (https://phabricator.wikimedia.org/T85123) (owner: 10Hashar) [04:02:45] Yippee, build fixed! [04:02:45] Project browsertests-VisualEditor-test2.wikipedia.org-linux-chrome-sauce build #415: FIXED in 21 min: https://integration.wikimedia.org/ci/job/browsertests-VisualEditor-test2.wikipedia.org-linux-chrome-sauce/415/ [04:17:15] Yippee, build fixed! [04:17:16] Project browsertests-UploadWizard-commons.wikimedia.beta.wmflabs.org-linux-firefox-sauce build #409: FIXED in 8 min 33 sec: https://integration.wikimedia.org/ci/job/browsertests-UploadWizard-commons.wikimedia.beta.wmflabs.org-linux-firefox-sauce/409/ [04:23:26] Yippee, build fixed! [04:23:26] Project browsertests-Flow-en.wikipedia.beta.wmflabs.org-linux-chrome-monobook-sauce build #223: FIXED in 41 min: https://integration.wikimedia.org/ci/job/browsertests-Flow-en.wikipedia.beta.wmflabs.org-linux-chrome-monobook-sauce/223/ [04:34:38] Yippee, build fixed! [04:34:39] Project browsertests-Echo-en.wikipedia.beta.wmflabs.org-linux-chrome-sauce build #274: FIXED in 8 min 18 sec: https://integration.wikimedia.org/ci/job/browsertests-Echo-en.wikipedia.beta.wmflabs.org-linux-chrome-sauce/274/ [04:39:06] Yippee, build fixed! [04:39:07] Project browsertests-Flow-en.wikipedia.beta.wmflabs.org-linux-chrome-sauce build #431: FIXED in 40 min: https://integration.wikimedia.org/ci/job/browsertests-Flow-en.wikipedia.beta.wmflabs.org-linux-chrome-sauce/431/ [04:39:58] Yippee, build fixed! [04:39:58] Project browsertests-PdfHandler-test2.wikipedia.org-linux-firefox-sauce build #306: FIXED in 51 sec: https://integration.wikimedia.org/ci/job/browsertests-PdfHandler-test2.wikipedia.org-linux-firefox-sauce/306/ [04:40:48] Yippee, build fixed! [04:40:48] Project browsertests-CentralNotice-en.wikipedia.beta.wmflabs.org-os_x_10.9-safari-sauce build #47: FIXED in 49 sec: https://integration.wikimedia.org/ci/job/browsertests-CentralNotice-en.wikipedia.beta.wmflabs.org-os_x_10.9-safari-sauce/47/ [04:58:19] Yippee, build fixed! [04:58:20] Project browsertests-MobileFrontend-en.m.wikipedia.beta.wmflabs.org-linux-chrome-sauce build #438: FIXED in 41 min: https://integration.wikimedia.org/ci/job/browsertests-MobileFrontend-en.m.wikipedia.beta.wmflabs.org-linux-chrome-sauce/438/ [04:59:21] Yippee, build fixed! [04:59:21] Project browsertests-CentralNotice-en.wikipedia.beta.wmflabs.org-windows_7-internet_explorer-10-sauce build #47: FIXED in 1 min 1 sec: https://integration.wikimedia.org/ci/job/browsertests-CentralNotice-en.wikipedia.beta.wmflabs.org-windows_7-internet_explorer-10-sauce/47/ [05:10:23] PROBLEM - Puppet failure on deployment-mediawiki01 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [05:13:14] Yippee, build fixed! [05:13:15] Project browsertests-VisualEditor-en.wikipedia.beta.wmflabs.org-linux-chrome-sauce build #442: FIXED in 19 min: https://integration.wikimedia.org/ci/job/browsertests-VisualEditor-en.wikipedia.beta.wmflabs.org-linux-chrome-sauce/442/ [05:21:13] Project beta-scap-eqiad build #37289: FAILURE in 17 min: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/37289/ [05:33:20] 3Multimedia, Quality-Assurance, MediaWiki-extensions-MultimediaViewer: Navigation browser test no longer works with Safari driver - https://phabricator.wikimedia.org/T85802#961794 (10Gilles) 5Open>3Resolved [05:35:23] RECOVERY - Puppet failure on deployment-mediawiki01 is OK: OK: Less than 1.00% above the threshold [0.0] [05:44:14] Yippee, build fixed! [05:44:14] Project beta-scap-eqiad build #37292: FIXED in 10 min: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/37292/ [06:27:35] (03PS1) 10Gilles: Make rubocop voting on UploadWizard [integration/config] - 10https://gerrit.wikimedia.org/r/183438 [06:28:43] PROBLEM - Puppet failure on deployment-jobrunner01 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [06:28:53] (03PS1) 10Gilles: Make rubocop voting on Media Viewer [integration/config] - 10https://gerrit.wikimedia.org/r/183439 [06:39:34] Project browsertests-MultimediaViewer-en.wikipedia.beta.wmflabs.org-linux-chrome-sauce build #393: FAILURE in 24 min: https://integration.wikimedia.org/ci/job/browsertests-MultimediaViewer-en.wikipedia.beta.wmflabs.org-linux-chrome-sauce/393/ [06:41:02] Project browsertests-UniversalLanguageSelector-commons.wikimedia.beta.wmflabs.org-linux-firefox-sauce build #387: FAILURE in 14 min: https://integration.wikimedia.org/ci/job/browsertests-UniversalLanguageSelector-commons.wikimedia.beta.wmflabs.org-linux-firefox-sauce/387/ [07:18:45] RECOVERY - Puppet failure on deployment-jobrunner01 is OK: OK: Less than 1.00% above the threshold [0.0] [08:58:45] PROBLEM - Puppet failure on deployment-rsync01 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [09:06:53] (03CR) 10Hashar: [C: 032] Make rubocop voting on UploadWizard [integration/config] - 10https://gerrit.wikimedia.org/r/183438 (owner: 10Gilles) [09:08:04] (03Merged) 10jenkins-bot: Make rubocop voting on UploadWizard [integration/config] - 10https://gerrit.wikimedia.org/r/183438 (owner: 10Gilles) [09:08:06] (03CR) 10Hashar: [C: 032] Make rubocop voting on Media Viewer [integration/config] - 10https://gerrit.wikimedia.org/r/183439 (owner: 10Gilles) [09:09:16] (03Merged) 10jenkins-bot: Make rubocop voting on Media Viewer [integration/config] - 10https://gerrit.wikimedia.org/r/183439 (owner: 10Gilles) [09:23:39] RECOVERY - Puppet failure on deployment-rsync01 is OK: OK: Less than 1.00% above the threshold [0.0] [10:06:30] 3Continuous-Integration: Jenkins: Set up lint and phpunit jobs for cdb repo - https://phabricator.wikimedia.org/T75541#962249 (10hashar) 5Open>3Resolved This has been resolved. We have a Jenkins job template to run `composer test` and the cdb repo invokes parallel PHP linting + PHPUnit. [10:08:16] 3Continuous-Integration, Release-Engineering: Jenkins: Implement hhvm based voting jobs for mediawiki and extensions (tracking) - https://phabricator.wikimedia.org/T75521#962254 (10hashar) [10:08:19] 3operations, Continuous-Integration: [OPS] Jenkins: Slaves running Ubuntu Trusty should have hhvm installed - https://phabricator.wikimedia.org/T75356#962252 (10hashar) 5Open>3Resolved Patch https://gerrit.wikimedia.org/r/#/c/178806/ is still pending review but otherwise has been already deployed. We have J... [10:10:23] 3Continuous-Integration, Release-Engineering: Jenkins: Implement hhvm based voting jobs for mediawiki and extensions (tracking) - https://phabricator.wikimedia.org/T75521#962263 (10hashar) [10:30:31] (03PS6) 10Adrian Lang: Make mwext-WikibaseJavaScriptApi-qunit voting [integration/config] - 10https://gerrit.wikimedia.org/r/180418 [10:31:17] (03CR) 10Adrian Lang: Make mwext-WikibaseJavaScriptApi-qunit voting (031 comment) [integration/config] - 10https://gerrit.wikimedia.org/r/180418 (owner: 10Adrian Lang) [10:31:47] 3Release-Engineering: scap.ssh.cluster_ssh() only returns the last line of error - https://phabricator.wikimedia.org/T84986#962273 (10hashar) My bad, I have been confused because only the first line is colored :-) [10:35:34] (03CR) 10jenkins-bot: [V: 04-1] Make mwext-WikibaseJavaScriptApi-qunit voting [integration/config] - 10https://gerrit.wikimedia.org/r/180418 (owner: 10Adrian Lang) [10:38:56] (03PS7) 10Adrian Lang: Make mwext-WikibaseJavaScriptApi-qunit voting [integration/config] - 10https://gerrit.wikimedia.org/r/180418 [10:56:57] PROBLEM - Puppet failure on deployment-salt is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [10:57:39] PROBLEM - Puppet failure on deployment-bastion is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [11:03:40] PROBLEM - Puppet failure on deployment-restbase03 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [11:03:52] (03PS8) 10Adrian Lang: Make mwext-WikibaseJavaScriptApi-qunit voting [integration/config] - 10https://gerrit.wikimedia.org/r/180418 [11:11:02] PROBLEM - Puppet failure on deployment-upload is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [11:11:47] PROBLEM - Puppet failure on deployment-eventlogging02 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [11:20:44] PROBLEM - Puppet failure on deployment-cache-upload02 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [11:20:55] (03PS1) 10Zfilipin: Make ContentTranslation RuboCop job voting [integration/config] - 10https://gerrit.wikimedia.org/r/183471 [11:22:01] (03CR) 10Amire80: [C: 031] Make ContentTranslation RuboCop job voting [integration/config] - 10https://gerrit.wikimedia.org/r/183471 (owner: 10Zfilipin) [11:22:26] (03CR) 10Zfilipin: "The job is green: https://gerrit.wikimedia.org/r/#/c/183470/" [integration/config] - 10https://gerrit.wikimedia.org/r/183471 (owner: 10Zfilipin) [11:23:07] 3Continuous-Integration: Investigate npm cache-min option to speed up npm install - https://phabricator.wikimedia.org/T85961#962337 (10Krinkle) The http requests to npm cloud aren't packages. It's the registry itself. Package cache has no expiry as they're versioned and behind an npm built-in local HTTP 304 prox... [11:23:19] (03CR) 10Hashar: [C: 032] Make ContentTranslation RuboCop job voting [integration/config] - 10https://gerrit.wikimedia.org/r/183471 (owner: 10Zfilipin) [11:24:16] (03Merged) 10jenkins-bot: Make ContentTranslation RuboCop job voting [integration/config] - 10https://gerrit.wikimedia.org/r/183471 (owner: 10Zfilipin) [11:25:51] 3Parsoid, RESTBase, Continuous-Integration, Services: Move testing to our own hardware - https://phabricator.wikimedia.org/T78410#962357 (10Krinkle) The generic solution for test isolation is T47499. Note that, once implemented, we could (though I'm not yet convinced we'll have to) adopt a `.travis.yml`-like con... [11:26:56] RECOVERY - Puppet failure on deployment-salt is OK: OK: Less than 1.00% above the threshold [0.0] [11:27:40] RECOVERY - Puppet failure on deployment-bastion is OK: OK: Less than 1.00% above the threshold [0.0] [11:27:59] 3Parsoid, RESTBase, Continuous-Integration, Services: Move testing to our own hardware - https://phabricator.wikimedia.org/T78410#962360 (10Krinkle) For the moment, let's try and fulfil this request by adding the necessary infrastructure to our labs slaves in general (we already have node v0.10 on Ubuntu Trusty)... [11:28:24] 3Parsoid, RESTBase, Continuous-Integration, Services: Move Parsoid testing from Travis CI to our Jenkins - https://phabricator.wikimedia.org/T78410#962361 (10Krinkle) [11:32:49] PROBLEM - Puppet failure on deployment-cache-bits01 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [11:33:41] RECOVERY - Puppet failure on deployment-restbase03 is OK: OK: Less than 1.00% above the threshold [0.0] [11:33:41] PROBLEM - Puppet failure on deployment-bastion is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [11:35:50] 3Continuous-Integration: Install and use load based balancer plugin - https://phabricator.wikimedia.org/T84911#962377 (10Krinkle) Gearman, Zuul and the default Jenkins features for load balancing should be good enough. There are often cases where our stack gets stuck or where it stops distributing jobs. However... [11:35:57] RECOVERY - Puppet failure on deployment-upload is OK: OK: Less than 1.00% above the threshold [0.0] [11:42:57] (03CR) 10Krinkle: Make mwext-WikibaseJavaScriptApi-qunit voting (031 comment) [integration/config] - 10https://gerrit.wikimedia.org/r/180418 (owner: 10Adrian Lang) [11:43:04] (03CR) 10Krinkle: [C: 04-1] Make mwext-WikibaseJavaScriptApi-qunit voting [integration/config] - 10https://gerrit.wikimedia.org/r/180418 (owner: 10Adrian Lang) [11:43:39] 3Continuous-Integration: Install and use load based balancer plugin - https://phabricator.wikimedia.org/T84911#962396 (10Krinkle) [11:45:29] 3Continuous-Integration: Jenkins: Figure out long term solution for /tmp management - https://phabricator.wikimedia.org/T74011#962398 (10Krinkle) [11:45:42] RECOVERY - Puppet failure on deployment-cache-upload02 is OK: OK: Less than 1.00% above the threshold [0.0] [11:46:28] Project beta-scap-eqiad build #37327: FAILURE in 22 min: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/37327/ [11:47:34] hasharSilent: What does "Ready" mean for the CI workboard? [11:47:45] (as opposed to Done or Resolved) [11:47:55] https://phabricator.wikimedia.org/T73062 [11:49:57] 3Continuous-Integration: Puppet broken on integration slaves: install_zuul - https://phabricator.wikimedia.org/T84917#962401 (10Krinkle) 5Open>3Resolved I think the puppet logic for this is now working for existing and new nodes. Re-open if otherwise. [11:50:19] 3Continuous-Integration: Chromium user profiles sometimes get left behind in /tmp - https://phabricator.wikimedia.org/T75966#962403 (10Krinkle) p:5High>3Low [11:52:58] (03PS1) 10Zfilipin: Run VisualEditor screenshot job on a Mac [integration/config] - 10https://gerrit.wikimedia.org/r/183480 (https://phabricator.wikimedia.org/T78648) [11:54:16] (03CR) 10Amire80: [C: 031] Run VisualEditor screenshot job on a Mac [integration/config] - 10https://gerrit.wikimedia.org/r/183480 (https://phabricator.wikimedia.org/T78648) (owner: 10Zfilipin) [11:56:07] (03PS9) 10Adrian Lang: Make mwext-WikibaseJavaScriptApi-qunit voting [integration/config] - 10https://gerrit.wikimedia.org/r/180418 [11:56:13] (03CR) 10Adrian Lang: Make mwext-WikibaseJavaScriptApi-qunit voting (031 comment) [integration/config] - 10https://gerrit.wikimedia.org/r/180418 (owner: 10Adrian Lang) [11:56:47] RECOVERY - Puppet failure on deployment-eventlogging02 is OK: OK: Less than 1.00% above the threshold [0.0] [11:59:58] Project browsertests-VisualEditor-language-screenshot-os_x_10.10-firefox » en,contintLabsSlave && UbuntuTrusty build #1: FAILURE in 20 min: https://integration.wikimedia.org/ci/job/browsertests-VisualEditor-language-screenshot-os_x_10.10-firefox/LANGUAGE_SCREENSHOT_CODE=en,label=contintLabsSlave%20&&%20UbuntuTrusty/1/ [12:02:48] RECOVERY - Puppet failure on deployment-cache-bits01 is OK: OK: Less than 1.00% above the threshold [0.0] [12:03:40] RECOVERY - Puppet failure on deployment-bastion is OK: OK: Less than 1.00% above the threshold [0.0] [12:14:15] 3Continuous-Integration: yamllint no longer fails on invalid syntax - https://phabricator.wikimedia.org/T76508#962436 (10Krinkle) [12:14:31] (03CR) 10Krinkle: "Caused T76508. Somehow safe_load_all isn't throwing an exception." [integration/jenkins] - 10https://gerrit.wikimedia.org/r/57304 (owner: 10Hashar) [12:16:12] Yippee, build fixed! [12:16:13] Project beta-scap-eqiad build #37328: FIXED in 28 min: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/37328/ [12:17:48] 3Quality-Assurance, Release-Engineering: Advanced Topics in Browser Test Automation - https://phabricator.wikimedia.org/T86070#962451 (10Qgil) p:5Triage>3Normal [12:19:00] (03PS1) 10Krinkle: yamllint: Use safe_load instead of safe_load_all [integration/jenkins] - 10https://gerrit.wikimedia.org/r/183483 (https://phabricator.wikimedia.org/T76508) [12:19:16] (03CR) 10Krinkle: "Fixed in I14b37801c5d35fa062a38c3034083844b1eb0237." [integration/jenkins] - 10https://gerrit.wikimedia.org/r/57304 (owner: 10Hashar) [12:21:12] 3Continuous-Integration: CI browser test dashboard takes 100 seconds to appear on first load - https://phabricator.wikimedia.org/T72671#962466 (10Krinkle) 5Open>3Resolved a:3hashar [12:22:13] 3Continuous-Integration: Add --no-autoloader_layout-check to operations-puppet-puppetlint-lenient - https://phabricator.wikimedia.org/T75117#962471 (10Krinkle) [12:24:37] 3Continuous-Integration: Add regression tests for slave-script tools - https://phabricator.wikimedia.org/T86158#962486 (10Krinkle) 3NEW [12:24:58] 3Continuous-Integration: Add regression tests for slave-script tools - https://phabricator.wikimedia.org/T86158#962486 (10Krinkle) [12:25:20] 3Continuous-Integration: /tmp/sess_* left behind on Jenkins slaves (hhvm php sessions) - https://phabricator.wikimedia.org/T65611#962497 (10Krinkle) p:5Normal>3Low [12:46:19] 3Continuous-Integration, MediaWiki-Unit-tests: Evaluate using vfsStream for file-system interaction in PHPUnit tests - https://phabricator.wikimedia.org/T86163#962548 (10Krinkle) 3NEW [12:48:22] 3Continuous-Integration, MediaWiki-Unit-tests: Evaluate using vfsStream for file-system interaction in PHPUnit tests - https://phabricator.wikimedia.org/T86163#962557 (10Krinkle) [13:17:57] 3Continuous-Integration, Wikimedia-Labs-General: Create labs project for continuous integration nodepool - https://phabricator.wikimedia.org/T55978#962605 (10Krinkle) [13:18:11] 3Continuous-Integration, Wikimedia-Labs-General: Create labs project for continuous integration nodepool - https://phabricator.wikimedia.org/T55978#578885 (10Krinkle) p:5Triage>3Normal [13:23:01] 3Continuous-Integration: Jenkins: JSDuck should run on Ruby 1.9 instead of Ruby 1.8 - https://phabricator.wikimedia.org/T62138#962614 (10Krinkle) [13:23:03] 3Continuous-Integration, MediaWiki-Documentation: Doxygen 1.8.x is too slow on Trusty labs instance - https://phabricator.wikimedia.org/T75311#962613 (10Krinkle) 5Open>3Resolved [13:23:34] 3Continuous-Integration: /tmp/MWDocGen-* files are left behind on Jenkins slaves - https://phabricator.wikimedia.org/T84973#962615 (10Krinkle) p:5Normal>3Low [13:23:54] 3Continuous-Integration: AbuseFilter requires the AntiSpoof extension - https://phabricator.wikimedia.org/T84859#962616 (10Krinkle) [13:24:23] 3Continuous-Integration: Jenkins should flag usage of deprecated features - https://phabricator.wikimedia.org/T53908#962617 (10Krinkle) [13:25:33] 3Continuous-Integration: /tmp/bundler* directories left behind on Jenkins slaves - https://phabricator.wikimedia.org/T84974#962618 (10Krinkle) p:5Triage>3Low [13:26:17] 3Continuous-Integration: Set up Jenkin jobs for phabricator/* repos - https://phabricator.wikimedia.org/T70263#962619 (10Krinkle) [13:26:25] 3Continuous-Integration: Set up Jenkins jobs for phabricator/* repos - https://phabricator.wikimedia.org/T70263#723397 (10Krinkle) [13:26:42] 3Continuous-Integration: Jenkins: Set up perceptual diffs (visual regression testing) - https://phabricator.wikimedia.org/T64633#962625 (10Krinkle) p:5Normal>3Volunteer? [13:32:40] 3Continuous-Integration: Allow tests to specify what extensions and or what order things are loaded in - https://phabricator.wikimedia.org/T72250#962634 (10Krinkle) I see a lot of proposals for implementation but no specific use cases or issues it would solve. Would the following not work? * Include extension... [13:33:02] 3Continuous-Integration: Allow tests to specify what extensions and or what order things are loaded in - https://phabricator.wikimedia.org/T72250#962637 (10Krinkle) [13:33:11] 3Wikidata, Continuous-Integration: Allow tests to specify what extensions and or what order things are loaded in - https://phabricator.wikimedia.org/T72250#742187 (10Krinkle) [13:33:53] 3Continuous-Integration: Remove dependency on git.wikimedia.org - https://phabricator.wikimedia.org/T74001#962640 (10Krinkle) p:5Normal>3Low [13:52:42] 3Continuous-Integration, Wikimedia-Labs-General: Create labs project for continuous integration nodepool - https://phabricator.wikimedia.org/T55978#962689 (10hashar) I am merging this up in its parent task {T47499}. Will fill some new tasks for each of the item listed there. [13:52:54] 3Continuous-Integration: Jenkins: Run jobs in disposable VMs - https://phabricator.wikimedia.org/T47499#514898 (10hashar) [13:52:56] 3Continuous-Integration, Wikimedia-Labs-General: Create labs project for continuous integration nodepool - https://phabricator.wikimedia.org/T55978#962691 (10hashar) [13:56:17] 3Continuous-Integration: [upstream] Jenkins: jsduck test is sometimes passing when the build contains warnings - https://phabricator.wikimedia.org/T57668#962703 (10Krinkle) 5Open>3Resolved a:3Krinkle Change-Id: [Ic01c83ffec3c135c44f1a3df40314f6935c72507](https://gerrit.wikimedia.org/r/#/c/113033/4) [13:56:23] 3Continuous-Integration, Wikimedia-Labs-Infrastructure: Create labs project for CI disposables instances - https://phabricator.wikimedia.org/T86167#962706 (10hashar) 3NEW [13:56:26] 3Continuous-Integration: [upstream] Jenkins: jsduck test is sometimes passing when the build contains warnings - https://phabricator.wikimedia.org/T57668#962714 (10Krinkle) [13:58:48] 3Continuous-Integration: Empty .git/config for mediawiki/core.git clone in mediawiki-phpunit workspace on gallium - https://phabricator.wikimedia.org/T78474#962717 (10Krinkle) 5Open>3Resolved a:3Krinkle Too many variables and not enough data left to investigate now. Let's attribute this to random error and... [14:00:17] 3Continuous-Integration: Isolate contintcloud labs project from rest of the labs project - https://phabricator.wikimedia.org/T86168#962720 (10hashar) 3NEW [14:00:33] 3Continuous-Integration: Isolate contintcloud labs project from rest of the labs project - https://phabricator.wikimedia.org/T86168#962720 (10hashar) [14:02:22] 3Continuous-Integration: run i18n message checks in Jenkins (tracking) - https://phabricator.wikimedia.org/T57456#962735 (10Krinkle) [14:02:25] 3Continuous-Integration: Create generic banana jenkins job template for comparing en.json and qqq.json entries - https://phabricator.wikimedia.org/T66045#962736 (10Krinkle) [14:02:28] 3Continuous-Integration: run i18n message checks in Jenkins (tracking) - https://phabricator.wikimedia.org/T57456#597953 (10Krinkle) [14:04:25] 3Continuous-Integration, Wikimedia-Labs-Infrastructure: OpenStack API account to control `contintcloud` labs project - https://phabricator.wikimedia.org/T86170#962741 (10hashar) 3NEW [14:04:40] 3Continuous-Integration, Wikimedia-Labs-Infrastructure: Create labs project for CI disposables instances - https://phabricator.wikimedia.org/T86167#962749 (10hashar) [14:05:02] 3Continuous-Integration: Create generic banana jenkins job template for comparing en.json and qqq.json entries - https://phabricator.wikimedia.org/T66045#962750 (10Krinkle) 5Open>3declined Yep. Projects should control what tasks they execute during a build. That makes for a faster and more scalable infrastru... [14:06:12] 3Continuous-Integration: Jenkins: Add a lint job for JSON - https://phabricator.wikimedia.org/T60279#962752 (10Krinkle) 5Open>3Resolved Whether ideal for the long term or not, it was resolved by Antoine by including it in the jslint job. For the long term, projects should include a jsonlint task in their ow... [14:06:28] 3Continuous-Integration, Wikimedia-Labs-Infrastructure: Create labs project for CI disposables instances - https://phabricator.wikimedia.org/T86167#962706 (10hashar) [14:06:40] 3Continuous-Integration, Wikimedia-Labs-Infrastructure: OpenStack API account to control `contintcloud` labs project - https://phabricator.wikimedia.org/T86170#962755 (10hashar) [14:07:35] 3Continuous-Integration: Jenkins: Run jobs in disposable VMs - https://phabricator.wikimedia.org/T47499#962758 (10hashar) [14:08:07] 3Continuous-Integration: Jenkins: Run jobs in disposable VMs - https://phabricator.wikimedia.org/T47499#514898 (10hashar) [14:08:27] 3Continuous-Integration: Jenkins: Fix "PHP Warning: Unable to load dynamic library '/usr/lib/php5/20090626/apc.so' " on integration slaves - https://phabricator.wikimedia.org/T68093#962763 (10Krinkle) 5Open>3Invalid a:3Krinkle Can't reproduce this error. Things have changed since and we're no longer trigge... [14:09:38] PROBLEM - Puppet failure on deployment-fluoride is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [14:11:00] 3Continuous-Integration, Wikimedia-Labs-General: Create labs project for continuous integration nodepool - https://phabricator.wikimedia.org/T55978#962770 (10hashar) [14:11:44] 3Continuous-Integration, Wikimedia-Labs-General: Create labs project for continuous integration nodepool - https://phabricator.wikimedia.org/T55978#578885 (10hashar) I have filled new tasks for the list of item that were there: {T86168} {T86170} {T84989} [14:12:53] 3Continuous-Integration: Jenkins: JSDuck should run on Ruby 1.9 instead of Ruby 1.8 - https://phabricator.wikimedia.org/T62138#962777 (10Krinkle) [14:13:59] 3Continuous-Integration: Design the Jenkins isolation architecture - https://phabricator.wikimedia.org/T86171#962780 (10hashar) 3NEW a:3hashar [14:16:33] 3Continuous-Integration: Jenkins: Run jobs in disposable VMs - https://phabricator.wikimedia.org/T47499#962806 (10hashar) [14:27:24] (03CR) 10Nikerabbit: "Nice!" [integration/config] - 10https://gerrit.wikimedia.org/r/183471 (owner: 10Zfilipin) [14:29:05] 3Continuous-Integration, Wikimedia-Labs-Infrastructure: Create labs project for CI disposables instances - https://phabricator.wikimedia.org/T86167#962706 (10hashar) [14:34:10] Project browsertests-Wikidata-SmokeTests-linux-firefox-sauce build #116: FAILURE in 17 min: https://integration.wikimedia.org/ci/job/browsertests-Wikidata-SmokeTests-linux-firefox-sauce/116/ [14:34:40] RECOVERY - Puppet failure on deployment-fluoride is OK: OK: Less than 1.00% above the threshold [0.0] [14:34:42] (03PS2) 10Krinkle: Clean up phpcs macros and jobs (remove strict/lenient split) [integration/config] - 10https://gerrit.wikimedia.org/r/166071 (https://bugzilla.wikimedia.org/48420) (owner: 10Jforrester) [14:34:58] (03CR) 10Krinkle: "Rebased to resolve merge conflict." [integration/config] - 10https://gerrit.wikimedia.org/r/166071 (https://bugzilla.wikimedia.org/48420) (owner: 10Jforrester) [14:39:04] hashar: I learned today that JJB has a 'remove jobs' option [14:39:10] to remove jobs not in jjb [14:39:32] Aside from debug jobs for us manually, do we have anything left? [14:39:54] (03CR) 10Krinkle: [C: 031] Clean up phpcs macros and jobs (remove strict/lenient split) [integration/config] - 10https://gerrit.wikimedia.org/r/166071 (https://bugzilla.wikimedia.org/48420) (owner: 10Jforrester) [14:40:11] (03CR) 10Krinkle: "Note that in the long run this is going away anyway, with local entry points. But this is a good clean up for now." [integration/config] - 10https://gerrit.wikimedia.org/r/166071 (https://bugzilla.wikimedia.org/48420) (owner: 10Jforrester) [14:42:21] (03CR) 10Hashar: [C: 032] "Nice catch, thanks a lot!" [integration/jenkins] - 10https://gerrit.wikimedia.org/r/183483 (https://phabricator.wikimedia.org/T76508) (owner: 10Krinkle) [14:42:25] (03Merged) 10jenkins-bot: yamllint: Use safe_load instead of safe_load_all [integration/jenkins] - 10https://gerrit.wikimedia.org/r/183483 (https://phabricator.wikimedia.org/T76508) (owner: 10Krinkle) [14:44:00] !log on gallium and lanthanum, pushing integration/jenkins.git which would: 1b6a290 - Upgrade JSHint from v2.5.6 to 2.5.11 [14:44:03] Krinkle: ^^^ [14:44:05] Logged the message, Master [14:44:16] Krinkle: the JSHint update added to integration/jenkins did not get git deployed :D [14:44:30] Krinkle: congratulations on figuring out the issue with the yaml linter! [14:45:16] (03CR) 10Hashar: "Deployed with git-deploy on gallium/lanthanum. The labs slaves will be updated by puppet." [integration/jenkins] - 10https://gerrit.wikimedia.org/r/183483 (https://phabricator.wikimedia.org/T76508) (owner: 10Krinkle) [14:45:48] 3Continuous-Integration: yamllint no longer fails on invalid syntax - https://phabricator.wikimedia.org/T76508#962859 (10hashar) a:3Krinkle [14:46:08] 3Continuous-Integration: yamllint no longer fails on invalid syntax - https://phabricator.wikimedia.org/T76508#962860 (10hashar) 5Open>3Resolved Deployed with git-deploy on gallium/lanthanum. The labs slaves will be updated by puppet. [14:48:28] 3Continuous-Integration: Jenkins: jshint should not use wrongly-inherited integration/docroot/.jshintrc by default - https://phabricator.wikimedia.org/T54456#962867 (10Krinkle) 5Open>3declined a:3Krinkle * JSHint does nothing without a config file. * Do not use JSHint without a .jshintrc file in your repos... [14:48:42] PROBLEM - Puppet failure on deployment-sca01 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [14:49:43] 3Continuous-Integration: Jenkins: Run jobs in disposable VMs - https://phabricator.wikimedia.org/T47499#962874 (10yuvipanda) [14:49:44] 3Continuous-Integration: Isolate contintcloud labs project from rest of the labs project - https://phabricator.wikimedia.org/T86168#962875 (10yuvipanda) [14:49:46] 3Continuous-Integration, Wikimedia-Labs-Infrastructure: OpenStack API account to control `contintcloud` labs project - https://phabricator.wikimedia.org/T86170#962876 (10yuvipanda) [14:51:33] 3Continuous-Integration: Migrate JSDuck jobs to Ubuntu Trusty - https://phabricator.wikimedia.org/T86174#962879 (10Krinkle) 3NEW [14:51:46] 3Continuous-Integration: Migrate jsduck-publish jobs to run in labs via integration-publisher - https://phabricator.wikimedia.org/T86175#962885 (10Krinkle) 3NEW [14:52:51] 3Continuous-Integration: Extension unit tests do not run due to not being able to load entry file - https://phabricator.wikimedia.org/T71247#962899 (10Krinkle) p:5Normal>3Volunteer? [14:53:36] PROBLEM - Puppet failure on deployment-sentry2 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [14:55:27] (03PS10) 10Tobias Gritschacher: Make mwext-WikibaseJavaScriptApi-qunit voting [integration/config] - 10https://gerrit.wikimedia.org/r/180418 (https://phabricator.wikimedia.org/T86176) (owner: 10Adrian Lang) [14:58:34] !log [[Nova_Resource:Contintcloud|contintcloud labs project]] has been created! {{bug|T86170}}. Added Krinkle and 20after4 as project admins. [14:58:36] Logged the message, Master [14:58:54] 3Continuous-Integration: Provide a way to have a demo directory alongside the documentation on doc.wikimedia.org - https://phabricator.wikimedia.org/T62143#962924 (10Krinkle) [14:59:07] 3Continuous-Integration: Provide a way to have a demo directory alongside the documentation on doc.wikimedia.org - https://phabricator.wikimedia.org/T62143#665972 (10Krinkle) Once we have dev.wikimedia.org up and running, we should migrate documentation to there, which will help remove the awkward terminology ov... [15:00:55] 3Continuous-Integration: Provide a way to have a demo directory alongside the documentation on doc.wikimedia.org - https://phabricator.wikimedia.org/T62143#962939 (10hashar) Can you ship the demo directly in the doc ? Ie: https://doc.wikimedia.org/visualeditor/v0.1.0/doc/demos/ ? They could be linked from the ma... [15:18:48] RECOVERY - Puppet failure on deployment-sca01 is OK: OK: Less than 1.00% above the threshold [0.0] [15:20:55] (03PS2) 10Krinkle: Change parsoidsvc-jslint back to UbuntuPrecise [integration/config] - 10https://gerrit.wikimedia.org/r/183277 [15:20:58] (03PS3) 10Krinkle: Change parsoidsvc-jslint back to UbuntuPrecise [integration/config] - 10https://gerrit.wikimedia.org/r/183277 [15:21:26] (03PS1) 10Krinkle: cleanup: Remove redundant job-group 'mediawiki-jobs' [integration/config] - 10https://gerrit.wikimedia.org/r/183507 [15:22:44] PROBLEM - Puppet failure on deployment-mediawiki03 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [15:25:18] (03CR) 10jenkins-bot: [V: 04-1] cleanup: Remove redundant job-group 'mediawiki-jobs' [integration/config] - 10https://gerrit.wikimedia.org/r/183507 (owner: 10Krinkle) [15:29:45] PROBLEM - Puppet failure on deployment-jobrunner01 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [15:38:44] PROBLEM - Puppet failure on deployment-logstash1 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [15:47:43] RECOVERY - Puppet failure on deployment-mediawiki03 is OK: OK: Less than 1.00% above the threshold [0.0] [15:50:27] (03CR) 10Krinkle: [C: 032] Change parsoidsvc-jslint back to UbuntuPrecise [integration/config] - 10https://gerrit.wikimedia.org/r/183277 (owner: 10Krinkle) [15:54:46] PROBLEM - Puppet failure on deployment-bastion is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [15:57:27] (03Merged) 10jenkins-bot: Change parsoidsvc-jslint back to UbuntuPrecise [integration/config] - 10https://gerrit.wikimedia.org/r/183277 (owner: 10Krinkle) [15:59:45] 3Parsoid, RESTBase, Continuous-Integration, Services: Move Parsoid testing from Travis CI to our Jenkins - https://phabricator.wikimedia.org/T78410#963104 (10GWicke) @krinkle, this issue was / is about restbase testing, which requires cassandra. [15:59:46] RECOVERY - Puppet failure on deployment-jobrunner01 is OK: OK: Less than 1.00% above the threshold [0.0] [16:00:10] 3Parsoid, RESTBase, Continuous-Integration, Services: Move Parsoid and RESTBase testing from Travis CI to our Jenkins - https://phabricator.wikimedia.org/T78410#963105 (10GWicke) [16:08:42] RECOVERY - Puppet failure on deployment-logstash1 is OK: OK: Less than 1.00% above the threshold [0.0] [16:11:52] 3Continuous-Integration: Provide a way to have a demo directory alongside the documentation on doc.wikimedia.org - https://phabricator.wikimedia.org/T62143#963115 (10Krinkle) >>! In T62143#962939, @hashar wrote: > Can you ship the demo directly in the doc ? Ie: https://doc.wikimedia.org/visualeditor/v0.1.0/doc/d... [16:13:18] 3Continuous-Integration: Write a cronjob to compress old Jenkins builds' logs - https://phabricator.wikimedia.org/T65939#963128 (10Krinkle) [16:13:59] hashar: did you enable yamllint as voting in twn repo? https://integration.wikimedia.org/ci/job/translatewiki-yamllint/2972/console [16:14:26] 3Continuous-Integration: Gallium must be backed up (tracking) - https://phabricator.wikimedia.org/T65934#963135 (10Krinkle) [16:14:28] 3Continuous-Integration: Write a cronjob to compress old Jenkins builds' logs - https://phabricator.wikimedia.org/T65939#710931 (10Krinkle) 5Open>3declined a:3Krinkle Per Antoine, let's instead use Jenkins' built-in mechanism to purge old builds in general. Thus removing them entirely instead of compressing. [16:15:01] (03CR) 10Nikerabbit: "Translatewiki.net uses:" [integration/jenkins] - 10https://gerrit.wikimedia.org/r/183483 (https://phabricator.wikimedia.org/T76508) (owner: 10Krinkle) [16:15:05] 3Continuous-Integration: Jenkins: Figure out long term solution for /tmp management - https://phabricator.wikimedia.org/T74011#963138 (10Krinkle) [16:15:07] 3MediaWiki-Unit-tests, Continuous-Integration: /tmp/sites-******.json files are left behind on Jenkins slaves - https://phabricator.wikimedia.org/T84970#963136 (10Krinkle) 5Open>3Resolved a:5aude>3Krinkle [16:15:12] 3MediaWiki-Unit-tests, Continuous-Integration: /tmp/sites-******.json files are left behind on Jenkins slaves - https://phabricator.wikimedia.org/T84970#935648 (10Krinkle) [16:16:43] PROBLEM - Puppet failure on deployment-cache-upload02 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [16:23:15] Krinkle: ^ [16:24:54] Nikerabbit: No, it's not new as voting. It's been voting for a while [16:25:06] Nikerabbit: THe difference it that is was previously always passing regardless of a syntax error [16:25:10] now it is actually checking the syntax [16:25:18] Nikerabbit: However for TWN the failure is something else [16:25:28] Nikerabbit: You have more than one yaml blob in a single file. [16:26:01] Krinkle: yes we do [16:26:04] Nikerabbit: Do you know a way to validate that? Or perhaps it's worth removing that from TWN and consolidating it into one keyed by something? [16:26:25] Nikerabbit: e.g. { a: {}, b: {}, c } instead of [doc: a] {}, doc: b: {}, doc: c {} [16:26:51] Krinkle: it's working well as it is, I'd rather keep it [16:27:04] Krinkle: not directly related, but I'm working on extended validation for Translate: https://phabricator.wikimedia.org/T86000 [16:27:25] the lint never worked afaik, patches welcome for validating that kind of yaml file. otherwise, should I make it non-voting? [16:28:13] Krinkle: I guess that is the best option for now, it's blocking other work and at minimum I need a migration period if I choose to change anything [16:29:02] (03PS3) 10Jforrester: Clean up phpcs macros and jobs (remove strict/lenient split) [integration/config] - 10https://gerrit.wikimedia.org/r/166071 (https://phabricator.wikimedia.org/T50420) [16:30:15] Krinkle: if it has not been validating stuff till now, it's been of dubious help anyway [16:30:47] in twn repo... of course other repos are different in that sense [16:31:08] 3translatewiki.net, Continuous-Integration: Support multiple documents in yamlllint - https://phabricator.wikimedia.org/T86194#963167 (10Krinkle) 3NEW [16:32:32] (03PS1) 10Krinkle: Disable translatewiki-yamllint [integration/config] - 10https://gerrit.wikimedia.org/r/183522 (https://phabricator.wikimedia.org/T86194) [16:32:53] (03CR) 10Krinkle: [C: 032] Disable translatewiki-yamllint [integration/config] - 10https://gerrit.wikimedia.org/r/183522 (https://phabricator.wikimedia.org/T86194) (owner: 10Krinkle) [16:33:48] (03Merged) 10jenkins-bot: Disable translatewiki-yamllint [integration/config] - 10https://gerrit.wikimedia.org/r/183522 (https://phabricator.wikimedia.org/T86194) (owner: 10Krinkle) [16:34:53] !log Reload Zuul to deploy I9bed999493feb715 [16:34:55] Logged the message, Master [16:34:59] Nikerabbit: ^ [16:36:38] Krinkle: thank you very much [16:37:07] (03PS1) 10Hashar: Test for /tools/yamllint.py [integration/jenkins] - 10https://gerrit.wikimedia.org/r/183525 (https://phabricator.wikimedia.org/T86158) [16:37:10] (03CR) 10jenkins-bot: [V: 04-1] Test for /tools/yamllint.py [integration/jenkins] - 10https://gerrit.wikimedia.org/r/183525 (https://phabricator.wikimedia.org/T86158) (owner: 10Hashar) [16:37:41] pfff [16:39:09] (03CR) 10Hashar: "That is not really smart. PHPUnit 3.7 does not supports assertNotFalse, and the bad yaml fixture causes the yamllint job to choke :-D" [integration/jenkins] - 10https://gerrit.wikimedia.org/r/183525 (https://phabricator.wikimedia.org/T86158) (owner: 10Hashar) [16:39:43] RECOVERY - Puppet failure on deployment-bastion is OK: OK: Less than 1.00% above the threshold [0.0] [16:42:35] 3Labs-Team, Continuous-Integration, Wikimedia-Labs-Infrastructure: OpenStack API account to control `contintcloud` labs project - https://phabricator.wikimedia.org/T86170#963208 (10hashar) Dear #Labs-Team , do you have any idea how to provide OpenStack API credentials for the `contintcloud` project ? Would use i... [16:43:47] 3Labs-Team, Continuous-Integration, Wikimedia-Labs-Infrastructure: OpenStack API account to control `contintcloud` labs project - https://phabricator.wikimedia.org/T86170#963215 (10yuvipanda) Where are you going to call the API from? From inside labs or from inside production? [16:45:28] hashar: not-assertions are an anti-pattern. Assert explicitly. If you don't care what the result is, then the test shouldn't care either. Give it something explicit :) [16:46:05] e.g. True( strlen() > 0 ) [16:46:23] hashar: Hm.. nice catch. [16:46:35] yeah should just assert it is a string [16:46:40] hashar: Add integration/jenkins.git:/composer.json with phpunit dev-dependency and scripts.tests? [16:46:41] realpath() returns false when it can find the file [16:46:42] RECOVERY - Puppet failure on deployment-cache-upload02 is OK: OK: Less than 1.00% above the threshold [0.0] [16:47:04] Hehe :) [16:47:13] (to use newer version if you like) [16:48:47] (03PS2) 10Hashar: Test for /tools/yamllint.py [integration/jenkins] - 10https://gerrit.wikimedia.org/r/183525 (https://phabricator.wikimedia.org/T86158) [16:49:01] yeah could use composer instead :d [16:49:27] zuul is deadlocked :-/ [16:51:43] (03CR) 10Hashar: "recheck" [integration/jenkins] - 10https://gerrit.wikimedia.org/r/183525 (https://phabricator.wikimedia.org/T86158) (owner: 10Hashar) [16:52:18] Project browsertests-MobileFrontend-en.m.wikipedia.beta.wmflabs.org-linux-firefox-sauce build #475: ABORTED in 4 min 0 sec: https://integration.wikimedia.org/ci/job/browsertests-MobileFrontend-en.m.wikipedia.beta.wmflabs.org-linux-firefox-sauce/475/ [16:52:19] (03CR) 10jenkins-bot: [V: 04-1] Test for /tools/yamllint.py [integration/jenkins] - 10https://gerrit.wikimedia.org/r/183525 (https://phabricator.wikimedia.org/T86158) (owner: 10Hashar) [16:53:15] (03CR) 10Hashar: "replaced assertNotFalse by asserting realpath() returned a string." [integration/jenkins] - 10https://gerrit.wikimedia.org/r/183525 (https://phabricator.wikimedia.org/T86158) (owner: 10Hashar) [17:00:33] * greg-g notices that Krinkle was busy overnight in Phab [17:03:36] 3Labs-Team, Continuous-Integration, Wikimedia-Labs-Infrastructure: OpenStack API account to control `contintcloud` labs project - https://phabricator.wikimedia.org/T86170#963332 (10chasemp) p:5Triage>3Normal [17:04:25] 3operations, Continuous-Integration: Acquire old production API servers for use in CI - https://phabricator.wikimedia.org/T84940#963335 (10hashar) We might have a use for them to set some CI supporting servers straight inside the labs infrastructure. That would host the Zuul mergers which provides the patch set... [17:04:44] 3Continuous-Integration: Design the Jenkins isolation architecture - https://phabricator.wikimedia.org/T86171#963337 (10hashar) p:5Triage>3High [17:06:19] 3translatewiki.net, Continuous-Integration: Support multiple documents in yamlllint - https://phabricator.wikimedia.org/T86194#963342 (10hashar) The tools/yamlllint does not use safe_load_all() because it does not raise an exception when one of the document is invalid. It probably needs an extra checks to valid... [17:08:30] greg-g: I have reorganized the tasks related to running jobs in disposable VMs \O/ [17:08:36] greg-g: bunch of blockers to https://phabricator.wikimedia.org/T47499 [17:08:42] yay! [17:08:47] * greg-g is catching up on bug mail [17:08:51] er task mail [17:08:52] will do the arch documentation tomorrow / monday hopefully [17:08:53] whatever [17:08:55] cool [17:12:06] I am off. Take care! [17:20:36] PROBLEM - Puppet failure on deployment-db2 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [17:30:40] 3Release-Engineering: Update Beta Cluster status documentation (re Q3 intradepartamental priority) - https://phabricator.wikimedia.org/T1000#963383 (10greg) [17:33:23] sorry for in-coming bug spam [17:34:47] (or maybe it doesn't announce column changes?) [17:34:48] greg-g is a Phab MADMAN :) [17:35:12] OCD is more like it :) [17:35:36] greg-g: as an update, we have probably tracked down the DNS issue, and are working on a fix. [17:35:42] YuviPanda: yay! [17:36:33] 3Release-Engineering: l10nupdate broken in production with scap related errors - https://phabricator.wikimedia.org/T1383#963465 (10greg) p:5Unbreak!>3Normal [17:36:54] bah! [17:37:37] 3Release-Engineering: l10nupdate broken in production with scap related errors - https://phabricator.wikimedia.org/T1383#24196 (10greg) p:5Normal>3Unbreak! [17:40:29] Yippee, build fixed! [17:40:30] Project browsertests-MobileFrontend-en.m.wikipedia.beta.wmflabs.org-linux-firefox-sauce build #476: FIXED in 47 min: https://integration.wikimedia.org/ci/job/browsertests-MobileFrontend-en.m.wikipedia.beta.wmflabs.org-linux-firefox-sauce/476/ [17:41:03] 3Release-Engineering, Phabricator, § Phabricator-Sprint-Extension: Create a continuous integration plan for Wikimedia Phabricator patches - https://phabricator.wikimedia.org/T85123#963565 (10Christopher) Thanks Hashar and Andre. Just to clarify. I have been using Scrutinizer for a build which runs the tests an... [17:45:36] RECOVERY - Puppet failure on deployment-db2 is OK: OK: Less than 1.00% above the threshold [0.0] [17:47:35] * greg-g decides not to bug spam the #contint project... just yet [17:47:45] I did enough with the beta cluster/deployment systems one today :) [17:48:18] (beta cluster still has a ton more to move, but I think I should move on to more useful things) [17:49:54] 3MediaWiki-Core-Team, Librarization, Continuous-Integration: Set up composer validate job for operations/mediawiki-config - https://phabricator.wikimedia.org/T76621#963607 (10bd808) p:5Normal>3Low [17:50:38] 3MediaWiki-Core-Team, Librarization, Continuous-Integration: Set up composer validate job for operations/mediawiki-config - https://phabricator.wikimedia.org/T76621#807514 (10bd808) p:5Low>3Normal [17:54:48] Project beta-scap-eqiad build #37365: FAILURE in 1.4 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/37365/ [17:57:51] 3Release-Engineering, Phabricator, § Phabricator-Sprint-Extension: Create a continuous integration plan for Wikimedia Phabricator patches - https://phabricator.wikimedia.org/T85123#963673 (10mmodell) @christopher: I didn't realize that only master gets replicated to github. I'll merge production into master so... [17:58:17] PROBLEM - Puppet failure on deployment-db1 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [17:58:49] 3Release-Engineering: Puppet failure on deployment-sentry2 - https://phabricator.wikimedia.org/T78411#963692 (10greg) p:5Triage>3Normal [18:00:53] 3Release-Engineering: Code Deploy Dashboard - https://phabricator.wikimedia.org/T280#963715 (10greg) p:5Low>3Volunteer? [18:02:10] 3Release-Engineering, Phabricator, § Phabricator-Sprint-Extension: Create a continuous integration plan for Wikimedia Phabricator patches - https://phabricator.wikimedia.org/T85123#963719 (10Christopher) actually, the production branch does get pushed to github... [18:03:35] 3operations, Release-Engineering: performance testing environment - https://phabricator.wikimedia.org/T67394#963721 (10greg) [18:05:01] bd808: did sudo fail for you for a few mins? [18:05:35] YuviPanda: *shrug* I haven't been on a beta box for a few days \o/ [18:05:48] bd808: hmm, do you have a cron running as you? [18:05:53] 3Release-Engineering: Create phpunit test in mediawiki-config repo to validate Parsoid settings - https://phabricator.wikimedia.org/T70532#963734 (10greg) p:5Triage>3Low [18:05:53] yeah [18:05:59] bd808: ah, ok [18:06:03] the hhvm core cleaner script [18:06:22] bd808: oh, is it cleaning them off /var/tmp? [18:06:36] hmm, I’ve set the kernel param for core dumps to /data/project/cores [18:06:39] but I don’t see any there [18:06:43] I suppose hhvm sets its own location [18:06:52] I think it may [18:07:08] 3Release-Engineering, Phabricator, § Phabricator-Sprint-Extension: Create a continuous integration plan for Wikimedia Phabricator patches - https://phabricator.wikimedia.org/T85123#963738 (10mmodell) Actually, the production branch is on github: https://github.com/wikimedia/phabricator-phabricator/tree/producti... [18:07:58] (03PS4) 10Krinkle: Clean up phpcs macros and jobs (remove strict/lenient split) [integration/config] - 10https://gerrit.wikimedia.org/r/166071 (https://phabricator.wikimedia.org/T50420) (owner: 10Jforrester) [18:08:12] greg-g: ^5 as in ‘to the power of 5’ or ‘5 lines up’? [18:08:45] YuviPanda: as in "high five" [18:08:50] greg-g: hahaa! :D [18:08:52] :) [18:09:00] greg-g: sort of a fib. I did log in to look at something broken yesterday or the day before but did not fix it myself [18:09:08] close enough [18:09:13] greg-g: we cleaned up the DNS issue, so puppet should be happier [18:09:20] puppet AND scap [18:09:22] wheeeee thanks YuviPanda [18:09:28] a bunch of hosts failed while we restarted the DNS server [18:09:42] greg-g: apparently dnsmasq defaulted to setting TTL of 0 for everything, and so nothing cached the DNS responses... [18:09:55] yeah I think I saw a 503 from the API on beta labs briefly [18:09:57] that will make things suck [18:10:00] YuviPanda: that's.... awesome [18:10:17] we were looking at other solutions because it did not occur to us that… it would be this bad [18:10:17] 0 ttl is stupid [18:10:36] so we set up a local recursor for the proxies (which were responsible for about 40% of dns traffic) [18:10:56] and then the local recursor also did not cache, but told us why, and then a simple dig confirmed 0 ttl and ‘ewwwww'w [18:11:57] anyway, we made it set a 300 ttl [18:11:57] now [18:12:08] I wonder how long ttl has been set to 0. [18:12:26] chrismcmahon: since the beginning [18:12:28] of labs [18:12:57] yeah. I have no problem believing it was set that way when I first encountered beta labs 3 years ago [18:13:12] RECOVERY - Puppet failure on deployment-db1 is OK: OK: Less than 1.00% above the threshold [0.0] [18:15:45] Yippee, build fixed! [18:15:45] Project beta-scap-eqiad build #37367: FIXED in 1 min 48 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/37367/ [18:16:21] ^ I'll take that as the first sign of good things to come from the DNS issue [18:16:44] * greg-g crosses fingers/knocks on wood/throws salt over left shoulder/runs away from a black cat [18:24:39] greg-g: more details on https://phabricator.wikimedia.org/T72076. [18:26:03] !log purged nscd cache on all deployment-prep hosts [18:26:06] Logged the message, Master [18:34:46] greg-g: chrismcmalunch everything seems to have recovered, at least for deployment-prep. I’ll leave the bug open for another day [18:35:12] YuviPanda: thanks sir! [18:35:19] yw [18:39:48] 3Beta-Cluster: Rename all occurences of "deployment-prep" to "beta-cluster" - https://phabricator.wikimedia.org/T74694#963815 (10yuvipanda) *poke*. We should decline this if the releng team thinks it isn't worth the hassle. [18:44:26] Project browsertests-UploadWizard-commons.wikimedia.beta.wmflabs.org-linux-chrome-sauce build #405: FAILURE in 10 min: https://integration.wikimedia.org/ci/job/browsertests-UploadWizard-commons.wikimedia.beta.wmflabs.org-linux-chrome-sauce/405/ [18:59:22] Krinkle: https://gerrit.wikimedia.org/r/#/c/183450/ doesn't appear in the Zuul status board… jenkins/zuul/gearman/whatever having issues again? [19:01:28] 3Continuous-Integration: test and gate-and-submit jobs are usually redundant - https://phabricator.wikimedia.org/T78328#963906 (10Krinkle) [19:05:44] !log Zuul/Gearman stuck [19:05:46] Logged the message, Master [19:05:51] 3Release-Engineering, Phabricator, § Phabricator-Sprint-Extension: Create a continuous integration plan for Wikimedia Phabricator patches - https://phabricator.wikimedia.org/T85123#963913 (10Christopher) Do I still trigger the request for deployment with a change to operations/puppet/manifests/role/phabricator.p... [19:06:13] James_F: The best indicator is 2nd (top-right) graph on the bottom of the Zuul dashboard [19:06:19] if you see a build up of blue, that's stuck [19:06:46] Krinkle: Ah. But those charts don't auto-refresh, unlike the rest of interface, right? OK, will remember. [19:06:52] And the 4th (bottom-right) graph being flat (Gearman not responding, so it appears app values are 0) [19:07:19] the latter is better now that I think about it :) [19:07:27] Right. [19:08:07] !log Relaunched Gearman from Jenkins manager [19:08:10] Logged the message, Master [19:21:12] 3operations, Release-Engineering: /usr/local/bin/deploy2graphite broken on tin due to nc command syntax - https://phabricator.wikimedia.org/T1387#963965 (10Reedy) ``` reedy@ubuntu64-web-esxi:~/git/operations/puppet$ grep -R MW_STATSD_HOST * files/misc/scripts/deploy2graphite: echo "deploy.${1}:1|c" |... [19:21:14] !log Gearman is back up but Zuul itself still stuck (no longer processing new events, doing "Updating information for .." for the same three jobs over and over again) [19:21:16] Logged the message, Master [19:21:18] !log Force restart Zuul [19:21:19] Logged the message, Master [19:25:57] 3operations, Release-Engineering: /usr/local/bin/deploy2graphite broken on tin due to nc command syntax - https://phabricator.wikimedia.org/T1387#963987 (10Reedy) https://gerrit.wikimedia.org/r/#/c/183568/ [19:33:30] (03PS1) 10Krinkle: zuul: Update graphite graphs together with Zuul status [integration/docroot] - 10https://gerrit.wikimedia.org/r/183614 [19:33:36] James_F: ^ [19:33:45] (03CR) 10Krinkle: [C: 032] zuul: Update graphite graphs together with Zuul status [integration/docroot] - 10https://gerrit.wikimedia.org/r/183614 (owner: 10Krinkle) [19:33:48] And live :) [19:33:57] 3Continuous-Integration: test and gate-and-submit jobs are usually redundant - https://phabricator.wikimedia.org/T78328#964012 (10awight) To clarify, I'm not suggesting we skip the gate-submit jobs, I'm actually saying we should skip or kill the test jobs and rely entirely on gate-submit jobs instead. No proble... [19:34:10] (03Merged) 10jenkins-bot: zuul: Update graphite graphs together with Zuul status [integration/docroot] - 10https://gerrit.wikimedia.org/r/183614 (owner: 10Krinkle) [19:35:31] Krinkle: You're awesome. :-) [19:54:00] Krinkle: If you have a second, could you tell what's causing https://gerrit.wikimedia.org/r/#/c/183170/ to V-1? [19:55:08] James_F: npm cache compat break. Possibly one of the upstream dependencies force-pushed a public tag or didn't follow semver. [19:55:14] will flush manually [19:55:29] Krinkle: Ick. Thanks. Any way I could have fixed this without wasting your time, for future reference? [19:55:58] Nope. Digging on the live slaves in a deep directory structure that's prolly not worth documenting. [19:56:03] ssh [20:07:26] Project browsertests-MobileFrontend-en.m.wikipedia.beta.wmflabs.org-linux-chrome-sauce build #439: FAILURE in 42 min: https://integration.wikimedia.org/ci/job/browsertests-MobileFrontend-en.m.wikipedia.beta.wmflabs.org-linux-chrome-sauce/439/ [20:26:04] Project browsertests-VisualEditor-en.wikipedia.beta.wmflabs.org-linux-chrome-sauce build #443: FAILURE in 22 min: https://integration.wikimedia.org/ci/job/browsertests-VisualEditor-en.wikipedia.beta.wmflabs.org-linux-chrome-sauce/443/ [20:56:16] Yippee, build fixed! [20:56:16] Project browsertests-Flow-en.wikipedia.beta.wmflabs.org-windows_8-internet_explorer-sauce build #383: FIXED in 47 min: https://integration.wikimedia.org/ci/job/browsertests-Flow-en.wikipedia.beta.wmflabs.org-windows_8-internet_explorer-sauce/383/ [21:06:22] (03PS1) 10Krinkle: mediawiki-core-npm: Use default entry point instead of tests/frontend [integration/config] - 10https://gerrit.wikimedia.org/r/183635 [21:15:25] Reedy: Tried to continue in #wikimedia-releng but you aren't there. :) [21:15:46] when did that pop up? :P [21:16:04] The pyc files in tin in tin:/srv/deployment/scap/scap/scap have wonky looking permissions [21:16:34] I'm not sure why they've got me as the owner either [21:17:09] They happen at run time, you ran scap and python built them [21:17:22] ah [21:17:26] the long term problem is that they may not be rebuilt when code cahnges [21:18:31] (03PS1) 10Merlijn van Deen: + wikibugs2 auto-pull after merge [integration/config] - 10https://gerrit.wikimedia.org/r/183636 [21:19:54] (03PS2) 10Merlijn van Deen: + wikibugs2 auto-pull after merge [integration/config] - 10https://gerrit.wikimedia.org/r/183636 [21:26:58] (03CR) 10Legoktm: "I think this will need to be pinned to a labs slave?" [integration/config] - 10https://gerrit.wikimedia.org/r/183636 (owner: 10Merlijn van Deen) [21:32:21] 3Beta-Cluster: Rename all occurences of "deployment-prep" to "beta-cluster" - https://phabricator.wikimedia.org/T74694#964297 (10greg) It's definitely low (or even "needs volunteer") priority, but... if there is ever the occasion to do it (for whatever reason) we should. [21:33:49] (03CR) 10Krinkle: "It's a public end point. However I'm concerned about whether we should start allowing this kind of postmerge job." [integration/config] - 10https://gerrit.wikimedia.org/r/183636 (owner: 10Merlijn van Deen) [21:45:22] (03CR) 10Legoktm: "the PHP file is just:" [integration/config] - 10https://gerrit.wikimedia.org/r/183636 (owner: 10Merlijn van Deen) [21:46:46] 3Collaboration-Team, Beta-Cluster, MediaWiki-extensions-Flow: Beta labs Special:Contributions lags by a long time and notes slow Flow queries - https://phabricator.wikimedia.org/T78671#964335 (10Spage) [21:50:29] 3operations, Release-Engineering: Determine Trebuchet/git-deploy maintenance plan - https://phabricator.wikimedia.org/T85008#964352 (10greg) [21:59:10] Yippee, build fixed! [21:59:11] Project browsertests-UniversalLanguageSelector-commons.wikimedia.beta.wmflabs.org-linux-firefox-sauce build #388: FIXED in 14 min: https://integration.wikimedia.org/ci/job/browsertests-UniversalLanguageSelector-commons.wikimedia.beta.wmflabs.org-linux-firefox-sauce/388/ [22:13:28] 3Phabricator.org, Release-Engineering, Phabricator: Answer questions about ongoing maintenance of phabricator customizations/extensions - https://phabricator.wikimedia.org/T78464#964431 (10Qgil) Is there any action pending in this task? This is ultimately a matter of priorities and resources, just like in any... [22:39:45] (03CR) 10Merlijn van Deen: "I considered rate-limiting, but that's somewhat complicated. An IP whitelist might be useful. On the other hand... it's just a git pull, n" [integration/config] - 10https://gerrit.wikimedia.org/r/183636 (owner: 10Merlijn van Deen) [23:22:38] (03PS1) 10Mattflaschen: Have Thanks depend on MobileFrontend [integration/config] - 10https://gerrit.wikimedia.org/r/183724 [23:25:23] (03PS2) 10Mattflaschen: Have Thanks depend on MobileFrontend [integration/config] - 10https://gerrit.wikimedia.org/r/183724 [23:46:29] PROBLEM - Free space - all mounts on deployment-cache-upload02 is CRITICAL: CRITICAL: deployment-prep.deployment-cache-upload02.diskspace._srv_vdb.byte_percentfree.value (<100.00%)