[00:00:34] chrismcmahon: ^ [00:02:45] (03CR) 10Cmcmahon: [C: 032] Fixed token error handling [ruby/api] - 10https://gerrit.wikimedia.org/r/156480 (https://bugzilla.wikimedia.org/70066) (owner: 10Dduvall) [00:02:47] (03Merged) 10jenkins-bot: Fixed token error handling [ruby/api] - 10https://gerrit.wikimedia.org/r/156480 (https://bugzilla.wikimedia.org/70066) (owner: 10Dduvall) [00:03:35] marxarelli: thanks! do you have rights to publish the gem and bump the version? [00:06:00] Project beta-scap-eqiad build #18929: FAILURE in 1 min 52 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/18929/ [00:06:03] (03PS1) 10Dduvall: Releasing patch version 0.2.1 [ruby/api] - 10https://gerrit.wikimedia.org/r/156485 [00:10:02] (03CR) 10Cmcmahon: [C: 032] Releasing patch version 0.2.1 [ruby/api] - 10https://gerrit.wikimedia.org/r/156485 (owner: 10Dduvall) [00:10:04] (03Merged) 10jenkins-bot: Releasing patch version 0.2.1 [ruby/api] - 10https://gerrit.wikimedia.org/r/156485 (owner: 10Dduvall) [00:10:06] (03PS1) 10Dduvall: Bumped runtime dependency for mediawiki_api [selenium] - 10https://gerrit.wikimedia.org/r/156486 [00:12:05] (03CR) 10Cmcmahon: [C: 032] Bumped runtime dependency for mediawiki_api [selenium] - 10https://gerrit.wikimedia.org/r/156486 (owner: 10Dduvall) [00:12:07] (03Merged) 10jenkins-bot: Bumped runtime dependency for mediawiki_api [selenium] - 10https://gerrit.wikimedia.org/r/156486 (owner: 10Dduvall) [00:13:59] (03PS1) 10Dduvall: Releasing patch version 0.3.2 [selenium] - 10https://gerrit.wikimedia.org/r/156487 [00:14:58] chrismcmahon: one more ^ :) [00:15:18] chrismcmahon: and then the fun part: updating every repo's Gemfile :/ [00:15:55] Yippee, build fixed! [00:15:56] Project beta-scap-eqiad build #18930: FIXED in 1 min 49 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/18930/ [00:17:00] (03CR) 10Cmcmahon: [C: 032] Releasing patch version 0.3.2 [selenium] - 10https://gerrit.wikimedia.org/r/156487 (owner: 10Dduvall) [00:17:02] (03Merged) 10jenkins-bot: Releasing patch version 0.3.2 [selenium] - 10https://gerrit.wikimedia.org/r/156487 (owner: 10Dduvall) [00:17:49] chrismcmahon: cool. i just built and pushed the gem [00:18:10] marxarelli: kk. I can update Gemfiles tomorrow, I think I only have one meeting (as opposed to today's 5) [00:18:35] chrismcmahon: sounds good. let me know if you want to divide and conquer [00:19:17] marxarelli: let em fail, I'll sort them in the morning. I cleaned up debt in the MF repo today, VE is square, some odds and ends of other stuff I'd be looking at anyway [00:21:15] chrismcmahon: roger that. on that note, i'm signing off! [00:21:25] marxarelli: thanks for making the gem go zoom zoom, see you tomorrow [00:21:45] no problem! see ya [00:26:14] Project beta-scap-eqiad build #18931: FAILURE in 2 min 13 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/18931/ [00:36:12] Yippee, build fixed! [00:36:13] Project beta-scap-eqiad build #18932: FIXED in 1 min 55 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/18932/ [00:47:08] Project UploadWizard-api-commons.wikimedia.beta.wmflabs.org build #599: SUCCESS in 1 min 8 sec: https://integration.wikimedia.org/ci/job/UploadWizard-api-commons.wikimedia.beta.wmflabs.org/599/ [00:53:44] Project UploadWizard-api-commons.wikimedia.org build #481: SUCCESS in 44 sec: https://integration.wikimedia.org/ci/job/UploadWizard-api-commons.wikimedia.org/481/ [01:45:27] Project beta-scap-eqiad build #18939: FAILURE in 1 min 17 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/18939/ [01:56:08] Yippee, build fixed! [01:56:09] Project beta-scap-eqiad build #18940: FIXED in 2 min 2 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/18940/ [02:36:08] Project beta-scap-eqiad build #18944: FAILURE in 2 min 9 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/18944/ [02:45:56] Yippee, build fixed! [02:45:57] Project beta-scap-eqiad build #18945: FIXED in 1 min 59 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/18945/ [03:05:16] Project browsertests-ZeroBanner-en.m.wikipedia.org-linux-phantomjs build #101: STILL FAILING in 15 sec: https://integration.wikimedia.org/ci/job/browsertests-ZeroBanner-en.m.wikipedia.org-linux-phantomjs/101/ [03:14:33] Project browsertests-TwnMainPage-sandbox.translatewiki.net-linux-firefox-sauce build #98: STILL FAILING in 10 min: https://integration.wikimedia.org/ci/job/browsertests-TwnMainPage-sandbox.translatewiki.net-linux-firefox-sauce/98/ [03:15:47] Project beta-scap-eqiad build #18948: FAILURE in 1 min 39 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/18948/ [03:15:53] Project browsertests-PageTriage-en.wikipedia.beta.wmflabs.org-linux-chrome-sauce build #69: SUCCESS in 1 min 19 sec: https://integration.wikimedia.org/ci/job/browsertests-PageTriage-en.wikipedia.beta.wmflabs.org-linux-chrome-sauce/69/ [03:17:27] Project browsertests-Echo-test2.wikipedia.org-linux-chrome-sauce build #5: FAILURE in 11 min: https://integration.wikimedia.org/ci/job/browsertests-Echo-test2.wikipedia.org-linux-chrome-sauce/5/ [03:17:36] Project browsertests-MultimediaViewer-en.wikipedia.beta.wmflabs.org-os_x_10.9-safari-sauce build #125: SUCCESS in 1 min 42 sec: https://integration.wikimedia.org/ci/job/browsertests-MultimediaViewer-en.wikipedia.beta.wmflabs.org-os_x_10.9-safari-sauce/125/ [03:19:27] Project browsertests-UniversalLanguageSelector-language-browsertests.wmflabs.org-linux-firefox-sauce build #126: STILL FAILING in 1 min 27 sec: https://integration.wikimedia.org/ci/job/browsertests-UniversalLanguageSelector-language-browsertests.wmflabs.org-linux-firefox-sauce/126/ [03:23:37] Project browsertests-Flow-test2.wikipedia.org-windows_8-internet_explorer-sauce build #111: STILL FAILING in 22 min: https://integration.wikimedia.org/ci/job/browsertests-Flow-test2.wikipedia.org-windows_8-internet_explorer-sauce/111/ [03:25:47] Yippee, build fixed! [03:25:48] Project beta-scap-eqiad build #18949: FIXED in 1 min 37 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/18949/ [03:28:06] Project browsertests-UploadWizard-commons.wikimedia.beta.wmflabs.org-linux-chrome-sauce build #136: STILL FAILING in 4 min 27 sec: https://integration.wikimedia.org/ci/job/browsertests-UploadWizard-commons.wikimedia.beta.wmflabs.org-linux-chrome-sauce/136/ [03:30:03] Project browsertests-CirrusSearch-en.wikipedia.beta.wmflabs.org-linux-firefox-sauce build #102: SUCCESS in 1 min 55 sec: https://integration.wikimedia.org/ci/job/browsertests-CirrusSearch-en.wikipedia.beta.wmflabs.org-linux-firefox-sauce/102/ [03:35:57] Project beta-scap-eqiad build #18950: FAILURE in 1 min 50 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/18950/ [03:38:39] Project browsertests-Core-en.wikipedia.beta.wmflabs.org-linux-firefox-sauce build #131: FAILURE in 19 min: https://integration.wikimedia.org/ci/job/browsertests-Core-en.wikipedia.beta.wmflabs.org-linux-firefox-sauce/131/ [03:42:42] Project browsertests-Flow-en.wikipedia.beta.wmflabs.org-linux-firefox-sauce build #182: STILL FAILING in 25 min: https://integration.wikimedia.org/ci/job/browsertests-Flow-en.wikipedia.beta.wmflabs.org-linux-firefox-sauce/182/ [03:43:40] Project browsertests-VisualEditor-en.wikipedia.beta.wmflabs.org-windows_xp-firefox-sauce build #142: STILL FAILING in 13 min: https://integration.wikimedia.org/ci/job/browsertests-VisualEditor-en.wikipedia.beta.wmflabs.org-windows_xp-firefox-sauce/142/ [03:44:00] Project browsertests-UniversalLanguageSelector-sandbox.translatewiki.net-linux-firefox-sauce build #98: SUCCESS in 1 min 16 sec: https://integration.wikimedia.org/ci/job/browsertests-UniversalLanguageSelector-sandbox.translatewiki.net-linux-firefox-sauce/98/ [03:45:40] Yippee, build fixed! [03:45:40] Project beta-scap-eqiad build #18951: FIXED in 1 min 25 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/18951/ [03:47:13] Project browsertests-PdfHandler-test2.wikipedia.org-linux-firefox-sauce build #42: FAILURE in 1 min 12 sec: https://integration.wikimedia.org/ci/job/browsertests-PdfHandler-test2.wikipedia.org-linux-firefox-sauce/42/ [03:49:14] Project browsertests-Echo-en.wikipedia.beta.wmflabs.org-linux-firefox-sauce build #19: STILL FAILING in 5 min 32 sec: https://integration.wikimedia.org/ci/job/browsertests-Echo-en.wikipedia.beta.wmflabs.org-linux-firefox-sauce/19/ [04:02:50] Project browsertests-VisualEditor-en.wikipedia.beta.wmflabs.org-linux-firefox-sauce build #224: STILL FAILING in 13 min: https://integration.wikimedia.org/ci/job/browsertests-VisualEditor-en.wikipedia.beta.wmflabs.org-linux-firefox-sauce/224/ [04:05:34] Project browsertests-Math-en.wikipedia.beta.wmflabs.org-linux-chrome-sauce build #58: SUCCESS in 2 min 42 sec: https://integration.wikimedia.org/ci/job/browsertests-Math-en.wikipedia.beta.wmflabs.org-linux-chrome-sauce/58/ [04:07:19] Project browsertests-UploadWizard-commons.wikimedia.beta.wmflabs.org-linux-firefox-sauce build #143: STILL FAILING in 1 min 44 sec: https://integration.wikimedia.org/ci/job/browsertests-UploadWizard-commons.wikimedia.beta.wmflabs.org-linux-firefox-sauce/143/ [04:08:57] Project browsertests-Flow-en.wikipedia.beta.wmflabs.org-linux-chrome-sauce build #161: STILL FAILING in 24 min: https://integration.wikimedia.org/ci/job/browsertests-Flow-en.wikipedia.beta.wmflabs.org-linux-chrome-sauce/161/ [04:12:01] Project browsertests-WikiLove-en.wikipedia.beta.wmflabs.org-linux-firefox-sauce build #102: SUCCESS in 3 min 2 sec: https://integration.wikimedia.org/ci/job/browsertests-WikiLove-en.wikipedia.beta.wmflabs.org-linux-firefox-sauce/102/ [04:12:03] Project browsertests-ContentTranslation-language-stage.wmflabs.org-linux-firefox-sauce build #87: STILL FAILING in 1.6 sec: https://integration.wikimedia.org/ci/job/browsertests-ContentTranslation-language-stage.wmflabs.org-linux-firefox-sauce/87/ [04:15:42] Project browsertests-Echo-en.wikipedia.beta.wmflabs.org-linux-chrome-sauce build #5: STILL FAILING in 3 min 38 sec: https://integration.wikimedia.org/ci/job/browsertests-Echo-en.wikipedia.beta.wmflabs.org-linux-chrome-sauce/5/ [04:16:34] Project browsertests-Translate-meta.wikimedia.org-linux-firefox-sauce build #126: SUCCESS in 50 sec: https://integration.wikimedia.org/ci/job/browsertests-Translate-meta.wikimedia.org-linux-firefox-sauce/126/ [04:17:44] Project browsertests-CirrusSearch-test2.wikipedia.org-linux-firefox-sauce build #101: SUCCESS in 1 min 9 sec: https://integration.wikimedia.org/ci/job/browsertests-CirrusSearch-test2.wikipedia.org-linux-firefox-sauce/101/ [04:18:36] Project browsertests-PageTriage-en.wikipedia.beta.wmflabs.org-linux-firefox-sauce build #69: SUCCESS in 52 sec: https://integration.wikimedia.org/ci/job/browsertests-PageTriage-en.wikipedia.beta.wmflabs.org-linux-firefox-sauce/69/ [04:35:55] Project beta-scap-eqiad build #18956: FAILURE in 1 min 50 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/18956/ [04:45:06] Yippee, build fixed! [04:45:06] Project beta-scap-eqiad build #18957: FIXED in 1 min 5 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/18957/ [04:47:06] Project browsertests-VisualEditor-test2.wikipedia.org-linux-chrome-sauce build #147: STILL FAILING in 1 hr 8 min: https://integration.wikimedia.org/ci/job/browsertests-VisualEditor-test2.wikipedia.org-linux-chrome-sauce/147/ [04:56:58] Project browsertests-MobileFrontend-en.m.wikipedia.beta.wmflabs.org-linux-chrome-sauce build #166: STILL FAILING in 49 min: https://integration.wikimedia.org/ci/job/browsertests-MobileFrontend-en.m.wikipedia.beta.wmflabs.org-linux-chrome-sauce/166/ [05:06:48] Project browsertests-VisualEditor-en.wikipedia.beta.wmflabs.org-linux-chrome-sauce build #171: STILL FAILING in 9 min 48 sec: https://integration.wikimedia.org/ci/job/browsertests-VisualEditor-en.wikipedia.beta.wmflabs.org-linux-chrome-sauce/171/ [05:16:16] Project beta-scap-eqiad build #18960: FAILURE in 2 min 13 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/18960/ [05:17:04] Project browsertests-MobileFrontend-en.m.wikipedia.beta.wmflabs.org-linux-firefox-sauce build #196: STILL FAILING in 58 min: https://integration.wikimedia.org/ci/job/browsertests-MobileFrontend-en.m.wikipedia.beta.wmflabs.org-linux-firefox-sauce/196/ [05:24:15] Project browsertests-Flow-en.wikipedia.beta.wmflabs.org-windows_8-internet_explorer-sauce build #116: STILL FAILING in 17 min: https://integration.wikimedia.org/ci/job/browsertests-Flow-en.wikipedia.beta.wmflabs.org-windows_8-internet_explorer-sauce/116/ [05:26:34] Project browsertests-MultimediaViewer-en.wikipedia.beta.wmflabs.org-windows_8-internet_explorer-sauce build #126: SUCCESS in 2 min 17 sec: https://integration.wikimedia.org/ci/job/browsertests-MultimediaViewer-en.wikipedia.beta.wmflabs.org-windows_8-internet_explorer-sauce/126/ [05:26:38] Yippee, build fixed! [05:26:39] Project beta-scap-eqiad build #18961: FIXED in 2 min 28 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/18961/ [05:37:46] Yippee, build fixed! [05:37:46] Project browsertests-Echo-test2.wikipedia.org-linux-firefox-sauce build #5: FIXED in 11 min: https://integration.wikimedia.org/ci/job/browsertests-Echo-test2.wikipedia.org-linux-firefox-sauce/5/ [05:39:31] Yippee, build fixed! [05:39:31] Project browsertests-MultimediaViewer-en.wikipedia.beta.wmflabs.org-linux-firefox-sauce build #127: FIXED in 1 min 45 sec: https://integration.wikimedia.org/ci/job/browsertests-MultimediaViewer-en.wikipedia.beta.wmflabs.org-linux-firefox-sauce/127/ [05:44:00] Project browsertests-MobileFrontend-test2.m.wikipedia.org-linux-firefox-sauce build #131: STILL FAILING in 56 min: https://integration.wikimedia.org/ci/job/browsertests-MobileFrontend-test2.m.wikipedia.org-linux-firefox-sauce/131/ [05:44:31] Project browsertests-Flow-test2.wikipedia.org-linux-firefox-sauce build #113: STILL FAILING in 27 min: https://integration.wikimedia.org/ci/job/browsertests-Flow-test2.wikipedia.org-linux-firefox-sauce/113/ [05:47:08] Project browsertests-MultimediaViewer-mediawiki.org-linux-firefox-sauce build #128: SUCCESS in 2 min 35 sec: https://integration.wikimedia.org/ci/job/browsertests-MultimediaViewer-mediawiki.org-linux-firefox-sauce/128/ [05:48:51] Project browsertests-MultimediaViewer-en.wikipedia.beta.wmflabs.org-linux-chrome-sauce build #127: SUCCESS in 1 min 43 sec: https://integration.wikimedia.org/ci/job/browsertests-MultimediaViewer-en.wikipedia.beta.wmflabs.org-linux-chrome-sauce/127/ [06:14:23] Project browsertests-Translate-sandbox.translatewiki.net-linux-firefox-sauce build #97: STILL FAILING in 25 min: https://integration.wikimedia.org/ci/job/browsertests-Translate-sandbox.translatewiki.net-linux-firefox-sauce/97/ [06:30:30] Project browsertests-Flow-test2.wikipedia.org-linux-chrome-sauce build #114: STILL FAILING in 50 min: https://integration.wikimedia.org/ci/job/browsertests-Flow-test2.wikipedia.org-linux-chrome-sauce/114/ [06:32:22] Project browsertests-Math-en.wikipedia.beta.wmflabs.org-linux-firefox-sauce build #70: SUCCESS in 1 min 50 sec: https://integration.wikimedia.org/ci/job/browsertests-Math-en.wikipedia.beta.wmflabs.org-linux-firefox-sauce/70/ [06:38:22] Yippee, build fixed! [06:38:22] Project browsertests-UniversalLanguageSelector-commons.wikimedia.beta.wmflabs.org-linux-firefox-sauce build #130: FIXED in 23 min: https://integration.wikimedia.org/ci/job/browsertests-UniversalLanguageSelector-commons.wikimedia.beta.wmflabs.org-linux-firefox-sauce/130/ [06:38:48] Project UploadWizard-api-commons.wikimedia.beta.wmflabs.org build #600: SUCCESS in 47 sec: https://integration.wikimedia.org/ci/job/UploadWizard-api-commons.wikimedia.beta.wmflabs.org/600/ [06:46:39] Project UploadWizard-api-commons.wikimedia.beta.wmflabs.org build #601: SUCCESS in 39 sec: https://integration.wikimedia.org/ci/job/UploadWizard-api-commons.wikimedia.beta.wmflabs.org/601/ [06:53:30] Project UploadWizard-api-commons.wikimedia.org build #482: SUCCESS in 30 sec: https://integration.wikimedia.org/ci/job/UploadWizard-api-commons.wikimedia.org/482/ [06:59:08] Project browsertests-VisualEditor-test2.wikipedia.org-linux-firefox-sauce build #144: STILL FAILING in 1 hr 15 min: https://integration.wikimedia.org/ci/job/browsertests-VisualEditor-test2.wikipedia.org-linux-firefox-sauce/144/ [07:46:21] Project beta-scap-eqiad build #18975: FAILURE in 1 min 48 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/18975/ [07:55:47] Yippee, build fixed! [07:55:48] Project beta-scap-eqiad build #18976: FIXED in 1 min 45 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/18976/ [08:36:33] Project beta-scap-eqiad build #18980: FAILURE in 2 min 0 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/18980/ [08:47:06] Yippee, build fixed! [08:47:07] Project beta-scap-eqiad build #18981: FIXED in 2 min 26 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/18981/ [08:47:34] (03PS3) 10Hashar: DonationInterface passes jslint now [integration/zuul-config] - 10https://gerrit.wikimedia.org/r/156179 (owner: 10Awight) [08:48:12] (03CR) 10Hashar: [C: 032] "Well done!" [integration/zuul-config] - 10https://gerrit.wikimedia.org/r/156179 (owner: 10Awight) [08:48:28] (03Merged) 10jenkins-bot: DonationInterface passes jslint now [integration/zuul-config] - 10https://gerrit.wikimedia.org/r/156179 (owner: 10Awight) [08:50:29] (03CR) 10Hashar: "Adam: note the change to zuul-config needs to be deployed manually on gallium.wikimedia.org once they have been merged :D" [integration/zuul-config] - 10https://gerrit.wikimedia.org/r/156194 (owner: 10Jforrester) [08:51:55] (03CR) 10Hashar: "Adam thank you. The change is definitely under my radar. I want to write an integration test to ensure both list of emails are coherent a" [integration/zuul-config] - 10https://gerrit.wikimedia.org/r/151249 (owner: 10Awight) [08:53:49] (03PS11) 10Hashar: Enable browser tests for the GettingStarted extension [integration/jenkins-job-builder-config] (cloudbees) - 10https://gerrit.wikimedia.org/r/150172 (owner: 10Phuedx) [08:54:55] (03PS12) 10Hashar: Enable browser tests for the GettingStarted extension [integration/jenkins-job-builder-config] (cloudbees) - 10https://gerrit.wikimedia.org/r/150172 (owner: 10Phuedx) [08:57:17] (03PS5) 10Hashar: The first GettingStarted test [selenium] - 10https://gerrit.wikimedia.org/r/144978 (https://bugzilla.wikimedia.org/52246) (owner: 10Zfilipin) [08:57:49] (03CR) 10Hashar: [C: 032] "Rebased / fixed conflict. Landing this in, I don't think it is worth a new release since that just updates the README.md file." [selenium] - 10https://gerrit.wikimedia.org/r/144978 (https://bugzilla.wikimedia.org/52246) (owner: 10Zfilipin) [08:57:51] (03Merged) 10jenkins-bot: The first GettingStarted test [selenium] - 10https://gerrit.wikimedia.org/r/144978 (https://bugzilla.wikimedia.org/52246) (owner: 10Zfilipin) [08:58:12] (03PS5) 10Hashar: WIP Running Ruby linter for GettingStarted [integration/zuul-config] - 10https://gerrit.wikimedia.org/r/152795 (https://bugzilla.wikimedia.org/52246) (owner: 10Zfilipin) [08:59:52] (03PS6) 10Hashar: Running Ruby linter for GettingStarted [integration/zuul-config] - 10https://gerrit.wikimedia.org/r/152795 (https://bugzilla.wikimedia.org/52246) (owner: 10Zfilipin) [09:02:24] Project browsertests-GettingStarted-en.wikipedia.beta.wmflabs.org-linux-firefox-sauce build #1: FAILURE in 1 min 4 sec: https://integration.wikimedia.org/ci/job/browsertests-GettingStarted-en.wikipedia.beta.wmflabs.org-linux-firefox-sauce/1/ [09:03:14] (03CR) 10Hashar: "Job deployed:" [integration/jenkins-job-builder-config] (cloudbees) - 10https://gerrit.wikimedia.org/r/150172 (owner: 10Phuedx) [09:03:34] (03CR) 10Hashar: [C: 032] "Merging anyway since I have deployed the job." [integration/jenkins-job-builder-config] (cloudbees) - 10https://gerrit.wikimedia.org/r/150172 (owner: 10Phuedx) [09:03:53] \o/ [09:03:55] thanks hashar_ [09:04:04] (03Merged) 10jenkins-bot: Enable browser tests for the GettingStarted extension [integration/jenkins-job-builder-config] (cloudbees) - 10https://gerrit.wikimedia.org/r/150172 (owner: 10Phuedx) [09:04:26] Project beta-code-update-eqiad build #21755: FAILURE in 1 min 26 sec: https://integration.wikimedia.org/ci/job/beta-code-update-eqiad/21755/ [09:07:39] (03PS1) 10Hashar: Ruby linter for GettingStarted [integration/jenkins-job-builder-config] - 10https://gerrit.wikimedia.org/r/156510 (https://bugzilla.wikimedia.org/52246) [09:08:29] (03CR) 10Hashar: [C: 032] "Deployed :)" [integration/jenkins-job-builder-config] - 10https://gerrit.wikimedia.org/r/156510 (https://bugzilla.wikimedia.org/52246) (owner: 10Hashar) [09:08:45] (03PS7) 10Hashar: Running Ruby linter for GettingStarted [integration/zuul-config] - 10https://gerrit.wikimedia.org/r/152795 (https://bugzilla.wikimedia.org/52246) (owner: 10Zfilipin) [09:08:58] (03CR) 10Hashar: [C: 032] "Ruby linter job created by https://gerrit.wikimedia.org/r/156510" [integration/zuul-config] - 10https://gerrit.wikimedia.org/r/152795 (https://bugzilla.wikimedia.org/52246) (owner: 10Zfilipin) [09:09:11] (03Merged) 10jenkins-bot: Running Ruby linter for GettingStarted [integration/zuul-config] - 10https://gerrit.wikimedia.org/r/152795 (https://bugzilla.wikimedia.org/52246) (owner: 10Zfilipin) [09:11:13] (03Merged) 10jenkins-bot: Ruby linter for GettingStarted [integration/jenkins-job-builder-config] - 10https://gerrit.wikimedia.org/r/156510 (https://bugzilla.wikimedia.org/52246) (owner: 10Hashar) [09:14:18] Yippee, build fixed! [09:14:18] Project beta-code-update-eqiad build #21756: FIXED in 1 min 17 sec: https://integration.wikimedia.org/ci/job/beta-code-update-eqiad/21756/ [09:19:53] (03PS3) 10Hashar: Switch from experimental to test and gate for wikibase [integration/zuul-config] - 10https://gerrit.wikimedia.org/r/156270 (owner: 10Addshore) [09:21:31] (03CR) 10Hashar: [C: 031] "The jobs have been deployed already despite the config change still being pending in Gerrit." [integration/zuul-config] - 10https://gerrit.wikimedia.org/r/156270 (owner: 10Addshore) [09:44:48] (03PS2) 10Hashar: Split mw-run-update-script from mw-setup-extension [integration/jenkins-job-builder-config] - 10https://gerrit.wikimedia.org/r/156170 (owner: 10Addshore) [09:45:41] Project beta-scap-eqiad build #18986: FAILURE in 1 min 26 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/18986/ [09:54:39] (03CR) 10Hashar: Split mw-run-update-script from mw-setup-extension (031 comment) [integration/jenkins-job-builder-config] - 10https://gerrit.wikimedia.org/r/156170 (owner: 10Addshore) [09:56:13] (03PS3) 10Hashar: Split mw-run-update-script from mw-setup-extension [integration/jenkins-job-builder-config] - 10https://gerrit.wikimedia.org/r/156170 (owner: 10Addshore) [09:57:03] hashar: you spotted it :D [09:57:34] addshore: yeah eventually figured out it was an ordering issue hehe [09:57:39] I was just about to comment :P [09:57:55] waiting for diff [09:58:00] https://integration.wikimedia.org/ci/job/integration-jjb-config-diff/839/console [10:01:36] (03CR) 10Hashar: [C: 032] "noop per https://integration.wikimedia.org/ci/job/integration-jjb-config-diff/839/console" [integration/jenkins-job-builder-config] - 10https://gerrit.wikimedia.org/r/156170 (owner: 10Addshore) [10:01:44] :D [10:02:46] (03CR) 10Hashar: Add jobs for wikibase (031 comment) [integration/jenkins-job-builder-config] - 10https://gerrit.wikimedia.org/r/156114 (owner: 10Addshore) [10:02:51] (03PS18) 10Hashar: Add jobs for wikibase [integration/jenkins-job-builder-config] - 10https://gerrit.wikimedia.org/r/156114 (owner: 10Addshore) [10:04:01] (03Merged) 10jenkins-bot: Split mw-run-update-script from mw-setup-extension [integration/jenkins-job-builder-config] - 10https://gerrit.wikimedia.org/r/156170 (owner: 10Addshore) [10:07:10] Yippee, build fixed! [10:07:11] Project beta-scap-eqiad build #18988: FIXED in 2 min 53 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/18988/ [10:08:45] now hashar enable the wikibase jobs? :P [10:11:13] addshore: refreshing the jobs :D [10:11:21] then I am out for lunch [10:11:28] will add the triggers after lunch :D [10:11:56] kk, make sure you poke aude or tobi to turn the wdjenkins current ones off for wikibase [10:12:01] im going out for a surf :) tata for now! [10:12:28] addshore: have fun! [10:12:46] (03CR) 10Hashar: [C: 032] "Following the rebase, I have force updated the configuration of the jobs:" [integration/jenkins-job-builder-config] - 10https://gerrit.wikimedia.org/r/156114 (owner: 10Addshore) [10:13:00] will check them this afternoon [10:13:06] and that is fine, poke #wikidata about it [10:13:10] and enable them :D [10:13:13] addshore: thank you a ton [10:15:02] (03Merged) 10jenkins-bot: Add jobs for wikibase [integration/jenkins-job-builder-config] - 10https://gerrit.wikimedia.org/r/156114 (owner: 10Addshore) [10:15:46] Project beta-scap-eqiad build #18989: FAILURE in 1 min 31 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/18989/ [10:26:19] Yippee, build fixed! [10:26:20] Project beta-scap-eqiad build #18990: FIXED in 1 min 53 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/18990/ [11:08:29] re [11:18:57] (03CR) 10Hashar: "I have triggered the jobs on a dummy change https://gerrit.wikimedia.org/r/#/c/70395/" [integration/zuul-config] - 10https://gerrit.wikimedia.org/r/156270 (owner: 10Addshore) [11:20:08] (03PS2) 10Hashar: WIP kill jenkins slaves WIP [integration/jenkins-job-builder-config] - 10https://gerrit.wikimedia.org/r/156455 [11:21:13] (03PS3) 10Hashar: WIP kill jenkins slaves WIP [integration/jenkins-job-builder-config] - 10https://gerrit.wikimedia.org/r/156455 [11:27:32] (03PS4) 10Hashar: WIP kill jenkins slaves WIP [integration/jenkins-job-builder-config] - 10https://gerrit.wikimedia.org/r/156455 [11:52:26] (03PS5) 10Hashar: WIP kill jenkins slaves WIP [integration/jenkins-job-builder-config] - 10https://gerrit.wikimedia.org/r/156455 [11:53:19] (03PS6) 10Hashar: WIP kill jenkins slaves WIP [integration/jenkins-job-builder-config] - 10https://gerrit.wikimedia.org/r/156455 [11:54:57] (03CR) 10Aude: [C: 031] "looks great" [integration/zuul-config] - 10https://gerrit.wikimedia.org/r/156270 (owner: 10Addshore) [12:24:17] (03PS1) 10Hashar: Experiment extensions with Zuul cloner [integration/zuul-config] - 10https://gerrit.wikimedia.org/r/156525 [12:25:24] (03CR) 10Hashar: [C: 032] "That is an experiment :D" [integration/zuul-config] - 10https://gerrit.wikimedia.org/r/156525 (owner: 10Hashar) [12:25:33] (03Merged) 10jenkins-bot: Experiment extensions with Zuul cloner [integration/zuul-config] - 10https://gerrit.wikimedia.org/r/156525 (owner: 10Hashar) [12:46:53] Project UploadWizard-api-commons.wikimedia.beta.wmflabs.org build #602: SUCCESS in 52 sec: https://integration.wikimedia.org/ci/job/UploadWizard-api-commons.wikimedia.beta.wmflabs.org/602/ [12:53:33] Project UploadWizard-api-commons.wikimedia.org build #483: SUCCESS in 33 sec: https://integration.wikimedia.org/ci/job/UploadWizard-api-commons.wikimedia.org/483/ [13:06:34] Project beta-scap-eqiad build #19006: FAILURE in 2 min 20 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/19006/ [13:16:31] Yippee, build fixed! [13:16:32] Project beta-scap-eqiad build #19007: FIXED in 2 min 27 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/19007/ [14:06:18] Project beta-scap-eqiad build #19010: FAILURE in 4 min 35 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/19010/ [14:16:05] Yippee, build fixed! [14:16:06] Project beta-scap-eqiad build #19012: FIXED in 2 min 4 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/19012/ [14:26:01] (03PS1) 10Hashar: Drop some jobs that no more exist in Jenkins [integration/zuul-config] - 10https://gerrit.wikimedia.org/r/156537 [14:28:35] (03PS1) 10Hashar: Typo in unicodejs Jsduck publisher [integration/zuul-config] - 10https://gerrit.wikimedia.org/r/156538 [14:29:07] (03CR) 10Hashar: [C: 032] Drop some jobs that no more exist in Jenkins [integration/zuul-config] - 10https://gerrit.wikimedia.org/r/156537 (owner: 10Hashar) [14:29:15] (03Merged) 10jenkins-bot: Drop some jobs that no more exist in Jenkins [integration/zuul-config] - 10https://gerrit.wikimedia.org/r/156537 (owner: 10Hashar) [14:29:25] (03CR) 10Hashar: [C: 032] Typo in unicodejs Jsduck publisher [integration/zuul-config] - 10https://gerrit.wikimedia.org/r/156538 (owner: 10Hashar) [14:29:33] (03Merged) 10jenkins-bot: Typo in unicodejs Jsduck publisher [integration/zuul-config] - 10https://gerrit.wikimedia.org/r/156538 (owner: 10Hashar) [14:31:37] (03PS1) 10Hashar: Missing mwext-PdfHandler-ruby1.9.3lint [integration/jenkins-job-builder-config] - 10https://gerrit.wikimedia.org/r/156539 [14:33:31] (03CR) 10Hashar: [C: 032] Missing mwext-PdfHandler-ruby1.9.3lint [integration/jenkins-job-builder-config] - 10https://gerrit.wikimedia.org/r/156539 (owner: 10Hashar) [14:35:57] (03Merged) 10jenkins-bot: Missing mwext-PdfHandler-ruby1.9.3lint [integration/jenkins-job-builder-config] - 10https://gerrit.wikimedia.org/r/156539 (owner: 10Hashar) [14:36:26] Project beta-scap-eqiad build #19014: FAILURE in 2 min 14 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/19014/ [14:36:39] (03PS1) 10Hashar: jenkins-jobs-list.py dump list of jenkins jobs [integration/jenkins] - 10https://gerrit.wikimedia.org/r/156540 [14:37:00] (03CR) 10Hashar: [C: 032] jenkins-jobs-list.py dump list of jenkins jobs [integration/jenkins] - 10https://gerrit.wikimedia.org/r/156540 (owner: 10Hashar) [14:37:02] (03Merged) 10jenkins-bot: jenkins-jobs-list.py dump list of jenkins jobs [integration/jenkins] - 10https://gerrit.wikimedia.org/r/156540 (owner: 10Hashar) [14:39:35] back hashar :P [14:40:39] addshore: I am adding a new validation job on Zuul conf :D [14:40:53] I found out that Zuul layout validator can optionally be given a list of jobs defined [14:40:54] extra validation? :P [14:41:01] :O [14:41:09] and will bails out whenever we configure a trigger to a non existing job :D [14:41:20] :D awesome [14:42:42] (03CR) 10Addshore: [C: 031] "I would say we are ready for this :>" [integration/zuul-config] - 10https://gerrit.wikimedia.org/r/156270 (owner: 10Addshore) [14:43:03] (03PS1) 10Hashar: Ensure Zuul points to existing jobs [integration/jenkins-job-builder-config] - 10https://gerrit.wikimedia.org/r/156543 [14:43:57] (03CR) 10Hashar: "recheck" [integration/zuul-config] - 10https://gerrit.wikimedia.org/r/156270 (owner: 10Addshore) [14:44:35] hashar: our jobs seem to take 5 mins less now :P [14:44:48] win! [14:44:57] addshore: yeah and the first 2 minutes or so are spent setting up composer dependencies [14:45:02] yup :P [14:45:03] that can most probably be speeded up somehow [14:45:16] Yippee, build fixed! [14:45:17] Project beta-scap-eqiad build #19015: FIXED in 1 min 6 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/19015/ [14:45:38] probably, well, definitely if the wmf get a /local/ composer repo [14:46:45] (03PS1) 10Hashar: Triggering a non existent job [integration/zuul-config] - 10https://gerrit.wikimedia.org/r/156546 [14:46:55] (03CR) 10jenkins-bot: [V: 04-1] Triggering a non existent job [integration/zuul-config] - 10https://gerrit.wikimedia.org/r/156546 (owner: 10Hashar) [14:47:00] :D Win! [14:47:56] (03CR) 10Hashar: [C: 032] "Tested by adding a trigger to a non existing job ( https://gerrit.wikimedia.org/r/#/c/156546/ )" [integration/jenkins-job-builder-config] - 10https://gerrit.wikimedia.org/r/156543 (owner: 10Hashar) [14:48:09] (03Abandoned) 10Hashar: Triggering a non existent job [integration/zuul-config] - 10https://gerrit.wikimedia.org/r/156546 (owner: 10Hashar) [14:48:35] very awesome [14:48:36] addshore: lets enable wikibase test :D [14:48:41] woo! yes [14:48:47] Ill go and disable the other jenkins [14:49:12] (03PS4) 10Hashar: Switch from experimental to test and gate for wikibase [integration/zuul-config] - 10https://gerrit.wikimedia.org/r/156270 (owner: 10Addshore) [14:49:20] (03CR) 10Hashar: [C: 032] "Lets deploy!" [integration/zuul-config] - 10https://gerrit.wikimedia.org/r/156270 (owner: 10Addshore) [14:49:52] (03CR) 10jenkins-bot: [V: 04-1] Switch from experimental to test and gate for wikibase [integration/zuul-config] - 10https://gerrit.wikimedia.org/r/156270 (owner: 10Addshore) [14:49:58] (03Merged) 10jenkins-bot: Ensure Zuul points to existing jobs [integration/jenkins-job-builder-config] - 10https://gerrit.wikimedia.org/r/156543 (owner: 10Hashar) [14:50:43] Failure: URLError () :p [14:50:51] (03CR) 10Addshore: "recheck" [integration/zuul-config] - 10https://gerrit.wikimedia.org/r/156270 (owner: 10Addshore) [14:51:15] +2Ved :) [14:53:15] (03CR) 10Hashar: [C: 032] Switch from experimental to test and gate for wikibase [integration/zuul-config] - 10https://gerrit.wikimedia.org/r/156270 (owner: 10Addshore) [14:53:29] (03Merged) 10jenkins-bot: Switch from experimental to test and gate for wikibase [integration/zuul-config] - 10https://gerrit.wikimedia.org/r/156270 (owner: 10Addshore) [14:56:02] Project beta-scap-eqiad build #19016: FAILURE in 1 min 56 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/19016/ [14:57:11] (03PS1) 10Hashar: Wikibase: check-only -> check [integration/zuul-config] - 10https://gerrit.wikimedia.org/r/156549 [14:57:38] (03CR) 10Addshore: [C: 031] Wikibase: check-only -> check [integration/zuul-config] - 10https://gerrit.wikimedia.org/r/156549 (owner: 10Hashar) [14:58:12] (03CR) 10Hashar: [C: 032] Wikibase: check-only -> check [integration/zuul-config] - 10https://gerrit.wikimedia.org/r/156549 (owner: 10Hashar) [14:58:22] (03Merged) 10jenkins-bot: Wikibase: check-only -> check [integration/zuul-config] - 10https://gerrit.wikimedia.org/r/156549 (owner: 10Hashar) [15:07:24] Yippee, build fixed! [15:07:24] Project beta-scap-eqiad build #19018: FIXED in 2 min 55 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/19018/ [15:09:14] (03PS1) 10Addshore: Make mwext-Wikibase-* jobs concurrent [integration/jenkins-job-builder-config] - 10https://gerrit.wikimedia.org/r/156551 [15:25:36] (03PS1) 10Addshore: Add Wikibase qunit job [integration/jenkins-job-builder-config] - 10https://gerrit.wikimedia.org/r/156554 [15:27:02] (03CR) 10jenkins-bot: [V: 04-1] Add Wikibase qunit job [integration/jenkins-job-builder-config] - 10https://gerrit.wikimedia.org/r/156554 (owner: 10Addshore) [15:34:53] (03PS2) 10Addshore: Add Wikibase qunit job [integration/jenkins-job-builder-config] - 10https://gerrit.wikimedia.org/r/156554 [15:36:49] (03CR) 10jenkins-bot: [V: 04-1] Add Wikibase qunit job [integration/jenkins-job-builder-config] - 10https://gerrit.wikimedia.org/r/156554 (owner: 10Addshore) [15:41:26] (03CR) 10Addshore: "recheck" [integration/jenkins-job-builder-config] - 10https://gerrit.wikimedia.org/r/156554 (owner: 10Addshore) [15:49:59] (03PS1) 10Addshore: Trigger wikibase-qunit job [integration/zuul-config] - 10https://gerrit.wikimedia.org/r/156560 [15:54:58] 3Wikimedia / 3Continuous integration: Jenkins: Use node-jscs as checkstyle for javascript coding style - 10https://bugzilla.wikimedia.org/54218#c12 (10Krinkle) 5NEW>3RESO/FIX The bug is to figure out how to properly do it. Which I define by hacking it together for one repo, and as you repeat it for more... [16:07:51] (03PS1) 10Addshore: wdjenkins > wikidata [integration/jenkins-job-builder-config] - 10https://gerrit.wikimedia.org/r/156566 [16:09:34] (03CR) 10jenkins-bot: [V: 04-1] wdjenkins > wikidata [integration/jenkins-job-builder-config] - 10https://gerrit.wikimedia.org/r/156566 (owner: 10Addshore) [16:20:02] (03PS1) 10Addshore: Add jobs for Wikidata build repo [integration/jenkins-job-builder-config] - 10https://gerrit.wikimedia.org/r/156567 [16:20:04] (03PS1) 10Addshore: Remove wikidata-testextensions jobs [integration/jenkins-job-builder-config] - 10https://gerrit.wikimedia.org/r/156568 [16:20:57] (03CR) 10jenkins-bot: [V: 04-1] Remove wikidata-testextensions jobs [integration/jenkins-job-builder-config] - 10https://gerrit.wikimedia.org/r/156568 (owner: 10Addshore) [16:20:59] (03CR) 10jenkins-bot: [V: 04-1] Add jobs for Wikidata build repo [integration/jenkins-job-builder-config] - 10https://gerrit.wikimedia.org/r/156567 (owner: 10Addshore) [16:24:48] (03PS1) 10Addshore: Add experimental triggers for Wikidata build repo [integration/zuul-config] - 10https://gerrit.wikimedia.org/r/156572 [16:24:57] (03CR) 10jenkins-bot: [V: 04-1] Add experimental triggers for Wikidata build repo [integration/zuul-config] - 10https://gerrit.wikimedia.org/r/156572 (owner: 10Addshore) [16:28:19] (03PS2) 10Addshore: wdjenkins > wikidata [integration/jenkins-job-builder-config] - 10https://gerrit.wikimedia.org/r/156566 [16:29:25] (03CR) 10jenkins-bot: [V: 04-1] wdjenkins > wikidata [integration/jenkins-job-builder-config] - 10https://gerrit.wikimedia.org/r/156566 (owner: 10Addshore) [16:30:14] (03PS3) 10Addshore: wdjenkins > wikidata [integration/jenkins-job-builder-config] - 10https://gerrit.wikimedia.org/r/156566 [16:33:15] (03PS2) 10Addshore: WIP DNM Remove wikidata-testextensions jobs [integration/jenkins-job-builder-config] - 10https://gerrit.wikimedia.org/r/156568 [16:33:17] (03PS2) 10Addshore: Add jobs for Wikidata build repo [integration/jenkins-job-builder-config] - 10https://gerrit.wikimedia.org/r/156567 [16:33:49] (03PS2) 10Addshore: Add experimental triggers for Wikidata build repo [integration/zuul-config] - 10https://gerrit.wikimedia.org/r/156572 [16:34:11] (03CR) 10jenkins-bot: [V: 04-1] WIP DNM Remove wikidata-testextensions jobs [integration/jenkins-job-builder-config] - 10https://gerrit.wikimedia.org/r/156568 (owner: 10Addshore) [16:34:14] (03CR) 10jenkins-bot: [V: 04-1] Add experimental triggers for Wikidata build repo [integration/zuul-config] - 10https://gerrit.wikimedia.org/r/156572 (owner: 10Addshore) [16:34:27] (03CR) 10jenkins-bot: [V: 04-1] Add jobs for Wikidata build repo [integration/jenkins-job-builder-config] - 10https://gerrit.wikimedia.org/r/156567 (owner: 10Addshore) [16:35:29] (03PS3) 10Addshore: Add jobs for Wikidata build repo [integration/jenkins-job-builder-config] - 10https://gerrit.wikimedia.org/r/156567 [16:35:53] (03PS3) 10Addshore: WIP DNM Remove wikidata-testextensions jobs [integration/jenkins-job-builder-config] - 10https://gerrit.wikimedia.org/r/156568 [16:38:53] (03PS3) 10Addshore: Add experimental triggers for Wikidata build repo [integration/zuul-config] - 10https://gerrit.wikimedia.org/r/156572 [16:39:01] (03CR) 10jenkins-bot: [V: 04-1] Add experimental triggers for Wikidata build repo [integration/zuul-config] - 10https://gerrit.wikimedia.org/r/156572 (owner: 10Addshore) [17:32:40] chrismcmahonbrb: since you're in the SoS, I'll leave and let you take it on, I'm too much of a chicken :) [17:33:10] greg-g: ok [17:36:04] 3Wikimedia Labs / 3deployment-prep (beta): wikidata beta (item pages, etc.) inaccessible with 503 errors - 10https://bugzilla.wikimedia.org/69708#c5 (10Aude) now we are getting Fatal error: Argument 1 passed to Wikibase\\EntityHandler::getTitleForId() must be an instance of Wikibase\\EntityId, Wikibase\\Dat... [17:38:45] (03CR) 10Aude: [C: 031] Make mwext-Wikibase-* jobs concurrent [integration/jenkins-job-builder-config] - 10https://gerrit.wikimedia.org/r/156551 (owner: 10Addshore) [18:30:19] chrismcmalunch: Flow browsertests still often failing with getaddrinfo failures and timeouts https://integration.wikimedia.org/ci/job/browsertests-Flow-en.wikipedia.beta.wmflabs.org-linux-firefox-sauce/ [18:30:59] 3Wikimedia / 3Continuous integration: Jenkins: Use node-jscs as checkstyle for javascript coding style - 10https://bugzilla.wikimedia.org/54218#c13 (10Antoine "hashar" Musso) Sounds good to me Timo. That nicely scale up CI to developers :-) [19:22:08] chrismcmalunch: can you weigh in on the inline comment here https://gerrit.wikimedia.org/r/#/c/155781/ [19:33:58] marxarelli: the "progress bar" only appears for browsers that support it. iirc it shows up in chrome but not ff [19:34:38] marxarelli: yeah, that test was set to run only for chrome and not for FF because the MF folk were attached to testing the progress bar showing up [19:50:03] 3Wikimedia Labs / 3deployment-prep (beta): wikidata beta (item pages, etc.) inaccessible with 503 errors - 10https://bugzilla.wikimedia.org/69708 (10John F. Lewis) p:5Unprio>3Normal [20:06:41] marxarelli: ping :) [20:07:07] * greg-g assumes the mushroom kingdom is being squatted [20:10:07] greg-g: sh!tballs! sorry about that [20:10:19] let me check. i think someone is in there [20:56:22] (03PS1) 10Cscott: Run `npm test` on Parsoid [integration/jenkins-job-builder-config] - 10https://gerrit.wikimedia.org/r/156690 [20:59:31] 3Wikimedia Labs / 3deployment-prep (beta): Rename labswiki to deploymentwiki - 10https://bugzilla.wikimedia.org/70108 (10Andrew Bogott) 3NEW p:3Unprio s:3normal a:3None I'm in the middle of some work to get wikitech to function as part of the deployment system: https://gerrit.wikimedia.org/r/#/c/155... [21:04:37] (03PS1) 10Cscott: parsoidsvc-npm shouldn't vote (yet). [integration/zuul-config] - 10https://gerrit.wikimedia.org/r/156693 [21:05:21] (03CR) 10Hashar: [C: 04-1] Run `npm test` on Parsoid (031 comment) [integration/jenkins-job-builder-config] - 10https://gerrit.wikimedia.org/r/156690 (owner: 10Cscott) [21:16:28] anyone else having lag issues on beta cluster [21:19:29] 3Wikimedia Labs / 3deployment-prep (beta): Rename labswiki to deploymentwiki - 10https://bugzilla.wikimedia.org/70108#c3 (10Andrew Bogott) Renaming requires a manual rename of the database (probably via a dump and recreation) and the attached patch. And... anything else? [21:19:55] greg-g: beta seems reasonably zippy to me, got any details? [21:20:17] chrismcmahon: https://bugzilla.wikimedia.org/show_bug.cgi?id=70103 [21:21:04] manybubbles: are y'all indexing stuff on beta right now or something like that? ^^ [21:21:14] greg-g: this is why we need a beta2 [21:21:34] yes :( [21:22:00] greg-g: beta3 and 4 will come soon [21:22:28] manybubbles: seems to be that search is slow but apache is reasonably not-slow [21:22:49] manybubbles: I'm fine with that, actually ;) [21:23:08] right now we have too few, I'll be happy to switch the needle to too many and figure out the "right" number [21:23:45] also, hardwarez :-) [21:24:49] (03CR) 10Hashar: [C: 04-1] "The job needs to be triggered now :-)" (032 comments) [integration/zuul-config] - 10https://gerrit.wikimedia.org/r/156693 (owner: 10Cscott) [21:24:58] chrismcmahon: no idea why - can look [21:26:29] 3Wikimedia Labs / 3deployment-prep (beta): VisualEditor: All kind of search is extremely slow in Betalabs which includes searching for matching link target names, images, category names, templates etc - 10https://bugzilla.wikimedia.org/70103 (10Greg Grossmeier) p:5Unprio>3Highes a:3None [21:26:51] chrismcmahon: please please no beta2 . We need a better name :] [21:27:14] if we were to use an equivalent of beta cluster that is updated only once per day, I would call it 'nightly' [21:27:46] name to be decided by bikeshed later [21:27:50] hehe [21:27:52] :) [21:27:56] one huge impact [21:27:57] I'm not sure I'd call apache "fast" simple.wikipedia.beta.wmflabs.org took forever to load [21:28:15] is that we use puppet to install everything, and the manifests currently vary the configuration between prod and beta solely based on $::realm [21:28:36] so the new cluster will be in the same $::realm as beta cluster which would not let us customize it [21:28:44] that part needs a solution. Maybe hiera() [21:28:46] we still have both hhvm servers serving traffic? [21:28:57] hiera probably there, yeah [21:29:28] 3Wikimedia Labs / 3deployment-prep (beta): All kind of search is extremely slow in Betalabs which includes searching for matching link target names,images, category names, templates etc - 10https://bugzilla.wikimedia.org/70103 (10Greg Grossmeier) [21:31:13] 3Wikimedia Labs / 3deployment-prep (beta): All kind of search is extremely slow in Betalabs which includes searching for matching link target names, images, category names, templates etc - 10https://bugzilla.wikimedia.org/70103#c1 (10Nik Everett) I see it taking a long time too. The load on the search serve... [21:31:23] hashar: beta ganglia busted? [21:31:48] manybubbles: mine might have been cached, checking... [21:32:40] manybubbles: yeah, my bad, I was hitting the cache. we should probably talk to ori [21:33:52] chrismcmahon: doesn't look like it - where do the logs go? [21:35:02] manybubbles: yeah [21:35:14] manybubbles: it is on a 1 cpu / 2GB memory box so it keeps dieing [21:35:22] and we can't resize it in nova [21:35:33] potentially we could rebuild one but it is not entirely puppetized [21:35:38] so I gave up:/ [21:36:22] manybubbles: @deployment-bastion:/data/project/logs$ [21:36:56] director hhvm_appservers random { [21:36:56] { [21:36:56] .backend = ipv4_10_68_17_208; [21:37:00] greg-g: only one hhvm box pooled [21:37:04] so we filled the partition again? [21:37:13] chrismcmahon: yeah - it doesn't look like elasticsearch is slow here [21:37:26] greg-g: that comes straight from varnish config on depoyment-cache-text02 in /etc/varnish/director hhvm_appservers random { [21:37:26] { [21:37:26] .backend = ipv4_10_68_17_208; [21:37:29] aeherha [21:37:32] I am too tired sorry [21:38:22] and yeah mediawiki02 is full again [21:39:24] !log deployment-prep deleted hhvm core files on mediawiki02:/tmp {{bug|69979}} [21:39:28] the bug is https://bugzilla.wikimedia.org/show_bug.cgi?id=69979 [21:39:59] hashar: can we save those core files instead of deleting them (and tell Ori where they are)? HHVM should not be dumping core at this point [21:40:05] boo. _joe_ and I tried to fix that earlier today [21:40:06] no [21:40:14] they disrupt the service so I delete them [21:40:18] 3Wikimedia Labs / 3deployment-prep (beta): hhvm creates core file in /tmp/ filling mediawiki02 labs instance root partition - 10https://bugzilla.wikimedia.org/69979#c5 (10Antoine "hashar" Musso) I have deleted the 2GB+ core files on mediawiki02:/tmp/ [21:40:29] if we want to keep them, they should be written to some place that has enough disk space i.e. /srv local to the instance [21:40:34] I made a dir a data/project/hhvm-cores for the ones that cleaned up this morning [21:40:40] but I have no clue how to change hhvm core file directory :( [21:40:45] hashar: they are our only documentation of hhvm bugs [21:41:06] yeah I should probably have copied those somewhere [21:41:17] meanwhile, they cause beta to go havoc and me to work over the weekend [21:41:20] The cores are *supposed* to go to /var/log/hhvm and we made that a symlink to /srv today. Must not work though [21:43:27] and the symlink is gone now. Thanks puppet [21:44:55] aaaaaaa :-( [21:45:34] (03PS2) 10Cscott: Run `npm test` on Parsoid [integration/jenkins-job-builder-config] - 10https://gerrit.wikimedia.org/r/156690 [21:45:36] (03PS1) 10Cscott: Remove obsolete parsoid.yaml [integration/jenkins-job-builder-config] - 10https://gerrit.wikimedia.org/r/156703 [21:45:39] at least there are stack traces there [21:47:19] bd808: sorry if I sound angry. I am just tired :D [21:50:17] hashar: no worries. You should sleep more :) [21:51:05] bd808: yeah but I really really want to finish up a few tasks before child++ [21:52:37] hashar: omg we are going to lose you to your family !!! [21:52:58] that will make my wife happy :] [21:53:01] :) we will survive [21:53:07] I am sure [21:53:17] but we will be glad when you come back [21:53:19] if in doubt: restart jenkins hehe [21:53:35] twice, because the first restart never works [21:53:46] yeah that is an issue in the init script I think [21:53:53] I have filled a bug about it but never investigated :( [21:54:00] apparently the init script kill the java process [21:54:08] wait some fixed amount of time, and give up [21:54:19] leaving the old java process running and starting a new jenkins [21:54:55] that starves a CPU until it is kill -9 [21:55:22] (03PS2) 10Cscott: Swap in the new parsoidsvc-(source|deploy) jobs. [integration/zuul-config] - 10https://gerrit.wikimedia.org/r/156693 [21:55:33] jenkins is a fickle beast; I have battled it before [21:55:36] (03CR) 10jenkins-bot: [V: 04-1] Swap in the new parsoidsvc-(source|deploy) jobs. [integration/zuul-config] - 10https://gerrit.wikimedia.org/r/156693 (owner: 10Cscott) [21:55:58] ^^^ \O/ [21:56:50] (03CR) 10Hashar: "Layout validation fails because the jobs have not been deployed to Jenkins yet:" [integration/zuul-config] - 10https://gerrit.wikimedia.org/r/156693 (owner: 10Cscott) [21:58:56] if hhvm is going to dump core multiple times per day under just browser test load, there is just no way it is even close to ready for prod. I'm not sure what to do about that. [22:00:00] (03CR) 10Hashar: [C: 032] "Confirmed those jobs are no more triggered in Zuul. I am leaving them in Jenkins for now since their history might still be useful, I wil" [integration/jenkins-job-builder-config] - 10https://gerrit.wikimedia.org/r/156703 (owner: 10Cscott) [22:00:11] from yesterday: 14:53 < ori> i spent about an hour on labs yesterday and got the second instance re-pooled [22:00:16] that's no longer true? [22:00:23] bd808: ^ [22:01:10] greg-g: I think it is true. But hmmm... let me check something [22:01:27] cause 02 is the one that keeps getting cores and filling / [22:02:04] (03Merged) 10jenkins-bot: Remove obsolete parsoid.yaml [integration/jenkins-job-builder-config] - 10https://gerrit.wikimedia.org/r/156703 (owner: 10Cscott) [22:03:03] 3Wikimedia Labs / 3deployment-prep (beta): hhvm creates core file in /tmp/ filling mediawiki02 labs instance root partition - 10https://bugzilla.wikimedia.org/69979 (10Greg Grossmeier) p:5Unprio>3Highes [22:03:43] 3Wikimedia Labs / 3deployment-prep (beta): hhvm creates core file in /tmp/ filling mediawiki02 labs instance root partition - 10https://bugzilla.wikimedia.org/69979 (10Greg Grossmeier) [22:03:58] (03CR) 10Hashar: [C: 032] "Deployed:" [integration/jenkins-job-builder-config] - 10https://gerrit.wikimedia.org/r/156690 (owner: 10Cscott) [22:05:15] 3Wikimedia Labs / 3deployment-prep (beta): hhvm creates core file in /tmp/ filling mediawiki02 labs instance root partition - 10https://bugzilla.wikimedia.org/69979#c6 (10Greg Grossmeier) a:3Bryan Davis Since Bryan and Giuseppe were working on this issue this morning, I'm assigning to Bryan. :) [22:05:36] greg-g: gee thanks [22:05:40] ;) [22:05:59] bd808: but in seriousness, if you hit a wall, let me know [22:06:43] bd808: 01 is unpolled with a puppet local hack I think [22:06:58] greg-g: You mean if Deskana comes at me for not getting SUL done [22:07:16] bd808: want to assign it to _joe_ or ori? [22:07:20] hashar: Looks like that patch was removed but I'm wondering if the varnishes were all restarted [22:07:22] (serious) [22:07:48] bd808: deployment-cache-text02 did not have the second node has a director backend [22:07:54] though puppet might well be broken on that instance [22:08:17] I'm going to cycle the varnishes and put a bandaid in cron to sweep the cores into nfs [22:09:27] hashar: 10.68.17.96 is in /etc/varnish/wikimedia_text-backend.vcl [22:09:31] (03Merged) 10jenkins-bot: Run `npm test` on Parsoid [integration/jenkins-job-builder-config] - 10https://gerrit.wikimedia.org/r/156690 (owner: 10Cscott) [22:09:33] maybe varnish just needs kicked [22:09:58] 3Wikimedia Labs / 3deployment-prep (beta): Search and page loads extremely slow on beta cluster (cause being investigated) - 10https://bugzilla.wikimedia.org/70103#c2 (10Greg Grossmeier) Not this slow. [22:10:02] !log deployment-prep restarted varnish on deployment-cache-text02 [22:10:22] (03CR) 10Hashar: "I have deployed the Jenkins job cscott crafted. Test should pass now." [integration/zuul-config] - 10https://gerrit.wikimedia.org/r/156693 (owner: 10Cscott) [22:10:28] (03CR) 10Hashar: "recheck" [integration/zuul-config] - 10https://gerrit.wikimedia.org/r/156693 (owner: 10Cscott) [22:10:50] bd808: \O/ [22:11:06] bd808: yeah puppet probably does not notify varnish [22:11:17] what ops do is that they push the new varnish package / config [22:11:29] unpool a random prod varnish [22:11:33] upgrade package / conf [22:11:35] test it [22:11:39] pool it back and monitor [22:11:45] then they generalize to the other caches [22:12:02] yeah. Do you know how to ask a live varnish instance what config it is running? [22:12:10] no idea :-/ [22:12:59] bd808: who besides Ori is in a position to stop hhvm from dumping core at all? [22:13:56] well... nobody including Ori. Each core needs to be inspected to find out why and then we need to fix the bug either in our php or their interpreter [22:14:11] Ori, Tim and Brett were doing such things [22:14:35] with occasional help from Max and Aaron [22:14:39] if we have them written to some other partition than / we can keep them :] [22:14:59] that's what bd808 and _joe_ tried to do this "morning" [22:15:28] Yeah that didn't work. But I can write a cleaner script to move them and cron it [22:15:40] Project beta-scap-eqiad build #19059: FAILURE in 1 min 33 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/19059/ [22:16:08] still, why is beta cluster so slow right now? [22:16:14] was it hhvm dieing? [22:16:31] afternoon load? [22:16:43] more testing both automated and manual late in the day [22:16:56] especially on a Wednesday [22:16:57] never used to be this bad [22:17:11] show me graphs to prove that [22:17:15] "never".... right [22:17:30] well, how about an increase in user complaints? [22:17:32] greg-g bd808 browser test builds kick off at 11AM Pacific [22:17:33] see Steven's email [22:17:49] and Rummana's bug report, and... [22:17:53] greg-g bd808 hhvm starts dumping core during browser test builds [22:18:10] I think I broked varnish. [22:18:17] I see 2 master processes [22:18:22] trying again [22:18:48] (and maryana's email) [22:19:48] thoughts: hashar will be MIA with baby pretty darn soon, we need to have *someone* who can coddle beta cluster when he's gone, without worrying about who to blame for the current situation. [22:21:03] greg-g: well ton of folks have been babysitting beta for the past six months [22:21:15] greg-g: keep in mind that the actual issue right now is hhvm dumping core when it should not be dumping core [22:21:16] I have not been that much involved since the pmtpa -> eqiad migration [22:21:34] my role been mostly to give clue to other folks / offer some expertise with the tech debt or specific configuration there [22:21:54] greg-g: the small partition is not a problem when hhvm does not dump core [22:22:27] bd808: you can get varnish help from bblack [22:22:28] greg-g hashar even I have a couple of merges to mediawiki-config :-) [22:22:42] now it's broke good [22:22:52] bd808: you blew it up good? [22:23:00] bd808: he even went to fix in varnish source code an issue that was hitting beta often and prod from time to time! [22:23:02] unable to connect [22:23:05] varnish runnign 500 for all requests [22:23:16] yeah there is no backend [22:23:44] bd808: there are two services: varnish varnish-frontend [22:23:57] yupo. fixed now [22:24:00] one handle the request from the client and (I think) has the memory cache [22:24:10] (03PS1) 10Addshore: Triggers for Wikidata jobs out of experimental [integration/zuul-config] - 10https://gerrit.wikimedia.org/r/156715 [22:24:10] the second as a disk cache and interact with the backend appservers [22:24:26] (03CR) 10jenkins-bot: [V: 04-1] Triggers for Wikidata jobs out of experimental [integration/zuul-config] - 10https://gerrit.wikimedia.org/r/156715 (owner: 10Addshore) [22:24:44] still slow as crap [22:24:51] (03PS4) 10Addshore: Remove wikidata-testextensions job [integration/jenkins-job-builder-config] - 10https://gerrit.wikimedia.org/r/156568 [22:25:04] bd808: if you want to see request flowing in you can: varnishncsa -n frontend [22:25:34] for the request made by the backend, you have to look up the name under /var/lib/varnish/ [22:25:37] should be the hostname [22:26:04] so: varnishncsa -n `hostname` [22:27:15] Yippee, build fixed! [22:27:16] Project beta-scap-eqiad build #19060: FIXED in 3 min 16 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/19060/ [22:27:19] oh [22:28:24] (03CR) 10Addshore: "This chain of patches should be merged in this order:" [integration/zuul-config] - 10https://gerrit.wikimedia.org/r/156560 (owner: 10Addshore) [22:28:26] (03CR) 10Addshore: "This chain of patches should be merged in this order:" [integration/jenkins-job-builder-config] - 10https://gerrit.wikimedia.org/r/156551 (owner: 10Addshore) [22:28:55] hashar: lovely chain of 8 patches there for whenever you feel like it ;p [22:29:24] addshore: lovely! [22:29:37] For some currently unknown reason varnish is not caching anything [22:29:57] X-Cache:deployment-cache-text02 miss (0), deployment-cache-text02 frontend miss (0) [22:30:07] For http://en.wikipedia.beta.wmflabs.org/wiki/Main_Page as an anon [22:30:14] addshore: to late for now though. "Heading to bed soon" © [22:30:23] yep, I agree :) [22:30:28] 3Wikimedia Labs / 3deployment-prep (beta): Search and page loads extremely slow on beta cluster (cause being investigated) - 10https://bugzilla.wikimedia.org/70103#c3 (10Greg Grossmeier) 18:29 < bd808> For some currently unknown reason varnish is not caching anything [22:30:41] Although I might try and add a few more to the chain ;p [22:31:18] hmmm.. better from an incognito window [22:31:28] bd808: seems cached for me now [22:31:45] yeah, mine was quick [22:32:36] "deployment-cache-text02 miss (0), deployment-cache-text02 frontend hit (20)" [22:32:49] look at age: also [22:32:57] yeah. dunnon what was up with my primary browser [22:33:00] *dunno [22:34:53] gah [22:34:53] Service Temporarily Unavailable [22:34:53] Due to heavy load on the server, connections may be temporarily blocked from locations that fetch an unusually high number of pages. If you've just been heavily browsing, go get a cup of coffee and come back and reload in a minute. :) [22:35:15] yeah I noticed that one earlier today in the sauce labs build results [22:35:20] slave lag? [22:35:20] results -> screenshots [22:35:37] not sure if it is some rate limiting [22:35:50] slave lag was an issue but I believe sam upped it to 300 seconds [22:36:16] I think it may happen when the hhvm fcgi process is dumping core and hasn't respawed yet [22:36:21] hashar: yeah, we upped the lag from 5s to 300s a couple of weeks ago [22:36:25] bah and udp2log-mw keep crashing on deployment-bastion now :( [22:37:01] STAHP DUMPING CORRRE PLZ KTHXBAI [22:37:02] !log deployment-prep restarting udp2log-mw on deployment-bastion. It keeps crashing since fiarly recently [22:37:10] wrong chann [22:37:28] special:random on both en.wiki and commons is taking for ever to return the resulting page (not even counting loading the page) cc manybubbles [22:37:34] on beta, of course [22:37:40] it was fast a bit ago [22:37:59] Special:Random is nasty one [22:38:44] fast on enwiki now, slow on commons [22:39:14] there is also a bunch of exception being logged [22:39:17] 2014-08-27 22:38:36 deployment-mediawiki01 enwiki: [fc90d1d4] /w/index.php/dfe9a"%3b761d0?title=Special:RecentChangesLinked&hideminor=1&target=Main_Page Exception from line 220 of /srv/common-local/php-master/extensions/Scribunto/engines/LuaSandbox/Engine.php: The luasandbox extension is not present, this engine cannot be used. [22:39:18] :D [22:39:42] ack [22:39:55] :( [22:40:15] hhvm-luasandbox is installed there.... [22:40:15] Project browsertests-Echo-en.wikipedia.beta.wmflabs.org-linux-chrome-sauce build #7: STILL FAILING in 13 min: https://integration.wikimedia.org/ci/job/browsertests-Echo-en.wikipedia.beta.wmflabs.org-linux-chrome-sauce/7/ [22:40:37] I think there is some bot doing some security vulnerability tests [22:40:52] there is a bunch of very suspicious hits [22:41:13] hashar: there is, a volunteer working with Chris Steipp is running fuzzers against beta labs periodically [22:41:25] well [22:41:39] that surely doesn't help with load on the single hhvm box [22:41:44] no shit [22:41:48] :/ [22:41:57] we could get that stopped [22:42:01] cause it spam dozen of them per second [22:42:01] there are 2 now. maybe we need a 3rd? [22:42:03] uncached [22:42:08] (obviously it is uncached) [22:42:08] but I'd rather just get hhvm to not suck [22:42:23] bd808: I'll take a few, please [22:42:36] but, ugh, graphs [22:44:04] 3Wikimedia Labs / 3deployment-prep (beta): hhvm creates core file in /tmp/ filling mediawiki02 labs instance root partition - 10https://bugzilla.wikimedia.org/69979#c7 (10Bryan Davis) Bandaid solution: $ cat cleanup-hhvm-cores #!/usr/bin/env bash sudo mv /tmp/hhvm.*.core /data/project/hhvm-cores sudo mv /v... [22:45:08] yep, hashar, sleep time, while I mull over in my head the whole composer cloning 20 repos * 4 tests = 80 clones per patchset [22:45:54] hack time? :> [22:45:56] hehe, night [22:46:44] bd808: "18:40 < bd808> hhvm-luasandbox is installed there...." still being looked into? [22:47:07] it's there. I can see the .so in the right directory [22:47:27] let me double check that the ini loads it [22:48:04] it's loaded by /etc/hhvm/fcgi/config.hdf too [22:51:08] greg-g, chrismcmahon, hashar: I croned a script to move cores to the nfs drive every 2 mintues [22:51:22] That should keep things going I hope [22:51:42] Not a fix but a bandaid [22:51:44] bd808: I saw that, and is it true that we had some core files pretty much instantly? [22:51:55] * chrismcmahon could go look I guess [22:52:19] There was one on mediawiki01 [22:52:33] others in there were from when I did that manually this morning [22:52:53] 20140825 stacktrace.25424.log.20140826 stacktrace.29703.log.20140825 stacktrace.8861.log.20140826 [22:52:53] cmcmahon@deployment-bastion:/data/project/hhvm-cores$ ls -al *core [22:52:53] -rw------- 1 pybal-check apache 2291052544 Aug 26 09:41 hhvm.10679.core [22:52:55] -rw------- 1 pybal-check apache 2193469440 Aug 26 15:56 hhvm.2335.core [22:52:57] -rw------- 1 pybal-check apache 4096 Aug 26 18:29 hhvm.23584.core [22:52:59] -rw------- 1 pybal-check apache 2085355520 Aug 27 22:01 hhvm.23646.core [22:53:01] -rw------- 1 pybal-check apache 2111422464 Aug 27 22:34 hhvm.23914.core [22:53:03] -rw------- 1 pybal-check apache 208896 Aug 27 06:02 hhvm.25140.core [22:53:05] -rw------- 1 pybal-check apache 8192 Aug 27 07:11 hhvm.31168.core [22:53:07] -rw------- 1 pybal-check apache 696348672 Aug 26 17:08 hhvm.3154.core [22:53:09] -rw------- 1 pybal-check apache 8192 Aug 27 08:05 hhvm.32672.core [22:53:11] -rw------- 1 pybal-check apache 4096 Aug 26 19:06 hhvm.4039.core [22:53:13] -rw------- 1 pybal-check apache 2214969344 Aug 26 05:13 hhvm.4289.core [22:53:15] -rw------- 1 pybal-check apache 1032192 Aug 26 17:28 hhvm.8861.core [22:53:17] sorry for the flood [22:53:54] bd808: +1 on using a cron as a workaround :] [22:54:11] * bd808 has a box full of bandaids [22:54:19] chrismcmahon: yeah /data/project is a huge shared disk [22:54:29] though that starves resource from a shared labs disk [22:54:39] guess labs ops will complain if it grows to big [22:54:40] only 11T free :) [22:54:59] hashar: we simply should not have that many hhvm cores. this is just unreasonable. [22:55:08] bd808: there was some security audit tool running on beta that caused a massive amount of query [22:55:35] bd808: I have contacted the utility operator. Chris Steip cced me to some email and I have cced you to my reply [22:56:36] Project beta-scap-eqiad build #19063: FAILURE in 2 min 23 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/19063/ [22:57:43] 3Wikimedia Labs / 3deployment-prep (beta): Search and page loads extremely slow on beta cluster (cause being investigated) - 10https://bugzilla.wikimedia.org/70103#c4 (10Bryan Davis) That may have been my user-agent doing something strange. Antoine also found that we were undergoing a high rate vulnerabilit... [23:01:38] chrismcmahon: yeah hhvm is still a fairly young project and we are hitting code paths / bugs that Facebook probably does not encounter [23:01:44] or they might be new bugs [23:03:27] thanks hashar [23:03:40] and bd808, sorry for roping you in [23:05:40] Yippee, build fixed! [23:05:40] Project beta-scap-eqiad build #19064: FIXED in 1 min 34 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/19064/ [23:05:55] hashar: I understand that, but a dozen cores in 24 hours on Wed, after having a debug build of hhvm on beta labs all weekend since Friday is just not supportable. this does not seem even ready for beta much less prod. [23:06:41] depends [23:06:43] 3Wikimedia Labs / 3deployment-prep (beta): Search and page loads extremely slow on beta cluster (cause being investigated) - 10https://bugzilla.wikimedia.org/70103#c5 (10Antoine "hashar" Musso) We have some security audit being run on the beta cluster. Unfortunately the script is not throttled and cause a fa... [23:06:51] a dozen of cores out of a million request is not that bad :D [23:06:58] but yeah [23:07:01] that doesn't help testing [23:07:13] what could be done is unpool hhvm entirely [23:07:25] and repool the Zend PHP application servers ( apache01 and apache02 ) [23:07:28] (03CR) 10Awight: "Absolutely, thx! I can continue to kick the can for as long as it's helpful." [integration/zuul-config] - 10https://gerrit.wikimedia.org/r/151249 (owner: 10Awight) [23:07:33] 3Wikimedia Labs / 3deployment-prep (beta): hhvm creates core file in /tmp/ filling mediawiki02 labs instance root partition - 10https://bugzilla.wikimedia.org/69979#c8 (10Bryan Davis) a:5Bryan Davis>3None Unlicking this cookie. The core (ha punny) problem remains but hopefully someone on the hhvm team ca... [23:07:46] then figure out with hhvm folks a strategy to only have some specific traffic to hhvm [23:07:52] chrismcmahon: there were 12 of them today? [23:09:04] greg-g: did you see my channel flood of 12 minutes ago? 12 cores dated since 26aug that bd808 shipped to @deployment-bastion:/data/project/hhvm-cores [23:09:25] :( [23:09:25] 17 minutes ago actually [23:09:31] the disk space is such that 2-3 are enough to cause things to go sideways [23:10:04] and ori told he expected none. I are concerneded. [23:10:17] apache01 and 02 are gone but we could rebuild them [23:11:01] Assuming that the puppet roles can be made to work ... which may be a large assumption [23:11:20] we just need to get focus on triage and fixing the hhvm crash bugs [23:11:40] and "we" was very sarcastic there [23:12:23] I am very close to saying loudly that we need to stop work on hhvm until we have somewhere to test it that does not bring the work of 6 feature teams to a screeching halt every day. [23:12:32] 3Wikimedia Labs / 3deployment-prep (beta): hhvm creates core file in /tmp/ filling mediawiki02 labs instance root partition - 10https://bugzilla.wikimedia.org/69979#c9 (10Greg Grossmeier) Resseting assignee and priority as this is now no longer an OMG! situation. For the record: gjg@deployment-bastion:/data... [23:12:42] chrismcmahon: just do that [23:12:43] 3Wikimedia Labs / 3deployment-prep (beta): hhvm creates core file in /tmp/ filling mediawiki02 labs instance root partition - 10https://bugzilla.wikimedia.org/69979 (10Greg Grossmeier) p:5Highes>3Normal [23:12:44] :) [23:14:13] chrismcmahon: as I said during our weekly checkin, I should have spotted that all traffic was going to be solely relying on hhvm and cause browsertests / feature to be largely impacted by the ongoing hhvm integration [23:14:29] but I though we were going to use a pool of both Zend / hhvm app servers [23:14:50] and have route requests based on an opt-in cookie or whatever other routing system [23:15:00] hashar: I knew that, but my expectation was that we would be in final phases by now, and that does not seem to be true [23:15:10] that was my understanding as well, unsure when that plan changed [23:15:25] move fast and break things is cool and all, but... [23:15:30] depends on who you ask, so QA should definitely speak up [23:15:41] yeah [23:15:56] and the security audit spam was probably not helping either [23:16:03] QA should have had a more active role in this all along [23:16:13] not blaming or shaming, just saying [23:16:24] so potentially: we could rebuild Zend app servers and have them server the traffic by default [23:16:40] figure out a way to route some specific kind of traffic to different app servers [23:16:54] ie: hhvm test traffic to the hhvm appserver [23:16:59] The default prod server image is trusty as of today. Puppet will make all trusty app servers hhvm servers [23:17:06] bd808: can we still make that happen? (getting QA more closely involved) [23:17:07] and security audit spam to some dedicated (zend|hhvm) appserver [23:17:47] bd808: wait... really? so if a machine goes down/gets rebuilt *in production* it'll be hhvm? [23:18:01] na [23:18:08] they can rebuild servers using a Precise image [23:18:13] It will take special attention from Ops to ensure it isn't [23:18:16] it is just not the default anymore [23:18:32] which was pointed out on ops-l right after the trusty announcement [23:18:38] bd808: can you verify that with robh/mutante? [23:18:40] oh [23:18:44] so there is a slight change we end up pulling a Trusty app server in production if someone reinstall it using the defaults :-D [23:19:06] bd808: nvm [23:19:19] * greg-g reads email [23:19:57] but Ori and Giuseppie are working on the end-game steps for an hhvm cluster. So if QA thinks it is unstable they should jump and scream and make sure they are heard [23:20:25] ori-mtng: yes [23:20:26] er [23:20:30] yes [23:20:32] How have test/test2 been? [23:20:42] the thing is that we're not doing anything even particularly weird. browser tests are all happy-path. and hhvm dumps core multiple times every time the builds kick off. [23:21:29] no idea about test/test2, haven't looked. we're hitting test2 with stuff, but we don't care much [23:21:30] I guess it's just test that is hhvm right now. test2 is still php5 [23:21:34] chrismcmahon: can you start writing up your "why I think this needs a rethink" ideas? [23:21:45] mostly, the evidence peices [23:22:00] ori will want numbers/dates/time/etc [23:22:17] "Note I had actually set the tests to run since last Friday." from security scanner guy [23:22:30] so... maybe not just nice tests running [23:22:44] huh [23:22:50] greg-g: multiple core dumps per day basically. they've all been deleted, but it is well known. [23:23:11] ohhh [23:23:11] yeah, but timing would be good, if they correspond to the time of browser tests [23:23:21] chrismcmahon: ^ [23:23:23] so that would mean that since friday we have a hugeeee number of requests hitting hhvm [23:23:28] bd808: there was a bogus debug hhvm build on beta from Friday afternoon until yesterday. another annoyance. [23:23:29] and thus causing all the havoc we had :-D [23:23:44] plus point: we gather a lot of hhvm.core files which are definitely useful [23:23:46] greg-g: they do correspond [23:24:04] I think the bar to set is a week of beta not blowing up. And active support from the hhvm team [23:24:08] hashar: do you have a log of those requests? can you give a "this many req/s from sherif's tool" answer? [23:25:29] greg-g: yeah /data/project/logs/apache-access.log [23:25:42] and they are most probably log rotated under /data/project/logs/archive [23:25:57] I did send a mail to Sherif / Chris Steip [23:26:07] asking the utility to be throttled and a user agent added [23:26:10] yeah, just wanting numbers [23:26:13] and re blacklisted it [23:26:16] thanks for that [23:26:19] just use grep -c ? :D [23:26:20] will do [23:26:24] ;) [23:26:45] Holly hell [23:26:48] ok I count that [23:26:51] then go to bed [23:27:01] I can do it [23:27:01] cause /data/project/swift --> 20GB doh [23:27:43] greg-g: /data/project/logs/archive/apache-access.log* [23:27:49] yeah [23:27:55] doesn't make sense though [23:27:57] apparently log rotated once per week [23:28:22] yeah we are missing a bunch of headers in there :( [23:28:47] you can see requests such as "GET /wiki/alert(1) HTTP/1.1" [23:29:05] which check whether the title would be injected as is in some javasciprt and cause an alert box [23:29:07] lovely [23:29:57] (see pm) [23:30:35] greg-g: I think it would be reasonable to set a threshold for hhvm before release like "zero core dumps for seven days straight in beta labs". I don't really care if they replace a call to dump with an error log, but this is not reasonable. [23:30:49] * greg-g nods [23:31:23] greg-g: the implications for prod seem pretty severe to me [23:32:57] gjg@deployment-bastion:/data/project/logs$ grep -c 87.81.152.249 xff.log [23:33:01] shit [23:33:33] anyways... [23:33:42] trying to zgrep the archived logs gives me: gzip: xff.log-20140820.gz: Permission denied [23:34:27] greg-g: logrotate is bugged though [23:34:34] so some files did not get compressed [23:34:38] ah [23:34:48] got 200k hits for some of those files [23:35:21] AHHHH [23:35:41] I now understand why the .log files suddenly disappear [23:35:51] http://paste.debian.net/117923/ [23:35:57] for the curious ^ [23:36:26] there's your problem right there :) [23:38:09] only 28435 hits per day on average [23:38:59] but all uncached... so yeah [23:39:27] wait no, those are days, not weeks [23:39:45] 3Wikimedia Labs / 3deployment-prep (beta): udp2log logs are not properly rotated / daemon restarted - 10https://bugzilla.wikimedia.org/70112 (10Antoine "hashar" Musso) 3NEW p:3Unprio s:3normal a:3None We have udp2log-mw service on deployment-bastion which write to log files under /data/project/logs/... [23:39:51] and Ifilled a bug because udp2log are not rotated properly :( [23:40:13] yeah. days [23:40:42] add 200k uncached hits to tiny labs boxes, see things melt [23:42:10] yep [23:42:32] Only 2 req/second... But yeah, may have pushed us over some edge [23:42:53] greg-g: Can you count the lines in those files? I'm curious what % of traffic it was.. [23:45:08] csteipp: http://paste.debian.net/117924/ [23:45:19] so, 99% [23:45:59] Hah, ok, so that does explain it [23:46:31] :) [23:46:45] highest previous day in logs is: 1730 xff.log-20140715.gz [23:47:04] wait... [23:47:31] wc of a gz file no worky right, right? [23:48:03] greg-g: found another one [23:48:05] greg-g: No. `zcat | wc -l` [23:48:09] * greg-g nods [23:48:15] greg-g: /var/log/udp2log/udp2log.log [23:48:28] logs every 5 minutes the average number of udp packet received :D [23:48:42] usual traffic: 0.0xx k/s [23:49:02] bumps to 2.500 k/s [23:49:29] that gives an indication of the order of magnitude the security audit is causing on beta [23:50:58] 3Wikimedia Labs / 3deployment-prep (beta): udp2log logs are not properly rotated / daemon restarted - 10https://bugzilla.wikimedia.org/70112#c1 (10Antoine "hashar" Musso) /var/log/udp2log/udp2log.log may have some clue. Not we had a huge number of requests send for the last few days, so that might have prev... [23:51:28] 3Wikimedia Labs / 3deployment-prep (beta): Search and page loads extremely slow on beta cluster (cause being investigated) - 10https://bugzilla.wikimedia.org/70103#c6 (10Greg Grossmeier) gjg@deployment-bastion:/data/project/logs$ grep -c REDACTED xff.log 4960 gjg@deployment-bastion:/data/project/logs/archiv... [23:52:55] we need more test clusters y'all [23:53:10] word [23:53:22] and monitoring [23:53:28] *And* .. yeah, what hashar said [23:53:28] 3Wikimedia Labs / 3deployment-prep (beta): Search and page loads extremely slow on beta cluster (cause being investigated) - 10https://bugzilla.wikimedia.org/70103#c7 (10Antoine "hashar" Musso) The udp2log-mw service on deployment-bastion.eqiad.wmflabs logs the average number of packets it receives per secon... [23:54:26] auditor confirmed he started last friday :] [23:54:59] bad timing [23:55:00] So we need to get hiera into ops/puppet so we can do per-project config changes, cleanup a bunch of horrible hacks and create some new labs projects for alternate testing environments. _joe_ is working on the first bit. [23:56:13] 3Wikimedia Labs / 3deployment-prep (beta): Search and page loads extremely slow on beta cluster (cause being investigated) - 10https://bugzilla.wikimedia.org/70103#c8 (10Greg Grossmeier) 5NEW>3RESO/FIX Closing this. Thanks Rummana for the heads up and for all who helped debug this multilayered issue. [23:57:32] clean up == antoine and _joe_ (he offered) and ori? he seems motivated enough ;). creating new == antoine [23:58:19] ok, I'm out, dinner time [23:58:24] thanks a ton hashar [23:58:27] go sleep [23:58:46] greg-g: sure Mr. Grossmeier :-D [23:58:54] at least we found out the root cause [23:58:56] :) [23:58:57] I am quite happy [23:59:03] good :) [23:59:09] and have a long list of things to implement / improve / fix [23:59:14] overall, that is constructive. [23:59:16] I love outages [23:59:54] bd808: yeah hiera definitely a prerequisite imho