[00:33:24] Project browsertests-UniversalLanguageSelector-commons.wikimedia.beta.wmflabs.org-linux-firefox-sauce build #218: FAILURE in 23 min: https://integration.wikimedia.org/ci/job/browsertests-UniversalLanguageSelector-commons.wikimedia.beta.wmflabs.org-linux-firefox-sauce/218/ [03:54:30] Yippee, build fixed! [03:54:30] Project browsertests-Core-en.wikipedia.beta.wmflabs.org-linux-firefox-sauce build #229: FIXED in 12 min: https://integration.wikimedia.org/ci/job/browsertests-Core-en.wikipedia.beta.wmflabs.org-linux-firefox-sauce/229/ [03:59:15] 3Wikimedia / 3Continuous integration: Jenkins: Set $wgHTTPProxy in mediawiki config - 10https://bugzilla.wikimedia.org/59253 (10Krinkle) [04:00:18] (03PS1) 10Krinkle: mwconf: Remove references to pmtpa [integration/jenkins] - 10https://gerrit.wikimedia.org/r/166529 [05:36:44] Project browsertests-Echo-en.wikipedia.beta.wmflabs.org-linux-chrome-sauce build #105: FAILURE in 8 min 12 sec: https://integration.wikimedia.org/ci/job/browsertests-Echo-en.wikipedia.beta.wmflabs.org-linux-chrome-sauce/105/ [05:40:14] !log Setting up integration-slave1004 and integration-slave1009 ({{bug|71873}} fixed) [05:40:20] Logged the message, Master [05:44:06] Project browsertests-MobileFrontend-en.m.wikipedia.beta.wmflabs.org-linux-firefox-sauce build #301: ABORTED in 4 min 11 sec: https://integration.wikimedia.org/ci/job/browsertests-MobileFrontend-en.m.wikipedia.beta.wmflabs.org-linux-firefox-sauce/301/ [05:44:34] Project browsertests-MobileFrontend-en.m.wikipedia.beta.wmflabs.org-linux-chrome-sauce build #265: ABORTED in 24 min: https://integration.wikimedia.org/ci/job/browsertests-MobileFrontend-en.m.wikipedia.beta.wmflabs.org-linux-chrome-sauce/265/ [05:54:29] Project browsertests-VisualEditor-en.wikipedia.beta.wmflabs.org-windows_8-internet_explorer-sauce build #64: ABORTED in 3 min 26 sec: https://integration.wikimedia.org/ci/job/browsertests-VisualEditor-en.wikipedia.beta.wmflabs.org-windows_8-internet_explorer-sauce/64/ [05:54:45] Project browsertests-MobileFrontend-test2.m.wikipedia.org-linux-firefox-sauce build #227: ABORTED in 16 sec: https://integration.wikimedia.org/ci/job/browsertests-MobileFrontend-test2.m.wikipedia.org-linux-firefox-sauce/227/ [06:20:16] 3Wikimedia / 3Continuous integration: Jenkins: Figure out long term solution for /tmp management - 10https://bugzilla.wikimedia.org/72011 (10Krinkle) [06:20:18] 3Wikimedia / 3Continuous integration: Jenkins: Figure out long term solution for /tmp management - 10https://bugzilla.wikimedia.org/72011 (10Krinkle) 3NEW p:3Unprio s:3major a:3None While the VM testing will make this problem obsolete, depending on how far away this is, this is a high priority proble... [06:30:18] 3Wikimedia / 3Quality Assurance: QA: Rename /tmp/out.jpg to something else - 10https://bugzilla.wikimedia.org/72012 (10Krinkle) 3NEW p:3Unprio s:3major a:3None I'm not a 100% sure, but I see this on all contint slaves and suspect it comes from browser tests. If that's the case, please: 1) [required... [06:36:32] 3Wikimedia / 3Continuous integration: [Regression] Puppet failing on CI slaves (Trusty) "Unable to locate package python-elasticsearch" - 10https://bugzilla.wikimedia.org/72014 (10Krinkle) 3NEW p:3Unprio s:3critic a:3None https://gerrit.wikimedia.org/r/#/c/163945/ Error: /Stage[main]/Elasticsearch::... [06:43:11] !log Pooled integration-slave1004 [06:43:13] Logged the message, Master [06:43:50] !log Keeping the new integration-slave1009 unpooled because setup could not be completed due to {{bug|72014}}. [06:43:52] Logged the message, Master [06:45:16] 3Wikimedia / 3Continuous integration: [Regression] Puppet failing on CI slaves (Trusty) "Unable to locate package python-elasticsearch" - 10https://bugzilla.wikimedia.org/72014#c1 (10Krinkle) This is affecting integration-slave1006, integration-slave1007, integration-slave1008. But not critical there since t... [06:45:22] And fucking again. Can't do shit because ppl keep breaking stuff without testing. [06:45:52] Another day wasted. [06:47:11] :( [06:49:22] !log Did a slow-rotating graceful depool/reboot/repool of all integration-slave's over the past hour to debug problems whilst waiting for puppet to unblock and set up new slaves. [06:49:24] Logged the message, Master [06:49:29] nn o/ [08:06:15] 3Wikimedia / 3Continuous integration: [Regression] Puppet failing on CI slaves (Trusty) "Unable to locate package python-elasticsearch" - 10https://bugzilla.wikimedia.org/72014#c2 (10Antoine "hashar" Musso (WMF)) p:5Unprio>3Normal s:5critic>3normal Indeed https://gerrit.wikimedia.org/r/#/c/163945/ ad... [08:10:15] 3Wikimedia / 3Continuous integration: Jenkins: Figure out long term solution for /tmp management - 10https://bugzilla.wikimedia.org/72011#c1 (10Antoine "hashar" Musso (WMF)) Seems like a duplicate of Bug 68563 - Jenkins: point TMP/TEMP to workspace and delete it after build completion [08:11:30] 3Wikimedia / 3Continuous integration: ci/jenkins: remove dependency on git.wikimedia.org - 10https://bugzilla.wikimedia.org/72001#c2 (10Antoine "hashar" Musso (WMF)) We have two slave scripts (integration/jenkins.git) which still depends on git.wikimedia.org bin/mw-core-get.sh tools/fetch-mw-ext Most job... [08:19:15] 3Wikimedia / 3Quality Assurance: QA: Rename /tmp/out.jpg to something else - 10https://bugzilla.wikimedia.org/72012#c1 (10Željko Filipin) As far as I know, this is not created by browser tests. (But I could be wrong.) I am adding more people to this bug that could know. [08:30:08] hashar: good morning :) [08:30:13] feeling better? [08:30:34] (03PS4) 10Zfilipin: Create mediawiki-selenium-bundle-rspec [integration/config] - 10https://gerrit.wikimedia.org/r/166029 (owner: 10Hashar) [08:31:15] zeljkof: yeah much better thanks [08:31:38] zeljkof: one of the rspec job cause ruby 2 to spurt out a stacktrace [08:32:13] hashar: hm, saw that somewhere in gerrit and/or mail [08:33:26] I have +2d a few commits in integration/config yesterday, but I did not deploy any jobs, just in case something breaks :) [08:33:34] (03CR) 10Zfilipin: [C: 032] Create mediawiki-selenium-bundle-rspec [integration/config] - 10https://gerrit.wikimedia.org/r/166029 (owner: 10Hashar) [08:33:50] waiting for the last one to get merged, and we can deploy from master ^ [08:34:12] hashar: this one breaks? https://gerrit.wikimedia.org/r/#/c/159644/ [08:35:41] zeljkof: if you +2 a jjb change, you should refresh the job :] [08:35:58] and yeah https://gerrit.wikimedia.org/r/#/c/159644/ cause a stacktrace [08:36:59] hashar: will do, as soon as the last one merges [08:37:08] (03CR) 10Hashar: "Note that the rspec job for mediawiki/selenium cause a stacktrace with proposed patch https://gerrit.wikimedia.org/r/#/c/159644/" [integration/config] - 10https://gerrit.wikimedia.org/r/166029 (owner: 10Hashar) [08:37:26] I haven't looked at the trace [08:37:51] 00:00:55.992 /mnt/jenkins-workspace/workspace/gems/gems/json-1.8.1/lib/json/common.rb:67: [BUG] Segmentation fault [08:37:51] :D [08:37:52] hashar: I did, and looks like ruby crashed [08:38:00] (03Merged) 10jenkins-bot: Create mediawiki-selenium-bundle-rspec [integration/config] - 10https://gerrit.wikimedia.org/r/166029 (owner: 10Hashar) [08:38:32] hashar: ok, everything merged, deploying the jobs [08:38:52] \O/ [08:44:11] hashar: all jobs updated, let's see if anything breaks :) [08:46:30] \O/ [08:46:39] mediawiki-selenium-bundle-rspec is not triggered though [08:46:49] I think it is only in the experimental pipeline in Zuul conf [08:47:05] i.e. needs to be manually triggered by commenting in Gerrit 'check experimental' [08:54:21] hashar: by the way, are you interested in attending selenium workshop? https://www.mediawiki.org/wiki/QA/Selenium_Workshop [08:54:31] (the day after all hands, friday) [08:59:23] yeah probably [08:59:33] have to setup some workshop for Zuul as well [09:20:54] hashar: great, please sign up for the workshop, the space is limited [09:21:18] and please do consider a CI workshop (JJB, zuul and friends) [09:23:31] yeah [09:23:46] probably need to sign up with my wmf account [09:23:52] for which I lost the pass :D [09:24:32] hashar: reset password :) [09:25:00] I would recommend https://lastpass.com/ as a password manager [09:25:06] I use it for years and I love it [09:26:00] !log renamed deployment-cxserver02 node slaves to 03 and updated the ip address [09:26:03] kart_: ^^^ :D [09:26:05] Logged the message, Master [09:26:26] hashar: what ::) [09:36:47] Project browsertests-CirrusSearch-en.wikipedia.beta.wmflabs.org-linux-firefox-sauce build #201: FAILURE in 21 sec: https://integration.wikimedia.org/ci/job/browsertests-CirrusSearch-en.wikipedia.beta.wmflabs.org-linux-firefox-sauce/201/ [09:37:28] hashar: uh oh ^ [09:37:36] not sure what is wrong there, checking [09:40:12] oops, looks like it was my mistake [09:40:24] I have refreshed the jobs, but forgot to pull the latest jjb :( [09:40:26] fixing [09:46:39] hashar: how come this job is running on precise, not trusty? [09:46:43] https://integration.wikimedia.org/ci/job/browsertests-CirrusSearch-en.wikipedia.beta.wmflabs.org-linux-firefox-sauce/202/console [09:46:52] 00:00:00.041 Building remotely on integration-slave1001 (hasPhpUnit hasPhpcs contintLabsSlave UbuntuPrecise) [09:47:23] hmm [09:47:31] 3Wikimedia / 3Continuous integration: [Regression] Puppet failing on CI slaves (Trusty) "Unable to locate package python-elasticsearch" - 10https://bugzilla.wikimedia.org/72014#c3 (10Filippo Giunchedi) 5NEW>3RESO/FIX uploaded python-elasticsearch 1.0.0-2chl1~trusty1 to trusty-wikimedia [09:47:35] we haven't migrated them to Trusty yet ? :D [09:47:54] hashar: didn't you say that they are migrated? [09:48:04] or did I misunderstood that? [09:48:04] only the *bundle* ones [09:48:13] what do you mean by that? [09:48:18] if mediawiki/selenium now requires ruby2 I guess we need to migrate all the browertests jobs as well [09:48:41] * 0a8f087 - Migrate bundler based jobs to ruby2.0 (on Trusty) (21 hours ago) [09:48:51] that only changes the bundler job template [09:48:56] not the browser tests [09:49:17] uh [09:49:22] i am slightly lost :) [09:49:28] https://gerrit.wikimedia.org/r/#/c/166049/3/jjb/ruby-jobs.yaml,unified :D [09:49:42] the 'bundle' jjb macro is changed [09:49:49] but that macro is not used by browertests jobs [09:49:54] I see [09:49:58] they use a copy pasted version [09:50:21] so we need to update the macro? [09:50:54] so if mediawiki/selenium requires ruby 2.0 we need to adjust all the copy pasted code [09:51:12] hashar: ok, will work on that [09:51:32] zeljkof: yeah update everything :-D [09:51:34] http://www.yodaquotes.net/try-not-do-or-do-not-there-is-no-try/ [09:51:49] before release v1.0 of the gems, I guess we want to enforce ruby2.0 [09:52:18] hashar: I am not sure if that is doable from the gem [09:52:24] https://www.youtube.com/watch?v=BQ4yd2W50No ! [09:52:33] can't Gemfile enforce a ruby version? [09:52:35] but I think I saw it can be specified in the gemfile [09:52:48] I think so, saw it somewhere, but never used [09:52:55] the time has come to try it out [09:54:29] bundler as http://bundler.io/v1.7/gemfile_ruby.html [09:54:30] has [09:59:36] Project browsertests-CirrusSearch-test2.wikipedia.org-linux-firefox-sauce build #198: FAILURE in 19 sec: https://integration.wikimedia.org/ci/job/browsertests-CirrusSearch-test2.wikipedia.org-linux-firefox-sauce/198/ [10:01:23] :D [10:01:56] zeljkof: you only changed BROWSER_TIMEOUT='' [10:01:56] https://integration.wikimedia.org/ci/job/browsertests-CirrusSearch-test2.wikipedia.org-linux-firefox-sauce/jobConfigHistory/showDiffFiles?timestamp1=2014-10-10_09-24-42×tamp2=2014-10-14_09-51-02 [10:01:57] :D [10:03:34] hashar: yes, looking :) [10:03:34] but now sure why the jobs fail now [10:04:12] test2 is dead? :D [10:04:20] 00:00:16.226 Scenario: Search suggestions # features/smoke.feature:15 [10:04:20] 00:00:16.644 Timeout::Error (Timeout::Error) [10:04:24] Browser request was cancelled before a Sauce Labs virtual machine was found [10:04:28] https://saucelabs.com/tests/922d43ce06cc4e5883cdc402df16f78a [10:04:34] I wish that message to shows the URL it attempt to reach [10:05:47] could it be that export BROWSER_TIMEOUT= set a timeout of 0 seconds? [10:05:56] hashar: yes, testing [10:06:07] timeout = ENV["BROWSER_TIMEOUT"].to_i [10:06:07] hehe [10:06:38] which would end up setting a timeout of 0 [10:06:50] probably doesn't play well with client = Selenium::WebDriver::Remote::Http::Default.new [10:08:08] Yippee, build fixed! [10:08:09] Project browsertests-CirrusSearch-test2.wikipedia.org-linux-firefox-sauce build #199: FIXED in 1 min 30 sec: https://integration.wikimedia.org/ci/job/browsertests-CirrusSearch-test2.wikipedia.org-linux-firefox-sauce/199/ [10:10:17] hashar: yes, reverting the timeout commit fixed the problem [10:11:18] browser_timeout: 360 [10:11:22] it is only set at one place [10:11:44] hashar: yes, it should be not set elsewhere [10:11:55] will see if the fix is easier to do in jjb or mw_sel [10:12:10] so either mediawiki/selenium should only set it when it is actually a number [10:12:21] (i.e. skip setting the timeout if the env variable is an empty string [10:12:33] or our jjb macro should only use export BROWSER_TIMEOUT={browser_timeout} [10:12:43] when {browser_timeout} is not an empty string [10:12:52] which is the default: jjb/job-templates-browsertests.yaml: browser_timeout: '' [10:12:56] I would wrap the export [10:13:04] and leave mediawiki/selenium untouched [10:13:24] something like: [10:13:26] hashar: yes, releasing the new version of the gem is harder than fixing jjb [10:13:52] if [ -n "{browser_timeout}" ]; then export BROWSER_TIMEOUT={browser_timeout}; fi [10:13:52] Tobi_WMDE_SW: timeout change broke all the builds :) [10:14:12] that export is set in two places [10:20:24] hashar: thanks for the suggestion, it fixed the problem https://integration.wikimedia.org/ci/view/BrowserTests/view/CirrusSearch/job/browsertests-CirrusSearch-en.wikipedia.beta.wmflabs.org-linux-firefox-sauce/203/console [10:20:31] pushing the change to gerrit [10:20:45] https://integration.wikimedia.org/ci/view/BrowserTests/view/CirrusSearch/job/browsertests-CirrusSearch-en.wikipedia.beta.wmflabs.org-linux-firefox-sauce/jobConfigHistory/showDiffFiles?timestamp1=2014-10-14_09-42-06×tamp2=2014-10-14_10-18-56 :D [10:20:49] Yippee, build fixed! [10:20:49] Project browsertests-CirrusSearch-en.wikipedia.beta.wmflabs.org-linux-firefox-sauce build #203: FIXED in 1 min 21 sec: https://integration.wikimedia.org/ci/job/browsertests-CirrusSearch-en.wikipedia.beta.wmflabs.org-linux-firefox-sauce/203/ [10:20:59] well done [10:21:07] going to lunch with my wife [10:21:12] will be back later on [10:21:44] feel free to start the work on migrating the browsertests to Trusty / ruby2 :] [10:22:04] hashar: gladly, but I have a ton of other stuff to do, probably will not have the time [10:22:33] (03PS1) 10Zfilipin: Fix broken jobs when BROWSER_TIMEOUT is not set [integration/config] - 10https://gerrit.wikimedia.org/r/166547 [10:24:14] zeljkof: well ruby2 is probably going to be a requirement nothlesss isn't it ? [10:24:21] hashar: soon [10:24:24] but not today :) [10:24:43] (03CR) 10Hashar: [C: 031] "Spotted by Zeljkof and confirmed to work." [integration/config] - 10https://gerrit.wikimedia.org/r/166547 (owner: 10Zfilipin) [10:24:45] :D [10:24:46] there is just stuff that I needed to do last week/month [10:24:53] hehe [10:24:58] I am off for lunch [10:25:03] before taking on a new project [10:27:39] (03PS1) 10Zfilipin: Update documentation on how to use jenkins-jobs [integration/config] - 10https://gerrit.wikimedia.org/r/166548 [10:42:38] (03PS1) 10Zfilipin: ZeroPortal has browser tests [selenium] - 10https://gerrit.wikimedia.org/r/166549 [10:43:00] (03CR) 10jenkins-bot: [V: 04-1] ZeroPortal has browser tests [selenium] - 10https://gerrit.wikimedia.org/r/166549 (owner: 10Zfilipin) [11:27:43] hashar: welcome back :) [11:28:07] I am just finishing config commit for zero portal, not sure if I did everything [11:28:12] will push it in a minute [11:29:00] (03PS1) 10Zfilipin: ZeroPortal has browser tests [integration/config] - 10https://gerrit.wikimedia.org/r/166553 [11:30:16] hashar: ^ [11:30:32] also not sure why this one fails https://gerrit.wikimedia.org/r/#/c/166549/ [11:31:13] looks like bundle exec yard [11:31:17] breaks ruby 2.0 [11:31:19] or something [11:32:08] arghh [11:32:24] (03CR) 10Zfilipin: "recheck" [selenium] - 10https://gerrit.wikimedia.org/r/166549 (owner: 10Zfilipin) [11:32:28] 00:00:20.491 /mnt/jenkins-workspace/workspace/gems/gems/json-1.8.1/lib/json/common.rb:67: [BUG] Segmentation fault [11:36:35] this looks related https://bugs.ruby-lang.org/issues/9444 [11:38:16] looking at the json common.rb file there is: [11:38:17] const_set :SAFE_STATE_PROTOTYPE, State.new [11:38:22] might need to be const_set :SAFE_STATE_PROTOTYPE, State.new() [11:38:42] (03PS2) 10Zfilipin: ZeroPortal has browser tests [integration/config] - 10https://gerrit.wikimedia.org/r/166553 [11:39:17] Project browsertests-ZeroPortal-zero.wikimedia.org-linux-firefox-sauce build #1: FAILURE in 1.3 sec: https://integration.wikimedia.org/ci/job/browsertests-ZeroPortal-zero.wikimedia.org-linux-firefox-sauce/1/ [11:39:19] AHHH [11:39:42] zeljkof: soo [11:39:51] we maintain a gem cache on a per instance basis [11:39:58] but that gem cache does not vary by ruby version :D [11:40:01] so [11:40:06] hashar: :) [11:40:16] yes, that would be a problem [11:40:23] we probably end up using something compiled against ruby 1.9.3 [11:40:28] which would definitely not work with ruby2 [11:41:08] (03CR) 10Zfilipin: "The jobs fails:" [integration/config] - 10https://gerrit.wikimedia.org/r/166553 (owner: 10Zfilipin) [11:41:15] we need to add the ruby version in GEM_HOME [11:41:45] $ ruby --version [11:41:45] ruby 2.1.2p95 (2014-05-08 revision 45877) [x86_64-darwin13.0] [11:41:46] pff [11:41:55] so stupidly long [11:42:05] (03CR) 10Zfilipin: "Trying to format the message, one more time." [integration/config] - 10https://gerrit.wikimedia.org/r/166553 (owner: 10Zfilipin) [11:43:10] $ ruby -e 'puts RUBY_VERSION' [11:43:10] 2.1.2 [11:43:11] \O/ [11:43:37] exactly what I wanted to propose :) [11:43:42] http://ruby.about.com/od/advancedruby/a/How-To-Display-Your-Ruby-Version-In-Your-Bash-Prompt.htm [11:44:06] I am amending the JJB copy pasted macros [11:46:24] great [11:46:28] * zeljkof is out of lunch [11:48:37] (03CR) 10Hashar: "So the stracktrace is because the json gem has some compiled module. But the cached version in GEM_HOME is for ruby1.9.x which obviously d" [selenium] - 10https://gerrit.wikimedia.org/r/159644 (owner: 10Dduvall) [11:51:05] (03PS1) 10Hashar: Namespace GEM_HOME based on ruby version [integration/config] - 10https://gerrit.wikimedia.org/r/166555 [11:52:53] (03PS2) 10Hashar: Namespace GEM_HOME based on ruby version [integration/config] - 10https://gerrit.wikimedia.org/r/166555 [11:52:59] (03CR) 10Hashar: "recheck" [selenium] - 10https://gerrit.wikimedia.org/r/166549 (owner: 10Zfilipin) [11:57:12] (03PS3) 10Hashar: Namespace GEM_HOME based on ruby version [integration/config] - 10https://gerrit.wikimedia.org/r/166555 [12:00:37] zeljkof: hashar: oh.. [12:00:40] what? [12:00:42] no [12:00:54] damn [12:00:56] :) [12:00:59] sorry for that [12:01:43] is it that export BROWSER_TIMEOUT= sets the timeout to 0? [12:01:53] why? wouldn't that unset the env var? [12:02:54] Tobi_WMDE_SW: yeah [12:03:11] so in JJB BROWSER_TIMEOUT='{browser_timeout}' would define the env variable [12:03:14] set to an empty string [12:03:33] and mediawiki selenium cast that empty string to an integer ( using to_i [12:03:40] which for an empty string yield a 0 [12:04:26] (03CR) 10Hashar: "check experimental" [selenium] - 10https://gerrit.wikimedia.org/r/159644 (owner: 10Dduvall) [12:07:03] (03CR) 10Hashar: "recheck" [selenium] - 10https://gerrit.wikimedia.org/r/159644 (owner: 10Dduvall) [12:13:00] (03CR) 10Hashar: "So the Jenkins job issue should be fixed by varying the gem cache based on the ruby version being used. https://gerrit.wikimedia.org/r/#/" [selenium] - 10https://gerrit.wikimedia.org/r/159644 (owner: 10Dduvall) [12:13:58] (03CR) 10Hashar: "Gave it a try for the job based on the bundle macro (i.e. the ones using ruby2 / gem2.0): https://gerrit.wikimedia.org/r/#/c/159644/ and h" [integration/config] - 10https://gerrit.wikimedia.org/r/166555 (owner: 10Hashar) [12:14:31] (03CR) 10Hashar: "Magically fixed by varying the gem cache with the ruby version being used: https://gerrit.wikimedia.org/r/#/c/166555/" [selenium] - 10https://gerrit.wikimedia.org/r/166549 (owner: 10Zfilipin) [12:14:45] * zeljkof is back [12:17:40] (03CR) 10Hashar: Namespace GEM_HOME based on ruby version (031 comment) [integration/config] - 10https://gerrit.wikimedia.org/r/166555 (owner: 10Hashar) [12:19:48] (03CR) 10Zfilipin: Namespace GEM_HOME based on ruby version (031 comment) [integration/config] - 10https://gerrit.wikimedia.org/r/166555 (owner: 10Hashar) [12:24:13] (03CR) 10Hashar: Namespace GEM_HOME based on ruby version (031 comment) [integration/config] - 10https://gerrit.wikimedia.org/r/166555 (owner: 10Hashar) [12:30:19] hashar: is there a way to move this commit to the correct repo? https://gerrit.wikimedia.org/r/#/c/152926/ [12:30:31] or should I recreate the patch in the correct repo? [12:31:01] yeah [12:31:05] cherry pick it :D [12:31:17] hashar: how? from gerrit? [12:31:23] cd /path/to/your/checkout/of/integration/config [12:31:33] then on the change https://gerrit.wikimedia.org/r/#/c/152926/ [12:31:41] below the Patch set information there is a Download: [12:31:51] choose cherry-pick and anonymous HTTP [12:31:57] that craft a command that you can copy paste [12:32:03] and thus easily fetch and apply the patch [12:32:12] hashar: thanks! [12:32:13] doing it [12:32:27] git might be smart enough to detect the file renaming [12:34:05] it was [12:34:07] fixing conflict [12:34:29] Project beta-scap-eqiad build #25466: FAILURE in 0.43 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/25466/ [12:35:20] 12:34:28 12:34:28 scap failed: LockFailedError Failed to lock /var/lock/scap: [Errno 11] Resource temporarily unavailable (duration: 00m 00s) [12:37:46] hashar: no, it was not [12:38:05] the repo now has both layout.yaml and zuul/layout.yaml [12:38:14] and I am not sure how to merge them [12:41:31] 3Wikimedia / 3Continuous integration: [upstream] Jenkins: jobs created via JJB are not properly registered in Zuul Gearman server - 10https://bugzilla.wikimedia.org/63758#c4 (10Antoine "hashar" Musso (WMF)) I have upgraded Jenkins Gearman plugin to fix jobs registrations: * cherry picked https://review.opens... [12:53:15] Yippee, build fixed! [12:53:16] Project beta-scap-eqiad build #25467: FIXED in 8 min 59 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/25467/ [12:56:25] (03PS1) 10Zfilipin: Added job template and builder that runs rubocop, Ruby linter [integration/config] - 10https://gerrit.wikimedia.org/r/166563 (https://bugzilla.wikimedia.org/69245) [12:57:04] (03Abandoned) 10Zfilipin: Added job template and builder that runs rubocop, Ruby linter [integration/jenkins-job-builder-config] - 10https://gerrit.wikimedia.org/r/152918 (https://bugzilla.wikimedia.org/69245) (owner: 10Zfilipin) [12:57:20] (03CR) 10Zfilipin: "Moved from https://gerrit.wikimedia.org/r/#/c/166563/" [integration/config] - 10https://gerrit.wikimedia.org/r/166563 (https://bugzilla.wikimedia.org/69245) (owner: 10Zfilipin) [13:02:59] (03PS2) 10Zfilipin: WIP Run rubocop, Ruby linter, for all repositories that have Ruby code [integration/config] - 10https://gerrit.wikimedia.org/r/166563 (https://bugzilla.wikimedia.org/69245) [13:03:23] (03CR) 10Zfilipin: "Moved to https://gerrit.wikimedia.org/r/#/c/166563/" [integration/zuul-config] - 10https://gerrit.wikimedia.org/r/152926 (https://bugzilla.wikimedia.org/69245) (owner: 10Zfilipin) [13:07:18] (03CR) 10jenkins-bot: [V: 04-1] WIP Run rubocop, Ruby linter, for all repositories that have Ruby code [integration/config] - 10https://gerrit.wikimedia.org/r/166563 (https://bugzilla.wikimedia.org/69245) (owner: 10Zfilipin) [13:17:28] (03PS3) 10Zfilipin: Run rubocop, Ruby linter, for all repositories that have Ruby code [integration/config] - 10https://gerrit.wikimedia.org/r/166563 (https://bugzilla.wikimedia.org/69245) [13:20:33] (03CR) 10jenkins-bot: [V: 04-1] Run rubocop, Ruby linter, for all repositories that have Ruby code [integration/config] - 10https://gerrit.wikimedia.org/r/166563 (https://bugzilla.wikimedia.org/69245) (owner: 10Zfilipin) [13:25:56] (03PS4) 10Zfilipin: Run rubocop, Ruby linter, for all repositories that have Ruby code [integration/config] - 10https://gerrit.wikimedia.org/r/166563 (https://bugzilla.wikimedia.org/69245) [13:29:10] (03CR) 10jenkins-bot: [V: 04-1] Run rubocop, Ruby linter, for all repositories that have Ruby code [integration/config] - 10https://gerrit.wikimedia.org/r/166563 (https://bugzilla.wikimedia.org/69245) (owner: 10Zfilipin) [13:35:21] (03PS5) 10Zfilipin: Run rubocop, Ruby linter, for all repositories that have Ruby code [integration/config] - 10https://gerrit.wikimedia.org/r/166563 (https://bugzilla.wikimedia.org/69245) [13:38:29] (03CR) 10jenkins-bot: [V: 04-1] Run rubocop, Ruby linter, for all repositories that have Ruby code [integration/config] - 10https://gerrit.wikimedia.org/r/166563 (https://bugzilla.wikimedia.org/69245) (owner: 10Zfilipin) [13:39:16] ;:( [13:39:39] zeljkof: you have to run jjb to define the jobs [13:39:55] hashar: ah, that is why it is failing? [13:40:01] will do [13:40:09] yeah look at bottom of https://integration.wikimedia.org/ci/job/integration-zuul-layoutvalidation/1885/console [13:40:22] Job VisualEditor-rubocop not defined [13:40:29] maybe the message should be made nicer :D [13:40:42] I took a look but did not notice that :) [13:40:50] it should be RED! :) [13:41:56] that is build in zuul server though [13:45:08] what does that mean? [13:45:26] (03PS6) 10Zfilipin: Run rubocop, Ruby linter, for all repositories that have Ruby code [integration/config] - 10https://gerrit.wikimedia.org/r/166563 (https://bugzilla.wikimedia.org/69245) [13:47:39] zeljkof: sorry [13:47:58] zeljkof: I meant the message reported "Job foobar not defined", is a message in Zuul server [13:48:04] we can't really tweak it easily [13:48:51] (03CR) 10jenkins-bot: [V: 04-1] Run rubocop, Ruby linter, for all repositories that have Ruby code [integration/config] - 10https://gerrit.wikimedia.org/r/166563 (https://bugzilla.wikimedia.org/69245) (owner: 10Zfilipin) [13:49:20] hashar: hm, looks like the above patch does not _create_ any jobs :) [13:49:25] looking... [13:49:52] https://integration.wikimedia.org/ci/job/integration-jjb-config-diff/1437/console [13:50:57] well you are invoking 'mwext-{name}-rubocop' [13:51:07] so that surely creates a bunch of 'mwext-*-rubocop' jobs [13:51:15] you have to run jenkins job builder now [13:51:20] to have the job created in jenkins [13:51:42] hashar: but take a look at this https://integration.wikimedia.org/ci/job/integration-jjb-config-diff/1437/console [13:51:48] no new jobs are created [13:51:55] oh [13:52:02] that is a bug :] [13:52:16] I have tried it locally, not jobs created either [13:52:26] I have messed up something somewhere [13:52:50] ah yeah [13:52:59] so you are creating a template name: '{name}-rubocop' [13:53:01] in JJB [13:53:06] but it is not applied to any JJB project [13:53:19] trying to figure out how to move the commit to use job-template '{name}-bundle-{bundlecommand}' as you have suggested here https://gerrit.wikimedia.org/r/#/c/152918/ [13:53:24] though you applied those new jobs in Zuul layout config https://gerrit.wikimedia.org/r/#/c/166563/6/zuul/layout.yaml [13:54:03] the template '{name}-rubocop' should be invoked in mediawiki-extensions.yaml [13:54:36] ie under a: [13:54:52] - project: [13:54:52] name: mwext-ContentTranslation [13:54:52] jobs: [13:54:52] - '{name}-rubocop' [13:55:59] hashar: I see [13:56:05] doing... [14:02:03] (03PS7) 10Zfilipin: Run rubocop, Ruby linter, for all repositories that have Ruby code [integration/config] - 10https://gerrit.wikimedia.org/r/166563 (https://bugzilla.wikimedia.org/69245) [14:05:17] (03CR) 10jenkins-bot: [V: 04-1] Run rubocop, Ruby linter, for all repositories that have Ruby code [integration/config] - 10https://gerrit.wikimedia.org/r/166563 (https://bugzilla.wikimedia.org/69245) (owner: 10Zfilipin) [14:06:28] hashar: ok, extensions are here https://integration.wikimedia.org/ci/job/integration-jjb-config-diff/1438/consoleFull [14:06:31] deploying [14:07:47] what is rubocop and should I be afraid? [14:08:00] Nikerabbit: you should be _very_ afraid :P [14:08:09] it is a ruby linter [14:08:25] helping is have beautiful ruby code by complaining if you commit ugly code [14:08:49] not deployed yet, when deployed will be non voting until we are sure it works fine [14:09:07] (03CR) 10Zfilipin: "recheck" [integration/config] - 10https://gerrit.wikimedia.org/r/166563 (https://bugzilla.wikimedia.org/69245) (owner: 10Zfilipin) [14:11:36] hashar: progress, down to just a few missing jobs https://integration.wikimedia.org/ci/job/integration-zuul-layoutvalidation/1888/console [14:15:47] (03CR) 10Hashar: "The wiki page https://github.com/sebastianbergmann/phpunit/wiki/ChangeLog-for-PHPUnit-3.7" [integration/phpunit] - 10https://gerrit.wikimedia.org/r/164683 (owner: 10BryanDavis) [14:16:14] (03CR) 10Hashar: "recheck" [integration/phpunit] - 10https://gerrit.wikimedia.org/r/164683 (owner: 10BryanDavis) [14:18:20] (03PS8) 10Zfilipin: Run rubocop, Ruby linter, for all repositories that have Ruby code [integration/config] - 10https://gerrit.wikimedia.org/r/166563 (https://bugzilla.wikimedia.org/69245) [14:18:40] hashar: the only thing I can not figure out is where to define VisualEditor-rubocop job [14:19:48] there is VisualEditor-ruby1.9.3lint, so when I find it, I will put rubocop job there [14:19:52] _when_ I find it... [14:20:52] found it! [14:21:02] in browsertests.yaml, not intuitive at all [14:21:22] but I do remember a problem with it, so it got moved from another file [14:21:26] now it does make sense [14:21:35] (03CR) 10jenkins-bot: [V: 04-1] Run rubocop, Ruby linter, for all repositories that have Ruby code [integration/config] - 10https://gerrit.wikimedia.org/r/166563 (https://bugzilla.wikimedia.org/69245) (owner: 10Zfilipin) [14:22:40] (03PS9) 10Zfilipin: Run rubocop, Ruby linter, for all repositories that have Ruby code [integration/config] - 10https://gerrit.wikimedia.org/r/166563 (https://bugzilla.wikimedia.org/69245) [14:33:09] chrismcmahon: welcome back :) [14:36:13] zeljkof: yeah the browser tests are defining the jjb VisualEditor project [14:36:18] yet another thing to refactor [14:36:19] :D [14:36:47] hashar: the commit is now all green :) https://gerrit.wikimedia.org/r/#/c/166563/ [14:37:06] but I am not sure how to make it use job-template '{name}-bundle-{bundlecommand}' [14:37:11] like you have suggested here [14:37:15] https://gerrit.wikimedia.org/r/#/c/152918/ [14:37:42] so there https://gerrit.wikimedia.org/r/#/c/166563/9/jjb/macro.yaml [14:37:49] the macro is largely a duplicate of the bundle macro [14:38:31] so it could be a '{name}-bundle-{bundlecommand}' template [14:38:40] being passed 'rubocop' as the bundlecommand [14:38:51] that will do the GME_HOME hack and gem install [14:39:24] though my bundle command macro does not let you pass additional parameters to the exec [14:39:34] ei --format progress --format html --out log/rubocop.html [14:39:40] hashar: ok, that is optional [14:39:49] we can improve the macro later [14:40:00] now is a good time to reduce some duplication [14:40:40] with python tox [14:40:59] the target env run a command which is specified in a tox.ini file and can thus take extra args [14:41:19] ok, looking at mediawiki-misc.yaml, I think I have an idea on how to use [14:41:26] the bundlecommand [14:42:32] will amend the commit [14:42:43] if we could define default arguments passed per command that would be nice [14:42:53] i.e. jenkins will invoke bundle exec rubocop [14:43:14] and would read some .bundlerc or .rubocoprc to have some extra args defined [14:45:07] do we have something similar, or should this be implemented? [14:45:16] need to be implemented [14:45:25] can we define our own command with bundler? [14:45:42] if we could define a : bundle doc [14:45:51] hashar: hm [14:45:53] and have that 'doc' to execute rubocop --some --parameters [14:45:53] not sure [14:45:55] that would be nice :D [14:46:04] will check [14:46:11] rubocop probably has a configuration file [14:46:17] ok, for now will move everything to the bundlecommand [14:46:17] so we can just run bundle exec rubocop [14:46:29] hashar: yes, there is a config file there [14:46:38] and expect the rubocop file to publish the report at the same place everything [14:46:44] not sure what options are available, but it should have what we need [14:47:51] yeah [14:47:52] .rubocop.yml [14:48:09] https://github.com/bbatsov/rubocop#defaults [14:49:06] hashar: yes, we use it to ignore violations we do not care about [14:49:17] $ rubocop --format html --out log/rubocop.html [14:49:17] No such file or directory @ rb_sysopen - log/rubocop.html [14:49:18] bah [14:49:18] we could just extend the file with formatters [14:49:28] it doesn't know how to create the directory :-/ [14:49:46] probably expects the directory to exist [14:52:45] https://github.com/bbatsov/rubocop/issues/1389 :D [14:52:46] yeah [14:52:47] filled [14:53:32] hi zeljkof, nice to be back [14:53:44] at worth we could have the output dir to be set to / [14:53:50] and look it up there [14:57:27] hashar: do we even need this job-template https://gerrit.wikimedia.org/r/#/c/166563/9/jjb/job-templates.yaml,unified [14:57:38] if we use '{name}-bundle-{bundlecommand}' template [14:57:40] ? [14:57:58] yeah that template could be removed [14:58:16] the issue is that for bundlecommand == rubocop , you still need to pass extra arguments [14:58:18] hashar: done [14:58:33] :-/ [14:58:34] hashar: for now just having console output is fine [14:58:44] we can think of html reports later [15:05:30] (03PS10) 10Zfilipin: Run rubocop, Ruby linter, for all repositories that have Ruby code [integration/config] - 10https://gerrit.wikimedia.org/r/166563 (https://bugzilla.wikimedia.org/69245) [15:07:08] hmm [15:07:14] I found a way but I am not happy with it [15:08:09] * hashar blames python .format() [15:10:07] (03CR) 10jenkins-bot: [V: 04-1] Run rubocop, Ruby linter, for all repositories that have Ruby code [integration/config] - 10https://gerrit.wikimedia.org/r/166563 (https://bugzilla.wikimedia.org/69245) (owner: 10Zfilipin) [15:13:31] (03PS11) 10Zfilipin: Run rubocop, Ruby linter, for all repositories that have Ruby code [integration/config] - 10https://gerrit.wikimedia.org/r/166563 (https://bugzilla.wikimedia.org/69245) [15:13:42] hashar: ok, that should be it ^ [15:14:03] but I think job names now include "bundle", so I should fix that [15:14:12] waiting to see if gerrit complains :) [15:14:16] * zeljkof brb [15:33:36] Reedy: heya, can you help moritz get this merged? https://gerrit.wikimedia.org/r/#/c/166410/ [15:34:40] hmm [15:34:53] zeljkof left oh no [15:35:45] * zeljkof is back [15:37:28] (03CR) 10Hashar: [C: 031] "Good enough for now." [integration/config] - 10https://gerrit.wikimedia.org/r/166563 (https://bugzilla.wikimedia.org/69245) (owner: 10Zfilipin) [15:37:35] zeljkof: !!!!!!!!!!!!!!!! [15:37:42] zeljkof: so [15:37:55] zeljkof: I wanted to be able to pass extra args to the bundle macro [15:38:05] i.e. --format html --out log/rubocop.html [15:38:32] which means changing the job template and builder macro to expand a variable such as {extraargs} [15:38:50] which in turns means extraargs would always need to be set even when we don't need to pass anything [15:39:03] (since JJB throw a stracktrace whenever a variable is not set) [15:43:27] ok [15:43:47] but it can be empty string, right? [15:45:37] (03CR) 10Zfilipin: "recheck" [integration/config] - 10https://gerrit.wikimedia.org/r/166563 (https://bugzilla.wikimedia.org/69245) (owner: 10Zfilipin) [15:49:06] (03CR) 10jenkins-bot: [V: 04-1] Run rubocop, Ruby linter, for all repositories that have Ruby code [integration/config] - 10https://gerrit.wikimedia.org/r/166563 (https://bugzilla.wikimedia.org/69245) (owner: 10Zfilipin) [15:52:02] (03CR) 10Zfilipin: "recheck" [integration/config] - 10https://gerrit.wikimedia.org/r/166563 (https://bugzilla.wikimedia.org/69245) (owner: 10Zfilipin) [16:01:29] hashar: twentyafterfour meeting ping :) [16:02:50] (03CR) 10Zfilipin: "recheck" [integration/config] - 10https://gerrit.wikimedia.org/r/166563 (https://bugzilla.wikimedia.org/69245) (owner: 10Zfilipin) [16:06:33] error [16:06:35] stupid hangout [16:06:47] (this web page has a redirection loop) [16:06:55] hashar: clear cache? [16:07:00] another browser? [16:07:09] na my session has been invalidated [16:07:41] someone at google cleared my session id ! [16:07:50] :/ [16:10:31] and ether pad is dead [16:15:48] I'm getting hte stupid redirect loop too [16:15:58] yeah my session got invalidated [16:16:02] i logged out / logged in [16:16:05] and that worked again [16:21:08] I can hear you all [16:21:17] people are waking up ang getting on wifi at the hotel and it suuuucks [16:21:45] ok, I can't connect anymore :/ [16:21:47] greg-g: lets do it by phone? [16:22:05] bah, just update the etherpad with sentences so i can read it [16:22:20] lemme add some things real quick [16:22:30] hashar: have everyone look in here, please :) [16:22:40] greg-g: can we call you? [16:22:59] greg-g: we need your phone number to call you [16:23:17] oh, sure [16:23:38] just pm'd to marxarelli [16:23:58] forgot hangouts could do that for free [16:30:56] Reedy: any clue about mathoid status ? [16:31:08] Yeah, I'll deploy it later [16:31:15] The commit was only made yesterday :P [16:31:18] oh, mathoid? [16:31:22] eyah [16:31:55] Reedy: greg is wondering in the audio whether it should deployed to prod this week [16:32:11] since apparently a bunch of code hasn't run / been tested on beta cluster yet [16:32:19] (hearsay, I am lacking all the backgroun) [16:32:47] I noticed Moritz mail from yesterday but was sick and did not follow up today :/ [16:33:48] I've honestly no idea [16:35:47] As for production, is it evne ready? [16:35:55] i have no clue [16:36:10] in my opinion: if we have no idea whether it is prod ready, we shouldn't deploy it [16:36:14] and wait +1 week [16:36:18] heh, yeah [16:36:23] I'm sure alex was only just working on it [16:36:44] from a week ago... [16:36:45] [15:39:33] RECOVERY - mathoid on sca1002 is OK: HTTP OK: HTTP/1.1 200 OK - 301 bytes in 0.038 second response time [16:36:45] [15:39:37] yes! [16:36:45] [15:40:02] RECOVERY - mathoid on sca1001 is OK: HTTP OK: HTTP/1.1 200 OK - 301 bytes in 0.022 second response time [16:36:45] [15:40:05] where physikerwelt ? he would be thrilled with this... [16:36:46] [15:40:35] <_joe_> a nodejs app responding correctly to health checks? I'm impressed as well [16:36:52] just have to be honest with Moritz [16:37:55] If it hasn't been tested in beta either... [16:38:14] yeah I have no idea whether it has been tested [16:38:27] it probably doesn't have any feature switch since that sounds to me like a huge rewrite /change [16:38:33] maybe it can be done out of the main wmf train [16:38:41] in a dedicated deployment slot [16:38:45] sounds safer to me [16:38:57] but then , I have no idea what are the impacts of their change [16:40:04] Depends if using that rendering becomes default etc [16:41:59] yeah I guess we need more details [16:42:03] and a better plan :-D [16:42:23] mind following up on ops list , not sure whether Moritz is subscribed to it [16:42:37] I am attending an other org board meeting tonight so can't really handle it :( [16:44:41] He usually ends up PMing me when he wants things doing :P [16:47:02] Reedy: yeah, see Moritz' email i forwarded to ops, hopefully someone will approve his reply to the list soon [16:47:24] I can forward you his reply [16:47:37] oh, no, you alrady got it [16:47:44] so, see that email from him :) [16:47:56] alright, i should run, I'll be contactable via email mostly [16:48:22] Moritz quote a change to enable it on beta https://gerrit.wikimedia.org/r/#/c/166410/ [16:48:26] that needs a review + 2 [16:48:31] would apparently let him test on beta [16:48:38] Right [16:48:45] Which I'll deploy in my window in just over an hour [16:48:46] then I would probably get it deployed on both wmf branches but outside of the usual wmf train [16:48:56] so we can have gabriel / moritz / math expert present [16:49:02] It missed SWAT :P [16:49:04] and make sure it does not screw up a wmf train [16:49:11] yeah SWAT it :-] [16:49:38] I mean for prod [16:49:43] cause https://gerrit.wikimedia.org/r/#/c/166410/ can probably land in anytime [16:50:19] Reedy: I am rushing out, be back tomorrow. Please follow up with moritz on ops list :] [16:50:29] we can talk about it again tomorrow morning [16:51:04] greg-g: backing up labs instance is supported by openstack but it has never been done on labs. Everything should be in puppet anyway so I guess it is not that much of ya priorit [16:51:08] a priority [16:51:39] getting stuff in puppet seems a better idea (if it isn't) [16:51:47] almost everything is in puppet [16:52:07] but when creating a new instance the IP change which has some impacts in our puppet and mediawiki-config repos [16:52:10] not a big deal [16:52:39] the data backend are not backed up either (redis, mysql db, elasticsearch ..) [16:52:40] we have the same issue in production [16:52:46] then it takes ages before someone notices :P [16:53:01] well, only mysql really needs backing up [16:53:08] elasticsearch can be rebuilt [16:53:14] redis is temporary data [16:53:33] so we need a bug to backup beta mysql dbs :D [16:54:01] off to board meeting. see you tomorrow [17:27:44] OK, I think I have dismissed all my email from vacation. If I owe anyone anything, please say something! [19:07:15] Project beta-scap-eqiad build #25509: FAILURE in 1 min 8 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/25509/ [19:08:08] chrismcmalunch, marxarelli: how can browsertest n00bs figure out where things are documented? I believe features/support/pages is the cheezy thing, but e.g. is when_present() part of cucumber, watir, selenium_driver, WebDriver ? [19:10:54] Yippee, build fixed! [19:10:55] Project beta-scap-eqiad build #25510: FIXED in 58 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/25510/ [19:15:16] spagewmf: the browser-testing articles are probably the best place to start (https://www.mediawiki.org/wiki/Quality_Assurance/Browser_testing/Writing_tests) [19:17:44] spagewmf: though it seems we might need a more cohesive section that goes over the various libs in use, and what each of them provides [19:23:22] Project browsertests-Echo-en.wikipedia.beta.wmflabs.org-linux-firefox-sauce build #120: FAILURE in 11 min: https://integration.wikimedia.org/ci/job/browsertests-Echo-en.wikipedia.beta.wmflabs.org-linux-firefox-sauce/120/ [19:41:07] spagewmf: google "when_present" and the top hit for me is http://watirwebdriver.com/waiting/ [19:42:25] my mistake was adding 'cucumber ', I got The downy mildew of the cucumber: what it is and how to ... :) [19:44:38] chrismcmahon: BTW I have workaround for the timeout you reported, https://gerrit.wikimedia.org/r/166625 , should reduce Flow test flapping [19:45:00] chrismcmahon: I thought you were on vacation [19:45:19] spagewmf: I'm back from vacation today, catching up on stuff. [19:53:06] spagewmf: I was resisting increasing the timeout for creating a new topic because I wanted to be reminded that there is still an underlying performance issue I think [19:56:10] chrismcmahon: right. Could tests "fail" performance if actions take more than X seconds, but still pass if they take less than Y seconds? Or I think Jon Robson suggested failing test immediately retries. [20:00:42] in my experience browser tests are not a good tool for testing performance... although they can uncover pain points [20:03:12] Nikerabbit: right. It's interesting that add topic takes a long time, it's really interesting if the time increases and stays higher. But extracting useful performance info from test runs seems like A.I. :) [20:11:06] Yippee, build fixed! [20:11:07] Project browsertests-Echo-en.wikipedia.beta.wmflabs.org-linux-chrome-sauce build #106: FIXED in 9 min 32 sec: https://integration.wikimedia.org/ci/job/browsertests-Echo-en.wikipedia.beta.wmflabs.org-linux-chrome-sauce/106/ [20:13:46] bd808S: does the 'S' stand for 'secure'? :) [20:13:56] also I can't ssh into deployment-sca01, and it's not sending metrics either... [20:13:58] is it dead? [20:14:24] greg-g: ^ do you know? [20:14:41] it looks like a mathoid/citoid box... [20:14:53] there's http://watirwebdriver.com/page-performance/ , but seems no way to get timings from wait_until_present, when_not_visible, etc. [20:17:12] YuviPanda: yeah, can't ssh into it either [20:17:20] is eet deaadd? [20:17:46] Icinga says so [20:17:58] http://icinga.wmflabs.org/cgi-bin/icinga/extinfo.cgi?type=1&host=deployment-sca01.eqiad.wmflabs [20:18:32] JohnLewis: yeah, that's what I'm investigating :) [20:18:40] Oh :p [20:19:29] YuviPanda: just looked at the console output for it [20:21:01] Doesn't seem too happy. I'm wondering whether a reboot of it is a) useful b) not going to kill other stuff [20:21:19] JohnLewis: well, it's dead. might as well reboot :) [20:22:56] rebooted. Do we have the loggie bots in here (as I saw it was being discussed at one point) or? :p [20:24:05] oh qa-morebots I assume it is [20:24:08] yeah [20:24:13] !log rebooted deployment-sca01 [20:24:17] Logged the message, Master [20:27:04] Yippee, build fixed! [20:27:04] Project browsertests-VisualEditor-production-linux-firefox-sauce build #24: FIXED in 1 hr 27 min: https://integration.wikimedia.org/ci/job/browsertests-VisualEditor-production-linux-firefox-sauce/24/ [20:30:05] YuviPanda: changed nothing so I just poked andrew to take a peak and see if he can help root the issue down :) [20:30:26] JohnLewis: ok :) might want to poke gwicke as well [20:31:46] YuviPanda: andrewbogott said kill and recreate so I just poked gwicke in -operations [20:31:54] ok! [20:33:18] YuviPanda: no idea re -sca (/me is at MW Core Team offsite) [20:33:25] arthur doesn't let us use our laptops [20:33:42] greg-g: hah! ok :) JohnLewis has been helpful and poked gwicke [20:34:03] :) [20:34:09] * greg-g goes [20:42:44] !log deleted and recreated deployment-sca01 (still needs puppet set up) [20:42:46] Logged the message, Master [20:45:21] JohnLewis: cool, let me archive the metrics from it [20:45:31] YuviPanda: alright [20:45:38] done [20:45:42] RECOVERY - BetaLabs: Puppet failure events on labmon1001 is OK: OK: All targets OK [20:46:03] do beta labs HHVM errors get logged anywhere? I don't see any in deployment-bastion:/data/project/logs [20:46:36] Project browsertests-MobileFrontend-en.m.wikipedia.beta.wmflabs.org-linux-chrome-sauce build #266: STILL FAILING in 54 min: https://integration.wikimedia.org/ci/job/browsertests-MobileFrontend-en.m.wikipedia.beta.wmflabs.org-linux-chrome-sauce/266/ [20:46:37] spagewmf: I think they got logged to the mediawiki instances at one point. Check there? [20:48:55] JohnLewis: thx but I don't know which one. I have the HTML of Wikimedia Error page in a browser test failure, doesn't seem to have any "Served by mw-labs-foobar-prep-02" in it [20:50:14] a successful response has 'mw.config.set({"wgBackendResponseTime":682,"wgHostname":"deployment-mediawiki01"});' in it, but not an error page [20:52:08] spagewmf: about which time did it fail? That's probably the only real way to find a log by looking on -02 and 01 [20:54:09] YuviPanda: just added the puppet roles to sca01 [20:55:47] JohnLewis: yup, looking now. (And hhvm log uses yet another date format... yet people complain about systemd journal) [20:56:20] 02 has a few errors from a quick skim. Those are GWToolSet and CA ones. [20:56:38] logstash maybe? [20:56:47] ^ that too :p [20:57:08] JohnLewis: but icinga is green for betalabs now, so yay! :) [20:57:40] JohnLewis: could be 'Oct 13 05:36:09 deployment-mediawiki01 hhvm: #012Warning: timed out after 0.25 seconds when connecting to 10.68.16.146 [110]: Connection timed out [20:58:10] how would I figure out what server is '10.68.16.146' [20:58:45] a quick 'host' returns ottomata-worker4.eqiad.wmflabs [20:59:21] thanks. Obviously a bitcoin operation :) [20:59:31] hehe [20:59:41] YuviPanda: so; everything working well-ish now? [20:59:56] JohnLewis: well, I've no idea what sca01 did, so unsure if it 'works' :) but icinga is happy... [21:00:22] !log icinga says deployment-sca01 is good (yay) [21:00:23] Logged the message, Master [21:00:49] JohnLewis: well, I purged the old sca01 ;) [21:00:52] so that's why it's good [21:01:04] :p [21:11:54] Project browsertests-MobileFrontend-en.m.wikipedia.beta.wmflabs.org-linux-firefox-sauce build #302: STILL FAILING in 57 min: https://integration.wikimedia.org/ci/job/browsertests-MobileFrontend-en.m.wikipedia.beta.wmflabs.org-linux-firefox-sauce/302/ [21:12:31] PROBLEM - BetaLabs: Puppet failure events on labmon1001 is CRITICAL: CRITICAL: deployment-prep.deployment-sca01.puppetagent.failed_events.value (30.00%) [21:13:05] JohnLewis: ^ haha, apparently not [21:13:27] heh [21:13:44] trying to debug mailman issues on labs right now as well as this :p [21:22:03] 3Wikimedia / 3Quality Assurance: QA: mediawiki_api doesn't report HTTP errors - 10https://bugzilla.wikimedia.org/72056 (10spage) 3NEW p:3Unprio s:3normal a:3None An Echo browser test trying to create an account failed with 503 Service unavailable. That's understandable, but the indication of failure... [21:22:35] marxarelli: ^ I'm confused, you fixed bug 70193 which should fix this, but I don't see your fix in the mediawiki_api gem [21:26:33] Project browsertests-MobileFrontend-test2.m.wikipedia.org-linux-firefox-sauce build #228: STILL FAILING in 50 min: https://integration.wikimedia.org/ci/job/browsertests-MobileFrontend-test2.m.wikipedia.org-linux-firefox-sauce/228/ [21:31:13] Project browsertests-VisualEditor-en.wikipedia.beta.wmflabs.org-windows_8-internet_explorer-sauce build #65: STILL FAILING in 1 hr 4 min: https://integration.wikimedia.org/ci/job/browsertests-VisualEditor-en.wikipedia.beta.wmflabs.org-windows_8-internet_explorer-sauce/65/ [21:42:47] spagewmf: yeah, if the server responded with a 503, an exception should have been raised [21:43:02] spagewmf: where don't you see the fix? [21:43:40] marxarelli: bug 72056 [21:44:14] spagewmf: right, but when you say "you don't see the fix in the gem," what do you mean? [21:44:16] marxarelli: as it says, both in the CI run and my local gem, the line seems missing. [21:44:57] marxarelli: is there a way to figure out the git version of a gem? [21:45:16] greg-g: sca stuff resolved btw :) [21:45:46] spagewmf: oh, wait... [21:46:05] spagewmf: yeah, we haven't done a release since that fix [21:46:36] spagewmf: i'll do one today [21:49:05] marxarelli: thanks. github says "our actual code is hosted with Gerrit", and it's merged there. Can the gem update from gerrit or git.wikimedia.org ? [21:50:01] spagewmf: that's up to you. it's mirrored so either should work [21:52:52] RECOVERY - BetaLabs: Puppet failure events on labmon1001 is OK: OK: All targets OK [21:55:35] (03PS1) 10Dduvall: Releasing minor version 0.3.0 [ruby/api] - 10https://gerrit.wikimedia.org/r/166679 (https://bugzilla.wikimedia.org/72056) [21:56:13] spagewmf, chrismcmahon: ^ [21:56:31] JohnLewis: yay! ^ :) thanks! [21:56:50] YuviPanda: woo [22:00:52] Project browsertests-MultimediaViewer-en.wikipedia.beta.wmflabs.org-windows_7-internet_explorer-9-sauce build #66: FAILURE in 7 min 19 sec: https://integration.wikimedia.org/ci/job/browsertests-MultimediaViewer-en.wikipedia.beta.wmflabs.org-windows_7-internet_explorer-9-sauce/66/ [22:05:59] JohnLewis: thanks sir :) [22:17:28] (03CR) 10Spage: [C: 04-1] "Most Gemfile.lock files have:" [ruby/api] - 10https://gerrit.wikimedia.org/r/166679 (https://bugzilla.wikimedia.org/72056) (owner: 10Dduvall) [22:18:28] marxarelli: ^ don't leave it "up to me" when I don't understand the inner workings of Gemfile, gemspec, bundle, RVM :) [22:20:27] marxarelli: how come github says "Aug 7 [22:20:51] Releasing minor version 0.3.0 " but you just released it now? [22:22:09] marxarelli: ignore that, that's mediawiki_selenium. I should eat something [22:22:10] (03CR) 10Dduvall: "The "~> 0.2" version spec actually means anything between 0.2.0 and 1.0 so we should be ok there." [ruby/api] - 10https://gerrit.wikimedia.org/r/166679 (https://bugzilla.wikimedia.org/72056) (owner: 10Dduvall) [22:22:54] (03CR) 10Spage: [C: 032] "OK!" [ruby/api] - 10https://gerrit.wikimedia.org/r/166679 (https://bugzilla.wikimedia.org/72056) (owner: 10Dduvall) [22:23:04] (03Merged) 10jenkins-bot: Releasing minor version 0.3.0 [ruby/api] - 10https://gerrit.wikimedia.org/r/166679 (https://bugzilla.wikimedia.org/72056) (owner: 10Dduvall) [22:23:20] spagewmf: no, it hasn't been released yet. as soon as that commit is merged, i'll run `gem build` and `gem push` to release [22:23:35] spagewmf: thanks! [22:25:53] spagewmf: ok, pushed it. `bundle update mediawiki_api` should give you 0.3.0 [22:28:08] 3Wikimedia / 3Quality Assurance: QA: mediawiki_api doesn't report HTTP errors - 10https://bugzilla.wikimedia.org/72056#c3 (10Dan Duvall) 5PATC>3ASSI a:3Dan Duvall Built and released the gem. Running `bundle update mediawiki_api` in your tests/browser directory should update you to 0.3.0. [22:30:19] marxarelli: it does, but it also changes childprocess|ffi|mime-types versions in Echo/tests/browser/Gemfile.lock Are we supposed to check in the results of `bundle update` in 13 extensions using mediawiki_api ? [22:31:45] spagewmf: unfortunately bundler isn't that good at updating just a single dependency :/ [22:32:08] Krinkle, hashar: So… is https://gerrit.wikimedia.org/r/#/c/163791/ good to go? [22:32:51] spagewmf: it's usually safe to commit all the updates as long as the dependent gem uses tight enough versions [22:33:15] spagewmf: but you can try just editing Gemfile.lock yourself and see if that works [22:33:21] James_F: sleeping sorry [22:33:38] hashar: Tomorrow? :-) [22:33:48] James_F: it is deployed on our puppetmaster already [22:34:01] hashar: Oh, so it just needs +2ing for clean-up? [22:34:14] James_F: timo did on Oct 6th :] [22:34:32] someone from ops need to +2 / submit it so it lands in puppet git repo [22:34:48] but the change is already applied on the labs slaves via our puppetmaster [22:35:02] James_F: We've had Chrome and Firefox on jenkins for a week as you know ) [22:35:04] sleeping for real now :] [22:35:06] hashar: Does that mean we can switch to Chromium rather than PhantomJS for extensions? [22:35:09] Bah. [22:35:11] James_F: Yes [22:35:12] Krinkle: ^^^ [22:35:15] Krinkle: Now? :-) [22:35:17] Yes [22:35:19] Krinkle: At least for VE. [22:35:26] Oh, wait [22:35:27] well [22:35:29] No [22:35:36] Krinkle: Why not? [22:35:36] but that's because of something else [22:36:23] James_F: as of https://github.com/wikimedia/integration-config/commit/238d4c4fec6d86d9084a6152cdbc79411851459c Chrome and Firefox can be used in any -npm job [22:36:37] Krinkle: But not for non-npm. [22:36:39] https://github.com/wikimedia/integration-jenkins/commit/0b85d48e603ef64ac03c05959eae465bde6e5c43 [22:36:49] sets up the Xvfb link and aliases Chrome/Chromium [22:37:02] James_F: Well, we can add it to other jobs too [22:37:03] but... [22:37:10] Krinkle: But? [22:37:15] extensions-qunit needs mediawiki to be installed [22:37:21] Yeah. [22:37:41] and thus can't use Grunt [22:37:47] so mwext-qunit never uses Grunt [22:37:55] It uses a custom job with its script on the server. [22:38:07] which does lend parts of Grunt to do the job, but it uses grunt-contrib-qunit [22:38:07] Yeah. [22:38:21] Which is PhantomJS [22:38:55] Can we change that script? [22:38:58] https://github.com/wikimedia/integration-jenkins/blob/master/bin/wmfgrunt [22:39:06] https://github.com/wikimedia/integration-jenkins/blob/master/tools/Gruntfile.js [22:39:40] https://github.com/wikimedia/integration-config/blob/79ad0192b050d15740d3ea6e1a918552205f311b/jjb/macro.yaml#L344 [22:39:43] It's possible yes [22:40:10] * Krinkle files bug with details [22:43:26] James_F: https://bugzilla.wikimedia.org/show_bug.cgi?id=72063 [22:43:39] 3Wikimedia / 3Continuous integration: Jenkins: Convert mwext qunit from grunt-contrib-qunit (PhanttomJS) to grunt-karma (Chromium) - 10https://bugzilla.wikimedia.org/72063 (10Krinkle) 3NEW p:3Unprio s:3normal a:3None Relevant files: https://github.com/wikimedia/integration-jenkins/blob/0b85d48e60/b... [22:43:47] wikibugs: you're getting old [22:45:53] * James_F grins. [23:06:00] (03CR) 10Krinkle: [C: 04-1] "webproxy.pmtpa.wmnet / carbon.wikimedia.org is still up as of writing. Will keep around for now." [integration/jenkins] - 10https://gerrit.wikimedia.org/r/166529 (owner: 10Krinkle)