[00:31:16] PROBLEM - Long lived cherry-picks on puppetmaster on deployment-puppetmaster02 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [02:41:06] PROBLEM - SSH on integration-slave-docker-1014 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [02:45:57] RECOVERY - SSH on integration-slave-docker-1014 is OK: SSH OK - OpenSSH_6.7p1 Debian-5+deb8u4 (protocol 2.0) [04:05:20] PROBLEM - SSH on integration-slave-docker-1013 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [04:20:10] RECOVERY - SSH on integration-slave-docker-1013 is OK: SSH OK - OpenSSH_6.7p1 Debian-5+deb8u4 (protocol 2.0) [07:37:08] PROBLEM - SSH on integration-slave-docker-1014 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [07:59:44] Could someone point me at information on how to run specific Selenium tests? I have a test failure for a patch (https://gerrit.wikimedia.org/r/c/434071/), which is running the job mwskin-mw-selenium-jessie, and I'd like to work out what's going wrong... [08:01:57] (The general `npm run selenium` approach takes a very very long time, and seemed to be failing a bunch with timeouts on vagrant anyway, so it's not so helpful for debugging. Trying to run the job on Jenkins without my patch applied just turned up that I don't know what all those ZUUL variables should be.) [08:01:58] RECOVERY - SSH on integration-slave-docker-1014 is OK: SSH OK - OpenSSH_6.7p1 Debian-5+deb8u4 (protocol 2.0) [08:02:25] Kemayo: should be documented at https://www.mediawiki.org/wiki/Selenium/Node.js [08:02:43] zeljkof: is here at the hackathon venue now.... somewhere [08:04:00] Kemayo: I am in sessions until lunch, but we can talk during lunch, meet you under "selenium help" sign? :D [08:04:12] Kemayo: or even better, come to Selenium session at 11! [08:04:25] (thanks greg-g ) [08:09:42] Kemayo: I took a quick look, one of the selenium tests in (ruby framework, that is deprecated) is timing out, it should be reproducible in mw-vagrant [08:23:45] zeljkof: I did try to run that on vagrant, but the readme instructions didn't actually get it to a workable state. Actually testing it myself in a browser also doesn't obviously seem to faill. [08:24:39] Kemayo: ruby selenium framework is deprecated, so docs might be out of date :( [08:25:35] we could probably get it working, but I would need 10-20 minutes to refresh my memory, is 1pm a good time for you to pair on this? is it urgent? [08:28:08] zeljkof: It's not urgent. I'll show up at the Selenium session. [08:28:48] Kemayo: we will probably not have the time for it at the session, it will be about selenium+node.js, but we can talk then [08:34:18] zeljkof: There are some questions on https://gerrit.wikimedia.org/r/#/c/164049/ about where to put a manual test page. I think that volunteer might appreciate some help understanding how to structure a better test for their change. [08:38:58] zeljkof: Okay, so... on deeper investigation, I don't think this test could pass even without my patch, because the module it's waiting for is just never loaded. So I'm going to do one of those fun fix-the-test patches instead of working more on running everything. [08:41:36] bd808: thanks, will take a look [08:41:47] 10Gerrit, 10Release-Engineering-Team: Gerrit plugin "zuul" failed to load - https://phabricator.wikimedia.org/T195176#4218757 (10Krinkle) [08:47:53] Seems integration-slave-jessie-1001 has a full drive. Making the operations-mw-config-typos job fail on all commits. [08:50:33] greg-g: [08:53:58] yeah, wanted to say the same thing. php linters are failing as well. [08:54:06] bd808: took a look, phab task is from 2010 (how is that even possible, import?), commit is from 2014 with 188 patch sets! that's amazing :) did not find the test page discussion yet, it's a busy patch [09:15:56] bugzilla stuff got imported into phab [09:16:21] Krenair: yeah, I was just about to do that before the wifi dropped, now I'm on someone's phone... [09:16:38] here's the oldest: https://phabricator.wikimedia.org/T2001 [09:18:36] !log gjg@integration-slave-jessie-1001:/srv/jenkins-workspace/workspace$ sudo rm -rf * [09:18:39] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [09:18:57] Krinkle: done (ps sorry for the usual mis-ping Krenair ;) ) [09:19:10] np :P [09:28:07] RECOVERY - Free space - all mounts on integration-slave-jessie-1001 is OK: OK: integration.integration-slave-jessie-1001.diskspace._mnt.byte_percentfree (No valid datapoints found) [10:25:22] 10commit-message-validator: commit-message-validator prints a strange 'true' and 'false' in the middle of the message - https://phabricator.wikimedia.org/T195078#4217517 (10bd808) I helped @rafidaslam look at this problem briefly during the #wikimedia-hackathon-2018. It seems that the `true` comes from the code:... [10:30:51] PROBLEM - SSH on integration-slave-docker-1015 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [10:45:41] RECOVERY - SSH on integration-slave-docker-1015 is OK: SSH OK - OpenSSH_6.7p1 Debian-5+deb8u4 (protocol 2.0) [11:00:06] PROBLEM - Puppet errors on deployment-snapshot01 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [11:11:29] 10Release-Engineering-Team, 10MediaWiki-Core-Tests, 10User-zeljkofilipin: Q4 Selenium framework improvements - https://phabricator.wikimedia.org/T190994#4219100 (10zeljkofilipin) [11:11:32] 10Release-Engineering-Team (Kanban), 10Wikimedia-Hackathon-2018, 10JavaScript, 10User-zeljkofilipin: Write Selenium tests in JavaScript/Node.js workshop - https://phabricator.wikimedia.org/T190046#4219096 (10zeljkofilipin) 05Open>03Resolved Done. Thanks to everyone that came to the workshop. I am avail... [11:15:42] 10Release-Engineering-Team (Kanban), 10Quibble, 10Wikimedia-Hackathon-2018, 10User-zeljkofilipin: Breakout session: Quibble a test runner for MediaWiki - https://phabricator.wikimedia.org/T194970#4219115 (10zeljkofilipin) 05Open>03Resolved a:03zeljkofilipin [11:25:32] 10Release-Engineering-Team (Kanban), 10Release, 10Train Deployments: 1.32.0-wmf.5 deployment blockers - https://phabricator.wikimedia.org/T191051#4219137 (10Ryasmeen) [11:58:24] 10Release-Engineering-Team (Watching / External), 10Operations, 10hardware-requests: eqiad: replacement tin/deployment server - https://phabricator.wikimedia.org/T174452#4219191 (10Reedy) [11:58:33] 10Release-Engineering-Team (Watching / External), 10Operations, 10Patch-For-Review: setup/install/deploy deploy1001 as deployment server - https://phabricator.wikimedia.org/T175288#4219190 (10Reedy) [11:58:54] 10Release-Engineering-Team (Watching / External), 10Operations, 10hardware-requests: eqiad: replacement tin/deployment server - https://phabricator.wikimedia.org/T174452#3562461 (10Reedy) [11:59:02] 10Release-Engineering-Team (Watching / External), 10Operations, 10Patch-For-Review: setup/install/deploy deploy1001 as deployment server - https://phabricator.wikimedia.org/T175288#4004075 (10Reedy) [11:59:17] RECOVERY - Free space - all mounts on deployment-tin is OK: OK: deployment-prep.deployment-tin.diskspace._mnt.byte_percentfree (No valid datapoints found) [12:01:56] 10commit-message-validator, 10Patch-For-Review: tests/sample_repository.py throws an TypeError exception - https://phabricator.wikimedia.org/T195076#4219201 (10rafidaslam) 05Open>03Resolved Merged :) [12:05:17] PROBLEM - Free space - all mounts on deployment-tin is CRITICAL: CRITICAL: deployment-prep.deployment-tin.diskspace._mnt.byte_percentfree (No valid datapoints found)deployment-prep.deployment-tin.diskspace.root.byte_percentfree (<11.11%) [14:16:04] 10Continuous-Integration-Infrastructure, 10Math: mediawiki-core-qunit-selenium-jessie tests for math broken - https://phabricator.wikimedia.org/T195206#4219460 (10Physikerwelt) [14:53:17] PROBLEM - Host deployment-puppetdb01 is DOWN: CRITICAL - Host Unreachable (10.68.23.76) [15:23:49] 10Continuous-Integration-Infrastructure, 10WikimediaMessages, 10Patch-For-Review, 10User-Zoranzoki21: Failing quibble-vendor-mysql-hhvm-docker in WikimediaMessages repository - https://phabricator.wikimedia.org/T195210#4219582 (10Zoranzoki21) [15:55:23] PROBLEM - Puppet errors on deployment-ircd is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [16:00:16] RECOVERY - Free space - all mounts on deployment-tin is OK: OK: deployment-prep.deployment-tin.diskspace._mnt.byte_percentfree (No valid datapoints found) [16:06:15] PROBLEM - Free space - all mounts on deployment-tin is CRITICAL: CRITICAL: deployment-prep.deployment-tin.diskspace._mnt.byte_percentfree (No valid datapoints found)deployment-prep.deployment-tin.diskspace.root.byte_percentfree (<22.22%) [16:38:55] PROBLEM - Host deployment-secureredirexperiment is DOWN: CRITICAL - Host Unreachable (10.68.17.132) [16:50:00] (03PS1) 10Addshore: Add quibble for Wikibase experimental [integration/config] - 10https://gerrit.wikimedia.org/r/434198 [16:52:00] (03CR) 10Addshore: [C: 032] Add quibble for Wikibase experimental [integration/config] - 10https://gerrit.wikimedia.org/r/434198 (owner: 10Addshore) [16:54:09] (03Merged) 10jenkins-bot: Add quibble for Wikibase experimental [integration/config] - 10https://gerrit.wikimedia.org/r/434198 (owner: 10Addshore) [16:55:06] !log reload zuul for (Merged) jenkins-bot: Add quibble for Wikibase experimental [integration/config] - https://gerrit.wikimedia.org/r/434198 (owner: Addshore) [16:55:08] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [17:26:54] 10MediaWiki-Codesniffer: Add or enable rule for spaces inside index brackets - https://phabricator.wikimedia.org/T195123#4218298 (10thiemowmde) My personal preference is `$array['key']`. I feel this illustrates the intent better than having the key separated off with extra spaces. It's not like the array key is... [17:36:17] RECOVERY - Free space - all mounts on deployment-tin is OK: OK: deployment-prep.deployment-tin.diskspace._mnt.byte_percentfree (No valid datapoints found) [17:47:15] PROBLEM - Free space - all mounts on deployment-tin is CRITICAL: CRITICAL: deployment-prep.deployment-tin.diskspace._mnt.byte_percentfree (No valid datapoints found)deployment-prep.deployment-tin.diskspace.root.byte_percentfree (<22.22%) [18:28:16] 10Gerrit, 10Release-Engineering-Team: Gerrit plugin "zuul" failed to load - https://phabricator.wikimedia.org/T195176#4219837 (10Paladox) I think this may be if the internet is slow, or unreliable this happens. With polygerrit it has a timeout but does not affect the ui, just shows a popup at the bottom not to... [18:28:37] i think that is the internet for why zuul fails to load [18:28:43] i think they have a timeout or something. [18:29:12] ah [18:29:16] plugins.jsLoadTimeout [18:29:20] it set to 5s [18:31:27] 10Gerrit, 10Release-Engineering-Team: Gerrit plugin "zuul" failed to load - https://phabricator.wikimedia.org/T195176#4219838 (10Paladox) Aha, https://gerrit-review.googlesource.com/Documentation/config-gerrit.html#plugins.jsLoadTimeout is set to 5s so that will be the likly cause for users who are on a slow i... [18:31:59] i filled this polygerrit bug to make timeout configurable https://bugs.chromium.org/p/gerrit/issues/detail?id=9045 [18:49:27] 10Release-Engineering-Team (Kanban), 10Wiki-Setup (Close): Close chairwiki - https://phabricator.wikimedia.org/T184961#3901664 (10Urbanecm) Any progress here? [19:15:22] https://opensource.googleblog.com/2018/05/introducing-git-protocol-version-2.html [19:48:55] RECOVERY - Host deployment-secureredirexperiment is UP: PING OK - Packet loss = 0%, RTA = 1.93 ms [20:21:40] PROBLEM - Host deployment-secureredirexperiment is DOWN: CRITICAL - Host Unreachable (10.68.17.132) [20:43:51] PROBLEM - Puppet errors on integration-slave-jessie-android is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [21:08:22] PROBLEM - Puppet errors on deployment-mx02 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [22:25:07] 10Beta-Cluster-Infrastructure: Secure deployment-prep sudo access to prevent member -> projectadmin escalation by dns-manager credentials - https://phabricator.wikimedia.org/T190781#4220001 (10Krenair) might not need to do this if we do T194998 instead (or we might decide we still want to do it, but it's lower p... [22:35:17] PROBLEM - Puppet errors on deployment-deploy1001 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [23:06:55] (03PS1) 10Physikerwelt: Revert "Run Selenium tests for Math" [integration/config] - 10https://gerrit.wikimedia.org/r/434276 [23:30:19] (03PS2) 10Physikerwelt: Revert "Run Selenium tests for Math" [integration/config] - 10https://gerrit.wikimedia.org/r/434276 (https://phabricator.wikimedia.org/T195206) [23:33:49] 10Continuous-Integration-Infrastructure, 10Math: mediawiki-core-qunit-selenium-jessie tests for math broken - https://phabricator.wikimedia.org/T195206#4220035 (10Physikerwelt) I did discuss with @zeljkofilipin to [disable quit selenium tests](https://gerrit.wikimedia.org/r/#/c/434276/) for now. However, the t...