[00:14:15] (03PS7) 10Dduvall: WIP Video recording of headless execution [selenium] - 10https://gerrit.wikimedia.org/r/222346 (https://phabricator.wikimedia.org/T104583) [00:21:05] Yippee, build fixed! [00:21:05] Project beta-update-databases-eqiad build #1282: FIXED in 1 min 5 sec: https://integration.wikimedia.org/ci/job/beta-update-databases-eqiad/1282/ [00:29:10] 10Deployment-Systems, 10RESTBase: Setup staging for testing RESTBase deploys - https://phabricator.wikimedia.org/T104276#1432596 (10thcipriani) So I created all restbase instances (`staging-restbase{01-10}`); however, in trying to get the cassandra cluster built I'm running into some issues. First I ran `sudo... [00:29:39] (03PS8) 10Dduvall: WIP Video recording of headless execution [selenium] - 10https://gerrit.wikimedia.org/r/222346 (https://phabricator.wikimedia.org/T104583) [00:30:53] 10Deployment-Systems, 10RESTBase: Setup staging for testing RESTBase deploys - https://phabricator.wikimedia.org/T104276#1432603 (10GWicke) @thcipriani, you need to temporarily add one node to its own seeds in the cassandra config in order to let it start up. All other nodes (and itself, when it re-joins later... [03:34:15] (03PS1) 1020after4: Increment deployment stats after sync-wikiversions [tools/scap] - 10https://gerrit.wikimedia.org/r/223236 (https://phabricator.wikimedia.org/T104635) [03:34:38] (03CR) 10jenkins-bot: [V: 04-1] Increment deployment stats after sync-wikiversions [tools/scap] - 10https://gerrit.wikimedia.org/r/223236 (https://phabricator.wikimedia.org/T104635) (owner: 1020after4) [04:30:58] Yippee, build fixed! [04:30:58] Project browsertests-MultimediaViewer-en.wikipedia.beta.wmflabs.org-windows_7-internet_explorer-9-sauce build #492: FIXED in 38 min: https://integration.wikimedia.org/ci/job/browsertests-MultimediaViewer-en.wikipedia.beta.wmflabs.org-windows_7-internet_explorer-9-sauce/492/ [04:40:46] PROBLEM - Puppet staleness on deployment-restbase01 is CRITICAL 20.00% of data above the critical threshold [43200.0] [05:35:46] Project browsertests-MultimediaViewer-en.wikipedia.beta.wmflabs.org-windows_7-internet_explorer-11-sauce build #472: FAILURE in 33 min: https://integration.wikimedia.org/ci/job/browsertests-MultimediaViewer-en.wikipedia.beta.wmflabs.org-windows_7-internet_explorer-11-sauce/472/ [06:30:14] Yippee, build fixed! [06:30:15] Project browsertests-Core-en.wikipedia.beta.wmflabs.org-linux-firefox-sauce build #672: FIXED in 11 min: https://integration.wikimedia.org/ci/job/browsertests-Core-en.wikipedia.beta.wmflabs.org-linux-firefox-sauce/672/ [06:40:17] RECOVERY - Free space - all mounts on deployment-bastion is OK All targets OK [07:41:56] Project browsertests-MultimediaViewer-en.wikipedia.beta.wmflabs.org-windows_8-internet_explorer-10-sauce build #91: FAILURE in 32 min: https://integration.wikimedia.org/ci/job/browsertests-MultimediaViewer-en.wikipedia.beta.wmflabs.org-windows_8-internet_explorer-10-sauce/91/ [07:55:53] (03CR) 10Zfilipin: [C: 031] "Looks good to me. Is there something left to do here, or can WIP be removed?" [selenium] - 10https://gerrit.wikimedia.org/r/222346 (https://phabricator.wikimedia.org/T104583) (owner: 10Dduvall) [08:30:13] (03CR) 10Zfilipin: [C: 031] Generalize MW-Selenium job for MediaWiki extensions [integration/config] - 10https://gerrit.wikimedia.org/r/223189 (https://phabricator.wikimedia.org/T103039) (owner: 10Dduvall) [08:41:09] 10Beta-Cluster, 10Wikimedia-Site-requests, 5Patch-For-Review: Enable the possibility to block users by the AbuseFilter at the deployment wiki at the beta cluster - https://phabricator.wikimedia.org/T103060#1433217 (10Luke081515) @Krenair : Thanks, it works, filter created. [08:57:48] 5Continuous-Integration-Isolation: Create a Jessie image with diskimage-builder suitable for nodepool - https://phabricator.wikimedia.org/T102878#1433255 (10hashar) [08:57:50] 5Continuous-Integration-Isolation, 5Patch-For-Review: Remove /etc/nodepool/elements/devuser on labnodepool1001 once diskimage-builder is updated - https://phabricator.wikimedia.org/T102882#1433254 (10hashar) [08:57:53] 5Continuous-Integration-Isolation, 6operations, 5Patch-For-Review: Backport python-diskimage-builder 0.1.46 from testing to jessie-wikimedia - https://phabricator.wikimedia.org/T102880#1433252 (10hashar) 5Open>3Resolved I have upgraded the package on labnodepool1001 to `0.1.46-1+wmf1`. Not sure why we h... [08:59:21] 5Continuous-Integration-Isolation, 5Patch-For-Review: Remove /etc/nodepool/elements/devuser on labnodepool1001 once diskimage-builder is updated - https://phabricator.wikimedia.org/T102882#1433261 (10hashar) 5Open>3Resolved a:3hashar [09:04:15] 5Continuous-Integration-Isolation, 6operations, 5Patch-For-Review: Backport python-diskimage-builder 0.1.46 from testing to jessie-wikimedia - https://phabricator.wikimedia.org/T102880#1433269 (10MoritzMuehlenhoff) >>! In T102880#1433252, @hashar wrote: > I have upgraded the package on labnodepool1001 to `0... [09:16:10] 10Beta-Cluster, 10MediaWiki-API: mw: interwiki prefix missing on beta cluster, so API's "complete documentation" is a 404. - https://phabricator.wikimedia.org/T104504#1433305 (10Spage) >>! In T104504#1419533, @Legoktm wrote: > Interwikis are a local site customization, so messages should not depend on it and s... [09:54:12] 5Continuous-Integration-Isolation, 6operations: Backport python-os-client-config 1.3.0-1 from Debian Sid to jessie-wikimedia - https://phabricator.wikimedia.org/T104967#1433420 (10hashar) 3NEW [09:57:34] 5Continuous-Integration-Isolation, 6operations: Backport python-os-client-config 1.3.0-1 from Debian Sid to jessie-wikimedia - https://phabricator.wikimedia.org/T104967#1433433 (10hashar) [09:59:09] zeljkof: yo [09:59:39] aharoni: I just wanted to comment all the browser tests bots :) [09:59:48] we even have gender parity [10:00:14] gender parity ftw [10:01:46] aharoni: any news from vikas about his visa? [10:02:10] also, we should prepare the talk [10:12:45] 5Continuous-Integration-Isolation, 6operations, 7Nodepool: Bump our Nodepool package to 1.0.0 - https://phabricator.wikimedia.org/T104971#1433466 (10hashar) 3NEW [10:25:39] 5Continuous-Integration-Isolation: Investigate non blocking fs resizing when instance is booted - https://phabricator.wikimedia.org/T104974#1433499 (10hashar) 3NEW [13:06:06] Yippee, build fixed! [13:06:06] Project browsertests-MobileFrontend-en.m.wikipedia.beta.wmflabs.org-linux-chrome-sauce build #709: FIXED in 34 min: https://integration.wikimedia.org/ci/job/browsertests-MobileFrontend-en.m.wikipedia.beta.wmflabs.org-linux-chrome-sauce/709/ [13:16:35] (03CR) 10Hashar: [C: 032] "Seems to work as expected. Merging in so we get Zuul to populate EXT_NAME" [integration/config] - 10https://gerrit.wikimedia.org/r/223189 (https://phabricator.wikimedia.org/T103039) (owner: 10Dduvall) [13:18:33] (03Merged) 10jenkins-bot: Generalize MW-Selenium job for MediaWiki extensions [integration/config] - 10https://gerrit.wikimedia.org/r/223189 (https://phabricator.wikimedia.org/T103039) (owner: 10Dduvall) [13:25:25] (03PS1) 10Hashar: zuul: set ext dependencies on mwext-mw-selenium [integration/config] - 10https://gerrit.wikimedia.org/r/223291 [13:25:49] (03CR) 10Hashar: [C: 032] zuul: set ext dependencies on mwext-mw-selenium [integration/config] - 10https://gerrit.wikimedia.org/r/223291 (owner: 10Hashar) [13:27:40] (03Merged) 10jenkins-bot: zuul: set ext dependencies on mwext-mw-selenium [integration/config] - 10https://gerrit.wikimedia.org/r/223291 (owner: 10Hashar) [13:34:05] (03CR) 10Hashar: "I have deleted the job mwext-MobileFrontend-mw-selenium" [integration/config] - 10https://gerrit.wikimedia.org/r/223189 (https://phabricator.wikimedia.org/T103039) (owner: 10Dduvall) [13:46:31] 10Browser-Tests, 10Continuous-Integration-Infrastructure, 5Patch-For-Review: Define JJB builder for running a subset of integration MW-Selenium tests - https://phabricator.wikimedia.org/T103039#1433910 (10hashar) Gave it a try on the opened change for MobileFrontend https://gerrit.wikimedia.org/r/#/c/221313/... [13:55:50] 5Continuous-Integration-Isolation, 6operations, 7Nodepool: flapping "permission denied" disk space alarm for temporary image on labnodepool1001 - https://phabricator.wikimedia.org/T104975#1433956 (10hashar) [13:56:57] 5Continuous-Integration-Isolation, 6operations, 5Patch-For-Review: Backport python-diskimage-builder 0.1.46 from testing to jessie-wikimedia - https://phabricator.wikimedia.org/T102880#1433963 (10hashar) >>! In T102880#1433269, @MoritzMuehlenhoff wrote: > I've changed the version number since that build is s... [13:57:31] PROBLEM - Free space - all mounts on deployment-videoscaler01 is CRITICAL deployment-prep.deployment-videoscaler01.diskspace._var.byte_percentfree (<40.00%) [14:12:57] 6Release-Engineering: Organize browsertests/Selenium training - https://phabricator.wikimedia.org/T100170#1434039 (10zeljkofilipin) [14:12:58] 10Browser-Tests, 10Wikimania-Hackathon-2015: Workshop: write the first browsertests/Selenium test - https://phabricator.wikimedia.org/T94024#1434038 (10zeljkofilipin) 5Resolved>3Open [14:13:00] 6Release-Engineering: Organize browsertests/Selenium training - https://phabricator.wikimedia.org/T100170#1307198 (10zeljkofilipin) [14:13:05] 10Browser-Tests, 10Wikimania-Hackathon-2015: Workshop: Fix broken browsertests/Selenium Jenkins jobs - https://phabricator.wikimedia.org/T94299#1434041 (10zeljkofilipin) 5Resolved>3Open [14:14:36] (03PS2) 1020after4: Increment deployment stats after sync-wikiversions [tools/scap] - 10https://gerrit.wikimedia.org/r/223236 (https://phabricator.wikimedia.org/T104635) [14:17:34] 6Release-Engineering: Prepare Making Translated Screenshots With No Effort talk - https://phabricator.wikimedia.org/T104985#1434075 (10Qgil) [14:18:38] 6Release-Engineering: Prepare Making Translated Screenshots With No Effort talk - https://phabricator.wikimedia.org/T104985#1433726 (10Qgil) There is no #Wikimania project in Phabricator. Check the change of projects. I just want to keep the #Wikimania-Hackathon-2015 workboard up to date. [14:29:44] 10Browser-Tests, 10Wikimania-Hackathon-2015: Fix broken browsertests/Selenium Jenkins jobs - https://phabricator.wikimedia.org/T94299#1434146 (10zeljkofilipin) [14:29:51] 10Browser-Tests, 10Wikimania-Hackathon-2015: Investigate using the sikuli-like Applitools framework for visual testing - https://phabricator.wikimedia.org/T90884#1434147 (10zeljkofilipin) [14:29:52] (03PS1) 10Hashar: Drop sartoris [integration/config] - 10https://gerrit.wikimedia.org/r/223303 [14:30:34] 10Browser-Tests, 10Wikimania-Hackathon-2015: Write the first browsertests/Selenium test - https://phabricator.wikimedia.org/T94024#1434152 (10zeljkofilipin) [14:31:39] (03PS2) 10Hashar: Drop sartoris [integration/config] - 10https://gerrit.wikimedia.org/r/223303 (https://phabricator.wikimedia.org/T105005) [14:31:46] (03CR) 10Hashar: [C: 032] Drop sartoris [integration/config] - 10https://gerrit.wikimedia.org/r/223303 (https://phabricator.wikimedia.org/T105005) (owner: 10Hashar) [14:33:56] (03Merged) 10jenkins-bot: Drop sartoris [integration/config] - 10https://gerrit.wikimedia.org/r/223303 (https://phabricator.wikimedia.org/T105005) (owner: 10Hashar) [14:38:27] Project browsertests-MobileFrontend-SmokeTests-linux-chrome-sauce build #183: FAILURE in 10 min: https://integration.wikimedia.org/ci/job/browsertests-MobileFrontend-SmokeTests-linux-chrome-sauce/183/ [14:49:10] 5Continuous-Integration-Isolation, 6operations, 7Nodepool: Bump our Nodepool package to 1.0.0 - https://phabricator.wikimedia.org/T104971#1434279 (10hashar) http://backports.debian.org/Instructions/#index3h2 claims: > apt-get -t jessie-backports install "package" But we should probably go with apt preferenc... [14:53:23] 6Release-Engineering, 6Commons, 10MediaWiki-File-management, 10MediaWiki-Tarball-Backports, and 7 others: InstantCommons broken by switch to HTTPS - https://phabricator.wikimedia.org/T102566#1434294 (10Tau) I downloaded 2 php.ini files via FileZilla from directories /etc/php5/apache2/ and /etc/php5/cli/.... [14:55:44] 5Continuous-Integration-Isolation, 6operations, 7Nodepool: Bump our Nodepool package to 0.1.0 or just before 0.1.1 - https://phabricator.wikimedia.org/T104971#1434320 (10hashar) [15:04:24] 10Browser-Tests, 10Wikimania-Hackathon-2015: Fix broken browsertests/Selenium Jenkins jobs - https://phabricator.wikimedia.org/T94299#1434393 (10zeljkofilipin) p:5Normal>3Low [15:10:10] 6Release-Engineering, 10Wikimania-Hackathon-2015, 7Tracking: Update repositories that use mediawiki_selenium Ruby gem 1.x (tracking) - https://phabricator.wikimedia.org/T94083#1434414 (10zeljkofilipin) [15:12:31] 10Browser-Tests, 5Patch-For-Review: Wikidata browser test jobs fail since upgrading to mediawiki-selenium 1.2.1 - https://phabricator.wikimedia.org/T102458#1434425 (10dduvall) 5Open>3Resolved The updated dependency is included in the 1.4 release. [15:13:13] 10Browser-Tests, 5Patch-For-Review: MW-Selenium's PageFactory chokes when `on(Page)` is used before the first use of `visit(Page)` - https://phabricator.wikimedia.org/T103746#1434428 (10dduvall) 5Open>3Resolved a:3dduvall Included in the 1.4.0 release. [15:13:43] 10Browser-Tests, 10VisualEditor: Update VisualEditor repository to mediawiki_selenium Ruby gem 1.1 - https://phabricator.wikimedia.org/T99661#1434437 (10dduvall) p:5High>3Normal [15:15:30] 10Browser-Tests, 10MediaWiki-extensions-UniversalLanguageSelector, 5Patch-For-Review: Fix failed UniversalLanguageSelector browsertests Jenkins job - https://phabricator.wikimedia.org/T94158#1434446 (10zeljkofilipin) [15:16:28] 10Browser-Tests, 10MediaWiki-extensions-UniversalLanguageSelector, 5Patch-For-Review: Fix failed UniversalLanguageSelector browsertests Jenkins job - https://phabricator.wikimedia.org/T94158#1156479 (10zeljkofilipin) @amire80: I see the job is still broken. Do you plan to work on this in the near future? [15:16:42] 10Browser-Tests, 10MediaWiki-extensions-UniversalLanguageSelector: Fix failed UniversalLanguageSelector browsertests Jenkins job - https://phabricator.wikimedia.org/T94158#1434460 (10zeljkofilipin) [15:20:29] marxarelli: the browser test job triggered from Gerrit for MobileFrontend passed :-)))))))))))))))))))))))))))) [15:20:44] gotta bump everything now :-/ [15:23:28] PROBLEM - Puppet failure on deployment-cache-bits01 is CRITICAL 100.00% of data above the critical threshold [0.0] [15:25:02] 10Browser-Tests, 6Phabricator, 10Phabricator-Sprint-Extension, 7Upstream: Create Browser Tests for Phabricator - https://phabricator.wikimedia.org/T87359#1434505 (10zeljkofilipin) [15:26:14] 10Browser-Tests, 6Phabricator, 10Phabricator-Sprint-Extension, 7Upstream: Create Browser Tests for Phabricator - https://phabricator.wikimedia.org/T87359#989638 (10zeljkofilipin) Removed release-engineering, looks like this is not a task for that board. Removed patch-for-review, the only patch was merged.... [15:27:16] Yippee, build fixed! [15:27:16] Project browsertests-MobileFrontend-SmokeTests-linux-chrome-sauce build #184: FIXED in 10 min: https://integration.wikimedia.org/ci/job/browsertests-MobileFrontend-SmokeTests-linux-chrome-sauce/184/ [15:28:11] 10Browser-Tests, 6Collaboration-Team, 10Flow: QA: mediawiki_selenium's "no ResourceLoader errors" step should not run for tests of the no-JS experience - https://phabricator.wikimedia.org/T85654#1434515 (10dduvall) 5Open>3Invalid a:3dduvall The requested behavior would be an antipattern. If you don't w... [15:29:13] 10Browser-Tests, 5Patch-For-Review: Support headless gem's video recording feature for headless Jenkins jobs - https://phabricator.wikimedia.org/T104583#1434530 (10dduvall) a:3dduvall [15:30:34] 10Browser-Tests, 10Continuous-Integration-Infrastructure, 5Patch-For-Review: It takes about 20 seconds just to start a Sauce Labs browser - https://phabricator.wikimedia.org/T92613#1434534 (10zeljkofilipin) [15:31:45] 10Browser-Tests, 10Continuous-Integration-Infrastructure, 5Patch-For-Review: It takes about 20 seconds just to start a Sauce Labs browser - https://phabricator.wikimedia.org/T92613#1116195 (10zeljkofilipin) Updating to the latest version of watir-webdriver should solve the problem. [15:31:56] 10Browser-Tests, 10Continuous-Integration-Infrastructure, 5Patch-For-Review: It takes about 20 seconds just to start a Sauce Labs browser - https://phabricator.wikimedia.org/T92613#1434543 (10zeljkofilipin) a:3zeljkofilipin [15:33:18] 10Browser-Tests, 10Continuous-Integration-Infrastructure, 5Patch-For-Review: It takes about 20 seconds just to start a Sauce Labs browser - https://phabricator.wikimedia.org/T92613#1116195 (10zeljkofilipin) Make sure to check the status of related patch https://gerrit.wikimedia.org/r/#/c/200767/ [15:41:16] 10Browser-Tests: Use rspec-expectations "expect" syntax instead of "should" syntax - https://phabricator.wikimedia.org/T68369#1434587 (10zeljkofilipin) [15:41:50] 10Continuous-Integration-Infrastructure, 7Jenkins: Upgrade Jenkins to 1.609.1 - https://phabricator.wikimedia.org/T101884#1434591 (10hashar) Lets upgrade Jenkins during our Wednesday pairing session. [[ https://wikitech.wikimedia.org/w/index.php?title=Deployments&diff=169423&oldid=169421 | Scheduled at 10:00... [15:42:02] 10Browser-Tests: Use rspec-expectations "expect" syntax instead of "should" syntax - https://phabricator.wikimedia.org/T68369#686543 (10zeljkofilipin) Removed release-engineering, since it does not belong there. Removed patch-for-review, since all related patches (as far as I can see) are merged or abandoned. [15:44:10] 10Browser-Tests, 6Collaboration-Team, 10Flow: Flow's Edit existing post test fails if post not in view - https://phabricator.wikimedia.org/T59702#1434614 (10zeljkofilipin) [15:44:30] 10Browser-Tests, 6Collaboration-Team, 10Flow: Flow's Edit existing post test fails if post not in view - https://phabricator.wikimedia.org/T59702#620746 (10zeljkofilipin) Removed patch-for-review because related patch has been abandoned. [15:45:02] !log Cherry-picked https://gerrit.wikimedia.org/r/#/c/223301/ [15:45:06] Logged the message, Master [15:45:12] 10Browser-Tests, 6Collaboration-Team, 10Flow: Flow's Edit existing post test fails if post not in view - https://phabricator.wikimedia.org/T59702#1434629 (10zeljkofilipin) @spage: is this still a problem, or can we close the task? [15:46:30] 10Browser-Tests: mediawiki_selenium always use the same default xvfb display 99 - https://phabricator.wikimedia.org/T73602#1434632 (10dduvall) 5Open>3declined a:3dduvall There's an outstanding upstream bug which prevents us from reusing a display started by another user, but the maintainer has not responde... [15:46:31] 10Browser-Tests, 10Continuous-Integration-Infrastructure, 6Release-Engineering, 7Jenkins: Jenkins: browser test host performance issue for timed builds - https://phabricator.wikimedia.org/T68449#1434635 (10dduvall) [15:49:33] 10Browser-Tests, 6Collaboration-Team, 10Echo: 503 on Echo tests - https://phabricator.wikimedia.org/T103437#1434665 (10dduvall) 5Open>3declined a:3dduvall We can't avoid errors due to underlying infrastructure, and we're already working on initiatives to provide a more stable staging cluster and to imp... [16:02:07] 10Browser-Tests: mediawiki_selenium always use the same default xvfb display 99 - https://phabricator.wikimedia.org/T73602#1434774 (10hashar) I looked at your solution and agree assigning the displays ourselves is better than relying on upstream unknown spaghetti code :-) Thanks! [16:17:32] 10Browser-Tests, 10Wikimania-Hackathon-2015: Fix broken browsertests/Selenium Jenkins jobs - https://phabricator.wikimedia.org/T94299#1434850 (10zeljkofilipin) p:5Low>3Normal [16:46:44] ostriches: the phab svn bug is hotfixed [16:47:38] ty [16:48:01] Hmm [16:48:05] https://phabricator.wikimedia.org/diffusion/SVN/ [16:48:13] Oh, needs apache kick prolly, apc cache [17:08:14] 10Deployment-Systems, 7Security-Other: By default, have /usr/local/bin/sql use a read-only account when connecting to mysql - https://phabricator.wikimedia.org/T105046#1435123 (10Krinkle) [17:16:16] 10Deployment-Systems, 7Security-Other: By default, have /usr/local/bin/sql use a read-only account when connecting to mysql - https://phabricator.wikimedia.org/T105046#1435148 (10jcrespo) This is particular harmful in the case of `sqldump`, where with the current defaults it will block all writes (`--lock-tabl... [17:17:04] 10Deployment-Systems, 7Security-Other: By default, have /usr/local/bin/sql use a read-only account when connecting to mysql - https://phabricator.wikimedia.org/T105046#1435153 (10Krenair) a:3Krenair [17:40:16] 5Continuous-Integration-Isolation, 6Labs, 10Labs-Infrastructure, 3Labs-Sprint-103, and 3 others: Instances without a shared NFS storage suffers from a 3 minutes boot delay - https://phabricator.wikimedia.org/T102544#1435228 (10Andrew) The new trusty image now has all the updated changes and should start up... [18:08:40] ostriches: yep, restarted apache, error gone [18:10:51] !log puppet broken on deployment-cache-* by https://gerrit.wikimedia.org/r/#/c/222124/ [18:10:53] Logged the message, Master [18:14:58] !log Changed role::protoproxy::ssl::beta to role::tlsproxy::ssl::beta for deployment-cache-* [18:15:00] Logged the message, Master [18:19:48] ostriches, thcipriani: ^ the new role has some failures (varnishxcps, varnishrls, varnishstatsd-default) that look like "oops this only works on jessie" problems [18:21:29] hmm [18:21:46] bd808: we have some time scheduled to work cache problems in beta this afternoon. There have been quite a few varnish issues on beta recently. [18:22:06] ideally we'd update the beta caches to jessie to match production. There's a ticket for that somewhere. [18:22:19] cool. You may want to find bblack and beat him with a trout too :) [18:23:53] I just noticed it because I'm trying to get traffic off of logstash1 and on to logstash2 and the vanish servers were still sending stuff to logstash1 [18:31:39] (03PS9) 10Dduvall: Video recording of headless execution [selenium] - 10https://gerrit.wikimedia.org/r/222346 (https://phabricator.wikimedia.org/T104583) [18:32:59] (03CR) 10jenkins-bot: [V: 04-1] Video recording of headless execution [selenium] - 10https://gerrit.wikimedia.org/r/222346 (https://phabricator.wikimedia.org/T104583) (owner: 10Dduvall) [18:37:06] (03PS10) 10Dduvall: Video recording of headless execution [selenium] - 10https://gerrit.wikimedia.org/r/222346 (https://phabricator.wikimedia.org/T104583) [18:40:03] 10Beta-Cluster, 10Wikimedia-Site-requests, 5Patch-For-Review: Enable the possibility to block users by the AbuseFilter at the deployment wiki at the beta cluster - https://phabricator.wikimedia.org/T103060#1435465 (10hashar) >>! In T103060#1432193, @MGChecker wrote: > Is it active for deploymentwiki or for a... [18:47:06] 10Continuous-Integration-Infrastructure, 7Jenkins: Upgrade Jenkins to 1.609.1 - https://phabricator.wikimedia.org/T101884#1435471 (10Krinkle) [18:51:29] !log restarted puppetmaster on deployment-salt to pick up logging config changes [18:51:32] Logged the message, Master [19:31:41] (03PS11) 10Dduvall: Video recording of headless execution [selenium] - 10https://gerrit.wikimedia.org/r/222346 (https://phabricator.wikimedia.org/T104583) [19:34:53] (03CR) 10Dduvall: [C: 031] "The latest patches switch the codec to `libx264` for better compression, only start captures if `headless_capture_path` is defined, and on" [selenium] - 10https://gerrit.wikimedia.org/r/222346 (https://phabricator.wikimedia.org/T104583) (owner: 10Dduvall) [19:41:35] 10Beta-Cluster, 10Wikimedia-Logstash, 5Patch-For-Review: deployment-logstash02 fails puppet: Apache2 can't start, mod_authz_groupfile not enabled on Jessie - https://phabricator.wikimedia.org/T103804#1435690 (10bd808) 5Open>3Resolved a:3bd808 [19:42:13] 10Beta-Cluster, 10Wikimedia-Logstash, 5Patch-For-Review, 15User-Bd808-Test: Build jessie based elasticsearch/logstash/kibana (ELK) host for beta testing - https://phabricator.wikimedia.org/T101541#1435693 (10bd808) [19:51:52] 10Deployment-Systems, 10RESTBase: Setup staging for testing RESTBase deploys - https://phabricator.wikimedia.org/T104276#1435720 (10thcipriani) @GWicke thanks for the tips—finally got everything up and running. `staging-restbase{01..10}` is now up and running on debian jessie. All are in the same cassandra cl... [19:55:05] 10Deployment-Systems, 10RESTBase: Setup staging for testing RESTBase deploys - https://phabricator.wikimedia.org/T104276#1435721 (10GWicke) @thcipriani, to test our current ansible system, you need to 1) add a new inventory file similar to https://github.com/wikimedia/ansible-deploy/blob/master/staging ('labs... [19:58:09] !log cherry-picked https://gerrit.wikimedia.org/r/#/c/223391/ [19:58:11] Logged the message, Master [20:07:51] !log Forced puppet run on deployment-restbase01; run picked up changes that should have been applied yesterday, not sure why puppet wasn't running from cron properly [20:07:54] Logged the message, Master [20:15:48] RECOVERY - Puppet staleness on deployment-restbase01 is OK Less than 1.00% above the threshold [3600.0] [20:21:59] RECOVERY - Puppet failure on deployment-restbase01 is OK Less than 1.00% above the threshold [0.0] [20:33:34] 10Deployment-Systems, 6operations, 5Patch-For-Review: install/deploy mira as codfw deployment server - https://phabricator.wikimedia.org/T95436#1435862 (10Dzahn) on mira: Notice: /Stage[main]/Role::Deployment::Server/Package[mysql-client]/ensure: ensure changed 'purged' to 'present' on tin: Notice: /Stage... [20:38:06] 10Beta-Cluster, 10Wikimedia-Site-requests, 5Patch-For-Review: Enable the possibility to block users by the AbuseFilter at the deployment wiki at the beta cluster - https://phabricator.wikimedia.org/T103060#1435880 (10MGChecker) @hashar I thought so too, but I have seen that in ß-dewiki it is active now altho... [20:48:02] !log cherry-picking https://gerrit.wikimedia.org/r/#/c/158016/ on deployment-salt [20:48:05] Logged the message, Master [21:10:29] (03CR) 10Zfilipin: [C: 032] Video recording of headless execution [selenium] - 10https://gerrit.wikimedia.org/r/222346 (https://phabricator.wikimedia.org/T104583) (owner: 10Dduvall) [21:11:39] (03Merged) 10jenkins-bot: Video recording of headless execution [selenium] - 10https://gerrit.wikimedia.org/r/222346 (https://phabricator.wikimedia.org/T104583) (owner: 10Dduvall) [21:15:17] PROBLEM - Puppet failure on deployment-cache-text03 is CRITICAL 11.11% of data above the critical threshold [0.0] [21:18:27] PROBLEM - Puppet failure on deployment-parsoidcache02 is CRITICAL 100.00% of data above the critical threshold [0.0] [21:20:21] RECOVERY - Puppet failure on deployment-cache-text03 is OK Less than 1.00% above the threshold [0.0] [21:23:35] bd808: Fwiw, I don't see any reason not to just rebuild them as jessie. Puppet seems to Work Just Fine in beta + jessie + cache roles. Only PITA is that the IPs/hostnames are hardcoded in puppet and not hiera-ized yet. [21:23:48] !log deleted instance deployment-logstash1 [21:23:51] Logged the message, Master [21:24:04] ostriches: *nod* [21:24:17] getting them to jessie seems like a good idea [21:24:52] I just did the "track down all the references" dance for logstash1. PITA [21:25:18] sneaksy parsoid had a config repo I didn't know about/think of [21:25:44] 10Beta-Cluster, 10Wikimedia-Logstash, 5Patch-For-Review, 15User-Bd808-Test: Build jessie based elasticsearch/logstash/kibana (ELK) host for beta testing - https://phabricator.wikimedia.org/T101541#1436030 (10bd808) [21:26:05] Meh, spoke a tad too fast. [21:26:18] PROBLEM - Puppet failure on deployment-cache-text03 is CRITICAL 22.22% of data above the critical threshold [0.0] [21:26:18] nginx isn't starting [21:26:25] it never has [21:26:37] we don't have ssl certs [21:26:49] and that makes it sad (and useless) [21:27:12] PROBLEM - Host deployment-logstash1 is DOWN: CRITICAL - Host Unreachable (10.68.16.134) [21:27:43] How do I tell shinken that I wanted logstash1 dead? [21:28:00] Ah, star.wmflabs.org [21:28:40] there is an epic phab bug about the certs [21:29:17] I know that story. [21:29:37] epic indeed [21:30:32] FYI everyone: I just killed deployment-logstash1. deployment-logstash2 is the new hotness. If you go looking for something in kibana and can't find it it is possible that I missed a config change somewhere. [21:31:14] I think I got everything and tcpdump for 20 minutes was empty but that's not complete proof [21:32:55] Yippee, build fixed! [21:32:55] Project browsertests-Flow-en.wikipedia.beta.wmflabs.org-linux-chrome-sauce build #693: FIXED in 1 hr 6 min: https://integration.wikimedia.org/ci/job/browsertests-Flow-en.wikipedia.beta.wmflabs.org-linux-chrome-sauce/693/ [21:39:13] 10Beta-Cluster, 6Security-Team, 6operations, 7Blocked-on-Operations, and 2 others: Setup a dedicated mediawiki host in Beta Cluster that we can use for security scanning - https://phabricator.wikimedia.org/T72181#1436051 (10dduvall) Paired with @demon and @thcipriani in rewriting the patch as much of the P... [21:49:58] 10Beta-Cluster, 10Wikimedia-Logstash, 5Patch-For-Review, 15User-Bd808-Test: Build jessie based elasticsearch/logstash/kibana (ELK) host for beta testing - https://phabricator.wikimedia.org/T101541#1436078 (10bd808) [21:50:20] 10Beta-Cluster, 10Wikimedia-Logstash, 5Patch-For-Review, 15User-Bd808-Test: Build jessie based elasticsearch/logstash/kibana (ELK) host for beta testing - https://phabricator.wikimedia.org/T101541#1436079 (10bd808) 5Open>3Resolved [21:56:17] PROBLEM - Free space - all mounts on deployment-bastion is CRITICAL deployment-prep.deployment-bastion.diskspace._var.byte_percentfree (<11.11%) [22:12:37] PROBLEM - Puppet failure on deployment-cxserver03 is CRITICAL 20.00% of data above the critical threshold [0.0] [22:15:39] PROBLEM - Puppet failure on deployment-memc03 is CRITICAL 50.00% of data above the critical threshold [0.0] [22:23:42] PROBLEM - Puppet failure on deployment-parsoid05 is CRITICAL 40.00% of data above the critical threshold [0.0] [22:26:18] RECOVERY - Free space - all mounts on deployment-bastion is OK All targets OK [22:36:33] PROBLEM - Puppet failure on deployment-memc04 is CRITICAL 60.00% of data above the critical threshold [0.0] [22:39:11] PROBLEM - Puppet failure on deployment-bastion is CRITICAL 22.22% of data above the critical threshold [0.0] [22:48:39] RECOVERY - Puppet failure on deployment-parsoid05 is OK Less than 1.00% above the threshold [0.0] [22:51:19] PROBLEM - Puppet failure on deployment-sca01 is CRITICAL 33.33% of data above the critical threshold [0.0] [22:51:29] PROBLEM - Puppet failure on deployment-zotero01 is CRITICAL 40.00% of data above the critical threshold [0.0] [22:52:16] PROBLEM - Puppet failure on deployment-restbase02 is CRITICAL 44.44% of data above the critical threshold [0.0] [22:52:30] PROBLEM - Puppet failure on deployment-urldownloader is CRITICAL 60.00% of data above the critical threshold [0.0] [22:53:12] PROBLEM - Puppet failure on deployment-sca02 is CRITICAL 66.67% of data above the critical threshold [0.0] [23:00:38] RECOVERY - Puppet failure on deployment-memc03 is OK Less than 1.00% above the threshold [0.0] [23:01:34] RECOVERY - Puppet failure on deployment-memc04 is OK Less than 1.00% above the threshold [0.0] [23:02:36] RECOVERY - Puppet failure on deployment-cxserver03 is OK Less than 1.00% above the threshold [0.0] [23:07:05] PROBLEM - Puppet failure on deployment-mediawiki01 is CRITICAL 55.56% of data above the critical threshold [0.0] [23:07:07] PROBLEM - Puppet failure on deployment-logstash2 is CRITICAL 55.56% of data above the critical threshold [0.0] [23:09:11] RECOVERY - Puppet failure on deployment-bastion is OK Less than 1.00% above the threshold [0.0] [23:09:33] PROBLEM - Puppet failure on deployment-elastic05 is CRITICAL 30.00% of data above the critical threshold [0.0] [23:11:27] Project browsertests-Gather-en.m.wikipedia.beta.wmflabs.org-linux-chrome-sauce build #187: FAILURE in 14 min: https://integration.wikimedia.org/ci/job/browsertests-Gather-en.m.wikipedia.beta.wmflabs.org-linux-chrome-sauce/187/ [23:16:21] RECOVERY - Puppet failure on deployment-sca01 is OK Less than 1.00% above the threshold [0.0] [23:16:31] RECOVERY - Puppet failure on deployment-zotero01 is OK Less than 1.00% above the threshold [0.0] [23:17:19] RECOVERY - Puppet failure on deployment-restbase02 is OK Less than 1.00% above the threshold [0.0] [23:17:27] RECOVERY - Puppet failure on deployment-urldownloader is OK Less than 1.00% above the threshold [0.0] [23:18:11] RECOVERY - Puppet failure on deployment-sca02 is OK Less than 1.00% above the threshold [0.0] [23:20:09] PROBLEM - Puppet failure on deployment-bastion is CRITICAL 44.44% of data above the critical threshold [0.0] [23:27:08] RECOVERY - Puppet failure on deployment-logstash2 is OK Less than 1.00% above the threshold [0.0] [23:28:26] PROBLEM - Puppet failure on deployment-stream is CRITICAL 40.00% of data above the critical threshold [0.0] [23:31:12] PROBLEM - Puppet failure on deployment-db2 is CRITICAL 55.56% of data above the critical threshold [0.0] [23:32:04] RECOVERY - Puppet failure on deployment-mediawiki01 is OK Less than 1.00% above the threshold [0.0] [23:32:23] PROBLEM - Puppet failure on deployment-sca01 is CRITICAL 55.56% of data above the critical threshold [0.0] [23:37:36] PROBLEM - Puppet failure on deployment-memc04 is CRITICAL 60.00% of data above the critical threshold [0.0] [23:49:24] (03CR) 10BryanDavis: [C: 032] Increment deployment stats after sync-wikiversions [tools/scap] - 10https://gerrit.wikimedia.org/r/223236 (https://phabricator.wikimedia.org/T104635) (owner: 1020after4) [23:49:45] (03Merged) 10jenkins-bot: Increment deployment stats after sync-wikiversions [tools/scap] - 10https://gerrit.wikimedia.org/r/223236 (https://phabricator.wikimedia.org/T104635) (owner: 1020after4) [23:53:53] Yippee, build fixed! [23:53:54] Project browsertests-Gather-en.m.wikipedia.beta.wmflabs.org-linux-chrome-sauce build #188: FIXED in 14 min: https://integration.wikimedia.org/ci/job/browsertests-Gather-en.m.wikipedia.beta.wmflabs.org-linux-chrome-sauce/188/ [23:55:52] PROBLEM - Puppet failure on deployment-redis01 is CRITICAL 20.00% of data above the critical threshold [0.0] [23:56:12] RECOVERY - Puppet failure on deployment-db2 is OK Less than 1.00% above the threshold [0.0] [23:56:19] PROBLEM - Puppet failure on deployment-elastic07 is CRITICAL 20.00% of data above the critical threshold [0.0] [23:56:27] PROBLEM - Puppet failure on deployment-mathoid is CRITICAL 40.00% of data above the critical threshold [0.0] [23:57:19] RECOVERY - Puppet failure on deployment-sca01 is OK Less than 1.00% above the threshold [0.0] [23:58:09] PROBLEM - Puppet failure on deployment-logstash2 is CRITICAL 33.33% of data above the critical threshold [0.0] [23:58:22] !log updated scap to 303e72e (Increment deployment stats after sync-wikiversions) [23:58:25] Logged the message, Master [23:58:27] RECOVERY - Puppet failure on deployment-stream is OK Less than 1.00% above the threshold [0.0] [23:59:37] RECOVERY - Puppet failure on deployment-elastic05 is OK Less than 1.00% above the threshold [0.0] [23:59:49] PROBLEM - Puppet failure on deployment-eventlogging02 is CRITICAL 60.00% of data above the critical threshold [0.0]