[00:00:25] 10Deployment-Systems, 06Release-Engineering-Team, 06Operations, 05Mediawiki SWAT Deployments: Clarify SWAT process for testing maintence script changes (to not use mwdebug* hosts) - https://phabricator.wikimedia.org/T153316#2963567 (10greg) [01:21:34] 03Scap3: autolog scap3 deployments in beta - https://phabricator.wikimedia.org/T156079#2963728 (10bd808) You would need a tcpircbot process in beta cluster for this. That bot is a python process that is provisioned by `role::tcpircbot` in production which listens for messages on a socket and then replays them as... [06:37:49] Yippee, build fixed! [06:37:50] Project selenium-Wikibase » chrome,test,Linux,contintLabsSlave && UbuntuTrusty build #248: 09FIXED in 1 hr 56 min: https://integration.wikimedia.org/ci/job/selenium-Wikibase/BROWSER=chrome,MEDIAWIKI_ENVIRONMENT=test,PLATFORM=Linux,label=contintLabsSlave%20&&%20UbuntuTrusty/248/ [06:59:19] Yippee, build fixed! [06:59:20] Project selenium-Wikibase » chrome,beta,Linux,contintLabsSlave && UbuntuTrusty build #248: 09FIXED in 2 hr 18 min: https://integration.wikimedia.org/ci/job/selenium-Wikibase/BROWSER=chrome,MEDIAWIKI_ENVIRONMENT=beta,PLATFORM=Linux,label=contintLabsSlave%20&&%20UbuntuTrusty/248/ [07:48:27] 10Continuous-Integration-Config, 06Operations, 06Operations-Software-Development, 13Patch-For-Review: tox-jessie is failing on operations/software - https://phabricator.wikimedia.org/T152549#2964240 (10hashar) 05Open>03Resolved [07:53:27] 10Continuous-Integration-Config, 05Goal, 07I18n, 13Patch-For-Review: Configure banana checker for i18n files to run on all MediaWiki extensions and skins - https://phabricator.wikimedia.org/T94547#2964258 (10hashar) >>! In T94547#2963477, @Psychoslave wrote: > On a side note, what is the origin of the bana... [08:16:06] 10Continuous-Integration-Config, 05Goal, 07I18n, 13Patch-For-Review: Configure banana checker for i18n files to run on all MediaWiki extensions and skins - https://phabricator.wikimedia.org/T94547#2964289 (10Psychoslave) Lucky him. :) [08:20:44] 03Scap3: autolog scap3 deployments in beta - https://phabricator.wikimedia.org/T156079#2964296 (10mmodell) @bd808: Maybe it doesn't matter but that seems like a really convoluted path to get a message into stash. Do you think it would be possible, practical to bypass tcpircbot & stashbot? Is irc the only way t... [08:28:01] 10Deployment-Systems, 06Release-Engineering-Team, 05Mediawiki SWAT Deployments: Clarify SWAT process for testing maintence script changes (to not use mwdebug* hosts) - https://phabricator.wikimedia.org/T153316#2964306 (10MoritzMuehlenhoff) [08:48:13] 10Continuous-Integration-Config, 05Goal, 07I18n, 13Patch-For-Review: Configure banana checker for i18n files to run on all MediaWiki extensions and skins - https://phabricator.wikimedia.org/T94547#2964363 (10Legoktm) I filed to allow us to set u... [09:18:39] 10Browser-Tests-Infrastructure, 15User-zeljkofilipin: Browser test Jenkins videos do not always play in-browser - https://phabricator.wikimedia.org/T155794#2964448 (10zeljkofilipin) 05Open>03stalled [09:35:05] PROBLEM - puppet last run on contint2001 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [09:37:05] RECOVERY - puppet last run on contint2001 is OK: OK: Puppet is currently enabled, last run 6 seconds ago with 0 failures [10:31:15] 10Browser-Tests-Infrastructure, 07Jenkins, 15User-zeljkofilipin: Browser test Jenkins videos do not always play in-browser - https://phabricator.wikimedia.org/T155794#2964548 (10hashar) Something might have been corrupted in Jenkins. The video/artifacts are deleted after 3 days hence https://integration.wik... [10:32:32] 10Gerrit, 06Release-Engineering-Team, 07Upstream: Convert its-phabricator upstream repo which wmf maintains to bazel - https://phabricator.wikimedia.org/T156024#2964550 (10Aklapper) >>! In T156024#2961886, @Paladox wrote: > Setting to high as this is a blocker for any future upgrades we do. So this task onl... [10:37:05] 10Browser-Tests-Infrastructure, 07Jenkins, 15User-zeljkofilipin: Browser test Jenkins videos do not always play in-browser - https://phabricator.wikimedia.org/T155794#2964566 (10hashar) [10:38:51] 10Browser-Tests-Infrastructure, 07Jenkins, 15User-zeljkofilipin: Browser test Jenkins videos do not always play in-browser - https://phabricator.wikimedia.org/T155794#2954941 (10hashar) So in short, we might want to set `media-src: self`. But I cant tell about the security implication. [10:43:03] 10Gerrit, 06Release-Engineering-Team, 07Upstream: Convert its-phabricator upstream repo which wmf maintains to bazel - https://phabricator.wikimedia.org/T156024#2964571 (10hashar) p:05High>03Low @Paladox mind filling a task to upgrade to Gerrit 2.14 and add this one as a sub task? Similar to {T146350}.... [10:45:18] (03CR) 10Hashar: "Ah that is solely for the job mwext-php70-phan-jessie . Aren't all dependencies supposedly already in mediawiki/vendor already?" [integration/config] - 10https://gerrit.wikimedia.org/r/333686 (owner: 10WMDE-leszek) [10:50:39] (03CR) 10WMDE-leszek: "If I get it right mediawiki/vendor has all core's dependencies." [integration/config] - 10https://gerrit.wikimedia.org/r/333686 (owner: 10WMDE-leszek) [10:52:27] (03PS1) 10Hashar: Change publish proxy instance [integration/config] - 10https://gerrit.wikimedia.org/r/333883 (https://phabricator.wikimedia.org/T156064) [10:54:38] 10Continuous-Integration-Infrastructure, 06Labs, 10Labs-Infrastructure, 13Patch-For-Review: Migrate integration-publisher service to use a Jessie instance - https://phabricator.wikimedia.org/T156064#2964629 (10hashar) Jobs updated INFO:jenkins_jobs.builder:Number of jobs generated: 31 INFO:jenkins_jobs.b... [10:55:20] (03PS2) 10Hashar: Change publish proxy instance [integration/config] - 10https://gerrit.wikimedia.org/r/333883 (https://phabricator.wikimedia.org/T156064) [10:56:17] (03CR) 10Hashar: [C: 032] Change publish proxy instance [integration/config] - 10https://gerrit.wikimedia.org/r/333883 (https://phabricator.wikimedia.org/T156064) (owner: 10Hashar) [10:57:44] (03Merged) 10jenkins-bot: Change publish proxy instance [integration/config] - 10https://gerrit.wikimedia.org/r/333883 (https://phabricator.wikimedia.org/T156064) (owner: 10Hashar) [11:02:55] 10Gerrit, 06Release-Engineering-Team, 07Upstream: Convert its-phabricator upstream repo which wmf maintains to bazel - https://phabricator.wikimedia.org/T156024#2964648 (10Paladox) Oh yep, though the 2.14 update haven't been released yet. But I can create the task in advanced. I herd that upstream are plann... [11:04:35] !log Deleting integration-publisher (Precise) replaced by integration-publishing (Jessie). T156064 T143349 [11:04:40] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [11:04:41] T156064: Migrate integration-publisher service to use a Jessie instance - https://phabricator.wikimedia.org/T156064 [11:04:41] T143349: Deprecate precise instances in Labs by 03/31/2017 - https://phabricator.wikimedia.org/T143349 [11:04:56] 10Continuous-Integration-Infrastructure, 06Labs, 10Labs-Infrastructure, 13Patch-For-Review: Migrate integration-publisher service to use a Jessie instance - https://phabricator.wikimedia.org/T156064#2964651 (10hashar) 05Open>03Resolved a:03hashar Validated by triggering the job operations-puppet-doc... [11:05:30] PROBLEM - Host integration-publisher is DOWN: CRITICAL - Host Unreachable (10.68.16.255) [11:05:53] 10Gerrit, 06Release-Engineering-Team: Update gerrit to 2.14 - https://phabricator.wikimedia.org/T156120#2964656 (10Paladox) [11:06:06] 10Gerrit, 06Release-Engineering-Team: Update gerrit to 2.14 - https://phabricator.wikimedia.org/T156120#2964670 (10Paladox) [11:06:09] 10Gerrit, 06Release-Engineering-Team, 07Upstream: Convert its-phabricator upstream repo which wmf maintains to bazel - https://phabricator.wikimedia.org/T156024#2961872 (10Paladox) [11:06:37] 10Gerrit, 06Release-Engineering-Team: Update gerrit to 2.14 - https://phabricator.wikimedia.org/T156120#2964673 (10hashar) p:05Triage>03Normal [11:06:38] paladox: thanks :) [11:07:05] You need to use Java 8 and Node.js for building gerrit. [11:07:08] that sounds fun [11:10:25] 10Gerrit, 06Release-Engineering-Team: Update gerrit to 2.14 - https://phabricator.wikimedia.org/T156120#2964676 (10Paladox) p:05Normal>03Triage [11:10:42] 10Gerrit, 06Release-Engineering-Team: Update gerrit to 2.14 - https://phabricator.wikimedia.org/T156120#2964656 (10Paladox) p:05Triage>03Normal [11:11:31] 10Gerrit, 06Release-Engineering-Team: Update gerrit to 2.14 - https://phabricator.wikimedia.org/T156120#2964656 (10Paladox) [11:13:34] 10Gerrit, 06Release-Engineering-Team, 07Upstream: Convert its-phabricator upstream repo which wmf maintains to bazel - https://phabricator.wikimedia.org/T156024#2964685 (10Paladox) >>! In T156024#2964550, @Aklapper wrote: >>>! In T156024#2961886, @Paladox wrote: >> Setting to high as this is a blocker for an... [11:22:48] 10Gerrit, 06Release-Engineering-Team: Update gerrit to 2.14 - https://phabricator.wikimedia.org/T156120#2964694 (10Paladox) [11:23:58] 10Gerrit, 06Release-Engineering-Team: Update gerrit to 2.14 - https://phabricator.wikimedia.org/T156120#2964656 (10Paladox) [12:16:19] PROBLEM - App Server Main HTTP Response on deployment-mediawiki05 is CRITICAL: HTTP CRITICAL: HTTP/1.1 500 Internal Server Error - 22575 bytes in 1.618 second response time [12:17:17] PROBLEM - English Wikipedia Mobile Main page on beta-cluster is CRITICAL: HTTP CRITICAL: HTTP/1.1 500 Internal Server Error - 18105 bytes in 1.458 second response time [12:20:19] PROBLEM - App Server Main HTTP Response on deployment-mediawiki04 is CRITICAL: HTTP CRITICAL: HTTP/1.1 500 Internal Server Error - 22575 bytes in 1.089 second response time [12:20:19] PROBLEM - App Server Main HTTP Response on deployment-mediawiki06 is CRITICAL: HTTP CRITICAL: HTTP/1.1 500 Internal Server Error - 22575 bytes in 1.084 second response time [12:20:19] PROBLEM - English Wikipedia Main page on beta-cluster is CRITICAL: HTTP CRITICAL: HTTP/1.1 500 Internal Server Error - 22982 bytes in 1.320 second response time [12:30:18] RECOVERY - App Server Main HTTP Response on deployment-mediawiki04 is OK: HTTP OK: HTTP/1.1 200 OK - 45737 bytes in 1.758 second response time [12:30:20] RECOVERY - App Server Main HTTP Response on deployment-mediawiki06 is OK: HTTP OK: HTTP/1.1 200 OK - 45710 bytes in 3.705 second response time [12:30:23] RECOVERY - English Wikipedia Main page on beta-cluster is OK: HTTP OK: HTTP/1.1 200 OK - 46166 bytes in 1.404 second response time [12:31:23] RECOVERY - App Server Main HTTP Response on deployment-mediawiki05 is OK: HTTP OK: HTTP/1.1 200 OK - 45713 bytes in 3.658 second response time [12:31:39] (03PS1) 10Zfilipin: WIP Let node-config know it should use Jenkins configuration [integration/config] - 10https://gerrit.wikimedia.org/r/333896 (https://phabricator.wikimedia.org/T139740) [12:32:17] RECOVERY - English Wikipedia Mobile Main page on beta-cluster is OK: HTTP OK: HTTP/1.1 200 OK - 33502 bytes in 1.816 second response time [12:44:31] (03PS2) 10Zfilipin: WIP Let node-config know it should use Jenkins configuration [integration/config] - 10https://gerrit.wikimedia.org/r/333896 (https://phabricator.wikimedia.org/T139740) [12:47:04] (03PS3) 10Zfilipin: Let node-config know it should use Jenkins configuration file for running Selenium tests. [integration/config] - 10https://gerrit.wikimedia.org/r/333896 (https://phabricator.wikimedia.org/T139740) [12:51:45] 10Gerrit, 06Release-Engineering-Team, 07Upstream: Convert its-phabricator upstream repo which wmf maintains to bazel - https://phabricator.wikimedia.org/T156024#2964932 (10Aklapper) >>! In T156024#2964648, @Paladox wrote: > though the 2.14 update haven't been released yet. So this task wasn't urgent (= hig... [13:08:16] (03CR) 10Hashar: [C: 04-1] Let node-config know it should use Jenkins configuration file for running Selenium tests. (031 comment) [integration/config] - 10https://gerrit.wikimedia.org/r/333896 (https://phabricator.wikimedia.org/T139740) (owner: 10Zfilipin) [13:21:37] 06Release-Engineering-Team, 10Phabricator, 10Wikimedia-Blog: Write a blog post on the up coming phabricator update on wmf - https://phabricator.wikimedia.org/T155896#2964994 (10Aklapper) >>! In T155896#2961979, @EdErhart-WMF wrote: > Hey @Paladox, can you send me an email? Happy to send you a Google Doc with... [13:28:19] (03CR) 10Zfilipin: Let node-config know it should use Jenkins configuration file for running Selenium tests. (031 comment) [integration/config] - 10https://gerrit.wikimedia.org/r/333896 (https://phabricator.wikimedia.org/T139740) (owner: 10Zfilipin) [13:47:30] Yippee, build fixed! [13:47:30] Project selenium-VisualEditor » firefox,beta,Linux,contintLabsSlave && UbuntuTrusty build #284: 09FIXED in 2 min 30 sec: https://integration.wikimedia.org/ci/job/selenium-VisualEditor/BROWSER=firefox,MEDIAWIKI_ENVIRONMENT=beta,PLATFORM=Linux,label=contintLabsSlave%20&&%20UbuntuTrusty/284/ [13:54:30] zeljkof: did review your selenium/nodejs patch for mw https://gerrit.wikimedia.org/r/#/c/323401/37 :] [13:55:05] my major concern is the code duplication for beforeEach() and afterEach() in both page.js and user.js [13:55:15] that should be a global hook really [13:56:24] zeljkof: I think you just have to put them in a tests/selenium/helper.js [13:56:36] and given the beforeEach/afterEach are not in a describe [13:56:41] they will be executed before tests [13:56:45] and applied to all of them [14:02:10] hashar: yes, that is the next step [14:02:22] I was extracting configuration so far [14:52:58] zeljkof: but that looks great :] [14:53:06] I like the "config" node module [14:53:12] yeah [14:53:18] took a bit to find, but I like that one [14:57:10] but maybe we could just define each of the task with the proper global variables [14:57:13] directly in the gruntfile js [14:57:15] so one would have [14:57:18] grunt mocha:vagrant [14:57:46] though to run the tests against prod / beta we would need various other targets [14:57:49] might end up being messy :D [15:05:41] 10Gerrit, 06Release-Engineering-Team, 07Upstream: Convert its-phabricator upstream repo which wmf maintains to bazel - https://phabricator.wikimedia.org/T156024#2965328 (10Paladox) >>! In T156024#2964932, @Aklapper wrote: >>>! In T156024#2964648, @Paladox wrote: >> though the 2.14 update haven't been release... [15:10:50] hashar: oh, so define globals in the gruntfile... [15:11:05] hm, I think using node-config is a cleaner way [15:31:52] 03Scap3: autolog scap3 deployments in beta - https://phabricator.wikimedia.org/T156079#2965465 (10bd808) >>! In T156079#2964296, @mmodell wrote: > @bd808: Maybe it doesn't matter but that seems like a really convoluted path to get a message into stash. Do you think it would be possible, practical to bypass tcpi... [15:43:41] 06Release-Engineering-Team, 10Phabricator, 10Wikimedia-Blog: Write a blog post on the up coming phabricator update on wmf - https://phabricator.wikimedia.org/T155896#2965510 (10Paladox) The update has been deployed at https://phab-01.wmflabs.org which should be going out hopefully to production phabricator t... [16:16:17] 06Release-Engineering-Team, 10Elasticsearch, 10Phabricator: Add support for elasticsearch 5 - https://phabricator.wikimedia.org/T155299#2965604 (10Paladox) I have tested elasticsearch 5 on phab-01, and it is working. Note that we had to delete the indexes and do a full reindex, but that's better then trying... [16:16:29] 06Release-Engineering-Team, 10Elasticsearch, 10Phabricator: Add support for elasticsearch 5 - https://phabricator.wikimedia.org/T155299#2965606 (10Paladox) 05Open>03Resolved a:03Paladox [16:16:58] zeljkof: yeah node-config is probably better :] [16:17:07] easier than digging in the huge Gruntfile.js [16:18:47] hashar: having some trouble in moving the hooks (before, after) to a separate file [16:18:59] but there should be a way [16:19:00] https://mochajs.org/#root-level-hooks [16:19:09] it's probably just my javascript fu [16:24:49] ok, so a couple of jobs fail https://gerrit.wikimedia.org/r/#/c/323401/ [16:25:24] because I have moved node-config config files from the default config folder to tests/selenium/config [16:26:14] and now npm has to be invoked with NODE_CONFIG_DIR=./tests/selenium/config [16:27:16] will amend https://gerrit.wikimedia.org/r/#/c/333896/ with the fix [16:28:37] (03PS4) 10Zfilipin: WIP Let node-config know it should use Jenkins configuration [integration/config] - 10https://gerrit.wikimedia.org/r/333896 (https://phabricator.wikimedia.org/T139740) [16:31:28] zeljkof: have grunt set env.NODE_CONFIG_DIR ? [16:31:49] hashar: done ^ [16:31:53] :D [16:32:07] na [16:32:14] not in the job, I meant in the Gruntfile :] [16:32:18] or, do you think I should do it in the Gruntfile? [16:32:28] ok, let me try [16:32:49] the less we have in the jenkins job, the happier I am :] [16:32:56] agreed [16:43:27] PROBLEM - Long lived cherry-picks on puppetmaster on deployment-puppetmaster02 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [16:47:28] (03CR) 10Zfilipin: "#4 tries to fix failing Jenkins jobs in https://gerrit.wikimedia.org/r/#/c/323401/" [integration/config] - 10https://gerrit.wikimedia.org/r/333896 (https://phabricator.wikimedia.org/T139740) (owner: 10Zfilipin) [16:49:20] (03PS5) 10Zfilipin: Let node-config know it should use Jenkins configuration file for running Selenium tests. [integration/config] - 10https://gerrit.wikimedia.org/r/333896 (https://phabricator.wikimedia.org/T139740) [16:50:28] (03CR) 10Zfilipin: "#5 reverts #4, we are back where we were with #3." [integration/config] - 10https://gerrit.wikimedia.org/r/333896 (https://phabricator.wikimedia.org/T139740) (owner: 10Zfilipin) [16:51:07] hashar: do you have any complaints about https://gerrit.wikimedia.org/r/#/c/333896/ [16:51:12] could we merge it? [16:51:58] no clue [16:52:02] in conf call pipeline right now [17:29:05] 10Gerrit, 06Release-Engineering-Team: Update gerrit to 2.14 - https://phabricator.wikimedia.org/T156120#2965857 (10demon) p:05Normal>03Lowest [17:29:25] 10Gerrit, 06Release-Engineering-Team, 07Upstream: Convert its-phabricator upstream repo which wmf maintains to bazel - https://phabricator.wikimedia.org/T156024#2965862 (10demon) p:05Low>03Lowest [19:10:49] Im going to move phab-01 to phabricator which i am apply the role now. (Just moving it to the role, i will hopefully be able to migrate data including db) :) [19:11:03] i was speaking to mutante about this :) [19:11:29] greg-g twentyafterfour ^^ [19:11:50] ok [19:11:57] i will keep phab-01 as it is until i have managed to get everything in working order on phabricator [19:12:23] https://phabricator.wikimedia.org/P4797#25209 <- with puppet role = good [19:12:48] without puppet role, let's replace [19:47:03] Yippee, build fixed! [19:47:03] Project selenium-RelatedArticles » chrome,beta-mobile,Linux,contintLabsSlave && UbuntuTrusty build #286: 09FIXED in 1 min 2 sec: https://integration.wikimedia.org/ci/job/selenium-RelatedArticles/BROWSER=chrome,MEDIAWIKI_ENVIRONMENT=beta-mobile,PLATFORM=Linux,label=contintLabsSlave%20&&%20UbuntuTrusty/286/ [20:49:34] 06Release-Engineering-Team, 06Reading-Admin, 06Zero, 07Mobile, 07Technical-Debt: Pull WikipediaMobileFirefoxOS from mediawiki-config - https://phabricator.wikimedia.org/T107172#2966560 (10demon) 05Open>03declined Per what I said [[#1567872|a year and a half ago]] [21:05:02] 06Release-Engineering-Team, 10Phabricator, 10Wikimedia-Blog: Write a blog post on the up coming phabricator update on wmf - https://phabricator.wikimedia.org/T155896#2966673 (10EdErhart-WMF) Hey @Aklapper and @Dzahn, I asked for an email because I don't follow my Phabricator notifications on a daily basis. T... [21:25:11] Hi, im wondering does anyone know why im getting this error https://phabricator.wikimedia.org/P4801 for scap [21:33:45] paladox: it looks like you are rendering a config file on a target that is consumes a file that is not valid yaml format. [21:33:56] Oh. [21:34:24] The configs im using are [21:34:25] "scap::deployment_server": localhost [21:34:33] "scap::server::sources": [21:34:33] phabricator/deployment: [21:34:33] repository: phabricator/deployment [21:34:49] then i run /usr/bin/scap deploy-local --repo phabricator/deployment -D log_json:False [21:36:13] do you have a deployment server for this project? [21:36:29] ah, no, localhost [21:37:03] yep, running it on https://wikitech.wikimedia.org/wiki/Nova_Resource:Phabricator.phabricator.eqiad.wmflabs [21:37:37] ah, so on phab-tin in /srv/deployment/phabricator/deployment/scap/config_files.yaml [21:38:31] the remote_vars key for a file in there is pointing to an invalid yaml file [21:38:40] docs: https://doc.wikimedia.org/mw-tools-scap/scap3/quickstart/setup.html?highlight=config_deploy#remote-variable-files [21:38:47] oh [21:40:09] But, im using phabricator instance and not phab-tin. On phabricator, this file does not exist /srv/deployment/phabricator/deployment/scap/config_files.yaml [21:40:16] the repo exists. though [21:40:58] Im wondering is it this file [21:40:59] phabricator-targets [21:41:18] it points to phab-scap.eqiad.wmflabs [21:42:35] so in the scap.cfg file you should see what file it's looking at to get a list of targets for a domain [21:42:54] you may need to update that file to point to phab-01 [21:43:02] oh [21:45:52] Ok, ive updated phabricator-targets since that is what it points to it [21:45:53] dsh_targets: phabricator-targets [21:45:59] 06Release-Engineering-Team (Deployment-Blockers), 05Release: MW-1.29.0-wmf.10 deployment blockers - https://phabricator.wikimedia.org/T155525#2966831 (10mmodell) Doh, ref'd the wrong task! [21:46:03] still fails though [21:48:42] so mostly the way we deploy is from tin -> targets. So on phab-tin you should be able to run: scap deploy -v inside the /srv/deployment/phabricator/deployment directory [21:49:03] thcipriani, that wont work on labs though [21:49:20] due to some problems with how it finds hostnames. [21:49:35] but it does work on beta [21:49:51] so I'm not sure what you mean [21:50:47] oh. [21:51:09] so for deployment_server i put deployment_server: phab-tin.phabricator.eqiad.wmflabs? [21:51:27] that seems correct [21:51:59] ah [21:52:02] that works [21:52:08] doing it under the folder and doing scap deploy -v [21:52:51] Fails with this at the end [21:52:51] https://phabricator.wikimedia.org/P4802 [21:53:06] 06Release-Engineering-Team (Deployment-Blockers), 05Release: MW-1.29.0-wmf.9 deployment blockers - https://phabricator.wikimedia.org/T154683#2966859 (10mmodell) [21:56:05] 21:55:05 ['/usr/bin/scap', 'deploy-local', '-v', '--repo', 'phabricator/deployment', '-g', 'default', 'fetch', '--refresh-config'] on phabricator.phabricator.eqiad.wmflabs returned [255]: Host key verification failed. [21:56:09] fails with ^^ [21:57:02] add, add the ssh hostkey as your user. So just try to ssh to that machine as the ssh_user from the scap.cfg file [21:57:09] and then accept the hostkey [21:59:17] oh [21:59:19] ok thanks [21:59:21] 10Continuous-Integration-Config, 06Release-Engineering-Team, 06Analytics-Kanban, 10EventBus, 10Wikimedia-Stream: Improve tests for KafkaSSE - https://phabricator.wikimedia.org/T150436#2966905 (10Nuria) 05Open>03Resolved [22:00:55] 22:00:48 ['/usr/bin/scap', 'deploy-local', '-v', '--repo', 'phabricator/deployment', '-g', 'default', 'fetch', '--refresh-config'] on phabricator.phabricator.eqiad.wmflabs returned [255]: Permission denied (publickey). [22:01:02] fails with permission denied public key [22:02:48] yeah, we use keyholder in beta and production for this [22:03:10] https://wikitech.wikimedia.org/wiki/Keyholder [22:03:22] oh [22:03:45] it may be overkill for what you're trying to achieve. Basically you just need some way for the ssh_user to ssh to the targets [22:04:01] using a key [22:05:50] oh yep [22:06:02] would i need to add it to service::deploy::scap::public_key_file [22:06:08] or do i set [22:06:09] "keyholder::require_encrypted_keys": 'no' [22:06:22] im just looking at how it is done on https://wikitech.wikimedia.org/wiki/Hiera:Deployment-prep [22:07:14] scap::server::keyholder_agents [22:07:19] oh i see theres ^^ [22:07:40] yep that's the one you want to setup keyholder [22:08:33] hrm, not sure how this will work without a self-hosted puppetmaster though :\ [22:09:04] it looks for a secret file (using the puppet secret funciton) named whatever your key is named [22:09:27] oh [22:11:03] alternatively setting up a key for this user directly on this machine should work. I don't think puppet will remove the key you put in place on the target. [22:11:27] have not read backlog, but if you need secrets in labs, use labs/private repo to put a fake/test key [22:11:44] that's where the secret function will look [22:12:21] 03Scap3: Support using scap on localhost without needing ssh and self hosting puppet masters - https://phabricator.wikimedia.org/T156197#2966988 (10Paladox) [22:13:04] mutante i need to add a ssh key to the labs private repo on the self hosting puppet. I guess i will just setup a self hosting one. Would a small be ok? [22:13:15] twentyafterfour ^^ [22:13:55] PROBLEM - Puppet run on repository is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [22:15:54] paladox: there are 2 options, add fake key in labs/private (and keep using central puppetmaster) OR using self-hosted and not having to use labs/private [22:16:04] i'd do the first unless you run into more problems [22:16:07] thcipriani do i need to use wikitech to create the ssh key? Or can i create the ssh key locally? [22:16:19] oh [22:16:31] mutante ok, i will add the pub key to labs/private [22:16:55] sounds good, yep [22:17:13] well, the private key [22:17:20] but since it just exists for testing.. it's ok [22:17:30] ok [22:17:34] you can just make a key [22:18:36] ok [22:18:49] what does this do [22:18:50] "scap::server::keyholder_agents": [22:18:51] phabricator: [22:18:52] trusted_groups: [22:18:53] - project-deployment-prep [22:18:54] thcipriani ^^ [22:19:14] thats an example, but what do i do for phabricator. Does it trusty my ssh key if i do project-phabricator [22:20:15] trusty = trust [22:22:20] ah, so keyholder checks your user's group before it allows you to use a key [22:22:29] so yeah, you'll want to change that to project-phabricator [22:22:43] and it'll trust folks in the project to use that key that's held by keyholder [22:24:08] oh [22:24:10] thanks [22:24:30] also it dosent seem that keyholder is installed. [22:25:58] hrm, it should get installed when you have an agent defined [22:26:10] paladox: "keyholder status" as root ? [22:26:29] root@phabricator:/srv/deployment/phabricator/deployment# keyholder status [22:26:29] bash: keyholder: command not found [22:26:32] mutante ^^ [22:26:40] paladox: on phabricator-tin ? [22:27:01] that's the equivalent of a deployment server [22:27:05] it works on there [22:27:10] and the keyholder would be on the deployment box [22:27:16] to connect from there to the box it deploys to [22:27:28] so that part is normal [22:27:30] role::deployment::server [22:27:33] yep [22:27:36] that gets you keyholder [22:28:00] paladox: try "keyholder arm" on phab-tin [22:28:14] eh, /usr/local/sbin/keyholder status ? [22:29:48] ok [22:30:01] root@phabricator:/srv/deployment/phabricator/deployment# /usr/local/sbin/keyholder status [22:30:01] bash: /usr/local/sbin/keyholder: No such file or directory [22:30:16] Creating directory '/nonexistent'. [22:30:17] Identity added: /etc/keyholder.d/mwdeploy (rsa w/o comment) [22:30:33] root@phab-tin:/home/paladox# keyholder arm [22:30:33] Creating directory '/nonexistent'. [22:30:33] Identity added: /etc/keyholder.d/mwdeploy (rsa w/o comment) [22:30:56] it says /etc/keyholder.d/mwdeploy is not an acceptable key. Does it have a passphrase? [22:32:03] mutante ^^ [22:33:30] paladox: it wants you to set a passphrase on your key [22:33:38] when you create one, use the -p option [22:33:45] and set a passphrase [22:33:51] Oh, so i create a new on on phab-tin [22:33:53] this is during key creation [22:33:56] yes [22:34:02] set a random passphrase [22:34:11] and when doing "keyholder arm" it will ask you to enter it [22:34:54] Oh, i did that but it fails [22:35:24] paladox: you can test it on your local machine with [22:35:28] ssh-add [22:35:34] ok [22:35:46] make key, ssh-add /path/to/key [22:35:48] ssh-add -l [22:35:59] Identity added: /Users/xxxx/.ssh/id_rsa (/Users/xxx/.ssh/id_rsa) [22:36:02] keyholder just uses ssh-add internally [22:36:21] paladox: does it list it with ssh-add -l ? [22:36:33] root@phab-tin:/home/paladox# ssh-add [22:36:33] Could not open a connection to your authentication agent. [22:36:36] and in that list, does it show the full path to it? [22:36:56] ssh-add by itself doesnt do anything [22:37:02] -l to list keys [22:37:05] XXX@XXX-MBP:~$ ssh-add -l [22:37:05] 2048 SHA256:2a2XnGm8Cb4qN/Kf8JyXjgYoXqPN/3vWxk5yJmXYD2U /Users/xxxx/.ssh/id_rsa (RSA) [22:37:05] or path to key to add one [22:37:27] ok, and did it ask you to type a passphrase when you added it? [22:37:27] root@phab-tin:/home/paladox# ssh-add -l [22:37:27] Could not open a connection to your authentication agent. [22:37:31] nope [22:37:49] then try creating one again [22:37:53] with a passphrase [22:38:15] oh ok. can i have mutiple ssh keys? [22:38:18] yes [22:39:04] ok [22:39:24] it's created now [22:39:43] but doing ssh-key -I still dosent ask for a passphrase [22:39:50] but i did set one this time [22:42:04] I don't know "ssh-key" [22:42:28] oh [22:42:32] ssh-keygen -p [22:42:47] oh sorry [22:43:10] what was the full command you used to create the key? [22:43:17] what was -I [22:44:19] ssh-keygen [22:44:36] anyways, doing ssh-keygen -p changes a passphase [22:44:48] but i doint need to change the passphase on my new ssh key [22:44:51] as i now have mutiple;s [22:44:54] mutiple's [22:47:14] paladox: however you do it, the one you want to use must have a passphrase [22:47:25] just type ssh-keygen then, do you get asked this, or not: [22:47:34] Enter passphrase (empty for no passphrase): *enter your passphrase here* [22:47:40] do _not_ just hit enter there [22:47:48] enter a random password [22:47:50] ok, yep i got asked that [22:47:52] i did [22:48:06] then if you load that key with ssh-add it must ask you for that passphrase too [22:50:16] yep [22:50:18] it did [22:51:19] good, then "keyholder arm" should work the same way now [22:51:48] ok [22:53:06] root@phab-tin:/home/paladox# ssh-add ~/.ssh/id_rsa [22:53:06] Could not open a connection to your authentication agent. [22:53:11] doing that fails still ^^ [22:53:30] that's not the same thing [22:53:37] use "keyholder arm" [22:53:47] ssh-add was for testing on your local machine [22:53:56] if the key has a passphrase [22:53:56] RECOVERY - Puppet run on repository is OK: OK: Less than 1.00% above the threshold [0.0] [22:54:16] root@phab-tin:/home/paladox# keyholder arm [22:54:17] Identity added: /etc/keyholder.d/mwdeploy (rsa w/o comment) [22:54:23] :) [22:54:23] root@phab-tin:/home/paladox# keyholder arm [22:54:23] /etc/keyholder.d/mwdeploy is not an acceptable key. Does it have a passphrase? [22:54:23] Identity added: /etc/keyholder.d/mwdeploy (rsa w/o comment) [22:54:27] duh [22:54:41] added or not added? [22:54:44] confusing [22:55:04] it shows a key in /etc/keyholder.d/mwdeploy [22:55:13] what does keyholder status say now? [22:55:25] yes, is that the key you made? [22:55:34] root@phab-tin:/home/paladox# keyholder status [22:55:34] keyholder-agent: active [22:55:35] - 4096 cb:43:c3:76:3a:68:b0:29:1a:8f:3f:31:1f:87:f8:7c rsa w/o comment (RSA) [22:55:35] keyholder-proxy: active [22:55:35] - 4096 cb:43:c3:76:3a:68:b0:29:1a:8f:3f:31:1f:87:f8:7c rsa w/o comment (RSA) [22:55:44] mwdeploy is for mediawiki, not phab [22:55:46] in prod [22:56:07] ok [22:56:11] paladox: ok, so the "w/o comment" part is the problem [22:56:17] oh [22:56:17] and i had the same problem in production [22:56:23] when i tried changing the passphrases [22:56:31] oh [22:56:34] and i wish i had the solution [22:56:50] it has to do with the version of ssh-add or something [22:56:51] oh? [22:57:06] I thought that was not a problem with the new version of tin/mira [22:57:08] Oh [22:57:16] the w/o comment thing [22:57:22] keyholder expects the full path to the key to be there [22:57:28] where it says "w/o comment" here [22:57:30] yeep [22:57:40] in the moment i touched the existing keys [22:57:46] and changed the passphrase on them [22:57:51] huh, weird [22:57:52] the comment disappeared [22:57:55] and i had to revert it [22:57:58] because it broke [22:58:08] i even talked in #openssh for a while about it [22:58:11] so we had this problem in beta, but we didn't have the problem in production when we moved deployment servers [22:58:23] it somehow depends where we create the keys then [22:58:26] so I abandoned my patch [22:58:27] i guess [22:58:38] hold on, finding notes [22:59:04] https://phabricator.wikimedia.org/T154943#2955264 [22:59:34] https://gerrit.wikimedia.org/r/#/c/312947/ [23:00:03] ^ so we actually have that patch cherry-picked in beta [23:00:32] oh, this is needed as soon as you touch the keys [23:00:41] and i did never copy them away from tin/mira [23:00:52] oh, wait. to be correct [23:01:04] they are in the private repo on the puppetmaster [23:01:14] so i ran ssh-keygen where the private repo is [23:01:48] oh [23:02:28] different keyholder in labs vs prod will add to the confusion [23:02:34] and already did [23:02:49] i guess we can use git::clone on labs. [23:03:02] no, don't [23:03:24] oh [23:04:13] that adds yet another different setup and is what we want to avoid [23:04:31] trying to find the context here, but IIRC hashar/moritz reimaged deployment-{tin,mira} keyholder check was failing, I wrote this, then when mira and tin prod were reimaged the problem was gone [23:05:20] so I abandoned this patch; however, un-cherry-picking would have led to alerts so it stayed cherry-picked on beta puppetmaster [23:05:34] they just did not do the "keyholder arm" dance i guess [23:05:47] then faidon looked at it [23:05:55] and noticed there is a different passphrase for each key [23:06:03] and then asked me to change them to use the same passphrase [23:06:17] during reimaging they never touched the key files itself [23:06:26] they just get installed from puppet private repo [23:07:03] so as long as the files were not touched, they still showed the full path as comment [23:07:10] interesting [23:07:36] i think we should merge this in prod then [23:08:15] yarp, unabandoning now [23:09:15] :) [23:15:46] i checked that it works on tin and then merged [23:15:51] thanks [23:16:12] paladox: but for your actual phab deployment it should not matter [23:16:21] phaOk [23:16:31] thanks [23:17:04] paladox: in /etc/keyholder.d/ you would expect multiple keys [23:17:10] at least that is the case on tin [23:17:30] one is "mwdeploy" but another one is "phabricator" [23:17:44] oh [23:18:02] in prod we have also "analytics" "service" "dumps" and so on [23:18:13] one for each thing that gets deployed [23:18:36] so /etc/keyholder.d/phabricator (RSA) [23:18:50] i would expect that deployment-tin has that too [23:19:10] yep [23:19:59] ./secret/secrets/phabricator/phab_deploy_private_key [23:20:06] this is the private key that matters [23:20:18] so for you that is labs/private ^ [23:20:23] oh [23:20:45] oops, actually there are 2 pathes [23:20:50] ./secret/secrets/keyholder/phabricator [23:21:20] i believe the one in ./keyholder/ is the right one [23:21:43] can you locate this stuff in labs/private? [23:23:12] https://github.com/wikimedia/labs-private/blob/master/modules/secret/secrets/phabricator/phab_deploy_private_key [23:23:16] mutante ^^ [23:24:05] ah, that could have been me :) [23:24:16] i usually put SNAKEOIL :P [23:24:25] this was to make puppet compiler run [23:24:46] if you actually want to deploy now, needs an actual working key there instead [23:24:53] replace my placeholder [23:25:10] root@phab-tin:/home/paladox# keyholder arm [23:25:19] root@phab-tin:/home/paladox# keyholder arm [23:25:19] /etc/keyholder.d/*: No such file or directory [23:25:42] oh [23:25:54] well, now you lost me, because earlier: [23:25:55] 14:57 < paladox> root@phab-tin:/home/paladox# keyholder status [23:25:55] 14:57 < paladox> keyholder-agent: active [23:26:19] not sure why it is failing now. [23:26:30] i didnt really touch anything. [23:26:33] you even pasted how you have the mwdeploy key there [23:26:34] may have been https://gerrit.wikimedia.org/r/312947 [23:26:55] no, that is just an Icinga check [23:26:59] oh [23:27:02] it doesnt create or remove /etc/keyholder.d [23:27:13] module/keyholder does [23:27:13] 10Beta-Cluster-Infrastructure: beta cluster: Warning: failed to mkdir "/srv/mediawiki/php-master/images/thumb/... - https://phabricator.wikimedia.org/T145496#2967258 (10Tgr) [23:29:40] 10Beta-Cluster-Infrastructure: beta cluster: Warning: failed to mkdir "/srv/mediawiki/php-master/images/thumb/... - https://phabricator.wikimedia.org/T145496#2967264 (10Tgr) What would be the correct directory to use? [23:30:47] mutante, i will try again tommror, we managed to get puppet working correctly on phabricator, just scap wont clone the phabricator/deploymenet repo. [23:30:48] getting late here. [23:30:49] root@phab-tin:/srv# keyholder add /root/.ssh/id_rsa [23:30:51] root@phab-tin:/srv# keyholder add /root/.ssh/id_rsa [23:30:52] /root/.ssh/id_rsa: Permission denied [23:32:59] paladox: that key should be in /etc/keyholder.d/ [23:33:04] paladox: ok, continue next time [23:33:14] oh ok, thanks.