[01:03:22] 06Release-Engineering-Team, 10ArchCom-RfC, 06Developer-Relations, 06WMF-Legal, 07RfC: Create formal process for CREDITS files - https://phabricator.wikimedia.org/T139300#2432255 (10RobLa-WMF) [01:12:06] https://wikitech.wikimedia.org/wiki/Labs_Baremetal_Lifecycle aaah <-- Pinged 'cause of "le aaah". :P [01:22:14] heh, what is "le aaah" , pinged because of mutante [01:30:15] (03PS3) 10Madhuvishy: Add maven release job template and analytics-refinery-release project [integration/config] - 10https://gerrit.wikimedia.org/r/290597 (https://phabricator.wikimedia.org/T132182) [01:31:05] (03CR) 10jenkins-bot: [V: 04-1] Add maven release job template and analytics-refinery-release project [integration/config] - 10https://gerrit.wikimedia.org/r/290597 (https://phabricator.wikimedia.org/T132182) (owner: 10Madhuvishy) [01:57:24] Project browsertests-Wikidata-WikidataTests-Group0-SmokeTests-linux-firefox-sauce build #98: 04FAILURE in 17 min: https://integration.wikimedia.org/ci/job/browsertests-Wikidata-WikidataTests-Group0-SmokeTests-linux-firefox-sauce/98/ [04:10:15] Project selenium-MultimediaViewer » firefox,beta,Linux,contintLabsSlave && UbuntuTrusty build #65: 04FAILURE in 14 min: https://integration.wikimedia.org/ci/job/selenium-MultimediaViewer/BROWSER=firefox,MEDIAWIKI_ENVIRONMENT=beta,PLATFORM=Linux,label=contintLabsSlave%20&&%20UbuntuTrusty/65/ [06:21:08] PROBLEM - Long lived cherry-picks on puppetmaster on deployment-puppetmaster is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [06:50:19] 06Release-Engineering-Team (Deployment-Blockers), 05Release: MW-1.28.0-wmf.5 deployment blockers - https://phabricator.wikimedia.org/T136042#2432485 (10Tgr) [06:50:21] 06Release-Engineering-Team (Deployment-Blockers), 05Release: MW-1.28.0-wmf.3 deployment blockers - https://phabricator.wikimedia.org/T135559#2432486 (10Tgr) [07:49:45] 10Continuous-Integration-Config, 10Continuous-Integration-Infrastructure, 06Services, 07Easy, 13Patch-For-Review: npm-node-4.3 jobs are failing because node is version 4.4.6 - https://phabricator.wikimedia.org/T139374#2430096 (10mobrovac) As per {T138561}, the plan is to switch to the new LTS (4.4.6) nex... [08:16:39] 07Browser-Tests, 06Reading-Web-Backlog, 13Patch-For-Review, 03Reading-Web-Sprint-75-Quantitative-Paralysis, 05WMF-deploy-2016-07-05_(1.28.0-wmf.9): Run browser tests on beta cluster from desktop domain - https://phabricator.wikimedia.org/T130429#2432566 (10phuedx) >>! In T130429#2419299, @Jdlrobson wrote... [08:32:16] 10Browser-Tests-Infrastructure, 10Continuous-Integration-Config, 07Upstream, 15User-zeljkofilipin: Firefox v47 breaks mediawiki_selenium - https://phabricator.wikimedia.org/T137561#2432590 (10hashar) The Ubuntu Mozilla Security Team is preparing a package for Firefox 47.0.1. It has been proposed in their s... [08:37:02] 06Release-Engineering-Team, 06Commons, 10MediaWiki-File-management, 06Multimedia, and 4 others: InstantCommons broken by switch to HTTPS - https://phabricator.wikimedia.org/T102566#2432595 (10Tau) How do I check the http log channel? In /mediawiki/LocalSettings.php I have following lines: ``` ## To enab... [08:53:47] hashar hi, could i have help with zuul please [09:00:38] 05Continuous-Integration-Scaling, 06Labs, 10Labs-Infrastructure, 13Patch-For-Review: Nodepool can not delete/spawn instances anymore - https://phabricator.wikimedia.org/T139285#2432601 (10hashar) 05Open>03Resolved It is solved. What @paladox noticed yesterday was the pool of instances being exhausted a... [09:01:12] paladox: not today sorry got bunch of things to handle :( [09:01:18] Ok [09:02:13] hashar i got zuul being triggered by gerrit :) [09:07:16] (03PS1) 10Hashar: operations/software/service-checker: add tox-jessie [integration/config] - 10https://gerrit.wikimedia.org/r/297559 [09:08:34] (03CR) 10Hashar: [C: 032] operations/software/service-checker: add tox-jessie [integration/config] - 10https://gerrit.wikimedia.org/r/297559 (owner: 10Hashar) [09:09:22] (03Merged) 10jenkins-bot: operations/software/service-checker: add tox-jessie [integration/config] - 10https://gerrit.wikimedia.org/r/297559 (owner: 10Hashar) [09:28:39] 06Release-Engineering-Team (Deployment-Blockers), 10MediaWiki-extensions-Translate, 05Release, 07Wikimedia-log-errors: Notice: Undefined index: 0 in /srv/mediawiki/php-1.28.0-wmf.9/extensions/Translate/tag/TranslatablePage.php on line XXX - https://phabricator.wikimedia.org/T139447#2432635 (10hashar) [09:29:33] 06Release-Engineering-Team (Deployment-Blockers), 10MediaWiki-extensions-Translate, 05Release, 07Wikimedia-log-errors: Notice: Undefined index: 0 in /srv/mediawiki/php-1.28.0-wmf.9/extensions/Translate/tag/TranslatablePage.php on line XXX - https://phabricator.wikimedia.org/T139447#2432651 (10Joe) a:03Joe [09:30:10] 06Release-Engineering-Team (Deployment-Blockers), 10MediaWiki-extensions-Translate, 05Release, 07Wikimedia-log-errors: Notice: Undefined index: 0 in /srv/mediawiki/php-1.28.0-wmf.9/extensions/Translate/tag/TranslatablePage.php on line XXX - https://phabricator.wikimedia.org/T139447#2432635 (10Joe) This is... [09:30:42] 06Release-Engineering-Team (Deployment-Blockers), 10MediaWiki-extensions-Translate, 05Release, 07Wikimedia-log-errors: Notice: Undefined index: 0 in /srv/mediawiki/php-1.28.0-wmf.9/extensions/Translate/tag/TranslatablePage.php on line XXX - https://phabricator.wikimedia.org/T139447#2432653 (10hashar) That... [09:32:53] 06Release-Engineering-Team (Deployment-Blockers), 10MediaWiki-extensions-Translate, 05Release, 07Wikimedia-log-errors: Notice: Undefined index: 0 in /srv/mediawiki/php-1.28.0-wmf.9/extensions/Translate/tag/TranslatablePage.php on line XXX - https://phabricator.wikimedia.org/T139447#2432655 (10hashar) It st... [09:45:28] 10Continuous-Integration-Config: Allow jenkins tests upon uploading a patch on gerrit for wikimedia/portals repo - https://phabricator.wikimedia.org/T139345#2432663 (10MarcoAurelio) p:05Triage>03Normal Okay, I think I now see the problem. I've checked the `layout.yaml` (https://phabricator.wikimedia.org/diff... [09:46:40] 10Continuous-Integration-Config: Whitelist @mxn on zuul.yaml so his patches can be CI-tested - https://phabricator.wikimedia.org/T139345#2432666 (10MarcoAurelio) [09:58:48] 06Release-Engineering-Team (Deployment-Blockers), 10MediaWiki-extensions-Translate, 05Release, 07Wikimedia-log-errors: Notice: Undefined index: 0 in /srv/mediawiki/php-1.28.0-wmf.9/extensions/Translate/tag/TranslatablePage.php on line XXX - https://phabricator.wikimedia.org/T139447#2432669 (10hashar) The c... [10:18:16] 06Release-Engineering-Team (Deployment-Blockers), 10MediaWiki-extensions-Translate, 05Release, 07Wikimedia-log-errors: Notice: Undefined index: 0 in /srv/mediawiki/php-1.28.0-wmf.9/extensions/Translate/tag/TranslatablePage.php on line XXX - https://phabricator.wikimedia.org/T139447#2432678 (10Joe) If what... [10:37:38] PROBLEM - Parsoid on deployment-parsoid06 is CRITICAL: Connection refused [10:55:31] 03releng-201516-q4, 10Malu (Malu-Prototype), 07Surveys, 15User-zeljkofilipin: Send out browser testing user satisfaction survey - https://phabricator.wikimedia.org/T131123#2432705 (10zeljkofilipin) >>! In T131123#2422322, @Capt_Swing wrote: > Hi @zeljkofilipin. Could you add this survey to the community en... [10:56:28] 03releng-201516-q4, 10Malu (Malu-Prototype), 07Surveys, 15User-zeljkofilipin: Send out browser testing user satisfaction survey - https://phabricator.wikimedia.org/T131123#2432706 (10zeljkofilipin) >>! In T131123#2423267, @Risker wrote: > What's the privacy policy for this survey? Apologies for forgetti... [12:14:30] 10Browser-Tests-Infrastructure: We should check for ResourceLoader errors after any page opens - https://phabricator.wikimedia.org/T59304#2432859 (10zeljkofilipin) This might be interesting: [[ https://watirmelon.com/2016/06/29/using-webdriver-to-automatically-check-for-javascript-errors-on-every-page-2016-edit... [13:02:32] hashar, hi does the gerrit command have to be installed [13:02:36] since im getting this [13:02:37] Exception: Gerrit error executing gerrit review --project testing/test --message "Main test build succeeded. [13:02:37] - noop http://gerrit-jenkins.wmflabs.org/job/noop/None/console : SUCCESS [13:02:39] In log [13:02:46] paladox: no it is on the Gerrit server [13:02:57] Gerrit has a built in ssh daemon [13:03:02] which has a bunch of commands [13:03:05] so you do something like: [13:03:16] sudo su - zuul -s /bin/bash [13:03:18] to become zuul [13:03:20] then [13:03:28] ssh -p 29418 jenkins@localhost [13:03:38] ( 29418 is the TCP port on which Gerrit ssh daemon listen to) [13:03:42] then you can: [13:03:53] ssh -p 29418 jenkins@localhost query "is:open owner:paladox" [13:04:03] ssh -p 29418 jenkins@localhost gerrit review --code-review 2 12345,42 [13:04:04] etc [13:06:52] Oh [13:09:13] hashar shows this zuul@gerrit-test:~$ ssh -p 29418 jenkins@localhost query "is:open owner:Administrator" [13:09:13] Gerrit Code Review: query: not found [13:10:37] But ssh -p 29418 jenkins@localhost gerrit review --code-review 2 12345,42 works [13:13:47] paladox: the command is "gerrit" [13:13:53] try [13:13:55] ssh -p 29418 jenkins@localhost gerrit [13:13:59] Ok [13:14:08] ssh -p 29418 jenkins@localhost gerrit query "is:open" [13:14:12] something like that [13:14:20] the Yep that works [13:14:21] ssh -p 29418 jenkins@localhost gerrit query "is:open" [13:15:57] But doing [13:15:58] zuul@gerrit-test:~$ ssh -p 29418 jenkins@localhost gerrit review --project testing/test --message "Main test build succeeded. - noop http://gerrit-jenkins.wmflabs.org/job/noop/None/console : SUCCESS " --verified 2 5,2 [13:15:58] fatal: "test" is not a valid patch set [13:16:03] dosent work [13:17:08] hashar ^^ [13:29:23] Seems to be an escaping issue [13:44:31] hashar how can i do '" in the layout file [13:44:43] where Main test build succeeded message is [13:44:50] since it has an escaping problem [13:45:01] https://gerrit.googlesource.com/gerrit/+/9c0cfc22f626a7e9a70c121ff0cf389edfd23894%5E%21/ [13:45:22] please [13:47:53] Yippee, build fixed! [13:47:53] Project selenium-VisualEditor » firefox,beta,Linux,contintLabsSlave && UbuntuTrusty build #72: 09FIXED in 3 min 51 sec: https://integration.wikimedia.org/ci/job/selenium-VisualEditor/BROWSER=firefox,MEDIAWIKI_ENVIRONMENT=beta,PLATFORM=Linux,label=contintLabsSlave%20&&%20UbuntuTrusty/72/ [14:01:22] paladox: sorry [14:01:33] ssh -p 29418 jenkins@localhost gerrit review --project testing/test --message "Main test build succeeded. - noop http://gerrit-jenkins.wmflabs.org/job/noop/None/console : SUCCESS " --verified 2 5,2 [14:01:41] if that fails maybe you need to quote again the message [14:02:06] oh no that is generated by zuul [14:02:16] and yeah for "noop" job there is no build [14:02:33] so the url is wrong, try with a real job [14:04:11] Ok [14:04:42] hashar i tryed with quotes and it worked [14:04:48] how do i get zuul to do that [14:04:49] please [14:05:34] hashar yay i got it working [14:05:34] http://gerrit-test.wmflabs.org/gerrit/#/c/5/2 [14:06:02] It was v+ was the problem it was set at 2 when i havent set it that high only +1 [14:07:02] (03PS1) 10Hashar: Whitelist Minh Nguyen [integration/config] - 10https://gerrit.wikimedia.org/r/297596 (https://phabricator.wikimedia.org/T139345) [14:07:11] hashar would doing something like [14:07:12] - job: [14:07:12] name: 'noop' [14:07:12] node: master [14:07:12] concurrent: true [14:07:12] triggers: [14:07:14] - zuul [14:07:16] builders: [14:07:18] - shell: "echo $ZUUL_PROJECT > extensions_load.txt" [14:07:20] - shell: "echo success" [14:07:22] work [14:07:33] i set it in mediawiki-extension* yaml file as a test. [14:08:13] But i still need quetoes to be passed to zuul [14:08:20] so that success is coloured in green [14:09:24] paladox: "noop" is a job that is built-in Zuul [14:09:29] create a real job [14:09:31] Oh [14:09:34] ok [14:09:41] you can even create it directly in Jenkins web interface if that is easier [14:09:44] Probaly something like test-gerrit [14:14:00] (03CR) 10Hashar: [C: 032] Whitelist Minh Nguyen [integration/config] - 10https://gerrit.wikimedia.org/r/297596 (https://phabricator.wikimedia.org/T139345) (owner: 10Hashar) [14:14:49] (03Merged) 10jenkins-bot: Whitelist Minh Nguyen [integration/config] - 10https://gerrit.wikimedia.org/r/297596 (https://phabricator.wikimedia.org/T139345) (owner: 10Hashar) [14:18:50] hashar_ yay it works [14:18:55] jenkins is triggered now [14:19:31] But on http://gerrit-jenkins.wmflabs.org/job/test-gerrit/1/ [14:19:33] it does [14:19:36]

[14:19:36] Triggered by change: [14:19:36] 5,2
[14:19:49] 10Continuous-Integration-Config, 13Patch-For-Review: Whitelist @mxn on zuul.yaml so his patches can be CI-tested - https://phabricator.wikimedia.org/T139345#2433180 (10hashar) 05Open>03Resolved a:03hashar Should be good. Please reopen if that is not the case. [14:28:38] hashar where do i get zuul-clonemap.yaml from [14:28:42] https://github.com/wikimedia/integration-config/blob/fbe0fa10332cb9c3ed2072cc3bca0e3671e8ae8b/jjb/macro-scm.yaml#L30 [14:28:49] http://gerrit-jenkins.wmflabs.org/job/test-gerrit/2/console [14:29:36] please [14:47:20] !log attempting to refresh ci-jessie-wikimedia image to get librdkafka-dev included for T133779 [14:47:21] T133779: Event Logging doesn't handle kafka nodes restart cleanly - https://phabricator.wikimedia.org/T133779 [14:47:24] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL, Master [14:47:41] paladox: the clone map is in integration/jenkins.yaml [14:47:46] paladox: you most probably dont need it [14:48:00] paladox: it is for zuul-cloner to clone repositories under specific paths [14:48:10] eg mediawiki/core to /src/ [14:48:20] and mediawiki/extensions/Foobar to /src/extensions/Foobar [14:49:28] paladox: try it with out passing --map ? [14:50:51] Ok [14:50:55] Yep [14:51:04] But fails with http://gerrit-jenkins.wmflabs.org/job/test-gerrit/5/console [14:51:05] now [14:51:07] hashar ^^ [14:52:10] 14:45:15 stderr: 'fatal: remote error: Git repository not found' [14:52:24] I have no idea how the Gerrit git repos are exposed publicly [14:52:26] over apache [14:52:30] maybe that needs some proxy [14:53:13] paladox: in puppet there is ./modules/gerrit/manifests/proxy.pp [14:53:22] and ./modules/gerrit/templates/gerrit.wikimedia.org.erb [14:53:33] Oh [14:53:42] ProxyPass /r/ http://127.0.0.1:8080/r/ retry=0 nocanon [14:53:47] AllowEncodedSlashes On [14:53:54] and probably other directives are needed [14:54:36] Oh [14:54:54] So i change it from /gerrit/ to /r/ [14:54:57] !log Image ci-jessie-wikimedia-1467816381 in wmflabs-eqiad is ready T133779 [14:54:58] T133779: Event Logging doesn't handle kafka nodes restart cleanly - https://phabricator.wikimedia.org/T133779 [14:55:01] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL, Master [14:56:03] hashar ^^ [14:58:17] 10Continuous-Integration-Config, 13Patch-For-Review: Whitelist @mxn on zuul.yaml so his patches can be CI-tested - https://phabricator.wikimedia.org/T139345#2433250 (10mxn) Thanks! [14:58:18] paladox: /gerrit/ is the web interface [14:58:25] Yep [14:58:26] paladox: not sure where it is redirected [14:58:32] paladox: maybe you can just use /r/ [14:58:36] Ok [14:58:39] I will do that [14:59:00] or use ProxyPath /gerrit/r/ http://127.0.0.1:8080/r/ retry=0 nocanon [14:59:06] I am not sure really, you want to test it :-} [14:59:10] Do i remove these bit's [14:59:11] ProxyRequests Off [14:59:11] ProxyVia Off [14:59:11] ProxyPreserveHost On [14:59:14] adjust the job then [14:59:20] Ok [14:59:31] and you can look at Gerrit interface as to which URL it uses to download a patch [14:59:37] Ok [14:59:41] Proxy* directives, you probably need all of them [14:59:49] Ok [15:01:34] Oh wait i forgot the gerrit in the url [15:01:38] that's why it failed [15:06:42] 07Browser-Tests, 06Reading-Web-Backlog, 13Patch-For-Review, 03Reading-Web-Sprint-75-Quantitative-Paralysis, 05WMF-deploy-2016-07-05_(1.28.0-wmf.9): Run browser tests on beta cluster from desktop domain - https://phabricator.wikimedia.org/T130429#2433274 (10zeljkofilipin) >>! In T130429#2412536, @Jdlrobso... [15:13:31] hashar it works now :) [15:13:49] we can now build zuul from the version you merged in precise and test it [15:13:50] :) [15:14:58] I am not sure I got one for Jessie though [15:15:02] hashar i get this error now http://gerrit-jenkins.wmflabs.org/job/test-gerrit/8/console [15:15:06] hashar nope we doint [15:15:10] or do you have the Zuul server on Precise? [15:15:19] I have it on jessie [15:15:43] and in prod Gerrit is 2.8 [15:16:01] Yep i use gerrit 2.12 [15:16:14] I think hopefully it will be updated in two weeks but not time frame for that. [15:17:52] hashar im not sure why it is erroring in http://gerrit-jenkins.wmflabs.org/job/test-gerrit/8/console dosent clearing [15:17:59] show the error except from trace back. [15:20:21] paladox: try deleting the workspace entirely ? [15:20:27] Ok [15:20:29] and you can try doing: [15:20:31] git clone https://gerrit-test.wmflabs.org/gerrit/p/testing/test [15:20:32] cd test [15:20:36] git remote prune origin [15:20:37] Oh [15:20:39] Ok [15:20:41] thanks [15:20:42] :) [15:23:25] hashar i get [15:23:25] root@gerrit-test:/srv/zuul/git/testing/test# git remote prune origin [15:23:25] The authenticity of host '[127.0.0.1]:29418 ([127.0.0.1]:29418)' can't be established. [15:23:25] RSA key fingerprint is 7b:33:37:9a:86:55:e0:9c:5d:c9:9e:81:7f:cd:0e:af. [15:23:25] Are you sure you want to continue connecting (yes/no)? yes [15:23:26] Warning: Permanently added '[127.0.0.1]:29418' (RSA) to the list of known hosts. [15:23:27] Permission denied (publickey). [15:23:29] fatal: Could not read from remote repository. [15:23:31] Please make sure you have the correct access rights [15:23:33] and the repository exists. [15:23:57] looks like it tries to use ssh [15:24:01] Yep [15:24:06] you can see the remote URL with: git remote -v [15:24:21] Ok [15:24:31] the jenkins job should probably just use https:// [15:24:33] git remote -v [15:24:38] origin ssh://jenkins@127.0.0.1:29418/testing/test (fetch) [15:24:38] origin ssh://jenkins@127.0.0.1:29418/testing/test (push) [15:24:41] yeah [15:25:03] and the jenkins user running the job does not have the credentials since that is only for the Zuul merger [15:25:32] if you got the zuul merger running properly, the job should git clone using $ZUUL_URL/r/$ZUUL_PROJECT [15:25:41] Oh [15:25:51] look at the jjb conf in default.yaml [15:25:56] Ok [15:26:01] that has some git scm with the proper parameters [15:26:05] ok [15:26:10] the $ZUUL_ * parameters are passed to Zuul [15:26:16] are passed to Jenkins by Zuul [15:26:24] and should point to the zuul::merger instance [15:26:40] heading to audio train for the next two hours [15:27:15] 10Continuous-Integration-Config, 10Continuous-Integration-Infrastructure, 06Services, 07Easy, 13Patch-For-Review: npm-node-4.3 jobs are failing because node is version 4.4.6 - https://phabricator.wikimedia.org/T139374#2430096 (10ssastry) One of our jenkins jobs continues to fail ... https://integration.w... [15:28:50] Ok [15:32:38] 10Continuous-Integration-Config, 10Continuous-Integration-Infrastructure, 06Services, 07Easy, 13Patch-For-Review: npm-node-4.3 jobs are failing because node is version 4.4.6 - https://phabricator.wikimedia.org/T139374#2433400 (10greg) >>! In T139374#2432528, @mobrovac wrote: > As per {T138561}, the plan... [15:35:01] hashar would this be the cause [15:35:02] zuul_url=git://localhost [15:39:01] Yippee, build fixed! [15:39:02] Project selenium-MobileFrontend » chrome,beta,Linux,contintLabsSlave && UbuntuTrusty build #71: 09FIXED in 17 min: https://integration.wikimedia.org/ci/job/selenium-MobileFrontend/BROWSER=chrome,MEDIAWIKI_ENVIRONMENT=beta,PLATFORM=Linux,label=contintLabsSlave%20&&%20UbuntuTrusty/71/ [15:49:09] 10Continuous-Integration-Config, 10Continuous-Integration-Infrastructure, 06Services, 07Easy, 13Patch-For-Review: npm-node-4.3 jobs are failing because node is version 4.4.6 - https://phabricator.wikimedia.org/T139374#2433460 (10thcipriani) >>! In T139374#2433368, @ssastry wrote: > One of our jenkins job... [15:51:12] 10Continuous-Integration-Config, 10Continuous-Integration-Infrastructure, 06Services, 07Easy, 13Patch-For-Review: npm-node-4.3 jobs are failing because node is version 4.4.6 - https://phabricator.wikimedia.org/T139374#2433465 (10ssastry) >>! In T139374#2433460, @thcipriani wrote: >>>! In T139374#2433368,... [16:00:43] (03CR) 10JanZerebecki: Add RevisionSlider to make-wmf-branch [tools/release] - 10https://gerrit.wikimedia.org/r/297412 (https://phabricator.wikimedia.org/T138943) (owner: 10Addshore) [16:01:43] (03CR) 10JanZerebecki: [C: 031] Add RevisionSlider to make-wmf-branch [tools/release] - 10https://gerrit.wikimedia.org/r/297412 (https://phabricator.wikimedia.org/T138943) (owner: 10Addshore) [16:18:42] 10Continuous-Integration-Config, 10Continuous-Integration-Infrastructure, 06Services, 07Easy, 13Patch-For-Review: npm-node-4.3 jobs are failing because node is version 4.4.6 - https://phabricator.wikimedia.org/T139374#2433543 (10mobrovac) >>! In T139374#2433400, @greg wrote: > At the same time, let's red... [16:55:19] Is there a page on mw.o about Diffusion repo creation policies/practices? I want a new repo with diffusion as the primary and gerrit and github mirrors to put my Striker project into. [16:55:52] I have the right permissions to do the Diffusion part myself, but I'm not sure that I know how things should be named etc [16:56:06] twentyafterfour ^^ [16:59:01] 10Continuous-Integration-Config, 06Release-Engineering-Team, 10DBA, 13Patch-For-Review, 07WorkType-NewFunctionality: Automatize the check and fix of object, schema and data drifts between production masters and slaves - https://phabricator.wikimedia.org/T104459#1417808 (10mmodell) I just ran across http:... [17:14:41] 10Continuous-Integration-Config, 06Release-Engineering-Team, 10DBA, 13Patch-For-Review, 07WorkType-NewFunctionality: Automatize the check and fix of object, schema and data drifts between production masters and slaves - https://phabricator.wikimedia.org/T104459#2433775 (10jcrespo) >>! In T104459#2433664,... [17:33:56] bd808: I don't think policies / practices are well documented. [17:34:08] that's something I'm trying to improve [17:34:26] *nod* [17:34:39] for now should I just open a task about what I want setup? [17:34:49] bd808: the github mirroring is easy to configure since we have the credentials in phabricator's credential store [17:35:08] bd808: feel free to create the repo yourself if you have permission to do so [17:35:28] but if you have specific questions I'll gladly answer them either here or on a task [17:35:45] and I can help set up the mirroring to github / gerrit [17:36:35] do we have any guidelines for organizing things? This repo is a python app that will end up deployed in production via scap3 once it passes security review [17:37:34] gerrit had its hierarchy rather than the flat namespace of github & phab [17:38:18] I'm not clear on if we are maintaining some pseudo hierarchy for newer project generally or not [17:38:38] I know we put a several newer php project right in the "root" namespace [17:38:50] wow, grammar [17:41:58] bd808: I think we actually have no policys. Maybe because we have less repo admins that project admins [17:42:24] It's a brave new world ;) [17:43:12] in gerrit there was all this inheriting permissions from parent repos and stuff.. but on the other hand we mostly just used that one "wmf" group for everything afaict [17:43:14] bd808: heh, I like that world, because there was nothing bad, so we don't need policys to prevent it ;) [17:43:15] bd808: hierarchy isn't strictly necessary but it doesn't hurt to maintain some organization [17:43:24] so maybe we dont need that much hierarchy [17:43:49] 06Release-Engineering-Team (Deployment-Blockers), 13Patch-For-Review, 05Release: MW-1.28.0-wmf.9 deployment blockers - https://phabricator.wikimedia.org/T138555#2433924 (10matmarex) [17:44:14] almost every part of phabricator avoids hierarchy, generally [17:44:28] *nod* I'll think on it a bit. I'm in no super rush to switch this over from the personal github repo I've been using up to now. [17:44:33] but that's because of upstream not us [17:44:39] I just need to get it moved before any prod deploys [17:45:16] twentyafterfour hi, could i have help with zuul please. [17:45:22] Im getting this error http://gerrit-jenkins.wmflabs.org/job/test-gerrit/36/console [17:45:34] from patch http://gerrit-test.wmflabs.org/gerrit/#/c/6/2 [17:45:37] paladox: looking [17:45:40] Thanks [17:46:21] bd808: btw, phabricator supports slashes in the 'name' field for repositories, but also has a 'short name' which doesn't allow slashes. So feel free to use hierarchy by putting slashes in the name [17:46:33] We could use it for testing any jenkins/zuul changes for phabricator on phab-01 [17:47:20] bd808: I think actually we are using - insted of /? So we have mediawiki-core, and mediawiki-extensions-, that's the actual situation, IIRC [17:47:26] twentyafterfour: ^ [17:47:35] (sry, was mainly to 20after4) [17:47:38] Luke081515: yes but that was a silly choice. [17:47:45] why? [17:47:49] I think that phabricator used to reject slashes [17:48:08] why is it silly? because it breaks the mapping between our existing repos and those in phabricator [17:48:10] the dashes make things the same on github too which is not horrible [17:48:35] we had to define a big list mapping the names in gerrit to the names in phabricator and then maintain that list [17:48:57] I think we should decide, what we want. because actually I'm changing names to - instead of / if I see them, because I thought that was wanted, and most of them are already at - [17:49:03] twentyafterfour maybe we can make that list editable from phabricator gui [17:49:13] bd808: the "short name" field works for matching up with github [17:49:15] making it easy for any changes to be done. [17:49:31] 10Continuous-Integration-Config, 10Continuous-Integration-Infrastructure, 06Services, 07Easy, 13Patch-For-Review: npm-node-4.3 jobs are failing because node is version 4.4.6 - https://phabricator.wikimedia.org/T139374#2433933 (10Krinkle) For Node, major versions are the top level numbers. There shouldn't... [17:49:49] twentyafterfour: ah neat. best of both worlds [17:50:00] paladox: I want to move to using the convention of name=gerrit_name, short name='name-with-dashes' [17:50:11] Oh yep that would be nice [17:50:12] :) [17:50:27] Also we could probaly do that at the same time as adding github mirror. [17:50:31] 'short name' is part of the url so it becomes the directory name when you clone the repo. 'name' is just the label and it can contain anything [17:52:10] twentyafterfour comparing http://gerrit-jenkins.wmflabs.org/job/test-gerrit/36/console and https://integration.wikimedia.org/ci/job/parsoidsvc-source-npm-node-4.3/281/consoleFull [17:52:12] it shows [17:52:13] that [17:52:22] hudson.plugins.git.GitException: Command "git -c core.askpass=true fetch --tags --progress http://localhost/gerrit/testing/test refs/zuul/master/Z154535efb67d4b349f62f73a361ddf6b" returned status code 128: [17:52:35] is doing [17:52:36] git -c core.askpass=true fetch --tags --progress git://scandium.eqiad.wmnet/mediawiki/services/parsoid +refs/heads/*:refs/remotes/origin/* [17:52:47] is doing = should be doingh [17:52:50] doing [17:53:13] paladox: you need to adjust the jenkins job config [17:53:17] oh [17:53:24] there is a place where you can configure the refspec [17:53:31] What do i ajust it to please, yep there is. [17:53:38] I used jenkins job builder [17:53:50] that i git cloned from repo in phabricator [17:54:51] hmm [17:57:46] twentyafterfour the config file looks like [17:57:47] [core] [17:57:47] repositoryformatversion = 0 [17:57:47] filemode = true [17:57:47] bare = false [17:57:48] logallrefupdates = true [17:57:50] [remote "origin"] [17:57:52] url = ssh://jenkins@127.0.0.1:29418/testing/test [17:57:54] fetch = +refs/heads/*:refs/remotes/origin/* [17:57:56] [branch "master"] [17:57:58] remote = origin [17:58:00] merge = refs/heads/master [17:58:02] [user] [17:58:04] email = zuul-merger@gerrit-test.wmflabs.org [17:58:06] name = jenkins [17:58:08] for .git [17:58:10] /srv/zuul/git/testing/test/.git [17:58:12] paladox: pastebin [17:58:22] Sorry [18:00:42] paladox: this might work? refs/*:refs/remotes/origin/* [18:00:51] I don't know much about zuul [18:00:53] Ok [18:00:55] thanks [18:17:06] ostriches: https://www.mediawiki.org/wiki/Release_checklist#Update_MediaWiki.org - [[Template:MW version/status]] was still 1.27=unknown, 1.26=stable, 1.25=legacy. Updated now to stable/legacy/unsupported. [18:17:54] * Krinkle created that template in 2013 - hoped it would be easy to re-do in Lua [18:18:37] Krinkle: Ok thx [18:43:44] 07Browser-Tests, 06Reading-Web-Backlog, 05WMF-deploy-2016-07-05_(1.28.0-wmf.9): Run browser tests on beta cluster from desktop domain - https://phabricator.wikimedia.org/T130429#2434273 (10Jdlrobson) [18:44:12] 07Browser-Tests, 06Reading-Web-Backlog, 05WMF-deploy-2016-07-05_(1.28.0-wmf.9): Run browser tests on beta cluster from desktop domain - https://phabricator.wikimedia.org/T130429#2135772 (10Jdlrobson) Okay. Updated description to describe the problem and moved. [19:07:45] (03Abandoned) 10Subramanya Sastry: Hide the read-more block in the default mediawiki rendering [integration/visualdiff] - 10https://gerrit.wikimedia.org/r/269335 (owner: 10Subramanya Sastry) [19:37:50] 06Release-Engineering-Team (Deployment-Blockers), 13Patch-For-Review, 05Release: MW-1.28.0-wmf.9 deployment blockers - https://phabricator.wikimedia.org/T138555#2434546 (10mmodell) [19:37:52] 06Release-Engineering-Team (Deployment-Blockers), 10MediaWiki-extensions-Translate, 05Release, 07Wikimedia-log-errors: Notice: Undefined index: 0 in /srv/mediawiki/php-1.28.0-wmf.9/extensions/Translate/tag/TranslatablePage.php on line XXX - https://phabricator.wikimedia.org/T139447#2434545 (10mmodell) [19:49:06] 06Release-Engineering-Team (Deployment-Blockers), 10MediaWiki-extensions-Translate, 05Release, 07Wikimedia-log-errors: Notice: Undefined index: 0 in /srv/mediawiki/php-1.28.0-wmf.9/extensions/Translate/tag/TranslatablePage.php on line XXX - https://phabricator.wikimedia.org/T139447#2434585 (10mmodell) Remo... [19:59:26] 06Release-Engineering-Team (Deployment-Blockers), 13Patch-For-Review, 05Release: MW-1.28.0-wmf.9 deployment blockers - https://phabricator.wikimedia.org/T138555#2434623 (10mmodell) [20:08:04] !log beta: on db1 and db2 move the MariaDB 'syslog' setting under [mysqld_safe] section. Cherry picked https://gerrit.wikimedia.org/r/#/c/296713/3 and reloaded mysql on both instances. T119370 [20:08:05] T119370: Send deployment-db1 deployment-db2 syslog to beta cluster logstash - https://phabricator.wikimedia.org/T119370 [20:08:11] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL, Master [20:30:50] !log beta: restarted mysql on both db1 and db2 so it takes in account the --syslog setting T119370 [20:30:51] T119370: Send deployment-db1 deployment-db2 syslog to beta cluster logstash - https://phabricator.wikimedia.org/T119370 [20:30:56] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL, Master [20:35:13] 10Beta-Cluster-Infrastructure, 10DBA, 13Patch-For-Review, 07WorkType-NewFunctionality: Send deployment-db1 deployment-db2 syslog to beta cluster logstash - https://phabricator.wikimedia.org/T119370#2434798 (10hashar) Both instances show entries for mysql in their /var/log/syslog. The beta cluster logstas... [20:37:25] hashar hi, im getting this error http://gerrit-jenkins.wmflabs.org/job/test-gerrit/46/console [20:37:38] It clones but carn't find ref refs/zuul/** [20:38:27] paladox: because you are cloning from Gerrit [20:38:33] Yep [20:38:35] I fixed it [20:38:35] but the ref has been generated on the zuul merger [20:38:39] Oh [20:38:54] Im not sure how i do that [20:39:00] the zuul-merger process takes the patch, attempt to merge it on tip of the branch [20:39:01] since i used jenkins job builder [20:39:31] then report back to zuul-server which pass the info (merge commit / ref ) to Jenkins with $ZUUL_COMMIT $ZUUL_REF [20:39:39] so Jenkins has to fetch from the zuul-merger repo [20:39:39] Yep i have [20:39:41] those in jenkins [20:39:48] I mean parameters [20:39:50] in jenkins [20:39:52] with role::zuul::merger that should have started a git-daemon process [20:39:57] that exposes the git repos [20:40:05] Oh [20:40:33] apparently it is not around [20:40:58] role::zuul::merger [20:41:04] # Serves Zuul git repositories [20:41:05] class { 'contint::zuul::git_daemon': [20:41:05] zuul_git_dir => $role::zuul::configuration::merger[$::realm]['git_dir'], [20:41:06] } [20:41:15] Oh [20:41:24] I have installed that role [20:41:27] which role do you have applied for the zuul merger? [20:41:27] in gerrit-test [20:41:29] oh [20:41:31] role::zuul::merge [20:41:35] https://wikitech.wikimedia.org/w/index.php?title=Special:NovaInstance&action=configure&instanceid=dfd84b2b-3f9a-420e-8ac7-16cd63991189&project=git®ion=eqiad [20:41:43] guess it is missing a dir maybe [20:41:47] Oh [20:41:53] isent it /srv/zuul/git [20:41:55] ? [20:42:04] hmm [20:42:05] Yippee, build fixed! [20:42:05] Project selenium-Echo » firefox,beta,Linux,contintLabsSlave && UbuntuTrusty build #77: 09FIXED in 1 min 3 sec: https://integration.wikimedia.org/ci/job/selenium-Echo/BROWSER=firefox,MEDIAWIKI_ENVIRONMENT=beta,PLATFORM=Linux,label=contintLabsSlave%20&&%20UbuntuTrusty/77/ [20:42:19] paladox: I ran puppet and that did a bunch of changes eeeeeek [20:42:27] Oh [20:42:40] Yep it keeps reverting my changes done in /etc/apache2/ [20:42:44] and etc/zuul [20:42:54] for apache2 you can put them in a different file [20:43:03] for /etc/zuul etc [20:43:06] Oh [20:43:15] But then it keeps reverting it [20:43:20] that is because it uses the parameters from the puppet classes :( [20:43:22] after adding it to a seperate file [20:43:28] but maybe they can be overriden by hiera [20:43:32] We could remove the role after installing [20:46:25] paladox: https://phabricator.wikimedia.org/P3348 [20:46:29] paladox: did a copy [20:46:36] Ok thanks [20:46:56] so for gerrit_proxy [20:47:08] the redirect match / directory directives [20:47:17] they should be put in a file in /etc/apache2/sites-available [20:47:20] then a2ensite [20:47:22] to enable it [20:47:30] Yep, but then it reverts it [20:47:41] in sites-available ? [20:47:50] Nope in sites-e* [20:48:08] Since i was using test-red.conf [20:48:15] which is the file i created [20:48:20] grbmblblb [20:48:41] so yeah should use puppet everywhere :D [20:49:01] Yep [20:49:12] I ln -s ../sites*/test-red.conf test-red.conf [20:49:47] and maybe use gerrit::proxy [20:50:10] maybe ostriches has a nice puppet class to easily setup Gerrit on a labs instance with sane defaults [20:50:35] gerrit::instance [20:50:41] Will install gerrit::proxy and gerrit::jetty [20:50:46] Ok thanks [20:50:50] It should *mostly* work since my cleanup last week [20:50:51] do i do [20:50:53] role::gerrit::proxy [20:50:55] Might trip up on SSL a bit [20:50:59] oh [20:51:21] paladox: for zuul settings. Most are borrowed from modules/role/manifests/zuul/configuration.pp which is from before we had hiera() [20:51:30] Oh [20:51:38] role::gerrit::server should work [20:51:46] Ok thanks [20:51:58] Or doing gerrit::instance :) [20:52:03] paladox: and they are then passed in role::zuul::server to the zuul::server class. And I dont know whether hiera can override them [20:52:09] Probably the latter actually. Former is kinda prod specific. [20:52:15] The role still has some hardcoded prod stuff [20:52:19] Oh [20:52:20] But the rest should be fine [20:52:53] hashar ostriches [20:52:53] gerrit::proxy [20:52:58] Error: Could not retrieve catalog from remote server: Error 400 on SERVER: Reading data from Git failed: TypeError: Data retrieved from Git is String not Hash at /etc/puppet/manifests/realm.pp:68 on node gerrit-test.git.eqiad.wmflabs [20:52:58] Warning: Not using cache on failed catalog [20:52:58] Error: Could not retrieve catalog; skipping run [20:53:18] Yeah you need gerrit::instance. [20:53:26] Oh [20:53:27] It'll install gerrit::jetty & gerrit::proxy [20:53:29] You need both [20:53:35] So just do ::instance one [20:53:47] Everything is in hiera now too [20:53:53] Will that overwrite my current gerrit install [20:53:58] Yeah probably [20:54:16] Oh [20:54:27] Error: Could not retrieve catalog from remote server: Error 400 on SERVER: Reading data from Git failed: TypeError: Data retrieved from Git is String not Hash at /etc/puppet/manifests/realm.pp:68 on node gerrit-test.git.eqiad.wmflabs [20:54:27] Warning: Not using cache on failed catalog [20:54:27] Error: Could not retrieve catalog; skipping run [20:54:34] ositrcues ^^ does that still [20:54:41] puppetmaster died? [20:55:18] That seems like a busted puppet to me [20:55:22] I haven't tried my new work in labs [20:55:30] It was basically a massive no-op refactor for prod [20:55:42] Oh [20:56:34] paladox: git-daemon is running now [20:56:42] Yay thanks [20:56:51] paladox: but a bunch of the conf has been overriden by puppet ( output is https://phabricator.wikimedia.org/P3348 ) [20:57:01] Oh yep [20:57:04] i am doing that now [20:57:12] If i disable the role::zule [20:57:15] sorry :( [20:57:15] zuul [20:57:17] thcipriani, I'm thinking of doing live hacks on labs to debug - am I allowed to break beta-sync for a while? :D [20:57:28] class will that remove it [20:57:33] when we do puppet next [20:57:58] nop [20:58:04] Ok thanks [20:58:07] it is just that puppet will no more manage it [20:58:12] I have disabled it as it is installed. [20:58:15] i.e. not update whatever changes are made in puppet [20:58:22] Yep [20:58:27] we would want to migrate to use hiera instead of the configuration class [20:58:28] but agan [20:58:31] As a way to prevent any breakage. [20:58:32] Yep [20:58:43] maybe the settings passed to zuul::server can be overriden via hiera() [20:58:48] Oh [20:58:49] would need to test it [20:58:52] Yep [20:59:14] MaxSem: sure, you might end up fighting with jenkins a bit. Might be easier to use x-wikimedia-debug + mw1017 [20:59:50] I prevously just chowned it to root. ppl didn't like :P [20:59:52] hashar i still get [20:59:53] Error: Could not retrieve catalog from remote server: Error 400 on SERVER: Reading data from Git failed: TypeError: Data retrieved from Git is String not Hash at /etc/puppet/manifests/realm.pp:68 on node gerrit-test.git.eqiad.wmflabs [20:59:53] Warning: Not using cache on failed catalog [20:59:53] Error: Could not retrieve catalog; skipping run [21:00:04] after disabling role::gerrit and zuul in puppet [21:00:08] and running the puppet run [21:00:28] paladox: the lame config file https://github.com/wikimedia/operations-puppet/blob/production/modules/role/manifests/zuul/configuration.pp [21:00:40] Oh [21:00:42] role::zuul::server is https://github.com/wikimedia/operations-puppet/blob/production/modules/role/manifests/zuul/server.pp [21:00:45] it spawn a git-daemon [21:00:51] and invoke the zuul::server class [21:00:55] Oh :) [21:01:03] passing parameters read from the configuration class [21:01:03] eg [21:01:04] gerrit_server => $role::zuul::configuration::shared[$::realm]['gerrit_server'], [21:01:11] Oh [21:01:14] so MAYBE hiera can be used to override them [21:01:18] Oh [21:01:27] Is that also set in zuul_merger [21:01:30] or zuul [21:01:32] conf files [21:01:38] but really, I have no idea how it can be done [21:01:42] Oh [21:01:56] maybe create the hash of has role::zuul::configuration::shared in the hiera page and see what happens [21:02:08] but really [21:02:15] Oh [21:02:23] role::zuul::configuration should be phased out and replaced by hiera. That is worth filling a task [21:02:25] how do you do the hash of role::zuul [21:02:28] please [21:02:31] yep [21:02:53] that would be some kind of yaml [21:03:29] hashar https://phabricator.wikimedia.org/T139527 [21:03:30] 10Continuous-Integration-Infrastructure, 06Labs, 10Zuul: role::zuul::configuration should be replaced by hiera - https://phabricator.wikimedia.org/T139527#2434874 (10Paladox) [21:04:14] 10Continuous-Integration-Infrastructure, 10Zuul, 07Puppet: role::zuul::configuration should be replaced by hiera - https://phabricator.wikimedia.org/T139527#2434888 (10hashar) [21:05:04] hashar how do you do maybe create the hash of has role::zuul::configuration::shared in the hiera page and see what happens please [21:05:56] 10Continuous-Integration-Infrastructure, 10Zuul, 07Puppet, 07Technical-Debt: role::zuul::configuration should be replaced by hiera - https://phabricator.wikimedia.org/T139527#2434874 (10hashar) role::zuul::configuration is really just a hash of hash of settings that are used by the role class to invoke th... [21:06:10] paladox: need some yaml magic hyeah [21:06:14] Yep [21:06:21] hashar i tryed building zuul today [21:06:28] first time doing dpkg packaging [21:06:32] for debian/jessie [21:07:37] but kept failing. The errors were not obvous, the /debian/rules has tabs not spaces [21:08:59] hashar is it something like https://wikitech.wikimedia.org/wiki/Hiera:Git [21:09:28] Yay it works now [21:10:11] 10Continuous-Integration-Infrastructure, 10Zuul, 07Puppet, 07Technical-Debt: role::zuul::configuration should be replaced by hiera - https://phabricator.wikimedia.org/T139527#2434900 (10hashar) Maybe something like: ``` lang=yaml role::zuul::configuration::shared: labs: gerrit_server: fooobar.wmflab... [21:10:16] paladox: maybe https://phabricator.wikimedia.org/T139527#2434900 [21:10:22] Oh [21:10:23] it worked [21:10:28] It ran puppet [21:10:30] OH REALLY [21:10:33] Yep [21:10:43] hashar i did https://wikitech.wikimedia.org/wiki/Hiera:Git [21:10:43] can you copy paste that Hiera:Git content to the task ? [21:10:48] Ok yep [21:10:55] so apparently hiera override whatever variables are in the puppet manifests [21:10:59] it is great [21:11:27] Yep [21:11:28] :) [21:11:28] 10Continuous-Integration-Infrastructure, 10Zuul, 07Puppet, 07Technical-Debt: role::zuul::configuration should be replaced by hiera - https://phabricator.wikimedia.org/T139527#2434901 (10Paladox) I did this "zuul::server::gerrit_server": 127.0.0.1 "zuul::server::gerrit_user": jenkins "zuul::server::gearma... [21:11:42] hashar im not sure if it works though, i wonder how to test now [21:11:45] git daemon [21:11:51] so you can tweak them until you end up with good zuul-server.conf and zuul-merger.conf [21:11:53] progresses [21:12:03] Oh yay [21:12:30] modules/role/manifests/zuul/merger.pp [21:12:36] Oh [21:12:41] it also invoke class { 'contint::zuul::git_daemon': [21:12:42] zuul_git_dir => $role::zuul::configuration::merger[$::realm]['git_dir'], [21:12:46] which set up a git-daemon [21:13:02] Oh [21:13:03] so that Jenkins jobs can clone and fetch the zuul-merger patches [21:13:07] Oh [21:13:11] is the git daemon setup [21:13:19] note the parameter [21:13:23] I will ne to go re edit those zuul-* file again [21:13:35] Yep i am using wikimedia jenkins job builder [21:13:41] the same value is used for the git_dir setting in zuul-merger.conf [21:13:45] Oh [21:13:53] Yep is that /srv/zuul/ [21:13:57] cause you want git-daemon to serve the files that zuul-merger crreates [21:13:58] yeah [21:14:01] :) [21:14:04] /srv/zuul/ [21:14:12] that looks good enough [21:14:13] so [21:14:20] zuul-scheduler receive a patch for a repo [21:14:23] Ok [21:14:27] it triggers a job merger:merge [21:14:33] that is processed by a zuul-merger daemon [21:14:34] I will need to re do the ssh key [21:14:41] and update the .conf files again [21:14:48] the daemon attempt to clone the repo from gerrit under /srv/zuul/ [21:14:58] hashar: re: SoS call for participants, want to throw a calendar invite at me for friday or so? maybe you / me / thcipriani / and some other opsen [21:15:04] can work through things [21:15:30] zuul-merger then fetch the patch in Gerrit, attempt to merge it and tag a reference with /refs/zuul/master/Zxxxxxx . Then reply back to zuul-server with the merge commit sha1 / referecen [21:15:41] Oh yay let's test [21:15:49] I am editing the apache conf [21:15:55] to add the missing bits back [21:15:56] :) [21:16:02] chasemp: re gallium replacement or re CI/nodepool resource needs? [21:16:14] paladox: the parameters are then passed to Jenkins which can clone from the zuul merger via git clone $ZUUL_URL/$ZUUL_PROJECT && git fetch $ZUUL_REF && git checkout -f $ZUUL_COMMIT (something like that) [21:16:25] Oh yep [21:16:29] chasemp: ah great!!!!!!! ;-) [21:16:33] I think the SoS was the first but it seems like a good segue for me to get a better handle on teh second too [21:16:39] I mainly need context etc [21:16:52] hashar woops i accidentally linked to a .json file [21:16:55] in apache [21:17:22] chasemp: I guess we will et some agenda set up tomorrow an sent to all. But yeah context is definitely needed, then we have open questions / interrogations. [21:17:27] I have a mental model that is most likely a year old, combined with what seems like a lot of hot potato and adjusted plans [21:18:02] sure, formal or informal all is well to me [21:18:08] thcipriani: still around ? [21:18:12] hashar: yup [21:18:39] I'm on somewhat dimished internets so it may be better as an irc meeting, however completely willing to try hangout [21:18:47] (un related but we will need to rename npm tests to use the number 4 instead of 4.3. [21:18:48] thcipriani: following SoS chase seems to have been designed to help on the phase out gallium hot potato :} [21:18:55] Instead of having to rename it everytime [21:19:16] I can at least attempt to help close the gap whatever that may be [21:19:34] chasemp: thcipriani can we do that check in tomorrow on your mornings? [21:19:51] tomorrow works better for me than Friday [21:20:09] (considering I'm out Friday :)) [21:20:14] greg-g on the CI resourcing stuff I'm not sure what to make of things yet, is there a task or outline on where you guys intend to be afa depending on labs usage growth? [21:20:21] i.e. the 10 back to 20 to x and why [21:20:47] I don't have a handle on the underlying reasoning but I feel like you guys may have your eye on a scenario never considered fully from our side [21:20:49] and that's scary [21:21:01] chasemp: well, to be honest, we asked :) [21:21:26] about growth or help? [21:21:29] last year, we asked about using labs for nodepool instances and were told "don't worry about it" [21:21:33] growth [21:21:48] ah yes well but a year is a long time and that wasn't a very good answer on our side [21:21:51] but... now we're here, and we should tell you what our pool size should be for what reasons [21:22:20] but, I just want to have separate meetings about these things [21:22:21] I honestly don't mean this catty but the spoiler is we do need to worry as it's very finite [21:22:24] yeah [21:22:32] chasemp: thcipriani: would 14:00 UTC tomorrow work or is that too early? else 15:30 UTC. [21:22:35] 1 is gallium replacement, 2 is nodepool pool size/labs needs [21:22:58] where is that gallium task? [21:23:08] even better: https://phabricator.wikimedia.org/project/view/1966/ [21:23:17] chasemp: thcipriani: thinking about 15 minutes context pres by myself , then tyler nice summary of open question. Then conclude with looping in more folks and another checkin next week [21:23:40] I could do 14:00 UTC [21:23:54] I can't commit to a weekly yet but let's see where this ends up and maybe shoot for biweekly [21:23:55] but yeah [21:24:20] no need to be a weekly [21:24:35] chasemp: not just a task, we have a workboard https://phabricator.wikimedia.org/project/view/1966/ :) [21:24:37] I can't tomorrow at 10 a.m. CDT which is where my brain lands 1400 UTC [21:24:47] but I could an hour later [21:24:48] 10releng-201516-q3, 10Continuous-Integration-Infrastructure (phase-out-gallium), 07Jenkins: [keyresult] Migrate Jenkins to Jessie (gallium -> cobalt) - https://phabricator.wikimedia.org/T124121#2434949 (10greg) 05Open>03Invalid (this was a meta task for this work in q3, it is now redudant with T95757, cl... [21:24:56] that is literally the one meeting already booked [21:25:21] ah I am so wrong with the tz :D [21:25:35] I am -5 at CDT afaik [21:25:40] so is thcipriani I think? [21:25:46] ah so 11 [21:25:50] man oh man ok [21:25:51] so [21:25:55] yeah that works :) [21:25:58] I'm in mountain timezone [21:26:03] hipster [21:26:42] -06:00 for life [21:26:49] (or until daylight savings) [21:27:01] which ever comes first [21:27:04] so it's 10 am for you and 11 for me [21:27:05] seems good [21:27:15] I'll even be dressed by then [21:27:46] thcipriani: chasemp I have sent an invite hoping Google calendar sort it out for us [21:27:53] ok neat [21:28:04] I think 14:00UTC would be 8 am for me [21:28:11] if that does not work / it is too early, we can move it after chase other meeting. Aka one hour and a half later [21:28:37] so the question is, how much do you guys want to do config wise? I mean I can be an ops buddy and try to direct to a person or +2 as I understand whats up [21:28:50] config wise I am not too worried [21:28:54] but I won't have time to figure out whatever new varnish thing for foo/a foo/b [21:29:05] it is all in puppet, beside jenkins conf that would need to be rsynced [21:29:32] it is more about where to put each of the components from gallium which is currently just a huge mess of several services [21:29:42] ok fair I think I can help w/ that reasoning [21:29:46] varnish misc I can handle it [21:30:02] and the puppet config changes tyler and I can definitely handle them. It should be all about changing few ips [21:30:21] and I think we can set up all the services in parallel then atomically switch when ready [21:30:27] ok just wanting to set an expectation there [21:30:41] I have not much time in all honesty but I'm around [21:30:52] till the morrow then [21:30:56] with a bit of sidebar [21:31:02] what we are looking for is really a validation of the architecture and clear out assumption we have regarding the labs support network / what can land in there and the network flow / matrix [21:31:02] re: CI and nodepool [21:31:18] understood I think [21:31:25] back to the other thing :) [21:31:26] I will prep a very short pres / summary with an ether pad [21:31:26] hashar im almost finished updating the conf [21:31:27] :) [21:31:31] so you're at 10 concurrent nodes I think right [21:31:41] does that mean atm w have 100 tests and it's running 10 at once? [21:31:59] paladox: awesome! ostriches looks like we have a Gerrit/Zuul/CI test ground to prepare Zuul / Gerrit upgrade ;} [21:32:07] Yep [21:32:15] chasemp: yup [21:32:17] We can also test with phabricator [21:32:22] with phab-01 [21:32:37] chasemp: Zuul queue the jobs requests. They are honored as instances are spawned in the labs. [21:32:45] hashar: thcipriani ok so a few concerns I have seen and this isn't a beat up you guys thing or anything but we know that we have a DNS leak issue [21:32:49] and CI is like 99% of it [21:32:54] and then take a look at something like this [21:33:01] audit of resources by project [21:33:08] look at bastion [21:33:08] | bastion | 4 | 5505028.47 | 2688.0 | 53760.04 | [21:33:14] chasemp: the idea from back in november was to start with a small pool and raise it up as we migrate from the permanent slaves (integration labs tenant) to nodepool instances [21:33:16] 4 instances, ram, yadda yadda [21:33:18] now look at ci [21:33:28] | contintcloud | 27456 | 34764134.5 | 16974.68 | 339493.5 | [21:33:30] ah yeah the DNS leaks [21:33:37] bunch of PTR stick around [21:33:38] AFA nova thinks we have 30k instances almost in CI [21:33:48] because I think that there isa bug somewhere or an issue w/ race [21:33:55] nova or rabbit or something is not keeping up with CI in general [21:34:02] and it's an issue we keep seeing [21:34:08] which is not only from the contintcloud project. But since we have spawned 170k instances contintcloud exacerbate the leak [21:34:49] chasemp: so all those resources add up stress to the openstack infra? [21:35:26] well, I don't really get what's going on, but the primary difference is instance spin/up teardown logic and rate [21:35:46] and that seems to have some gaps catching on dns setup/teardown and maybe confusing internal usage [21:35:50] hashar i forgot how to ssh into gerrit to add the ip to know_host [21:36:06] this is all theory but is meant to convey, this is new waters and your the only ones in it [21:36:07] chasemp: is that correct that when creating an instance, that triggers the creation of the DNS entries? [21:36:14] or is that a separate step? [21:36:24] cause I dont think Nodepool ask for creation/deletion of DNS entries [21:36:29] well...there is a hook in designate that says to do it [21:36:38] whenever an instance comes into being [21:36:41] so you don't ask for it [21:36:48] but you get it and then maybe not get it [21:36:53] and maybe get it and don't get it removed [21:36:54] so one ask for /v2/server/create , and internally the hook triggers to have designate to add the instance? [21:37:01] yes [21:37:03] maybe we can skip the DNS update for contintcloud project [21:37:19] I dont think we rely on the instance IP to have proper DNS entries [21:37:23] if that can help / is doable [21:37:42] (task is https://phabricator.wikimedia.org/T115194 ) [21:37:50] I mean, maybe? it seems first we should figure out waht's the deal as I would imagine this isn't the only oddity [21:37:55] but anyways [21:38:40] I'm sorry you guys felt like you had the go ahead and now here we are, tbf to us a year and mutliples of growth are revising factors I guess [21:39:07] but either way, the thing as it is puts more load on teh system than pretty much all other operations outside of compute itself :) [21:39:16] it always have been very clear to me that labs resources are scare / limited [21:39:17] combined even [21:39:37] and you and andrew firmly repeated all of last year that bumping the pool needs to be done very carefully with due notice a head of time [21:39:38] hashar ok ready now [21:39:40] it is all fine to me [21:39:48] one question I don't get is why 20 or 10 now [21:39:57] as in why not 11 or 12 or 25 [21:40:00] I clearly understand that we can't suddenly ask for 200 more instances (or whatever x number) [21:40:05] how did we arrive at this concurrency? [21:40:11] we started with 10 [21:40:23] I eventually filled a request for 20 since I was migrating a buch of jobs [21:40:39] but why 20? [21:40:41] and from the # of jobs and their duration my rough queue model estimated that 20 would be nice [21:40:41] hashar http://gerrit-jenkins.wmflabs.org/job/test-gerrit/47/console it works but fails [21:40:45] ah [21:40:46] 21:39:20 > git rev-parse origin/master^{commit} # timeout=10 [21:40:46] 21:39:20 ERROR: Couldn't find any revision to build. Verify the repository and branch configuration for this job. [21:40:46] 21:39:20 Finished: FAILURE [21:40:49] with some delay during peak hours [21:41:04] then we have a bunch of over jobs to migrate and I filled another task to raise it from 20 to 40 [21:41:23] right but...why 40? [21:41:23] but that is too much stress for labs infra and it is pending new hardware [21:41:25] is it a like for like? [21:41:30] which is a perfectly legitimate reason [21:41:30] are you migrating 20 jobs i.e. 20 to 40? [21:41:48] it is more in term of build we want to run in parrallel [21:42:12] a mediawiki/core change for example triggers ten jobs [21:42:23] how many jobs do we run a day now? [21:42:25] so if you get five patches to mediawiki/core that is 50 jobs or 50 instances [21:42:35] if we get 10 instances, we process 1 change [21:42:41] hashar http://gerrit-jenkins.wmflabs.org/job/test-gerrit/47/console [21:42:43] with 50 instances we process all 10 changes in parralle [21:43:10] sure gotcha I get hte gist of the math on queue reduction [21:43:13] https://grafana.wikimedia.org/dashboard/db/releng-zuul?panelId=20&fullscreen [21:43:20] and since consuming instances is an heavy operation, all of last year we have aggregated the jobs so that a change triggers less jobs and thus consume less instanes [21:43:25] but I'm not getting the target for increase [21:43:31] are we trying ot keep queue length at x? [21:43:35] number of builds per job type per day ^^ (grafana link) [21:43:48] that's nice [21:43:55] yeah that one is the labels in Jenkins [21:44:15] though a build can be accounted more than once [21:44:26] so are all of those summed the total builds i.e. vms in use exclusively per day? [21:44:28] chasemp: generally, we want to reduce the time for merges/test results, lemme find that graph [21:44:39] if a build run on a slave having labels Foo and Bar. That will count +1 toward each of Foo and Bar [21:44:46] I get that but...are you targeting a build time or a queue length time? [21:45:07] the ci-jessie-wikimedia and ci-trusty-wikimedia re the labels used for Nodepools instances [21:45:14] how do we know we need more concurrency [21:45:18] thats the crux [21:45:28] say we add way more stuff and don't up our concurrency [21:45:30] what makes us say [21:45:33] and UbuntuTrusty is the label for the permanent slaves ( integration labs tenant) [21:45:34] we need more because x [21:45:41] hashar seems i am getting ERROR: Couldn't find any revision to build. Verify the repository and branch configuration for this job. [21:45:45] now [21:45:49] paladox: one second please [21:45:54] Ok sorry [21:45:54] if I'm looking at https://grafana.wikimedia.org/dashboard/db/releng-kpis?panelId=2&fullscreen [21:46:03] so for a random day I get 900 build on Nodepool instances vs 800 on Permanent Trusty slave. Or double the number of builds [21:46:05] I /think/ that includes the drop from 20 to 10 there [21:46:11] which lead me to believing we want to double the pool [21:46:13] but I don't see a meaningful change etc [21:46:27] paladox: busy sorry [21:46:32] Ok [21:46:55] so I'm wondering, how to interpret the end results of the 20 to 10 change [21:46:59] where are we feeling that pain and how much [21:47:10] chasemp: that graph is the time a MediaWiki change stay in Zuul from the time the event is received from Gerrit until the time the change is submitted (merged) [21:47:38] so that is only for CR+2 [21:47:42] so is that time to queue or time to test start? [21:47:45] and that has high priority [21:47:46] eg [21:47:55] Zuul will process the CR+2 before everything else [21:48:03] the rest waits [21:48:06] so you are saying that is artificially good no matter what [21:48:12] so where is the ones who get bumped :) [21:48:27] that is what KPI are for ? ;-} [21:48:32] keep them green! [21:48:46] https://grafana.wikimedia.org/dashboard/db/releng-zuul?panelId=18&fullscreen [21:49:11] that one shows the maximum time waiting for an instance to be available to run a job [21:49:20] a metric for each of the two images we have (trusty vs jessie) [21:49:40] over 7 days there are two spikes [21:49:43] we've been increasing since yesterday... will that increase stop increasing? [21:50:03] one on July 4th which correspond to rabbit-mq dieting / libvirt having PCI bus error [21:50:21] it may imply at 10 we are not processing jobs as fast as they are being produced but then again [21:50:25] too small of a sample to know [21:50:31] and the other from ~ 24 hours ago which is libvirt1011 (the sole host in the compute pool) which lost network [21:50:43] solved when the other libvirt have been confirmed to work fine and added back [21:51:10] right so excluding both of those as anomalies etc [21:51:28] ci-jessie seems to be handling things possibly but is ci trusty in the beginnings of a death spiral? [21:51:42] that metric is only available since late may. Should be made a KPI eventually [21:51:55] and that would be directly influenced by the # of instances we can consume in parallel [21:53:03] we have less jobs on Trusty [21:53:13] and less trusty instances immediately available [21:53:25] so this is titled max launch wait per nodepool label, is this a graph of hte longest waiting job in teh queue at that time then? [21:53:31] do we have an average and is that going up? [21:55:26] max has been a few minutes (less than 4 ?) for pretty much all of june [21:55:29] but [21:55:42] migrating the rest of the jobs will most definitely make it worth [21:55:54] cause more builds are going to be scheduled on the nodepool instnaces [21:56:07] so the pool get consumed faster and jobs will have to wait more [21:56:22] so I have hold the migration of the other jobs till the pool is raised [21:56:30] that is more like trial and errors than rocket science :((( [21:56:39] idk it looks pretty highly variable form what I see [21:56:43] looking at 6/8 and now [21:56:48] idk if something was up 6/8 as well [21:56:52] I dont have a firm / robust mathematical module to properly estimate the # of instances needed [21:57:08] sure I get that and I'm not intending to bust chops about that [21:57:13] what I am looking for is [21:57:21] what is your we are ok max wait time [21:57:26] what is your ok average wait time [21:57:30] how do you know you are doing well? [21:57:39] gallium was dead [21:57:48] that's not a real answer though [21:57:59] unless it really is in which case oh crap [21:58:16] sorry [21:58:24] I was referring the 6/8 gap / peak [21:58:29] hah ok [21:58:38] there seems to be no methodology on when to adjust [21:58:45] say you want to spray and pray on instance allocation [21:58:46] ok then [21:58:56] but how do you know to pull back as it's past the threshold of limited retuns? [21:58:56] processing the other questions :D [21:58:59] returns [21:59:23] so for the are we ok times. We do not have any firm SLA [21:59:39] can someone (e.g. ostriches, bd808) comment on T125031 and how it relates to the proposed migration from gerrit to diffusion/diffusion? [21:59:40] T125031: PHP libraries as Gerrit top-level projects - https://phabricator.wikimedia.org/T125031 [21:59:45] for most of CI life it has been do our best to keep it under a time when devs start complaining [21:59:51] aside from sla I just mean between labs and releng [22:00:02] see that doesn't work though when we start talking between teams service offerings [22:00:16] I have been putting off creating new libraries in gerrit until it is resolved, and I can't do that for much longer [22:00:35] I suppose the idea of a big flat namespace at the top level is to follow the github convention? [22:02:04] I don't really care what the resolution is so much as just having a resolution [22:02:31] twentyafterfour: ^ T125031 [22:02:32] T125031: PHP libraries as Gerrit top-level projects - https://phabricator.wikimedia.org/T125031 [22:02:33] chasemp: I guess I will have to loop back with rest of releng team and figure out acceptable wait times / average times etc [22:03:10] hashar: it's cool it's not a right this moment consideration but we need to figure that out I think to collaborate on this [22:03:21] chasemp: I dont have a good answer now, beside trying to keep the CR+2 patches to land in a reaonsable time [22:03:26] 06Release-Engineering-Team, 10ArchCom-RfC, 06Developer-Relations, 06WMF-Legal, 07RfC: Create formal process for CREDITS files - https://phabricator.wikimedia.org/T139300#2426179 (10Krinkle) **Research** jQuery * AUTHORS.txt file, containing all commit authors ([link](https://github.com/jquery/jquery/blo... [22:03:33] when people complain is really just all the time for us :) [22:03:44] I have noticed :(((( [22:04:14] you guys are really in the bad position: at the end of all pipelines and having to make sure the water flow properly through all those pipes [22:04:41] greg-g: guess we will need to find out some new KPI to track for CI :-} [22:05:02] nah it's ok that wasn't a sideways complaint at all, I meant it more like, let's figure out where our thresholds are for this so that we can both speak the same language [22:05:17] atm the only real bad thing I know for sure is, if hte queue isn't ever going down it's bad [22:05:23] but even taht could be over the course of an hour or a day who knows [22:05:32] and that's not a good thing we can share [22:05:48] that is also because we have 10 m1.large instances handling roughly half the CI workload [22:06:01] (the "integration" tenant) [22:06:19] the aim is to get rid of all those permanent instances and shift the load to the "contint" tenant (the nodepool instances) [22:06:35] so we are kind of in the middle of the bridge [22:06:48] right [22:06:54] even more reason to figure out what good looks like :) [22:07:09] back in spring I was hoping labs infra to have enough capacity to handle both in paralel [22:07:18] with a spike of resources allocation while jobs are moved [22:07:28] but it is bad timing in the end :-( [22:07:33] I don't know that this will happen anytime soon [22:07:55] we have explosive growth to manage of our own accord but either way it's a long tail problem to reduce things [22:08:12] yeah I fully understand that [22:08:29] there is at least one sure thing [22:08:38] labs infra keeps getting better and better overall [22:09:59] we were talking about our cycle of masochism, we keep trying to make it more stable, making people want to do more, making it less stable [22:10:17] chasemp: we have 15 Trusty instances with 8GBytes; Most of them can be phased out ;-} [22:10:30] in integration project? [22:10:42] + a 16Gbytes one for Android testing [22:10:48] and a few more 4Gbytes ones [22:10:57] actually maybe I can start dropping some already [22:11:39] can you make a list [22:11:41] and I can drop them [22:11:58] usage of those Trusty instances https://integration.wikimedia.org/ci/label/UbuntuTrusty/load-statistics?type=hour [22:12:00] be sure tho, no coming back [22:12:19] creating a task [22:12:33] tomorrow I will prep the pres for our quick meeting [22:12:40] hashar Hi, (when ever your ready or free) how do i create the .deb package for https://phabricator.wikimedia.org/diffusion/CIZU/browse/debian%252Fjessie-wikimedia/ [22:12:45] if I get time I will further analysis the demand and probably just drop a few of them [22:12:45] im running it but getting errors [22:13:14] debian/rules:32: recipe for target 'override_dh_virtualenv' failed [22:13:25] TimStarling: Responded. [22:13:25] hashar: great and thanks for taking my questions to heart, I think I understand more now [22:16:20] 10Continuous-Integration-Infrastructure, 06Labs, 10Labs-Infrastructure: Drop some Trusty permanent slaves from integration labs project - https://phabricator.wikimedia.org/T139535#2435262 (10hashar) [22:16:35] chasemp: filled https://phabricator.wikimedia.org/T139535 to get rid of some 8Gbytes instances ;} [22:16:50] chasemp: thanks for asking questions!!!! [22:17:02] chasemp: that has been very helpful last year for the Nodepool project [22:17:20] and looks like I am going to be able to reclaim some memory from labs quite easily! [22:17:41] will get you guys some more room / time [22:23:41] Looks like viewing of refs/meta/config is broken in Gerrit [22:23:45] Links to Difussion which is unable to view it [22:24:13] The redirect might not work [22:24:17] There *is* a way to see it though [22:24:18] Hmm [22:24:25] Lemme finish this other thing and I'll look [22:26:56] Krinkle know problem [22:27:00] *known [22:27:11] https://phabricator.wikimedia.org/r/browse/WrappedString;refs/meta/config;project.config [22:27:11] Krinkle it is viewable but you need to add the ref on [22:27:36] Krinkle need to do something like [22:27:44] https://phabricator.wikimedia.org/r/browse/WrappedString;refs/meta/config;project.config;{COMMIT_ID} [22:27:50] paladox: for the deb package. you need to jump in the Debian packaging toolchain. We might have some doc on wikitech but honestly it is too late to touch that topic :D [22:28:04] Ok [22:28:16] hashar could you quickly help me if you have time with [22:28:19] the jenkins error please [22:28:29] http://gerrit-jenkins.wmflabs.org/job/test-gerrit/47/console [22:29:00] paladox: for .deb packaging there si the puppet class package_builder that setup everything needed and a README.md explaining how to use it to build package [22:29:07] paladox: but really, that is far from trivial :/ [22:29:10] Oh [22:30:04] 06Release-Engineering-Team, 15User-greg: Determine timing of 2016 RelEng team offsite - https://phabricator.wikimedia.org/T137720#2435350 (10greg) a:05Ettorre>03greg [22:30:07] Krinkle https://phabricator.wikimedia.org/diffusion/GWST/browse/refs%252Fmeta%252Fconfig/project.config;3e37ab85871b036a382815158dee35c56c4db946 [22:30:09] paladox: for the jenkins error, it seems you have a bunch of legit parameters http://gerrit-jenkins.wmflabs.org/job/test-gerrit/47/parameters/ [22:30:13] Yep [22:30:14] try to reproduce manually ? [22:30:18] Ok [22:30:24] git clone git://localhost/testing/test [22:30:27] Ok [22:30:31] git fetch refs/zuul/master/Z551fcfcbccc54c65b02b473c15c4c03a [22:30:31] thanks [22:30:37] git checkout 358bcdd26001a32a2641c0738d971c1fcd97dfa6 [22:30:47] most probably the Git configuration in the Jenkins job is wrong / off [22:30:54] or not using the proper $ZUUL_x params [22:31:23] hashar [22:31:24] git clone git://localhost/testing/test [22:31:28] zuul@gerrit-test:/srv/zuul/git$ rm -rf test* [22:31:28] zuul@gerrit-test:/srv/zuul/git$ git clone git://localhost/testing/test [22:31:28] Cloning into 'test'... [22:31:28] fatal: remote error: access denied or repository not exported: /testing/test [22:32:11] :-( [22:32:20] Krinkle: https://phabricator.wikimedia.org/diffusion/GWST/browse/refs%252Fmeta%252Fconfig/project.config;refs/meta/config [22:32:25] (Works, but is ugly as heck) [22:32:39] Actually, https://phabricator.wikimedia.org/diffusion/GWST/browse/HEAD/project.config;refs/meta/config works too [22:32:52] paladox: Jul 6 22:31:07 gerrit-test git-daemon[1766]: '/srv/zuul/git/testing/test' does not appear to be a git repository [22:32:56] the ref in between browse and the filename only matters if it's a branch or tag :) [22:33:01] Oh [22:33:03] paladox: tail -F /var/log/daemon.log [22:33:07] while you test it [22:33:08] Yep [22:33:10] Ok [22:33:57] paladox: and look also at /var/log/zuul/zuul-merger.log [22:34:06] Ok [22:34:18] Resetting repository /srv/zuul/git/testing/test [22:34:25] but /srv/zuul/git is empty :( [22:34:29] Oh [22:34:40] I deleted testing repo [22:34:45] so yeah [22:34:46] in /srv/ [22:34:51] Jenkins can't fetch from it [22:34:55] trigger a change again [22:34:59] Ok [22:35:01] zuul-merger will recreate it [22:35:10] 06Release-Engineering-Team, 15User-greg: Document tech leads for RelEng projects - https://phabricator.wikimedia.org/T139539#2435378 (10greg) [22:35:21] Ok [22:35:29] paladox: and maybe use real project name instead of testing/test :-} [22:35:30] hashar http://gerrit-jenkins.wmflabs.org/job/test-gerrit/48/console [22:35:41] Yep, i will once i get everything working :) [22:35:47] 06Release-Engineering-Team, 15User-greg: Identify RelEng projects 'worthy' of a tech lead - https://phabricator.wikimedia.org/T139540#2435395 (10greg) [22:35:58] /srv/zuul/git/testing/test exists now [22:36:42] paladox: most probably the git plugin lacks some parameter such as the branch to build [22:36:49] Oh [22:37:15] paladox: Resetting repository /srv/zuul/git/testing/test [22:37:17] grr [22:37:24] paladox: http://docs.openstack.org/infra/zuul/launchers.html?highlight=jenkins#jenkins-git-plugin-configuration [22:37:30] that hints at how to configure the git plugin in the job [22:37:40] Oh [22:37:40] or look at jjb/macro-scm.yaml [22:37:45] Ok [22:37:56] the branch specifier is what is going to be checked out [22:38:17] ref spec is what to fetch ( eg: refs/zuul/master/Z123456789 ) [22:38:17] Yep i need to set defaults [22:38:55] and the branch specifier has to be set to $ZUUL_COMMIT [22:39:11] I am off [22:39:22] chasemp: kudos and thank you a ton for all the questions ;-) [22:39:37] paladox: I am heading to bed. Congrats on the Hiera:git trick !!! [22:40:01] Ok [22:40:04] Your welcome :) [22:40:18] Would you be able to help me some more tomarror when you have time please [22:40:20] hashar ^^ [22:40:32] not at all [22:40:39] got a bunch of documents to fill in tomorro [22:40:56] and I am not working tomorrow evening [22:41:00] Ok [22:41:21] you can try polishing up the hiera hack until puppet is all happy [22:41:28] Ok [22:41:34] sort out the jenkins issue [22:41:37] Yep [22:41:43] maybe you can try with plain git command instead of Jenkins git plugin [22:41:48] or even use zuul-cloner [22:41:51] Oh [22:41:55] Yep [22:42:12] which is looking at the ZUUL_ parameters to do the right thing most of the time [22:42:22] ;-) [22:42:27] Im going to re [22:42:30] regenerate [22:42:34] jobs now [22:42:39] using jenkins job builder [22:43:00] we can even add them to integration/config , maybe in a different directory or in a branch [22:43:12] I am asleep now! [22:43:23] Yep [22:43:25] Ok bye [22:43:26] :) [22:43:59] ostriches: Hm.. ok. Can we fix the Gerrit pointers? [22:44:06] hashar http://gerrit-jenkins.wmflabs.org/job/test-gerrit/1/console [22:44:14] Seems to happen again [22:44:20] Krinkle: Probably needs some mangling in the redirect script to handle that ref better. [22:44:22] :-( [22:44:28] Since it's not a standard branch/tag name [22:44:29] ostriches: Short of cloning locally (which involves even more hacks to see this weird branch), there is basically no normal way to view this right now. [22:44:47] It should be fixable [22:44:48] And makes ACL editing hard, without history or even basic content viewing. [22:44:50] Cool :) [22:44:58] paladox mentioned it was known. Is there a task? [22:45:07] Krinkle yep florian filled it [22:45:10] https://phabricator.wikimedia.org/T137354 [22:45:10] I will go find it [22:45:23] Yep that it is [22:45:39] Krinkle if you switch on the github mirror on the repo your using [22:45:45] you can view refs/meta/ on github now [22:46:49] paladox: No, they are not. [22:46:57] Yes they are [22:47:01] That's also not ideal, since the link is from Gerrit to Diffusion :) [22:47:03] Only for one repo I think? [22:47:24] Yeah we don't replicate all of refs/* from gerrit to github [22:47:24] https://github.com/wikimedia/at-ease/tree/refs/meta/config [22:47:29] 404 [22:47:31] It only does that for ones that go via Phab [22:47:32] :) [22:47:34] only one solution guys, bring back gitblit [22:47:43] * ostriches stabs chasemp [22:47:48] chasemp: No, gerrit-web! [22:47:50] * ostriches stabs again, just in case [22:48:02] Krinkle https://github.com/wikimedia/mediawiki-extensions-GoogleLogin/tree/refs/meta/config [22:48:29] Krinkle but mw-core will need to have refs/changes/* cut or added manually to github [22:48:40] same for operations-puppet since it hits github limit [22:48:43] There's no need to mirror this to GitHub [22:48:48] We only need it in 1 place [22:49:00] It's on Phab, we just gotta fix the redirector :) [22:49:03] Then we're good! [22:49:07] yep. [22:49:08] Yep [22:49:10] Rest of that crap can skip Github :) [22:49:14] Only needs heads and tags [22:49:18] It's not discoverable on GitHub anyway, since it doesn't list refs/* in teh branch selector [22:49:33] Nope not visable [22:50:38] If you create a branch 'refs/meta/config' on github it'll become refs/heads/refs/meta/config [22:50:43] Which is silly :) [22:51:37] Yep [22:51:46] Any refs outside of heasd or tags is pretty silly but it's too late to debate that now :p [22:51:55] Krinkle but we are switching the mirror on, on the phabricator repos [23:08:35] 06Release-Engineering-Team: Proposal: Add a European mid-day SWAT window - https://phabricator.wikimedia.org/T137970#2435540 (10greg) Here's my proposal for the new SWAT timeslots: * add a European window at 6am Pacific (13:00 UTC currently) * move the 8am (15:00 UTC) window to 11am (18:00 UTC) * keep the 4pm (2... [23:09:07] 06Release-Engineering-Team: Identify inaugural SWAT members for the European SWAT window - https://phabricator.wikimedia.org/T139544#2435541 (10greg) [23:09:38] 06Release-Engineering-Team: Update new SWAT member process and deploying documentation in prep for adding European SWAT window - https://phabricator.wikimedia.org/T139545#2435555 (10greg) [23:09:52] 06Release-Engineering-Team: Add a European mid-day SWAT window - https://phabricator.wikimedia.org/T137970#2385565 (10greg) [23:11:30] 06Release-Engineering-Team: Update new SWAT member process and deploying documentation in prep for adding European SWAT window - https://phabricator.wikimedia.org/T139545#2435582 (10greg) [23:30:10] 10scap, 03Scap3 (Scap3-MediaWiki-MVP): Implement MediaWiki pre-promote checks - https://phabricator.wikimedia.org/T121597#2435629 (10Krinkle) [23:30:12] 10Deployment-Systems, 10scap, 07WorkType-NewFunctionality: Create canary deploy process for MediaWiki - https://phabricator.wikimedia.org/T136883#2435628 (10Krinkle) [23:36:26] 06Release-Engineering-Team: Add a European mid-day SWAT window - https://phabricator.wikimedia.org/T137970#2385565 (10Krinkle) I'd like to propose to change the SWAT process to **require** syncing to a canary server first. We already require that a peer verify the fix after the fact. Instead of having this pers... [23:46:25] 10scap, 03Scap3 (Scap3-MediaWiki-MVP): Implement MediaWiki pre-promote checks - https://phabricator.wikimedia.org/T121597#2435670 (10Krinkle)