[00:10:41] 10Beta-Cluster-Infrastructure: Review rights removal by user Vogone - https://phabricator.wikimedia.org/T121168#1871714 (10greg) >>! In T121168#1871640, @Krenair wrote: > You really shouldn't be testing security on public wikis. As long as it's not with sensitive info, what are we worried about? [00:15:24] 10Beta-Cluster-Infrastructure: Review rights removal by user Vogone - https://phabricator.wikimedia.org/T121168#1871732 (10Krenair) >>! In T121168#1871714, @greg wrote: >>>! In T121168#1871640, @Krenair wrote: >> You really shouldn't be testing security on public wikis. > > As long as it's not with sensitive in... [00:28:55] ugh [00:29:00] why is the sql command on deployment-bastion broken [00:30:05] krenair@deployment-bastion:/srv/mediawiki-staging$ host=`echo $hostcode | /usr/local/bin/mwscript eval.php --wiki="$lookupdb"` [00:30:05] krenair@deployment-bastion:/srv/mediawiki-staging$ echo $host [00:30:06] Warning: array_merge(): Argument #1 is not an array in /mnt/srv/mediawiki/wmf-config/CommonSettings.php on line 2137 deployment-db2 [00:30:06] what [00:30:42] that warning comes up every time you run eval.php [00:31:19] $wgExtractsRemoveClasses = array_merge( $wgExtractsRemoveClasses, $wmgExtractsRemoveClasses ); [00:31:58] Project beta-scap-eqiad build #81925: 04FAILURE in 29 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/81925/ [00:35:56] (03PS4) 10Krinkle: scm: Enable shallow-clone for git-remoteonly-zuul [integration/config] - 10https://gerrit.wikimedia.org/r/195021 [00:36:37] (03Abandoned) 10Krinkle: scm: Enable shallow-clone for git-remoteonly-zuul [integration/config] - 10https://gerrit.wikimedia.org/r/195021 (owner: 10Krinkle) [00:42:53] Yippee, build fixed! [00:42:53] Project beta-scap-eqiad build #81926: 09FIXED in 7 min 50 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/81926/ [00:43:35] Krenair: ohhhh, that's me. I think I just broke that. [00:44:18] Did you just change it to extension registration? [00:45:11] Yep: https://gerrit.wikimedia.org/r/#/c/258065/ [00:45:12] gj [00:45:18] better fix it before the next branch cut [00:45:35] 10Beta-Cluster-Infrastructure: Review rights removal by user Vogone - https://phabricator.wikimedia.org/T121168#1871814 (10greg) Right, good point if the process is discernable. [00:46:13] Project UploadWizard-api-commons.wikimedia.beta.wmflabs.org build #3129: 04FAILURE in 12 sec: https://integration.wikimedia.org/ci/job/UploadWizard-api-commons.wikimedia.beta.wmflabs.org/3129/ [00:47:42] 10Beta-Cluster-Infrastructure: Review rights removal by user Vogone - https://phabricator.wikimedia.org/T121168#1871818 (10Peachey88) [01:04:16] (03PS1) 10Dduvall: Support all SauceLabs provided browsers [selenium] - 10https://gerrit.wikimedia.org/r/258394 (https://phabricator.wikimedia.org/T114362) [01:06:02] (03PS1) 10Legoktm: Whitelist MtDu [integration/config] - 10https://gerrit.wikimedia.org/r/258395 [01:06:13] (03CR) 10Legoktm: [C: 032] Whitelist MtDu [integration/config] - 10https://gerrit.wikimedia.org/r/258395 (owner: 10Legoktm) [01:07:11] (03CR) 10jenkins-bot: [V: 04-1] Support all SauceLabs provided browsers [selenium] - 10https://gerrit.wikimedia.org/r/258394 (https://phabricator.wikimedia.org/T114362) (owner: 10Dduvall) [01:12:18] (03Merged) 10jenkins-bot: Whitelist MtDu [integration/config] - 10https://gerrit.wikimedia.org/r/258395 (owner: 10Legoktm) [01:12:41] !log deploying https://gerrit.wikimedia.org/r/258395 [01:12:44] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL, Master [01:22:10] 10Continuous-Integration-Infrastructure: jenkins-bot is not running for MtDu's patches - https://phabricator.wikimedia.org/T121062#1871849 (10Legoktm) 5Open>3Resolved a:3Legoktm I don't know why jenkins wouldn't run the +1 tests, but I "fixed" it by whitelisting them: {4ac106ca5b8c16779239fe61657b52e970765... [01:28:11] (03CR) 10John Vandenberg: "needs rebase" [integration/jenkins] - 10https://gerrit.wikimedia.org/r/247047 (owner: 10XZise) [01:44:05] Project browsertests-Wikidata-SmokeTests-linux-firefox-sauce build #467: 04FAILURE in 27 min: https://integration.wikimedia.org/ci/job/browsertests-Wikidata-SmokeTests-linux-firefox-sauce/467/ [02:46:52] (03PS1) 10Tim Starling: Add total area metric and fix mask and moved metrics [integration/uprightdiff] - 10https://gerrit.wikimedia.org/r/258411 [02:46:54] (03PS1) 10Tim Starling: Make it compile without C++11 [integration/uprightdiff] - 10https://gerrit.wikimedia.org/r/258412 [02:47:32] (03CR) 10Tim Starling: [C: 032 V: 032] Add total area metric and fix mask and moved metrics [integration/uprightdiff] - 10https://gerrit.wikimedia.org/r/258411 (owner: 10Tim Starling) [02:48:39] (03CR) 10Tim Starling: [C: 032 V: 032] Make it compile without C++11 [integration/uprightdiff] - 10https://gerrit.wikimedia.org/r/258412 (owner: 10Tim Starling) [03:51:36] Project beta-scap-eqiad build #81944: 04FAILURE in 15 min: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/81944/ [06:35:55] PROBLEM - Free space - all mounts on deployment-bastion is CRITICAL: CRITICAL: deployment-prep.deployment-bastion.diskspace._var.byte_percentfree (<10.00%) [06:45:58] RECOVERY - Free space - all mounts on deployment-bastion is OK: OK: All targets OK [06:46:20] Yippee, build fixed! [06:46:21] Project UploadWizard-api-commons.wikimedia.beta.wmflabs.org build #3130: 09FIXED in 19 sec: https://integration.wikimedia.org/ci/job/UploadWizard-api-commons.wikimedia.beta.wmflabs.org/3130/ [06:56:45] PROBLEM - Host angry-caching-proxy is DOWN: CRITICAL - Host Unreachable (10.68.19.184) [08:11:48] 10Deployment-Systems, 10Architecture, 10Wikimedia-Developer-Summit-2016-Organization, 7Availability: WikiDev 16 working area: Software engineering - https://phabricator.wikimedia.org/T119032#1872344 (10Qgil) p:5Triage>3Normal [08:21:11] Project browsertests-CirrusSearch-en.wikipedia.beta.wmflabs.org-linux-firefox-sauce build #793: 04FAILURE in 1 min 10 sec: https://integration.wikimedia.org/ci/job/browsertests-CirrusSearch-en.wikipedia.beta.wmflabs.org-linux-firefox-sauce/793/ [08:57:23] PROBLEM - Puppet failure on wmfbranch is CRITICAL: CRITICAL: 62.50% of data above the critical threshold [0.0] [09:32:27] RECOVERY - Puppet failure on wmfbranch is OK: OK: Less than 1.00% above the threshold [0.0] [09:38:32] Project browsertests-Echo-en.wikipedia.beta.wmflabs.org-linux-chrome-sauce build #700: 04FAILURE in 1 min 30 sec: https://integration.wikimedia.org/ci/job/browsertests-Echo-en.wikipedia.beta.wmflabs.org-linux-chrome-sauce/700/ [10:26:11] 10Beta-Cluster-Infrastructure: Review rights removal by user Vogone - https://phabricator.wikimedia.org/T121168#1872602 (10Luke081515) I discovered some of the bugs by the way, so unwanted, like the both bugs concerning mediawiki currently triaged as low. For the bug triaged as high you need just the right to vi... [10:42:19] hmmm [10:42:21] 10:30:05 rm: cannot remove ‘/mnt/home/jenkins-deploy/tmpfs/jenkins-1/lessphp_2116pao1z3hcc88400gc04so40ogcc8.lesscache’: Permission denied [10:42:30] hashar: ^ [10:42:43] arhghg [10:42:44] again [10:42:58] thedj: yeah we have some race condition :-/ [10:43:13] thedj: any specific job ? [10:43:18] I need to get the instance name [10:43:36] hashar: from https://gerrit.wikimedia.org/r/#/c/254039/ [10:44:17] 10Beta-Cluster-Infrastructure: Review rights removal by user Vogone - https://phabricator.wikimedia.org/T121168#1872648 (10Luke081515) Added to that, I can't help users with their testing anymore, for example, if an abusefilter interrupts a user at his testing, I can't modify this filter anymore. [10:44:50] thedj: going to mass purge [10:45:24] !log salt --show-timeout '*slave*' cmd.run 'rm -fR /mnt/home/jenkins-deploy/tmpfs/jenkins-?/*' [10:45:27] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL, Master [10:46:56] 10Continuous-Integration-Infrastructure, 5Patch-For-Review: Dozens of jobs failing on integration-slave-trusty-1012 because chmod fails for /tmp/jenkins-2 - https://phabricator.wikimedia.org/T120824#1872655 (10hashar) [10:47:48] 10Continuous-Integration-Infrastructure, 5Patch-For-Review: Dozens of jobs failing on integration-slave-trusty-1012 because chmod fails for /tmp/jenkins-2 - https://phabricator.wikimedia.org/T120824#1862091 (10hashar) I have put integration-slave-trusty-1011 after deleting all the tmpfs leftover files with:... [10:48:12] thedj: thank you for the notification. The actual bug is https://phabricator.wikimedia.org/T120824 [10:49:47] np. thx for taking care of all that stuff [10:50:26] thedj: yeah it is surprisingly mostly working fine but there are some crazy issues from time to time that hard to diagnose :-( [10:50:34] * thedj runs buildslaves too and it's a hassle... [10:51:00] * thedj runs macosx build slaves with ios developer signing.. it be HELL [10:56:17] thedj: we actually had a meeting about testing iOS on our own infra back in May 2015 (lyon hackathon) [10:56:42] thedj: ended up with the agreement that using Travis (which has support for xCode apparently) would be a better use of everyone time :D [10:56:50] after trying myself for a long time, i'm convinced: "buy iut" [10:56:52] it [10:57:08] I dont think anyone was willing to put mac-ini in one of the prod datacenter [10:57:12] then have to deal with security update [10:57:17] and get puppet to run on them :D [10:57:20] that's the least of your problems. trust me [10:58:09] the wmf mobile wanted to be good citizen and self host / reuse the current ci [10:58:11] which totally make sense [10:58:18] and we (releng) were definitely willing to do it [10:58:34] but in the end the cost/benefit and complication was not worth it [10:58:42] and mobile already had a proof of concept via Travis [10:58:50] so they moved iOS dev to github/Travis :-} [10:58:58] wich [10:59:00] works!!!!! [10:59:21] exactly. much easier. [11:00:48] unfortunately, we are a bitbucket company... so no travis... [11:09:12] 10Beta-Cluster-Infrastructure: Review rights removal by user Vogone - https://phabricator.wikimedia.org/T121168#1872691 (10Luke081515) >>! In T121168#1871640, @Krenair wrote: > I'm told you've been evading IRC bans? No, I talked with a op at a channel at a query, and he said I can try to join this channel, (I t... [11:14:35] Phabricator is down at the moment [11:14:44] "Our servers are currently experiencing a technical problem. " [11:15:57] hm, works again now [11:31:56] (03CR) 10Aklapper: "@Daniel: Any plans to pick up Nemo's comment? And anybody who could review this patch that's been waiting for 10 months?" [tools/code-utils] - 10https://gerrit.wikimedia.org/r/190825 (owner: 10Daniel Kinzler) [11:39:14] 10MediaWiki-Releasing, 6Developer-Relations, 10Wikimedia-Blog-Content, 3DevRel-December-2015, 5MW-1.26-release: Write blog post announcing MW 1.26 - https://phabricator.wikimedia.org/T112842#1872791 (10Qgil) (Grrr browser sessions) Thank you very much! I just edited the signatures, because having four n... [12:09:10] 10Continuous-Integration-Infrastructure, 5Patch-For-Review: Dozens of jobs failing on integration-slave-trusty-1012 because chmod fails for /tmp/jenkins-2 - https://phabricator.wikimedia.org/T120824#1872891 (10hashar) Random thoughts from this morning To follow up on @JanZerebecki , the post build action do... [12:33:06] 10Beta-Cluster-Infrastructure: Review rights removal by user Vogone - https://phabricator.wikimedia.org/T121168#1872953 (10Vogone) >>! In T121168#1872602, @Luke081515 wrote: > I discovered some of the bugs by the way, so unwanted, like the both bugs concerning mediawiki currently triaged as low. For the bug tria... [12:42:17] 10Continuous-Integration-Infrastructure, 5Patch-For-Review: Dozens of jobs failing on integration-slave-trusty-1012 because chmod fails for /tmp/jenkins-2 - https://phabricator.wikimedia.org/T120824#1872977 (10hashar) On gallium I have looked for mediawiki-* jobs having: ``` rm: cannot remove ‘/mnt/home/jenkin... [12:55:33] kart_: mobrovac about cxserver devdependencies https://gerrit.wikimedia.org/r/#/c/258435/ [12:55:55] I am debugging some nasty stuff on ci infra, so unlikely to play with it today [12:55:56] sorry, i've gotta leave [12:56:00] hehe [12:56:10] 00:02:25.573 [*] NPM devDependencies Installation [*] [12:56:10] 00:02:25.574 + npm test [12:56:15] hashar: please leave a comment, or create a ticket [12:56:26] looks like the job has to npm install the dev [12:56:27] yeah will reply [12:56:29] good lunchu [12:56:30] ah hashar, that's problem we've had with graphoid, right? [12:56:43] leave leave -:} [12:56:50] kk, we will need to pair up on this next week and fix it [12:56:53] PROBLEM - Free space - all mounts on deployment-bastion is CRITICAL: CRITICAL: deployment-prep.deployment-bastion.diskspace._var.byte_percentfree (<11.11%) [12:56:54] 10Beta-Cluster-Infrastructure: Review rights removal by user Vogone - https://phabricator.wikimedia.org/T121168#1872990 (10Luke081515) >>! In T121168#1872953, @Vogone wrote: > Perhaps indeed not in this case, but I presume this was meant to be a general comment. Generally, testing security bugs on a public wiki... [12:56:55] ciao ciao [12:56:57] :) [12:56:59] Project browsertests-GettingStarted-en.wikipedia.beta.wmflabs.org-linux-firefox-sauce build #686: 04FAILURE in 2 min 57 sec: https://integration.wikimedia.org/ci/job/browsertests-GettingStarted-en.wikipedia.beta.wmflabs.org-linux-firefox-sauce/686/ [13:00:11] 10Beta-Cluster-Infrastructure: Review rights removal by user Vogone - https://phabricator.wikimedia.org/T121168#1872999 (10Steinsplitter) 5Open>3Invalid a:3Steinsplitter Res ipsa loquitur [13:00:40] 10Beta-Cluster-Infrastructure: Review rights removal by user Vogone - https://phabricator.wikimedia.org/T121168#1873002 (10Steinsplitter) a:5Steinsplitter>3None [13:17:32] hashar: mobrovac thanks! [13:20:54] 10Beta-Cluster-Infrastructure: Review rights removal by user Vogone - https://phabricator.wikimedia.org/T121168#1873073 (10Luke081515) 5Invalid>3Open [13:23:32] 10Continuous-Integration-Infrastructure, 5Patch-For-Review: Dozens of jobs failing on integration-slave-trusty-1012 because chmod fails for /tmp/jenkins-2 - https://phabricator.wikimedia.org/T120824#1873097 (10hashar) Based on previous comment forensic, I have looked back at an issue reported earlier: integra... [13:24:13] 10Beta-Cluster-Infrastructure: Review rights removal by user Vogone - https://phabricator.wikimedia.org/T121168#1873102 (10Hydriz) @Luke081515 You reopened this task. Do you have any actionables that can be carried out in doing so? I believe @Vogone has provided sufficient evidence for this case. Do you have add... [13:26:36] 10Beta-Cluster-Infrastructure: Review rights removal by user Vogone - https://phabricator.wikimedia.org/T121168#1873107 (10Luke081515) As I sais above * I can't modify global filters anymore, so I can't help users, if a filter matched as a false positive * I can't run specific tests (at new features, to find nor... [13:55:03] 10Beta-Cluster-Infrastructure: Review rights removal by user Vogone - https://phabricator.wikimedia.org/T121168#1873137 (10Luke081515) As the actual recent changes at beta says, edit global filter is very useful, there was a IP with crosswiki-spamming, spamming always the same, so this is a point, where I could... [14:12:26] 13:55:51 chmod: changing permissions of ‘/mnt/home/jenkins-deploy/tmpfs/jenkins-3’: Operation not permitted [14:12:39] hashar: is this known? [14:14:22] Nikerabbit: yes [14:14:32] Nikerabbit: been working on it since I woke up , can't figure it out :-/ [14:15:14] integration-slave-trusty-1011.integration.eqiad.wmflabs: [14:15:14] drwxr-xr-x 2 www-data www-data 40 Dec 11 10:45 /mnt/home/jenkins-deploy/tmpfs/jenkins-3/ [14:15:21] belong to the wrong user ( www-data ) [14:15:42] it's https://phabricator.wikimedia.org/T120824 right? [14:15:51] yes [14:17:50] and I guess there is a reason why we are storing LC in filesystem rather than in db? [14:18:05] none [14:18:08] it is not configured [14:18:14] so I guess the default is to use the filesystem [14:18:26] and the other default is to have them written in the tmp directory [14:18:38] should we go for the DB insteaD? [14:19:19] hashar: that can cause other problems... so maybe not [14:19:44] we also have jobs using sqlite [14:19:46] might be super slow [14:20:07] so anyway [14:20:14] I am trying to reproduce / find the root cause [14:20:17] but can't figure it out so far :-( [14:20:27] $ok = mkdir( $dir, $mode, true ); // PHP5 <3 [14:20:29] hmm [14:21:05] what we do is create a temp dir /mnt/home/jenkins-deploy/tmpfs/jenkins [14:21:13] which belong to the jenkins-deploy user [14:21:34] then when some jobs hit MediaWiki over Apache, the directory ends up belonging to www-data (apache user) [14:21:46] The only recent cache change I know about is https://phabricator.wikimedia.org/T113092#1868005 [14:22:10] I don't suppose there is any code which would remove the dir and then recreate it on purpose [14:22:42] none I can think of [14:22:42] :( [14:22:46] maybe I can try putting the l10n cache in a subdirectory [14:22:54] $TMP_DIR/l10ncache [14:23:13] yeah that would be interesting [14:24:15] maybe mkdir() is acting strangely :-D [14:24:54] you never know [14:25:17] hashar: and hhvm had bugs related to '.' but I don't think that's relevant for this [14:27:58] hmm [14:28:38] yeah I am quite convinced that *something* is deleting that directory and l10ncache just happens to recreate it [14:30:43] yeah that is my assumption as well [14:30:53] but I can't figure out what is deleting the dir :-((((((((( [14:32:31] (03CR) 10Zfilipin: "recheck" [selenium] - 10https://gerrit.wikimedia.org/r/258394 (https://phabricator.wikimedia.org/T114362) (owner: 10Dduvall) [14:34:49] (03PS1) 10Hashar: Point l10n cache to a subdirectory [integration/jenkins] - 10https://gerrit.wikimedia.org/r/258450 (https://phabricator.wikimedia.org/T120824) [14:35:00] Nikerabbit: Nemo_bis: maybe https://gerrit.wikimedia.org/r/258450 would help [14:35:16] then there is also lessphp files being created by www-data there [14:35:27] so maybe that is the less compiler changing the files :-/ [14:43:19] 10Beta-Cluster-Infrastructure: Review rights removal by user Vogone - https://phabricator.wikimedia.org/T121168#1873189 (10Luke081515) beta is to test MediaWiki in a production like environment, so sometimes it is useful too, to test a specific configuration, to test behaviour. In this case, you can simulate a s... [14:43:48] (03CR) 10Zfilipin: [C: 032] Support all SauceLabs provided browsers [selenium] - 10https://gerrit.wikimedia.org/r/258394 (https://phabricator.wikimedia.org/T114362) (owner: 10Dduvall) [14:45:53] (03Merged) 10jenkins-bot: Support all SauceLabs provided browsers [selenium] - 10https://gerrit.wikimedia.org/r/258394 (https://phabricator.wikimedia.org/T114362) (owner: 10Dduvall) [14:46:40] hashar: it could be any unit test... [14:47:50] yup [14:51:51] I found one failing https://integration.wikimedia.org/ci/job/mediawiki-extensions-qunit/22834/consoleFull [14:52:02] and trying to guess which build before it could have created the dir [14:53:24] and I am pretty sure it is mwext-mw-selenium [14:56:55] 10Beta-Cluster-Infrastructure: Review rights removal by user Vogone - https://phabricator.wikimedia.org/T121168#1873197 (10doctaxon) As Vogone said here above and it's my opinion, too, your irc behaviour does not show that trustworthiness, as it is needed to be a steward. Looking for almightiness in wiki world a... [14:58:54] 10Beta-Cluster-Infrastructure: Review rights removal by user Vogone - https://phabricator.wikimedia.org/T121168#1873198 (10doctaxon) Added: the phab is no discussion container for admin or steward rights on wikis. [15:04:02] !log jenkins-deploy@integration-slave-precise-1011:/mnt/jenkins-workspace/workspace/mwext-Wikibase-client-tests-mysql-zend/src/extensions/Wikibase$ rm .git/refs/heads/mw1.21-wmf6.lock [15:04:07] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL, Master [15:09:37] 10Continuous-Integration-Infrastructure, 5Patch-For-Review: Dozens of jobs failing on integration-slave-trusty-1012 because chmod fails for /tmp/jenkins-2 - https://phabricator.wikimedia.org/T120824#1873209 (10hashar) Been running mwext-mw-selenium while watching for files belonging to www-data with: ``` while... [15:11:52] 10Beta-Cluster-Infrastructure: Review rights removal by user Vogone - https://phabricator.wikimedia.org/T121168#1873217 (10Krenair) >>! In T121168#1873198, @doctaxon wrote: > Added: the phab is no discussion container for admin or steward rights on wikis. is for beta [15:14:14] !log ssh integration-slave-trusty-1017.eqiad.wmflabs 'sudo -u jenkins-deploy rm -rf /mnt/home/jenkins-deploy/tmpfs/jenkins-1' [15:14:17] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL, Master [15:15:32] jzerebecki: I am trying to figure out the root cause :( [15:25:10] 10Beta-Cluster-Infrastructure: Review rights removal by user Vogone - https://phabricator.wikimedia.org/T121168#1873241 (10Luke081515) No, sometimes it take to long time, till someone is reachable. If you look at the RC at deployment, sometimes their is something urgently, like today, there was crosswiki-spam by... [15:25:42] 10Beta-Cluster-Infrastructure: Review rights removal by user Vogone - https://phabricator.wikimedia.org/T121168#1873243 (10doctaxon) ya, for beta, too [15:30:38] 10Beta-Cluster-Infrastructure: Review rights removal by user Vogone - https://phabricator.wikimedia.org/T121168#1873246 (10doctaxon) No, there are more trustworthy stewards in beta cluster, you can contact really fast. You don't need wait so long. And you can test in beta, too, without to be a steward. [15:33:30] 10Beta-Cluster-Infrastructure: Review rights removal by user Vogone - https://phabricator.wikimedia.org/T121168#1873247 (10Hydriz) Two questions: # Why do you need to modify global groups? If there is a need, a task in Phabricator is preferred so discussion can take place. # Testing? Do you //really// need... [15:33:55] hmm [15:34:59] 10Beta-Cluster-Infrastructure: Review rights removal by user Vogone - https://phabricator.wikimedia.org/T121168#1873248 (10Vogone) >>! In T121168#1873247, @Hydriz wrote: > Two questions: > > # Why do you need to modify global groups? If there is a need, a task in Phabricator is preferred so discussion can tak... [15:37:01] 10Beta-Cluster-Infrastructure: Review rights removal by user Vogone - https://phabricator.wikimedia.org/T121168#1873258 (10Hydriz) >>! In T121168#1873248, @Vogone wrote: >>>! In T121168#1873247, @Hydriz wrote: >> Two questions: >> >> # Why do you need to modify global groups? If there is a need, a task in Pha... [15:39:51] 10Beta-Cluster-Infrastructure: Review rights removal by user Vogone - https://phabricator.wikimedia.org/T121168#1873265 (10doctaxon) Ya, thank you ... [15:42:59] 10Beta-Cluster-Infrastructure: Review rights removal by user Vogone - https://phabricator.wikimedia.org/T121168#1873268 (10Luke081515) Modifing a global group must be nothing permanent. You can create a temp testgroup, were you can test a right config, for tests at one wiki. After you made the tests, you can rem... [16:07:45] well [16:08:18] I give up with the /mnt/home/jenkins-deploy/tmpfs/jenkins-*/ dir being owned by www-data :-/ [16:09:23] 10Continuous-Integration-Infrastructure, 5Patch-For-Review: Dozens of jobs failing on integration-slave-trusty-1012 because chmod fails for /tmp/jenkins-2 - https://phabricator.wikimedia.org/T120824#1873288 (10hashar) So I have been digging into it more and I am out of idea. I can't even think of a good way t... [16:59:58] 6Release-Engineering-Team, 10DBA, 7user-notice: Requests to globally reset a user's skin preferences - https://phabricator.wikimedia.org/T119206#1873407 (10demon) >>! In T119206#1870832, @NickK wrote: >>>! In T119206#1838886, @demon wrote: >> No that isn't possible I'm afraid. It will only set the preference... [17:04:46] https://integration.wikimedia.org/ci/job/harbormaster-test/ [17:05:11] Luke081515 helped me figure out what was wrong with jenkins->harbormaster reporting [17:05:11] Heh, https://phabricator.wikimedia.org/D77#1794 [17:05:46] ostriches: the jenkins user in phabricator wasn't a member of the releng group so it couldn't report back [17:06:03] now it's working [17:06:19] Now we just need wikibugs informing about the diffs too, then it's very simalar to gerrit ;) [17:06:59] Luke081515: indeed [17:07:11] twentyafterfour: Couldn't comment unless in releng? [17:07:14] wikibugs could be fixed pretty easily, but I'm not very familiar with the wikibugs code [17:07:43] ostriches: no, it's the harbormaster build status at the top of the diff. It never made it to the comment step because the harbormaster step failed [17:08:19] I'm missing what's special about being in releng here. [17:08:38] harboarmaster app is restricted currently [17:08:55] Ah, we should change that permission to a new group for the bot then perhaps? [17:08:56] mostly for the sake of not bothering people who don't care about harbormaster [17:09:01] That's where I'm getting confuzzled [17:09:17] yeah possibly an acl*project [17:09:45] that's the part that Luke081515 helped me with, because it had simply slipped my mind [17:11:24] acl* sounds good for jenkins bot permissions to be managed with. [17:11:33] If we keep doing it with releng, https://tools.wmflabs.org/bash/quip/AU7VTzhg6snAnmqnK_pc :p [17:11:56] ostriches: I agree [17:12:27] wait, did we already create an acl*releng project or is that still controversial? [17:12:36] twentyafterfour: That observation is from like 8 years ago ;-) [17:12:40] 8...or more [17:13:13] ostriches: I imagine it's been observed long before that [17:15:01] https://phabricator.wikimedia.org/tag/acl*releng/ [17:16:34] ok I adjusted harbormaster policies. Now it's can use app: public, create build plans: acl*releng, default edit policy: acl*releng [17:16:59] default view policy: public [17:17:22] and I added jenkins user to acl*releng [17:17:40] so we should be good to go with harbormaster->jenkins->harbormaster [17:18:08] currently it's only set up for scap builds though [17:18:28] ostriches: Am I right, that there was a policy, that callsigns are only allowed with four letters? [17:18:38] s/was/is/ [17:18:53] But it's a silly policy because callsigns don't scale when you have 800+ repositories. [17:19:02] Waiting on upstream there [17:19:42] ostriches: But I guess the callsing TESTREVSCORINGAGAIN is a bit to long? [17:20:05] I don't care enough to enforce the policy anymore since they'll hopefully go away soon :p [17:20:17] ah ok :) [17:22:25] by the way: Can someone help me with PHP? I tried to install it at windows, but it can't start, because VCRUNTIME140.dll is missing [17:22:37] twentyafterfour: I was thinking last night...does Phab's git daemons do anything to trigger a gc/repack manually? We should probably fiddle with the system gitconfig so they *do* get triggered on git operations more frequently than like every 250+ commits. [17:22:53] Luke081515: I haven't used PHP on Windows in years. Last time I did I used XAMPP. [17:23:17] twentyafterfour: If not, we should...that is [17:30:24] (03CR) 10Chad: [C: 032] "+1 to hashar, we should rewrite this in python." [tools/release] - 10https://gerrit.wikimedia.org/r/258074 (owner: 10Thcipriani) [17:30:46] ostriches: I'm not sure [17:31:03] 10Beta-Cluster-Infrastructure: Review rights removal by user Vogone - https://phabricator.wikimedia.org/T121168#1873499 (10doctaxon) But now Luke081515 misuses his channel and user rights in irc to let me part the channel due to autojoin. During this discussion here he's demonstrating again, how to play intentio... [17:31:12] (03Merged) 10jenkins-bot: Better usage. No positional args. [tools/release] - 10https://gerrit.wikimedia.org/r/258074 (owner: 10Thcipriani) [17:32:37] 10Beta-Cluster-Infrastructure: Review rights removal by user Vogone - https://phabricator.wikimedia.org/T121168#1873510 (10demon) >>! In T121168#1873499, @doctaxon wrote: > But user rights, both in irc and in beta cluster ARE NO TOYS TO PLAY WITH! > > Till Luke081515 does understand how to use user rights, he s... [17:36:52] 10Beta-Cluster-Infrastructure: Review rights removal by user Vogone - https://phabricator.wikimedia.org/T121168#1873514 (10doctaxon) But there's no hand for steward rights and that's the topic here! [17:46:39] twentyafterfour: can I remove jenkins from #together or does it need to be there too? [17:46:54] * greg-g is just a bit ocd this morning, he was wrapping presents last night [17:48:27] 10Beta-Cluster-Infrastructure: Review rights removal by user Vogone - https://phabricator.wikimedia.org/T121168#1873590 (10demon) This whole thing seems like a tempest in a teacup really. We're talking about //beta// here. It's meant for people to experiment, try out new things, and that includes having access t... [17:49:50] greg-g: doesn't need to be in there [17:51:01] 10Beta-Cluster-Infrastructure: Review rights removal by user Vogone - https://phabricator.wikimedia.org/T121168#1873595 (10greg) For the record, from my perspective as the manager of the team that owns the Beta Cluster (where these wikis live, which is used for pre-production deployment testing by bots and peopl... [17:51:17] 10Beta-Cluster-Infrastructure: Review rights removal by user Vogone - https://phabricator.wikimedia.org/T121168#1873596 (10Luke081515) There is, as I sayed above. I don't abused my rights, so in my opinion there is no reason for removal. If you said "he not need them at the moment", you can remove near to all ri... [17:53:58] greg-g: I hates that phab has no edit conflict detection ;) But thanks for your comment :) [17:54:49] :) [17:58:53] 10Beta-Cluster-Infrastructure: Review rights removal by user Vogone - https://phabricator.wikimedia.org/T121168#1873612 (10doctaxon) >>! In T121168#1873595, @greg wrote: > Luke081515 has been immensely helpful with task triage, task/issue reporting, and general cleanup along the way. I appreciate and welcome his... [18:01:58] 10Beta-Cluster-Infrastructure: Review rights removal by user Vogone - https://phabricator.wikimedia.org/T121168#1873624 (10Luke081515) You don't read that, what I wrote above? As I said it is very useful for tests to modify global group rights. counterquestion: Why were the rights removed? You can't remove righ... [18:04:30] twentryafterfour: So, if the jenkins build failed, the bot posts a message, but does the build at phab stops then, at the moment? [18:05:34] 10Beta-Cluster-Infrastructure: Review rights removal by user Vogone - https://phabricator.wikimedia.org/T121168#1873627 (10MGChecker) I don't think any behaivour in //private// channel is subject in this task, to be honest. I actually think the beta-cluster is a famous place to do some tests for things in prod... [18:07:09] twentyafterfour: Ok, the build "successful failed" in D76, that's a good sign (for jenkins, not for your diff) ;) [18:07:46] 10Beta-Cluster-Infrastructure: Review rights removal by user Vogone - https://phabricator.wikimedia.org/T121168#1873633 (10doctaxon) @Luke081515 Your counterquestion is already answered in Vogone's long comment above ... [18:08:25] Maybe a good idea for the jenkins bot: It looks better, if he use bold letter instead of caps [18:09:35] 10Beta-Cluster-Infrastructure: Review rights removal by user Vogone - https://phabricator.wikimedia.org/T121168#1873636 (10Luke081515) No. He don't said something about the temp global groups, needed for testing a specific situation. [18:12:35] 10Beta-Cluster-Infrastructure: Review rights removal by user Vogone - https://phabricator.wikimedia.org/T121168#1873641 (10doctaxon) you can ask for temp global groups if you really need 'em [18:17:57] 10Beta-Cluster-Infrastructure: Review rights removal by user Vogone - https://phabricator.wikimedia.org/T121168#1873665 (10Luke081515) I don't want to say it again. Why not, you can read here: T121168#1873268. The second point is, that your logic is not right. Image somebody removed your sysop bit at dewiki, an... [18:21:40] twentyafterfour: Did you read my comment from 18:08 UTC? Maybe a good stylistic enchancement [18:23:41] 10Continuous-Integration-Infrastructure, 5Patch-For-Review: Dozens of jobs failing on integration-slave-trusty-1012 because chmod fails for /tmp/jenkins-2 - https://phabricator.wikimedia.org/T120824#1873691 (10hashar) //Random notes from evening digging, probably not worth reading// Bunch of console logs hav... [18:25:50] 10Continuous-Integration-Infrastructure, 5Patch-For-Review: Dozens of jobs failing on integration-slave-trusty-1012 because chmod fails for /tmp/jenkins-2 - https://phabricator.wikimedia.org/T120824#1873693 (10hashar) The last four Wikibase builds ran on integration-slave-trusty-1012 on Dec 8 around 14:00 UTC... [18:30:24] Luke081515: in a meeting .. I'll look in a minute [18:30:35] ah, ok, sry [18:30:43] Noting urgently ;) [18:33:58] twentyafterfour: Why dosen't this 'text/x-php': 'text/x-php; charset=utf-8' work for desktops but works for mobiles. This is to do with viewing raw files on phabricator. [18:34:24] paladox: I'm not sure [18:34:41] 10Continuous-Integration-Config, 6Community-Tech: mediawiki/extensions/PageAssessments history should be cleaned and reimported + other concerns - https://phabricator.wikimedia.org/T121157#1873706 (10DannyH) p:5Triage>3High [18:35:03] 10Continuous-Integration-Config, 6Community-Tech: mediawiki/extensions/PageAssessments history should be cleaned and reimported + other concerns - https://phabricator.wikimedia.org/T121157#1873711 (10kaldari) @hashar: Definitely agree, just wasn't sure how to do that. [18:35:11] twentyafterfour: Could i have help to get viewing raw files in php to work on phabricator please. [18:39:46] paladox: right now, please just comment on the relevant task in the WMF phabricator instance [18:40:39] greg-g: Ok. [18:41:02] PROBLEM - Puppet failure on deployment-sentry2 is CRITICAL: CRITICAL: 62.50% of data above the critical threshold [0.0] [18:44:58] 3Scap3: Remove apache dependency from scap3 deployment host - https://phabricator.wikimedia.org/T116630#1754302 (10thcipriani) [18:45:01] 10Deployment-Systems, 3Scap3, 6Phabricator: Deploy Phabricator with scap3 - https://phabricator.wikimedia.org/T114363#1873781 (10thcipriani) [18:45:04] 10Deployment-Systems, 3Scap3: enforcing deployment from `/srv/deployment` is wrong - https://phabricator.wikimedia.org/T116207#1873779 (10thcipriani) 5Open>3Resolved [18:46:49] 10Deployment-Systems, 3Scap3: create an environment object that centralizes the file and directory lookup logic for scap3 - https://phabricator.wikimedia.org/T119643#1873800 (10thcipriani) a:5mmodell>3dduvall @dduvall has some things for this in D70 [18:47:03] 10Gitblit-Deprecate, 10Diffusion, 5Patch-For-Review: redirect gerrit repo paths to diffusion callsigns - https://phabricator.wikimedia.org/T110607#1873803 (10Paladox) Gerrit links now work. Just branches doin't since they use for example refs/heads/REL1_26 a redirecter script needs creating to redirect bran... [18:50:38] 10Gitblit-Deprecate, 10Diffusion, 5Patch-For-Review: redirect gerrit repo paths to diffusion callsigns - https://phabricator.wikimedia.org/T110607#1873852 (10demon) >>! In T110607#1873803, @Paladox wrote: > Gerrit links now work. > > Just branches doin't since they use for example refs/heads/REL1_26 a redir... [19:05:20] 10Deployment-Systems, 3Scap3, 7Epic: Future Deployment Tooling (tracking) - https://phabricator.wikimedia.org/T101023#1873927 (10thcipriani) [19:06:30] 10Deployment-Systems, 3Scap3, 7Epic: EPIC: Future Deployment Tooling - https://phabricator.wikimedia.org/T101023#1873934 (10thcipriani) [19:07:42] 10Deployment-Systems, 3Scap3, 7Epic: Scap3 should implement the services team requirements - https://phabricator.wikimedia.org/T109535#1873944 (10thcipriani) [19:07:56] 10Deployment-Systems, 3Scap3, 7Epic: EPIC: Scap3 should implement the services team requirements - https://phabricator.wikimedia.org/T109535#1873945 (10thcipriani) [19:08:58] 10Deployment-Systems, 3Scap3: create an environment object that centralizes the file and directory lookup logic for scap3 - https://phabricator.wikimedia.org/T119643#1873947 (10greg) [19:27:13] 10Deployment-Systems, 6operations: install/deploy mira as codfw deployment server - https://phabricator.wikimedia.org/T95436#1874032 (10demon) [19:27:15] 10Deployment-Systems, 3Scap3, 5Patch-For-Review: [scap] Add support for syncing /srv/mediawiki-staging including fully working git data to warm spare deploy server - https://phabricator.wikimedia.org/T104826#1874030 (10demon) 5Open>3Resolved I think we're done here folks. [19:28:00] 10Deployment-Systems, 6operations: install/deploy mira as codfw deployment server - https://phabricator.wikimedia.org/T95436#1874037 (10Dzahn) woohoo :) all blockers closed? when are we deploying from mira ? [19:28:39] 10Deployment-Systems: [scap] multi datacenter aware without (major) performance hit - https://phabricator.wikimedia.org/T71572#1874043 (10demon) [19:28:41] 10Deployment-Systems, 6operations: install/deploy mira as codfw deployment server - https://phabricator.wikimedia.org/T95436#1874040 (10demon) 5Open>3Resolved And mira's happy too, afaict. [19:28:47] 10Deployment-Systems, 6operations: install/deploy mira as codfw deployment server - https://phabricator.wikimedia.org/T95436#1874045 (10demon) [19:35:11] 10Deployment-Systems, 6Release-Engineering-Team, 3Scap3: Cleanup things we're not deploying anymore. - https://phabricator.wikimedia.org/T120157#1874063 (10demon) For comparison, this is the list on mira: ``` abacist cassandra cxserver elasticsearch fluoride grafana iegreview jobrunner kibana l... [19:38:12] 10Gitblit-Deprecate, 10Diffusion: Replicate open patchsets to diffusion - https://phabricator.wikimedia.org/T89940#1874079 (10Paladox) I think phabricator may be causing this problem. Since it looks like Why dosen't this 'text/x-php': 'text/x-php; charset=utf-8' work for desktops but works for mobiles. This... [19:42:53] 10Gitblit-Deprecate, 10Diffusion: Replicate open patchsets to diffusion - https://phabricator.wikimedia.org/T89940#1874083 (10Paladox) Would refs/remotes/changes or something similar work. [19:45:45] 10Gitblit-Deprecate, 10Diffusion: Replicate open patchsets to diffusion - https://phabricator.wikimedia.org/T89940#1874092 (10demon) These issues are all unrelated. [20:22:20] paladox: I'll look into the redirect issue [20:22:33] I'm not sure what to do about the raw files [20:22:39] twentyafterfour: Thanks. [20:27:01] PROBLEM - Puppet failure on integration-dev is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [20:27:35] twentyafterfour: Could you review https://gerrit.wikimedia.org/r/#/c/258506/ it is a minor update to the redirect script. [20:35:40] 10Deployment-Systems, 6Release-Engineering-Team, 3Scap3: Cleanup things we're not deploying anymore. - https://phabricator.wikimedia.org/T120157#1874229 (10bd808) ``` $ diff -uw dirs.tin dirs.mira --- dirs.tin 2015-12-11 13:34:43.000000000 -0700 +++ dirs.mira 2015-12-11 13:34:57.000000000 -0700 @@ -1,6... [20:49:40] Yippee, build fixed! [20:49:41] Project beta-scap-eqiad build #82034: 09FIXED in 20 min: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/82034/ [20:51:32] paladox: looking [20:51:46] twentyafterfour: Thanks. [20:53:44] ok +2 [20:56:06] twentyafterfour: Ok thanks but need re +2 since it didn't go through. [20:57:06] paladox: I'll deploy that in a bit [20:59:10] 10Continuous-Integration-Infrastructure, 5Patch-For-Review: Dozens of jobs failing on integration-slave-trusty-1012 because chmod fails for /tmp/jenkins-2 - https://phabricator.wikimedia.org/T120824#1874251 (10hashar) Found the sequence of three jobs running on slave-trusty-1012 executor #2 by grepping jenkins... [20:59:30] I'm going to deploy https://gerrit.wikimedia.org/r/#/c/258462/ for chasemp/coren [20:59:35] (03PS1) 10Paladox: [phabricator/extensions] Add Jenkins tests [integration/config] - 10https://gerrit.wikimedia.org/r/258512 [21:01:59] twentyafterfour: Could you review https://gerrit.wikimedia.org/r/#/c/258512/ it so that phabricator/extensions can have some jenkins tests. Please. [21:02:10] RECOVERY - Puppet failure on integration-dev is OK: OK: Less than 1.00% above the threshold [0.0] [21:02:52] (03CR) 1020after4: [C: 032] [phabricator/extensions] Add Jenkins tests [integration/config] - 10https://gerrit.wikimedia.org/r/258512 (owner: 10Paladox) [21:03:02] paladox: done [21:03:26] twentyafterfour: Thanks. [21:03:48] (03Merged) 10jenkins-bot: [phabricator/extensions] Add Jenkins tests [integration/config] - 10https://gerrit.wikimedia.org/r/258512 (owner: 10Paladox) [21:05:26] PROBLEM - Puppet failure on integration-slave-trusty-1012 is CRITICAL: CRITICAL: 25.00% of data above the critical threshold [0.0] [21:15:25] RECOVERY - Puppet failure on integration-slave-trusty-1012 is OK: OK: Less than 1.00% above the threshold [0.0] [21:21:32] 10Beta-Cluster-Infrastructure: Review rights removal by user Vogone - https://phabricator.wikimedia.org/T121168#1874308 (10doctaxon) 5Open>3stalled I could manage a talk with Luke081515 and Vogone these days to find a solution for the problems ourselves. So it's my opinion to stall this task up to now, tempo... [21:22:08]  [21:22:12] oh sorry [21:22:23] that was the wrong active window [21:37:54] thanks twentyafterfour seems good [21:37:59] twentyafterfour: Could the tests for phabricator/extensions be deployed please. [21:40:27] paladox: I don't know how to do that [21:40:43] twentyafterfour: Oh ok. [21:41:17] hashar: Could you deploy https://gerrit.wikimedia.org/r/#/c/258512/ please. [21:43:19] :-P [21:43:41] I don't think it's really terribly urgent [21:43:52] we deploy as soon as a change is merged [21:43:55] fab deploy_zuul [21:44:01] from the root of integration/config [21:44:07] or the long way: [21:44:15] ssh gallium.wikimedia.org [21:44:25] sudo -u zuul -s [21:44:34] cd /etc/zuul/wikimedia [21:44:35] git pull [21:44:45] /etc/init.d/zuul reload [21:44:47] * twentyafterfour doesn't seem to have access to gallium [21:44:50] but the fab script is wayyyy easier :} [21:44:52] oh [21:45:15] gotta reach it via the bastion [21:45:27] I have something like: [21:45:40] Host gallium.wikimedia.org [21:45:45] ProxyCommand ssh -a -W %h:%p bast1001.wikimedia.org [21:45:48] (in ~/.ssh/config ) [21:46:00] hashar: Ok Thanks. [21:46:16] paladox: well you surely dont have access to it -:} [21:46:30] hashar: Yes. [21:47:08] (03Abandoned) 10Hashar: Point l10n cache to a subdirectory [integration/jenkins] - 10https://gerrit.wikimedia.org/r/258450 (https://phabricator.wikimedia.org/T120824) (owner: 10Hashar) [21:51:00] hashar: hello! i have a JJB question for you :) do we have any fine examples of jobs that create a git commit and push it? i'd like to set up some periodic jobs that run an update script and commit and push it for review. [21:52:08] twentyafterfour: want me to deploy the Zuul change? [21:52:45] niedzielski: there is only one usage and I am willing to phase it out [21:52:56] niedzielski: the trouble is passing the credentials needed to push :/ [21:53:36] hashar: hm, i was thinking we could just use jenkins-deploy or if there's a default jenkins account [21:54:15] niedzielski: ah the job is 'mwext-VisualEditor-sync-gerrit' in jjb/mediawiki-extensions.yaml [21:54:32] it is running on gallium and jenkins-deploy sudo as 'jenkins' which has the ssh key [21:54:36] should probably refactor that [21:54:45] Jenkins has a credential store where one can put ssh keys [21:54:54] and then we can apply those credentials to a specific job [21:55:23] hashar: hm, so is this a security risk as it is currently implemented? [21:55:38] !log Reloading Zuul to deploy 385ddd9dd906865e7e61c3c5ea85eae0bb522c8d [21:55:42] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL, Master [21:55:52] niedzielski: not really :-} [21:56:07] but it is dumb because the jobstill has to run on gallium the prod server [21:56:36] 3Scap3: Bring co-master / fanout capabilities to `deploy` and friends - https://phabricator.wikimedia.org/T121276#1874432 (10demon) 3NEW [21:56:43] hashar: there seems to be some git user named "jenkins-bot@gerrit.wikimedia.org" that does our merges. maybe i can use that instead? [21:56:58] no way :-} [21:57:05] that one is really used by Zuul [21:57:10] and is a privileged account [21:57:25] it has +2 / merge rights on almost every repo [21:57:39] I would create a dedicated user for your project [21:57:50] generate a ssh key pair for it [21:58:06] then in Jenkins you can import the key in the credential store at https://integration.wikimedia.org/ci/credential-store/ [21:58:35] hashar: ok, hm. well maybe i will work towards a patch with some other account and, when it's ready, i can work with you to finish the setup? [21:58:59] I am not entirely sure how the credential store works to be honest :( [21:59:24] yeah sure [21:59:43] hey looks like you can even use password protected keys [21:59:45] hashar: ok, well i will move forward and assume we'll figure it out one way or another :) [22:00:06] we did use it for the Browser tests [22:00:15] they have some Wiki username/password [22:00:40] http://docs.openstack.org/infra/jenkins-job-builder/wrappers.html#wrappers.credentials-binding [22:00:52] let you fetch a credential via its ID [22:01:01] hashar: hm, that sounds promising for this task. i'm not sure how secure the whole setup is considered [22:01:19] no clue ;-:} [22:01:46] hashar: we have some high security tasks we'd like to automate but i think they'll have to live some place private [22:02:24] well [22:02:31] I would first do a quick proof of concept [22:02:38] hashar: The jenkins tests seem to say merge conflict even on patches that have status as can merge yes. For example please see https://gerrit.wikimedia.org/r/#/c/258506/ and https://gerrit.wikimedia.org/r/#/c/236417/ [22:02:39] hashar: yeah :) [22:02:39] create a user in wikitech for gerrit [22:02:45] add a ssh key pair [22:02:49] add the ssh key to Jenkins credential [22:03:08] write a single job that attempt to use the Jenkins credentials to ssh to gerrit (done via the Web UI ) [22:03:11] etc [22:03:25] hashar: ok, different JJB question: is there any way to inhibit automatic merges on patches? in other words, a patch has to be based on master HEAD to merge. [22:03:56] paladox: sounds something is broken on Zuul side. Will look [22:04:06] hashar: Ok thanks. [22:05:35] niedzielski: nop [22:06:00] niedzielski: but what could be done is that in test pipeline, instead of fetching the merge from Zuul, one could fetch the patch from Gerrit directly [22:06:15] niedzielski: but on gate-and-submit, you will want to use the Zuul ref. So two jobs [22:06:21] the reason for the merge check is for gate [22:06:36] there is no point in running tests for a change if it is not amerceable. [22:06:39] mergeable [22:06:59] and when you have changes A, B, C that are +2 ed [22:07:08] what Zuul does is attempt to merge A on branch [22:07:16] then merge B on (A + branch) [22:07:24] and C on (B + A + change) [22:07:34] so it is able to reject C if it can't merge on top of B+A [22:08:39] paladox: that is a namespace issue [22:08:43] hashar: i kind of get what you're saying. i'll try that out. thanks!! [22:09:19] hashar: Oh how would we fix that. [22:09:38] paladox: twentyafterfour: the Zuul daemon that handles the merge clones the repos and since we have a repo phabricator/extensions/Sprint.git , phabricator/extensions/ already exists [22:09:58] paladox: twentyafterfour thus Zuul can not git clone to phabricator/extensions because the dir exists :-} [22:10:21] niedzielski: feel free to raise that on the QA mailing list maybe [22:10:44] so should we create one test for those repos so instead of phabricator/extensions/Sprint, phabricator/extensions/BurnDownCharts and phabricator/extensions/security it would be phabricator/extensions which i am hoping would then run tests for sub folders. [22:11:20] hashar: Or if thats not possible then how do we do that. [22:12:29] niedzielski: Zuul pass a bunch parameters, among them the change number has ZUUL_CHANGE , unfortunately not the patchset number :/ [22:12:37] oh no [22:12:38] ZUUL_PATCHSET [22:13:08] hashar: is this something that would change when we switch to diffusion? [22:13:16] niedzielski: so you can git clone https://gerrit.wikimedia.org/r/p/$ZUUL_PROJECT && git fetch refs/changes/XX/$ZUUL_CHANGE/$ZUUL_PATCHSET [22:13:21] hashar: er, differential? [22:13:29] niedzielski: where XX are the last two digits of ZUUL_CHANGE [22:14:24] !log On Zuul merger, nuking /srv/ssd/zuul/git/phabricator/extensions so zuul-merger can properly clone phabricator/extensions.git (dir exists because of phabricator/extensions/Sprint.git among others ) [22:14:27] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL, Master [22:14:38] niedzielski: no clue :-} [22:16:24] paladox: fixed https://gerrit.wikimedia.org/r/#/c/236417/ :D [22:16:46] hashar: Thanks. [22:18:03] !log Stopped zuul merger on gallium to have phabricator/extensions populated on scandium (namespacing issue). Restarted zuul-merger on gallium once done. [22:18:06] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL, Master [22:18:36] niedzielski: anyway gotta sleep sorry :-( [22:18:45] niedzielski: do reach QA list as needed! [22:18:48] hashar: that's cool! thanks for all the help! [22:18:54] hashar: have a good night ! [22:19:10] paladox: the issue is fixed. That is a corner case bug in Zuul :-( [22:19:26] hashar: Ok thanks :) [22:32:31] (03PS1) 10Paladox: Update two repo Jenkins tests [integration/config] - 10https://gerrit.wikimedia.org/r/258534 [22:38:26] twentyafterfour: We should move all the phab/* repos to Differential. [22:38:32] No need to host Phab from Gerrit anymore imho. [23:11:24] Project browsertests-Gather-en.m.wikipedia.beta.wmflabs.org-linux-chrome-sauce build #352: 04FAILURE in 14 min: https://integration.wikimedia.org/ci/job/browsertests-Gather-en.m.wikipedia.beta.wmflabs.org-linux-chrome-sauce/352/ [23:12:42] ostriches: the only problem with that is I need to figure out how to deploy phabricator when phabricator is offline. [23:12:55] since the deployment uses git [23:13:02] it's a bit of a challenge [23:15:11] 10Continuous-Integration-Infrastructure, 5Patch-For-Review: Dozens of jobs failing on integration-slave-trusty-1012 because chmod fails for /tmp/jenkins-2 - https://phabricator.wikimedia.org/T120824#1874600 (10JanZerebecki) >>! In T120824#1873097, @hashar wrote: > So far all good. Then we invoke the selenium... [23:15:25] (03PS1) 10JanZerebecki: Don't leave one of the TMPDIRs around when it is switched inbetween [integration/jenkins] - 10https://gerrit.wikimedia.org/r/258634 (https://phabricator.wikimedia.org/T120824) [23:16:04] twentyafterfour: Maybe something still to do with jenkins: a) Let phab don't write "Unit Tests OK" if build failed", b) let jenkins request changes to that revision [23:16:10] what do you think? [23:18:46] Luke081515: I think that's how it's supposed to work already [23:18:53] ah, ok [23:19:20] twentyafterfour: Hehe, trueeee [23:19:42] Although ideally, we wouldn't be deploying from Phab while it's down, but pulling the update and *then* bouncing the server. [23:21:07] ostriches: Maybe one way to solve that: [23:21:35] we could use the function, that phab can create mirrors, so we have one problem less, if phab is down [23:23:10] I mean yeah that's kind of always been the problem with our deploy systems. SPOF on the git machine. scap & co really should have a way to fall back somewhere. [23:23:48] be cool if there was an environment switch and you coule deploy to codfw or eqiad from either [23:23:53] and it just defaulted to the local [23:24:17] it deploys to both from either by default. [23:24:59] ostriches: scap kind of has that now. The deploy server tells the mirrors to fetch from it and then tells the MWs to fetch from the closest mirror that can be pinged [23:25:27] This is more "I can't fetch things to deploy master because git is f'd" [23:25:34] but there is a last ditch "fetch from tin" too (or whatever is in scap.cfg as the default) [23:26:19] Yeah, I know deploy targets have several possible targets to choose from. [23:26:21] ah.. that would be either remote rewriting or using a service hostname in the git remote that can be changed in dns easily [23:27:03] ostriches: phabricator requires the web server to be offline while doing the update [23:27:05] but... [23:27:21] You can break that up though. [23:27:34] fetch, kill the server, merge + checkout, bring server up [23:27:34] we could use ssh, maybe? I suspect that phab git over ssh still requires apache on the back end [23:27:52] ostriches: right, but currently scap isn't that flexible [23:27:59] at least I don't think it is [23:28:10] If only we knew some scap devs! :p [23:28:19] it should be after we refactor the sequence of operations [23:28:30] to use a graph resolution [23:28:35] (algorithm) [23:30:15] heh I guess I could just hack it to use local disk git access [23:30:25] since the repo is on disk for sure, just no server running [23:31:44] remote local=/srv/phab/repos/PHDEP [23:35:39] 10Continuous-Integration-Infrastructure, 10Wikidata, 3Wikidata-Sprint-2015-12-01: Cannot find site jenkins_u3_mw - https://phabricator.wikimedia.org/T121083#1874674 (10JanZerebecki)