[00:19:39] nodepool not working? [00:27:25] Yippee, build fixed! [00:27:26] Project selenium-Flow » firefox,beta,Linux,contintLabsSlave && UbuntuTrusty build #21: 09FIXED in 11 min: https://integration.wikimedia.org/ci/job/selenium-Flow/BROWSER=firefox,MEDIAWIKI_ENVIRONMENT=beta,PLATFORM=Linux,label=contintLabsSlave%20&&%20UbuntuTrusty/21/ [00:28:37] thcipriani: around? ^^ [00:28:57] I am. /me looks [00:31:38] hmm, more timeout waiting for deletion things... [01:01:54] !log nodepool CI was down, slowly recovering [01:13:00] RECOVERY - Puppet run on integration-slave-precise-1012 is OK: OK: Less than 1.00% above the threshold [0.0] [01:14:33] added nodepool section to our quick CI troubleshooting doc: https://www.mediawiki.org/wiki/Continuous_integration/Architecture/Troubleshooting#Nodepool [01:57:44] Yippee, build fixed! [01:57:44] Project browsertests-Wikidata-WikidataTests-Group0-SmokeTests-linux-firefox-sauce build #50: 09FIXED in 17 min: https://integration.wikimedia.org/ci/job/browsertests-Wikidata-WikidataTests-Group0-SmokeTests-linux-firefox-sauce/50/ [02:38:22] PROBLEM - Parsoid on deployment-parsoid06 is CRITICAL: Connection refused [03:58:13] Yippee, build fixed! [03:58:13] Project selenium-MultimediaViewer » firefox,mediawiki,Linux,contintLabsSlave && UbuntuTrusty build #14: 09FIXED in 2 min 12 sec: https://integration.wikimedia.org/ci/job/selenium-MultimediaViewer/BROWSER=firefox,MEDIAWIKI_ENVIRONMENT=mediawiki,PLATFORM=Linux,label=contintLabsSlave%20&&%20UbuntuTrusty/14/ [04:17:34] Yippee, build fixed! [04:17:35] Project selenium-MultimediaViewer » chrome,beta,OS X 10.9,contintLabsSlave && UbuntuTrusty build #14: 09FIXED in 21 min: https://integration.wikimedia.org/ci/job/selenium-MultimediaViewer/BROWSER=chrome,MEDIAWIKI_ENVIRONMENT=beta,PLATFORM=OS%20X%2010.9,label=contintLabsSlave%20&&%20UbuntuTrusty/14/ [04:38:04] 10Continuous-Integration-Infrastructure, 07Upstream, 07WorkType-Maintenance, 07Zuul: Zuul deadlocks if unknown repo has activity in Gerrit - https://phabricator.wikimedia.org/T128569#2299873 (10TerraCodes) [04:59:22] mobrovac: I want to test git fetch https://gerrit.wikimedia.org/r/mediawiki/services/cxserver/deploy refs/changes/05/288905/2 && git cherry-pick FETCH_HEAD in beta, so usual cherry-pick in /srv/deployment/cxserver/deploy should work? [04:59:47] eh, https://gerrit.wikimedia.org/r/#/c/288905/ one [05:29:29] PROBLEM - Puppet run on deployment-tin is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [06:04:25] RECOVERY - Puppet run on deployment-tin is OK: OK: Less than 1.00% above the threshold [0.0] [06:16:20] kart_: no need to cherry-pick, you can just go onto deployment-tin in /srv/deployment/cxserver/deploy and do a normal deploy in beta [06:16:29] that won't do anything to prod [06:16:34] and you can always revert [06:19:16] mobrovac: thanks. I thought cherry-pick is quick, but deploy seems better. [06:19:43] git pull is surely shorter to write :) [06:25:17] mobrovac: The repository is dirty. Please commit or revert any uncommitted changes. [06:25:59] mobrovac: I did git pull and started deploy start. [06:26:07] wait wait kart_ [06:26:07] Is there anything I'm missing? [06:26:17] if the repo is dirty you didn't actually do a git pull [06:26:24] i.e. it didn't go through [06:26:41] do a git diff and inspect the local changes [06:27:05] if it's something harmless, just do git checkout -- [06:27:20] and then git pull && git submodule update --init [06:28:34] mobrovac: yep. needed submodule bump after pull :) [06:28:43] deploying [06:29:36] kk [07:23:18] mobrovac: despite of updating code in beta, restarting cxserver and checking if code is update, it doesn't reflect, for example, https://cxserver-beta.wmflabs.org/v1/list/mt/nb/nn still return {}, while it should be Apertium. [07:23:25] mobrovac: any way to debug? [07:23:55] did the deploy succeed? [07:24:06] yes [07:24:13] mobrovac: where to check log? [07:24:34] which log are you referring to? [07:26:41] mobrovac: http://pastebin.com/5rKGdS1E [07:26:54] mobrovac: in case failed to deploy.. [07:27:24] mobrovac: it looks we need sca02 there? [07:27:34] cxserver03 doesn't exists [07:27:41] kart_: you have to use scap, not trebuchet for deploying ... [07:27:49] remember that we switched to it the other day? [07:27:53] mobrovac: beta too? [07:27:57] yes [07:27:57] mobrovac: that's beta. [07:28:03] OK :) [07:28:15] but i should note that deploys are unfortunately broken in beta ATM [07:33:23] mobrovac: ah :/ [07:56:32] kart_: the only external calls cxserver does are to yandex, right? [08:19:34] 06Release-Engineering-Team, 05Release: MW-1.28.0-wmf.2 deployment blockers - https://phabricator.wikimedia.org/T134450#2300218 (10Tgr) [08:23:31] 06Release-Engineering-Team, 05Release: MW-1.28.0-wmf.2 deployment blockers - https://phabricator.wikimedia.org/T134450#2300245 (10Tgr) [08:23:34] mobrovac: in Production, yes. [08:23:42] kk thnx [08:35:47] 06Release-Engineering-Team, 05Release: MW-1.28.0-wmf.2 deployment blockers - https://phabricator.wikimedia.org/T134450#2300267 (10Tgr) [08:48:10] 07Browser-Tests, 10Wikidata, 07Tracking: [tracking] make Wikidata browsertests non-flaky - https://phabricator.wikimedia.org/T92619#2300299 (10adrianheine) There are three non-flaky failing browser tests currently. They all pass locally against local instance for me. * Edit sitelinks.Remove multiple sitelin... [09:13:52] PROBLEM - Puppet run on integration-slave-trusty-1015 is CRITICAL: CRITICAL: 10.00% of data above the critical threshold [0.0] [09:23:52] RECOVERY - Puppet run on integration-slave-trusty-1015 is OK: OK: Less than 1.00% above the threshold [0.0] [09:46:52] 10Continuous-Integration-Infrastructure, 06Release-Engineering-Team, 06Operations, 10Ops-Access-Requests, 13Patch-For-Review: Allow RelEng nova log access - https://phabricator.wikimedia.org/T133992#2300553 (10hashar) >>! In T133992#2298968, @Dzahn wrote: >> [labnodepool1001:~] $ id thcipriani >> uid=116... [09:52:00] 06Release-Engineering-Team, 05Release: MW-1.28.0-wmf.2 deployment blockers - https://phabricator.wikimedia.org/T134450#2300563 (10Tgr) [10:14:29] 05Gerrit-Migration, 05Gitblit-Deprecate, 10Diffusion, 10MediaWiki-General-or-Unknown, 10Phabricator: Mirroring mediawiki/core to GitHub from diffusion does not work - https://phabricator.wikimedia.org/T135494#2300678 (10Paladox) [10:29:15] 03Scap3, 10scap, 13Patch-For-Review: scap::target shouldn't allow users to redefine the user's key - https://phabricator.wikimedia.org/T132747#2300690 (10Ladsgroup) @thcipriani Thanks! I removed that line from [[https://gerrit.wikimedia.org/r/280403|my commit]]. Please re-cherry-pick it once you're done. Tha... [10:37:46] 10Continuous-Integration-Infrastructure, 06Release-Engineering-Team, 07Jenkins, 07WorkType-Maintenance: Upgrade Jenkins from 1.642.3 to 1.651.2 - https://phabricator.wikimedia.org/T133737#2300718 (10hashar) James E. Blair (OpenStack) kind replied on http://lists.openstack.org/pipermail/openstack-infra/2016... [11:04:17] 10Continuous-Integration-Infrastructure, 06Release-Engineering-Team, 06Operations, 10Ops-Access-Requests, 13Patch-For-Review: Allow RelEng nova log access - https://phabricator.wikimedia.org/T133992#2300765 (10Joe) Neither labnet1001 nor labnet1002 have glance logs, so I consider that out of socpe for n... [11:04:24] PROBLEM - Puppet run on deployment-ms-fe01 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [11:06:22] PROBLEM - Puppet run on integration-slave-trusty-1024 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [12:16:20] mobrovac: what we need to fix 'scap deploy' in beta? [12:16:42] mobrovac: any task/bug for it? [12:16:58] kart_: it's a deployment-user-ssh-key problem, we have to wait for releng people on this [12:17:13] mobrovac: yes. I got that error too! [12:17:17] kart_: https://phabricator.wikimedia.org/T132747 [12:17:20] mobrovac: Okay! [12:19:24] 05Gerrit-Migration, 05Gitblit-Deprecate, 10Diffusion, 10MediaWiki-General-or-Unknown, and 2 others: Mirroring mediawiki/core to GitHub from diffusion does not work - https://phabricator.wikimedia.org/T135494#2300899 (10Danny_B) May {T135403} be related? [12:24:23] 05Gerrit-Migration, 05Gitblit-Deprecate, 10Diffusion, 10MediaWiki-General-or-Unknown, and 2 others: Mirroring mediawiki/core to GitHub from diffusion does not work - https://phabricator.wikimedia.org/T135494#2300907 (10Paladox) @Danny_B nope because you set the mirror link in diffusion for each repo. Any r... [13:04:30] Yippee, build fixed! [13:04:31] Project selenium-Math » chrome,beta,Linux,contintLabsSlave && UbuntuTrusty build #21: 09FIXED in 29 sec: https://integration.wikimedia.org/ci/job/selenium-Math/BROWSER=chrome,MEDIAWIKI_ENVIRONMENT=beta,PLATFORM=Linux,label=contintLabsSlave%20&&%20UbuntuTrusty/21/ [13:04:39] Yippee, build fixed! [13:04:40] Project selenium-Math » firefox,beta,Linux,contintLabsSlave && UbuntuTrusty build #21: 09FIXED in 39 sec: https://integration.wikimedia.org/ci/job/selenium-Math/BROWSER=firefox,MEDIAWIKI_ENVIRONMENT=beta,PLATFORM=Linux,label=contintLabsSlave%20&&%20UbuntuTrusty/21/ [13:57:09] 10Continuous-Integration-Infrastructure, 07Jenkins: Have Jenkins to strip build parameters that are not explicitly defined in jobs - https://phabricator.wikimedia.org/T135506#2301119 (10hashar) [13:59:37] 10Continuous-Integration-Infrastructure, 06Release-Engineering-Team, 07Jenkins, 13Patch-For-Review, 07WorkType-Maintenance: Upgrade Jenkins from 1.642.3 to 1.651.2 - https://phabricator.wikimedia.org/T133737#2301134 (10hashar) 05Open>03Resolved Jenkins is upgraded and the build parameters are kept pr... [14:04:43] 10Browser-Tests-Infrastructure, 13Patch-For-Review, 15User-zeljkofilipin: Ownership of Selenium tests - https://phabricator.wikimedia.org/T134492#2301145 (10zeljkofilipin) >>! In T134492#2298975, @Jdlrobson wrote: > #wikipedia-mobile would be great. Especially if it can ping team members when it fails. Mob... [14:19:54] 10Continuous-Integration-Infrastructure, 07Jenkins, 07Upstream: Have Jenkins to strip build parameters that are not explicitly defined in jobs - https://phabricator.wikimedia.org/T135506#2301215 (10hashar) Filled upstream bug https://issues.jenkins-ci.org/browse/JENKINS-34885 //Gearman plugin should whitelis... [14:20:27] 10Continuous-Integration-Infrastructure, 07Jenkins, 07Upstream: Have Jenkins to strip build parameters that are not explicitly defined in jobs - https://phabricator.wikimedia.org/T135506#2301119 (10hashar) p:05Triage>03Normal [14:25:23] RECOVERY - Puppet run on integration-slave-trusty-1024 is OK: OK: Less than 1.00% above the threshold [0.0] [14:41:42] 10Browser-Tests-Infrastructure, 13Patch-For-Review, 15User-zeljkofilipin: Ownership of Selenium tests - https://phabricator.wikimedia.org/T134492#2301256 (10zeljkofilipin) [14:42:32] Project beta-update-databases-eqiad build #8607: 04FAILURE in 3 sec: https://integration.wikimedia.org/ci/job/beta-update-databases-eqiad/8607/ [14:42:33] Project beta-code-update-eqiad build #104744: 04FAILURE in 3 sec: https://integration.wikimedia.org/ci/job/beta-code-update-eqiad/104744/ [14:44:40] Yippee, build fixed! [14:44:40] Project beta-code-update-eqiad build #104745: 09FIXED in 1 min 39 sec: https://integration.wikimedia.org/ci/job/beta-code-update-eqiad/104745/ [14:45:14] (03PS1) 10Zfilipin: James is owner of selenium-VisualEditor Jenkins job [integration/config] - 10https://gerrit.wikimedia.org/r/289220 (https://phabricator.wikimedia.org/T134492) [14:52:50] thcipriani: hrrrrrraaaaa [14:53:02] thcipriani: been looking at deployment-tin and mwdeploy issue [14:53:06] there is a local group for mwdeploy [14:53:35] ugh. I cleaned up passwd and shadow but not group [14:56:18] 10Beta-Cluster-Infrastructure: deployment-tin ssh: Connection closed by UNKNOWN - https://phabricator.wikimedia.org/T134777#2301308 (10hashar) Might or might not be related, I have noticed puppet being weird: Notice: /Stage[main]/Mediawiki::Scap/File[/srv/mediawiki]/group: group changed 'mwdeploy' to 'mwdep... [14:56:46] thcipriani: and I found out the old bug I was talking about yesterday [14:56:49] (03CR) 10Jforrester: [C: 031] James is owner of selenium-VisualEditor Jenkins job [integration/config] - 10https://gerrit.wikimedia.org/r/289220 (https://phabricator.wikimedia.org/T134492) (owner: 10Zfilipin) [14:57:01] https://phabricator.wikimedia.org/T73480 Prevent puppet from creating local user when they are defined in LDAP [14:58:26] 10Browser-Tests-Infrastructure, 13Patch-For-Review, 15User-zeljkofilipin: Ownership of Selenium tests - https://phabricator.wikimedia.org/T134492#2301314 (10zeljkofilipin) [14:59:20] were the weird beta-update-databases-eqiad errors related to the ssh problem? [14:59:32] (03CR) 10Zfilipin: [C: 032] James is owner of selenium-VisualEditor Jenkins job [integration/config] - 10https://gerrit.wikimedia.org/r/289220 (https://phabricator.wikimedia.org/T134492) (owner: 10Zfilipin) [14:59:36] (probably, me thinks) [15:00:34] (03Merged) 10jenkins-bot: James is owner of selenium-VisualEditor Jenkins job [integration/config] - 10https://gerrit.wikimedia.org/r/289220 (https://phabricator.wikimedia.org/T134492) (owner: 10Zfilipin) [15:01:21] matt_flaschen: Hi would you be able to review https://gerrit.wikimedia.org/r/#/c/279957/ please. [15:01:21] !log beta: salt -v '*' cmd.run 'groupdel mwdeploy' [15:01:28] Ive tested it my self and it worked. [15:02:01] I'm working on another issue right now. [15:02:47] Ok [15:20:34] Yippee, build fixed! [15:20:35] Project beta-update-databases-eqiad build #8608: 09FIXED in 33 sec: https://integration.wikimedia.org/ci/job/beta-update-databases-eqiad/8608/ [15:27:52] ostriches: Would this https://git-lfs.github.com/ help with phabricator mirroring mw core since it is a big repo. [15:28:14] Since it looks like mw core mirroring is still broken. And im not sure why. [15:28:26] Ive created this task https://phabricator.wikimedia.org/T135494 [15:46:04] getting ready to cut the branch [15:55:11] (03PS1) 10Zfilipin: Moritz is owner of selenium-Math Jenkins job [integration/config] - 10https://gerrit.wikimedia.org/r/289231 (https://phabricator.wikimedia.org/T134492) [16:01:58] !log branching wmf/1.28.0-wmf.2 [16:02:02] paladox: I'm pretty sure it's the same thing it's always been, git-upload-pack on github's side craps out because the push is requesting too many refs to be updated at once. It basically explodes because the resulting system call in the end is too long. [16:02:46] ostriches: Oh, is there a way we can do it in batches. [16:03:00] Probably, I haven't had the time to play with it though [16:03:17] Once we get the refs all pushed, it should work just fine for subsequent pushes [16:03:29] Oh ok. [16:04:05] ostriches: Im just thinking is git-lfs worth it. Would it allow more uploading. [16:04:13] Or would it limit the file sizes more [16:04:20] but repo sizes expanded. [16:05:12] git-lfs wouldn't affect this at all. [16:05:30] That has to do with storing large files (mostly binary blobs) [16:05:38] Not a large underlying repo (which isn't the problem either) [16:05:52] Our problem is too many refs/* at once to update, which are small, they're just pointers :) [16:06:21] Oh [16:07:03] Maybe as a workaround we could go in manually and mirror ref by ref for example do refs/heads first then tags and changes.] [16:07:55] ostriches ^^ [16:08:24] refs/heads/* and refs/tags/* should already be in sync, it's just refs/changes/*, and yeah, we'll wanna try to do it in batches. [16:09:31] Ok. [16:19:35] ostriches: Would following this https://confluence.atlassian.com/bitbucketserverkb/git-push-fails-fatal-the-remote-end-hung-up-unexpectedly-779171796.html help [16:19:40] It is not github [16:19:43] But realted. [16:20:10] No, not related. [16:20:18] Again, that's about large repos. This isn't a large repo problem. [16:20:24] The packed objects are all there already [16:21:01] Oh yep sorry. [16:21:33] ostriches: I managed to do https://github.com/paladox/testrepo-mw but i carn't seem to get it to push there now [16:21:39] Its still importing. [16:21:40] More like.... [16:21:42] http://git.661346.n2.nabble.com/Git-is-not-scalable-with-too-many-refs-td6456443.html [16:21:57] Specifically the post from Shawn where he outlines that having a bajillion refs doesn't scale [16:22:18] Oh [16:23:44] Yeah that thread basically describes (sorta) what's happening. [16:23:53] Or at least a related case. [16:25:17] Ok [16:30:23] ostriches: Would this output saying it failed [16:30:25] Connection to github.com closed by remote host. [16:30:25] fatal: The remote end hung up unexpectedly [16:30:25] fatal: sha1 file '' write error: Broken pipe [16:30:25] error: failed to push some refs to 'ssh://xxxxx@github.com/paladox/testrepo-mw.git' [16:30:54] be what you said that it carn't handle the amout of refs. with git -upload-pack [16:32:02] Yep, that's it. [16:32:10] Actually it's git-receive-pack I think [16:32:14] I always get them mixed up :) [16:33:02] Oh [16:39:57] 03Scap3, 10scap, 13Patch-For-Review: scap::target shouldn't allow users to redefine the user's key - https://phabricator.wikimedia.org/T132747#2301580 (10mmodell) I've reworked my previous keyholder patch, the whole thing is still a bit of a mess due to so many callers using scap::target differently. I wante... [16:49:05] 06Release-Engineering-Team, 05Release: MW-1.28.0-wmf.2 deployment blockers - https://phabricator.wikimedia.org/T134450#2301617 (10Tgr) [16:52:58] 06Release-Engineering-Team, 05Release: MW-1.28.0-wmf.2 deployment blockers - https://phabricator.wikimedia.org/T134450#2301664 (10Tgr) [16:53:30] 10Beta-Cluster-Infrastructure, 06Release-Engineering-Team: Beta puppetmaster cherry-pick process - https://phabricator.wikimedia.org/T135427#2301669 (10mmodell) This task is about coming up with a new process, not just cleaning up the current list of patches. It's not always so straightforward. Sometimes the... [16:53:42] 10Beta-Cluster-Infrastructure, 06Release-Engineering-Team: Beta puppetmaster cherry-pick process - https://phabricator.wikimedia.org/T135427#2301670 (10Eevans) >>! In T135427#2298455, @thcipriani wrote: > As of today we have 13 patches cherry picked to beta of various ages by various authors: > > ``` > thcipr... [16:57:35] 10Beta-Cluster-Infrastructure, 06Release-Engineering-Team: Beta puppetmaster cherry-pick process - https://phabricator.wikimedia.org/T135427#2301706 (10thcipriani) >>! In T135427#2301670, @Eevans wrote: > This one should be getting merged RSN (today? tomorrow?). Removing it would downgrade the Cassandra confi... [16:59:30] Project beta-scap-eqiad build #102891: 04FAILURE in 74 ms: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/102891/ [17:04:39] Project beta-scap-eqiad build #102892: 04STILL FAILING in 99 ms: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/102892/ [17:11:18] 10Beta-Cluster-Infrastructure, 06Release-Engineering-Team: Beta puppetmaster cherry-pick process - https://phabricator.wikimedia.org/T135427#2301757 (10Eevans) >>! In T135427#2301706, @thcipriani wrote: >>>! In T135427#2301670, @Eevans wrote: >> This one should be getting merged RSN (today? tomorrow?). Removi... [17:15:49] Yippee, build fixed! [17:15:50] Project beta-scap-eqiad build #102893: 09FIXED in 1 min 8 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/102893/ [17:28:54] 10Continuous-Integration-Config, 06Front-end-Standards-Group: Devise a recommended grunt configuration for linting and style-checking CSS files that isn't CSSlint - https://phabricator.wikimedia.org/T130721#2301853 (10Jdforrester-WMF) 05Open>03Resolved OK, this is now Resolved. * Use stylelint via `grunt-... [17:43:38] Hello folks, newbie question about scap - I am trying to deploy from Tin to the AQS service and I get "Agent admitted failure to sign using the key.", that seems to be the same problem that I had with Beta (wasn't in the deploy-service group and wasn't able to use the key holder) [17:44:15] so I am wondering in which group should I be in to make scap work with my credentials (if any) [17:44:26] I didn't find the proper docs probably :( [17:51:09] (just re-joined, IRC issues) [18:10:32] hashar, have you cut the new branch yet for the extenisons? [18:20:38] 06Release-Engineering-Team, 05Release: MW-1.28.0-wmf.2 deployment blockers - https://phabricator.wikimedia.org/T134450#2302067 (10TheDJ) [18:23:03] 10Beta-Cluster-Infrastructure, 06Release-Engineering-Team: Beta puppetmaster cherry-pick process - https://phabricator.wikimedia.org/T135427#2302090 (10mmodell) >>! In T135427#2301757, @Eevans wrote: > Any policy/process moving forward that put a hard upper-bound on the amount of time a cherry-picked changeset... [18:31:40] 07Browser-Tests, 10MobileFrontend, 10Reading-Web-Backlog: `Generic special page features.Search from Watchlist` test failing - https://phabricator.wikimedia.org/T130971#2302193 (10MBinder_WMF) [18:31:56] ostriches: im wondering would manaully removing changes from refs/changes inside the mw core repo on diffusion work. Or will that break everything. I mean just remove it so we can mirror first to github. [18:32:50] 10Browser-Tests-Infrastructure, 10MobileFrontend, 10Reading-Web-Backlog, 07Upstream: Upstream: Issue with Chrome driver with resizing window - https://phabricator.wikimedia.org/T88288#2302231 (10MBinder_WMF) [18:33:06] If we remove them that would undo the work we did to put them there to begin with... [18:33:24] If we add them back, it would introduce the same problem. [18:33:30] We have now [18:34:27] Oh. [18:37:52] 10Beta-Cluster-Infrastructure, 06Release-Engineering-Team: Beta puppetmaster cherry-pick process - https://phabricator.wikimedia.org/T135427#2302398 (10hashar) @Eevans Playing a maintenance on beta before rolling it on production is a perfectly legitimate use case and the whole point of beta cluster. Limiting... [18:44:20] 10Continuous-Integration-Config, 06Front-end-Standards-Group: Devise a recommended grunt configuration for linting and style-checking CSS files that isn't CSSlint - https://phabricator.wikimedia.org/T130721#2144519 (10hashar) @Jdforrester-WMF & @Esanders that looks great :-}  Maybe you could announce it on wik... [18:46:39] 06Release-Engineering-Team, 05Release: MW-1.28.0-wmf.2 deployment blockers - https://phabricator.wikimedia.org/T134450#2302494 (10Anomie) [18:47:00] ostriches: I was talking about temporarily. Only to start mirroring. [18:47:15] And then it would re create refs/changes [18:47:16] again [18:47:33] That doesn't make sense. [18:47:55] What would doing it temporarily solve? [18:48:08] It would start it off. [18:48:20] But then again it already is. [18:48:38] Would git clone --mirror url on github work [18:48:54] No that's not what we want. [18:49:14] Oh ok [18:49:45] Basically we need to iterate over all refs/changes/ and push them one by one to Github [18:50:02] ostriches: Oh. [18:50:03] That will have to be done manually-ish [18:50:08] We can script it [18:50:16] Oh, yep. [18:50:23] Thanks for expalning [18:50:23] Once that's done once replication should just work [18:50:26] Yep [18:50:32] Since the number of refs to push would be sane [18:51:14] Yep [18:51:26] Phabricator actually got this right-ish by only processing one ref at a time for import [18:51:56] It's Github that's broken here, although for totally understandable reasons and I would prolly do the same if I were them [18:52:09] ostriches: yep, once we migrate from gerrit to differential the problem should go away [18:52:25] Since we can stop refs/changes [18:52:37] Or they'll stop growing at least. [18:52:44] Yep [18:52:55] They've cluttered us permanently. I'm pretty convinced it's a flawed model [18:53:13] Yep [18:55:42] ostriches: When refs/changes start showing for me on the repo [18:55:48] it slows diffusion [18:56:13] On my personal laptop [18:56:23] That doesn't entirely surprise me. [18:56:28] http://www.test-random-wikisaur.tk/diffusion/8/ [18:56:32] cc twentyafterfour ^ [18:56:46] (you were right, ofc, importing all refs was a resource drain) [18:56:58] Yep. It seems to work for wikimedia phabricator [18:57:09] Well we have stronger hardware ;-) [18:57:19] But my personal pc it dosent work. [18:57:20] Yep [18:57:36] But honestly, I dunno the alternative though if people really want the ability to view a random sha1 that may never actually end up being committed in a *different* git browser. [18:57:51] Oh [18:57:59] I think it's mainly just because gerrit is lacking in its diffs...but I don't think using $random_other_tool to display that is the right fix. [18:58:09] Oh [18:58:15] Whether it's gitblit or phabricator. [18:59:46] Yep, not sure since i think phabricator looks better and is faster then gitblit [19:00:10] Well I don't disagree that Phabricator is a better solution than Gitblit for git browsing. [19:00:23] I wonder if we can block refs/changes from being mirror [19:00:25] I'm more talking about the (diffusion) or (gitblit) links in Gerrit [19:00:25] ostriches: yeah. having all those changesets is a huge resource drain. it's slowed down repo importing by a lot, now all our repos lag, sometimes hours instead of seconds to see a commit show up in diffusion [19:00:31] I've never found *those* terribly useful. [19:00:36] IMO it was a big mistake [19:01:03] it's really nice to see a change in diffusion rather than gerrit, but I don't really think it's worth the cost overall [19:01:16] plus it's a lot of noise in phab [19:06:17] for my money it's not worth having but I know some like it [19:15:02] twentyafterfour: ostriches: Is there a way we can add refs/changes to the ignored list. [19:15:12] for mirroring [19:16:32] so it will still import refs/changes into diffusion but pushing to mirror it will ignored refs/changes [19:17:02] Because i like viewing changes from refs/changes in diffusion because gerrit sometimes crashes [19:17:09] if it is a big file. [19:18:59] 06Release-Engineering-Team, 05Release: MW-1.28.0-wmf.2 deployment blockers - https://phabricator.wikimedia.org/T134450#2302660 (10Tgr) [20:26:38] ostriches how would we manually ad refs/changes to the mirror [20:26:49] I would like to try tht please [20:34:43] the evidence is building that it's unwise to try to publish all these refs [20:55:20] 06Release-Engineering-Team, 05Release: MW-1.28.0-wmf.3 deployment blockers - https://phabricator.wikimedia.org/T135559#2303065 (10Luke081515) [20:55:31] 06Release-Engineering-Team, 05Release: MW-1.28.0-wmf.2 deployment blockers - https://phabricator.wikimedia.org/T134450#2265934 (10Luke081515) [20:57:52] twentyafterfour: You going to cut branches etc today? wikitech:deployments says hashar would? Is the second one out of date? [20:58:25] Luke081515 I did it today [20:58:50] twentyafterfour: ok, then you get the assignment, of the blocker, ok? ;) [21:00:20] Luke081515: what blocker? [21:00:24] it's already deployed :D [21:01:02] just to make a kind of order :D [21:01:08] https://phabricator.wikimedia.org/T134450 [21:01:10] is it [21:01:23] 06Release-Engineering-Team, 05Release: MW-1.28.0-wmf.2 deployment blockers - https://phabricator.wikimedia.org/T134450#2303089 (10Luke081515) p:05Triage>03Normal [21:01:27] that one [21:04:35] twentyafterfour: Luke081515 : Mukunda been scheduled for this week as I understood it during our weekly meeting yesterday [21:04:47] I guess the Deployment page hasn't been updated to reflect that [21:04:48] OR [21:05:00] I have mis understood and skipped branching :( [21:05:36] which mean I owe Mukunda a few of his favorite brewerage and some apology on wikitech-l if that resulted in some delay [21:05:47] -operations said he did it ;) [21:06:01] so he gets the task now ;) [21:06:13] 06Release-Engineering-Team, 05Release: MW-1.28.0-wmf.2 deployment blockers - https://phabricator.wikimedia.org/T134450#2303093 (10Luke081515) a:03mmodell [21:08:11] 10Deployment-Systems, 03Scap3, 13Patch-For-Review: Scap scripts on mw1017 are incorrect - https://phabricator.wikimedia.org/T135206#2303096 (10thcipriani) 05Open>03Resolved a:03thcipriani [21:11:28] 06Release-Engineering-Team, 05Release: MW-1.28.0-wmf.2 deployment blockers - https://phabricator.wikimedia.org/T134450#2303099 (10mmodell) 05Open>03Resolved [21:11:38] 06Release-Engineering-Team, 05Release: MW-1.28.0-wmf.2 deployment blockers - https://phabricator.wikimedia.org/T134450#2265934 (10mmodell) 05Resolved>03Open [21:13:07] 05Continuous-Integration-Scaling, 06Labs, 10Labs-Infrastructure, 13Patch-For-Review: Bump quota of Nodepool instances (contintcloud tenant) - https://phabricator.wikimedia.org/T133911#2303104 (10Andrew) a:03Andrew [21:13:26] Luke081515: one day it will be automatic / safe enough that you could be the one cutting the branch and pushing to prod :) [21:13:58] :D [21:14:24] Guess I need a bit more mediawiki programming experience before ;) [21:14:49] the first time I have met greg-g face to face was at the Amsterdam hackathon in 2013 [21:15:04] and we had a discussion about aiming at a one click deploy button [21:15:40] we are not so far now :] [21:15:52] lot of enhancements have been made over the last three years [21:16:19] it's getting closer [21:16:22] Luke081515: usually https://logstash-beta.wmflabs.org/#/dashboard/elasticsearch/fatalmonitor is all you need :) [21:16:32] Luke081515: that one is for beta but there is the exact same one in production [21:16:46] Luke081515: if you see bars: press roll back :) [21:16:52] * twentyafterfour doesn't have much mediawiki-specific experience, though I know the intricacies of how extensions and branches interact now [21:17:14] I have lost most of my mediawiki knowledge to be honest [21:17:16] ah, ok :) [21:17:25] hard to keep up with all the change that have been going on [21:18:04] I dont even know how to log an error :/ though t.g.r reviewed one of my patch pasting a nice obscure PHP oneliner for me [21:18:34] anyway I am not proud of my today lack of mediawiki knowledge [21:18:46] then for deployment, I dont think much is needed beside some basic knowledge [21:19:09] what ones really need is knowing the different breaks involved and what twentyafterfour said how the extensions/branches interacts [21:19:43] and security patching :( [21:19:55] which is currently the biggest pain point in the weekly process [21:20:18] yeah [21:20:30] I am confident we can get a script to streamline that part [21:20:34] even get it part of scap [21:21:01] but I havent found out how to simulate applying several patches that depends on each other [21:23:36] we do have the 'scap security-check' command. That was as far as I got on it. https://github.com/wikimedia/scap/blob/master/scap/main.py#L161 [21:24:06] git apply --index --cached , would apply the patches to the index, not touching the working tree [21:24:34] so once all patches have been applied properly to the index you know they are working or have caught errors and can dish the index [21:24:54] then if all passed, brute git apply already knowing it is going to pass [21:27:46] anyway sleep time! [21:28:23] oh forgot to say: I got Jenkins upgraded afterall [21:29:06] there is a bunch of interesting bits from https://phabricator.wikimedia.org/T133737#2290669 [21:29:26] including how to trigger jobs using gearman CLI :D [21:30:02] echo '{"SOME_PARAM":3,"OFFLINE_NODE_WHEN_COMPLETE":1}'|gearman -h 127.0.0.1 -p 4730 -v -f build:some-job [21:30:08] my test jenkins is currently at 2.5 :D [21:30:15] my private one [21:30:16] ooh neat. [21:30:26] the biggest change: [21:30:36] Succesful builds have since 2.0 a blue icon \o/ [21:30:41] Jenkins 2.x is apparently back compatible backend / API wise [21:31:05] seems it "just" add a bunch more stuff on top of it the most important ones being the pipeline/workflow plugin which is now integrated [21:31:10] and the second is the huge UI revamp [21:31:45] Luke081515: iirc success icons always have been blue. We have a plugin to turn them green :D [21:32:12] hashar: ah, ok. What's the name of that plugin? :D [21:34:06] Luke081515: https://wiki.jenkins-ci.org/display/JENKINS/Green+Balls [21:34:44] Luke081515: and yeah it is definitely installed on wmf setup [21:35:01] merci :) [21:35:33] https://wiki.jenkins-ci.org/display/JENKINS/AnsiColor+Plugin transforms ansi sequences to html for colorified console output [21:35:45] https://wiki.jenkins-ci.org/display/JENKINS/Timestamper adds a time prefix [21:37:56] I already have the last installed, but the first ones sound good :) [21:38:04] Luke081515: thcipriani and I have a pairing session maybe he would be interested in upgrading to Jenkins 2.X :) [21:38:41] :) [21:40:20] * robla thinks he needs a new Phab permission to add #ArchCom-Approved as a milestone within #ArchCom-RFC (per T133803) [21:40:32] Luke081515: and here are the plugins that have been installed on 1000+ setup http://stats.jenkins-ci.org/jenkins-stats/svg/201603-top-plugins1000.svg :D [21:41:30] robla: do you are project admin? [21:41:37] then you can. Otherwise you can ask me ;) [21:42:55] Luke081515: I'm a project admin and have project create permission, but as near as jaufrecht and I could tell when we were looking at the interface, I didn't have the permission. maybe we were looking in the right spot [21:43:15] * robla double checks that he's listed as a project admin...maybe that's his problem [21:43:36] have a good day *wave* [21:43:47] bye, hashar :) [21:44:34] robla: Use the links here: https://phabricator.wikimedia.org/project/subprojects/52/ [21:45:20] oh, look at that! [21:45:28] * robla attempts to do the deed [21:46:12] robla: the "subprojects" menu entry was disabled, so you it was not easy to find it, but I reenabled it ;) [21:49:00] Luke081515: thanks! it's created now: https://phabricator.wikimedia.org/project/view/2002/ [21:55:47] 10Continuous-Integration-Infrastructure, 10ArchCom-RfC (ArchCom-Approved), 07RfC: RFC: Extensions continuous integration - https://phabricator.wikimedia.org/T1350#2303226 (10RobLa-WMF) [22:03:53] twentafterfour: We can manually add refs/changes in github per ostriches [22:04:24] Which would be a workaround. [22:04:43] We run git push --mirror url [22:35:22] PROBLEM - Puppet run on deployment-sca01 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [22:52:33] 03Scap3, 10scap, 13Patch-For-Review: scap::target shouldn't allow users to redefine the user's key - https://phabricator.wikimedia.org/T132747#2303453 (10thcipriani) a:05thcipriani>03mmodell >>! In T132747#2301580, @mmodell wrote: > I've reworked my previous keyholder patch, the whole thing is still a bi... [23:00:22] RECOVERY - Puppet run on deployment-sca01 is OK: OK: Less than 1.00% above the threshold [0.0]