[00:36:09] PROBLEM - Free space - all mounts on deployment-bastion is CRITICAL: CRITICAL: deployment-prep.deployment-bastion.diskspace._var.byte_percentfree.value (<55.56%) [03:31:03] Yippee, build fixed! [03:31:04] Project browsertests-Core-en.wikipedia.beta.wmflabs.org-linux-firefox-sauce build #526: FIXED in 12 min: https://integration.wikimedia.org/ci/job/browsertests-Core-en.wikipedia.beta.wmflabs.org-linux-firefox-sauce/526/ [03:34:41] Project beta-scap-eqiad build #44527: FAILURE in 39 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/44527/ [03:44:27] Yippee, build fixed! [03:44:28] Project browsertests-MultimediaViewer-en.wikipedia.beta.wmflabs.org-windows_8.1-internet_explorer-11-sauce build #359: FIXED in 37 min: https://integration.wikimedia.org/ci/job/browsertests-MultimediaViewer-en.wikipedia.beta.wmflabs.org-windows_8.1-internet_explorer-11-sauce/359/ [03:54:47] Yippee, build fixed! [03:54:48] Project beta-scap-eqiad build #44529: FIXED in 51 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/44529/ [06:36:03] PROBLEM - Puppet failure on deployment-apertium01 is CRITICAL: CRITICAL: 75.00% of data above the critical threshold [0.0] [06:36:11] RECOVERY - Free space - all mounts on deployment-bastion is OK: OK: All targets OK [07:01:06] RECOVERY - Puppet failure on deployment-apertium01 is OK: OK: Less than 1.00% above the threshold [0.0] [07:56:33] PROBLEM - Puppet failure on deployment-pdf01 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [08:46:49] hello [08:56:44] hashar: sorry, will be 5 or so minutes late, have to get something to eat, crazy morning [08:58:48] zeljkof: take your time :) [09:10:11] hashar: I am in the hangout [09:12:15] good [09:12:20] zeljkof: joining [09:54:33] Project beta-scap-eqiad build #44565: FAILURE in 37 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/44565/ [09:59:31] aharoni: ready? [09:59:47] in a couple of mintes [09:59:57] aharoni: ok, ping me when ready [10:02:36] zeljkof: in hangout [10:04:30] aharoni: me too, but it is all black [10:04:34] reloading [10:15:05] Yippee, build fixed! [10:15:05] Project beta-scap-eqiad build #44567: FIXED in 1 min 2 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/44567/ [10:16:53] 10Continuous-Integration: mediawiki-extensions-hhvm failed on a patchset with "RuntimeException: Cannot override frozen service "storage"." - https://phabricator.wikimedia.org/T91888#1099692 (10hashar) [10:17:12] 10Continuous-Integration, 10Flow: Jenkins reports test failures in current master: Cannot override frozen service "storage" - https://phabricator.wikimedia.org/T91951#1099472 (10hashar) [10:17:39] 10Continuous-Integration: mediawiki-extensions-hhvm failed on a patchset with "RuntimeException: Cannot override frozen service "storage"." - https://phabricator.wikimedia.org/T91888#1098226 (10hashar) Seems there is some kind of race condition. I have marked your bug as a duplicate of T91951 which has the test... [12:37:35] PROBLEM - SSH on deployment-lucid-salt is CRITICAL: Connection refused [13:08:12] (03CR) 10Jdlrobson: "@hashar ping! :)" [integration/config] - 10https://gerrit.wikimedia.org/r/191046 (https://phabricator.wikimedia.org/T74794) (owner: 10Hashar) [14:33:20] PROBLEM - Puppet failure on deployment-memc04 is CRITICAL: CRITICAL: 37.50% of data above the critical threshold [0.0] [14:34:03] PROBLEM - Puppet failure on deployment-apertium01 is CRITICAL: CRITICAL: 62.50% of data above the critical threshold [0.0] [14:34:25] PROBLEM - Puppet failure on deployment-rsync01 is CRITICAL: CRITICAL: 14.29% of data above the critical threshold [0.0] [14:34:37] PROBLEM - Puppet failure on deployment-bastion is CRITICAL: CRITICAL: 62.50% of data above the critical threshold [0.0] [14:34:48] <^d> YuviPanda: Where'd we get on making sure -palladium is ok? [14:35:01] <^d> (and by extension: am I ok to mess with mc[1-3] again?) [14:35:11] PROBLEM - Puppet failure on deployment-parsoid05 is CRITICAL: CRITICAL: 42.86% of data above the critical threshold [0.0] [14:35:11] PROBLEM - Puppet failure on deployment-mediawiki02 is CRITICAL: CRITICAL: 62.50% of data above the critical threshold [0.0] [14:35:47] ^d: yup, yup. [14:35:51] ^d: we figured out the salt mess [14:35:54] PROBLEM - Puppet failure on deployment-elastic05 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [14:36:02] ^d: and that we have to 1. remove the public key and 2. restart salt-minion on each client as we go ahead [14:36:05] palladium itself is ok [14:36:29] <^d> Yeah we knew that step still [14:36:42] YuviPanda: followup. Where'd we land on automating setting the salt master public key? Is that I good idea? Should I do that? [14:36:43] PROBLEM - Puppet failure on deployment-stream is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [14:36:55] thcipriani: let’s give it a shot, yeah. [14:37:30] I’m going to try get rid of role::parsoid::beta today [14:38:40] neat. So many beta roles :( [14:38:47] PROBLEM - Puppet failure on deployment-salt is CRITICAL: CRITICAL: 71.43% of data above the critical threshold [0.0] [14:39:23] thcipriani: it’s the last one, I think :D [14:39:29] but I’ve been hacking away at these for months... [14:39:44] thcipriani: scap is the only significant one lef [14:40:22] nice! [14:41:37] PROBLEM - Puppet failure on deployment-restbase01 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [14:41:55] thcipriani: I moved the scap prod code from some strange places into the scap/ module. Now just need to re-org it a bit and kill beta/scap [14:45:20] one thing I started to work on a little bit is combining role::deployment::deployment_servers::{production,labs} I feel like staging-tin is going to be a bit…tricky. [14:48:48] ^d: do you remember how to add someone to a wmf ldap group? [14:48:55] thcipriani: yeah... [14:48:57] <^d> Yep [14:49:38] <^d> Who? [14:50:23] ^d: joal [14:50:30] analytics, there’s a wmfall email.. [14:51:00] <^d> k [14:51:24] <^d> joal is already a member of the group, skipping. [14:51:25] <^d> No changes to make; exiting. [14:58:21] RECOVERY - Puppet failure on deployment-memc04 is OK: OK: Less than 1.00% above the threshold [0.0] [14:59:39] RECOVERY - Puppet failure on deployment-bastion is OK: OK: Less than 1.00% above the threshold [0.0] [15:00:13] RECOVERY - Puppet failure on deployment-parsoid05 is OK: OK: Less than 1.00% above the threshold [0.0] [15:00:13] RECOVERY - Puppet failure on deployment-mediawiki02 is OK: OK: Less than 1.00% above the threshold [0.0] [15:01:31] ^d: hmm, so I added him on friday, and thought it was ok, but he can’t access graphite... [15:01:41] RECOVERY - Puppet failure on deployment-stream is OK: OK: Less than 1.00% above the threshold [0.0] [15:01:42] RECOVERY - Puppet failure on deployment-restbase01 is OK: OK: Less than 1.00% above the threshold [0.0] [15:02:06] <^d> YuviPanda: Using CN to login or SN? You use CN. [15:02:19] oh [15:02:21] good point [15:04:14] ^d: he did use cn (joal), and that didn’t really work... [15:04:26] RECOVERY - Puppet failure on deployment-rsync01 is OK: OK: Less than 1.00% above the threshold [0.0] [15:05:14] <^d> YuviPanda: Ask another ops? wfm :p [15:05:38] ^d: where will I go find someone from ops? those people never are around… :P [15:05:40] I’ll poke later [15:05:41] thanks ^d [15:05:47] <^d> yw [15:05:56] RECOVERY - Puppet failure on deployment-elastic05 is OK: OK: Less than 1.00% above the threshold [0.0] [15:08:46] RECOVERY - Puppet failure on deployment-salt is OK: OK: Less than 1.00% above the threshold [0.0] [15:20:23] zeljkof: did you change something in Jenkins? as of the last build, no test results for e.g. https://integration.wikimedia.org/ci/view/BrowserTests/view/-All/job/browsertests-MobileFrontend-SmokeTests-linux-chrome-sauce/ [15:20:41] chrismcmahon: no, did not touch jenkins today [15:21:12] chrismcmahon: https://integration.wikimedia.org/ci/view/BrowserTests/view/-All/job/browsertests-MobileFrontend-SmokeTests-linux-chrome-sauce/45/consoleFull [15:21:18] there was something wrong with the job [15:21:29] 00:00:15.421 Failed to load 'jpg' programming language for file features/support/exif.jpg: cannot load such file -- cucumber/jpg_support/jpg_language [15:21:48] looks like somebody committed jpeg file to the repo and confused cucumber [15:23:08] weird, OK [15:23:51] zeljkof: btw, since Sauce changed the default version of Chrome, every test of ours that uses an "overlay" fails. it is annoying. [15:24:38] chrismcmahon: we explicitly use the latest supported chrome, I do not think sauce did any major changes [15:24:54] thcipriani: ^d we should also talk about how deploys are going to happen in the ‘staging’ cluster. [15:25:04] doing the thing with jenkins makes me somewhat uncomfortable [15:25:13] because it’s no longer a ‘true’ prod environment then [15:25:18] chrismcmahon: https://github.com/wikimedia/integration-config/blob/master/jjb/macro-browsertests.yaml#L49-L50 [15:25:38] hrm. [15:26:17] (03Abandoned) 10Hashar: Merge branch 'upstream-debian-sid' into debian [integration/zuul] (debian) - 10https://gerrit.wikimedia.org/r/191770 (owner: 10Hashar) [15:26:19] I don’t have any solutions / suggestions atm. [15:29:46] (03PS1) 10Hashar: .gitreview for debian/precise-wikimedia branch [integration/zuul] (debian/precise-wikimedia) - 10https://gerrit.wikimedia.org/r/195269 [15:30:04] zeljkof: this thing with the .jpg in the build, it was not a problem until a few hours ago https://integration.wikimedia.org/ci/job/browsertests-MobileFrontend-SmokeTests-linux-chrome-sauce/44/. I'm looking... [15:31:58] chrismcmahon: maybe cucumber upgrade change something [15:32:46] (03PS1) 10Hashar: Merge Zuul upstream 2.0.0-304-g685ca22 [integration/zuul] (debian/precise-wikimedia) - 10https://gerrit.wikimedia.org/r/195270 [15:37:25] (03CR) 10Hashar: [C: 032 V: 032] .gitreview for debian/precise-wikimedia branch [integration/zuul] (debian/precise-wikimedia) - 10https://gerrit.wikimedia.org/r/195269 (owner: 10Hashar) [15:37:37] (03CR) 10Hashar: [C: 032 V: 032] Merge Zuul upstream 2.0.0-304-g685ca22 [integration/zuul] (debian/precise-wikimedia) - 10https://gerrit.wikimedia.org/r/195270 (owner: 10Hashar) [15:38:24] (03PS1) 10Hashar: Package python deps with dh-virtualenv [integration/zuul] (debian/precise-wikimedia) - 10https://gerrit.wikimedia.org/r/195272 (https://phabricator.wikimedia.org/T48552) [15:38:55] (03Abandoned) 10Hashar: Package python deps with dh-virtualenv [integration/zuul] (debian-precise-venv) - 10https://gerrit.wikimedia.org/r/194520 (https://phabricator.wikimedia.org/T48552) (owner: 10Hashar) [15:39:03] (03Abandoned) 10Hashar: Vcs-* points to openstack-infra now [integration/zuul] (upstream-debian-sid) - 10https://gerrit.wikimedia.org/r/194398 (owner: 10Hashar) [15:39:29] (03Abandoned) 10Hashar: Add .gitreview [integration/zuul] (upstream-debian-sid) - 10https://gerrit.wikimedia.org/r/194397 (owner: 10Hashar) [15:41:53] zeljkof: I sent email with the Sauce announcement about the change that affected our tests that use overlays. [15:43:02] (03PS2) 10Hashar: Package python deps with dh-virtualenv [integration/zuul] (debian/precise-wikimedia) - 10https://gerrit.wikimedia.org/r/195272 (https://phabricator.wikimedia.org/T48552) [16:04:10] (03PS1) 10Hashar: Apply wmf patches [integration/zuul] (debian/precise-wikimedia) - 10https://gerrit.wikimedia.org/r/195279 [16:05:05] RECOVERY - Long lived cherry-picks on puppetmaster on deployment-salt is OK: OK: Less than 100.00% above the threshold [0.0] [16:05:18] (03PS1) 10Hashar: wmf: soften requirements [integration/zuul] (patch-queue/debian/precise-wikimedia) - 10https://gerrit.wikimedia.org/r/195280 [16:05:20] (03PS1) 10Hashar: Merger: ensure_cloned() now looks for '.git' [integration/zuul] (patch-queue/debian/precise-wikimedia) - 10https://gerrit.wikimedia.org/r/195281 [16:05:22] (03PS1) 10Hashar: Update merge status after merge:merge is submitted [integration/zuul] (patch-queue/debian/precise-wikimedia) - 10https://gerrit.wikimedia.org/r/195282 [16:05:24] (03PS1) 10Hashar: Ensure the repository configuration lock is released [integration/zuul] (patch-queue/debian/precise-wikimedia) - 10https://gerrit.wikimedia.org/r/195283 [16:14:48] Project beta-scap-eqiad build #44602: FAILURE in 44 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/44602/ [16:24:31] (03PS2) 10Hashar: Apply wmf patches [integration/zuul] (debian/precise-wikimedia) - 10https://gerrit.wikimedia.org/r/195279 [16:24:33] (03PS3) 10Hashar: Package python deps with dh-virtualenv [integration/zuul] (debian/precise-wikimedia) - 10https://gerrit.wikimedia.org/r/195272 (https://phabricator.wikimedia.org/T48552) [16:27:56] zeljkof: I think I figured out the issue with the jpg in the MF repo https://git.wikimedia.org/commit/mediawiki%2Fextensions%2FMobileFrontend/53ea70c609ad6fd7dfa0301fe076d28eb8c07ed0 [16:28:24] now I have to figure out what to do about it. [16:30:00] zeljkof: would it make sense to revert https://gerrit.wikimedia.org/r/#/c/195037/ ? Or more sense to actually clean up those useless tests? [16:30:22] thcipriani: ^d so the parsoid role is stalled until thursday, because don’t want to futz with beta VE testing until then. anything I can do to help you guys? [16:31:04] chrismcmahon: Gemfile.lock should be commited [16:31:13] they did not know what they were doing [16:31:25] zeljkof: right. OK, lets' revert that now then [16:31:35] * YuviPanda looks at staging project [16:31:38] YuviPanda: if you want to look into staging-tin I think that's the machine that has the most roles I'm worried about. [16:31:45] heh, looking [16:32:11] thcipriani: we can ignore a fair bit of them [16:32:30] zeljkof: I added you to review https://gerrit.wikimedia.org/r/#/c/195294/ [16:32:32] admin, releases::upload, labsdb::manager [16:32:48] there are still some bogus tests in that repo though that should be removed. [16:33:08] chrismcmahon: please also add other people that were on the review list that did the revert [16:33:44] I did [16:34:38] YuviPanda: there is at least one role on there that requires some labs-private files that don't exist, can't remember which one :\ [16:34:54] thcipriani: I’m investigating / looking into the errors. [16:35:00] Yippee, build fixed! [16:35:00] Project beta-scap-eqiad build #44604: FIXED in 59 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/44604/ [16:43:36] wat. https://integration.wikimedia.org/ci/job/mediawiki-extensions-hhvm/5980/console "Unable to fork, can't test merge" mean anything to you hashar YuviPanda anyone? [17:02:30] zeljkof: Finally got it running in "http://176.58.110.226:4567/wiki/References?veaction=edit", thank you for http://www.installationpage.com/selenium/how-to-run-selenium-headless-firefox-in-ubuntu/ [17:02:39] chrismcmahonbrb: hey. no idea, sorry [17:03:12] vikasyaligar: great [17:03:41] zeljkof: screenshots can be found here => http://176.58.110.226:8000/screenshots/ [17:03:54] zeljkof: github: https://github.com/vikassy/Screenshot-recorder [17:24:18] (03CR) 10Krinkle: "For the main 'wikimedia-fundraising-crm' job this would ignore the submodules and use latest master. Which will most likely lead to two is" [integration/config] - 10https://gerrit.wikimedia.org/r/195074 (https://phabricator.wikimedia.org/T91905) (owner: 10Awight) [17:36:35] greg-g: sent an email to Sarah asking for schedule tetris help. [17:36:58] bd808: thanks [17:37:11] sorry to kick it back, I had no chance [17:40:13] PROBLEM - Puppet failure on deployment-zotero01 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [17:49:48] hey deployment-preppers, I have a question for you if you use git deploy at all [17:51:20] hi apergos [17:51:24] hey [17:51:31] is git deploy the same as trebuchet? [17:51:39] uh yes sort of [17:51:46] so trebuchet does deploy a lot of things [17:51:54] scap itself is deployed via trebuchet [17:51:58] and so is the jobrunner code, etc [17:52:03] what I mean is the specific interface to trebuchet where you type at the command lline 'git deploy start', 'git deploy sync' etc [17:52:13] other than that I don't probably care [17:52:15] I’ve had to use it a couple of times. [17:52:20] (to deploy jobrunner updates) [17:52:29] ah [17:52:40] o you have root on the project? [17:52:55] yup [17:53:08] * bd808 grudgingly acknowledges using trebuchet [17:53:25] * YuviPanda tags bd808 [17:53:28] YOU ARE IT! [17:53:41] The git-deploy interface is a pita [17:54:10] it was weird the only two times I had to use it, I’ll admit [17:54:18] I’m glad scap comes through from provider => trebuchet [17:54:19] I blame salt [17:54:38] indeed, that’s why I eat all the sugar instead [17:54:46] * YuviPanda looks for a pack of M&Ms [17:55:01] the async reality of salt combined with the sync expectations of deployment are not a great combo [17:55:29] so trebuchet tries to cover it all up with polling [17:55:54] but then makes the deployer actually trigger each round of poling [17:56:20] anyway, apergos did you need help with something git-deploy related? [17:56:21] yeah well the git deploy interace is *cough* broken on deployment prep right now, I'll have to rebuild the trigger package [17:56:28] ah [17:56:45] no I need to feel sorry for myself that the new salt version has a teeny tiny bug about parsing arguments... meeeehhhh [17:56:56] yes, we need it if that's the question [17:57:13] well the question was going to be whether uh [17:57:22] any non root people use it [17:57:33] and whether I could sucker people into [17:57:41] sec I have 3 mins to join a meeting, brb [17:57:49] none of the automated deploys use git-deploy but several manual processes do [17:59:14] yeah then my best bet is indeed to rebuild the package, or someone will be sad [17:59:43] nothing uses git-deploy, why do we mess with git-deploy [18:14:03] apergos: btw, if you’re messing around with more salt packages, putting in salt-syndic would also be nice :D [18:14:58] uh it's in the repo [18:15:03] is it? [18:15:07] yes [18:15:09] oh [18:15:18] my bad I am grabbing from the salt ppa for dpeloyment-prep [18:15:28] it will go into the repo along with all the rest, when we update on prod [18:15:32] The following packages will be DOWNGRADED: [18:15:32] salt-common salt-master salt-minion [18:15:35] apergos: coool :) [18:15:45] if you want to trry to upgrade it on salt-master [18:15:51] nah, no rush... [18:16:12] (remember I couldn't get it to uninstall the old one, nor could I get the upgrade to work because it claimed the old one was and yet was not installed) [18:16:42] just do the pinning, I might have left the pinning file around, so all you would need is to add the salt ppa repo [18:16:58] deployment-prep only though ;-) [18:17:32] if you are able to get rid of the old package I will iinstall the new one for you though, free of charge :-P but configuration would be up to you [18:19:03] apergos: ah, I killed the old package. purge let it die. [18:19:14] apergos: so deployment-salt has no salt-syndic now. [18:20:24] ok sweet [18:20:28] lemme get that going for you [18:21:02] (03PS10) 10Awight: Jenkins job builder definition for CRM job [integration/config] - 10https://gerrit.wikimedia.org/r/195063 (https://phabricator.wikimedia.org/T91895) [18:21:06] thankyou apergos [18:21:13] yw [18:21:18] (03Abandoned) 10Awight: CRM job can be run on submodules [integration/config] - 10https://gerrit.wikimedia.org/r/195074 (https://phabricator.wikimedia.org/T91905) (owner: 10Awight) [18:23:35] YuviPanda: done [18:23:40] apergos: \o/ [18:23:50] have fun! [18:25:10] so anyone using git deploy sync today will find that it doesn't actually do the checkout [18:25:12] sadly. [18:25:26] they can either a) poke me about the repo and I can do it or [18:25:39] they can b) wait until I get the trigger package rebuilt and out there, hopefully tomorrow. [18:25:55] (tonight I will probably be tooc rispy to finish that up) [18:35:33] hashar: fyi, I think the CiviCRM job is ready to go: https://gerrit.wikimedia.org/r/195063 [18:36:00] I'm using zuul-cloner for the submodules as you suggested, looking forward to testing the job on those repos as well! [18:50:37] thcipriani: I see what you mean by tin being problematic... [18:50:38] * YuviPanda ponders [18:51:07] I think class role::deployment::deployment_servers::labs { [18:51:09] needs to go [18:51:10] and be unified [18:51:12] and then we just use that [18:54:44] YuviPanda: yeah, I had a patch where I started to combine deployment_servers::labs and production into commons as best I could, got lost somewhere along the way though. [18:54:55] heh [18:55:00] do you still have it? [18:55:17] I do, I could push it up to gerrit if you like. [18:56:42] thcipriani: please do! [18:57:25] I'd like to test my CI job on changes for a new repo--is there a way to do that manually? [18:58:09] YuviPanda: kk, digging through some conflicts. [18:58:17] hashar: I just realized there's an issue with the zuul-cloner approach for submodules. If the parent repo patch includes a submodule bump, that won't be picked up by the cloner. ? [18:58:34] thcipriani: yeah, cool. I thik that’s where we should start. [19:05:05] whenever somebody uses the word "submodule" it is that it's causing a problem we would not have without them [19:05:26] hm.. wikibugs gone? [19:05:36] check if it's still in the -lab [19:05:39] -labs [19:07:28] mutante: err, agreed, git submodules are poorly supported. But without them, what we would have are symlinks. [19:08:38] awight: fair, i don't even claim i have solutions, i just had to point out the pattern i notice on IRC [19:08:50] (on more than one channel) [19:10:02] (03CR) 10Ori.livneh: [C: 031] Ensure the repository configuration lock is released [integration/zuul] (patch-queue/debian/precise-wikimedia) - 10https://gerrit.wikimedia.org/r/195283 (owner: 10Hashar) [19:10:02] mutante: That is totally a thing, it's true! But I think supporting any semi-standard solution is going to be better than homebrew SHA1s and symlinks, plus external configuration hell that we would probably have instead... [19:10:42] The rumor is that Linus was forced to add submodules to git, and doesn't want to implement them well :) [19:11:02] I'll have to find an article about that before I slander much further tho... [19:12:40] The latest example of submodules hurting ordinary people such as myself is, when I tried to commit a composer vendor/ dir. argh. [19:17:24] (03PS1) 10Krinkle: Enable 'mediawiki-extensions-{hhvm,zend}' in all extensions it covers [integration/config] - 10https://gerrit.wikimedia.org/r/195335 (https://phabricator.wikimedia.org/T91968) [19:18:12] This is how bad things happen to good people ;) http://www.gelato.unsw.edu.au/archives/git/0612/index.html [19:20:44] YuviPanda: https://gerrit.wikimedia.org/r/#/c/195336/ sorry that took so long. Like I was solving a riddle with that rebase :) [19:22:07] zuul stuck? [19:22:39] (03CR) 10Jforrester: Enable 'mediawiki-extensions-{hhvm,zend}' in all extensions it covers (031 comment) [integration/config] - 10https://gerrit.wikimedia.org/r/195335 (https://phabricator.wikimedia.org/T91968) (owner: 10Krinkle) [19:23:40] thcipriani: :) I’m looking at it now [19:23:53] thcipriani: mind if I make some changes? [19:24:30] hashar, Krinkle: Zuul's looking a bit… yeah, what legoktm said. [19:24:35] YuviPanda: not at all, go for it. [19:26:03] (03CR) 10Krinkle: Enable 'mediawiki-extensions-{hhvm,zend}' in all extensions it covers (031 comment) [integration/config] - 10https://gerrit.wikimedia.org/r/195335 (https://phabricator.wikimedia.org/T91968) (owner: 10Krinkle) [19:27:19] Project browsertests-Flow-en.wikipedia.beta.wmflabs.org-linux-firefox-monobook-sauce build #357: ABORTED in 32 min: https://integration.wikimedia.org/ci/job/browsertests-Flow-en.wikipedia.beta.wmflabs.org-linux-firefox-monobook-sauce/357/ [19:27:19] Project browsertests-MultimediaViewer-en.wikipedia.beta.wmflabs.org-os_x_10.9-safari-sauce build #514: ABORTED in 1 hr 17 min: https://integration.wikimedia.org/ci/job/browsertests-MultimediaViewer-en.wikipedia.beta.wmflabs.org-os_x_10.9-safari-sauce/514/ [19:27:20] Project browsertests-Flow-en.wikipedia.beta.wmflabs.org-linux-chrome-sauce build #552: ABORTED in 17 min: https://integration.wikimedia.org/ci/job/browsertests-Flow-en.wikipedia.beta.wmflabs.org-linux-chrome-sauce/552/ [19:27:41] James_F: looking [19:30:53] !log Re-established Gearman connection from Jenkins [19:30:56] Logged the message, Master [19:31:08] !log Restarted slave agent on gallium [19:31:10] Logged the message, Master [19:31:27] Krinkle: Thanks. [19:35:24] !log Delete integration-slave1010 [19:35:26] Logged the message, Master [19:36:07] thcipriani: hah! actually, I found one of my older patches for this, in a branch. let me merge that with yours, and push that [19:36:28] heh, nice. [19:38:23] everyday I’m rebasing... [19:39:36] (03CR) 10Krinkle: [C: 032] Enable 'mediawiki-extensions-{hhvm,zend}' in all extensions it covers [integration/config] - 10https://gerrit.wikimedia.org/r/195335 (https://phabricator.wikimedia.org/T91968) (owner: 10Krinkle) [19:40:42] (03Merged) 10jenkins-bot: Enable 'mediawiki-extensions-{hhvm,zend}' in all extensions it covers [integration/config] - 10https://gerrit.wikimedia.org/r/195335 (https://phabricator.wikimedia.org/T91968) (owner: 10Krinkle) [19:42:00] !log Reloading Zuul to deploy I48cb4db87 [19:42:02] Logged the message, Master [19:42:56] (03PS1) 10Awight: Job passes now, let's keep it that way [integration/config] - 10https://gerrit.wikimedia.org/r/195341 [19:44:01] PROBLEM - Puppet failure on deployment-cache-mobile03 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [19:44:56] (03PS1) 10Awight: WIP: Set to voting once the job passes [integration/config] - 10https://gerrit.wikimedia.org/r/195343 [19:50:35] I don't understand why this job stopped running on Mar 3... https://integration.wikimedia.org/ci/job/wikimedia-fundraising-crm-jslint/ [20:20:12] awight: we started conslidating jobs into simple generic ones like "phplint" instead of "{name}-phplint", I don't know if Krinkle did that to jslint as well [20:21:06] legoktm: The strangest thing... That job stopped on Mar 3, which doesn't correspond to any integration-config commit worth mentioning. Then about half an hour after I complained here, it started again :D [20:21:11] (03PS1) 10Krinkle: Remove config of legacy jobs no longer used [integration/jenkins] - 10https://gerrit.wikimedia.org/r/195357 [20:21:15] oh, fantastic [20:21:18] * awight eyes bots warily [20:21:30] awight: well, unless someone touched a *.js or *.json file, the job wouldn't have run [20:21:31] legoktm: No, not yet. And we can't for the moment due to exceptions. [20:21:41] We can't consolidate jslint jobs yet [20:22:06] ah, right [20:22:16] It changed a .json file (composer.json) so it runs jslint [20:22:22] which also checks json for legacy reasons [20:22:39] (03CR) 10Krinkle: [C: 032] Remove config of legacy jobs no longer used [integration/jenkins] - 10https://gerrit.wikimedia.org/r/195357 (owner: 10Krinkle) [20:23:41] legoktm: oh :) /me sheepishly wanders off. Thanks for pointing that out. [20:36:38] deadlocked again I think [20:37:03] * legoktm fixes [20:37:34] !log doing the gearman shuffle dance thing [20:37:37] Logged the message, Master [20:38:22] * awight picks C++ threads out of the teeth... [20:39:08] well I did that but the queue doesn't appear to be moving... [20:39:51] running again [20:45:42] (03Merged) 10jenkins-bot: Remove config of legacy jobs no longer used [integration/jenkins] - 10https://gerrit.wikimedia.org/r/195357 (owner: 10Krinkle) [20:47:19] (03PS1) 10Awight: Enable PHP_CodeSniffer job for DonationInterface [integration/config] - 10https://gerrit.wikimedia.org/r/195371 [20:47:21] (03PS1) 10Awight: Fix the alphabet [integration/config] - 10https://gerrit.wikimedia.org/r/195372 [20:48:01] marxarelli: Feel like giving https://github.com/senchalabs/jsduck/issues/525 a crack? [20:48:12] * ^demon|away finds something stabby and trusty shaped [20:48:38] We're working around it for the moment, though concurrency would speed things up a fair bit. Curious if it's an easy fix or not. I'm not too familiar in the parallel ruby stack. Maybe you are? [20:48:57] ^demon|away: Ubuntu :P ? [20:48:57] ^demon|away: I’m taking a stab at staging-tin, btw. Is gettin ugly (https://gerrit.wikimedia.org/r/#/c/195340/), mostly from… trebuchet. [20:49:06] I’ll live to fight another day! [20:49:19] (03CR) 10Legoktm: [C: 04-1] "extension-phpcs is deprecated. See on how to set it up using" [integration/config] - 10https://gerrit.wikimedia.org/r/195371 (owner: 10Awight) [20:50:45] <^demon|away> Krinkle: yep :p [20:51:02] <^demon|away> YuviPanda|zzz: I've got a package missing that's killing -mc* :( [20:51:12] * ^demon|away shall file a task [20:51:15] <^demon|away> *phile [20:51:37] ^demon|away: oh? What package? [20:51:37] (03Abandoned) 10Awight: Enable PHP_CodeSniffer job for DonationInterface [integration/config] - 10https://gerrit.wikimedia.org/r/195371 (owner: 10Awight) [20:51:43] <^demon|away> memkeys [20:51:48] Aaah [20:51:49] Right [20:51:54] Yeah file task [20:52:05] you are using precise hosts right? [20:52:12] Prod memcache is still precise [20:52:46] Anyway off for realz now [20:53:52] (03PS2) 10Awight: Fix the alphabet [integration/config] - 10https://gerrit.wikimedia.org/r/195372 [20:54:48] lol [20:56:12] <^demon|away> lol no trusty [20:57:33] jessie [21:06:50] ^demon|away: nah. Let's mirror prod. Use precise! [21:08:32] Krinkle: sure, i can take a look [21:21:48] 10Continuous-Integration, 6operations, 3Continuous-Integration-Isolation, 5Patch-For-Review, 7Upstream: Create a Debian package for Zuul - https://phabricator.wikimedia.org/T48552#1101593 (10hashar) [21:22:22] 10Continuous-Integration, 6operations, 3Continuous-Integration-Isolation, 5Patch-For-Review, 7Upstream: Create a Debian package for Zuul - https://phabricator.wikimedia.org/T48552#489927 (10hashar) [21:27:09] hashar: Hey. How's it going? Anything I should know / or need? [21:27:17] hashar: Hey. How's it going? Anything I should know / or need something from me? [21:34:52] Krinkle: I am still busy packaging zuul :( [21:36:11] (03PS3) 10Hashar: zuul: properly sort DonationInterface [integration/config] - 10https://gerrit.wikimedia.org/r/195372 (owner: 10Awight) [21:37:49] (03CR) 10Hashar: [C: 032] zuul: properly sort DonationInterface [integration/config] - 10https://gerrit.wikimedia.org/r/195372 (owner: 10Awight) [21:39:06] (03Merged) 10jenkins-bot: zuul: properly sort DonationInterface [integration/config] - 10https://gerrit.wikimedia.org/r/195372 (owner: 10Awight) [22:14:02] (03PS4) 10Hashar: Package python deps with dh-virtualenv [integration/zuul] (debian/precise-wikimedia) - 10https://gerrit.wikimedia.org/r/195272 (https://phabricator.wikimedia.org/T48552) [22:19:46] Yippee, build fixed! [22:19:46] Project browsertests-Gather-en.m.wikipedia.beta.wmflabs.org-linux-chrome-sauce build #18: FIXED in 4 min 23 sec: https://integration.wikimedia.org/ci/job/browsertests-Gather-en.m.wikipedia.beta.wmflabs.org-linux-chrome-sauce/18/ [22:34:18] 10Continuous-Integration, 6operations, 3Continuous-Integration-Isolation, 5Patch-For-Review, 7Upstream: Create a Debian package for Zuul - https://phabricator.wikimedia.org/T48552#1102046 (10hashar) From a mail I sent to the private ops list: Hello, To package Zuul [T48552], I gave dh_virtualenv a try.... [22:34:49] (03PS1) 10Krinkle: Make apps-ios-wikipedia-jslint voting [integration/config] - 10https://gerrit.wikimedia.org/r/195468 (https://phabricator.wikimedia.org/T71838) [22:38:40] Project beta-scap-eqiad build #44636: FAILURE in 16 min: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/44636/ [22:47:00] (03CR) 10Krinkle: [C: 032] Make apps-ios-wikipedia-jslint voting [integration/config] - 10https://gerrit.wikimedia.org/r/195468 (https://phabricator.wikimedia.org/T71838) (owner: 10Krinkle) [22:48:11] (03Merged) 10jenkins-bot: Make apps-ios-wikipedia-jslint voting [integration/config] - 10https://gerrit.wikimedia.org/r/195468 (https://phabricator.wikimedia.org/T71838) (owner: 10Krinkle) [22:49:49] !log Reloading Zuul to deploy I229d24c57d90ef [22:49:54] Logged the message, Master [23:05:29] Yippee, build fixed! [23:05:30] Project beta-scap-eqiad build #44639: FIXED in 10 min: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/44639/ [23:19:40] 10Continuous-Integration: Generic phplint job is extremely slow for mediawiki/core - https://phabricator.wikimedia.org/T92042#1102177 (10Legoktm) 3NEW [23:19:51] 10Continuous-Integration: Generic phplint job is extremely slow for mediawiki/core - https://phabricator.wikimedia.org/T92042#1102184 (10Legoktm) p:5Triage>3High [23:23:18] legoktm: Question about this "composer test" thing--what's the recommended way to make one of those jobs "non-voting"? For example, getting phpcs output but not failing the test because of errors? I guess I could just end the shell command with "phpcs .... || echo 'warning: phpcs errors'"? [23:24:17] hmm... not sure that's been thought of yet [23:24:31] hehe [23:24:34] The `||` trick should work [23:24:44] ok, will do. [23:25:13] And any tricks for commenting .json? I could have a shell command that starts with "#" :D [23:31:12] awight: I would just create a custom command like "phpcs" that people can run and then once it's passing, move it under "test" [23:32:15] legoktm: that works, but I was greedily imagining this would give feedback under zuul/jenkins [23:34:21] I guess the || will work then [23:37:09] PROBLEM - Free space - all mounts on deployment-bastion is CRITICAL: CRITICAL: deployment-prep.deployment-bastion.diskspace._var.byte_percentfree.value (<37.50%) [23:45:59] Krinkle: https://phabricator.wikimedia.org/T92042 should we revert mw-core to having its own job with a workspace? Or not having lint block tests? [23:47:36] legoktm: Hm.. I was afraid the lint would be slow for linting all files, but it's still only doing changed-in-head. And the lint iteslf only took 1 few seconds [23:47:46] It's doing a shallow clone [23:47:50] what's taking it so long? [23:48:02] I think it's just the cloning part thats slow [23:48:15] I'm rebuilding it [23:48:23] legoktm: It shouldn't block though [23:48:27] I've removed that from most jobs [23:48:45] making it block is optimising for the uncommon case [23:48:53] https://github.com/wikimedia/integration-config/blob/master/zuul/layout.yaml#L2041 no it's still set to block [23:49:06] well that link is to test, but same with gate-and-submit [23:50:29] Yeah, that's wrong [23:50:32] Let's do that first [23:50:47] I'll do a quick check into why it's cloning slow. If not result, we can give mwcore its own one for now [23:51:00] Though the shallow clone was supposed to mitigate that concern [23:52:07] (03PS1) 10Legoktm: Don't block mediawiki/core phpunit jobs on phplint [integration/config] - 10https://gerrit.wikimedia.org/r/195486 [23:54:03] (03CR) 10Legoktm: [C: 032] Don't block mediawiki/core phpunit jobs on phplint [integration/config] - 10https://gerrit.wikimedia.org/r/195486 (owner: 10Legoktm) [23:55:15] (03Merged) 10jenkins-bot: Don't block mediawiki/core phpunit jobs on phplint [integration/config] - 10https://gerrit.wikimedia.org/r/195486 (owner: 10Legoktm) [23:56:59] !log deployed https://gerrit.wikimedia.org/r/195486 [23:57:01] Logged the message, Master [23:58:17] legoktm: OK. There's definitely a bug [23:58:19] https://integration.wikimedia.org/ci/job/phplint/421/console [23:58:24] I monitored it over ssh [23:58:31] It does a shallow clone with only 3 commits [23:58:41] And that's before it says "23:49:29 Fetching upstream changes from origin " [23:58:49] Then it's stuck on that for a while [23:59:01] and afterwards there's full history damn it [23:59:06] :| [23:59:20] so our shallow clones aren't actually shallow? [23:59:22] >< [23:59:25] Yeah! [23:59:34] well, that would explain why it's so slow :P