[00:01:53] <wikibugs>	 10Gerrit, 10MediaWiki-Vagrant, 13Patch-For-Review: "index-pack failed" when installing new MediaWiki-Vagrant box - https://phabricator.wikimedia.org/T152801#2984083 (10bd808) >>! In T152801#2984055, @Paladox wrote: > Oh ok, i guess then the only thing here is to build from source as suggested here https://bu...
[00:04:03] <wikibugs>	 10Gerrit, 10MediaWiki-Vagrant, 13Patch-For-Review: "index-pack failed" when installing new MediaWiki-Vagrant box - https://phabricator.wikimedia.org/T152801#2984088 (10Paladox) @bd808 hi, we upgraded to gerrit 2.13 just near that date. See https://gerrit.wikimedia.org/r/#/c/323545/ and https://gerrit.wikimed...
[00:15:44] <wikibugs>	 10Gerrit, 10MediaWiki-Vagrant, 13Patch-For-Review: "index-pack failed" when installing new MediaWiki-Vagrant box - https://phabricator.wikimedia.org/T152801#2984115 (10Paladox) >>! In T152801#2984083, @bd808 wrote: >>>! In T152801#2984055, @Paladox wrote: >> Oh ok, i guess then the only thing here is to buil...
[00:16:21] <wikibugs>	 10Gerrit, 10MediaWiki-Vagrant, 13Patch-For-Review: "index-pack failed" when installing new MediaWiki-Vagrant box - https://phabricator.wikimedia.org/T152801#2984116 (10Paladox) Also do you curl the commit-msg too?
[00:44:25] <wikibugs>	 10Gerrit, 10MediaWiki-Vagrant, 13Patch-For-Review: "index-pack failed" when installing new MediaWiki-Vagrant box - https://phabricator.wikimedia.org/T152801#2984138 (10Tgr) I think the realistic alternatives at this point are shallow cloning and using GitHub.  Shallow cloning breaks git log and blame (and ma...
[01:16:03] <MaxSem>	 Phab admins, we've got a spammer, see #-dev
[01:16:47] <JustBerry>	 MaxSem: Looks like about 50 tickets need to get purged
[01:16:58] <JustBerry>	 at least these: https://phabricator.wikimedia.org/p/GuerellaNuke23/
[01:17:15] <JustBerry>	 and these: https://phabricator.wikimedia.org/p/SimonWalkerAlt/
[01:17:48] <JustBerry>	 greg-g: ryasmeen ^^
[01:20:14] <ostriches>	 MaxSem: I'm already on it
[01:20:32] <JustBerry>	 ostriches: Thanks for blocking account creation (didn't realize you were user:demon on phab)
[01:20:43] <ostriches>	 That *should* handle it for now
[01:20:46] <ostriches>	 I hope
[01:21:17] <JustBerry>	 ostriches: kinda vague question to ask, but any other wikis that you think might have this kind of loop hole?
[01:21:23] <JustBerry>	 ostriches: seems like wikitech doesn't have torblock on
[01:22:14] <ostriches>	 Wikitechwiki isn't used for Phab login
[01:22:20] <ostriches>	 It's mw.org (so SUL'd wikis)
[01:24:03] <JustBerry>	 ostriches: PMed you instead of chan chat
[01:53:51] <bd808>	 !log https://integration.wikimedia.org/zuul/ showing huge backlogs but https://integration.wikimedia.org/ci/ looks mostly idle
[01:53:54] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL
[01:54:49] <bd808>	 legoktm: about? I could use some help figuring out what has zuul all gummed up
[01:55:00] <legoktm>	 Ish
[01:55:07] <bd808>	 jobs suck for 3+ hours :/
[01:55:10] <legoktm>	 Hmm
[01:55:27] <legoktm>	 There were issues with nodepool not deleting slaves I think earluer
[01:55:30] <legoktm>	 Earlier*
[01:55:42] <bd808>	 yeah... looks like nodepool stupidity
[01:55:52] <bd808>	 *-jessie queued
[01:56:14] <bd808>	 and no jessie exec nodes pooled
[01:56:19] <legoktm>	 https://integration.wikimedia.org/ci/computer/
[01:56:30] <legoktm>	 No nodepool instances
[01:56:49] <bd808>	 I fucking hate nodepool
[01:59:22] <bd808>	 !log nodepool is full of instance stuck in "delete"
[01:59:24] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL
[02:00:03] <legoktm>	 https://phabricator.wikimedia.org/T156636
[02:00:40] <bd808>	 blerg
[02:01:57] <bd808>	 chasemp: about? We've got a pile of fscked noodppol instances stuck in delete. Possibly related to T156636
[02:01:57] <stashbot>	 T156636: Labs instance ci-jessie-wikimedia-498353 can not be deleted - https://phabricator.wikimedia.org/T156636
[02:03:41] <bd808>	 and of course we have made this unstable pile of nodepool critical to pretty much every zuul job queue at this point instead of ripping it out entirely
[02:05:08] <ostriches>	 It's ok, we're going to just add more nodes ;-)
[02:05:34] <wikibugs>	 10Continuous-Integration-Infrastructure, 06Labs, 10Labs-Infrastructure: Labs instance ci-jessie-wikimedia-498353 can not be deleted - https://phabricator.wikimedia.org/T156636#2981991 (10bd808) The whole pool is full of instances stuck in delete now. Here's a bit more info: ``` nodepool@labnodepool1001:~$ no...
[02:05:48] <bd808>	 this doesn't look good "novaclient.exceptions.Unauthorized: Unauthorized (HTTP 401) (Request-ID: req-12dc7525-9b7a-4045-be36-4f4ac2dd5587)"
[02:06:38] <chasemp>	 bd808: I get teh same error actually using novaadmin
[02:06:42] <chasemp>	 so yeah, we should call andrew
[02:07:20] <legoktm>	 So we think it's an openstack issue?
[02:07:45] <chasemp>	 I'm not sure, but that slants towards yes
[02:09:53] <bd808>	 possibly related to the problems with scaling the uwsgi container that he's been fighting?
[02:10:09] <bd808>	 chasemp: you wanna dial or should I?
[02:10:34] <chasemp>	 talking now
[02:11:20] <legoktm>	 should I start moving CI jobs off of nodepool or wait for a bit?
[02:11:48] <bd808>	 we should move all jobs off of nodepool forever (my $0.02USD)
[02:11:52] <chasemp>	 andrew will be online in 15 (he's out to dinner) I'm gogin to try a few things first
[02:18:35] <ostriches>	 +10000 to bd808 
[02:45:10] <chasemp>	 does anyone know ( ostriches?) the last time nodepool was doing things successfully?
[02:46:26] <ostriches>	 My best guess is ~4h ago when those jobs in zuul got stuck
[02:46:36] <ostriches>	 But I haven't been doing any gerrit/jenkins/ci work today
[02:46:39] <ostriches>	 So that's just a guess
[02:46:49] <chasemp>	 kk
[02:47:30] <andrewbogott>	 is there any chance it was much longer, like several days?
[02:49:56] <chasemp>	 ostriches: ^
[02:50:56] <ostriches>	 I think it was fine as of thursday.
[02:51:02] <ostriches>	 I was out friday and out of town all weekend
[02:51:33] <andrewbogott>	 that's… a big window :(
[02:52:27] <ostriches>	 I mean I'm pretty sure it was fine earlier today or people would've complained.
[02:52:29] <chasemp>	 andrewbogott: fwiw event I think you mean was Thu Jan 26 19:56:33 2017 +0000
[02:52:37] <ostriches>	 Granted, it took (at least) 3.5h for us to notice
[02:52:51] <chasemp>	 yeah seems like it can't be parked for any real length, but that doesn't mean some cache didn't expire or something
[03:58:57] <andrewbogott>	 ostriches (or anyone else following along), I'd expect the nova/keystone apis to work normally now, probably nodepool will sort itself out shortly
[03:59:13] <chasemp>	 andrewbogott: I've been watching nodepool churn nodes for awhile now
[03:59:16] <chasemp>	 seems to be building etc
[03:59:22] <andrewbogott>	 Q. What was wrong?  A: I'm not sure but restarting the nova-api endpoint seems to have fixed it
[04:04:11] <legoktm>	 CI is moving again :)
[04:04:16] <legoktm>	 thanks andrewbogott and chasemp
[04:04:36] <ostriches>	 Yeah, +1 on the thank you
[04:08:54] <andrewbogott>	 This was either a cache-invalidation bug or an 'after this service works for X number of days it doesn't work no more' bug
[04:09:00] <andrewbogott>	 we reverted the change that may've introduced the former
[04:09:05] <andrewbogott>	 as for the latter… time will tell :(
[05:07:10] <greg-g>	 So guys, this is not the first time nodepool has just been the canary in a coal mine with OpenStack infra issues yet it still gets the stink eye when it acts as a better monitor of Labs health than the monitoring system (like most users do!)
[05:07:21] <greg-g>	 of course, at this hour I'm talk to myself
[06:39:54] <wmf-insecte>	 Project selenium-Wikibase » chrome,beta,Linux,contintLabsSlave && UbuntuTrusty build #255: 04FAILURE in 1 hr 58 min: https://integration.wikimedia.org/ci/job/selenium-Wikibase/BROWSER=chrome,MEDIAWIKI_ENVIRONMENT=beta,PLATFORM=Linux,label=contintLabsSlave%20&&%20UbuntuTrusty/255/
[06:50:05] <wikibugs>	 10MediaWiki-Releasing, 10MediaWiki-Vendor: 1.27 tarball: Unnecessary library "ruflin/elastica 2.3.1" requirement - https://phabricator.wikimedia.org/T156637#2984818 (10Osnard)
[06:50:38] <wikibugs>	 10MediaWiki-Releasing, 10MediaWiki-Vendor: 1.27 tarball: Unnecessary library "ruflin/elastica 2.3.1" requirement - https://phabricator.wikimedia.org/T156637#2982017 (10Osnard) @Aklapper 1.27.1; I've added the download link in the task description
[07:12:46] <greg-g>	 (follow-ed up in -labs-admin)
[07:16:58] <wikibugs>	 10MediaWiki-Releasing, 10MediaWiki-Vendor: 1.27 tarball: Unnecessary library "ruflin/elastica 2.3.1" requirement - https://phabricator.wikimedia.org/T156637#2984861 (10Osnard) @Legoktm Thanks for the explanation. I understand now why there is the mediawiki/vendor repo [1]. The problem with this approach is tha...
[07:43:26] <shinken-wm>	 RECOVERY - Long lived cherry-picks on puppetmaster on deployment-puppetmaster02 is OK: OK: Less than 100.00% above the threshold [0.0]
[08:29:46] <wikibugs>	 10MediaWiki-Releasing, 10MediaWiki-Vendor: 1.27 tarball: Unnecessary library "ruflin/elastica 2.3.1" requirement - https://phabricator.wikimedia.org/T156637#2982017 (10hashar) I cant find a reference right now, but I thought mediawiki/vendor had REL branches populated via the composer merge plugin. Maybe we ca...
[08:33:33] <wikibugs>	 10Continuous-Integration-Config, 10BlueSpice, 13Patch-For-Review: Enable unit tests on BlueSpice* repos - https://phabricator.wikimedia.org/T130811#2984981 (10Osnard) Okay, it's merged. Go ahead :smile:
[08:40:17] <wikibugs>	 10Continuous-Integration-Config, 10BlueSpice, 13Patch-For-Review: Enable unit tests on BlueSpice* repos - https://phabricator.wikimedia.org/T130811#2985012 (10Paladox) Ok thanks :)
[08:54:39] <wikibugs>	 10Continuous-Integration-Config, 10BlueSpice, 13Patch-For-Review: Enable unit tests on BlueSpice* repos - https://phabricator.wikimedia.org/T130811#2985022 (10Paladox) This is the file https://github.com/wikimedia/mediawiki-extensions-BlueSpiceExtensions/blob/28fb12d1d04557d78f1ce26c5d706a27cfec2ad6/Checklis...
[08:58:23] <wikibugs>	 10Continuous-Integration-Config, 10BlueSpice, 13Patch-For-Review: Enable unit tests on BlueSpice* repos - https://phabricator.wikimedia.org/T130811#2985025 (10Paladox) Actually it's this file https://github.com/wikimedia/mediawiki-extensions-BlueSpiceExtensions/blob/master/Checklist/tests/phpunit/BSApiCheckl...
[09:50:40] <wikibugs>	 10Continuous-Integration-Infrastructure, 06Labs, 10Labs-Infrastructure: Labs instance ci-jessie-wikimedia-498353 can not be deleted - https://phabricator.wikimedia.org/T156636#2985545 (10hashar)
[09:51:50] <wikibugs>	 10Continuous-Integration-Infrastructure, 06Labs, 10Labs-Infrastructure: Labs instance ci-jessie-wikimedia-498353 can not be deleted - https://phabricator.wikimedia.org/T156636#2981991 (10hashar)
[10:04:07] <wikibugs>	 10Continuous-Integration-Infrastructure, 06Labs, 10Labs-Infrastructure: Labs instance ci-jessie-wikimedia-498353 can not be deleted - https://phabricator.wikimedia.org/T156636#2985716 (10hashar) Not sure what happened with 508783 but eventually it has been deleted:    Logs show that some other instances/proj...
[10:16:22] <wikibugs>	 06Release-Engineering-Team, 10MediaWiki-Vagrant, 06Operations, 07Epic: [EPIC] Migrate base image to Debian Jessie - https://phabricator.wikimedia.org/T136429#2985781 (10Gilles)
[10:17:57] <wikibugs>	 10Continuous-Integration-Config, 10BlueSpice, 13Patch-For-Review: Enable unit tests on BlueSpice* repos - https://phabricator.wikimedia.org/T130811#2985792 (10Osnard) Strange. Looks like some magic word does not work. It is initiated by https://github.com/wikimedia/mediawiki-extensions-BlueSpiceExtensions/bl...
[10:38:16] <wikibugs>	 10Gerrit, 10MediaWiki-Vagrant, 13Patch-For-Review: "index-pack failed" when installing new MediaWiki-Vagrant box - https://phabricator.wikimedia.org/T152801#2985831 (10hashar) >>! In T152801#2983841, @bd808 wrote: > ... > It's still pretty suspicious that this all showed up around the same time as {T151676}....
[12:27:42] <wikibugs>	 06Release-Engineering-Team, 06Developer-Relations (Oct-Dec-2016), 07Documentation: Merge Wikimedia's "Deployment checklist for new extensions" doc pages - https://phabricator.wikimedia.org/T142081#2522000 (10Nemo_bis) For reference: https://www.mediawiki.org/w/index.php?title=Review_queue&type=revision&diff=...
[12:43:31] <addshore>	 hashar: car to look at something git and deploy related quickly for me? :D
[12:45:06] <hashar>	 addshore: having lunch sorry :((   paste question here
[12:45:14] <hashar>	 will follow up once I am done 
[12:45:48] <addshore>	 okay, well hashar it relates to my notes in the reverting section @ https://wikitech.wikimedia.org/wiki/User:Addshore/Deployments , im sure there should be a git rebase in there somewhere, but I can't tell where!
[12:47:39] <wikibugs>	 (03PS1) 10Aleksey Bekh-Ivanov (WMDE): Fix absence of dev dependencies for Wikibase in jenkins job [integration/config] - 10https://gerrit.wikimedia.org/r/335215
[13:10:38] <shinken-wm>	 PROBLEM - Puppet run on deployment-elastic08 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0]
[13:12:50] <wikibugs>	 10Gerrit, 06Operations, 07Beta-Cluster-reproducible, 13Patch-For-Review, 07Upstream: gerrit jgit gc caused mediawiki/core repo problems - https://phabricator.wikimedia.org/T151676#2986079 (10Paladox)
[13:15:29] <wikibugs>	 10Gerrit, 06Release-Engineering-Team: Update gerrit to 2.14 - https://phabricator.wikimedia.org/T156120#2986085 (10Paladox)
[13:15:35] <wikibugs>	 10Gerrit, 06Operations, 07Beta-Cluster-reproducible, 13Patch-For-Review, 07Upstream: gerrit jgit gc caused mediawiki/core repo problems - https://phabricator.wikimedia.org/T151676#2986084 (10Paladox)
[13:17:05] <wikibugs>	 10Gerrit, 10BlueSpice, 13Patch-For-Review, 07Upstream: Merge/Submit error on Gerrit: "org.eclipse.jgit.errors.MissingObjectException: Missing unknown" for BlueSpiceExtensions' REL1_27 branch - https://phabricator.wikimedia.org/T153079#2986087 (10Paladox)
[13:17:15] <wikibugs>	 10Gerrit, 06Operations, 07Beta-Cluster-reproducible, 13Patch-For-Review, 07Upstream: gerrit jgit gc caused mediawiki/core repo problems - https://phabricator.wikimedia.org/T151676#2824332 (10Paladox)
[13:18:46] <wikibugs>	 10Gerrit, 06Operations, 07Beta-Cluster-reproducible, 13Patch-For-Review, 07Upstream: gerrit jgit gc caused mediawiki/core repo problems - https://phabricator.wikimedia.org/T151676#2986089 (10Paladox) This should hopefully be fixed in gerrit 2.14. Though I doint know if there will be permenant damage. We...
[13:25:38] <shinken-wm>	 RECOVERY - Puppet run on deployment-elastic08 is OK: OK: Less than 1.00% above the threshold [0.0]
[13:30:10] <hashar>	 addshore: k back around
[13:30:29] <hashar>	 addshore: the submodules have a setting to autorebase
[13:31:25] <hashar>	 $ git config --list|grep Wikidata.update
[13:31:26] <hashar>	 submodule.extensions/Wikidata.update=rebase
[13:31:59] <hashar>	 your git log commands can be made:   git log HEAD..HEAD@{u}
[13:32:16] <hashar>	 {u} or {upstream}  refers to the tracked branch
[13:32:48] <hashar>	 portals has an extra shell script for deployment
[13:33:12] <hashar>	 gotcha: if updating wikiversion.json, on mwdebug1001 one need to compile the wikivesion.PHP  
[13:33:25] <hashar>	 should be:  mwdebug1001$ scap compile-wikiversions
[13:33:56] <hashar>	 but yeah looks more or less fine :]
[13:42:13] <wikibugs>	 (03PS1) 10Aude: Bump Wikidata to wmf/1.29.0-wmf.10 [tools/release] - 10https://gerrit.wikimedia.org/r/335225
[13:43:06] <hashar>	 aude: that wikidata bump should happen today right?
[13:47:30] <wmf-insecte>	 Project selenium-VisualEditor » firefox,beta,Linux,contintLabsSlave && UbuntuTrusty build #291: 04FAILURE in 2 min 30 sec: https://integration.wikimedia.org/ci/job/selenium-VisualEditor/BROWSER=firefox,MEDIAWIKI_ENVIRONMENT=beta,PLATFORM=Linux,label=contintLabsSlave%20&&%20UbuntuTrusty/291/
[13:48:43] <aude>	 hashar: yes
[13:49:09] <shinken-wm>	 PROBLEM - Puppet run on buildlog is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0]
[13:49:20] <wikibugs>	 (03CR) 10Hashar: [C: 032] Bump Wikidata to wmf/1.29.0-wmf.10 [tools/release] - 10https://gerrit.wikimedia.org/r/335225 (owner: 10Aude)
[13:49:42] <aude>	 thanks
[13:50:17] <wikibugs>	 (03Merged) 10jenkins-bot: Bump Wikidata to wmf/1.29.0-wmf.10 [tools/release] - 10https://gerrit.wikimedia.org/r/335225 (owner: 10Aude)
[13:50:33] <hashar>	 aude: addshore: and eventually I think we should update wikidatawiki during our wednesday afternoon
[13:50:51] <hashar>	 instead of bumping it with group1 at 20:00/21:00
[13:51:59] <wikibugs>	 10Gerrit, 06Operations, 07Beta-Cluster-reproducible, 13Patch-For-Review, 07Upstream: gerrit jgit gc caused mediawiki/core repo problems - https://phabricator.wikimedia.org/T151676#2986228 (10hashar)
[13:52:03] <wikibugs>	 10Gerrit, 10BlueSpice, 13Patch-For-Review, 07Upstream: Merge/Submit error on Gerrit: "org.eclipse.jgit.errors.MissingObjectException: Missing unknown" for BlueSpiceExtensions' REL1_27 branch - https://phabricator.wikimedia.org/T153079#2986227 (10hashar)
[13:53:50] <aude>	 hashar: maybe
[13:54:07] <hashar>	 aude: it is really all up to you :]
[13:54:16] <aude>	 though if there is ever a problem we find on testwikidata on tuesday
[13:54:21] <aude>	 then we need time to fix it
[13:54:23] <hashar>	 but if there is an interest in bumping wikidatawiki during european day, I am all for it
[13:54:31] * aude is not in europe, btw
[13:54:38] <hashar>	 iohhhh
[13:54:50] <aude>	 would have to be before european swat
[13:55:23] <aude>	 probably not good for hoo though
[13:55:41] <hashar>	 well it just a random idea really :]
[13:55:48] <aude>	 yeah
[13:57:14] <aude-wiki>	 messed up my irc...
[15:09:59] <wikibugs>	 10Gerrit, 06Operations, 07Beta-Cluster-reproducible, 13Patch-For-Review, 07Upstream: gerrit jgit gc caused mediawiki/core repo problems - https://phabricator.wikimedia.org/T151676#2986472 (10Paladox) @hashar hi this task T153079 has nothing to do with the task here as the problem there was that the branc...
[15:30:25] <wikibugs>	 10Deployment-Systems, 03Scap3, 13Patch-For-Review, 07Wikimedia-Incident: Include fatal log rate check in scap canary test - https://phabricator.wikimedia.org/T154646#2986573 (10thcipriani) 05Open>03Resolved a:03thcipriani
[15:31:57] <wikibugs>	 10Deployment-Systems, 10Wikimedia-Logstash, 13Patch-For-Review, 07Wikimedia-Incident: Check same set of errors/warnings/fatals in scap logstash_checker.py as there is in `fatalmonitor` on fluorine - https://phabricator.wikimedia.org/T142784#2986592 (10thcipriani) 05Open>03Resolved New scap release is n...
[16:15:02] <wikibugs>	 10Continuous-Integration-Infrastructure, 06Labs, 10Labs-Infrastructure: Labs instance ci-jessie-wikimedia-498353 can not be deleted - https://phabricator.wikimedia.org/T156636#2986837 (10bd808) The nodepool issues on 2017-01-30 and 31 were very likely caused by a nova-api failure which itself may or may not...
[16:48:42] <wikibugs>	 05Gerrit-Migration, 10releng-201617-q2, 07Documentation: Document workflow and creation of CI jobs in Differential - https://phabricator.wikimedia.org/T130952#2986995 (10Aklapper) #releng-201617-q2 is over. Should this be #releng-201617-q3 or not?
[17:10:06] <wikibugs>	 03Scap3, 06Services (later), 15User-mobrovac: Delay repooling trending service after a restart - https://phabricator.wikimedia.org/T156687#2987059 (10thcipriani) >>! In T156687#2984052, @mobrovac wrote: > We need to establish if that would be possible with Scap3. I figure we could do a `sleep 30` //check// s...
[18:44:02] <wikibugs>	 10Browser-Tests-Infrastructure, 07Ruby, 15User-zeljkofilipin: Release mediawiki_api 0.7.1 - https://phabricator.wikimedia.org/T156837#2987336 (10zeljkofilipin)
[18:45:04] <wikibugs>	 10Browser-Tests-Infrastructure, 07Ruby, 15User-zeljkofilipin: Release mediawiki_api 0.7.1 - https://phabricator.wikimedia.org/T156837#2987336 (10zeljkofilipin) p:05Triage>03Normal
[18:49:38] <wikibugs>	 03Scap3, 10Parsoid: Saying yes (y) continues to all groups - https://phabricator.wikimedia.org/T156839#2987377 (10Arlolra)
[18:53:40] <wikibugs>	 (03PS1) 10Zfilipin: Release patch version 0.7.1 [ruby/api] - 10https://gerrit.wikimedia.org/r/335264
[18:54:57] <wikibugs>	 (03CR) 10Zfilipin: [C: 032] Release patch version 0.7.1 [ruby/api] - 10https://gerrit.wikimedia.org/r/335264 (owner: 10Zfilipin)
[18:55:20] <wikibugs>	 (03Merged) 10jenkins-bot: Release patch version 0.7.1 [ruby/api] - 10https://gerrit.wikimedia.org/r/335264 (owner: 10Zfilipin)
[18:55:43] <wikibugs>	 (03CR) 10jenkins-bot: Release patch version 0.7.1 [ruby/api] - 10https://gerrit.wikimedia.org/r/335264 (owner: 10Zfilipin)
[18:55:49] <wikibugs>	 10Browser-Tests-Infrastructure, 07Ruby, 15User-zeljkofilipin: Release mediawiki_api 0.7.1 - https://phabricator.wikimedia.org/T156837#2987412 (10zeljkofilipin) Forgot to tag the task in the commit message: https://gerrit.wikimedia.org/r/#/c/335264/
[19:00:30] <wikibugs>	 10Browser-Tests-Infrastructure, 07Ruby, 15User-zeljkofilipin: Release mediawiki_api 0.7.1 - https://phabricator.wikimedia.org/T156837#2987433 (10zeljkofilipin) 05Open>03Resolved Done! https://rubygems.org/gems/mediawiki_api
[19:07:39] <wikibugs>	 05Gerrit-Migration, 10releng-201617-q2, 07Documentation: Document workflow and creation of CI jobs in Differential - https://phabricator.wikimedia.org/T130952#2987481 (10greg) It's not a quarterly goal level thing, so no ;)
[19:08:30] <wikibugs>	 (03CR) 10Abartov: "Thank you! :)" [ruby/api] - 10https://gerrit.wikimedia.org/r/335264 (owner: 10Zfilipin)
[19:49:58] <wikibugs>	 (03PS1) 10Ejegg: Change SmashPig tests from PHP 5.3 to 5.5 [integration/config] - 10https://gerrit.wikimedia.org/r/335271
[20:56:27] <chasemp>	 thcipriani: about?
[20:56:43] <thcipriani>	 chasemp: yep, what's up?
[20:57:28] <chasemp>	 looking through the mess from last night w/ the nova-api freeze up and I was considering an alert that was some rough sanity marker for nodepool, it stores internals but nothing detailed other than age of instances and state
[20:57:39] <chasemp>	 I was thinking alert if newest nodepool instance is x old
[20:57:55] <chasemp>	 and that led me to wondering if there is any dummy jobs in the ci pipeline we could rely on
[20:58:11] <chasemp>	 to flush out failures in a predictive way and if not could there be?
[20:58:17] <chasemp>	 not sure how hard, maybe you have an idea
[20:58:25] <chasemp>	 it's also possible I've gone full ramble
[20:58:34] <thcipriani>	 :)
[20:58:51] <thcipriani>	 you're looking for a job that runs on nodepool frequently? Or?
[20:59:18] <chasemp>	 well, yes is there a job that runs predictably as in every n minutes to watch for
[20:59:27] <chasemp>	 and if not and it's not a good idea
[20:59:39] <chasemp>	 thoughts on alerting if nodepool doesn't have new instances in y time
[21:00:18] <thcipriani>	 hrm, I'm not sure if we have nodepool jobs that run in intervals rather than being triggered by patch sets
[21:00:40] <chasemp>	 it seems like a predictable and seeded dummy job every 10m would be a good idea
[21:00:43] <thcipriani>	 I don't think it's not a good idea, I just don't think we have any right now
[21:00:50] <chasemp>	 right
[21:01:17] * thcipriani digs in integration/config
[21:01:49] <chasemp>	 I do have the beginnings of a full stack test (create/test/delete) a vm but that's sort of in the neighborhood of watching a to know about b since there are a lot of variables between any test cae and nodepool
[21:01:55] <chasemp>	 I think both paths are appropriate
[21:03:23] <chasemp>	 thcipriani: one issue we have every time and I think this will translate to any medium is since all jobs are adhoc it's difficult to know when a problem as begun
[21:03:51] <chasemp>	 and right behind that is there is no deterministic test case to lean on when you are wondering if things are ok now
[21:03:57] <thcipriani>	 yup, that makes sense
[21:04:46] <thcipriani>	 the only problem I could think is that if we have some dummy job we rely on that gets burried under patch sets it may not be super reliable
[21:05:08] <chasemp>	 sure that's ok though to find out I think
[21:05:22] <chasemp>	 and would possibly be the first real holistic view we have :)
[21:05:27] <thcipriani>	 yeah, may be a non-issue provided we queue it correctly
[21:05:50] <thcipriani>	 is there a task for this?
[21:06:08] <chasemp>	 we have tasks for nova but probably not this angle as I was just working through the idea
[21:06:15] <chasemp>	 also, I'm not sure
[21:06:30] <chasemp>	 there are nodepool tasks scattered about
[21:06:52] <thcipriani>	 yeah nodepool has a mean phab presence
[21:07:11] <chasemp>	 I can make one :) doyou mind if I toss it your way even for just reasoning and some ci details on feasibility?
[21:07:58] <thcipriani>	 no problem. I'll add some thoughts to the task. Seems like it'd be trivial to make the actual job.
[21:08:28] <chasemp>	 I have to believe so
[21:08:57] * thcipriani says before really considering the weight of jjb on his soul
[21:09:19] <JustBerry>	 thcipriani: I have way too many stalk words ;p
[21:09:22] <JustBerry>	 jb was one of them
[21:09:24] * JustBerry removes jb
[21:09:32] <chasemp>	 well some insane person has nodepool reporting age of instances in decimal?
[21:09:33] <chasemp>	 0.07
[21:09:48] <thcipriani>	 JustBerry: heh, sorry :)
[21:09:54] <JustBerry>	 nah not your fault
[21:10:07] <thcipriani>	 0.07 time
[21:10:13] <thcipriani>	 time units
[21:10:40] <chasemp>	 Age (hours)
[21:10:46] <chasemp>	 it's hour in decimal?
[21:10:53] <chasemp>	 that hurts me deeply
[21:12:05] <chasemp>	 good thing this convo only took .09 of an hour
[21:12:16] <thcipriani>	 could have been worse. Age (12 minutes)
[21:40:35] <wikibugs>	 10Deployment-Systems, 03Scap3, 10scap: scap wikiversions compile happening too late in scap sync - https://phabricator.wikimedia.org/T156851#2987851 (10thcipriani)
[21:41:16] <wikibugs>	 10Deployment-Systems, 03Scap3, 10scap: scap wikiversions compile happening too late in scap sync - https://phabricator.wikimedia.org/T156851#2987865 (10thcipriani) p:05Triage>03High
[21:48:37] <Reedy>	 21:43:55 Building remotely on integration-slave-precise-1012 (phpflavor-php53 contintLabsSlave phpflavor-zend UbuntuPrecise) in workspace /srv/jenkins-workspace/workspace/mwext-testextension-php53
[21:48:43] <Reedy>	 21:44:21 ERROR 2002 (HY000): Can't connect to local MySQL server through socket '/var/run/mysqld/mysqld.sock' (2)
[21:59:25] <thcipriani>	 hrm, mysql was stopped there
[21:59:33] <thcipriani>	 I poked it, it's going now
[22:01:24] <Reedy>	 thanks
[22:06:36] <Reedy>	 thcipriani: dead on 1011 too
[22:06:52] <thcipriani>	 hrm, looks like a job for salt
[22:06:57] <Reedy>	 :)
[22:07:17] <chasemp>	 even better thcipriani age is not age of the instance but age in that state only
[22:07:24] <chasemp>	 in decimal hours
[22:07:51] <thcipriani>	 wat
[22:08:24] <JustBerry>	 lol
[22:08:52] <chasemp>	 it resets between build and ready at least and ready adn delete
[22:09:01] <chasemp>	 which is honestly not terrible as a counter but who would hve guessed?
[22:12:18] <thcipriani>	 !log started mysql on all integration precise instances via salt -- was stopped for some reason
[22:12:22] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL
[22:15:39] <Reedy>	 progress
[22:15:41] <Reedy>	 22:13:57 Building remotely on ci-jessie-wikimedia-510456 (ci-jessie-wikimedia) in workspace /home/jenkins/workspace/npm-node-6-jessie
[22:15:49] <Reedy>	 22:14:02 npm ERR! install Couldn't read dependencies
[22:15:50] <Reedy>	 etc
[22:15:53] <Reedy>	 https://integration.wikimedia.org/ci/job/npm-node-6-jessie/2859/console
[22:16:57] <Reedy>	 Potentially REL1_23 thing
[22:17:37] <Reedy>	 Roll on May 2017
[22:18:40] <Reedy>	 Oh, I see
[22:20:16] <wikibugs>	 10Continuous-Integration-Config, 10MediaWiki-extensions-LiquidThreads: npm-node-6-jessie fails on LiquidThreads on REL1_23 - https://phabricator.wikimedia.org/T156859#2988046 (10Reedy)
[22:20:24] <wikibugs>	 10Continuous-Integration-Config, 10MediaWiki-extensions-LiquidThreads: npm-node-6-jessie fails on LiquidThreads on REL1_23 - https://phabricator.wikimedia.org/T156859#2988058 (10Reedy) p:05Triage>03Lowest
[22:35:59] <wikibugs>	 10Continuous-Integration-Config, 10MediaWiki-extensions-LiquidThreads: npm-node-6-jessie fails on LiquidThreads on REL1_23 - https://phabricator.wikimedia.org/T156859#2988144 (10hashar) 05Open>03declined Yup the package.json with a test script has been introduced in a later branch.  Zuul has support to ski...
[22:42:21] <Reedy>	 lol
[22:44:49] <wikibugs>	 10Deployment-Systems, 03Scap3, 10scap: scap wikiversions compile happening too late in scap sync - https://phabricator.wikimedia.org/T156851#2988199 (10bd808) Moving `tasks.sync_common` after `wikiversions-compile` certainly would have caused this. Moving the sync **before** calling wikiversions-compile was...
[23:09:04] <wikibugs>	 10Gerrit, 10MediaWiki-Vagrant, 13Patch-For-Review: "index-pack failed" when installing new MediaWiki-Vagrant box - https://phabricator.wikimedia.org/T152801#2988333 (10Paladox) Bug report already filled upstream on gerrit  https://bugs.chromium.org/p/gerrit/issues/detail?id=2295
[23:19:59] <legoktm>	 thcipriani: there's some bug related to mysql randomly dying on precise slaves
[23:20:31] <thcipriani>	 I had some half-memory of that.
[23:20:42] <thcipriani>	 I don't remember if there is a resolution?
[23:22:53] <legoktm>	 [14:17:37] <Reedy> Roll on May 2017
[23:34:52] <chasemp>	 thcipriani: https://gerrit.wikimedia.org/r/335373
[23:40:31] <thcipriani>	 chasemp: neat. Logic looks reasonable.
[23:40:48] <chasemp>	 the actual thresholds are from my limited watching just this afternoon
[23:41:07] <chasemp>	 so guaranteed to be not ideal but in theory if we tweak we will find a place that it's only outliers
[23:42:32] <thcipriani>	 yup. The delete one will definitely come in handy -- I think that's where issues make themselves known. I'm not sure about the used value, but, as you say, will rough-hewn closer to correct over time.
[23:42:44] <chasemp>	 well my thinking on used is
[23:42:52] <chasemp>	 stuck tests and/or tests that are invalid in duration
[23:42:59] <chasemp>	 that's probably me being bullheaded tho
[23:43:20] <chasemp>	 I think there should be a max test run time tbh