[00:20:00] <wikibugs>	 10Continuous-Integration-Config, 10Fundraising-Backlog, 5Patch-For-Review, 7WorkType-Maintenance: Tests on deployment branches of wikimedia/fundraising/crm falling causing to force merge (and deadlock of Zuul) - https://phabricator.wikimedia.org/T117062#1809607 (10DStrine)
[00:23:37] <wikibugs>	 10Deployment-Systems, 6Release-Engineering-Team, 3Scap3: scap creating directories owned by root on mira - https://phabricator.wikimedia.org/T118691#1809619 (10Krinkle) When syncing a file:  ``` 00:21:21 Started sync-masters 00:21:29 ['/srv/deployment/scap/scap/bin/sync-master', 'tin.eqiad.wmnet'] on mira.co...
[01:12:03] <grrrit-wm>	 (03PS1) 10Tim Starling: Add .gitreview [integration/uprightdiff] - 10https://gerrit.wikimedia.org/r/253519 
[01:12:05] <grrrit-wm>	 (03PS1) 10Tim Starling: Optimisations [integration/uprightdiff] - 10https://gerrit.wikimedia.org/r/253520 
[01:13:18] <grrrit-wm>	 (03CR) 10Tim Starling: [C: 032] Add .gitreview [integration/uprightdiff] - 10https://gerrit.wikimedia.org/r/253519 (owner: 10Tim Starling)
[01:15:39] <grrrit-wm>	 (03CR) 10Tim Starling: [V: 032] Add .gitreview [integration/uprightdiff] - 10https://gerrit.wikimedia.org/r/253519 (owner: 10Tim Starling)
[01:15:53] <grrrit-wm>	 (03CR) 10Tim Starling: [C: 032 V: 032] Optimisations [integration/uprightdiff] - 10https://gerrit.wikimedia.org/r/253520 (owner: 10Tim Starling)
[02:06:54] <wikibugs>	 10Continuous-Integration-Infrastructure, 10MediaWiki-Unit-tests: MediaWiki PHPUnit tests skips HtmlFormatterTest because "Tidy extension not installed" - https://phabricator.wikimedia.org/T118814#1809962 (10Krinkle) 3NEW
[02:08:10] <wikibugs>	 10Continuous-Integration-Infrastructure, 10MediaWiki-Unit-tests: MediaWiki PHPUnit tests skips TidyTest because "Tidy not found" - https://phabricator.wikimedia.org/T118814#1809969 (10Krinkle)
[02:29:24] <shinken-wm>	 PROBLEM - Host integration-labsvagrant is DOWN: CRITICAL - Host Unreachable (10.68.16.4)
[03:04:58] <wmf-insecte>	 Project beta-scap-eqiad build #78855: 04FAILURE in 10 min: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/78855/
[03:27:17] <wmf-insecte>	 Project browsertests-MobileFrontend-en.m.wikipedia.beta.wmflabs.org-linux-firefox-sauce build #884: 04FAILURE in 45 min: https://integration.wikimedia.org/ci/job/browsertests-MobileFrontend-en.m.wikipedia.beta.wmflabs.org-linux-firefox-sauce/884/
[06:27:21] <Krinkle>	 greg-g: twentyafterfour: ostriches: Almost 100 "Production impact" issues in Wikimedia-log-errors. May be time for a sprint or otherwise highlight of sorts. 
[06:38:22] <shinken-wm>	 RECOVERY - Free space - all mounts on deployment-bastion is OK: OK: All targets OK
[06:40:13] <ostriches>	 Krinkle: I know.
[06:40:25] <ostriches>	 Half of the are probably fixed/no longer issues
[06:40:31] <ostriches>	 The other half nobody owns.
[06:40:36] <ostriches>	 And the last half might get fixed :)
[09:47:40] <wikibugs>	 5Continuous-Integration-Scaling, 6operations: Upload new Zuul packages on apt.wikimedia.org for Precise / Trusty / Jessie - https://phabricator.wikimedia.org/T118340#1810354 (10hashar) The packaging work is held in our Gerrit repo `integration/zuul.git` with the following branches:  | `upstream` | 1cc37f7b469a...
[09:57:26] <shinken-wm>	 PROBLEM - Host deployment-cache-parsoid04 is DOWN: CRITICAL - Host Unreachable (10.68.19.197)
[10:02:55] <wikibugs>	 10Continuous-Integration-Config, 10Continuous-Integration-Infrastructure, 10MobileFrontend: MobileFrontend is failing mwext-mw-selenium test - https://phabricator.wikimedia.org/T118771#1810389 (10hashar)
[10:27:43] <shinken-wm>	 RECOVERY - Host deployment-parsoidcache02 is UP: PING OK - Packet loss = 0%, RTA = 0.54 ms
[11:02:01] <wikibugs>	 5Gerrit-Migration, 10Analytics-Tech-community-metrics: Make MetricsGrimoire/korma support gathering Code Review statistics from Phabricator's Differential - https://phabricator.wikimedia.org/T118753#1810527 (10Aklapper)
[11:52:15] <shinken-wm>	 PROBLEM - Host deployment-parsoidcache02 is DOWN: CRITICAL - Host Unreachable (10.68.16.145)
[12:37:35] <wikibugs>	 10Continuous-Integration-Config, 10ArticlePlaceholder, 10Wikidata, 5Patch-For-Review, and 2 others: [Task] add CI to extension ArticlePlaceholder - https://phabricator.wikimedia.org/T113049#1810631 (10Tobi_WMDE_SW)
[12:42:48] <grrrit-wm>	 (03CR) 10Hashar: [C: 032] Set up CI for eventlogging (python) repo [integration/config] - 10https://gerrit.wikimedia.org/r/253359 (https://phabricator.wikimedia.org/T118761) (owner: 10Ottomata)
[12:43:40] <grrrit-wm>	 (03Merged) 10jenkins-bot: Set up CI for eventlogging (python) repo [integration/config] - 10https://gerrit.wikimedia.org/r/253359 (https://phabricator.wikimedia.org/T118761) (owner: 10Ottomata)
[12:55:00] <wmf-insecte>	 Yippee, build fixed!
[12:55:01] <wmf-insecte>	 Project browsertests-GettingStarted-en.wikipedia.beta.wmflabs.org-linux-firefox-sauce build #662: 09FIXED in 59 sec: https://integration.wikimedia.org/ci/job/browsertests-GettingStarted-en.wikipedia.beta.wmflabs.org-linux-firefox-sauce/662/
[13:17:08] <jzerebecki>	 hashar: new date works for me
[13:18:04] <hashar>	 jzerebecki: sorry for the late notification :(
[13:18:32] <hashar>	 jzerebecki: andrew proposed that time slot to get a new zuul-merger instance deployed and there is no other good timeslot this week
[13:27:28] <grrrit-wm>	 (03PS1) 10Hashar: Dependencies install notes for Mac/Homebrew [integration/uprightdiff] - 10https://gerrit.wikimedia.org/r/253598 
[14:28:37] <wmf-insecte>	 Yippee, build fixed!
[14:28:37] <wmf-insecte>	 Project browsertests-Echo-en.wikipedia.beta.wmflabs.org-linux-firefox-sauce build #693: 09FIXED in 2 min 36 sec: https://integration.wikimedia.org/ci/job/browsertests-Echo-en.wikipedia.beta.wmflabs.org-linux-firefox-sauce/693/
[14:51:43] <wikibugs>	 5Continuous-Integration-Scaling, 6operations: Upload new Zuul packages on apt.wikimedia.org for Precise / Trusty / Jessie - https://phabricator.wikimedia.org/T118340#1810828 (10Andrew) ok, good enough for me :)
[15:02:37] <andrewbogott>	 hashar: this’ll do
[15:03:00] <hashar>	 :-}
[15:03:00] <hashar>	 there is a lot of other ops stuff going 
[15:05:07] <hashar>	 andrewbogott: so that is a two step process
[15:05:16] <hashar>	 first get zuul-merger installed / setup on scandium
[15:05:25] <hashar>	 then once happy / completed
[15:05:45] <hashar>	 add scandium to the iptables rule on gallium which prevent it from joining the pool
[15:06:34] <andrewbogott>	 ok, so installing zuul-merger is https://gerrit.wikimedia.org/r/#/c/252336/3 right?
[15:06:49] <hashar>	 yes
[15:07:08] <hashar>	 wich will create the /etc related files to have the service running
[15:07:25] <hashar>	 and should create a disk mount under /srv/ssd
[15:08:12] <hashar>	 once happ
[15:08:19] <hashar>	 we can enable the iptables rule ( https://gerrit.wikimedia.org/r/252337 )
[15:08:36] <andrewbogott>	 hm, "parent directory /srv/ssd/zuul does not exist"
[15:08:42] <andrewbogott>	 do you want to fix that in puppet or shall I?
[15:08:49] <hashar>	 oh man
[15:08:55] <hashar>	 I forgotto mount the ssd
[15:09:10] <hashar>	 on gallium that is done via site.pp
[15:09:16] <hashar>	     file { '/srv/ssd':
[15:09:16] <hashar>	     mount { '/srv/ssd':
[15:09:29] * hashar copy paste
[15:10:26] <hashar>	 hmm
[15:10:58] <hashar>	  /dev/md2        139G   33M  139G   1% /srv
[15:11:05] <hashar>	 andrewbogott: I dont know how that mount got realized
[15:11:07] <hashar>	 maybe on setup
[15:11:36] <andrewbogott>	 most of our partman recipes put extra drives at /srv
[15:11:54] <andrewbogott>	 so, it’s no surprise.  You can move it, or move your stuff to use /srv instead
[15:12:10] <hashar>	 I would move it to /srv/ssd for consistency
[15:12:18] <hashar>	 or agrrg
[15:12:25] <hashar>	 yeah
[15:12:25] <hashar>	 it is easier
[15:12:34] <hashar>	 else I will have to vary the mount point between gallium and scandium
[15:13:02] <icinga-wm>	 PROBLEM - puppet last run on scandium is CRITICAL: CRITICAL: Puppet has 1 failures
[15:13:57] <hashar>	 andrewbogott:  https://gerrit.wikimedia.org/r/253611   
[15:14:00] <hashar>	 copy pasted from gallium
[15:14:30] <hashar>	 wrong disk
[15:15:53] <hashar>	 made it to mount /dev/md2
[15:16:01] <andrewbogott>	 ready for me to merge that?
[15:20:28] <hashar>	 yeah
[15:20:37] <andrewbogott>	 I think you still need to mkdir /srv/ssd/zuul someplace
[15:21:04] <hashar>	 looks like the zuul puppet manifest is not magic
[15:21:46] <hashar>	   file { $git_dir:
[15:21:46] <hashar>	         ensure => directory,
[15:21:47] <hashar>	         owner  => 'zuul',
[15:21:47] <hashar>	     }
[15:21:53] <hashar>	 zuul::merger should create it
[15:23:29] <hashar>	 oh I got it
[15:23:34] <hashar>	 so 
[15:23:43] <hashar>	 the zuul::merger class is being passed `/srv/ssd/zuul/git`
[15:23:48] <hashar>	 but only /srv/ssd exists :(
[15:23:57] <hashar>	 and puppet doesn't mkdir -p
[15:24:09] <andrewbogott>	 yeah, puppet still lacks proper recursive mkdir I think
[15:25:06] <andrewbogott>	 The role should probably create all the parent dirs before invoking the module
[15:25:06] <hashar>	 should we just create it manually and call it an end? :D
[15:25:37] <hashar>	 or I can exec {} mkdir -p
[15:25:46] <andrewbogott>	 let me look...
[15:27:53] <hashar>	 andrewbogott: and I had the same issue with nodepool actually
[15:28:55] <hashar>	 https://gerrit.wikimedia.org/r/#/c/253616/1/modules/zuul/manifests/merger.pp
[15:29:39] <hashar>	 I took it from the nodepool manifest https://github.com/wikimedia/operations-puppet/blob/production/modules/nodepool/manifests/init.pp#L135-L149
[15:31:27] <andrewbogott>	 hashar: won’t https://gerrit.wikimedia.org/r/#/c/253617/ do it?
[15:31:56] <hashar>	 yup
[15:31:59] <hashar>	 though a couple line below
[15:32:00] <hashar>	 'git_dir'   => '/srv/ssd/zuul/git',
[15:32:05] <hashar>	 that is a configurable dir
[15:32:13] <hashar>	 so in our specific case that is going to work
[15:32:25] <hashar>	 but if we ever change the git_dir that can causes some troubles
[15:32:26] <andrewbogott>	 oh, I see...
[15:32:44] <andrewbogott>	 yeah, yours is better :)
[15:33:07] <hashar>	 must have been suggested to me by filipo when reviewing the nodepool manifest
[15:33:14] <hashar>	 I deserve no credit :-}
[15:34:54] <andrewbogott>	 hashar: ok, puppet is happy now.  Want to make sure that things are doing what you’d expect?
[15:35:08] <hashar>	 checking
[15:35:22] <icinga-wm>	 RECOVERY - puppet last run on scandium is OK: OK: Puppet is currently enabled, last run 1 second ago with 0 failures
[15:35:25] <hashar>	 the zuul-merger did not start
[15:35:34] <hashar>	 but maybe it is lacks ensure => started 
[15:35:36] <hashar>	 or something similar
[15:37:10] <icinga-wm>	 PROBLEM - zuul_merger_service_running on scandium is CRITICAL: PROCS CRITICAL: 0 processes with regex args ^/usr/share/python/zuul/bin/python /usr/bin/zuul-merger
[15:37:25] <hashar>	 started it manually
[15:40:27] <andrewbogott>	 hashar: let me know when you want me to merge that last patch
[15:40:31] <hashar>	 so
[15:40:34] <hashar>	 I got it running
[15:40:44] <andrewbogott>	 although maybe we should puppetize the running state first?
[15:41:03] <hashar>	 running puppet
[15:41:06] <hashar>	 to make sure it starts the service
[15:41:25] <hashar>	 there might be some weird interaction between systemd and the .pid file
[15:42:00] <hashar>	 and puppet just refuse to start it
[15:42:03] <hashar>	 :(-
[15:43:00] <icinga-wm>	 RECOVERY - zuul_merger_service_running on scandium is OK: PROCS OK: 1 process with regex args ^/usr/share/python/zuul/bin/python /usr/bin/zuul-merger
[15:43:17] <hashar>	     service { 'zuul-merger':
[15:43:18] <hashar>	         name       => 'zuul-merger',
[15:43:19] <hashar>	         enable     => true,
[15:43:19] <hashar>	         hasrestart => true,
[15:44:11] <hashar>	 I see in the debug
[15:44:12] <hashar>	 Debug: Executing '/usr/sbin/service zuul-merger status'
[15:44:13] <hashar>	 Debug: Executing '/bin/systemctl show -pSourcePath zuul-merger'
[15:45:53] <hashar>	 5$ that puppet doesn't manage systemd properly :/
[15:46:04] <andrewbogott>	 it works in lots of other cases
[15:46:13] <andrewbogott>	 could be that the systemd setup in the package isn’t quite right...
[15:46:55] <hashar>	 and puppet doesn't start the git-daemon either
[15:48:03] <hashar>	 the package for Jessie doesn't have systemd
[15:48:42] <icinga-wm>	 PROBLEM - zuul_merger_service_running on scandium is CRITICAL: PROCS CRITICAL: 0 processes with regex args ^/usr/share/python/zuul/bin/python /usr/bin/zuul-merger
[15:48:53] <andrewbogott>	 ok — so sounds like either we need to add proper systemd scripts to the package, or add them externally with puppet
[15:49:15] <hashar>	 yeah
[15:49:25] <hashar>	 or figure out what is happening with puppet
[15:49:29] <grrrit-wm>	 (03CR) 10Alexandros Kosiaris: [C: 031] "LGTM, but I am not the best one around to judge that" [integration/config] - 10https://gerrit.wikimedia.org/r/252716 (https://phabricator.wikimedia.org/T110019) (owner: 10Zfilipin)
[15:49:30] <hashar>	  service { 'git-daemon':
[15:49:30] <hashar>	     ensure     => running,
[15:49:31] <hashar>	     enable     => true,
[15:49:31] <hashar>	     hasrestart => true,
[15:49:35] <hashar>	 doesn't start it either :-(
[15:51:27] <andrewbogott>	 If there’s no systemd script then I wouldn’t expect anything to work.  It’s not puppet’s fault, is it?
[15:54:18] <hashar>	 I think it auto detects the provider
[15:54:18] <hashar>	 so on Jessie assumes everything is systemd
[15:54:18] <hashar>	 but then
[15:54:18] <hashar>	 it executes /usr/sbin/service zuul-merger status
[15:54:18] <hashar>	 and probably ends up being confused while trying to parse the output
[15:54:19] <hashar>	 so
[15:54:19] <hashar>	 I guess it is fail
[15:54:19] <shinken-wm>	 PROBLEM - App Server Main HTTP Response on deployment-mediawiki01 is CRITICAL: CRITICAL - Socket timeout after 10 seconds
[15:54:19] <hashar>	 till I figure out what is going on puppet side
[15:54:19] <andrewbogott>	 if you look in modules/openstack/manifests/designate/service.pp
[15:54:19] <andrewbogott>	 you can see me hacking around a similar problem, where a .deb doesn’t have systemd scripts
[15:54:19] <andrewbogott>	 look for 'These would be automatically included in a correct designate package'
[15:54:19] <shinken-wm>	 PROBLEM - English Wikipedia Main page on beta-cluster is CRITICAL: CRITICAL - Socket timeout after 10 seconds
[15:54:53] <ottomata>	 goood morninnniiing
[15:54:57] <hashar>	 and you use base::service_unit
[15:55:12] <ottomata>	 ostriches: do you use scap3 in beta labs (deployment-prep)?
[15:55:53] <andrewbogott>	 hashar: fixing the packages so they work properly on debian would be better, but I don’t really know how to do that :/
[15:56:06] <hashar>	 i think I had a patch floating around
[15:56:25] <hashar>	 but that mean diverging the one for Jessie from the ones for Precise/Trusty
[15:56:25] <hashar>	 but yeah
[15:56:27] <hashar>	 will have to do that
[15:56:34] <hashar>	 though. git-daemon doesn't work either
[15:57:54] <shinken-wm>	 RECOVERY - App Server Main HTTP Response on deployment-mediawiki01 is OK: HTTP OK: HTTP/1.1 200 OK - 38934 bytes in 0.614 second response time
[15:57:55] <hashar>	 andrewbogott: so I am calling it an end. Lets not pool it
[15:58:10] <andrewbogott>	 ok :(
[15:58:25] <hashar>	 will try to reproduce on labs 
[15:58:33] <hashar>	 and figure out what is the heck is happening in puppet :D
[15:58:38] <shinken-wm>	 RECOVERY - English Wikipedia Main page on beta-cluster is OK: HTTP OK: HTTP/1.1 200 OK - 39292 bytes in 0.747 second response time
[15:58:40] <hashar>	 zuul is workable
[15:58:44] <hashar>	 but the git-daemon is not
[16:02:08] <wikibugs>	 10Continuous-Integration-Infrastructure, 5Patch-For-Review, 7Zuul: Zuul-cloner should use hard links when fetching from cache-dir - https://phabricator.wikimedia.org/T97106#1811009 (10hashar)
[16:02:09] <wikibugs>	 5Continuous-Integration-Scaling, 6operations, 5Patch-For-Review: install/deploy scandium as zuul merger (ci) server - https://phabricator.wikimedia.org/T95046#1811010 (10hashar)
[16:02:11] <wikibugs>	 5Continuous-Integration-Scaling, 6operations: Upload new Zuul packages on apt.wikimedia.org for Precise / Trusty / Jessie - https://phabricator.wikimedia.org/T118340#1811007 (10hashar) 5Open>3Resolved Andrew uploaded them all :-} Thank you!
[16:02:12] <hashar>	 andrewbogott: I started it manually to clear the icinga alarm.
[16:02:21] <andrewbogott>	 thanks
[16:02:25] <hashar>	 will fill a bunch of follow up tasks
[16:02:33] <hashar>	 and I guess we will want to reschedule something next week :-(
[16:02:51] <icinga-wm>	 RECOVERY - zuul_merger_service_running on scandium is OK: PROCS OK: 1 process with regex args ^/usr/share/python/zuul/bin/python /usr/bin/zuul-merger
[16:08:13] <shinken-wm>	 PROBLEM - English Wikipedia Mobile Main page on beta-cluster is CRITICAL: CRITICAL - Socket timeout after 10 seconds
[16:08:13] <shinken-wm>	 PROBLEM - App Server Main HTTP Response on deployment-mediawiki02 is CRITICAL: CRITICAL - Socket timeout after 10 seconds
[16:08:14] <hashar>	 andrewbogott: one last thing, do we setup the Icinga monitoring probe in the modules or on the role?
[16:08:25] <andrewbogott>	 hashar: modules, usually
[16:08:40] <hashar>	 oh
[16:08:42] <hashar>	 modules/zuul/manifests/monitoring/merger.pp !!
[16:12:36] <shinken-wm>	 RECOVERY - Host deployment-parsoidcache02 is UP: PING OK - Packet loss = 0%, RTA = 1.71 ms
[16:26:29] <wikibugs>	 5Continuous-Integration-Scaling, 6operations, 7Puppet: On Jessie, puppet does not start zuul-merger via init scripts - https://phabricator.wikimedia.org/T118861#1811101 (10hashar) 3NEW a:3hashar
[16:27:02] <hashar>	 andrewbogott: can I grab your hand to facepalm self ?
[16:27:14] <hashar>	 having zuul-merger to not be run by puppet is intended
[16:27:25] <hashar>	 just remembered about it when I proposed the task
[16:27:36] <hashar>	 the idea is to be able to manually stop it without having puppet to interfer
[16:28:56] <hashar>	 was confused by  enable => true  
[16:29:00] <hashar>	 which is really "start at boot"
[16:29:00] <andrewbogott>	 hashar: does that mean we’re done?
[16:29:11] <hashar>	 and it lacks ensure     => running,
[16:29:56] <wikibugs>	 5Continuous-Integration-Scaling, 6operations, 5Patch-For-Review: install/deploy scandium as zuul merger (ci) server - https://phabricator.wikimedia.org/T95046#1811113 (10hashar)
[16:29:58] <wikibugs>	 5Continuous-Integration-Scaling, 6operations, 7Puppet: On Jessie, puppet does not start zuul-merger via init scripts - https://phabricator.wikimedia.org/T118861#1811111 (10hashar) 5Open>3Resolved zuul-merger does not have `ensure     => running,` so we can stop it manually without having puppet to start...
[16:30:00] <andrewbogott>	 oh yeah :)  So puppet was doing what we told it to do
[16:30:00] <hashar>	 andrewbogott: yeah pretty much
[16:30:09] <hashar>	 I will get the iptables rule added tomorrow with european ops
[16:30:12] <hashar>	 yeah
[16:30:13] <hashar>	 as usual
[16:30:32] <hashar>	 the problem was between the keyboard / chair  and the poor semantic used by puppet (enable vs ensure)
[16:30:40] <andrewbogott>	 ok — ping me tomorrow if things aren’t done by the time I’m awake
[16:30:59] <hashar>	 + side
[16:31:07] <hashar>	 the git-daemon will now be monitored
[16:31:53] <hashar>	 and the parent directory of /srv/ssd/zuul/git is now created
[16:32:11] <andrewbogott>	 want me to merge https://gerrit.wikimedia.org/r/#/c/253622/ ?
[16:32:47] <hashar>	 andrewbogott: yeah I think it is fine
[16:32:55] <hashar>	 merely copy pasted
[16:33:01] <hashar>	 I can get the iptables rule lifted tomorrow since I got access to ferm rules on gallium
[16:33:28] <hashar>	 and test everything works fine. if so the iptables patch can just be merged
[16:33:30] <hashar>	 \O/
[16:34:25] <hashar>	 thank you very much andrewbogott !
[16:35:24] <wikibugs>	 7Browser-Tests, 10Continuous-Integration-Config, 10Wikidata, 3Wikidata-Sprint-2015-11-03: create a Wikibase browser test job running against a fresh MediaWiki installation - https://phabricator.wikimedia.org/T118284#1811138 (10JanZerebecki) Patch in wikibase that adds an initial browsertest: https://gerrit...
[16:36:15] <wikibugs>	 7Browser-Tests, 10Continuous-Integration-Config, 10Wikidata, 3Wikidata-Sprint-2015-11-03: create a Wikibase browser test job running against a fresh MediaWiki installation - https://phabricator.wikimedia.org/T118284#1811143 (10JanZerebecki) Job: https://integration.wikimedia.org/ci/job/mwext-mw-selenium-co...
[16:47:46] <shinken-wm>	 PROBLEM - Host deployment-parsoidcache02 is DOWN: CRITICAL - Host Unreachable (10.68.16.145)
[16:52:13] <grrrit-wm>	 (03PS1) 10JanZerebecki: Add set_ext_dependencies to mwext-mw-selenium-composer [integration/config] - 10https://gerrit.wikimedia.org/r/253636 (https://phabricator.wikimedia.org/T118284) 
[16:52:16] <shinken-wm>	 RECOVERY - Host deployment-parsoidcache02 is UP: PING OK - Packet loss = 0%, RTA = 0.89 ms
[16:55:18] <grrrit-wm>	 (03CR) 10JanZerebecki: "That job is not whitelisted in test_zuul_layout.py for the check pipeline." [integration/config] - 10https://gerrit.wikimedia.org/r/253343 (https://phabricator.wikimedia.org/T114860) (owner: 10Zfilipin)
[17:01:33] <grrrit-wm>	 (03CR) 10Dduvall: [C: 032] Add set_ext_dependencies to mwext-mw-selenium-composer [integration/config] - 10https://gerrit.wikimedia.org/r/253636 (https://phabricator.wikimedia.org/T118284) (owner: 10JanZerebecki)
[17:02:46] <grrrit-wm>	 (03Merged) 10jenkins-bot: Add set_ext_dependencies to mwext-mw-selenium-composer [integration/config] - 10https://gerrit.wikimedia.org/r/253636 (https://phabricator.wikimedia.org/T118284) (owner: 10JanZerebecki)
[17:12:55] <grrrit-wm>	 (03CR) 10JanZerebecki: [C: 032] Run Ruby jobs using Rake [integration/config] - 10https://gerrit.wikimedia.org/r/252690 (https://phabricator.wikimedia.org/T114860) (owner: 10Zfilipin)
[17:13:03] <grrrit-wm>	 (03PS5) 10JanZerebecki: Run Ruby jobs using Rake [integration/config] - 10https://gerrit.wikimedia.org/r/252690 (https://phabricator.wikimedia.org/T114860) (owner: 10Zfilipin)
[17:13:13] <grrrit-wm>	 (03CR) 10JanZerebecki: [C: 032] Run Ruby jobs using Rake [integration/config] - 10https://gerrit.wikimedia.org/r/252690 (https://phabricator.wikimedia.org/T114860) (owner: 10Zfilipin)
[17:14:06] <marxarelli>	 !log Reloading Zuul to deploy I902e9dace28a6e5f42a71f90c86891cfb645b232
[17:14:39] <grrrit-wm>	 (03Merged) 10jenkins-bot: Run Ruby jobs using Rake [integration/config] - 10https://gerrit.wikimedia.org/r/252690 (https://phabricator.wikimedia.org/T114860) (owner: 10Zfilipin)
[17:18:29] <grrrit-wm>	 (03CR) 10JanZerebecki: [C: 04-1] "Would make CI for that repo fail: https://gerrit.wikimedia.org/r/#/c/253637/1" [integration/config] - 10https://gerrit.wikimedia.org/r/252716 (https://phabricator.wikimedia.org/T110019) (owner: 10Zfilipin)
[17:19:07] <grrrit-wm>	 (03CR) 10JanZerebecki: [C: 04-1] "Would make the repo fail CI: https://gerrit.wikimedia.org/r/#/c/253637/1" [integration/config] - 10https://gerrit.wikimedia.org/r/252689 (https://phabricator.wikimedia.org/T110019) (owner: 10Zfilipin)
[17:24:33] <shinken-wm>	 PROBLEM - Puppet failure on deployment-eventlogging03 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0]
[17:27:27] <grrrit-wm>	 (03PS5) 10JanZerebecki: Code review by whitelisted users should triggers tests [integration/config] - 10https://gerrit.wikimedia.org/r/184886 (https://phabricator.wikimedia.org/T64429) (owner: 10Hashar)
[17:29:23] <grrrit-wm>	 (03CR) 10JanZerebecki: [C: 032] Code review by whitelisted users should triggers tests [integration/config] - 10https://gerrit.wikimedia.org/r/184886 (https://phabricator.wikimedia.org/T64429) (owner: 10Hashar)
[17:30:34] <grrrit-wm>	 (03CR) 10Legoktm: [C: 031] "Yay! Please announce this somewhere, as it is a pretty drastic behavior change (CR +1 is not useless anymore)" [integration/config] - 10https://gerrit.wikimedia.org/r/184886 (https://phabricator.wikimedia.org/T64429) (owner: 10Hashar)
[17:30:40] <grrrit-wm>	 (03Merged) 10jenkins-bot: Code review by whitelisted users should triggers tests [integration/config] - 10https://gerrit.wikimedia.org/r/184886 (https://phabricator.wikimedia.org/T64429) (owner: 10Hashar)
[17:31:48] <grrrit-wm>	 (03CR) 10JanZerebecki: [V: 04-1] "Waiting to get dependent patches merged." [integration/config] - 10https://gerrit.wikimedia.org/r/248663 (owner: 10Gergő Tisza)
[17:37:44] <jzerebecki>	 !log reload zull for 339b575..a2e0173
[17:42:43] <shinken-wm>	 PROBLEM - Host deployment-parsoidcache02 is DOWN: CRITICAL - Host Unreachable (10.68.16.145)
[17:45:37] <grrrit-wm>	 (03CR) 10JanZerebecki: "Two of the changed repos are now failing." [integration/config] - 10https://gerrit.wikimedia.org/r/252690 (https://phabricator.wikimedia.org/T114860) (owner: 10Zfilipin)
[17:45:47] <jzerebecki>	 zeljkof: here? ^^
[17:46:06] <jzerebecki>	 *sigh* probably means I need to revert.
[17:49:30] <shinken-wm>	 PROBLEM - Free space - all mounts on deployment-bastion is CRITICAL: CRITICAL: deployment-prep.deployment-bastion.diskspace._var.byte_percentfree (<33.33%)
[18:21:19] <grrrit-wm>	 (03CR) 10JanZerebecki: "Tested on https://gerrit.wikimedia.org/r/#/c/251924/ . Mail sent to wikitech-l." [integration/config] - 10https://gerrit.wikimedia.org/r/184886 (https://phabricator.wikimedia.org/T64429) (owner: 10Hashar)
[18:22:05] <grrrit-wm>	 (03PS1) 10JanZerebecki: Revert "Run Ruby jobs using Rake" [integration/config] - 10https://gerrit.wikimedia.org/r/253649 
[18:23:04] <grrrit-wm>	 (03PS2) 10JanZerebecki: Revert "Run Ruby jobs using Rake" [integration/config] - 10https://gerrit.wikimedia.org/r/253649 
[18:23:12] <grrrit-wm>	 (03CR) 10JanZerebecki: [C: 032] Revert "Run Ruby jobs using Rake" [integration/config] - 10https://gerrit.wikimedia.org/r/253649 (owner: 10JanZerebecki)
[18:33:41] <grrrit-wm>	 (03Merged) 10jenkins-bot: Revert "Run Ruby jobs using Rake" [integration/config] - 10https://gerrit.wikimedia.org/r/253649 (owner: 10JanZerebecki)
[18:36:32] <jzerebecki>	 !log reloading zuul for a2e0173..9f35c8d
[18:38:30] <wikibugs>	 10Continuous-Integration-Infrastructure, 5Patch-For-Review: Zuul: run 'test' jobs on jenkins when trusted user votes +1 and only 'check' jobs was ran - https://phabricator.wikimedia.org/T64429#1811676 (10JanZerebecki) 5Open>3Resolved
[18:40:39] <grrrit-wm>	 (03CR) 10JanZerebecki: "Please reupload. We can try again when the repos under test are changed so that they will pass the job." [integration/config] - 10https://gerrit.wikimedia.org/r/252690 (https://phabricator.wikimedia.org/T114860) (owner: 10Zfilipin)
[18:44:16] <ottomata>	 hm, i'm having trouble setting up a new trebuchet deploy target
[18:44:22] <ottomata>	 things seem to work...but nothing happens
[18:44:24] <ottomata>	 bd808: ?
[18:44:36] <ottomata>	 i've done this
[18:44:36] <ottomata>	 https://gerrit.wikimedia.org/r/#/c/253637/
[18:44:44] <ottomata>	 i'm testing in both beta labs and in prod
[18:44:51] <ottomata>	 in prod, i see the new pillars get added on palladium
[18:45:07] <ottomata>	 then i run puppet on tin, but /srv/deployment/eventlogging/eventlogging never shows up
[18:46:57] <bd808>	 ottomata: I can try to take a look in a few minutes
[18:47:05] <ottomata>	 k
[18:47:05] <ottomata>	 thanks
[18:47:49] <ottomata>	 hoping its not due to case insensitivity
[18:47:53] <ottomata>	 maybe i'll try removing the old target
[18:49:10] <ottomata>	 don't really want to, as i'm not ready to force prod deploys from this new repo yet...
[19:05:21] <wikibugs>	 7Browser-Tests, 10Continuous-Integration-Config, 10Wikidata, 3Wikidata-Sprint-2015-11-17: [Task] Move Wikidata browsertests into Wikibase repository - https://phabricator.wikimedia.org/T118727#1811830 (10JanZerebecki)
[19:05:43] <wikibugs>	 7Browser-Tests, 10Continuous-Integration-Config, 10Wikidata, 5Patch-For-Review, 3Wikidata-Sprint-2015-11-17: create a Wikibase browser test job running against a fresh MediaWiki installation - https://phabricator.wikimedia.org/T118284#1811831 (10JanZerebecki)
[19:06:06] <wikibugs>	 10Continuous-Integration-Config, 10Wikidata, 5Patch-For-Review, 3Wikidata-Sprint-2015-11-17: [Task] Add Wikidata to extension-gate in CI - https://phabricator.wikimedia.org/T96264#1811834 (10JanZerebecki)
[19:08:11] <wikibugs>	 7Browser-Tests, 10Continuous-Integration-Config, 10Wikidata, 5Patch-For-Review, 3Wikidata-Sprint-2015-11-17: create a Wikibase browser test job running against a fresh MediaWiki installation - https://phabricator.wikimedia.org/T118284#1811836 (10JanZerebecki) a:3JanZerebecki
[19:22:32] <ottomata>	 bd808: ping again, am a little lost atm.  recommend another helper? :)
[19:22:45] <bd808>	 ottomata: I think I figured it out
[19:22:52] <ottomata>	 oh!
[19:22:54] <ottomata>	 k...
[19:22:55] <bd808>	 I believe you have to add your new repo in https://wikitech.wikimedia.org/wiki/Hiera:Deployment-prep
[19:22:56] <ottomata>	 is it case sensitive?
[19:23:04] <ottomata>	 oh, but in prod its not moving either...
[19:23:15] <ottomata>	 no new repo checked out on tin
[19:23:25] <ottomata>	 will edit deployment prep and try
[19:24:01] <bd808>	 ottomata: this gets to a place that needs root powers pretty quickly for debugging so I won't be much help outside of beta cluster
[19:24:21] <bd808>	 when things are right your new repo should show up in /srv/pillars/deployment/repo_config.sls on  the salt master
[19:24:30] <ottomata>	 yes, it is there
[19:24:37] <ottomata>	 on palladium
[19:24:55] <bd808>	 the next thing to try then on tin would be `sudo salt-call deploy.deployment_server_init` and see if it gets a mention
[19:25:08] <bd808>	 https://wikitech.wikimedia.org/wiki/Trebuchet#Repo_doesn.27t_exist_on_tin
[19:26:02] <bd808>	 for beta cluster the new repo isn't listed on the salt master because it hasn't been added to the weird on-wiki hiera settings
[19:26:11] <ottomata>	 i just added it
[19:26:21] <ottomata>	 is puppetmaster salt master there?...
[19:26:24] <wmf-insecte>	 Yippee, build fixed!
[19:26:25] <wmf-insecte>	 Project beta-scap-eqiad build #78954: 09FIXED in 41 min: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/78954/
[19:26:33] <ottomata>	 no, -salt
[19:26:51] * bd808 forces puppet on deployment-salt
[19:27:02] <ottomata>	 ah ha!
[19:27:03] <ottomata>	 [ERROR   ] Command '/usr/bin/git clone https://gerrit.wikimedia.org/r/eventlogging/.git /srv/deployment/eventlogging/eventlogging' failed with return code: 128
[19:27:06] <ottomata>	 thank you, now i'm onto something
[19:27:22] <bd808>	 cool. permissions problem?
[19:27:45] <ottomata>	 [ERROR   ] output: fatal: could not create work tree dir '/srv/deployment/eventlogging/eventlogging'.: Permission denied
[19:27:46] <ottomata>	 yeah
[19:28:06] <ottomata>	 drwxrwsr-x  3 sartoris  wikidev 4096 Mar 16  2015 .
[19:28:09] <ottomata>	 vs trebuchet?
[19:28:34] <ottomata>	 should I just chown it?
[19:28:41] <ottomata>	 gonna try...
[19:28:42] <bd808>	 yeah. sartoris was before the trebuchet rename
[19:29:13] <ottomata>	 i think that works.
[19:29:38] <ottomata>	 yes cool, and it is on deployment-bastion now too
[19:29:41] <ottomata>	 thank you bd808!
[19:29:50] <bd808>	 ottomata: yw
[19:30:00] <ottomata>	 i shoulda just found that ref in the wiki myself, apologies for bugging, help much appreciated though!
[19:30:26] <bd808>	 sure. I have that page pretty much memorized :/
[19:30:56] <ottomata>	 ha
[19:31:13] <ottomata>	 one more q bd808.  does puppetmaster self not work on beta labs?  i tried to apply it to a node so I can more easily test a puppet patch
[19:31:22] <ottomata>	 i guess i can cherry pick on puppetmaster...
[19:32:05] <bd808>	 the beta cluster uses deployment-puppetmaster as the "self hosted" puppet
[19:32:24] <ottomata>	 right, but i should be able to override it for an individual node, no?
[19:32:28] <bd808>	 so the way to test is by cherry picking your patches there (on top of the current stack)
[19:32:31] <ottomata>	 ok
[19:32:48] <ottomata>	 i'll just do that
[19:33:00] <bd808>	 I think the hiera stuff we have setup makes overriding per-node hard
[19:33:26] <ottomata>	 ah, k
[19:33:42] <ottomata>	 yeah
[19:33:42] <ottomata>	 hiera > the configure instance page
[19:33:42] <ottomata>	 probably
[19:33:44] <ottomata>	 that is probably why
[19:34:13] <bd808>	 yeah, I think we do it in the Hiera namespace on wikitech and that trumps all the other config locations
[19:50:22] <chasemp>	 hiera is a first value winner take all for the most part, most specific value first
[20:04:35] <shinken-wm>	 RECOVERY - Puppet failure on deployment-eventlogging03 is OK: OK: Less than 1.00% above the threshold [0.0]
[20:07:01] <grrrit-wm>	 (03PS1) 10Gilles: Configure thumbor/exif-optimizer [integration/config] - 10https://gerrit.wikimedia.org/r/253668 (https://phabricator.wikimedia.org/T111722) 
[20:25:30] <shinken-wm>	 PROBLEM - Puppet failure on deployment-eventlogging03 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0]
[21:57:21] <shinken-wm>	 RECOVERY - Host deployment-parsoidcache02 is UP: PING OK - Packet loss = 0%, RTA = 0.81 ms
[22:03:47] <shinken-wm>	 PROBLEM - Host deployment-parsoidcache02 is DOWN: CRITICAL - Host Unreachable (10.68.16.145)
[22:10:38] <grrrit-wm>	 (03PS1) 10Krinkle: mediawiki/conf: Use wgDebugLogGroups['ratelimit'] instead of wgRateLimitLog [integration/jenkins] - 10https://gerrit.wikimedia.org/r/253762 
[22:10:57] <grrrit-wm>	 (03CR) 10Krinkle: [C: 032] mediawiki/conf: Use wgDebugLogGroups['ratelimit'] instead of wgRateLimitLog [integration/jenkins] - 10https://gerrit.wikimedia.org/r/253762 (owner: 10Krinkle)
[22:13:43] <shinken-wm>	 RECOVERY - Host deployment-parsoidcache02 is UP: PING OK - Packet loss = 0%, RTA = 1.08 ms
[22:19:45] <grrrit-wm>	 (03Merged) 10jenkins-bot: mediawiki/conf: Use wgDebugLogGroups['ratelimit'] instead of wgRateLimitLog [integration/jenkins] - 10https://gerrit.wikimedia.org/r/253762 (owner: 10Krinkle)
[22:37:05] <shinken-wm>	 PROBLEM - Host deployment-parsoidcache02 is DOWN: CRITICAL - Host Unreachable (10.68.16.145)
[23:42:18] <shinken-wm>	 RECOVERY - Host deployment-parsoidcache02 is UP: PING OK - Packet loss = 0%, RTA = 0.56 ms
[23:54:54] <shinken-wm>	 PROBLEM - Puppet failure on pmcache is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0]