[00:24:16] 10Continuous-Integration-Infrastructure, 06Release-Engineering-Team, 06Operations, 10Ops-Access-Requests: Allow RelEng nova log access - https://phabricator.wikimedia.org/T133992#2271484 (10Dzahn) re: "the releng team". Are we talking about the "contint-admins" , "contint-users" groups? These are the ones... [00:37:20] bd808, hi, do you know why this is failing? https://gerrit.wikimedia.org/r/#/c/287300/ [00:38:11] "puppet/modules/role/manifests/graph.pp:8 WARNING: variable not enclosed in {}" -- https://integration.wikimedia.org/ci/job/rake-jessie/36733/console [00:38:45] it's the double quotes [00:40:17] it thinks you're trying to access puppet variables [00:41:06] also that config block is pretty gross. You should probably move it to a file or actually write it using puppet syntax [00:42:09] I wouldn't have merged the config it is replacing either actually. Multiline strings aren't easy to deal with [00:42:20] thanks! what is the puppet syntax for nested arrays? [00:42:47] (i probaby should use google for taht :) [00:43:18] { foo => { bar => { ... } } [00:43:23] or my brain :) [00:43:25] thanks [00:44:23] look at how the centralauth role does it's settings block [00:44:41] I think everything you are doing there can be in native puppet data structs [00:47:02] updated [00:48:17] now it's going to yell at you about your => not being aligned [00:48:32] that puppet lint rule is a bit annoying [00:48:41] people who don't align => are a bit annoying! [00:49:06] the => for http, https, wikirawupload, and the otehr one need to line up vertically [00:49:17] ori: touche [00:49:44] at the first level I like it but in deep structures it gets harder to do the first tiem [00:49:47] *time [00:50:05] I suppose that means I need a better autoformat plugin [00:50:26] bd808, meh, ok, will do. How can i auto-insert the port number for 'localhost:8080' ? [00:51:05] ${::port_fragment} [00:51:23] it's a global [00:52:23] it includes the leading : too [00:52:41] because it is possible to config the wiki to no need it [00:53:11] so is it 'localhost:${::port_fragment}' or 'localhost${::port_fragment}' [00:53:21] the second version [00:55:27] meh, that thing is by far more OCD than i ever was [00:55:51] actually "localhost${::port_fragment}" [00:56:01] double quotes :) [00:56:22] he did it right in the patch [00:56:37] success! [00:56:46] w00t [00:56:59] want me to merge for you? [00:57:05] yes please ) [00:57:45] thx! [00:57:48] yw [01:57:39] Project browsertests-Wikidata-WikidataTests-Group0-SmokeTests-linux-firefox-sauce build #40: 04FAILURE in 17 min: https://integration.wikimedia.org/ci/job/browsertests-Wikidata-WikidataTests-Group0-SmokeTests-linux-firefox-sauce/40/ [02:38:20] PROBLEM - Parsoid on deployment-parsoid06 is CRITICAL: Connection refused [03:40:37] alex@alex-laptop:~$ ssh deployment-logstash2 [03:40:38] Permission denied (publickey). [03:41:15] bah, it has one of these puppet confs with "server = deployment-puppetmaster.eqiad.wmflabs" [03:43:04] How do we keep finding these old broken instances?! [03:43:09] sigh [03:50:32] 10Beta-Cluster-Infrastructure, 10Parsoid: deployment-parsoid06 puppet fails due to having role::parsoid::beta (requiring upstart) on jessie - https://phabricator.wikimedia.org/T134226#2271724 (10Krenair) [03:51:18] 10Beta-Cluster-Infrastructure, 06Services: deployment-restbase01 puppet fails due to issues with cassandra package - https://phabricator.wikimedia.org/T134630#2271728 (10Krenair) [04:15:47] 10Beta-Cluster-Infrastructure, 07Puppet: /etc/puppet/puppet.conf sometimes gets the old puppetmaster FQDN and breaks puppet - https://phabricator.wikimedia.org/T134631#2271750 (10Krenair) [04:16:52] 10Beta-Cluster-Infrastructure, 07Puppet: /etc/puppet/puppet.conf sometimes gets the old puppetmaster FQDN and breaks puppet - https://phabricator.wikimedia.org/T134631#2271764 (10Krenair) This appears to have happened on deployment-memc02 since 13th April, I also found -logstash2 with the issue and -mathoid wi... [04:18:12] twentyafterfour, phab-beta makes me sad because I have an SSH rule sending phab-* to .phabricator.eqiad.wmflabs :( [04:18:48] also it has this puppet issue, which I haven't filed yet since you're probably still working on it: Error: /Stage[main]/Ganglia::Monitor::Service/Service[ganglia-monitor]: Could not evaluate: Could not find init script or upstart conf file for 'ganglia-monitor' [04:30:09] Krenair: I have it in deployment-prep because I need to deploy via scap and cross-project deploy was broken in difficult to debug ways. It's named as it is because I deleted deployment-phab and recreating it with the same name did not work out. [04:31:24] The ganglia error is a mystery to me - it's not anything I caused afaik, and I haven't figured out why it's happening. is there any way for you to just ignore all my phab related machines? they are used only by me and you don't need to concern yourself with their puppet failures. [04:31:47] if they are bothering you with alerts I'd be glad to try to avoid that I just don't know what might be triggering them [06:57:58] 10Beta-Cluster-Infrastructure, 07Puppet: /etc/puppet/puppet.conf sometimes gets the old puppetmaster FQDN and breaks puppet - https://phabricator.wikimedia.org/T134631#2271750 (10hashar) Supposed to be fixed by https://gerrit.wikimedia.org/r/#/c/284852/ for T132689 [06:58:31] 10Beta-Cluster-Infrastructure, 07Puppet: /etc/puppet/puppet.conf sometimes gets the old puppetmaster FQDN and breaks puppet - https://phabricator.wikimedia.org/T134631#2271819 (10hashar) [06:58:33] 10Beta-Cluster-Infrastructure, 06Labs, 13Patch-For-Review, 07Puppet: /etc/puppet/puppet.conf keeps getting double content - first for labs-wide puppetmaster, then for the correct puppetmaster - https://phabricator.wikimedia.org/T132689#2206880 (10hashar) [11:18:13] PROBLEM - Host integration-dev is DOWN: CRITICAL - Host Unreachable (10.68.17.81) [11:27:49] (03PS3) 10Addshore: Add another pass test file with a namespace [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/287281 [11:28:45] (03PS1) 10Addshore: Change my author tag to addshore everywhere [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/287401 [11:43:15] 06Release-Engineering-Team, 05MW-1.27-release, 07Performance, 07User-notice: First paint time regression on Wikimedia page views with wmf.23 roll-out - https://phabricator.wikimedia.org/T134553#2272037 (10Jdforrester-WMF) Provisionally tagging this as blocking the MW 1.27 release (in case it's a fault in c... [11:52:05] (03PS1) 10Addshore: Remove reflection that doesn't seem to be needed any more [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/287402 [12:51:44] RECOVERY - Puppet run on deployment-mathoid is OK: OK: Less than 1.00% above the threshold [0.0] [13:09:21] 10MediaWiki-Codesniffer, 03Google-Summer-of-Code-2016: Improving an static analysis tools for MediaWiki - Weekly reports - https://phabricator.wikimedia.org/T134225#2272134 (10Lethexie) [13:12:58] RECOVERY - Puppet run on deployment-salt is OK: OK: Less than 1.00% above the threshold [0.0] [15:05:24] PROBLEM - Puppet run on deployment-ms-fe01 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [15:33:35] PROBLEM - Puppet run on deployment-memc03 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [15:47:29] Project selenium-MobileFrontend » firefox,beta,Linux,contintLabsSlave && UbuntuTrusty build #3: 04FAILURE in 25 min: https://integration.wikimedia.org/ci/job/selenium-MobileFrontend/BROWSER=firefox,MEDIAWIKI_ENVIRONMENT=beta,PLATFORM=Linux,label=contintLabsSlave%20&&%20UbuntuTrusty/3/ [16:35:58] twentyafterfour, if I leave machines with puppet errors alone are they all just going to become more and more broken or will you take care of them? [16:45:02] 10Beta-Cluster-Infrastructure, 07Puppet: /etc/puppet/puppet.conf sometimes gets the old puppetmaster FQDN and breaks puppet - https://phabricator.wikimedia.org/T134631#2272446 (10Krenair) I'm not sure how this is a duplicate? [17:01:12] and what other phab-related machines are there in deployment-prep? [20:14:35] Krenair: no other ones [20:41:19] 07Browser-Tests, 10Wikidata: Sitelink browser test sometimes fails with firefox because of rate limit? - https://phabricator.wikimedia.org/T126585#2272652 (10JanZerebecki) [20:42:54] Project selenium-Echo » chrome,beta,Linux,contintLabsSlave && UbuntuTrusty build #14: 04FAILURE in 1 min 53 sec: https://integration.wikimedia.org/ci/job/selenium-Echo/BROWSER=chrome,MEDIAWIKI_ENVIRONMENT=beta,PLATFORM=Linux,label=contintLabsSlave%20&&%20UbuntuTrusty/14/ [20:42:58] Project selenium-Echo » firefox,beta,Linux,contintLabsSlave && UbuntuTrusty build #14: 04FAILURE in 1 min 57 sec: https://integration.wikimedia.org/ci/job/selenium-Echo/BROWSER=firefox,MEDIAWIKI_ENVIRONMENT=beta,PLATFORM=Linux,label=contintLabsSlave%20&&%20UbuntuTrusty/14/ [20:48:42] I'll try to fix the ganglia error, I think it must be related to one of the patches currently cherry-picked on the deployment puppetmaster [21:22:37] RECOVERY - Puppet run on deployment-memc03 is OK: OK: Less than 1.00% above the threshold [0.0] [21:32:54] 06Release-Engineering-Team, 06Developer-Relations, 06Team-Practices, 13Patch-For-Review, 15User-greg: Set up Code Review office hours - https://phabricator.wikimedia.org/T128371#2272683 (10mmodell) [21:35:17] 06Release-Engineering-Team, 06Developer-Relations, 06Team-Practices, 13Patch-For-Review, 15User-greg: Set up Code Review office hours - https://phabricator.wikimedia.org/T128371#2272685 (10mmodell) p:05Low>03Normal