[00:07:01] PROBLEM - Mediawiki Error Rate on graphite-labs is CRITICAL [00:07:01] Notice: /Stage[main]/Phabricator/Phabricator::Conf_env[vcs]/File[/srv/phab/phabricator/conf/local/vcs.json]/group: group changed 'vcs' to 'phd' [00:07:11] Notice: Finished catalog run in 21.16 seconds [00:07:23] twentyafterfour: ^ all good, just fyi [00:08:41] phabricator works on labs [00:08:51] time to deprecate phabricator labs class [00:09:06] just remeber to use scap on localhost on labs. [00:11:41] :)) [00:14:50] 10Deployment-Systems, 06Release-Engineering-Team, 06Operations: Trebuchet targets for test/testrepo are out of date - https://phabricator.wikimedia.org/T149180#2833693 (10fgiunchedi) p:05Triage>03Low [00:21:49] 10Beta-Cluster-Infrastructure, 13Patch-For-Review: Mobile view url broken on beta cluster (redirect, mobile view, etc.) - https://phabricator.wikimedia.org/T151894#2833715 (10Jdforrester-WMF) 05Open>03Resolved Looks like this is now fixed. Thanks! [00:23:53] We are now getting into fixing startup issue of the phab-ssh deamon on systemd [00:24:29] that will help with labs and phab2001 too i think [00:34:53] it seems the systemd unit file for the ssh-phab.service does not get created on the labs jessie instance [00:35:02] but on phab2001 we do have it [00:35:50] which is also jessie. but it could be that it used to be installed by puppet in the past and then it stopped due to another change later [00:36:35] since puppet is deactivated on phab2001. i'd like to enable it and remove that file and run it and see if it comes back or not [00:39:04] There seems to be a difference between ssh phabricator and phd (service) [00:39:06] https://github.com/wikimedia/operations-puppet/blob/a9b55135045b9b7edd7dbc75dbec7fbe8097ca87/modules/phabricator/manifests/phd.pp#L27 [00:39:14] https://github.com/wikimedia/operations-puppet/blob/8d5ac3337641041ae92e2ed7fab4e5e5b30f3f15/modules/phabricator/manifests/vcs.pp#L104 [00:39:18] mutante ^^ [00:39:31] most likly just phd redirects to phabricator bin [00:39:37] one moment paladox, be right back, testing something [00:39:43] ok [00:42:37] mutante i belive this https://github.com/wikimedia/operations-puppet/commit/6b6a5849e13b572fa64149925e313b6ad39a681f [00:42:39] broke it [00:42:46] can you confirm if git-ssh from phab is working? [00:42:58] Oh how do i test that? [00:43:04] good find, will get back to that in a minute [00:43:06] is that from phabricator.wikimedia.org [00:43:40] git-ssh.wikimedia.org [00:44:01] phabricator ssh works [00:44:06] good [00:44:15] i ran puppet on phab2001 [00:44:25] ok [00:44:26] :) [00:44:27] and remember how last time it set the wrong IP [00:44:33] and broke this [00:44:33] Yep [00:44:50] this means that phab2001 got a bunch of updates that had accumulated [00:44:57] yep [00:45:46] (03PS1) 10Krinkle: Replace visualeditor-jsduck-jessie with npm-run-doc-jessie template [integration/config] - 10https://gerrit.wikimedia.org/r/324368 [00:45:58] so let me tell you the details from prod right away [00:46:13] since you have the same issue in labs [00:46:14] well, similar [00:46:20] (03PS2) 10Krinkle: Replace visualeditor-jsduck-jessie with npm-run-doc-jessie template [integration/config] - 10https://gerrit.wikimedia.org/r/324368 [00:46:48] phab2001 has 2 IPs, the first is phab2001.codfw.wmnet and the second phab2001-vcs.codfw.wmnet [00:46:59] (03CR) 10Krinkle: "Fixed in Idd396f1acaec78f." [integration/config] - 10https://gerrit.wikimedia.org/r/323872 (owner: 10Jforrester) [00:47:03] the second is the one your second SSHD runs on [00:47:10] uep [00:47:17] on iridium thi sis: [00:47:46] yep [00:47:57] iridium.eqiad.wmnet and phab1001-vcs.eqiad.wmnet [00:48:02] yep [00:48:08] you should imagine that iridium is already phab1001 for this [00:48:36] now what you need on labs is a second private IP [00:48:39] no need for public [00:48:41] yep [00:48:49] but you just need a second IP on the interface [00:48:53] oh [00:49:05] and then the second ssh server should start [00:49:18] BUT .. it also does not right now because you dont have the systemd unit file [00:49:30] and this brings us back to what you pasted above [00:49:40] yep [00:49:40] why does the unit file not get created by puppet for you [00:50:01] Not sure I think it may be https://github.com/wikimedia/operations-puppet/commit/6b6a5849e13b572fa64149925e313b6ad39a681f [00:51:49] i am moving that file on phab2001 and running puppet [00:52:30] ok [00:52:36] it does not come back [00:52:41] and i see the behaviour you see [00:52:46] Could not evaluate: Could not retrieve information from environment production source(s) puppet:///modules/phabricator/sshd-phab.service [00:53:00] wait.. that path [00:53:05] Oh [00:53:50] sshd-phab.service [00:53:52] vs [00:53:56] ssh-phab.service [00:54:25] there is the extra "d" in the error but not in repo [00:54:27] marxarelli: my vbguest plugin became a bigger deal today when we found out that Vagrant 1.9.0 is broken with our current plugin loading scheme -- https://gerrit.wikimedia.org/r/#/c/320277/ [01:01:04] i believe the fix is just moving the file https://gerrit.wikimedia.org/r/#/c/324369/ [01:02:20] just because that matches " $init_source = 'puppet:///modules/phabricator/sshd-phab.service' [01:03:33] twentyafterfour: ^ sounds right? i'll go ahead with that since it would not influence trusty iridium anyways [01:03:51] but it should fix 2001 and labs on jessie [01:03:56] ok [01:14:13] 06Release-Engineering-Team, 06Operations, 10Parsoid: Provide a /parsoid directory on releases.wikimedia.org - https://phabricator.wikimedia.org/T150672#2833841 (10fgiunchedi) p:05Triage>03Normal [01:14:52] in iridium: nothing [01:15:00] no more puppet errors on the phabricator class in labs [01:15:01] on phab2001: service ssh-phab started [01:15:03] :) [01:15:13] started on the test instance too [01:15:17] nice [01:16:17] yep [01:17:07] well, and we have phab2001 "back" [01:17:18] puppet running i mean [01:17:41] yep :) [01:17:42] now let's see what else we need there [01:17:49] ok [01:18:16] (03PS2) 10Reedy: Stop reference of string $content as an array [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/283384 (https://phabricator.wikimedia.org/T127572) (owner: 10Aashaka) [01:18:24] T137928 [01:18:25] (03CR) 10jenkins-bot: [V: 04-1] Stop reference of string $content as an array [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/283384 (https://phabricator.wikimedia.org/T127572) (owner: 10Aashaka) [01:18:42] is the bug for that right [01:18:57] i expected a bot to turn that into full URL [01:19:48] mutante the bot is silent in this channel [01:20:15] it wont reply to you in this channel but it will publish it to the task if you log it. [01:20:44] ah, ok [01:20:55] well then, at this point we take a break and continue more tomorrow [01:21:05] yep [01:35:28] (03CR) 10Jforrester: [C: 031] Replace visualeditor-jsduck-jessie with npm-run-doc-jessie template [integration/config] - 10https://gerrit.wikimedia.org/r/324368 (owner: 10Krinkle) [01:44:27] 10MediaWiki-Codesniffer: Should we require documentation for constructors? - https://phabricator.wikimedia.org/T146388#2659448 (10Samwilson) You can just document the constructor parameters, and it'll pass. I think it's worth documenting parameters for every method, including constructors. But no need, as you s... [01:59:20] 06Release-Engineering-Team, 06Operations, 06Parsing-Team, 07HHVM, and 2 others: API cluster failure / OOM - https://phabricator.wikimedia.org/T151702#2833920 (10tstarling) >>! In T151702#2831448, @Joe wrote: > From a quick look, most threads seem effectively blocked in a very simple function: > > ``` > je... [02:02:38] 10Continuous-Integration-Config, 10MediaWiki-extensions-Other, 07Mobile: CommentStreams: The module 'ext.CommentStreams' must not have target 'mobile' because its dependency 'jquery.ui.dialog' does not have it - https://phabricator.wikimedia.org/T151863#2833926 (10Krinkle) @cicalese If you are passing `ext.C... [02:17:24] Yippee, build fixed! [02:17:25] Project selenium-QuickSurveys » chrome,beta,Linux,contintLabsSlave && UbuntuTrusty build #234: 09FIXED in 4 min 24 sec: https://integration.wikimedia.org/ci/job/selenium-QuickSurveys/BROWSER=chrome,MEDIAWIKI_ENVIRONMENT=beta,PLATFORM=Linux,label=contintLabsSlave%20&&%20UbuntuTrusty/234/ [02:48:57] (03PS1) 10Samwilson: Return earlier when testing scope fields [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/324376 (https://phabricator.wikimedia.org/T146439) [02:52:56] (03PS2) 10Samwilson: Return earlier when testing scope fields [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/324376 (https://phabricator.wikimedia.org/T146439) [02:57:58] 10MediaWiki-Codesniffer, 13Patch-For-Review: Undefined index: parenthesis_closer in SpaceBeforeControlStructureBraceSniff.php - https://phabricator.wikimedia.org/T146439#2833984 (10Samwilson) a:03Samwilson The above changes fix this and one other similar problem. phpcs runs fine on TextExtracts (well, there'... [03:40:08] (03CR) 10Legoktm: "Thanks, could you add a test case with code that previously would have triggered the warning?" [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/324376 (https://phabricator.wikimedia.org/T146439) (owner: 10Samwilson) [04:18:20] Project selenium-MultimediaViewer » firefox,beta,Linux,contintLabsSlave && UbuntuTrusty build #219: 04FAILURE in 22 min: https://integration.wikimedia.org/ci/job/selenium-MultimediaViewer/BROWSER=firefox,MEDIAWIKI_ENVIRONMENT=beta,PLATFORM=Linux,label=contintLabsSlave%20&&%20UbuntuTrusty/219/ [05:13:32] 06Release-Engineering-Team, 06Operations, 06Parsing-Team, 07HHVM, and 2 others: API cluster failure / OOM - https://phabricator.wikimedia.org/T151702#2834167 (10tstarling) Filed upstream bug https://github.com/facebook/hhvm/issues/7515 , but we're not blocked on it, we can use the MALLOC_CONF environment v... [06:01:21] 10Beta-Cluster-Infrastructure, 13Patch-For-Review: Mobile view url broken on beta cluster (redirect, mobile view, etc.) - https://phabricator.wikimedia.org/T151894#2831435 (10phuedx) Thanks @Krenair! --- >>! In T151894#2833266, @Krenair wrote: > Oh, no, maybe not, I misunderstood the hackery going on here: >... [06:40:48] Yippee, build fixed! [06:40:48] Project selenium-Wikibase » chrome,test,Linux,contintLabsSlave && UbuntuTrusty build #193: 09FIXED in 2 hr 0 min: https://integration.wikimedia.org/ci/job/selenium-Wikibase/BROWSER=chrome,MEDIAWIKI_ENVIRONMENT=test,PLATFORM=Linux,label=contintLabsSlave%20&&%20UbuntuTrusty/193/ [09:13:49] zeljkof: !;-) [09:14:39] hashar: what's up? 🤔 [09:14:47] sorry about yesterday [09:14:57] could not really assist on the npm/selenium job [09:15:01] what about yesterday? [09:15:08] ended up swamped trying to add tests for a hundred or so of extensions [09:15:14] and it took slightly longer than expected [09:15:19] oh, no problem, will continue with that today [09:15:25] seen your patch to do the symlinks under /usr/local/bin [09:15:37] lets try it! [09:15:45] yeah, I have no idea if that would work [09:18:27] it would :) [09:19:04] it took me a while to figure out how it's done, the first hit on google was something completely different [09:19:07] * zeljkof was confused [09:19:35] yeah puppet is messy [09:19:40] wanna deploy it? [09:26:53] hashar: sorry, just saw your comment [09:27:01] sure! lets's deploy! [09:27:14] meeting in the usual hangout? [09:27:45] in a coworking place so that is not convenient [09:27:59] I have no idea how to deploy :) [09:28:04] * zeljkof is searching for docs [09:28:12] puppet standalone [09:28:14] basically [09:28:16] ssh to integration-puppetmaster01.integration.eqiad.wmflabs [09:28:17] sudo su - [09:28:22] cd /var/lib/git/operations/puppet [09:28:38] (that is the local checkout of ops/puppet.git that is read by the puppet master running on that instance) [09:28:51] then git fetch && git cherry-pick [09:29:02] the fetch url / reference is listed in Gerrit. Top right under "Download" link [09:36:05] hashar: uh oh, never done that, let's see... :) [09:37:56] 06Release-Engineering-Team, 06Operations, 10Parsoid: Provide a /parsoid directory on releases.wikimedia.org - https://phabricator.wikimedia.org/T150672#2792988 (10Legoktm) A new directory can be created by defining it in puppet: https://github.com/wikimedia/operations-puppet/blob/production/modules/releases/... [10:31:06] zeljkof: internet went down [10:31:27] welcome back! 🎉 [10:32:23] hashar: I've left a comment at https://gerrit.wikimedia.org/r/#/c/324203/2 [10:32:39] I did cherry pick, but not sure if I have to run puppet manually, or if it would run automatically [10:32:41] reading https://wikitech.wikimedia.org/wiki/Puppet [10:32:49] puppet runs from a cron [10:32:57] every maybe 20 minutes or so [10:33:37] it was more than 20 minutes ago, so it should be applied then? [10:33:58] I'll rerun one of the jobs and see if it can see chromedriver [10:34:53] 10Browser-Tests-Infrastructure, 13Patch-For-Review, 15User-zeljkofilipin: Ensure ChromeDriver is installed for jobs that run Selenium tests - https://phabricator.wikimedia.org/T117418#2834427 (10hashar) ``` $ ssh integration-saltmaster.integration.eqiad.wmflabs hashar@integration-saltmaster:~$ sudo su - roo... [10:35:05] zeljkof: and salt lets one mass verify https://phabricator.wikimedia.org/T117418#2834427 [10:35:37] the next trick is that it is solely for the permanent slaves [10:35:42] hashar: so it worked?! [10:35:50] for the nodepool slaves, they are booting out of a snapshot that got generated yesterday [10:35:56] so lack the link :D [10:36:32] ok, but it will be there tomorrow? [10:37:29] zeljkof: or we can refresh the images [10:37:37] nodepool does it automatically at 14:14 UTC [10:37:49] (on other news, nodepool get more instances to spawn ! https://grafana.wikimedia.org/dashboard/db/nodepool?panelId=1&fullscreen&from=now-24h&to=now ) [10:37:54] from 12 to 19! [10:38:10] 19? why not 20? [10:39:13] it is complicated :D [10:39:29] goes with the quota [10:39:37] we had up to 12 instances against a quota of 15 [10:39:48] leaving extra room for 3 instances [10:40:06] two days ago, we had 3 instances leaked. They were in the openstack project but not known to nodepool [10:40:09] so it worked fine [10:40:26] when changing the quota to 20 instances [10:40:43] that means nodepool could have 20 instances, add to that the 3 leaked instances and that is 23 instances [10:40:54] or 23 instances * 2 CPU/instance = 46 CPU [10:40:59] but the quota is 44 CPU [10:41:05] hence oepnstack refused to boot an extra [10:41:24] moving the quota down to 19 let us allow for up to 3 leaked instances as before [10:41:27] (sorry all confusing) [10:41:35] what is a leaked instance? [10:41:36] the fix is to get Nodepool to detect leaked instances and delete the [10:41:38] m [10:41:40] ah leaked [10:41:46] an instance that nodepool asked to spawn [10:41:50] which get spawned by openstack [10:41:58] but that nodepool erroneously forget/stop tracking [10:42:13] so the instance is idling/doing nothing in openstack, and consumes its quota [10:42:30] but nodepool hasn't acknowledged it / knows about it [10:42:52] so eventually you could have 19 instances spawned in openstack. Nodepool would know about none and will try to spawn instances over and over [10:43:10] only to get refused by openstack because the labs project has 19/19 instances used [10:43:13] but why is 3 the magic number? [10:43:21] that is what we had 2 days ago [10:43:27] so merely set the same [10:43:33] it is arbitrary really [10:43:34] why wouldn't there be more, or less? [10:43:37] I see [10:44:17] meanwhile [10:44:18] https://gerrit.wikimedia.org/r/#/c/324203/ is ready [10:44:21] hashar: all *-jessie jobs are on nodepool, right? [10:44:27] gotta drop the WIP, maybe add some more info to the commit message [10:44:32] and we can get ops to review/merge the patch [10:44:44] hashar: will do [10:45:01] the -jessie -trusty jobs are on nodepool yes [10:45:11] once the patch is merged, it is quite trivial to refresh nodepool snapshots [10:45:15] oh, and -trusty too [10:45:24] https://wikitech.wikimedia.org/wiki/Nodepool#Manually_generate_a_new_snapshot [10:45:30] but the patch needs to be merged first? [10:45:31] ssh labnodepool1001.eqiad.wmnet [10:45:33] become-nodepool [10:45:39] git -C /etc/nodepool/wikimedia/ pull [10:45:42] nodepool image-update wmflabs-eqiad ci-jessie-wikimedia [10:45:48] ^^^^4 lines :} [10:45:54] and the patch has to be merged yes [10:46:09] ok, working on the commit message [10:46:11] the provisioning script will git pull from operations/puppet.git and does not support cherry pick / local hacks [10:46:19] that is a limitation :( [10:48:32] hashar: better? https://gerrit.wikimedia.org/r/#/c/324203/ [10:48:40] * zeljkof is back in a few minutes [10:51:38] coffee etc [10:53:02] le café [10:53:24] mafk: indeed :-} [10:54:27] bien sûr [11:03:08] back [11:04:25] hashar hi [11:06:15] lo [11:07:33] hashar im not sure if you saw me wrote this last night [11:07:34] http://snapshot.debian.org/package/python-shade/0.6.1-1/ [11:07:40] ^^ we could use that [11:07:55] yeah maybe or maybe not [11:08:11] as I said yesterday, I am not sure I am going to spend time trying to upgrade nodepool [11:08:17] gotta look at the patches that fix leaked instances [11:08:25] if I have confidence I can just cherry pick them I will just do that [11:09:14] ok [11:10:13] hashar if we decide to scap nodepool, are we going with Docker? [11:14:52] most probably [11:15:00] Dan did a proof of concept [11:15:08] basically the source repo has a Dockerfile [11:15:20] he has setup a basic Jessie instance that really just have docker installed [11:15:26] clone the repo, run the docker command [11:15:26] and report [11:15:49] https://integration.wikimedia.org/ci/job/differential-docker-test/ [11:16:18] and the Dockerfile example is https://phabricator.wikimedia.org/D455 [11:18:48] !log Gerrit mediawiki/extensions/CentralNotice/BannerProxy.git Empty since 2014 [11:18:51] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [11:20:52] !log Gerrit made mediawiki/extensions/GuidedTour/guiders read-only (per README.md, no more used) [11:20:55] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [11:22:11] !log Gerrit hide mediawiki/extensions/JsonData/JsonSchema Empty since 2013 [11:22:14] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [11:22:34] yep [11:22:51] hashar would that allow us to use one instance and run many Docker's in it? [11:22:59] yes [11:23:07] with that proof of concept [11:23:12] oh :) :) [11:23:17] we would have permanent instances in wmflabs [11:23:26] allowing X builds to run in parlalel [11:23:36] and the build being run in their own docker container [11:23:37] s [11:23:43] oh, i guess that would be the benefit allow mutiple runs on the same instance. [11:23:49] would it be as secure as nodepool? [11:24:17] so we can remove the whitelist as planned? [11:29:13] ideally yes [11:31:51] Oh :) [11:32:32] hashar i clean up the look of the ext dependacies in paramater_function in https://gerrit.wikimedia.org/r/#/c/323540/ [11:32:59] its is now all clear, making it easyer for future users to add deps without jenkins failing hopefully [11:37:59] 10Deployment-Systems, 03Scap3 (Scap3-Adoption-Phase1), 10scap, 10MediaWiki-JobRunner: Deploy jobrunner with scap3 (Trebuchet jobrunner/jobrunner) - https://phabricator.wikimedia.org/T129148#2834598 (10hashar) [11:42:15] 03Scap3 (Scap3-Adoption-Phase1), 10scap, 06Discovery, 06Discovery-Search, 10Elasticsearch: Deploy elasticsearch plugins with scap3 - https://phabricator.wikimedia.org/T151996#2834610 (10hashar) [11:43:14] 03Scap3 (Scap3-Adoption-Phase1), 10scap, 06Discovery, 06Discovery-Search, 10Elasticsearch: Deploy elasticsearch plugins with scap3 - https://phabricator.wikimedia.org/T151996#2834625 (10hashar) @dcausse @Gehel I am going to try to migrate the jobrunner service first ( T129148 ), then I guess jump into tr... [11:43:34] 03Scap3 (Scap3-Adoption-Phase1), 10scap, 06Discovery, 06Discovery-Search, 10Elasticsearch: Deploy elasticsearch plugins with scap3 (Trebuchet elasticsearch/plugins) - https://phabricator.wikimedia.org/T151996#2834627 (10hashar) [11:44:29] 03Scap3 (Scap3-Adoption-Phase1), 10scap, 06Discovery, 06Discovery-Search, 10Elasticsearch: Deploy elasticsearch plugins with scap3 (Trebuchet elasticsearch/plugins) - https://phabricator.wikimedia.org/T151996#2834628 (10Gehel) @hashar I only have very limited experience with scap3 and even less with treb... [11:53:09] 03Scap3 (Scap3-Adoption-Phase1), 10scap, 06Discovery, 06Discovery-Search, 10Elasticsearch: Deploy elasticsearch plugins with scap3 (Trebuchet elasticsearch/plugins) - https://phabricator.wikimedia.org/T151996#2834637 (10hashar) I will level up myself on jobrunner then brain dump what I know and lead the... [11:57:01] (03PS1) 10Zfilipin: WIP Run experimental Node.js Selenium job for mediawiki/core in experimental pipeline [integration/config] - 10https://gerrit.wikimedia.org/r/324416 (https://phabricator.wikimedia.org/T139740) [11:57:46] (03Abandoned) 10Zfilipin: WIP mediawiki-core-qunit-jessie Jenkins job needs Vector skin [integration/config] - 10https://gerrit.wikimedia.org/r/324178 (https://phabricator.wikimedia.org/T139740) (owner: 10Zfilipin) [11:58:11] hashar: how does this look? https://gerrit.wikimedia.org/r/#/c/324416/ [11:59:02] 03Scap3 (Scap3-Adoption-Phase1), 10scap, 06Discovery, 06Discovery-Search, 10Elasticsearch: Deploy elasticsearch plugins with scap3 (Trebuchet elasticsearch/plugins) - https://phabricator.wikimedia.org/T151996#2834644 (10dcausse) Thanks @hashar! Let me know if I can help, I know the existing process but I... [12:03:52] Yippee, build fixed! [12:03:53] Project selenium-RelatedArticles » chrome,beta-mobile,Linux,contintLabsSlave && UbuntuTrusty build #228: 09FIXED in 2 min 51 sec: https://integration.wikimedia.org/ci/job/selenium-RelatedArticles/BROWSER=chrome,MEDIAWIKI_ENVIRONMENT=beta-mobile,PLATFORM=Linux,label=contintLabsSlave%20&&%20UbuntuTrusty/228/ [12:09:04] (03PS1) 10Hashar: test: add qa to list Gerrit repos not in Zuul [integration/config] - 10https://gerrit.wikimedia.org/r/324420 [12:18:37] (03CR) 10Hashar: [C: 032] test: add qa to list Gerrit repos not in Zuul [integration/config] - 10https://gerrit.wikimedia.org/r/324420 (owner: 10Hashar) [12:19:30] (03Merged) 10jenkins-bot: test: add qa to list Gerrit repos not in Zuul [integration/config] - 10https://gerrit.wikimedia.org/r/324420 (owner: 10Hashar) [12:21:11] (03PS1) 10Hashar: Add debian-glue-non-voting to four repos [integration/config] - 10https://gerrit.wikimedia.org/r/324422 [12:22:35] (03CR) 10Hashar: [C: 032] Add debian-glue-non-voting to four repos [integration/config] - 10https://gerrit.wikimedia.org/r/324422 (owner: 10Hashar) [12:23:46] (03Merged) 10jenkins-bot: Add debian-glue-non-voting to four repos [integration/config] - 10https://gerrit.wikimedia.org/r/324422 (owner: 10Hashar) [13:08:57] sync-masters: 0% (ok: 0; fail: 0; left: 1) [13:09:01] Shouldn't that be doing both? [13:09:08] ie a sync-common on tin too? [13:15:59] 06Release-Engineering-Team, 03Scap3: /srv/mediawiki on tin not being updated when using scap sync-file - https://phabricator.wikimedia.org/T152005#2834815 (10Reedy) [13:16:24] (03PS1) 10Hashar: Tweak integration-config-qa email notification [integration/config] - 10https://gerrit.wikimedia.org/r/324437 [13:21:36] Reedy: sync-masters is just to sync /srv/mediawiki-staging between the deployment hosts [13:22:04] sync-common I guess the deployment servers are targets of deployment and they are populated just like other mw app servers [13:22:21] (03CR) 10Hashar: [C: 032] "Looks nicer now :-}" [integration/config] - 10https://gerrit.wikimedia.org/r/324437 (owner: 10Hashar) [13:23:44] (03Merged) 10jenkins-bot: Tweak integration-config-qa email notification [integration/config] - 10https://gerrit.wikimedia.org/r/324437 (owner: 10Hashar) [13:47:02] Yippee, build fixed! [13:47:02] Project selenium-VisualEditor » firefox,beta,Linux,contintLabsSlave && UbuntuTrusty build #229: 09FIXED in 3 min 1 sec: https://integration.wikimedia.org/ci/job/selenium-VisualEditor/BROWSER=firefox,MEDIAWIKI_ENVIRONMENT=beta,PLATFORM=Linux,label=contintLabsSlave%20&&%20UbuntuTrusty/229/ [14:27:02] (03PS1) 10Hashar: [EditPageTracking] make tests voting [integration/config] - 10https://gerrit.wikimedia.org/r/324450 [14:27:04] (03PS1) 10Hashar: [OpenIDConnect] make tests voting [integration/config] - 10https://gerrit.wikimedia.org/r/324451 [14:27:06] (03PS1) 10Hashar: [Auth_remoteuser] make tests voting [integration/config] - 10https://gerrit.wikimedia.org/r/324452 [14:34:08] (03CR) 10Hashar: [C: 032] [EditPageTracking] make tests voting [integration/config] - 10https://gerrit.wikimedia.org/r/324450 (owner: 10Hashar) [14:34:12] (03CR) 10Hashar: [C: 032] [Auth_remoteuser] make tests voting [integration/config] - 10https://gerrit.wikimedia.org/r/324452 (owner: 10Hashar) [14:34:15] (03CR) 10Hashar: [C: 032] [OpenIDConnect] make tests voting [integration/config] - 10https://gerrit.wikimedia.org/r/324451 (owner: 10Hashar) [14:36:08] (03Merged) 10jenkins-bot: [EditPageTracking] make tests voting [integration/config] - 10https://gerrit.wikimedia.org/r/324450 (owner: 10Hashar) [14:36:55] (03Merged) 10jenkins-bot: [OpenIDConnect] make tests voting [integration/config] - 10https://gerrit.wikimedia.org/r/324451 (owner: 10Hashar) [14:36:57] (03Merged) 10jenkins-bot: [Auth_remoteuser] make tests voting [integration/config] - 10https://gerrit.wikimedia.org/r/324452 (owner: 10Hashar) [14:44:58] (03PS1) 10Hashar: mediawiki-extensions-* jobs on Nodepool [integration/config] - 10https://gerrit.wikimedia.org/r/324456 (https://phabricator.wikimedia.org/T135001) [14:45:28] (03CR) 10Paladox: [C: 031] "Yay" [integration/config] - 10https://gerrit.wikimedia.org/r/324456 (https://phabricator.wikimedia.org/T135001) (owner: 10Hashar) [14:45:38] 10Continuous-Integration-Config, 05Continuous-Integration-Scaling, 10releng-201516-q3, 13Patch-For-Review, 07WorkType-NewFunctionality: Migrate PHPUnit MediaWiki core jobs to Nodepool - https://phabricator.wikimedia.org/T135001#2835039 (10hashar) [14:50:04] (03PS1) 10Hashar: mediawiki HHVM job from Trusty to Jessie [integration/config] - 10https://gerrit.wikimedia.org/r/324457 [14:54:41] (03CR) 10Hashar: [C: 032] mediawiki HHVM job from Trusty to Jessie [integration/config] - 10https://gerrit.wikimedia.org/r/324457 (owner: 10Hashar) [14:55:41] (03Merged) 10jenkins-bot: mediawiki HHVM job from Trusty to Jessie [integration/config] - 10https://gerrit.wikimedia.org/r/324457 (owner: 10Hashar) [14:57:18] (03PS2) 10Hashar: mediawiki-extensions-* jobs on Nodepool [integration/config] - 10https://gerrit.wikimedia.org/r/324456 (https://phabricator.wikimedia.org/T135001) [14:59:27] (03PS1) 10Hashar: [CryoKey] make tests voting [integration/config] - 10https://gerrit.wikimedia.org/r/324463 [15:03:42] (03CR) 10Hashar: [C: 032] [CryoKey] make tests voting [integration/config] - 10https://gerrit.wikimedia.org/r/324463 (owner: 10Hashar) [15:05:32] PROBLEM - Host deployment-elastic08 is DOWN: CRITICAL - Host Unreachable (10.68.21.29) [15:06:33] (03Merged) 10jenkins-bot: [CryoKey] make tests voting [integration/config] - 10https://gerrit.wikimedia.org/r/324463 (owner: 10Hashar) [15:44:17] (03PS3) 10Hashar: mediawiki-extensions-* jobs on Nodepool [integration/config] - 10https://gerrit.wikimedia.org/r/324456 (https://phabricator.wikimedia.org/T135001) [15:44:19] (03PS1) 10Hashar: Drop mediawiki-phpunit-hhvm-jessie from experimental [integration/config] - 10https://gerrit.wikimedia.org/r/324470 [15:44:26] PROBLEM - Long lived cherry-picks on puppetmaster on deployment-puppetmaster02 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [15:47:59] (03CR) 10Hashar: [C: 032] Drop mediawiki-phpunit-hhvm-jessie from experimental [integration/config] - 10https://gerrit.wikimedia.org/r/324470 (owner: 10Hashar) [15:48:53] (03Merged) 10jenkins-bot: Drop mediawiki-phpunit-hhvm-jessie from experimental [integration/config] - 10https://gerrit.wikimedia.org/r/324470 (owner: 10Hashar) [15:50:00] (03CR) 10Hashar: [C: 032] mediawiki-extensions-* jobs on Nodepool [integration/config] - 10https://gerrit.wikimedia.org/r/324456 (https://phabricator.wikimedia.org/T135001) (owner: 10Hashar) [15:51:42] (03Merged) 10jenkins-bot: mediawiki-extensions-* jobs on Nodepool [integration/config] - 10https://gerrit.wikimedia.org/r/324456 (https://phabricator.wikimedia.org/T135001) (owner: 10Hashar) [16:07:55] hashar: does beta not have a beta restbase? [16:08:17] (let me find out how to give you back an answer that does not sound like trolling) [16:08:24] :D [16:08:34] maybe they have docker images running on rackspace? [16:08:48] honestly, I have no idea [16:09:06] okay! :D [16:09:17] maybe it is not quite possible to setup kafka/cassandra/whatever trendy tech on labs instance [16:09:58] addshore: there is deployment-restbase01.deployment-prep.eqiad.wmflabs !! [16:10:20] isnt VisualEditor relying on restbase to reach parsoid nowadays ? [16:10:48] and there is [16:10:49] wmf-config/LabsServices.php:$wmfAllServices['eqiad']['restbase'] = 'http://10.68.17.189:7231'; // deployment-restbase02.deployment-prep.eqiad.wmflabs [16:11:25] that later url seems to respond addshore ! [16:12:11] oooh [16:13:11] https://deployment.wikimedia.beta.wmflabs.org/api/rest_v1/page/html/User%3AErikaHerzog?redirect=false [16:13:17] oooh, okay it is there [16:13:26] Hi! I am active on the Wikimedia and Wikipedia projects as User:BrillLyle on English Wikipedia and am part of Wikimedia NYC. [16:13:36] ;D [16:13:37] but the doc page is a 404 https://deployment.wikimedia.beta.wmflabs.org/api/rest_v1 [16:13:41] wrong user [16:13:53] haha, that was just a sample page ;) [16:13:59] trailing slash issue I guess [16:14:00] https://deployment.wikimedia.beta.wmflabs.org/api/rest_v1/ [16:14:01] works [16:14:12] TIL restbase is available on beta :}}} [16:14:22] ahh epic, so it does exist for beta, and it's in the same place! woo! [16:14:23] there are some entry point that would not be availalbe [16:14:32] I have seen a task related to setting up page views api on beta [16:14:44] and eventually I think folks will use a mock/fake database [16:14:56] instead of trying to replicate the whole analytics cluster [16:15:55] 06Release-Engineering-Team, 06Operations, 06Parsing-Team, 07HHVM, and 3 others: API cluster failure / OOM - https://phabricator.wikimedia.org/T151702#2835239 (10Joe) I have set arenas for jemalloc to be equal to the number of processors seen by the OS, the bandaid fix should be in the process of being remo... [16:20:30] (03PS1) 10Hashar: Switch to mediawiki-extensions-* jobs [integration/config] - 10https://gerrit.wikimedia.org/r/324477 (https://phabricator.wikimedia.org/T135001) [16:21:00] 06Release-Engineering-Team, 06Operations, 06Parsing-Team, 07HHVM, and 3 others: API cluster failure / OOM - https://phabricator.wikimedia.org/T151702#2835255 (10Joe) So, with the HHVM part "solved" we still should take the prevention measures I named here: - Check the concurrency/retry/timeout rates of al... [16:21:00] (03PS2) 10Hashar: Switch to Nodepool mediawiki-extensions-* jobs [integration/config] - 10https://gerrit.wikimedia.org/r/324477 (https://phabricator.wikimedia.org/T135001) [16:22:49] (03CR) 10jenkins-bot: [V: 04-1] Switch to Nodepool mediawiki-extensions-* jobs [integration/config] - 10https://gerrit.wikimedia.org/r/324477 (https://phabricator.wikimedia.org/T135001) (owner: 10Hashar) [16:31:28] (03PS3) 10Hashar: Switch to mediawiki-extensions-* jobs [integration/config] - 10https://gerrit.wikimedia.org/r/324477 (https://phabricator.wikimedia.org/T135001) [16:41:09] releng folks, gerrit seems to be broken [16:41:34] giuseppe is looking at it, as of a minute ago in _security [16:41:46] see -operations [16:41:53] andrewbogott ^^ [16:41:54] heh, ok :) [16:41:59] :) [16:42:27] paladox: I just thought the releng people might want to know [16:42:37] Oh ok [16:42:46] :) [16:56:38] 10Continuous-Integration-Config, 10MediaWiki-extensions-Other, 07Mobile, 13Patch-For-Review: CommentStreams: The module 'ext.CommentStreams' must not have target 'mobile' because its dependency 'jquery.ui.dialog' does not have it - https://phabricator.wikimedia.org/T151863#2835374 (10cicalese) Thank you @K... [17:01:41] (03CR) 10Hashar: "Should work (checked via experimental pipeline) but one never know. Will do tomorrow I guess." [integration/config] - 10https://gerrit.wikimedia.org/r/324477 (https://phabricator.wikimedia.org/T135001) (owner: 10Hashar) [17:04:42] Project beta-code-update-eqiad build #132416: 04FAILURE in 1 min 41 sec: https://integration.wikimedia.org/ci/job/beta-code-update-eqiad/132416/ [17:06:07] 10Continuous-Integration-Config, 10MediaWiki-extensions-Other, 07Mobile, 13Patch-For-Review: CommentStreams: The module 'ext.CommentStreams' must not have target 'mobile' because its dependency 'jquery.ui.dialog' does not have it - https://phabricator.wikimedia.org/T151863#2835411 (10cicalese) OK, tests pa... [17:09:17] Yippee, build fixed! [17:09:17] Project beta-code-update-eqiad build #132417: 09FIXED in 1 min 47 sec: https://integration.wikimedia.org/ci/job/beta-code-update-eqiad/132417/ [17:11:01] !log rolling restart of deployment-elastic0* - upgrade to Java 8 - T151325 [17:11:06] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [17:12:53] 10Continuous-Integration-Config, 10MediaWiki-extensions-Other, 07Mobile, 13Patch-For-Review: CommentStreams: The module 'ext.CommentStreams' must not have target 'mobile' because its dependency 'jquery.ui.dialog' does not have it - https://phabricator.wikimedia.org/T151863#2835448 (10hashar) There is no ro... [17:13:06] I am off [17:16:00] 10Continuous-Integration-Config, 10MediaWiki-extensions-Other, 07Mobile, 13Patch-For-Review: CommentStreams: The module 'ext.CommentStreams' must not have target 'mobile' because its dependency 'jquery.ui.dialog' does not have it - https://phabricator.wikimedia.org/T151863#2835453 (10cicalese) Great! I see... [17:16:40] 06Release-Engineering-Team, 06Operations, 06Parsing-Team, 07HHVM, and 3 others: API cluster failure / OOM - https://phabricator.wikimedia.org/T151702#2835456 (10greg) >>! In T151702#2835255, @Joe wrote: > So, with the HHVM part "solved" we still should take the prevention measures I named here: > > - Chec... [17:22:24] !log restart of logstash on deployment-logstash2 - upgrade to Java 8 - T151325 [17:22:28] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [17:23:50] 10Continuous-Integration-Config, 10MediaWiki-extensions-Other, 07Mobile, 13Patch-For-Review: CommentStreams: The module 'ext.CommentStreams' must not have target 'mobile' because its dependency 'jquery.ui.dialog' does not have it - https://phabricator.wikimedia.org/T151863#2835488 (10cicalese) One more que... [17:27:25] 10Continuous-Integration-Config, 10MediaWiki-extensions-Other, 07Mobile, 13Patch-For-Review: CommentStreams: The module 'ext.CommentStreams' must not have target 'mobile' because its dependency 'jquery.ui.dialog' does not have it - https://phabricator.wikimedia.org/T151863#2835507 (10hashar) Sure thing! Th... [17:29:52] 06Release-Engineering-Team, 06Operations, 06Parsing-Team, 07HHVM, and 3 others: API cluster failure / OOM - https://phabricator.wikimedia.org/T151702#2835543 (10Joe) @greg yeah I know, I'll do my homework, promised :) I'm just waiting to see if the issue happens again in the next couple of days before clo... [17:30:03] 10Continuous-Integration-Config, 10MediaWiki-extensions-Other, 07Mobile, 13Patch-For-Review: CommentStreams: The module 'ext.CommentStreams' must not have target 'mobile' because its dependency 'jquery.ui.dialog' does not have it - https://phabricator.wikimedia.org/T151863#2835544 (10cicalese) Awesome! Tha... [18:18:20] 03Scap3: scap version flag - https://phabricator.wikimedia.org/T147155#2835730 (10mmodell) 05Open>03Resolved a:03mmodell resolved by D448 [20:07:31] PROBLEM - Puppet run on deployment-cache-upload04 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [20:14:29] 03Scap3, 10Parsoid: Scap rollback fails after promote completes - https://phabricator.wikimedia.org/T149012#2836204 (10dduvall) 05Open>03Resolved a:03dduvall Implemented in {D439} [20:21:26] 10Continuous-Integration-Config, 06Operations, 06Operations-Software-Development: Flake8 for python files without extension in puppet repo - https://phabricator.wikimedia.org/T144169#2590514 (10fgiunchedi) After some discussion in https://gerrit.wikimedia.org/r/#/c/323559/ I've changed my vote to "automatica... [20:47:29] RECOVERY - Puppet run on deployment-cache-upload04 is OK: OK: Less than 1.00% above the threshold [0.0] [21:07:32] 10Gerrit, 06Operations, 13Patch-For-Review: Investigate why gerrit slowed down on 17/10/2016 / 18/10/2016 / 21/10/2016 - https://phabricator.wikimedia.org/T148478#2836410 (10Dzahn) today we disabled gc on gerrit completely https://gerrit.wikimedia.org/r/#/c/323655/ this was linked to T151676 a related ticket [21:08:17] 10Gerrit, 06Operations, 07Beta-Cluster-reproducible, 13Patch-For-Review: gerrit jgit gc caused mediawiki/core repo problems - https://phabricator.wikimedia.org/T151676#2824332 (10Dzahn) now gc is disabled. also see T148478 [21:09:14] 10Gerrit, 06Operations, 13Patch-For-Review: Investigate why gerrit slowed down on 17/10/2016 / 18/10/2016 / 21/10/2016 30/11/2016 - https://phabricator.wikimedia.org/T148478#2724179 (10Dzahn) [21:22:17] PROBLEM - Puppet run on deployment-eventlogging03 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [21:22:48] PROBLEM - Puppet run on deployment-mx is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [21:23:14] PROBLEM - Puppet run on deployment-mediawiki04 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [21:24:24] PROBLEM - Puppet run on integration-slave-trusty-1004 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [21:24:24] PROBLEM - Puppet run on deployment-puppetmaster02 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [21:24:24] PROBLEM - Puppet run on integration-slave-docker-1000 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [21:25:05] PROBLEM - Puppet run on deployment-ores-redis is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [21:25:30] PROBLEM - Puppet run on deployment-stream is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [21:26:07] PROBLEM - Puppet run on deployment-aqs01 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [21:27:05] PROBLEM - Puppet run on deployment-kafka05 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [21:27:27] PROBLEM - Puppet run on deployment-tin is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [21:27:38] PROBLEM - Puppet run on deployment-pdf01 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [21:27:45] this will also be https://gerrit.wikimedia.org/r/#/c/256890/11 [21:27:48] and was reverted [21:28:10] PROBLEM - Puppet run on deployment-restbase02 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [21:28:18] PROBLEM - Puppet run on deployment-sca01 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [21:28:22] PROBLEM - Puppet run on integration-slave-jessie-android is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [21:30:10] PROBLEM - Puppet run on deployment-imagescaler01 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [21:30:30] PROBLEM - Puppet run on deployment-sca02 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [21:30:34] PROBLEM - Puppet run on deployment-poolcounter04 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [21:31:10] PROBLEM - Puppet run on integration-publisher is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [21:32:05] PROBLEM - Puppet run on deployment-kafka01 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [21:32:11] PROBLEM - Puppet run on deployment-ircd is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [21:32:17] PROBLEM - Puppet run on deployment-sentry01 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [21:32:19] PROBLEM - Puppet run on integration-slave-trusty-1006 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [21:32:25] PROBLEM - Puppet run on deployment-secureredirexperiment is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [21:33:13] PROBLEM - Puppet run on deployment-memc04 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [21:33:35] PROBLEM - Puppet run on deployment-db03 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [21:34:07] PROBLEM - Puppet run on deployment-ms-be01 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [21:34:14] PROBLEM - Puppet run on deployment-kafka03 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [21:40:36] 10Deployment-Systems, 10Architecture, 07Availability: WikiDev 16 working area: Software engineering - https://phabricator.wikimedia.org/T119032#2836513 (10daniel) [22:00:05] RECOVERY - Puppet run on deployment-ores-redis is OK: OK: Less than 1.00% above the threshold [0.0] [22:01:05] RECOVERY - Puppet run on deployment-aqs01 is OK: OK: Less than 1.00% above the threshold [0.0] [22:02:03] RECOVERY - Puppet run on deployment-kafka05 is OK: OK: Less than 1.00% above the threshold [0.0] [22:02:19] RECOVERY - Puppet run on deployment-eventlogging03 is OK: OK: Less than 1.00% above the threshold [0.0] [22:02:47] RECOVERY - Puppet run on deployment-mx is OK: OK: Less than 1.00% above the threshold [0.0] [22:03:14] RECOVERY - Puppet run on deployment-mediawiki04 is OK: OK: Less than 1.00% above the threshold [0.0] [22:03:24] RECOVERY - Puppet run on integration-slave-jessie-android is OK: OK: Less than 1.00% above the threshold [0.0] [22:04:24] RECOVERY - Puppet run on integration-slave-docker-1000 is OK: OK: Less than 1.00% above the threshold [0.0] [22:04:24] RECOVERY - Puppet run on integration-slave-trusty-1004 is OK: OK: Less than 1.00% above the threshold [0.0] [22:04:26] RECOVERY - Puppet run on deployment-puppetmaster02 is OK: OK: Less than 1.00% above the threshold [0.0] [22:05:16] we good? [22:05:31] RECOVERY - Puppet run on deployment-stream is OK: OK: Less than 1.00% above the threshold [0.0] [22:05:31] ah, I see daniel's comment [22:05:33] RECOVERY - Puppet run on deployment-sca02 is OK: OK: Less than 1.00% above the threshold [0.0] [22:05:33] RECOVERY - Puppet run on deployment-poolcounter04 is OK: OK: Less than 1.00% above the threshold [0.0] [22:05:50] this == that == the puppet failures [22:06:20] it was a change in base, that's why it affected prod and labs [22:06:36] * greg-g nods [22:06:55] i will also restart the prod icinga bot now, or you would have seen spam in -operations too [22:07:10] RECOVERY - Puppet run on deployment-ircd is OK: OK: Less than 1.00% above the threshold [0.0] [22:07:18] RECOVERY - Puppet run on deployment-sentry01 is OK: OK: Less than 1.00% above the threshold [0.0] [22:07:20] RECOVERY - Puppet run on integration-slave-trusty-1006 is OK: OK: Less than 1.00% above the threshold [0.0] [22:07:26] RECOVERY - Puppet run on deployment-secureredirexperiment is OK: OK: Less than 1.00% above the threshold [0.0] [22:07:28] RECOVERY - Puppet run on deployment-tin is OK: OK: Less than 1.00% above the threshold [0.0] [22:07:36] RECOVERY - Puppet run on deployment-pdf01 is OK: OK: Less than 1.00% above the threshold [0.0] [22:08:08] RECOVERY - Puppet run on deployment-restbase02 is OK: OK: Less than 1.00% above the threshold [0.0] [22:08:12] RECOVERY - Puppet run on deployment-memc04 is OK: OK: Less than 1.00% above the threshold [0.0] [22:08:15] RECOVERY - Puppet run on deployment-sca01 is OK: OK: Less than 1.00% above the threshold [0.0] [22:09:04] RECOVERY - Puppet run on deployment-ms-be01 is OK: OK: Less than 1.00% above the threshold [0.0] [22:09:14] RECOVERY - Puppet run on deployment-kafka03 is OK: OK: Less than 1.00% above the threshold [0.0] [22:10:12] RECOVERY - Puppet run on deployment-imagescaler01 is OK: OK: Less than 1.00% above the threshold [0.0] [22:11:09] RECOVERY - Puppet run on integration-publisher is OK: OK: Less than 1.00% above the threshold [0.0] [22:12:05] RECOVERY - Puppet run on deployment-kafka01 is OK: OK: Less than 1.00% above the threshold [0.0] [22:13:35] RECOVERY - Puppet run on deployment-db03 is OK: OK: Less than 1.00% above the threshold [0.0] [23:03:18] 10Gerrit, 06Operations, 13Patch-For-Review: Investigate why gerrit slowed down on 17/10/2016 / 18/10/2016 / 21/10/2016 30/11/2016 - https://phabricator.wikimedia.org/T148478#2836728 (10Paladox) The cpu seems to be still very high https://ganglia.wikimedia.org/latest/?r=hour&cs=&ce=&m=cpu_report&c=Miscellane... [23:10:08] (03PS11) 10Paladox: Support extension and skin dependacies in the skin pipeline and extension pipeline [integration/config] - 10https://gerrit.wikimedia.org/r/323540 (https://phabricator.wikimedia.org/T151593) [23:11:40] (03PS12) 10Paladox: Support extension and skin dependacies in the skin pipeline and extension pipeline [integration/config] - 10https://gerrit.wikimedia.org/r/323540 (https://phabricator.wikimedia.org/T151593)