[01:51:00] <shinken-wm>	 PROBLEM - Puppet failure on deployment-memc04 is CRITICAL 20.00% of data above the critical threshold [0.0]
[01:56:40] <shinken-wm>	 PROBLEM - Puppet failure on deployment-test is CRITICAL 30.00% of data above the critical threshold [0.0]
[01:57:40] <shinken-wm>	 PROBLEM - Puppet failure on deployment-jobrunner01 is CRITICAL 40.00% of data above the critical threshold [0.0]
[01:58:10] <shinken-wm>	 PROBLEM - Puppet failure on deployment-apertium01 is CRITICAL 33.33% of data above the critical threshold [0.0]
[01:58:10] <shinken-wm>	 PROBLEM - Puppet failure on deployment-mediawiki01 is CRITICAL 33.33% of data above the critical threshold [0.0]
[01:58:14] <shinken-wm>	 PROBLEM - Puppet failure on deployment-db1 is CRITICAL 20.00% of data above the critical threshold [0.0]
[01:58:22] <shinken-wm>	 PROBLEM - Puppet failure on deployment-videoscaler01 is CRITICAL 22.22% of data above the critical threshold [0.0]
[01:58:30] <shinken-wm>	 PROBLEM - Puppet failure on deployment-memc03 is CRITICAL 20.00% of data above the critical threshold [0.0]
[01:58:54] <shinken-wm>	 PROBLEM - Puppet failure on deployment-sentry2 is CRITICAL 40.00% of data above the critical threshold [0.0]
[01:58:58] <shinken-wm>	 PROBLEM - Puppet failure on deployment-db2 is CRITICAL 20.00% of data above the critical threshold [0.0]
[01:59:14] <shinken-wm>	 PROBLEM - Puppet failure on deployment-restbase01 is CRITICAL 22.22% of data above the critical threshold [0.0]
[01:59:26] <shinken-wm>	 PROBLEM - Puppet failure on deployment-parsoidcache02 is CRITICAL 40.00% of data above the critical threshold [0.0]
[01:59:28] <shinken-wm>	 PROBLEM - Puppet failure on deployment-cxserver03 is CRITICAL 20.00% of data above the critical threshold [0.0]
[01:59:30] <shinken-wm>	 PROBLEM - Puppet failure on deployment-bastion is CRITICAL 20.00% of data above the critical threshold [0.0]
[01:59:38] <shinken-wm>	 PROBLEM - Puppet failure on deployment-fluoride is CRITICAL 60.00% of data above the critical threshold [0.0]
[01:59:40] <shinken-wm>	 PROBLEM - Puppet failure on deployment-kafka02 is CRITICAL 30.00% of data above the critical threshold [0.0]
[02:00:07] <shinken-wm>	 PROBLEM - Puppet failure on deployment-fluorine is CRITICAL 22.22% of data above the critical threshold [0.0]
[02:00:21] <shinken-wm>	 PROBLEM - Puppet failure on deployment-elastic05 is CRITICAL 50.00% of data above the critical threshold [0.0]
[02:00:39] <shinken-wm>	 PROBLEM - Puppet failure on deployment-zookeeper01 is CRITICAL 30.00% of data above the critical threshold [0.0]
[02:00:43] <shinken-wm>	 PROBLEM - Puppet failure on deployment-pdf02 is CRITICAL 40.00% of data above the critical threshold [0.0]
[02:01:37] <shinken-wm>	 PROBLEM - Puppet failure on deployment-redis01 is CRITICAL 40.00% of data above the critical threshold [0.0]
[02:01:39] <shinken-wm>	 PROBLEM - Puppet failure on deployment-sca01 is CRITICAL 70.00% of data above the critical threshold [0.0]
[02:01:49] <shinken-wm>	 PROBLEM - Puppet failure on deployment-salt is CRITICAL 30.00% of data above the critical threshold [0.0]
[02:02:03] <shinken-wm>	 PROBLEM - Puppet failure on deployment-mediawiki03 is CRITICAL 30.00% of data above the critical threshold [0.0]
[02:02:05] <shinken-wm>	 PROBLEM - Puppet failure on deployment-parsoid05 is CRITICAL 44.44% of data above the critical threshold [0.0]
[02:02:07] <shinken-wm>	 PROBLEM - Puppet failure on deployment-mathoid is CRITICAL 44.44% of data above the critical threshold [0.0]
[02:02:07] <shinken-wm>	 PROBLEM - Puppet failure on deployment-upload is CRITICAL 44.44% of data above the critical threshold [0.0]
[02:02:07] <shinken-wm>	 PROBLEM - Puppet failure on deployment-mediawiki02 is CRITICAL 44.44% of data above the critical threshold [0.0]
[02:03:09] <shinken-wm>	 PROBLEM - Puppet failure on deployment-elastic06 is CRITICAL 33.33% of data above the critical threshold [0.0]
[02:03:23] <shinken-wm>	 PROBLEM - Puppet failure on deployment-elastic08 is CRITICAL 50.00% of data above the critical threshold [0.0]
[02:03:23] <shinken-wm>	 PROBLEM - Puppet failure on deployment-stream is CRITICAL 60.00% of data above the critical threshold [0.0]
[02:13:10] <shinken-wm>	 RECOVERY - Puppet failure on deployment-elastic06 is OK Less than 1.00% above the threshold [0.0]
[02:13:22] <shinken-wm>	 RECOVERY - Puppet failure on deployment-elastic08 is OK Less than 1.00% above the threshold [0.0]
[02:13:32] <shinken-wm>	 RECOVERY - Puppet failure on deployment-memc03 is OK Less than 1.00% above the threshold [0.0]
[02:14:26] <shinken-wm>	 RECOVERY - Puppet failure on deployment-parsoidcache02 is OK Less than 1.00% above the threshold [0.0]
[02:15:39] <shinken-wm>	 RECOVERY - Puppet failure on deployment-zookeeper01 is OK Less than 1.00% above the threshold [0.0]
[02:16:37] <shinken-wm>	 RECOVERY - Puppet failure on deployment-redis01 is OK Less than 1.00% above the threshold [0.0]
[02:16:39] <shinken-wm>	 RECOVERY - Puppet failure on deployment-sca01 is OK Less than 1.00% above the threshold [0.0]
[02:16:40] <shinken-wm>	 RECOVERY - Puppet failure on deployment-test is OK Less than 1.00% above the threshold [0.0]
[02:17:05] <shinken-wm>	 RECOVERY - Puppet failure on deployment-mathoid is OK Less than 1.00% above the threshold [0.0]
[02:17:07] <shinken-wm>	 RECOVERY - Puppet failure on deployment-upload is OK Less than 1.00% above the threshold [0.0]
[02:17:39] <shinken-wm>	 RECOVERY - Puppet failure on deployment-jobrunner01 is OK Less than 1.00% above the threshold [0.0]
[02:18:06] <shinken-wm>	 RECOVERY - Puppet failure on deployment-mediawiki01 is OK Less than 1.00% above the threshold [0.0]
[02:18:12] <shinken-wm>	 RECOVERY - Puppet failure on deployment-apertium01 is OK Less than 1.00% above the threshold [0.0]
[02:18:22] <shinken-wm>	 RECOVERY - Puppet failure on deployment-videoscaler01 is OK Less than 1.00% above the threshold [0.0]
[02:18:56] <shinken-wm>	 RECOVERY - Puppet failure on deployment-sentry2 is OK Less than 1.00% above the threshold [0.0]
[02:19:36] <shinken-wm>	 RECOVERY - Puppet failure on deployment-fluoride is OK Less than 1.00% above the threshold [0.0]
[02:20:40] <shinken-wm>	 RECOVERY - Puppet failure on deployment-pdf02 is OK Less than 1.00% above the threshold [0.0]
[02:21:00] <shinken-wm>	 RECOVERY - Puppet failure on deployment-memc04 is OK Less than 1.00% above the threshold [0.0]
[02:22:08] <shinken-wm>	 RECOVERY - Puppet failure on deployment-mediawiki02 is OK Less than 1.00% above the threshold [0.0]
[02:23:12] <shinken-wm>	 RECOVERY - Puppet failure on deployment-db1 is OK Less than 1.00% above the threshold [0.0]
[02:23:26] <shinken-wm>	 RECOVERY - Puppet failure on deployment-stream is OK Less than 1.00% above the threshold [0.0]
[02:24:14] <shinken-wm>	 RECOVERY - Puppet failure on deployment-restbase01 is OK Less than 1.00% above the threshold [0.0]
[02:24:29] <shinken-wm>	 RECOVERY - Puppet failure on deployment-bastion is OK Less than 1.00% above the threshold [0.0]
[02:24:41] <shinken-wm>	 RECOVERY - Puppet failure on deployment-kafka02 is OK Less than 1.00% above the threshold [0.0]
[02:25:07] <shinken-wm>	 RECOVERY - Puppet failure on deployment-fluorine is OK Less than 1.00% above the threshold [0.0]
[02:25:23] <shinken-wm>	 RECOVERY - Puppet failure on deployment-elastic05 is OK Less than 1.00% above the threshold [0.0]
[02:26:47] <shinken-wm>	 RECOVERY - Puppet failure on deployment-salt is OK Less than 1.00% above the threshold [0.0]
[02:27:03] <shinken-wm>	 RECOVERY - Puppet failure on deployment-mediawiki03 is OK Less than 1.00% above the threshold [0.0]
[02:27:03] <shinken-wm>	 RECOVERY - Puppet failure on deployment-parsoid05 is OK Less than 1.00% above the threshold [0.0]
[02:28:59] <shinken-wm>	 RECOVERY - Puppet failure on deployment-db2 is OK Less than 1.00% above the threshold [0.0]
[02:29:27] <shinken-wm>	 RECOVERY - Puppet failure on deployment-cxserver03 is OK Less than 1.00% above the threshold [0.0]
[04:02:24] <shinken-wm>	 PROBLEM - SSH on deployment-logstash1 is CRITICAL - Socket timeout after 10 seconds
[04:07:21] <shinken-wm>	 RECOVERY - SSH on deployment-logstash1 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.4 (protocol 2.0)
[04:40:59] <wmf-insecte>	 Yippee, build fixed!
[04:40:59] <wmf-insecte>	 Project browsertests-MultimediaViewer-en.wikipedia.beta.wmflabs.org-windows_8.1-internet_explorer-11-sauce build #447: FIXED in 33 min: https://integration.wikimedia.org/ci/job/browsertests-MultimediaViewer-en.wikipedia.beta.wmflabs.org-windows_8.1-internet_explorer-11-sauce/447/
[04:41:39] <shinken-wm>	 PROBLEM - App Server Main HTTP Response on deployment-mediawiki01 is CRITICAL - Socket timeout after 10 seconds
[04:46:34] <shinken-wm>	 RECOVERY - App Server Main HTTP Response on deployment-mediawiki01 is OK: HTTP OK: HTTP/1.1 200 OK - 47739 bytes in 2.448 second response time
[05:35:22] <wmf-insecte>	 Yippee, build fixed!
[05:35:23] <wmf-insecte>	 Project browsertests-MultimediaViewer-en.wikipedia.beta.wmflabs.org-windows_7-internet_explorer-11-sauce build #422: FIXED in 33 min: https://integration.wikimedia.org/ci/job/browsertests-MultimediaViewer-en.wikipedia.beta.wmflabs.org-windows_7-internet_explorer-11-sauce/422/
[06:17:49] <shinken-wm>	 PROBLEM - Puppet failure on deployment-salt is CRITICAL 20.00% of data above the critical threshold [0.0]
[06:43:45] <shinken-wm>	 RECOVERY - Free space - all mounts on deployment-bastion is OK All targets OK
[06:47:49] <shinken-wm>	 RECOVERY - Puppet failure on deployment-salt is OK Less than 1.00% above the threshold [0.0]
[07:31:16] <shinken-wm>	 PROBLEM - Content Translation Server on deployment-cxserver03 is CRITICAL: Connection refused
[07:34:17] <wikibugs>	 10Continuous-Integration-Infrastructure, 10Parsoid, 10VisualEditor: VisualEditor doesn't work on the beta-cluster - https://phabricator.wikimedia.org/T99496#1292767 (10Amire80) 3NEW
[07:36:16] <shinken-wm>	 RECOVERY - Content Translation Server on deployment-cxserver03 is OK: HTTP OK: HTTP/1.1 200 OK - 1103 bytes in 0.019 second response time
[08:05:51] <etonkovidova>	 zeljkof: can you see my message?
[08:06:02] <zeljkof>	 etonkovidova: yes! :)
[08:06:11] <etonkovidova>	 zeljkof: perfec!
[08:07:00] <greg-g>	 hola!
[08:10:51] <zeljkof>	 etonkovidova: https://github.com/wikimedia/mediawiki-extensions-MobileFrontend/graphs/contributors
[08:10:59] <zeljkof>	 greg-g: welcome to irccloud! :)
[08:13:41] <greg-g>	 zeljkof: I'm not on irccloud, I'm on my bouncer (via ssh) ;)
[08:13:52] <zeljkof>	 greg-g: cool :)
[08:13:58] <greg-g>	 irssi ftw ;)
[08:21:43] <hashar>	 greg-g: you are such a geek
[08:22:13] <wikibugs>	 10Continuous-Integration-Infrastructure, 10Parsoid, 10VisualEditor: VisualEditor doesn't work on the beta-cluster - https://phabricator.wikimedia.org/T99496#1292834 (10mobrovac) This is caused by a problem in the VE <-> RESTBase communication. There seem to be two problems:  1. VE in Beta is trying to use th...
[08:22:16] <shinken-wm>	 PROBLEM - Content Translation Server on deployment-cxserver03 is CRITICAL: Connection refused
[08:25:25] <etonkovidova>	 zeljkof: https://phabricator.wikimedia.org/T97364
[08:27:57] <wikibugs>	 10Beta-Cluster, 10RESTBase, 10VisualEditor: VisualEditor doesn't work on the beta-cluster - https://phabricator.wikimedia.org/T99496#1292841 (10mobrovac) p:5Triage>3Unbreak! a:3mobrovac
[08:37:17] <shinken-wm>	 RECOVERY - Content Translation Server on deployment-cxserver03 is OK: HTTP OK: HTTP/1.1 200 OK - 1103 bytes in 0.020 second response time
[08:39:32] <wikibugs>	 10Continuous-Integration-Infrastructure: PHP Warning:  Module 'apc' already loaded in Unknown on line 0 - https://phabricator.wikimedia.org/T99413#1292854 (10Legoktm)
[08:39:49] <wikibugs>	 10Continuous-Integration-Infrastructure: PHP Warning:  Module 'apc' already loaded in Unknown on line 0 on zend slaves - https://phabricator.wikimedia.org/T99413#1292857 (10Legoktm)
[08:48:52] <wikibugs>	 10Continuous-Integration-Infrastructure: PHP Warning:  Module 'apc' already loaded in Unknown on line 0 on zend slaves - https://phabricator.wikimedia.org/T99413#1292872 (10Nemo_bis) Legoktm said: > ...no, it was just failing on "08:11:01 42 | WARNING | Line exceeds 100 characters; contains 107 characters" and c...
[08:56:40] <shinken-wm>	 PROBLEM - Content Translation Server on deployment-sca02 is CRITICAL: Connection refused
[09:03:17] <shinken-wm>	 PROBLEM - Puppet failure on deployment-sca02 is CRITICAL 100.00% of data above the critical threshold [0.0]
[09:10:06] <wikibugs>	 6Release-Engineering, 6Project-Creators: SWAT Project (Tag) - https://phabricator.wikimedia.org/T99411#1292920 (10mmodell) @krenair:  that was me testing out a sprint, this is proposal for a real permanent project.
[09:12:25] <wikibugs>	 10Beta-Cluster, 10Parsoid: Parsoid not binding to any port in Beta Cluster - https://phabricator.wikimedia.org/T99505#1292925 (10mobrovac) 3NEW
[09:14:14] <wikibugs>	 10Beta-Cluster, 10RESTBase, 10VisualEditor: VisualEditor doesn't work on the beta-cluster - https://phabricator.wikimedia.org/T99496#1292935 (10mobrovac)
[09:14:18] <wikibugs>	 10Beta-Cluster, 10Parsoid: Parsoid not binding to any port in Beta Cluster - https://phabricator.wikimedia.org/T99505#1292936 (10mobrovac)
[09:16:51] <wikibugs>	 6Release-Engineering, 6Project-Creators: SWAT Project (Tag) - https://phabricator.wikimedia.org/T99411#1292945 (10Legoktm) > It looks like we will start organizing SWAT deployments on maniphest  Says who and why? Where was this discussed with SWAT deployers / people who request SWAT deploys / people who someho...
[09:20:22] <wikibugs>	 6Release-Engineering, 6Project-Creators: SWAT Project (Tag) - https://phabricator.wikimedia.org/T99411#1292948 (10greg) Don't worry :) I just started an email thread with the people listed on Deployments page as SWATers after a suggestion from Jon R. Nothing decided yet, we just started the conversation.  When...
[09:22:24] <shinken-wm>	 PROBLEM - Puppet failure on deployment-mx is CRITICAL 100.00% of data above the critical threshold [0.0]
[09:55:48] <wikibugs>	 10Beta-Cluster, 10RESTBase, 10VisualEditor: VisualEditor doesn't work on the beta-cluster - https://phabricator.wikimedia.org/T99496#1292985 (10mobrovac) >>! In T99496#1292834, @mobrovac wrote: > This is caused by a problem in the VE <-> RESTBase communication. There seem to be two problems: >  > 1. VE in Be...
[09:56:07] <shinken-wm>	 RECOVERY - Puppet staleness on deployment-restbase02 is OK Less than 1.00% above the threshold [3600.0]
[10:12:30] <wikibugs>	 10Deployment-Systems, 6Services: Evaluate Ansible as a deployment tool - https://phabricator.wikimedia.org/T93433#1293000 (10JeanFred) Just in case that’s useful, I’m a long time Ansible user and use it for deployments − happy to help out if I can.
[10:31:21] <shinken-wm>	 PROBLEM - Free space - all mounts on deployment-eventlogging02 is CRITICAL deployment-prep.deployment-eventlogging02.diskspace._var.byte_percentfree (<30.00%)
[11:11:43] <shinken-wm>	 PROBLEM - App Server Main HTTP Response on deployment-mediawiki03 is CRITICAL - Socket timeout after 10 seconds
[11:28:02] <shinken-wm>	 PROBLEM - Puppet failure on deployment-parsoid05 is CRITICAL 33.33% of data above the critical threshold [0.0]
[11:35:39] <shinken-wm>	 PROBLEM - Puppet failure on deployment-fluoride is CRITICAL 30.00% of data above the critical threshold [0.0]
[11:39:55] <shinken-wm>	 PROBLEM - Puppet failure on deployment-sentry2 is CRITICAL 50.00% of data above the critical threshold [0.0]
[11:40:25] <shinken-wm>	 PROBLEM - Puppet failure on deployment-parsoidcache02 is CRITICAL 70.00% of data above the critical threshold [0.0]
[11:48:02] <shinken-wm>	 RECOVERY - Puppet failure on deployment-parsoid05 is OK Less than 1.00% above the threshold [0.0]
[11:50:24] <shinken-wm>	 RECOVERY - Puppet failure on deployment-parsoidcache02 is OK Less than 1.00% above the threshold [0.0]
[11:55:35] <shinken-wm>	 RECOVERY - Puppet failure on deployment-fluoride is OK Less than 1.00% above the threshold [0.0]
[11:59:52] <wmf-insecte>	 Project browsertests-CentralAuth-en.wikipedia.beta.wmflabs.org-linux-firefox-sauce build #112: FAILURE in 2 min 52 sec: https://integration.wikimedia.org/ci/job/browsertests-CentralAuth-en.wikipedia.beta.wmflabs.org-linux-firefox-sauce/112/
[11:59:55] <shinken-wm>	 RECOVERY - Puppet failure on deployment-sentry2 is OK Less than 1.00% above the threshold [0.0]
[12:17:33] <grrrit-wm>	 (03CR) 10TheDJ: "Note to self, link that page from the relevant: https://www.mediawiki.org/wiki/Manual:Coding_conventions" [integration/config] - 10https://gerrit.wikimedia.org/r/209991 (owner: 10TheDJ)
[12:38:22] <mobrovac>	 greg-g: if the brain trust in Annecy has any input on https://phabricator.wikimedia.org/T99505 , it would be highly appreciated
[12:49:50] <mobrovac>	 also, does anybody know what's the status of deployment-logstash1 ?
[12:50:31] <wikibugs>	 10Continuous-Integration-Infrastructure, 10MediaWiki-extensions-Translate: mediawiki-extensions-hhvm: MessageGroupStatesUpdaterJobTest::testHooks is intermittent failing - https://phabricator.wikimedia.org/T88554#1293385 (10Nikerabbit) p:5High>3Low
[13:01:52] <wikibugs>	 10Beta-Cluster, 6Labs: No DNS entry for deployment-logstash1.eqiad.wmflabs ? - https://phabricator.wikimedia.org/T99521#1293401 (10mobrovac) 3NEW
[13:08:27] <shinken-wm>	 PROBLEM - English Wikipedia Mobile Main page on beta-cluster is CRITICAL - Socket timeout after 10 seconds
[13:08:50] <wikibugs>	 10Beta-Cluster, 6Labs: No DNS entry for deployment-logstash1.eqiad.wmflabs ? - https://phabricator.wikimedia.org/T99521#1293422 (10mobrovac)
[13:09:16] <wikibugs>	 10Beta-Cluster, 6Labs: No DNS entry for deployment-logstash1.eqiad.wmflabs ? - https://phabricator.wikimedia.org/T99521#1293401 (10mobrovac)
[13:12:10] <wikibugs>	 10Beta-Cluster, 10Parsoid: Parsoid not binding to any port in Beta Cluster - https://phabricator.wikimedia.org/T99505#1293428 (10mobrovac) Heh, turns out the actual problem has nothing to do with Parsoid per se. As in T99506, Parsoid tries to create a gelf logger which connects to `deployment-logstash1`, but i...
[13:12:11] <shinken-wm>	 RECOVERY - Parsoid on deployment-parsoid05 is OK: HTTP OK: HTTP/1.1 200 OK - 1476 bytes in 0.078 second response time
[13:13:14] <mobrovac>	 any take / info on https://phabricator.wikimedia.org/T99521 ?
[13:13:21] <shinken-wm>	 RECOVERY - English Wikipedia Mobile Main page on beta-cluster is OK: HTTP OK: HTTP/1.1 200 OK - 31029 bytes in 0.802 second response time
[13:13:26] <mobrovac>	 parsoid and restbase are failing in beta because of it
[13:22:43] <wikibugs>	 10Beta-Cluster, 6Release-Engineering, 6Labs: No DNS entry for deployment-logstash1.eqiad.wmflabs ? - https://phabricator.wikimedia.org/T99521#1293441 (10Aklapper)
[13:35:09] <wikibugs>	 10Beta-Cluster, 6Release-Engineering, 6Labs: No DNS entry for deployment-logstash1.eqiad.wmflabs ? - https://phabricator.wikimedia.org/T99521#1293479 (10Joe) We narrowed down the problem to be local to the ldap-backed dns, @Coren is looking into it at the moment.
[13:37:56] <wikibugs>	 10Deployment-Systems, 6Services: Evaluate Ansible as a deployment tool - https://phabricator.wikimedia.org/T93433#1293487 (10Joe) @Gwicke   >>! In T93433#1207219, @mmodell wrote: > Although ansible looks really coo l** I think it's going to be a tough sell as long as we are using puppet and salt as our 'offica...
[14:06:46] <wikibugs>	 10Beta-Cluster, 6Release-Engineering, 6Labs: No DNS entry for deployment-logstash1.eqiad.wmflabs ? - https://phabricator.wikimedia.org/T99521#1293546 (10coren) The issue may actually lie within dnsmasq after all; the instance is properly listed in the list it should be serving the name of, yet gives a SRVFAI...
[14:13:22] <wikibugs>	 10Deployment-Systems, 6Services: Evaluate Ansible as a deployment tool - https://phabricator.wikimedia.org/T93433#1293559 (10GWicke) > A tough one indeed.  We are primarily interested in rolling deployments here, which puppet doesn't support; there shouldn't be any direct competition vs. puppet at this point....
[14:42:28] <shinken-wm>	 PROBLEM - Puppet failure on integration-puppetmaster is CRITICAL 50.00% of data above the critical threshold [0.0]
[14:50:53] <shinken-wm>	 PROBLEM - Puppet failure on deployment-sentry2 is CRITICAL 30.00% of data above the critical threshold [0.0]
[15:07:27] <shinken-wm>	 RECOVERY - Puppet failure on integration-puppetmaster is OK Less than 1.00% above the threshold [0.0]
[15:17:31] * Nemo_bis upgraded to fedora 21, now going through https://phabricator.wikimedia.org/diffusion/MWVA/browse/master/support/README-lxc.md or rather https://github.com/fgrehm/vagrant-lxc/wiki/Usage-on-fedora-hosts#fedora-21
[15:20:54] <shinken-wm>	 RECOVERY - Puppet failure on deployment-sentry2 is OK Less than 1.00% above the threshold [0.0]
[15:25:45] <Nemo_bis>	 Lines 17–47 can hopefully be skipped
[15:26:24] <wikibugs>	 10Beta-Cluster, 10Parsoid: Parsoid not binding to any port in Beta Cluster - https://phabricator.wikimedia.org/T99505#1293752 (10Andrew)
[15:26:26] <wikibugs>	 10Beta-Cluster, 6Release-Engineering, 6Labs: No DNS entry for deployment-logstash1.eqiad.wmflabs ? - https://phabricator.wikimedia.org/T99521#1293748 (10Andrew) 5Open>3Resolved a:3Andrew No idea why this broke, but a reboot fixed it.
[15:28:57] <wmf-insecte>	 Project browsertests-Math-en.wikipedia.beta.wmflabs.org-linux-chrome-sauce build #537: FAILURE in 56 sec: https://integration.wikimedia.org/ci/job/browsertests-Math-en.wikipedia.beta.wmflabs.org-linux-chrome-sauce/537/
[15:29:43] <shinken-wm>	 PROBLEM - Puppet staleness on deployment-logstash1 is CRITICAL 22.22% of data above the critical threshold [43200.0]
[15:34:43] <shinken-wm>	 RECOVERY - Puppet staleness on deployment-logstash1 is OK Less than 1.00% above the threshold [3600.0]
[16:29:10] <shinken-wm>	 PROBLEM - Puppet failure on deployment-zotero01 is CRITICAL 100.00% of data above the critical threshold [0.0]
[16:36:56] <Nemo_bis>	 no luck...
[16:36:56] <Nemo_bis>	 - Gem::Ext::BuildError: ERROR: Failed to build gem native extension.
[16:44:16] <wikibugs>	 10Beta-Cluster, 6Release-Engineering, 6Labs: No DNS entry for deployment-logstash1.eqiad.wmflabs ? - https://phabricator.wikimedia.org/T99521#1293890 (10coren) As a further note (so that we can recognize the bug if it recurs), dnsmasq /did/ have a valid lease for that IP and had the correct name associated w...
[16:46:00] <shinken-wm>	 PROBLEM - Host integration-slave-trusty-1015 is DOWN: CRITICAL - Host Unreachable (10.68.18.30)
[16:46:12] <shinken-wm>	 PROBLEM - Host deployment-bastion is DOWN: CRITICAL - Host Unreachable (10.68.16.58)
[16:47:18] <shinken-wm>	 PROBLEM - Host deployment-test is DOWN: CRITICAL - Host Unreachable (10.68.16.149)
[16:47:34] <shinken-wm>	 PROBLEM - Host deployment-memc03 is DOWN: PING CRITICAL - Packet loss = 100%
[16:47:44] <shinken-wm>	 PROBLEM - Host deployment-parsoid05 is DOWN: PING CRITICAL - Packet loss = 100%
[16:47:46] <shinken-wm>	 PROBLEM - Host deployment-restbase01 is DOWN: PING CRITICAL - Packet loss = 100%
[16:48:38] <shinken-wm>	 PROBLEM - Host deployment-rsync01 is DOWN: PING CRITICAL - Packet loss = 100%
[16:48:44] <shinken-wm>	 PROBLEM - Host integration-slave-trusty-1013 is DOWN: PING CRITICAL - Packet loss = 100%
[16:49:04] <shinken-wm>	 PROBLEM - Host deployment-urldownloader is DOWN: PING CRITICAL - Packet loss = 100%
[16:49:16] <shinken-wm>	 PROBLEM - Host deployment-elastic08 is DOWN: CRITICAL - Host Unreachable (10.68.17.188)
[16:49:30] <shinken-wm>	 PROBLEM - Host integration-raita is DOWN: CRITICAL - Host Unreachable (10.68.16.53)
[16:49:48] <shinken-wm>	 PROBLEM - Host deployment-cache-text02 is DOWN: CRITICAL - Host Unreachable (10.68.16.16)
[16:50:21] <shinken-wm>	 PROBLEM - Host deployment-pdf01 is DOWN: PING CRITICAL - Packet loss = 100%
[16:50:23] <shinken-wm>	 PROBLEM - Host deployment-salt is DOWN: CRITICAL - Host Unreachable (10.68.16.99)
[16:50:25] <shinken-wm>	 PROBLEM - Host Generic Beta Cluster is DOWN: CRITICAL - Host Unreachable (en.wikipedia.beta.wmflabs.org)
[16:54:13] <shinken-wm>	 PROBLEM - Puppet failure on deployment-apertium01 is CRITICAL 20.00% of data above the critical threshold [0.0]
[16:56:56] <shinken-wm>	 PROBLEM - Puppet failure on deployment-sentry2 is CRITICAL 60.00% of data above the critical threshold [0.0]
[16:57:04] <shinken-wm>	 PROBLEM - Puppet failure on deployment-memc04 is CRITICAL 50.00% of data above the critical threshold [0.0]
[16:57:40] <shinken-wm>	 PROBLEM - Puppet failure on deployment-sca01 is CRITICAL 22.22% of data above the critical threshold [0.0]
[16:58:10] <shinken-wm>	 PROBLEM - Puppet failure on deployment-mediawiki02 is CRITICAL 40.00% of data above the critical threshold [0.0]
[16:59:18] <shinken-wm>	 PROBLEM - Puppet failure on deployment-db1 is CRITICAL 30.00% of data above the critical threshold [0.0]
[16:59:27] <shinken-wm>	 PROBLEM - Puppet failure on deployment-stream is CRITICAL 50.00% of data above the critical threshold [0.0]
[17:00:03] <shinken-wm>	 PROBLEM - Puppet failure on deployment-db2 is CRITICAL 33.33% of data above the critical threshold [0.0]
[17:00:31] <shinken-wm>	 PROBLEM - Puppet failure on deployment-cxserver03 is CRITICAL 20.00% of data above the critical threshold [0.0]
[17:00:42] <shinken-wm>	 PROBLEM - Puppet failure on deployment-kafka02 is CRITICAL 44.44% of data above the critical threshold [0.0]
[17:01:08] <shinken-wm>	 PROBLEM - Puppet failure on deployment-fluorine is CRITICAL 40.00% of data above the critical threshold [0.0]
[17:01:24] <shinken-wm>	 PROBLEM - Puppet failure on deployment-elastic05 is CRITICAL 60.00% of data above the critical threshold [0.0]
[17:01:46] <shinken-wm>	 PROBLEM - Puppet failure on deployment-pdf02 is CRITICAL 100.00% of data above the critical threshold [0.0]
[17:02:38] <shinken-wm>	 PROBLEM - Puppet failure on deployment-logstash1 is CRITICAL 40.00% of data above the critical threshold [0.0]
[17:03:02] <shinken-wm>	 PROBLEM - Puppet failure on deployment-mediawiki03 is CRITICAL 30.00% of data above the critical threshold [0.0]
[17:04:12] <shinken-wm>	 PROBLEM - Puppet failure on deployment-elastic06 is CRITICAL 20.00% of data above the critical threshold [0.0]
[17:04:24] <shinken-wm>	 PROBLEM - Puppet failure on deployment-videoscaler01 is CRITICAL 60.00% of data above the critical threshold [0.0]
[17:06:28] <shinken-wm>	 PROBLEM - Puppet failure on deployment-parsoidcache02 is CRITICAL 40.00% of data above the critical threshold [0.0]
[17:07:39] <shinken-wm>	 PROBLEM - Puppet failure on deployment-redis01 is CRITICAL 11.11% of data above the critical threshold [0.0]
[17:08:11] <shinken-wm>	 PROBLEM - Puppet failure on deployment-mathoid is CRITICAL 30.00% of data above the critical threshold [0.0]
[17:08:11] <shinken-wm>	 PROBLEM - Puppet failure on deployment-upload is CRITICAL 20.00% of data above the critical threshold [0.0]
[17:08:57] <shinken-wm>	 RECOVERY - Host deployment-elastic08 is UPING OK - Packet loss = 0%, RTA = 0.59 ms
[17:09:03] <shinken-wm>	 RECOVERY - Host deployment-memc03 is UPING OK - Packet loss = 0%, RTA = 0.69 ms
[17:09:15] <shinken-wm>	 RECOVERY - Host deployment-parsoid05 is UPING OK - Packet loss = 0%, RTA = 0.86 ms
[17:09:39] <shinken-wm>	 RECOVERY - Host deployment-rsync01 is UPING OK - Packet loss = 0%, RTA = 0.60 ms
[17:09:49] <shinken-wm>	 RECOVERY - Host deployment-cache-text02 is UPING OK - Packet loss = 0%, RTA = 0.62 ms
[17:09:57] <shinken-wm>	 RECOVERY - Host deployment-bastion is UPING OK - Packet loss = 0%, RTA = 0.63 ms
[17:10:07] <shinken-wm>	 RECOVERY - Host deployment-pdf01 is UPING OK - Packet loss = 0%, RTA = 0.60 ms
[17:10:13] <shinken-wm>	 RECOVERY - Host deployment-urldownloader is UPING OK - Packet loss = 0%, RTA = 0.55 ms
[17:10:17] <shinken-wm>	 PROBLEM - Puppet failure on deployment-restbase02 is CRITICAL 50.00% of data above the critical threshold [0.0]
[17:10:23] <shinken-wm>	 RECOVERY - Host deployment-restbase01 is UPING OK - Packet loss = 0%, RTA = 0.76 ms
[17:10:33] <shinken-wm>	 RECOVERY - Host deployment-test is UPING OK - Packet loss = 0%, RTA = 0.75 ms
[17:11:21] <shinken-wm>	 RECOVERY - Host Generic Beta Cluster is UPING OK - Packet loss = 0%, RTA = 0.45 ms
[17:11:39] <shinken-wm>	 PROBLEM - Puppet failure on deployment-fluoride is CRITICAL 40.00% of data above the critical threshold [0.0]
[17:11:43] <shinken-wm>	 PROBLEM - Puppet failure on deployment-zookeeper01 is CRITICAL 66.67% of data above the critical threshold [0.0]
[17:11:51] <shinken-wm>	 RECOVERY - Host deployment-salt is UPING OK - Packet loss = 0%, RTA = 0.48 ms
[17:12:00] <greg-g>	 eh?
[17:12:01] <shinken-wm>	 RECOVERY - Host integration-slave-trusty-1013 is UPING OK - Packet loss = 0%, RTA = 0.82 ms
[17:12:02] <shinken-wm>	 RECOVERY - Host integration-slave-trusty-1015 is UPING OK - Packet loss = 0%, RTA = 0.55 ms
[17:13:43] <shinken-wm>	 PROBLEM - Puppet failure on deployment-jobrunner01 is CRITICAL 55.56% of data above the critical threshold [0.0]
[17:14:07] <shinken-wm>	 RECOVERY - Host integration-raita is UPING OK - Packet loss = 0%, RTA = 0.71 ms
[17:14:09] <shinken-wm>	 PROBLEM - Puppet failure on deployment-mediawiki01 is CRITICAL 40.00% of data above the critical threshold [0.0]
[17:14:33] <shinken-wm>	 PROBLEM - Puppet failure on deployment-memc03 is CRITICAL 80.00% of data above the critical threshold [0.0]
[17:15:29] <shinken-wm>	 PROBLEM - Puppet failure on deployment-bastion is CRITICAL 71.43% of data above the critical threshold [0.0]
[17:24:18] <shinken-wm>	 RECOVERY - Puppet failure on deployment-db1 is OK Less than 1.00% above the threshold [0.0]
[17:24:26] <shinken-wm>	 RECOVERY - Puppet failure on deployment-stream is OK Less than 1.00% above the threshold [0.0]
[17:25:00] <shinken-wm>	 RECOVERY - Puppet failure on deployment-db2 is OK Less than 1.00% above the threshold [0.0]
[17:25:30] <shinken-wm>	 RECOVERY - Puppet failure on deployment-cxserver03 is OK Less than 1.00% above the threshold [0.0]
[17:25:44] <shinken-wm>	 RECOVERY - Puppet failure on deployment-kafka02 is OK Less than 1.00% above the threshold [0.0]
[17:26:08] <shinken-wm>	 RECOVERY - Puppet failure on deployment-fluorine is OK Less than 1.00% above the threshold [0.0]
[17:26:22] <shinken-wm>	 RECOVERY - Puppet failure on deployment-elastic05 is OK Less than 1.00% above the threshold [0.0]
[17:27:14] <mobrovac>	 heh, was alone today in the future-of-deployment meeting
[17:27:38] <shinken-wm>	 RECOVERY - Puppet failure on deployment-logstash1 is OK Less than 1.00% above the threshold [0.0]
[17:28:07] <shinken-wm>	 RECOVERY - Puppet failure on deployment-mediawiki03 is OK Less than 1.00% above the threshold [0.0]
[17:29:13] <shinken-wm>	 RECOVERY - Puppet failure on deployment-elastic06 is OK Less than 1.00% above the threshold [0.0]
[17:31:27] <shinken-wm>	 RECOVERY - Puppet failure on deployment-parsoidcache02 is OK Less than 1.00% above the threshold [0.0]
[17:32:35] <greg-g>	 mobrovac: everyone is out at for a walk/doing some shopping for things they forgot. I'm stuck at the hotel in a budget meeting
[17:32:39] <shinken-wm>	 RECOVERY - Puppet failure on deployment-redis01 is OK Less than 1.00% above the threshold [0.0]
[17:32:55] <greg-g>	 well, all of my people that is ;)
[17:33:00] <mobrovac>	 greg-g: yeah, no pb, figured as much
[17:33:02] <mobrovac>	 lucky you
[17:33:05] <mobrovac>	 fight fight fight
[17:33:06] <mobrovac>	 :)
[17:33:07] <shinken-wm>	 RECOVERY - Puppet failure on deployment-mathoid is OK Less than 1.00% above the threshold [0.0]
[17:34:33] <shinken-wm>	 RECOVERY - Puppet failure on deployment-memc03 is OK Less than 1.00% above the threshold [0.0]
[17:34:44] <greg-g>	 mobrovac: :P
[17:36:39] <shinken-wm>	 RECOVERY - Puppet failure on deployment-fluoride is OK Less than 1.00% above the threshold [0.0]
[17:36:41] <shinken-wm>	 RECOVERY - Puppet failure on deployment-pdf02 is OK Less than 1.00% above the threshold [0.0]
[17:36:41] <shinken-wm>	 RECOVERY - Puppet failure on deployment-zookeeper01 is OK Less than 1.00% above the threshold [0.0]
[17:36:57] <shinken-wm>	 RECOVERY - Puppet failure on deployment-sentry2 is OK Less than 1.00% above the threshold [0.0]
[17:37:43] <shinken-wm>	 RECOVERY - Puppet failure on deployment-sca01 is OK Less than 1.00% above the threshold [0.0]
[17:38:13] <shinken-wm>	 RECOVERY - Puppet failure on deployment-upload is OK Less than 1.00% above the threshold [0.0]
[17:38:41] <shinken-wm>	 RECOVERY - Puppet failure on deployment-jobrunner01 is OK Less than 1.00% above the threshold [0.0]
[17:39:09] <shinken-wm>	 RECOVERY - Puppet failure on deployment-mediawiki01 is OK Less than 1.00% above the threshold [0.0]
[17:39:11] <shinken-wm>	 RECOVERY - Puppet failure on deployment-apertium01 is OK Less than 1.00% above the threshold [0.0]
[17:39:25] <shinken-wm>	 RECOVERY - Puppet failure on deployment-videoscaler01 is OK Less than 1.00% above the threshold [0.0]
[17:40:20] <shinken-wm>	 RECOVERY - Puppet failure on deployment-restbase02 is OK Less than 1.00% above the threshold [0.0]
[17:40:32] <shinken-wm>	 RECOVERY - Puppet failure on deployment-bastion is OK Less than 1.00% above the threshold [0.0]
[17:42:04] <shinken-wm>	 RECOVERY - Puppet failure on deployment-memc04 is OK Less than 1.00% above the threshold [0.0]
[17:43:14] <shinken-wm>	 RECOVERY - Puppet failure on deployment-mediawiki02 is OK Less than 1.00% above the threshold [0.0]
[18:21:33] <shinken-wm>	 PROBLEM - Puppet failure on integration-zuul-packaged is CRITICAL 100.00% of data above the critical threshold [0.0]
[18:22:36] <wikibugs>	 10Beta-Cluster, 10RESTBase, 10VisualEditor: VisualEditor doesn't work on the beta-cluster - https://phabricator.wikimedia.org/T99496#1294203 (10mobrovac)
[18:22:39] <wikibugs>	 10Beta-Cluster, 10Parsoid: Parsoid not binding to any port in Beta Cluster - https://phabricator.wikimedia.org/T99505#1294201 (10mobrovac) 5Open>3Resolved a:3mobrovac
[18:43:14] <wikibugs>	 10Beta-Cluster, 10RESTBase, 10VisualEditor: VisualEditor doesn't work on the beta-cluster - https://phabricator.wikimedia.org/T99496#1294339 (10mobrovac) 5Open>3Resolved
[18:47:15] <wikibugs>	 10Continuous-Integration-Infrastructure, 7Jenkins: Let Jenkins-mwext-sync clean up own open unmergable patch sets - https://phabricator.wikimedia.org/T99552#1294357 (10Umherirrender) 3NEW
[19:18:44] <wikibugs>	 10Beta-Cluster, 10RESTBase-Cassandra: Cassandra gets the wrong IPs in deployment-prep - https://phabricator.wikimedia.org/T99564#1294541 (10mobrovac) 3NEW
[20:26:21] <shinken-wm>	 RECOVERY - Free space - all mounts on deployment-eventlogging02 is OK All targets OK
[20:41:40] <wikibugs>	 10Deployment-Systems, 6Collaboration-Team, 10Thanks, 7I18n: "Thanks" button in language which is not the interface one - https://phabricator.wikimedia.org/T99575#1294860 (10Nemo_bis)
[20:56:10] <shinken-wm>	 PROBLEM - App Server Main HTTP Response on deployment-mediawiki03 is CRITICAL - Socket timeout after 10 seconds
[20:56:10] <shinken-wm>	 PROBLEM - App Server Main HTTP Response on deployment-mediawiki02 is CRITICAL - Socket timeout after 10 seconds
[20:59:39] <shinken-wm>	 RECOVERY - App Server Main HTTP Response on deployment-mediawiki03 is OK: HTTP OK: HTTP/1.1 200 OK - 46625 bytes in 4.356 second response time
[21:00:53] <shinken-wm>	 RECOVERY - App Server Main HTTP Response on deployment-mediawiki02 is OK: HTTP OK: HTTP/1.1 200 OK - 47738 bytes in 0.767 second response time
[21:07:13] <wikibugs>	 10Deployment-Systems, 6Collaboration-Team, 10Thanks, 7I18n: "Thanks" button in language which is not the interface one - https://phabricator.wikimedia.org/T99575#1294906 (10Nemo_bis) 5Open>3Resolved a:3Nemo_bis Actually the search failed me https://translatewiki.net/w/i.php?title=MediaWiki:Thanks-tha...
[21:17:26] <shinken-wm>	 PROBLEM - Puppet staleness on deployment-eventlogging02 is CRITICAL 100.00% of data above the critical threshold [43200.0]
[21:38:45] <wikibugs>	 10Continuous-Integration-Infrastructure, 6Release-Engineering, 10Wikimedia-Hackathon-2015, 10Wikipedia-Android-App, 10Wikipedia-iOS-App: Hacking: Create end-to-end automated test for Wikipedia native app(s) - https://phabricator.wikimedia.org/T90177#1294965 (10BGerstle-WMF) a:3BGerstle-WMF
[22:18:23] <wmf-insecte>	 Project browsertests-MobileFrontend-SmokeTests-linux-chrome-sauce build #123: ABORTED in 3 hr 0 min: https://integration.wikimedia.org/ci/job/browsertests-MobileFrontend-SmokeTests-linux-chrome-sauce/123/
[22:47:17] <wikibugs>	 10Browser-Tests, 6Release-Engineering: Net::ReadTimeout shouldn't mark a test as failed - https://phabricator.wikimedia.org/T98968#1295118 (10Jdlrobson) This is hitting us a lot today. Examples: https://integration.wikimedia.org/ci/view/Mobile/job/browsertests-Gather-en.m.wikipedia.beta.wmflabs.org-linux-chrom...
[23:37:48] <wikibugs>	 10Browser-Tests, 6Release-Engineering, 10Hackathon-Mexico-City-2015, 10Wikimedia-Hackathon-2015: Create pool of user accounts on beta cluster for browser test builds in Jenkins - https://phabricator.wikimedia.org/T90964#1295145 (10hashar) 5Open>3declined a:3hashar I don't see a good use case for now....