[06:35:18] RECOVERY - Free space - all mounts on deployment-bastion is OK All targets OK [07:26:34] PROBLEM - Puppet failure on deployment-memc04 is CRITICAL 40.00% of data above the critical threshold [0.0] [07:27:13] PROBLEM - Puppet failure on deployment-db2 is CRITICAL 44.44% of data above the critical threshold [0.0] [07:27:37] PROBLEM - Puppet failure on deployment-memc03 is CRITICAL 50.00% of data above the critical threshold [0.0] [07:36:31] PROBLEM - Puppet failure on deployment-elastic05 is CRITICAL 20.00% of data above the critical threshold [0.0] [07:38:21] PROBLEM - Puppet failure on deployment-sca01 is CRITICAL 33.33% of data above the critical threshold [0.0] [07:38:33] PROBLEM - Puppet failure on deployment-zotero01 is CRITICAL 30.00% of data above the critical threshold [0.0] [07:39:28] PROBLEM - Puppet failure on deployment-urldownloader is CRITICAL 50.00% of data above the critical threshold [0.0] [07:39:46] PROBLEM - Puppet failure on deployment-eventlogging02 is CRITICAL 20.00% of data above the critical threshold [0.0] [07:40:12] PROBLEM - Puppet failure on deployment-sca02 is CRITICAL 55.56% of data above the critical threshold [0.0] [07:41:10] PROBLEM - Puppet failure on deployment-logstash2 is CRITICAL 33.33% of data above the critical threshold [0.0] [07:41:10] PROBLEM - Puppet failure on deployment-test is CRITICAL 22.22% of data above the critical threshold [0.0] [07:41:28] PROBLEM - Puppet failure on deployment-elastic08 is CRITICAL 30.00% of data above the critical threshold [0.0] [07:41:40] PROBLEM - Puppet failure on deployment-zookeeper01 is CRITICAL 30.00% of data above the critical threshold [0.0] [07:41:40] PROBLEM - Puppet failure on deployment-elastic06 is CRITICAL 30.00% of data above the critical threshold [0.0] [07:41:54] PROBLEM - Puppet failure on deployment-redis01 is CRITICAL 40.00% of data above the critical threshold [0.0] [07:41:54] PROBLEM - Puppet failure on deployment-kafka02 is CRITICAL 30.00% of data above the critical threshold [0.0] [07:42:14] PROBLEM - Puppet failure on deployment-restbase02 is CRITICAL 66.67% of data above the critical threshold [0.0] [07:42:20] PROBLEM - Puppet failure on deployment-elastic07 is CRITICAL 40.00% of data above the critical threshold [0.0] [07:42:24] PROBLEM - Puppet failure on deployment-pdf01 is CRITICAL 30.00% of data above the critical threshold [0.0] [07:42:26] PROBLEM - Puppet failure on deployment-stream is CRITICAL 30.00% of data above the critical threshold [0.0] [07:42:26] PROBLEM - Puppet failure on deployment-mathoid is CRITICAL 40.00% of data above the critical threshold [0.0] [07:42:38] PROBLEM - Puppet failure on deployment-memc02 is CRITICAL 30.00% of data above the critical threshold [0.0] [07:42:38] PROBLEM - Puppet failure on deployment-upload is CRITICAL 20.00% of data above the critical threshold [0.0] [07:42:42] PROBLEM - Puppet failure on deployment-salt is CRITICAL 40.00% of data above the critical threshold [0.0] [07:42:52] PROBLEM - Puppet failure on deployment-apertium01 is CRITICAL 40.00% of data above the critical threshold [0.0] [07:43:07] PROBLEM - Puppet failure on deployment-sentry2 is CRITICAL 55.56% of data above the critical threshold [0.0] [07:43:07] PROBLEM - Puppet failure on deployment-redis02 is CRITICAL 44.44% of data above the critical threshold [0.0] [07:43:31] PROBLEM - Puppet failure on deployment-mediawiki02 is CRITICAL 30.00% of data above the critical threshold [0.0] [07:43:43] PROBLEM - Puppet failure on deployment-mediawiki03 is CRITICAL 30.00% of data above the critical threshold [0.0] [07:44:07] PROBLEM - Puppet failure on deployment-mediawiki01 is CRITICAL 44.44% of data above the critical threshold [0.0] [07:44:11] PROBLEM - Puppet failure on deployment-bastion is CRITICAL 44.44% of data above the critical threshold [0.0] [07:44:13] PROBLEM - Puppet failure on deployment-db1 is CRITICAL 30.00% of data above the critical threshold [0.0] [07:44:35] PROBLEM - Puppet failure on deployment-cxserver03 is CRITICAL 60.00% of data above the critical threshold [0.0] [07:44:41] PROBLEM - Puppet failure on deployment-parsoid05 is CRITICAL 60.00% of data above the critical threshold [0.0] [07:44:57] PROBLEM - Puppet failure on deployment-restbase01 is CRITICAL 60.00% of data above the critical threshold [0.0] [07:45:09] PROBLEM - Puppet failure on deployment-logstash1 is CRITICAL 66.67% of data above the critical threshold [0.0] [07:45:56] PROBLEM - Puppet failure on deployment-fluorine is CRITICAL 60.00% of data above the critical threshold [0.0] [07:45:56] PROBLEM - Puppet failure on deployment-pdf02 is CRITICAL 60.00% of data above the critical threshold [0.0] [07:46:18] PROBLEM - Puppet failure on deployment-jobrunner01 is CRITICAL 55.56% of data above the critical threshold [0.0] [07:52:35] RECOVERY - Puppet failure on deployment-memc02 is OK Less than 1.00% above the threshold [0.0] [07:52:41] RECOVERY - Puppet failure on deployment-upload is OK Less than 1.00% above the threshold [0.0] [07:54:13] RECOVERY - Puppet failure on deployment-db1 is OK Less than 1.00% above the threshold [0.0] [07:55:11] RECOVERY - Puppet failure on deployment-sca02 is OK Less than 1.00% above the threshold [0.0] [07:56:09] RECOVERY - Puppet failure on deployment-test is OK Less than 1.00% above the threshold [0.0] [07:56:29] RECOVERY - Puppet failure on deployment-elastic08 is OK Less than 1.00% above the threshold [0.0] [07:56:35] RECOVERY - Puppet failure on deployment-elastic05 is OK Less than 1.00% above the threshold [0.0] [07:57:11] RECOVERY - Puppet failure on deployment-restbase02 is OK Less than 1.00% above the threshold [0.0] [07:57:12] RECOVERY - Puppet failure on deployment-db2 is OK Less than 1.00% above the threshold [0.0] [07:57:26] RECOVERY - Puppet failure on deployment-stream is OK Less than 1.00% above the threshold [0.0] [07:58:06] RECOVERY - Puppet failure on deployment-redis02 is OK Less than 1.00% above the threshold [0.0] [07:58:22] RECOVERY - Puppet failure on deployment-sca01 is OK Less than 1.00% above the threshold [0.0] [07:58:30] RECOVERY - Puppet failure on deployment-zotero01 is OK Less than 1.00% above the threshold [0.0] [07:59:30] RECOVERY - Puppet failure on deployment-urldownloader is OK Less than 1.00% above the threshold [0.0] [07:59:36] RECOVERY - Puppet failure on deployment-cxserver03 is OK Less than 1.00% above the threshold [0.0] [08:01:36] RECOVERY - Puppet failure on deployment-memc04 is OK Less than 1.00% above the threshold [0.0] [08:02:26] RECOVERY - Puppet failure on deployment-mathoid is OK Less than 1.00% above the threshold [0.0] [08:02:36] RECOVERY - Puppet failure on deployment-memc03 is OK Less than 1.00% above the threshold [0.0] [08:02:46] RECOVERY - Puppet failure on deployment-salt is OK Less than 1.00% above the threshold [0.0] [08:02:54] zeljkof-conferen: had to reboot sorry [08:03:06] zeljkof-conferen: or are you attending a conference today ? :D [08:04:08] RECOVERY - Puppet failure on deployment-bastion is OK Less than 1.00% above the threshold [0.0] [08:04:47] RECOVERY - Puppet failure on deployment-eventlogging02 is OK Less than 1.00% above the threshold [0.0] [08:05:55] RECOVERY - Puppet failure on deployment-pdf02 is OK Less than 1.00% above the threshold [0.0] [08:06:09] RECOVERY - Puppet failure on deployment-logstash2 is OK Less than 1.00% above the threshold [0.0] [08:06:17] RECOVERY - Puppet failure on deployment-jobrunner01 is OK Less than 1.00% above the threshold [0.0] [08:06:39] RECOVERY - Puppet failure on deployment-zookeeper01 is OK Less than 1.00% above the threshold [0.0] [08:06:41] RECOVERY - Puppet failure on deployment-elastic06 is OK Less than 1.00% above the threshold [0.0] [08:06:53] RECOVERY - Puppet failure on deployment-redis01 is OK Less than 1.00% above the threshold [0.0] [08:06:53] RECOVERY - Puppet failure on deployment-kafka02 is OK Less than 1.00% above the threshold [0.0] [08:07:19] RECOVERY - Puppet failure on deployment-elastic07 is OK Less than 1.00% above the threshold [0.0] [08:07:25] RECOVERY - Puppet failure on deployment-pdf01 is OK Less than 1.00% above the threshold [0.0] [08:07:53] RECOVERY - Puppet failure on deployment-apertium01 is OK Less than 1.00% above the threshold [0.0] [08:08:07] RECOVERY - Puppet failure on deployment-sentry2 is OK Less than 1.00% above the threshold [0.0] [08:09:07] RECOVERY - Puppet failure on deployment-mediawiki01 is OK Less than 1.00% above the threshold [0.0] [08:09:41] RECOVERY - Puppet failure on deployment-parsoid05 is OK Less than 1.00% above the threshold [0.0] [08:09:55] RECOVERY - Puppet failure on deployment-restbase01 is OK Less than 1.00% above the threshold [0.0] [08:10:11] RECOVERY - Puppet failure on deployment-logstash1 is OK Less than 1.00% above the threshold [0.0] [08:10:55] RECOVERY - Puppet failure on deployment-fluorine is OK Less than 1.00% above the threshold [0.0] [08:13:33] RECOVERY - Puppet failure on deployment-mediawiki02 is OK Less than 1.00% above the threshold [0.0] [08:13:43] RECOVERY - Puppet failure on deployment-mediawiki03 is OK Less than 1.00% above the threshold [0.0] [08:37:41] (03CR) 10Hashar: [C: 032] "Thanks for that. Refreshing all jobs :-}" [integration/config] - 10https://gerrit.wikimedia.org/r/221401 (owner: 10Legoktm) [08:40:03] (03Merged) 10jenkins-bot: Take advantage of /usr/local/bin/composer symlink [integration/config] - 10https://gerrit.wikimedia.org/r/221401 (owner: 10Legoktm) [08:44:03] 5Continuous-Integration-Isolation, 6Labs, 10Labs-Infrastructure, 3Labs-Sprint-103, 5Patch-For-Review: Instances without a shared NFS storage suffers from a 3 minutes boot delay - https://phabricator.wikimedia.org/T102544#1409042 (10hashar) p:5Triage>3Normal [08:57:39] (03PS9) 10Paladox: Configure npm for Metrolook and update tests [integration/config] - 10https://gerrit.wikimedia.org/r/221175 [09:14:33] (03CR) 10Hashar: [C: 04-1] "The phpcs test fails, so holding this change till the dev repo is ready :-} (see https://gerrit.wikimedia.org/r/#/c/221178/ )" [integration/config] - 10https://gerrit.wikimedia.org/r/221175 (owner: 10Paladox) [09:15:20] (03CR) 10Paladox: "Ok do you know what the errors say." [integration/config] - 10https://gerrit.wikimedia.org/r/221175 (owner: 10Paladox) [09:19:16] (03CR) 10Hashar: "Just run composer test on your machine that should give you the errors doesn't it ?" [integration/config] - 10https://gerrit.wikimedia.org/r/221175 (owner: 10Paladox) [09:20:31] (03CR) 10Paladox: "How do I run it." [integration/config] - 10https://gerrit.wikimedia.org/r/221175 (owner: 10Paladox) [09:20:41] (03CR) 10Paladox: "On windows." [integration/config] - 10https://gerrit.wikimedia.org/r/221175 (owner: 10Paladox) [09:30:10] PROBLEM - Puppet failure on deployment-bastion is CRITICAL 66.67% of data above the critical threshold [0.0] [09:30:58] (03PS5) 10Hashar: Add ParserFunctions to the mediawiki-gate [integration/config] - 10https://gerrit.wikimedia.org/r/191086 (owner: 10EBernhardson) [09:33:43] (03CR) 10Hashar: [C: 032] "I added ParserFunctions to the shared job mediawiki-extensions-*" [integration/config] - 10https://gerrit.wikimedia.org/r/191086 (owner: 10EBernhardson) [09:35:34] 6Release-Engineering, 6Phabricator, 5Release: Next Phabricator upgrade: 2015-07-01 - https://phabricator.wikimedia.org/T104047#1409100 (10Aklapper) We won't specifically upgrade to some redesign branch. We upgrade to the stable/master upstream branch. Which might include a redesign at some point. [09:35:37] (03Merged) 10jenkins-bot: Add ParserFunctions to the mediawiki-gate [integration/config] - 10https://gerrit.wikimedia.org/r/191086 (owner: 10EBernhardson) [09:36:03] hashar: yes, sorry, sent you mail, traveling back from conference, took longer than expected [09:36:11] zeljkof-conferen: no problem :-} [09:37:35] PROBLEM - Puppet failure on deployment-memc04 is CRITICAL 20.00% of data above the critical threshold [0.0] [09:38:33] (03CR) 10Hashar: "Tested on https://gerrit.wikimedia.org/r/#/c/69650/ and that works." [integration/config] - 10https://gerrit.wikimedia.org/r/191086 (owner: 10EBernhardson) [09:38:37] PROBLEM - Puppet failure on deployment-memc03 is CRITICAL 30.00% of data above the critical threshold [0.0] [09:38:37] PROBLEM - Puppet failure on deployment-memc02 is CRITICAL 30.00% of data above the critical threshold [0.0] [09:38:39] PROBLEM - Puppet failure on deployment-upload is CRITICAL 30.00% of data above the critical threshold [0.0] [09:39:35] PROBLEM - Puppet failure on deployment-zotero01 is CRITICAL 40.00% of data above the critical threshold [0.0] [09:40:15] PROBLEM - Puppet failure on deployment-db1 is CRITICAL 50.00% of data above the critical threshold [0.0] [09:40:41] PROBLEM - Puppet failure on deployment-parsoid05 is CRITICAL 40.00% of data above the critical threshold [0.0] [09:41:09] PROBLEM - Puppet failure on deployment-logstash1 is CRITICAL 55.56% of data above the critical threshold [0.0] [09:42:09] PROBLEM - Puppet failure on deployment-test is CRITICAL 55.56% of data above the critical threshold [0.0] [09:42:32] PROBLEM - Puppet failure on deployment-elastic08 is CRITICAL 60.00% of data above the critical threshold [0.0] [09:42:44] PROBLEM - Puppet failure on deployment-elastic06 is CRITICAL 60.00% of data above the critical threshold [0.0] [09:42:54] PROBLEM - Puppet failure on deployment-redis01 is CRITICAL 60.00% of data above the critical threshold [0.0] [09:47:20] PROBLEM - Puppet failure on deployment-jobrunner01 is CRITICAL 33.33% of data above the critical threshold [0.0] [09:48:56] PROBLEM - Puppet failure on deployment-videoscaler01 is CRITICAL 40.00% of data above the critical threshold [0.0] [09:49:42] PROBLEM - Puppet failure on deployment-mediawiki03 is CRITICAL 20.00% of data above the critical threshold [0.0] [09:50:10] RECOVERY - Puppet failure on deployment-bastion is OK Less than 1.00% above the threshold [0.0] [09:51:21] (03PS10) 10Paladox: Configure npm for Metrolook and update tests [integration/config] - 10https://gerrit.wikimedia.org/r/221175 [09:54:32] PROBLEM - Puppet failure on deployment-mediawiki02 is CRITICAL 50.00% of data above the critical threshold [0.0] [09:55:04] PROBLEM - Puppet failure on deployment-mediawiki01 is CRITICAL 66.67% of data above the critical threshold [0.0] [09:55:29] 10Continuous-Integration-Infrastructure: Investigate installing php5.3 on a trusty instance - https://phabricator.wikimedia.org/T103786#1409134 (10hashar) p:5Triage>3Normal [09:55:33] 10Continuous-Integration-Infrastructure: Investigate installing php5.3 on a trusty instance - https://phabricator.wikimedia.org/T103786#1399172 (10hashar) p:5Normal>3High [09:58:10] (03CR) 10Paladox: "@Hashar hi what do I run to get the warning your seeing." [integration/config] - 10https://gerrit.wikimedia.org/r/221175 (owner: 10Paladox) [10:10:44] PROBLEM - Puppet failure on deployment-eventlogging02 is CRITICAL 40.00% of data above the critical threshold [0.0] [10:10:56] PROBLEM - Puppet failure on deployment-restbase01 is CRITICAL 30.00% of data above the critical threshold [0.0] [10:11:58] PROBLEM - Puppet failure on deployment-fluorine is CRITICAL 40.00% of data above the critical threshold [0.0] [10:13:42] PROBLEM - Puppet failure on deployment-salt is CRITICAL 50.00% of data above the critical threshold [0.0] [10:14:05] 10Browser-Tests, 5Patch-For-Review: Support MEDIAWIKI_PROXY_URL for browser tests - https://phabricator.wikimedia.org/T71725#1409175 (10hashar) 5Open>3Resolved a:3hashar Seems to be fixed per @dduvall [10:14:06] PROBLEM - Puppet failure on deployment-redis02 is CRITICAL 22.22% of data above the critical threshold [0.0] [10:14:12] PROBLEM - Puppet failure on deployment-db2 is CRITICAL 77.78% of data above the critical threshold [0.0] [10:17:34] PROBLEM - Puppet failure on deployment-elastic05 is CRITICAL 30.00% of data above the critical threshold [0.0] [10:18:12] PROBLEM - Puppet failure on deployment-restbase02 is CRITICAL 22.22% of data above the critical threshold [0.0] [10:18:26] PROBLEM - Puppet failure on deployment-stream is CRITICAL 60.00% of data above the critical threshold [0.0] [10:19:21] PROBLEM - Puppet failure on deployment-sca01 is CRITICAL 44.44% of data above the critical threshold [0.0] [10:20:29] PROBLEM - Puppet failure on deployment-urldownloader is CRITICAL 60.00% of data above the critical threshold [0.0] [10:20:35] pfff [10:20:37] PROBLEM - Puppet failure on deployment-cxserver03 is CRITICAL 20.00% of data above the critical threshold [0.0] [10:21:13] PROBLEM - Puppet failure on deployment-sca02 is CRITICAL 66.67% of data above the critical threshold [0.0] [10:21:21] !log sees beta cluster puppetmaster is suffering from some random issue [10:21:24] Logged the message, Master [10:23:18] !log puppet master stalled due to: [ldap-yaml-enc.p] . Killing it [10:23:20] Logged the message, Master [10:23:27] PROBLEM - Puppet failure on deployment-mathoid is CRITICAL 30.00% of data above the critical threshold [0.0] [10:24:24] !log restarted puppetmater on deployment-salt [10:24:26] Logged the message, Master [10:26:09] PROBLEM - Puppet failure on deployment-bastion is CRITICAL 33.33% of data above the critical threshold [0.0] [10:26:53] PROBLEM - Puppet failure on deployment-pdf02 is CRITICAL 30.00% of data above the critical threshold [0.0] [10:27:05] PROBLEM - Puppet failure on deployment-logstash2 is CRITICAL 33.33% of data above the critical threshold [0.0] [10:27:40] PROBLEM - Puppet failure on deployment-zookeeper01 is CRITICAL 40.00% of data above the critical threshold [0.0] [10:27:54] PROBLEM - Puppet failure on deployment-kafka02 is CRITICAL 50.00% of data above the critical threshold [0.0] [10:28:20] PROBLEM - Puppet failure on deployment-elastic07 is CRITICAL 55.56% of data above the critical threshold [0.0] [10:28:24] PROBLEM - Puppet failure on deployment-pdf01 is CRITICAL 40.00% of data above the critical threshold [0.0] [10:28:52] PROBLEM - Puppet failure on deployment-apertium01 is CRITICAL 50.00% of data above the critical threshold [0.0] [10:29:08] PROBLEM - Puppet failure on deployment-sentry2 is CRITICAL 66.67% of data above the critical threshold [0.0] [10:35:30] RECOVERY - Puppet failure on deployment-urldownloader is OK Less than 1.00% above the threshold [0.0] [10:36:12] RECOVERY - Puppet failure on deployment-sca02 is OK Less than 1.00% above the threshold [0.0] [10:38:12] RECOVERY - Puppet failure on deployment-restbase02 is OK Less than 1.00% above the threshold [0.0] [10:39:20] RECOVERY - Puppet failure on deployment-sca01 is OK Less than 1.00% above the threshold [0.0] [10:39:21] (03CR) 10Paladox: "And when I run composer update it just download dependency's and nothing else." [integration/config] - 10https://gerrit.wikimedia.org/r/221175 (owner: 10Paladox) [10:39:30] RECOVERY - Puppet failure on deployment-zotero01 is OK Less than 1.00% above the threshold [0.0] [10:40:35] RECOVERY - Puppet failure on deployment-cxserver03 is OK Less than 1.00% above the threshold [0.0] [10:42:33] RECOVERY - Puppet failure on deployment-memc04 is OK Less than 1.00% above the threshold [0.0] [10:42:55] RECOVERY - Puppet failure on deployment-redis01 is OK Less than 1.00% above the threshold [0.0] [10:42:58] !log manually rebasing integration-puppetmaster git repo [10:43:01] Logged the message, Master [10:43:19] RECOVERY - Puppet failure on deployment-elastic07 is OK Less than 1.00% above the threshold [0.0] [10:43:29] RECOVERY - Puppet failure on deployment-mathoid is OK Less than 1.00% above the threshold [0.0] [10:43:35] RECOVERY - Puppet failure on deployment-memc03 is OK Less than 1.00% above the threshold [0.0] [10:43:43] RECOVERY - Puppet failure on deployment-salt is OK Less than 1.00% above the threshold [0.0] [10:44:07] RECOVERY - Puppet failure on deployment-sentry2 is OK Less than 1.00% above the threshold [0.0] [10:45:47] RECOVERY - Puppet failure on deployment-eventlogging02 is OK Less than 1.00% above the threshold [0.0] [10:46:11] RECOVERY - Puppet failure on deployment-bastion is OK Less than 1.00% above the threshold [0.0] [10:46:55] RECOVERY - Puppet failure on deployment-pdf02 is OK Less than 1.00% above the threshold [0.0] [10:47:07] RECOVERY - Puppet failure on deployment-logstash2 is OK Less than 1.00% above the threshold [0.0] [10:47:23] RECOVERY - Puppet failure on deployment-jobrunner01 is OK Less than 1.00% above the threshold [0.0] [10:47:40] RECOVERY - Puppet failure on deployment-elastic06 is OK Less than 1.00% above the threshold [0.0] [10:47:40] RECOVERY - Puppet failure on deployment-zookeeper01 is OK Less than 1.00% above the threshold [0.0] [10:47:52] RECOVERY - Puppet failure on deployment-kafka02 is OK Less than 1.00% above the threshold [0.0] [10:48:22] RECOVERY - Puppet failure on deployment-pdf01 is OK Less than 1.00% above the threshold [0.0] [10:48:36] RECOVERY - Puppet failure on deployment-memc02 is OK Less than 1.00% above the threshold [0.0] [10:48:52] RECOVERY - Puppet failure on deployment-apertium01 is OK Less than 1.00% above the threshold [0.0] [10:48:58] RECOVERY - Puppet failure on deployment-videoscaler01 is OK Less than 1.00% above the threshold [0.0] [10:49:03] PROBLEM - Host integration-vmbuilder-trusty is DOWN: CRITICAL - Host Unreachable (10.68.16.59) [10:49:43] RECOVERY - Puppet failure on deployment-mediawiki03 is OK Less than 1.00% above the threshold [0.0] [10:50:05] RECOVERY - Puppet failure on deployment-mediawiki01 is OK Less than 1.00% above the threshold [0.0] [10:50:41] RECOVERY - Puppet failure on deployment-parsoid05 is OK Less than 1.00% above the threshold [0.0] [10:50:49] 10Continuous-Integration-Infrastructure, 6operations, 7Blocked-on-Operations: Update jenkins-debian-glue packages on Jessie to v0.13.0 - https://phabricator.wikimedia.org/T102106#1409305 (10hashar) [10:50:55] RECOVERY - Puppet failure on deployment-restbase01 is OK Less than 1.00% above the threshold [0.0] [10:51:02] 10Continuous-Integration-Infrastructure, 6operations, 7Blocked-on-Operations, 7Jenkins: Please refresh Jenkins package on apt.wikimedia.org to 1.609.1 - https://phabricator.wikimedia.org/T103343#1409307 (10hashar) [10:51:11] RECOVERY - Puppet failure on deployment-logstash1 is OK Less than 1.00% above the threshold [0.0] [10:51:55] RECOVERY - Puppet failure on deployment-fluorine is OK Less than 1.00% above the threshold [0.0] [10:52:33] RECOVERY - Puppet failure on deployment-elastic08 is OK Less than 1.00% above the threshold [0.0] [10:53:29] (03CR) 10Hashar: "Once updated, you want to invoke the defined script in composer.json i.e.:" [integration/config] - 10https://gerrit.wikimedia.org/r/221175 (owner: 10Paladox) [10:53:42] RECOVERY - Puppet failure on deployment-upload is OK Less than 1.00% above the threshold [0.0] [10:54:06] RECOVERY - Puppet failure on deployment-redis02 is OK Less than 1.00% above the threshold [0.0] [10:54:12] RECOVERY - Puppet failure on deployment-db2 is OK Less than 1.00% above the threshold [0.0] [10:54:30] RECOVERY - Puppet failure on deployment-mediawiki02 is OK Less than 1.00% above the threshold [0.0] [10:55:16] RECOVERY - Puppet failure on deployment-db1 is OK Less than 1.00% above the threshold [0.0] [10:55:35] (03Abandoned) 10Hashar: Add AntiSpoof to the shared mw job [integration/config] - 10https://gerrit.wikimedia.org/r/187903 (owner: 10Hashar) [10:57:08] RECOVERY - Puppet failure on deployment-test is OK Less than 1.00% above the threshold [0.0] [10:57:32] RECOVERY - Puppet failure on deployment-elastic05 is OK Less than 1.00% above the threshold [0.0] [10:58:26] RECOVERY - Puppet failure on deployment-stream is OK Less than 1.00% above the threshold [0.0] [11:59:26] (03Abandoned) 10Hashar: (WIP) debian-glue job for Zuul (WIP) [integration/config] - 10https://gerrit.wikimedia.org/r/203347 (owner: 10Hashar) [13:28:09] PROBLEM - Puppet failure on deployment-logstash2 is CRITICAL 33.33% of data above the critical threshold [0.0] [13:42:09] PROBLEM - Puppet failure on deployment-logstash1 is CRITICAL 22.22% of data above the critical threshold [0.0] [13:43:09] RECOVERY - Puppet failure on deployment-logstash2 is OK Less than 1.00% above the threshold [0.0] [14:12:11] RECOVERY - Puppet failure on deployment-logstash1 is OK Less than 1.00% above the threshold [0.0] [14:47:43] hashar: any more thoughts on cxserver/deploy jenkins? Still failing. [14:48:58] https://phabricator.wikimedia.org/T92369 [14:49:03] hashar: ^ [14:49:19] hashar: It looks we can't go ahead to update cxserver without it. [15:23:28] PROBLEM - Puppet failure on deployment-cache-bits01 is CRITICAL 100.00% of data above the critical threshold [0.0] [15:40:17] 10Continuous-Integration-Infrastructure, 10ContentTranslation-Deployments, 5Patch-For-Review: Fix npm oid jobs - https://phabricator.wikimedia.org/T92369#1409910 (10hashar) grunt-cli does not honor NODE_PATH because the developers don't feel like supporting it : https://github.com/gruntjs/grunt-cli/pull/18 [15:43:10] (03PS8) 10Hashar: WIP: Hack for npm oid jobs [integration/config] - 10https://gerrit.wikimedia.org/r/189473 (https://phabricator.wikimedia.org/T92369) [15:45:04] 10Continuous-Integration-Infrastructure, 10ContentTranslation-Deployments, 5Patch-For-Review: Fix npm oid jobs - https://phabricator.wikimedia.org/T92369#1409918 (10hashar) I have rebased the hack at https://gerrit.wikimedia.org/r/#/c/189473/7 and redeployed the job. Still have to verify the impact on Parsed. [15:45:25] kart_: I reapplied the workaround [15:45:35] still have to test parsoid jobs. But for now I am gone :/ [16:54:20] (03CR) 10Paladox: "Hi when I do that I get this error" [integration/config] - 10https://gerrit.wikimedia.org/r/221175 (owner: 10Paladox) [17:02:06] (03CR) 10Paladox: "Hi it says something about line code like \r\n should be \n but I carnt see it through code on windows." [integration/config] - 10https://gerrit.wikimedia.org/r/221175 (owner: 10Paladox) [17:02:45] 10Beta-Cluster, 6Release-Engineering: Cannot login to Beta Cluster - https://phabricator.wikimedia.org/T104212#1410121 (10Ryasmeen) 3NEW [17:06:21] 10Beta-Cluster, 6Release-Engineering, 10MediaWiki-User-login-and-signup: Cannot login to Beta Cluster - https://phabricator.wikimedia.org/T104212#1410144 (10greg) p:5Triage>3Unbreak! [17:06:30] (03CR) 10Paladox: "Hi it says that it can be done automatically how can I do that." [integration/config] - 10https://gerrit.wikimedia.org/r/221175 (owner: 10Paladox) [17:06:45] 10Beta-Cluster, 6Release-Engineering, 10MediaWiki-User-login-and-signup: Cannot login to Beta Cluster - https://phabricator.wikimedia.org/T104212#1410121 (10greg) Confirmed. [17:06:47] well crap [17:08:22] 10Beta-Cluster, 6Release-Engineering, 10MediaWiki-User-login-and-signup, 10MediaWiki-extensions-CentralAuth: Login failing - The provided authentication token is either expired or invalid. - https://phabricator.wikimedia.org/T104212#1410152 (10greg) [17:08:45] legoktm: csteipp ^ login failure on beta cluster [17:08:57] hi [17:08:58] nothing obvious in central auth log, but.... I just saw it [17:09:05] git log, that is [17:09:23] * legoktm looks [17:10:50] 10Beta-Cluster, 6Release-Engineering, 10MediaWiki-User-login-and-signup, 10MediaWiki-extensions-CentralAuth: Login failing - The provided authentication token is either expired or invalid. - https://phabricator.wikimedia.org/T104212#1410161 (10csteipp) Did beta switch sessions storage, or lose memcache? [17:11:30] thcipriani: ^ question from chris [17:14:42] thcipriani: It looks specifically to be the cache-- CentralAuth assumes we can put stuff in memcache on one wiki, and get it back out on a request from another wiki. If not, we hit this errror. [17:21:58] greg-g: csteipp sorry, in meeting, looking now [17:24:23] yeah, 'twas bad timing with deploy working group [17:27:01] I wonder if this is a nutcracker thing? [17:28:17] Could be. If someone can login to beta and test out wgMemc->get/set from the commandline, that would verify it [17:28:52] s/verify/narrow down potential issues/ [17:30:19] sure I can do that, I don't remember how to start the interactive shell, looking through my notes about it... [17:32:01] `mwscript eval.php --wiki=enwiki` [17:32:14] Might have the parameters backwards [17:33:02] Would be best to do that from one of the actual apaches, in case it's a security group issue too. [17:33:34] 10Beta-Cluster, 6Release-Engineering, 10MediaWiki-User-login-and-signup, 10MediaWiki-extensions-CentralAuth: Login failing - The provided authentication token is either expired or invalid. - https://phabricator.wikimedia.org/T104212#1410281 (10Legoktm) To be clear, just global login is failing. You should... [17:34:06] thcipriani: what csteipp looks good. running that should work anywhere on a machine [17:39:44] hmm, well, running from deployment-mediawiki01 var_dump($wgMemc->set('testval', 10, 300)); returns false, also can't get testval (also returns false) [17:40:19] 5Continuous-Integration-Isolation, 6Labs, 10Labs-Infrastructure, 3Labs-Sprint-103, and 2 others: Instances without a shared NFS storage suffers from a 3 minutes boot delay - https://phabricator.wikimedia.org/T102544#1410329 (10Andrew) [17:40:43] So yeah, looks like memcache isn't working... are the memcache servers up? If so, like you said, nutcracker would be suspect. [17:42:38] all the memcache servers look up from netstat -tlnp on each of the memcache boxes [17:43:47] thcipriani: Can you run `echo "stats" | netcat 127.0.0.1 11212` from the apache? [17:44:05] Just to prove nutcracker is up? [17:44:46] not getting anything back on mediawiki01 [17:45:07] 10Beta-Cluster, 6Release-Engineering, 10MediaWiki-User-login-and-signup, 10MediaWiki-extensions-CentralAuth: Login failing - The provided authentication token is either expired or invalid. - https://phabricator.wikimedia.org/T104212#1410350 (10greg) (investigation on-going in #wikimedia-releng) [17:47:41] 6Release-Engineering, 5MW-1.23-release, 5Patch-For-Review: QA problems from upgrade from MediaWiki 1.23.3 to MW1.23.4 - https://phabricator.wikimedia.org/T529#1410370 (10Qgil) [17:48:44] or on mediawiki02 or mediawiki03: does this mean I should just restart nutcracker? [17:50:11] thcipriani: Yeah, that's probably a good thing to try [17:50:24] Since they're all dead, I'm going to guess there's a bad config preventing startup? [17:52:49] csteipp: nutcracker seems to have restarted fine, new pid, still not responding to stats, actual memcache server seem to respond [17:54:16] nutcracker might not proxy stats commands.. that could be my fault [17:55:13] But if you're still not able to get/set wgMemc from eval.php, then it would seem like nutcracker isn't proxying at all [17:58:29] thcipriani: Now I'm headed into a meeting. But sounds like someone from ops would be more helpful to you at this point. [17:58:48] csteipp: ok, thanks for walking me through some things. [17:59:17] just in time for the weekly ops meeting [17:59:21] stupid mondays [17:59:43] well, wait, wgMemc is letting me set stuff, seemingly, lemme try login [18:06:18] 10Beta-Cluster, 6Release-Engineering, 10MediaWiki-User-login-and-signup, 10MediaWiki-extensions-CentralAuth: Login failing - The provided authentication token is either expired or invalid. - https://phabricator.wikimedia.org/T104212#1410820 (10thcipriani) Restarted nutcracker on beta apache instances. I c... [18:08:39] !log restarted nutcracker on beta cluster salt '*-mediawiki*' cmd.run 'service nutcracker restart' [18:08:42] Logged the message, Master [18:23:04] 10Continuous-Integration-Infrastructure, 5Patch-For-Review, 7Pywikibot-tests: Python requests[security] requires libffi which isnt on the CI workers - https://phabricator.wikimedia.org/T103775#1410871 (10Legoktm) 5Open>3Resolved [18:23:06] 10Beta-Cluster, 10Continuous-Integration-Infrastructure, 10pywikibot-core, 5Patch-For-Review: Run pywikibot test suite regularly on beta cluster as part of MediaWiki/Wikimedia CI - https://phabricator.wikimedia.org/T100903#1410872 (10Legoktm) [18:32:05] so, if a user is on the whitelist, and jenkins V+2s a patch, why does it run all those jobs again when CR+2 and merging? or does that depend on what the configuration of jobs for that repository is? [18:32:09] question based on https://gerrit.wikimedia.org/r/#/c/221676/ [18:36:22] polybuildr: when you upload a patch, we run "test" pipeline stuff, when it's merging, it uses "gate-and-submit". most of the time the two pipelines are the same, but for mw/core for example, we only run the slower tests on gate. Also the for extensions tests depend upon other repos like mw/core, so when you merge, it could be using different versions of those repos [18:37:40] 10Beta-Cluster, 6Release-Engineering, 10MediaWiki-User-login-and-signup, 10MediaWiki-extensions-CentralAuth: Login failing - The provided authentication token is either expired or invalid. - https://phabricator.wikimedia.org/T104212#1410954 (10csteipp) I'm able to login, where I was getting the error befor... [18:37:54] legoktm: oh, so it runs the tests again because the state of the repository (duh) might have changed since the original V+2? [18:38:06] yes [18:38:07] legoktm: I mean, that's why many of the test and gate-and-submit jobs are same? [18:38:10] right, okay [18:38:11] also the tests may be different [18:38:26] alright, got it. thanks! :D [18:38:36] legoktm: also, could you please review https://gerrit.wikimedia.org/r/#/c/220728/ ? :P [18:39:37] (03CR) 10Legoktm: [C: 04-1] "Shouldn't this be done in MediaWiki core?" [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/220728 (https://phabricator.wikimedia.org/T103806) (owner: 10Polybuildr) [18:41:08] (03CR) 10Polybuildr: "Yeah, ideally it should. So we override default codesniffer config in core using a ruleset.xml or something?" [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/220728 (https://phabricator.wikimedia.org/T103806) (owner: 10Polybuildr) [18:42:50] 10Beta-Cluster, 6Release-Engineering, 10MediaWiki-User-login-and-signup, 10MediaWiki-extensions-CentralAuth: Login failing - The provided authentication token is either expired or invalid. - https://phabricator.wikimedia.org/T104212#1410972 (10thcipriani) 5Open>3Resolved a:3thcipriani it's strange, n... [18:43:16] polybuildr: https://gerrit.wikimedia.org/r/#/c/218388/ maybe something like that? idk [18:45:40] legoktm: oh, that's nice. :D [18:45:42] yes yes [18:45:43] please do that [18:45:45] makes a lot of sense [18:50:12] (03CR) 10Polybuildr: [C: 04-1] "Agree with legoktm, should be done in core, probably after https://gerrit.wikimedia.org/r/#/c/218388/." [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/220728 (https://phabricator.wikimedia.org/T103806) (owner: 10Polybuildr) [18:51:17] (03Abandoned) 10Polybuildr: Ignore languages/messages/Message*.php in line length sniff [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/220728 (https://phabricator.wikimedia.org/T103806) (owner: 10Polybuildr) [18:54:14] 10Beta-Cluster, 10Traffic: Puppet failing on deployment-prep caches - https://phabricator.wikimedia.org/T104076#1407254 (10BBlack) @akosiaris @chasemp - related to recent refactors here? [18:56:27] 10Beta-Cluster, 10Traffic: Puppet failing on deployment-prep caches - https://phabricator.wikimedia.org/T104076#1411027 (10chasemp) I didn't remove anything there, only added. So I don't think it would be related to adding the 'conftool' key [18:57:54] PROBLEM - Puppet failure on deployment-pdf02 is CRITICAL 20.00% of data above the critical threshold [0.0] [19:06:44] 10Beta-Cluster, 10Traffic: Puppet failing on deployment-prep caches - https://phabricator.wikimedia.org/T104076#1411137 (10akosiaris) That does seem like me, I 'll dig into it and fix it [19:20:23] 10Deployment-Systems, 6Labs, 10wikitech.wikimedia.org, 5Patch-For-Review: Merge as many configuration hacks in wikitech.php configuration file as possible into InitialiseSettings.php - https://phabricator.wikimedia.org/T75939#1411236 (10Krenair) I think we can probably resolve this after the above commit g... [19:27:57] RECOVERY - Puppet failure on deployment-pdf02 is OK Less than 1.00% above the threshold [0.0] [19:29:15] 10Beta-Cluster, 6Release-Engineering, 10MediaWiki-User-login-and-signup, 10MediaWiki-extensions-CentralAuth: Login failing - The provided authentication token is either expired or invalid. - https://phabricator.wikimedia.org/T104212#1411283 (10Ryasmeen) Yes, I am not getting that error now and being able t... [19:32:39] 10Beta-Cluster, 10Traffic: Puppet failing on deployment-prep caches - https://phabricator.wikimedia.org/T104076#1411296 (10thcipriani) FWIW, it looks like these changes: 'bits' => { - 'eqiad' => flatten([$lvs::configuration::lvs_service_ips['production']['bits']['eqiad']['b... [19:55:48] 10Deployment-Systems, 6Community-Liaison, 6Multimedia: New Feature Notification - https://phabricator.wikimedia.org/T77347#1411382 (10Quiddity) [20:00:41] 10Deployment-Systems, 6Community-Liaison, 6Multimedia: New Feature Notification - https://phabricator.wikimedia.org/T77347#1411397 (10Quiddity) Additional thoughts on this, from https://meta.wikimedia.org/wiki/Community_Engagement_%28Product%29/Process_ideas#Notifications_for_Beta_Features_.2F_More_deliberat... [20:18:43] (03PS1) 10Ejegg: Disable fundraising crm tests on contrib branches [integration/config] - 10https://gerrit.wikimedia.org/r/221749 (https://phabricator.wikimedia.org/T103006) [20:20:55] (03CR) 10Paladox: "I get this error now" [integration/config] - 10https://gerrit.wikimedia.org/r/221175 (owner: 10Paladox) [20:21:32] (03PS2) 10Ejegg: Disable fundraising CRM tests on contrib branches [integration/config] - 10https://gerrit.wikimedia.org/r/221749 (https://phabricator.wikimedia.org/T103006) [20:23:10] Hi all! Anyone have time to merge/deploy https://gerrit.wikimedia.org/r/221749? [20:26:43] 10Continuous-Integration-Infrastructure, 7Zuul: Zuul-cloner checks out wrong branch - https://phabricator.wikimedia.org/T104243#1411459 (10Ejegg) [21:18:26] PROBLEM - Puppet failure on deployment-parsoidcache02 is CRITICAL 100.00% of data above the critical threshold [0.0] [21:41:14] (03CR) 10Paladox: "mediawiki codesnifer should update package for the php sniffer to 2.3.3 not 2.3.0 because 2.3.3 disable checking in if statement for code " [integration/config] - 10https://gerrit.wikimedia.org/r/221175 (owner: 10Paladox) [21:41:29] (03PS2) 10Awight: Update vendor using composer rather than cloning the deployment repo [integration/config] - 10https://gerrit.wikimedia.org/r/221310 [21:44:11] 10Continuous-Integration-Infrastructure, 7Zuul: Zuul-cloner checks out wrong branch - https://phabricator.wikimedia.org/T104243#1411739 (10hashar) Indeed, zuul-cloner attempts to match the branches in all repositories being cloned. The assumption being that branches are matching releases (i.e. you want to test... [21:44:44] PROBLEM - Puppet failure on deployment-salt is CRITICAL 20.00% of data above the critical threshold [0.0] [21:49:29] PROBLEM - Puppet failure on deployment-stream is CRITICAL 60.00% of data above the critical threshold [0.0] [21:54:41] RECOVERY - Puppet failure on deployment-salt is OK Less than 1.00% above the threshold [0.0] [22:01:56] greg-g: can I get a deployment window for tomorrow to test global user merge in production? I need to backport some patches, enable GUM (config setting), do a merge, then disable GUM [22:14:26] RECOVERY - Puppet failure on deployment-stream is OK Less than 1.00% above the threshold [0.0] [22:15:14] 10Continuous-Integration-Infrastructure, 7Zuul: Zuul-cloner checks out wrong branch - https://phabricator.wikimedia.org/T104243#1411893 (10Ejegg) Thanks @hashar, I guess that behavior does make sense. Good to know about the cloner option! In this case, I realize we probably don't want the tests to run on any... [22:16:14] 10Continuous-Integration-Infrastructure, 7Zuul: Zuul-cloner checks out wrong branch - https://phabricator.wikimedia.org/T104243#1411898 (10Ejegg) 5Open>3declined [22:22:40] twentyafterfour: do you have time to merge/deploy https://gerrit.wikimedia.org/r/221749? I need to disable tests on the branch that holds pristine upstream code in a couple of fundraising's CRM subrepos [22:23:20] (03CR) 10Legoktm: [C: 032] Disable fundraising CRM tests on contrib branches [integration/config] - 10https://gerrit.wikimedia.org/r/221749 (https://phabricator.wikimedia.org/T103006) (owner: 10Ejegg) [22:23:29] Thanks legoktm ! [22:24:03] :) [22:25:14] (03Merged) 10jenkins-bot: Disable fundraising CRM tests on contrib branches [integration/config] - 10https://gerrit.wikimedia.org/r/221749 (https://phabricator.wikimedia.org/T103006) (owner: 10Ejegg) [22:25:37] !log deploying https://gerrit.wikimedia.org/r/221749 [22:25:40] Logged the message, Master [22:26:21] PROBLEM - Free space - all mounts on deployment-bastion is CRITICAL deployment-prep.deployment-bastion.diskspace._var.byte_percentfree (<22.22%) [22:26:26] ejegg: deployed now [22:26:35] sweet, now we can upgrade civi and drupal again! [22:26:41] thanks again [23:19:26] !log Moved logstash irc bot from logstash1 to logstash2 [23:19:31] Logged the message, Master [23:19:47] and it worked even :) [23:21:18] RECOVERY - Free space - all mounts on deployment-bastion is OK All targets OK [23:23:54] 10Deployment-Systems, 6Scrum-of-Scrums, 6operations, 7Blocked-on-Operations: Update wikitech wiki with deployment train - https://phabricator.wikimedia.org/T70751#1412155 (10Krenair) [23:23:58] 10Deployment-Systems, 6Labs, 10wikitech.wikimedia.org, 5Patch-For-Review: Merge as many configuration hacks in wikitech.php configuration file as possible into InitialiseSettings.php - https://phabricator.wikimedia.org/T75939#1412153 (10Krenair) 5Open>3Resolved Let's call this resolved. Not much point... [23:25:43] 10Beta-Cluster, 10Wikimedia-Logstash, 5Patch-For-Review, 15User-Bd808-Test: Build jessie based elasticsearch/logstash/kibana (ELK) host for beta testing - https://phabricator.wikimedia.org/T101541#1412173 (10bd808) [23:43:41] (03PS1) 10Hoo man: Update Wikidata to the wmf/1.26wmf12 branch [tools/release] - 10https://gerrit.wikimedia.org/r/221795