[00:04:46] Project beta-scap-eqiad build #39128: STILL FAILING in 47 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/39128/ [00:07:40] RECOVERY - Puppet failure on deployment-cxserver03 is OK: OK: Less than 1.00% above the threshold [0.0] [00:15:24] Yippee, build fixed! [00:15:24] Project beta-scap-eqiad build #39129: FIXED in 1 min 28 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/39129/ [00:21:10] PROBLEM - Puppet failure on deployment-db1 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [00:33:53] 3pywikibot-core, Continuous-Integration: run pep8 and pep257 for pywikibot/core - https://phabricator.wikimedia.org/T87169#990525 (10jayvdb) p:5Triage>3Unbreak! It looks like there will not be a quick resolution with the pep257 project ( https://github.com/GreenSteam/pep257/issues/97) , and I had no luck fix... [00:43:41] PROBLEM - Puppet failure on deployment-logstash1 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [00:45:07] Project beta-scap-eqiad build #39132: FAILURE in 1 min 14 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/39132/ [00:46:15] RECOVERY - Puppet failure on deployment-db1 is OK: OK: Less than 1.00% above the threshold [0.0] [00:54:51] Yippee, build fixed! [00:54:51] Project beta-scap-eqiad build #39133: FIXED in 1 min 0 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/39133/ [01:08:42] RECOVERY - Puppet failure on deployment-logstash1 is OK: OK: Less than 1.00% above the threshold [0.0] [02:57:32] PROBLEM - Free space - all mounts on deployment-bastion is CRITICAL: CRITICAL: deployment-prep.deployment-bastion.diskspace._var.byte_percentfree.value (<22.22%) [03:55:30] Yippee, build fixed! [03:55:30] Project browsertests-Core-en.wikipedia.beta.wmflabs.org-linux-firefox-sauce build #433: FIXED in 13 min: https://integration.wikimedia.org/ci/job/browsertests-Core-en.wikipedia.beta.wmflabs.org-linux-firefox-sauce/433/ [04:12:20] Yippee, build fixed! [04:12:20] Project browsertests-MultimediaViewer-en.wikipedia.beta.wmflabs.org-os_x_10.9-safari-sauce build #423: FIXED in 1 hr 2 min: https://integration.wikimedia.org/ci/job/browsertests-MultimediaViewer-en.wikipedia.beta.wmflabs.org-os_x_10.9-safari-sauce/423/ [04:12:24] Yippee, build fixed! [04:12:25] Project browsertests-VisualEditor-test2.wikipedia.org-linux-chrome-sauce build #445: FIXED in 22 min: https://integration.wikimedia.org/ci/job/browsertests-VisualEditor-test2.wikipedia.org-linux-chrome-sauce/445/ [04:13:25] Yippee, build fixed! [04:13:26] Project browsertests-Math-en.wikipedia.beta.wmflabs.org-linux-chrome-sauce build #360: FIXED in 1 min 0 sec: https://integration.wikimedia.org/ci/job/browsertests-Math-en.wikipedia.beta.wmflabs.org-linux-chrome-sauce/360/ [04:14:14] Yippee, build fixed! [04:14:15] Project browsertests-CentralNotice-en.wikipedia.beta.wmflabs.org-linux-firefox-sauce build #142: FIXED in 48 sec: https://integration.wikimedia.org/ci/job/browsertests-CentralNotice-en.wikipedia.beta.wmflabs.org-linux-firefox-sauce/142/ [04:29:10] Project browsertests-Flow-en.wikipedia.beta.wmflabs.org-linux-chrome-monobook-sauce build #253: FAILURE in 35 min: https://integration.wikimedia.org/ci/job/browsertests-Flow-en.wikipedia.beta.wmflabs.org-linux-chrome-monobook-sauce/253/ [04:30:20] Yippee, build fixed! [04:30:20] Project browsertests-UploadWizard-commons.wikimedia.beta.wmflabs.org-linux-firefox-sauce build #439: FIXED in 16 min: https://integration.wikimedia.org/ci/job/browsertests-UploadWizard-commons.wikimedia.beta.wmflabs.org-linux-firefox-sauce/439/ [04:33:40] Yippee, build fixed! [04:33:40] Project browsertests-WikiLove-en.wikipedia.beta.wmflabs.org-linux-firefox-sauce build #396: FIXED in 3 min 20 sec: https://integration.wikimedia.org/ci/job/browsertests-WikiLove-en.wikipedia.beta.wmflabs.org-linux-firefox-sauce/396/ [04:42:46] Yippee, build fixed! [04:42:46] Project browsertests-Flow-en.wikipedia.beta.wmflabs.org-linux-chrome-sauce build #461: FIXED in 36 min: https://integration.wikimedia.org/ci/job/browsertests-Flow-en.wikipedia.beta.wmflabs.org-linux-chrome-sauce/461/ [04:43:24] Yippee, build fixed! [04:43:24] Project browsertests-PageTriage-en.wikipedia.beta.wmflabs.org-linux-firefox-sauce build #364: FIXED in 1 min 1 sec: https://integration.wikimedia.org/ci/job/browsertests-PageTriage-en.wikipedia.beta.wmflabs.org-linux-firefox-sauce/364/ [04:43:40] Yippee, build fixed! [04:43:40] Project browsertests-CirrusSearch-test2.wikipedia.org-linux-firefox-sauce build #398: FIXED in 1 min 18 sec: https://integration.wikimedia.org/ci/job/browsertests-CirrusSearch-test2.wikipedia.org-linux-firefox-sauce/398/ [04:44:26] Yippee, build fixed! [04:44:26] Project browsertests-PdfHandler-test2.wikipedia.org-linux-firefox-sauce build #336: FIXED in 45 sec: https://integration.wikimedia.org/ci/job/browsertests-PdfHandler-test2.wikipedia.org-linux-firefox-sauce/336/ [04:45:48] Yippee, build fixed! [04:45:48] Project browsertests-CentralNotice-en.wikipedia.beta.wmflabs.org-os_x_10.9-safari-sauce build #77: FIXED in 1 min 21 sec: https://integration.wikimedia.org/ci/job/browsertests-CentralNotice-en.wikipedia.beta.wmflabs.org-os_x_10.9-safari-sauce/77/ [05:28:14] Yippee, build fixed! [05:28:14] Project browsertests-MobileFrontend-test2.m.wikipedia.org-linux-firefox-sauce build #426: FIXED in 42 min: https://integration.wikimedia.org/ci/job/browsertests-MobileFrontend-test2.m.wikipedia.org-linux-firefox-sauce/426/ [05:38:40] Project browsertests-MultimediaViewer-en.wikipedia.beta.wmflabs.org-linux-firefox-sauce build #424: FAILURE in 6.2 sec: https://integration.wikimedia.org/ci/job/browsertests-MultimediaViewer-en.wikipedia.beta.wmflabs.org-linux-firefox-sauce/424/ [06:14:33] Project beta-scap-eqiad build #39165: FAILURE in 36 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/39165/ [06:23:47] so is there an appropriate place, say, on labs, that I could run an irc bouncer? [06:24:40] I was running one on digitalocean but I let that vm die a while ago when I got a reliable connection at home [06:24:46] Project beta-scap-eqiad build #39166: STILL FAILING in 48 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/39166/ [06:25:26] 3Phabricator: Add help link to explain meaning of priority levels - https://phabricator.wikimedia.org/T87411#990654 (10FriedhelmW) 3NEW [06:28:33] Yippee, build fixed! [06:28:33] Project browsertests-MultimediaViewer-mediawiki.org-linux-firefox-sauce build #422: FIXED in 3 min 17 sec: https://integration.wikimedia.org/ci/job/browsertests-MultimediaViewer-mediawiki.org-linux-firefox-sauce/422/ [06:34:19] 3Phabricator: Decide on "Needs Volunteer" Priority field value in Phabricator - https://phabricator.wikimedia.org/T78617#990676 (10FriedhelmW) Keep it as it is. "Needs Volunteer" is self-explanatory even for non native English. [06:35:25] Yippee, build fixed! [06:35:25] Project beta-scap-eqiad build #39167: FIXED in 1 min 31 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/39167/ [06:37:30] RECOVERY - Free space - all mounts on deployment-bastion is OK: OK: All targets OK [07:06:24] 3pywikibot-core, Continuous-Integration: Whitelist people with +2 rights - https://phabricator.wikimedia.org/T87413#990688 (10jayvdb) 3NEW [08:02:29] 3Phabricator: Change "CC" to "Subscribers" - https://phabricator.wikimedia.org/T87421#990782 (10FriedhelmW) 3NEW [08:05:23] Project beta-scap-eqiad build #39176: FAILURE in 1 min 14 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/39176/ [08:15:01] Yippee, build fixed! [08:15:02] Project beta-scap-eqiad build #39177: FIXED in 1 min 4 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/39177/ [08:15:38] 3Phabricator: Change "CC" to "Subscribers" - https://phabricator.wikimedia.org/T87421#990797 (10FriedhelmW) [08:25:05] Project beta-scap-eqiad build #39178: FAILURE in 1 min 4 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/39178/ [08:32:33] 3Phabricator: Change "CC" to "Subscribers" - https://phabricator.wikimedia.org/T87421#990802 (10FriedhelmW) Google Translator is unable to translate "CC" to German. And it is not a "carbon" copy. [08:35:21] Project beta-scap-eqiad build #39179: STILL FAILING in 1 min 23 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/39179/ [08:44:58] Yippee, build fixed! [08:44:58] Project beta-scap-eqiad build #39180: FIXED in 1 min 3 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/39180/ [08:50:40] PROBLEM - Puppet failure on deployment-fluoride is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [09:01:59] PROBLEM - Puppet failure on deployment-redis02 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [09:11:59] PROBLEM - Puppet failure on deployment-upload is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [09:16:19] PROBLEM - Puppet failure on deployment-apertium01 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [09:20:44] RECOVERY - Puppet failure on deployment-fluoride is OK: OK: Less than 1.00% above the threshold [0.0] [09:24:59] Project beta-scap-eqiad build #39184: FAILURE in 57 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/39184/ [09:30:49] PROBLEM - Puppet failure on deployment-jobrunner01 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [09:32:05] RECOVERY - Puppet failure on deployment-redis02 is OK: OK: Less than 1.00% above the threshold [0.0] [09:32:41] PROBLEM - Puppet failure on deployment-sca01 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [09:35:42] Yippee, build fixed! [09:35:43] Project beta-scap-eqiad build #39185: FIXED in 1 min 41 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/39185/ [09:35:43] PROBLEM - Puppet failure on deployment-videoscaler01 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [09:37:02] RECOVERY - Puppet failure on deployment-upload is OK: OK: Less than 1.00% above the threshold [0.0] [09:40:41] PROBLEM - Puppet failure on deployment-cache-upload02 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [09:41:17] RECOVERY - Puppet failure on deployment-apertium01 is OK: OK: Less than 1.00% above the threshold [0.0] [09:43:59] PROBLEM - Puppet failure on deployment-salt is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [09:44:42] PROBLEM - Puppet failure on deployment-db2 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [09:55:44] RECOVERY - Puppet failure on deployment-jobrunner01 is OK: OK: Less than 1.00% above the threshold [0.0] [09:55:52] PROBLEM - Puppet failure on deployment-cache-bits01 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [09:57:43] RECOVERY - Puppet failure on deployment-sca01 is OK: OK: Less than 1.00% above the threshold [0.0] [10:00:45] PROBLEM - Puppet failure on deployment-logstash1 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [10:00:45] RECOVERY - Puppet failure on deployment-videoscaler01 is OK: OK: Less than 1.00% above the threshold [0.0] [10:03:56] RECOVERY - Puppet failure on deployment-salt is OK: OK: Less than 1.00% above the threshold [0.0] [10:04:36] RECOVERY - Puppet failure on deployment-db2 is OK: OK: Less than 1.00% above the threshold [0.0] [10:05:08] Project beta-scap-eqiad build #39188: FAILURE in 1 min 2 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/39188/ [10:05:44] RECOVERY - Puppet failure on deployment-cache-upload02 is OK: OK: Less than 1.00% above the threshold [0.0] [10:15:08] Yippee, build fixed! [10:15:08] Project beta-scap-eqiad build #39189: FIXED in 1 min 8 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/39189/ [10:20:49] RECOVERY - Puppet failure on deployment-cache-bits01 is OK: OK: Less than 1.00% above the threshold [0.0] [10:25:42] RECOVERY - Puppet failure on deployment-logstash1 is OK: OK: Less than 1.00% above the threshold [0.0] [10:25:48] PROBLEM - Puppet failure on deployment-mathoid is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [10:26:02] Project beta-scap-eqiad build #39190: FAILURE in 1 min 46 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/39190/ [10:27:54] PROBLEM - Puppet failure on deployment-cache-mobile03 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [10:35:29] Yippee, build fixed! [10:35:30] Project beta-scap-eqiad build #39191: FIXED in 1 min 25 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/39191/ [10:55:31] Project beta-scap-eqiad build #39193: FAILURE in 1 min 25 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/39193/ [10:55:47] RECOVERY - Puppet failure on deployment-mathoid is OK: OK: Less than 1.00% above the threshold [0.0] [11:00:42] 3Phabricator: Create project for MediaWiki-extensions-WikibaseView - https://phabricator.wikimedia.org/T87428#990882 (10adrianheine) 3NEW [11:05:55] Yippee, build fixed! [11:05:55] Project beta-scap-eqiad build #39194: FIXED in 1 min 44 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/39194/ [11:12:57] RECOVERY - Puppet failure on deployment-cache-mobile03 is OK: OK: Less than 1.00% above the threshold [0.0] [11:15:43] Project beta-scap-eqiad build #39195: FAILURE in 1 min 38 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/39195/ [11:17:22] PROBLEM - Puppet failure on deployment-stream is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [11:25:22] Yippee, build fixed! [11:25:22] Project beta-scap-eqiad build #39196: FIXED in 1 min 17 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/39196/ [11:31:48] PROBLEM - Puppet failure on deployment-jobrunner01 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [11:42:25] RECOVERY - Puppet failure on deployment-stream is OK: OK: Less than 1.00% above the threshold [0.0] [11:53:01] PROBLEM - Puppet failure on deployment-upload is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [11:56:45] RECOVERY - Puppet failure on deployment-jobrunner01 is OK: OK: Less than 1.00% above the threshold [0.0] [11:57:19] PROBLEM - Puppet failure on deployment-apertium01 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [12:02:42] PROBLEM - Puppet failure on deployment-logstash1 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [12:08:53] PROBLEM - Puppet failure on deployment-restbase02 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [12:18:01] RECOVERY - Puppet failure on deployment-upload is OK: OK: Less than 1.00% above the threshold [0.0] [12:23:56] PROBLEM - Puppet failure on deployment-cache-mobile03 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [12:27:42] RECOVERY - Puppet failure on deployment-logstash1 is OK: OK: Less than 1.00% above the threshold [0.0] [12:31:39] PROBLEM - Puppet failure on deployment-mx is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [12:33:57] RECOVERY - Puppet failure on deployment-restbase02 is OK: OK: Less than 1.00% above the threshold [0.0] [12:38:45] PROBLEM - Puppet failure on deployment-bastion is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [12:40:41] PROBLEM - Puppet failure on deployment-rsync01 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [12:42:19] RECOVERY - Puppet failure on deployment-apertium01 is OK: OK: Less than 1.00% above the threshold [0.0] [12:45:55] PROBLEM - Puppet failure on deployment-mediawiki04 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [12:53:56] RECOVERY - Puppet failure on deployment-cache-mobile03 is OK: OK: Less than 1.00% above the threshold [0.0] [12:56:27] PROBLEM - Puppet failure on deployment-sentry2 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [12:56:46] PROBLEM - Puppet failure on deployment-videoscaler01 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [13:03:11] PROBLEM - Puppet failure on deployment-db1 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [13:03:41] RECOVERY - Puppet failure on deployment-bastion is OK: OK: Less than 1.00% above the threshold [0.0] [13:05:40] RECOVERY - Puppet failure on deployment-rsync01 is OK: OK: Less than 1.00% above the threshold [0.0] [13:07:08] PROBLEM - Puppet failure on deployment-parsoidcache02 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [13:07:13] PROBLEM - Puppet failure on deployment-elastic08 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [13:10:53] RECOVERY - Puppet failure on deployment-mediawiki04 is OK: OK: Less than 1.00% above the threshold [0.0] [13:13:41] (03PS2) 10Adrian Lang: Add npm job to wikibase [integration/config] - 10https://gerrit.wikimedia.org/r/184592 [13:14:25] (03CR) 10Adrian Lang: "grunt config in wikibase is merged, jscs task passes. Please merge this :)" [integration/config] - 10https://gerrit.wikimedia.org/r/184592 (owner: 10Adrian Lang) [13:16:43] RECOVERY - Puppet failure on deployment-videoscaler01 is OK: OK: Less than 1.00% above the threshold [0.0] [13:19:43] PROBLEM - Puppet failure on deployment-bastion is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [13:21:26] RECOVERY - Puppet failure on deployment-sentry2 is OK: OK: Less than 1.00% above the threshold [0.0] [13:21:42] PROBLEM - Puppet failure on deployment-rsync01 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [13:21:54] PROBLEM - Puppet failure on deployment-mediawiki04 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [13:23:16] RECOVERY - Puppet failure on deployment-db1 is OK: OK: Less than 1.00% above the threshold [0.0] [13:26:50] PROBLEM - Puppet failure on deployment-mathoid is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [13:28:43] PROBLEM - Puppet failure on deployment-logstash1 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [13:32:10] RECOVERY - Puppet failure on deployment-parsoidcache02 is OK: OK: Less than 1.00% above the threshold [0.0] [13:32:14] RECOVERY - Puppet failure on deployment-elastic08 is OK: OK: Less than 1.00% above the threshold [0.0] [13:41:03] PROBLEM - Puppet failure on deployment-memc04 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [13:41:43] RECOVERY - Puppet failure on deployment-rsync01 is OK: OK: Less than 1.00% above the threshold [0.0] [13:41:43] PROBLEM - Puppet failure on deployment-cache-upload02 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [13:44:11] PROBLEM - Puppet failure on deployment-db1 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [13:44:41] RECOVERY - Puppet failure on deployment-bastion is OK: OK: Less than 1.00% above the threshold [0.0] [13:48:43] RECOVERY - Puppet failure on deployment-logstash1 is OK: OK: Less than 1.00% above the threshold [0.0] [13:49:59] PROBLEM - Puppet failure on deployment-cache-mobile03 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [13:51:53] RECOVERY - Puppet failure on deployment-mediawiki04 is OK: OK: Less than 1.00% above the threshold [0.0] [13:56:05] Project beta-scap-eqiad build #39211: FAILURE in 2 min 9 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/39211/ [13:56:47] RECOVERY - Puppet failure on deployment-mathoid is OK: OK: Less than 1.00% above the threshold [0.0] [14:00:40] 3Phabricator: One touch assign button on phabricator - https://phabricator.wikimedia.org/T87436#990997 (1001tonythomas) 3NEW [14:00:42] PROBLEM - Puppet failure on deployment-bastion is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [14:01:06] RECOVERY - Puppet failure on deployment-memc04 is OK: OK: Less than 1.00% above the threshold [0.0] [14:04:10] RECOVERY - Puppet failure on deployment-db1 is OK: OK: Less than 1.00% above the threshold [0.0] [14:05:23] Yippee, build fixed! [14:05:24] Project beta-scap-eqiad build #39212: FIXED in 1 min 30 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/39212/ [14:06:44] RECOVERY - Puppet failure on deployment-cache-upload02 is OK: OK: Less than 1.00% above the threshold [0.0] [14:09:23] PROBLEM - Puppet failure on deployment-mediawiki01 is CRITICAL: CRITICAL: 14.29% of data above the critical threshold [0.0] [14:09:54] RECOVERY - Puppet failure on deployment-cache-mobile03 is OK: OK: Less than 1.00% above the threshold [0.0] [14:11:38] PROBLEM - Puppet failure on deployment-fluoride is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [14:11:46] PROBLEM - Puppet failure on deployment-eventlogging02 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [14:20:41] RECOVERY - Puppet failure on deployment-bastion is OK: OK: Less than 1.00% above the threshold [0.0] [14:22:03] PROBLEM - Puppet failure on deployment-memc04 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [14:25:41] PROBLEM - Puppet failure on deployment-db2 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [14:25:56] PROBLEM - Puppet failure on deployment-elastic06 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [14:31:31] PROBLEM - Puppet failure on deployment-elastic07 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [14:35:01] PROBLEM - Puppet failure on deployment-upload is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [14:36:46] RECOVERY - Puppet failure on deployment-eventlogging02 is OK: OK: Less than 1.00% above the threshold [0.0] [14:36:46] RECOVERY - Puppet failure on deployment-fluoride is OK: OK: Less than 1.00% above the threshold [0.0] [14:39:24] RECOVERY - Puppet failure on deployment-mediawiki01 is OK: OK: Less than 1.00% above the threshold [0.0] [14:42:03] RECOVERY - Puppet failure on deployment-memc04 is OK: OK: Less than 1.00% above the threshold [0.0] [14:45:36] RECOVERY - Puppet failure on deployment-db2 is OK: OK: Less than 1.00% above the threshold [0.0] [14:47:12] Yippee, build fixed! [14:47:12] Project browsertests-Wikidata-SmokeTests-linux-firefox-sauce build #132: FIXED in 30 min: https://integration.wikimedia.org/ci/job/browsertests-Wikidata-SmokeTests-linux-firefox-sauce/132/ [14:50:56] RECOVERY - Puppet failure on deployment-elastic06 is OK: OK: Less than 1.00% above the threshold [0.0] [14:52:42] PROBLEM - Puppet failure on deployment-videoscaler01 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [14:54:43] PROBLEM - Puppet failure on deployment-jobrunner01 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [14:54:59] RECOVERY - Puppet failure on deployment-upload is OK: OK: Less than 1.00% above the threshold [0.0] [14:56:32] RECOVERY - Puppet failure on deployment-elastic07 is OK: OK: Less than 1.00% above the threshold [0.0] [15:01:38] PROBLEM - Puppet failure on deployment-db2 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [15:14:44] RECOVERY - Puppet failure on deployment-jobrunner01 is OK: OK: Less than 1.00% above the threshold [0.0] [15:17:44] RECOVERY - Puppet failure on deployment-videoscaler01 is OK: OK: Less than 1.00% above the threshold [0.0] [15:23:17] 3Phabricator: One touch assign button on phabricator - https://phabricator.wikimedia.org/T87436#991048 (10Qgil) We also have `Action > Reassign / Claim` in the comment area. I don't think upstream is going to consider adding another option in the quite populated task menu, when there are already to ways to clai... [15:26:38] RECOVERY - Puppet failure on deployment-db2 is OK: OK: Less than 1.00% above the threshold [0.0] [15:34:53] 3Phabricator: Change color of #Patch-For-Review project to something more unique - https://phabricator.wikimedia.org/T87226#991055 (10Qgil) James, now I don't know whether you are being serious of ironic. [16:04:33] 3Release-Engineering, MediaWiki-Developer-Summit-2015, Continuous-Integration: 2015 MediaWiki Developer Summit - State of continuous integration (CI), what we want to do in 2015 - https://phabricator.wikimedia.org/T86752#991104 (10Qgil) p:5Triage>3Normal [16:04:34] 3Release-Engineering, MediaWiki-Developer-Summit-2015, Continuous-Integration: 2015 MediaWiki Developer Summit - State of continuous integration (CI), what we did in 2014 - https://phabricator.wikimedia.org/T86750#991105 (10Qgil) p:5Triage>3Normal [16:04:42] PROBLEM - Puppet failure on deployment-logstash1 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [16:15:08] Project beta-scap-eqiad build #39225: FAILURE in 1 min 2 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/39225/ [16:22:35] PROBLEM - Puppet failure on deployment-db2 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [16:24:41] RECOVERY - Puppet failure on deployment-logstash1 is OK: OK: Less than 1.00% above the threshold [0.0] [16:25:28] Yippee, build fixed! [16:25:29] Project beta-scap-eqiad build #39226: FIXED in 1 min 35 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/39226/ [16:29:34] 3Beta-Cluster, Wikidata: m.wikidata.beta.wmflabs.org/ redirects to a host that does not exist - https://phabricator.wikimedia.org/T87440#991113 (10JanZerebecki) 3NEW [16:29:55] PROBLEM - Puppet failure on deployment-restbase02 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [16:31:00] PROBLEM - Puppet failure on deployment-upload is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [16:33:05] 3Beta-Cluster, Wikidata: m.wikidata.beta.wmflabs.org/ redirects to a host that does not exist - https://phabricator.wikimedia.org/T87440#991120 (10Krenair) Also, why does http://en.wikidata.beta.wmflabs.org/wiki/Wikidata:Main_Page work? [16:46:21] PROBLEM - Puppet failure on deployment-pdf01 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [16:47:39] RECOVERY - Puppet failure on deployment-db2 is OK: OK: Less than 1.00% above the threshold [0.0] [16:54:54] RECOVERY - Puppet failure on deployment-restbase02 is OK: OK: Less than 1.00% above the threshold [0.0] [16:55:59] RECOVERY - Puppet failure on deployment-upload is OK: OK: Less than 1.00% above the threshold [0.0] [16:58:41] PROBLEM - Puppet failure on deployment-cache-upload02 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [17:05:54] PROBLEM - Puppet failure on deployment-cache-mobile03 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [17:12:03] PROBLEM - Puppet failure on deployment-upload is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [17:13:54] Project beta-code-update-eqiad build #41539: FAILURE in 53 sec: https://integration.wikimedia.org/ci/job/beta-code-update-eqiad/41539/ [17:17:43] PROBLEM - Puppet failure on deployment-rsync01 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [17:19:57] PROBLEM - Puppet failure on deployment-salt is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [17:23:47] RECOVERY - Puppet failure on deployment-cache-upload02 is OK: OK: Less than 1.00% above the threshold [0.0] [17:30:59] RECOVERY - Puppet failure on deployment-cache-mobile03 is OK: OK: Less than 1.00% above the threshold [0.0] [17:36:41] PROBLEM - Puppet failure on deployment-parsoid05 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [17:36:59] RECOVERY - Puppet failure on deployment-upload is OK: OK: Less than 1.00% above the threshold [0.0] [17:38:03] PROBLEM - Puppet failure on deployment-memc04 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [17:42:43] RECOVERY - Puppet failure on deployment-rsync01 is OK: OK: Less than 1.00% above the threshold [0.0] [17:44:55] RECOVERY - Puppet failure on deployment-salt is OK: OK: Less than 1.00% above the threshold [0.0] [17:53:42] PROBLEM - Puppet failure on deployment-videoscaler01 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [17:54:41] 3Ops-Access-Requests, Continuous-Integration: Make sure relevant RelEng people have access to gallium (Chris M, Dan, Mukunda, Zeljko) - https://phabricator.wikimedia.org/T85936#991202 (10RobH) a:5JohnLewis>3RobH I'll handle this later today. Ops is in break out sprints, which include a phabricator group (wh... [18:03:03] RECOVERY - Puppet failure on deployment-memc04 is OK: OK: Less than 1.00% above the threshold [0.0] [18:04:01] Yippee, build fixed! [18:04:01] Project beta-code-update-eqiad build #41544: FIXED in 1 min 0 sec: https://integration.wikimedia.org/ci/job/beta-code-update-eqiad/41544/ [18:05:31] Project beta-scap-eqiad build #39231: FAILURE in 1 min 29 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/39231/ [18:06:40] RECOVERY - Puppet failure on deployment-parsoid05 is OK: OK: Less than 1.00% above the threshold [0.0] [18:15:08] greg-g: so, fun bug turned up last night https://phabricator.wikimedia.org/T87396. It has been generally mitigated via an abuse filter rule on enwiki, but would like to deploy a tiny patch to fix it https://gerrit.wikimedia.org/r/186351 [18:15:29] Yippee, build fixed! [18:15:30] Project beta-scap-eqiad build #39232: FIXED in 1 min 31 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/39232/ [18:18:43] RECOVERY - Puppet failure on deployment-videoscaler01 is OK: OK: Less than 1.00% above the threshold [0.0] [18:23:36] PROBLEM - Puppet failure on deployment-db2 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [18:26:40] PROBLEM - Puppet failure on deployment-logstash1 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [18:26:56] PROBLEM - Puppet failure on deployment-cache-mobile03 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [18:28:55] ebernhardson: doit [18:31:20] PROBLEM - Puppet failure on deployment-mediawiki01 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [18:31:28] PROBLEM - Puppet failure on deployment-memc03 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [18:33:27] PROBLEM - Puppet failure on deployment-memc02 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [18:34:03] puppet, behave. [18:39:46] greg-g: Were we (err, you) going to add a third midday SWAT window? [18:43:41] PROBLEM - Puppet failure on deployment-rsync01 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [18:45:32] Project beta-scap-eqiad build #39235: FAILURE in 1 min 34 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/39235/ [18:46:41] RECOVERY - Puppet failure on deployment-logstash1 is OK: OK: Less than 1.00% above the threshold [0.0] [18:52:00] RECOVERY - Puppet failure on deployment-cache-mobile03 is OK: OK: Less than 1.00% above the threshold [0.0] [18:54:43] Project browsertests-MultimediaViewer-en.wikipedia.beta.wmflabs.org-windows_8.1-internet_explorer-11-sauce build #270: FAILURE in 47 min: https://integration.wikimedia.org/ci/job/browsertests-MultimediaViewer-en.wikipedia.beta.wmflabs.org-windows_8.1-internet_explorer-11-sauce/270/ [18:55:30] Yippee, build fixed! [18:55:31] Project beta-scap-eqiad build #39236: FIXED in 1 min 21 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/39236/ [18:55:44] PROBLEM - Puppet failure on deployment-jobrunner01 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [18:56:24] RECOVERY - Puppet failure on deployment-mediawiki01 is OK: OK: Less than 1.00% above the threshold [0.0] [18:57:51] 3RESTBase, Continuous-Integration, Services, Parsoid: Move Parsoid and RESTBase testing from Travis CI to our Jenkins - https://phabricator.wikimedia.org/T78410#991387 (10Jdouglas) Point in favor of setting up our own infrastructure: Travis is currently in a bad state, and all we can do is wait. https://travis-... [18:58:29] RECOVERY - Puppet failure on deployment-memc02 is OK: OK: Less than 1.00% above the threshold [0.0] [18:59:24] jesus fuckin christ, puppet [18:59:25] * YuviPanda looks [18:59:43] PROBLEM - Puppet failure on deployment-cache-upload02 is CRITICAL: CRITICAL: 11.11% of data above the critical threshold [0.0] [19:01:30] RECOVERY - Puppet failure on deployment-memc03 is OK: OK: Less than 1.00% above the threshold [0.0] [19:01:42] fucking dnsmasq [19:03:40] RECOVERY - Puppet failure on deployment-rsync01 is OK: OK: Less than 1.00% above the threshold [0.0] [19:04:01] bd808: where’s the beta logstash instance? [19:04:14] deployment-logstash01 [19:04:23] crazy right? :) [19:04:48] bd808: no, i mean, from the web... [19:04:59] gah, should’ve said kibana web url [19:05:20] https://logstash-beta.wmflabs.org/#/dashboard/elasticsearch/default [19:05:48] * bd808 needs to make a config patch for exception-json processing [19:07:11] James_F: having trouble finding deployers [19:07:53] PROBLEM - Puppet failure on deployment-cache-mobile03 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [19:08:43] PROBLEM - Puppet failure on deployment-sca01 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [19:08:43] RECOVERY - Puppet failure on deployment-db2 is OK: OK: Less than 1.00% above the threshold [0.0] [19:09:33] bd808: thanks [19:09:39] greg-g: this is just the DNS server fucking up again [19:11:30] 3Deployment-Systems: Expose php warnings in mediawiki-config more visibly - https://phabricator.wikimedia.org/T87447#991399 (10Nikerabbit) 3NEW [19:11:37] greg-g: Krenair and a couple of others, at least, would be around. [19:11:55] hi [19:12:01] * James_F takes Krenair's name in vain. :-) [19:12:46] what needs to be done? [19:12:59] PROBLEM - Puppet failure on deployment-upload is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [19:13:28] Krenair: adding SWAT deployers :) [19:13:44] kaldari will be training sam smith [19:14:43] greg-g: Add the slot and they will come? ;-) [19:14:44] PROBLEM - Puppet failure on deployment-rsync01 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [19:15:48] RECOVERY - Puppet failure on deployment-jobrunner01 is OK: OK: Less than 1.00% above the threshold [0.0] [19:18:21] PROBLEM - Puppet failure on deployment-stream is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [19:18:34] James_F: :) [19:18:36] greg-g, so... do I need to do something? when? [19:18:49] Krenair: don't think so, are you thinking about GWToolset? [19:19:03] (there was also a Flow issue that came up this morning) [19:19:13] No, James_F said my name [19:19:14] Krenair: you wanna do it Monday morning? [19:19:22] ah, no, nothing :) [19:20:02] isn't that at like 07:00 AM local time? [19:20:37] SF? 8 :) [19:20:45] ok ok, afternoon? :) [19:20:58] we could probably also do a 9 or 10 if need be [19:21:29] 15:00 UTC, subtract 8 hours... [19:24:41] RECOVERY - Puppet failure on deployment-cache-upload02 is OK: OK: Less than 1.00% above the threshold [0.0] [19:25:15] 16:00–17:00 UTC [19:25:16] 08:00–09:00 PST [19:26:13] sigh... why is this listed as 15:00 on the SWAT_deploys page? [19:26:25] oh, old [19:26:32] I should just delete that, too many wiki pages.... [19:26:41] greg-g: Overriding the SWAT? [19:26:44] Oh, huh. [19:26:52] The SWAT is just...OK. [19:27:02] Krenair: because of daylight savings [19:27:03] * marktraceur is going to get more coffee. [19:27:16] Krenair: deploy windows are pinned to SF time, which means the UTC time changes twice a year [19:27:22] oh, so you change the UTC timings so the time would remain constant in SF? [19:27:26] * greg-g nods [19:27:28] yeah [19:27:30] ugh [19:27:59] 3Phabricator: Change color of #Patch-For-Review project to something more unique - https://phabricator.wikimedia.org/T87226#991470 (10Jdforrester-WMF) >>! In T87226#991055, @Qgil wrote: > James, now I don't know whether you are being serious of ironic. Serious. Gerrit is currently where code review happens, so... [19:28:00] https://wikitech.wikimedia.org/w/index.php?title=SWAT_deploys&diff=141817&oldid=141809 [19:28:04] daylight savings is such a pita [19:28:09] yup [19:28:49] PROBLEM - Puppet failure on deployment-eventlogging02 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [19:32:13] greg-g, okay so looking at the schedule for MWDS, I'm not likely to be available for that window [19:32:54] RECOVERY - Puppet failure on deployment-cache-mobile03 is OK: OK: Less than 1.00% above the threshold [0.0] [19:34:41] greg-g, why is there a 00:00 - 23:00 UTC/16:00-15:00 PST "Morning SWAT" on monday? [19:34:59] I think those times are wrong [19:37:58] RECOVERY - Puppet failure on deployment-upload is OK: OK: Less than 1.00% above the threshold [0.0] [19:38:46] RECOVERY - Puppet failure on deployment-sca01 is OK: OK: Less than 1.00% above the threshold [0.0] [19:43:27] RECOVERY - Puppet failure on deployment-stream is OK: OK: Less than 1.00% above the threshold [0.0] [19:44:41] RECOVERY - Puppet failure on deployment-rsync01 is OK: OK: Less than 1.00% above the threshold [0.0] [19:47:31] PROBLEM - Puppet failure on deployment-memc03 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [19:49:01] PROBLEM - Puppet failure on deployment-upload is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [19:58:48] RECOVERY - Puppet failure on deployment-eventlogging02 is OK: OK: Less than 1.00% above the threshold [0.0] [20:01:42] PROBLEM - Puppet failure on deployment-bastion is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [20:01:46] PROBLEM - Puppet failure on deployment-mediawiki03 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [20:01:50] PROBLEM - Puppet failure on deployment-cache-bits01 is CRITICAL: CRITICAL: 80.00% of data above the critical threshold [0.0] [20:03:54] Krenair: good catch [20:05:23] fixed [20:10:42] Project beta-scap-eqiad build #39242: FAILURE in 26 min: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/39242/ [20:12:29] RECOVERY - Puppet failure on deployment-memc03 is OK: OK: Less than 1.00% above the threshold [0.0] [20:14:18] Project browsertests-MultimediaViewer-en.wikipedia.beta.wmflabs.org-os_x_10.9-safari-sauce build #424: FAILURE in 2 hr 4 min: https://integration.wikimedia.org/ci/job/browsertests-MultimediaViewer-en.wikipedia.beta.wmflabs.org-os_x_10.9-safari-sauce/424/ [20:19:01] RECOVERY - Puppet failure on deployment-upload is OK: OK: Less than 1.00% above the threshold [0.0] [20:21:40] RECOVERY - Puppet failure on deployment-bastion is OK: OK: Less than 1.00% above the threshold [0.0] [20:21:46] RECOVERY - Puppet failure on deployment-cache-bits01 is OK: OK: Less than 1.00% above the threshold [0.0] [20:22:42] PROBLEM - Puppet failure on deployment-logstash1 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [20:24:12] PROBLEM - Puppet failure on deployment-elastic08 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [20:26:37] 3Release-Engineering, MediaWiki-Developer-Summit-2015, Continuous-Integration: 2015 MediaWiki Developer Summit - State of continuous integration (CI), what we did in 2014 - https://phabricator.wikimedia.org/T86750#991703 (10hashar) [20:26:44] RECOVERY - Puppet failure on deployment-mediawiki03 is OK: OK: Less than 1.00% above the threshold [0.0] [20:28:18] ould not resolve hostname deployment-mediawiki02.eqiad.wmflabs: Temporary failure in name resolution [20:28:33] ah, yeah, yuvi knows [20:36:55] I really wish we could make this faster: Finished mw-update-l10n (duration: 19m 45s) [20:37:41] PROBLEM - Puppet failure on deployment-bastion is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [20:42:38] greg-g: Inorite. [20:42:43] Project beta-scap-eqiad build #39243: STILL FAILING in 30 min: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/39243/ [20:42:53] still? [20:42:59] greg-g: Waiting for Beta Labs sync to confirm that fixed bugs are actually fixed in production(-like) environment sucks. [20:43:03] Build timed out (after 30 minutes). Marking the build as failed. [20:43:06] fucking eh [20:43:22] I want to turn that timeout off [20:47:22] (03PS1) 10Greg Grossmeier: Set beta-scap-$datacenter timeout to 45 minutes [integration/config] - 10https://gerrit.wikimedia.org/r/186406 [20:47:31] Yippee, build fixed! [20:47:32] Project beta-scap-eqiad build #39244: FIXED in 2 min 5 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/39244/ [20:47:44] RECOVERY - Puppet failure on deployment-logstash1 is OK: OK: Less than 1.00% above the threshold [0.0] [20:48:38] (03CR) 10Greg Grossmeier: "I have no idea if I did this correctly." [integration/config] - 10https://gerrit.wikimedia.org/r/186406 (owner: 10Greg Grossmeier) [20:48:49] ^ at least I'm honest [20:49:03] greg-g: at least you tried :) [20:49:55] when I get sufficiently annoyed... [20:50:47] greg-g: Feels like https://gerrit.wikimedia.org/r/#/c/186001/1/wmf-config/CommonSettings-labs.php – "// FIXME: Is this right?" [20:50:51] Project browsertests-Flow-en.wikipedia.beta.wmflabs.org-linux-chrome-sauce build #462: FAILURE in 59 min: https://integration.wikimedia.org/ci/job/browsertests-Flow-en.wikipedia.beta.wmflabs.org-linux-chrome-sauce/462/ [20:51:15] James_F: :) :) [20:52:21] PROBLEM - Puppet failure on deployment-mediawiki01 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [20:54:16] RECOVERY - Puppet failure on deployment-elastic08 is OK: OK: Less than 1.00% above the threshold [0.0] [20:54:46] PROBLEM - Puppet failure on deployment-eventlogging02 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [20:54:46] PROBLEM - Puppet failure on deployment-videoscaler01 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [20:54:58] (03CR) 1020after4: [C: 031] "looks good to me" [integration/config] - 10https://gerrit.wikimedia.org/r/186406 (owner: 10Greg Grossmeier) [20:58:39] PROBLEM - Puppet failure on deployment-cxserver03 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [21:02:42] RECOVERY - Puppet failure on deployment-bastion is OK: OK: Less than 1.00% above the threshold [0.0] [21:11:57] PROBLEM - Puppet failure on deployment-pdf02 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [21:14:28] PROBLEM - Puppet failure on deployment-memc02 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [21:17:22] RECOVERY - Puppet failure on deployment-mediawiki01 is OK: OK: Less than 1.00% above the threshold [0.0] [21:18:40] PROBLEM - Puppet failure on deployment-bastion is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [21:19:45] RECOVERY - Puppet failure on deployment-videoscaler01 is OK: OK: Less than 1.00% above the threshold [0.0] [21:19:47] RECOVERY - Puppet failure on deployment-eventlogging02 is OK: OK: Less than 1.00% above the threshold [0.0] [21:26:29] (03PS1) 10Jdlrobson: Add smoke tests for MobileFrontend in jjb [integration/config] - 10https://gerrit.wikimedia.org/r/186451 [21:28:42] RECOVERY - Puppet failure on deployment-cxserver03 is OK: OK: Less than 1.00% above the threshold [0.0] [21:31:09] 3Phabricator: Searchable "Reference" custom field - https://phabricator.wikimedia.org/T991#991804 (10chasemp) seems not worth the headache at this point [21:35:43] PROBLEM - Puppet failure on deployment-videoscaler01 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [21:39:03] PROBLEM - Puppet failure on deployment-memc04 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [21:39:25] RECOVERY - Puppet failure on deployment-memc02 is OK: OK: Less than 1.00% above the threshold [0.0] [21:41:59] RECOVERY - Puppet failure on deployment-pdf02 is OK: OK: Less than 1.00% above the threshold [0.0] [21:43:39] RECOVERY - Puppet failure on deployment-bastion is OK: OK: Less than 1.00% above the threshold [0.0] [21:48:26] (03CR) 10Greg Grossmeier: "This is working as expected without complaints (after Bryan's manual config change)." [integration/config] - 10https://gerrit.wikimedia.org/r/184502 (https://phabricator.wikimedia.org/T84947) (owner: 10Greg Grossmeier) [21:48:32] PROBLEM - Puppet failure on deployment-memc03 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [21:48:56] PROBLEM - Puppet failure on deployment-cache-mobile03 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [21:59:09] Yippee, build fixed! [21:59:10] Project browsertests-VisualEditor-en.wikipedia.beta.wmflabs.org-linux-chrome-sauce build #473: FIXED in 33 min: https://integration.wikimedia.org/ci/job/browsertests-VisualEditor-en.wikipedia.beta.wmflabs.org-linux-chrome-sauce/473/ [22:00:12] (03CR) 10Krinkle: "Creating the job in Jenkins is fine, but please also add it to one or more of your pipelines in zuul-config" [integration/config] - 10https://gerrit.wikimedia.org/r/184592 (owner: 10Adrian Lang) [22:00:18] (03CR) 10Krinkle: "(Otherwise it won't run)." [integration/config] - 10https://gerrit.wikimedia.org/r/184592 (owner: 10Adrian Lang) [22:00:41] RECOVERY - Puppet failure on deployment-videoscaler01 is OK: OK: Less than 1.00% above the threshold [0.0] [22:04:09] RECOVERY - Puppet failure on deployment-memc04 is OK: OK: Less than 1.00% above the threshold [0.0] [22:08:56] RECOVERY - Puppet failure on deployment-cache-mobile03 is OK: OK: Less than 1.00% above the threshold [0.0] [22:09:16] PROBLEM - Puppet failure on deployment-redis01 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [22:13:29] RECOVERY - Puppet failure on deployment-memc03 is OK: OK: Less than 1.00% above the threshold [0.0] [22:19:17] PROBLEM - Puppet failure on deployment-db1 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [22:21:38] puppet failures caused by "temporary failure in name resolution" ... [22:23:41] PROBLEM - Puppet failure on deployment-logstash1 is CRITICAL: CRITICAL: 42.86% of data above the critical threshold [0.0] [22:27:58] I can't see anything wrong with dns though [22:29:28] PROBLEM - Puppet failure on deployment-memc03 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [22:30:04] twentyafterfour: yuvi just said dnsmasq is fucking up /me shrugs [22:32:03] 3Phabricator, operations: have any task put into ops-access-requests automatically generate an ops-access-review task - https://phabricator.wikimedia.org/T87467#991967 (10chasemp) [22:34:18] RECOVERY - Puppet failure on deployment-redis01 is OK: OK: Less than 1.00% above the threshold [0.0] [22:41:39] PROBLEM - Puppet failure on deployment-rsync01 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [22:41:46] OK I am at the end of my rope on this one [22:41:55] https://phabricator.wikimedia.org/T86770 [22:42:03] We thought it was z-index, I'm not convinced [22:42:15] I solved two of these bugs by adding a sleep 1 statement to the relevant steps [22:42:30] The one I've got now doesn't have any z-index on the elements involved [22:42:43] (cc gi11es) [22:43:01] Is it just the Chrome driver being finnicky or something? [22:43:55] * marktraceur looks at zel and chr [22:43:58] Damn it guys [22:44:10] RECOVERY - Puppet failure on deployment-db1 is OK: OK: Less than 1.00% above the threshold [0.0] [22:48:40] RECOVERY - Puppet failure on deployment-logstash1 is OK: OK: Less than 1.00% above the threshold [0.0] [22:49:13] marktraceur: They're in an all-afternoon training session for this stuff. [22:49:25] I see that now [22:49:37] marktraceur: :-) [22:50:08] 3Phabricator, operations: have any task put into ops-access-requests automatically generate an ops-access-review task - https://phabricator.wikimedia.org/T87467#992004 (10RobH) [22:53:07] 3Phabricator, operations: have any task put into ops-access-requests automatically generate an ops-access-review task - https://phabricator.wikimedia.org/T87467#992011 (10mmodell) I thought this was already done as part of the security extension. Not extensively tested though. [22:54:35] RECOVERY - Puppet failure on deployment-memc03 is OK: OK: Less than 1.00% above the threshold [0.0] [22:58:54] 3Phabricator, operations: have any task put into ops-access-requests automatically generate an ops-access-review task - https://phabricator.wikimedia.org/T87467#992016 (10mmodell) The described behaviour is essentially how it works, however, it's not based on project tags, it's based on the security dropdown. It... [22:59:46] 3Phabricator, operations: have any task put into ops-access-requests automatically generate an ops-access-review task - https://phabricator.wikimedia.org/T87467#992017 (10mmodell) Ah ha, I see the access request dropdown option is missing. @chasemp: any reason this isn't included in the config? [23:00:15] PROBLEM - Puppet failure on deployment-db1 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [23:00:31] 3Phabricator, operations: have any task put into ops-access-requests automatically generate an ops-access-review task - https://phabricator.wikimedia.org/T87467#992024 (10chasemp) @mmodell, before today we weren't ready for it, and I wasn't sure if the logic was ported to Security-Extension-2.0? [23:01:20] 3Phabricator, operations: have any task put into ops-access-requests automatically generate an ops-access-review task - https://phabricator.wikimedia.org/T87467#992026 (10mmodell) @chasemp the logic is ported, just not very well tested. I'll do some testing on my local phab and we can go from there. [23:02:00] 3Phabricator, operations: have any task put into ops-access-requests automatically generate an ops-access-review task - https://phabricator.wikimedia.org/T87467#992027 (10chasemp) >>! In T87467#992026, @mmodell wrote: > @chasemp the logic is ported, just not very well tested. I'll do some testing on my local pha... [23:06:39] RECOVERY - Puppet failure on deployment-rsync01 is OK: OK: Less than 1.00% above the threshold [0.0] [23:10:00] PROBLEM - Puppet failure on deployment-upload is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [23:10:34] PROBLEM - Puppet failure on deployment-memc03 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [23:15:04] PROBLEM - Puppet failure on deployment-memc04 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [23:25:13] RECOVERY - Puppet failure on deployment-db1 is OK: OK: Less than 1.00% above the threshold [0.0] [23:35:01] RECOVERY - Puppet failure on deployment-upload is OK: OK: Less than 1.00% above the threshold [0.0] [23:35:33] RECOVERY - Puppet failure on deployment-memc03 is OK: OK: Less than 1.00% above the threshold [0.0] [23:37:54] 3Phabricator, operations: have any task put into ops-access-requests automatically generate an ops-access-review task - https://phabricator.wikimedia.org/T87467#992259 (10mmodell) https://gerrit.wikimedia.org/r/#/c/186533/ [23:39:22] 3Phabricator: Add cscott to WMF_NDA. - https://phabricator.wikimedia.org/T87479#992265 (10cscott) 3NEW [23:40:12] 3Phabricator, operations: have any task put into ops-access-requests automatically generate an ops-access-review task - https://phabricator.wikimedia.org/T87467#992273 (10mmodell) @chasemp: See my patch, I fixed a couple of little things (fallout from upstream changes) and this seems to be working well now. [23:41:31] 3Phabricator, operations: have any task put into ops-access-requests automatically generate an ops-access-review task - https://phabricator.wikimedia.org/T87467#992276 (10mmodell) One thing you might not want - it currently includes the "CC'd can see this" behaviour... [23:45:07] RECOVERY - Puppet failure on deployment-memc04 is OK: OK: Less than 1.00% above the threshold [0.0] [23:45:11] so where is dnsmasq running in the beta cluster? [23:46:17] Warning: Permanently added '10.68.16.1' (ECDSA) to the list of known hosts. [23:46:20] Permission denied (publickey). [23:46:45] 10.68.16.1 is the dns server according to resolv.conf, and I can't log in there [23:47:55] twentyafterfour: eth4-1102.labnet1001.eqiad.wmnet. [23:48:12] I guess it's labs wide [23:48:44] so not something I can even attempt to fix [23:48:50] that sucks [23:49:17] What's the issue? [23:49:29] oh, just read in the other channel [23:49:51] twentyafterfour: I'd phile a ticket and cc Coren, Andrew and Yuvi at least [23:50:20] ok [23:53:15] Yippee, build fixed! [23:53:15] Project browsertests-UniversalLanguageSelector-commons.wikimedia.beta.wmflabs.org-linux-firefox-sauce build #418: FIXED in 16 min: https://integration.wikimedia.org/ci/job/browsertests-UniversalLanguageSelector-commons.wikimedia.beta.wmflabs.org-linux-firefox-sauce/418/ [23:57:36] 3Beta-Cluster, Release-Engineering, operations: Intermittent DNS failures in beta labs regularly trigger a bunch of puppet failures - https://phabricator.wikimedia.org/T87480#992333 (10mmodell) [23:57:48] 3Beta-Cluster, operations: Intermittent DNS failures in beta labs regularly trigger a bunch of puppet failures - https://phabricator.wikimedia.org/T87480#992335 (10Reedy) [23:57:48] done ^ [23:58:31] srsly [23:58:37] I can't re-add Release-Engineering after accidentally removing it