[00:31:13] !log integration Deleted stale config.lock in gallium:/srv/ssd/jenkins-slave/workspace/mediawiki-phpunit/src/.git (see https://integration.wikimedia.org/ci/job/mediawiki-phpunit/1683/console) [00:31:18] Logged the message, Master [04:09:44] [13nagf] 15Krinkle pushed 1 new commit to 06master: 02https://github.com/wikimedia/nagf/commit/6b821bc9f3210322001da477e60d8e65578e46a8 [04:09:44] 13nagf/06master 146b821bc 15Timo Tijhof: build: Format test script in composer.json as array [04:10:31] wikimedia/nagf#20 (master - 6b821bc: Timo Tijhof) The build passed. - http://travis-ci.org/wikimedia/nagf/builds/43908436 [08:16:23] PROBLEM - Puppet staleness on tools-webproxy is CRITICAL: CRITICAL: 11.11% of data above the critical threshold [43200.0] [08:29:50] 3Labs-Team: Make labs salt use instance names than ids - https://phabricator.wikimedia.org/T1154#845589 (10yuvipanda) [08:31:22] RECOVERY - Puppet staleness on tools-webproxy is OK: OK: Less than 1.00% above the threshold [3600.0] [10:48:16] 3Beta-Cluster, Labs-Team: Setup multimaster salt for large projects using salt-syndic - https://phabricator.wikimedia.org/T78466#845659 (10yuvipanda) 3NEW [12:23:38] PROBLEM - Puppet failure on tools-login is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [12:23:53] PROBLEM - Puppet failure on tools-dev is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [12:23:59] PROBLEM - Puppet failure on tools-exec-08 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [12:24:07] PROBLEM - Puppet failure on tools-webgrid-01 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [12:24:19] PROBLEM - Puppet failure on tools-exec-07 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [12:24:53] PROBLEM - Puppet failure on tools-exec-12 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [12:25:03] PROBLEM - Puppet failure on tools-webgrid-03 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [12:25:13] PROBLEM - Puppet failure on tools-exec-06 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [12:25:19] PROBLEM - Puppet failure on tools-shadow is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [12:25:59] PROBLEM - Puppet failure on tools-exec-13 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [12:25:59] PROBLEM - Puppet failure on tools-exec-11 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [12:26:29] PROBLEM - Puppet failure on tools-mail is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [12:26:35] PROBLEM - Puppet failure on tools-webgrid-04 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [12:26:47] PROBLEM - Puppet failure on tools-exec-02 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [12:26:49] PROBLEM - Puppet failure on tools-exec-catscan is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [12:27:21] PROBLEM - Puppet failure on tools-webgrid-tomcat is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [12:27:42] PROBLEM - Puppet failure on tools-webgrid-05 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [12:28:05] PROBLEM - Puppet failure on tools-exec-15 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [12:28:17] PROBLEM - Puppet failure on tools-exec-03 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [12:28:36] PROBLEM - Puppet failure on tools-master is CRITICAL: CRITICAL: 88.89% of data above the critical threshold [0.0] [12:28:40] PROBLEM - Puppet failure on tools-exec-cyberbot is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [12:28:40] PROBLEM - Puppet failure on tools-exec-14 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [12:28:48] PROBLEM - Puppet failure on tools-exec-05 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [12:28:54] PROBLEM - Puppet failure on tools-exec-09 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [12:29:08] PROBLEM - Puppet failure on tools-exec-gift is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [12:29:08] PROBLEM - Puppet failure on tools-submit is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [12:29:08] PROBLEM - Puppet failure on tools-exec-01 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [12:29:12] PROBLEM - Puppet failure on tools-redis is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [12:29:18] PROBLEM - Puppet failure on tools-exec-wmt is CRITICAL: CRITICAL: 88.89% of data above the critical threshold [0.0] [12:29:27] PROBLEM - Puppet failure on tools-webgrid-02 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [12:29:29] PROBLEM - Puppet failure on tools-exec-10 is CRITICAL: CRITICAL: 11.11% of data above the critical threshold [0.0] [12:29:29] PROBLEM - Puppet failure on tools-exec-04 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [12:29:34] hmm, sigh [12:29:35] PROBLEM - Puppet failure on tools-trusty is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [12:29:47] transient storm because puppetmaster failed, nothing to look at, move along... [12:33:50] RECOVERY - Puppet failure on tools-exec-05 is OK: OK: Less than 1.00% above the threshold [0.0] [12:33:52] RECOVERY - Puppet failure on tools-exec-09 is OK: OK: Less than 1.00% above the threshold [0.0] [12:34:12] RECOVERY - Puppet failure on tools-redis is OK: OK: Less than 1.00% above the threshold [0.0] [12:34:30] RECOVERY - Puppet failure on tools-exec-10 is OK: OK: Less than 1.00% above the threshold [0.0] [12:35:58] RECOVERY - Puppet failure on tools-exec-11 is OK: OK: Less than 1.00% above the threshold [0.0] [12:38:16] RECOVERY - Puppet failure on tools-exec-03 is OK: OK: Less than 1.00% above the threshold [0.0] [12:38:32] RECOVERY - Puppet failure on tools-master is OK: OK: Less than 1.00% above the threshold [0.0] [12:38:42] RECOVERY - Puppet failure on tools-login is OK: OK: Less than 1.00% above the threshold [0.0] [12:39:16] RECOVERY - Puppet failure on tools-exec-wmt is OK: OK: Less than 1.00% above the threshold [0.0] [12:39:26] RECOVERY - Puppet failure on tools-webgrid-02 is OK: OK: Less than 1.00% above the threshold [0.0] [12:40:58] RECOVERY - Puppet failure on tools-exec-13 is OK: OK: Less than 1.00% above the threshold [0.0] [12:41:30] RECOVERY - Puppet failure on tools-mail is OK: OK: Less than 1.00% above the threshold [0.0] [12:42:18] RECOVERY - Puppet failure on tools-webgrid-tomcat is OK: OK: Less than 1.00% above the threshold [0.0] [12:42:42] RECOVERY - Puppet failure on tools-webgrid-05 is OK: OK: Less than 1.00% above the threshold [0.0] [12:43:02] RECOVERY - Puppet failure on tools-exec-15 is OK: OK: Less than 1.00% above the threshold [0.0] [12:43:40] RECOVERY - Puppet failure on tools-exec-14 is OK: OK: Less than 1.00% above the threshold [0.0] [12:43:56] RECOVERY - Puppet failure on tools-dev is OK: OK: Less than 1.00% above the threshold [0.0] [12:44:05] RECOVERY - Puppet failure on tools-exec-08 is OK: OK: Less than 1.00% above the threshold [0.0] [12:44:07] RECOVERY - Puppet failure on tools-exec-gift is OK: OK: Less than 1.00% above the threshold [0.0] [12:44:12] RECOVERY - Puppet failure on tools-webgrid-01 is OK: OK: Less than 1.00% above the threshold [0.0] [12:44:22] RECOVERY - Puppet failure on tools-exec-07 is OK: OK: Less than 1.00% above the threshold [0.0] [12:44:36] RECOVERY - Puppet failure on tools-trusty is OK: OK: Less than 1.00% above the threshold [0.0] [12:44:54] RECOVERY - Puppet failure on tools-exec-12 is OK: OK: Less than 1.00% above the threshold [0.0] [12:45:07] RECOVERY - Puppet failure on tools-webgrid-03 is OK: OK: Less than 1.00% above the threshold [0.0] [12:46:39] RECOVERY - Puppet failure on tools-webgrid-04 is OK: OK: Less than 1.00% above the threshold [0.0] [12:46:47] RECOVERY - Puppet failure on tools-exec-02 is OK: OK: Less than 1.00% above the threshold [0.0] [12:46:51] RECOVERY - Puppet failure on tools-exec-catscan is OK: OK: Less than 1.00% above the threshold [0.0] [12:49:07] RECOVERY - Puppet failure on tools-submit is OK: OK: Less than 1.00% above the threshold [0.0] [12:49:07] RECOVERY - Puppet failure on tools-exec-01 is OK: OK: Less than 1.00% above the threshold [0.0] [12:49:29] RECOVERY - Puppet failure on tools-exec-04 is OK: OK: Less than 1.00% above the threshold [0.0] [12:50:17] RECOVERY - Puppet failure on tools-shadow is OK: OK: Less than 1.00% above the threshold [0.0] [12:53:41] RECOVERY - Puppet failure on tools-exec-cyberbot is OK: OK: Less than 1.00% above the threshold [0.0] [13:49:22] test [16:03:42] !log deployment-prep Starting work on [[phab:T78076]] to renumber apache users in beta [16:03:45] Logged the message, Master [16:41:17] PROBLEM - Puppet failure on tools-shadow is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [17:06:17] RECOVERY - Puppet failure on tools-shadow is OK: OK: Less than 1.00% above the threshold [0.0] [17:10:38] !log deployment-prep Many strange puppet and scap failures in beta that look to be related to DNS failures [17:10:41] Logged the message, Master [20:22:20] PROBLEM - Puppet staleness on tools-webproxy is CRITICAL: CRITICAL: 11.11% of data above the critical threshold [43200.0] [21:49:50] PROBLEM - Puppet failure on tools-exec-05 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [22:14:48] RECOVERY - Puppet failure on tools-exec-05 is OK: OK: Less than 1.00% above the threshold [0.0] [22:18:18] PROBLEM - Puppet failure on tools-webgrid-tomcat is CRITICAL: CRITICAL: 62.50% of data above the critical threshold [0.0] [22:18:22] 3Wikimedia-Labs-Infrastructure: Internal DNS look-ups fail every once in a while - https://phabricator.wikimedia.org/T72076#846411 (10bd808) We've seen frequent errors with the beta-scap-eqiad again in the last week or two that look to be related to DNS failures for internal hosts: ``` 22:05:23 ['/srv/deploymen... [22:18:36] 3Wikimedia-Labs-Infrastructure: Internal DNS look-ups fail every once in a while - https://phabricator.wikimedia.org/T72076#846413 (10bd808) [22:43:20] RECOVERY - Puppet failure on tools-webgrid-tomcat is OK: OK: Less than 1.00% above the threshold [0.0] [22:55:16] PROBLEM - Puppet failure on tools-exec-wmt is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [23:20:17] RECOVERY - Puppet failure on tools-exec-wmt is OK: OK: Less than 1.00% above the threshold [0.0] [23:51:21] valhallasw`cloud: legoktm preliminary plan for uwsgi support on toollabs: https://etherpad.wikimedia.org/p/toollabs-uwsgi [23:51:35] will run on trusty from the start, so py3 support baked in [23:51:43] should also be faster [23:53:49] so much magic [23:54:26] so now lighttpd.conf will be merged-in from three different files? [23:54:28] brrr. [23:55:05] or will uwsgi follow a different route? [23:55:17] and thus no lighttpd base url-rewrites, for instance? [23:56:24] valhallasw`cloud: yeah, uwsgi will have no lighty in it [23:56:31] hmm, that’s true, not sure if uwsgi handles url-rewrites [23:57:02] valhallasw`cloud: christ, that shit’s convoluted https://github.com/unbit/uwsgi-docs/blob/master/InternalRouting.rst [23:58:05] mmm [23:58:14] so then always everything will pass through uwsgi? [23:58:17] what about static files? [23:59:24] uwsgi these days has a nice static file server [23:59:34] https://github.com/unbit/uwsgi-docs/blob/master/StaticFiles.rst