[01:03:45] PROBLEM - Free space - all mounts on tools-webproxy-02 is CRITICAL: CRITICAL: tools.tools-webproxy-02.diskspace.root.byte_percentfree.value (<25.00%) [01:13:46] RECOVERY - Free space - all mounts on tools-webproxy-02 is OK: OK: All targets OK [02:03:56] (03PS1) 10Jforrester: Move/dupe some more repos to -visualeditor for the Edting department [labs/tools/grrrit] - 10https://gerrit.wikimedia.org/r/195003 [02:07:14] (03PS2) 10Jforrester: Notify mobile and corefeatures of Mantle commits [labs/tools/grrrit] - 10https://gerrit.wikimedia.org/r/137270 (owner: 10Spage) [02:08:09] (03CR) 10Jforrester: [C: 032] Notify mobile and corefeatures of Mantle commits [labs/tools/grrrit] - 10https://gerrit.wikimedia.org/r/137270 (owner: 10Spage) [02:09:03] (03CR) 10Jforrester: "Needs rebasing." [labs/tools/grrrit] - 10https://gerrit.wikimedia.org/r/186905 (owner: 10Awight) [02:47:00] PROBLEM - ToolLabs Home Page on toollabs is CRITICAL: CRITICAL - Socket timeout after 10 seconds [02:52:01] RECOVERY - ToolLabs Home Page on toollabs is OK: HTTP OK: HTTP/1.1 200 OK - 747533 bytes in 9.388 second response time [02:55:44] (03CR) 10Krinkle: [C: 032] Move/dupe some more repos to -visualeditor for the Edting department [labs/tools/grrrit] - 10https://gerrit.wikimedia.org/r/195003 (owner: 10Jforrester) [02:55:47] (03Merged) 10jenkins-bot: Move/dupe some more repos to -visualeditor for the Edting department [labs/tools/grrrit] - 10https://gerrit.wikimedia.org/r/195003 (owner: 10Jforrester) [02:55:50] (03Merged) 10jenkins-bot: Notify mobile and corefeatures of Mantle commits [labs/tools/grrrit] - 10https://gerrit.wikimedia.org/r/137270 (owner: 10Spage) [05:14:41] 6Labs: Rename keystone role 'projectadmin' to 'admin' - https://phabricator.wikimedia.org/T91830#1097687 (10scfc) Is that a statement of fact or just the status quo for Horizon? :-) Having users of the Tools project create and delete VMs would change … everything :-). [06:23:59] RECOVERY - Puppet staleness on tools-exec-10 is OK: OK: Less than 1.00% above the threshold [3600.0] [06:26:56] 10Tool-Labs: Different versions of python-requests cause Puppet failures on four instances - https://phabricator.wikimedia.org/T91862#1097702 (10scfc) 3NEW a:3scfc [06:27:25] PROBLEM - Puppet failure on tools-exec-10 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [07:17:17] 10Tool-Labs: Move tools-login and tools-dev to trusty - https://phabricator.wikimedia.org/T91863#1097720 (10yuvipanda) 3NEW [07:32:39] 10Tool-Labs: Move tools-login and tools-dev to trusty - https://phabricator.wikimedia.org/T91863#1097734 (10Legoktm) +1. [07:35:15] 10Tool-Labs, 10Wikimedia-Mailing-lists: Create a labs-announce-l mailing list - https://phabricator.wikimedia.org/T91864#1097735 (10yuvipanda) 3NEW [07:59:38] 10Tool-Labs: Move tools-login and tools-dev to trusty - https://phabricator.wikimedia.org/T91863#1097772 (10yuvipanda) Actually, since they are just DNS entries, I can just point them to a different instance, and make backport ones for the older hosts. So people would just get routed to the new address at some p... [08:00:07] 6Labs: Upgrade labs cluster to Jessie (alternative to T90821) - https://phabricator.wikimedia.org/T91799#1097773 (10Aklapper) [Please associate a project when creating a task. Adding #Labs here] [09:27:52] RECOVERY - Puppet staleness on tools-static is OK: OK: Less than 1.00% above the threshold [3600.0] [09:27:52] PROBLEM - Puppet failure on tools-exec-05 is CRITICAL: CRITICAL: 12.50% of data above the critical threshold [0.0] [09:28:04] PROBLEM - Puppet failure on tools-exec-04 is CRITICAL: CRITICAL: 14.29% of data above the critical threshold [0.0] [09:30:48] PROBLEM - Puppet failure on tools-exec-09 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [09:34:12] RECOVERY - Puppet staleness on tools-exec-03 is OK: OK: Less than 1.00% above the threshold [3600.0] [09:35:16] PROBLEM - Puppet failure on tools-exec-15 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [09:36:12] PROBLEM - Puppet failure on tools-exec-13 is CRITICAL: CRITICAL: 42.86% of data above the critical threshold [0.0] [09:36:12] RECOVERY - Puppet staleness on tools-exec-01 is OK: OK: Less than 1.00% above the threshold [3600.0] [09:36:36] RECOVERY - Puppet staleness on tools-exec-07 is OK: OK: Less than 1.00% above the threshold [3600.0] [09:36:36] RECOVERY - Puppet staleness on tools-exec-04 is OK: OK: Less than 1.00% above the threshold [3600.0] [09:36:44] PROBLEM - Puppet failure on tools-redis-slave is CRITICAL: CRITICAL: 37.50% of data above the critical threshold [0.0] [09:36:56] PROBLEM - Puppet failure on tools-exec-08 is CRITICAL: CRITICAL: 25.00% of data above the critical threshold [0.0] [09:37:22] PROBLEM - Puppet failure on tools-exec-12 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [09:37:36] PROBLEM - Puppet failure on tools-exec-07 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [09:37:52] PROBLEM - Puppet failure on tools-submit is CRITICAL: CRITICAL: 25.00% of data above the critical threshold [0.0] [09:37:53] PROBLEM - Puppet failure on tools-exec-catscan is CRITICAL: CRITICAL: 75.00% of data above the critical threshold [0.0] [09:38:21] PROBLEM - Puppet failure on tools-webgrid-05 is CRITICAL: CRITICAL: 25.00% of data above the critical threshold [0.0] [09:38:37] PROBLEM - Puppet failure on tools-trusty is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [09:38:37] RECOVERY - Puppet staleness on tools-exec-09 is OK: OK: Less than 1.00% above the threshold [3600.0] [09:38:59] RECOVERY - Puppet staleness on tools-exec-02 is OK: OK: Less than 1.00% above the threshold [3600.0] [09:39:31] RECOVERY - Puppet staleness on tools-exec-06 is OK: OK: Less than 1.00% above the threshold [3600.0] [09:39:44] PROBLEM - Puppet failure on tools-master is CRITICAL: CRITICAL: 62.50% of data above the critical threshold [0.0] [09:39:54] RECOVERY - Puppet staleness on tools-exec-08 is OK: OK: Less than 1.00% above the threshold [3600.0] [09:41:12] RECOVERY - Puppet staleness on tools-exec-12 is OK: OK: Less than 1.00% above the threshold [3600.0] [09:41:12] RECOVERY - Puppet staleness on tools-exec-catscan is OK: OK: Less than 1.00% above the threshold [3600.0] [09:41:56] RECOVERY - Puppet staleness on tools-exec-15 is OK: OK: Less than 1.00% above the threshold [3600.0] [09:42:16] RECOVERY - Puppet staleness on tools-exec-13 is OK: OK: Less than 1.00% above the threshold [3600.0] [09:42:38] PROBLEM - Puppet failure on tools-exec-02 is CRITICAL: CRITICAL: 71.43% of data above the critical threshold [0.0] [09:42:42] RECOVERY - Puppet staleness on tools-exec-05 is OK: OK: Less than 1.00% above the threshold [3600.0] [09:43:00] PROBLEM - Puppet failure on tools-shadow is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [09:43:22] PROBLEM - Puppet failure on tools-exec-cyberbot is CRITICAL: CRITICAL: 14.29% of data above the critical threshold [0.0] [09:43:32] RECOVERY - Puppet staleness on tools-mail is OK: OK: Less than 1.00% above the threshold [3600.0] [09:43:36] PROBLEM - Puppet failure on tools-exec-01 is CRITICAL: CRITICAL: 62.50% of data above the critical threshold [0.0] [09:43:42] RECOVERY - Puppet staleness on tools-exec-cyberbot is OK: OK: Less than 1.00% above the threshold [3600.0] [09:43:47] RECOVERY - Puppet staleness on tools-redis is OK: OK: Less than 1.00% above the threshold [3600.0] [09:44:05] RECOVERY - Puppet staleness on tools-redis-slave is OK: OK: Less than 1.00% above the threshold [3600.0] [09:44:38] RECOVERY - Puppet staleness on tools-exec-gift is OK: OK: Less than 1.00% above the threshold [3600.0] [09:45:56] PROBLEM - Puppet failure on tools-webgrid-07 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [09:46:42] RECOVERY - Puppet staleness on tools-shadow is OK: OK: Less than 1.00% above the threshold [3600.0] [09:47:11] RECOVERY - Puppet staleness on tools-webgrid-01 is OK: OK: Less than 1.00% above the threshold [3600.0] [09:47:53] RECOVERY - Puppet failure on tools-exec-catscan is OK: OK: Less than 1.00% above the threshold [0.0] [09:48:01] RECOVERY - Puppet staleness on tools-trusty is OK: OK: Less than 1.00% above the threshold [3600.0] [09:48:44] RECOVERY - Puppet staleness on tools-webgrid-02 is OK: OK: Less than 1.00% above the threshold [3600.0] [09:48:55] RECOVERY - Puppet staleness on tools-webgrid-05 is OK: OK: Less than 1.00% above the threshold [3600.0] [09:50:27] PROBLEM - Puppet failure on tools-exec-03 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [09:50:27] PROBLEM - Puppet failure on tools-exec-06 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [09:51:41] RECOVERY - Puppet staleness on tools-webgrid-07 is OK: OK: Less than 1.00% above the threshold [3600.0] [09:51:55] RECOVERY - Puppet failure on tools-redis-slave is OK: OK: Less than 1.00% above the threshold [0.0] [09:52:27] RECOVERY - Puppet failure on tools-exec-10 is OK: OK: Less than 1.00% above the threshold [0.0] [09:52:53] RECOVERY - Puppet failure on tools-submit is OK: OK: Less than 1.00% above the threshold [0.0] [09:54:42] PROBLEM - Puppet failure on tools-webgrid-02 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [09:56:54] PROBLEM - Puppet failure on tools-webgrid-01 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [09:56:54] PROBLEM - Puppet failure on tools-exec-gift is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [09:56:58] PROBLEM - Puppet failure on tools-webgrid-tomcat is CRITICAL: CRITICAL: 37.50% of data above the critical threshold [0.0] [09:57:48] RECOVERY - Puppet staleness on tools-webgrid-generic-02 is OK: OK: Less than 1.00% above the threshold [3600.0] [09:58:04] 10Tool-Labs: Different versions of python-requests cause Puppet failures on four instances - https://phabricator.wikimedia.org/T91862#1097841 (10scfc) Interestingly, Puppet stalled //previous// to any errors with `labsdebrepo`, i. e. other changes hadn't been applied either. Hmmm. [09:58:56] PROBLEM - Puppet failure on tools-exec-catscan is CRITICAL: CRITICAL: 12.50% of data above the critical threshold [0.0] [09:59:50] RECOVERY - Puppet failure on tools-master is OK: OK: Less than 1.00% above the threshold [0.0] [10:00:18] PROBLEM - Puppet failure on tools-webgrid-generic-02 is CRITICAL: CRITICAL: 37.50% of data above the critical threshold [0.0] [10:03:39] RECOVERY - Puppet failure on tools-trusty is OK: OK: Less than 1.00% above the threshold [0.0] [10:03:53] PROBLEM - Puppet failure on tools-submit is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [10:06:39] RECOVERY - Puppet staleness on tools-webproxy-02 is OK: OK: Less than 1.00% above the threshold [3600.0] [10:06:55] RECOVERY - Puppet failure on tools-webgrid-tomcat is OK: OK: Less than 1.00% above the threshold [0.0] [10:08:07] RECOVERY - Puppet failure on tools-exec-04 is OK: OK: Less than 1.00% above the threshold [0.0] [10:08:23] RECOVERY - Puppet failure on tools-webgrid-05 is OK: OK: Less than 1.00% above the threshold [0.0] [10:08:55] RECOVERY - Puppet staleness on tools-dev is OK: OK: Less than 1.00% above the threshold [3600.0] [10:10:17] RECOVERY - Puppet failure on tools-exec-15 is OK: OK: Less than 1.00% above the threshold [0.0] [10:14:24] PROBLEM - Puppet failure on tools-webgrid-05 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [10:14:34] PROBLEM - Puppet failure on tools-trusty is CRITICAL: CRITICAL: 37.50% of data above the critical threshold [0.0] [10:16:48] PROBLEM - Puppet failure on tools-dev is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [10:22:03] PROBLEM - Puppet failure on tools-login is CRITICAL: CRITICAL: 77.78% of data above the critical threshold [0.0] [10:24:41] 10Tool-Labs: Different versions of python-requests cause Puppet failures on four instances - https://phabricator.wikimedia.org/T91862#1097864 (10scfc) 5Open>3Resolved I've downgraded `python-requests` to our version 1.2.3 on all instances. For Trusty (that provides version 2.2.1) it might be interesting to... [10:27:05] RECOVERY - Puppet failure on tools-exec-08 is OK: OK: Less than 1.00% above the threshold [0.0] [10:27:39] RECOVERY - Puppet failure on tools-exec-02 is OK: OK: Less than 1.00% above the threshold [0.0] [10:27:51] RECOVERY - Puppet failure on tools-shadow is OK: OK: Less than 1.00% above the threshold [0.0] [10:28:23] RECOVERY - Puppet failure on tools-exec-cyberbot is OK: OK: Less than 1.00% above the threshold [0.0] [10:28:45] RECOVERY - Puppet failure on tools-exec-01 is OK: OK: Less than 1.00% above the threshold [0.0] [10:28:57] RECOVERY - Puppet failure on tools-submit is OK: OK: Less than 1.00% above the threshold [0.0] [10:32:53] RECOVERY - Puppet failure on tools-exec-05 is OK: OK: Less than 1.00% above the threshold [0.0] [10:35:25] RECOVERY - Puppet failure on tools-exec-03 is OK: OK: Less than 1.00% above the threshold [0.0] [10:35:37] RECOVERY - Puppet failure on tools-exec-06 is OK: OK: Less than 1.00% above the threshold [0.0] [10:38:10] RECOVERY - Puppet failure on tools-exec-11 is OK: OK: Less than 1.00% above the threshold [0.0] [10:39:40] RECOVERY - Puppet failure on tools-webgrid-02 is OK: OK: Less than 1.00% above the threshold [0.0] [10:40:48] RECOVERY - Puppet failure on tools-exec-09 is OK: OK: Less than 1.00% above the threshold [0.0] [10:40:51] RECOVERY - Puppet failure on tools-exec-06 is OK: OK: Less than 1.00% above the threshold [0.0] [10:41:11] RECOVERY - Puppet failure on tools-exec-13 is OK: OK: Less than 1.00% above the threshold [0.0] [10:41:55] RECOVERY - Puppet failure on tools-exec-gift is OK: OK: Less than 1.00% above the threshold [0.0] [10:42:38] RECOVERY - Puppet failure on tools-exec-07 is OK: OK: Less than 1.00% above the threshold [0.0] [10:45:01] RECOVERY - Puppet failure on tools-exec-14 is OK: OK: Less than 1.00% above the threshold [0.0] [10:46:01] RECOVERY - Puppet failure on tools-webgrid-04 is OK: OK: Less than 1.00% above the threshold [0.0] [10:46:51] RECOVERY - Puppet failure on tools-dev is OK: OK: Less than 1.00% above the threshold [0.0] [10:46:55] RECOVERY - Puppet failure on tools-webgrid-01 is OK: OK: Less than 1.00% above the threshold [0.0] [10:47:07] RECOVERY - Puppet failure on tools-login is OK: OK: Less than 1.00% above the threshold [0.0] [10:53:56] hi guys! just a quick question. I deleted accidently a script I didn't want in tool labs. Is there any garbage or recovery area? [10:57:26] YuviPanda: mister? [11:07:25] RECOVERY - Puppet failure on tools-webgrid-06 is OK: OK: Less than 1.00% above the threshold [0.0] [11:09:19] RECOVERY - Puppet failure on tools-webgrid-05 is OK: OK: Less than 1.00% above the threshold [0.0] [11:10:24] RECOVERY - Puppet failure on tools-webgrid-generic-02 is OK: OK: Less than 1.00% above the threshold [0.0] [11:10:58] RECOVERY - Puppet failure on tools-webgrid-07 is OK: OK: Less than 1.00% above the threshold [0.0] [11:13:18] PROBLEM - Puppet failure on tools-webgrid-06 is CRITICAL: CRITICAL: 62.50% of data above the critical threshold [0.0] [11:15:20] PROBLEM - Puppet failure on tools-webgrid-05 is CRITICAL: CRITICAL: 62.50% of data above the critical threshold [0.0] [11:21:17] PROBLEM - Puppet failure on tools-webgrid-generic-02 is CRITICAL: CRITICAL: 42.86% of data above the critical threshold [0.0] [11:26:48] PROBLEM - Puppet failure on tools-webgrid-07 is CRITICAL: CRITICAL: 62.50% of data above the critical threshold [0.0] [11:28:58] RECOVERY - Puppet failure on tools-exec-catscan is OK: OK: Less than 1.00% above the threshold [0.0] [11:39:57] PROBLEM - Puppet failure on tools-exec-catscan is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [11:40:50] 10Tool-Labs: Setup a local repo for toollabs that supports separate trusty and precise packages - https://phabricator.wikimedia.org/T76802#1097918 (10scfc) You're right, nothing good. I've just run into a wall where Ubuntu's `python-matplotlib` on Trusty requires `libtcl8.6`. On Precise it doesn't, so our home... [11:52:44] 10Tool-Labs: Reduce amount of tools-local packages - https://phabricator.wikimedia.org/T91874#1097938 (10yuvipanda) 3NEW [12:33:31] 10Tool-Labs: Investigate system-level packages - https://phabricator.wikimedia.org/T91877#1097971 (10scfc) 3NEW [12:41:09] 10Tool-Labs: Get rid of custom nginx packages - https://phabricator.wikimedia.org/T91878#1097977 (10yuvipanda) 3NEW a:3yuvipanda [12:42:54] 10Tool-Labs: Rename 'misctools' toollabs package to something more appropriate - https://phabricator.wikimedia.org/T91879#1097985 (10yuvipanda) 3NEW [12:43:04] 10Tool-Labs: Reduce amount of tools-local packages - https://phabricator.wikimedia.org/T91874#1097991 (10yuvipanda) [12:43:05] 10Tool-Labs: Rename 'misctools' toollabs package to something more appropriate - https://phabricator.wikimedia.org/T91879#1097985 (10yuvipanda) [13:07:11] 10Tool-Labs, 10Wikimedia-Mailing-lists: Create a labs-announce-l mailing list - https://phabricator.wikimedia.org/T91864#1098002 (10JohnLewis) a:3JohnLewis Sure, I'll do this shortly. Any preference to list admins or should I just add you two + Andrew or? :) [16:02:08] YuviPanda: re. Labs-announce, bothered who are listadmins (all three) or just want you and coren? [16:05:32] JohnFLewis: Also Andrew [16:06:23] Coren: alright, I'll get to that in about 5 minutes :) [16:31:03] 10Tool-Labs, 10Wikimedia-Mailing-lists: Create a labs-announce-l mailing list - https://phabricator.wikimedia.org/T91864#1098174 (10JohnLewis) 5Open>3Resolved Done. Created as 'labs-announce' and set Andrew, Marc and Yuvi as listadmins. Password sent. [20:56:19] !ping Betacommand you available for PM? [20:56:19] !pong [20:56:33] T13|mobile: yeah [21:35:00] Coren YuviPanda andrewbogott_afk https://tools.wmflabs.org/imagemapedit/ is not working, please retart the webservice ... [21:35:41] if i only had the service webserver2 restart right ... [21:45:14] does anyone know a way that I could create an internal web service that could only be called from one of my tools, but not from the web? [21:45:43] I used to have my tool call jsub, but it appears that Trusty is disallowing invocation via the user. [21:46:21] _Only_ from the tool? [21:46:28] matanya: Gimme a minute. [21:46:50] well only from internal IPs [21:47:05] only listen on 127.0.01? [21:47:11] Magog_the_Ogre: you can use htaccess or iptables ? [21:47:35] matanya: Webservice restarted [21:47:40] thank Coren [21:48:33] Coren: send the bill to the usual place [21:50:46] all the IPs in my access.log come from a 10.68.0.0/16 range [21:50:59] including a wget I just ran from the command line from tools-login [21:51:16] Magog_the_Ogre: I think the easiest way would be to have a continuous job that listens for http requests without registering itself with the proxy. [21:51:33] Magog_the_Ogre: All the webservices can only see internal IPs, the proxy hides the others. [21:52:07] (By design) [21:52:32] coren, I'm sorry, I am good at programming, but I am very bad at the logistics of web servers [21:52:36] how would I do such a thing? [21:53:21] I guess I could encrypt my data with a private key accessible in my base directory which no one has reads rights to [21:53:46] Magog_the_Ogre: Or you could just do authentication. [21:54:11] http://redmine.lighttpd.net/projects/1/wiki/HowToBasicAuth [21:55:57] FWIW, I hardly even know what an .htaccess file is [21:56:03] even this is going over my head [21:56:12] I think I will just use a private key [21:57:03] Heh. [22:02:55] PROBLEM - Puppet staleness on tools-exec-15 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [43200.0] [22:17:27] YuviPanda: Do you happen to know if https://tools.wmflabs.org/paste/ is intentionally down? [22:17:57] RECOVERY - Puppet staleness on tools-exec-15 is OK: OK: Less than 1.00% above the threshold [3600.0] [22:19:39] multichill: the easy way is: Coren can you please restart the webservice of https://tools.wmflabs.org/paste/ ? [22:20:40] Just not sure if it's intentional or just forgot to start it after one of the last crashes [23:31:17] Hi, /topic [23:31:26] Oops, sorry [23:32:13] Hi, eswiki replication DB for Tools seems to have stopped today at 13:26 UTC, is this known?