[06:46:46] PROBLEM - Puppet failure on tools-webgrid-generic-1405 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [06:49:01] PROBLEM - Puppet failure on tools-exec-gift is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [07:21:42] RECOVERY - Puppet failure on tools-webgrid-generic-1405 is OK: OK: Less than 1.00% above the threshold [0.0] [07:24:00] RECOVERY - Puppet failure on tools-exec-gift is OK: OK: Less than 1.00% above the threshold [0.0] [08:58:52] for some reason, https://wikitech.wikimedia.org/wiki/Nova_Resource:Design no longer lists living-style-guide.design , though it has a page https://wikitech.wikimedia.org/wiki/Nova_Resource:Living-style-guide.design.eqiad.wmflabs, weird [09:34:06] 6Labs: can't administer sluggish "Living-style-guide.design" through wikitech Nova lost the - https://phabricator.wikimedia.org/T113214#1657568 (10Spage) 3NEW [09:49:18] 6Labs: can't administer sluggish "Living-style-guide.design" through wikitech, Nova "lost" the design instance - https://phabricator.wikimedia.org/T113214#1657611 (10Spage) [10:45:11] Hello, is here somebody, who couldhelp me with instances? Any instance I create do not work [10:45:22] Yuvipanda: Are you here? [10:46:31] (03PS1) 10Alexandros Kosiaris: Add ticket.wikimedia.org secret [labs/private] - 10https://gerrit.wikimedia.org/r/239804 [10:47:42] (03CR) 10Alexandros Kosiaris: [C: 032 V: 032] Add ticket.wikimedia.org secret [labs/private] - 10https://gerrit.wikimedia.org/r/239804 (owner: 10Alexandros Kosiaris) [11:34:00] 6Labs: Creation of instances broken - https://phabricator.wikimedia.org/T113175#1657782 (10Luke081515) This happens also with jessie and presice instances, for example here the console output of precise: [1;35merr: Could not request certificate: Connection refused - connect(2)[0m [1;35merr: Could not reques... [11:34:12] (03PS1) 10Alexandros Kosiaris: Fix ticket.wikimedia.org filename [labs/private] - 10https://gerrit.wikimedia.org/r/239814 [11:34:30] (03CR) 10Alexandros Kosiaris: [C: 032 V: 032] Fix ticket.wikimedia.org filename [labs/private] - 10https://gerrit.wikimedia.org/r/239814 (owner: 10Alexandros Kosiaris) [13:33:12] could someone take a look at and triage https://phabricator.wikimedia.org/T112641 please? (broken filesystem) We're down to 37M free space on the instance. Many thanks [13:35:31] not that the space is the cause, but I want to free up more space by using the available partition which errored, turned read only and now is not mounting [13:39:19] 6Labs, 3Labs-Sprint-114, 3labs-sprint-113: Evaluate gridengine's use of NFS and (possibly) move it to a different volume - https://phabricator.wikimedia.org/T111797#1657985 (10coren) 5Open>3declined After gathering data twice a day for a couple of days, I am now convinced there is no issue to solve - at... [14:29:06] 6Labs, 10Labs-Infrastructure, 3Labs-Sprint-114: Ironic on Labs - https://phabricator.wikimedia.org/T110556#1658105 (10Andrew) This thread discusses a use case similar to ours: http://lists.openstack.org/pipermail/openstack-dev/2015-September/073530.html It looks like we'd have to set up neutron, at least t... [14:37:39] 6Labs: can't administer sluggish "Living-style-guide.design" through wikitech, Nova "lost" the design instance - https://phabricator.wikimedia.org/T113214#1658132 (10scfc) On https://wikitech.wikimedia.org/wiki/Nova_Resource:Design, you are not listed as an "Admin" for that project. Only project administrators... [15:44:59] poor wikibugs [15:45:10] welcome back wikibugs [15:47:51] ahh, it was James_F most likely "Mass-removing the Multimedia tag from MediaViewer tasks,..." [15:48:20] Yup. [15:48:22] Sorry. [15:48:28] :) :) no worries [16:21:58] 10Tool-Labs-tools-Other, 6Commons, 6Community-Tech: [AOI] Create a new DerivativeFX after the Toolserver shutdown - https://phabricator.wikimedia.org/T110409#1659609 (10Jdforrester-WMF) [16:26:58] 6Labs, 10Labs-Infrastructure, 5Patch-For-Review: Upgrade Labs to Openstack Juno - https://phabricator.wikimedia.org/T104587#1659707 (10Andrew) 5Open>3Resolved a:3Andrew [16:26:59] 6Labs, 10Labs-Infrastructure: upgrade to Openstack Kilo - https://phabricator.wikimedia.org/T104586#1659709 (10Andrew) [16:27:11] 6Labs, 10Labs-Infrastructure: Give 'novaobserver' keystone account rights to read everything, everywhere, write or change nothing - https://phabricator.wikimedia.org/T104588#1659712 (10Andrew) [16:27:12] 6Labs, 10Labs-Infrastructure: upgrade to Openstack Kilo - https://phabricator.wikimedia.org/T104586#1659710 (10Andrew) 5Open>3Resolved a:3Andrew [16:27:41] 6Labs, 10Labs-Infrastructure, 3Labs-sprint-112, 5Patch-For-Review, 3labs-sprint-113: Update Labs to OpenStack Kilo - https://phabricator.wikimedia.org/T110045#1659716 (10Andrew) [16:27:42] 6Labs, 10Labs-Infrastructure, 3Labs-sprint-112, 5Patch-For-Review, 3labs-sprint-113: Update Horizon/Californium to Kilo - https://phabricator.wikimedia.org/T112201#1659715 (10Andrew) 5Open>3Resolved [16:27:51] 6Labs, 10Labs-Infrastructure, 3Labs-sprint-112, 5Patch-For-Review, 3labs-sprint-113: Update Labs to OpenStack Kilo - https://phabricator.wikimedia.org/T110045#1659724 (10Andrew) 5Open>3Resolved [16:28:16] 6Labs, 10Labs-Infrastructure: Horizon (kilo) cannot talk properly to keystone - https://phabricator.wikimedia.org/T113093#1659729 (10Andrew) 5Open>3Invalid This turns out to be just a perverse Horizon design bug, not a communication issue [16:29:20] andrewbogott: Can you please at https://phabricator.wikimedia.org/T113175? [16:29:44] Luke081515: I’m in a meeting but will look shortly [16:29:48] 6Labs, 10Labs-Infrastructure: Creation of instances broken - https://phabricator.wikimedia.org/T113175#1659736 (10Luke081515) [16:29:49] ok, thanks [16:40:42] Luke081515|away: I haven’t entirely reproduced your problem yet, but I suspect it has to do with reusing instance names and some caches and races and such. Can you try deleting, and then waiting an hour and then recreating? [16:40:58] Obviously an hour is too long to wait in general but it should help us diagnose at least [16:49:18] 6Labs, 10Labs-Infrastructure: Creation of instances broken - https://phabricator.wikimedia.org/T113175#1659866 (10Andrew) The new instances have the same names as recently-deleted instances, yes? If so, I'd advise deleting the original instances and then waiting several minutes (maybe 10-15) before recreating... [16:59:58] 6Labs, 10Tool-Labs, 10Labs-Infrastructure, 3Labs-Sprint-115: Can't delete rule in default security group - https://phabricator.wikimedia.org/T112492#1659911 (10Andrew) a:3Andrew [17:02:02] 6Labs, 10Tool-Labs, 3Labs-Sprint-115: Decide on Docker image policies for Tool Labs Kubernetes - https://phabricator.wikimedia.org/T112855#1659926 (10yuvipanda) [17:03:00] 6Labs, 10Tool-Labs, 3Labs-Sprint-114, 3Labs-Sprint-115: Add support to dynamicproxy for kubernetes based web services - https://phabricator.wikimedia.org/T111916#1659943 (10yuvipanda) [17:05:28] 6Labs, 3Labs-Sprint-114, 3Labs-Sprint-115, 5Patch-For-Review: Setup an availability checker for all labsdb hosts - https://phabricator.wikimedia.org/T107449#1659947 (10Andrew) [17:05:42] 6Labs, 10Tool-Labs, 3Labs-Sprint-114, 3Labs-Sprint-115, and 2 others: Setup a tools checker service that can check all internal services for availability - https://phabricator.wikimedia.org/T97748#1659948 (10Andrew) [17:10:15] 6Labs, 3Labs-Sprint-114, 3Labs-Sprint-115: Make a flowchart for locating and halting misbehaving NFS clients - https://phabricator.wikimedia.org/T101744#1659956 (10coren) [17:10:52] 6Labs, 3Labs-Q4-Sprint-1, 3Labs-Q4-Sprint-2, 3Labs-Q4-Sprint-4, and 2 others: Labs NFSv4/idmapd mess - https://phabricator.wikimedia.org/T87870#1659957 (10coren) [17:14:57] 6Labs, 10Tool-Labs, 3Labs-Sprint-115: Write admission controller disabling mounting of unauthorized volumes - https://phabricator.wikimedia.org/T112718#1659963 (10yuvipanda) [17:15:55] 6Labs, 10Tool-Labs, 3Labs-Sprint-115: Permission issues and/or failure to load Ruby environment on trusty - https://phabricator.wikimedia.org/T106170#1659973 (10coren) [17:17:04] 6Labs, 10Tool-Labs, 3Labs-Sprint-103, 3Labs-Sprint-115: Labs: Move tools-shadow off the same host as tool-master - https://phabricator.wikimedia.org/T103390#1659975 (10coren) [17:24:02] 6Labs, 10Tool-Labs: Phase out precise instances from toollabs - https://phabricator.wikimedia.org/T94790#1660010 (10yuvipanda) [17:24:04] 6Labs, 10Tool-Labs: Move tools-master and tools-shadow to trusty - https://phabricator.wikimedia.org/T94791#1660009 (10yuvipanda) 5Resolved>3Open [17:24:14] 6Labs, 10Tool-Labs: Move tools-master and tools-shadow to trusty - https://phabricator.wikimedia.org/T94791#1172731 (10yuvipanda) Let's close this when it is actually trusty [17:29:54] !log wikilabels deployed wikilabels:4a1d11d (recovers from database connection losses) [17:29:56] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Wikilabels/SAL, Master [17:41:10] valhallasw`cloud: one of my venv's is broken, and I'm confused: [17:41:15] tools.legobot@tools-bastion-01:~$ /data/project/legobot/python/bin/python [17:41:20] >>> import datetime [17:41:20] Traceback (most recent call last): [17:41:20] File "", line 1, in [17:41:21] ImportError: No module named datetime [17:58:07] legoktm: precise vs trusty? [17:58:12] created on one used on the other [18:00:50] hrm [18:02:02] legoktm: ssh tools-precise-dev and try again [18:02:33] okay [18:02:34] that works [18:02:40] so I just need to set that somewhere [18:03:14] valhallasw`cloud: what is the jsub default? [18:03:20] precise [18:03:23] :( [18:03:43] hrmm [18:03:47] my job is running on precise then [18:03:54] and we can't change that without breaking existing jobs [18:05:01] ok, doesn't look to be venv related [18:05:02] thanks [18:55:14] 6Labs, 3Labs-Sprint-108, 3Labs-Sprint-109, 3Labs-Sprint-114, 3labs-sprint-113: Have catchpoint checks for all labs services (Tracking) - https://phabricator.wikimedia.org/T107058#1660430 (10Andrew) [18:55:15] 6Labs, 3Labs-Sprint-114, 3Labs-Sprint-115, 5Patch-For-Review: Setup an availability checker for all labsdb hosts - https://phabricator.wikimedia.org/T107449#1660428 (10Andrew) 5Open>3Resolved Since there's no default wiki db on postgres, I did not make a read-an-existing-record check for labsdb1004. I... [19:12:59] (03PS1) 10Jean-Frédéric: Update API link from wlm.wikimedia.org to tool labs [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/239910 (https://phabricator.wikimedia.org/T113275) [19:14:18] (03CR) 10Jean-Frédéric: [C: 032] Update API link from wlm.wikimedia.org to tool labs [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/239910 (https://phabricator.wikimedia.org/T113275) (owner: 10Jean-Frédéric) [19:14:28] 6Labs, 10Labs-Infrastructure: glancesync cron is failing - https://phabricator.wikimedia.org/T112719#1660559 (10Andrew) 5Open>3Resolved [19:19:08] (03Merged) 10jenkins-bot: Update API link from wlm.wikimedia.org to tool labs [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/239910 (https://phabricator.wikimedia.org/T113275) (owner: 10Jean-Frédéric) [19:26:40] andrewbogott: Are you still here? I will delete the instances now [19:26:54] Luke081515: yep, still here [19:27:01] Ok, instances deleted [19:37:12] Luke081515: ok, try recreating one of them now, we’ll see what happens. [19:37:54] 6Labs, 10Labs-Infrastructure: Creation of instances broken - https://phabricator.wikimedia.org/T113175#1660679 (10Luke081515) Ok, the instances are deleted now, I will recreate them tomorrow. Thanks for that tip. [19:40:33] Ok, than i try it now ;) [19:40:46] well, let’s just do one and see if it works [19:40:59] created one trusty medium instance [19:44:35] 6Labs, 10Labs-Infrastructure: Creation of instances broken - https://phabricator.wikimedia.org/T113175#1660686 (10Luke081515) 5Open>3Resolved a:3Luke081515 Works now. [19:44:46] andrewbogott: Many thanks [19:45:13] Luke081515: great — as a rule I think 10 minutes between delete and recreate should be enough. [19:45:28] Ok, I will try to remember that rule ;) [19:56:48] 6Labs, 10Continuous-Integration-Infrastructure, 10Labs-Infrastructure: integration-slave-trusty-1014 and integration-slave-trusty-1017 instances can't boot anymore, ended up corrupted. Need rebuild - https://phabricator.wikimedia.org/T110052#1660705 (10hashar) [20:10:41] (03PS1) 10Jean-Frédéric: Add unit tests for extractWikilink converter [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/239981 [20:10:50] (03CR) 10Jean-Frédéric: [C: 032] Add unit tests for extractWikilink converter [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/239981 (owner: 10Jean-Frédéric) [20:11:04] (03Merged) 10jenkins-bot: Add unit tests for extractWikilink converter [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/239981 (owner: 10Jean-Frédéric) [20:15:46] (03PS1) 10Jean-Frédéric: Add converter remove_commons_category_prefix [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/239984 [20:15:58] (03CR) 10Jean-Frédéric: [C: 032] Add converter remove_commons_category_prefix [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/239984 (owner: 10Jean-Frédéric) [20:16:11] (03Merged) 10jenkins-bot: Add converter remove_commons_category_prefix [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/239984 (owner: 10Jean-Frédéric) [20:22:04] (03PS1) 10Jean-Frédéric: Extract commons category during harvesting for Romania [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/239986 (https://phabricator.wikimedia.org/T112414) [20:24:19] (03CR) 10Jean-Frédéric: [C: 032] Extract commons category during harvesting for Romania [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/239986 (https://phabricator.wikimedia.org/T112414) (owner: 10Jean-Frédéric) [20:24:34] (03Merged) 10jenkins-bot: Extract commons category during harvesting for Romania [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/239986 (https://phabricator.wikimedia.org/T112414) (owner: 10Jean-Frédéric) [20:47:05] 6Labs, 10Labs-Infrastructure, 5Continuous-Integration-Scaling, 7WorkType-NewFunctionality: Investigate non blocking fs resizing when instance is booted - https://phabricator.wikimedia.org/T104974#1660905 (10greg) [20:47:20] wikibugs dieing was me [20:47:31] only adding a new project to 19 tasks [21:00:44] (03PS1) 10Jean-Frédéric: Fix Commons prefix converter lowercase problem [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/239988 (https://phabricator.wikimedia.org/T112414) [21:01:01] (03CR) 10Jean-Frédéric: [C: 032] Fix Commons prefix converter lowercase problem [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/239988 (https://phabricator.wikimedia.org/T112414) (owner: 10Jean-Frédéric) [21:01:15] (03Merged) 10jenkins-bot: Fix Commons prefix converter lowercase problem [labs/tools/heritage] - 10https://gerrit.wikimedia.org/r/239988 (https://phabricator.wikimedia.org/T112414) (owner: 10Jean-Frédéric) [21:05:36] hi yuvipanda [21:12:48] valhallasw`cloud: I just noticed toollabs_p's structure... [21:12:50] sigh [21:13:06] maintainers is a space separated field... [21:27:19] 6Labs, 10Tool-Labs, 10Labs-Infrastructure, 3Labs-Sprint-115: Can't delete rule in default security group - https://phabricator.wikimedia.org/T112492#1661158 (10Andrew) upstream bug: https://bugs.launchpad.net/nova/+bug/1498197 [21:56:35] 6Labs: Cleanup leftover lucid instances on Labs - https://phabricator.wikimedia.org/T113199#1661315 (10JohnLewis) Searching via wikitech, I see: labs-vmbuilder-lucid.openstack.eqiad.wmflabs opengrok-web.opengrok.eqiad.wmflabs conventionextension-trial.conventionextension.eqiad.wmflabs nginx-dev1.nginx.eqiad.wmf... [22:07:20] 6Labs: Cleanup leftover lucid instances on Labs - https://phabricator.wikimedia.org/T113199#1661351 (10yuvipanda) nginx-dev1 can definitely be killed. @andrew I guess labs-vmbuilder-lucid can also be? opengrok-web is Victor Vasiliev's, and conventionextension-trial is Chughakshay16. Need to reach out to these pe... [22:20:16] PROBLEM - Puppet failure on tools-webgrid-lighttpd-1202 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [22:21:52] PROBLEM - Puppet failure on tools-exec-1207 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [22:22:30] PROBLEM - Puppet failure on tools-webgrid-lighttpd-1401 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [22:24:02] PROBLEM - Puppet failure on tools-exec-1209 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [22:24:56] PROBLEM - Puppet failure on tools-webgrid-lighttpd-1204 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [22:25:00] PROBLEM - Puppet failure on tools-exec-1203 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [22:25:18] PROBLEM - Puppet failure on tools-webgrid-lighttpd-1402 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [22:25:19] PROBLEM - Puppet failure on tools-webgrid-lighttpd-1201 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [22:26:17] mutante: ^ [22:26:19] ESC[1;31mError: Could not retrieve catalog from remote server: Error 400 on SERVER: Duplicate declaration: Package[git] is already declared in file /etc/puppet/modules/base/manifests/standard-packages.pp:54; cannot redeclare at /etc/puppet/modules/toollabs/manifests/exec_environ.pp:312 on node tools-webgrid-lighttpd-1402.tools.eqiad.wmflabsESC[0m [22:26:29] PROBLEM - Puppet failure on tools-exec-1218 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [22:26:29] PROBLEM - Puppet failure on tools-webgrid-lighttpd-1203 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [22:26:57] PROBLEM - Puppet failure on tools-webgrid-lighttpd-1207 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [22:27:01] PROBLEM - Puppet failure on tools-webgrid-lighttpd-1407 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [22:27:17] PROBLEM - Puppet failure on tools-exec-1215 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [22:27:17] PROBLEM - Puppet failure on tools-exec-1219 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [22:27:18] haha [22:27:22] I can't push because gerrit is down [22:27:27] PROBLEM - Puppet failure on tools-exec-1205 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [22:28:07] PROBLEM - Puppet failure on tools-exec-1404 is CRITICAL: CRITICAL: 25.00% of data above the critical threshold [0.0] [22:28:13] PROBLEM - Puppet failure on tools-webgrid-lighttpd-1208 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [22:28:21] PROBLEM - Puppet failure on tools-exec-1405 is CRITICAL: CRITICAL: 70.00% of data above the critical threshold [0.0] [22:29:03] arrr, gerrit is back [22:29:05] PROBLEM - Puppet failure on tools-webgrid-lighttpd-1406 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [22:29:07] PROBLEM - Puppet failure on tools-exec-1408 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [22:29:08] but the gerrit issue is fixed [22:29:27] PROBLEM - Puppet failure on tools-exec-1204 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [22:31:46] yuvipanda: fixing that now [22:31:46] PROBLEM - Puppet failure on tools-webgrid-generic-1404 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [22:31:50] PROBLEM - Puppet failure on tools-exec-1211 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [22:31:58] mutante: I sent up a patch [22:31:58] PROBLEM - Puppet failure on tools-exec-1403 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [22:32:08] mutante: https://gerrit.wikimedia.org/r/#/c/240002/ [22:32:14] mutante: can you merge that while I take care of krrrit-wm [22:32:51] PROBLEM - Puppet failure on tools-webgrid-generic-1403 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [22:33:20] yuvipanda: on it, yes [22:34:05] PROBLEM - Puppet failure on tools-exec-1202 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [22:35:23] PROBLEM - Puppet failure on tools-exec-1402 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [22:35:36] PROBLEM - Puppet failure on tools-exec-1401 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [22:35:42] PROBLEM - Puppet failure on tools-webgrid-lighttpd-1405 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [22:36:08] PROBLEM - Puppet failure on tools-exec-1206 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [22:37:10] PROBLEM - Puppet failure on tools-webgrid-lighttpd-1206 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [22:37:12] PROBLEM - Puppet failure on tools-precise-dev is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [22:37:13] testing on tools-exec-1206 [22:38:16] PROBLEM - Puppet failure on tools-exec-1407 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [22:38:18] fix confirmed [22:38:28] recoveries should be soon [22:39:43] 6Labs: Cleanup leftover lucid instances on Labs - https://phabricator.wikimedia.org/T113199#1661445 (10Andrew) As long as we're never building lucid images ever again, then labs-vmbuilder-lucid can be deleted, yeah. [22:51:05] RECOVERY - Puppet failure on tools-exec-1206 is OK: OK: Less than 1.00% above the threshold [0.0] [22:53:17] 6Labs: can't administer sluggish "Living-style-guide.design" through wikitech, Nova "lost" the design instance - https://phabricator.wikimedia.org/T113214#1661463 (10Spage) >>! In T113214#1658132, @scfc wrote: > On https://wikitech.wikimedia.org/wiki/Nova_Resource:Design, you are not listed as an "Admin" for tha... [22:53:18] RECOVERY - Puppet failure on tools-exec-1407 is OK: OK: Less than 1.00% above the threshold [0.0] [22:59:05] RECOVERY - Puppet failure on tools-exec-1209 is OK: OK: Less than 1.00% above the threshold [0.0] [23:00:13] RECOVERY - Puppet failure on tools-webgrid-lighttpd-1201 is OK: OK: Less than 1.00% above the threshold [0.0] [23:00:17] RECOVERY - Puppet failure on tools-webgrid-lighttpd-1202 is OK: OK: Less than 1.00% above the threshold [0.0] [23:01:26] 6Labs: project members don't see "Living-style-guide.design" instance in Nova_Resource:Design on wikitech - https://phabricator.wikimedia.org/T113214#1661485 (10Spage) [23:01:27] RECOVERY - Puppet failure on tools-webgrid-lighttpd-1203 is OK: OK: Less than 1.00% above the threshold [0.0] [23:01:52] RECOVERY - Puppet failure on tools-exec-1207 is OK: OK: Less than 1.00% above the threshold [0.0] [23:02:16] RECOVERY - Puppet failure on tools-exec-1215 is OK: OK: Less than 1.00% above the threshold [0.0] [23:02:16] RECOVERY - Puppet failure on tools-exec-1219 is OK: OK: Less than 1.00% above the threshold [0.0] [23:02:30] RECOVERY - Puppet failure on tools-webgrid-lighttpd-1401 is OK: OK: Less than 1.00% above the threshold [0.0] [23:04:05] RECOVERY - Puppet failure on tools-exec-1408 is OK: OK: Less than 1.00% above the threshold [0.0] [23:04:53] RECOVERY - Puppet failure on tools-webgrid-lighttpd-1204 is OK: OK: Less than 1.00% above the threshold [0.0] [23:05:01] RECOVERY - Puppet failure on tools-exec-1203 is OK: OK: Less than 1.00% above the threshold [0.0] [23:05:18] RECOVERY - Puppet failure on tools-webgrid-lighttpd-1402 is OK: OK: Less than 1.00% above the threshold [0.0] [23:06:28] RECOVERY - Puppet failure on tools-exec-1218 is OK: OK: Less than 1.00% above the threshold [0.0] [23:06:58] RECOVERY - Puppet failure on tools-webgrid-lighttpd-1207 is OK: OK: Less than 1.00% above the threshold [0.0] [23:07:02] RECOVERY - Puppet failure on tools-webgrid-lighttpd-1407 is OK: OK: Less than 1.00% above the threshold [0.0] [23:07:23] RECOVERY - Puppet failure on tools-exec-1205 is OK: OK: Less than 1.00% above the threshold [0.0] [23:07:51] RECOVERY - Puppet failure on tools-webgrid-generic-1403 is OK: OK: Less than 1.00% above the threshold [0.0] [23:08:07] RECOVERY - Puppet failure on tools-exec-1404 is OK: OK: Less than 1.00% above the threshold [0.0] [23:08:14] RECOVERY - Puppet failure on tools-webgrid-lighttpd-1208 is OK: OK: Less than 1.00% above the threshold [0.0] [23:08:22] RECOVERY - Puppet failure on tools-exec-1405 is OK: OK: Less than 1.00% above the threshold [0.0] [23:09:16] RECOVERY - Puppet failure on tools-webgrid-lighttpd-1406 is OK: OK: Less than 1.00% above the threshold [0.0] [23:09:24] RECOVERY - Puppet failure on tools-exec-1204 is OK: OK: Less than 1.00% above the threshold [0.0] [23:10:26] RECOVERY - Puppet failure on tools-exec-1402 is OK: OK: Less than 1.00% above the threshold [0.0] [23:11:44] RECOVERY - Puppet failure on tools-webgrid-generic-1404 is OK: OK: Less than 1.00% above the threshold [0.0] [23:11:52] RECOVERY - Puppet failure on tools-exec-1211 is OK: OK: Less than 1.00% above the threshold [0.0] [23:12:00] RECOVERY - Puppet failure on tools-exec-1403 is OK: OK: Less than 1.00% above the threshold [0.0] [23:14:08] RECOVERY - Puppet failure on tools-exec-1202 is OK: OK: Less than 1.00% above the threshold [0.0] [23:15:34] RECOVERY - Puppet failure on tools-exec-1401 is OK: OK: Less than 1.00% above the threshold [0.0] [23:15:44] RECOVERY - Puppet failure on tools-webgrid-lighttpd-1405 is OK: OK: Less than 1.00% above the threshold [0.0] [23:17:12] RECOVERY - Puppet failure on tools-precise-dev is OK: OK: Less than 1.00% above the threshold [0.0] [23:17:12] RECOVERY - Puppet failure on tools-webgrid-lighttpd-1206 is OK: OK: Less than 1.00% above the threshold [0.0] [23:21:14] 6Labs, 10Tool-Labs, 3Labs-Sprint-115: Decide on Docker image policies for Tool Labs Kubernetes - https://phabricator.wikimedia.org/T112855#1661545 (10yuvipanda) So, we'll have: ### Public docker repository ### # Read access to everyone (wide internet) # Write access only to toollabs admins # Contains 'base...