[00:12:41] RECOVERY - Puppet run on tools-services-01 is OK: OK: Less than 1.00% above the threshold [0.0] [05:38:42] PROBLEM - Puppet run on tools-services-01 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [06:32:09] PROBLEM - Puppet run on tools-exec-1419 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [06:43:42] RECOVERY - Puppet run on tools-services-01 is OK: OK: Less than 1.00% above the threshold [0.0] [07:07:11] RECOVERY - Puppet run on tools-exec-1419 is OK: OK: Less than 1.00% above the threshold [0.0] [09:09:41] PROBLEM - Puppet run on tools-services-01 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [09:47:03] PROBLEM - Free space - all mounts on tools-worker-1003 is CRITICAL: CRITICAL: tools.tools-worker-1003.diskspace._var_lib_docker.byte_percentfree (No valid datapoints found) tools.tools-worker-1003.diskspace._public_dumps.byte_percentfree (No valid datapoints found)tools.tools-worker-1003.diskspace.root.byte_percentfree (<100.00%) [11:14:42] RECOVERY - Puppet run on tools-services-01 is OK: OK: Less than 1.00% above the threshold [0.0] [11:52:21] 06Labs, 10Tool-Labs: /etc/cron.daily/logrotate: gzip: stdin: file size changed while zipping - https://phabricator.wikimedia.org/T96007#2902108 (10scfc) I'll submit a basic revert of 93f5a6b53be9f9665804a9546bb656b53ba6e2a8; `/etc/cron.daily/logrotate` can then be restored manually with `clush` by `sudo rm -f... [12:05:42] PROBLEM - Puppet run on tools-services-01 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [13:15:44] RECOVERY - Puppet run on tools-services-01 is OK: OK: Less than 1.00% above the threshold [0.0] [14:36:41] PROBLEM - Puppet run on tools-services-01 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [15:17:46] 06Labs, 10Tool-Labs, 13Patch-For-Review: /etc/cron.daily/logrotate: gzip: stdin: file size changed while zipping - https://phabricator.wikimedia.org/T96007#2902355 (10scfc) Correction: The command for `clush` will be `sudo apt-get --reinstall -o Dpkg::Options::=--force-confask -o Dpkg::Options::=--force-conf... [15:41:41] RECOVERY - Puppet run on tools-services-01 is OK: OK: Less than 1.00% above the threshold [0.0] [16:07:42] PROBLEM - Puppet run on tools-services-01 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [17:12:41] RECOVERY - Puppet run on tools-services-01 is OK: OK: Less than 1.00% above the threshold [0.0] [17:13:45] 06Labs, 07Puppet: role::puppetmaster::standalone has no firewall rule for port 8140 - https://phabricator.wikimedia.org/T154150#2902431 (10scfc) [17:26:27] 06Labs, 10Labs-Sprint-103, 13Patch-For-Review: decouple role::labs::instance puppet runs from the rest of puppet - https://phabricator.wikimedia.org/T103357#2902448 (10scfc) 05Open>03declined >>! In T103357#1452187, @Andrew wrote: > I'm no longer sure this is a great idea... it adds a lot of complexity.... [18:03:59] 06Labs, 10Tool-Labs: apache::static_site should provide a way to work without SSL or with automatically created self-signed certificates - https://phabricator.wikimedia.org/T153818#2902535 (10scfc) I didn't see anything there, but now I found `base::expose_puppet_certs` which seems to be what I have been looki... [18:07:14] 06Labs, 10Tool-Labs: apache::static_site should provide a way to work without SSL or with automatically created self-signed certificates - https://phabricator.wikimedia.org/T153818#2902537 (10scfc) [19:38:42] PROBLEM - Puppet run on tools-services-01 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [20:43:42] RECOVERY - Puppet run on tools-services-01 is OK: OK: Less than 1.00% above the threshold [0.0] [21:14:48] 06Labs, 10Tool-Labs, 10community-labs-monitoring: Implement a system to monitor tools on tool-labs - https://phabricator.wikimedia.org/T53434#2902794 (10Matthewrbowker) Okay, after some examination here's what I'd like to propose. @yuvipanda this is subject to your OK. I currently have a Labs project set u...