[00:03:52] PROBLEM Current Load is now: CRITICAL on testing-virt9.pmtpa.wmflabs 10.4.1.74 output: Connection refused by host [00:04:32] PROBLEM Current Users is now: CRITICAL on testing-virt9.pmtpa.wmflabs 10.4.1.74 output: Connection refused by host [00:04:32] PROBLEM Free ram is now: WARNING on bots-4.pmtpa.wmflabs 10.4.0.64 output: Warning: 19% free memory [00:04:57] there's one new compute node [00:05:14] PROBLEM Disk Space is now: CRITICAL on testing-virt9.pmtpa.wmflabs 10.4.1.74 output: Connection refused by host [00:05:54] PROBLEM Free ram is now: CRITICAL on testing-virt9.pmtpa.wmflabs 10.4.1.74 output: Connection refused by host [00:06:04] you know pretty sure we're going to have so many compute nodes that storage is going to be crappy as hell if we loose one [00:06:18] what do you mean? [00:06:20] storage is local [00:06:33] if we lose a compute node we lose all the instances on it too [00:06:54] I thought you had that glusteris too or is that the one you got rid of due to the hartid of images? [00:07:16] I got rid of that over a year ago ;) [00:07:24] PROBLEM Total processes is now: CRITICAL on testing-virt9.pmtpa.wmflabs 10.4.1.74 output: Connection refused by host [00:07:41] totally don't keep track of time [00:07:47] heh [00:07:54] PROBLEM dpkg-check is now: CRITICAL on testing-virt9.pmtpa.wmflabs 10.4.1.74 output: Connection refused by host [00:08:05] puppet takes so fucking long to run [00:08:11] It's ruby *shrug* [00:08:25] it takes 403 seconds to run on a new instance [00:08:36] that's absurd [00:08:40] That's redic [00:08:54] since we're already imaging a new instance should be done in like <2min [00:08:55] RECOVERY Current Load is now: OK on testing-virt9.pmtpa.wmflabs 10.4.1.74 output: OK - load average: 0.71, 0.92, 0.52 [00:09:34] RECOVERY Current Users is now: OK on testing-virt9.pmtpa.wmflabs 10.4.1.74 output: USERS OK - 0 users currently logged in [00:09:56] that users logged in check is silly [00:10:07] yeah [00:10:07] agreed [00:10:13] RECOVERY Disk Space is now: OK on testing-virt9.pmtpa.wmflabs 10.4.1.74 output: DISK OK [00:10:26] maybe I could clone a fully puppetized system [00:10:30] and use that cloned image [00:10:40] keys and stuff could be problematic [00:10:46] all the state is stored locally [00:10:47] what keys? [00:10:53] that's over nfs [00:10:57] on gluster [00:11:03] RECOVERY Free ram is now: OK on testing-virt9.pmtpa.wmflabs 10.4.1.74 output: OK: 900% free memory [00:11:05] puppet key [00:11:07] ok ssl shiz [00:11:08] ah [00:11:08] right [00:11:28] I'd need to wipe some stuff out before cloning [00:12:07] hmm hashar's pep8 check doesn't agree with me... I should talk to him about just using make test since that works [00:12:23] RECOVERY Total processes is now: OK on testing-virt9.pmtpa.wmflabs 10.4.1.74 output: PROCS OK: 84 processes [00:12:53] RECOVERY dpkg-check is now: OK on testing-virt9.pmtpa.wmflabs 10.4.1.74 output: All packages OK [00:12:57] New patchset: DamianZaremba; "I couldn't give a flying monkey turd" [labs/nagios-builder] (master) - https://gerrit.wikimedia.org/r/45080 [00:13:40] Change merged: DamianZaremba; [labs/nagios-builder] (master) - https://gerrit.wikimedia.org/r/45080 [00:13:59] something is still causing really slow logins *sigh* [00:15:23] right.. that's that check gone... just leaves fixing puppet freashness which I have no idea how I can do right now hmm... I could re-write the snmp trap to do a ldap lookup I guess [00:15:41] Damianz: is it? [00:15:45] I'm not seeing slow logins [00:15:56] it hangs after banner for a few seconds randomly for me [00:16:10] oh. you mean slightly slow logins [00:16:13] not like yesterday logins [00:16:20] the ldap servers are slightly overloaded [00:16:25] I mean like - it's not instant, this makes me sad [00:17:00] there's some things I can do to speed up lookups som [00:17:01] *some [00:17:10] probably doesn't help we killed caching [00:17:10] some options in nslcd that didn't exist in nssldap [00:17:19] that's only negative caching we killed [00:17:24] true [00:17:32] hmm like /etc/wmflabs-instancename exists, is there one for region the instance is in? [00:18:10] hm [00:18:12] I guess I could pull it from salt but wow shit is that hacky... [00:18:17] heh [00:18:19] or facter [00:18:24] actually... [00:18:32] I can do this straight in puppet... sorta [00:18:37] puppet knows [00:18:47] custom facter to pill the instance name and then replace the snmptrap call if in relm labs [00:18:48] is it $site? [00:18:50] boom [00:19:09] Also... CAN WE JUST USE ONE GOD DAMN FWDN [00:19:13] s/W/Q/ [00:19:21] what do you mean? [00:19:25] get rid of i-xxx? [00:19:41] i-0000030d.pmtpa.wmflabs nagios-main.pmtpa.wbflabs, we should use the later [00:19:46] indeed [00:19:48] that's in the plans [00:20:01] for now... let's just hack it up [00:20:02] it's not amazingly easy [00:20:26] have to ensure uniqueness in a bunch of places it didn't exist before [00:20:32] have to immediately delete old keys [00:20:53] It's more fun when you have duplicate hostnames [00:20:56] openstack (in this release) doesn't care about unique hostnames [00:21:06] I think andrew fixed that in grizzly [00:21:18] you can do unique names globally or per project [00:21:38] I really kind of wish we went with ...wmflabs [00:21:40] so if we'd just gone with ..wmflabs :D [00:22:13] we still can, but it's going to be painful [00:22:21] nuke all the things [00:22:30] it's just a matter of changing their dns names [00:22:38] but it's probably going to break a lot of people's stuff [00:23:01] another compute node in [00:23:01] well for a start you gotta update host files or you're gonna break acls, clear caches, ensure people aren't retarded etc [00:23:12] host files? [00:23:27] I don't think the hostname is in thefile [00:23:32] oh yeah... you not seen really weird behaviour when your hostname doesn't resolve to 127.0.0.1 [00:23:37] it isn't [00:23:47] actually it's not [00:23:56] if I try that on redhat or with pgsql it freaks out [00:24:00] heh [00:24:03] PROBLEM Current Load is now: CRITICAL on testing-virt10.pmtpa.wmflabs 10.4.0.82 output: Connection refused by host [00:24:08] it's a problem with a lot of java services too [00:24:13] RECOVERY Free ram is now: OK on nova-precise2.pmtpa.wmflabs 10.4.1.57 output: OK: 20% free memory [00:24:25] java hates everyone [00:24:37] $::site yay [00:24:40] !log testlabs shutting labs-nfs1 instance down [00:24:42] Logged the message, Master [00:24:51] \o/ [00:24:53] PROBLEM Disk Space is now: CRITICAL on testing-virt10.pmtpa.wmflabs 10.4.0.82 output: Connection refused by host [00:24:53] RECOVERY Free ram is now: OK on swift-be4.pmtpa.wmflabs 10.4.0.127 output: OK: 20% free memory [00:25:00] hasn't been any network connection on there for a while [00:25:13] I'm going to kill it in a few days [00:25:20] delete it, that is [00:25:34] I think I may have a celebratory drink when I do [00:25:59] get the 15 year old single malt out? [00:26:17] probably the 18yo bourbon [00:26:30] hmmm bourbon... american [00:26:58] I drink bourbon, rye and scotch [00:27:09] I don't have any scotch on my desk right now, though [00:27:41] Never had rye, I found a nice bottle of Caol Ila next to my Laphroaig earlier though [00:27:44] btw, check out the awesome change: https://gerrit.wikimedia.org/r/#/c/44948/2 [00:27:59] Krenair is great :) [00:28:05] * Damianz tickles Krenair [00:28:08] we're going to have proper notifications! [00:28:24] :D [00:28:27] snmptt is the most retarded bit of software ever [00:28:33] Damianz: yes, it is [00:30:44] hmm I could totally write a facter for this or just cat the file... lets cat the file [00:32:14] PROBLEM Free ram is now: WARNING on nova-precise2.pmtpa.wmflabs 10.4.1.57 output: Warning: 19% free memory [00:33:53] PROBLEM host: labs-nfs1.pmtpa.wmflabs is DOWN address: 10.4.0.13 CRITICAL - Host Unreachable (10.4.0.13) [00:35:31] * Damianz waits [00:36:38] well snmptt is borked but https://gerrit.wikimedia.org/r/#/c/45081/ needs to get merged before it will work anyway [00:37:39] Ryan_Lane: Has anyone ever considered writing a custom puppet reporter module to talk to nagios? it would be far easier, more reliable and not need snmp =\ [00:38:17] well, it's checking to see if puppet ran ;) [00:38:26] if puppet doesn't run, it can't report, right? [00:38:29] oh. right [00:38:31] ... [00:38:33] if it doesn't then it times out [00:38:35] :D [00:38:37] ignore me [00:38:41] * Damianz ignores you [00:38:48] o.o [00:38:52] yes, that would be a better solution than an snmp trap [00:38:57] though you could also make it warn/error in real time if it failed a run hard [00:39:09] indeed. that would be much nicer [00:39:17] https://github.com/DamianZaremba/sentry-puppet/blob/master/lib/puppet/reports/sentry.rb < I send mine to Sentry [00:39:33] and yuck ruby is horrid [00:39:34] RECOVERY Free ram is now: OK on bots-4.pmtpa.wmflabs 10.4.0.64 output: OK: 22% free memory [00:39:38] yep [00:39:54] RECOVERY Free ram is now: OK on sube.pmtpa.wmflabs 10.4.0.245 output: OK: 27% free memory [00:40:19] * Damianz gives mimi a cookie [00:40:41] thanks Damianz :o [00:40:42] Damianz: did you see the scheduler in salt 0.12.0? http://docs.saltstack.org/en/latest/topics/releases/0.12.0.html [00:40:56] * mimi eats [00:40:58] like a cron for the network :D [00:41:24] nope but that would be really useful... I was thinking of pushing metrics to graphite via it [00:41:51] pure python dsl renderer could be useful too [00:42:14] I'm hoping the xp build for windows gets fixed... 7/08 works awesomly [00:42:33] is the one in this release not ok? [00:42:36] * mimi brings cake [00:42:39] :o [00:42:41] someone want [00:42:45] A few weeks back it wasn't... I have a bug open for it [00:42:50] Does weird cpu things on xp [00:42:54] ah [00:43:20] https://f.cloud.github.com/assets/142120/50392/892b2fae-59a3-11e2-90e9-604d277f5cd8.png < [00:43:29] hahahaha [00:43:57] I realy need group based acls so I can jam it into ldap for users then use peer runs and stuff to do monitoring :D [00:46:08] yep [00:46:11] that would be nice [00:46:24] nice than nrpe/nsclinet++ [00:46:29] indeed [00:46:53] I was trying to use a custom c# schedular with powershell modules for graphing system metrics on windows... it turns out powershell sucks ass compared to python [00:47:23] heh [00:47:27] just use python, then ;) [00:47:57] Means I have to install it a few hundred times... but if I can push out salt then that's my excuse and I can just use modules [00:48:39] I'm still learning that real time distributed metric collection is hard when you're polling hundreds of thousands of data points :( [00:49:41] multicast udp ;) [00:51:01] Damianz: merged your change through [00:51:47] Mmmm udp, if only it was a bit more clingy over crappy links [00:52:01] Yay... now just to fix snmptt... maybe tomorrow [00:52:31] I wonder if I can get a commit into production puppet with the line 'couldn't give a flying monkey turd in it'.... I think this should be a goal for a refactor [00:52:50] Krenair: are you ready for that change to be merged in? [00:52:56] I reviewed it and it looks fine to me [00:52:58] no [00:53:00] ok [00:53:14] it works with or without echo enabled, which is nice :) [00:53:15] there's still some stuff missing like preference messages [00:53:17] ah [00:53:18] ok [00:53:35] it does? I was about to go and probably break the labsconsole test to check that :) [00:53:39] we need openstack's gerrit change for "work-in-progress" [00:53:42] I just tested it [00:53:46] never give users choices, just force it upon them! [00:54:12] people yell when you use gerrit for wip [00:54:42] I only have 45 revisions of bots up which sent like 2 messages to ops channel for each :D [00:55:44] That maint script is 100% untested [00:55:55] ah. right [00:56:24] will fit right into mediawiki [00:56:26] * Damianz ducks [00:56:35] oh, shouldn't that be in a separate change? [00:57:25] I developed it all together... Kinda regret it now because it's going to be a pain to split up [00:58:28] well, you can remove the file from the change by doing an amended change [00:59:02] I hate that about gerrit [00:59:19] so, I started using github for something recently [00:59:23] and I *hate* it [00:59:24] Much before the feature branch, merge to master workflow... compressing to 1 commit looses so much context [01:00:30] for group workflow github sucks [01:00:36] GH wouldn't work very well for the likes of puppet [01:00:45] I can't modify someone else's pull request [01:00:48] For 1/2/3 people managing a project it's awesome [01:00:49] I have to ask them to change it [01:00:56] I could just stick echo "This script isn't ready for use yet.\n"; die(); at the top [01:01:03] You can pull their branch, modify it and submit a PR [01:01:16] die is evil... [01:01:19] Damianz: then I steal their change [01:01:23] which is bullshit [01:01:32] Not really [01:01:43] Commit history would still be ,,,,, [01:01:51] yes [01:02:22] but the github interface would show someone else as adding the pull-request [01:02:35] dealing with rebases is annoying too [01:02:43] with git review, it'll rebase automatically for you [01:02:51] gerrit has a button to rebase dependencies [01:02:52] PROBLEM Free ram is now: WARNING on sube.pmtpa.wmflabs 10.4.0.245 output: Warning: 9% free memory [01:02:57] in github you have to do it manually [01:03:04] Hmm I don't really care who does the PR as long as the code is good [01:03:36] the person with the original pull request may care [01:04:01] rebasing automatically can be dodgy... if you actually edit the same file over multiple branches and it can't merge [01:04:04] people use github as their personal resume now-a-days [01:04:26] gerrit will tell if you if fails the rebase [01:04:34] My resume is on github, a large proportion of the code there is crap/random and not stuff I use professionally :D [01:04:36] and git review will bring you into a mode to fix it [01:04:52] PROBLEM host: labs-nfs1.pmtpa.wmflabs is DOWN address: 10.4.0.13 CRITICAL - Host Unreachable (10.4.0.13) [01:05:53] PROBLEM Free ram is now: WARNING on swift-be4.pmtpa.wmflabs 10.4.0.127 output: Warning: 19% free memory [01:07:00] Gerrit could be a lot nicer ux wise with the same workflow... like the github/travisci integration is sweet [01:07:24] Ryan_Lane, if I add that to the top of the maint script would it be okay? [01:07:33] PROBLEM Free ram is now: WARNING on bots-4.pmtpa.wmflabs 10.4.0.64 output: Warning: 19% free memory [01:07:43] PROBLEM Total processes is now: WARNING on bots-salebot.pmtpa.wmflabs 10.4.0.163 output: PROCS WARNING: 174 processes [01:08:07] Krenair: well, it would really be better to split the file out. how are you pushing the change in right now? [01:08:32] Damianz: travisci? [01:08:48] I commit amend, run git review. Then to test, kill the branch on nova-precise2 and run git fetch, git checkout, etc. again [01:08:59] Krenair: you can create a new branch, add the file in the other branch [01:09:04] then rm the file from the current branch [01:09:13] then do an amended patchset [01:09:52] hosted ci -> https://travis-ci.org/DamianZaremba/labsnagiosbuilder for example [01:10:06] ah, for people who don't use jenkins? :) [01:10:20] mhm [01:10:22] easier to setup [01:10:26] yeah [01:10:27] jenkins is really a bitch [01:10:51] That sounds difficult [01:10:56] Spent like 30min trying to make it build my site, gate up, wrote 10 lines and bash, stuck it in cron and it just works (tm) [01:11:13] I'm just going to delete it for now. It'll remain in the Git history and I'll keep a backup [01:12:09] git stash ftw! [01:12:42] RECOVERY Total processes is now: OK on bots-salebot.pmtpa.wmflabs 10.4.0.163 output: PROCS OK: 97 processes [01:16:19] I need to go now. gnight. [01:18:53] Krenair: good night [01:34:53] PROBLEM host: labs-nfs1.pmtpa.wmflabs is DOWN address: 10.4.0.13 CRITICAL - Host Unreachable (10.4.0.13) [02:04:53] PROBLEM host: labs-nfs1.pmtpa.wmflabs is DOWN address: 10.4.0.13 CRITICAL - Host Unreachable (10.4.0.13) [02:17:31] Ryan_Lane, did you update that puppet config? [02:17:40] which one? [02:17:45] the bots one? [02:17:49] ya [02:17:49] no. I'd imagine that Damianz did [02:17:59] but it isn't applied currently anyway [02:18:06] you need to manually install the packages for now [02:18:26] well sudo is kinda disabled on that server... xD [02:19:18] which packages is it? I'll install [02:19:33] this is on bots-3? [02:19:43] nr2 [02:20:50] openjdk-7-jdk openjdk-7-jre [02:21:30] installing [02:22:43] done [02:22:50] awesome, thanks :) [02:24:00] yw [02:24:02] PROBLEM Free ram is now: CRITICAL on testing-virt11.pmtpa.wmflabs 10.4.0.82 output: Connection refused by host [02:25:33] PROBLEM Total processes is now: CRITICAL on testing-virt11.pmtpa.wmflabs 10.4.0.82 output: Connection refused by host [02:26:16] \o/ all compute nodes in now [02:27:11] hmm Ryan_Lane i gave you the wrong packages i think cause of errors, can you uninstall those and install openjdk-6-jdk openjdk-6-jre [02:28:01] compatability does not exist in java [02:28:24] done [02:30:05] still same issue, let me see what i had on bots-1 for the old fbot... [02:34:13] PROBLEM Current Load is now: WARNING on bots-4.pmtpa.wmflabs 10.4.0.64 output: WARNING - load average: 4.10, 5.65, 5.11 [02:34:53] PROBLEM host: labs-nfs1.pmtpa.wmflabs is DOWN address: 10.4.0.13 CRITICAL - Host Unreachable (10.4.0.13) [02:39:12] RECOVERY Current Load is now: OK on bots-4.pmtpa.wmflabs 10.4.0.64 output: OK - load average: 4.34, 5.03, 4.96 [02:42:11] i guess ill just build it on my laptop [02:55:53] RECOVERY Free ram is now: OK on swift-be4.pmtpa.wmflabs 10.4.0.127 output: OK: 20% free memory [03:04:53] PROBLEM host: labs-nfs1.pmtpa.wmflabs is DOWN address: 10.4.0.13 CRITICAL - Host Unreachable (10.4.0.13) [03:09:02] PROBLEM Free ram is now: WARNING on swift-be4.pmtpa.wmflabs 10.4.0.127 output: Warning: 18% free memory [03:36:23] PROBLEM host: labs-nfs1.pmtpa.wmflabs is DOWN address: 10.4.0.13 CRITICAL - Host Unreachable (10.4.0.13) [03:55:11] hmm. is bastion having issues? [03:55:33] yes [03:55:42] the ldap change I pushed caused issues [03:55:47] kk [03:55:51] fixing now [03:56:57] ok. working again [03:58:09] the changes should ideally make things faster and cause less load on the ldap servers [04:06:23] PROBLEM host: labs-nfs1.pmtpa.wmflabs is DOWN address: 10.4.0.13 CRITICAL - Host Unreachable (10.4.0.13) [04:28:04] RECOVERY Free ram is now: OK on sube.pmtpa.wmflabs 10.4.0.245 output: OK: 31% free memory [04:33:43] PROBLEM Free ram is now: CRITICAL on wordpressbeta-precise.pmtpa.wmflabs 10.4.0.215 output: Critical: 5% free memory [04:36:24] PROBLEM host: labs-nfs1.pmtpa.wmflabs is DOWN address: 10.4.0.13 CRITICAL - Host Unreachable (10.4.0.13) [04:38:44] RECOVERY Free ram is now: OK on wordpressbeta-precise.pmtpa.wmflabs 10.4.0.215 output: OK: 35% free memory [04:38:54] RECOVERY Free ram is now: OK on swift-be4.pmtpa.wmflabs 10.4.0.127 output: OK: 22% free memory [04:41:02] PROBLEM Free ram is now: WARNING on sube.pmtpa.wmflabs 10.4.0.245 output: Warning: 19% free memory [04:41:22] RECOVERY Free ram is now: OK on bots-sql2.pmtpa.wmflabs 10.4.0.41 output: OK: 21% free memory [04:54:22] PROBLEM Free ram is now: WARNING on bots-sql2.pmtpa.wmflabs 10.4.0.41 output: Warning: 16% free memory [04:56:53] PROBLEM Free ram is now: WARNING on swift-be4.pmtpa.wmflabs 10.4.0.127 output: Warning: 19% free memory [05:07:12] PROBLEM host: labs-nfs1.pmtpa.wmflabs is DOWN address: 10.4.0.13 CRITICAL - Host Unreachable (10.4.0.13) [05:21:24] RECOVERY Free ram is now: OK on sube.pmtpa.wmflabs 10.4.0.245 output: OK: 31% free memory [05:22:34] PROBLEM Total processes is now: WARNING on dumps-bot2.pmtpa.wmflabs 10.4.0.60 output: PROCS WARNING: 152 processes [05:27:32] RECOVERY Total processes is now: OK on dumps-bot2.pmtpa.wmflabs 10.4.0.60 output: PROCS OK: 148 processes [05:37:14] PROBLEM host: labs-nfs1.pmtpa.wmflabs is DOWN address: 10.4.0.13 CRITICAL - Host Unreachable (10.4.0.13) [06:07:42] PROBLEM host: labs-nfs1.pmtpa.wmflabs is DOWN address: 10.4.0.13 CRITICAL - Host Unreachable (10.4.0.13) [06:07:52] PROBLEM Current Load is now: WARNING on orgcharts-dev.pmtpa.wmflabs 10.4.0.122 output: WARNING - load average: 6.00, 5.79, 5.37 [06:31:02] PROBLEM Total processes is now: WARNING on parsoid-roundtrip4-8core.pmtpa.wmflabs 10.4.0.39 output: PROCS WARNING: 155 processes [06:35:53] RECOVERY Total processes is now: OK on parsoid-roundtrip4-8core.pmtpa.wmflabs 10.4.0.39 output: PROCS OK: 147 processes [06:37:43] PROBLEM host: labs-nfs1.pmtpa.wmflabs is DOWN address: 10.4.0.13 CRITICAL - Host Unreachable (10.4.0.13) [06:44:48] petan, around? :) [06:53:53] PROBLEM dpkg-check is now: CRITICAL on deployment-cache-mobile01.pmtpa.wmflabs 10.4.1.82 output: DPKG CRITICAL dpkg reports broken packages [06:58:53] RECOVERY dpkg-check is now: OK on deployment-cache-mobile01.pmtpa.wmflabs 10.4.1.82 output: All packages OK [07:09:52] PROBLEM host: labs-nfs1.pmtpa.wmflabs is DOWN address: 10.4.0.13 CRITICAL - Host Unreachable (10.4.0.13) [07:32:02] RECOVERY Free ram is now: OK on swift-be4.pmtpa.wmflabs 10.4.0.127 output: OK: 21% free memory [07:39:53] PROBLEM host: labs-nfs1.pmtpa.wmflabs is DOWN address: 10.4.0.13 CRITICAL - Host Unreachable (10.4.0.13) [07:51:02] addshore sure [07:51:26] any chance I can have a mysql account e.t.c :) [07:51:28] on bots [07:51:31] sure [07:51:33] on which server [07:51:45] what are my choices? [07:51:54] @labs-resolve bots-sql [07:51:54] I don't know this instance - aren't you are looking for: I-000000af (bots-sql2), I-000000b4 (bots-sql3), I-000000b5 (bots-sql1), [07:52:44] 1 or 3 I think (they have more ram) ;p [07:52:57] ok, but sql2 has more storage :P [07:53:10] I shouldnt need the space :) [07:53:22] doubt i will ever use over 10mb [07:53:22] xD [07:53:23] 3 is better ;) [07:53:33] * addshore would like 3 then :) [07:53:35] petan: is deployment-prep-backup sql instance doing stuff still? [07:53:59] Ryan_Lane not much, but it still holds lot of db backups [07:54:03] ok [07:54:07] for whole beta [07:54:13] I need to move it from virt6 to another host [07:54:20] should be ok to shut it down, right? [07:54:20] no problem, you can even shut it down [07:54:29] sure [07:54:36] does it just keep backup files? [07:54:51] sometimes I run backup script which clone the beta sql in there [07:55:01] so it also run mysql server [07:55:04] if so, any reason not to keep the files in gluster? [07:55:04] ah [07:55:16] it actually runs the db too [07:55:16] ok [07:55:17] I don't copy files, I use sql commands for that [07:55:23] * Ryan_Lane nods [07:55:28] makes sense [07:55:32] ok. I'll move that tomorrow [07:55:37] copying files on the fly is something btrfs can do, but not ext3 :P [07:55:38] we're getting low on space on virt6 [07:55:57] like, they are huge and being modified when you copy them [07:56:13] * Ryan_Lane nods [07:56:42] I guess purging nscd cache on every system isn't going to help the load of the LDAP server any [07:56:47] heh [07:56:59] oh well, it should help when they fill back up [07:57:12] stupid nscd has to be fully purged when you change most of its settings [08:00:00] @search phpmy [08:00:00] No results were found, remember, the bot is searching through content of keys and their names [08:00:02] PROBLEM Free ram is now: WARNING on swift-be4.pmtpa.wmflabs 10.4.0.127 output: Warning: 19% free memory [08:00:32] 87% and 57% hitrate on bastion1. up from 1% and 0% [08:01:25] addshore did you receive my pm? [08:04:42] Damianz: thanks for your fix of the instance names in the nagios notifications \O/ [08:04:52] nom [08:05:00] snmptt is still broken though :( [08:05:26] Damianz: is the Nagios host receiving any traps at all? [08:05:48] port 162 udp [08:06:07] so you could tcpdump and find out if the nagios is at least receiving them [08:06:37] Oh it's recieving them, snmptt is just not calling the script to submit passive results... didn't get time to fix it last night [08:07:41] ahh [08:08:46] Should get chance to sort it today maybe... hmm, just pushed a change to use vars for the instance name in puppet also [08:11:09] merged [08:11:33] PROBLEM host: labs-nfs1.pmtpa.wmflabs is DOWN address: 10.4.0.13 CRITICAL - Host Unreachable (10.4.0.13) [08:11:42] * Damianz pats Ryan_Lane [08:20:53] PROBLEM Free ram is now: WARNING on bots-nr1.pmtpa.wmflabs 10.4.1.2 output: Warning: 19% free memory [08:22:13] I hate java [08:22:17] period [08:39:23] RECOVERY Free ram is now: OK on bots-sql2.pmtpa.wmflabs 10.4.0.41 output: OK: 21% free memory [08:39:53] RECOVERY Free ram is now: OK on swift-be4.pmtpa.wmflabs 10.4.0.127 output: OK: 22% free memory [08:40:54] RECOVERY Free ram is now: OK on bots-nr1.pmtpa.wmflabs 10.4.1.2 output: OK: 21% free memory [08:41:34] PROBLEM host: labs-nfs1.pmtpa.wmflabs is DOWN address: 10.4.0.13 CRITICAL - Host Unreachable (10.4.0.13) [08:43:34] hashar +1 [08:44:03] I hate oracle, java is just an arm [08:52:24] PROBLEM Free ram is now: WARNING on bots-sql2.pmtpa.wmflabs 10.4.0.41 output: Warning: 16% free memory [08:53:54] PROBLEM Free ram is now: WARNING on bots-nr1.pmtpa.wmflabs 10.4.1.2 output: Warning: 18% free memory [09:04:23] PROBLEM Free ram is now: WARNING on sube.pmtpa.wmflabs 10.4.0.245 output: Warning: 15% free memory [09:12:12] PROBLEM host: labs-nfs1.pmtpa.wmflabs is DOWN address: 10.4.0.13 CRITICAL - Host Unreachable (10.4.0.13) [09:13:52] Damianz: if you are still around, do you know if we have a nagios check for ircecho bot ? [09:14:36] misc::ircecho apparently does not have any [09:17:54] PROBLEM Free ram is now: WARNING on swift-be4.pmtpa.wmflabs 10.4.0.127 output: Warning: 19% free memory [09:18:48] ahh [09:20:52] https://gerrit.wikimedia.org/r/#/c/45097/ :-] [09:29:02] PROBLEM Current Load is now: CRITICAL on wikidata-testclient.pmtpa.wmflabs 10.4.0.23 output: Connection refused by host [09:29:52] PROBLEM Disk Space is now: CRITICAL on wikidata-testclient.pmtpa.wmflabs 10.4.0.23 output: Connection refused by host [09:30:43] PROBLEM Free ram is now: CRITICAL on wikidata-testclient.pmtpa.wmflabs 10.4.0.23 output: Connection refused by host [09:32:13] PROBLEM Total processes is now: CRITICAL on wikidata-testclient.pmtpa.wmflabs 10.4.0.23 output: Connection refused by host [09:33:03] PROBLEM dpkg-check is now: CRITICAL on wikidata-testclient.pmtpa.wmflabs 10.4.0.23 output: Connection refused by host [09:34:03] RECOVERY Current Load is now: OK on wikidata-testclient.pmtpa.wmflabs 10.4.0.23 output: OK - load average: 0.66, 0.90, 0.50 [09:34:53] RECOVERY Disk Space is now: OK on wikidata-testclient.pmtpa.wmflabs 10.4.0.23 output: DISK OK [09:35:43] RECOVERY Free ram is now: OK on wikidata-testclient.pmtpa.wmflabs 10.4.0.23 output: OK: 2853% free memory [09:37:13] RECOVERY Total processes is now: OK on wikidata-testclient.pmtpa.wmflabs 10.4.0.23 output: PROCS OK: 100 processes [09:38:03] RECOVERY dpkg-check is now: OK on wikidata-testclient.pmtpa.wmflabs 10.4.0.23 output: All packages OK [09:42:13] PROBLEM host: labs-nfs1.pmtpa.wmflabs is DOWN address: 10.4.0.13 CRITICAL - Host Unreachable (10.4.0.13) [10:03:03] RECOVERY Free ram is now: OK on swift-be4.pmtpa.wmflabs 10.4.0.127 output: OK: 21% free memory [10:12:13] PROBLEM host: labs-nfs1.pmtpa.wmflabs is DOWN address: 10.4.0.13 CRITICAL - Host Unreachable (10.4.0.13) [10:14:33] PROBLEM Total processes is now: WARNING on etherpad-lite.pmtpa.wmflabs 10.4.0.87 output: PROCS WARNING: 151 processes [10:16:02] PROBLEM dpkg-check is now: CRITICAL on wikidata-testclient.pmtpa.wmflabs 10.4.0.23 output: DPKG CRITICAL dpkg reports broken packages [10:36:03] PROBLEM Free ram is now: WARNING on swift-be4.pmtpa.wmflabs 10.4.0.127 output: Warning: 19% free memory [10:42:42] PROBLEM host: labs-nfs1.pmtpa.wmflabs is DOWN address: 10.4.0.13 CRITICAL - Host Unreachable (10.4.0.13) [11:12:42] PROBLEM host: labs-nfs1.pmtpa.wmflabs is DOWN address: 10.4.0.13 CRITICAL - Host Unreachable (10.4.0.13) [11:42:42] PROBLEM host: labs-nfs1.pmtpa.wmflabs is DOWN address: 10.4.0.13 CRITICAL - Host Unreachable (10.4.0.13) [12:12:43] PROBLEM host: labs-nfs1.pmtpa.wmflabs is DOWN address: 10.4.0.13 CRITICAL - Host Unreachable (10.4.0.13) [12:37:24] RECOVERY Free ram is now: OK on bots-sql2.pmtpa.wmflabs 10.4.0.41 output: OK: 21% free memory [12:39:24] RECOVERY Free ram is now: OK on sube.pmtpa.wmflabs 10.4.0.245 output: OK: 21% free memory [12:41:03] RECOVERY Free ram is now: OK on swift-be4.pmtpa.wmflabs 10.4.0.127 output: OK: 22% free memory [12:42:43] PROBLEM host: labs-nfs1.pmtpa.wmflabs is DOWN address: 10.4.0.13 CRITICAL - Host Unreachable (10.4.0.13) [13:04:03] PROBLEM Free ram is now: WARNING on swift-be4.pmtpa.wmflabs 10.4.0.127 output: Warning: 19% free memory [13:05:43] PROBLEM Free ram is now: WARNING on dumps-bot1.pmtpa.wmflabs 10.4.0.4 output: Warning: 19% free memory [13:10:22] PROBLEM Free ram is now: WARNING on bots-sql2.pmtpa.wmflabs 10.4.0.41 output: Warning: 15% free memory [13:12:22] PROBLEM Free ram is now: WARNING on sube.pmtpa.wmflabs 10.4.0.245 output: Warning: 12% free memory [13:14:02] PROBLEM host: labs-nfs1.pmtpa.wmflabs is DOWN address: 10.4.0.13 CRITICAL - Host Unreachable (10.4.0.13) [13:36:32] PROBLEM dpkg-check is now: CRITICAL on extrev1.pmtpa.wmflabs 10.4.0.210 output: DPKG CRITICAL dpkg reports broken packages [13:38:53] PROBLEM dpkg-check is now: CRITICAL on grail.pmtpa.wmflabs 10.4.0.239 output: DPKG CRITICAL dpkg reports broken packages [13:44:44] PROBLEM host: labs-nfs1.pmtpa.wmflabs is DOWN address: 10.4.0.13 CRITICAL - Host Unreachable (10.4.0.13) [14:14:52] PROBLEM host: labs-nfs1.pmtpa.wmflabs is DOWN address: 10.4.0.13 CRITICAL - Host Unreachable (10.4.0.13) [14:44:52] PROBLEM host: labs-nfs1.pmtpa.wmflabs is DOWN address: 10.4.0.13 CRITICAL - Host Unreachable (10.4.0.13) [15:08:53] PROBLEM Free ram is now: CRITICAL on bots-nr1.pmtpa.wmflabs 10.4.1.2 output: Critical: 5% free memory [15:16:34] PROBLEM host: labs-nfs1.pmtpa.wmflabs is DOWN address: 10.4.0.13 CRITICAL - Host Unreachable (10.4.0.13) [15:18:54] RECOVERY Free ram is now: OK on bots-nr1.pmtpa.wmflabs 10.4.1.2 output: OK: 133% free memory [15:22:22] PROBLEM Current Load is now: WARNING on etherpad-lite.pmtpa.wmflabs 10.4.0.87 output: WARNING - load average: 6.51, 6.46, 5.58 [15:39:32] PROBLEM Total processes is now: CRITICAL on etherpad-lite.pmtpa.wmflabs 10.4.0.87 output: PROCS CRITICAL: 211 processes [15:41:05] PROBLEM Free ram is now: WARNING on swift-be2.pmtpa.wmflabs 10.4.0.112 output: Warning: 19% free memory [15:47:13] PROBLEM host: labs-nfs1.pmtpa.wmflabs is DOWN address: 10.4.0.13 CRITICAL - Host Unreachable (10.4.0.13) [16:08:57] Ryan_Lane, I think https://gerrit.wikimedia.org/r/#/c/44948/ is ready for review now, unless you want reboot/build split out into separate changes [16:17:42] PROBLEM host: labs-nfs1.pmtpa.wmflabs is DOWN address: 10.4.0.13 CRITICAL - Host Unreachable (10.4.0.13) [16:37:22] RECOVERY Free ram is now: OK on sube.pmtpa.wmflabs 10.4.0.245 output: OK: 22% free memory [16:39:02] RECOVERY Free ram is now: OK on swift-be4.pmtpa.wmflabs 10.4.0.127 output: OK: 21% free memory [16:40:22] RECOVERY Free ram is now: OK on bots-sql2.pmtpa.wmflabs 10.4.0.41 output: OK: 21% free memory [16:41:02] RECOVERY Free ram is now: OK on swift-be2.pmtpa.wmflabs 10.4.0.112 output: OK: 23% free memory [16:47:43] PROBLEM host: labs-nfs1.pmtpa.wmflabs is DOWN address: 10.4.0.13 CRITICAL - Host Unreachable (10.4.0.13) [16:58:22] PROBLEM Free ram is now: WARNING on bots-sql2.pmtpa.wmflabs 10.4.0.41 output: Warning: 15% free memory [17:05:22] PROBLEM Free ram is now: WARNING on sube.pmtpa.wmflabs 10.4.0.245 output: Warning: 12% free memory [17:07:02] PROBLEM Free ram is now: WARNING on swift-be4.pmtpa.wmflabs 10.4.0.127 output: Warning: 18% free memory [17:17:43] PROBLEM host: labs-nfs1.pmtpa.wmflabs is DOWN address: 10.4.0.13 CRITICAL - Host Unreachable (10.4.0.13) [17:29:02] PROBLEM Free ram is now: WARNING on swift-be2.pmtpa.wmflabs 10.4.0.112 output: Warning: 19% free memory [17:31:53] RECOVERY Free ram is now: OK on swift-be4.pmtpa.wmflabs 10.4.0.127 output: OK: 20% free memory [17:39:04] RECOVERY Free ram is now: OK on swift-be2.pmtpa.wmflabs 10.4.0.112 output: OK: 23% free memory [17:47:43] PROBLEM host: labs-nfs1.pmtpa.wmflabs is DOWN address: 10.4.0.13 CRITICAL - Host Unreachable (10.4.0.13) [18:17:44] PROBLEM host: labs-nfs1.pmtpa.wmflabs is DOWN address: 10.4.0.13 CRITICAL - Host Unreachable (10.4.0.13) [18:19:53] PROBLEM Free ram is now: WARNING on swift-be4.pmtpa.wmflabs 10.4.0.127 output: Warning: 18% free memory [18:27:04] PROBLEM Free ram is now: WARNING on swift-be2.pmtpa.wmflabs 10.4.0.112 output: Warning: 19% free memory [18:49:49] bleh hasher isn't around [18:49:52] PROBLEM host: labs-nfs1.pmtpa.wmflabs is DOWN address: 10.4.0.13 CRITICAL - Host Unreachable (10.4.0.13) [18:49:54] we need a memo function [19:13:43] PROBLEM Disk Space is now: CRITICAL on kubo.pmtpa.wmflabs 10.4.0.19 output: DISK CRITICAL - free space: /mnt 556 MB (2% inode=95%): [19:19:52] PROBLEM host: labs-nfs1.pmtpa.wmflabs is DOWN address: 10.4.0.13 CRITICAL - Host Unreachable (10.4.0.13) [19:34:44] PROBLEM Current Load is now: WARNING on parsoid-roundtrip7-8core.pmtpa.wmflabs 10.4.1.26 output: WARNING - load average: 7.96, 7.59, 5.86 [19:49:52] PROBLEM host: labs-nfs1.pmtpa.wmflabs is DOWN address: 10.4.0.13 CRITICAL - Host Unreachable (10.4.0.13) [19:56:42] PROBLEM Current Load is now: WARNING on ve-roundtrip2.pmtpa.wmflabs 10.4.0.162 output: WARNING - load average: 9.77, 7.48, 5.81 [20:00:23] andrewbogott: hey, let's discuss https://bugzilla.wikimedia.org/show_bug.cgi?id=39788 [20:01:11] Ryan_Lane: on IRC for eavesdroppers or do you want me to come stand next to you? [20:01:49] on here is good [20:02:09] it would be nice to tackle this bug because having to use a password is annoying and a little insecure [20:02:22] there's only one big issue here [20:02:22] It's a pretty tiny change isn't it? [20:02:32] krb! [20:02:38] we need to modify all sudo policies and need to modify OpenStackManager [20:02:40] ok sledge hammer... nut... [20:02:43] we use "ALL" for users [20:02:43] PROBLEM Current Load is now: WARNING on parsoid-roundtrip3.pmtpa.wmflabs 10.4.0.62 output: WARNING - load average: 5.18, 6.48, 5.77 [20:02:46] that's bad [20:02:55] it's fine with passwords, but incredibly dangerous without [20:03:05] Damianz: indeed :) [20:03:12] Damianz: and that would still require a password [20:03:14] you can pass ldap groups to sudo policy not too hard [20:03:19] Damianz: yep [20:03:23] well yeah but I could kinit locally and use my ticket on labs [20:03:24] :D [20:03:29] I trust my own pc [20:03:30] Ryan_Lane: When you say 'ALL' you mean that we permit all possible commands? [20:03:41] from all users [20:03:41] so, we need to modify sudo policies from using "ALL" for users to "project-group" [20:03:53] PROBLEM Current Load is now: WARNING on parsoid-roundtrip6-8core.pmtpa.wmflabs 10.4.0.222 output: WARNING - load average: 4.86, 5.66, 5.11 [20:03:54] well, whatever the proper format for groups is [20:03:57] simple puppet change *boom* [20:04:03] it's not a puppet change [20:04:05] this is in LDAP [20:04:22] and is per-project [20:04:31] yeah... you suck for using sudo-ldap :D [20:04:44] it was the much better solution available [20:05:01] suggest something better that allows user management of policies per-project ;) [20:05:27] I was thinking you could use puppet to update /etc/sudoers for the ldap acls for the group staticly, then as soon as they're a member they have access etc [20:05:35] that's what we used to do [20:05:38] When you introduce 'open' projects I'll troll your pain [20:05:42] that assumes instances properly run puppet [20:06:16] not even all of the instances keep connected to salt [20:06:19] which is the alternative [20:06:34] they all have to connect to ldap, or they are broken ;) [20:06:48] it also allows completely instance chanegs [20:06:51] *instant [20:07:04] RECOVERY Free ram is now: OK on swift-be2.pmtpa.wmflabs 10.4.0.112 output: OK: 21% free memory [20:07:10] anyway, we're not discussing changing that right now :) [20:07:21] andrewbogott: basically, the change is three part [20:07:25] Just for my security education… can you back up a bit and explain why enabling it for all users is bad? Is it because it gives permissions to an account that was created outside of ldap? [20:07:31] ah. ok [20:07:40] so, ALL includes system users [20:07:51] with a password it's fine. they don't have passwords [20:07:53] they can't sudo [20:08:21] system users, ok. [20:08:21] with no password, now any web application shell execution vulnerability is also a remote root vulnerability [20:08:24] andrewbogott: lunch? [20:08:27] yep! [20:08:36] ok. we'll bring this back up when back [20:08:53] RECOVERY Current Load is now: OK on parsoid-roundtrip6-8core.pmtpa.wmflabs 10.4.0.222 output: OK - load average: 2.65, 4.31, 4.74 [20:14:46] well instant - cache :D [20:15:44] I'm having problems pushing to Gerrit from a labs instance. I have agent forwarding enabled and I'm using ssh -A from bastion yet I'm still receiving "Permission denied (publickey)" [20:16:24] what does `ssh-agent` say [20:16:30] MaxSem: can you check "ssh-add -l"? [20:16:53] PROBLEM Current Load is now: WARNING on parsoid-roundtrip6-8core.pmtpa.wmflabs 10.4.0.222 output: WARNING - load average: 7.00, 6.13, 5.38 [20:16:55] saper, locally or on bastion? [20:17:11] MaxSem: labs instances, and then backwards till it shows the key [20:17:53] MaxSem: second question, using screen(1) or something like that? [20:18:14] or script(1)? or anything that changes/removes the environment like "env -" ? [20:18:16] no [20:18:51] ssh to the gerrit port and not port 22 (I know stupid, but happens) [20:19:28] mhm, I see my key in bastion's ssh-add -l [20:19:53] PROBLEM host: labs-nfs1.pmtpa.wmflabs is DOWN address: 10.4.0.13 CRITICAL - Host Unreachable (10.4.0.13) [20:20:35] try ssh gerrit.wikimedia.org:29418 [20:21:07] Oh it's ssh -p 29418 gerrit.wikimedia.org, isn't it... [20:21:55] MaxSem: I get ssh: connect to host gerrit.wikimedia.org port 22: No route to host [20:22:01] mhm [20:22:13] ah ok [20:22:21] **** Welcome to Gerrit Code Review **** [20:22:21] maxsem@mobile-osm:/var/lib/git/operations/puppet$ ssh-add -l [20:22:21] Could not open a connection to your authentication agent. [20:22:41] saper@bastion1:~$ more ~/.ssh/config [20:22:41] Host gerrit-dev [20:22:41] ForwardAgent yes [20:22:55] MaxSem: that was the only change required in my case [20:22:58] maxsem@mobile-osm:/var/lib/git/operations/puppet$ ssh-agent [20:22:58] mkdtemp: private socket dir: No space left on device [20:23:35] eee [20:24:08] why run ssh-agent on the instance? [20:24:09] saper, ForwardAgent yes is the same as ssh -A which I'm using [20:24:21] MaxSem: but that does not work for some reason [20:24:25] Oh, are you sure you're pushing to USERNAME@gerrit.wm.o? [20:24:47] saper, because MaxSem: labs instances, and then backwards till it shows the key [20:24:48] different usernames - I don't think since auth is the same? [20:25:11] MaxSem: but ssh-agent should only run on your local computer, neither bastion nor labs [20:25:29] So for example I need to do krenair@gerrit.wikimedia.org [20:25:42] Unless I update my .ssh/config [20:25:51] Krenair: and there is a different username to bastion in your case? [20:25:56] saper, ssh-add -l on bastion shows that agent forwarding works [20:26:40] Same username in labs actually. I just found that I can't connect to gerrit from my home computer (username alex) without specifying the remote username [20:27:15] Actually thinking that through, my suggestion is probably nonsense. Ignore me. [20:27:20] MaxSem: only to bastion... next step is from bastion to the instance (ssh -A instance or .ssh/config with ForwardAgent) [20:27:52] Krenair: yeah, you need only ssh krenair@bastion.wmflabs.org, it should be smooth from there [20:30:02] PROBLEM Free ram is now: WARNING on swift-be2.pmtpa.wmflabs 10.4.0.112 output: Warning: 19% free memory [20:30:07] i got several emails in the last days saying that a process of mine was killed but i don't understand why [20:33:37] saper, I'm already doing ssh -A, no effect [20:38:23] RECOVERY Free ram is now: OK on bots-sql2.pmtpa.wmflabs 10.4.0.41 output: OK: 21% free memory [20:38:38] one thing maybe related to this No space left on device problem... [20:39:03] on what box? [20:39:47] MaxSem: SSH_AUTH_SOCK on labs instance should show /tmp/ssh-SOMETHING/agent.something - is it set? does it exist? if /tmp/ is full it might have failed to create /tp/ssh-SOMETHING directory [20:39:53] RECOVERY Free ram is now: OK on swift-be4.pmtpa.wmflabs 10.4.0.127 output: OK: 21% free memory [20:40:12] RECOVERY Free ram is now: OK on swift-be2.pmtpa.wmflabs 10.4.0.112 output: OK: 22% free memory [20:40:22] RECOVERY Free ram is now: OK on sube.pmtpa.wmflabs 10.4.0.245 output: OK: 21% free memory [20:43:21] meh [20:43:30] maxsem@mobile-osm:~$ env|grep -i ssh [20:43:30] SSH_CLIENT=10.4.0.54 51288 22 [20:43:30] SSH_TTY=/dev/pts/0 [20:43:30] SSH_CONNECTION=10.4.0.54 51288 10.4.0.226 22 [20:51:33] PROBLEM host: labs-nfs1.pmtpa.wmflabs is DOWN address: 10.4.0.13 CRITICAL - Host Unreachable (10.4.0.13) [20:52:14] A better way to handle repos is to push changes into gerrit, then pull them onto your instance [20:52:23] agent forwarding to instances past gerrit is dangerous [20:52:26] err [20:52:29] past bastion [20:52:35] in fact, I've considered disabling it [20:52:53] PROBLEM Free ram is now: WARNING on swift-be4.pmtpa.wmflabs 10.4.0.127 output: Warning: 18% free memory [20:53:04] agent forwarding is dangerous :) [20:53:12] you can also push and pull directly to/from repos using proxycommand [20:53:15] gerrit hurts for that [20:53:15] there you go [20:53:24] I develop locally, and push into gerrit, then pull [20:53:40] sometimes I push directly into the repo, though, if I'm unsure I'm going to push my change in [20:53:41] New patchset: DamianZaremba; "RYAN IS A MURDER YAY" [labs/nagios-builder] (master) - https://gerrit.wikimedia.org/r/45230 [20:53:42] but labs is used for dev sometimes [20:53:47] you can set up multiple remotes for that [20:54:26] git remote set-url labs my-instance.pmtpa.wmflabs:/mnt/myrepo [20:54:27] New patchset: DamianZaremba; "RYAN IS A MURDERER YAY" [labs/nagios-builder] (master) - https://gerrit.wikimedia.org/r/45230 [20:54:34] o.O [20:54:42] Change merged: DamianZaremba; [labs/nagios-builder] (master) - https://gerrit.wikimedia.org/r/45230 [20:54:46] boed [20:54:52] s/oe/ore/ [20:54:58] heh [20:55:06] yeah. I killed labs-nfs1 [20:55:11] going to delete it soonish [20:55:17] my list of 'I don't give a shit' instances grows once more [20:55:19] then I'm going to celebrate [20:56:22] PROBLEM Free ram is now: WARNING on bots-sql2.pmtpa.wmflabs 10.4.0.41 output: Warning: 16% free memory [20:56:35] 20....30...40 LOG ME IN [20:56:40] I hate your servers [20:57:13] I swaer it's storage mounting that's slow [20:57:19] Damianz: it is [20:57:25] :( [20:57:26] I could change the timeout [20:57:31] Damianz: there was an issue with ldap [20:57:36] I saw [20:57:55] it's way faster right now than it was [20:58:06] If this is fast I don't wanna know slow [20:58:10] :D [20:58:15] which instance is this slow? [20:58:25] nagios... no one logs in regulary so it unmounts [20:58:44] it takes less than 2 seconds to log into bastion [20:58:44] ah [20:58:44] yeah [20:58:44] that's the reason, then [20:59:03] bots is fast... well ~3/4 seconds [20:59:04] (less than 2 seconds to log into bastion, when my latency is 80 ms) [20:59:17] it's going to be slower for you, since you're in europe [20:59:20] the stupid banner causes it to be slower [20:59:23] yep [20:59:29] but the banner is necessary [20:59:37] really [20:59:41] isn't RTFM obuvious [21:00:01] the number of access questions I get went down dramatically after adding the banner [21:00:21] sounds like a gap in new users workflow... [21:00:28] also hasher's change makes no sense [21:00:32] which change? [21:00:53] https://gerrit.wikimedia.org/r/#/c/45097/ [21:00:59] check_procs_generic is local not nrpe [21:01:08] as far as I can see in this config anyway [21:01:46] that would need to be nrpe [21:01:48] the reason nrpe_local is so huge is a generic check_procs would require remote command arguments... yay to getting hackorzed [21:01:52] there's no way for the master to check that [21:02:23] I wish nrpe_local could be built from the monitoirng_service calls... so adding that to a host can then be collected in its config [21:02:28] don't think you can with puppet though [21:02:43] exported resources? [21:02:45] PROBLEM Current Load is now: CRITICAL on orgcharts-dev.pmtpa.wmflabs 10.4.0.122 output: CRITICAL - load average: 20.93, 20.42, 20.12 [21:02:53] @labs-info i-000000e8 [21:02:53] [Name i-000000e8 doesn't exist but resolves to I-000000e8] I-000000e8 is Nova Instance with name: bots-4, host: virt6, IP: 10.4.0.64 of type: m1.small, with number of CPUs: 1, RAM of this size: 2048M, member of project: bots, size of storage: 30 and with image ID: lucid-server-cloudimg-amd64.img [21:02:59] mhm [21:03:05] I've completely disabled exported resources in labs, though [21:03:15] oh yeah I know, which is why nagios does other stuff [21:03:21] to speed up the puppet server and the instance build process [21:03:24] PROBLEM Free ram is now: WARNING on sube.pmtpa.wmflabs 10.4.0.245 output: Warning: 12% free memory [21:03:45] an initial puppet run takes 6 minutes [21:03:48] *6* minutes [21:03:49] slow [21:03:57] yes. it's absurd [21:04:06] probably 4min is ruby eating ram [21:04:12] probably [21:04:32] modules might help though... less code loaded for the state generation [21:04:55] https://www.facebook.com/wikipedia/posts/10151342422383346 [21:05:02] "No idea what this means. Yay." [21:05:17] Damianz: yeah. modules would help with lots of things [21:05:22] ENVS! [21:05:46] ue[ [21:05:48] *yep [21:06:15] Hmm I dunno how you did it... I'd of done the new site as dr, triped a failover then reversed the replication so you know failover works and stuff isn't inter-dependant [21:10:31] I didn't [21:10:36] asher, mark and peter did [21:11:43] :P [21:12:06] I hate our puppet repo [21:12:17] so many duplicates with just patch changes PUPPET HAS VARS FOR A REASON [21:13:41] Ryan_Lane: So, you were saying: …the change is three part [21:13:55] andrewbogott: ah. right [21:14:03] bleh [21:14:09] how do you review someone elses code -.- [21:14:18] Damianz: in gerrit? [21:14:39] mhm [21:14:52] andrewbogott: 1. modify openstackmanager to use project-group rather than ALL for users [21:15:04] 2. Modify all sudo policies to use project-group rather than ALL [21:15:25] 3. modify all sudo policies to allow nopasswd [21:15:40] I guess also have that as an option in the interface [21:15:45] Damianz: click review? [21:15:48] -.- [21:15:54] Ryan_Lane: per-project, you mean? [21:15:55] I'm not sure what you mean [21:15:57] No... I want to edit their code [21:15:59] andrewbogott: yep [21:16:02] fuck it I'll just make a new change [21:16:03] Damianz: pull their change [21:16:07] I did [21:16:07] Damianz: edit it [21:16:09] I did [21:16:10] amended commit [21:16:12] I did [21:16:15] git review [21:16:17] review says no different [21:16:58] No changes between HEAD and gerrit/production. [21:16:59] Submitting for review would be pointless. [21:16:59] and ! [remote rejected] HEAD -> refs/for/review (branch review not found) [21:17:04] I dislike gerrit today [21:17:09] oh [21:17:15] you're trying to push into a branch [21:17:22] called review [21:17:51] that normally gets around git review being stupid [21:17:56] or maybe I'm just remembering it wrong [21:18:03] probably remembering it wrong [21:18:12] git push origin production HEAD:refs/for/production works [21:18:13] yeah [21:18:14] branch [21:18:16] derp [21:18:18] heh [21:18:19] review still sucks [21:18:24] ! [rejected] production -> production (non-fast-forward) [21:18:24] There are multiple keys, refine your input: !log, $realm, $site, *, :), access, account, account-questions, accountreq, addresses, addshore, afk, alert, amend, ask, b, bang, bastion, beta, blehlogging, blueprint-dns, bot, botrestart, bots, botsdocs, broken, bug, bz, cmds, console, cookies, credentials, cs, damianz, damianz's-reset, db, del, demon, deployment-beta-docs-1, deployment-prep, docs, documentation, domain, epad, etherpad, extension, -f, forwarding, gerrit, gerritsearch, gerrit-wm, ghsh, git, git-branches, git-puppet, gitweb, google, group, hashar, help, hexmode, home, htmllogs, hyperon, info, initial-login, instance, instance-json, instancelist, instanceproject, keys, labs, labsconf, labsconsole, labsconsole.wiki, labs-home-wm, labs-morebots, labs-nagios-wm, labs-project, labswiki, leslie's-reset, link, linux, load, load-all, logs, mac, magic, mail, manage-projects, meh, mobile-cache, monitor, morebots, msys, msys-git, nagios, nagios.wmflabs.org, nagios-fix, newgrp, new-labsuser, new-ldapuser, nova-resource, op_on_duty, openstack-manager, origin/test, os-change, osm-bug, pageant, password, pastebin, pathconflict, petan, ping, pl, pong, port-forwarding, project-access, project-discuss, projects, puppet, puppetmaster::self, puppetmasterself, puppet-variables, putty, pxe, python, q1, queue, quilt, report, requests, resource, revision, rights, rt, Ryan, ryanland, sal, SAL, say, search, security, security-groups, sexytime, single-node-mediawiki, socks-proxy, ssh, sshkey, start, stucked, sudo, sudo-policies, sudo-policy, svn, terminology, test, Thehelpfulone, tunnel, unicorn, whatIwant, whitespace, wiki, wikitech, wikiversity-sandbox, windows, wl, wm-bot, [21:18:24] ARGH [21:18:28] :D [21:18:29] PRs are far simplier [21:18:32] wm-bot: DIAF [21:18:32] Hi Damianz, there is some error, I am a stupid bot and I am not intelligent enough to hold a conversation with you :-) [21:18:34] I disagree [21:18:54] I've been using github for the past week and I fucking hate PRs [21:19:01] I can't modify someone else's PR at all [21:19:13] I can't modify someone else's gerrit change at all [21:19:15] what you're currently trying in gerrit doesn't exist at all in gerrit [21:19:17] err [21:19:22] doesn't exist in github [21:19:29] I do it all the time [21:19:35] I just did it like two hours ago [21:19:48] I cherry-picked in the change [21:19:52] then edited the files [21:19:55] then amended [21:19:58] meh. I ended up exposing the git checkout via http [21:19:58] then did git review [21:20:08] I could cherry pick your git pr into my git pr [21:20:15] MaxSem: why not use proxycommand? [21:20:23] and push or pull from it directly? [21:20:24] also this way [21:20:25] Author: Antoine Musso [21:20:28] is no longer true [21:20:40] yeah that is me [21:20:47] Damianz: it's a multi-owner patchset now [21:20:49] * Damianz pats hashar [21:21:02] Ryan_Lane: But I can't push my change to his change ... [21:21:05] -.- [21:21:07] it's possible to abandon his change [21:21:09] and make your own [21:21:21] hashar: Btw can I have gerrit do make test rather than pep [21:21:31] Ryan_Lane: Oh you mean like on github? :D [21:21:45] Damianz: yes. notice I didn't recommend it [21:22:33] PROBLEM Free ram is now: CRITICAL on bots-4.pmtpa.wmflabs 10.4.0.64 output: Critical: 5% free memory [21:22:40] Damianz: i am not there for the rest of the week, do fill a bug please and I will do that later on :/ [21:22:53] hashar: aww, haz funz [21:23:04] 'Submitting for review would be pointless.' git review has life summed up [21:23:05] Damianz: if you want to change a commit author, git commit --amend --author="John Doe " [21:23:25] andrewbogott: the changes shouldn't be too bad, it's just a little laborious [21:23:26] Yeah... I don't mind, just want gerrit to admit I changed the code and make a new patch set [21:23:42] This way you get the bug reports [21:23:53] Where as in the github module we'd both get them for the code we touched [21:24:03] Ryan_Lane: Yep. Step one (and the interface) I think I know how to do… I'll check in with you before I start actually changing ldap. [21:24:09] cool [21:24:10] thanks [21:24:18] I think this will make a lot of people happy :) [21:25:16] It'll reduce IRC questions by 10%. "What is my sudo password?" [21:25:24] yep [21:25:30] I also hate typing my password [21:25:32] :) [21:25:41] andrewbogott: We could store them in plaintext then be able to answer that question :D [21:25:44] well copy/pasting my password from my database to be more accurate [21:26:26] git push gerrit HEAD:refs/changes/45097 < That was totally obuvios [21:28:24] I dunno [21:28:28] git review just makes this work [21:28:35] or just git push gerrit HEAD:refs/for/production [21:28:40] or sometimes it doesn't [21:28:44] also, as long as the changeid is in the change it should just work [21:28:48] and gerrit will figure out the change based on the Change-Id: field in your commit message [21:28:53] need to update from pip anyway [21:29:01] maybe that's the issue [21:29:02] PROBLEM Free ram is now: WARNING on swift-be2.pmtpa.wmflabs 10.4.0.112 output: Warning: 19% free memory [21:29:16] I don't know. I rarely have troubles with git review [21:29:25] chad hates git review, though [21:29:26] bed time *waves* [21:29:29] so maybe I'm alone there [21:29:31] hashar: see ya [21:29:40] I use git-review myself :-D [21:29:49] and saper contributes to git-review upstream [21:29:52] so you are not alone! [21:30:00] Chad likes java so hating git must be a whole other level of rage [21:30:34] Ryan_Lane: https://gerrit.wikimedia.org/r/45097 make more sense to you? [21:32:39] New patchset: DamianZaremba; "Adding irc echo service - for when 45097 gets merged" [labs/nagios-builder] (master) - https://gerrit.wikimedia.org/r/45238 [21:33:23] Damianz: yes [21:33:33] is that going to work? [21:34:48] is what gonna work [21:35:59] the check for ircecho [21:36:00] puppetClass: misc::ircecho < should work for labs, prod I guss so - everything else works [21:36:03] it's a pythin script that's run [21:36:09] *python [21:36:14] I'm not sure how that nagios check works [21:36:32] does it check for any process that has that in the command? [21:37:43] root@i-000000d2:~# ps aux | grep ircecho | grep -v grep [21:37:43] nobody 1713 0.0 0.4 61600 8700 ? Sl 2012 59:29 python /usr/ircecho/bin/ircecho --infile=/var/log/logmsg #wikimedia-labs beta-logmsgbot irc.freenode.net [21:37:46] root@i-000000d2:~# /usr/lib/nagios/plugins/check_procs -w 1:3 -c 1:20 -a ircecho [21:37:49] PROCS OK: 1 process with args 'ircecho' [21:37:52] root@i-000000d2:~# /etc/init.d/ircecho stop [21:37:55] root@i-000000d2:~# ps aux | grep ircecho | grep -v grep [21:37:56] ah, so it just does a grep [21:37:57] root@i-000000d2:~# /usr/lib/nagios/plugins/check_procs -w 1:3 -c 1:20 -a ircecho [21:37:57] ok [21:38:00] PROCS CRITICAL: 0 processes with args 'ircecho' [21:38:09] ok, this change is good, then [21:38:25] want a merge? [21:38:41] sure [21:38:58] then I can enable my shiz on labs and prove the theory that it works :D [21:38:58] done [21:39:04] awesome [21:39:40] Change merged: DamianZaremba; [labs/nagios-builder] (master) - https://gerrit.wikimedia.org/r/45238 [21:44:27] New patchset: DamianZaremba; "Wrong class fix" [labs/nagios-builder] (master) - https://gerrit.wikimedia.org/r/45241 [21:45:18] * Damianz looks at chad [21:45:28] he should be in here so I can yell without changing channel [21:45:46] Change merged: DamianZaremba; [labs/nagios-builder] (master) - https://gerrit.wikimedia.org/r/45241 [21:46:45] RECOVERY Current Load is now: OK on parsoid-roundtrip6-8core.pmtpa.wmflabs 10.4.0.222 output: OK - load average: 6.07, 4.56, 4.87 [21:46:45] file written ok... service added... lets see if it checks fine... should do then we're golden [21:48:16] http://nagios.wmflabs.org/cgi-bin/nagios3/extinfo.cgi?type=2&host=deployment-dbdump.pmtpa.wmflabs&service=IRC+Echo+Process < yay [21:48:29] petan hashar fyi ^ [21:51:52] PROBLEM IRC Echo Process is now: CRITICAL on bots-labs.pmtpa.wmflabs 10.4.0.75 output: NRPE: Command check_ircecho not defined [21:51:52] PROBLEM IRC Echo Process is now: WARNING on nagios-main.pmtpa.wmflabs 10.4.0.120 output: PROCS WARNING: 4 processes with args ircecho [21:52:04] yeah, yeah puppet hasn't run yet [21:52:15] I'm inpatient [21:53:39] Hello late night folks! :) [21:57:03] late night? it's like 22:00 or well I guess 23 for germany [21:57:19] Silke_WMDE: howdy [21:57:55] :) [21:58:12] * Silke_WMDE had some wine already [21:58:23] PROBLEM Free ram is now: CRITICAL on sube.pmtpa.wmflabs 10.4.0.245 output: Connection refused by host [21:59:53] PROBLEM Current Load is now: WARNING on parsoid-roundtrip6-8core.pmtpa.wmflabs 10.4.0.222 output: WARNING - load average: 6.25, 5.79, 5.29 [22:04:52] RECOVERY Current Load is now: OK on parsoid-roundtrip6-8core.pmtpa.wmflabs 10.4.0.222 output: OK - load average: 3.48, 4.64, 4.98 [22:06:52] RECOVERY IRC Echo Process is now: OK on bots-labs.pmtpa.wmflabs 10.4.0.75 output: PROCS OK: 3 processes with args ircecho [22:07:43] Damianz: yes 23 [22:16:13] RECOVERY Current Load is now: OK on parsoid-roundtrip3.pmtpa.wmflabs 10.4.0.62 output: OK - load average: 3.37, 4.27, 4.90 [22:23:24] PROBLEM Free ram is now: WARNING on sube.pmtpa.wmflabs 10.4.0.245 output: Warning: 10% free memory [22:23:31] What's missing: Speech input for git. [22:26:43] RECOVERY Current Load is now: OK on ve-roundtrip2.pmtpa.wmflabs 10.4.0.162 output: OK - load average: 3.71, 3.97, 4.72 [22:30:12] PROBLEM Free ram is now: WARNING on nova-precise2.pmtpa.wmflabs 10.4.1.57 output: Warning: 18% free memory [22:34:43] PROBLEM Current Load is now: WARNING on ve-roundtrip2.pmtpa.wmflabs 10.4.0.162 output: WARNING - load average: 6.15, 5.65, 5.31 [22:37:35] RECOVERY Free ram is now: OK on bots-4.pmtpa.wmflabs 10.4.0.64 output: OK: 32% free memory [22:38:53] RECOVERY Free ram is now: OK on swift-be2.pmtpa.wmflabs 10.4.0.112 output: OK: 23% free memory [22:54:44] RECOVERY Current Load is now: OK on ve-roundtrip2.pmtpa.wmflabs 10.4.0.162 output: OK - load average: 4.31, 4.42, 4.82 [22:56:54] PROBLEM Free ram is now: WARNING on swift-be2.pmtpa.wmflabs 10.4.0.112 output: Warning: 19% free memory [23:03:42] https://fedoraproject.org/wiki/Features/ReplaceMySQLwithMariaDB interesting [23:04:38] Ryan_Lane: Dammit, you got me all excited thinking you where enabling feeds then I remembered Krenair namd it echo :( [23:04:45] hahaha [23:04:54] well, krenair didn't name it echo [23:05:00] Umm... Yeah... [23:05:01] I think brandon harris did [23:05:11] Uh, Ryan_Lane [23:05:18] Krenair: you did? [23:05:19] "Ran out of captcha images" on the labsconsole signup form [23:05:19] Krenair: ? [23:05:22] it's easier to play 'the last one that touched the code' :D [23:05:22] oh? [23:05:23] no [23:05:30] I did not name Echo. [23:05:35] :( [23:05:41] hm [23:05:47] did I wipe out the captcha images? [23:05:50] I love that it gives you a backtrace for out of captcha images [23:05:51] -.- [23:06:04] For some reason the signup form URL is the one I get from chrome's autocomplete... It took me to that error page [23:06:16] I accidentally deleted all images [23:06:19] and restored from backup [23:06:30] I did a backup right before update, of course [23:06:45] ah [23:06:47] I see [23:07:15] fixed [23:07:32] Krenair: thanks for letting me know :) [23:07:39] yw [23:07:53] can we not just turn captcha's off? they make for a bad ux, only block a portion of bots and generally suck [23:09:28] Damianz: that scares me [23:09:59] more than the likes of 'headshits' appearing and warding off new users? [23:10:41] I can sorta see why, but they don't get shell so the worst they could do is spam and send junk to gerrit [23:10:44] I don't think new users to labs are going to be very scared here [23:10:49] actually that could be an interesting way to kill gerrit [23:10:56] Damianz: that's not totally true [23:11:00] it's also entries in ldap [23:11:12] if we get 20,000 spam accounts, that's problematic [23:11:23] that shouldn't make a difference... well it might screw your caches/paging up [23:11:27] yes [23:11:33] and it'll make things slower [23:11:39] and it'll add load on the ldap server [23:11:44] be like old times :D [23:11:48] heh [23:11:49] hmm [23:12:01] there's a welcome message to new users through echo now! :) [23:12:06] I think they make for a bad ux and we should be able to do better... and they don't stop some bots... but are sorta useful [23:12:10] Krenair: is there any way to customize the welcome message? [23:12:25] so that we can give users a link to the getting started guide? [23:12:49] Damianz: yes, I'd like to have better captchas, not eliminate them [23:13:39] https://labsconsole.wikimedia.org/wiki/MediaWiki:Notification-new-user-content ? [23:13:50] Awesome [23:13:52] thanks [23:14:09] The solution to stop spam accounts - make everyone fax their passport :D [23:14:15] lolololol I bet some places would too [23:15:10] https://labsconsole.wikimedia.org/wiki/MediaWiki:Notification-new-user-content [23:15:14] Though on a serious note we should support other methods... like I believe prod for people who can't actually read them (say are blind) they can email the helpdesk and have someone do it for them etc... we're kinda lacking there [23:15:27] Damianz: you don't know how close to that we were [23:15:51] Damianz: I had to fight to not have the identification policy apply to labs [23:15:56] probably not, but I like to complain until thinks are perfect, then find flaws with them [23:16:10] <^demon> Ryan_Lane: Oh, so I moved gerrit-dev to use project storage. Let's never ever use gluster for production gerrit. [23:16:13] you mean the same policy you pretty much have to do to get r00t on prod? [23:16:14] <^demon> It's way too fucking slow. [23:16:15] If you had an identification policy would that rule out under-18s? [23:16:21] I think I saw some of the early emails about that [23:16:25] ^demon: yes, it's incredibly slow [23:17:00] Krenair: well, mostly it would mean we'd need to identify every single labs user (which includes all developers) before they could get an account [23:17:01] ^demon: Well git loves small files, gluster hates small files... match made in heaven [23:17:03] that's terrible [23:17:19] Ryan_Lane: Yay to not loving the opensource community [23:17:30] yeah. that would never fly [23:17:40] I still think facebook making you sign a CLA is obnoxious [23:17:43] Yes I know it is and I agree [23:18:00] But I'm wondering whether it would've prevented me from using labs [23:18:11] maybe [23:18:19] well, probably [23:18:45] I think that policy requires 18+ [23:19:13] Not like the old days when roots where everyone and their mom (as interested in helping out and not a total noob) [23:21:26] lolollol I love the note on the bottom of the create account form [23:21:33] that's like smooth, beautiful ux right there [23:21:34] -.- [23:21:59] You mean the one that's brokenly rendered around the side of it? :P [23:22:22] Damianz: :) [23:22:58] Change on 12mediawiki a page Wikimedia Labs/Interface usability improvement project was modified, changed by Ryan lane link https://www.mediawiki.org/w/index.php?diff=633215 edit summary: [-149] /* Notifications */ [23:29:37] Damianz: patches welcome ;) [23:30:03] btw, that's mediawiki's rendering [23:30:16] the entire form is ugly as hell [23:30:50] mediawiki is ugly as hell [23:31:05] it makes for a nice documentation system but argh it's so over complex and a pita for a site [23:31:20] I hate to think how much hacking was done to make webplatform as ok looking as it is [23:31:29] and even then the editor is kinda ugl [23:31:37] petan: could you add rschen7754 to the bots project? [23:32:00] (or any other bots admin) [23:42:06] MaxSem: managed to find out why the ssh auth socket does not get created? [23:42:26] saper, I went another way [23:42:32] thanks for your help anyway [23:43:55] Damianz: heh [23:44:06] Damianz: well, it's mostly the skin that makes it look nce [23:44:54] magic [23:46:10] hmm, I wonder if anyone has tried using something like Sentry to track mediawiki exceptions/db errors across the prod cluster... could be an interesting project [23:46:54] https://gerrit.wikimedia.org/r/45267 ;) [23:47:07] just saw [23:47:19] makes more sense for us, needs some ui changes though [23:47:41] yes [23:47:45] not many [23:47:58] I think OSM is configurable for that [23:48:07] sed out sysadmin, delete netadmin, change some acls [23:48:55] actual access wise we should be able to pull rights from keystone... it's how the pages are layed out really... though I think 1 config page per instance is best [23:49:39] well, we just need to add an api for OSM in mediawiki [23:49:44] and make a javascript interface ;) [23:50:05] ewww javascript [23:50:15] though api == good, we should have an api [23:50:17] what, should we use dart? :D [23:50:33] I want actions via ajax calls [23:50:39] <^demon> It's not Javascript unless you write it in Java ;-) [23:50:44] :D [23:51:23] you know on a normal site we could use bootstrap and have all the events stuff just magically appear [23:51:31] then I remember this is mediawiki and everything is hell [23:52:00] I totally mean backbone not bootstrap also [23:54:04] hmm Salt Scheduleing looks so useful, wondering how long before the yaml file is un-readable [23:55:29] backbone? [23:55:39] and yeah, salt scheduling looks really useful [23:56:18] that + a returner could be really interesting [23:56:20] 'Today's data center migration only required 32 minutes of Wikipedia in read-only mode. Kudos to @Wikimedia ops!' so pretty much a normal enwiki outage [23:56:31] except it was just read-only [23:56:32] backbone.js is a javascript framework [23:56:34] ah [23:56:59] well by outage I was thinking master capping out or slaves lagging rather than memcache taking a dump [23:57:07] none of that happened ;) [23:57:18] and yeah it could.... I wish you could validate the data passed to the returner hmm it would be really interesting for auditing [23:57:20] reads worked. writes were blocked in the interface [23:57:34] why can't you validate the data? [23:58:00] I dunno, but I'm assuming you can (as you have the key) fake data back to the returner [23:58:02] ah, right, because it's the client writing [23:58:26] well, if you really care, then use a runner [23:58:45] then use a returner with the runner [23:58:50] that's silly, though [23:59:02] For metric collection I think I might use a returner, since I don't care... for auditing licenses etc I kinda need to validate it since it updates the asset management db