[02:39:45] I am on the verge of adding a new nutcracker service to a group of servers. Is nutcracker generally deprecated in favor of mcrouter, or does it depend on the use case? I see nutcracker getting used in a fair number of places but don't want to add a new use case if someone else is on a 'stamp out nutcracker' mission. [02:46:57] andrewbogott: I don't know for sure but I believe that on appservers, nutcracker is only used for redis nowadays [02:47:34] that suggests that the module might not be on the verge of removal :) [02:48:02] mcrouter has a million features that I don't care about but that might not be a reason to not use it [06:21:30] andrewbogott: what is the use case for nutcracker ? Mcrouter has a lot of options but generally it is not that difficult to configure/maintain. It behaves very differently from nutcracker (major example is what they do when a shard goes down), so keep it mind when choosing (we can discuss further if you want). Also, packaging/upgrading nutcracker might not be very widely supported as mcrouter in the [06:21:36] future (but this is only a speculation from my side) [06:24:18] -- [06:24:45] I was checking icinga alerts and I noticed some logstash100X hosts, the ES cluster seems to be yellow [06:24:48] "cluster_name" : "production-logstash-eqiad", [06:24:50] "status" : "yellow", [06:28:17] yes there are some shards in unassigned state [06:32:46] ok opening a task, but IIRC an alarm should be raised when an ES cluster is not "green" [06:40:26] https://phabricator.wikimedia.org/T250133 [06:57:22] wearing my Debian maintainer hat, nutcracker is pretty much abandonware at this point [06:57:35] to the point where I'd really like to remove it from Debian, when we stop using it here :) [06:58:12] hasn't seen real development for 5 years or so [06:58:49] there are 56 outstanding pull requests in GitHub, I think that says it all :) [07:25:37] morning folks. i've just joined the data persistance team. if you want someone to blame, cdanis and mark are the best candidates :) [07:26:03] morning kormat! [07:26:19] hello! [07:27:30] morning! [07:27:52] I like reasons to blame cdanis, so this is another one! [07:29:27] In doubts I personally prefer to blame marostegui [07:34:01] +1 [07:34:05] kormat: hi! [08:29:43] kormat: welcome :D [08:50:15] welcome kormat! [08:53:24] mark: i have a complaint. you never told me marostegui is a mac user [08:54:05] i didn't know that either [08:54:11] i am too, fwiw [08:54:17] /o\ [08:54:24] you can still run [08:54:42] country is in lockdown. i think i'm stuck. ;) [08:57:46] hahahahaha [08:58:02] kormat: Don't worry, we all know that 2020 is linux's desktop year [08:59:31] is postponed to 2021 now because of covid, marostegui [08:59:37] kormat: join the linux ops here [09:00:19] if you have a problem with your os, you can rant to your co-workers until it is fixed! [09:00:20] mark: I thought this would actually help linux desktop, as everyone now has time to debug sound and video issues when coming back after hibernating your laptop [09:00:40] our former australian DBA was hardcore linux desktop user... wrote his own tiling window manager too [09:02:33] multiple ones! [09:34:54] welcome kormat [10:16:53] i see jynus is trolling me by inviting me to #wikimedia-sre-private while i don't have access ;) [10:17:03] I thought that would work [10:17:06] sorry [10:17:36] jynus: at least you don't use a mac [12:15:58] kormat: welcome! [12:16:46] moritzm: hi! marostegui said you had some .debs to ease setting up a dev env [12:18:10] I do! I'll build and upload them shortly, will ping you [12:18:20] great, thanks :) [12:30:40] godog: hi :) i'm told you're the person to bother about getting my pager with VictorOps set up [12:32:45] kormat: hello! yes that is correct, I'll be sending invite and docs your way [12:32:53] * marostegui hides [12:33:11] haha! [12:38:57] mutante: I know you don't need that, but would you be ok with me showing you the command (and also a reminder for everyone here) so you can check it yourself? [12:41:55] jynus: if it's bconsole then i already know it i think [12:42:12] no, it is a new command [12:42:16] ok, please do [12:42:18] simpler one [12:43:08] mutante: https://phabricator.wikimedia.org/P10970 [12:43:09] i see we have "roothome" for apt1001. looking good [12:43:16] that was important because of the GPG keys [12:43:21] jynus: thanks [12:43:30] F means full; I, incremental; D, differential [12:43:49] the icinga checks checks that there is a full before an incremental too [12:43:57] otherwise it alerts [12:44:12] however, the main issue is that it only checks what is configured [12:44:24] for example, on apt* hosts, only 3 datasets are: [12:44:28] roothome [12:44:34] srv-autoinstall [12:44:40] and srv-wikimedia [12:45:03] #1 error is "it cannot check what is not told to be checked" :-D [12:45:52] also please note the date of the last backup [12:45:59] we can run a new full one if that helps [12:47:38] the grafana/prometheus graphs may help visualize it better if one doesn't like clis: https://grafana.wikimedia.org/d/413r2vbWk/bacula?orgId=1&var-dc=eqiad%20prometheus%2Fops&var-job=apt1001.wikimedia.org-Monthly-1st-Fri-production-srv-wikimedia&from=1586695618993&to=1586868418993 [12:48:18] let me know if you still want me to check every host [12:52:53] jynus: thanks. you don't need to check every host. i can do it. And i will run a full backup of the roothome one, using bconsole then, after looking at the date and the file mod times [12:53:24] well.. except if i do "run job" it's the incremental one [12:53:41] you can modify it with mod, and modify the level [12:53:44] modifies the level to full [12:53:45] it can be done in one go [12:53:47] ack, found it [12:53:53] with level=full I think [12:54:43] Job queued. JobId=219474 [12:55:08] ok, cool. i think i'm good. thx [15:44:23] o/ [15:50:27] elukey: I need a simple n-host memcached cluster (where, today, n=2); all hosts in the same DC. Would you suggest that I try to use the existing mcrouter class for that or will that be overkill? [15:50:50] (And, re: nutcracker being pretty much abandoned, thanks paravoid, that answers my question completely) [15:57:29] andrewbogott: yes it shouldn't be difficult to set it up! One thing to note though: mcrouter periodically health checks each memcached shard configured, and if some number of timeouts occurs (configurable) by default it just return an error for all the traffic towards it until the health checks pass again [15:58:16] nutcracker didn't do it, since it removes the failing shard from the consistent hash and route traffic towards other ones [15:59:02] mcrouter can be instructed to failover to other shards, we are doing something like that for mw but it is still WIP (https://phabricator.wikimedia.org/T244852) [15:59:38] we also have good prometheus metrics for mcrouter, see https://grafana.wikimedia.org/d/000000549/mcrouter?orgId=1 [16:24:36] <_joe_> andrewbogott: mcrouter should be adaptable, yes [16:30:06] cool, will have a patch for y'all to look at in a bit [16:31:39] <_joe_> andrewbogott: what application would use it? [16:31:48] (in a meeting, sorry) [16:31:56] <_joe_> sure, sorry, we can talk tomorrow [16:55:44] any of our Debian Developers interested in uploading a (likely trivial) backport of yubikey-manager (and probably yubikey-personalization) as well? version in buster is 2.1.0; a modern version (3.1.1) adds a `cached` touch authentication option, so you can have the hardware prompt for a touch before any auth, but only once every 15 seconds, which is a big usability improvement over "literally every [16:55:46] time" [17:13:02] _joe_, elukey, tentative mcrouter patch: https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/588752/ [17:16:33] (I have another nutcracker cluster which I will also replace once I'm convinced I know what I'm doing with this change) [17:17:47] andrewbogott: going afk now but will check tomorrow! [17:18:00] thanks! [17:44:23] cdanis: if nobody else shows interest, I can take a look by end of this week [17:44:55] arturo: cool, thanks :)