[07:25:18] 10netops, 10Operations: AS63541's session down reported by cr1-eqsin - https://phabricator.wikimedia.org/T228617 (10elukey) [09:37:41] 10Traffic, 10Operations, 10Goal: ATS Backends: Test live cache_text traffic - https://phabricator.wikimedia.org/T228629 (10ema) [09:37:49] 10Traffic, 10Operations, 10Goal: ATS Backends: Test live cache_text traffic - https://phabricator.wikimedia.org/T228629 (10ema) p:05Triage→03Normal [09:53:23] 10Traffic, 10Operations, 10Patch-For-Review, 10discovery-system, 10services-tooling: Figure out a security model for etcd - https://phabricator.wikimedia.org/T97972 (10Joe) >>! In T97972#5335454, @CDanis wrote: > I think we likely want to revisit this. > > * Right now the `guest` user has access to `/ev... [10:39:43] 10Traffic, 10Operations, 10Patch-For-Review, 10discovery-system, 10services-tooling: Figure out a security model for etcd - https://phabricator.wikimedia.org/T97972 (10Volans) 05Resolved→03Open >>! In T97972#5352851, @Joe wrote: > IIRC we already have an account specialized for accessing only `mwconf... [10:39:46] 10Traffic, 10Operations, 10discovery-system, 10services-tooling: Create a tool to sync static configuration from a repository to the consistent k/v store - https://phabricator.wikimedia.org/T97978 (10Volans) [10:39:51] 10Traffic, 10Operations, 10Patch-For-Review, 10discovery-system, 10services-tooling: Figure out a security model for etcd - https://phabricator.wikimedia.org/T97972 (10Volans) p:05High→03Normal [14:43:25] 10Traffic, 10Operations: rack/setup/install lvs101[3-6] - https://phabricator.wikimedia.org/T184293 (10BBlack) 05Open→03Resolved These have been in-service for a while now, closing! [14:52:22] 10Traffic, 10DC-Ops, 10Operations, 10decommission: Decommission lvs100[123456] - https://phabricator.wikimedia.org/T228671 (10BBlack) [14:54:45] 10Traffic, 10DC-Ops, 10Operations, 10decommission: Decommission lvs100[123456] - https://phabricator.wikimedia.org/T228671 (10BBlack) [15:08:36] 10netops, 10Analytics, 10Analytics-Kanban, 10Operations, 10ops-eqiad: Move cloudvirtan* hardware out of CloudVPS back into production Analytics VLAN. - https://phabricator.wikimedia.org/T225128 (10Ottomata) [15:08:59] 10netops, 10Analytics, 10Analytics-Kanban, 10Operations, 10ops-eqiad: Move cloudvirtan* hardware out of CloudVPS back into production Analytics VLAN. - https://phabricator.wikimedia.org/T225128 (10Ottomata) OO, when we reimage these, let's use Buster! :) [15:23:36] dear traffic team why this change in the DNS repo is failing with this message error: Name 'termbox-test.staging.svc.eqiad.wmnet.': CNAME not allowed alongside other data [15:23:48] the change https://gerrit.wikimedia.org/r/c/operations/dns/+/524797 [15:23:56] any other way to do it? maybe discovery records_ [15:42:25] fsero: CNAMEs are singletons. For any given name on the left-hand-side of a DNS record, there can only be a singular CNAME RR (no other record types allowed, and no more than 1 CNAME) [15:42:48] that's not our sre rule or a gdnsd rule, it's just the basic rules of how the DNS protocol stuff is specified, we can't change it. [15:43:27] (because they're aliases; they're like the softlinks of the DNS. imagine asking the filesystem to softlink a file to two different places) [15:44:24] * volans wondering if we should add this check too to the zone_validator [15:45:38] CNAMEs are skipped as of now [15:47:05] yeah, depends on the scope of zone_validator really [15:47:26] it's not an SRE rule, it's more in the realm of basic zonefile checks than any zonefile linter would have for generic purposes. [15:47:36] bblack: thank you so much, i guess besides layer 8 issue on my end i expected that syntax to allow me to write round robin CNAMEs [15:47:51] gdnsd is going to gain one of those sometime Soon too, although I'm not sure how Soon. [15:48:40] [it's on the TODO list to basically get rid of all the "soft" sanity checks in gdnsd's C code, only enforcing what rules are strictly necessary to load the data at all, and ship a separate script that does the soft-rules linting for the generic case] [15:49:27] it'd be nice if that script (or module) was generic enough that users could add further custom local rules, I donno [16:08:19] 10Traffic, 10Analytics, 10Operations: Fix geoip updaters for new MaxMind hashed keys by 2019-08-15 - https://phabricator.wikimedia.org/T228533 (10Milimetric) @faidon: we don't have any updaters on our end, we just move the databases around and keep backups for historical use. But let us know if you run into... [16:10:09] 10Traffic, 10Operations: Implement GeoDNS smooth repooling in gdnsd - https://phabricator.wikimedia.org/T228678 (10BBlack) p:05Triage→03Normal [16:25:48] 10Traffic, 10Operations: Implement GeoDNS smooth repooling in gdnsd - https://phabricator.wikimedia.org/T228678 (10BBlack) [16:25:52] 10Traffic, 10Operations: implement better failure-scenario geoip mapping in gdnsd - https://phabricator.wikimedia.org/T94697 (10BBlack) [16:26:16] bblack: if you have a minute today, open question for you from ema on this patch "@BBlack is this known and OK?" https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/512925/ [16:33:43] 10Traffic, 10Operations: Implement GeoDNS smooth repooling in gdnsd - https://phabricator.wikimedia.org/T228678 (10BBlack) [16:36:50] ebernhardson: yeah, this keeps falling off my radar, sorry. I think we need to talk with andrewbogott about this and how it fits in their big picture.... [16:37:22] TL;DR is I gave you some advice based on my small-context understanding of getting this cloudelastic thing up and running, about how to structure it for our existing LVS solution, etc... [16:38:03] but I think at our offsite in Dublin, we had some meetings between SRE<->WMCS in a bigger context about how all such things should be handled going forward, and the outcome of that was in a completely different direction than the patch you're working on.... [16:38:10] bblack: sure, most ops patches i write i figure i write the wrong thing and someone will tell me why it's wrong :) [16:38:31] (e.g. avoid using production stuff like LVS and our ranges; and WMCS is going to have some haproxy solution for such things and use their own domainnames and address spaces) [16:38:54] hmm, ok. sonuds like 18 months though :) [16:39:20] but I think at this point we need to talk to andrew and confirm that all this new stuff applies to this case or not, and sure... what the timeline is for having something available to use and what we should do in the meantime, or whatever. [16:39:52] ok makes sense, i'll loop him into the ticket and provide some context at least [16:39:53] it sounded like it was going to be pretty quick, as they have some immediate needs of their own, but I donno [16:40:01] if so, awesome [16:48:22] 10Traffic, 10Operations, 10TechCom-RFC, 10Core Platform Team Backlog (Designing), 10Services (designing): Make API usage limits easier to understand, implement, and more adaptive to varying request costs / concurrency limiting - https://phabricator.wikimedia.org/T167906 (10daniel) @EvanProdromou Are you... [17:30:08] 10netops, 10Operations, 10ops-codfw: Cable mr1-codfw<->cr1/2-codfw through asw-a-codfw - https://phabricator.wikimedia.org/T228112 (10Papaul) [17:48:26] bblack, ema - https://phabricator.wikimedia.org/T227141 seems to contain a lot of cp1XXX hosts, and afaik it is scheduled as next rack for PDU maintenance.. [17:48:58] (also LVS hosts) [18:04:24] (the LVS are 1007-9, I think decommed/spare?) [18:06:26] they should be all old hosts, lovely [18:15:16] yes, noted in task, thanks!