[07:39:15] 10netops, 10Operations: Make eqord its own AS - https://phabricator.wikimedia.org/T259593 (10ayounsi) To clarify export/import policies: From eqiad and codfw we export all WMF prefixes to eqord (and no DMZ), and apply the RPKI rules to the prefixes imported from eqord. From ulsfo we export only the POPs prefix... [09:57:02] 10Traffic, 10Operations, 10conftool, 10serviceops: confd's watch functionality appears to be partially broken when interacting with etcd 3.x - https://phabricator.wikimedia.org/T260889 (10Joe) Adding traffic as their systems are the ones affected. [09:57:14] <_joe_> ema, vgutierrez please read above [09:57:28] ema is currently on vacations [09:57:30] * vgutierrez reading [10:02:11] _joe_: hmmm pybal would be affected as well? [10:03:14] <_joe_> no. [10:03:19] <_joe_> it doesn't use confd [10:03:25] <_joe_> it uses its own connection [10:03:40] yup.. directly against etcd [10:09:54] <_joe_> what do you think of my stopgap solution? [10:16:30] as long as it is really temporary... ) [10:16:31] :) [10:29:31] <_joe_> vgutierrez: define temporary... this doesn't seem like a simple bug to fix, and I'm just offering a way to unbreak it [10:30:01] <_joe_> frankly speaking, I think if having an interval of update of 3seconds is unacceptable, maybe traffic should look into how to fix it [10:30:25] <_joe_> confd has no new releases in some time, so I don't think an upstream change would improve the situation tbh [10:32:02] <_joe_> I mean, if you have better alternatives than my proposal, I'm all ears. [10:41:32] ack, let me discuss that with ema on Monday [10:41:44] and we will get back to you on the task [10:44:27] <_joe_> ok, in the meantime I'd be happy to get a +1 on the patch, as right now we risk having dns in an inconsistent state [10:44:52] <_joe_> lemme rephrase: we *did* have dns in an inconsistent state [10:45:47] <_joe_> https://gerrit.wikimedia.org/r/c/operations/puppet/+/621484 the patch [12:21:01] 10netops, 10Operations: No Juniper alarms in SNMP for MX204 - https://phabricator.wikimedia.org/T241105 (10ayounsi) 05Open→03Stalled p:05Medium→03Low a:05ayounsi→03None [12:42:41] 10Traffic, 10Operations: Enable DNSSEC validation in Wikidough - https://phabricator.wikimedia.org/T259816 (10jbond) > Given that outages due to misconfigured DNSSEC domains are all too common (see https://ianix.com/pub/dnssec-outages.html for a list) Im not sure i would agree that they are "all to common".... [13:30:57] 10Traffic, 10Operations: Enable DNSSEC validation in Wikidough - https://phabricator.wikimedia.org/T259816 (10jbond) >> unless the client set the AD and/or DO bits > > do we know what chrome/FF set on queries? Really not familiar with FF/chrome code but this looks like a no FF: https://searchfox.org/mozil... [14:26:25] 10Traffic, 10Operations: Enable DNSSEC validation in Wikidough - https://phabricator.wikimedia.org/T259816 (10ssingh) >>! In T259816#6399814, @jbond wrote: >> Given that outages due to misconfigured DNSSEC domains are all too common (see https://ianix.com/pub/dnssec-outages.html for a list) > Im not sure i wo... [14:47:34] 10Traffic, 10Operations, 10Patch-For-Review: Enable DNSSEC validation in Wikidough - https://phabricator.wikimedia.org/T259816 (10jbond) > So this means that they treat the DO bit to not only return the DNSSEC records but also to validate them? I can check this in the code but I just wanted to confirm if I a... [15:37:33] <_joe_> I have a dns question, do I need to wait for Brandon to be back? [15:38:43] 10Traffic, 10Operations, 10Patch-For-Review: Enable DNSSEC validation in Wikidough - https://phabricator.wikimedia.org/T259816 (10ssingh) >>! In T259816#6400068, @jbond wrote: >> So this means that they treat the DO bit to not only return the DNSSEC records but also to validate them? I can check this in the... [15:40:28] _joe_: I can try [15:41:26] <_joe_> so my problem is: how can I tell if a resolver is returning the right address for mobileapps to a client in eqiad and one in codfw [15:41:39] <_joe_> one that doesn't involve ssh'ing into both DCs [15:42:39] <_joe_> and I want to test that for every authdns, actually [15:43:27] <_joe_> so first part is: how do I query each authdns directly? [15:44:21] <_joe_> I gather I have to query port 5353, first of all, correct? [15:46:38] yes, that's right [15:46:41] <_joe_> ohhh we do support client subnet [15:46:45] <_joe_> jayme: so [15:46:56] <_joe_> dig @authdns1001.wikimedia.org -p 5353 +subnet=10.32.0.0/24 mathoid.discovery.wmnet (an eqiad subnet) [15:46:58] <_joe_> vs [15:47:07] <_joe_> dig @authdns1001.wikimedia.org -p 5353 +subnet=10.192.0.0/24 mathoid.discovery.wmnet [15:47:12] <_joe_> a codfw one [15:47:52] <_joe_> jayme: now how do you override the client subnet in dnspython, I have no idea :P [15:48:03] <_joe_> cdanis: thanks for rubberducking for me :) [15:48:08] will figure out :-) [15:48:35] <_joe_> I was still convinced we didn't support edns client subnet, then I realized we use anycast for the internal resolvers so we MUST be [15:50:12] https://www.dnspython.org/docs/1.16.0/dns.edns.ECSOption-class.html ? [19:47:48] 10Traffic, 10Maps, 10Operations, 10Wiki-Loves-Monuments (2020): maps.wikilovesmonuments.org returns a HTTP 429 error (let it access varnish maps_domains) - https://phabricator.wikimedia.org/T260520 (10Dzahn) @Zache You may try again now. Is the 429 gone? [19:51:27] 10Traffic, 10Maps, 10Operations, 10Wiki-Loves-Monuments (2020): maps.wikilovesmonuments.org returns a HTTP 429 error (let it access varnish maps_domains) - https://phabricator.wikimedia.org/T260520 (10Dzahn) 05Open→03Resolved p:05Triage→03High a:03Dzahn Thanks to @cdanis for deploying my change.... [20:54:59] 10Traffic, 10Operations: Don't set cookies for api.wikimedia.org at the caching layer - https://phabricator.wikimedia.org/T260943 (10eprodromou)