[01:14:59] 06Traffic, 06Infrastructure-Foundations, 06SRE, 13Patch-For-Review: Delegate reverse DNS zones for k8s pod IP ranges on authdns servers - https://phabricator.wikimedia.org/T376291#10198467 (10ssingh) >>! In T376291#10197890, @cmooney wrote: >>>! In T376291#10197677, @ssingh wrote: >> * It seem the network... [03:31:57] 10netops, 06DC-Ops, 10fundraising-tech-ops, 06Infrastructure-Foundations, and 2 others: codfw:frack:servers migration task - https://phabricator.wikimedia.org/T375151#10198601 (10Papaul) [06:16:42] 10netops, 06Infrastructure-Foundations, 06SRE: Upgrade Management routers to 22.4R3-S2 - https://phabricator.wikimedia.org/T369504#10198646 (10ayounsi) Let's use the latest recommended, so 23. Thx! [06:16:52] 10netops, 06Infrastructure-Foundations, 06SRE: Upgrade Management routers to 23.4R2-S2 - https://phabricator.wikimedia.org/T369504#10198647 (10ayounsi) [08:03:52] 06Traffic, 06Movement-Insights: Investigating unique devices traffic data - https://phabricator.wikimedia.org/T375562#10198756 (10Vgutierrez) {F57585229} this is what we are seeing at the CDN on haproxy metrics for eqsin (Singapore DC) during the last 3 months, each data point in the graph is the number of req... [10:25:15] 10netops, 06Infrastructure-Foundations, 06SRE: Add link from cloudsw1-e4-eqiad to cloudsw1-f4-eiqad - https://phabricator.wikimedia.org/T372061#10199000 (10cmooney) With the gnmi stats in place we see fairly consistent drops on these links from cloudsw1-d5-eqiad: https://grafana-rw.wikimedia.org/d/5p97dAASz... [10:54:09] FIRING: [8x] LVSHighCPU: The host lvs5005:9100 has at least its CPU 0 saturated - https://bit.ly/wmf-lvscpu - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=lvs5005 - https://alerts.wikimedia.org/?q=alertname%3DLVSHighCPU [10:59:09] RESOLVED: [8x] LVSHighCPU: The host lvs5005:9100 has at least its CPU 0 saturated - https://bit.ly/wmf-lvscpu - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=lvs5005 - https://alerts.wikimedia.org/?q=alertname%3DLVSHighCPU [11:20:44] 10netops, 06DC-Ops, 10fundraising-tech-ops, 06Infrastructure-Foundations, and 3 others: codfw:frack:rack/install/configuration new switches - https://phabricator.wikimedia.org/T374587#10199081 (10cmooney) >>! In T374587#10160970, @ayounsi wrote: > It would indeed be great to have redundancy for the `fmsw`,... [12:23:59] 06Traffic, 06Infrastructure-Foundations, 06SRE, 13Patch-For-Review: Delegate reverse DNS zones for k8s pod IP ranges on authdns servers - https://phabricator.wikimedia.org/T376291#10199200 (10cmooney) >>! In T376291#10198467, @ssingh wrote: > You are basing `dns_k8s_reverse_delegation` on `hieradata/common... [12:38:51] 10netops, 06cloud-services-team, 10Cloud-VPS, 06Infrastructure-Foundations, 06SRE: CloudVPS: IPv6 in codfw1dev - https://phabricator.wikimedia.org/T245495#10199231 (10aborrero) [12:41:31] 10netops, 06cloud-services-team, 10Cloud-VPS, 06Infrastructure-Foundations, 06SRE: netbox: create IPv6 entries for Cloud VPS - https://phabricator.wikimedia.org/T374712#10199227 (10aborrero) 05Open→03Resolved [12:41:38] 10netops, 06cloud-services-team, 10Cloud-VPS, 06Infrastructure-Foundations, 06SRE: Cloud IPv6 subnets - https://phabricator.wikimedia.org/T187929#10199229 (10aborrero) [12:48:21] 10netops, 06cloud-services-team, 10Cloud-VPS, 06Infrastructure-Foundations, 06SRE: cloudsw: codfw: enable IPv6 - https://phabricator.wikimedia.org/T374713#10199236 (10aborrero) Created: * https://netbox.wikimedia.org/ipam/prefixes/1085/ * https://netbox.wikimedia.org/ipam/prefixes/1086/ * https://netbox.... [13:23:33] 06Traffic, 06Infrastructure-Foundations, 06SRE, 13Patch-For-Review: Delegate reverse DNS zones for k8s pod IP ranges on authdns servers - https://phabricator.wikimedia.org/T376291#10199351 (10cmooney) I had a quick stab at creating the data for the `dns_reverse_zones.yaml` file from the dns repo and it's f... [15:31:17] 06Traffic, 06Infrastructure-Foundations, 06SRE, 13Patch-For-Review: Delegate reverse DNS zones for k8s pod IP ranges on authdns servers - https://phabricator.wikimedia.org/T376291#10199890 (10ssingh) >>! In T376291#10199200, @cmooney wrote: >>>! In T376291#10198467, @ssingh wrote: >> You are basing `dns_k8... [15:36:56] 10netops, 06Infrastructure-Foundations, 06SRE: Upgrade Management routers to 23.4R2-S2 - https://phabricator.wikimedia.org/T369504#10199906 (10Papaul) [16:11:17] topranks: so.. liberica is now able to refresh communities and/or peers with a simple config reload, no restart needed at all [16:11:36] nice!! [16:11:39] good work [16:12:01] bgp config so far looks like this [16:12:16] https://www.irccloud.com/pastebin/Xohd9SYA/ [16:13:04] communities is optional, so if you drop it from the config file no communities will be set [16:13:28] and peers will get a softresetout if old and new don't match [16:18:30] vgutierrez: looks good [16:18:57] what are the 'next-hops' for? I guess these are the systems own IP address to use on the bgp routes? [16:19:06] yes [16:19:11] same as we do on pybal [16:19:26] kind of makes sense in this scenario that we specify those rather than they come automatically from the interface IP [16:19:37] cool [16:23:35] changing next hops and or ASN would require a restart [16:23:46] but I think that's OK [16:30:50] yeah they won't change I think [16:31:19] new ASN BGP would have to restart, in theory you can just send UPDATEs with new next-hops but again, for any given LB they won't change [16:32:47] yup.. considering that updating next hops would mean adding/changing IPs on the LB main NIC, requiring a restart doesn't seem crazy [16:37:33] at least if we look at the bird config: ASN change, never, next-hop (means a different thing there as that's the cr/ToR switch but almost never changed?) [16:38:49] I think next-hop is the server IP itself, not the cr/tor switch (that is the 'peers') [16:39:33] ah right in the above, that's probably one of the lvs test instances then [16:39:56] yeah.. in my paste above is lvs1013 IP [16:40:19] peer is set to 127.0.0.2 cause I'm running tests locally [16:40:43] on a production configuration it should be the cr/ToR IPv4 [17:34:58] 06Traffic, 10Prod-Kubernetes, 06serviceops, 07Kubernetes, 13Patch-For-Review: Reverse DNS for k8s pods IPs - https://phabricator.wikimedia.org/T344171#10200401 (10cmooney) To make progress here while we work on automating the sub-zone delegation I have manually delegated the required zones covering the k... [18:04:46] 06Traffic, 10Prod-Kubernetes, 06serviceops, 07Kubernetes, 13Patch-For-Review: Reverse DNS for k8s pods IPs - https://phabricator.wikimedia.org/T344171#10200521 (10ssingh) Since `gdnsd` is fine and `pdns-recursor` is not, could it be because of this? https://doc.powerdns.com/recursor/settings.html#settin... [18:07:52] 06Traffic, 10Prod-Kubernetes, 06serviceops, 07Kubernetes, 13Patch-For-Review: Reverse DNS for k8s pods IPs - https://phabricator.wikimedia.org/T344171#10200529 (10ssingh) If we want to confirm the above, we can depool a DNS host, disable Puppet, manually edit `recursor.conf`, restart it, test it and then... [18:21:19] 06Traffic, 10Prod-Kubernetes, 06serviceops, 07Kubernetes, 13Patch-For-Review: Reverse DNS for k8s pods IPs - https://phabricator.wikimedia.org/T344171#10200581 (10cmooney) @ssingh yes that I think is probably it! Totally makes sense for the recursor to have that rule. People often mess up and put those... [18:30:06] 06Traffic, 10Prod-Kubernetes, 06serviceops, 07Kubernetes, 13Patch-For-Review: Reverse DNS for k8s pods IPs - https://phabricator.wikimedia.org/T344171#10200622 (10ssingh) Are the kubectls IP in some particular subnet? If so, we can exclude them from above, such as `!10.64.0.0/12` or something -- you get... [18:44:25] 06Traffic, 13Patch-For-Review: Remove RSA certificates and use only ECDSA certificates - https://phabricator.wikimedia.org/T370837#10200685 (10BCornwall) Now that some time has passed since the misreporting has been fixed I checked again: ` SELECT tls_auth, COUNT(tls_auth) from "druid"."webrequest_sampled_liv... [18:44:30] 06Traffic, 10Prod-Kubernetes, 06serviceops, 07Kubernetes, 13Patch-For-Review: Reverse DNS for k8s pods IPs - https://phabricator.wikimedia.org/T344171#10200686 (10cmooney) @ssingh you were correct!!! awesome <3 We depooled dns1005 and then modified `/etc/powerdns/recursor.conf`, adding a //dont-query//... [19:02:43] 06Traffic, 06Infrastructure-Foundations, 06SRE: Authdns: automate reverse DNS zone delegation for k8s pod IP ranges - https://phabricator.wikimedia.org/T376291#10200713 (10cmooney) [19:10:18] 10netops, 06cloud-services-team, 10Cloud-VPS, 06Infrastructure-Foundations, 06SRE: cloudsw: codfw: enable IPv6 - https://phabricator.wikimedia.org/T374713#10200735 (10cmooney) >>! In T374713#10199236, @aborrero wrote: > Created: Thanks! I've made some minor edits to them in Netbox btw, just some things...