[10:15:41] 10netops, 10Anti-Harassment, 10SRE-tools: Surprising new svc.eqiad.wmnet ip deployed: similar-users - https://phabricator.wikimedia.org/T273275 (10jcrespo) [10:16:07] 10netops, 10Anti-Harassment, 10SRE-tools: Surprising new svc.eqiad.wmnet ip deployed: similar-users - https://phabricator.wikimedia.org/T273275 (10jcrespo) [10:21:20] 10netops, 10Anti-Harassment, 10SRE-tools: Surprising new svc.eqiad.wmnet ip deployed: similar-users - https://phabricator.wikimedia.org/T273275 (10akosiaris) That's probably the netbox equivalent of https://gerrit.wikimedia.org/r/c/operations/dns/+/658976 The back story is that svc IP address haven't yet be... [10:31:02] 10netops, 10Anti-Harassment, 10SRE-tools: Surprising new svc.eqiad.wmnet ip deployed: similar-users - https://phabricator.wikimedia.org/T273275 (10hnowlan) Oh, interesting! So the change to the dns repo is still required in this case, just to confirm? [10:31:37] 10netops, 10Anti-Harassment, 10SRE-tools: Surprising new svc.eqiad.wmnet dns entry deployed: similar-users on host decommission - https://phabricator.wikimedia.org/T273275 (10jcrespo) [10:39:11] 10netops, 10Anti-Harassment, 10SRE-tools: Surprising new svc.eqiad.wmnet dns entry deployed: similar-users on host decommission - https://phabricator.wikimedia.org/T273275 (10jcrespo) If everything looks good now, maybe this can be converted into a feature-request (lower priority) to "check uncommited netbox... [11:20:03] 10netops, 10Anti-Harassment, 10SRE, 10SRE-tools: Surprising new svc.eqiad.wmnet dns entry deployed: similar-users on host decommission - https://phabricator.wikimedia.org/T273275 (10akosiaris) >>! In T273275#6786599, @hnowlan wrote: > Oh, interesting! So the change to the dns repo is still required in this... [12:13:04] 10netops, 10Anti-Harassment, 10SRE, 10SRE-tools: Surprising new svc.eqiad.wmnet dns entry deployed: similar-users on host decommission - https://phabricator.wikimedia.org/T273275 (10akosiaris) >>! In T273275#6786606, @jcrespo wrote: > If everything looks good now, maybe this can be converted into a feature... [16:55:19] 10netops, 10ops-eqiad: cr1-eqiad<>asw2-d-eqiad link down - https://phabricator.wikimedia.org/T273301 (10CDanis) [16:55:32] 10netops, 10ops-eqiad: cr1-eqiad<>asw2-d-eqiad link down - https://phabricator.wikimedia.org/T273301 (10CDanis) p:05Triage→03High [17:02:43] 10Traffic, 10Gerrit, 10Phabricator, 10SRE, 10periodic-update: Phabricator and Gerrit: Improve the way that maintenance downtime is communicated to users. - https://phabricator.wikimedia.org/T180655 (10Aklapper) >>! In T180655#6566245, @Dzahn wrote: > I think this is done meanwhile. Both Phabricator and G... [17:44:49] 10netops, 10ops-eqiad: cr1-eqiad<>asw2-d-eqiad link down - https://phabricator.wikimedia.org/T273301 (10wiki_willy) a:05wiki_willy→03Cmjohnson No one scheduled to be onsite today, but @Cmjohnson will go in to check it out later this afternoon. Thanks, Willy [20:18:51] 10Traffic: Disable broken security update polling for dnsdist - https://phabricator.wikimedia.org/T273322 (10ssingh) [20:19:16] 10Traffic: Disable broken security update polling for dnsdist - https://phabricator.wikimedia.org/T273322 (10ssingh) [20:19:21] 10Traffic, 10SRE, 10Patch-For-Review: Deploy Wikidough: Experimental DNS-over-HTTPS (DoH) public resolver - https://phabricator.wikimedia.org/T252132 (10ssingh) [20:42:15] 10Traffic, 10Patch-For-Review: Disable broken security update polling for dnsdist - https://phabricator.wikimedia.org/T273322 (10ssingh) 05Open→03Resolved Change merged and tested; broken security polling is now disabled. [20:42:17] 10Traffic, 10SRE, 10Patch-For-Review: Deploy Wikidough: Experimental DNS-over-HTTPS (DoH) public resolver - https://phabricator.wikimedia.org/T252132 (10ssingh) [21:21:42] 10Traffic, 10netops: cr4-ulsfo<>cr2-eqsin GRE tunnel flapping due to BFD timer expired - https://phabricator.wikimedia.org/T273328 (10CDanis) [21:21:55] 10Traffic, 10netops: cr4-ulsfo<>cr2-eqsin GRE tunnel flapping due to BFD timer expired - https://phabricator.wikimedia.org/T273328 (10CDanis) p:05Triage→03High [21:23:39] 10Traffic, 10netops: cr4-ulsfo<>cr2-eqsin GRE tunnel flapping due to BFD timer expired - https://phabricator.wikimedia.org/T273328 (10CDanis) The first few cycles of logs from the ulsfo side: `lines=10 Jan 29 20:21:46 cr4-ulsfo bfdd[16019]: BFD Session fe80::827f:f800:43:6b66 (IFL 75) state Up -> Down LD/RD(1... [21:33:54] 10netops, 10SRE, 10ops-eqiad: cr1-eqiad<>asw2-d-eqiad link down - https://phabricator.wikimedia.org/T273301 (10Cmjohnson) 05Open→03Resolved The optics on asw2-d2 xe-2/0/40 was bad. I replace both for good measure and the link is back up [21:45:12] 10netops, 10SRE, 10ops-eqiad: cr1-eqiad<>asw2-d-eqiad link down - https://phabricator.wikimedia.org/T273301 (10wiki_willy) Thanks @Cmjohnson! >>! In T273301#6788367, @Cmjohnson wrote: > The optics on asw2-d2 xe-2/0/40 was bad. I replace both for good measure and the link is back up [21:50:04] 10netops, 10SRE, 10ops-eqiad: cr1-eqiad<>asw2-d-eqiad link down - https://phabricator.wikimedia.org/T273301 (10CDanis) 05Resolved→03Open I think something is still wrong? LibreNMS is showing the port on the asw receiving about 7kbps of errors: https://librenms.wikimedia.org/device/device=149/tab=port/po... [21:58:18] 10netops, 10SRE, 10ops-eqiad: cr1-eqiad<>asw2-d-eqiad link down - https://phabricator.wikimedia.org/T273301 (10Cmjohnson) @cdanis trying a different SFP for cr1 now [22:13:19] 10netops, 10SRE, 10ops-eqiad: cr1-eqiad<>asw2-d-eqiad link down - https://phabricator.wikimedia.org/T273301 (10CDanis) 05Open→03Resolved looks good now, thanks! [23:15:49] 10Traffic, 10netops, 10Data-Services, 10SRE, 10cloud-services-team (Kanban): wikireplicas last-minute infra work to discuss / resolve - https://phabricator.wikimedia.org/T273248 (10Legoktm) p:05Triage→03High