[04:03:32] 10Acme-chief, 10Traffic, 10Operations: Memory leak on acme-chief 0.21 - https://phabricator.wikimedia.org/T234131 (10Vgutierrez) 05Open→03Resolved a:03Vgutierrez [05:14:02] 10netops, 10Operations, 10ops-codfw: msw-c1 down? - https://phabricator.wikimedia.org/T234411 (10faidon) p:05Triage→03High [06:31:46] 10netops, 10Operations: Telia IC-314534 (eqord/eqdfw 10Gbps wave) down - https://phabricator.wikimedia.org/T234335 (10elukey) p:05Triage→03Normal [06:32:25] 10netops, 10Operations: Telia IC-314534 (eqord/eqdfw 10Gbps wave) down - https://phabricator.wikimedia.org/T234335 (10elukey) Link is down again as far as I can see from icinga and: ` elukey@re0.cr2-codfw> show interfaces descriptions |match down` xe-5/2/1 up down Transport: cr2-eqord:xe-0/1/0 (Teli... [06:35:01] 10Traffic, 10Beta-Cluster-Infrastructure, 10Operations, 10Puppet: Puppet fails on deployment-cache-text05 - https://phabricator.wikimedia.org/T234412 (10mobrovac) [06:47:11] 10netops, 10Operations: Telia IC-314534 (eqord/eqdfw 10Gbps wave) down - https://phabricator.wikimedia.org/T234335 (10elukey) ` elukey@re0.cr2-codfw> show interfaces diagnostics optics xe-5/2/1 Physical interface: xe-5/2/1 Laser bias current : 46.512 mA Laser output power... [07:54:28] 10netops, 10Operations: Telia IC-314534 (eqord/eqdfw 10Gbps wave) down - https://phabricator.wikimedia.org/T234335 (10elukey) I missed an email from Telia, they are replacing a faulty card that apparently caused flaps and the impact that we saw. Hopefully we'll see recovery soon. [08:29:26] 10netops, 10Operations, 10ops-eqiad: asw2-a-eqiad <-> cr2-eqiad fiber issue - https://phabricator.wikimedia.org/T234416 (10faidon) p:05Triage→03High [08:50:29] 10Traffic, 10Beta-Cluster-Infrastructure, 10DNS, 10Operations, and 4 others: Ferm's upstream Net::DNS Perl library questionable handling of NOERROR responses without records causing puppet errors when we try to @resolve AAAA in labs - https://phabricator.wikimedia.org/T153468 (10MoritzMuehlenhoff) 05Open... [10:29:08] 10Traffic, 10Beta-Cluster-Infrastructure, 10Operations, 10Puppet: Puppet fails on deployment-cache-text05 - https://phabricator.wikimedia.org/T234412 (10ema) p:05Triage→03Normal [10:36:42] 10Traffic, 10Beta-Cluster-Infrastructure, 10Operations, 10Puppet: Puppet fails on deployment-cache-text05 - https://phabricator.wikimedia.org/T234412 (10Vgutierrez) This is caused by adding the ATS-TLS instance to the text cluster. So you need to provide a valid configuration for the ats-tls profile. See:... [11:00:15] 10Traffic, 10Beta-Cluster-Infrastructure, 10Operations, 10Core Platform Team Workboards (Clinic Duty Team), 10Puppet: Puppet fails on deployment-cache-text05 - https://phabricator.wikimedia.org/T234412 (10mobrovac) 05Open→03Resolved a:03mobrovac As per @Vgutierrez' instructions, I looked up the ATS... [15:26:29] 10netops, 10Operations, 10ops-codfw: msw-c1 down? - https://phabricator.wikimedia.org/T234411 (10ayounsi) a:03Papaul @papaul, can you check the LED status, cables (all properly connected), then power cycle the device? [15:38:29] 10netops, 10Operations, 10ops-eqiad: asw2-a-eqiad <-> cr2-eqiad fiber issue - https://phabricator.wikimedia.org/T234416 (10ayounsi) a:03Cmjohnson Related to T203719. @Cmjohnson same as when there are interfaces errors, but here monitor for new: `sfp-7/0/46 link 46 SFP receive power low warning set` in `sh... [15:53:18] 10netops, 10Operations: Telia IC-314534 (eqord/eqdfw 10Gbps wave) down - https://phabricator.wikimedia.org/T234335 (10ayounsi) 05Open→03Resolved a:03elukey Work completed, everything is up, thank to you two! [17:17:04] 10Traffic, 10Operations, 10Phabricator, 10Release-Engineering-Team-TODO, and 2 others: Prepare Phame to support heavy traffic for a Tech Department blog - https://phabricator.wikimedia.org/T226044 (10Jdforrester-WMF) [18:14:24] 10netops, 10Operations: configure BGP route damping on IX sessions - https://phabricator.wikimedia.org/T222424 (10ayounsi) Updated change with the above feedbacks: `lang=diff [edit protocols bgp group IX4] + damping; [edit protocols bgp group IX6] + damping; [edit policy-options policy-statement BGP_IXP_... [18:23:08] 10netops, 10Operations: configure BGP route damping on IX sessions - https://phabricator.wikimedia.org/T222424 (10ayounsi) For the record: ` cr4-ulsfo> show bgp neighbor | match "Suppressed due to damping"| except " 0" Suppressed due to damping: 1 Suppressed due to damping:... [18:35:04] 10netops, 10Operations: configure BGP route damping on IX sessions - https://phabricator.wikimedia.org/T222424 (10ayounsi) Eqord: ` Suppressed due to damping: 4 Suppressed due to damping: 4 Suppressed due to damping: 1 Suppressed due to damping: 1 ` eqdfw: ` Suppressed due to da... [21:46:19] Hello traffic folks! :) Is there a staff member that worked on Let's Encrypt implementation around? The nonprofit behind LE is looking for some help with a testimonial for their annual report. [21:51:10] varnent: vgutierrez is probably the man but it's still night on his TZ I think [21:56:16] varnent: yeah vgutierrez has worked on our LE stuff the most, may be loop in vgutierrez via email so he can respond async and CC me pls :) [21:57:22] I assume they're not looking for anything too deep, just a short "LE is awesome because X and we did Y with it" quip from an actual technical person here. [22:04:34] vgutierrez: while I'm bumping your name anyways: https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/540469/ -> https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/540470/ (we now have dual commercial vendors again, for now. We should talk about how to puppetize per-dc different cert sets for ATS like we do for legacy nginx, and also the conversation about making LE the third option too in [22:04:40] https://phabricator.wikimedia.org/T230687 as part of the same convo) [22:13:22] bblack: I think at most a chat with someone to talk about why we picked LE so they can pull some quotes basically [23:01:04] I'd be interested in seeing the output of that varnent. [23:25:28] Krenair: I suspect they will publish it - their goal is to help potential funders understand their impact [23:25:46] varnent, and they want something official from Foundation staff for that? [23:26:06] Generally we do not do these with partners as we are very protective of our name being used. However as they are a fellow nonprofit, more interested in helping [23:27:11] Krenair: It sounds like they have a lot of testimonials from corporate technical folks - but not another nonprofit with shared values around things like privacy and security - so yeah - they would like someone with the official Foundation connection who can speak to why we use them so they can show how they help support like-minded tech nonprofits [23:27:55] and seeing as how we do use them and presumably do not want them to go away - helping them tell that story seems like a fair request. :) [23:27:57] fair enough [23:29:17] Given who their donors and potential donors are - seems reasonable to me. Plenty of tech folks that will give them more if they know it's helping other orgs they also support. Donors are not as persuaded by stories of how you used their money to help a company make more money. :) [23:30:14] Plus they like us and want to humanize our story as well - which stuff like this can help do. :) [23:30:31] it sounds nice [23:43:41] 10netops, 10Operations, 10ops-eqiad: asw2-a-eqiad <-> cr2-eqiad fiber issue - https://phabricator.wikimedia.org/T234416 (10ayounsi) 05Open→03Resolved Better! ` ayounsi@asw2-a-eqiad> show interfaces diagnostics optics xe-7/0/46 | match "rx|receive" Receiver signal average optical power : 0.0741... [23:56:57] 10netops, 10Operations, 10ops-eqiad: asw2-a-eqiad <-> cr2-eqiad fiber issue - https://phabricator.wikimedia.org/T234416 (10Cmjohnson) I swapped both optics