[02:58:56] FIRING: [2x] ProbeDown: Service mirror1001:443 has failed probes (http_mirrors_wikimedia_org_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#mirror1001:443 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://alerts.wikimedia.org/?q=alertname%3DProbeDown [03:03:56] RESOLVED: [2x] ProbeDown: Service mirror1001:443 has failed probes (http_mirrors_wikimedia_org_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#mirror1001:443 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://alerts.wikimedia.org/?q=alertname%3DProbeDown [09:12:21] 10netops, 06Infrastructure-Foundations, 06SRE: Audit and verify all cloudcephosd have their primary interface tagged and access to cloud-storage vlan - https://phabricator.wikimedia.org/T409690 (10fgiunchedi) 03NEW [09:15:03] 10netops, 06Infrastructure-Foundations, 06SRE: Cloudcephosd: migrate to single network uplink - https://phabricator.wikimedia.org/T399180#11357477 (10fgiunchedi) >>! In T399180#11310972, @cmooney wrote: >>>! In T399180#11310845, @fgiunchedi wrote: >> @taavi @Andrew @cmooney what do you think of the above? >... [13:47:49] 10netops, 06Infrastructure-Foundations, 06SRE: lsw1-d6-eqiad reboot failed, stuck in UEFI shell - https://phabricator.wikimedia.org/T409731 (10cmooney) 03NEW p:05Triage→03High [13:53:28] 10netops, 06Infrastructure-Foundations, 06SRE: Audit and verify all cloudcephosd have their primary interface tagged and access to cloud-storage vlan - https://phabricator.wikimedia.org/T409690#11358656 (10cmooney) I can take a look at this unless there is another plan? [14:41:59] 10netops, 06Infrastructure-Foundations, 06SRE: Audit and verify all cloudcephosd have their primary interface tagged and access to cloud-storage vlan - https://phabricator.wikimedia.org/T409690#11358897 (10fgiunchedi) Yes please @cmooney, much appreciated! Note that this is currently not a blocker / not high... [15:40:26] 10netops, 06Infrastructure-Foundations, 06SRE: Audit and verify all cloudcephosd have their primary interface tagged and access to cloud-storage vlan - https://phabricator.wikimedia.org/T409690#11359233 (10LSobanski) p:05Triage→03Medium [15:41:06] 10netops, 06cloud-services-team, 06Infrastructure-Foundations, 10Toolforge: Create new VRF and networks for Toolforge-on-Metal - https://phabricator.wikimedia.org/T409309#11359235 (10cmooney) p:05Triage→03Medium [15:46:31] 10netops, 06Infrastructure-Foundations, 06SRE: Sporadic RST drops in the ulogd logs - https://phabricator.wikimedia.org/T238823#11359271 (10LSobanski) 05Open→03Resolved a:03LSobanski Resolving as part of backlog review. There have been changes to the network and Puppet since the creation of this ta... [16:31:36] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-eqiad, 06SRE: lsw1-d6-eqiad reboot failed, stuck in UEFI shell - https://phabricator.wikimedia.org/T409731#11359494 (10Jclark-ctr) [16:58:03] cccccbukvgbchktntducevfdjulbeiejfdllrbrnlcbj [16:58:57] cccccdbcihrfkbbtufukkdcvbrchnrebhlbkrtiuhrut [16:59:43] yubikeys conspiring against humans [17:01:20] can you blame them? [17:23:19] there are 2 options: non-nano case = breaks physically. nano case = stays in laptop and you touch it all day by accident [23:39:11] FIRING: [2x] PfwCoreBGPDown: Fundraising Firewall core BGP session down between pfw1-codfw and (null) (10.195.0.248) - group VPN - https://wikitech.wikimedia.org/wiki/Network_monitoring#BGP_status - https://alerts.wikimedia.org/?q=alertname%3DPfwCoreBGPDown [23:51:43] ah HA [23:53:35] is someone looking at the FR firewall issue already?