[01:15:40] FIRING: SystemdUnitFailed: dump_cloud_ip_ranges.service on puppetserver2004:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [02:42:57] 10netops, 06Infrastructure-Foundations, 06Traffic: esams/magru: 185.71.138.0/24 (wikidough) prefix not advertized - https://phabricator.wikimedia.org/T420342#11730865 (10ops-monitoring-bot) Icinga downtime and Alertmanager silence (ID=c4f208ac-d300-44af-a51e-696d5c084e32) set by sukhe@cumin1003 for 1 day... [02:43:19] 10netops, 06Infrastructure-Foundations, 06Traffic: esams/magru: 185.71.138.0/24 (wikidough) prefix not advertized - https://phabricator.wikimedia.org/T420342#11730866 (10ops-monitoring-bot) Icinga downtime and Alertmanager silence (ID=2fd1d9bc-b373-41d2-8079-0dddb67c1c7a) set by sukhe@cumin1003 for 1 day... [02:54:11] FIRING: [2x] MaxConntrack: Elevated conntrack usage on ganeti3006:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_conntrack - https://grafana.wikimedia.org/d/oITUqwKIk/netfilter-connection-tracking - https://alerts.wikimedia.org/?q=alertname%3DMaxConntrack [05:15:40] FIRING: SystemdUnitFailed: dump_cloud_ip_ranges.service on puppetserver2004:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [06:54:11] FIRING: [2x] MaxConntrack: Elevated conntrack usage on ganeti3006:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_conntrack - https://grafana.wikimedia.org/d/oITUqwKIk/netfilter-connection-tracking - https://alerts.wikimedia.org/?q=alertname%3DMaxConntrack [07:29:30] 10SRE-tools, 06Infrastructure-Foundations, 06serviceops-radar: Add --min-uptime to cookbooks - https://phabricator.wikimedia.org/T419967#11731264 (10Ajuanca) `start-datetime` flag of [367592](https://phabricator.wikimedia.org/T367592) acts exactly like the one we're discussing. IMHO, `--not-rebooted-since` i... [08:10:06] FIRING: [2x] MaxConntrack: Elevated conntrack usage on ganeti3006:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_conntrack - https://grafana.wikimedia.org/d/oITUqwKIk/netfilter-connection-tracking - https://alerts.wikimedia.org/?q=alertname%3DMaxConntrack [08:14:23] RESOLVED: [2x] MaxConntrack: Elevated conntrack usage on ganeti3006:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_conntrack - https://grafana.wikimedia.org/d/oITUqwKIk/netfilter-connection-tracking - https://alerts.wikimedia.org/?q=alertname%3DMaxConntrack [09:15:01] 10netops, 06Infrastructure-Foundations, 06SRE: Nokia SR-Linux - wonky routing with IPv6 RAs and EVPN Anycast GW - https://phabricator.wikimedia.org/T420706 (10cmooney) 03NEW p:05Triage→03High [09:15:27] 10netops, 06Infrastructure-Foundations, 06SRE: Eqiad C/D refresh: move legacy switch uplinks to Nokias and migrate Vlan GWs - https://phabricator.wikimedia.org/T405562#11731460 (10cmooney) [09:15:28] 10netops, 06Infrastructure-Foundations, 06SRE: Nokia SR-Linux - wonky routing with IPv6 RAs and EVPN Anycast GW - https://phabricator.wikimedia.org/T420706#11731459 (10cmooney) [09:16:43] 10netops, 06Infrastructure-Foundations, 06SRE: Eqiad: move row-wide vlan gateways to Nokia switches - https://phabricator.wikimedia.org/T416872#11731464 (10cmooney) Unfortunately we hit another blocker with this so we will have to review the way forward. See T420706. [09:19:18] FIRING: SystemdUnitFailed: dump_cloud_ip_ranges.service on puppetserver2004:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [09:42:37] 10netops, 06Infrastructure-Foundations, 06SRE: Nokia SR-Linux - wonky routing with IPv6 RAs and EVPN Anycast GW - https://phabricator.wikimedia.org/T420706#11731593 (10cmooney) Ticket 05547487 opened with Nokia. [10:16:09] Hi! i.nflatador recently sent out https://gerrit.wikimedia.org/r/c/operations/puppet/+/1251117 to add a new cfssl profile to the dse-k8s cluster to generated longer-lived certs for our opensearch clusters. Would anyone from IF be able to review? Thank you! [13:19:54] FIRING: SystemdUnitFailed: dump_cloud_ip_ranges.service on puppetserver2004:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [17:19:54] FIRING: SystemdUnitFailed: dump_cloud_ip_ranges.service on puppetserver2004:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [19:05:10] 10Mail, 06Infrastructure-Foundations, 10Observability-Logging: Allow IT Services to view inbound email logs - https://phabricator.wikimedia.org/T419906#11733675 (10jhathaway) [19:05:48] 10Mail, 06Infrastructure-Foundations, 10Observability-Logging: Allow IT Services to view inbound email logs - https://phabricator.wikimedia.org/T419906#11733676 (10jhathaway) added, a few alternatives, after discussing with other folks, happy to hear of others. [19:15:50] 10Mail, 06Infrastructure-Foundations, 10Observability-Logging: Allow IT Services to view inbound email logs - https://phabricator.wikimedia.org/T419906#11733728 (10jhathaway) [21:19:54] FIRING: SystemdUnitFailed: dump_cloud_ip_ranges.service on puppetserver2004:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [22:19:56] FIRING: [2x] ProbeDown: Service mirror1001:443 has failed probes (http_mirrors_wikimedia_org_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#mirror1001:443 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://alerts.wikimedia.org/?q=alertname%3DProbeDown [22:59:56] RESOLVED: [2x] ProbeDown: Service mirror1001:443 has failed probes (http_mirrors_wikimedia_org_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#mirror1001:443 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://alerts.wikimedia.org/?q=alertname%3DProbeDown