[00:19:25] FIRING: SystemdUnitFailed: sync-puppet-volatile.service on puppetmaster2001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [00:34:25] RESOLVED: SystemdUnitFailed: sync-puppet-volatile.service on puppetmaster2001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [03:18:35] FIRING: NetboxPhysicalHosts: Netbox - Report parity errors between PuppetDB and Netbox for physical devices. - https://wikitech.wikimedia.org/wiki/Netbox#Report_Alert - https://netbox.wikimedia.org/core/jobs/ - https://alerts.wikimedia.org/?q=alertname%3DNetboxPhysicalHosts [03:23:35] FIRING: NetboxAccounting: Netbox - Accounting job failed - https://wikitech.wikimedia.org/wiki/Netbox#Report_Alert - https://netbox.wikimedia.org/core/jobs/ - https://alerts.wikimedia.org/?q=alertname%3DNetboxAccounting [03:35:25] FIRING: SystemdUnitFailed: sync-puppet-volatile.service on puppetmaster2001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [03:50:25] RESOLVED: SystemdUnitFailed: sync-puppet-volatile.service on puppetmaster2001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [07:18:35] FIRING: NetboxPhysicalHosts: Netbox - Report parity errors between PuppetDB and Netbox for physical devices. - https://wikitech.wikimedia.org/wiki/Netbox#Report_Alert - https://netbox.wikimedia.org/core/jobs/ - https://alerts.wikimedia.org/?q=alertname%3DNetboxPhysicalHosts [07:23:35] FIRING: NetboxAccounting: Netbox - Accounting job failed - https://wikitech.wikimedia.org/wiki/Netbox#Report_Alert - https://netbox.wikimedia.org/core/jobs/ - https://alerts.wikimedia.org/?q=alertname%3DNetboxAccounting [09:19:06] 10netbox, 10ChangeProp, 06cloud-services-team, 06collaboration-services, and 10 others: Figure out a plan to move forward with regarding Redis License changes - https://phabricator.wikimedia.org/T360596#10882980 (10hashar) 05Open→03Resolved a:03hashar After chatting with Alexandros, the relicensi... [10:06:14] 10netops, 06Infrastructure-Foundations, 06SRE: Export additional network device stats in gnmi - https://phabricator.wikimedia.org/T395998 (10cmooney) 03NEW p:05Triage→03Low [10:06:59] 10netops, 06Infrastructure-Foundations, 06SRE: Export additional network device stats in gnmi - https://phabricator.wikimedia.org/T395998#10883105 (10cmooney) [10:07:02] 10netops, 06Infrastructure-Foundations, 10Observability-Alerting, 13Patch-For-Review: Migrate network icinga alerts to gNMI/prometheus - https://phabricator.wikimedia.org/T388641#10883106 (10cmooney) [10:11:09] 10netops, 06Infrastructure-Foundations, 06SRE: Homer: stop using the 'section' macro in jinja templates - https://phabricator.wikimedia.org/T395555#10883127 (10cmooney) Ok thanks guys. Let me see if I can prep a patch to remove it where we currently are. It would clear up my proposed IBGP patch quite a bit... [11:18:35] FIRING: NetboxPhysicalHosts: Netbox - Report parity errors between PuppetDB and Netbox for physical devices. - https://wikitech.wikimedia.org/wiki/Netbox#Report_Alert - https://netbox.wikimedia.org/core/jobs/ - https://alerts.wikimedia.org/?q=alertname%3DNetboxPhysicalHosts [11:23:35] FIRING: NetboxAccounting: Netbox - Accounting job failed - https://wikitech.wikimedia.org/wiki/Netbox#Report_Alert - https://netbox.wikimedia.org/core/jobs/ - https://alerts.wikimedia.org/?q=alertname%3DNetboxAccounting [11:41:55] 10netops, 06Infrastructure-Foundations, 10Observability-Alerting, 13Patch-For-Review: Migrate network icinga alerts to gNMI/prometheus - https://phabricator.wikimedia.org/T388641#10883377 (10ayounsi) 05Open→03Stalled a:03ayounsi Going to mark that one as stalled until we can either onboard new device... [11:43:22] 10netops, 06Infrastructure-Foundations, 06SRE: Export additional network device stats in gnmi - https://phabricator.wikimedia.org/T395998#10883383 (10ayounsi) Good idea! in theory not particularly difficult, but we should look at reducing the load (go routines) on the current gNMIc instances first. [11:57:24] 10netops, 06Infrastructure-Foundations: Downgrade pfw1-codfw to Junos 23.4R2-S3 - https://phabricator.wikimedia.org/T393996#10883453 (10cmooney) p:05Medium→03Low >>! In T393996#10861802, @Dwisehaupt wrote: > Just pinging on this. Maintenance week is this week and we are ok for the work to happen when you a... [12:29:15] 10netops, 06Infrastructure-Foundations, 06SRE, 13Patch-For-Review: Homer: stop using the 'section' macro in jinja templates - https://phabricator.wikimedia.org/T395555#10883638 (10cmooney) 05Open→03Resolved a:03cmooney [12:45:25] FIRING: SystemdUnitFailed: gitlab-package-puller.service on apt-staging2001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [12:50:25] RESOLVED: SystemdUnitFailed: gitlab-package-puller.service on apt-staging2001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [14:23:09] FIRING: GanetiMemoryPressure: Ganeti: High memory usage (94.37%) on ganeti2020:9100 - https://wikitech.wikimedia.org/wiki/Ganeti#Memory_pressure - https://grafana.wikimedia.org/d/gd6vep5Iz/ganeti-memory-pressure?orgId=1&var-site=codfw - https://alerts.wikimedia.org/?q=alertname%3DGanetiMemoryPressure [14:33:10] RESOLVED: GanetiMemoryPressure: Ganeti: High memory usage (94.3%) on ganeti2020:9100 - https://wikitech.wikimedia.org/wiki/Ganeti#Memory_pressure - https://grafana.wikimedia.org/d/gd6vep5Iz/ganeti-memory-pressure?orgId=1&var-site=codfw - https://alerts.wikimedia.org/?q=alertname%3DGanetiMemoryPressure [15:07:11] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-codfw, and 2 others: Install and cable Nokia test devices and test servers in codfw - https://phabricator.wikimedia.org/T385217#10884415 (10cmooney) >>! In T385217#10879725, @Jhancock.wm wrote: > @cmooney I'm gonna reply to Jorge's email about boxes a... [15:18:25] FIRING: SystemdUnitFailed: gitlab-package-puller.service on apt-staging2001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [15:18:35] FIRING: NetboxPhysicalHosts: Netbox - Report parity errors between PuppetDB and Netbox for physical devices. - https://wikitech.wikimedia.org/wiki/Netbox#Report_Alert - https://netbox.wikimedia.org/core/jobs/ - https://alerts.wikimedia.org/?q=alertname%3DNetboxPhysicalHosts [15:23:35] FIRING: NetboxAccounting: Netbox - Accounting job failed - https://wikitech.wikimedia.org/wiki/Netbox#Report_Alert - https://netbox.wikimedia.org/core/jobs/ - https://alerts.wikimedia.org/?q=alertname%3DNetboxAccounting [15:28:25] RESOLVED: SystemdUnitFailed: gitlab-package-puller.service on apt-staging2001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [16:02:30] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-codfw, 06SRE: Install and cable Nokia test devices and test servers in codfw - https://phabricator.wikimedia.org/T385217#10884669 (10Jhancock.wm) okay cool. I'm gonna unrack them tomorrow and get them boxed. i replied to Nokia's email asking for pac... [17:53:16] 10Mail, 10Infrastructure Security, 06Infrastructure-Foundations, 06SRE: Add DMarcian trial-account address to the dmarc-ruf@wikimedia.org mailing list - https://phabricator.wikimedia.org/T396062 (10Jgreen) 03NEW [19:18:35] FIRING: NetboxPhysicalHosts: Netbox - Report parity errors between PuppetDB and Netbox for physical devices. - https://wikitech.wikimedia.org/wiki/Netbox#Report_Alert - https://netbox.wikimedia.org/core/jobs/ - https://alerts.wikimedia.org/?q=alertname%3DNetboxPhysicalHosts [19:23:35] FIRING: NetboxAccounting: Netbox - Accounting job failed - https://wikitech.wikimedia.org/wiki/Netbox#Report_Alert - https://netbox.wikimedia.org/core/jobs/ - https://alerts.wikimedia.org/?q=alertname%3DNetboxAccounting [19:37:02] 10Mail, 06Infrastructure-Foundations, 10MediaWiki-Email, 10MediaWiki-extensions-EmailAuth, and 3 others: Could not send confirmation email: Unknown error in PHP's mail() function. - https://phabricator.wikimedia.org/T383047#10885665 (10jhathaway) >>! In T383047#10879595, @Tgr wrote: > What would help with... [20:47:30] 10Mail, 06Infrastructure-Foundations, 10MediaWiki-Email, 10MediaWiki-extensions-EmailAuth, and 3 others: Could not send confirmation email: Unknown error in PHP's mail() function. - https://phabricator.wikimedia.org/T383047#10885830 (10Tgr) >>! In T383047#10885665, @jhathaway wrote: > I tried cutting a pat... [23:18:35] FIRING: NetboxPhysicalHosts: Netbox - Report parity errors between PuppetDB and Netbox for physical devices. - https://wikitech.wikimedia.org/wiki/Netbox#Report_Alert - https://netbox.wikimedia.org/core/jobs/ - https://alerts.wikimedia.org/?q=alertname%3DNetboxPhysicalHosts [23:23:35] FIRING: NetboxAccounting: Netbox - Accounting job failed - https://wikitech.wikimedia.org/wiki/Netbox#Report_Alert - https://netbox.wikimedia.org/core/jobs/ - https://alerts.wikimedia.org/?q=alertname%3DNetboxAccounting