[01:58:40] FIRING: SystemdUnitFailed: dnsmasq.service on ganeti7001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [03:11:04] FIRING: NetboxAccounting: Netbox - Accounting job failed - https://wikitech.wikimedia.org/wiki/Netbox#Report_Alert - https://netbox.wikimedia.org/core/jobs/ - https://alerts.wikimedia.org/?q=alertname%3DNetboxAccounting [04:11:04] RESOLVED: NetboxAccounting: Netbox - Accounting job failed - https://wikitech.wikimedia.org/wiki/Netbox#Report_Alert - https://netbox.wikimedia.org/core/jobs/ - https://alerts.wikimedia.org/?q=alertname%3DNetboxAccounting [04:46:25] FIRING: MirrorHighLag: Mirrors - /srv/mirrors/debian synchronization lag - https://wikitech.wikimedia.org/wiki/Mirrors - https://grafana.wikimedia.org/d/dbd8a904-eab2-48d1-a3b9-fa1851ef3ed2/mirrors?orgId=1 - https://alerts.wikimedia.org/?q=alertname%3DMirrorHighLag [05:58:40] FIRING: SystemdUnitFailed: dnsmasq.service on ganeti7001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [06:36:00] 10Mail, 06Infrastructure-Foundations, 06SRE, 10Wikimedia-Mailing-lists: Replace Exim on lists.wikimedia.org with Postfix - https://phabricator.wikimedia.org/T378021#11208956 (10ABran-WMF) [08:46:40] FIRING: MirrorHighLag: Mirrors - /srv/mirrors/debian synchronization lag - https://wikitech.wikimedia.org/wiki/Mirrors - https://grafana.wikimedia.org/d/dbd8a904-eab2-48d1-a3b9-fa1851ef3ed2/mirrors?orgId=1 - https://alerts.wikimedia.org/?q=alertname%3DMirrorHighLag [09:06:25] RESOLVED: MirrorHighLag: Mirrors - /srv/mirrors/debian synchronization lag - https://wikitech.wikimedia.org/wiki/Mirrors - https://grafana.wikimedia.org/d/dbd8a904-eab2-48d1-a3b9-fa1851ef3ed2/mirrors?orgId=1 - https://alerts.wikimedia.org/?q=alertname%3DMirrorHighLag [09:58:40] FIRING: SystemdUnitFailed: dnsmasq.service on ganeti7001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [11:45:04] FIRING: NetboxAccounting: Netbox - Accounting job failed - https://wikitech.wikimedia.org/wiki/Netbox#Report_Alert - https://netbox.wikimedia.org/core/jobs/ - https://alerts.wikimedia.org/?q=alertname%3DNetboxAccounting [12:14:00] 10netbox, 06Infrastructure-Foundations, 13Patch-For-Review, 07Regression: after logging into Netbox, NDAs see an empty dashboard - https://phabricator.wikimedia.org/T404494#11209739 (10SLyngshede-WMF) 05Open→03In progress Users can now request the new permission via https://idm.wikimedia.org/permission... [12:36:58] elukey: sorry to bug you, and thanks for the reviews on my patches this week been very helpful :) [12:37:06] I have this one which is hopefully quick, and would unblock me: [12:37:13] https://gerrit.wikimedia.org/r/c/operations/software/netbox-extras/+/1191012 [12:43:13] topranks: I have it opened in my browser, I wanted to do it later but lemme check [12:43:32] ah no it is another one! reviewing [12:44:34] yeah this one is hopefully simple... I did some sanity check tests with the regex here so it does work, similar pattern to used for the others to allow the ".xxx" at the end [12:45:04] RESOLVED: NetboxAccounting: Netbox - Accounting job failed - https://wikitech.wikimedia.org/wiki/Netbox#Report_Alert - https://netbox.wikimedia.org/core/jobs/ - https://alerts.wikimedia.org/?q=alertname%3DNetboxAccounting [12:46:09] topranks: tested as well, looks good [12:48:00] thanks! [13:58:40] FIRING: SystemdUnitFailed: dnsmasq.service on ganeti7001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [14:24:25] FIRING: MirrorHighLag: Mirrors - /srv/mirrors/ubuntu synchronization lag - https://wikitech.wikimedia.org/wiki/Mirrors - https://grafana.wikimedia.org/d/dbd8a904-eab2-48d1-a3b9-fa1851ef3ed2/mirrors?orgId=1 - https://alerts.wikimedia.org/?q=alertname%3DMirrorHighLag [14:25:31] topranks: I forwarded you supermicros reply if you are curious, their analysis seems wrong to me, they said they saw HEAD request in the cold boot case, which I don't see, perhaps they looked at the wrong pcap file? [14:27:21] as an aside why is the wireshark filter for http, simply `http`, strange [14:46:04] FIRING: NetboxAccounting: Netbox - Accounting job failed - https://wikitech.wikimedia.org/wiki/Netbox#Report_Alert - https://netbox.wikimedia.org/core/jobs/ - https://alerts.wikimedia.org/?q=alertname%3DNetboxAccounting [15:29:57] jhathaway: thanks yeah I'm not quite sure I get their answer [15:30:03] perhaps just lost in translation [15:30:35] yeah, I replied asking for clarity, though that may never be obtained ;P [15:30:38] they seem to reference an error log saying the HTTP HEAD request didn't work, which makes sense, it isn't actually sent [15:32:51] I really can't work out what pcap they are talking about. The numbers don't correspond to those in the filtered set of files I attached to the task anyway [15:33:14] as to why wireshark's http filter is 'http' I don't know but to my mind that absolutely makes sense! [15:33:33] I sent the original pcaps sorry, I was a bit too trigger happy with my reply [15:33:39] ah ok [15:34:10] I deleted those so I'm at a loss, but either way they show the same thing just less clear-cut [15:34:25] RESOLVED: MirrorHighLag: Mirrors - /srv/mirrors/ubuntu synchronization lag - https://wikitech.wikimedia.org/wiki/Mirrors - https://grafana.wikimedia.org/d/dbd8a904-eab2-48d1-a3b9-fa1851ef3ed2/mirrors?orgId=1 - https://alerts.wikimedia.org/?q=alertname%3DMirrorHighLag [15:34:27] but there is no head request in the hostname-cold-boot.pcap file that I see [15:34:34] nope it's not sent [15:34:57] clearly the problem, it's supposed to send it, and probably sets up some read buffer for the required size on the back of that [15:35:07] well we will see what they say, I'm targeting 2027 for the completion of this task [15:35:29] when it doesn't get sent, but does send the GET, it's not initialised itself properly to receive the file [15:36:01] yep let's see how the years unfold, children grow up, world evolves..... and supermicro respond [15:37:00] :) [15:37:20] 10netops, 06Infrastructure-Foundations, 06SRE: Cloudcephosd: migrate to single network uplink - https://phabricator.wikimedia.org/T399180#11210576 (10cmooney) [15:40:43] 10netops, 06Infrastructure-Foundations, 06SRE: Cloudcephosd: migrate to single network uplink - https://phabricator.wikimedia.org/T399180#11210598 (10cmooney) As discussed in today's meeting I believe all the cloudcephosd hosts have jumbo frames enabled on all their physical interfaces. So there should be n... [15:46:04] RESOLVED: NetboxAccounting: Netbox - Accounting job failed - https://wikitech.wikimedia.org/wiki/Netbox#Report_Alert - https://netbox.wikimedia.org/core/jobs/ - https://alerts.wikimedia.org/?q=alertname%3DNetboxAccounting [16:02:34] why is idp suddenly giving me a permission error when trying to log in to netbox? [16:03:57] commit 1dcf9c8bf751dac92aa136920e9a94fea6bace1f [16:04:00] P:idp swap nda for netbox-readonly-access in Netbox-OIDC [16:04:08] probably because of this [16:04:54] so you probably now need to be in netbox-readonly-access [16:05:17] http://phabricator.wikimedia.org/T404494 [16:05:22] it’s probably related [16:05:42] but taavi shouldn’t be in the nda / ro group anymore I thought [16:12:58] I'm in ops which gives me full write access in netbox but I guess is not enough to log in on its own? [16:44:02] 10netbox, 06Infrastructure-Foundations, 07Regression: after logging into Netbox, NDAs see an empty dashboard - https://phabricator.wikimedia.org/T404494#11210986 (10Novem_Linguae) * Looks like there's a spot in IDM somewhere where a description can be typed for Netbox-readonly-access. We might want to find t... [16:51:53] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-eqiad, 06SRE: Tidy up lvs1018 L2 link to ssw1-e1-eqiad - https://phabricator.wikimedia.org/T405499 (10cmooney) 03NEW p:05Triage→03Medium [17:13:54] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-eqiad, 06SRE: Tidy up lvs1018 L2 link to ssw1-e1-eqiad - https://phabricator.wikimedia.org/T405499#11211183 (10cmooney) For reference these are the vlans / IPs currently connected: ` lvs1018 - enp94s0f0np0 - vlan1031 - 10.64.130.18/24 - private1-e1-... [17:22:06] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-eqiad, and 2 others: Tidy up lvs1018 L2 link to ssw1-e1-eqiad - https://phabricator.wikimedia.org/T405499#11211217 (10cmooney) [17:28:35] taavi: sorry missed that earlier. Being in ops is I think all you should need. It may be some weirdness due to caching or the fact you first logged in on that [17:28:52] we can ask Si mon tomorrow and see if he knows, probably something simple [17:58:40] FIRING: SystemdUnitFailed: dnsmasq.service on ganeti7001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [20:02:35] yeah let's see what comes from it.. you never know! [21:58:40] FIRING: SystemdUnitFailed: dnsmasq.service on ganeti7001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [23:03:25] FIRING: [2x] SystemdUnitFailed: dnsmasq.service on ganeti7001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed