[00:00:14] 06Traffic, 06DC-Ops, 10ops-eqiad, 06SRE, 13Patch-For-Review: Q3:test NIC for lvs1017 - https://phabricator.wikimedia.org/T387145#10897773 (10BCornwall) [00:01:33] 06Traffic, 06DC-Ops, 10ops-eqiad, 06SRE, 13Patch-For-Review: Q3:test NIC for lvs1017 - https://phabricator.wikimedia.org/T387145#10897774 (10BCornwall) [06:23:51] FIRING: FermMSS: Unexpected MSS value on 10.2.1.27:80 @ ms-fe2015 - https://wikitech.wikimedia.org/wiki/LVS#LVSRealserverMSS_alert - https://grafana.wikimedia.org/d/Y9-MQxNSk/ipip-encapsulated-services?orgId=1&viewPanel=4&var-site=codfw&var-cluster=swift - https://alerts.wikimedia.org/?q=alertname%3DFermMSS [06:28:51] RESOLVED: FermMSS: Unexpected MSS value on 10.2.1.27:80 @ ms-fe2015 - https://wikitech.wikimedia.org/wiki/LVS#LVSRealserverMSS_alert - https://grafana.wikimedia.org/d/Y9-MQxNSk/ipip-encapsulated-services?orgId=1&viewPanel=4&var-site=codfw&var-cluster=swift - https://alerts.wikimedia.org/?q=alertname%3DFermMSS [07:59:20] 06Traffic, 10conftool, 13Patch-For-Review: FY 24/25 WE 4.3.11 Define a policy for maintenance of requestctl rules - https://phabricator.wikimedia.org/T393381#10898293 (10Joe) >>! In T393381#10891298, @CDanis wrote: > I think overall the idea of priorities is sound, although I have some questions about the fi... [12:18:29] 06Traffic, 06Experimentation Lab: EventGate: Investigate data loss during the SDS 2.4.11 Synthetic A/A Test experiment - https://phabricator.wikimedia.org/T396474 (10phuedx) 03NEW [12:18:44] 06Traffic, 06Experimentation Lab: EventGate: Investigate data loss during the SDS 2.4.11 Synthetic A/A Test experiment - https://phabricator.wikimedia.org/T396474#10899363 (10phuedx) [12:19:01] 06Traffic, 06Experimentation Lab: EventGate: Investigate data loss during the SDS 2.4.11 Synthetic A/A Test experiment - https://phabricator.wikimedia.org/T396474#10899366 (10phuedx) [12:19:09] 06Traffic, 06Experimentation Lab: EventGate: Investigate data loss during the SDS 2.4.11 Synthetic A/A Test experiment - https://phabricator.wikimedia.org/T396474#10899368 (10phuedx) [12:24:43] 06Traffic, 06Experimentation Lab: EventGate: Investigate data loss during the SDS 2.4.11 Synthetic A/A Test experiment - https://phabricator.wikimedia.org/T396474#10899383 (10phuedx) @bblack @Vgutierrez: Further to the small amount of detail in the task description, we saw what appears to be a significant rate... [12:55:56] 06Traffic, 06Experimentation Lab: EventGate: Investigate data loss during the SDS 2.4.11 Synthetic A/A Test experiment - https://phabricator.wikimedia.org/T396474#10899517 (10Vgutierrez) could `getXExperimentEnrollments` be executed for requests where the original path isn't `/evt-103e/v2/events`? I'm asking t... [13:37:05] 06Traffic, 06Experimentation Lab: EventGate: Investigate data loss during the SDS 2.4.11 Synthetic A/A Test experiment - https://phabricator.wikimedia.org/T396474#10899691 (10phuedx) >>! In T396474#10899517, @Vgutierrez wrote: > So it's totally possible that requests headed to intake-analytics.wm.o with WMFUni... [13:42:33] 06Traffic, 06Experimentation Lab: EventGate: Investigate data loss during the SDS 2.4.11 Synthetic A/A Test experiment - https://phabricator.wikimedia.org/T396474#10899729 (10phuedx) p:05Triage→03High [14:48:28] o/ I have a change to multi-dc.lua that I'd like to roll out. Is now an okay time, and does it look okay to ye? https://gerrit.wikimedia.org/r/c/operations/puppet/+/1155198 [14:51:30] looks OK, and falls within allowed_methods for fe_vcl_config. and you can merge whenever you want, thanks for checking! [14:54:23] thanks! [15:36:19] 06Traffic, 10Prod-Kubernetes, 06serviceops, 07Kubernetes, 13Patch-For-Review: Handling inbound IPIP traffic on low traffic LVS k8s based realservers - https://phabricator.wikimedia.org/T352956#10900403 (10cmooney) @akosiaris a quick question about this: > meaning that ICMP traffic to e.g. coredns gets d... [15:37:52] vgutierrez just a heads-up that we're pretty close on the elastic LVS filtering puppet patch, ref https://phabricator.wikimedia.org/T387569 [15:38:38] probably be done by next week and we can start on the IPIP migration stuff as well [15:38:47] \o/ [15:38:50] 🍻 [15:38:54] thx [15:39:32] awesome! [15:40:18] Sorry it's taken so long, but we're definitely excited to go to IPIP and Liberica. /me needs to rewatch the SRE session. [15:40:36] good things need their time [15:40:38] don't worry [15:43:50] 06Traffic, 10Prod-Kubernetes, 06serviceops, 07Kubernetes, 13Patch-For-Review: Handling inbound IPIP traffic on low traffic LVS k8s based realservers - https://phabricator.wikimedia.org/T352956#10900449 (10akosiaris) >>! In T352956#10900403, @cmooney wrote: > @akosiaris a quick question about this: > >>... [16:26:27] that's great Brian thanks for working on it! [16:31:01] 06Traffic, 10Prod-Kubernetes, 06serviceops, 07Kubernetes, 13Patch-For-Review: Handling inbound IPIP traffic on low traffic LVS k8s based realservers - https://phabricator.wikimedia.org/T352956#10900719 (10cmooney) @akosiaris thanks for confirming. So overall my thinking is: * Path MTU discovery should... [16:32:20] 06Traffic, 10Prod-Kubernetes, 06serviceops, 07Kubernetes, 13Patch-For-Review: Handling inbound IPIP traffic on low traffic LVS k8s based realservers - https://phabricator.wikimedia.org/T352956#10900737 (10cmooney) >>! In T352956#10900449, @akosiaris wrote: >>>! In T352956#10900403, @cmooney wrote: >> @ak... [17:15:27] 06Traffic, 06DC-Ops, 10ops-esams, 10ops-magru, and 2 others: CPU temperature issues in cp hosts - https://phabricator.wikimedia.org/T373993#10900974 (10BCornwall) Fun little tidbit: Our power consumption lowered after increasing the fan speeds in magru {F62284645} [20:55:51] FIRING: FermMSS: Unexpected MSS value on 10.2.1.27:80 @ ms-fe2015 - https://wikitech.wikimedia.org/wiki/LVS#LVSRealserverMSS_alert - https://grafana.wikimedia.org/d/Y9-MQxNSk/ipip-encapsulated-services?orgId=1&viewPanel=4&var-site=codfw&var-cluster=swift - https://alerts.wikimedia.org/?q=alertname%3DFermMSS [21:00:51] RESOLVED: FermMSS: Unexpected MSS value on 10.2.1.27:80 @ ms-fe2015 - https://wikitech.wikimedia.org/wiki/LVS#LVSRealserverMSS_alert - https://grafana.wikimedia.org/d/Y9-MQxNSk/ipip-encapsulated-services?orgId=1&viewPanel=4&var-site=codfw&var-cluster=swift - https://alerts.wikimedia.org/?q=alertname%3DFermMSS [21:15:28] 06Traffic, 06Experimentation Lab: EventGate: Investigate data loss during the SDS 2.4.11 Synthetic A/A Test experiment - https://phabricator.wikimedia.org/T396474#10902049 (10dr0ptp4kt) @BBlack noted that the the hashed edge unique values are base64url encoded, not plain base64. @tchin created a [[ https://gi... [22:55:40] FIRING: [6x] VarnishHighThreadCount: Varnish's thread count on cp5018:0 is high - https://wikitech.wikimedia.org/wiki/Varnish - https://alerts.wikimedia.org/?q=alertname%3DVarnishHighThreadCount [23:00:40] FIRING: [6x] VarnishHighThreadCount: Varnish's thread count on cp5018:0 is high - https://wikitech.wikimedia.org/wiki/Varnish - https://alerts.wikimedia.org/?q=alertname%3DVarnishHighThreadCount [23:10:40] FIRING: [8x] VarnishHighThreadCount: Varnish's thread count on cp5018:0 is high - https://wikitech.wikimedia.org/wiki/Varnish - https://alerts.wikimedia.org/?q=alertname%3DVarnishHighThreadCount [23:15:40] FIRING: [8x] VarnishHighThreadCount: Varnish's thread count on cp5018:0 is high - https://wikitech.wikimedia.org/wiki/Varnish - https://alerts.wikimedia.org/?q=alertname%3DVarnishHighThreadCount [23:30:40] RESOLVED: [2x] VarnishHighThreadCount: Varnish's thread count on cp5020:0 is high - https://wikitech.wikimedia.org/wiki/Varnish - https://alerts.wikimedia.org/?q=alertname%3DVarnishHighThreadCount