[07:17:58] 10netops, 06Traffic, 06Infrastructure-Foundations, 06serviceops: weighted maglev viability for low-traffic services - https://phabricator.wikimedia.org/T368545#9929335 (10ayounsi) Strictly on the network side, there is no blocker one way or the other. I think I miss some context, what's the current low-tr... [07:24:36] 06Traffic, 06MW-Interfaces-Team, 06serviceops: map the /api/ prefix to /w/rest.php - https://phabricator.wikimedia.org/T364400#9929351 (10daniel) This is unblocked from our side - which team is going to take care of creating the routing rule? #traffic or #serviceops? [07:32:28] 10Acme-chief, 06Traffic, 06Infrastructure-Foundations, 10Puppet-Infrastructure, and 2 others: Revert back to fleet-wide acmechief config once all ACME consumers are on Puppet 7 - https://phabricator.wikimedia.org/T365799#9929377 (10MoritzMuehlenhoff) [09:16:29] 10netops, 06Traffic, 06Infrastructure-Foundations, 06serviceops: weighted maglev viability for low-traffic services - https://phabricator.wikimedia.org/T368545#9929623 (10Vgutierrez) >>! In T368545#9929335, @ayounsi wrote: > I think I miss some context, what's the current low-traffic setup ? Usually servic... [09:39:03] 10netops, 06Data-Persistence, 06DBA, 06Infrastructure-Foundations, and 2 others: Upgrade EVPN switches Eqiad row E-F to JunOS 22.2 - lsw1-e7-eqiad - https://phabricator.wikimedia.org/T365988#9929683 (10ABran-WMF) [10:09:28] 06Traffic, 06SRE: Perform katran load tests on lvs1013 - https://phabricator.wikimedia.org/T342618#9929835 (10Vgutierrez) 05Open→03Resolved [10:12:29] 06Traffic, 06collaboration-services, 06Release-Engineering-Team, 13Patch-For-Review: CI on gitlab for eBPF / networking heavy projects - https://phabricator.wikimedia.org/T353279#9929840 (10Vgutierrez) 05Open→03Resolved a:03Vgutierrez This has been solved by running the code inside a fully emulat... [10:12:53] topranks: re https://phabricator.wikimedia.org/T368544#9926889, I'm aware it was an option for high-traffic[12] as well, but given that we don't want MSS that could lead to IP fragmentation being sent to the UAs that hit us over the Internet I'm guessing that MSS clamping would be needed anyways [10:21:57] vgutierrez: yep actually that makes perfect sense, we need the clamping on realservers for that reason anyway, going 20 bytes lower is no big loss [10:39:20] 06Traffic: Replace ping offload servers with eBPF - https://phabricator.wikimedia.org/T367973#9929970 (10Vgutierrez) It looks like we get this for free with katran, ICMPv4 support on https://github.com/facebookincubator/katran/blob/d7575aeae1069a0f761c0414f5c30c05b50285fd/katran/lib/bpf/handle_icmp.h#L286-L288 &... [10:39:54] FIRING: SystemdUnitFailed: benthos@haproxy_cache.service on cp4037:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [10:40:29] fabfur: ^^ [10:40:48] that's on me, reverted and applying puppet right now [10:41:06] should be resolved quickly [10:41:27] it looks like bad hieradata :) [10:41:38] tcp@127.0.0.1:1221 isn't a valid address for benthos [10:42:08] given puppet is disabled you could fix it forward [10:42:13] rather than reverting and submitting another one [10:42:46] I had already reverted before the alert [10:44:54] RESOLVED: SystemdUnitFailed: benthos@haproxy_cache.service on cp4037:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [10:47:45] fabfur: ack [11:12:53] 06Traffic: Consider preferring TLS_AES_128_GCM_SHA256 over TLS_AES_256_GCM_SHA384 - https://phabricator.wikimedia.org/T365327#9930056 (10Ladsgroup) Might not be a big deal but I also saw this in "Real-world cryptography" (2021) by David Wong. Page 67: > It is foreseeable that AES-128 will remain secure for a lon... [12:20:20] 06Traffic: Replace ping offload servers with eBPF - https://phabricator.wikimedia.org/T367973#9930220 (10cmooney) >>! In T367973#9929970, @Vgutierrez wrote: > It looks like we get this for free with katran, ICMPv4 support on https://github.com/facebookincubator/katran/blob/d7575aeae1069a0f761c0414f5c30c05b50285f... [13:42:02] elukey: I got an encoding question for you [13:46:14] 10netops, 06Traffic, 06Infrastructure-Foundations, 06serviceops: IPIP encapsulation considerations for low-traffic services - https://phabricator.wikimedia.org/T368544#9930546 (10CDanis) Could be convinced otherwise, but I'm generally in favor of the MSS clamping option -- we know it works and the tradeoff... [13:46:25] vgutierrez: o/ no idea if I can answer but shoot :) [13:50:17] elukey: ok... so what's the encoding that our kafka cluster expects for values sent with varnishkafka? [13:52:53] we have some clients sending data in non utf-8 encodings [13:53:14] for example some UAs send ø as 0xF8 (iso-8859-1 or windows-1252 encodings for example) [13:53:34] that could be seen on the Referer header for instance [13:53:46] how that ends up showing on the kafka cluster? [13:53:56] cause it's invalid utf-8 [13:55:20] so IIRC we use snappy compression on the client side, so what kafka sees in theory it is the content already compressed. So I'd say that as long as Varnishkafka is happy with the JSON crafted with weird utf-8 encoding, then Kafka will not really complain/validate anything [13:56:03] do you see 0xF8 on kafkacat while reading records from webrequest? [13:56:41] not weird.. in this case 0xf8 will make a utf-8 decoder to fail [13:57:21] ah yes sorry, mistyped [13:58:05] I don't recall excactly the C code but I think the JSON was crafted in a very coarse grained way, namely copy pasting what varnish returned from the shm-log [14:00:13] 10netops, 06Traffic, 06Infrastructure-Foundations, 06serviceops: IPIP encapsulation considerations for low-traffic services - https://phabricator.wikimedia.org/T368544#9930584 (10Joe) I'd go ahead and take a step back: why do we need to switch to IPIP encapsulation for backend services? Is there a compell... [14:00:25] https://github.com/wikimedia/operations-software-varnish-varnishkafka/blob/master/varnishkafka.c#L1386 should list what we use [14:01:17] I am wondering if yajl is the responsible for this [14:01:29] but with snappy compression, I am almost sure that kafka doesn't care [14:03:29] 10netops, 06Traffic, 06Infrastructure-Foundations, 06serviceops: weighted maglev viability for low-traffic services - https://phabricator.wikimedia.org/T368545#9930604 (10Joe) It is pretty clear to me that the only way to have fair load balancing with `maglev` is if we do the consistent hashing using the r... [14:20:47] 10netops, 06Traffic, 06Infrastructure-Foundations, 06serviceops: IPIP encapsulation considerations for low-traffic services - https://phabricator.wikimedia.org/T368544#9930681 (10akosiaris) T352956 is related (possibly a duplicate) and I 've mulling over it for a few months now. I think we need to have a l... [14:35:22] 10netops, 06Traffic, 06Infrastructure-Foundations, 06serviceops: IPIP encapsulation considerations for low-traffic services - https://phabricator.wikimedia.org/T368544#9930740 (10Vgutierrez) >>! In T368544#9930584, @Joe wrote: > I'd go ahead and take a step back: why do we need to switch to IPIP encapsulat... [14:46:41] 10netops, 06Traffic, 06Infrastructure-Foundations, 06serviceops: IPIP encapsulation considerations for low-traffic services - https://phabricator.wikimedia.org/T368544#9930773 (10Joe) >>! In T368544#9930740, @Vgutierrez wrote: >>>! In T368544#9930584, @Joe wrote: >> I'd go ahead and take a step back: why d... [14:46:57] 10netops, 06Data-Persistence, 06DBA, 06Infrastructure-Foundations, and 2 others: Upgrade EVPN switches Eqiad row E-F to JunOS 22.2 - lsw1-e7-eqiad - https://phabricator.wikimedia.org/T365988#9930774 (10ops-monitoring-bot) Icinga downtime and Alertmanager silence (ID=66810f76-0e2d-43f3-8c96-bbfe4e6a7aee) se... [14:52:20] 10netops, 06Traffic, 06Infrastructure-Foundations, 06serviceops: IPIP encapsulation considerations for low-traffic services - https://phabricator.wikimedia.org/T368544#9930800 (10BBlack) For more context: eventually our Katran-based Liberica balancer will replace pybal/LVS. The Katran one has to use IPIP,... [14:57:10] 10netops, 06Traffic, 06Infrastructure-Foundations, 06serviceops: IPIP encapsulation considerations for low-traffic services - https://phabricator.wikimedia.org/T368544#9930822 (10Vgutierrez) theoretically speaking we could keep low-traffic on liberica/IPVS (instead of liberica/Katran) to be able to get rid... [14:57:54] 06Traffic, 06DC-Ops, 10ops-eqsin, 06SRE, 13Patch-For-Review: Q4: install PCIe NVMe SSDs into eqsin text cp50(1[789]|2[01234] - https://phabricator.wikimedia.org/T365763#9930838 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by brett@cumin2002 for host cp5021.eqsin.wmnet with OS b... [14:57:58] 10netops, 06Data-Persistence, 06DBA, 06Infrastructure-Foundations, and 2 others: Upgrade EVPN switches Eqiad row E-F to JunOS 22.2 - lsw1-e7-eqiad - https://phabricator.wikimedia.org/T365988#9930839 (10ops-monitoring-bot) Icinga downtime and Alertmanager silence (ID=2863d158-d71c-4317-a811-4dd3cb8e6e72) se... [14:58:49] 10netops, 06Data-Persistence, 06DBA, 06Infrastructure-Foundations, and 2 others: Upgrade EVPN switches Eqiad row E-F to JunOS 22.2 - lsw1-e7-eqiad - https://phabricator.wikimedia.org/T365988#9930845 (10ops-monitoring-bot) Icinga downtime and Alertmanager silence (ID=bd008f08-7b85-4b69-ba4e-5d84a9307d79) se... [15:17:40] 10netops, 06Traffic, 06Infrastructure-Foundations, 06serviceops: IPIP encapsulation considerations for low-traffic services - https://phabricator.wikimedia.org/T368544#9930949 (10cmooney) >>! In T368544#9930584, @Joe wrote: > I'd go ahead and take a step back: why do we need to switch to IPIP encapsulation... [15:19:53] 10netops, 06Traffic, 06Infrastructure-Foundations, 06serviceops: IPIP encapsulation considerations for low-traffic services - https://phabricator.wikimedia.org/T368544#9930977 (10Joe) >>! In T368544#9930822, @Vgutierrez wrote: > theoretically speaking we could keep low-traffic on liberica/IPVS (instead of... [15:20:49] 10netops, 06Data-Persistence, 06DBA, 06Infrastructure-Foundations, and 2 others: Upgrade EVPN switches Eqiad row E-F to JunOS 22.2 - lsw1-e7-eqiad - https://phabricator.wikimedia.org/T365988#9930989 (10cmooney) Upgrade completed, all looking good network-wise. [15:40:53] 10netops, 06Data-Persistence, 06DBA, 06Infrastructure-Foundations, and 2 others: Upgrade EVPN switches Eqiad row E-F to JunOS 22.2 - lsw1-e7-eqiad - https://phabricator.wikimedia.org/T365988#9931141 (10Eevans) >>! In T365988#9930989, @cmooney wrote: > Upgrade completed, all looking good network-wise. Than... [15:41:31] 10netops, 06Traffic, 06Infrastructure-Foundations, 06serviceops: IPIP encapsulation considerations for low-traffic services - https://phabricator.wikimedia.org/T368544#9931142 (10Vgutierrez) >>! In T368544#9930977, @Joe wrote: > oh I agree 100% with this. My doubts were specifically for switching to katran... [15:45:31] elukey: sorry.. got trapped in some meetings [15:45:37] elukey: so... https://github.com/lloyd/yajl/blob/5e3a7856e643b4d6410ddc3f84bc2f38174f2872/src/yajl_gen.c#L260 [15:45:46] it looks like generating valid utf-8 is optional for yajl_gen_string() [15:46:35] bingo, yes I suspected something similar [15:47:50] checking varnishkafka code it looks like it doesn't perform any kind of config fir yajl (what a lovely name) [15:48:08] https://github.com/wikimedia/operations-software-varnish-varnishkafka/blob/1a8364fb8216e44590c87509bc260a6ed20d68c4/varnishkafka.c#L1396-L1401 [15:48:12] just a _gen_alloc() [15:48:19] that sets everything to 0 [15:49:29] an the valid UTF-8 flag is 0x08 [15:49:38] so I'm guessing it's disabled by default [15:49:49] elukey: how can I run kafkacat on a cp host? [15:50:00] I want to validate this theory :) [15:50:27] assuming I can do it on the cp host that sends the request via varnishkafka of course [15:53:53] vgutierrez: do you mean how you can isolate a specific cp node consuming from webrequest? [15:54:01] yes [15:54:31] I want to trigger a request with this ø encoded as iso-8859-1 and see what varnishkafka sends [15:55:03] BTW... I'm enjoying this rabbit hole because invalid utf-8 makes benthos fail :) [15:55:21] and it kinda makes sense because we use RFC 5424 syslog to send data from haproxy to benthos [15:55:31] and RFC 5424 says that structured data will be sent as valid UTF-8 [15:55:43] and it isn't valid UTF-8 at all :) [15:56:37] 06Traffic, 06DC-Ops, 10ops-eqsin, 06SRE, 13Patch-For-Review: Q4: install PCIe NVMe SSDs into eqsin text cp50(1[789]|2[01234] - https://phabricator.wikimedia.org/T365763#9931221 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by brett@cumin2002 for host cp5021.eqsin.wmnet with OS bulls... [15:56:59] 06Traffic, 06DC-Ops, 10ops-eqsin, 06SRE, 13Patch-For-Review: Q4: install PCIe NVMe SSDs into eqsin text cp50(1[789]|2[01234] - https://phabricator.wikimedia.org/T365763#9931222 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by brett@cumin2002 for host cp5021.eqsin.wmnet with OS b... [15:57:17] so as a side quest the question of how varnish handles this came to mind [16:37:47] vgutierrez: sorry got dragged into a supermicro issue :) [16:41:04] no problem :) [16:43:16] on a stat node you can run something like [16:43:16] kafkacat -C -b kafka-jumbo1015.eqiad.wmnet:9092 -t webrequest_text | jq '. | select (.hostname == "cp3072.esams.wmnet")' [16:43:34] and you should get only records from the node [16:43:52] (not sure if it answers the question that you had) [16:44:24] yeah.. I could filter by URL as well that would be great [16:44:31] just to not get flooded :) [16:44:37] do we have documentation on that? [16:45:24] but that's helpful already, worst case scenario I'll depool the host first O:) [16:45:29] thx :) [16:45:40] oh, that's jq already [16:45:42] perfect [16:47:53] kafkacat -C -b kafka-jumbo1015.eqiad.wmnet:9092 -t webrequest_text | jq '. | select (.hostname == "cp3072.esams.wmnet") | select (.uri_path != "/w/load.php")' [16:48:06] this is an example about filtering futher [16:48:13] not a jq expert but it should work fine [16:50:02] vgutierrez: --^ [17:00:06] Thx :) [17:01:26] 10Acme-chief, 06Traffic, 13Patch-For-Review: Create automation for registered MarkMonitor DNS and acme-chief/ncredir - https://phabricator.wikimedia.org/T355189#9931520 (10BCornwall) 05In progress→03Resolved Marking as resolved since this is vague and technically has been achieved. Any further develo... [17:02:38] FIRING: [8x] LVSRealserverMSS: Unexpected MSS value on 103.102.166.224:443 @ cp5021 - https://wikitech.wikimedia.org/wiki/LVS#LVSRealserverMSS_alert - https://grafana.wikimedia.org/d/Y9-MQxNSk/ipip-encapsulated-services?orgId=1&viewPanel=2&var-site=eqsin&var-cluster=cache_text - https://alerts.wikimedia.org/?q=alertname%3DLVSRealserverMSS [17:06:19] 06Traffic, 06DC-Ops, 10ops-eqsin, 06SRE, 13Patch-For-Review: Q4: install PCIe NVMe SSDs into eqsin text cp50(1[789]|2[01234] - https://phabricator.wikimedia.org/T365763#9931532 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by brett@cumin2002 for host cp5021.eqsin.wmnet with OS bulls... [17:07:38] RESOLVED: [8x] LVSRealserverMSS: Unexpected MSS value on 103.102.166.224:443 @ cp5021 - https://wikitech.wikimedia.org/wiki/LVS#LVSRealserverMSS_alert - https://grafana.wikimedia.org/d/Y9-MQxNSk/ipip-encapsulated-services?orgId=1&viewPanel=2&var-site=eqsin&var-cluster=cache_text - https://alerts.wikimedia.org/?q=alertname%3DLVSRealserverMSS [17:36:18] 06Traffic: Harden ncmonitor systemd service - https://phabricator.wikimedia.org/T368638#9931679 (10Aklapper) [18:11:18] 06Traffic, 06DC-Ops, 10ops-eqsin, 06SRE, 13Patch-For-Review: Q4: install PCIe NVMe SSDs into eqsin text cp50(1[789]|2[01234] - https://phabricator.wikimedia.org/T365763#9931845 (10BCornwall) [18:19:22] 06Traffic, 06DC-Ops, 10ops-eqsin, 06SRE, 13Patch-For-Review: Q4: install PCIe NVMe SSDs into eqsin text cp50(1[789]|2[01234] - https://phabricator.wikimedia.org/T365763#9931877 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by brett@cumin2002 for host cp5022.eqsin.wmnet with OS b... [18:52:57] 06Traffic, 06DC-Ops, 10ops-eqsin, 06SRE, 13Patch-For-Review: Q4: install PCIe NVMe SSDs into eqsin text cp50(1[789]|2[01234] - https://phabricator.wikimedia.org/T365763#9932100 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by brett@cumin2002 for host cp5022.eqsin.wmnet with OS bulls... [18:53:00] 06Traffic, 06DC-Ops, 10ops-eqsin, 06SRE, 13Patch-For-Review: Q4: install PCIe NVMe SSDs into eqsin text cp50(1[789]|2[01234] - https://phabricator.wikimedia.org/T365763#9932102 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by brett@cumin2002 for host cp5022.eqsin.wmnet with OS b... [19:37:38] FIRING: [8x] LVSRealserverMSS: Unexpected MSS value on 103.102.166.224:443 @ cp5022 - https://wikitech.wikimedia.org/wiki/LVS#LVSRealserverMSS_alert - https://grafana.wikimedia.org/d/Y9-MQxNSk/ipip-encapsulated-services?orgId=1&viewPanel=2&var-site=eqsin&var-cluster=cache_text - https://alerts.wikimedia.org/?q=alertname%3DLVSRealserverMSS [19:42:38] RESOLVED: [8x] LVSRealserverMSS: Unexpected MSS value on 103.102.166.224:443 @ cp5022 - https://wikitech.wikimedia.org/wiki/LVS#LVSRealserverMSS_alert - https://grafana.wikimedia.org/d/Y9-MQxNSk/ipip-encapsulated-services?orgId=1&viewPanel=2&var-site=eqsin&var-cluster=cache_text - https://alerts.wikimedia.org/?q=alertname%3DLVSRealserverMSS [19:44:38] FIRING: [8x] LVSRealserverMSS: Unexpected MSS value on 103.102.166.224:443 @ cp5022 - https://wikitech.wikimedia.org/wiki/LVS#LVSRealserverMSS_alert - https://grafana.wikimedia.org/d/Y9-MQxNSk/ipip-encapsulated-services?orgId=1&viewPanel=2&var-site=eqsin&var-cluster=cache_text - https://alerts.wikimedia.org/?q=alertname%3DLVSRealserverMSS [19:48:13] 06Traffic, 06DC-Ops, 10ops-eqsin, 06SRE, 13Patch-For-Review: Q4: install PCIe NVMe SSDs into eqsin text cp50(1[789]|2[01234] - https://phabricator.wikimedia.org/T365763#9932421 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by brett@cumin2002 for host cp5022.eqsin.wmnet with OS bulls... [19:49:38] RESOLVED: [8x] LVSRealserverMSS: Unexpected MSS value on 103.102.166.224:443 @ cp5022 - https://wikitech.wikimedia.org/wiki/LVS#LVSRealserverMSS_alert - https://grafana.wikimedia.org/d/Y9-MQxNSk/ipip-encapsulated-services?orgId=1&viewPanel=2&var-site=eqsin&var-cluster=cache_text - https://alerts.wikimedia.org/?q=alertname%3DLVSRealserverMSS [20:34:01] FIRING: PurgedHighEventLag: High event process lag with purged on cp5020:2112 - https://wikitech.wikimedia.org/wiki/Purged#Alerts - https://grafana.wikimedia.org/d/RvscY1CZk/purged?var-datasource=eqsin%20prometheus/ops&var-instance=cp5020 - https://alerts.wikimedia.org/?q=alertname%3DPurgedHighEventLag [20:39:01] FIRING: [15x] PurgedHighEventLag: High event process lag with purged on cp5017:2112 - https://wikitech.wikimedia.org/wiki/Purged#Alerts - https://alerts.wikimedia.org/?q=alertname%3DPurgedHighEventLag [20:40:02] 06Traffic, 06DC-Ops, 10ops-eqsin, 06SRE, 13Patch-For-Review: Q4: install PCIe NVMe SSDs into eqsin text cp50(1[789]|2[01234] - https://phabricator.wikimedia.org/T365763#9932574 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by brett@cumin2002 for host cp5023.eqsin.wmnet with OS b... [20:59:01] FIRING: [17x] PurgedHighEventLag: High event process lag with purged on cp5017:2112 - https://wikitech.wikimedia.org/wiki/Purged#Alerts - https://alerts.wikimedia.org/?q=alertname%3DPurgedHighEventLag [21:04:01] RESOLVED: [25x] PurgedHighEventLag: High event process lag with purged on cp5017:2112 - https://wikitech.wikimedia.org/wiki/Purged#Alerts - https://alerts.wikimedia.org/?q=alertname%3DPurgedHighEventLag [21:52:38] FIRING: [8x] LVSRealserverMSS: Unexpected MSS value on 103.102.166.224:443 @ cp5023 - https://wikitech.wikimedia.org/wiki/LVS#LVSRealserverMSS_alert - https://grafana.wikimedia.org/d/Y9-MQxNSk/ipip-encapsulated-services?orgId=1&viewPanel=2&var-site=eqsin&var-cluster=cache_text - https://alerts.wikimedia.org/?q=alertname%3DLVSRealserverMSS [21:57:38] RESOLVED: [8x] LVSRealserverMSS: Unexpected MSS value on 103.102.166.224:443 @ cp5023 - https://wikitech.wikimedia.org/wiki/LVS#LVSRealserverMSS_alert - https://grafana.wikimedia.org/d/Y9-MQxNSk/ipip-encapsulated-services?orgId=1&viewPanel=2&var-site=eqsin&var-cluster=cache_text - https://alerts.wikimedia.org/?q=alertname%3DLVSRealserverMSS [21:59:38] FIRING: [8x] LVSRealserverMSS: Unexpected MSS value on 103.102.166.224:443 @ cp5023 - https://wikitech.wikimedia.org/wiki/LVS#LVSRealserverMSS_alert - https://grafana.wikimedia.org/d/Y9-MQxNSk/ipip-encapsulated-services?orgId=1&viewPanel=2&var-site=eqsin&var-cluster=cache_text - https://alerts.wikimedia.org/?q=alertname%3DLVSRealserverMSS [22:02:33] 06Traffic, 06DC-Ops, 10ops-eqsin, 06SRE, 13Patch-For-Review: Q4: install PCIe NVMe SSDs into eqsin text cp50(1[789]|2[01234] - https://phabricator.wikimedia.org/T365763#9932790 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by brett@cumin2002 for host cp5023.eqsin.wmnet with OS bulls... [22:04:38] RESOLVED: [8x] LVSRealserverMSS: Unexpected MSS value on 103.102.166.224:443 @ cp5023 - https://wikitech.wikimedia.org/wiki/LVS#LVSRealserverMSS_alert - https://grafana.wikimedia.org/d/Y9-MQxNSk/ipip-encapsulated-services?orgId=1&viewPanel=2&var-site=eqsin&var-cluster=cache_text - https://alerts.wikimedia.org/?q=alertname%3DLVSRealserverMSS [22:04:41] 06Traffic, 06DC-Ops, 10ops-eqsin, 13Patch-For-Review: Q4: install PCIe NVMe SSDs into eqsin text cp50(1[789]|2[01234] - https://phabricator.wikimedia.org/T365763#9932805 (10BCornwall) [22:44:00] 06Traffic, 06DC-Ops, 10ops-eqsin: Q4: install PCIe NVMe SSDs into eqsin text cp50(1[789]|2[01234] - https://phabricator.wikimedia.org/T365763#9932922 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by brett@cumin2002 for host cp5024.eqsin.wmnet with OS bullseye [23:33:22] 06Traffic, 06DC-Ops, 10ops-eqsin, 13Patch-For-Review: Q4: install PCIe NVMe SSDs into eqsin text cp50(1[789]|2[01234] - https://phabricator.wikimedia.org/T365763#9933091 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by brett@cumin2002 for host cp5024.eqsin.wmnet with OS bullseye execu... [23:33:38] 06Traffic, 06DC-Ops, 10ops-eqsin, 13Patch-For-Review: Q4: install PCIe NVMe SSDs into eqsin text cp50(1[789]|2[01234] - https://phabricator.wikimedia.org/T365763#9933092 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by brett@cumin2002 for host cp5024.eqsin.wmnet with OS bullseye