[08:29:45] hi! when you get a chance I'd like your take on https://gerrit.wikimedia.org/r/c/operations/puppet/+/597765, I've audited in the related task the services we'd stop checking SNI for and they LGTM [09:07:59] thank you mr vgutierrez ! [09:10:35] np [09:12:55] 10Traffic, 10Operations: check_http and SNI support - https://phabricator.wikimedia.org/T253292 (10fgiunchedi) 05Open→03Resolved a:03fgiunchedi Change deployed, resolving [09:47:21] I'm going to roll-restart pybal on codfw low-traffic to pick up a new service [09:52:11] ack [09:53:21] aannnd backing out of the change :| proxyfetch doesn't send SNI AFAICT and thus raises ConnectionLost [09:54:39] ouch [09:55:10] that's pretty interesting.. considering that ProxyFetch is talking TLSv1.3 to ats-tls [09:55:53] dt:2020-05-25T09:55:43Z hostname:cp3050.esams.wmnet time_firstbyte:0 ip:2620:0:862:102:10:20:0:17 cache_status:int-front http_status:200 response_size:40 http_method:GET uri_host:varnishcheck.wikimedia.org uri_path:/from/pybal content_type:- referer:- user_agent:Twisted PageGetter accept_language:- x_analytics:https=1 range:- x_cache:cp3050 [09:55:53] int accept:- backend:Varnish tls:vers=TLSv1.3;keyx=X25519;auth=ECDSA;ciph=AES-256-GCM-SHA384;prot=h1;sess=new [09:56:12] that's proxyfetch traffic on cp3050 [09:56:38] mhh interesting indeed, it is a hunch so far the SNI thing, will investigate after the revert [09:56:57] godog: can you capture a full TLS handshake? [09:57:19] using the source ip of the lvs you should only see healthchecks and not real traffic [09:57:41] vgutierrez: for sure, will capture now on thanos-fe2001 [09:58:56] but I think you're right [09:59:06] I have here an old pcap of lvs healthchecks [09:59:13] and I'm not seeing the SNI on the ClientHello [10:02:57] yeah I suspected as much because envoy closes the connection in sni-strict mode [10:11:49] 10Traffic, 10Operations, 10Pybal: PyBal ProxyFetch failure when talking to Envoy in SNI-only mode - https://phabricator.wikimedia.org/T253527 (10fgiunchedi) [11:38:59] 10Traffic, 10Operations: atskafka: expose rdkafka metrics to prometheus - https://phabricator.wikimedia.org/T253551 (10ema) [11:39:05] 10Traffic, 10Operations: atskafka: expose rdkafka metrics to prometheus - https://phabricator.wikimedia.org/T253551 (10ema) p:05Triage→03Medium [11:52:36] 10Traffic, 10Analytics, 10Operations: Remove ganglia leftovers from ops/puppet - https://phabricator.wikimedia.org/T253555 (10ema) [11:53:58] 10Traffic, 10Analytics, 10Operations: Remove ganglia leftovers from ops/puppet - https://phabricator.wikimedia.org/T253555 (10ema) p:05Triage→03Low [11:54:51] 10Traffic, 10Analytics, 10Operations: Remove ganglia leftovers from ops/puppet - https://phabricator.wikimedia.org/T253555 (10ema) [12:05:54] 10Traffic, 10Analytics, 10Operations, 10Patch-For-Review: Remove ganglia leftovers from ops/puppet - https://phabricator.wikimedia.org/T253555 (10ema) [12:17:25] 10Traffic, 10Operations, 10Patch-For-Review: Implement a prometheus exporter for rdkafka in golang - https://phabricator.wikimedia.org/T253197 (10ema) 05Open→03Resolved a:03ema The package prometheus-rdkafka-exporter is now available in buster-wikimedia, closing. [12:44:35] attempting the thanos-swift pybal setup again shortly, this time should work as expected [12:45:37] ack [16:09:22] XioNoX: that zayo circuit 120003 is hours past its end of maintenance window but still down [16:46:35] cdanis: good thing you have them on quick dial [16:46:36] :) [16:48:35] (I'm not working today unless it's urgent, I just connected for the Equinix call that I scheduled a day I forgot was a holiday) [16:49:19] I emailed them but am again logging off :) [16:50:44] cdanis: I didn't see your email, was about to look into it, did you CC noc? [16:50:53] ah, now :) [16:51:12] thanks! [18:29:16] 10Traffic, 10Continuous-Integration-Infrastructure, 10Operations: Caching of https://doc.wikimedia.org/cover/mediawiki-libs-IPUtils/IPUtils.php.html is inconsistent - https://phabricator.wikimedia.org/T252131 (10hashar) doc.wikimedia.org is mostly static files. In the Apache config there is: ` # Lower ca... [19:05:45] 10Traffic, 10Continuous-Integration-Infrastructure, 10Operations: Caching of https://doc.wikimedia.org/cover/mediawiki-libs-IPUtils/IPUtils.php.html is inconsistent - https://phabricator.wikimedia.org/T252131 (10Krinkle) Did the inconsistency last for over an hour? If not, I think this is expected given muti...