[01:37:14] if you are using the cloud VPS project "traffic" you should mark it on this page to prevent it from being removed [01:37:17] https://wikitech.wikimedia.org/wiki/News/Cloud_VPS_2019_Purge#traffic [08:21:42] 10netops, 10Operations, 10Wikimedia-Incident: Improve resiliency of the eqsin transport link - https://phabricator.wikimedia.org/T236878 (10ayounsi) p:05Triage→03Normal [08:31:33] 10netops, 10Operations: cr3-esams crash - https://phabricator.wikimedia.org/T236598 (10ayounsi) re1 is unresponsive, even through console. We have 2 options to try to power cycle it: - Have someone onsite unseat/reseat the card (non disruptive) - Power cycle the whole router (disruptive) I'd suggest the 2nd... [09:21:40] 10Traffic, 10Analytics, 10Analytics-Kanban, 10Operations, 10observability: Publish tls related info to webrequest via varnish - https://phabricator.wikimedia.org/T233661 (10Vgutierrez) @JAllemandou it's currently split like this: ` - VCL_Log CP-TLS-Version: TLSv1.2 - VCL_Log CP-TLS-Sess... [09:38:18] 10Traffic, 10Analytics, 10Analytics-Kanban, 10Operations, 10observability: Publish tls related info to webrequest via varnish - https://phabricator.wikimedia.org/T233661 (10JAllemandou) Thanks @Vgutierrez - I think representing those values in a map (or an array) is probably the easiest and most flexibl... [09:44:04] 10Traffic, 10Operations, 10Patch-For-Review: Move cache text cluster from nginx to ats-tls - https://phabricator.wikimedia.org/T231627 (10Vgutierrez) [09:55:29] 10Traffic, 10Operations, 10Patch-For-Review: Move cache text cluster from nginx to ats-tls - https://phabricator.wikimedia.org/T231627 (10Vgutierrez) [10:43:03] 10Traffic, 10Analytics, 10Analytics-Kanban, 10Operations, 10observability: Publish tls related info to webrequest via varnish - https://phabricator.wikimedia.org/T233661 (10elukey) Sure! The JSON format of what we collect from Varnish for webrequest is in `profile::cache::kafka::webrequest`: ` format... [11:03:58] 10Traffic, 10Analytics, 10Analytics-Kanban, 10Operations, 10observability: Publish tls related info to webrequest via varnish - https://phabricator.wikimedia.org/T233661 (10elukey) I set up a test webrequest.conf on cp2001, and confirmed that the solution works! Side note - varnishkafka set with output... [11:44:07] 10Traffic, 10Operations, 10Patch-For-Review: Replace Varnish backends with ATS on cache text nodes - https://phabricator.wikimedia.org/T227432 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by ema on cumin1001.eqiad.wmnet for hosts: ` ['cp5008.eqsin.wmnet'] ` The log can be found in `/var/log/wm... [12:22:57] 10Traffic, 10Operations: Slow loading and connectivity issues on some wikis - https://phabricator.wikimedia.org/T236872 (10Aklapper) 05Resolved→03Declined [13:37:07] 10Traffic, 10Operations: Replace Varnish backends with ATS on cache text nodes - https://phabricator.wikimedia.org/T227432 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['cp5008.eqsin.wmnet'] ` Of which those **FAILED**: ` ['cp5008.eqsin.wmnet'] ` [13:46:44] 10Traffic, 10Operations: Replace Varnish backends with ATS on cache text nodes - https://phabricator.wikimedia.org/T227432 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by ema on cumin1001.eqiad.wmnet for hosts: ` ['cp5008.eqsin.wmnet'] ` The log can be found in `/var/log/wmf-auto-reimage/2019103... [14:32:49] 10Traffic, 10DC-Ops, 10Operations, 10ops-esams: cp3056 hardware issue - https://phabricator.wikimedia.org/T236497 (10BBlack) Tried again this morning, but the kernel panics happen too fast to make much progress once the agent starts actually using the NIC (I've only ever had one agent run complete successf... [14:36:03] 10Traffic, 10Operations: Replace Varnish backends with ATS on cache text nodes - https://phabricator.wikimedia.org/T227432 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['cp5008.eqsin.wmnet'] ` and were **ALL** successful. [14:36:19] \o/ [14:36:29] I knew that turning it off and on again would have fixed everything [14:39:50] :D [14:46:38] fix most technical problem with this one weird trick! computers HATE this!! [14:47:42] You won't believe what happened next. [14:49:55] lol [15:08:20] "first you will get mad, then you will be inspired" [15:33:12] 10Traffic, 10Commons, 10MediaWiki-File-management, 10Multimedia, and 8 others: Picture from Commons not found from Singapore - https://phabricator.wikimedia.org/T231086 (10aaron) >>! In T231086#5601608, @fgiunchedi wrote: > swiftrepl is puppetized now to run an eqiad -> codfw sync once a week on Monday (wi... [16:14:52] 10Traffic, 10Core Platform Team, 10MediaWiki-extensions-CentralAuth, 10Operations, and 5 others: Consistent HTTP 503 Error on some urls for some logged-in users (CentralAuth Set-Cookie storm) - https://phabricator.wikimedia.org/T226840 (10BBlack) >>! In T226840#5615777, @Ottomata wrote: > It sounds like th... [17:25:46] 10Traffic, 10Operations, 10Patch-For-Review, 10Puppet, 10User-jbond: Serve volatile uri from local site - https://phabricator.wikimedia.org/T235427 (10jbond) [17:26:11] 10netops, 10Operations, 10Puppet, 10User-jbond: Investigate improvements to how puppet manages interfaces - https://phabricator.wikimedia.org/T234207 (10jbond) [17:26:29] 10netops, 10Operations, 10User-jbond: Investigate the potential benefits of BGPalerter - https://phabricator.wikimedia.org/T230600 (10jbond) [17:32:17] 10Traffic, 10Operations, 10Patch-For-Review, 10Puppet, 10User-jbond: Serve volatile uri from local site - https://phabricator.wikimedia.org/T235427 (10BBlack) ~15m delays should be ok for the GeoIP stuff, it was already sync'd to various consuming cache and DNS nodes over the ~30 minute splay window of p... [17:58:10] 10Traffic, 10Analytics, 10Operations, 10User-jbond: Fix geoip updaters for new MaxMind hashed keys by 2019-08-15 - https://phabricator.wikimedia.org/T228533 (10jbond) [18:03:07] 10Traffic, 10netops, 10Operations, 10IPv6, and 2 others: Fix IPv6 autoconf issues once and for all, across the fleet. - https://phabricator.wikimedia.org/T102099 (10jbond) [18:04:07] 10Traffic, 10Operations, 10SRE-tools, 10Goal, and 3 others: Automate generation of Management DNS records from Netbox - https://phabricator.wikimedia.org/T233183 (10jbond) [18:37:19] 10Traffic, 10Operations, 10serviceops, 10Patch-For-Review: Applayer services without TLS - https://phabricator.wikimedia.org/T210411 (10Dzahn) [18:38:50] 10Traffic, 10Operations, 10serviceops, 10Patch-For-Review: Applayer services without TLS - https://phabricator.wikimedia.org/T210411 (10Dzahn) RT (requesttracker) moved from jessie and public IP (ununpentium) to buster and private IP (moscovium) and https to backend via https://rt.discovery.wmnet [18:41:03] 10Traffic, 10Operations, 10Core Platform Team Workboards (Clinic Duty Team): Have Varnish set the `X-Request-Id` header for incoming external requests - https://phabricator.wikimedia.org/T221976 (10WDoranWMF) [18:54:01] bblack: thanks for your update to phab ticket re the volatile uri, could you also +1 https://gerrit.wikimedia.org/r/c/operations/puppet/+/542922 ? [20:17:15] 10HTTPS, 10Traffic, 10Cloud-VPS, 10Toolforge, 10cloud-services-team (Kanban): Move tools-static.wmflabs.org behind project-proxy - https://phabricator.wikimedia.org/T236952 (10Krenair) [23:17:29] 10netops, 10Operations: cr3-esams crash - https://phabricator.wikimedia.org/T236598 (10ayounsi) 05Open→03Resolved Power cycled CB1 (hosting re1) following https://kb.juniper.net/InfoCenter/index?page=content&id=KB14278&cat=JUNOS&actp=LIST and RE1 is now back online in a healthy state.