[06:31:34] 10Traffic, 10Operations, 10SRE-tools, 10netbox, and 4 others: Automate generation of Management DNS records from Netbox - https://phabricator.wikimedia.org/T233183 (10ayounsi) [09:27:00] 10Traffic, 10Operations, 10Performance-Team (Radar): Consider collecting more timestamp milestones from ATS-TLS - https://phabricator.wikimedia.org/T265869 (10ema) p:05Triage→03Medium [09:48:46] 10Traffic, 10Cloud-Services, 10Operations, 10cloud-services-team (Kanban): cloudweb2001-dev: add TLS termination - https://phabricator.wikimedia.org/T263829 (10Marostegui) [09:51:41] 10netops, 10Operations, 10cloud-services-team (Kanban): Remove 185.15.56.0/24 from network::external - https://phabricator.wikimedia.org/T265864 (10aborrero) No problems on my side. Probably the smart thing to do is to clearly define the semantics of that data file, so we can safely add/remove stuff from th... [10:06:55] 10Traffic, 10Operations, 10Performance-Team (Radar): 8-10% response start regression (Varnish 5.1.3-1wm15 -> 6.0.6-1wm1) - https://phabricator.wikimedia.org/T264398 (10ema) >>! In T264398#6538699, @Gilles wrote: > I've captured 30 minutes of data using varnishlog simultaneously on cp3052 and cp3054, using 4... [10:07:15] 10netops, 10Operations, 10Patch-For-Review, 10Security, 10User-jbond: Review default ferm INPUT policy - https://phabricator.wikimedia.org/T264888 (10jbond) [10:08:33] 10Traffic, 10DBA, 10Operations, 10Patch-For-Review: dbtree broken (for some users?) - https://phabricator.wikimedia.org/T162976 (10Marostegui) [10:08:57] 10Traffic, 10DBA, 10Operations, 10Sustainability: dbtree: make wasat a working backend and become active-active - https://phabricator.wikimedia.org/T163141 (10Marostegui) 05Stalled→03Declined Closing this as we won't be really working on this anymore, but on deprecating tendril in favour of something e... [11:17:19] 10Traffic, 10Operations: Unclear LVS bandwidth graph in "load balancers" dashboard - https://phabricator.wikimedia.org/T174432 (10Marostegui) 05Open→03Resolved Per the last two comments, looks like this is fixed. [11:41:56] 10Traffic, 10Operations, 10Patch-For-Review: Remove SLAAC IPs from Ganeti hosts - https://phabricator.wikimedia.org/T265904 (10jbond) >>! In T265904#6560485, @Volans wrote: > Do you think we could trick facter into reporting the non-SLAAC address as primary? > > ` > $ sudo facter -p interface_primary > priv... [12:45:42] 10Traffic, 10Operations, 10Patch-For-Review: Remove SLAAC IPs from Ganeti hosts - https://phabricator.wikimedia.org/T265904 (10jbond) I had a look at upstreaming this to facter v3 but i didn't see an obvious fix and as facter v4 is moving back to ruby i'm not sure its worth the effort to fix this in facter v3 [12:46:18] 10netops, 10Operations: Upgrade Routinator 3000 to 0.8.0 - https://phabricator.wikimedia.org/T266001 (10ayounsi) p:05Triage→03Medium [13:08:59] 10Traffic, 10Operations, 10Performance-Team (Radar): 8-10% response start regression (Varnish 5.1.3-1wm15 -> 6.0.6-1wm1) - https://phabricator.wikimedia.org/T264398 (10ema) >>! In T264398#6538699, @Gilles wrote: > ` > SELECT event.responsestart - event.fetchstart FROM event.navigationtiming WHERE year = 2020... [13:41:39] 10Traffic, 10Operations, 10Patch-For-Review, 10Performance-Team (Radar): 8-10% response start regression (Varnish 5.1.3-1wm15 -> 6.0.6-1wm1) - https://phabricator.wikimedia.org/T264398 (10Gilles) >>! In T264398#6564065, @ema wrote: >>>! In T264398#6538699, @Gilles wrote: >> ` >> SELECT event.responsestart... [14:31:52] 10Traffic, 10Operations, 10observability: prometheus-varnish-exporter@frontend.service: Unit entered failed state - invalid character 'C' - https://phabricator.wikimedia.org/T203191 (10ema) 05Open→03Resolved a:03ema The following now returns nothing: ` cumin 'A:cp' 'journalctl -u prometheus-varnish-e... [15:23:31] 10Traffic, 10Operations, 10Performance-Team (Radar): 8-10% response start regression (Varnish 5.1.3-1wm15 -> 6.0.6-1wm1) - https://phabricator.wikimedia.org/T264398 (10Gilles) Here's the same data collected with commands like the following (using `Process`), over a 30 minute period. ` varnishlog -n frontend... [16:05:22] 10Traffic, 10Operations: Large text objects are randomized to cache backends - https://phabricator.wikimedia.org/T266040 (10BBlack) [16:05:59] 10Traffic, 10Operations: Large text objects are randomized to cache backends - https://phabricator.wikimedia.org/T266040 (10BBlack) p:05Triage→03Medium [16:17:03] 10Traffic, 10Cloud-Services, 10Operations, 10cloud-services-team (Kanban): cloudweb2001-dev: add TLS termination - https://phabricator.wikimedia.org/T263829 (10nskaggs) p:05Triage→03Medium [16:17:59] 10Traffic, 10Advanced-Search, 10Discovery-Search, 10Operations, and 3 others: Strange URL pattern after search https://en.wikipedia.org/w/index.php?sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance ... - https://phabricator.wikimedia.org/T243884 (10jcrespo) 05Open→... [16:22:23] 10Traffic, 10Operations, 10Patch-For-Review: Large text objects are randomized to cache backends - https://phabricator.wikimedia.org/T266040 (10RLazarus) [16:28:56] 10Traffic, 10Operations, 10Performance-Team (Radar): 8-10% response start regression (Varnish 5.1.3-1wm15 -> 6.0.6-1wm1) - https://phabricator.wikimedia.org/T264398 (10BBlack) I stumbled on T266040 while looking at something unrelated, but now I'm remembering that earlier in this ticket, there was some menti... [16:31:58] 10netops, 10Operations, 10cloud-services-team (Kanban): Enable L3 routing on cloudsw nodes - https://phabricator.wikimedia.org/T265288 (10aborrero) That's fair. I will try proposing a new date tomorrow. [16:34:54] 10netops, 10Operations, 10cloud-services-team (Kanban): Enable L3 routing on cloudsw nodes - https://phabricator.wikimedia.org/T265288 (10aborrero) New proposed date: 2020-11-03, [17:08:59] 10Traffic, 10Operations, 10Performance-Team (Radar): 8-10% response start regression (Varnish 5.1.3-1wm15 -> 6.0.6-1wm1) - https://phabricator.wikimedia.org/T264398 (10Gilles) Seeing that for some reason in my 30 minute test cp3054 was getting significantly more miss and pass requests than cp3052, I've just... [17:27:51] 10Traffic, 10Operations, 10Performance-Team (Radar): 8-10% response start regression (Varnish 5.1.3-1wm15 -> 6.0.6-1wm1) - https://phabricator.wikimedia.org/T264398 (10Gilles) Comparing percentages for that 30-minute test, which was only looking at hit-front/hit-local/miss/pass for requests to /wiki/ URLs (a... [17:37:21] 10Traffic, 10Operations, 10Performance-Team (Radar): 8-10% response start regression (Varnish 5.1.3-1wm15 -> 6.0.6-1wm1) - https://phabricator.wikimedia.org/T264398 (10Gilles) I've found the dashboard for total objects, and it seems like as many objects are stored now as there were before the Varnish 6 deplo... [17:44:04] 10Traffic, 10netops, 10Operations: Wikimedia projects not reachable for some Telecom Italia users - https://phabricator.wikimedia.org/T262869 (10Nemo_bis) > We'll prepare at least a lightweight incident report in the coming days. Did this happen? I couldn't find it. Sorry if I looked in the wrong places. (... [20:18:58] 10Traffic, 10Operations, 10Performance-Team (Radar): 8-10% response start regression (Varnish 5.1.3-1wm15 -> 6.0.6-1wm1) - https://phabricator.wikimedia.org/T264398 (10CDanis) >>! In T264398#6565366, @Gilles wrote: > I'm not sure how frontend servers are picked to serve requests (hashed by IP? URL?), but thi... [20:40:47] 10Traffic, 10Gerrit, 10Operations, 10Phabricator, 10periodic-update: Phabricator and Gerrit: Improve the way that maintenance downtime is communicated to users. - https://phabricator.wikimedia.org/T180655 (10Dzahn) I think this is done meanwhile. Both Phabricator and Gerrit do not show generic 503 error... [23:16:34] bblack: Would we ever add domain names to ncredir-parking that have a status of: 'registrant: WMNL (or some other affiliate) name servers: WMF'? (yes/no/undefined) :) [23:17:13] example case is pywikibot.org