[00:01:05] 10Traffic, 10Security-Team, 10WMF-General-or-Unknown, 10ContentSecurityPolicy, 10Patch-Needs-Improvement: Add restrictive CSP to upload.wikimedia.org - https://phabricator.wikimedia.org/T117618 (10BCornwall) [08:22:23] o/ hi sukhe [09:12:39] 10netops, 10Infrastructure-Foundations, 10SRE: Announce internal/core routes from CRs to L3 switches - https://phabricator.wikimedia.org/T344547 (10cmooney) 05Resolved→03Open p:05Low→03Medium [09:17:15] 10netops, 10Infrastructure-Foundations, 10SRE: Announce internal/core routes from CRs to L3 switches - https://phabricator.wikimedia.org/T344547 (10cmooney) After deployment in Codfw I noticed an issue which is affecting our EVPN switches. The problem isn't anything to do with EVPN, but more the fact that o... [09:35:52] 10Traffic, 10User-MoritzMuehlenhoff: Investigate Chrony as a replacement for ISC ntpd - https://phabricator.wikimedia.org/T177742 (10MoritzMuehlenhoff) >>! In T177742#9299994, @BCornwall wrote: > Since we're using systemd's timesyncd nowadays, so this isn't relevant any more. If I'm wrong, please do re-open.... [09:48:58] 10netops, 10Infrastructure-Foundations, 10SRE, 10Patch-For-Review: Announce internal/core routes from CRs to L3 switches - https://phabricator.wikimedia.org/T344547 (10cmooney) Routing now looks ok, for instance in esams to the loopbacks of each CR: ` cmooney@asw1-bw27-esams> show route 185.15.59.128/32... [10:03:27] 10netops, 10Infrastructure-Foundations, 10SRE, 10Patch-For-Review: Announce internal/core routes from CRs to L3 switches - https://phabricator.wikimedia.org/T344547 (10cmooney) 05Open→03Resolved [10:17:52] 10netops, 10Infrastructure-Foundations, 10SRE: Do we need to generate aggregates for LVS service IP ranges - https://phabricator.wikimedia.org/T350354 (10cmooney) p:05Triage→03Low [10:18:17] 10netops, 10Infrastructure-Foundations, 10SRE: Do we need to generate aggregates for LVS service IP ranges? - https://phabricator.wikimedia.org/T350354 (10cmooney) [10:54:05] 10Traffic, 10PyBal: pybal should automatically reconnect to etcd - https://phabricator.wikimedia.org/T169765 (10jbond) 05Open→03Declined @BCornwall I think this ticket should still be tagged traffic. if traffic don't intend t work on pybal any-more you/then they should decline the ticket instead of leavin... [10:55:54] 10netops, 10Infrastructure-Foundations, 10SRE, 10Traffic-Icebox, 10Patch-For-Review: Create Generalised blocking strategy - https://phabricator.wikimedia.org/T270618 (10jbond) > think it would be better if we close this and create smaller tickets with more focused scope. i don't think we need to close t... [11:01:02] 10netops, 10Infrastructure-Foundations, 10SRE: Announce internal/core routes from CRs to L3 switches - https://phabricator.wikimedia.org/T344547 (10ops-monitoring-bot) Icinga downtime and Alertmanager silence (ID=c5da2d0a-c4af-4f96-b651-e1b326898629) set by cmooney@cumin1001 for 2:00:00 on 34 host(s) and the... [11:23:09] 10netops, 10Infrastructure-Foundations, 10SRE: Announce internal/core routes from CRs to L3 switches - https://phabricator.wikimedia.org/T344547 (10cmooney) One other observation is that the MED setting does not optimize the outbound path where we are using EVPN. One might hope that a LEAF switch, learning... [12:02:04] 10netops, 10Infrastructure-Foundations, 10SRE, 10ops-codfw: Bring codfw row A-B EVPN switches live and make them gateway for existing Vlans - https://phabricator.wikimedia.org/T347191 (10cmooney) @papaul hoping to tackle these in this order, want to do the asw-a ones first, then the asw-b ones. |Order|ASW... [12:02:54] hello! I would like to route traffic to the last aqs2 service today if possible https://gerrit.wikimedia.org/r/c/operations/puppet/+/970367 [13:52:55] stevemunene: all rolled out, sorry for the delay [13:53:24] np, Thanks for helping out sukhe [14:31:38] 10Traffic, 10SRE: Allow purged to specify buffer length - https://phabricator.wikimedia.org/T346874 (10Fabfur) 05Stalled→03Resolved [14:34:26] hi hnowlan, I'll give a look at it shortly [14:34:54] thanks! [14:36:51] 10Traffic, 10Traffic-Icebox: Refactoring and some other work on purged - https://phabricator.wikimedia.org/T350396 (10Fabfur) [15:05:29] thanks for the review! I am going to roll forward with the usual puppet-stopping/cp2037 dance [15:06:36] puppet disabled, cp2037 depooled [15:06:39] 👍 [15:12:07] looks good to me on cp2037 [15:12:59] going to keep going [15:13:20] ok [15:25:25] all done, things look fine. I'll propose a CR for cleaning up the gateway script config to route all metrics stuff rather than per-service next week [15:25:28] thanks for the help! [15:26:49] 10Traffic, 10SRE, 10Patch-For-Review: HAProxy should use a single backend for Vanish - https://phabricator.wikimedia.org/T349287 (10Fabfur) The change is been deployed on cp4037.ulsfo.wmnet as test host [15:27:18] 10Traffic, 10Patch-For-Review: Add custom HAProxy backend only for healthchecks - https://phabricator.wikimedia.org/T348851 (10Fabfur) The change is been deployed on cp4037.ulsfo.wmnet as test host, PyBal healthchecks failures will be monitored. [15:30:00] hnowlan: np, thanks to you! [15:40:02] fabfur: did you mean to create https://phabricator.wikimedia.org/T350396 directly into the icebox? [15:42:02] brett: yes, it's a long-term umbrella task, probably I won't start to work on it right now [15:42:07] is this the correct tag? [15:44:47] Others might have different ideas of the icebox but it was originally created as a declaration of bankruptcy from the number of abandoned tasks. There isn't any consensus for our usage of phab quite yet. For now, I guess I'd request it go into the "Traffic" tag backlog while I continue slowly cleaning up the icebox :) [15:45:39] ok tnx [15:45:45] Thank *you* [15:45:52] 10Traffic, 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team: Investigate ATS/Varnish serving stall/cached Zuul status json - https://phabricator.wikimedia.org/T341548 (10BCornwall) [15:46:19] 10Traffic, 10SRE: Refactoring and some other work on purged - https://phabricator.wikimedia.org/T350396 (10Fabfur) p:05Triage→03Low [15:53:07] 10netops, 10Infrastructure-Foundations, 10SRE, 10ops-codfw, 10Patch-For-Review: Bring codfw row A-B EVPN switches live and make them gateway for existing Vlans - https://phabricator.wikimedia.org/T347191 (10Papaul) @cmooney the order works for me [15:55:05] 10netops, 10Infrastructure-Foundations, 10SRE, 10ops-codfw, 10Patch-For-Review: Bring codfw row A-B EVPN switches live and make them gateway for existing Vlans - https://phabricator.wikimedia.org/T347191 (10cmooney) Row A Steps Detail: P53131 Row B Steps Detail: P53132 [15:58:25] 10Traffic, 10netops, 10Infrastructure-Foundations, 10Patch-For-Review: Create Generalised blocking strategy - https://phabricator.wikimedia.org/T270618 (10BCornwall) [15:59:17] 10Traffic, 10SRE, 10GitLab (Project Migration): Move purged repository from Gerrit to GitLab - https://phabricator.wikimedia.org/T346305 (10Fabfur) 05Open→03Invalid Not needed (please refer to T347623 for a list of traffic repositories that needs to be migrated to GitLab) [16:16:57] fabfur: if you have another second free, I have a quick fix for some of the services deployed: https://gerrit.wikimedia.org/r/c/operations/puppet/+/971226 [16:17:01] shouldn't need a slow rollout [16:17:29] ok [16:17:41] is this a fix to the previous endpoints, correct? [16:19:33] yep [16:21:14] 10netops, 10Infrastructure-Foundations, 10SRE, 10ops-codfw: Migrate atlas-codfw from asw-a1-codfw to lsw1-a1-codfw - https://phabricator.wikimedia.org/T348159 (10Papaul) @cmooney cable is in place connected to lasw1-a2-codfw ge-0/0/46 ID 00756 [16:27:19] hnowlan: seems ok [16:27:30] fabfur: thanks! [17:03:35] 10Traffic: Add custom HAProxy backend only for healthchecks - https://phabricator.wikimedia.org/T348851 (10Fabfur) the change has been deployed to all ulsfo cp hosts [17:05:37] 10netops, 10Infrastructure-Foundations, 10SRE, 10ops-codfw, 10Patch-For-Review: Bring codfw row A-B EVPN switches live and make them gateway for existing Vlans - https://phabricator.wikimedia.org/T347191 (10ops-monitoring-bot) Icinga downtime and Alertmanager silence (ID=0a8384b5-aa0d-44df-bf5c-aa9e191ed... [19:59:04] 10Traffic: Refactoring and some other work on purged - https://phabricator.wikimedia.org/T350396 (10BCornwall) [19:59:19] 10Traffic, 10GitLab (Project Migration): Move purged repository from Gerrit to GitLab - https://phabricator.wikimedia.org/T346305 (10BCornwall) [21:11:41] 10Acme-chief, 10Patch-For-Review: acme-chief calls unnecessarily to ACMEChief._push_live_certificates() on daemon start - https://phabricator.wikimedia.org/T218543 (10CodeReviewBot) brett merged https://gitlab.wikimedia.org/repos/sre/acme-chief/-/merge_requests/2 Release 0.36-2 for Bookworm [21:11:50] 10Traffic, 10SRE, 10Patch-For-Review: Upgrade Traffic hosts to bookworm - https://phabricator.wikimedia.org/T342154 (10CodeReviewBot) brett merged https://gitlab.wikimedia.org/repos/sre/acme-chief/-/merge_requests/2 Release 0.36-2 for Bookworm [21:19:50] 10Traffic, 10SRE, 10GitLab (Project Migration), 10Patch-For-Review: Migrate Traffic repositories from Gerrit to Gitlab - https://phabricator.wikimedia.org/T347623 (10CodeReviewBot) brett opened https://gitlab.wikimedia.org/repos/sre/acme-chief/-/merge_requests/5 ci: Automatically build Debian packages [21:20:32] 10Traffic, 10SRE, 10GitLab (Project Migration), 10Patch-For-Review: Migrate Traffic repositories from Gerrit to Gitlab - https://phabricator.wikimedia.org/T347623 (10CodeReviewBot) brett closed https://gitlab.wikimedia.org/repos/sre/acme-chief/-/merge_requests/5 ci: Automatically build Debian packages