[07:53:27] hello people [07:54:01] I've restarted the varnish backend on cp4024, it was showing up in icinga for mailbox lag and it was causing fetch failures and 503s - https://grafana.wikimedia.org/dashboard/db/varnish-failed-fetches?orgId=1&var-datasource=ulsfo%20prometheus%2Fops&var-cache_type=upload&var-server=All [09:46:07] thanks elukey [10:56:45] 10Traffic, 10Operations, 10media-storage, 10User-fgiunchedi: Swift invalid range requests causing 501s - https://phabricator.wikimedia.org/T183902#3905720 (10fgiunchedi) [14:30:25] mark: thanks for reviewing https://gerrit.wikimedia.org/r/#/c/403677, I've addressed your comments (I think!) [14:37:56] cool [14:38:01] in the mean time i'm writing test cases for Server [14:38:09] i am planning to split that into its own module [14:38:15] but i thought i'd write test cases before hand [14:38:39] there's also another change from a while ago awaiting your review (per service MED) [14:39:09] yep :) [14:39:56] getting somewhere with testing coverage [15:02:21] so my plan is to release 1.14.3 with the two latest bugfixes (alerts endpoint throwing 500s and canDepool logic) and deploy it ASAP [15:02:47] then start working on 1.15.0 (new BGP-related features) [15:19:30] mmh twisted exploded in a funny way on pybal-test [15:19:35] https://phabricator.wikimedia.org/P6604 [15:26:06] aha! that's my fault, I've set proxyfetch.timeout = -1 to make checks fail [15:35:18] hehe [15:35:25] yeah sounds good to me [15:46:47] ok the new logic works fine! [15:46:57] I've tested the following pool: [15:46:59] { 'host': 'cp4028.ulsfo.wmnet', 'weight': 20, 'enabled': True } [15:46:59] { 'host': 'cp4029.ulsfo.wmnet', 'weight': 20, 'enabled': True } [15:46:59] { 'host': 'cp4030.ulsfo.wmnet', 'weight': 20, 'enabled': True } [15:46:59] { 'host': 'cp4031.ulsfo.wmnet', 'weight': 20, 'enabled': False } [15:47:02] { 'host': 'cp4032.ulsfo.wmnet', 'weight': 20, 'enabled': False } [15:47:09] so 2/5 hosts administratively disabled [15:48:11] then I've dropped outgoing packets to cp4028:http with iptables, and pybal did not depool cp4028 (because too many down, threshold=.5) [17:20:12] 10Traffic, 10Operations, 10Pybal, 10Patch-For-Review: pybal's "can-depool" logic only takes downServers into account - https://phabricator.wikimedia.org/T184715#3906596 (10ema) 05Open>03Resolved a:03ema [17:20:21] 10Traffic, 10Operations, 10Pybal, 10Patch-For-Review: Alert instrumentation returning 500 errors - https://phabricator.wikimedia.org/T184721#3906598 (10ema) 05Open>03Resolved a:03ema [17:28:25] <_joe_> we just had a huge spike of 5xxs, you might want to take a look [17:42:03] 10netops, 10DC-Ops, 10Operations, 10ops-eqiad, 10procurement: eqiad: networking audit for support contract renewal - https://phabricator.wikimedia.org/T176338#3906721 (10RobH) 05Open>03Resolved they are in racktables and now being tracked int eh spares rack in eqiad. [19:21:50] looks like pybal unit test coverage will be at 80% once my gerrit changes get merged [20:37:43] 10Traffic, 10Operations, 10TemplateStyles, 10Wikimedia-Extension-setup, and 4 others: Deploy TemplateStyles to WMF production - https://phabricator.wikimedia.org/T133410#2861029 (10Isarra) So when's this happening? Wheeeeeen? [20:40:36] 10Traffic, 10Operations, 10TemplateStyles, 10Wikimedia-Extension-setup, and 4 others: Deploy TemplateStyles to WMF production - https://phabricator.wikimedia.org/T133410#3907310 (10dr0ptp4kt) Hi @Isarra , just wanted to note that @Deskana is taking on product owner duties on this and is working with @Tgr a... [20:56:43] 10Traffic, 10Analytics-Cluster, 10Analytics-Kanban, 10Operations, and 2 others: TLS security review of the Kafka stack - https://phabricator.wikimedia.org/T182993#3907372 (10Ottomata) [20:56:54] 10Traffic, 10Analytics-Cluster, 10Analytics-Kanban, 10Operations, and 2 others: TLS security review of the Kafka stack - https://phabricator.wikimedia.org/T182993#3840663 (10Ottomata) a:03Ottomata [20:59:43] 10Traffic, 10Operations, 10TemplateStyles, 10Wikimedia-Extension-setup, and 4 others: Deploy TemplateStyles to WMF production - https://phabricator.wikimedia.org/T133410#3907388 (10Raymond) >>! In T133410#3907310, @dr0ptp4kt wrote: > Hi @Isarra , just wanted to note that @Deskana is taking on product owner... [21:03:25] 10Traffic, 10Operations, 10TemplateStyles, 10Wikimedia-Extension-setup, and 4 others: Deploy TemplateStyles to WMF production - https://phabricator.wikimedia.org/T133410#3907402 (10Isarra) If we want to propose specific projects for this, should we just do the usual discussion on-wiki to see if there's con... [22:19:09] 10netops, 10Operations, 10ops-esams: replace msw1-esams - https://phabricator.wikimedia.org/T185151#3907680 (10ayounsi) [22:34:42] 10Traffic, 10Operations, 10TemplateStyles, 10Wikimedia-Extension-setup, and 4 others: Deploy TemplateStyles to WMF production - https://phabricator.wikimedia.org/T133410#2861046 (10Iniquity) @Isarra we are waiting for T180817 this task.