[05:38:20] 10netops, 10Operations, 10ops-eqiad: (Need By: Sept 30) upgrade msw1-eqiad from EX4200 to EX4300 - https://phabricator.wikimedia.org/T225121 (10Papaul) @Cmjohnson I put together a "How to" at the link below on how to upgrade the switch. Please let me know if you have any questions. https://wikitech.wikimedi... [05:38:22] 10Traffic, 10Operations: Provide prometheus metrics for the ncredir service - https://phabricator.wikimedia.org/T228382 (10Vgutierrez) 05Open→03Resolved [05:38:27] 10HTTPS, 10Traffic, 10Operations, 10Goal, 10Patch-For-Review: Create a secure redirect service for large count of non-canonical / junk domains - https://phabricator.wikimedia.org/T133548 (10Vgutierrez) [05:39:58] 10Domains, 10Traffic, 10Wikimedia-Apache-configuration, 10Operations: en-wp.org certificate error - https://phabricator.wikimedia.org/T190244 (10Vgutierrez) 05Open→03Resolved a:03Vgutierrez This has been solved by the deploy of the ncredir service (T133548). [07:07:41] 10Traffic, 10Operations, 10Discovery-Search (Current work): Can't reach cloudelastic.wikimedia.org via IPv6 - https://phabricator.wikimedia.org/T229861 (10Mathew.onipe) 05Open→03Resolved [07:24:06] 10Traffic, 10Elasticsearch, 10Operations, 10Discovery-Search (Current work): Icinga check defined from LVS configuration for cloudelastic are borked - https://phabricator.wikimedia.org/T229621 (10Mathew.onipe) [07:25:39] vgutierrez: good morning! [07:25:52] morning onimisionipe [07:26:08] can you take a look at this https://phabricator.wikimedia.org/T229621? [07:26:12] I left a comment [07:28:16] I don't get your " I don't think this will work as the host param will not be unique and icinga does not seem to handle that well." [07:29:05] how's this case different from the ncredir.wikimedia.org HTTP and HTTPS checks? [07:29:16] we have two checks with the same host and different ports [07:29:52] https://usercontent.irccloud-cdn.com/file/mhqW7WQ0/image.png [07:31:21] Its quite different. The cloudelastic case is all HTTPS. [07:31:48] not HTTP to HTTPS or HTTP and HTTPS [07:32:11] so several https endpoints [07:32:32] and also from your image, icinga seems to be applying a hack by renaming the hostname.. I mean the _ipv6 suffix hostname [07:32:33] yea [07:32:44] maybe we can do that? [07:33:06] I'd say that you should use the same approach that has been used in lvs::monitor_service_http_https [07:33:29] what icinga is not able to handle apparently is 1 host with two IPs (IPv4 and IPv6) [07:33:42] so that why the suffix _ipv6 is added to check the IPv6 endpoints [07:33:58] but several ports on the same IPv4/IPv6 seems fine :) [07:34:49] ok [07:36:05] I will try that [07:36:08] thanks! [10:40:15] 10Traffic, 10Discovery, 10Operations, 10WMDE-Analytics-Engineering, and 3 others: Allow access to wdqs.svc.eqiad.wmnet on port 8888 - https://phabricator.wikimedia.org/T176875 (10elukey) Changed the following: (Cc: @ayounsi ) ` elukey@re0.cr2-eqiad# show | compare [edit firewall family inet filter analyti... [10:43:40] 10Traffic, 10Discovery, 10Operations, 10WMDE-Analytics-Engineering, and 3 others: Allow access to wdqs.svc.eqiad.wmnet on port 8888 - https://phabricator.wikimedia.org/T176875 (10elukey) Adding @WMDE-leszek and @Ladsgroup since afaics they were/are working on this :) [11:06:10] 10Traffic, 10Operations: SRE Onboarding for Sukhbir Singh - https://phabricator.wikimedia.org/T229860 (10ema) p:05Triage→03Normal a:03ema [11:12:59] 10Traffic, 10Operations: SRE Onboarding for Sukhbir Singh - https://phabricator.wikimedia.org/T229860 (10ema) [11:13:46] 10Traffic, 10Operations: SRE Onboarding for Sukhbir Singh - https://phabricator.wikimedia.org/T229860 (10mark) Approved for access. [11:22:32] 10Traffic, 10Operations: SRE Onboarding for Sukhbir Singh - https://phabricator.wikimedia.org/T229860 (10ema) [11:31:28] 10Traffic, 10Operations: SRE Onboarding for Sukhbir Singh - https://phabricator.wikimedia.org/T229860 (10ssingh) [11:31:51] 10Traffic, 10Operations: SRE Onboarding for Sukhbir Singh - https://phabricator.wikimedia.org/T229860 (10ema) [11:59:22] 10Traffic, 10Discovery, 10Operations, 10WMDE-Analytics-Engineering, and 3 others: Allow access to wdqs.svc.eqiad.wmnet on port 8888 - https://phabricator.wikimedia.org/T176875 (10Gehel) At the moment, we have a [[ https://github.com/wikimedia/puppet/blob/production/modules/profile/manifests/wdqs/gui.pp#L24... [12:20:44] vgutierrez: hello again. About our discussion above. I still think cloudelastic use case deviate a bit. lvs::monitor_service_http_https assumes there's an http port and creates an http check according to its implementation which will not be valid here as there's no http endpoint for cloudelastic. and also(except I get it wrong), lvs::monitor_service_http_https does not include option for port except its added [12:20:44] via the uri option. And even if that is added, the http check lvs::monitor_service_http_https will create will still be wrong for cloudelastic. [12:20:56] I know.. a lot of cloudelastic trouble. sorry [12:26:37] yeah... you cannot exactly use that [12:26:48] because the issue you just described [12:55:00] (but you can use the checks that are in the icinga parts of the lvs config today, just move them over to to the right place? [12:55:03] ) [12:55:04] I'd assume [12:55:13] I dug through the existing checks and made-up that one for the right arguments [12:55:43] check_command: check_https_lvs_on_port!cloudelastic.wikimedia.org!9243!/ [13:02:30] bblack: I dont really get you. can you show an example? [13:05:12] the above is the icinga check I last defined in cloudelastic's entries in hieradata/common/lvs/configuration.yaml (when I was trying to debug/fix all related things) [13:05:37] there actually wasn't a great existing check_command that did a simple HTTPS URI check on a custom port for this scenario, so I created that check_command definition for it. [13:05:56] it takes as arguments the hostname, the port, and a URI Path to check (via HTTPS protocol) [13:07:38] ok [13:08:28] so wherever you'd put your monitoring::service {} definitions for this, you can use that check_command [13:09:55] then we can put it in lvs::monitor_services [13:10:17] like I said before. before I started confusing myself [13:12:43] yeah [13:12:58] in theory it can go there, and it will work fine, just using the different check_command shown [13:13:47] my only slight qualm abou that, is that lvs::monitor_services seems to be exclusively filled with "check_wmf_service" definitions of internal services, I guess to keep things cleanly separated like a filing cabinet. [13:14:29] but you could simply make another manifest just like it that's named differently, like lvs::monitor_https_ports or something. [13:14:51] either way it's just bikeshedding about organization, but someone might get annoyed at polluting the purity of monitor_services. [13:15:25] (also monitor_services has contact-group definitions that are Services-specific, with the stuff at the top about 'admins,team-services') [13:18:01] the stuff at the top part can be easily over-ridden [13:18:49] but yea.. A new manifest might be the way to go [13:21:05] Also lvs::monitor_services seems to need some refactoring. A lot of duplication everywhere [13:23:14] I guess there's always a reason for that [13:23:24] sure, but that's relatively minor in the grand scheme of all things wrong with LVS service definitions and related monitoring (the duplication) [13:24:45] I also think naming it like lvs::monitor_cloudelastic might be a way to go as we are introducing this because of cloudelastic and being explicit about it can reduce some confusion (my thoughts) [13:25:37] yeah that's probably fine for now. I doubt we'll have very many other services quite like it (that are on public IPs and LVS, but with 6 different non-standard HTTPS ports, etc) [13:26:52] Yeah.. cloudelastic brings quite the trouble [13:27:08] Thanks!!! [13:27:21] np! [14:13:58] 10Traffic, 10Operations: SRE Onboarding for Sukhbir Singh - https://phabricator.wikimedia.org/T229860 (10Jdforrester-WMF) [14:16:01] 10Traffic, 10Operations, 10Readers-Web-Backlog: [Bug] iPadOS 13 shows the desktop version of Safari with a broken layout - https://phabricator.wikimedia.org/T229875 (10Jdlrobson) a:05Jdlrobson→03phuedx We shouldn't add ` 10Traffic, 10Operations, 10Patch-For-Review: SRE Onboarding for Sukhbir Singh - https://phabricator.wikimedia.org/T229860 (10ssingh) [22:26:24] 10Traffic, 10Operations, 10Patch-For-Review: SRE Onboarding for Sukhbir Singh - https://phabricator.wikimedia.org/T229860 (10Dzahn) [23:16:59] 10Traffic, 10Operations, 10SRE-Access-Requests, 10Patch-For-Review: SRE Onboarding for Sukhbir Singh - https://phabricator.wikimedia.org/T229860 (10Dzahn) [23:37:28] 10Traffic, 10Operations, 10SRE-Access-Requests, 10Patch-For-Review: SRE Onboarding for Sukhbir Singh - https://phabricator.wikimedia.org/T229860 (10Dzahn)