[07:57:19] 10Traffic, 10Analytics, 10SRE: Add Traffic's notion of "from public cloud" to Analytics webrequest data - https://phabricator.wikimedia.org/T279380 (10JAllemandou) +1 on the approach (updating the task description for details) [07:58:22] 10Traffic, 10Analytics, 10SRE: Add Traffic's notion of "from public cloud" to Analytics webrequest data - https://phabricator.wikimedia.org/T279380 (10JAllemandou) [08:02:02] 10netops, 10SRE: Unable to load en.wikipedia.org from 84.19.61.192/26 - https://phabricator.wikimedia.org/T279503 (10Peachey88) [08:04:18] 10Traffic, 10netops, 10SRE: Unable to load en.wikipedia.org from 84.19.61.192/26 - https://phabricator.wikimedia.org/T279503 (10Peachey88) For information about reporting network connectivity issues, have a look at https://wikitech-static.wikimedia.org/wiki/Reporting_a_connectivity_issue [08:20:48] 10Traffic, 10netops, 10SRE: Unable to load en.wikipedia.org from 84.19.61.192/26 - https://phabricator.wikimedia.org/T279503 (10A189605) Please see the below information from the link you provided. Output from http://test-ipv6.com/helpdesk/: ` Your Internet help desk may ask you for the information below.... [08:48:33] 10netops, 10Analytics, 10SRE: Audit analytics firewall filters - https://phabricator.wikimedia.org/T279429 (10ayounsi) >>! In T279429#6976000, @ayounsi wrote: > There is also a term permitting UDP fragments, I added a "count" to know if/why we're using it. Looks like we're not. I'll remove it as well. [08:56:11] 10Traffic, 10netops, 10SRE: Unable to load en.wikipedia.org from 84.19.61.192/26 - https://phabricator.wikimedia.org/T279503 (10ayounsi) Investigation started a while ago on a noc@ thread. As additional data point, our Netflow (captured at our edge) show that we're getting your initial TCP SYN and replying... [08:59:21] 10Traffic, 10SRE, 10serviceops: Feedback for new service IP flowchart - https://phabricator.wikimedia.org/T279296 (10akosiaris) Hi, thanks for this! This is interesting, some feedback from my side: * What is the intended audience? If it's an SRE or at least someone who knows what LVS is, it's pretty fine,... [09:03:25] 10netops, 10Analytics, 10SRE: Audit analytics firewall filters - https://phabricator.wikimedia.org/T279429 (10elukey) @razzi this is a good task to get started with the firewall rules of our VLAN :) [09:11:12] 10Traffic, 10netops, 10SRE: Unable to load en.wikipedia.org from 84.19.61.192/26 - https://phabricator.wikimedia.org/T279503 (10A189605) >>! In T279503#6979248, @ayounsi wrote: > Investigation started a while ago on a noc@ thread. > > As additional data point, our Netflow (captured at our edge) show that we... [10:17:09] 10Traffic, 10SRE, 10serviceops: Feedback for new service IP flowchart - https://phabricator.wikimedia.org/T279296 (10ayounsi) Audience is indeed SREs, to be used to help them know what the (high level and preferred) options are when deploying a new service. `Service-deployment-requests` is nice, I'd imagine... [11:37:45] 10Traffic, 10SRE, 10serviceops: Feedback for new service IP flowchart - https://phabricator.wikimedia.org/T279296 (10akosiaris) >>! In T279296#6979590, @ayounsi wrote: > Audience is indeed SREs, to be used to help them know what the (high level and preferred) options are when deploying a new service. OK, go... [12:46:32] 10Traffic, 10SRE: Add exp cache admission policy parameters to hiera - https://phabricator.wikimedia.org/T279533 (10ema) [12:46:41] 10Traffic, 10SRE: Add exp cache admission policy parameters to hiera - https://phabricator.wikimedia.org/T279533 (10ema) p:05Triage→03Medium [12:50:26] bblack: around? [12:52:03] effie: yeah what's up? [12:52:41] I am curious about, why when I switch a service from production to lvs_set [12:52:48] it breaks DNS [12:53:41] I was following the docs on removeing and lvs service, and I didn't realise that it mattered when I would remove the discovery record form dns [12:54:08] and I broke things with "gdnsd checkconf on dns4002 is CRITICAL: CRITICAL: gdnsd -S checkconf failure" [12:54:21] I am not sure I understand how though [12:54:25] ok [12:54:51] well, I don't know all the specifics, but in general there's a dependency between what's defined in discovery DNS on the puppet side, and what's defined on the DNS side [12:55:35] I think the comments are outdated (about specific pathnames), which doesn't help [12:56:00] which commits did you push where in what order? [12:57:21] hmmm ok I think I found: you're talking about https://gerrit.wikimedia.org/r/c/operations/puppet/+/676883/3/hieradata/common/service.yaml [12:57:43] at least - was there another recent change before it? [13:00:42] effie: anyways, the way things logically plug together I think is this: [13:01:12] when you switch back from "production" to "lvs_setup", it's no longer exporting a definition of the parsoid service to the discover-dns mechanism on the DNS side of things [13:01:36] the DNS repo itself will keep passing CI checks because it has a set of mock entries in utils/mock_etc/ [13:02:43] I see [13:03:00] but yeah, the DNS zonefile now references a dns discovery config that doesn't exist, because the puppet-side change removed it [13:03:41] I'm trying to think of a not-confusing way to advise about how to think about these [13:04:09] I guess in the same sense that the service state has a sequence that builds up towards production: service_setup -> lvs_setup -> monitoring -> production [13:04:30] there's really a 5th state there which is "-> has a DNS record tied to discovery-dns" [13:04:58] you can't create this DNS record: [13:04:59] templates/wmnet:parsoid 300/10 IN DYNA geoip!disc-parsoid [13:05:10] until the service side is in the "production" state [13:05:37] 10netops, 10SRE: Higher latency on Lumen eqiad/esams link - https://phabricator.wikimedia.org/T277654 (10ayounsi) a:05ayounsi→03wiki_willy [13:05:48] and to keep CI sane on the DNS side of things, there's also a mock version of that config in the DNS repo itself, which must also stay in sync: [13:05:58] utils/mock_etc/discovery-geo-resources:disc-parsoid => { map => mock, dcmap => { mock => 192.0.2.1 } } [13:07:12] so if you're intending to move a puppet service from "production" to "lvs_setup" or lower: I think the first step has to be to remove that line from ops/dns templates/wmnet, as well as from the DNS mock config, I think, and roll that out via authdns-update [13:07:28] and then push the puppet side [13:10:14] 10Traffic, 10netops, 10SRE: TATA SKY Broadband (AS134674) issues with connecting to upload.wikimedia.org - https://phabricator.wikimedia.org/T275234 (10ayounsi) @ssingh did we get results from those test data? [13:12:21] effie: actually there's wikitech docs on this, which is awesome [13:12:23] https://wikitech.wikimedia.org/wiki/LVS#Remove_a_load_balanced_service [13:17:08] bblack: I am following the doc, but I didn't realise that changing the state to lvs_setup, would break dns [13:19:08] I will add an explanation there, thank you! [13:23:12] well, there's a step 1 already there which says to remove the DNS parts first [13:23:15] that's the hook [13:23:34] I'm pushing up a change on the ops/dns side to update the comments in mock_etc as well [13:27:55] (not that I think you would've stumbled on them naturally in this case anyways, but at least they can not be outdated for anyone that does!) [14:16:43] I updated wikitech to stress that order matters :p [14:16:48] I am restarting pybal on lvs2010, lvs1016 [14:20:01] restarting pybal on lvs2009, lvs1015 [15:07:48] effie: thanks for the docs updates! [15:08:27] helping the next soul who will read this :) [15:34:23] 10netops, 10SRE, 10ops-codfw: Multiple host down alerts from rack C2 - https://phabricator.wikimedia.org/T279457 (10Papaul) David Valverde 12:39 AM (9 hours ago) to me, support Good night Papaul Hope you’re doing well I would like to inform that this RMA was already processed by the Logistics Te... [15:35:35] 10netops, 10SRE, 10ops-codfw: Multiple host down alerts from rack C2 - https://phabricator.wikimedia.org/T279457 (10Papaul) Good morning Papaul Hope you’re doing well Thank you for your response, let me answer the additional question that you have today. What do we do with the license we currently... [16:09:00] 10Traffic, 10DNS, 10SRE, 10Abstract Wikipedia team (Phase δ): Establish wikifunctions.org - https://phabricator.wikimedia.org/T275904 (10DVrandecic) p:05Medium→03Low [16:40:31] 10Traffic, 10DNS, 10SRE, 10Abstract Wikipedia team (Phase δ): Establish wikifunctions.org - https://phabricator.wikimedia.org/T275904 (10DVrandecic) a:03Jdforrester-WMF [17:26:30] 10Traffic, 10DNS, 10SRE, 10Abstract Wikipedia team (Phase δ): Establish wikifunctions.org - https://phabricator.wikimedia.org/T275904 (10Jdforrester-WMF) [17:39:15] 10Traffic, 10DNS, 10SRE, 10Abstract Wikipedia team (Phase δ): Establish wikifunctions.org - https://phabricator.wikimedia.org/T275904 (10Jdforrester-WMF)