[13:08:32] Hello fine folks, I'm getting blackholed on gerrit (ssh and https) with timeouts from my german (o2 dsl) IP, unreachable.  mtr is giving zero packet loss so I assume it's a higher level than IP.  Routing my traffic via the WMDE VPN doesn't help, but I can load gerrit on my personal phone (also o2 but 4g).  Could I get some netops assistance [13:08:32] please? [13:10:38] ranyardm-wmde: have you already looked at https://wikitech.wikimedia.org/wiki/Reporting_a_connectivity_issue that has a bunch of useful commands to gather more info and how to open a task for it (you might need to adapt them to gerrit) [13:11:59] ranyardm-wmde: also responded to you on Slack, but yeah, follow the link volans mentioned and file a task so we can look into it [13:12:01] yup. I'm following the "or reach out on the #wikimedia-tech  channel on the libera.chat IRC network" [13:12:08] confirming that this just with Gerrit right? not the wikis? [13:12:19] yes, just gerrit [13:12:47] interesting. so then this has something to do with gerrit's IP [13:14:09] yes, ICMP to gerrit is fine though, just timing out on the web or ssh ports [13:15:14] (dislike webchats, switched to an actual irc channel) [13:33:30] yay for having to switch connection to provide details and IRC really loves that :-) [14:11:17] if anyone with logstash access feels like going down a rabbit hole to look for production-error logs, https://phabricator.wikimedia.org/T416112#11812816 has a request-id that might (hopefully!) have some useful logs attached to it :) [14:18:17] ranyardm-wmde: thanks for filing the task. can I please ask you to update it for the traceroute to say enwiki or dewiki? just for comparison against the Gerrit one [14:22:00] sukhe: done [14:23:54] ranyardm-wmde: oh interesting hmm. [14:23:59] curl -v https://gerrit.wikimedia.org/ [14:24:07] for you, this resolves to 208.80.154.151 [14:24:14] that should not be the case at all, that's the old gerrit IP [14:24:17] yes [14:24:35] I see [14:24:37] you should be hitting gerrit-lb, so 185.15.58.225, or gerrit-lb.drmrs [14:25:12] do you perhaps have the addressed hardcoded somewhere? in /etc/hosts or something? [14:25:38] it fails to connect because there is nothing on that IP, so it's not a network issue [14:26:22] that's from the router, I'll flush the cache there. [14:26:50] yeah that should fix it [14:27:29] your traceroute to enwiki is 185.15.59.224, which is correct. so for gerrit, you should see 185.15.58.225 [14:28:02] rather correction, text-lb and gerrit-lb esams since you are connecting from DE [14:28:38] so 185.15.59.224 for wikis and 185.15.59.225 for gerrit [14:29:37] interestingly, flushing isn't making a difference. [14:30:44] Now I know that I can override it in hosts but I guess my ISP is caching the old IP [14:31:30] ranyardm-wmde: since your curl lookup is also failing, we can rule out browser-level caching or DoH or anything of that sort [14:31:42] the TTL for this is fairly short though, it's 180s [14:32:30] strange... I'm in a meeting but I'll diag some more in a bit for you. [14:33:09] ranyardm-wmde: a simple test is to override the DNS lookup, so let's test this: [14:33:22] curl --resolve gerrit.wikimedia.org:443:$(dig @8.8.8.8 gerrit.wikimedia.org A +short) https://gerrit.wikimedia.org/ [14:33:32] force a DNS lookup through an external resolver and see if it works [14:34:06] 1.1.1.1 is giving the right answer so yeah... [14:34:58] yep [15:29:50] ranyardm-wmde: glad to hear it's resolved! [15:44:20] Yeah, the missing piece of information : It's always DNS. >.< [15:45:26] :>