[09:07:36] topranks: are you available today to check a few vxlan things? [09:08:06] arturo: I got a few minutes right now - and sorry I haven't forgot been a little busy since we spoke [09:08:18] that's ok, no problem :-) [09:10:52] so I have 2 VMs with apparent fully working network connectivity, in two different hypervisors [09:11:43] my ssh isn't working properly for some reason [09:11:49] it's set to "ProxyJump bastion.wmcloud.org:22" [09:11:53] is that valid for codfw? [09:12:10] it is not [09:12:28] you may have an easier time jumping via the console [09:12:45] unless you want to create an user account on codfw1dev, etc... longer procedure [09:12:50] is their an equivalent of that for codfw? or is it "more complex" [09:12:54] ah ok yeah I get you [09:12:57] I'll do that another time [09:13:07] I need to console from the correct libvirts? [09:13:09] so ssh to cloudvirt2005-dev and cloudvirt2006-dev [09:13:13] ok [09:13:49] on cloudvirt2006-dev [09:13:52] try, as root [09:13:56] virsh console i-00039418 [09:14:28] cool yeah that worked :) [09:14:35] on cloudvirt2005-dev, similarly, `sudo virsh console i-0003941b` [09:14:36] what's the console on cloudvirt2005-dev? [09:14:40] thanks! [09:21:36] topranks: seeing anything interesting? [09:23:30] it is not immediatly obvious to me how to see vxlan traffic inside ovs [09:25:18] I'm looking it on the wire [09:25:19] https://phabricator.wikimedia.org/P69112 [09:25:58] sudo tshark -x -V -i vlan2151 "udp port 4789" [09:26:05] looks good! [09:26:58] is this in a cloudvirt? [09:28:36] yes, it is, I see [09:28:46] on the cloudvirt yeah [09:28:55] MTU is as expected, but path mtud is working ok it seems [09:29:09] ok [09:32:32] we have mtu restricitons as expected, but things seem to be set up well [09:32:33] https://phabricator.wikimedia.org/P69113 [09:32:47] The VM ethernet interfaces are getting created with 1450 byte MTU [09:33:14] which... is actually one way to potentially address the mtu issue [09:33:28] The VM will never send a packet larger than that [09:33:44] and it will set MSS duing TCP handshake with anything to what fits in that [09:33:56] I see [09:33:58] which means *most* things work. basically all TCP and all outgoing packets get through [09:34:20] I'm quite pleased with that - you didn't set that 1450 manually? [09:34:35] I did not, I guess that's openstack being smart [09:34:41] cool yeah that's good [09:34:55] means it's working it out from the interface MTU / what it knows it can send to the other VXLAN hosts [09:35:00] quite safe [09:35:48] hopefully that means our mtu issue is more one of "it would be slightly more performant if we had 1500 mtu" than "basic connectivity is broken because we can't send big packets" [09:35:52] https://docs.openstack.org/neutron/latest/admin/config-mtu.html [09:36:00] ^^^ this has some explanations [09:36:31] cool... if it dynamically working out "what fits" is working - as it appears to - hopefully we don't need to override the setting [09:36:53] I guess is up to us if we want to enable jumbo on the physical net [09:38:17] yeah, but at least - as things stand - it doesn't look like we have an issue [09:38:28] I agree [09:38:44] jumbos on the cloud-private should be fine [09:38:51] the thorniest part is how to do the conversion [09:39:05] and if say some of the services (control plane, dns or whatever) running on it will have issues during the transition [09:39:13] i.e. if one host has already had it changed, but another has not [09:39:31] I see [09:40:05] should have thought of it when we were creating cloud-private and tried to start off with jumbos [09:40:20] but I think looking at these VMs it is not going to cause issues for now [09:40:30] may be some edge cases, but inter-vm is working, internet out is working [09:40:32] looks great! [09:40:43] ok [09:41:04] thanks for the check! [09:41:18] would you like to write a few words to T374020 for posterity about this? [09:41:18] T374020: openstack: instrument VXLAN-based flat network - https://phabricator.wikimedia.org/T374020 [09:42:15] for "posterity" - or for myself when I'm back on Tuesday :P [09:42:26] heh :-P [09:48:39] in your earlier paste [09:48:40] Leaving a max ping data size of 1422 (1450 - (10 UDP + 8 ICMP)) [09:48:58] I believe this max size would be 1432 instead of 1422 [09:49:13] just noticed the typo [09:51:04] thanks again for looking at this, really appreciated topranks <3 [09:51:17] UDP header is 20 bytes not 10 [09:51:20] that's the typo sorry [09:51:36] I'll edit the paste as I linked it in the task [09:52:16] UDP header is not 20 bytes.... it's IP header + ICMP [09:52:21] there is no UDP in a ping lol [09:52:51] ok [09:52:57] fixed properly now [09:53:07] cool [09:53:08] max ping size is always 28 bytes less than mtu on linux anyway [09:53:15] cos ping "-s" sets the payload side [09:53:29] and you got IP + ICMP header on top of that [09:53:46] this changes from OS to OS because the world likes to confuse people :) [09:53:53] heh [09:54:20] I'm off now until Tuesday but we can catch up next week [09:54:34] we have our BOF meeting I will send a mail going to try and re-inject some energy into those [09:54:35] excellent, I'll see how we can proceed further with the migration [09:54:42] ok [09:54:49] looking forward to them [09:55:02] we should look at v6 but... this really seems to be working well it's great! [09:55:22] I have been tempted a few times to enable v6 on this new vxlan subnet [09:55:33] but that will most likely distract me from getting rid of the vlan [09:56:02] but also, if you think this is the right moment to do it... let me know :-P [09:56:29] we should do it while the new setup is still a POC [09:56:31] maybe is the right moment because when we migrate VMs off the vlan into the vxlan, that is maybe the perfect moment to inject V6 on them [09:56:37] rahter than put real workloads on it and then try [09:56:52] yeah I think we should do it before moving anything onto it if we can [09:56:57] ok [09:57:08] do you have the ticket with the v6 addressing plan at hand? [09:57:11] _but_ it needs some thought - as we will need to do the upstream routing on switches / cloudgw / neutron etc [09:57:24] sure [09:57:58] think it's this one? [09:57:58] https://phabricator.wikimedia.org/T187929 [09:58:04] probably just last few commetns [09:58:35] yeah that one [10:01:43] I think we can go with Arzhel's suggested - should be ok an dif there is agreement let's do it [10:01:55] which would mean a private range of 2a02:ec80:a080:100::/56 for codfw [10:02:18] what would be private in this case? [10:02:41] cloud-private subnet? [10:02:41] sorry yeah ignore me [10:02:44] nah [10:02:49] this should use public space [10:03:19] so 2a02:ec80:a080::/56 would be the public allocation [10:03:30] gives you 256 /54s [10:03:57] https://www.irccloud.com/pastebin/xQrsU8QF/ [10:03:59] would that /56 be global? or per-DC? [10:04:09] per-DC [10:04:25] we want to announce the wider aggregates (/48 or /40 etc) to the internet separately from each site [10:04:33] ok [10:04:46] otherwise traffic is going half way around the world when it doesn't need to [10:04:59] the above would be for codfw [10:05:10] so, lets say we then allocate /64 per deployment? [10:06:07] I think to begin with we can allocate a /64 for every subnet we have [10:07:57] ok [10:08:00] so like where you have 172.16.129.0/24 you could also have 2a02:ec80:a080:0000::/64 [10:08:08] ok [10:14:10] I added a note on the task there - ignore the above I was using wrong base prefix [10:14:17] logic is the same though [10:14:25] ok [10:18:32] I think I still don't know what public/private networks means in this context [10:18:41] networks that will/wont be part of the BPG announces? [14:09:13] * arturo afk for a bit [16:48:02] * dhinus offline