[06:36:48] greetings [10:13:51] volans: I'll take it up here rather than spamming -sre [10:14:11] I'm wondering is there a reason the cinder volumes cannot be backed up via the cloud-private networking? [10:14:29] cinder volumes are backed up cross-dc [10:14:33] using the http proxy means traffic goes over the limited bw links to the CRs and back [10:14:44] ah ok well that will happen anyway [10:14:47] yep :D [10:14:48] ok cool [10:15:06] but the local stuff stays within the cloud switches right? it should perform much better that way [10:15:33] mostly, the cloudbackup1* hosts have a leg in cloud-private and hence go without proxy via that [10:15:48] ok gotcha, and that's why that was working [10:15:50] verified yesterday: https://phabricator.wikimedia.org/T428867#12008873 [10:16:19] and yes that's why they didn't broke [10:16:21] BUT [10:16:34] if they had broken, we would have been notified by the systemd monitoring in eqiad [10:16:35] we did discuss connecting the "cloud-private" networks on each site via tunneling between the cloudgw [10:16:45] but arturo advised it wasn't needed at the time [10:16:59] tbh it wouldn't bring any performance improvements so I don't think worth it for this [10:17:44] unless that would allow to do some dedicate QoS and you're interested to do that [10:18:02] in general they have to travel in the same pipe anyway [10:20:32] I don't really see a huge benefit [10:21:17] as the connection is to a public IP the proxy works fine, we were previously considering this if connectivity was needed to services purely exposed on private 172.x IPs [10:22:04] ack, do you want me to answer the comment in the doc or will you self-answer yourself? :D [10:22:19] I'll answer and resolve :) [10:22:20] thanks! [10:22:28] <3 [10:22:35] topranks: aiui it's just the openstack metadata going through the proxy, the actual data transfer is from ceph directly which uses the cloud-hosts* addresses [10:23:34] ah yes, that too, we set the proxy only for the openstack APIs to get the mapping of volumes [11:15:59] ah ok thanks for the confirmation guys, indeed yes it was the fact this was "disk backups" that caught my attention [11:17:07] that's a strange situation though, the hosts in eqiad talk to the API in codfw, and then send the data to hosts in eqiad? [11:17:55] on another note I created some tasks for switch upgrades: [11:18:02] https://phabricator.wikimedia.org/T429013 [11:18:23] computer said the tag I added for you guys was wrong - I'm not sure what's best for this, "Cloud VPS" ? [11:22:21] topranks: cloud vps and tools-infrastructure-team [11:22:47] taavi: thanks, I'll try to remember for next time <3 [11:23:27] topranks: I'm not sure I follow, cloudbackup2* hosts in codfw talk to openstack API in eqiad and then pull data from ceph cluster in eqiad [11:23:54] cloudbackup1* hosts in eqiad talk to openstack api in eqiad and then pull data from ceph cluster in eqiad, all via cloud-private vlan [11:24:46] volans: I may have mis-read, they pull directly using the 10.x cloud-hosts address? [11:24:57] traffic flow wise that's all fine sorry [11:25:06] I agree it is slightly confusing though :) [11:25:57] yes pulling via 10.x [11:25:58] tcp ESTAB 0 0 10.192.48.34:53154 10.64.149.4:6828 users:(("backy2 [Backing",pid=362381,fd=140)) [12:29:46] any thoughts/suggestions on my last 2 comments on https://phabricator.wikimedia.org/T428867#12014112 ? [12:55:21] +1 to deleting the snaphots, but make sure they're all backy2 snapshots (some volume might have cinder snapshots too) [12:56:46] and deleting snapshots can sometimes fail, but most should hopefully delete cleanly [12:58:32] unprotected and with name containing _cloudbackup should be enough? [13:03:57] because we have two options, snap purge ( Delete all unprotected snapshots.) or I grep the above conditions and then use snap remove [13:25:33] I'm not sure if snapshots created in horizon get the "protected" flag immediately, or only when a volume is derived from them. to be safe I think matching the "_cloudbackup" string in the name is the way to go. [13:26:44] I've run https://etherpad.wikimedia.org/p/volans-tmp2 (just printing the commands) [13:28:06] do be executed with: while read line; do $line; sleep 10; done < filename or similar [13:32:42] dhinus: ok for you? [13:33:27] I also excluded any pre-2026 snapshots, no matter the origin, at least for this. I know there are some snapshot leaks [13:33:31] in some cases [13:56:38] lgtm [13:57:00] how big is the list? [13:57:41] 147 [13:58:42] reasonable [13:59:15] ok using more like 60s sleep and starting that on a tmux [15:30:42] slyngs: is there a delay between when a user creates an account at https://idm.wikimedia.org/signup/ and when they show up in ldap? I'm talking to a user who got an email confirmation from idm but now can't log in. Not sure if that's a typo or some kind of expected delay. [15:33:09] slyngs: nevermind, I think user was confusing wiki account with developer account [15:58:17] I finally wrote actual documentation about sending good posts to cloud-announce: https://wikitech.wikimedia.org/wiki/User:Taavi/cloud-announce [16:01:58] +2 [16:15:54] nice [16:38:04] thank you taavi . Who are the moderators for that list? (i am not able to tell from lists.wikimedia.org) [16:54:35] bliviero: the moderator + owner lists is most enginers in the tools platform + infrastructure teams [17:17:01] I'm going to meet up with some other WMF folks so will be afk while I'm transit