[13:46:00] godog: my plan for this morning: back up ms-fe1005's swiftrepl.py, replace it with the one currently in git master, make a version of repl_all.sh that only does replicates and not deletes, start that in a root screen session [13:46:55] cdanis: sounds good to me, +1 [13:47:17] if we can't get to the bottom of the MW issues and need to keep this running, I'll systemd-ify it [13:48:37] <_joe_> cdanis, godog given how badly swift's own replication is doing for the docker registry [13:48:57] <_joe_> should we try to use swiftrepl or an adapted version to replicate those as well? [13:50:00] heh so overlapping/confusing terminology here, docker registry is using container synchronization as opposed to swift intra-cluster replication [13:50:17] <_joe_> oh right sorry [13:50:32] but also I'm not aware of swift working badly for docker registry, what's specifically not working ? [13:50:44] <_joe_> the replication lag can surge to hours [13:51:06] <_joe_> so when you upload an image to codfw, eqiad's swift has it hours later [13:51:29] <_joe_> there were various tickets, I thought fabian talked with you about those [13:52:37] I remember some of that conversations yeah, my understanding was that it was okay now though [13:53:10] next time there's a reoccurrence and/or somebody reports problems please file a task, I can't seem to find open ones about sychronization lagging [14:04:11] hello hello, going to start the eqiad's router routing engines replacement in ~1h, please let me know if you know of any easy to depool services "just in case" [14:12:43] nevermind, this is postponed [14:15:07] running for just a few minutes and already synced 125 files [14:16:20] cdanis: of thumbs or originals ? [14:16:25] originals [14:16:34] 154 now [14:19:33] oof [14:21:33] godog: in theory we need another swiftrepl running somewhere with opposite src/dst settings, right? [14:23:16] cdanis: afair mw should prefer eqiad over codfw, so in theory not but I'm not so sure anymore, so a dry-run with swapped src/dst is certainly in order [14:23:39] brb [14:32:02] <_joe_> cdanis: that's tricky, in theory none of this should happen, so also in theory that shouldn't be needed [14:32:08] <_joe_> but who knows at this point? [14:32:15] <_joe_> a dry run is certainly a good idea [14:33:14] adding dry run is a good idea and would be nontrivial code changes to the script :) [14:34:19] I think it would be safe enough to run the opposite direction in the same mode of copying nonexistent files, but not doing any deletes [14:35:28] *nod* yeah seems safe [17:05:01] it's gotten stuck due to some logic error, I'll investigate later [17:05:14] replicated about 2270 files though [17:05:20] and finished all of commons [17:17:25] cdanis: sounds good, thanks for the update