[06:03:33] 10Traffic, 10Operations, 10Readers-Web-Backlog: [Bug] iPadOS 13 shows the desktop version of Safari with a broken layout - https://phabricator.wikimedia.org/T229875 (10AndyRussG) >>! In T229875#5432239, @dr0ptp4kt wrote: > @ovasileva okay to redirect Safari desktop to mdot? > > CC @DStrine @mepps @ejegg @BB... [06:31:09] 10Domains, 10Traffic, 10Operations, 10WMF-Legal, 10Patch-For-Review: Move wikimedia.ee under WM-EE - https://phabricator.wikimedia.org/T204056 (10tramm) We have configured the name servers with Elkdata, I suppose the update can be finalized now. [08:07:39] 10Traffic, 10Operations: Allow blocking requests from specific networks on the edge - https://phabricator.wikimedia.org/T231063 (10ema) [08:07:47] 10Traffic, 10Operations: Allow blocking requests from specific networks on the edge - https://phabricator.wikimedia.org/T231063 (10ema) p:05Triageβ†’03Normal [08:17:18] 10Traffic, 10Operations, 10Patch-For-Review: Allow blocking requests from specific networks on the edge - https://phabricator.wikimedia.org/T231063 (10Joe) I think it's good to have a first, simple implementation, like the one above, but I think going further we would need a "block" object in puppet (or els... [08:24:12] 10Traffic, 10Operations, 10Patch-For-Review: Allow blocking requests from specific networks on the edge - https://phabricator.wikimedia.org/T231063 (10ema) >>! In T231063#5433306, @Joe wrote: > I think it's good to have a first, simple implementation, like the one above, but I think going further we would n... [11:15:57] 10Traffic, 10netops, 10Operations, 10IPv6, 10Patch-For-Review: Fix IPv6 autoconf issues once and for all, across the fleet. - https://phabricator.wikimedia.org/T102099 (10jcrespo) Hi, I am bit disconnected about the planning of deployment of this- Once all hosts (or all hosts that are planned above being... [14:23:51] 10Traffic, 10Operations, 10ops-eqsin: Picture from commons not found from Singapure - https://phabricator.wikimedia.org/T231086 (10elukey) [14:24:04] this seems interesting --^ [14:24:16] didn't dig into it but added the traffic tag :) [14:29:24] 10Traffic, 10Operations, 10media-storage: Picture from commons not found from Singapure - https://phabricator.wikimedia.org/T231086 (10CDanis) Looks like we are indeed serving 404s for this object from codfw and all points west: `βœ”οΈ cdanis@cdanis ~ πŸ•₯β˜• for DC in esams eqiad codfw ulsfo eqsin ; do echo -ne "$... [14:30:58] ema: bblack: do we cache 404s for upload.wm.o? [14:32:30] godog: I don't suppose you're still about? not totally sure how to map from an upload URL to a backing Swift object [14:37:17] cdanis: eehhh the logic is in our custom middleware in ./modules/swift/files/SwiftMedia/wmf/rewrite.py [14:37:36] cdanis: basically a mapping from "language + project" to the corresponding swift container [14:42:33] 10Traffic, 10Operations, 10media-storage: Picture from commons not found from Singapure - https://phabricator.wikimedia.org/T231086 (10CDanis) Indeed, the object exists on eqiad, but never made it to codfw swift: `βœ”οΈ cdanis@ms-fe1005.eqiad.wmnet ~ πŸ•₯β˜• swift stat wikipedia-commons-local-public.8c '8/8c/Vier_F... [14:42:39] godog: ty, [[Media storage]] helped me [14:42:49] (and I added a cross-link from the swift how to do stuff page) [14:43:22] however I don't know that the stuff on [[Media storage]] about replication is particularly accurate? [14:43:28] how does cross-DC replication work? [14:44:05] ah! thanks, the links are helpful [14:44:20] yeah that section is from many moons ago [14:44:34] ATM mediawiki writes to both datacenters "synchronously" [14:45:04] hm [14:45:26] well clearly it fails at least some of the time ;) [14:45:35] we also used to run a script to eventually (hah!) reconcile the state but afair that wasn't necessary anymore since mediawiki's filebackend was "synchronous" [14:45:38] indeed [14:45:49] cdanis: we do cache 404s, yes! [14:46:22] cdanis: see eg `curl -v https://upload.wikimedia.org/banana 2>&1 | grep -i x-cache:` [14:46:47] ema: ah yeah, as soon as I asked, I looked at x-cache: output [14:47:21] godog: ema: do you think it makes sense in the case of a 404 on swift reads in one DC, to retry on the other? [14:47:37] I can imagine arguments either way, but it would help here [14:47:51] so codfw swift does not have the image in this case, eqiad does? [14:48:32] indeed [14:48:41] when we configured swift as active/active my understanding was that eqiad and swift had the same stuff [14:48:51] eqiad and codfw I mean [14:48:53] yeah, see just above -- they are supposed to, and almost always do [14:48:56] MW writes to both [14:49:23] but it looks like it failed in this case, and whatever eventually-consistent background process that catches up on write failures has not yet had sufficient 'eventually' [14:49:35] oh, I see! [14:50:58] indeed, in this case it would have helped I think yeah, also I'll be looking at why codfw is missing that object on mon [14:52:19] where do you look? logstash? [14:52:31] FTR I think swiftrepl is the keyword for the software that was keeping things in sync [14:52:44] yeah that'd be my first place cdanis [14:53:00] and indeed swiftrepl is the thing we used to reconcile [14:53:09] no hits in puppet for swiftrepl ;) [14:53:55] there's a task about thatβ„’ [14:54:52] ahahaha you are not joking [14:56:15] hahaha yeah I wish I were tho [14:58:41] The run of repl_all.sh of swiftrepl is not puppetized and is running on a SCREEN on a single host (ms-fe1005 as of now). [15:00:21] I don't see a screen or a tmux running on any of the swift FE hosts [15:00:39] did we decide to turn it off when we went active/active and just trust in the synchronous replication working correctly? [15:01:01] is there any icinga monitoring that it is running? [15:01:48] IIRC yes we did turn it off in favour of mw sync replication [15:01:58] 10Traffic, 10Operations, 10media-storage: Picture from commons not found from Singapure - https://phabricator.wikimedia.org/T231086 (10aaron) Isn't there a swiftrepl background process to fix this? [15:02:22] there are icinga checks if the number of thumbnails drifts between eqiad and codfw iirc [15:03:13] 10Traffic, 10Operations, 10media-storage: Picture from commons not found from Singapure - https://phabricator.wikimedia.org/T231086 (10CDanis) >>! In T231086#5433993, @aaron wrote: > Isn't there a swiftrepl background process to fix this? @fgiunchedi tells me that we turned off swiftrepl once the work in {T... [15:03:30] hah aaron isn't on this channel, thanks cdanis for the update on task [15:03:50] is there any reason why we couldn't run swiftrepl still? [15:03:59] (does it have a dry run mode?) [15:05:20] not iirc, but should be simple enough to add heh [15:05:48] ok I'll poke around some more later today [15:07:35] thanks! I did some work on swiftrepl years ago but never put that version in actual production [15:07:52] the details in the task about swiftrepl not being puppetized are still true, i.e. it should work on ms-fe1005 [15:10:06] eheh [15:10:35] 10Traffic, 10Operations, 10media-storage: Picture from commons not found from Singapure - https://phabricator.wikimedia.org/T231086 (10fgiunchedi) >>! In T231086#5433994, @CDanis wrote: >>>! In T231086#5433993, @aaron wrote: >> Isn't there a swiftrepl background process to fix this? > > @fgiunchedi tells me... [15:11:21] brb [15:44:01] 10Traffic, 10Operations, 10media-storage: Picture from commons not found from Singapure - https://phabricator.wikimedia.org/T231086 (10aaron) Still, a file was only uploaded, and no other operations done...I'm not sure why the DB would commit if the file store failed in one of the FileBackendMultiwrite backe... [15:45:59] 10Traffic, 10Operations, 10media-storage: Picture from Commons not found from Singapore - https://phabricator.wikimedia.org/T231086 (10Ammarpad) [15:50:50] 10Traffic, 10Commons, 10MediaWiki-File-management, 10Operations, 10media-storage: Picture from Commons not found from Singapore - https://phabricator.wikimedia.org/T231086 (10Krinkle) [15:51:48] 10Traffic, 10Operations, 10Readers-Web-Backlog: [Bug] iPadOS 13 shows the desktop version of Safari with a broken layout - https://phabricator.wikimedia.org/T229875 (10ovasileva) > In the longer term, we do need to think about responsive design for banners, and how CentralNotice and Advancement banners (like... [16:04:40] 10Traffic, 10Commons, 10MediaWiki-File-management, 10Operations, 10media-storage: Picture from Commons not found from Singapore - https://phabricator.wikimedia.org/T231086 (10Krinkle) The file was uploaded July 24, for which the logs will be deleted over the next 24 hours. Actual upload was logged at "14... [16:25:43] 10netops, 10Operations, 10cloud-services-team: Review switches ACL to connect from tools-bastion to dbproxy1019 - https://phabricator.wikimedia.org/T230980 (10ayounsi) a:03ayounsi That's the change that need to be pushed to cr1/2-eqiad: `lang=diff [edit firewall family inet filter labs-instance-in4 term la... [17:03:29] 10Traffic, 10Commons, 10MediaWiki-File-management, 10Operations, 10media-storage: Picture from Commons not found from Singapore - https://phabricator.wikimedia.org/T231086 (10CDanis) I did some digging in the swift logs around the time the file was uploaded; there's no record of swift in codfw ever recei... [17:09:51] 10Traffic, 10Commons, 10MediaWiki-File-management, 10Operations, 10media-storage: Picture from Commons not found from Singapore - https://phabricator.wikimedia.org/T231086 (10CDanis) BTW for posterity, here's how I looked for logs: on one of the syslog centralservers (e.g. wezen): `ls /srv/syslog/archiv... [18:04:44] 10Domains, 10Traffic, 10Operations, 10WMF-Legal, 10Patch-For-Review: Move wikimedia.ee under WM-EE - https://phabricator.wikimedia.org/T204056 (10Slaporte) Great. We are updating with the registrar now. [18:15:58] 10Traffic, 10Commons, 10MediaWiki-File-management, 10Operations, 10media-storage: upload LB: retry 404s cross-cluster - https://phabricator.wikimedia.org/T231108 (10CDanis) [19:43:22] 10Traffic, 10Commons, 10MediaWiki-File-management, 10Operations, 10media-storage: Picture from Commons not found from Singapore - https://phabricator.wikimedia.org/T231086 (10Der_Keks) A stupid question: Are there any techniques implemented to synchronize replications or to report faulty replications? Th... [19:44:08] 10Traffic, 10Commons, 10MediaWiki-File-management, 10Operations, 10media-storage: Picture from Commons not found from Singapore - https://phabricator.wikimedia.org/T231086 (10CDanis) @Der_Keks yes, that is the purpose of the aforementioned `swiftrepl` daemon. [19:45:12] 10Traffic, 10Commons, 10MediaWiki-File-management, 10Operations, 10media-storage: Picture from Commons not found from Singapore - https://phabricator.wikimedia.org/T231086 (10Der_Keks) Ah okay and that's obviously disabled I understand. [20:57:57] 10Traffic, 10Commons, 10MediaWiki-File-management, 10Operations, 10media-storage: upload LB: retry swift 404s cross-cluster - https://phabricator.wikimedia.org/T231108 (10CDanis) [21:09:05] 10Domains, 10Traffic, 10Operations, 10WMF-Legal, 10Patch-For-Review: Move wikimedia.ee under WM-EE - https://phabricator.wikimedia.org/T204056 (10Slaporte) 05Openβ†’03Resolved The nameserver is updated and appears to be working! [21:31:37] 10Domains, 10Traffic, 10Operations, 10WMF-Legal, 10Patch-For-Review: Move wikimedia.ee under WM-EE - https://phabricator.wikimedia.org/T204056 (10Quiddity) Thanks all, and happy 17th birthday to Estonian Wikipedia! [21:52:11] 10Traffic, 10Operations, 10User-DannyS712: 503: backend fetch failed - https://phabricator.wikimedia.org/T231121 (10DannyS712)