[05:05:55] 07Varnish, 10MobileFrontend, 13Patch-For-Review, 03Reading Web Sprint 70 L: Stop default redirecting Samsung Smart TVs to mobile web - https://phabricator.wikimedia.org/T127021#2197939 (10phuedx) [06:21:07] 10Traffic, 10Analytics, 10DNS, 06Operations: Create data.wikimedia.org - https://phabricator.wikimedia.org/T132407#2198010 (10Peachey88) [06:21:17] 10Traffic, 10Analytics, 10DNS, 06Operations: Create data.wikimedia.org - https://phabricator.wikimedia.org/T132407#2197243 (10Peachey88) Will it be a wiki? microsite? [08:35:39] 07Varnish, 10MobileFrontend, 13Patch-For-Review, 03Reading Web Sprint 70 L: Stop default redirecting Samsung Smart TVs to mobile web - https://phabricator.wikimedia.org/T127021#2198126 (10phuedx) 05stalled>03Open [09:32:20] 10Traffic, 07Varnish, 10Analytics, 06Operations: varnishstatsd crashes with ValueError in vsl_callback without being restarted by systemd - https://phabricator.wikimedia.org/T132430#2198273 (10ema) [09:32:30] 10Traffic, 07Varnish, 10Analytics, 06Operations: varnishstatsd crashes with ValueError in vsl_callback without being restarted by systemd - https://phabricator.wikimedia.org/T132430#2198285 (10ema) p:05Triage>03Normal [10:15:09] 10Traffic, 07Varnish, 10Analytics, 06Operations: varnishstatsd crashes with ValueError in vsl_callback without being restarted by systemd - https://phabricator.wikimedia.org/T132430#2198317 (10ema) Another crash, this time on cp4017: Apr 12 07:30:32 cp4017 varnishstatsd[18631]: Traceback (most recent ca... [10:37:11] 10Traffic, 06Analytics-Kanban, 06Operations, 13Patch-For-Review: varnishkafka logrotate cronspam - https://phabricator.wikimedia.org/T129344#2198329 (10elukey) ``` root@carbon:~# reprepro ls varnishkafka varnishkafka | 1.0.2-1 | precise-wikimedia | amd64, source varnishkafka | 1.0.6-1 | trusty-wikimedia |... [10:56:44] ema: how does cache_route get controlled/set for VTC? [11:00:25] 10Traffic, 07Varnish, 10Analytics, 06Operations: varnishstatsd crashes with ValueError in vsl_callback without being restarted by systemd - https://phabricator.wikimedia.org/T132430#2198388 (10BBlack) It looks like various crashes like these have been happening for a while, and puppet runs are what normall... [11:33:52] 07HTTPS, 10Traffic, 06Operations: HTTPS Plans (tracking / high-level info) - https://phabricator.wikimedia.org/T104681#2198426 (10BBlack) [11:38:01] 07HTTPS, 10Traffic, 06Operations, 10Wikimedia-Fundraising, 13Patch-For-Review: delete links.email.donate.wikimedia.org (and all other email.donate.*?) from DNS - https://phabricator.wikimedia.org/T130414#2198434 (10BBlack) [11:38:04] 10Traffic, 06Operations, 10fundraising-tech-ops, 13Patch-For-Review: Decide what to do with *.donate.wikimedia.org subdomain + TLS - https://phabricator.wikimedia.org/T102827#2198433 (10BBlack) [11:42:50] <_joe_> bblack/ema: I would like to have a change prepared for the "Deploy Varnish to switch backend to appserver.svc.codfw.wmnet/api.svc.codfw.wmnet" step in the mediawiki switchover procedure (https://wikitech.wikimedia.org/wiki/Switch_Datacenter#MediaWiki-related) [11:42:58] <_joe_> can you take care of that? [11:43:14] 07HTTPS, 10Traffic, 06Operations: Preload HSTS - https://phabricator.wikimedia.org/T104244#2198443 (10BBlack) [11:43:16] 07HTTPS, 10Traffic, 06Operations: Preload HSTS for select hostnames within wikimedia.org - https://phabricator.wikimedia.org/T111967#2198441 (10BBlack) 05Open>03declined With the impending removal of *.donate, we'll actually finally be able to HSTS wikimedia.org itself at the DNS level. [11:43:24] <_joe_> I created a puppet branch called "switchover" where I am submitting all the changes related to the switchover [11:43:47] 07HTTPS, 10Traffic, 06Operations: Preload HSTS - https://phabricator.wikimedia.org/T104244#1411365 (10BBlack) [11:43:50] 07HTTPS, 10Traffic, 06Operations: Enable HSTS on Wikimedia sites - https://phabricator.wikimedia.org/T40516#2198444 (10BBlack) [11:44:16] 07HTTPS, 10Traffic, 06Operations, 10Wikimedia-General-or-Unknown: Check all wikis for inclusions of http resources on https - https://phabricator.wikimedia.org/T36670#2198447 (10BBlack) [11:44:18] 07HTTPS, 10Traffic, 06Operations: Enable HSTS on Wikimedia sites - https://phabricator.wikimedia.org/T40516#1142785 (10BBlack) [11:47:13] 07HTTPS, 10Traffic, 06Operations: Enable HSTS on Wikimedia sites - https://phabricator.wikimedia.org/T40516#2198449 (10BBlack) Note: this is already the case with two exceptions: 1. We're not sending HSTS (or even forcing HTTPS) on all of cache_misc's service hostnames yet 2. We're not sending HSTS for the *... [11:47:29] 07HTTPS, 10Traffic, 06Operations: Preload HSTS - https://phabricator.wikimedia.org/T104244#2198451 (10BBlack) [11:47:31] 07HTTPS, 10Traffic, 06Operations: Enable HSTS on Wikimedia sites - https://phabricator.wikimedia.org/T40516#2198452 (10BBlack) [11:50:40] 07HTTPS, 10Traffic, 06Operations: Enable HSTS on Wikimedia sites - https://phabricator.wikimedia.org/T40516#2198458 (10BBlack) [12:20:31] _joe_: you know the prepped commit will probably end up with merge conflicts from others right? [12:20:46] since the routes for all the various varnish->app are nearby each other in the same file heh [12:25:25] anyways, I made one at https://gerrit.wikimedia.org/r/#/c/282910/ and put it in your wiki page [12:38:03] 07HTTPS, 10Traffic, 06Operations: Preload HSTS for select hostnames within wikimedia.org - https://phabricator.wikimedia.org/T111967#2198576 (10Chmarkine) >>! In T111967#2198441, @BBlack wrote: > With the impending removal of *.donate, we'll actually finally be able to HSTS wikimedia.org itself at the DNS le... [12:41:01] 10Traffic, 07Varnish, 10Analytics, 06Operations: varnishstatsd crashes with ValueError in vsl_callback without being restarted by systemd - https://phabricator.wikimedia.org/T132430#2198613 (10elukey) The errors seems to be related to two tags: ``` 110 elif tag == 'BackendXID': 111 # Associate... [12:47:47] 07HTTPS, 10Traffic, 06Operations: Preload HSTS for select hostnames within wikimedia.org - https://phabricator.wikimedia.org/T111967#2198642 (10BBlack) Right: "at the DNS level" means it's fixed in terms of not having bad DNS names with no matching certs. We still have services that don't support HTTPS, or... [12:55:40] 10Traffic, 07Varnish, 10Analytics, 06Operations: varnishstatsd crashes with ValueError in vsl_callback without being restarted by systemd - https://phabricator.wikimedia.org/T132430#2198669 (10elukey) ``` /usr/local/lib/python2.7/dist-packages/varnishlog.py 100 def _vsl_handler(priv, tag_id, fd, lengt... [13:27:30] 10Traffic, 06Commons, 06Operations, 10media-storage, and 2 others: Deleted files sometimes remain visible to non-privileged users if permanently linked - https://phabricator.wikimedia.org/T109331#2198788 (10matmarex) [13:28:28] 10Traffic, 06Commons, 06Operations, 10media-storage, and 2 others: Deleted files sometimes remain visible to non-privileged users if permanently linked - https://phabricator.wikimedia.org/T109331#2198793 (10matmarex) I made the task public, with this many duplicates filed there is really no reason not to.... [13:33:17] bblack: cache_route for VTC has no special handling [13:34:45] well the test is only going to work on cache_route == direct [13:34:53] (which is the only case we do_stream = false) [13:36:00] right, so in the specific case of 09-chunked-response-add-cl.vtc /etc/varnish/misc-backend.inc.vcl has the do_stream=false part on my test instance [13:36:03] <_joe_> bblack: meh, you're right [13:36:12] Hello traffic people, if you want we can update the varnishkafka package to its new version [13:38:44] ema: I have a spam-load of HTTPS-related changes (mostly misc, but also affects common VCL) coming through [13:39:03] oh I guess you know, since you set up auto-review :) [13:39:24] bblack: yes I've noticed! :) [13:42:00] bblack: oh now I see what you mean. The test makes sense on eqiad but not on codfw for example [13:42:09] yeah [13:42:37] re: HTTPS - basically dzahn's shortly going to delete *.donate.wm.o from DNS, which was a major non-technical blocker (as in, nothing we can do about it) for HSTS on wikimedia.org [13:43:19] I (we?) kinda stopped caring too hard about fixing HTTPS issues on misc services (those on cache_misc, and the independent ones that are on no cache cluster at all) because we knew the donate issue was blocking wm.o HSTS anyways... [13:43:41] but now that that's gone, I want to try another push at cleaning up cache_misc to be all-HTTPS, and then go after the other services [13:44:04] they're probably all trivial, modulo someone griping about some internal tool here or there which can't do HSTS or follow redirects and hits those services... [13:44:13] 10Traffic, 10Analytics, 10DNS, 06Operations: Create data.wikimedia.org - https://phabricator.wikimedia.org/T132407#2198830 (10MZMcBride) >>! In T132407#2198012, @Peachey88 wrote: > Will it be a wiki? microsite? redirect to a page on meta? It sounds like this is more of a request for a server to host/run t... [13:48:21] 10Traffic, 10DNS, 10Fundraising-Backlog, 06Operations, and 2 others: Updating DNS records for Major Gifts subdomain (benefactors.wikimedia.org) - https://phabricator.wikimedia.org/T130937#2151449 (10BBlack) Before I merge the patch above: is all of the outbound benefactors email stopped for the duration of... [13:53:56] 10Traffic, 07Varnish, 10Analytics, 06Operations: varnishstatsd crashes with ValueError in vsl_callback without being restarted by systemd - https://phabricator.wikimedia.org/T132430#2198859 (10ema) >>! In T132430#2198388, @BBlack wrote: > 1. Fix the crashes Yep. > 2. systemd should restart it unless it's... [13:57:19] 10Traffic, 06Discovery, 06Operations, 10Wikidata, and 2 others: Empty result on a tree query - https://phabricator.wikimedia.org/T127014#2198865 (10BBlack) [13:57:22] bblack: also I've noticed that varnishxcps *does* produce some stats on maps machines so it is not text-only as we thought [13:57:24] 10Traffic, 07Varnish, 06Operations, 13Patch-For-Review: cache_misc's misc_fetch_large_objects has issues - https://phabricator.wikimedia.org/T128813#2198862 (10BBlack) 05Open>03Resolved a:03BBlack This should be resolved now, modulo perhaps some cache entries that need to fall out in the near future... [13:58:01] ema: I thought I stopped it and disabled it for maps? [13:59:29] bblack: right, but I've "ported" it (renamed one tag) and run it by hand on the maps machines. It does produce some stats so we should probably merge the ported version https://gerrit.wikimedia.org/r/#/c/282887/ [14:00:12] are you saying we should add it back to other clusters? [14:00:39] it shouldn't be producing output on others, as there should be no requests for load.php on them (well, legitimate ones that matter anyways - I guess someone can always go get a 404 for it if they want) [14:00:57] oh no, that's not the one about load.php [14:01:06] oh sorry [14:01:08] it's about X-Client-Connection [14:01:09] varnishrls [14:01:11] ok [14:01:25] yes, xcps should produce stats everywhere [14:01:43] we just decided not to care that we missed it in the initial maps conversion, because maps total traffic is a small contribution to our net xcps stats [14:03:20] exactly but it caught my eye today because of the icinga warnings on maps and I've noticed it's really easy to port to v4 [14:04:29] ok [14:08:21] 10Traffic, 10DNS, 10Fundraising-Backlog, 06Operations, and 2 others: Updating DNS records for Major Gifts subdomain (benefactors.wikimedia.org) - https://phabricator.wikimedia.org/T130937#2198881 (10CCogdill_WMF) Yes @BBlack, you're good to go. We aren't pushing any events right now so the system is dormant. [14:11:02] bblack: re: reusing 751 for HTTPS redirects, the only difference between the 755 and 751 handling code is that with 751 we also set CL to 0 [14:11:17] 10Traffic, 10DNS, 10Fundraising-Backlog, 06Operations, and 2 others: Updating DNS records for Major Gifts subdomain (benefactors.wikimedia.org) - https://phabricator.wikimedia.org/T130937#2198884 (10BBlack) 05Open>03Resolved [14:11:25] I guess it' totally fine but I've just noticed the difference now [14:11:28] ema: yeah the 751 version is more-correct [14:11:32] awesome [14:12:33] 10Traffic, 10Analytics, 10DNS, 06Operations: Create data.wikimedia.org - https://phabricator.wikimedia.org/T132407#2197243 (10Milimetric) There's some debate about this. We haven't used data.wikimedia.org in the past because of the possible confusion with wikidata. The wmflabs domain seems less "producti... [14:18:03] elukey: +1 for upgrading varnishkafka on maps machines [14:18:34] heh I screwed up the TLS redirec in the common code, forgot varnish3 clause for 751 [14:22:08] bblack: oh wow good catch [14:23:21] it's what I get for rushing through things :P [14:23:41] but sometimes there's just a backlog of junk that needs fixing up that doesn't fit neatly into the existing tickets and things need to Get Done [14:24:01] someday when our world is a better place, I like to think we'd have higher standards for reviewing and pushing work though! :) [14:24:36] some sunny day! [14:26:16] <_joe_> bblack: when the world is a better place and 10% of the silicon valley GDP is converted into wikipedia donations? [14:26:48] yes [14:27:12] they all use our data anyways. it should be considered a tithe on the profit-seekers :P [14:27:34] (amazon uses it for kindle context lookups, google uses it in search results) [14:28:38] facebook uses it in the internet.org thing, which in spite of the marketing is still about them capturing marketing eyeballs [14:32:07] bblack: uh ok I guess 05-gerrit.w.o-pass.vtc needs to be changed as well :) [14:32:12] let me do that [14:33:01] 07HTTPS, 10Traffic, 06Operations: Preload HSTS for select hostnames within wikimedia.org - https://phabricator.wikimedia.org/T111967#1621104 (10Dzahn) >>! In T111967#2198576, @Chmarkine wrote: > * https://status.wikimedia.org/ -- cert mismatch T34796 but stalled > * https://mirrors.wikimedia.org/ -- canno... [14:33:37] 07HTTPS, 10Traffic, 06Operations: enable https for mirrors.wikimedia.org - https://phabricator.wikimedia.org/T132450#2198960 (10Reedy) [14:47:52] bleh, the HSTS preload-checker doesn't like 404s [14:48:09] I guess we could synth a 200 OK in varnish [14:51:38] 10Traffic, 06Operations: HSTS preload for wmfusercontent.org - https://phabricator.wikimedia.org/T132452#2198983 (10BBlack) [14:52:02] 10Traffic, 06Operations: HSTS preload for wmfusercontent.org - https://phabricator.wikimedia.org/T132452#2198997 (10BBlack) [14:52:05] 07HTTPS, 10Traffic, 06Operations: Preload HSTS - https://phabricator.wikimedia.org/T104244#2198996 (10BBlack) [14:52:39] 07HTTPS, 10Traffic, 06Operations, 10Wikimedia-Fundraising, 13Patch-For-Review: delete links.email.donate.wikimedia.org (and all other email.donate.*?) from DNS - https://phabricator.wikimedia.org/T130414#2198998 (10Dzahn) Hi @CCogdill_WMF i'm back and you said Tuesday or Friday are good days for your s... [14:53:04] 07HTTPS, 10Traffic, 06Operations: Enable HSTS on Wikimedia sites - https://phabricator.wikimedia.org/T40516#2198999 (10BBlack) I should've added: 3. We're also not convered for redirect/HSTS on all the non-cache_misc direct services [14:53:34] Hi, I'm back. I'm seeing another issue that I believe is related to Varnish caching but not sure how to fix properly. Special:Random will show the same single or few pages repeatedly unless a cookie is set, e.g. from visiting the login page. I tried adding a fix to vcl_hit() that worked in the test environment but in production it reduced the cache hit ratio from 87% to about 50%. [14:54:00] The addition to the end of vcl_hit() was: if (obj.ttl > 0s) { return (pass); } [14:54:33] That doesn't feel right but it did seem to work as expected other than the performance degradation. [14:55:02] 07HTTPS, 10Traffic, 06Operations, 10Wikimedia-Fundraising, 13Patch-For-Review: delete links.email.donate.wikimedia.org (and all other email.donate.*?) from DNS - https://phabricator.wikimedia.org/T130414#2199003 (10CCogdill_WMF) Sure @Dzahn, I can make today work. [14:55:38] justinl: MediaWiki should be handling that. on our wikis, requests to MW for Special:Random emit this header: [14:55:41] Cache-control: private, must-revalidate, max-age=0 [14:55:44] which tells varnish not to cache it... [14:56:06] does yours emit that? [14:56:25] Hmm, let me check... [14:57:17] Yes it does. [14:57:41] does your VCL have some hack that's forcing cacheability for /wiki/ ? [14:58:06] 10Traffic, 07Varnish, 10Analytics, 06Operations: varnishstatsd crashes with ValueError in vsl_callback without being restarted by systemd - https://phabricator.wikimedia.org/T132430#2199012 (10ema) varnishstatsd has now crashed on cp4009, cp4017 and cp4018. For some reason it only seems to be crashing in u... [14:58:11] I've lost the paste link you gave before [14:58:37] Let me get the link... [14:59:05] http://pastebin.com/nfs0yUKd [14:59:43] Actually that's a bit dated, missing the change to the req.http.Cookie header, let me update it... [15:00:32] ok updated lines 48 and 53 to be current [15:04:09] ok so [15:04:14] in vcl_deliver you have this: [15:04:16] if (beresp.ttl < 48h) { set beresp.ttl = 48h;} [15:04:38] right, which is from the mediawiki varnish page [15:04:42] which is going to cache uncacheable objects [15:05:12] 10Traffic, 06Commons, 06Operations, 10media-storage, and 2 others: Deleted files sometimes remain visible to non-privileged users if permanently linked - https://phabricator.wikimedia.org/T109331#2199048 (10csteipp) a:05csteipp>03None [15:05:25] ok, so a beresp.ttl of 0s will get cached when it shouldn't? [15:06:10] right [15:06:29] probably what you want, earlier than that in your vcl_deliver, is something like: [15:07:11] if (beresp.ttl <= 0s) { set beresp.ttl = 601s; return (hit_for_pass); } [15:07:22] the current version of the mediawiki varnish page follows up the 48h block with: if (!beresp.ttl > 0s) { return (hit_for_pass); } [15:07:29] ok, so close :) [15:07:41] well a couple problems with the mediawiki page: [15:07:50] interesting. ok, let me try that in my test environment. [15:07:58] ok? i'm listening :) [15:08:05] 1) if that's after the 48h block, then clearly it's not doing anything, since the 48h block raises all <= 0s TTLs to 48h [15:08:43] 2) That leaves the hit_for_pass object at 0s, which isn't ideal either. my version makes the hit-for-pass persist for 10 minutes. [15:09:15] on tuning the 10 minutes value, the tradeoff is this: [15:10:26] 1. By making it longer, you make things more efficient. Every time the hit-for-pass expires, the first few requests that come in (if they're concurrent) will have to stall out on coalescing with each other (expecting a possible cache hit). But by making that only happen every 10 minutes, you minimize the impact and keep most of the requests as parallel passes. [15:10:50] 2. But if you make it too long, changes to underling MediaWiki (version bumps or config changes that change the cacheability of pages) will take a long time to take effect. [15:12:26] Ok, I understand. Thanks for the elaboration. Let me try it in test to ensure it behaves as expected and, if so, I can try it quickly in live and revert if it acts unexpectedly. [15:12:53] it will probably affect more than just Special:Random, I'd assume there are other special MW outputs which were meant to be uncacheable as well [15:15:40] wait, i don't have a vcl_deliver() function defined. is that what you meant? [15:16:08] sorry, vcl_fetch [15:16:35] ok. :) [15:16:59] also, notably-related: the default built-in varnish VCL (which runs at the end of your custom VCL functions, unless you return (foo) in all cases to avoid it), does: [15:17:02] # if (beresp.ttl <= 0s || [15:17:04] # beresp.http.Set-Cookie || [15:17:07] # beresp.http.Vary == "*") { [15:17:09] # /* [15:17:12] # * Mark as "Hit-For-Pass" for the next 2 minutes [15:17:14] # */ [15:17:17] # set beresp.ttl = 120 s; [15:17:19] which is similar in nature and accomplishes the same thing [15:17:21] # return (hit_for_pass); [15:17:24] # } [15:17:26] # return (deliver); [15:17:37] but the earlier "force all sub-48h TTLs to 48h, even if they're zero" is what's preventing that from taking effect in this case [15:18:33] not it also takes care of set-cookie efficiently too [15:20:19] 10Wikimedia-Apache-configuration, 06Operations, 13Patch-For-Review: Redirect for Wikimedia v NSA - https://phabricator.wikimedia.org/T97341#2199095 (10Varnent) 05Open>03Resolved Thank you @faidon for your quick work on this!!! I appreciate the notes about the ineffectiveness of this particular setup, an... [15:21:18] ok, i added the 601s block before the 48h block to my test env and it appears to work properly. so now I can try it in live and see if the behavior is the same (which i'd expect) and see what it does to the hit ratio, which I am now calculating using incoming vs. backend requests as you recommended last week [15:24:15] ok functionally it looks to be working in live as expected, now i just need to watch my graphs for a bit to see if/how things change [15:25:10] 07HTTPS, 10Traffic, 06Operations: let all services on misc-web enforce http->https redirects - https://phabricator.wikimedia.org/T103919#1402411 (10BBlack) I've done a bunch of cleanup on misc-web today, including: 1. removing the dead service entries (download, gerrit, rt) 2. inverting the existing TLS-red... [15:25:26] awesome [15:27:32] 07HTTPS, 10Traffic, 06Operations: let all services on misc-web enforce http->https redirects - https://phabricator.wikimedia.org/T103919#2199127 (10BBlack) I should add: once we can kill the last entry in that last of HTTPS-exceptions, we can drop that whole block and simply set cache_misc's https_redirects... [15:32:11] ok it's going to take a little while for the cache to fill up sufficiently to show the general long-term performance expectations, but it's already about 75% after just a few minutes, so it's looking good so far. i'll keep an eye on it but i'm thinking this should do it. thank you so much! [15:32:13] 10Traffic, 06Discovery, 06Operations, 10Wikidata, 10Wikidata-Query-Service: Move wdqs to an LVS service - https://phabricator.wikimedia.org/T132457#2199146 (10BBlack) [15:32:15] 10Traffic, 10Analytics, 10DNS, 06Operations: Create data.wikimedia.org - https://phabricator.wikimedia.org/T132407#2199157 (10Nuria) >It sounds like this is more of a request for a server to host/run tools than a request for just a DNS A record. I'm curious why Wikimedia Labs is insufficient. Being a prod... [15:34:00] 10Traffic, 06Operations, 10Wikimedia-Logstash: Move logstash to an LVS service - https://phabricator.wikimedia.org/T132458#2199162 (10BBlack) [15:34:27] justinl: np! [15:35:44] bblack: i should hire your professional consulting services :) [15:36:00] heh [15:41:46] 07HTTPS, 10Traffic, 06Operations: HTTPS redirects for config-master.wikimedia.org - https://phabricator.wikimedia.org/T132459#2199185 (10BBlack) [15:42:54] 07HTTPS, 10Traffic, 06Operations: HTTPS redirects for git.wikimedia.org - https://phabricator.wikimedia.org/T132460#2199202 (10BBlack) [15:43:39] 07HTTPS, 10Traffic, 06Operations: HTTPS redirects for graphite.wikimedia.org - https://phabricator.wikimedia.org/T132461#2199215 (10BBlack) [15:43:50] 07HTTPS, 10Traffic, 06Operations: HTTPS redirects for parsoid-tests.wikimedia.org - https://phabricator.wikimedia.org/T132462#2199229 (10BBlack) [15:44:02] 07HTTPS, 10Traffic, 06Operations: HTTPS redirects for datasets.wikimedia.org - https://phabricator.wikimedia.org/T132463#2199245 (10BBlack) [15:44:15] 07HTTPS, 10Traffic, 06Operations: HTTPS redirects for transparency.wikimedia.org - https://phabricator.wikimedia.org/T132464#2199258 (10BBlack) [15:44:30] 07HTTPS, 10Traffic, 06Operations: HTTPS redirects for stats.wikimedia.org - https://phabricator.wikimedia.org/T132465#2199271 (10BBlack) [15:47:54] 07HTTPS, 10Traffic, 06Operations, 10Wikimedia-Fundraising, 13Patch-For-Review: delete links.email.donate.wikimedia.org (and all other email.donate.*?) from DNS - https://phabricator.wikimedia.org/T130414#2199292 (10Dzahn) @CCogdill_WMF @bblack great, we can merge it now then it looks [15:57:35] 07HTTPS, 10Traffic, 06Operations, 10Wikimedia-Fundraising, 13Patch-For-Review: delete links.email.donate.wikimedia.org (and all other email.donate.*?) from DNS - https://phabricator.wikimedia.org/T130414#2199348 (10Dzahn) @CCogdill_WMF We have merged the config change on the DNS servers. Changes should... [15:58:04] 07HTTPS, 10Traffic, 06Operations, 10Wikimedia-Fundraising, 07Blocked-on-Fundraising-Tech: links.email.donate.wikimedia.org should offer HTTPS - https://phabricator.wikimedia.org/T74514#2199353 (10BBlack) [15:58:07] 10Traffic, 06Operations, 10fundraising-tech-ops, 13Patch-For-Review: Decide what to do with *.donate.wikimedia.org subdomain + TLS - https://phabricator.wikimedia.org/T102827#2199352 (10BBlack) [15:58:11] 07HTTPS, 10Traffic, 06Operations, 10Wikimedia-Fundraising, 13Patch-For-Review: delete links.email.donate.wikimedia.org (and all other email.donate.*?) from DNS - https://phabricator.wikimedia.org/T130414#2199351 (10BBlack) 05Open>03Resolved [15:58:18] 10Traffic, 06Community-Advocacy, 06Operations, 13Patch-For-Review: Fix/decom multiple-subdomain wikis in wikimedia.org - https://phabricator.wikimedia.org/T102826#2199358 (10BBlack) [15:58:20] 10Traffic, 06Operations: Clean up DNS/redirects for TLS - https://phabricator.wikimedia.org/T102824#2199359 (10BBlack) [15:58:22] 10Traffic, 06Operations, 10fundraising-tech-ops, 13Patch-For-Review: Decide what to do with *.donate.wikimedia.org subdomain + TLS - https://phabricator.wikimedia.org/T102827#1374642 (10BBlack) 05Open>03Resolved a:03BBlack They're gone! [16:04:53] bblack: there seems to be a VCL syntax error [16:04:58] 'text-common.inc.vcl' Line 71 Pos 78 [16:05:33] yup :/ [16:06:24] 07Varnish, 10MobileFrontend, 13Patch-For-Review, 03Reading Web Sprint 70 L: Stop default redirecting Samsung Smart TVs to mobile web - https://phabricator.wikimedia.org/T127021#2199386 (10zeljkofilipin) [16:08:15] bblack: perhaps we can use a "long string" instead? [16:08:19] https://www.varnish-cache.org/trac/wiki/VCLSyntaxStrings [16:09:53] yeah [16:10:34] ulsfo is depooled now, right? [16:11:33] esams is experiencing some minor packet loss [16:11:38] https://smokeping.wikimedia.org/smokeping.cgi?target=esams.Core.cr1-esams [16:12:38] paravoid: yes [16:14:06] paravoid: due to load from ulsfo you think? it seems longer than that, that it's had issue [16:14:09] 10Traffic, 06Commons, 06Operations, 10media-storage, and 2 others: Deleted files sometimes remain visible to non-privileged users if permanently linked - https://phabricator.wikimedia.org/T109331#2199455 (10NahidSultan) another one: https://upload.wikimedia.org/wikipedia/commons/6/60/Sajid-kiptus.webm htt... [16:14:51] no, I was just wondering if it gets worse whether we can depool esams too :) [16:15:00] bblack: I'd merge https://gerrit.wikimedia.org/r/#/c/282887/ (varnishxcps). Any objections? [16:15:28] paravoid: we should be able to :) [16:15:53] well last time we drained esams, our eqiad transits got saturated, so I'm not that sure :) [16:16:27] well that's regardless of ulsfo [16:16:38] in cache load terms, we should be able to [16:17:07] if we can't drain esams due to transit, yeah that sucks [16:18:05] esams doesn't gain any notable traffic rate from ulsfo depool, but eqiad/codfw do [16:18:11] 10Traffic, 06Analytics-Kanban, 06Operations, 13Patch-For-Review: varnishkafka logrotate cronspam - https://phabricator.wikimedia.org/T129344#2199461 (10Milimetric) p:05Triage>03Normal [16:18:24] mostly, codfw [16:18:50] so I guess that still shouldn't affect the eqiad transit picture much [16:19:23] but right now I think as a general rule, esams traffic all falls back to eqiad [16:19:43] if transit in eqiad can't take that, maybe splitting some of it to codfw we can, which would be a general update to the geodns failover mapping [16:20:16] I know in past discussions we've always predicated that mapping on "any site should be able to handle the global edge load, so we just map for latency" [16:20:28] but if transit saturates in eqiad with esams down, that's not really true anymore [16:21:01] (and thus we're back to wanting some complex load-capacity-weighted auto-mapping system) [16:21:29] and/or a second site in europe [16:28:09] 07HTTPS, 10Traffic, 06Operations: HTTPS Plans (tracking / high-level info) - https://phabricator.wikimedia.org/T104681#2199484 (10BBlack) [16:29:24] 07HTTPS, 10Traffic, 06Operations, 07Easy: WMF-Last-Access cookies doesn't set Secure flag - https://phabricator.wikimedia.org/T105451#2199492 (10BBlack) [16:29:26] 07HTTPS, 10Traffic, 07Varnish, 06Operations, 13Patch-For-Review: Mark cookies from varnish as secure - https://phabricator.wikimedia.org/T119576#2199493 (10BBlack) [16:42:27] 07HTTPS, 10Traffic, 06Operations: Enable HSTS on Wikimedia sites - https://phabricator.wikimedia.org/T40516#2199597 (10He7d3r) [16:50:40] <_joe_> how long do we cache 301 redirects? [16:50:44] <_joe_> bblack/ema ? [16:52:36] as long as the source tells us to, if it's from the app, I think [16:52:43] _joe_: ^ [16:52:57] what service/cluster/etc? [16:54:07] https://github.com/wikimedia/operations-puppet/blob/production/templates/varnish/text-common.inc.vcl.erb#L120 [16:54:11] does '60 s' make sense? [16:55:25] ema: that whole block makes no sense. it's been carried forward as "magical mobile VCL" from older and older versions of VCL with no idea how that really works [16:55:35] most likely, the whole feature is mis-designed :) [16:55:53] awesome :) [16:57:21] probably it should get forced to some realistic and efficient value like the default 3 days or whatever [16:57:45] if they're really relying it only being 60 seconds, it's probably a broken feature that breaks randomly when multiple people flip the toggle switch within a minute [16:58:15] I really don't know, we'd have to start out at "what are these query arguments for wiki-feature-wise, and how is that supposed to work?" [16:58:24] probably in MobileFrontend extension [16:59:51] I'm surprised it's not a syntax error [17:06:01] root@cp3034:~# journalctl _SYSTEMD_UNIT=varnishmedia.service --since today | wc -l [17:06:04] 20509 [17:06:06] mmmh [17:06:08] not cool [17:06:59] it seems to like to repeat 'Flush' all day long [17:13:10] 10Traffic, 06Commons, 06Operations, 10media-storage, and 2 others: Deleted files sometimes remain visible to non-privileged users if permanently linked - https://phabricator.wikimedia.org/T109331#2199673 (10matmarex) @bblack pointed out that if the upload.wikimedia.org URL is still accessible after deletin... [17:22:06] 10Traffic, 07Varnish, 10Analytics, 06Operations: varnishmedia: repeated calls to flush_stats() - https://phabricator.wikimedia.org/T132474#2199721 (10ema) [17:22:48] 10Traffic, 07Varnish, 10Analytics, 06Operations: varnishmedia: repeated calls to flush_stats() - https://phabricator.wikimedia.org/T132474#2199749 (10ema) p:05Triage>03Normal [17:24:11] heh [17:24:30] well maybe the flush is right, but it just needs to shut up about it [17:24:53] more than 600 times in one second? [17:25:11] I have no idea, maybe "flush" is supposed to be what it does after each logline [17:25:47] in any case, I'll be mostly-out most of the rest of the day [17:26:05] will check by a little later though [17:26:13] bblack: alright! See you tomorrow :) [17:26:20] cya [17:41:59] 10Traffic, 06Operations, 06Performance-Team, 13Patch-For-Review: Support HTTP/2 - https://phabricator.wikimedia.org/T96848#2199791 (10ori) Coloring in some additional details. I noticed a regression in first paint time on desktop over the past three months and found a correlated slump in the percent of cli... [17:59:46] 07HTTPS, 10Traffic, 06Operations, 06Release-Engineering-Team, 05Gitblit-Deprecate: HTTPS redirects for git.wikimedia.org - https://phabricator.wikimedia.org/T132460#2199866 (10Dzahn) [18:00:19] 07HTTPS, 10Traffic, 06Operations, 10Parsoid: HTTPS redirects for parsoid-tests.wikimedia.org - https://phabricator.wikimedia.org/T132462#2199867 (10Dzahn) [18:00:56] 07HTTPS, 10Traffic, 10Datasets-General-or-Unknown, 06Operations: HTTPS redirects for datasets.wikimedia.org - https://phabricator.wikimedia.org/T132463#2199873 (10Dzahn) [18:01:53] 07HTTPS, 10Traffic, 10Analytics-Cluster, 06Operations: HTTPS redirects for stats.wikimedia.org - https://phabricator.wikimedia.org/T132465#2199877 (10Dzahn) [18:01:56] 10Traffic, 10Analytics, 06Operations: cronspam from cpXXXX hosts related to varnishkafka non existent processes - https://phabricator.wikimedia.org/T132346#2199879 (10ema) p:05Triage>03Normal [18:03:02] 07HTTPS, 10Traffic, 06Operations, 10Pybal: HTTPS redirects for config-master.wikimedia.org - https://phabricator.wikimedia.org/T132459#2199185 (10Dzahn) [18:04:00] 10Traffic, 06Operations, 10Wikimedia-Logstash: Move logstash to an LVS service - https://phabricator.wikimedia.org/T132458#2199897 (10ema) p:05Triage>03Normal [18:04:12] 10Traffic, 06Discovery, 06Operations, 10Wikidata, 10Wikidata-Query-Service: Move wdqs to an LVS service - https://phabricator.wikimedia.org/T132457#2199898 (10ema) p:05Triage>03Normal [18:14:55] 07HTTPS, 10Traffic, 10Fundraising-Backlog, 06Operations, and 2 others: delete links.email.donate.wikimedia.org (and all other email.donate.*?) from DNS - https://phabricator.wikimedia.org/T130414#2199926 (10DStrine) [18:15:06] 07HTTPS, 10Traffic, 10Fundraising-Backlog, 06Operations, and 3 others: delete links.email.donate.wikimedia.org (and all other email.donate.*?) from DNS - https://phabricator.wikimedia.org/T130414#2135390 (10DStrine) [18:18:27] 07HTTPS, 10Traffic, 10Fundraising-Backlog, 06Operations, and 3 others: delete links.email.donate.wikimedia.org (and all other email.donate.*?) from DNS - https://phabricator.wikimedia.org/T130414#2199931 (10CCogdill_WMF) p:05High>03Unbreak! Hi @Dzahn, I think a record was deleted that we didn't discus... [18:20:17] 10Traffic, 06Operations, 10fundraising-tech-ops, 13Patch-For-Review: Decide what to do with *.donate.wikimedia.org subdomain + TLS - https://phabricator.wikimedia.org/T102827#2199938 (10Dzahn) [18:20:19] 07HTTPS, 10Traffic, 06Operations, 10Wikimedia-Fundraising, 07Blocked-on-Fundraising-Tech: links.email.donate.wikimedia.org should offer HTTPS - https://phabricator.wikimedia.org/T74514#2199939 (10Dzahn) [18:20:23] 07HTTPS, 10Traffic, 10Fundraising-Backlog, 06Operations, and 3 others: delete links.email.donate.wikimedia.org (and all other email.donate.*?) from DNS - https://phabricator.wikimedia.org/T130414#2199936 (10Dzahn) 05Resolved>03Open yes, checking [18:26:35] 07HTTPS, 10Traffic, 10Fundraising-Backlog, 06Operations, and 3 others: delete links.email.donate.wikimedia.org (and all other email.donate.*?) from DNS - https://phabricator.wikimedia.org/T130414#2199957 (10Dzahn) >>! In T130414#2199931, @CCogdill_WMF wrote: > Hi @Dzahn, I think a record was deleted that... [18:26:58] 07HTTPS, 10Traffic, 10Fundraising-Backlog, 06Operations, and 3 others: delete links.email.donate.wikimedia.org (and all other email.donate.*?) from DNS - https://phabricator.wikimedia.org/T130414#2199962 (10CCogdill_WMF) Whew, thanks so much for the quick fix! [18:29:46] 07HTTPS, 10Traffic, 06Operations, 10Wikimedia-Fundraising, 07Blocked-on-Fundraising-Tech: links.email.donate.wikimedia.org should offer HTTPS - https://phabricator.wikimedia.org/T74514#2199972 (10Dzahn) [18:29:48] 10Traffic, 06Operations, 10fundraising-tech-ops, 13Patch-For-Review: Decide what to do with *.donate.wikimedia.org subdomain + TLS - https://phabricator.wikimedia.org/T102827#2199971 (10Dzahn) [18:29:52] 07HTTPS, 10Traffic, 10Fundraising-Backlog, 06Operations, and 3 others: delete links.email.donate.wikimedia.org (and all other email.donate.*?) from DNS - https://phabricator.wikimedia.org/T130414#2199970 (10Dzahn) 05Open>03Resolved [18:33:51] 10Traffic, 06Operations, 06Performance-Team, 13Patch-For-Review: Support HTTP/2 - https://phabricator.wikimedia.org/T96848#2199998 (10BBlack) Making the above analysis harder for anyone else looking: that's not a percentage of client connections using SPDY, it's a percentage of client **requests** using SP... [18:51:35] 10Traffic, 10MediaWiki-Parser, 06Operations, 06Parsing-Team, and 4 others: Banners fail to show up occassionally on Russian Wikivoyage - https://phabricator.wikimedia.org/T121135#2200112 (10Jdlrobson) Thanks to @tgr I've confirmed that the issue is that ParserOutput::getTOCEnabled is false for these pages:... [19:10:27] 10Traffic, 10MediaWiki-Parser, 06Operations, 06Parsing-Team, and 4 others: Banners fail to show up occassionally on Russian Wikivoyage - https://phabricator.wikimedia.org/T121135#2200262 (10Jdlrobson) This is on the SWAt calendar for Wednesday 9:00PST https://wikitech.wikimedia.org/wiki/Deployments#Wednesd... [20:59:52] 10Traffic, 06Operations, 10Parsoid, 10RESTBase, and 3 others: Support following MediaWiki redirects when retrieving HTML revisions - https://phabricator.wikimedia.org/T118548#2200984 (10Pchelolo) [21:05:48] 07HTTPS, 10Traffic, 10Datasets-General-or-Unknown, 06Operations: HTTPS redirects for datasets.wikimedia.org - https://phabricator.wikimedia.org/T132463#2200993 (10ArielGlenn) dumps.wm.o already redirects and so does downloads.wm.o. What's left? [21:15:54] 10Traffic, 10DNS, 10Mail, 06Operations, and 2 others: phabricator.wikimedia.org has no SPF record - https://phabricator.wikimedia.org/T116806#2201055 (10BBlack) I'm confused - I think the last message above indicates we *do* have phab sending emails to mailing lists, which means we should use `?all`, but t... [21:17:07] 10Traffic, 10DNS, 10Mail, 06Operations, and 2 others: phabricator.wikimedia.org has no SPF record - https://phabricator.wikimedia.org/T116806#2201057 (10greg) Right. Based on (at least?) those 3 accounts -> mailing lists, I guess we should use `?all`. [21:18:13] 07HTTPS, 10Traffic, 10Datasets-General-or-Unknown, 06Operations: HTTPS redirects for datasets.wikimedia.org - https://phabricator.wikimedia.org/T132463#2201063 (10BBlack) This isn't for dumps or downloads, this is for `http://datasets.wikimedia.org` [21:36:24] 07HTTPS, 10Traffic, 10Datasets-General-or-Unknown, 06Operations: HTTPS redirects for datasets.wikimedia.org - https://phabricator.wikimedia.org/T132463#2201127 (10ArielGlenn) How does that not redirect already? I thought/hoped that the 'default' parameter in the nginx conf on datasets would take care of t... [21:38:34] 07HTTPS, 10Traffic, 10Datasets-General-or-Unknown, 06Operations, 13Patch-For-Review: HTTPS redirects for datasets.wikimedia.org - https://phabricator.wikimedia.org/T132463#2201136 (10Dzahn) >>! In T132463#2201127, @ArielGlenn wrote: > How does that not redirect already? I thought/hoped that the 'default... [21:39:46] 07HTTPS, 10Traffic, 10Datasets-General-or-Unknown, 06Operations, 13Patch-For-Review: HTTPS redirects for datasets.wikimedia.org - https://phabricator.wikimedia.org/T132463#2201140 (10Dzahn) Somewhat counterintuitively the config for this is not in module dataset but instead in modules/statistics/files/... [21:41:12] 10Traffic, 06Operations, 10Parsoid, 10RESTBase, and 3 others: Support following MediaWiki redirects when retrieving HTML revisions - https://phabricator.wikimedia.org/T118548#2201143 (10BBlack) I hate it, but I don't see any better way to solve the problem at present. To reword to be sure I have the inten... [21:45:49] 07HTTPS, 10Traffic, 10Datasets-General-or-Unknown, 06Operations, 13Patch-For-Review: HTTPS redirects for datasets.wikimedia.org - https://phabricator.wikimedia.org/T132463#2201182 (10ArielGlenn) That just can't be true. And yet it is. 2d1fa5938f22c94324435c75acc4f496c1cacaa3 @Ottomata set it up. Some... [21:47:42] 07HTTPS, 10Traffic, 10Datasets-General-or-Unknown, 06Operations, 13Patch-For-Review: HTTPS redirects for datasets.wikimedia.org - https://phabricator.wikimedia.org/T132463#2201190 (10Dzahn) Hah, and that says "# NOTE: This class has nothing to do with the # datasets site hosted at 'datasets.wikimedia.org... [21:56:01] 07HTTPS, 10Traffic, 06Operations, 07Graphite: HTTPS redirects for graphite.wikimedia.org - https://phabricator.wikimedia.org/T132461#2201268 (10BBlack) [22:16:41] 10Traffic, 06Operations, 10Parsoid, 10RESTBase, and 3 others: Support following MediaWiki redirects when retrieving HTML revisions - https://phabricator.wikimedia.org/T118548#2201380 (10GWicke) @BBlack: Yes, that captures the idea very well. I'm also not too fond of the VCL part of this, but couldn't come... [22:21:23] 07HTTPS, 10Traffic, 06Operations: Enforce HTTPS+HSTS on remaining one-off sites in wikimedia.org that don't use standard cache cluster termination - https://phabricator.wikimedia.org/T132521#2201391 (10BBlack) [22:21:57] 07HTTPS, 10Traffic, 06Operations: Enforce HTTPS+HSTS on remaining one-off sites in wikimedia.org that don't use standard cache cluster termination - https://phabricator.wikimedia.org/T132521#2201406 (10BBlack) [22:21:59] 07HTTPS, 10Traffic, 06Operations: Enable HSTS on Wikimedia sites - https://phabricator.wikimedia.org/T40516#2201405 (10BBlack) [22:23:04] 07HTTPS, 10Traffic, 06Operations: let all services on misc-web enforce http->https redirects - https://phabricator.wikimedia.org/T103919#2201410 (10BBlack) [22:23:06] 07HTTPS, 10Traffic, 06Operations: Enable HSTS on Wikimedia sites - https://phabricator.wikimedia.org/T40516#1146258 (10BBlack) [22:26:36] bblack: do you have some time to talk about what we (well, I) need to do for w.wiki? [22:27:56] everything should be done on the MW side, so now I think it's just apache stuff? [22:31:09] legoktm: from the traffic side of things, it's already configured to hit the mw* servers [22:31:20] so yeah, I guess if there's anything left in the middle, it's the mw* apache configurations [22:33:39] ok [22:33:45] I'll work on that today/tomorrow then [23:18:05] 10Traffic, 10DNS, 10Mail, 06Operations, and 2 others: phabricator.wikimedia.org has no SPF record - https://phabricator.wikimedia.org/T116806#2201603 (10scfc) Sending mails to mailing lists shouldn't matter as those rewrite the envelope header. The problem lies with forwarders that don't do that, for exam...