[07:16:24] 10Traffic, 10Operations: Change "CP" cookie from subdomain to project level - https://phabricator.wikimedia.org/T180407#3761787 (10Krinkle) >>! In T180407#3758143, @Nemo_bis wrote: > I assume the same would apply to the "UseDC" and "UseCDNCache" cookies? And maybe also the "cpPosTime"? `CP` cookie is a long-l... [07:28:24] 10Traffic, 10Operations: Change "CP" cookie from subdomain to project level - https://phabricator.wikimedia.org/T180407#3761794 (10Nemo_bis) Ok, makes sense. Explains why I didn't notice them before checking in the developer console. :-) I don't know whether creating those cookies causes issues (e.g. because o... [08:02:38] 10Traffic, 10Operations: Change "CP" cookie from subdomain to project level - https://phabricator.wikimedia.org/T180407#3756812 (10BBlack) Does RL make use of the CP cookie information to use different module-loading strategies for H/1 vs H/2? I remember that being the intent in creating it, but I'm not sure... [11:36:40] I've created two new branches to deal with future 4.1 varnish packages: debian-wmf-4.1, upstream-4.1 [11:36:43] https://gerrit.wikimedia.org/r/#/admin/projects/operations/debs/varnish4,branches [11:36:51] which we hopefully are not gonna need much! [11:37:57] moritzm: sanity check if you've got a sec? https://gerrit.wikimedia.org/r/#/c/391538/ https://gerrit.wikimedia.org/r/#/c/391541/ [11:38:12] having a look [11:38:16] thanks [11:41:09] +1d both [11:41:39] nice, merging [13:41:02] I love that patch, it's so Varnish :) [13:41:39] "Oh of course there's a bug here, anyone can see we were missing an 'if (l > ll) l = ll;' in our thinking here at glance!" [13:42:42] but hey at least they never annoyed anyone who's still using a real DEC VT100 with varnish source code line-wrapping [13:44:04] "o = 0;" is cute too [13:45:00] i like `int *ptr;` [13:45:31] `int *int_ptr;` would have been more explicit but hey [13:46:11] I'm one of those C heretics who hates attaching the * to the variable name btw :) [13:46:35] I always prefer "char* foo" to "char *foo" mentally [13:47:19] there are two cases where my way breaks down: declaring several pointers on one line like "int *foo, *bar, *baz" [13:47:57] and of course crazy pointer indirections with other attributes, when you're trying to write something like "char ** const * const **foo" or whatever [13:48:24] but IMHO those are the minority-case, and probably better avoiding them with better style/design if they bug you [13:48:55] but to me, the pointer-ness of something is part of the type and belongs attached to the type-name in the common case :P [13:49:43] that makes sense, I never thought about it :) [13:50:20] I prefer char* foo rather than char *foo too [13:50:57] elukey: and do you prefer l or ll? :) [13:52:04] ema: two letters variable? There is the risk of it being too descriptive and confuse the reader :D [13:52:33] I honestly never understood varnish's coding style [13:53:19] it's job security through obfuscation. "This is technically open source for brownie points, but nobody will ever understand it so we can effectively treat it like closed source for for all practical business model purposes" [13:53:35] lol [13:58:43] wow I hadn't seen this before, found in some HN comments: https://lite.cnn.io/en [13:58:57] good to know some major media companies still have lightweight newsfeeds [13:59:26] I was about to say "awesome, links-friendly!" but nope [14:08:54] so cp4024 is back but we still don't fully trust it right? [14:11:45] yeah I was gonna pool it today if it looked sane and healthy still [14:12:07] feel free, but it's a fresh install, should double-check icinga status and that it's not rebooted without us noticing since yesterday [14:15:39] yeah I've upgraded varnish there and the uptime is 18h now so I guess it hasn't rebooted yet! [14:19:33] repooled [15:20:10] 10Traffic, 10Wikimedia-Apache-configuration, 10Discovery, 10Operations, and 3 others: m.wikipedia.org and zero.wikipedia.org should redirect how/where - https://phabricator.wikimedia.org/T69015#3763177 (10Mholloway) [15:29:25] 10Traffic, 10Operations, 10monitoring, 10Patch-For-Review, 10Prometheus-metrics-monitoring: authdns prometheus metrics are not available anymore - https://phabricator.wikimedia.org/T180256#3763215 (10fgiunchedi) 05Open>03Resolved a:03fgiunchedi We're back! And as a nice side effect no longer relyin... [15:46:20] moritzm: ok to merge https://gerrit.wikimedia.org/r/#/c/389964/? I can take care of the pybal restarts (lvs[12]00[36]) [15:48:47] sure! please go ahead [16:06:36] moritzm: https://grafana.wikimedia.org/dashboard/db/pybal?orgId=1&from=now-1h&to=now&panelId=1&fullscreen&var-datasource=eqiad%20prometheus%2Fops&var-server=lvs1003&var-service=apaches_80 [16:10:05] nice, thanks for merging [16:24:40] 10Traffic, 10Wikimedia-Apache-configuration, 10Discovery, 10Operations, and 3 others: m.wikipedia.org and zero.wikipedia.org should redirect how/where - https://phabricator.wikimedia.org/T69015#3763392 (10Mholloway) It looks like the new redirect behavior was introduced in January (https://gerrit.wikimedia... [17:10:46] bblack: so the hfp thing for obj.ttl < 0 in wm_common_backend_response [17:11:36] was it introduced because of some specific issue? Would it make sense to try getting rid of it and seeing if things go south? [17:28:56] so, I'm trying to recall [17:29:06] I think what lead to that was something roughly like this: [17:31:13] Since roughly time immemorial in our VCL, we had some code in there that basically said: if (ttl <= 0) { hfp; } [17:31:21] The idea being that these must be uncacheables, so let's make sure they're HFP uncacheables so they don't stall. [17:32:06] at some point someone (probably me) figured out that this was probably creating HFPs for situations where it was a bad idea, and two such counter-cases came to mind, which that current block tries to workaround [17:33:28] 1) Random 5xx's. E.g. /wiki/Foo is very popular, but it gets one random 5xx on a miss-fetch due to random backend server fail. We don't want to create HFP in this case and cause it to continue to pass for a long time - we instead want the 5xx to go through and the next fetch to attempt caching from the backend again... [17:34:59] 2) Edge-cases to do with object expiration. e.g. we fetch an object from an application or another backend varnish, and it's technically a cacheable object, but happens to have 0 seconds of TTL remaining (or even slightly-negative thanks to grace-mode). We don't want these to create persistent HFP either and again block what should be future legitimate caching of this URL. [17:38:24] re-evaluating those two concerns (well, and the original concern about using HFP usefully on truly-uncacheable outputs that might be popular, to avoid stalling) in light of HFM is probably warranted. Really even without HFM, I think that the present logic might be flawed, or might fail to take into account other cases. I don't know, and I haven't tried to dig into it deeply late.y [17:52:34] ok! [17:52:54] stopping text/upload upgrades for today (dinnertime!) [17:54:39] cya! [17:56:13] thinking on the above a little further: [17:57:11] our original if(ttl<=0){hfp} was probably exactly the kind of common case of hfp-problems that is popular in various Varnish users' VCLs, and that the change to HFM behavior in 5.0 was attempting to address [17:57:40] our current complicated block is also an attempt to address those problems differently, in a world without HFM as an option. [17:58:10] the correct answer may be to replace the complicated block with: if (ttl <= 0) { hfm; } [17:59:47] I do wonder a little if that will make it harder to tease apart miss/pass statistics, though. Will things that a sane human evaluation would call a pass sort of case end up counting as misses a lot? [18:00:57] maybe not. maybe just like hfm can (/often does) convert into a hit shortly afterwards, hfm can also upconvert to a true hfp? Maybe we should look for this case by paying attention to Cache-Control rather than the TTL. [18:01:48] e.g. if(CC header says truly-uncacheable and !5xx) { hfp; } else if (ttl<=0) { hfm; } [18:02:43] well, more robust: [18:03:27] e.g. if(CC header !exists or fails to indicate this is really cacheable, and !5xx) { hfp; } else if (ttl<=0) { hfm; } [18:03:50] where "fails to indicate" means based on the keyword parameters, not the max-age/s-maxage [18:04:22] I donno, it's all very tricky [19:20:52] 10netops, 10Datasets-General-or-Unknown, 10Operations: dumps.wikimedia.org seems to have poor throughput towards some destinations - https://phabricator.wikimedia.org/T120425#1853689 (10ayounsi) @Nemo_bis I see that the last comment is from more than a year ago, is that issue still happening? [20:47:48] 10Traffic, 10netops, 10Operations: Number of nlwiki (biography) articles getting consistently ~70 hits per day for the past months - https://phabricator.wikimedia.org/T180621#3764660 (10MarcoAurelio) [20:48:30] 10Traffic, 10Operations: Number of nlwiki (biography) articles getting consistently ~70 hits per day for the past months - https://phabricator.wikimedia.org/T180621#3763869 (10MarcoAurelio) [22:24:56] ema: * The default stevedore is `-sfile` or `-spersistent` and the synthetic [22:24:59] > object is given a TTL larger than the `shortlived` parameter [22:25:02] > (default: 10 seconds.) [22:25:14] yes, that answers a lot of questions I think! [22:25:29] (I suspect we probably have some cases like that...) [22:56:22] 10Traffic, 10Operations, 10ops-ulsfo: cp4024 kernel errors - https://phabricator.wikimedia.org/T174891#3765029 (10BBlack) 05Open>03Resolved Closing for now, assuming no new problems surface. Thanks @RobH :)