[01:02:12] 10Traffic, 06Operations: Increase request limits for GETs to /api/rest_v1/ - https://phabricator.wikimedia.org/T118365#2378110 (10GWicke) Quick status update: We have since introduced per-entrypoint limits in the REST API. Initially, this is targeted at [uncacheable transforms](https://en.wikipedia.org/api/res... [03:42:15] 10Traffic, 10DBA, 06Labs, 06Operations, 10Tool-Labs: Antigng-bot improper non-api http requests - https://phabricator.wikimedia.org/T137707#2378215 (10Antigng_) Labs replicas can't do that job, as revision tables are removed on such databases. Dumps are not updated such often. [06:35:36] 10Traffic, 06Operations, 06Wikipedia-iOS-App-Product-Backlog: Wikipedia app hits loads.php on bits.wikimedia.org - https://phabricator.wikimedia.org/T132969#2378436 (10JMinor) p:05Normal>03Triage [07:41:02] 07HTTPS, 10Traffic: xx.wikipedia.com returns a certificate error - https://phabricator.wikimedia.org/T137779#2378666 (10cookies52) [07:41:45] 07HTTPS, 10Traffic, 06Operations: xx.wikipedia.com returns a certificate error - https://phabricator.wikimedia.org/T137779#2378679 (10cookies52) [07:45:03] 07HTTPS, 10Traffic, 10domains, 06Operations: xx.wikipedia.com returns a certificate error - https://phabricator.wikimedia.org/T137779#2378687 (10Peachey88) [12:21:01] 07HTTPS, 10Traffic, 10domains, 06Operations: xx.wikipedia.com returns a certificate error - https://phabricator.wikimedia.org/T137779#2379369 (10BBlack) [12:21:03] 07HTTPS, 10Traffic, 06Operations, 13Patch-For-Review: Secure redirect service for large count of non-canonical / junk domains - https://phabricator.wikimedia.org/T133548#2379371 (10BBlack) [12:27:46] ori: btw, the TLS dynamic record sizing config went out ~ 2016-06-13 20:15 . I don't think it made any significant difference in navtiming-related graphs, but it's hard to isolate small effects from noise. any insight into whether it did? [12:28:33] ori: there are other related experiments we could do too (e.g. bump the minimum from 1300 to 4K), but if the net effect is just as unreadable there's no real basis for tuning. [12:29:33] ori: (what we expect from the first dynamic record config is: no negative impact on anything, but possible positive impact on loading large amounts of data (large pages, lots of images, etc)) [15:13:36] 10Traffic, 06Operations, 13Patch-For-Review: Support websockets in cache_misc - https://phabricator.wikimedia.org/T134870#2379830 (10BBlack) Seems to basically work. Ideally we'd limit the websocket VCL capabilities based on req.http.Host as well, but that will have to come later with further refactoring of... [15:13:43] 10Traffic, 06Operations, 13Patch-For-Review: Support websockets in cache_misc - https://phabricator.wikimedia.org/T134870#2379832 (10BBlack) 05Open>03Resolved a:03BBlack [15:13:47] 10Traffic, 06Operations, 10Phabricator: Phabricator needs to expose notification daemon (websocket) - https://phabricator.wikimedia.org/T112765#2379835 (10BBlack) [15:20:14] 10Traffic, 06Operations, 13Patch-For-Review: Move stream.wikimedia.org (rcstream) behind cache_misc - https://phabricator.wikimedia.org/T134871#2379847 (10BBlack) This seems to be working now. It's fully-configured on cache_misc other than switching the DNS resolution for stream.wm.o to cache_misc, and can... [15:21:40] 10Traffic, 06Operations, 10Wikimedia-Stream: Move stream.wikimedia.org (rcstream) behind cache_misc - https://phabricator.wikimedia.org/T134871#2379849 (10BBlack) [15:29:30] 10Traffic, 06Operations, 10Wikimedia-Stream: Move stream.wikimedia.org (rcstream) behind cache_misc - https://phabricator.wikimedia.org/T134871#2379875 (10BBlack) Looking into the Websockets RFC ( https://tools.ietf.org/html/rfc6455 ), it says in section 4.1: ``` Once the client's opening handshake has been... [15:34:15] bblack: I found out with ema one use case in which the last timestamp is [15:34:18] Timestamp PipeSess: 1465915543.388921 4571.799760 4571.799658 [15:34:30] that should be an Upgrade req probably? [15:34:57] elukey: that's new as of this morning, it's the websockets pipe stuff I'm working on [15:35:23] ouch [15:35:57] pipe requests are "different". as long as it doesn't actively break anything, I'd ignore them for now [15:36:14] we have zero known working cases of legitimate traffic for them so far heh [15:36:28] (and they probably would've been broken in varnish3, too, for logging) [15:36:30] yep yep we'll remove them from the req count since vk will not find the end timestamp [15:36:41] 10Traffic, 10DBA, 06Labs, 06Operations, 10Tool-Labs: Antigng-bot improper non-api http requests - https://phabricator.wikimedia.org/T137707#2379894 (10Antigng_) If you don't give me a good reason why cp1008.wikimedia.org:3128 / index.php?action=raw shouldn't be used, I will start some of my jobs that don... [15:38:35] the last one would be trying to figure out why some mw-release.tar.gz file has no End Timestamp [15:40:20] something like releases.wikimedia.org /mediawiki/1.26/mediawiki-1.26.3.tar.gz [15:44:54] 10Traffic, 10DBA, 06Labs, 06Operations, 10Tool-Labs: Antigng-bot improper non-api http requests - https://phabricator.wikimedia.org/T137707#2379907 (10Joe) @Antigng_ just to understand, what is your bot doing? If dumps are not refreshed fast enough for you, maybe you should make your bot follow one of th... [15:50:23] 10Traffic, 10DBA, 06Labs, 06Operations, 10Tool-Labs: Antigng-bot improper non-api http requests - https://phabricator.wikimedia.org/T137707#2379915 (10Antigng_) Most of my tasks don't generate such " unacceptable amount of traffic". [15:56:12] 10Traffic, 10DBA, 06Labs, 06Operations, 10Tool-Labs: Antigng-bot improper non-api http requests - https://phabricator.wikimedia.org/T137707#2376097 (10BBlack) >>! In T137707#2376258, @Antigng_ wrote: > My bot was using /w/index.php?action=raw to fetch the content of each page/redirect at zhwiki, then it... [15:58:07] 10Traffic, 06Operations, 10Wikimedia-Stream: Move stream.wikimedia.org (rcstream) behind cache_misc - https://phabricator.wikimedia.org/T134871#2379935 (10BBlack) Another datapoint, in nginx logs on rcs1001, most successful operations seem to be non-SSL: ``` root@rcs1001:/var/log/nginx# grep -v rcstream_stat... [16:03:34] 10Traffic, 06Operations, 10Wikimedia-Stream: Move stream.wikimedia.org (rcstream) behind cache_misc - https://phabricator.wikimedia.org/T134871#2379955 (10BBlack) Obviously, if we can't fix the existing non-SSL clients in a timely fashion (or can't assume they can handle redirects), our other option is to pu... [16:09:57] 10Traffic, 06Operations, 10Wikimedia-Stream: Move stream.wikimedia.org (rcstream) behind cache_misc - https://phabricator.wikimedia.org/T134871#2379993 (10BBlack) Tried the sample JS client code too, from https://wikitech.wikimedia.org/wiki/RCStream#JavaScript . Same basic results. It works fine if I prepe... [16:10:41] 10Traffic, 10DBA, 06Labs, 06Operations, 10Tool-Labs: Antigng-bot improper non-api http requests - https://phabricator.wikimedia.org/T137707#2379995 (10Antigng_) Also, there doesn't exist a clear request rate limit for mediawiki api, as the rest api does. If you want to set one, you should document it. [16:11:11] 10Traffic, 10DBA, 06Labs, 06Operations, 10Tool-Labs: Antigng-bot improper non-api http requests - https://phabricator.wikimedia.org/T137707#2379997 (10jcrespo) For the API part, I would like to add that API infrastructure (application servers and databases) is specifically prepared to be separated from n... [16:22:53] 10Traffic, 10DBA, 06Labs, 06Operations, 10Tool-Labs: Antigng-bot improper non-api http requests - https://phabricator.wikimedia.org/T137707#2380046 (10Antigng_) I don't think api.php?action=query&prop=revisions&rvprop=content can be the same performant as index.php?action=raw, and the latter is the easie... [16:26:40] 10Traffic, 06Operations, 13Patch-For-Review: cronspam from cpXXXX hosts due to update-ocsp-all and zero_fetch - https://phabricator.wikimedia.org/T132835#2380107 (10ema) 05Open>03Resolved a:03ema Both update-ocsp-all and zerofetch are now logging to syslog instead of cronspamming. [16:32:37] 10Traffic, 10DBA, 06Labs, 06Operations, 10Tool-Labs: Antigng-bot improper non-api http requests - https://phabricator.wikimedia.org/T137707#2380159 (10jcrespo) > I would appreciate it if there was a way to perform api.php?action=raw Please file a separate bug report for that. [16:33:43] also the missing timestamp is now for stream.wikimedia.org /socket.io/1/websocket/482388852765, that is WIP from what I can see [16:34:00] definitely websockets are not handled well by vk [16:35:15] so the way that websockets are special is they use vcl_pipe [16:35:32] (but this is new, so it's not something that you were seeing e.g. yesterday. we never had vcl_pipe before) [16:36:01] vcl_pipe pretty much does a little bit of initial setup and then opens a raw TCP connection through the varnishes to the applayer and stops processing further request flow on it. [16:36:34] when I observe those in varnishlog, the log entry doesn't even appear until after the connection gets closed, which could be a very long time later. [16:37:17] in raw mode, the timestamps I get on it in varnishlog are: [16:37:18] - Timestamp Start: 1465914258.423045 0.000000 0.000000 [16:37:18] - Timestamp Req: 1465914258.423045 0.000000 0.000000 [16:37:27] - Timestamp Pipe: 1465914258.423126 0.000082 0.000082 [16:37:28] - Timestamp PipeSess: 1465914260.301709 1.878665 1.878583 [16:37:46] (I think in this case, I only had the session open for 1.87s) [16:38:33] elukey: ^ [16:39:39] even if we do put in some kind of support for them, we'd probably want to include some kind of pipe-mode flag in webrequest for them, so that hadoop can ignore them for "normal" stats or something. [16:39:56] otherwise 1h+ response times will massively skew other stats on normal requests [16:42:06] maybe could just skip them with filtering. currently we only filter PURGE: [16:42:09] $varnish_opts = { 'q' => 'ReqMethod ~ "^(?!PURGE$)"' } [16:42:42] - VCL_return pipe [16:42:42] - VCL_call HASH [16:42:42] - VCL_return lookup [16:42:42] - Link bereq 1824038 pipe [16:42:42] - Timestamp Pipe: 1465922428.670734 0.000223 0.000223 [16:42:44] - Timestamp PipeSess: 1465922435.814945 7.144434 7.144211 [16:42:47] - PipeAcct 335 472 82 17136 [16:42:50] ^ something in that unique trailer should be filter-able [16:44:14] bblack: yeah we were trying to figure out the vk issue, which apparently happens consistently with etherpad, noticed the vcl_pipe part and thought that was it [16:44:54] yeah but etherpad shouldn't be used vcl_pipe, I don't think [16:45:06] we can drop spurious logs from the hadoop consistency checks but I'd prefer to double check the root cause first, just to be sure :) [16:45:16] unless it's been trying to do websockets and falling back to non-websockets, and my adding websockets support to cache_misc made it suddenly start working? [16:45:32] I think it is trying to do websockets, yes [16:45:44] heh, maybe this will solve the random-disconnect bugs then [16:46:50] elukey is looking for the relevant varnishlog :) [16:46:55] * << Request >> 632585614 [16:46:55] - Begin req 635601883 rxreq [16:46:55] - Timestamp Start: 1465910971.589160 0.000000 0.000000 [16:46:55] - Timestamp Req: 1465910971.589160 0.000000 0.000000 [16:46:55] - ReqMethod GET [16:46:58] - ReqURL /socket.io/?EIO=3&transport=websocket&sid=iVvuoFEeWJTbsp2GAB4s [16:47:01] - Timestamp Pipe: 1465910971.589262 0.000102 0.000102 [16:47:03] - Timestamp PipeSess: 1465915543.388921 4571.799760 4571.799658 [16:47:06] ** << BeReq >> 632585615 [16:47:07] wow yeah I just tested it myself [16:47:09] -- Begin bereq 632585614 pipe [16:47:11] -- Timestamp Bereq: 1465910971.589256 0.000000 0.000000 [16:47:12] etherpad does do websockets [16:47:26] so, I accidentally upgraded etherpad connectivity [16:47:40] and nobody complained so it probably works? :) [16:47:47] I guess so! [16:47:58] :D [16:48:14] good to remember, I was going to later restrict websockets to just the services that are intended heh [16:48:51] so in this case in the log the "PipeSess" timestamp was the last one, not Resp [16:49:05] yeah [16:49:06] and it makes sense why vk was logging "-" (def value) [16:49:22] but the mediawiki-release gz files are still a mistery [16:49:42] but still, I'd say either we filter out all pipereqs in the -q filter and never vk-process them, or we should be logging some new field to differentiate them for downstream stats [16:50:08] this is a good point, I forgot about the vk filter! [16:51:08] what's the mediawiki-release gz case? [16:52:20] so this is an example: [16:52:21] cp3008.esams.wmnet 1302420 - 0.0 - PASS 206 0 GET releases.wikimedia.org /mediawiki/1.26/mediawiki-1.26.3.tar.gz application/x-gzip https://releases.wikimedia.org/mediawiki/1.26/ 41.74.214.106, 10.20.0.108 Mozilla/5.0 (Windows NT 6.2; WOW64; rv:40.0) Gecko/20100101 Firefox/40.0 fr,fr-FR;q=0.8,en-US;q=0.5,en;q=0.3 - bytes=1488504 [16:52:27] -25178779 - misc 2016 6 14 8 [16:53:07] the third field is a "-" [16:53:16] that is the timestamp [16:55:06] most of the return codes are 206 or 200 [16:55:12] fields are [16:55:38] (grabbing them) [16:55:59] (and just noticed that I didn't strip the IP, sorry for that, my bad ) [16:56:00] when I use wget, the logs look ok. I got a Resp timestamp whether I aborted the transfer early or let it finish... [16:56:26] yeah ema and I tried all combination of things [16:56:33] 206 sounds like Range requests.... [16:57:06] (which cache_misc doesn't have explicit support for like cache_upload does, and is one of many random things we haven't deeply investigated in our transition from hacked-v3 -> v4) [16:57:17] (some of our hacks were about range+streaming+gzip stuff...) [16:57:35] also 200s logged without timestamp [16:58:05] I have the feeling this might somehow be related to the empty-body bug [16:58:19] if you want to check some data [16:58:28] on stat1004 I have a Hive/Hadoop script [16:58:34] /home/elukey/show_missing_dash.sql [16:59:00] sudo -u hdfs beeline -f show_missing_dash.sql [16:59:22] to change date/time it is necessary to open the .sql file and change the variable values [16:59:25] ok so, I think this is less-complicated than that [16:59:29] and more-confusing [16:59:44] I just did a legit range-resume with wget (cancelled the transfer early, then tried again with "-c") [16:59:52] it succeeds and works fine functionally, which is awesome. [17:00:32] so this is what the relevant varnishlog looked like for that, ignoring dumb parts and my IPs and all.. [17:00:35] - ReqMethod GET [17:00:37] - ReqURL /mediawiki/1.26/mediawiki-1.26.3.tar.gz [17:00:40] - ReqProtocol HTTP/1.1 [17:00:42] - ReqHeader Connection: close [17:00:44] - ReqHeader Range: bytes=1639185- [17:01:11] (it's a cache miss, goes into fetch, blah blah) [17:01:12] - RespProtocol HTTP/1.1 [17:01:12] - RespStatus 200 [17:01:12] - RespReason OK [17:01:58] (bunch of normal header outputs) [17:01:59] - VCL_call DELIVER [17:02:20] (and then at the bottom) [17:02:21] - VCL_return deliver [17:02:21] - Timestamp Process: 1465923502.506981 0.083314 0.000040 [17:02:21] - RespHeader Accept-Ranges: bytes [17:02:21] - RespHeader Content-Range: bytes 1639185-25178779/25178780 [17:02:24] - RespProtocol HTTP/1.1 [17:02:27] - RespStatus 206 [17:02:29] - RespReason Partial Content [17:02:32] - RespReason Partial Content [17:02:34] - RespUnset Content-Length: 25178780 [17:02:37] - RespHeader Content-Length: 23539595 [17:02:39] - Debug "RES_MODE 2" [17:02:42] - RespHeader Connection: close [17:02:44] - Timestamp Resp: 1465923505.200778 2.777110 2.693797 [17:02:47] - ReqAcct 426 0 426 796 23539595 23540391 [17:02:49] it re-sets protocol/status [17:02:58] I think this has more to do with the whole general problem of getting confused by trace-like rather than log-like duplication of entries in one request stream.... [17:03:01] afk a few [17:03:11] me too :) [17:03:31] (will check later) [17:05:40] 10Traffic, 06Operations, 10Phabricator, 10hardware-requests: codfw: (1) phabricator host (backup node) - https://phabricator.wikimedia.org/T131775#2380267 (10mark) >>! In T131775#2219177, @RobH wrote: > No need to ping Papaul, he doesn't have any involvement in #hardware-requests. (Its primarily myself, a... [17:06:38] or it could be that the Process: timestamp is confusing things? [17:06:55] but really I think it's all the duplication of basic things seen earlier, re-written. [17:07:31] there *is* a Resp timestamp at the end, so it should work on that basic level [17:09:50] 10Traffic, 10DBA, 06Labs, 06Operations, 10Tool-Labs: Antigng-bot improper non-api http requests - https://phabricator.wikimedia.org/T137707#2380275 (10jcrespo) BTW, the API is definitely faster, one just need to use it efficiently: ``` $ time curl 'https://en.wikipedia.org/w/api.php?action=query&prop=r... [17:11:27] I don't know what to do about the tracing problem tbh. I thought about it last time, too. [17:11:41] currently it "works" because a later RespHeader overwrites an earlier one for the same header in vk [17:12:16] I don't see how you'd simply and sanely de-dupe the input log stream, though? [17:12:22] (in the general case) [17:13:11] you'd need different rules for different kinds of log entries. RespUnset should clear an existing header you already added to the logged set. Timestamps can use keys too I guess. RespProtocol and RespStatus I guess just overwrite without a sub-key [17:13:31] Debug isn't meant to be de-duplicated, in the case you were actually following Debug [17:14:01] you'd think varnishlog and the underlying APIs would have some kind of "mode" that switches from tracing to logging: don't show intermediate things, only the final state. [17:16:25] apart from Debug, is there a log record type that is concatenative? [17:16:45] I really don't know. there's a ton of new log record types, depending on how you query it. [17:17:01] we can invent our own rules as above for headers, protocol, etc... [17:17:36] this caused pain for varnishlog.py-based xcache logging too, because it would get tripped by multiple iterations of "RespHeader X-Cache: ...." during header construction in VCL. [17:18:10] we ended up "solving" that by constructing the header as X-Cache-Int across the layers for complex things, and then copying it to X-Cache as a single final step on frontend output. VCL deduplicating the logging manually, basically. [17:30:43] is the last one always authoritative? [17:43:47] even in the header case I'm not 100% sure [17:44:28] in the trivial and obvious cases, it should be [17:45:05] so, old varnishkafka + varnish3 had a different but related problem: it would only log request headers the client sent, not request headers created/modified by vcl_recv [17:45:27] and on the response headers side, it would show us what was sent to the client (the final form) [17:45:55] in varnish4, the raw log-stream just logs everything. it logs the client request headers, and then if VCL modified a given head 5 times and then unsets and re-sets, those are all logged. [17:46:28] on the response headers side, it's mostly a deduplication problem (last one wins, and note RespUnset can clear) [17:47:16] on the request headers side, I'm not even sure what the right behavior is. it's possible for us to modify request headers in VCL long after they've been used (to check cache, and to fetch from backend), just to do temp things with them. [17:47:43] is it legit to use the modified request header in webrequest logs, if the modification happened after all usage of request headers for real purposes was done? [17:49:03] for headers we can clearly see what happens, but there are questions about how we should interpret [17:49:14] I really don't know about non-header de-duplication in the general case. [18:31:29] 10Traffic, 06Operations, 10Phabricator, 10hardware-requests: codfw: (1) phabricator host (backup node) - https://phabricator.wikimedia.org/T131775#2380633 (10RobH) a:05mark>03RobH [18:44:21] 10Traffic, 06Operations, 10Phabricator, 10hardware-requests: codfw: (1) phabricator host (backup node) - https://phabricator.wikimedia.org/T131775#2178301 (10RobH) 05Open>03Resolved WMF6405 is allocated for this use. T137838 has been created for the setup/deployment. [19:41:36] bblack: I'm getting ready to merge https://gerrit.wikimedia.org/r/#/c/294068/ and send traffic to new maps servers. I think I know what I'm doing, but I'd appreciate if I knew there is someone who understand LVS better than I do available. Just in case... [19:42:34] bblack: would you be that someone? [20:00:45] checking [20:02:30] yeah merge away [20:02:55] gehel: after you puppet-merge, the conftool stuff runs as part of that and creates the new etcd entries [20:03:27] Oh, so no need to run puppet on lvs2002/2005? [20:22:29] 10Traffic, 06Operations, 10Phabricator: Phabricator needs to expose notification daemon (websocket) - https://phabricator.wikimedia.org/T112765#2380942 (10greg) (All blockers are resolved) [20:24:19] 10Traffic, 06Operations, 10Phabricator: Phabricator needs to expose notification daemon (websocket) - https://phabricator.wikimedia.org/T112765#2380949 (10BBlack) Is the node.js notification service already running on iridium? Do we need some matching config in public DNS + private phab so that it knows its... [20:29:19] 10Traffic, 06Operations, 10Phabricator: Phabricator needs to expose notification daemon (websocket) - https://phabricator.wikimedia.org/T112765#2380963 (10BBlack) (or, reading the docs, do we want to map phab.wm.o/ws/ to :22280? either way, it doesn't seem configured at all on the iridium side yet) [20:37:09] 10Traffic, 06Operations, 10Phabricator: Phabricator needs to expose notification daemon (websocket) - https://phabricator.wikimedia.org/T112765#2380993 (10mmodell) @bblack: not set up on iridium because I wasn't entirely clear when/if it would become possible. | do we want to map phab.wm.o/ws/ to :22280 Ye... [20:41:23] so, all kinds of impending fixups are coming together into some kind of nexus of change in VCL-land [20:42:42] https://phabricator.wikimedia.org/T134404 [20:42:50] https://phabricator.wikimedia.org/T110717 [20:42:58] https://phabricator.wikimedia.org/T112765 [20:43:23] (that's: active:active backends, standard declarative config for backends, phab needs websockets [20:43:37] and also there's a non-phab thing going on: fix up X-Cache in general) [20:44:20] part of the active:active work is preventing inter-cache traffic loops. was expecting to parse X-Cache for this, until I realized we have to stop the loop on the request-side, and X-Cache is all on the response side presently. [20:45:10] we'll need to set up X-Cache (or something similar/related) as a request header that concatenates to detect request-routing loops, while fixing up other aspects of X-Cache [20:46:03] part of the flexible/declarative app-backend stuff is already implement in cache_misc VCL, just not for the others. expanding that code to cover the others' cases and moving it to common VCL is what needs to happen there, and the other cases include the active:active backend config and the above [20:46:35] and the phabricator websockets thing is a case that even cache_misc's declarative backend stuff doesn't yet cover (having hostfoo -> backend1, but hostfoo/bar/ -> backend2) [20:47:00] upload and maps are trivial to expand for [20:47:25] text is complex again: it has things like "if (req.url ... ) { if (this-header) { backend1 } else { backend2 } }" [20:47:55] plus moving some of the backend-specific code blocks up if we can, e.g. misspass mangling of bereq for the restbase case [22:04:21] 07HTTPS, 10Traffic, 06Operations: Preload HSTS - https://phabricator.wikimedia.org/T104244#2381200 (10BBlack) [22:04:25] 07HTTPS, 10Traffic, 06Operations: Enable HSTS on Wikimedia sites - https://phabricator.wikimedia.org/T40516#2381195 (10BBlack) 05Open>03Resolved a:03BBlack This is done for all the reasonable cases we have direct control of. The external-ish ones are tracked in task T132521 and on wikitech at https://... [22:11:47] 10Traffic, 06Operations: Remove referrer check from varnish for maps cluster - https://phabricator.wikimedia.org/T137848#2381204 (10Yurik) [22:12:07] 10Traffic, 06Discovery, 10Kartotherian, 06Maps, and 2 others: Remove referrer check from varnish for maps cluster - https://phabricator.wikimedia.org/T137848#2381217 (10Yurik) [22:14:09] 10Traffic, 06Discovery, 10Kartotherian, 06Maps, and 3 others: Remove referrer check from varnish for maps cluster - https://phabricator.wikimedia.org/T137848#2381224 (10BBlack) [22:15:35] 10Traffic, 06Discovery, 06Maps, 06Operations, 13Patch-For-Review: Send traffic to new maps200? servers - https://phabricator.wikimedia.org/T137620#2381228 (10MaxSem) 05Open>03Resolved a:03MaxSem Was done by @Gehel. [22:39:47] bblack: I would expand the vk's filter query to: ReqMethod ~ "^(?!PURGE$)" or Timestamp !~ "Pipe" to remove pipe-related logs that are causing noise [22:39:54] (or something similar) [22:43:55] better: s/or/and/ :) [22:45:55] (afk again1) [22:47:03] elukey: seems sane for now [22:47:22] elukey: although I don't know if that will do what you think it will do, stated exactly as above [22:49:17] yeah the or should be a and to make everything work, just tried with varnishlog on cp3008 and looks good [22:49:28] but I'd need a bit more testing [22:51:29] mmmmm you're right, the expr seems wrong.. will check tomorrow with a bit more coffee :) [22:51:47] but you got the idea! If I get something working I'll send a code review! [22:54:22] elukey: yeah the problem is since there are multiple Timestamps in every request, one of them does not match "Pipe".... [22:54:49] you have to find something truly-unique to match on, which can be hard :) [22:55:32] or maybe it's possible to do it with their boolean query logic? as in: ReqMethod ~ "^(?!PURGE$)" and not Timestamp ~ "Pipe" [22:56:12] I would expect "not foo ~ bar" to mean "none of the several foo matches bar", where as "foo !~ bar" might meant "if any one of the several foo fails to match bar" [22:58:18] I was going with the assumption that Timestamp !~ "Pipe" was "none of the timestamps matches Pipe", but indeed it seems wrong now ) [22:58:21] :) [23:00:33] ah from https://www.varnish-cache.org/docs/trunk/reference/vsl-query.html there is something interesting [23:00:49] Timestamp:Process[2] > 0.8 [23:01:14] so maybe not Timestamp:Pipe could be feasible? [23:01:17] will check :)