[00:02:48] 10netops, 10Operations, 10ops-codfw, 10ops-eqiad: Audit switch ports/descriptions/enable - https://phabricator.wikimedia.org/T189519#4102901 (10ayounsi) Slightly related. I disabled all the interfaces on the access switches that are down and don't have a description by adding them in `interfaces interface-... [01:20:45] 10netops, 10Operations: Enabling graceful-switchover causes core dumps on cr1-codfw - https://phabricator.wikimedia.org/T191371#4102950 (10ayounsi) [02:27:01] 10netops, 10Operations: Config discrepencies on network devices - https://phabricator.wikimedia.org/T189588#4103007 (10ayounsi) [02:27:10] 10netops, 10Operations: Config discrepencies on network devices - https://phabricator.wikimedia.org/T189588#4046514 (10ayounsi) 05Open>03Resolved [08:09:18] vgutierrez: prometheus-varnish-caching looks good now! <3 [08:21:22] 10Traffic, 10Operations, 10Patch-For-Review: Post Varnish 5 migration cleanup - https://phabricator.wikimedia.org/T188545#4103454 (10ema) 05Open>03Resolved a:03ema With https://gerrit.wikimedia.org/r/416652 being merged, this is done. [08:22:51] 10Traffic, 10Operations: varnish 5.1.3 frontend child restarted - https://phabricator.wikimedia.org/T185968#4103463 (10ema) 05Open>03Resolved a:03ema I haven't seen this happening anymore in the past 3 months. Closing for now but feel free to reopen should the issue come back. [09:59:01] moritzm: all of cache@eqiad rebooted into new kernel, only the unpuppetised systems are left now [10:06:34] ok, thanks! [10:27:59] bblack: any reason for using the codfw debug appserver? https://gerrit.wikimedia.org/r/#/c/423866/ [10:57:31] 10Traffic, 10Commons, 10MediaWiki-File-management, 10Multimedia, and 3 others: thumb.php should not set CC:no-cache on renderer 404 responses? - https://phabricator.wikimedia.org/T150022#4103977 (10ema) [12:43:50] ema: any reason to change it either? IIRC those are just proxies, and something else configured by MW-peeps decides which appservers the proxies hit. [12:47:57] vgutierrez: before you kill the xcache daemon, can we get the new graph working better to confirm? [12:48:04] sure [12:48:12] I won't to kill it right now :) [12:48:18] s/won't/want/g [12:48:21] arg [12:48:29] I need more coffee [12:48:43] https://grafana.wikimedia.org/dashboard/db/prometheus-varnish-caching still shows the odd smoothness in the bottom graphs (but seems fixed in the top two?) [12:50:24] 10Traffic, 10DNS, 10Upstream: Difficulties reaching IPv6-enabled sites (bugzilla, lists) in Opera, Chrome on some systems - https://phabricator.wikimedia.org/T19140#4104257 (10TheDJ) [12:52:26] hmm let me try something on those... [12:53:41] on the meanwhile, regarding predictable network interfaces names, are we ok with having at the same time both schemas (ethX / enX) on puppet while the migration is ongoing? [12:54:30] for lvs you mean? [12:54:45] indeed [12:55:01] I'm not honestly sure how that's going to work out, but clearly we'll need to support both in general while migrating the first machine(s) [12:55:33] we've hardcoded interface names on site.pp and hieradata/common/lvs/interfaces.yaml [12:55:36] really, we'd need to get a better handle on what the new interface names mean and how we're intended to derive them I guess [12:56:29] they may for all practical purposes end up less-consistent than before, due to some machine-to-machine variances in how PCI busses are hooked up or whatever, and so yeah we might have to define the set on a per-machine level instead of per-site [12:56:40] I don't know [12:57:07] bblack: just for uniformity with everything else pointing to eqiad :) I got confused earlier today by x-cache: cp2019 pass, cp1066 pass, cp3030 pass, cp3043 pass [12:57:07] hopefully it doesn't end up too confusing, I guess! [12:57:19] ema: ok [12:57:39] yup.. if machines are not identical (PCI-E slots used for ethernet interfaces) we could get some differences [12:57:58] vgutierrez: for the normal case, we're intended/expected to use the interface_primary fact instead of hardcoding eth0 [12:59:02] vgutierrez: but I don't know if the schema change here brings any such benefit to LVS, unless we had some way to auto-derive interface names based on the row they're hooked up to via lldpd or something, but that sounds super-complicated. [12:59:19] we already have lldp facts in place [12:59:37] (I've been reading a lot of puppet this morning on the train) [12:59:44] but yes, it looks like over complicated [12:59:52] oh? [13:00:11] what facts? I'm curious [13:00:18] one sec :) [13:00:19] lldp facts can disappear easily [13:00:34] paravoid: :( [13:00:46] e.g. if there is a switch misconfiguration, or a bug like we had in the past (and may still have in some row I think) [13:00:49] or if lldpd dies etc. [13:00:53] right [13:01:22] we use the lldp facts to establish dependencies for monitoring, and for that purpose it's ok [13:01:32] but I wouldn't use it for anything more serious, like the actual config of LVS :) [13:01:32] bblack: https://github.com/wikimedia/puppet/blob/production/modules/base/lib/facter/lldp.rb [13:01:51] so, hardcode it is, and probably plan in the factoring out of it to expect the potential that not all LVSes in a site may share the same naming schema in terms of which iface name is "primary" in their home row, or connected to various other rows (we'll have to probe that via manual lldpdcli when setting up new ones' puppet defs before puppetizing them) [13:02:27] you could figure out which interface is which by looking at IP addresses perhaps, although that may be a chicken-and-egg [13:02:36] it is, we're talking new installs [13:02:58] for "regular" systems, we provision IP addresses outside of (and before) puppet [13:03:31] to recap quickly: there's one primary interface for the home row, then 3 other interfaces connected to other rows. at least the primary has a default vlan and works like "normal" for the base install and will/should end up interface_primary [13:03:31] DNS -> DHCP -> debian-installer writing that out as static to /e/n/interfaces [13:04:05] and then in theory after base install is done, we double-check lldpcli results and decide how to set up the site.pp interface stuff for the multiple rows/vlans in puppet, then start the initial puppetization of the machine. [13:04:34] in practice, consistency of ethN naming + consistency of how dcops cables things = it ends up that it matches up well on a whole-site level (not much per-lvs variance) [13:04:48] but in the new scheme, probably less gaurantees there on consistent naming in practice. [13:07:53] e.g. if you look at codfw (simpler in many unimportant ways): they all have a home row of either A or B. All eth0 goes to home row, all eth1 goes to opposite-of-home-row (A or B), all eth2 goes to C, all eth3 goes to D. [13:08:23] for future ones (e.g. lvs1016) we're aiming to do 4x LVSes as one lvs per row, so it's changing slightly anyways. [13:09:44] I think at some point we floated the idea of connecting the 1G interface for the "regular" networking, and then 4x10G for each of the rows [13:10:05] maybe it was part of the "ipvs in a separate network namespace" conversation? [13:10:24] even if the cabling works out similarly, there may be bus-level diffs we don't look at presently, where the eth3 in "all eth3 goes to D" becomes ensp1f0 on two machines and ensp0f1 on two others or whatever, depending how it works out [13:11:30] we could also decide to persistently label those of course, some scheme where we configure the machine to relabel the interfaces as ethrow[ABCD] after initial install or something crazy. [13:12:07] a quick run of lspci across lvs* shows inconsistencies in lvs[1001-1006] VS lvs[1010-1012] and lvs[4001-4004].ulsfo.wmnet VS lvs[4005-4007].ulsfo.wmnet [13:12:13] paravoid: I'm not sure if they all have 1G. I guess they should, but we've disabled it completely in the past (at the bios level) [13:12:45] lvs4001-4 are basically dead, they should've been decommed hardware already [13:12:53] so those are not a problem [13:12:54] lvs1010-12 are also not really LVSes anymore [13:13:02] (should be dead/decom/spare) [13:13:22] lsv1001-6 are the current live eqiad set, to be replaced by lvs1013-16 [13:13:49] (there will definitely be some diffs in that transition, as lvs1001-6 use completely different ethernet and are now ancient, but that's ok) [13:31:39] it looks like our BIOS doesn't report slot paths [13:32:14] so we will end either with onboard device names enoX or enpXXXXX [13:33:07] which host are you looking at to try first? [13:33:33] (also, we should probably disable pybal BGP for whichever host initially while we sort it out) [13:34:00] I was thinking about lvs5003 [13:34:20] well, it doesn't have the core problem of course :) [13:34:31] but it might be a simpler case to start at the edge sites, sure [13:34:52] and then maybe start core sites with the soon-to-be-installed lvs1016 case [13:34:55] https://phabricator.wikimedia.org/P6940 [13:35:11] that's how eth0 should like on stretch on every lvs instance [13:35:20] *look like [13:35:31] (italian coffee affects my brain differently) [13:35:31] ok [13:36:08] we could maybe factor things differently ahead of this and try to get the ethX naming out of site.pp first and all into hieradata [13:36:58] the biggest barrier there is the txqueuelen diffs, but that could be managed as well [13:37:20] (and excluded for lvs1001-6 just by virtue of their not being bnx2x, like other tweaks) [13:39:32] ignoring the txqueuelen thing, I'd think we could do it now by moving lvs::interface_tweaks over to modules/profile/manifests/lvs.pp, and have the interface-naming keys come from the set defined for this host in hieradata/common/lvs/interfaces.yaml ? [13:40:11] (would need some map-operator or whatever is the current replacement for inline-erb, to sift through the structure and grap just the ethernet interface names for a given host throughout) [13:41:26] lacking that, we'd need to split up some per-hostname conditionals in site.pp I guess [13:42:46] the interfaces::vlan_data hieradata seems to only be referenced by the lvs profile manifest [13:43:25] which passes it off to the lvs::tagged_interface profile [13:43:38] so we can refactor the data to include txqueuelen if we fixup the refs there [13:45:43] that and tagged_subnets [13:45:59] I wonder why tagged_subnets is broken up into per-dc hieradata and the main interfaces.yaml is in common/ ? [13:46:13] (probably no particularly-good reason) [13:55:06] I've just updated https://phabricator.wikimedia.org/P6940 considering ONBOARD names (that take priority over PATH names) [14:02:09] 10Traffic, 10Operations, 10Performance-Team (Radar): Support brotli compression - https://phabricator.wikimedia.org/T137979#4104636 (10ema) To get a more up-to-date idea about the percentage of requests we get with AE:br, I've analyzed 30s of GET traffic on cp3033 and was surprised to find zero requests wit... [14:08:13] ema: is it possible your logging of AE there is after varnish normalization? ^ [14:08:32] varnish at some stage normalizes any AE containing "gzip" to just "gzip" [14:08:45] (but would only bother to do so for hits/misses, not pass/POST/etc) [14:10:34] oh that could be [14:14:11] oh yes [14:14:24] - ReqHeader Accept-Encoding: gzip, deflate, br [14:14:27] - ReqUnset Accept-Encoding: gzip, deflate, br [14:14:30] thanks! [14:25:46] that has always bugged me, the disconnect between varnishlog's query/output capabilities and how the raw shmlog actually references a header multiple times as it changes [14:28:39] vgutierrez: note that on the modern bnx2x machines in core DCs (e.g. lvs1010, lvs2001), eth1-3 don't have ONBOARD names at all in that check [14:30:52] e.g. for lvs2001 eth0-3 PATH names are enp3s0f0, enp3s0f1, enp4s0f0, enp4s0f1, but only eth0+eth1 also have ONBOARD names eno1 and eno2 [14:31:10] I wonder which is used as the primary label? [14:31:45] do we end up with split naming as eno1, eno2, enp4s0f0, enp4s0f1 I guess? [14:32:52] I guess every scenario will be different. likely for lvs2* it's actually accurate and we purchsed them with 2x10G onboard + a 2x10G card [14:33:14] whereas some newer systems have the normal 1G onboard (which we disable), plus 2x 2x10G cards [14:42:44] in theory once we get used to it, that naming is useful for coordinating with dcops though [14:43:10] we can say "the first port on the card in pci slot 4" or something, or "the second onboard port" [14:43:29] instead of "whatever eth3 means, please move that to switch port X" [15:14:17] omg, awk I love you so much [15:14:26] varnishlog -c -n frontend -g raw -q 'ReqHeader:Accept-Encoding' -I ReqHeader:Accept-Encoding | awk '!seen[$1] { print ; seen[$1]=1 }' [15:28:45] bblack: yeah, plus no reason to disable the 1Gs anymore [15:29:00] we did that because of assumptions about eth0 being the "primary" one and stuff like that [15:34:08] well maybe for consistency, as I think at least the lvs2* machines have no separate onboard 1G [15:34:24] (might get confusing to puppetize that we have some with a separate interface_primary and some with it shared) [15:35:45] reposting from -perf since it might be interesting here too: https://dcreager.net/nel/intro/ [15:38:18] yeah, interesting :) [15:44:06] 10Traffic, 10Operations, 10media-storage, 10Patch-For-Review, 10User-fgiunchedi: Swift invalid range requests causing 501s - https://phabricator.wikimedia.org/T183902#4105174 (10fgiunchedi) 05Open>03Resolved This is now deployed, 501s did indeed disappear now, `rewrite.py` is using `swob` from swift... [15:44:45] unrelated but today I deployed ^, I'm mentioning it because next week I'm out, in case you see some change in behaviour on cache upload [15:45:22] thanks! [15:47:56] mmh I must still be doing something wrong, only ~1% of requests with AE:br does not sound reasonable, does it [15:48:26] timeout --foreground 10 varnishlog -c -n frontend -g raw -q 'ReqHeader:Accept-Encoding' -I ReqHeader:Accept-Encoding | awk '!seen[$1] { seen[$1]=1 ; $1=$2=$3=""; print }' | awk '/br/ { br++ } END { print br; print NR; print br * 100 / NR }' [15:48:32] 418 [15:48:35] 39800 [15:48:42] 1.05025 [15:53:06] what you really want is just the very first mention of AE in the shmlog entry [15:53:26] right [15:53:42] which is what I'm getting there (modulo mistakes) [15:53:46] I don't know that awk can do that once the request boundaries are gone from the stream, that easily [15:54:18] oh $1 is txn [15:54:20] I see [15:54:26] yup [15:55:14] hmmm [15:55:34] seen[] must get rather large over time in that setup [15:56:10] ~40k in those 10 seconds there [15:56:45] 40k entries in the associative array, that is [15:57:43] the trickiest part is the even the count of AEs isn't consistent, because it doesn't rewrite it when it's already just "gzip" [15:58:30] ? [15:58:50] even if it's not getting rewritten it should be logged as ReqHeader:AE right? [15:58:51] (of solving this generally with any simpler pipeline) [15:58:57] it is, what I mean is: [15:59:04] * << Request >> 503252626 [15:59:05] - ReqHeader Accept-Encoding: gzip, deflate, br [15:59:05] - ReqHeader Accept-Encoding: gzip [15:59:05] * << Request >> 503293088 [15:59:05] - ReqHeader Accept-Encoding: gzip [15:59:31] depending on whether the original value was just "gzip", you get either 2 log outputs or 1, which makes otherwise-simpler ideas for solving this fail [16:00:24] since "br" will only appear once per request, and would almost-always (bulk of GETs anyways) get transformed... [16:00:35] you can get an approximate answer with a simpler sanity-check [16:02:14] root@cp2001:~# timeout --foreground 10 varnishlog -c -n frontend -g raw -q 'ReqHeader:Accept-Encoding' -I ReqHeader:Accept-Encoding | awk '/br/ { br++ } END { print br; print NR - br - br; print br / (NR - br); }' [16:02:18] 311 [16:02:20] 5348 [16:02:23] 0.0549567 [16:03:21] that works on the assumption (which is probably mostly-valid for bulk cases) that all br entries are followed by a followup gzip line, and originally-gzip-only don't, but then fails at some things because there are transformed originals like "AE: gzip,deflate" [16:03:33] but seems to return similar-ish numbers [16:04:32] what skips blank lines in your last pasted cmd? [16:04:53] oh nevermind, it doesn't emit blanks, hmmm [16:04:57] right [16:05:01] so something surely is off [16:05:09] $ timeout --foreground 10 varnishncsa -n frontend -q 'ReqMethod ne "PURGE" and ReqHeader:Accept-Encoding ~ "br"' | wc -l [16:05:12] 29372 [16:05:17] $ timeout --foreground 10 varnishncsa -n frontend -q 'ReqMethod ne "PURGE"' | wc -l [16:05:20] 42064 [16:05:40] no -I on that [16:05:44] it's counting many lines-per-req [16:05:49] yeah, it's varnishncsa [16:05:57] oh, I should learn to read :) [16:06:01] hmmmm [16:06:31] so just use varnishncsa? seems simpler [16:06:37] :) [16:06:57] can still awk-split if you can get varnishncsa to log the original input AE [16:07:09] varnishncsa does not do that [16:07:15] (asked on #varnish) [16:08:06] ok yeah, how about we just use varnishncsa like this to get a rough idea and call it a day [16:08:35] just run a pair of them in parallel :) [16:08:46] my OCD isn't pleased [16:08:59] but OTOH it's beer o'clock, so yeah [16:09:32] beer has through the ages been a great motivator of half-assed human achievement :) [16:17:54] 10Traffic, 10Operations, 10Performance-Team (Radar): Support brotli compression - https://phabricator.wikimedia.org/T137979#4105329 (10ema) >>! In T137979#4104636, @ema wrote: > during that timeframe we only received AE:br requests for methods other than GET (OPTIONS, POST). That was due to the fact that va... [16:19:04] see ya! [16:27:22] 10Traffic, 10Analytics, 10New-Readers, 10Operations, and 2 others: Opera mini IP addresses reassigned - https://phabricator.wikimedia.org/T187014#4105369 (10atgo) Partnerships has been looking for a contact at Opera. We reached out to someone yesterday who is OOO until next week. Will keep you updated. [18:50:45] 10Traffic, 10Operations, 10ops-codfw: cp2006 memory replacement - https://phabricator.wikimedia.org/T191223#4106021 (10Papaul) Your Service Request SR#: 963059814 Contact Us | Support Library | Download Center | SupportAssist | Community Forums Dear Papaul Tshibamba, Current Status: This e-mail serves as... [18:55:49] 10Traffic, 10Operations, 10ops-codfw: cp2011 memory replacement - https://phabricator.wikimedia.org/T191226#4106039 (10Papaul) Your Service Request SR#: 963061588 Contact Us | Support Library | Download Center | SupportAssist | Community Forums Dear PAPAUL TSHIBAMBA, Current Status: This e-mail serves as... [18:56:09] 10Traffic, 10Operations, 10ops-codfw: cp2017 memory replacement - https://phabricator.wikimedia.org/T191227#4106049 (10Papaul) Your Service Request SR#: 963052179 Contact Us | Support Library | Download Center | SupportAssist | Community Forums Dear Papaul Tshibamba, Current Status: This e-mail serves as... [18:57:15] 10Traffic, 10Operations, 10ops-codfw: cp2022 memory replacement - https://phabricator.wikimedia.org/T191229#4106059 (10Papaul) Your Service Request SR#: 963052308 Contact Us | Support Library | Download Center | SupportAssist | Community Forums Dear Papaul Tshibamba, Current Status: This e-mail serves as... [21:26:51] 10Traffic, 10netops, 10Operations, 10Patch-For-Review: Offload pings to dedicated server - https://phabricator.wikimedia.org/T190090#4106852 (10ayounsi) [22:03:00] 10Traffic, 10netops, 10Operations, 10Patch-For-Review: Offload pings to dedicated server - https://phabricator.wikimedia.org/T190090#4106942 (10ayounsi) [22:21:17] 10Traffic, 10netops, 10Operations, 10Patch-For-Review: Offload pings to dedicated server - https://phabricator.wikimedia.org/T190090#4106972 (10ayounsi) About kernel tuning, here are the variables we can adjust as necessary, with their default. ``` 50 -- /proc/sys/net/ipv4/icmp_msgs_burst 1000 -- /proc/sys...