[02:39:11] 10Traffic, 10Android-app-feature-Compilations, 10Operations, 10Wikipedia-Android-App-Backlog, 10Reading-Infrastructure-Team-Backlog (Kanban): Determine where to host zim files for the Android app - https://phabricator.wikimedia.org/T170843#3476960 (10Krinkle) [07:48:44] at SoS reading web pinged about https://phabricator.wikimedia.org/T154227 yesterday FYI [08:13:54] godog: thanks! [08:14:55] bblack: any chance to get rid of that madness from the CDN and move it to the applayer (where arguably it belongs)? :) [08:15:58] I mean this BTW: https://github.com/wikimedia/puppet/blob/c88edf93ff096a7e72794e6ec57f83057b3a9f53/modules/varnish/templates/text-frontend.inc.vcl.erb#L21 [08:20:07] bblack: oh and I wasn't trolling you yesterday re: ipresolve in site.pp. Terrible idea though, I agree! [08:43:05] 10netops, 10Operations, 10monitoring: Grafana dashboards for librenms graphite data - https://phabricator.wikimedia.org/T171823#3477254 (10fgiunchedi) [10:18:24] volans: re: pleonastic else, I've always preferred avoiding those, till somebody told me it's better to be explicit (part of python's mantra) :) [10:19:04] I'm not sure I see it being implicit without :D but yeah pure personal style [12:57:53] ema: the mobile redirect is actually one of those things that can't live in the applayer, because "Vary" (and even the Vary extension options I've seen) isn't smart enough. [12:59:17] there's one URL (e.g. en.wp/wiki/Foo), and it needs to have two outputs (a 302 or a 200 + page content that's cacheable) depending on a complex set of input conditions: whether the UA string falls into binA or binB, and further where a no-redirect cookie is present. [13:01:11] but on a similar subject, what should be at the applayer for this is support for the .m. domain in general (as in, MW should support the .m. subdomains and emit mobile-style page content for them based on the Host header) [13:01:45] instead of what we do today, which is have varnish strip the .m. from the name and send an "X-Subdomain: M" header to the app to get the mobile variant of the content, because it doesn't know to parse the Host header for itself, but does know about that custom header... [13:04:45] ema: when you get a chance, can you double-check https://gerrit.wikimedia.org/r/#/c/367927/1/manifests/site.pp for stupid errors (e.g. wrong IPs in the wrong places vs the comments?) [13:07:26] bblack: double-checked, +1 [13:10:45] so the problem there would be the cacheability of 302s, to put it differently? If the complex redirect logic is moved to the applayer, and requests from mobile devices to non-mobile URLs result in redirects to the mobile version, then there's no need to Vary when it comes to caching 200 responses as the URLs would be different (en.m.wp/wiki/Foo vs. en.wp/wiki/Foo) [13:18:43] The content URLs are different, yes, so it seems like a simple case. But when you think about it in terms of "what about goes with which URL, for all traffic", only the en.m.wp outputs are simplistic for caching [13:19:25] the en.wp output has two possibilities: 302 or 200, depending on (UA+cookies sent by browser). Serving a cached 302 to a desktop client or a cached 200+content to a mobile client are both wrong outputs. [13:21:07] the only way to fix it so the mobile-detection logic could be in the applayer, would be to have a way for the applayer to consicely communicate the variance via response headers like Vary. [13:21:53] if we could ask all clients to just reliably set a header "AmIMobile=yes|no", then the applayer sending us "Vary: AmIMobile" would be sufficient [13:22:26] right [13:22:31] but instead we need "Vary: if(ua ~ /this/ || ua ~ /that/ && ua !~ /other/ && cookie !~ /whatever/) { a } else { b}" [13:22:44] which doesn't work :) [13:23:16] wouldn't it be great if all mobile browsers would emit such header? :) [13:26:11] bblack: ok to upgrade lvs[1009-1012] to the latest pybal? [13:26:16] yeah it would be nice if there was a standard mechanism for this [13:26:18] ema: yes [13:26:54] there was a database of UAs and a vmod to query it [13:27:05] I think the concern at the time was complexity and speed [13:27:09] e.g. the browser sending "GiveMeMobileView: 1", and then let browser prefs send that by default for mobile UAs, and remember to stop sending it if the user hits a buttom for "desktop view" in the browser UA (for that site) [13:28:05] but even if we got such a sane standard today, it would be 5+ years before we have to stop caring about UAs older than the standard :P [13:28:34] the sane standard exists, it's called CSS Media Queries :) [13:28:47] separate mobile websites are so 5-years ago [13:29:02] we could technically offload a small portion of the decision-making by inventing our own header just for Varnish<->MW too [13:29:33] we could do: if(ua ~ /this/ || ua ~ /that/ && ua !~ /other/ && cookie !~ /whatever/) { set req.http.MobileView = "1" } [13:29:54] and have MW emit "Vary: MobileView" on all pages that can vary (e.g. for 302 to mobile in our current scheme) and do the 302 themselves [13:30:05] but it's not much complexity savings over what we have today [13:30:39] (and replaces today's edge cases with new different edge cases) [13:31:15] it would eliminate the part that looks at req.url ~ /index.php/ and such (replaced with MW emitting Vary on appropriate outputs) [13:31:20] hm [13:31:35] it could allow MW to serve mobile pages at the same URL too though, wouldn't it? :) [13:31:42] yes, if they decided to do so :) [13:32:09] why do 302 in the first place [13:32:22] indeed! [13:32:31] that slows down pageviews coming from mobile google too, extra round-trip [13:33:05] we'd still need some variant of the logic for quite a while. We'd want to 301 .m. -> normal URL in the wake of the change for a while. [13:33:15] (which MW could also do for itself if it were Host-header-aware) [13:33:20] a while = many years [13:33:31] we still support secure.wikimedia.org redirects after all :/ [13:33:41] yup [13:33:53] this is why it's so important to make decisions about your URI space carefully :P [13:33:56] so, sounds like something we should ping Reading about [13:34:26] the whole topic starting from the top of "hey, you should read the host header and detect .m. yourself to start with?) [13:34:29] err " [13:34:50] or just ignore that part and leave .m. as a legacy varnish concern that MW never has to really support [13:35:10] (might be simpler on their end in the net) [13:38:05] you should mail Reading about it :) [13:38:07] maybe Adam? [13:38:21] yeah I guess so [13:38:36] it seems so out there in wishful-thinking-land, as with 100 other such issues. but it's worth raising anyways. [13:53:55] 10Traffic, 10Operations, 10Community-Liaisons (Jul-Sep 2017), 10User-Johan: Communicate dropping IE8-on-XP support (a security change) to affected editors and other community members - https://phabricator.wikimedia.org/T163251#3478043 (10BBlack) @Johan Yeah I've been OoO and catching up slowly too. We als... [13:54:15] mmh so our ipvsadm version (1.26) is too old to show the 'ops' flag in ipvsadm -L [13:54:44] or, rather, it shows it only on persistent services because of a bug fixed in 1ea1f41f40ad183176dba0ace8b873443d6aa82d [13:54:51] "ipvsadm: Show 'ops' flag regardless of service persistence" [13:55:13] ipvsadm -S does the right thing though, showing -o [13:57:34] this is the fix, included in ipvsadm > 1.26: https://git.kernel.org/pub/scm/utils/kernel/ipvsadm/ipvsadm.git/commit/?id=1ea1f41f40ad183176dba0ace8b873443d6aa82d [13:59:14] bblack, moritzm: I think we did discuss once if we should backport ipvsadm, and maybe we should? d [14:00:48] yeah I thought we had an old ticket about that, but can't find it [14:00:55] I don't remember that we discussed it, but we should: it's tied close to the kernel after all [14:01:10] it's mentioned in https://phabricator.wikimedia.org/T86650#1841136 back in 2015 [14:01:28] I think we have a ticket for backporting iproute [14:01:36] ah yeah, maybe that's what I was thinking of [14:01:48] https://phabricator.wikimedia.org/T138591 [14:01:53] we could also write a decent cli client ourselves, it's just netlink calls after all :) [14:02:03] maybe one that doesn't say memory allocation error all the time [14:02:22] now now, we don't want to go making all this advanced routing stuff too easy for lowly end-users :P [14:02:33] hehe :) [14:02:58] if we make a backports, I'd say let's upload it to jessie-backports as well (after checking with the maintainer) [14:03:06] yeah [14:03:06] yup [14:06:14] back on the whole ugly NSS+DNS related topic. The more I look at it, I think probably our most-viable option for improving that part of the situation is to write a new open source NSS module with all the flexibility bells and whistles, and use https://github.com/c-ares/c-ares as the DNS protocol implementation behind the logic. [14:06:23] c-ares has been around quite a while and seems decent quality, and is still active [14:06:39] I don't know when who will have time for that, but it's a doable plan. [14:08:29] https://github.com/c-ares/c-ares for the commit recency and so-on (and +1 for modern development practices for a C lib, coverage testing, etc) [14:09:04] long history: https://c-ares.haxx.se/download/ , and 49 contributors on the modern github codebase [14:10:11] 10Traffic, 10Operations, 10Pybal: Backport ipvsadm - https://phabricator.wikimedia.org/T171850#3478121 (10ema) [14:10:27] 10Traffic, 10Operations, 10Pybal: Backport ipvsadm - https://phabricator.wikimedia.org/T171850#3478137 (10ema) p:05Triage>03Normal [14:20:46] bblack: I'd merge https://gerrit.wikimedia.org/r/#/c/368162/ so that whenever we decide to upgrade the other LVSs the proper ops config is there already [14:22:11] ema: does pybal ignore unknown options there? [14:22:33] bblack: yes, confirmed on pybal-test2001 [14:22:38] awesome, then yeah [14:23:00] 10Traffic, 10Operations, 10Pybal: Backport ipvsadm - https://phabricator.wikimedia.org/T171850#3478121 (10faidon) stretch has 1.28, so perhaps it's just simpler to upgrade the LVS systems to stretch, which we'll need to do anyway at some point? We're already running the stretch kernel, and they don't have mu... [14:24:24] 10Traffic, 10Operations: Backport iproute2 4.x from debian testing -> our jessie - https://phabricator.wikimedia.org/T138591#2404751 (10faidon) This has been open for a while :) What new things that our kernels can do do we need and on which systems? Are these a priority now or can they wait until we upgrade... [14:25:38] 10Traffic, 10Operations, 10Pybal: Backport ipvsadm - https://phabricator.wikimedia.org/T171850#3478121 (10BBlack) Yeah, that's not a bad idea. Perhaps we should morph this into a stretch-for-LVS ticket, and start with the always-almost-ready-to-use lvs1007-12? :) [14:26:34] 10Traffic, 10Operations, 10Wikidata, 10wikiba.se, 10Wikidata-Sprint-2016-11-08: [Task] move wikiba.se webhosting to wikimedia misc-cluster - https://phabricator.wikimedia.org/T99531#3478229 (10BBlack) [14:28:05] 10Traffic, 10Operations, 10Pybal: Backport ipvsadm - https://phabricator.wikimedia.org/T171850#3478231 (10ema) +1 on upgrading to stretch. However, we are probably gonna end up in a similar situation on stretch whenever upgrading to newer kernels, so perhaps it might still make sense to keep this ticket open... [14:36:33] ok lvs1010 is now running with an ops service (see ipvsadm -S | grep ops) [14:37:09] 10Traffic, 10Operations, 10Pybal: Backport ipvsadm - https://phabricator.wikimedia.org/T171850#3478121 (10MoritzMuehlenhoff) +1 on the stretch upgrade, but I don't think it's very useful to keep the ticket open for future kernel updates, it'll only bitrot and who know's if there's even a new ipsvadm release... [16:44:15] 10Traffic, 10Operations, 10Pybal: Backport ipvsadm - https://phabricator.wikimedia.org/T171850#3478782 (10ema) >>! In T171850#3478261, @MoritzMuehlenhoff wrote: > +1 on the stretch upgrade, but I don't think it's very useful to keep the ticket open for future kernel updates, it'll only bitrot and who know's... [17:43:53] 10Traffic, 10DBA, 10MediaWiki-extensions-WikibaseClient, 10Operations, and 6 others: Cache invalidations coming from the JobQueue are causing lag on several wikis - https://phabricator.wikimedia.org/T164173#3479046 (10daniel) [19:05:27] !log added new misc::cache director "releases" for releases* servers, releases moving away from bromine (T164030) [19:05:37] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [19:05:37] T164030: setup releases1001.eqiad.wmnet (was: setup mwreleases1001) - https://phabricator.wikimedia.org/T164030