[06:34:37] good morning [06:35:03] mc1036 runs memcached 1.5.x and it looks very good up to now [06:35:04] https://grafana.wikimedia.org/d/000000316/memcache?orgId=1&from=now-24h&to=now&var-datasource=eqiad%20prometheus%2Fops&var-cluster=memcached&var-instance=mc1036 [06:35:23] - get hit ratio is around 0.94/95, as it was before the upgrade [06:35:54] - there are now more items than before the upgrade (31M vs 30M) but with way less memory allocated for slabs [06:36:26] - memcached's memory ceiling is 110G (vs 90) so a lot more space to go [06:36:38] - there are zero evictions right now [06:36:47] effie: --^ \o/ [06:40:54] something not great is mc1022 constantly with tx bw almost double the other shards [06:40:57] https://grafana.wikimedia.org/d/000000316/memcache?viewPanel=58&orgId=1&from=now-24h&to=now&var-datasource=eqiad%20prometheus%2Fops&var-cluster=memcached [06:58:47] it seems a SQLBlob key, I am trying to get the related revision from the analytics mysql replicas [07:25:21] aaand we have a winner https://en.wikipedia.org/w/index.php?oldid=983013703 [07:25:36] timing matches with the increase in bw usage [07:26:29] so I suppose that this will keep going for a while :( [08:08:42] I am not home but I take it it wouldnt hurt to firewall 1022 for a bit [08:09:11] if this is causing tkos [08:09:49] nono it is not causing tkos, just a doubled tx bw usage [08:18:01] <_joe_> Module:Citation/CS1 is a thorn in our side [08:18:23] <_joe_> that's the one key that I'd like to see in a local memcached [08:32:49] yeah that would be really great [08:37:43] <_joe_> effie: did we fix the beta memcached mess? [08:50:56] headsup: starting in 10-15 mins Grafana will be switched to CAS/grafana-rw [08:55:33] * volans starts an ab test against grafana :-P [09:09:32] moritzm: Riccardo as live a/b test is surely good but be prepared for the bug reports :D [09:10:08] lol, I meant the apache benchmark one :-P [09:10:30] _joe_ I didnt do anything regarding beta [09:10:50] (still out) [09:11:03] <_joe_> ok, let's talk next week :) [09:11:36] sure sure [09:11:56] <_joe_> basically I would like us to install memcacheds in beta with buster and redis 5 [09:12:07] <_joe_> right now it's a unholy mess [09:12:16] beta? [09:12:22] <_joe_> deployment-prep [09:12:22] * kormat ducks [09:12:46] <_joe_> kormat: the memcache/redis config is like the epitome of all that's wrong there [10:29:27] dashboard editing via sso should work now, you'll show up as logged in at https://grafana-rw.wikimedia.org/ [10:29:50] still a couple of touches to go but is working [10:36:11] nice! [10:37:13] yeah! bummer for the separate vhost but it is what it is [10:38:27] my vote would have been for grafanaaaaaaaaa.wikimedia.org, but I think it's too late [10:39:01] :-) [10:40:30] haha reminds of Archer calling Lana, e.g. https://www.youtube.com/watch?v=5UPapMCcwps [10:44:16] <_joe_> godog: this also means I won't be able to "explore" from a link to non -rw grafana? [10:44:22] <_joe_> that's quite inconvenient [10:44:44] <_joe_> I use that quite a lot during incident response [10:45:40] to explore you can change the host to -rw yeah then it'll work [10:46:11] not as convenient as a single vhost indeed [10:47:25] we can share '-rw' links no problem though, for folks not logged in with sso it'll redirect to grafana.w.o [12:40:28] sorry if I missed the answer to this, can we get the login link to redirect you to -rw ? [12:42:35] maybe/probably, we're currently still fine-tuning the setup [14:23:21] cdanis: yep, https://gerrit.wikimedia.org/r/c/operations/puppet/+/639533 [14:24:55] godog: sweet! yeah if login redirects to -rw and -rw links redirect to the main domain when logged out, that's about as seamless as it can be [14:26:19] cdanis: indeed! yeah the dashboard links from -rw to grafana.w.o are working afaict [14:27:07] okay one last thing that would be cool but seems not necessary: if /login could 302 to the referer rewritten to be the -rw domain, instead of just the front page [14:27:13] no idea how hard that is in apache config though [14:27:27] but it would mean that logging in lands you on the same dashboard you were on [14:29:00] interesting, yeah I think for that mod_alias isn't enough but mod_rewrite would be [14:29:29] i'll get the sacrificial 🐐 [14:30:12] hold my rules, I'm going in [14:30:38] I got that tingly feeling in the back of my neck that means someone's talking about confusing apache config, so I'm here to plug https://wikitech.wikimedia.org/wiki/httpbb [14:31:20] the word "confusing" is redundant [14:32:21] there are two kinds of apache config users: those who *know* they're confused, and those who are surprised when it doesn't work [14:32:28] httpbb can't do anything for the latter [14:32:34] lol [14:33:52] an apache doc maintainer once offered me the chance to, uh, rewrite the mod_rewrite docs. i thought about it for a split second and then declined [14:33:54] rzl: I didn't know you added a --generate-apache-config option to httpbb, nice! [14:34:17] volans: ooh that's a good idea, it's easy enough... as long as your tests are complete [14:34:33] maybe a better word than "complete" is "exhaustive" [14:34:49] but certainly I can generate *some* config that passes all your tests [14:35:01] I thought you were going for "Turing complete" there [14:36:32] the httpbb config language has so far avoided becoming turing complete [14:36:37] narrowly avoided, on a couple of occasions [14:37:12] it is an easy pit to fall into alright [14:37:19] but I figure we already have erb for those with... proclivities [16:50:03] time for the daily good news/bad news maps update: eqiad is nearly finished syncing and at that point the cluster will be serving up to date data and at full capacity (new nodes coming soon) [16:50:37] maps2002 is very sick and I am going to nuke and recreate the cassandra instance which will hopefully fix it up but things might be wackier than they have been [16:51:14] thanks hnowlan! I hope it goes easier than expected :) [18:35:41] The KaiOS app has "Firefox/48.0 KAIOS/2.5" in the user agent. Is it possible to see the traffic generated by the current users (1.2M downloads in India so far) [18:36:08] wkandek: webrequests or pageviews? :) [18:37:06] hm interesting, not finding it so far in the pageviews data (which does a bunch of UA parsing) [18:38:42] I assume we want WikipediaApp/ KAIOS [18:38:44] wkandek: https://w.wiki/kHo [18:42:42] cdanis: ok, so it is visible... [18:45:44] peak traffic is 44*128 requests/5 minutes =~ 19 rps [18:47:24] I had a quick chat with one of the devs and currently the app does not do any periodic updates (different from the iOS app). [18:49:01] cdanis: what is the 128 in that calculation? Sample size? [18:49:13] yes [18:49:17] yep! [18:49:50] cool taht goes into my TIL bucket for today. [18:49:55] that... [18:50:50] I was curious about the TLS implications earlier too. It seems KaiOS is fine on that stuff (it only has TLSv1.2, but all the samples we get support ECDHE, ECDSA, and AEAD ciphers, even if we zoom back when CBC was supported) [18:51:11] so they're not impacted by recent or future deprecations [18:52:30] wkandek: and to expand on the above, it's because chris used the webrequest_sampled_128 data source [18:54:58] volans: ah, that makes sense. [18:55:38] from the top-left menu you can seelct different ones [18:56:13] the sampled one is near real time so we often use that one [18:57:28] or I'm misremembering [18:57:44] webrequest_128 is hourly [18:57:47] not quite real time [18:57:57] usually lagged by 2 hours (the hour has to finish, and then, takes almost an hour to process) [18:58:19] for near-real-time stuff you need ... well it used to be weblog1001 but now it's on the centrallog hosts in the same /srv/webrequest directory [18:58:30] lots of tail -n99999999 | jq ... stuff [19:00:55] yea I was misleaded because still thinking to be utc+2 while I'm +1 now :D [19:01:01] sorry for the confusion [19:01:26] you don't keep this in your wm? https://i.imgur.com/R4dX8nE.png [19:04:01] nope, being so close it's so easy to count -1/-2 that I never felt the need :) [19:04:16] but I've done it when in SF for longer periods ;)