[00:35:41] (03CR) 10Nuria: "Some errors in console that might come from prior change that yours truly merged when it was not yet time to do so "Cannot read property '" [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/434557 (https://phabricator.wikimedia.org/T194430) (owner: 10Milimetric) [00:43:45] (03CR) 10Nuria: "Just checked master and no errors so they do come from this changeset. Looking." [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/434557 (https://phabricator.wikimedia.org/T194430) (owner: 10Milimetric) [00:51:50] 10Analytics, 10Analytics-Kanban: Problems with external referrals? - https://phabricator.wikimedia.org/T195880#4259705 (10JKatzWMF) [01:00:52] (03CR) 10Nuria: [C: 032] "Cannot repro errors, must have been local issue." [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/434557 (https://phabricator.wikimedia.org/T194430) (owner: 10Milimetric) [01:05:14] 10Analytics, 10Analytics-Kanban: Problems with external referrals? - https://phabricator.wikimedia.org/T195880#4259719 (10Nuria) Talked to @JKatzWMF and edited premise of ticket to better describe issue. No changes have been done to referrers as of late so issues must be prexisting, will take a look. [01:05:16] (03Merged) 10jenkins-bot: Adjust date formatting in the hover box [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/434557 (https://phabricator.wikimedia.org/T194430) (owner: 10Milimetric) [05:38:38] (03PS2) 10Sahil505: [WIP] Added CSS custom properties using postcss [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/437387 (https://phabricator.wikimedia.org/T190915) [06:07:12] (03PS3) 10Sahil505: [WIP] Added CSS custom properties using postcss [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/437387 (https://phabricator.wikimedia.org/T190915) [06:08:42] (03CR) 10Sahil505: "I see the merge conflict, working on this." [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/437387 (https://phabricator.wikimedia.org/T190915) (owner: 10Sahil505) [06:17:54] (03PS4) 10Sahil505: [WIP] Added CSS custom properties using postcss [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/437387 (https://phabricator.wikimedia.org/T190915) [06:26:54] 10Quarry: Allow /query/new to accept sql parameter - https://phabricator.wikimedia.org/T196525#4259904 (10MusikAnimal) [06:40:13] 10Quarry: Allow /query/new to accept sql parameter - https://phabricator.wikimedia.org/T196525#4259904 (10zhuyifei1999) Hmm. Shall I allow both GET and POST? Btw: shall I (somehow) promote `sql-optimizer` in Quarry? [08:09:08] heya teammm [08:39:42] o/ [09:05:17] 10Analytics, 10Analytics-Dashiki, 10Performance-Team: Setup Dashiki dashboard for performance survey responses - https://phabricator.wikimedia.org/T196528#4260176 (10Gilles) p:05Triage>03Normal [09:41:34] 10Analytics, 10Performance-Team: Setup dashboard for performance survey responses - https://phabricator.wikimedia.org/T196528#4260324 (10Gilles) [11:01:30] * elukey lunch! [12:34:13] 10Analytics, 10EventBus, 10MediaWiki-JobQueue, 10Goal, and 3 others: FY17/18 Q4 Program 8 Services Goal: Complete the JobQueue transition to EventBus - https://phabricator.wikimedia.org/T190327#4260696 (10mobrovac) [12:34:21] 10Analytics, 10EventBus, 10MediaWiki-JobQueue, 10MediaWiki-extensions-ORES, and 4 others: ORESFetchScoreJob fails quite a lot - https://phabricator.wikimedia.org/T196076#4260693 (10mobrovac) 05Open>03Resolved Thank you @Ladsgroup ! [12:35:08] 10Analytics, 10EventBus, 10MediaWiki-JobQueue, 10Goal, and 3 others: FY17/18 Q4 Program 8 Services Goal: Complete the JobQueue transition to EventBus - https://phabricator.wikimedia.org/T190327#4123123 (10mobrovac) [12:35:12] 10Analytics, 10EventBus, 10MediaWiki-JobQueue, 10MW-1.32-release-notes (WMF-deploy-2018-06-05 (1.32.0-wmf.7)), 10Services (done): Make JobExecutor debug-log to mwlog - https://phabricator.wikimedia.org/T195858#4260698 (10mobrovac) 05Open>03Resolved [13:46:58] 10Analytics, 10Analytics-Cluster, 10Services (doing): Move EventStreams to main Kafka clusters - https://phabricator.wikimedia.org/T185225#4260932 (10Ottomata) [13:47:19] 10Analytics, 10Analytics-Cluster, 10Services (doing): Support connection/rate limiting in EventStreams - https://phabricator.wikimedia.org/T196553#4260935 (10Ottomata) p:05Triage>03Normal [13:48:18] 10Analytics, 10Analytics-Cluster, 10Services (doing): Move EventStreams to main Kafka clusters - https://phabricator.wikimedia.org/T185225#3910019 (10Ottomata) A low connection rate limiting by IP would be useful too, as it would keep someone from quickly opening and closing connections, and/or opening too m... [14:42:39] 10Analytics, 10Analytics-Kanban, 10Operations, 10Patch-For-Review, 10User-Elukey: Import some Analytics git puppet submodules to operations/puppet - https://phabricator.wikimedia.org/T188377#4261137 (10elukey) [14:54:53] mforns: fdans: Wondering if you might know whether the 'https=1' portion is still useful in the Analytics response header? [14:55:14] Krinkle, mmmhhhh [14:55:16] not sure [14:55:35] 10Analytics, 10Analytics-Cluster, 10Services (doing): Move EventStreams to main Kafka clusters - https://phabricator.wikimedia.org/T185225#4261178 (10Ottomata) Oh, 20/sec is PER varnish host. Hm. [14:56:13] Krinkle, now everything is https, right? But not sure that the code defaults parsing to https, or is still reading that header... [14:56:42] Yeah, [14:57:33] Even if there are theoretical http responses somewhere in the cluster (probably just redirects from clients without HSTS support), might not be worth to track anymore, or at least not in a way that requires adding data to all responses from Varnish :) [14:59:29] Krinkle, yes, I can check the webrequest refinement code, and also talk with the team today in standup [14:59:43] will create a task for that [15:00:09] Thanks! Feel free to tag #Performance-Team (Radar) when you do :) [15:01:16] k :] thank you! [15:01:52] done [15:01:54] https://phabricator.wikimedia.org/T196558 [15:01:59] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Deploy Turnilo (possible pivot replacement) - https://phabricator.wikimedia.org/T194427#4198604 (10Tbayer) Since Turnilo has now officially replaced Pivot, it would be great to update the documentation on Wikitech (the main page at https://wikitech.wikimed... [15:02:28] 10Analytics, 10Performance-Team (Radar): Study whether we can default to https for all webrequests and remove https=1 header from x-Analytics - https://phabricator.wikimedia.org/T196558#4261190 (10mforns) [15:03:25] 10Analytics, 10Performance-Team (Radar): Study whether we can default to https for all webrequests and remove https=1 header from x-Analytics - https://phabricator.wikimedia.org/T196558#4261202 (10mforns) [15:03:57] 10Analytics, 10Analytics-Kanban: Varnishkafka does not play well with varnish 5.2 - https://phabricator.wikimedia.org/T177647#4261204 (10elukey) First part of the test done today, basically adapting our integration testing suite: https://wikitech.wikimedia.org/wiki/Analytics/Systems/Varnishkafka#Testing_a_cod... [15:14:32] (03CR) 10Mforns: Allow partial whitelisting of map fields (0313 comments) [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/437269 (https://phabricator.wikimedia.org/T193176) (owner: 10Mforns) [15:39:37] 10Analytics, 10Product-Analytics, 10Reading-analysis: Assess impact of ua-parser update on core metrics - https://phabricator.wikimedia.org/T193578#4261392 (10Tbayer) Thanks for the explanation, @fdans ! It seems like the best option for now regarding T189307 is to convert that kind of pageview_hourly query... [15:53:12] 10Analytics, 10Performance-Team (Radar): Study whether we can default to https for all webrequests and remove https=1 header from x-Analytics - https://phabricator.wikimedia.org/T196558#4261190 (10Tbayer) See the investigation results at T188807 and the recently updated documentation at https://wikitech.wikime... [15:54:18] Krinkle, mforns: there was actually considerable interest in this data just recently, so we should not remove it. I followed up on the task. [15:55:35] 10Analytics, 10Analytics-Cluster, 10Services (doing): Move EventStreams to main Kafka clusters - https://phabricator.wikimedia.org/T185225#4261478 (10Ottomata) Let's move this discussion to {T196553}. [15:57:08] does anyone know why the mailx command stopped working recently on stat1004? (recently as in, it was still working for me on may 21) [15:57:12] tbayer@stat1004:~$ mailx [15:57:12] -bash: mailx: command not found [15:58:13] oh! i might... who pinged me about that.... [15:59:18] mforns: Could you elaborate what it means to "default to https" for all web requests? [15:59:27] oh command not found [15:59:32] As in, changing behaviour in Varnish, or about analytics processing? [15:59:46] when did we reinstall stat1004 to stretch.. [15:59:51] Aside from creating a queriable field in Hadoop for webrequests, what does the https=1 field do? [16:00:05] ma 22 [16:00:06] may 22 [16:00:06] :) [16:00:23] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Deploy Turnilo (possible pivot replacement) - https://phabricator.wikimedia.org/T194427#4261508 (10Nuria) Added turnilo page: https://wikitech.wikimedia.org/wiki/Analytics/Systems/Turnilo [16:01:19] ping mforns [16:01:51] ottomata: ok could we reinstall it? it's very useful in a script to alert one about a finished long-running query [16:01:52] HaeB: info about heirloom-mailx package: [16:01:58] Description-en: feature-rich BSD mail(1) -- transitional package [16:01:58] This dummy package is provided to provide a smooth upgrade path from [16:01:58] heirloom-mailx to s-nail. It only contains symlinks to the s-nail [16:01:58] binary and manpage. [16:02:00] ping mforns standdupppp [16:02:33] hm that is installed thou [16:02:40] ottomata: so mail(1) should work with the same syntax? (i vaguely recall looking at the same question years ago ;) [16:02:59] this is new to me too HaeB I'm just googling and looking [16:03:11] (I've not really use these commands) [16:03:45] oh right, mail isn't available either [16:03:48] tbayer@stat1004:~$ mail [16:03:48] -bash: mail: command not found [16:03:56] HaeB: ./usr/bin/heirloom-mailx [16:04:07] try [16:04:11] heirloom-mailx [16:04:12] or [16:04:14] s-nail [16:04:21] (heirloom mailx symlinks to s-nail) [16:06:19] ottomata: yes, that worked [16:06:53] would still be nice to have the alias so i don't have to change my scripts... ;) [16:06:57] but thanks [16:07:56] HaeB: if you run your scripts with your username a simple bash alias will work fine [16:08:32] ya HaeB in your .bash_profile you could put [16:08:40] alias mailx=s-nail [16:10:03] elukey: understood, but i am pretty certain others use it in the same way (i originally started using mailx for this based on a tip by adam, and have since passed it on as a tip to others myself) [16:10:41] i'll change it in my bash profile though now [16:30:04] 10Quarry: Allow /query/new to accept sql parameter - https://phabricator.wikimedia.org/T196525#4261598 (10MusikAnimal) I figure you wouldn't want to allow POST (and actually create the Quarry), but if that's acceptable, it'd be even better because I can bring down the number of clicks to just one, assuming the A... [16:30:26] 10Analytics, 10Analytics-Cluster, 10Services (doing): Support connection/rate limiting in EventStreams - https://phabricator.wikimedia.org/T196553#4261600 (10Pchelolo) > Or, perhaps we can reuse https://github.com/wikimedia/limitation with the same interval counter stuff we use in statsd? To me this sounds... [16:34:34] 10Analytics, 10Research: [Open question] Improve bot identification at scale - https://phabricator.wikimedia.org/T138207#4261605 (10Tbayer) Is this going to be carried forward into the [[https://www.mediawiki.org/wiki/Wikimedia_Technology/Annual_Plans/FY2019 | 2018-19 annual plan]]? Improvements in this area w... [17:24:33] nuria_: Aye, I was not claiming that 100% of traffic is https. [17:24:51] nuria_: I'm merely wondering what the utility is of having it sent to the client and tracked as part of the analytics breakdown. [17:24:54] Krinkle: yaya, just explaining that disntiguishing the two is still of interest [17:25:32] Krinkle: we use it to actualy compute volume of connections attempted over http [17:25:55] Does varnishlog provide the protocol used in another way that doesn't require sending a cookie to the client? [17:26:52] Hey nuria_ - I thought it was mine [17:27:18] Krinkle: mmm.. i see what is your concern [17:27:42] nuria_: I can go :) [17:27:45] Krinkle: not that i know off [17:28:15] Things like host name, user-agent, and protocol etc. are usually available without needing to copy it back to the response. [17:28:21] Krinkle: but let me see , is this https sent for real [17:28:35] Krinkle: as a cookie? [17:29:15] nuria_: Sorry, you're right. It's a response header, not a cookie. [17:29:22] "x-analytics: https=1" [17:29:38] From curl -I 'https://www.wikimedia.org' as example [17:30:23] Krinkle: right, i just cannot keep track, it is a x-analytics response header but not a cookie [17:31:14] Yeah, so what I'm wondering is, where/how is it consumed from the response header (presumably varnishkafka > webrequest raw), and can that message already transport information without needing it to be a responser header. [17:32:42] The current vcl implementation for https=1 is based (only) on 'X-Forwarded-Proto' being set. perhaps that field can also be accessible via varnishkafka? [17:33:13] Anyhow, like I said, given that this is used, it is not about removal anymore, but about optimisation, and I'm less worried about that. There are other things I can look at first that can (maybe) be removed easier. [17:33:22] Feel free to ignore, or look at later if it seems possible. [17:42:02] ya, i think it is a response header by default but no, it does not need to be, actually x-analytics header has no use as response header that i can think of right? Unless some [17:42:50] client code is using it which seems unlikely [17:44:09] we should be able to remove it from our responses to the users, right? [17:44:46] vgutierrez, Krinkle : i think so, as it is not used [17:45:35] nuria_: vgutierrez Yeah, I'm pretty sure no client is using it, and if they are, we shouldn't care. but removing it from the response is (I think) not easy. But I could be wrong. [17:45:39] Krinkle: unless mediawiki requests (the ones varnish does not serve, cache bypass) expect that header to be there [17:45:59] I always thought that the reason we use the header is because that is an easy way to export data from VCL to read from varnishkafka [17:46:13] But if there is another way to do that, then yes, rm_header++ :) [17:46:46] ema was tracking some efforts regarding this on https://phabricator.wikimedia.org/T194814 [17:46:51] Krinkle: even if it sent to varnishkafka , ahem (not sure vgutierrez ) might know [17:46:59] it does not have to make it all teh way to our users [17:48:08] I know very little about VCL, so ignore this, but... from an outside perspective I would expect that calling unset() in VCL means it doesn't get sent to users *and* not to varnishlog. The header exists only as a hack to send information to varnishlog. [17:49:06] right, we do something like that for TLS information [17:49:14] instead of using http headers, we set some VCL Log entries [17:49:21] let me show it to you on the VCL [17:49:26] Yeah, that information is very difficult/impossible to extract afterwards, so that has to happen within VCL. [17:49:31] Oh, cool. [17:50:12] https://github.com/wikimedia/puppet/blob/production/modules/varnish/templates/vcl/wikimedia-frontend.vcl.erb#L109-L117 [17:50:15] right there [17:50:29] so, AFAIK, we could do something like that for X-Analytics [17:50:38] For the case of x-analytics: https=1, I do not know if it was in the past more complex, but the current implemetation is just based on X-Forward-Proto being set, which (I think) is already in the varnish log, so as-is, it should be possible to update the analytics consumer of varnishkafka to do its https=1 thing based on that directly. [17:50:54] vgutierrez: Oh, that looks very useful. I did not know that existed. [17:51:25] of course ema or bblack can provide more insights, maybe there is some con I'm missing [17:51:43] but I think it's a viable option [17:52:38] Krinkle: still you would need to add it to x-analytics field [17:53:13] Krinkle: cause all "out of band info" gets added there , we do not want to consume the https info by itself if that makes sense [17:54:47] 10Analytics, 10Analytics-Kanban, 10Performance-Team (Radar): Study whether we can default to https for all webrequests and remove https=1 header from x-Analytics - https://phabricator.wikimedia.org/T196558#4261915 (10Krinkle) 05declined>03Open I'll re-open this for a slightly different purpose, which is... [17:55:01] 10Analytics, 10Analytics-Kanban, 10Performance-Team (Radar): Study whether we can default to https for all webrequests and remove https=1 header from x-Analytics - https://phabricator.wikimedia.org/T196558#4261921 (10Krinkle) [17:55:28] 10Analytics, 10Analytics-Kanban, 10Performance-Team (Radar): Evaluate alternate means to send X-Analytics information from Varnish to Hadoop. - https://phabricator.wikimedia.org/T196558#4261190 (10Krinkle) [17:55:56] nuria_: Hm.. I'm not sure I understand "we do not want to consume the https info by itself". [17:56:17] I assume there exists code somewhere that, when populating the webrequests table, takes X-Analytics, makes a map, and sets it there. [17:56:23] That code currently reads it from a header. [17:56:31] Which is part of varnish log/kafka by default. [17:57:08] That code could instead make use of a slightly differnet key (e.g. resp.somethingMagic.X-Analytics instead of resp.header.X-Analytics) [17:59:21] Krinkle: mmmm... we do not consume (via varnishkafka ) any one thing that varnish has (let me dig that config) so i am not sure that the info about the header X-forward-Proto is available at the time info gets sent to varnishkafka.... [18:01:54] varnishkafka only has access to whatever is in the varnish shared log [18:02:10] hmmm [18:02:14] varnish.arg.q = ReqMethod ne "PURGE" and not Timestamp:Pipe and not ReqHeader:Upgrade ~ "[wW]ebsocket" and not HttpGarbage [18:02:19] that's webrequest query [18:02:33] you should be getting X-Forwarded-Proto info [18:02:39] which iirc gets entered after the response finishes [18:02:55] so any header (e.g. X-Forwared-Proto) that is set is accessible by varnishkafka [18:02:55] ottomata: Right, that's why we can't use headers unless we also send it to the user, which is fair I suppose. [18:03:24] X-Forwarded-Proto is set by nginx actually and carried by varnish [18:03:43] and TIL it is also possible for VCL to send data to the shared memory log only. [18:03:46] (explicitly) [18:04:00] Krinkle: yup, basically what I showed you before [18:04:32] vgutierrez: Could you add that to the task as well? It sounds like an interesting option to explore. [18:04:51] Krinkle: reporting X-Analytics via VCL_Log? [18:05:04] Yeah [18:05:30] I'll add a comment and I'll discuss it tomorrow morning with ema [18:05:35] thx! [18:11:53] vgutierrez: And a VCL_Log field like the ones you showed, those could be extracted similar to a header in varnishkafka format? [18:12:00] E.g. https://github.com/wikimedia/varnishkafka/blob/0c8d2dbaf56f717a25579dfa4db7d54a3d419eb5/varnishkafka.conf.example#L106 which currently uses a header [18:12:05] (this is example, couldn't find the live one) [18:12:37] ah, got it. [18:12:37] https://github.com/wikimedia/puppet/blob/9468a3970b08d3aa4beca6d91e00cd07abac4e7e/modules/profile/manifests/cache/kafka/webrequest.pp#L129 [18:13:24] Krinkle: yup [18:13:36] awesome [20:27:41] 10Analytics, 10EventBus, 10MediaWiki-Categories, 10MediaWiki-JobQueue, and 2 others: {{PAGESINCATEGORY}} returns incorrect value on en-wiki Category:Candidates for speedy deletion - https://phabricator.wikimedia.org/T195397#4262397 (10mobrovac) [20:52:43] 10Analytics, 10Analytics-Cluster, 10Services (doing): Support connection/rate limiting in EventStreams - https://phabricator.wikimedia.org/T196553#4262457 (10Nuria) >To me this sounds like the simplest solution of all since we don't need any new external dependencies and the algorithm seems to be the easiest... [22:00:29] 10Analytics, 10Analytics-Wikistats: Wikistats 2.0 Remaining reports. - https://phabricator.wikimedia.org/T186121#4262616 (10Nuria) [22:12:51] 10Analytics, 10Recommendation-API, 10Research: PySpark 2 cannot find numpy - https://phabricator.wikimedia.org/T196592#4262641 (10bmansurov) [22:13:29] 10Analytics, 10Recommendation-API, 10Research: PySpark 2 cannot find numpy - https://phabricator.wikimedia.org/T196592#4262641 (10bmansurov) [23:15:55] 10Analytics, 10Analytics-Kanban: Wikistats. Bug on title "wikistats 2" is not shown - https://phabricator.wikimedia.org/T194224#4262718 (10Nuria) [23:17:20] 10Analytics, 10Analytics-Kanban: Wikistats. Bug on title "wikimedia statistics" is not shown - https://phabricator.wikimedia.org/T194224#4192686 (10Nuria) [23:17:30] (03PS1) 10Nuria: "Wikimedia Statistics" was not shown on title for "all projects" [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/437878 (https://phabricator.wikimedia.org/T194224)