[11:50:52] (03CR) 10Nikerabbit: Measure articles published using CX2 (031 comment) [analytics/limn-language-data] - 10https://gerrit.wikimedia.org/r/442860 (https://phabricator.wikimedia.org/T196435) (owner: 10Amire80) [13:48:22] 10Analytics, 10Analytics-Kanban, 10Operations: Move internal sites hosted on thorium to ganeti instance(s) - https://phabricator.wikimedia.org/T202011 (10Ottomata) p:05Triage>03Normal [14:11:37] milimetric: ahhh where the heck do we rsync pageviews over to dumps??? [14:11:39] trying to find... [14:11:47] i have an idea why readme file might disappear... [14:16:33] YES [14:16:35] finally [14:16:36] found it [14:16:41] the rsync job uses --delete [14:16:49] so every time the stuff syncs, it deletes this file [14:41:29] 10Analytics, 10Analytics-Kanban, 10Operations: Move internal sites hosted on thorium to ganeti instance(s) - https://phabricator.wikimedia.org/T202011 (10Ottomata) [14:42:02] 10Analytics, 10Operations, 10vm-requests: eqiad: (3) VM %request for internal analytics web sites - https://phabricator.wikimedia.org/T202013 (10Ottomata) [14:47:42] 10Analytics, 10Analytics-Kanban, 10Datasets-General-or-Unknown, 10Documentation, 10Patch-For-Review: Missing documentation for pageviews dataset - https://phabricator.wikimedia.org/T201653 (10Ottomata) The reason the readme.html file(s) kept disappearing is that the rsync job that fetches the new datafil... [15:03:22] a-team: standup? [15:51:37] 10Analytics, 10Analytics-EventLogging, 10MW-1.32-release-notes (WMF-deploy-2018-08-07 (1.32.0-wmf.16)), 10Patch-For-Review, 10Performance-Team (Radar): Spin out a tiny EventLogging RL module for lightweight logging - https://phabricator.wikimedia.org/T187207 (10ori) @Krinkle OK, that makes sense to me. A... [15:51:43] 10Analytics, 10MediaWiki-Vagrant, 10Services (watching): Vagrant's /var/log/daemon.log filling up with kafka errors - https://phabricator.wikimedia.org/T187102 (10Ottomata) @DLynch, do you have another Kafka dependent puppet role enabled? E.g. eventbus or eventlogging? [15:53:19] 10Analytics, 10MediaWiki-Vagrant, 10Services (watching): Vagrant's /var/log/daemon.log filling up with kafka errors - https://phabricator.wikimedia.org/T187102 (10Ottomata) It looks like Kafka is failing because your Zookeeper service is not up and running. It can't reproduce this since both Zookeeper and K... [15:58:30] 10Analytics, 10MediaWiki-Vagrant, 10Services (watching): Vagrant's /var/log/daemon.log filling up with kafka errors - https://phabricator.wikimedia.org/T187102 (10DLynch) I have eventbus explicitly enabled, but I think I remember that being something I did on a suggestion from someone helping me debug this i... [15:59:13] 10Analytics, 10MediaWiki-Vagrant, 10Services (watching): Vagrant's /var/log/daemon.log filling up with kafka errors - https://phabricator.wikimedia.org/T187102 (10Reedy) [16:07:37] 10Analytics, 10MediaWiki-Vagrant, 10Services (watching): Vagrant's /var/log/daemon.log filling up with kafka errors - https://phabricator.wikimedia.org/T187102 (10Ottomata) Really hard to tell what is going on here. The easiest thing to do would be to create a new MW Vagrant instance from scratch and start... [16:11:30] 10Analytics: Functionality to share & view SWAP notebooks - https://phabricator.wikimedia.org/T156934 (10Neil_P._Quinn_WMF) >>! In T156934#4049358, @Neil_P._Quinn_WMF wrote: > My current workaround for this is to track my projects in Git (which is a good practice generally) and then push them to GitHub, which ha... [16:21:32] 10Analytics, 10Analytics-EventLogging, 10MW-1.32-release-notes (WMF-deploy-2018-08-07 (1.32.0-wmf.16)), 10Patch-For-Review, 10Performance-Team (Radar): Spin out a tiny EventLogging RL module for lightweight logging - https://phabricator.wikimedia.org/T187207 (10Nuria) The lighter EL module is something w... [16:26:38] milimetric: moved our 1 on 1 to be in 30 mins, is that ok? [16:26:48] fdans: let's sync up today, will set up meeting [16:27:04] sounds good! [16:27:38] fdans: thanks! invite sent [16:50:35] nuria_: sorry I was late getting out of the doctor's office, that time works though [16:55:53] 10Analytics, 10Analytics-EventLogging, 10MediaWiki-extensions-WikimediaEvents, 10Page-Issue-Warnings, and 6 others: Provide standard/reproducible way to access a PageToken - https://phabricator.wikimedia.org/T201124 (10Niedzielski) > @Ottomata, it's not clear if we should make these changes now to EventLog... [17:00:45] 10Analytics, 10Analytics-EventLogging, 10MediaWiki-extensions-WikimediaEvents, 10Page-Issue-Warnings, and 6 others: Provide standard/reproducible way to access a PageToken - https://phabricator.wikimedia.org/T201124 (10Ottomata) Yeah that makes sense. Likely the logic you guys come up with here will be us... [17:01:50] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban, 10EventBus, 10Services (watching): Modern Event Platform: Scalable Event Intake - https://phabricator.wikimedia.org/T201068 (10Niedzielski) Client-side validation feels like shipping test / debug code to production. This only prevents programmer... [17:06:26] 10Analytics: Data governance for topics - https://phabricator.wikimedia.org/T200440 (10Ottomata) [17:06:30] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban, 10EventBus, 10Services (watching): Modern Event Platform: Schema Registry - https://phabricator.wikimedia.org/T201063 (10Ottomata) [17:15:25] 10Analytics, 10Analytics-Kanban, 10EventBus, 10ORES, and 3 others: Fix "score_schema" -- invalid JSON Schema - https://phabricator.wikimedia.org/T197828 (10Ottomata) We'll be re-doing the schema as part of T197000, so I'll merge this with that ticket. [17:15:31] 10Analytics, 10Analytics-Kanban, 10EventBus, 10ORES, and 3 others: Fix "score_schema" -- invalid JSON Schema - https://phabricator.wikimedia.org/T197828 (10Ottomata) [17:15:34] 10Analytics, 10Analytics-Kanban, 10EventBus, 10ORES, and 4 others: Modify revision-score schema so that model probabilities won't conflict - https://phabricator.wikimedia.org/T197000 (10Ottomata) [17:16:26] 10Analytics, 10EventBus, 10MediaWiki-JobQueue, 10Discovery-Search (Current work), 10Services (designing): Huge messages in eqiad.mediawiki.job.cirrusSearchElasticaWrite (and other?) topics - https://phabricator.wikimedia.org/T196032 (10Ottomata) p:05Normal>03Low [17:17:09] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: pyspark2 job killed by YARN for exceeding memory limits - https://phabricator.wikimedia.org/T201519 (10Ottomata) [17:31:14] 10Analytics: Upgrade librdkafka on eventlog1002 - https://phabricator.wikimedia.org/T200769 (10Ottomata) a:05elukey>03Ottomata [17:34:07] 10Analytics, 10Patch-For-Review: Upgrade librdkafka on eventlog1002 - https://phabricator.wikimedia.org/T200769 (10Ottomata) Currently running librdkafka1 0.11.5-1~bpo9+1 in deployment-prep. Will leave it for a day or two and then upgrade in prod. [17:34:23] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Upgrade librdkafka on eventlog1002 - https://phabricator.wikimedia.org/T200769 (10Ottomata) [17:35:33] 10Analytics, 10Analytics-EventLogging, 10MW-1.32-release-notes (WMF-deploy-2018-08-07 (1.32.0-wmf.16)), 10Patch-For-Review, 10Performance-Team (Radar): Spin out a tiny EventLogging RL module for lightweight logging - https://phabricator.wikimedia.org/T187207 (10Krinkle) >>! In T187207#4505118, @ori wrote... [17:35:48] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban, 10EventBus, 10Services (watching): [WIP] RFC: Modern Event Platform: Scalable Event Intake Service - https://phabricator.wikimedia.org/T201963 (10Ottomata) [18:02:34] nuria_: holaaa [18:02:45] fdans: omw [18:32:29] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Turn off old geowiki jobs - https://phabricator.wikimedia.org/T190059 (10Nuria) Let's do a super safe check that new data is available on the monthly schedule as it should be (geoeditors) and let's remove the data from 1006 and note so in wikitech pages [18:48:05] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban, 10EventBus, 10Services (watching): [WIP] RFC: Modern Event Platform: Scalable Event Intake Service - https://phabricator.wikimedia.org/T201963 (10Milimetric) [18:57:25] 10Analytics, 10Analytics-EventLogging, 10MW-1.32-release-notes (WMF-deploy-2018-08-07 (1.32.0-wmf.16)), 10Patch-For-Review, 10Performance-Team (Radar): Spin out a tiny EventLogging RL module for lightweight logging - https://phabricator.wikimedia.org/T187207 (10ori) >>! In T187207#4505280, @Krinkle wrote... [19:07:32] 10Analytics, 10Analytics-EventLogging, 10MW-1.32-release-notes (WMF-deploy-2018-08-07 (1.32.0-wmf.16)), 10Patch-For-Review, 10Performance-Team (Radar): Spin out a tiny EventLogging RL module for lightweight logging - https://phabricator.wikimedia.org/T187207 (10Ottomata) +1 [19:51:57] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban, 10MW-1.32-release-notes (WMF-deploy-2018-08-07 (1.32.0-wmf.16)), and 2 others: Spin out a tiny EventLogging RL module for lightweight logging - https://phabricator.wikimedia.org/T187207 (10Milimetric) [19:58:12] chelsyx: yt? [20:00:31] nuria_: yes [20:00:48] chelsyx: sorry i have not gotten to your CR earlier [20:00:51] chelsyx: question [20:01:05] chelsyx: for that query you are not concerned about "pageviews" right? [20:01:20] chelsyx: even a request that is not initiated by user (like a preload) counts [20:02:21] nuria_: no. we just want to count unique devices. but i'm wondering why do we want pageviews originally for https://wikitech.wikimedia.org/wiki/Analytics/Data_Lake/Traffic/mobile_apps_uniques [20:02:39] the linked page didn't mention anything about pageview, [20:03:08] chelsyx: so unique devices that access the site no matter what type of request then.. can't the query be simplified to look just for urls with an appinstall id? [20:04:04] nuria_: yes. I think `access_method = 'mobile app'` should be sufficient [20:04:19] chelsyx: then.. let's simplify it right? [20:04:31] nuria_: but this way, the definition of unique devices would be different from https://wikitech.wikimedia.org/wiki/Analytics/Data_Lake/Traffic/mobile_apps_uniques [20:05:12] do we want to change the mobile app uniques too? [20:05:34] chelsyx: we can do that too, that does not have to be part of this change [20:06:18] nuria_: Ok. Sounds good. I will create a new ticket for that change [20:06:21] chelsyx: but we should move away from using code that does not have a rationale for existing and from what you are saying the code can be a lot simpler [20:06:37] chelsyx: sounds good [20:06:42] nuria_: yep, agree [20:07:11] chelsyx: ok, i will let you submit the new queries, the other part is the "when" teh job gets run [20:07:22] chelsyx: normally oozie jobs are not scheduled like crons [20:08:11] chelsyx: let me dig on how do we schedule similar jobs but i think there are couple things we also need to change [20:08:58] nuria_: ok. Thanks! [20:09:05] chelsyx: likewise [20:13:42] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban, 10EventBus, 10Services (watching): [WIP] RFC: Modern Event Platform: Scalable Event Intake Service - https://phabricator.wikimedia.org/T201963 (10Ottomata) [20:17:56] chelsyx: and - so you are triple sure- you want to count as active users that might have not open the app at all but the app might have updated in the background? (like feching reading lists) [20:18:52] chelsyx: do those "preload" requests come with any header so we could filter them out? [20:22:23] nuria_: ok. Let me double check with other analysts to see if they want to count users without pageviews. I will ping them in the ticket [20:22:45] nuria_: regarding your second questions, I'm not sure. I have to check with the ios engineer [20:24:08] chelsyx: ok, do we have already used headers for that purpose in the app to distinguish "intentional" versus "unintentional" requests, it is the vanilla way to distinguish request w/o looking at teh path [20:25:04] chelsyx: https://github.com/wikimedia/analytics-refinery-source/blob/master/refinery-core/src/main/java/org/wikimedia/analytics/refinery/core/PageviewDefinition.java#L244 [20:25:37] chelsyx: requests that sent a "preview" header are not counted as pageviews (see link above) [20:27:04] nuria_: Hmmm... I didn't know about that [20:27:28] chelsyx: it wad created to exclude "unintentional " requests [20:29:32] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban, 10MW-1.32-release-notes (WMF-deploy-2018-08-07 (1.32.0-wmf.16)), and 2 others: Spin out a tiny EventLogging RL module for lightweight logging - https://phabricator.wikimedia.org/T187207 (10Milimetric) Ok, starting work on this. Basic plan: * figu... [20:33:24] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban, 10EventBus, 10Services (watching): Modern Event Platform: Scalable Event Intake - https://phabricator.wikimedia.org/T201068 (10Nuria) >Client-side validation feels like shipping test / debug code to production. I disagree, it is a nice to have ca... [21:01:11] nuria_: I'm checking with the ios engineer, they haven't got back to me yet. But I think I need to discuss the definition of app unique devices with other analyst. In my opinion, if a user open the app and read the feed, even though they don't have a pageview and no "intentional" request, they should be counted as a unique devices [21:01:57] chelsyx: totally, ideally reading the feed will trigger an event that you could use [21:09:31] 10Analytics, 10ORES, 10Scoring-platform-team, 10Services (designing): ORES hooks - https://phabricator.wikimedia.org/T201869 (10Ladsgroup) Name of mediawiki hooks should have been prefixed by the name of the extension that introduces it so we dependencies would be more clear but beside that, it seems like... [21:31:04] 10Analytics, 10Discovery-Analysis, 10Product-Analytics, 10Reading-analysis, 10Patch-For-Review: Productionize per-country daily & monthly active app user stats - https://phabricator.wikimedia.org/T186828 (10chelsyx) @mpopov and @Tbayer , I have a question for you two: What is the proper definition of "mo... [21:54:25] @nuria_ I just check with ios engineer, they said the ios app never include "preview" in the x-analytics header. The values we have in the header are: User-Agent, Accept-Encoding, X-WMF-UUID (if user opts in to event logging), Accept-Language, Accept, Content-Type [22:03:23] chelsyx: right, cause it is the android app the one that does it [22:04:03] chelsyx: now, if we want to count uniques excluding non intentional requests we can do just by adding that header to those [22:09:23] nuria_: I see. Do you think it's necessary for iOS to add that? How do we filter iOS pageview now? [22:53:00] 10Analytics, 10Analytics-EventLogging, 10MediaWiki-extensions-WikimediaEvents, 10Page-Issue-Warnings, and 6 others: Provide standard/reproducible way to access a PageToken - https://phabricator.wikimedia.org/T201124 (10Krinkle) @Tbayer Is it important for the consumption of the Popups schema and other even...