[00:49:57] (03CR) 10Smalyshev: [C: 031] [WIP] Add tagging as part of webrequest refine process [analytics/refinery] - 10https://gerrit.wikimedia.org/r/367940 (https://phabricator.wikimedia.org/T171760) (owner: 10Nuria) [07:57:30] 10Analytics, 10Android-app-feature-Feeds, 10Pageviews-API, 10RESTBase-API, and 2 others: Why top views data of different sources is not the same? - https://phabricator.wikimedia.org/T172379#3499989 (10Liuxinyu970226) @Volker_E I'm afraid there's bugs that result RESTbase datas not synced well in #pageviews... [08:30:15] 10Analytics-EventLogging, 10Analytics-Kanban, 10Community-Tech, 10DBA, 10User-Elukey: Drop CookieBlock* tables from EventLogging DB - https://phabricator.wikimedia.org/T171883#3500047 (10Marostegui) Anything pending here? [08:31:16] 10Analytics-EventLogging, 10Analytics-Kanban, 10Community-Tech, 10DBA, 10User-Elukey: Drop CookieBlock* tables from EventLogging DB - https://phabricator.wikimedia.org/T171883#3500060 (10elukey) 05Open>03Resolved a:03elukey [09:48:50] 10Analytics, 10Analytics-Cluster, 10User-Elukey: Produce webrequests from varnishkafka to Kafka with Kafka message timestamp set to configurable content field - https://phabricator.wikimedia.org/T166833#3500352 (10elukey) [09:49:02] 10Analytics, 10Analytics-Cluster, 10User-Elukey: Produce webrequests from varnishkafka to Kafka with Kafka message timestamp set to configurable content field - https://phabricator.wikimedia.org/T166833#3309125 (10elukey) a:03elukey [09:50:19] 10Analytics-Cluster, 10Analytics-Kanban, 10Operations, 10Traffic, 10User-Elukey: Encrypt Kafka traffic, and restrict access via ACLs - https://phabricator.wikimedia.org/T121561#3500371 (10elukey) >>! In T121561#3323871, @Ottomata wrote: > We should do some work to understand how ACLs work and what ACLs f... [09:53:04] 10Analytics-Cluster, 10Analytics-Kanban, 10Operations, 10Traffic, 10User-Elukey: Encrypt Kafka traffic, and restrict access via ACLs - https://phabricator.wikimedia.org/T121561#3500389 (10elukey) > Note that this plan doesn't yet consider encryption of traffic between Kafka and Zookeeper. Should we? We'... [10:00:50] 10Analytics, 10Discovery: Look into encrypting logs sent between mediawiki app servers and kafka - https://phabricator.wikimedia.org/T126494#2015899 (10elukey) @EBernhardson time has passed and now this task is a child of https://phabricator.wikimedia.org/T152015, so we probably need to discuss next steps for... [10:03:01] 10Analytics-Cluster, 10Analytics-Kanban, 10Operations, 10Traffic, 10User-Elukey: Encrypt Kafka traffic, and restrict access via ACLs - https://phabricator.wikimedia.org/T121561#3500417 (10elukey) @Ottomata should we keep this task open given that we already have https://phabricator.wikimedia.org/T166167 ? [10:32:55] 10Analytics, 10Analytics-EventLogging: Alarm on errors on /var/log/upstart/eventlogging* files - https://phabricator.wikimedia.org/T170620#3437262 (10elukey) Can we use logster for this task? (reading logs in cron and reporting metrics like we do for Varnishkafka) [10:35:27] 10Analytics-Kanban, 10Patch-For-Review: Adding MAILTO to crontab for camus job - https://phabricator.wikimedia.org/T169248#3500476 (10elukey) p:05Triage>03Normal [10:36:07] 10Analytics-Kanban, 10Patch-For-Review: Adding MAILTO to crontab for camus job - https://phabricator.wikimedia.org/T169248#3391644 (10elukey) Updated the code review: since we have `>> ${log_file} 2>&1` (stdout/err directed to a log file) the MAILTO field is not enough to get alarms.. [10:46:18] * elukey lunch! [11:23:51] 10Analytics, 10EventBus, 10Scap, 10User-Elukey: eventlogging-service-eventbus scap deployments should depool/pool during deployment - https://phabricator.wikimedia.org/T171506#3500541 (10elukey) [11:24:42] 10Analytics, 10EventBus, 10Scap, 10User-Elukey: eventlogging-service-eventbus scap deployments should depool/pool during deployment - https://phabricator.wikimedia.org/T171506#3467218 (10elukey) @Joe what are the available options? Shall we migrate to use deploy-service for eventlogging or is there a way t... [11:57:06] 10Analytics, 10User-Elukey: Alarm on HDFS related script failures - https://phabricator.wikimedia.org/T168390#3500645 (10elukey) [11:57:17] 10Analytics-Kanban, 10Patch-For-Review, 10User-Elukey: Adding MAILTO to crontab for camus job - https://phabricator.wikimedia.org/T169248#3500646 (10elukey) [11:59:16] 10Analytics, 10Android-app-feature-Feeds, 10Pageviews-API, 10RESTBase-API, and 2 others: Why top views data of different sources is not the same? - https://phabricator.wikimedia.org/T172379#3500647 (10Liuxinyu970226) >>! In T172379#3499989, @Liuxinyu970226 wrote: > I'm afraid there's bugs that result RESTb... [12:30:38] 10Analytics-Kanban, 10User-Elukey: Calculate how much Popups events EL databases can host - https://phabricator.wikimedia.org/T172322#3500700 (10phuedx) @elukey: After a //brief// conversation with @ovasileva, I'd say that we can safely archive the following: * MobileWebUIClickTracking_10742159_15423246 * Mob... [12:33:32] 10Analytics-Kanban, 10User-Elukey: Calculate how much Popups events EL databases can host - https://phabricator.wikimedia.org/T172322#3500703 (10elukey) @phuedx awesome! Do you mean HDFS with "archive"? (Just to be sure) [12:45:44] 10Analytics-Kanban, 10User-Elukey: Calculate how much Popups events EL databases can host - https://phabricator.wikimedia.org/T172322#3500724 (10phuedx) >>! In T172322#3500703, @elukey wrote: > @phuedx awesome! Do you mean HDFS with "archive"? (Just to be sure) Yes. This should definitely be done for the Mobi... [12:59:05] 10Analytics-Kanban, 10User-Elukey: Calculate how much Popups events EL databases can host - https://phabricator.wikimedia.org/T172322#3500756 (10elukey) It should be feasible since we are doing a similar thing in https://phabricator.wikimedia.org/T170720. Moving MobileWebSectionUsage_ and MobileWebUIClickTrack... [12:59:43] 10Analytics-Kanban, 10User-Elukey: dbstore1002 /srv filling up - https://phabricator.wikimedia.org/T168303#3500758 (10elukey) We are discussing to move ~1TB of data to HDFS in https://phabricator.wikimedia.org/T172322, another great news :) [13:02:19] (03CR) 10Mforns: [V: 032 C: 032] "LGTM free to merge on my side" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/367940 (https://phabricator.wikimedia.org/T171760) (owner: 10Nuria) [13:22:16] 10Analytics, 10Analytics-EventLogging, 10Performance-Team: Make webperf eventlogging consumers use eventlogging on Kafka - https://phabricator.wikimedia.org/T110903#3500852 (10Ottomata) > Looking at the actual deployment, looks like it's using a version from October 2016. I guess it needs to be updated? Yaaa... [13:26:21] 10Analytics-Cluster, 10Analytics-Kanban, 10Operations, 10Traffic, 10User-Elukey: Encrypt Kafka traffic, and restrict access via ACLs - https://phabricator.wikimedia.org/T121561#3500872 (10Ottomata) Let's keep it open and use this task to track actually enabling TLS / ACLs for different clients. [13:38:14] 10Quarry: Quarry cannot store results with identical column names - https://phabricator.wikimedia.org/T170464#3500901 (10zhuyifei1999) @Huji Your query https://quarry.wmflabs.org/query/20697 is affected by this bug. [13:38:27] 10Quarry: Quarry cannot store results with identical column names - https://phabricator.wikimedia.org/T170464#3500905 (10zhuyifei1999) p:05Low>03Triage [13:51:42] 10Quarry: Quarry cannot store results with identical column names - https://phabricator.wikimedia.org/T170464#3500951 (10Huji) Noted, thanks! [14:03:06] Big data analysis with people who analyze big data stuffs meeting time guys [14:03:21] * elukey coffee! [14:03:42] milimetric, ottomata: I don't have anything for the agenda. I'm the only one in the call now. [14:03:52] ah [14:03:54] sorry [14:03:58] No worries :) [14:04:08] omw, you wanted to talk about a thing last time and I missed it [14:04:18] ah! [14:04:19] coming! [14:04:21] Oh! Yeah I'm definitely interested in re-haching that :) [14:04:25] *hashing [14:04:29] one-way hashing [14:04:32] ;) [14:21:16] 10Analytics, 10Pageviews-API: Endpoint for average view rate in Pageview API - https://phabricator.wikimedia.org/T162933#3501111 (10Tbayer) Regarding views from redirects, see also T121912 and note that if/when T53736 is implemented, it may imply [[https://phabricator.wikimedia.org/T53736#3381135 | major chang... [14:51:34] 10Analytics, 10User-Elukey: Refactor analytics cronjobs to alarm on failure - https://phabricator.wikimedia.org/T172532#3501241 (10elukey) [14:52:01] 10Analytics, 10User-Elukey: Alarm on HDFS related script failures - https://phabricator.wikimedia.org/T168390#3501260 (10elukey) [14:52:03] 10Analytics, 10User-Elukey: Refactor analytics cronjobs to alarm on failure - https://phabricator.wikimedia.org/T172532#3501263 (10elukey) [14:52:22] 10Analytics-Kanban, 10Patch-For-Review, 10User-Elukey: Adding MAILTO to crontab for camus job - https://phabricator.wikimedia.org/T169248#3501265 (10elukey) [14:52:24] 10Analytics, 10User-Elukey: Refactor analytics cronjobs to alarm on failure - https://phabricator.wikimedia.org/T172532#3501241 (10elukey) [14:52:37] 10Analytics, 10Analytics-Cluster, 10User-Elukey: Monitor hdfs-balancer - https://phabricator.wikimedia.org/T163907#3501271 (10elukey) [14:52:39] 10Analytics, 10User-Elukey: Refactor analytics cronjobs to alarm on failure - https://phabricator.wikimedia.org/T172532#3501241 (10elukey) [14:53:36] 10Analytics, 10User-Elukey: Refactor analytics cronjobs to alarm on failure - https://phabricator.wikimedia.org/T172532#3501241 (10elukey) [15:02:14] 10Analytics-Cluster, 10Analytics-Kanban, 10User-Elukey: Make refinery drop data scripts email analytics-alerts if they fail - https://phabricator.wikimedia.org/T168415#3501341 (10elukey) [15:02:16] 10Analytics, 10User-Elukey: Refactor analytics cronjobs to alarm on failure - https://phabricator.wikimedia.org/T172532#3501345 (10elukey) [15:05:12] 10Analytics, 10EventBus, 10ORES, 10Reading-Infrastructure-Team-Backlog, and 3 others: Emit revision-score event to EventBus and expose in EventStreams - https://phabricator.wikimedia.org/T167180#3501364 (10Ottomata) Hey yall, I just bike shed revision-score event schema with @halfak for a while, we had som... [15:06:49] 10Analytics-Kanban, 10Patch-For-Review: Monthly Mediawiki Sqoop job failed - https://phabricator.wikimedia.org/T172426#3498331 (10Milimetric) p:05Triage>03Normal [15:08:03] 10Analytics, 10EventBus, 10Scap, 10User-Elukey: eventlogging-service-eventbus scap deployments should depool/pool during deployment - https://phabricator.wikimedia.org/T171506#3501376 (10elukey) So the following class is responsible to add the necessary credentials to the `deploy-service`: ``` # === Class... [15:11:38] 10Analytics, 10EventBus, 10Scap, 10User-Elukey: eventlogging-service-eventbus scap deployments should depool/pool during deployment - https://phabricator.wikimedia.org/T171506#3501418 (10Ottomata) I am not strongly opposed to using deploy-service. But, this means that when we switch eventlogging analytics... [15:14:00] 10Analytics, 10EventBus, 10Scap, 10User-Elukey: eventlogging-service-eventbus scap deployments should depool/pool during deployment - https://phabricator.wikimedia.org/T171506#3501423 (10Ottomata) For example, we'd like to restrict kafka producers and consumers to specific topics by adding ACLs for TLS pri... [16:07:31] 10Analytics, 10EventBus, 10Scap, 10User-Elukey: eventlogging-service-eventbus scap deployments should depool/pool during deployment - https://phabricator.wikimedia.org/T171506#3467218 (10thcipriani) >>! In T171506#3501376, @elukey wrote: > So the following class is responsible to add the necessary credenti... [16:42:15] * elukey off! byyyeee o/ [17:11:43] 10Analytics, 10EventBus, 10Scap, 10User-Elukey: eventlogging-service-eventbus scap deployments should depool/pool during deployment - https://phabricator.wikimedia.org/T171506#3467218 (10mobrovac) >>! In T171506#3501418, @Ottomata wrote: > But, this means that when we switch eventlogging analytics use to d... [17:17:50] 10Analytics-Cluster, 10Analytics-Kanban, 10Patch-For-Review: Audit users and account expiry dates for stat boxes - https://phabricator.wikimedia.org/T170878#3501729 (10Nikerabbit) [17:23:31] 10Analytics, 10EventBus, 10ORES, 10Reading-Infrastructure-Team-Backlog, and 3 others: Emit revision-score event to EventBus and expose in EventStreams - https://phabricator.wikimedia.org/T167180#3501748 (10mobrovac) Hm, ok, so we are back at discussing work-arounds for not being able to use `oneOf` and fri... [17:25:03] 10Analytics, 10EventBus, 10ORES, 10Reading-Infrastructure-Team-Backlog, and 3 others: Emit revision-score event to EventBus and expose in EventStreams - https://phabricator.wikimedia.org/T167180#3501750 (10Pchelolo) > Hm, ok, so we are back at discussing work-arounds for not being able to use oneOf and fri... [17:30:26] 10Analytics, 10EventBus, 10ORES, 10Reading-Infrastructure-Team-Backlog, and 3 others: Emit revision-score event to EventBus and expose in EventStreams - https://phabricator.wikimedia.org/T167180#3501754 (10Ottomata) This goes beyond what we can validate in jsonschema. If a field can have multiple types, w... [17:37:15] 10Analytics, 10EventBus, 10Scap, 10User-Elukey: eventlogging-service-eventbus scap deployments should depool/pool during deployment - https://phabricator.wikimedia.org/T171506#3501792 (10Ottomata) AH! Interesting! If so, then totally fine to deploy with deploy-service, and run with user eventlogging. Us... [17:41:30] 10Analytics, 10EventBus, 10ORES, 10Reading-Infrastructure-Team-Backlog, and 3 others: Emit revision-score event to EventBus and expose in EventStreams - https://phabricator.wikimedia.org/T167180#3501806 (10He7d3r) [17:41:33] (03PS35) 10Ottomata: JsonRefine: refine arbitrary JSON datasets into Parquet backed hive tables [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/346291 (https://phabricator.wikimedia.org/T161924) (owner: 10Joal) [20:18:08] 10Analytics, 10EventBus, 10ORES, 10Reading-Infrastructure-Team-Backlog, and 3 others: Emit revision-score event to EventBus and expose in EventStreams - https://phabricator.wikimedia.org/T167180#3502152 (10Ottomata) Talked with Marko and Petr, we agreed to do Option A. @Halfak, let me know when you have t... [20:18:09] running a simple bash script to get instances of graph tag from page props table...intentional. oh that was fast [20:22:02] eh, running again on s6-analytics-slave. gotta select database() and save myself the headache [20:22:22] cat dbs.txt | xargs -I % mysql --defaults-extra-file=/etc/mysql/conf.d/research-client.cnf -h s6-analytics-slave.eqiad.wmnet --database=% --skip-column-names -s -e "select database(),count(*) from page_props where pp_propname = 'graph_specs'" [20:23:03] ^ if i had bearloga R-fu or felt ging python today that would have been ok, but sometimes it's just more fun to use trusty xargs [20:24:16] heh :) [20:26:37] 10Analytics, 10ChangeProp, 10EventBus, 10MediaWiki-JobQueue, and 3 others: [EPIC] Develop a JobQueue backend based on EventBus - https://phabricator.wikimedia.org/T157088#3502225 (10mobrovac) [20:37:57] 10Analytics, 10EventBus, 10ORES, 10Reading-Infrastructure-Team-Backlog, and 3 others: Emit revision-score event to EventBus and expose in EventStreams - https://phabricator.wikimedia.org/T167180#3502287 (10Halfak) OK! {T172566} [20:38:08] 10Analytics, 10EventBus, 10ORES, 10Reading-Infrastructure-Team-Backlog, and 3 others: Emit revision-score event to EventBus and expose in EventStreams - https://phabricator.wikimedia.org/T167180#3502289 (10Halfak) [20:41:10] hey bearloga do you know a way in cirrussearch to to query all namespaces on es.wikipedia.org? i know on en.wikipedia.org i can just start the search with all: and It Just Works. [20:41:20] i'm talking about the Special:Search page, of course [20:44:52] dr0ptp4kt: the only way I can think of is if you go into the Advanced tab on the Special:Search page (https://es.wikipedia.org/w/index.php?title=Especial:Buscar&profile=advanced&search=&fulltext=1) and check all (using the todos button) [20:48:55] bearloga: thanks! [20:58:54] 10Analytics-Kanban, 10User-Elukey: Calculate how much Popups events EL databases can host - https://phabricator.wikimedia.org/T172322#3502305 (10Tbayer) >>! In T172322#3500700, @phuedx wrote: > @elukey: After a //brief// conversation with @ovasileva, I'd say that we can safely archive the following: > > * Mob... [21:02:51] 10Analytics, 10Android-app-feature-Feeds, 10Mobile-Content-Service, 10Pageviews-API, and 4 others: Why top views data of different sources is not the same? - https://phabricator.wikimedia.org/T172379#3502325 (10mobrovac) AFAIK, MCS manipulates the data received from the Pageviews API, so perhaps that's why... [22:41:15] 10Analytics-Kanban, 10RESTBase-API, 10WMF-Legal, 10Patch-For-Review, 10Services (watching): License for pageview data - https://phabricator.wikimedia.org/T170602#3502627 (10ZhouZ) HI all, if there are no further objections, are we all ready to move forward with the new proposed language? [23:26:15] 10Analytics: Weird performance of sqoop job - https://phabricator.wikimedia.org/T172579#3502684 (10Milimetric) [23:53:32] 10Analytics-Kanban, 10Analytics-Wikistats: Wikistats2 bugs (2/4) - Wiki selector - https://phabricator.wikimedia.org/T170936#3502727 (10fdans) [23:54:46] 10Analytics-Kanban, 10Analytics-Wikistats: Wikistats2 bugs (2/4) - Wiki selector - https://phabricator.wikimedia.org/T170936#3448065 (10fdans)