[00:57:44] VisualEditor, WikiEditor, VisualEditor-Performance, Analytics, MediaWiki-Core-Team: Apply Schema:Edit instrumentation to WikiEditor - https://phabricator.wikimedia.org/T88027#1045363 (Krenair) TODO: * I should figure out what we can do about action.abort.type * Deal with the action.saveFailure.type schema iss... [02:03:45] VisualEditor, VisualEditor-Performance, Analytics-Engineering: Report on the central tendency for length of pages which are edited for VisualEditor performance benchmarking - https://phabricator.wikimedia.org/T89788#1045518 (Jdforrester-WMF) NEW a:Milimetric [06:26:45] Analytics, Datasets-Webstatscollector: www.f / foundationwiki (wikimediafoundation.org) pageviews low, underreporting? - https://phabricator.wikimedia.org/T51266#1045802 (Tbayer) [06:27:48] Analytics, Datasets-Webstatscollector: www.f / foundationwiki (wikimediafoundation.org) pageviews low, underreporting? - https://phabricator.wikimedia.org/T51266#562202 (Tbayer) Some more evidence that something is wrong here: The rise in the bottom diagram [[https://stats.wikimedia.org/wikispecial/EN/Summar... [08:05:54] Analytics-Cluster, Analytics-Kanban: {epic} WMF has UC report per project per month & day {bear} - https://phabricator.wikimedia.org/T88647#1045910 (Eloquence) [08:07:27] Analytics-Wikimetrics, Analytics-Kanban: EPIC: Productionizing Wikimetrics {dove} - https://phabricator.wikimedia.org/T76726#1045914 (Eloquence) [10:02:21] Analytics, Multimedia, MediaWiki-extensions-MultimediaViewer: Get rid of sync AJAX calls in MediaViewer - https://phabricator.wikimedia.org/T89533#1046016 (Gilles) a:Tgr This task ought to be renamed. [10:14:36] Analytics, Multimedia, MediaWiki-extensions-MultimediaViewer: Get rid of sync AJAX calls in MediaViewer - https://phabricator.wikimedia.org/T89533#1046033 (Gilles) [10:17:14] Analytics, Multimedia, MediaWiki-extensions-MultimediaViewer: Set up varnish 204 beacon endpoint for virtual media views and use it in Media Viewer - https://phabricator.wikimedia.org/T89088#1046041 (Gilles) [10:22:20] Analytics, Multimedia, MediaWiki-extensions-MultimediaViewer: Get rid of sync AJAX calls in MediaViewer - https://phabricator.wikimedia.org/T89533#1046059 (Gilles) [10:23:28] Analytics, Multimedia, MediaWiki-extensions-MultimediaViewer: Set up varnish 204 beacon endpoint for virtual media views and use it in Media Viewer - https://phabricator.wikimedia.org/T89088#1046065 (Gilles) [15:11:15] (CR) Milimetric: Add SQL script to create indexes in EL Edit tables (1 comment) [analytics/data-warehouse] - https://gerrit.wikimedia.org/r/190404 (https://phabricator.wikimedia.org/T89256) (owner: Mforns) [15:13:24] Analytics-Wikimetrics, Analytics-Kanban: Story: WikimetricsUser reads user names in a JSON report [8 pts] - https://phabricator.wikimedia.org/T74747#1046327 (ggellerman) @Frances - please ping Analytics Eng on IRC [15:14:35] Analytics-Wikimetrics, Analytics-Kanban: Story: WikimetricsUser reads user names in a JSON report [8 pts] - https://phabricator.wikimedia.org/T74747#1046328 (ggellerman) That is, please ping Analytics Eng on IRC for help getting Vagrant to work [15:31:49] (PS8) Milimetric: Add timeseries graph of key metrics [analytics/limn-edit-data] - https://gerrit.wikimedia.org/r/190113 [15:34:40] VisualEditor, Analytics-Engineering, VisualEditor-Performance: Report on the central tendency for length of pages which are edited for VisualEditor performance benchmarking - https://phabricator.wikimedia.org/T89788#1046434 (Jdforrester-WMF) [15:38:01] Analytics-EventLogging, Analytics-Engineering, Analytics-Kanban: capacity planning for Mobile team WikiGrok test 4 in first half of March 2015 - https://phabricator.wikimedia.org/T89827#1046447 (ggellerman) NEW [15:50:30] milimetric: back in business at a coffee shop [15:50:54] (PS3) Mforns: Add SQL script to create indexes in EL Edit tables [analytics/data-warehouse] - https://gerrit.wikimedia.org/r/190404 (https://phabricator.wikimedia.org/T89256) [15:50:54] cool [15:51:26] (CR) Milimetric: [C: 2 V: 2] Add SQL script to create indexes in EL Edit tables [analytics/data-warehouse] - https://gerrit.wikimedia.org/r/190404 (https://phabricator.wikimedia.org/T89256) (owner: Mforns) [16:56:12] Analytics-Cluster, Analytics-Kanban: Estimate roughly of how many users might not have javascript capable/enable browsers, use CSS to crosscheck. - https://phabricator.wikimedia.org/T89847#1046881 (Nuria) NEW [17:08:46] VisualEditor, Analytics-Engineering, VisualEditor-Performance, Analytics-Kanban: Report on the central tendency for length of pages which are edited for VisualEditor performance benchmarking - https://phabricator.wikimedia.org/T89788#1046935 (ggellerman) [17:10:13] Analytics-EventLogging, Analytics-Kanban: Tune Sampling rate of eventlogging navigation timing events - https://phabricator.wikimedia.org/T89848#1046942 (Nuria) NEW [17:18:35] Analytics-EventLogging, Analytics-Engineering, Analytics-Kanban: capacity planning for Mobile team WikiGrok test 4 in first half of March 2015 - https://phabricator.wikimedia.org/T89827#1046969 (ggellerman) from Nuria: "Note that there is not much work to do in this regard. We already talked with Kaldari in... [17:18:47] Analytics-Cluster, Analytics-Kanban: Estimate roughly of how many users might not have javascript capable/enable browsers, use CSS to crosscheck. - https://phabricator.wikimedia.org/T89847#1046971 (kevinator) p:Triage>Normal [18:36:00] (PS3) QChris: Add media file consumption reports [analytics/refinery] - https://gerrit.wikimedia.org/r/191118 [18:36:54] (CR) QChris: Add media file consumption reports [analytics/refinery] - https://gerrit.wikimedia.org/r/191118 (owner: QChris) [18:38:46] ottomata: It seems the mediacounts discussion has settled. Do you think I could get you to do some reviews and a refinery/source release? [18:38:51] Relevant changes are: [18:38:58] https://gerrit.wikimedia.org/r/#/c/191098/ [18:39:08] (needs a refinery/source deploy after merge) [18:39:10] and [18:39:15] https://gerrit.wikimedia.org/r/#/c/191118/ [18:45:35] (CR) Mforns: [C: 1] "I have added one comment. But I think no changes are needed. If so, ping me and I'll merge :]" (1 comment) [analytics/limn-edit-data] - https://gerrit.wikimedia.org/r/190113 (owner: Milimetric) [18:52:08] qchris: yes, will try to get to it this week, i am on ops clinic duty this week too [18:52:14] awesome, btw! :) [18:52:28] k. thanks ottomata. [19:06:57] (CR) Milimetric: [C: 2 V: 2] Add timeseries graph of key metrics (1 comment) [analytics/limn-edit-data] - https://gerrit.wikimedia.org/r/190113 (owner: Milimetric) [19:10:33] Analytics-Cluster, Analytics-Engineering: Refine page_id, page_name, and namespace using x_analytics fields and page tables - https://phabricator.wikimedia.org/T89396#1047374 (JAllemandou) Do we really want to provide dedicated fields (schema modification, data redundancy), or do we want provide an easy to us... [19:13:27] Analytics-Kanban, Analytics-EventLogging: capacity planning for Mobile team WikiGrok test 4 in first half of March 2015 - https://phabricator.wikimedia.org/T89827#1047390 (kevinator) [19:13:53] Analytics-Kanban, Analytics-EventLogging: capacity planning for Mobile team WikiGrok test 4 in first half of March 2015 - https://phabricator.wikimedia.org/T89827#1047394 (kevinator) Open>declined a:kevinator Analytics Eng got the heads up, nothing more is needed. [19:14:23] Analytics-Kanban, Analytics-EventLogging: capacity planning for Mobile team WikiGrok test 4 in first half of March 2015 - https://phabricator.wikimedia.org/T89827#1047398 (kevinator) declined>Resolved [19:31:47] wikimedia/mediawiki-extensions-EventLogging#357 (wmf/1.25wmf18 - a962ad0 : Mukunda Modell): The build passed. [19:31:47] Change view : https://github.com/wikimedia/mediawiki-extensions-EventLogging/commit/a962ad0ea8d7 [19:31:47] Build details : http://travis-ci.org/wikimedia/mediawiki-extensions-EventLogging/builds/51274743 [19:49:42] milimetric, how about avoiding multiple instances of the scheduler using a pidfile? [19:50:13] * milimetric looks up what a pidfile is :) [19:50:39] a file with the process id that is being executed [19:51:01] oh i didn't mean that was the only problem though [19:51:08] aha [19:51:14] as you can see from that library there are several problems that they solve already [19:51:22] and I'm a big big fan of not re-solving problems [19:51:33] sure [19:52:14] but if that library's no good, and you think it would take too long to find a good one, then we can write it from scratch, that's fine [19:52:25] but generate.py is written from scratch and... well.... :D [19:52:35] Analytics-Cluster, Analytics-Engineering: Refine page_id, page_name, and namespace using x_analytics fields and page tables - https://phabricator.wikimedia.org/T89396#1047534 (Ottomata) p:Triage>Low [19:52:50] Analytics-Cluster, Analytics-Engineering: Investigate getting redirect_page_id as an x_analytics field using the X analytics extension. - https://phabricator.wikimedia.org/T89397#1047535 (Ottomata) p:Triage>Low [19:53:05] Analytics-General-or-Unknown: Kafka broker analytics1021 not receiving messages every now and then - https://phabricator.wikimedia.org/T71667#1047536 (Ottomata) p:Triage>Low [19:53:14] milimetric, I was thinking: I suppose the scheduler will have to check every time that all the given reports are scheduled in APS, and are doing well [19:53:17] Analytics-Cluster, Analytics-Engineering: Automate sqooping of page table into Hive - https://phabricator.wikimedia.org/T89394#1047538 (Ottomata) p:Triage>Normal [19:54:00] Analytics-Cluster, Analytics-Engineering: Researchers have page_id in X-Analytics field of webrequest logs - https://phabricator.wikimedia.org/T77416#1047544 (Ottomata) [19:54:01] Analytics-Cluster, Analytics-Engineering: Mediawiki Eng adds fields to X-Analytics header - https://phabricator.wikimedia.org/T77389#1047543 (Ottomata) Open>Resolved [19:54:09] Analytics-Cluster, Analytics-Engineering: Researchers have page_id in X-Analytics field of webrequest logs - https://phabricator.wikimedia.org/T77416#1047545 (Ottomata) Open>Resolved [19:55:24] milimetric, as I understand, APS, when started, will grab from the db all reports that were scheduled at some point, and run them if necessary [19:56:32] ottomata: you beat me to it, I was about to close T77389 right now [19:58:40] Analytics-Cluster, Analytics-Engineering: MediaWiki Eng adds fields to X-Analytics header - https://phabricator.wikimedia.org/T77389#1047597 (Legoktm) [20:00:31] mforns: yes, if we used APS, I guess the scheduler would read a configuration file of some sort, translate it into jobs, and schedule those. It would have to "diff" between that config file and the database [20:01:13] :) [20:05:07] §Multimedia-Sprint-2015-02-18, Analytics, Multimedia, MediaWiki-extensions-MultimediaViewer: Get rid of artifical click delay in MediaViewer - https://phabricator.wikimedia.org/T89533#1047695 (Tgr) [20:05:36] milimetric, aha [20:40:32] hmm leila this is the first person we've added to the analytics-users group [20:40:44] and i just realized, that this will get her access [20:40:53] but, not to stat1002! [20:40:54] hm. [20:41:30] hm. [20:41:42] should I: install a hadoop client on stat1003? [20:41:47] I see, ottomata. it's good to have a more limited access setting. if you think it's too complicated at this point, let me know. We have all the signatures for full access. [20:41:47] that would be the proper thing to do. [20:41:52] oh! [20:41:53] we do? [20:41:54] hm [20:42:07] I just preferred more limited access unless it's needed [20:42:09] i'm not sure what to do. [20:42:11] yeah. [20:42:13] unless otherwise is needed [20:42:31] hadoop client on stat1003 means more people have access to hadoop, even if they don't have any permissions...but that seems less safe [20:42:55] i'd rather just include the analytics-users group to stat1002 and not think about it, but that would mean they have full permissions basically to see private data on stat1001 (e.g. sampled logs) [20:43:01] can we talk about this for 5 min batcave in 10 min or so? (I'm in showcase now) [20:43:04] soryr, on stat1002. [20:43:05] * [20:43:08] sure [20:43:16] thanks! :-) [20:50:52] ottomata: related to giving people access to stat1002, I thought I had logged an enhancement request many months ago… [20:51:54] … are there tools that can log requests people are running? for auditing purposes? [20:52:36] ? [20:53:57] Is it possible to leave an audit trail when running map/reduce jobs on the cluster? [20:54:15] similar to audit trails for databases [20:54:59] ottomata: I'm in batcave [20:55:48] um, kevinator sounds hard, i mean, the resourcemanager keeps track of that stuff unil the next time it is restarted [20:55:58] we could probably somehow back it up or something, i dunno. actually, wait, yes. [20:56:09] job logs are moved into hdfs after job completes [20:56:12] so those are there forever [20:56:14] i think [21:00:22] Analytics-Cluster: Hive User can specify webrequest date range in query more easily - https://phabricator.wikimedia.org/T76531#1048003 (kevinator) @milimetric wrote this JS you can bookmark ``` javascript:function where(e,t){t||(t=new Date),e=new Date(e);for(var a=[];t>=e;){var r=[];r.push("YEAR="+e.getUTCFu... [21:06:23] Analytics-Cluster, operations: Clean up permissions for privatedata files on stat1002 - they should be group readable by statistics-privatedata-users - https://phabricator.wikimedia.org/T89887#1048013 (Ottomata) NEW a:Ottomata [21:07:37] milimetric, do you have 10 minutes for batcave? [21:07:42] yes [21:07:43] coming [21:07:48] ok :] [21:38:52] MediaWiki-Vagrant, Analytics: role::hadoop will not provision on Ubuntu 14.04 (MediaWiki-Vagrant default) - https://phabricator.wikimedia.org/T70302#1048148 (Ottomata) stalled>Open That should do it! I'll get someone else to test that this works soon, Hopefully @jallemandou :) -Ao [21:40:34] ottomata: I just updated the phabricator task. talked to Toby. he approved access to everything [21:41:09] I'm going to grab lunch ottomata. should be back in 20 min if you have questions. [21:45:02] k [22:14:55] Analytics-Kanban, Analytics-EventLogging: Estimate maximum throughput of Schema:Search (capacity) {oryx} - https://phabricator.wikimedia.org/T89019#1048262 (kevinator) [23:25:30] (PS4) QChris: Add media file consumption reports [analytics/refinery] - https://gerrit.wikimedia.org/r/191118