[13:25:11] 10Analytics-Tech-community-metrics: Deployment of IRC panel - https://phabricator.wikimedia.org/T138004#2923324 (10Aklapper) "Last messages" on https://wikimedia.biterg.io/app/kibana#/dashboard/IRC ends on 2016-12-21. As the top says "Some bugs were identified in the process and are being reviewed" can bug repo... [13:25:32] 10Analytics-Tech-community-metrics: Handling multiple affiliations (at once; like work vs spare time) in tech community metrics - https://phabricator.wikimedia.org/T95238#2923325 (10Aklapper) p:05Low>03Lowest [13:25:52] 10Analytics-Tech-community-metrics, 10Possible-Tech-Projects, 07Epic: Allow contributors to update their own details in tech metrics directly - https://phabricator.wikimedia.org/T60585#2923328 (10Aklapper) p:05Low>03Lowest [13:26:32] 10Analytics-Tech-community-metrics: GrimoireLib sometimes displays different names for same user ID; link does not display (existing) contributor data - https://phabricator.wikimedia.org/T140299#2923330 (10Aklapper) p:05Low>03Lowest [13:27:16] 10Analytics-Tech-community-metrics: Have "Last Attracted Developers" information for Gerrit (already exists for Git) - https://phabricator.wikimedia.org/T151161#2923333 (10Aklapper) p:05Low>03Normal [13:27:18] 10Analytics-Tech-community-metrics: top-contributors.html displays comma as "Location" when a person has more than one affiliation - https://phabricator.wikimedia.org/T123926#2923335 (10Aklapper) p:05Low>03Lowest [13:28:38] 10Analytics-Tech-community-metrics: korma: Mismatch between numbers for code merges per organization - https://phabricator.wikimedia.org/T129910#2923339 (10Aklapper) p:05Low>03Lowest [13:29:15] 10Analytics-Tech-community-metrics: Ratio of performed code reviews vs. patches authored, for each Gerrit/Differential user - https://phabricator.wikimedia.org/T147948#2923341 (10Aklapper) p:05Low>03Lowest [13:29:33] 10Analytics-Tech-community-metrics, 06Developer-Relations: Measuring Time To First Code Change (TTFCC) - https://phabricator.wikimedia.org/T137201#2923346 (10Aklapper) p:05Low>03Lowest [16:12:48] bearloga: doesn't hive work for you? [16:14:05] bearloga: beeline was a bit better when it comes to character encoding but other thna that I am not aware of more benefits [16:14:10] nuria: yup! Hive works for me but for some reason beeline doesn't want to :\ [16:14:53] Oh. I was hoping it would be better about escaping characters in output. [16:15:07] Nevermind, I guess? [16:15:41] bearloga: i do not think so, i still use hive cause i cannot see (but I might have not run into it) any benefits of running beeline and it requires more set up [16:38:23] 10Analytics-Dashiki, 06Analytics-Kanban, 13Patch-For-Review: Add extension and category (ala Eventlogging) for DashikiConfigs - https://phabricator.wikimedia.org/T125403#2923641 (10Nuria) [16:38:26] 10Analytics, 10Analytics-Dashiki, 10Wikimedia-Site-requests: Need a Dashiki namespace so we can protect configs {crow} - https://phabricator.wikimedia.org/T112268#2923643 (10Nuria) [16:45:08] 10Analytics, 10Analytics-EventLogging, 13Patch-For-Review: eventlogging user agent data should be parsed so spiders can be easily identified {flea} - https://phabricator.wikimedia.org/T121550#2923667 (10Nuria) Update: we will be replacing the data held by the user agent column with the parsed version . Reso... [16:45:28] 10Analytics-EventLogging, 06Analytics-Kanban: Add user_agent_map field to EventCapsule - https://phabricator.wikimedia.org/T153207#2923670 (10Nuria) [16:45:31] 10Analytics, 10Analytics-EventLogging, 13Patch-For-Review: eventlogging user agent data should be parsed so spiders can be easily identified {flea} - https://phabricator.wikimedia.org/T121550#2923672 (10Nuria) [16:49:59] 10Analytics: THEME: Analyst uses an operationalized Saiku - https://phabricator.wikimedia.org/T75246#2923695 (10Nuria) 05Open>03Resolved a:03Nuria This is been replaced by pivot [17:05:51] bearloga: https://github.com/wikimedia/operations-puppet/blob/e959321aa620b77403cc9379db2e86080323c6e8/modules/role/templates/analytics_cluster/hive/beeline_wrapper.py.erb is the script [17:07:46] i am able to connect to it though [17:07:49] https://www.irccloud.com/pastebin/nTC5RdaU/ [17:08:03] On the box it should be in /usr/local/bin/beeline [17:19:56] Hey nuria, anything I can help you with? [17:20:19] joal: i was going to get started today in FINALLY loading druid with maps data , ayayayay. [17:20:27] Yay ! [17:20:34] nuria: I can possibly help with that :) [17:21:15] joal: but besides teh chnage syou recommended for changeset : https://gerrit.wikimedia.org/r/#/c/327845/ [17:21:23] *the chnages you re commended [17:22:22] joal: I do not think there is much to do besides just running the oozie workflow, is there? [17:23:36] nuria: do you have an idea of the size of the data you'll import? [17:23:46] joal: yes, let me look at table [17:26:10] (03CR) 10Joal: [C: 04-1] "Little typo in HQL, plus, I don't understand how the HQL and druid loading query will be run. I think you need a coordinator+workflow patt" (032 comments) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/327845 (https://phabricator.wikimedia.org/T151832) (owner: 10Nuria) [17:27:05] nuria: commented on the patch: I don't understand the tiles_table.hql file, and I think you misss the oozie job [17:27:22] a-team: we have experienced an outage on our caching layer, so I am pretty sure that the Oozie alarms are due to it [17:27:33] elukey: k [17:27:48] elukey: I'll relaunch the failed job [17:27:53] nuria: Ok? [17:27:55] joal: ah sorry, will correct hql. oozie is missing yes [17:29:56] elukey: outage still ongoing or not? [17:30:03] joal: the tiles_table is teh one we will be redaing from: [17:30:08] *reading from [17:30:10] joal: [17:30:14] https://www.irccloud.com/pastebin/O63P1G64/ [17:31:04] nuria: ah ok - those table creation scripts are usually in /hive/... instead of /oozie/.. (just to be picky ;) [17:31:33] joal: ah, ok, this is just one poc so code will be scrapped [17:31:42] sure nuria just saying [17:32:18] joal: we are currenty recovering afaik [17:32:19] :) [17:32:23] joal: ok, thank you, will move file aside and add coordinator stuff [17:32:30] madhuvishy: thank you! [17:32:39] nuria: cool :) [17:32:56] nuria: let me know if you need me for something [17:33:24] elukey: Should I wait another half-hour, or should I run failed job now? [17:39:19] joal: maybe let's wait half-hour! [17:39:34] elukey: I was uassuming that when asking, but prefered to ask : [17:40:29] (03PS6) 10Nuria: [WIP] POC of loading tile data into pivot [analytics/refinery] - 10https://gerrit.wikimedia.org/r/327845 (https://phabricator.wikimedia.org/T151832) [17:41:37] joal: updated, please take a look: https://gerrit.wikimedia.org/r/327845 [17:41:52] joal: I think you also recommended "longSum" instead of count here: https://gerrit.wikimedia.org/r/327845 [17:42:03] joal: sorry, here: https://gerrit.wikimedia.org/r/#/c/327845/5..6/oozie/maps/druid/load_map_tiles.template.json [17:42:57] nuria: correct (longSum) - Looks ok at first sight, maybe only segment granularity depending on data size [17:43:35] joal: ok, will add oozie code in a sec [17:46:34] nuria: just checked data size - About 15M (in sequence file) per day [17:46:56] joal: there should be two months of it [17:47:06] nuria: This is kinda small for druid [17:47:12] nuria: only one month (11) [17:47:25] joal: ah great, mikhail updated it then [17:47:50] So my point is: Better to have monthly segments (not daily) [17:49:01] (03CR) 10Joal: "Another round of comnments (data size oriented this time)" (032 comments) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/327845 (https://phabricator.wikimedia.org/T151832) (owner: 10Nuria) [17:50:06] joal: does that mean we are just loading 1 month at a time rather than 1 day at a time? [17:50:28] nuria: it means we need to reload the beginning of the month every day [17:50:30] joal: so oozie jobs should only mention month/year? [17:50:52] joal: me no understand [17:51:01] nuria: We can load either monthly (better from a re-work perspective), or load daily, and reload the beginning of the month [17:51:08] nuria: batcave ? [17:51:16] joal: k [18:19:02] (03PS7) 10Nuria: [WIP] POC of loading tile data into pivot [analytics/refinery] - 10https://gerrit.wikimedia.org/r/327845 (https://phabricator.wikimedia.org/T151832) [18:19:59] joal: please take a look https://gerrit.wikimedia.org/r/327845, i tried to remove the "mark directory as done" and "send email in error", also hardcoded name of source table [18:58:41] joal: anything against me restarting the oozie job? [19:04:53] !log started 0063446-161121120201437-oozie-oozi-C to re-run upload-2017-1-6-17 [19:04:54] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [19:19:05] 10Analytics, 10EventBus, 13Patch-For-Review, 06Services (doing), 05WMF-deploy-2017-01-03_(1.29.0-wmf.7): 400 errors in EventBus - https://phabricator.wikimedia.org/T153030#2924156 (10Pchelolo) 05Open>03Resolved Since the change was deployed and the errors don't show up in the logs any more I'm resolv... [20:14:09] elukey: wow, thanks mate [20:14:31] elukey: I talked a bit with nuria, then got hooked at diner, and forgot [20:14:36] nuria: reading [20:26:09] (03CR) 10Joal: [C: 04-1] "Still some glitches. getting there!" (037 comments) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/327845 (https://phabricator.wikimedia.org/T151832) (owner: 10Nuria) [20:26:13] nuria: --^ [20:27:33] also nuria: 3 druid-tiles-coord currently running in oozie - I assume those are not expected [20:28:40] elukey: you forgot our 16 from what I can sseb [20:30:08] elukey: another glitch: the refinery-jar version used in your config file is not up to date - current webrequest-load jobs use 0.0.38, the one you launched used 0.0.31 [20:35:36] !log Launched 0063574-161121120201437-oozie-oozi-C to cover for upload-2017-01-06-[16-17] [20:35:36] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [20:37:37] Gone for now a-team, will check stuff tomorrow [20:42:27] 10Analytics, 06Research-and-Data: Hash IPs on webrequest table - https://phabricator.wikimedia.org/T150545#2924368 (10Jsalsman) How is the value of being able to re-run metrics compared to the risk of disclosing readers personally identifying information when determining what to store and what to refrain from... [20:50:05] 10Analytics, 06Research-and-Data: Hash IPs on webrequest table - https://phabricator.wikimedia.org/T150545#2924383 (10Nuria) > If there is a more appropriate venue for these questions, please let me know which one it is.) analytics e-mail list, this ticket is about hasing IPs to avoid cut & paste errors. [20:52:33] 10Analytics, 06Research-and-Data: Hash IPs on webrequest table - https://phabricator.wikimedia.org/T150545#2924387 (10Jsalsman) I'm not sure what a cut & paste error is in this context. The original task description said, "research doesn't really need raw IPs on webrequest table" [20:57:27] 10Analytics, 06Research-and-Data: Hash IPs on webrequest table - https://phabricator.wikimedia.org/T150545#2924411 (10Nuria) @Jsalsman: my last post on this regard on this thread: our current data retention abides to privacy policy which went extensive discussion. As we have listed before we require buffer tim... [21:00:32] 10Analytics, 06Research-and-Data: Hash IPs on webrequest table - https://phabricator.wikimedia.org/T150545#2924419 (10Jsalsman) > Please do not hijack this phab ticket. In November, comments on this issue were directed here *from* the analytics list. [21:06:35] (03CR) 10Nuria: [WIP] POC of loading tile data into pivot (035 comments) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/327845 (https://phabricator.wikimedia.org/T151832) (owner: 10Nuria) [21:07:13] (03PS8) 10Nuria: [WIP] POC of loading tile data into pivot [analytics/refinery] - 10https://gerrit.wikimedia.org/r/327845 (https://phabricator.wikimedia.org/T151832) [22:21:15] (03PS9) 10Nuria: [WIP] POC of loading tile data into pivot [analytics/refinery] - 10https://gerrit.wikimedia.org/r/327845 (https://phabricator.wikimedia.org/T151832) [22:58:20] 06Analytics-Kanban, 06Discovery, 06Discovery-Analysis (Current work), 03Interactive-Sprint, 13Patch-For-Review: Add Maps tile usage counts as a Data Cube in Pivot - https://phabricator.wikimedia.org/T151832#2924827 (10Nuria) Update: latest patchset creates correctly the json to load into druid via oozie... [22:59:00] (03PS10) 10Nuria: [WIP] POC of loading tile data into pivot [analytics/refinery] - 10https://gerrit.wikimedia.org/r/327845 (https://phabricator.wikimedia.org/T151832)