[11:19:54] Analytics-Kanban, Analytics-Wikimetrics: Troubleshoot Wikimetrics RAE reports - https://phabricator.wikimedia.org/T93217#1140279 (mforns) In fact, the whole query with the problematic cohort does work for me now, it takes some time (40mins) but it finishes and returns reasonable results. Also, I could not... [14:07:00] Analytics-Dashiki, Analytics-Kanban, Patch-For-Review: Pageviews not loading in Vital Signs - https://phabricator.wikimedia.org/T90742#1140509 (mforns) Open>Resolved [14:33:54] good mroonniingng [14:44:44] morning! :D [14:46:13] Analytics: Vowpal Wabbit on stat1002 - https://phabricator.wikimedia.org/T93537#1140575 (Ottomata) This doesn't look easy; I'd likely have to build a custom debian package to install this. EEEEEEE [14:48:22] Analytics-Tech-community-metrics, Possible-Tech-Projects, ECT-March-2015, Epic, and 2 others: Allow contributors to update their own details in tech metrics directly - https://phabricator.wikimedia.org/T60585#1140578 (AntrikshAgarwal) Hello, I want to work on this project. I am well versed with py... [14:51:02] Analytics-Tech-community-metrics, Possible-Tech-Projects, ECT-March-2015, Epic, and 2 others: Allow contributors to update their own details in tech metrics directly - https://phabricator.wikimedia.org/T60585#1140584 (NiharikaKohli) @AntrikshAgarwal you stand a chance as good as anyone else. But si... [14:58:36] kevinator: I changed a few words on a few slides, as I was saying in standup. I should've probably consulted you but I'm free all day if you want to talk about it. They were nothing big either way [15:06:31] nuria: the last comment here https://phabricator.wikimedia.org/T93242 seems to lean away from doing sampling, did you talk to them more afterwards? [15:07:24] milimetric: ok, I'll make sure I read them carefully before presenting to make sure there are no surprises :-) [15:07:32] milimetric: no, i had not seen that, let me quantify [15:07:42] milimetric: I'll ping you later to day [15:07:46] milimetric: will update ticket in 10 mins [15:09:00] nuria: yeah, i'm seeing about 6.5 million events in the last 24 hours, in the table [15:09:30] so that's like 75 per second [15:09:30] milimetric: ok, then feel free to update ticket, that seems to me too much data [15:09:42] ya, 20 per sec sounds bette (3 times less) [15:09:53] k, i'll update [15:46:15] Analytics, Analytics-Cluster: Log the X-Cache header in the webrequest logs - https://phabricator.wikimedia.org/T91749#1140700 (Ottomata) I just what happens when adding columns to Hive tables backed by JSON Serde. It is weird: Caused by: java.lang.ClassCastException: java.util.ArrayList cannot be ca... [16:03:33] Analytics-EventLogging, Analytics-Kanban, Patch-For-Review: Reliable scheduler computes Visual Editor metrics [21 pts] {lion} - https://phabricator.wikimedia.org/T89251#1141349 (mforns) It seems that the reportupdater (aka scheduler) is working properly. However, the data is not good enough yet. And as... [16:32:18] Analytics-Tech-community-metrics, ECT-March-2015: Instructions to update user data in korma - https://phabricator.wikimedia.org/T88277#1141472 (Dicortazar) We have migrated the affiliations information from Wikimedia to the new SortingHat database schema. However, we're missing nationalities information... [16:41:31] ottomata: yt? [16:44:29] hiya yup [16:45:08] nuria: hiya [16:45:47] ottomata: do you know why this page: https://wikitech.wikimedia.org/wiki/Incident_documentation/20150206-EventLogging [16:46:58] ottomata: does not show here: https://wikitech.wikimedia.org/wiki/Incident_documentation [16:48:15] ah, you are def asking the wrong person [16:48:21] i have no idea how mediawiki works :/ [16:50:14] mediawiki works?! [16:50:20] has anyone told the users? [17:05:02] Analytics-EventLogging, Analytics-Kanban: Backfill client side data data for 2015-03-10 - https://phabricator.wikimedia.org/T93602#1141602 (Nuria) NEW [17:07:28] Ironholds, can I test your patch in vagrant? [17:07:41] mforns, sure! I mean, you can also build it and test it live ;) [17:07:58] live where? stat1002? [17:09:01] well, the hive cluster; build it into a JAR, then ADD JAR and import the UDF as you would for any custom UDF [17:09:42] Ironholds, oh of course [17:09:55] thanks! [18:07:59] (PS1) Ottomata: Add a simple hacky script to print out ADD PARTITION statements for a webrequest table [analytics/refinery] - https://gerrit.wikimedia.org/r/198759 [18:09:02] (CR) Ottomata: "I don't want to try to make this beautiful right now. I want to use it for a one off column addition as described in https://phabricator." [analytics/refinery] - https://gerrit.wikimedia.org/r/198759 (owner: Ottomata) [18:22:21] (CR) Joal: [C: 1 V: 1] "Sounds ok, but couldn't we use the trick we did with the parquet format (playing ith metastore) ?" [analytics/refinery] - https://gerrit.wikimedia.org/r/198759 (owner: Ottomata) [18:24:10] ottomata: yt ? [18:24:52] yup hiya! [18:25:50] joal|sleepy: how ya doin? [18:26:01] Doing great :) [18:26:20] Baby and wifey are very well, I am an happy dad :) [18:26:29] daww :) [18:26:36] huhu [18:26:55] Wiyld you have some time to review and merge / deploy the ua-parser stuff ? [18:27:12] Like that we move forward with the oozie stuff as well ! [18:27:39] oh yes sure, lemme see [18:28:28] looks good. I can merge this. would totally be cool for nuria to merge things like this too [18:28:33] (PS4) Ottomata: Move UAParser wrapper to refinery-core and update refinery-hive accordingly. [analytics/refinery/source] - https://gerrit.wikimedia.org/r/195952 (owner: Joal) [18:28:52] ottomata: ok, will do so in the future [18:28:53] Right, I'll ask her next time ;) [18:29:05] Hi nuria [18:29:09] (CR) Ottomata: [C: 2 V: 2] Move UAParser wrapper to refinery-core and update refinery-hive accordingly. [analytics/refinery/source] - https://gerrit.wikimedia.org/r/195952 (owner: Joal) [18:29:13] Not too tired ? [18:29:19] joal|sleepy: how is that sleep deprivation test? [18:29:35] Weeeeeeellllllll, so far, so good :) [18:30:00] joal|sleepy: great, I will merge your changes going forward, np. [18:30:01] My brain is half working, but as long as I concentrate on only one thing at a time ;)j [18:30:04] joal|sleepy: are there any upcoming changes to refinery-source that we might want to include in a release? [18:30:28] ottomata: I don't have any so far [18:30:57] I'll need to learn how to merge/deploy the source as well, at some point :) [18:31:04] is this change needed to do the ua parser in refined task? [18:31:13] ottomata: yes [18:31:19] ottomata: for niceness that is [18:31:23] yes. Ugh. i need to make that better. the main problem is archiva. i need to make archiva work with ldap [18:31:25] that would be ideal. [18:31:29] it also needs to be behind https soon [18:31:40] which reminds me, lemme check up on that ticket [18:32:37] joal|sleepy: you know how to deploy i think, just not release [18:32:49] releasing is mostly easy, but there is something that is broken for me each time I do it right now, that requires manual fixing [18:32:54] not fully sure why, haven' thad time to figure it out [18:33:06] something with maven and .pom shas being mixed up [18:33:09] in archiva server [18:33:30] i got some tech debt around archiva server... :/ [18:35:43] ah, Ironholds, i missed this in that review [18:35:48] why did we rename Pageview to PageviewDefinition? [18:36:15] nuria suggested it, and nuria is pretty much always right :D [18:37:22] GRRRRR, :) ok, fine with me, I am trying to let go a bit of refinery :p. [18:37:28] you should put that into the changelog too then [18:37:34] since that technically changes the refinery-core API [18:42:59] ottomata: wait .... wat api? [18:43:04] *what [18:45:57] ahha [18:46:11] i mean, it is a interface breaking change, it shoudl just be noted in the changelog [18:46:16] you can no longer do [18:46:21] Pageview.isPageview() [18:48:17] Analytics-Kanban, Analytics-Visualization: New host/lab environment for Visual Editor visualizations in labs that can report usage metrics [13 pts] {lion} - https://phabricator.wikimedia.org/T89255#1142310 (Nuria) >So we need a proper address in labs. The easiest would probably be to configure limn1 to se... [18:49:50] oh, gotcha [18:49:55] yeah, I should make sure I've updated the docs [18:50:00] is this going to break the oozie jobs? ;) [18:51:53] Thx ottomata :) [18:53:16] Ironholds: no, because you changed the UDF too, right? [18:53:31] afaik there isn't anything else that depends on it [18:53:59] ottomata, oh, point [18:54:08] oh, the ETL stuff with is_pageview is just a UDF thing? cool! [18:54:51] yup, it just calls out to the UDF and saves the result [18:55:07] but, if we moved refinement to say, spark or something, instead of hive [18:55:14] ottomata: shall I bump to 0.0.9 ? [18:55:15] we would be using refinery-core directly, not the UDF [18:55:24] joal|sleepy: that happens automatically when doing mvn:release [18:55:30] mvn release:prepare [18:55:30] i think [18:55:32] ja [18:55:48] joal|sleepy: https://github.com/wikimedia/analytics-refinery-source#releases-and-deployment [18:55:51] ottomata, gotcha [18:55:54] mmmh, was thinking of refinery projevt ;) [18:56:03] oozie jar version :) [18:56:14] ottomata: ah, ok,yes correct. Changes that method but that was a static class, it had no interface thus it could not be called other than directly. We can create interfcaes and have spark/udfs depend on those now [18:56:24] refinery, not refinery-source? [18:56:27] joal|sleepy: ? [18:57:06] nuria: i mean 'interface' in the generic sense, not the java specific senes, but ja [18:57:30] ottomata: In order to use the modified ua-parser code, I need to bump the jar version in refinery project [18:57:42] I'll push a code review, you'll see :) [18:57:44] (PS3) Ottomata: Update app Pageview detection and expose it [analytics/refinery/source] - https://gerrit.wikimedia.org/r/198489 (owner: OliverKeyes) [18:57:48] ah [18:57:53] joal|sleepy: ja, but i need to release the jar first :) [18:58:00] yup, for sure :) [18:58:08] (CR) Ottomata: [C: 2 V: 2] Update app Pageview detection and expose it [analytics/refinery/source] - https://gerrit.wikimedia.org/r/198489 (owner: OliverKeyes) [18:58:19] ja, joal|sleepy, once i release, i'll let you handle the rest of the deploy [18:58:59] hmm, I'll do that once you and nuria give me a formal go with code-review :0 [18:59:48] joal|sleepy: in which one? ... i do not think i have any pending coming from you, lemme check again [18:59:59] Ironholds: fyi i'm going to go ahead and add that to the changelog now, since I want to tag and release [19:00:02] not yet, waiting for the jar to be released [19:00:16] ottomata, you've seen the patch in, right? [19:00:28] Analytics-EventLogging, Analytics-Kanban: Backfill client side data data for 2015-03-20 - https://phabricator.wikimedia.org/T93602#1142366 (Nuria) [19:00:32] oh, you +2d! [19:00:33] yay! [19:00:45] nuria: --^ [19:00:53] yes! [19:00:57] :) [19:01:53] (PS1) Ottomata: Update changelog for 0.0.9 with info about Pageview class name changes [analytics/refinery/source] - https://gerrit.wikimedia.org/r/198784 [19:02:29] (PS2) Ottomata: Update changelog for 0.0.9 with info about Pageview class name changes [analytics/refinery/source] - https://gerrit.wikimedia.org/r/198784 [19:02:42] (CR) Ottomata: [C: 2 V: 2] Update changelog for 0.0.9 with info about Pageview class name changes [analytics/refinery/source] - https://gerrit.wikimedia.org/r/198784 (owner: Ottomata) [19:02:47] yay! [19:03:01] "I'm not an engineer, I just wrote most of our UDFs, fnar fnar" [19:03:05] * Ironholds practices in front of the mirror [19:03:11] oh, we need the UA parser stuff in the changelog too [19:03:33] UA parser stuff? [19:04:21] https://gerrit.wikimedia.org/r/#/c/195952/ [19:04:33] Ironholds: ua parsed values as a map [19:04:36] :) [19:04:43] ahh [19:04:45] cool! [19:04:52] joal|sleepy: fyi, we try to keep release changes documented in changelog.md, so include a summary of changes in there for future patches :) [19:05:11] Right, didn't know that [19:05:17] (PS1) Ottomata: Update changelog.md with UAParser wrapper change [analytics/refinery/source] - https://gerrit.wikimedia.org/r/198788 [19:05:20] cool, ja, np [19:05:49] (CR) Ottomata: [C: 2 V: 2] Update changelog.md with UAParser wrapper change [analytics/refinery/source] - https://gerrit.wikimedia.org/r/198788 (owner: Ottomata) [19:06:23] (PS1) Joal: Add ua_map and x_analytics_map fields to refine table. [analytics/refinery] - https://gerrit.wikimedia.org/r/198789 [19:06:41] OOOO exciting! [19:07:33] oh awesomse, str_to_map [19:07:38] didn't know that existed [19:07:41] :) [19:07:49] usefull trick [19:09:11] hm, joal|sleepy, i would have just replaced the x_analytics string in the refined table with the map [19:09:16] whatcha think? [19:09:59] do you think we need to keep the string around in the refined table? [19:09:59] ottomata: Clearly feasible, but we break the table format then (parquet only supports addition ..) [19:10:05] HM> [19:10:06] hm. [19:10:12] true. that would be funky. [19:10:14] :-/ [19:10:15] HMMMM [19:11:18] I think for the moment we keep the thing as is, and we concentrate on having a "sanitized" dataset, on whivh we'll remove the necesseray info [19:11:25] Whadyou think ? [19:12:23] ? [19:12:36] oh, a separate dataset than this refined one? [19:12:41] a 3rd webrequest table? [19:12:45] (CR) Nuria: "Need to check on whether we want to only have ua_map on refined tables or we want both user_agent and ua_map." (1 comment) [analytics/refinery] - https://gerrit.wikimedia.org/r/198789 (owner: Joal) [19:13:59] hm, ha, nuria, joal and I are talking about that same thing, but for x_analytics now [19:14:08] Maybe not a third one, but trying to only update the format removing the necessary fields to have a sanitized dataset once [19:14:16] ottomata: ah ok, ya. [19:14:19] aye, like, remove it later? [19:14:41] joal|sleepy: also, same here, did not know str_to_map existed, VERY HANDY! [19:14:46] for now let's just start using this, and later we can do the thing were we deprecated the old format by phasing it out (letting it be deleted) and maintaining two tables for a little while? [19:16:21] ottomatta, joal|sleepy : I do not think we event need to mainatain two tables, updating old oozie jobs should be enough, right? (maybe I am missing something big time) [19:16:38] nuria: we can't change the format of the old parquet data [19:16:50] so, we can't just replace the fields with their map versions, even though that would be cleaner [19:16:55] ottomata: ahhhhh [19:17:02] (although, i'm not sure we'd want to for user agent, but we can discuss that later) [19:17:18] the migration of adding new fields is trickky enough! [19:17:38] ottomata: ok, i see, not super easy [19:17:46] so, i thikn joal|sleepy is suggesting to add the new fields now, leaving the old data in place [19:17:56] if we decide to later, we can clean up and remove things we don't want [19:18:09] ottomata: k, sounds fine [19:18:10] we would likely do that by maintaining a legacy table for a couple of months, until the data is deleted anyway [19:18:40] ottomata, nuria : utlimate goal is providing a sanitized dataset, correct ? [19:18:45] joal|sleepy: fyi, i am actively working on https://phabricator.wikimedia.org/T91749 [19:18:49] so that will bea field addition as well [19:18:53] we should do all 3 at the same time [19:19:04] joal|sleepy: nuria, i think so? [19:19:37] it is not clear if the 'sanitized' dataset will be different or the same as the 'refined' dataset [19:19:40] i am not sure. [19:19:40] ottomata: I can wait as long as Kevin can wait ;) [19:19:45] haha, ok. [19:19:53] ottomata, joal|sleepy : right, goal is sanitized dataset from raw tables that can be "harvested" for longer retention [19:19:59] joal|sleepy: i thikn i will have this done by tomorrow [19:20:18] that is, x_cache field in the raw table [19:20:51] maybe by wed. [19:20:51] ottomata, nuria : so, about sanitization, it will involve removing some data from refined webrequest, so let's ensure we have it right before breaking format compatibility [19:20:53] lots of meetins tomorrow [19:20:59] yes, agree [19:21:06] np ottomata, will wait :) [19:21:21] we can do that when it is time, so ja, let's add these as the *_map fields as you have them for now [19:21:22] Send me an email when ready, I'll update the code [19:21:24] joal|sleepy: ya, totally, backwards compatibility 1st that helps also to define what we want to have [19:21:47] joal|sleepy: can you add comments to the create_webrequest_table.hql file documentiing this intention? [19:21:52] sounds good, everybody on the same page :) [19:22:08] Yes ottomata, will do [19:22:09] that we want to replace the string fields with the maps, and only keep this data once [19:22:10] danke [19:22:41] joal|sleepy: 0.0.9 release is up in archiva. [19:22:53] thx mate [19:22:58] do you have git-fat set up? [19:23:11] https://github.com/wikimedia/analytics-refinery#setting-up-the-refinery-repository [19:23:39] yup, I have [19:23:56] but not yet updating :( [19:24:07] https://wikitech.wikimedia.org/wiki/Archiva#Setting_up_git-fat_for_your_project [19:24:12] oh ok [19:24:13] cool.. [19:24:26] ok, you should be able to grab the 0.0.9 jars from archiva [19:24:29] and if git-fat is set up [19:24:31] just do [19:24:39] git add [19:24:41] and commit [19:24:46] fingers crossed and that will work [19:25:05] it will only work, if the sha that git-fat commits is the same as the sha in archiva [19:25:14] hmmmm [19:25:24] that is the pieces that i seem to ahve trouble with sometimes, i think there is something wrong with the sha in archiva sometimes, i am not sure [19:25:34] never got a jar from archiva yet [19:25:55] will try that tomorrow (baby time ... got to go) [19:26:03] Thx ottomata and nuria ! [19:26:12] wikimetrics question: how does it stand re: i18n? [19:26:19] will be back tomorrow (or maybe later tonight) [19:26:22] ok cool, laters! [19:29:12] fhocutt, I suppose you ask for: https://phabricator.wikimedia.org/T60634 [19:29:27] ah, I see! ok then. [19:29:31] thanks, mforns. [19:30:16] fhocutt: I don't know exaclty, it seems a task created long ago [19:31:00] it does! But you don't know of any work that's been done since then? [19:31:01] fhocutt: we have no plans in short term to add internacionalization [19:31:25] thanks, nuria, good to know [19:31:58] fhocutt: or long term i think [19:32:40] it could be useful for program leaders in the long term, especially since all grantees are required to provide global metrics [19:33:09] ottomata, qq: do you know if, in vagrant, role analytics depends on other roles? [19:35:17] it depends on the mysql one [19:35:49] fhocutt: any communication via e-mail with those offices and the developers to report problems, ask questions, must be in english so for such a small user base i do not think effort of adding 118n is worth it. [19:35:50] i mean, it includes tons of other roles, mforns [19:35:53] ::hadoop, etc. [19:36:24] fhocutt: *i18n, sorry [19:36:34] ok, thanks [19:36:42] ottomata, ok, but if I enable it by itself, should it work? [19:38:57] yes it should, follow the instructions in the comments of the top of that role file [19:39:32] ottomata, ok thanks! you said mysql role, I can not find any role with that name, is mysql its name? [19:39:47] uhh, i guess it doesn'thave a role, it is include by mediawiki ummmm [19:39:49] lemme see [19:40:01] ah just mysql module [19:40:02] mforns: [19:40:46] ottomata, ok [19:41:11] mforns: https://github.com/wikimedia/mediawiki-vagrant/blob/master/puppet/modules/role/manifests/oozie.pp [19:41:13] but ja [19:41:15] that's all [19:41:45] ottomata, ok [19:44:38] Analytics, Scrum-of-Scrums, Wikipedia-App-Android-App, Wikipedia-App-iOS-App, and 3 others: Avoid cache fragmenting URLs for Share a Fact shares - https://phabricator.wikimedia.org/T90606#1063059 (dr0ptp4kt) https://gerrit.wikimedia.org/r/#/c/198805/ under review. [19:51:07] !log added x_cache field to raw webrequest table, (no data yet) [19:52:20] Analytics, Analytics-Cluster: Log the X-Cache header in the webrequest logs - https://phabricator.wikimedia.org/T91749#1142575 (Ottomata) Cool, just did steps 1-3! Will do the varnishkafka change next. [19:52:47] (PS2) Ottomata: Add a simple hacky script to print out ADD PARTITION statements for a webrequest table [analytics/refinery] - https://gerrit.wikimedia.org/r/198759 [19:53:42] (PS1) Ottomata: Add x_cache field to create_webrequest_raw statement [analytics/refinery] - https://gerrit.wikimedia.org/r/198807 [19:53:57] (CR) Ottomata: [C: 2 V: 2] Add x_cache field to create_webrequest_raw statement [analytics/refinery] - https://gerrit.wikimedia.org/r/198807 (owner: Ottomata) [20:00:05] Quarry: Make "Home" navlink go to profile for logged-in users. - https://phabricator.wikimedia.org/T85175#1142621 (yuvipanda) a:yuvipanda>None [20:01:23] Analytics-Wikimetrics: some non-Latin characters do not show up in uploaded usernames and result in invalid usernames - https://phabricator.wikimedia.org/T93646#1142629 (Fhocutt) NEW [20:02:06] Analytics-Wikimetrics: some non-Latin characters do not show up in uploaded usernames and result in invalid usernames - https://phabricator.wikimedia.org/T93646#1142646 (Fhocutt) [20:23:21] Analytics, Analytics-Cluster, Patch-For-Review: Log the X-Cache header in the webrequest logs - https://phabricator.wikimedia.org/T91749#1142734 (Ottomata) varnishkafka format change deployed. I'll wait a day (or a few hours) to verify that x_cache data is present in new raw webrequest records. [20:24:06] Analytics, Analytics-Cluster, Patch-For-Review: Log the X-Cache header in the webrequest logs - https://phabricator.wikimedia.org/T91749#1095420 (Ottomata) Joseph, once that looks good, we can add this field to the refined table and then change the oozie jobs to use all 3 new fields we are adding. [20:28:17] Analytics-Tech-community-metrics, ECT-March-2015: Instructions to update user data in korma - https://phabricator.wikimedia.org/T88277#1142760 (Qgil) Now we have a JSON file hosted in a private project in Bitbucket containing identities and affiliations, to which I got access. I did a test to fix jforres... [20:36:51] Analytics-Wikimetrics: Description of metrics includes link to on-wiki metrics documentation - https://phabricator.wikimedia.org/T93659#1142792 (Fhocutt) NEW [20:40:32] Analytics, Scrum-of-Scrums, Wikipedia-App-Android-App, Wikipedia-App-iOS-App, and 3 others: Avoid cache fragmenting URLs for Share a Fact shares - https://phabricator.wikimedia.org/T90606#1142810 (dr0ptp4kt) Reserved codes now listed at https://www.mediawiki.org/wiki/Provenance [21:47:25] hey ottomata. Bob and I have a question for you. Are you looking into this? https://phabricator.wikimedia.org/T93537 or it's too complicated? [21:48:26] lzia, uhh, i think i will not get to it anytime soon [21:48:40] i mean, convince me? mahout is installed, ya know? [21:49:25] Analytics, Scrum-of-Scrums, Wikipedia-App-Android-App, Wikipedia-App-iOS-App, and 3 others: Avoid cache fragmenting URLs for Share a Fact shares - https://phabricator.wikimedia.org/T90606#1143136 (dr0ptp4kt) The iOS code update was https://gerrit.wikimedia.org/r/#/c/196243/, by the way. [21:49:58] ottomata: so one thing I'm not sure about is why we need a custom debian package. Bob can run it locally on his machine. [21:50:10] rules i guess :/ [21:50:16] leila, you could: [21:50:31] hm, does it compile to a static binary? [21:50:43] maybe you could just compile it on an ubuntu trusty instance, and copy the binary over and try it [21:51:03] i can't just 'install' things [21:51:08] on production boxes [21:51:22] leila: why not mahout? [21:51:32] let's say we have to learn it? [21:51:32] is vowpal rabbit specifically needed? [21:51:40] http://www.quora.com/What-are-the-main-differences-between-Apache-Mahout-Vowpal-Wabbit-in-term-of-prediction-capabilities-implementation [21:51:42] and it will slow us down right now. [21:51:59] mahout is useable in hadoop as is [21:52:00] reading [21:53:49] ottomata, Bob will look into Mahout. If that doesn't work, we will handle it on our end. [21:53:59] ok, sorry i can't help more leila :( [21:54:15] no worries. You would if you could. we know that. :-) [22:00:46] nuria: yt? [22:04:24] or milimetric? [22:05:32] hi ottomata [22:06:27] do you have a sense of how many events in eventlogging are spam? [22:06:42] not just bad events from WMF clients with good intentions [22:06:49] but, people hammering event.gif for fun [22:06:53] ? [22:10:26] ottomata: should be very low [22:10:38] because we usually have an explanation for almost every event that doesn't validate [22:10:43] at least, historically it's low [22:10:46] ok [22:10:47] cool [22:10:50] thanks [22:11:06] yeah i'm tailing the processor upstart logs now, and all of the events at least look well intentioned [22:12:06] Analytics-Cluster, Analytics-Kanban: Mobile PMs has reports on session-related metrics from Wikipedia Apps - https://phabricator.wikimedia.org/T86535#1143232 (kevinator) BTW the reports Oliver generated are here: http://datasets.wikimedia.org/aggregate-datasets/apps/ The new automated report will append... [22:28:53] Analytics-Cluster, Analytics-Kanban: Mobile PMs has reports on session-related metrics from Wikipedia Apps - https://phabricator.wikimedia.org/T86535#1143358 (kevinator) @deskana we're assuming you don't need this data backfilled. We couldn't anyway, the cluster only has 60 days of rolling data and none... [22:36:45] Analytics-Cluster, Analytics-Kanban: Mobile PMs has reports on session-related metrics from Wikipedia Apps - https://phabricator.wikimedia.org/T86535#1143381 (Deskana) >>! In T86535#1143232, @kevinator wrote: > BTW the reports Oliver generated are here: > http://datasets.wikimedia.org/aggregate-datasets/a... [22:37:27] Analytics-Cluster, Analytics-Kanban: Mobile PMs has reports on session-related metrics from Wikipedia Apps - https://phabricator.wikimedia.org/T86535#1143386 (Deskana) >>! In T86535#1143232, @kevinator wrote: > BTW the reports Oliver generated are here: > http://datasets.wikimedia.org/aggregate-datasets/a... [22:44:18] milimetric, are there any research slaves that flowdb is replicated to? I'm trying to investigate https://phabricator.wikimedia.org/T93492 . [22:45:58] superm401, the X1 cluster should have it [22:46:02] x1-analytics-slave.eqiad.wmnet [22:46:40] Ironholds, is that just a DB machine, or something I can ssh into (at least if I have/had rights)? [22:47:02] The former; ssh into stat1002 or stat1003, and then mysql x1-...etc. [22:48:44] Thanks [22:48:54] np :) [22:52:50] Analytics-Wikistats: Wikistats total article count for SV and ID are too high. - https://phabricator.wikimedia.org/T93683#1143413 (ezachte) NEW a:ezachte [22:59:33] Analytics-Wikimetrics: Plain language definitions of Wikimetrics metrics - https://phabricator.wikimedia.org/T93685#1143438 (Capt_Swing) NEW [23:23:44] Analytics-Cluster, Analytics-Kanban: Mobile PMs has reports on session-related metrics from Wikipedia Apps - https://phabricator.wikimedia.org/T86535#1143522 (Nuria) >And does this mean that the uniques counting will also start to be appended to these files? The current setup is very suboptimal, I have to... [23:26:25] Analytics-Wikimetrics: Plain language definitions of Wikimetrics metrics - https://phabricator.wikimedia.org/T93685#1143523 (mcruzWMF) Thanks for creating this task, J-Mo! Before thinking about format, channels, I would like to know if there is any application for this content: are we hoping to use a more p... [23:26:36] Analytics, Mobile-Web: Update main menu schema to include collections for limn graphs - https://phabricator.wikimedia.org/T93690#1143524 (Jdlrobson) NEW [23:27:08] (PS1) Jdlrobson: Update limn graphs (untested) [analytics/limn-mobile-data] - https://gerrit.wikimedia.org/r/199162 (https://phabricator.wikimedia.org/T93690) [23:28:51] Analytics-Wikimetrics: Plain language definitions of Wikimetrics metrics - https://phabricator.wikimedia.org/T93685#1143536 (Capt_Swing) @mcruzWMF I was thinking that these definitions would be useful if we made them available through a "learn more" link on the individual metric definition pages on Wikimetri... [23:29:01] Analytics-Wikimetrics: Plain language definitions of Wikimetrics metrics - https://phabricator.wikimedia.org/T93685#1143537 (Capt_Swing) p:Triage>Normal [23:38:19] Analytics-Cluster, Analytics-Kanban: Mobile PMs has reports on session-related metrics from Wikipedia Apps - https://phabricator.wikimedia.org/T86535#1143571 (Deskana) >>! In T86535#1143522, @Nuria wrote: > Old uniques data should be deleted as we know in some cases is 20% incorrect. Please do not do th... [23:39:39] Analytics-Cluster, Analytics-Kanban: Mobile PMs has reports on session-related metrics from Wikipedia Apps - https://phabricator.wikimedia.org/T86535#1143572 (Nuria) >Please do not do that. The mobile apps team relies on this data for its quarterly review. Got it. Please be aware of the precision of one d... [23:40:31] Analytics-Cluster, Analytics-Kanban: Mobile PMs has reports on session-related metrics from Wikipedia Apps - https://phabricator.wikimedia.org/T86535#1143585 (Deskana) >>! In T86535#1143572, @Nuria wrote: >>Please do not do that. The mobile apps team relies on this data for its quarterly review. > Got it.... [23:57:49] Analytics-Wikimetrics: Plain language definitions of Wikimetrics metrics - https://phabricator.wikimedia.org/T93685#1143664 (JAnstee_WMF) While it is need of some new metric updates, let's not forget about: https://meta.wikimedia.org/wiki/Grants:Evaluation/Learning_modules/1Wikimetrics_Training_Overview [23:59:10] Analytics-Wikistats, HTTPS: Fix the mixed content issue on Wikimedia Statistics - https://phabricator.wikimedia.org/T93702#1143666 (Chmarkine) NEW