[00:16:13] aharoni : I have a reply from LanguageTool and wikimedia-analytics. So should I compile all the stats in a report and mail it to you? [00:16:31] i am not done with the installation script yet but pretty close. [00:17:18] Also, earlier i sent you link to integrate languagetool in a website, so i gave you the wrong link. [00:17:50] the required files are in a different branch of my repo. [01:51:49] Analytics-EventLogging, Analytics-Kanban, WMF-deploy-2015-03-18_(1.25wmf22), WMF-deploy-2015-03-25_(1.25wmf23), WMF-deploy-2015-04-01_(1.25wmf24): Edit Schema module loaded by EL client side is not being updated - https://phabricator.wikimedia.org/T94059#1155835 (Jdforrester-WMF) [04:21:54] (PS1) KartikMistry: Add 'kn' and 'uk' languages [analytics/limn-language-data] - https://gerrit.wikimedia.org/r/200101 [10:08:47] Analytics-Tech-community-metrics, ECT-March-2015: Migrate Korma functionality into upstream - https://phabricator.wikimedia.org/T92952#1156222 (Qgil) p:Normal>High [10:09:10] Analytics-Tech-community-metrics, ECT-March-2015: Migrate Korma identitites database to SortingHat - https://phabricator.wikimedia.org/T92953#1156224 (Qgil) p:Normal>High [10:11:01] Analytics-Tech-community-metrics, ECT-April-2015: Provide list of oldest open Gerrit changesets without code review - https://phabricator.wikimedia.org/T94035#1156227 (Nemo_bis) I thought Quim proposed to list the changesets which didn't get any comment at all. CR = 0 is something else: comments without C... [10:56:41] Analytics-Tech-community-metrics, ECT-April-2015: Provide list of oldest open Gerrit changesets without code review - https://phabricator.wikimedia.org/T94035#1156278 (Qgil) Just a question, if a changeset is on -1 but then a new upload is made, then it is considered 0, right? If so, we should list only t... [11:20:17] (PS2) Mforns: [WIP] Add Apps session metrics job [analytics/refinery/source] - https://gerrit.wikimedia.org/r/199935 (https://phabricator.wikimedia.org/T86535) [11:32:16] (CR) KartikMistry: [C: 2] Add 'kn' and 'uk' languages [analytics/limn-language-data] - https://gerrit.wikimedia.org/r/200101 (owner: KartikMistry) [11:32:21] (Merged) jenkins-bot: Add 'kn' and 'uk' languages [analytics/limn-language-data] - https://gerrit.wikimedia.org/r/200101 (owner: KartikMistry) [11:38:20] kart_: are you going to change the files for that patch ^ ? [11:44:25] kart_: I got nervous so I changed them [11:44:51] milimetric: thanks! [11:44:57] milimetric: if you've done it :) [11:45:05] yeah, I just did [11:45:17] milimetric: was about to ping about it. Do you've time next week on teaching me 'how-to' [11:45:27] milimetric: or is it documented somewhere. [11:45:42] i documented it in the tickets you guys opened last time when it didn't work [11:45:45] but here, it's easy [11:45:57] cd /a/limn-public-data/language/datafiles [11:46:21] vim daily*tsv [11:46:43] change the header the way that you changed the select statement (same order, make sure not to expand tabs into spaces) [11:46:47] save, done [11:47:19] you can put that wherever it makes sense in your docs, the README for that project might be a good place [11:47:25] or like a comment in those SQL files [11:52:00] Analytics: Referrer data for en:Glitter for shareafact test - https://phabricator.wikimedia.org/T93270#1156459 (Dereckson) If a task is handled by X, please associate the project X, or if really this would be incoherent the team instead. That helps to avoid lost requests and allows Analytics team to track it. [11:57:32] kart_: I answered above, forgot to ping ^ [12:01:48] milimetric: cool. [12:01:57] milimetric: me too. distracted a bit :) [13:05:12] Analytics-Tech-community-metrics, ECT-April-2015, ECT-March-2015: Key performance indicator: Gerrit review queue - https://phabricator.wikimedia.org/T39463#1156606 (Qgil) [14:16:07] Analytics-EventLogging, Analytics-Kanban: Upgrade box for EventLogging (vanadium) - https://phabricator.wikimedia.org/T90363#1156789 (ggellerman) a:Ottomata [14:32:42] mforns: when do you have time to talk over the reports? [14:32:54] milimetric, now? [14:33:34] sure, batcave [14:34:07] ok [14:42:40] oh milimetric sorry [14:42:45] thought you were talking to marcel [14:43:00] check out /tmp/00000.smoosh [14:43:00] np [14:43:02] i extracted it there [14:43:03] yeah, binary [14:52:38] ottomata, milimetric: please take a look at new patch [14:52:41] https://gerrit.wikimedia.org/r/#/c/196009/ [14:52:48] for last-access [15:01:55] nuria: looks good to me, but i haven't been following that recently [15:02:57] ottomata: that's ok, as long as there are no obvious mistakes, i tested locally but outside puppet so i have to be moving the code back and forth [15:03:03] ottomata: thank you [15:03:49] (PS1) Milimetric: [WIP] adding wikitext editor data to queries [analytics/limn-edit-data] - https://gerrit.wikimedia.org/r/200159 [15:22:36] nuria: reviewed [15:27:21] milimetric: is it good idea to add steps you explained for limn dashboard in README of limn-language-data [15:27:40] milimetric: I'll do that now if you're okay :) [15:28:10] kart_: it's your repo man, go for it :) [15:32:04] mforns / ottomata: meeting https://plus.google.com/hangouts/_/wikimedia.org/analytics?authuser=0 [15:32:09] (ignore the authuser) [15:32:32] unnnnnghhhhhh i guess [15:33:42] Analytics-EventLogging, Analytics-Kanban, WMF-deploy-2015-03-18_(1.25wmf22), WMF-deploy-2015-03-25_(1.25wmf23), WMF-deploy-2015-04-01_(1.25wmf24): Edit Schema module loaded by EL client side is not being updated - https://phabricator.wikimedia.org/T94059#1156981 (Krenair) Are the events going int... [15:48:38] (PS1) KartikMistry: Added README with new language addition how-to [analytics/limn-language-data] - https://gerrit.wikimedia.org/r/200165 [16:10:45] Analytics: Referrer data for en:Glitter for shareafact test - https://phabricator.wikimedia.org/T93270#1157104 (leila) @Dereckson, this is not an Analytics task, it's a Research-and-Data task but Research-and-Data has not moved to phabricator, yet. I'd like to keep track of my tasks here. We can tag it as An... [17:09:25] milimetric: i'm not so sure how useful this mysql binlog thing is [17:09:31] here is an insert message [17:09:32] https://gist.github.com/ottomata/bdf99fefff50c156b016 [17:09:42] this corresponded to [17:09:56] INSERT INTO otto.t1 (f1, f2) VALUES (1, a) [17:10:02] INSERT INTO otto.t1 (f1, f2) VALUES (1, 'a') [17:10:29] it seems possible to somehow encode writes to specific tables into more specific avro schemas [17:10:51] buuuut, there isn't much info on how to do so, and i think the 'schema registry' this thing integrates with is not the same one that confluent is pushing now [17:16:27] ottomata: let me eat lunch, but that json looks as delicious as my burget [17:16:30] *burger [17:17:22] I'm only worried about whether or not this thing's reliable. If it is, we can figure out some way to transform it, even if doing it via Avro first is not ideal [17:17:27] Analytics-Kanban, Analytics-Wikimetrics, Patch-For-Review: Get a measure of daily usage of wikimetrics by userbase - https://phabricator.wikimedia.org/T94193#1157336 (Nuria) [17:18:46] mforns: how have you been running your spark code? in 1002? [17:31:27] nuria: afaik, he has been doing [17:31:28] spark-shell [17:31:30] on stat1002 [17:31:43] spark-shell --jars /home/otto/algebird-core_2.10-0.9.0.jar,/home/mforns/refinery-core-0.0.9.jar [17:32:01] nuria, sorry I was on 1x1 with Toby [17:33:11] nuria, just execute: spark-shell --jars /home/otto/algebird-core_2.10-0.9.0.jar,/home/mforns/refinery-core-0.0.9.jar [17:34:21] nuria, however, right now it wont work as is, because it has the form of a job, an object. [17:35:00] nuria, so problably the last changeset won't work, but you can test it with andrew's gist: https://gist.github.com/ottomata/2025b974b1a65c747bab [17:35:10] oh [17:35:19] you could do that with spark-submit instead then [17:35:27] mforns: link to your change again please? i lost it [17:35:32] or add me as reviewer [17:35:42] ottomata, https://gerrit.wikimedia.org/r/#/c/199935/ [17:35:55] danke [17:36:33] ottomata, so can I pass the job code to spark-submit? [17:38:04] mforns: if you are still testing with that single small file [17:38:05] then [17:38:16] mforns: ok, i am going to try to understand your quantile e-mail and review some scala that i have not used at all in couple years [17:38:31] spark-submit --jars /home/otto/algebird-core_2.10-0.9.0.jar,/home/mforns/refinery-core-0.0.9.jar [17:38:32] i think [17:38:38] oh [17:38:38] no [17:38:39] more [17:39:16] ottomata, ok, let me know if I got something wrong regarding quantiles, or you have any questions on my comments [17:39:22] spark-submit --jars /home/otto/algebird-core_2.10-0.9.0.jar,/home/mforns/refinery-core-0.0.9.jar your-jar.jar --class AppSessionMetrics [17:39:23] i think [17:39:34] aha [17:39:47] mforns: re quantiles, you know more than I do. [17:40:09] i was just looking for a quantiles solution, fabian pointed me to algebird qtrees, and i figured out how to make it work with spark [17:40:11] that's all [17:40:36] xD you did more in 30mins that I in 1 day :], hehehe [18:14:35] nuria, yt? [18:39:53] milimetric, I think I did not understand the needed changes with the tsv's headers... can we talk about it again? [18:40:14] yes, to the batcave! [18:41:05] mforns: ^ [18:41:12] ok! [18:54:56] mforns: I think these queries might not be performant enough for us to run 5000 of them every hour [18:55:02] we might need to do it daily [18:55:08] aha [18:55:29] that kind of makes sense from a data correctness point of view as well [18:55:41] for the weirder graphs like sunbursts [18:56:03] milimetric, aha, yes to avoid cutting sessions [18:56:03] so I was wrong initially to say daily resolution -> hourly frequency, sorry about that [18:56:33] oh, sorry what I said made no sense [18:56:39] ok [18:56:49] well, the query makes sure the sessions stay intact. But displaying just the morning sessions in the same shape as the all-day sessions would be more misleading than displaying the latest point on a timeseries graph [18:57:01] milimetric, so we need to change also the frequency of the reportupdater, right? [18:57:07] because if you look at the timeseries graph you see that what it's doing is probably still calculating the day's totals and it's "in progress" [18:57:21] but when you look at the sunburst, the only indicator you have is the "number of sessions" hover text [18:57:25] so it'd be more hidden [18:57:29] aha [18:57:47] mforns: yeah, so I think it's great that you broke out frequency and resolution separately [18:57:55] so in our case, all our reports should have both set to daily [18:58:20] and that's mostly for performance. Then, later when we have a better analytics store where we run these queries, maybe we can think about running it more frequently [18:58:25] ok, I'll recheck if in fact frequency and granularity are totally independent [18:58:41] and enable the granularity parameter in the config [19:34:43] milimetric: figured out how to use rcsream yet? [19:34:49] trying to do this in python [19:34:50] https://wikitech.wikimedia.org/wiki/RCStream [19:34:51] getting 404 [19:34:58] http://stream.wikimedia.org/rc [19:35:02] no, I'm struggling with the wikitext queries, some of the data's weird [19:35:38] makes sense that you'd get 404 if you just go to it [19:35:50] I can try it quickly on my druid prototype page, 'cause that has socket io already loaded [19:37:12] ottomata: they don't have CORS set up so I'm not sure how this would work except if you're on a wikimedia domain [19:37:35] oh, but server side that shouldn't matter, one sec [19:38:29] oh [19:38:36] hm, yeah, i'm just trying to run it from labs [19:38:46] in python [19:39:05] In [10]: socketIO = socketIO_client.SocketIO('stream.wikimedia.org', 80) [19:39:05] WARNING:root:stream.wikimedia.org:80/socket.io [waiting for connection] unexpected status code (404) [19:40:17] seems up http://stream.wikimedia.org/rcstream_status [19:41:38] oh I'm sure it's up, we just don't know how to connect to it [19:41:47] hm, node doesn't seem to get anything from it either [19:41:55] debugging now [19:42:02] from server I see: 127.0.0.1 - - [2015-03-27 19:41:50] "GET /socket.io/?EIO=3&transport=polling&t=1427485310856-0 HTTP/1.1" 404 96 0.000157 [19:42:12] (that's the only useful thing I know about rcstream :) [19:42:42] thanks chasemp, do you know who we should ask? [19:42:58] ori I would guess [19:43:04] aye [19:54:15] ottomata: it looks like it's actually a 503 if you use either stream.wikimedia.org or https://stream.wikimedia.org [19:54:31] so the 404 I think is just because http is not supported? maybe? [19:55:19] certainly doesn't seem good if nobody's connected: http://stream.wikimedia.org/rcstream_status [19:55:19] milimetric: mutante pointed me to https://phabricator.wikimedia.org/T91393 [19:55:30] i installed socketIO_client 0.5.5 and it is better [19:55:32] still not working, but not 404 [19:56:17] Analytics-EventLogging, Analytics-Kanban, WMF-deploy-2015-03-18_(1.25wmf22), WMF-deploy-2015-03-25_(1.25wmf23), WMF-deploy-2015-04-01_(1.25wmf24): Edit Schema module loaded by EL client side is not being updated - https://phabricator.wikimedia.org/T94059#1157843 (Milimetric) Yes! Thanks very muc... [19:56:52] not working in that I don't see anything yet... [19:56:55] it looks like it is working [19:57:08] i have a socket printing every event so it's easier to see what's going on [19:57:11] i'll try downgrading too [19:58:26] i'm trying to do that, how are you doing that? [19:59:21] ottomata: batcave? [19:59:25] ja [20:01:09] WARNING:socketIO_client:stream.wikimedia.org:80/socket.io/1: [packet error] unhandled namespace path () [20:20:09] hey ottomata [20:20:32] ottomata!!!! [20:20:40] (he's in the batcave with me and can't hear me) [21:16:54] (PS1) Mforns: [WIP] Add support for wiki explosion and others. [analytics/limn-mobile-data] - https://gerrit.wikimedia.org/r/200239 [21:18:49] (CR) Mforns: "First patch changes csv format to tsv." [analytics/limn-mobile-data] - https://gerrit.wikimedia.org/r/200239 (owner: Mforns) [21:20:53] Analytics-Wikimetrics: Wikimetrics backup has no monitoring - https://phabricator.wikimedia.org/T71397#1158262 (Dzahn) Would you be interested in having this added to Operations Icinga? [21:28:22] have a nice weekend, y'all! [22:46:33] milimetric: Need anything more on https://gerrit.wikimedia.org/r/#/c/195895/ ? [22:46:41] (CR/whatever) [22:47:21] James_F: I had put that on hold to get done with the dashboards, now I'm focused on analyzing the new wikitext data [22:47:30] milimetric: Sure, no worries. [22:47:36] (going well so far, pretty straightforward translation from the ve metrics) [22:47:44] milimetric: Just didn't want you to be blocked on anything. [22:47:48] Cool. [22:48:08] thanks I'm not blocked, but I'm still not sure if the user ids and user class properties were being captured correctly [22:48:24] last I looked there were still a ton of anonymous folks as recorded in the events [22:48:52] but now it's better because we can compare with wikitext too and maybe better track down the problem. [22:49:08] I'm at a conference until next Thursday, but after that we'll show you guys the dashboards hopefully [22:49:10] and we can take it from there [23:00:32] Analytics-EventLogging, Analytics-Kanban: Cron collects Visual Editor deployments [8 pts] {lion} - https://phabricator.wikimedia.org/T89253#1158711 (Milimetric) [23:02:47] Analytics-EventLogging, Analytics-Kanban, VisualEditor: Wikitext events need to be sampled {lion} - https://phabricator.wikimedia.org/T93201#1158753 (kevinator) [23:08:25] (CR) Nuria: Make non-Latin characters display in json reports - WIP (1 comment) [analytics/wikimetrics] - https://gerrit.wikimedia.org/r/199814 (https://phabricator.wikimedia.org/T93023) (owner: Fhocutt) [23:42:54] Analytics-Tech-community-metrics, ECT-March-2015: Tech metrics should talk about "Affiliation" instead of organizations or companies - https://phabricator.wikimedia.org/T62091#1158915 (Aklapper) I gave this a basic shot in https://github.com/Bitergia/mediawiki-dashboard/pull/56 Might not be perfect but a...