[01:16:02] Analytics, MediaWiki-API, User-bd808: Run ETL for wmf_raw.ActionApi into wmf.action_* aggregate tables - https://phabricator.wikimedia.org/T137321#2679147 (bd808) @Tgr do you have the time and energy to take this task over and finally finish our team goal from Q3 2015/16? [01:58:09] Analytics-EventLogging: Various EventLogging schemas losing events since around September 8/9 - https://phabricator.wikimedia.org/T146840#2679196 (Tbayer) More evidence that Chrome (possibly also other browsers) was affected across several schemas while Firefox was not: In the Edit schema, the ratio of event... [08:11:08] Good morning early-ottomata: ) [08:11:27] ottomata: I guess it's last morning for a few weeks when I can do that ;) [08:11:44] hehe! well i'll be in england for hte next week...but not online :) [08:11:59] :) [13:54:19] ottomata: here? [13:54:28] ottomata: http://blog.cloudera.com/blog/2016/09/apache-spark-2-0-beta-now-available-for-cdh/ [13:54:44] * joal will probably ask for a CDH upgrade soon ! [13:57:27] HI! [13:57:28] OHHHH [13:57:29] awesome! [13:57:37] :D [13:57:38] did you see the streaming 'sql' stuff in 2.0? [13:57:52] btw, i am running flink in yuvi's kubernetes in google cloud! :o [13:57:55] ottomata: I have not no, but from what you say, I think I have an idea :D [13:57:59] and spark too, madhu is trying to hook up jupyter to it [13:58:16] ottomata: That's awesome ! [13:58:48] ottomata: I'm actually more and more convinced of Spark + flink is good :) [14:00:09] spark + flink? [14:00:25] spark for batch like - flink for stream only [14:00:30] hm, ja maybe! [14:01:07] meaning, spark can do streaming, but does it batch way, and flink can do batch, but does it stream way ... See where I come from ? [14:01:14] aye [14:01:20] i hadoop i would just support both anyway [14:01:25] yup [14:01:29] if we get a streaming cluster up, probably would choose one (flink) [14:01:33] so ja [14:03:11] The other thing I thought about is data proximity: Streaming should probably be kafka to kafka mostly, with some acceses to HDFS for small things, but wouldn't be a translator from kafka to HDFS [14:04:28] aye right [14:04:33] for kafka to hdfs we use camus or whatever [14:04:35] but not flink [14:04:37] agreed [14:04:58] flink wouldn't be collocated with HDFS, so not really a good guy for heavy write/read [14:05:23] aye [14:06:02] But the thing could be: convert webrequests to pageview in flink, write them to kafka, and camus everything once in a while [14:10:43] :D [14:10:44] yeah [14:11:38] ottomata: If we go in that direction, we probably will be needing to strengthen our kafka cluster :) [14:12:44] heh, maybe! I betcha analytics cluster has a lot of headroom right now, but yeah, maybe! [14:13:27] ottomata: Let's assume we convert webrequest-raw to webrequest-augmented ... Like double the whole stuff [14:15:48] just looking at a single broker [14:16:08] by eyeball looks like we are using around 30% of space average [14:16:17] maybe 35 [14:16:39] ottomata: so doubling-ish would put quite a strain on the thing [14:17:17] at about 15% cpu usage currently [14:17:43] ottomata: hm [14:19:26] could add a couple servers and sounds like it'd be ok [14:19:45] maybe we can get other ops to agree to support it full time around the clock and then we can make it tier 1? [14:20:00] * milimetric ducks [14:20:37] ottomata, milimetric: we could also revisit server-config, getting more storage-RAM oriented machines, and recycle old ones (when needed) [14:21:03] milimetric: I finally managed to have a successfull run for enwiki [14:21:11] * joal joal dances in his small office [14:23:38] omg! [14:23:42] you're amazing, joal [14:27:48] milimetric: now I want need to solve functional things: the stuff is not actually doing exectly what I want :) [14:28:24] ok, you'll have to tell us what the performance issue was [14:29:21] milimetric: I'm not even completely sure I can really explain ! [14:29:29] joal: milimetric aye, that might actually be possible [14:29:38] joal, \o/ \o/ [14:29:50] i think faidon kinda thinks we should have just one giant kafka cluster for stuff anway [14:29:53] thanks mforns ! [14:30:01] That's a good end of week win :) [14:30:13] joal: enwiki mw history?! [14:30:19] yes ottomata [14:30:25] amaazing [14:30:27] how long did it take? [14:30:38] not yet ready, but means the apprach makes sense [14:32:13] mforns, milimetric: Checking differences with dump data generated on altiscale, (count revisions by page id) - Sounds correct [14:32:26] awesome [14:33:22] nice [14:40:27] ok mforns milimetric: back to making it correct now that I have a proof it should work :) [14:40:56] sweet, let us know if you need rubber ducks [14:41:01] mforns, milimetric: Thanks again for your support in making persistent :) [14:41:37] joal, yea RDing rocks [15:06:16] Analytics-Cluster, Continuous-Integration-Config: Jenkins tests for analytics/refinery? - https://phabricator.wikimedia.org/T147072#2680133 (hashar) [15:07:19] Analytics-Cluster, Continuous-Integration-Config: Jenkins tests for analytics/refinery? - https://phabricator.wikimedia.org/T147072#2680039 (hashar) Which repository and Gerrit change? The `analytics/refinery/source.git` repository has a maven job and it should be running: ``` - name: analytics/refine... [15:07:38] Analytics-Cluster, Continuous-Integration-Config: Jenkins tests for analytics/refinery? - https://phabricator.wikimedia.org/T147072#2680039 (hashar) p:Triage>Normal [15:09:33] Analytics: Kill limn1 - https://phabricator.wikimedia.org/T146308#2680144 (leila) @schana can you check T146308#2677422 and see if, as far as you know, we care about recommend/ being killed? [15:15:06] Analytics-Cluster, Continuous-Integration-Config: Jenkins tests for analytics/refinery? - https://phabricator.wikimedia.org/T147072#2680159 (MarcoAurelio) @hashar Please refer to https://gerrit.wikimedia.org/r/#/c/312810/ Thanks. [15:19:45] Analytics: Kill limn1 - https://phabricator.wikimedia.org/T146308#2680174 (schana) @leila I was unaware that the dashboard even existed. I have no objections to it being killed. [15:21:25] Analytics: Kill limn1 - https://phabricator.wikimedia.org/T146308#2680187 (leila) @schana, thanks. @Milimetric please feel free to kill recommend/. Thanks for checking with us. :) [15:34:13] Analytics-Cluster, Continuous-Integration-Config: Jenkins tests for analytics/refinery? - https://phabricator.wikimedia.org/T147072#2680202 (hashar) Ah that is for analytics/refinery.git Looks like it is related to T130123 by @madhuvishy c025b416611a32e1bb02d77e1cb4d6355a9eb6cd: > Add job that allows... [16:10:18] nuria_: Just triple checked your answer on zero in pageviews: for my example day, of zero-flagged pageviews: 76% mobile-web, 15% desktop, 9% mobile app [16:10:47] joal: on meeting, can talk in abit [16:10:52] sure [16:37:44] joal: back [16:38:01] heya [16:38:29] joal: I looked at : https://github.com/wikimedia/analytics-refinery-source/blob/master/refinery-core/src/main/java/org/wikimedia/analytics/refinery/core/Webrequest.java#L164 [16:38:53] off to lunch [16:39:32] joal: which matched "zero" and wap! to mobile-web [16:40:32] nuria_: i counted number of pageviews having zero_carrier non null in pageviews hourly grouped by access_method [16:40:49] zero_carrier comes from x_analytics header, [16:41:19] joal: ya, i imagined that , i am not sure if those are truly "zero" (as in no charge) pageviews [16:41:37] joal: but they might be, teh zero domains for sure are [16:41:39] *the [16:41:56] nuria_: I wonder what is correct zero: zero in host, or zero in cookie :) [16:42:03] joal: host [16:42:20] joal:sporry [16:42:40] joal: host is how domains are "whitelisted for no charge" i think [16:42:55] nuria_: I actually have no idea, so I trust you :) [16:43:10] joal: but it could be that some carriers do not need to whitelist and use our regular domains? [16:43:22] nuria_: ??? [16:43:41] nuria_: I guess it's a good question for people in the zero team :) [16:44:02] joal: not sure it exists anymore [16:44:30] hm [16:46:07] joal: so reading docs it could be both [16:46:21] joal: so answer is "mostly mobile" [16:46:28] joal: it can also be apps [16:46:30] nuria_: makes sense :) [16:46:37] joal: will clarify [16:49:09] thanks nuria_ :) [16:49:18] joal: done [17:01:08] a-team: cable battery of laptop started .. ahem.... *smoking* so need to drive to apple store to buy new one [17:01:18] nossa sinhora [17:01:44] happened to my wife once too [17:02:07] definitely take care of that [17:02:50] and in europe they cost like 99 eur... [18:11:10] ok, back [18:18:40] (PS3) Nuria: [WIP] Service Worker to cache locally AQS data [analytics/dashiki] - https://gerrit.wikimedia.org/r/302755 (https://phabricator.wikimedia.org/T138647) [18:38:33] milimetric, I'm done with T-Z [18:38:48] should I continue with my original other half? [18:39:25] Um... take the S, there should be a ton of those [18:39:34] milimetric, OK :] [18:41:15] Analytics-Kanban: Some recent ExternalLinksChange data lost - https://phabricator.wikimedia.org/T146815#2680558 (Nuria) ping @Samwalton9 I tried to edit this page in beta: https://en.wikipedia.beta.wmflabs.org/wiki/UserMergedsfxub adding a http://www.google.com link and no event got sent that I can see so... [18:41:30] Analytics: Some recent ExternalLinksChange data lost - https://phabricator.wikimedia.org/T146815#2680559 (Nuria) [18:41:57] Analytics: Some recent ExternalLinksChange data lost - https://phabricator.wikimedia.org/T146815#2671929 (Nuria) a:Nuria>None [19:30:31] Analytics: Some recent ExternalLinksChange data lost - https://phabricator.wikimedia.org/T146815#2680709 (Samwalton9) @Nuria This is currently only running on test wiki. No changes had been made to the Schema since I made the first lot of test edits. There's some changes being made to it this week (by @Legok... [21:56:25] done for the night yall, see ya [22:14:15] Analytics, Developer-Relations, MediaWiki-API, Reading-Admin, and 4 others: Metrics about the use of the Wikimedia web APIs - https://phabricator.wikimedia.org/T102079#2681241 (bd808) a:bd808>None Unlicking this soggy old cookie. [22:16:10] Analytics, MediaWiki-API, User-bd808: Run ETL for wmf_raw.ActionApi into wmf.action_* aggregate tables - https://phabricator.wikimedia.org/T137321#2681243 (bd808) a:bd808>Tgr I think I have successfully ~~conned~~ convinced @Tgr to take this on when he gets some time.