[00:01:08] Analytics: MobileWikiAppDailyStats should not count Googlebot - https://phabricator.wikimedia.org/T117631#1784254 (Nuria) > The explanation seems to be that Google periodically runs our app in an automated fashion, in order to evaluate the similarity of how the content is presented in the app versus mobile w... [00:11:30] Analytics, Analytics-Cluster, Fundraising Tech Backlog, Fundraising-Backlog, and 2 others: Verify kafkatee use for fundraising logs on erbium - https://phabricator.wikimedia.org/T97676#1784279 (awight) I think we're prepared to make this change now. The sample rate is parsed out of the filenames... [00:45:33] Analytics-Backlog, WMDE-Analytics-Engineering, Graphite: Create a Graphite instance in the Analytics cluster - https://phabricator.wikimedia.org/T117732#1784397 (JanZerebecki) An additional graphite as an additional grafana backend should work. It seems one can create a graph with 2 Metric Queries where... [02:23:52] Analytics, Analytics-Kanban, Discovery, EventBus, and 8 others: EventBus MVP - https://phabricator.wikimedia.org/T114443#1784606 (GWicke) @faidon: Until very recently (last days), there wasn't actually any REST proxy with schema validation in the EventLogging repository. @ottomata now has [a patch... [03:41:55] hmm, perhaps i'm just doing something wrong. but i can only seem to do simple queries in beeline that dont require map/reduce. If it needs map/reduce i get errors about not having permissions as the anonymous user. I can run the same query in the `hive` cli though [03:42:08] using: beeline -u jdbc:hive2://analytics1027.eqiad.wmnet:10000 --outputFormat=vertical [06:46:18] Analytics-Backlog, Wikipedia-iOS-App-Product-Backlog, operations, vm-requests, iOS-5-app-production: Request one server to suport piwik analytics - https://phabricator.wikimedia.org/T116312#1784783 (Joe) In my experience handling out 3 million events/day to a piwik installation means sounding th... [09:35:19] Analytics-Backlog, WMDE-Analytics-Engineering, Graphite: Create a Graphite instance in the Analytics cluster - https://phabricator.wikimedia.org/T117732#1784976 (Addshore) [09:37:38] Analytics-Backlog, Wikipedia-iOS-App-Product-Backlog, operations, vm-requests, iOS-5-app-production: Request one server to suport piwik analytics - https://phabricator.wikimedia.org/T116312#1784983 (mark) This is clearly a system for analytics. Will it be implemented, maintained and supported by... [09:56:18] Hi mobrovac [09:56:28] mobrovac: want to go for some hive ? [09:57:58] hi joal [09:58:32] joal: i'm currently indisposed, could we do it in an hour, hour and a half? [09:59:16] hm not easy for me at that time - 3pm CET would work for you ? [09:59:26] mobrovac: --^ [09:59:42] sure joal! [10:00:04] Just sent an invite not to forget :) [10:00:35] mobrovac: See you then :) [10:00:59] hehe gr8 joal! [10:41:54] ebernhardson: About beeline: -n [11:01:22] Analytics-Backlog, WMDE-Analytics-Engineering, Graphite: Create a Graphite instance in the Analytics cluster - https://phabricator.wikimedia.org/T117732#1785165 (fgiunchedi) TBH I'm not sure why you'd need a separate graphite instance in analytics cluster for this, can you elaborate? [11:47:05] Analytics-Backlog, WMDE-Analytics-Engineering, Graphite: Create a Graphite instance in the Analytics cluster - https://phabricator.wikimedia.org/T117732#1785236 (Addshore) Here are some bits of one of the discussions I had in IRC. > 3:20 PM <_joe_> addshore: whatever we want to keep "forever", should... [13:04:01] hey dcausse ! [13:04:04] Almost in time :) [13:09:07] joal: hi! [13:09:28] I'm in the meeting in the invite :) [13:09:35] dcausse: --^ [14:01:19] joal: ping? :P [14:01:29] pong mobrovac :) [14:01:38] oh la la [14:01:43] k so [14:01:44] huhu [14:02:28] to recap, i need the first check box from https://phabricator.wikimedia.org/T117429 [14:02:44] so, the list of IP/UA that hit a specific RB URI [14:03:09] ok mobrovac [14:03:14] mobrovac: you used to hive ? [14:03:29] nope [14:03:43] :) [14:03:50] first, do you have access ? [14:04:15] damn [14:04:17] probably not [14:04:20] :( [14:04:28] try it: stat1002 -- hive [14:04:35] * mobrovac checking [14:05:05] joal: access denied [14:05:10] mwarf [14:05:18] my thought exactly :) [14:05:18] So the first thing is to ask for acces [14:05:44] will create a ticket [14:05:58] https://wikitech.wikimedia.org/wiki/Analytics/Cluster/Hive [14:06:37] in the meantime, would it be possible for you to get me that data? [14:06:42] sure :) [14:06:53] the access req is not going to come through before monday evening [14:06:57] cool [14:06:59] So, IP, raw user agent, number of hits [14:07:03] right ? [14:07:07] yup [14:07:14] the uris are: [14:07:58] {domain}/api/rest_v1/page/mobile-html/{title} [14:08:02] mobrovac: I assume we will only have to parse text webrequests [14:08:08] {domain}/api/rest_v1/page/mobile-html-sections/{title} [14:08:14] {domain}/api/rest_v1/page/mobile-html-sections-lead/{title} [14:08:24] {domain}/api/rest_v1/page/mobile-html-sections-remaining/{title} [14:08:40] joal: you mean varnish text web reqs? [14:08:49] yessir [14:08:59] text and mobile or text only ? [14:09:14] ah good question [14:09:21] mobile shouldn't have any /me thinks [14:09:42] that's also my guess: the restbase stat we compute for you is using text only [14:09:45] joal: does it make a diff to you if both are included? that would probably be wise to od [14:09:50] yup [14:09:55] Analytics-Backlog, WMDE-Analytics-Engineering, Graphite: Create a Graphite instance in the Analytics cluster - https://phabricator.wikimedia.org/T117732#1785430 (fgiunchedi) >>! In T117732#1785236, @Addshore wrote: > @fgiunchedi Here are some bits of one of the discussions I had in IRC. > > >> 3:20 PM... [14:09:59] mobrovac: more data to scan [14:10:13] joal: kk, let's go with text only [14:10:28] oh yeah right, rb isn't probably used at all now in mobile varnish [14:10:30] ok cool mobrovac [14:10:51] I don't know about varnish, only about our side of it :) [14:10:55] That's why I ask :) [14:11:08] About time range: how long and when ? [14:11:12] mobrovac: --^ [14:11:35] last 7 days is enough joal [14:11:45] mobrovac: that's a lot already :) [14:12:03] joal: oh i see, let's start with 3 days then [14:12:18] mon, tue, wed [14:12:30] 1 day of refined text data (to scan), is about 500Gb :) [14:13:06] 2015-11-0[234] [14:14:13] mamma mia [14:14:33] mobrovac: also about the uris: {domain} would be the uri host, and /api/rest... the uri path, correct ? [14:14:46] mobrovac: That's we do that in parallel :) [14:15:06] correct joal even though i expect that only en.wp.org will contain data [14:15:16] mobrovac: should, indeed :) [14:15:34] Are you interested to know (group by) domain [14:15:35] ? [14:15:54] And {title} ? [14:16:02] joal: no need, let's KISS :) [14:16:06] or title I am pretty sure is a no, but domain ? [14:16:19] no no, jsut the aggregate is fine for now [14:16:22] KISS ? My english anagrams are bad :( [14:16:33] keep it simple stupid :) [14:16:48] * joal really feels stupid :) [14:16:53] naah [14:17:12] I mean, domain is easy to grab if you want [14:17:18] you tell me :) [14:17:45] basically, we want to rename the URIs so i jsut need to know if i need to contact anybody [14:17:57] ok mobrovac, makes sense [14:18:32] (PS1) DCausse: [WIP] Add initial oozie job for CirrusSearchRequestSet [analytics/refinery] - https://gerrit.wikimedia.org/r/251238 [14:19:19] mobrovac: testing a request on one hour [14:20:26] cool thnx joal [14:21:17] mobrovac: works, no result on that hour (nov 2nd, hour 1 [14:21:24] Launching on 3 days [14:22:21] A bit more crunching to go, mobrovac :) [14:23:35] :) [14:36:16] mobrovac: https://gist.github.com/jobar/c265a3582e28c3cb5405 [14:36:35] mobrovac: I added the domain finally :) [14:36:53] already done joal? [14:36:54] wow [14:36:55] thnx! [14:37:09] joal: should delete that [14:37:13] gist [14:37:31] mobrovac: tell me when you have it [14:38:01] joal: that's a cumulative for all of the uris i gave you? [14:38:07] right mobrovac [14:38:12] kk [14:38:17] lemme c/p that so you can delete it [14:38:21] yup [14:38:45] also mobrovac, just as info : Total MapReduce CPU Time Spent: 1 days 11 hours 32 minutes 8 seconds 70 msec [14:39:01] wow :) [14:39:02] heheh [14:39:11] joal: k, got it, you can delete the gist [14:39:24] (PS2) DCausse: [WIP] Add initial oozie job for CirrusSearchRequestSet [analytics/refinery] - https://gerrit.wikimedia.org/r/251238 [14:39:29] thx mobrovac [14:39:35] done hashar [14:39:42] :-} [14:48:50] Analytics, Analytics-Kanban, Discovery, EventBus, and 8 others: EventBus MVP - https://phabricator.wikimedia.org/T114443#1785475 (Ottomata) > Until very recently (last days), there wasn't actually an EventBus-like REST proxy with schema validation in the EventLogging repository. Not quite true, th... [14:51:04] hey milimetric, yt? [14:52:01] hey, yes [14:52:57] ottomata: ^ [14:54:03] got a sec to walk though el stuff? [14:54:45] milimetric: ^? [14:55:02] uh... [14:55:05] sorry not right now [14:55:11] after standup before staff? [14:55:34] haha, in those 5 mins? we can do after staf [14:55:40] backlog grooming [14:55:45] and then i'm hitting the road [14:55:47] well later today is fine [14:55:49] oh [14:55:50] s'ok [14:55:52] welp, tomorrow then! [14:56:01] wanted to walk though some stuff before I wrote a bunch of tests, but no biggy [14:56:04] tomorrow i'm back to normal [14:56:09] if it can wait that'll be cool [14:56:12] ya will be fine [14:56:35] joal: we can deploy anytime but I'm not around for the next couple of hours [14:56:53] (the code was merged and the gerrit mirror synced) [14:57:17] I'll be back in 1-2 hours [14:59:37] Analytics, Analytics-Cluster, Fundraising Tech Backlog, Fundraising-Backlog, and 2 others: Verify kafkatee use for fundraising logs on erbium - https://phabricator.wikimedia.org/T97676#1785481 (Jgreen) > Maybe we can doctor the last old files and the first new files by hand, so that they splice n... [15:03:03] dcausse: I have some more time now if you want :) [15:03:13] joal: great! [15:03:17] same hangout? [15:03:32] sure [15:18:49] Analytics-Tech-community-metrics, DevRel-November-2015: Key performance indicator: Top contributors - https://phabricator.wikimedia.org/T64221#1785541 (Aklapper) >>! In T64221#1781654, @Qgil wrote: > These two empty sections can be removed. https://github.com/Bitergia/mediawiki-dashboard/pull/70 (togethe... [15:28:31] Analytics-Tech-community-metrics, DevRel-November-2015: Many profiles on profile.html do not display identity's name though data is available - https://phabricator.wikimedia.org/T117871#1785578 (Aklapper) NEW [15:41:58] Analytics, Analytics-Cluster, Fundraising Tech Backlog, Fundraising-Backlog, and 2 others: Verify kafkatee use for fundraising logs on erbium - https://phabricator.wikimedia.org/T97676#1785592 (Pcoombe) @awight Sounds like it would be safest to just take campaigns down if it's only for a short wi... [15:44:46] (PS3) DCausse: [WIP] Add initial oozie job for CirrusSearchRequestSet [analytics/refinery] - https://gerrit.wikimedia.org/r/251238 [15:51:09] (CR) Nuria: "Just a note to make sure this is tested on cluster before it is merged, please let us know if you need help with that. https://wikitech.wi" [analytics/refinery] - https://gerrit.wikimedia.org/r/251238 (owner: DCausse) [15:53:52] nuria, ottomata : there is weirdness in timestamps for Cirrus imported logs [15:54:11] oh? [15:54:40] Camus job runs every hour at minnute 15 [15:55:03] And load 1 file per partition for every hour at that precise time [15:55:06] Sounds weird to me [15:55:39] I would expect camus load files in current hour (for the first 15 minutes), then wait for the next run to complete the hourb [15:55:44] ottomata: --^ [15:56:32] joal: ja, sounds like timestamps are not being read out of content properly [15:56:38] and it is defaulting to current time [15:56:43] makes sense :) [15:56:50] That's what I guessed as well [15:59:51] (PS4) DCausse: Add initial oozie job for CirrusSearchRequestSet [analytics/refinery] - https://gerrit.wikimedia.org/r/251238 [16:00:21] (CR) Nuria: [C: -1] Add initial oozie job for CirrusSearchRequestSet [analytics/refinery] - https://gerrit.wikimedia.org/r/251238 (owner: DCausse) [16:01:02] joal: reading log sorry i was CR the cirrus code [16:01:33] np nuria, I helped dcausse to have it right, so it should be ok :) [16:01:45] joal: but schema is duplicated [16:01:57] yes m'dame [16:02:06] joal: it should not be, right? [16:02:13] joal: otherwise schema is in three places [16:02:27] joal: php, avro bindings and oozie job [16:02:27] I don't know about avro/schema stuff, and I have no idea as to how to handle / work that issue [16:03:06] joal: i see, i think that the schema here: https://github.com/wikimedia/analytics-refinery-source/blob/master/refinery-camus/src/main/avro/CirrusSearchRequestSet.avsc [16:03:07] Analytics-Kanban: Investigate / Correct timestamp not being properly read from Camus for CirrusSearchRequestSet logs. - https://phabricator.wikimedia.org/T117873#1785617 (JAllemandou) NEW [16:03:26] joal: just needs to be deployed to an accesible path (and also be part of the jar) [16:03:38] nuria: I can remove the schema but we need it to create the table, and refinery-source is not deployed so I don't know how to have access to this file on stat servers... [16:03:47] right nuria -- I wonder if hive can read schema as part of a jar ... [16:03:52] I don't know [16:04:23] dcausse: in order to test your code you can always deploy schema to /tmp/dcausse on hadoop, that allows you to do testing [16:04:25] dcausse: created a task about the camus timestamp issue - https://phabricator.wikimedia.org/T117873#1785617 [16:04:37] joal: thanks [16:04:39] dcausse: you need to test your code that way anyways [16:04:48] nuria: ok [16:05:13] dcausse: that way you can create table in your own database when testing (as when testing as your user you cannot create tables on main wmf db) [16:05:17] dcausse: makes sense? [16:05:29] nuria: yes, thanks [16:06:24] ? i thought the latest avro schema got deployed with refinery [16:06:27] and camus would just use that [16:06:32] now maybe joal and i can figure out how to deploy the current schema to a known path on hive and make it accesible there, let me 1st serach as joal said whether you can reference it as part of a jar [16:06:34] did we deploy the change with the timestamp field? [16:06:37] hive? [16:06:40] ottomata: i did [16:06:40] why does hive need it? [16:06:47] to create table, see: [16:07:02] https://gerrit.wikimedia.org/r/#/c/251238/3/hive/mediawiki/cirrus-searchrequest-set/create_CirrusSearchRequestSet_table.hql ottomata [16:07:55] ottomata: makes sense? the table creation statement uses it [16:08:48] HMMMMMMMMMMM [16:08:52] yes ok [16:09:09] hm, but nuria that is only needed at create table time, right? [16:09:09] joal, ottomata also see this: https://cwiki.apache.org/confluence/display/Hive/AvroSerDe#AvroSerDe-SpecifyingtheAvroschemaforatable [16:09:16] and that create table statement is run manually [16:09:27] it recomends against specifying avro schema explicitily [16:09:52] without replication [16:10:04] wihtout replication? [16:10:04] as it is an url we should be able to use a path like jar:file:/path/to/ext.jar!/src/main/avro/schema.avsc maybe? [16:10:28] yeah,i would thikn so too dcausse, btw, your schema shoudl be in one of the already depoyred refinery jars [16:10:48] ok I'll try [16:11:21] not sure which one, nuria? refinery-camus? [16:11:32] ottomata: yes, it is here: [16:11:38] https://github.com/wikimedia/analytics-refinery-source/blob/master/refinery-camus/src/main/avro/CirrusSearchRequestSet.avsc [16:11:58] hmm, the class is in the .jar [16:12:02] maybe not the avsc [16:12:02] dcausse: ah , nice, if that works our issues are solved cc joal [16:12:03] hm [16:12:19] you could get the schema from the class if you were coding java..not sure about hive [16:12:52] i think is worth trying the " jar:file:/path/to/ext.jar!/src/main/avro/schema.avsc " [16:14:10] right, but the avsc is not in the jar [16:14:26] i think we'd have to tell maven to include it somehow when packaging [16:14:31] ottomata: correct, just unpacked the thing and double checked - no avsc [16:14:42] ottomata: ahhh, it should be then [16:15:15] ottomata: i thought it was, likely we are just packing java files (and no test) into jar [16:15:23] right [16:18:17] maybe the cause for this timestamp issue? [16:18:28] Analytics-Kanban, Analytics-Wikistats, Patch-For-Review: Feed Wikistats traffic reports with aggregated hive data {lama} [21 pts] - https://phabricator.wikimedia.org/T114379#1785714 (ezachte) @Milimetric do these new counts for Wikipedia main site make sense? Changed totals are from May 2015 onwards.... [16:20:32] Analytics, Analytics-Kanban, Discovery, EventBus, and 8 others: EventBus MVP - https://phabricator.wikimedia.org/T114443#1785720 (faidon) More importantly, I don't understand why this is something Andrew has to do (and "soon") and not the services team "or else". Why is it a given that the Service... [16:25:17] dcausse: probably not timestamp-bug related, but still some changes to go for to prevent schema duplication [16:27:58] Analytics, Analytics-Kanban, Discovery, EventBus, and 8 others: EventBus MVP - https://phabricator.wikimedia.org/T114443#1785745 (Nuria) > As mentioned, we might want to use a single node process exposing parsoid, restbase & eventbus for small (third party) installs, but might as well use the new... [16:29:07] dcausse: can you manually add the avro schema to the jar and try on 1002 whether we can specify the path as you suggested? [16:30:11] nuria: sure [16:30:13] dcausse: or rather let me build a jar and add the file, i will put it on 1002 and you can use it [16:31:09] I wonder how KafkaTopicSchemaRegistry can work without the schema, it should throw Schema not found... [16:31:24] dcausse: cause it doesn't need schema [16:31:28] dcausse: it needs the bindings [16:31:35] dcausse: makes sense? [16:31:48] not sure :/ [16:31:52] dcausse: the java code has the generated versions of the schema :) [16:31:55] dcausse: let me explain [16:32:19] dcausse: if you build refinery/source/refinery-camus [16:32:40] dcausse: with mvn compile [16:33:07] ok got it: avro-maven-plugin [16:33:27] https://www.irccloud.com/pastebin/ctFabVPL/ [16:33:33] on jar, those are teh bindings [16:33:53] nuria: makes sense, thanks! [16:34:57] dcausse: which are build with -as you said- with avro plugin [16:35:07] https://www.irccloud.com/pastebin/vwHbrNkl/ [16:35:22] argh, sorry [16:35:41] https://www.irccloud.com/pastebin/Nwkkzst3/ [16:35:41] np :) [16:36:09] thanks! [16:37:31] dcausse: I think adding schema there should not be too hard but 1st we should test whether hive actually can function with jar, let me know if you see a problem with that [16:37:57] dcausse, nuria: I think one way to have the schema with the jar is to set the avro folder as a resource one in maven [16:37:57] I've built a jar with schema, I'll test [16:38:09] yes it's what I've done [16:41:28] dcausse: ok, let us know when you test whether hive can function with that, thanks for the prompt response [16:42:29] dcausse: testing oozie jobs is not the fastest thing though, but not much we can do there [16:44:22] nuria: seems to work: hive -f create_CirrusSearchRequestSet_table.hql -d avro_schema='jar:file:/home/dcausse/refinery-camus-0.0.23-SNAPSHOT.jar!/CirrusSearchRequestSet.avsc' --database dcausse [16:44:33] dcausse: YES! [16:44:44] niceeee [16:45:15] dcausse: try with data, it's possible that hive doesn't try to get to the schema before having to :) [16:45:18] nuria: it works, I ran show create table and it looks good :) [16:45:29] ok I'll have to add a partition then [16:45:39] dcausse: boy if all our oozie troubles where so easy to fix [16:45:48] :) [16:45:54] t's'ok dcausse, if you have the fileds and all with the show, that's great :) [16:46:42] now, adding the schema as a resource works, but it's not beautifull --> schema lands at jar root [16:48:45] hmm, I see in the schema that it's "ts" for timestamp but in java we call getRecord().get("timestamp") [16:52:16] (PS1) DCausse: AvroBinaryMessageDecoder: timestamp should be ts [analytics/refinery/source] - https://gerrit.wikimedia.org/r/251267 [16:52:43] (PS2) DCausse: AvroBinaryMessageDecoder: timestamp should be ts [analytics/refinery/source] - https://gerrit.wikimedia.org/r/251267 [16:53:40] (CR) Joal: "Maybe instead of changing the "timestamp" possible value, add a "ts" test ?" [analytics/refinery/source] - https://gerrit.wikimedia.org/r/251267 (owner: DCausse) [16:54:15] question ottomata and nuria also about the cirrus integration with hive [16:54:30] wmf_raw database is good for me, is it for you ? [16:54:55] joal: yes [16:55:05] Finally, _SUCCESS flag is used when we use _PARTITIONED for webrequest [16:55:18] I think it's ok because there is no validation step in there process [16:55:25] But I prefer to have your validation :) [16:56:05] test [16:57:49] joal: I am not sure i understand, you mean making with "_SUCCESS" a partition when we have imported it? [17:00:30] a-team, brt, (GOTTA PEE!) [17:00:42] ottomata: now that is archived FOR EVER [17:02:20] everybody pees [17:03:46] Analytics-Tech-community-metrics, Phabricator, DevRel-November-2015: Closed tickets in Bugzilla migrated without closing event? - https://phabricator.wikimedia.org/T107254#1785800 (Aklapper) Thanks for the quick answer @chasemp! >>! In T107254#1782299, @chasemp wrote: > I believe the comment that acco... [17:36:02] (PS3) DCausse: AvroBinaryMessageDecoder: added "ts" as a possible record for timestamp [analytics/refinery/source] - https://gerrit.wikimedia.org/r/251267 (https://phabricator.wikimedia.org/T117873) [17:44:24] (PS5) DCausse: Add initial oozie job for CirrusSearchRequestSet [analytics/refinery] - https://gerrit.wikimedia.org/r/251238 (https://phabricator.wikimedia.org/T117575) [17:56:08] Analytics-Kanban: Add avro schema to refinery-camus jar [3] - https://phabricator.wikimedia.org/T117885#1785987 (Nuria) NEW [17:56:41] ottomata: ansible-playbook --check -i production -e target=aqs roles/restbase/deploy.yml [17:56:47] (PS1) Nuria: Include avro schema in refinery-camus jar [analytics/refinery/source] - https://gerrit.wikimedia.org/r/251289 (https://phabricator.wikimedia.org/T117885) [17:56:51] (that's just the check) [17:57:05] but it's all set, you can deploy anytime [17:57:14] and tell joal when you're done so he can work his magic [17:57:30] dcausse: FYI: https://gerrit.wikimedia.org/r/#/c/251289/ [17:59:41] (CR) DCausse: [C: 1] "Thanks!" [analytics/refinery/source] - https://gerrit.wikimedia.org/r/251289 (https://phabricator.wikimedia.org/T117885) (owner: Nuria) [18:04:28] Analytics-Kanban, Patch-For-Review: Investigate / Correct timestamp not being properly read from Camus for CirrusSearchRequestSet logs. - https://phabricator.wikimedia.org/T117873#1786011 (JAllemandou) a:dcausse [18:09:41] Analytics-Kanban: Pageview API Press release {slug} - https://phabricator.wikimedia.org/T117225#1786017 (JAllemandou) p:Triage>High [18:09:47] Analytics-Kanban: Pageview API documentation for end users {slug} - https://phabricator.wikimedia.org/T117226#1786019 (JAllemandou) p:Triage>High [18:10:08] Analytics-Backlog: Add page_id to pageview_hourly when present in webrequest x_analytics header - https://phabricator.wikimedia.org/T116023#1786023 (JAllemandou) a:JAllemandou>None [18:11:15] Analytics-Backlog, MediaWiki-API: Add Application errors for Mediawiki API to x-analytics - https://phabricator.wikimedia.org/T116658#1786031 (JAllemandou) [18:12:14] Analytics-Backlog, Analytics-Cluster: Procure hardware for future druid cluster - https://phabricator.wikimedia.org/T116293#1786032 (JAllemandou) [18:18:08] Analytics-Backlog: Add help page in wikitech on what the analytics team can do for you similar to release engineering page - https://phabricator.wikimedia.org/T116188#1786057 (JAllemandou) [18:18:43] Analytics-Backlog: Projections of cost and scaling for pageview API. {hawk} [8 pts] - https://phabricator.wikimedia.org/T116097#1786060 (JAllemandou) [18:20:36] Analytics-Backlog, Wikimedia-Developer-Summit-2016: Developer summit session: Pageview API overview - https://phabricator.wikimedia.org/T112956#1786064 (JAllemandou) [18:21:09] Analytics-Backlog, Wikimedia-Developer-Summit-2016: Developer summit session: Pageview API overview - https://phabricator.wikimedia.org/T112956#1786077 (JAllemandou) p:Normal>High [18:21:43] Analytics-Backlog, Analytics-EventLogging: Eventlogging monitoring of consumers (process nanny) - https://phabricator.wikimedia.org/T115495#1786091 (JAllemandou) [18:22:51] (CR) DCausse: "Thanks Nuria & Joal, oozie --validate is happy. I'll continue to test tomorrow and try to run the job on my db." [analytics/refinery] - https://gerrit.wikimedia.org/r/251238 (https://phabricator.wikimedia.org/T117575) (owner: DCausse) [18:23:21] milimetric: check worked [18:23:23] should I deploy? [18:23:33] thx ottomata, yes [18:23:48] Analytics-Kanban, Analytics-Wikistats, Patch-For-Review: Send email out to community notifying of change - https://phabricator.wikimedia.org/T115922#1786136 (JAllemandou) [18:23:49] Analytics-Kanban, Analytics-Wikistats: Publish new pageview dataset with clear documentation {lama} [8 pts] - https://phabricator.wikimedia.org/T115344#1786137 (JAllemandou) [18:24:00] !log deploying aqs [18:24:03] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log, Master [18:24:40] joal: thanks, beeline with -n did the trick [18:25:05] Analytics-Backlog: Traffic Breakdown Report - Browser Major Minor Version {lama} - https://phabricator.wikimedia.org/T115590#1786143 (JAllemandou) [18:28:01] Analytics-Kanban: Improve loading Analytics Query Service with data {slug} [subtasked] - https://phabricator.wikimedia.org/T115351#1786175 (JAllemandou) [18:28:44] Analytics-Kanban: Spike: Understand how Wikistats Traffic reports are computed {lama} [8 pts] - https://phabricator.wikimedia.org/T114669#1786180 (JAllemandou) [18:29:28] Analytics-Kanban: Spike: Understand how Wikistats Traffic reports are computed {lama} [8 pts] - https://phabricator.wikimedia.org/T114669#1786186 (JAllemandou) a:Milimetric [18:30:20] Analytics-Kanban: Communicate the WikimediaBot convention {hawk} [5 pts] - https://phabricator.wikimedia.org/T108599#1786196 (JAllemandou) p:Normal>High [18:34:57] sorry team, with the meetings, I forgot to login... [18:43:58] Analytics-Backlog, WMDE-Analytics-Engineering, Graphite: Create a Graphite instance in the Analytics cluster - https://phabricator.wikimedia.org/T117732#1786288 (JanZerebecki) Yes, we should not use graphite as a primary storage for metrics that we can not recreate. Using graphite as a cache only that i... [18:49:06] Analytics-Tech-community-metrics, Phabricator, DevRel-November-2015: Closed tickets in Bugzilla migrated without closing event? - https://phabricator.wikimedia.org/T107254#1786304 (chasemp) > Two items about complexity: > * Is importing all status and resolution changes a ticket has ever seen (and not... [18:59:11] you're welcome ebernhar1son :) [19:00:19] nuria: quick question about the avro schema resource in pom [19:00:56] Thanks milimetric and ottomata :) [19:01:19] eh? [19:01:27] aqs deployment: ) [19:02:03] milimetric: hm dpeloyement failed [19:02:03] sorry [19:02:10] arf ... [19:02:13] failed: [aqs1001.eqiad.wmnet] => {"elapsed": 180, "failed": true} [19:02:13] msg: Timeout when waiting for aqs1001.eqiad.wmnet:7232 [19:02:13] FATAL: all hosts have already failed -- aborting [19:02:18] PLAY RECAP ******************************************************************** [19:02:18] to retry, use: --limit @/Users/otto/deploy.retry [19:02:18] aqs1001.eqiad.wmnet : ok=2 changed=2 unreachable=0 failed=1 [19:02:35] its running now though [19:02:39] maybe it just took too long? [19:02:42] to restart? [19:03:24] ottomata: I don't see the changes in cassandra, so I think it has really failed [19:04:08] ok [19:08:50] joal: yes [19:09:17] When putting the resource in the pom as you did, it adds the content of the folder at the root of the jar [19:09:25] joal: right [19:09:27] is-it what we expect? [19:09:33] ok, then sounds good :) [19:09:52] I wondered if it was not messy, but after all, it's packaged :) [19:09:56] nuria: --^ [19:09:59] joal: I think as long as hive can find it we are good [19:10:04] ok great [19:10:11] I'll merge that [19:10:14] Analytics-Backlog, Wikipedia-iOS-App-Product-Backlog, operations, vm-requests, iOS-5-app-production: Request one server to suport piwik analytics - https://phabricator.wikimedia.org/T116312#1786348 (Milimetric) @Joe, @mark, there was more context to this issue in other tickets, but I'm happy to... [19:10:19] joal: k [19:10:36] I'll +1 dcausse code as well, please feel free to merge if you think it;s good (the java one) [19:10:53] joal / ottomata :( I have to go now, the hotel is kicking me out [19:10:54] this is important cause currently camus run are incorrect [19:10:59] sorry to leave you with that shitty situation [19:11:01] Ok, np milimetric [19:11:14] I just hope restbase works (even if not deploayed [19:11:42] it looks like it answers but I don't have time to look deeper [19:11:43] (CR) Joal: [C: 1] "Nuria for approval and merge :)" [analytics/refinery/source] - https://gerrit.wikimedia.org/r/251267 (https://phabricator.wikimedia.org/T117873) (owner: DCausse) [19:11:51] could you two kick the tires and make sure everything's ok? [19:12:00] ok, let's leave it as is and see tomorrow [19:12:07] Safe trip mili|gone ! [19:16:10] ottomata: queries look good, so it's not down [19:16:31] ottomata: do you know how to look for deployment - restore ? [19:16:56] restore? [19:17:00] (CR) Joal: [C: 2 V: 2] "Cool !" [analytics/refinery/source] - https://gerrit.wikimedia.org/r/251289 (https://phabricator.wikimedia.org/T117885) (owner: Nuria) [19:17:38] hm, so, do we continue / try again, or do we say that it works and wait for mili|gone tomorrow ottomata ? [19:18:12] joal, so far am I just a proxy command runner for this :/ [19:18:16] might be best to wait [19:18:24] ok, then since it works, let's wait agreed [19:18:32] thanks ottomata :) [19:18:50] mannnnn mforns .... I want to try your UI :)))) [19:19:05] joal, hehehe [19:19:09] working on it [19:19:14] You ROCK ! [19:19:16] :D [19:19:37] xD mmm I think today, end of day, I'll put it in labs [19:20:02] I'll see that tomorrow [19:20:07] ok [19:29:23] (PS1) Joal: Improve sorted-json job [analytics/wikihadoop] - https://gerrit.wikimedia.org/r/251311 (https://phabricator.wikimedia.org/T114359) [19:38:06] (CR) Nuria: AvroBinaryMessageDecoder: added "ts" as a possible record for timestamp (1 comment) [analytics/refinery/source] - https://gerrit.wikimedia.org/r/251267 (https://phabricator.wikimedia.org/T117873) (owner: DCausse) [19:43:51] Analytics-Tech-community-metrics, DevRel-November-2015: Key performance indicator: Top contributors - https://phabricator.wikimedia.org/T64221#1786443 (MarkAHershberger) I like what I'm seeing (though my vanity is hurt by not finding myself on the list). When I looked just now the krinkle was listed with... [19:48:39] (CR) DCausse: AvroBinaryMessageDecoder: added "ts" as a possible record for timestamp (1 comment) [analytics/refinery/source] - https://gerrit.wikimedia.org/r/251267 (https://phabricator.wikimedia.org/T117873) (owner: DCausse) [20:01:32] (CR) Nuria: AvroBinaryMessageDecoder: added "ts" as a possible record for timestamp (1 comment) [analytics/refinery/source] - https://gerrit.wikimedia.org/r/251267 (https://phabricator.wikimedia.org/T117873) (owner: DCausse) [20:06:34] Analytics-Kanban: Understand the Perl code for Traffic Breakdown Report - Browser Major Minor Version report {lama} - https://phabricator.wikimedia.org/T117245#1786553 (Nuria) [21:27:02] Analytics-Backlog: Move App session data to 7 day counts - https://phabricator.wikimedia.org/T117637#1786804 (JKatzWMF) @Nuria thanks for your response. Can you explain to me why the following holds true? > until we test we will not know if doing it every 7 days renders meaningful data Is this around the... [21:32:50] Analytics-Backlog: Provide weekly app session metrics separately for Android and iOS - https://phabricator.wikimedia.org/T117615#1786815 (Tbayer) [21:33:23] Analytics-Backlog: Provide weekly app session metrics separately for Android and iOS - https://phabricator.wikimedia.org/T117615#1779777 (Tbayer) >>! In T117615#1781919, @Nuria wrote: >> absolutely agree, but that's hypothetical as we don't have platform-specific data for these months since May. Or are you sa... [21:51:33] Analytics-Backlog: Move App session data to 7 day counts - https://phabricator.wikimedia.org/T117637#1786884 (Nuria) @JKantzWMF: >Is this around the % of sessions that take place during the start and end of the period in question? Right, the number of users that have a session within the period. A session is... [22:42:04] Analytics-Backlog: Move App session data to 7 day counts - https://phabricator.wikimedia.org/T117637#1787060 (Tbayer) FWIW, the dataset has contained some sessions that are longer than a week, although losing these outliers to truncation effects would probably increase rather than decrease data quality ;) (1... [23:09:08] Analytics-Backlog, Research-and-Data: Historical analysis of edit productivity for English Wikipedia - https://phabricator.wikimedia.org/T99172#1787111 (Halfak) And... I've made a mistake. I accidentally re-processed the old diff data. I need to re-start the process to work with the new diff data :( Ki... [23:20:20] Analytics-Backlog, Research-and-Data: Historical analysis of edit productivity for English Wikipedia - https://phabricator.wikimedia.org/T99172#1787141 (Halfak) Also, I should note that the altiscale engineers identified rare, periodic memory usage spikes that happen during the [Diffs]-->[Token stats] proc... [23:21:30] Analytics-Backlog, Research-and-Data: Historical analysis of edit productivity for English Wikipedia - https://phabricator.wikimedia.org/T99172#1787146 (Halfak) [23:24:49] Analytics-Backlog, Research consulting, Research-and-Data: Update official Wikimedia press kit with accurate numbers - https://phabricator.wikimedia.org/T117221#1787160 (ggellerman) (typing for @DarTar): This is assigned to @ezachte Moving it to In Progress column. Please keep us up to date on the o...