[03:27:21] 10Analytics-Dashiki, 06Analytics-Kanban, 13Patch-For-Review: Migrate from bower to npm and clean up folder hierarchy - https://phabricator.wikimedia.org/T147884#3208695 (10Krinkle) [03:27:24] 10Analytics, 10Analytics-Dashiki: Switch to fetch away from jquery - https://phabricator.wikimedia.org/T148053#3208694 (10Krinkle) [08:02:28] morning team! When I saw all the oozie emails I thought that an1003 was exploding [08:02:31] but they are all spma [08:02:33] *spam [08:02:35] niceeeeeeeeeeeeeeeee [08:06:46] everything looks good [08:06:51] * elukey back to holiday mode [10:32:24] Hi a-team [10:32:46] I'm sorry I have not been able to work yet - dentist has been hard on me [10:33:16] I'll teach this afternoon and will be present for the wikistats2 backend meeting, but will put a day off for the rest of the day [10:33:52] !log restart failed mediacounts-archive-coord : Workflow mediacounts-archive-wf-2017-04-24 [10:33:53] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [12:24:43] 10Analytics, 10Mobile-Content-Service, 06Reading-Infrastructure-Team-Backlog, 06Wikipedia-iOS-App-Backlog: As an end-user I shouldn't see non-articles in the list of trending articles - https://phabricator.wikimedia.org/T124082#3209460 (10NHarateh_WMF) [12:42:30] 10Analytics, 06Developer-Relations, 10MediaWiki-API, 06Reading-Admin, and 4 others: Is User-Agent data PII when associated with Action API requests? - https://phabricator.wikimedia.org/T154912#3209633 (10NHarateh_WMF) [12:43:24] 10Analytics, 10MediaWiki-API, 06Reading-Infrastructure-Team, 06Reading-Infrastructure-Team-Backlog: Load API request count and latency data from Hadoop to a dashboard - https://phabricator.wikimedia.org/T108414#3209660 (10NHarateh_WMF) [14:04:29] I'll hdfs -put that new whitelist so it stops bugging us [14:07:30] k, done [14:07:45] apologies to luca, enjoy your vacation dude [14:34:12] 06Analytics-Kanban, 10Analytics-Wikistats: Visual prototype for community feedback for Wikistats 2.0 iteration 1. - https://phabricator.wikimedia.org/T157827#3209986 (10ezachte) "show daily unique devices instead of monthly unique devices on dashboard" For the record: I don't remember saying this. In genera... [14:44:16] (03PS3) 10Milimetric: Add --generate-jar and --jar-file options [analytics/refinery] - 10https://gerrit.wikimedia.org/r/349723 (https://phabricator.wikimedia.org/T143119) [14:44:22] (03CR) 10Milimetric: Add --generate-jar and --jar-file options (033 comments) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/349723 (https://phabricator.wikimedia.org/T143119) (owner: 10Milimetric) [14:45:08] (03PS1) 10Mforns: [WIP] Add monthly sanitized job for banner activity [analytics/refinery] - 10https://gerrit.wikimedia.org/r/350219 (https://phabricator.wikimedia.org/T157582) [14:55:04] 10Analytics: Implement clickstream & navigation vectors as a regular job - https://phabricator.wikimedia.org/T163788#3210017 (10Halfak) [14:55:10] o/ hey folks [14:55:19] If anyone wants to talk about https://phabricator.wikimedia.org/T163788, I'd be happy to. [14:55:42] See the wikibugs post above mine "Implement clickstream & navigation vectors as a regular job" [14:56:28] I'm curious about any casual estimates of "oh! that'll be super easy" or "that'll be complicated" or maybe even "we already have that" which would be great :)) [14:57:36] 10Analytics: Implement clickstream & navigation vectors as a regular job - https://phabricator.wikimedia.org/T163788#3210035 (10Halfak) @Shilad is requesting this as something he'd use to keep [WikiBrain](https://shilad.github.io/wikibrain/) up to date and for other research. [14:59:38] 10Analytics: Investigate whether we could calculate "hourly unique devices" - https://phabricator.wikimedia.org/T163789#3210041 (10Nuria) [15:00:44] ottomata, fdans , elukey: standduppp [15:01:34] 10Analytics: Implement clickstream & navigation vectors as a regular job - https://phabricator.wikimedia.org/T163788#3210059 (10Shilad) I am happy to help with engineering on this if we can find a way to make that work. I've set up navigation-based word2vec pipelines in similar environments (PySpark, Oozie, etc.... [15:01:53] \o/ [15:03:47] 10Analytics: Implement clickstream & navigation vectors as a regular job - https://phabricator.wikimedia.org/T163788#3210017 (10Halfak) If we want to go that route, I volunteer to help @Shilad (a long time collaborator to Wikimedia technology advancement) do the NDA dance. [15:11:11] 10Analytics, 06Research-and-Data-Backlog: Improve bot identification at scale - https://phabricator.wikimedia.org/T138207#3210106 (10Nuria) [15:14:33] 10Analytics-Dashiki, 06Analytics-Kanban: Change default timeline for browser reports to be recent (not 2015) - https://phabricator.wikimedia.org/T160796#3210126 (10Nuria) [15:21:39] halfak: that would not be super easy, doable, but data munching for larger wikis might require tricks on splitting jobs in parallel that are not super obvious, I would do a prototype for a small wiki first (simplewiki?) , we prefer scala/spark for this types of jobs [15:21:47] *these types [15:22:59] 10Analytics: Implement clickstream & navigation vectors as a regular job - https://phabricator.wikimedia.org/T163788#3210017 (10Nuria) @Shilad: data munching for larger wikis might require tricks on splitting jobs in parallel that are not super obvious, I would do a prototype for a small wiki first (simplewiki?... [15:32:25] 10Analytics: Implement clickstream & navigation vectors as a regular job - https://phabricator.wikimedia.org/T163788#3210207 (10Halfak) Aha! {T158972} [15:33:16] 10Analytics: productionize ClickStream dataset - https://phabricator.wikimedia.org/T158972#3053334 (10Halfak) [15:34:14] 10Analytics: productionize ClickStream dataset - https://phabricator.wikimedia.org/T158972#3053334 (10Halfak) [15:36:04] 10Analytics: productionize ClickStream dataset - https://phabricator.wikimedia.org/T158972#3210239 (10Halfak) There's some discussion in {T163788} that I merged with this task. In summary: @Shilad offered to contribute time and code. He has experience with PySpark and Oozie. @Halfak offered to work with @Sh... [15:36:34] 10Analytics: productionize ClickStream dataset - https://phabricator.wikimedia.org/T158972#3210241 (10Halfak) [15:41:26] 10Analytics, 06Research-and-Data: Improve bot identification at scale - https://phabricator.wikimedia.org/T138207#3210256 (10leila) [15:53:20] 10Analytics: Implement clickstream & navigation vectors as a regular job - https://phabricator.wikimedia.org/T163788#3210349 (10Nuria) >Presumably, @ellery already had jobs to compute and clean up clickstream data working on our infrastructure. Code is here: https://github.com/ewulczyn/wiki-clickstream and more... [15:55:52] 10Analytics: Implement clickstream & navigation vectors as a regular job - https://phabricator.wikimedia.org/T163788#3210351 (10Halfak) @Nuria, I've merged this task into one that ya'll already have prioritized. Please continue the conversation in {T158972} [16:00:42] mforns: let me know when you deploy the annotations stuff so i can talk to neilpquinn about it [16:16:21] 10Analytics: add a more friendly message to ladp authentication box for [pivot - https://phabricator.wikimedia.org/T163797#3210400 (10Nuria) [16:19:40] milimetric: did you hdfs put the new value to the pageview whitelist? [16:21:28] nuria: yes [16:36:38] (03PS1) 10Tjones: Support Wiki Abbreviation for Czech (cs vs cz) [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/350247 [16:58:23] nuria, sure, I'm on it, I'll ping you when finished [16:59:07] Thanks a lot milimetric -- My turn to fail ops week ;) [16:59:48] team, meeting in batcave or wikistats-backe ? [17:01:04] im in wikistats-backend [17:02:02] I am here wikistats-backe cc joal milimetric mforns fdans [17:02:14] nuria: we're all in batcave [17:02:20] joal: sorry, omw [17:24:26] 10Analytics, 10ChangeProp, 10EventBus, 06Services (later), 15User-mobrovac: [EPIC] Develop a JobQueue backend based on EventBus - https://phabricator.wikimedia.org/T157088#3210696 (10Pchelolo) [17:24:28] 10Analytics, 10ChangeProp, 10EventBus, 06Services (done): Make EventBus service support wildcards in schema definitions - https://phabricator.wikimedia.org/T157091#3210694 (10Pchelolo) 05Open>03Resolved Deployed to production, resolving. [17:56:33] mforns, fdans: invite sent ! [18:06:56] joal, thx [18:26:42] hey I'm back, nuria lemme know if you want to work on that doc [18:27:03] milimetric: let me do 1st pass, just sent meeting notes [18:28:31] k [18:37:56] milimetric: done with updates: https://docs.google.com/document/d/10cTkWcxOE89kx_HejlAbRyiRjlhXL13Cii0hfPOki4c/edit# [18:38:38] milimetric: I am going to start creating tasks to start implementing FE for wikistats and pageviews and Unique Devices detail pages [18:39:12] 10Analytics, 10Analytics-Wikistats: Backend for wikistats 2.0 - https://phabricator.wikimedia.org/T156384#3211043 (10Nuria) [18:40:23] ok, nuria, read it, looks good [18:40:32] (03PS1) 10Mforns: Deploy browsers and reportcard [analytics/analytics.wikimedia.org] - 10https://gerrit.wikimedia.org/r/350259 (https://phabricator.wikimedia.org/T162482) [18:41:04] I'll work on that feature request on dashiki for the rest of today then... [18:41:33] (03CR) 10Mforns: [V: 032 C: 032] Deploy browsers and reportcard [analytics/analytics.wikimedia.org] - 10https://gerrit.wikimedia.org/r/350259 (https://phabricator.wikimedia.org/T162482) (owner: 10Mforns) [18:42:30] nuria, once analytics.wikimedia.org has been merged, do we need to trigger the code pull or puppet will do it automatically? [18:42:51] mforns: puppet will do it once it runs, which i think it is every 15 mins [18:43:53] k, already merged it, and also nuria: should we create annotation pages for all graphs of the browser-dashboard and the reportcard? [18:44:35] nuria, I'll ping you when I see the code has been pulled [18:45:03] mforns: teh reportcard should show pageview annotations as it shows pageview metric [18:45:11] mforns: same for unique devices if any [18:45:16] k [18:45:37] mforns: Browser data is not a 'metric" so i do not think we need to do anything for now [18:46:35] 06Analytics-Kanban: Initial FE code for Wikistats 2.0. Dashboard skeleton - https://phabricator.wikimedia.org/T163814#3211082 (10Nuria) [18:51:11] k [18:57:04] 10Analytics: Investigate whether we could calculate "hourly unique devices" - https://phabricator.wikimedia.org/T163789#3210041 (10Tbayer) What would be the purpose of this metric? [19:05:13] milimetric, nuria: there seems to be something wrong with the reportcard's on-wiki config https://meta.wikimedia.org/wiki/Config:Dashiki:ReportCard [19:05:38] to me it returns: [WP@dgwrAEEYAABKxG6MAAABD] 2017-04-25 19:03:31: Fatal exception of type MWUnknownContentModelException [19:07:34] the annotations-for-tabs-layout code has been deployed, but I can not edit the config to activate it [19:07:54] mforns: yep, welcome to the mess [19:08:06] so this mostly happened while you were gone and I forget if we caught you up with it [19:08:18] ah ok [19:08:27] the configuration of the dashiki extension had some bugs, we tried to fix them, and then we hit the deployment freeze [19:08:39] so it's in this intermediate state where it still has bugs but the config is still broken. [19:08:44] milimetric: i though last thing we were waiting for that was a deploy after the code freeze of the switchover? [19:08:48] yep [19:09:06] and if that doesn't fix it we have to like get into the db and start hacking shit up, it's gonna be nuts [19:09:31] ok [19:10:47] so, I'll wait until that deploy then to activate annotations, in the meantime I'll do the docs nuria [19:11:03] BTW I also deployed the changes to flow-reportcard.wmflabs.org [19:11:24] milimetric: we should be able to recreate config for the reportcard correct? [19:11:50] the thing is, the api is able to get it. maybe if we need it, we can edit it from the api? [19:12:14] milimetric, mforns : i think dan edited it from scratch last time [19:12:25] don't think so but maybe. We could also recreate it, sure, but then it's just more mess to clean up [19:12:46] milimetric, why doesn't it happen with all other config pages? [19:13:28] milimetric: I think in this case something else is broken as other configs are not 'editable' [19:13:38] the short answer is because it has a corrupted page.content_model, the long answer starts with "I have no idea and neither does anyone else but..." [19:13:58] but the serving through UI is not broken [19:14:28] yeah, question is: is this an emergency or can it wait until after the freeze? [19:14:36] milimetric: do we know when the deployment freeze is over? [19:14:52] I thought end of this week [19:18:13] milimetric: given that the api can retrieve the page but meta cannot print it it doesn't seem that the db is at fault but rather the json extension? [19:20:03] 06Analytics-Kanban: Implement pageviews and unique devices detail pages - https://phabricator.wikimedia.org/T163817#3211246 (10Nuria) [19:22:34] 06Analytics-Kanban: Implement pageviews and unique devices detail pages in Wikistats UI - https://phabricator.wikimedia.org/T163817#3211277 (10Nuria) [19:23:01] 06Analytics-Kanban: Initial FE code for Wikistats 2.0. Dashboard skeleton - https://phabricator.wikimedia.org/T163814#3211278 (10Nuria) [19:25:56] 06Analytics-Kanban: Initial FE code for Wikistats 2.0. Dashboard skeleton - https://phabricator.wikimedia.org/T163814#3211294 (10Nuria) [19:26:18] 10Analytics-Dashiki, 06Analytics-Kanban, 13Patch-For-Review: annotations should show on tab layout - https://phabricator.wikimedia.org/T162482#3211300 (10mforns) Code has been deployed, but config can not be edited because of deployment freeze. Will work on documentation in the meantime and leave this task i... [19:39:33] (03PS15) 10Ottomata: [WIP] Spark + JSON -> Hive [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/346291 (https://phabricator.wikimedia.org/T161924) (owner: 10Joal) [19:51:45] 10Analytics-Dashiki, 06Analytics-Kanban, 13Patch-For-Review: annotations should show on tab layout - https://phabricator.wikimedia.org/T162482#3211428 (10Nuria) Config is broken due to Dashiki extension being broken, fixes are being deployed after freeze.