[01:26:19] Analytics, Analytics-Kanban: Backfill event logging data after 02/05 outage - https://phabricator.wikimedia.org/T88692#1102539 (Nuria) Open>Resolved [01:26:36] Analytics-Engineering, Analytics-Kanban: Backfilling EL events from 20150206 to 20150210 - https://phabricator.wikimedia.org/T89269#1102540 (Nuria) Open>Resolved [01:34:15] Analytics-EventLogging, Analytics-Kanban, Wikimedia-Search: Estimate maximum throughput of Schema:Search (capacity) {oryx} - https://phabricator.wikimedia.org/T89019#1102542 (Tfinc) Checking in to see what's moved this into paused. thanks. [01:51:46] Analytics-EventLogging, Analytics-Kanban, Wikimedia-Search: Estimate maximum throughput of Schema:Search (capacity) {oryx} - https://phabricator.wikimedia.org/T89019#1102550 (Nuria) I think this is a mistake, moving into WIP. Please see my last comments, we can use EL to see how people use search. I... [01:52:46] Analytics-EventLogging, Analytics-Kanban, Wikimedia-Search: Estimate maximum throughput of Schema:Search (capacity) {oryx} - https://phabricator.wikimedia.org/T89019#1102551 (Nuria) Note that last action item is not on analytics team but rather the team that owns sercrh. [09:14:47] Analytics, Language-Engineering, MediaWiki-extensions-ContentTranslation, LE-Sprint-84: Newly added language ky and pa doesn't reflect in Language Limn Graph - https://phabricator.wikimedia.org/T92236#1103677 (KartikMistry) NEW [09:16:14] Analytics, Language-Engineering, MediaWiki-extensions-ContentTranslation, LE-Sprint-84: language-reportcard.wmflabs.org has issues - https://phabricator.wikimedia.org/T92237#1103686 (KartikMistry) NEW [10:41:56] mobile_apps uniques daily jobs caught up --> ongoing update now on [10:42:02] Starting monthly job [11:52:33] Analytics-General-or-Unknown, CA-team, Community-Liaison, Wikimedia-Extension-setup: enable Piwik on ru.wikimedia.org - https://phabricator.wikimedia.org/T91963#1104079 (Aklapper) So theoretical steps would be here: 1. Receive feedback whether extension could/would be installed 2. Set up Piwik inst... [11:53:35] Analytics-General-or-Unknown, CA-team, Community-Liaison, Wikimedia-Extension-setup: enable Piwik on ru.wikimedia.org - https://phabricator.wikimedia.org/T91963#1104089 (Rubin16) >>! In T91963#1104079, @Aklapper wrote: > So theoretical steps would be here: > 1. Receive feedback whether extension co... [13:07:41] Analytics, Language-Engineering, MediaWiki-extensions-ContentTranslation, LE-Sprint-84: language-reportcard.wmflabs.org has issues - https://phabricator.wikimedia.org/T92237#1104238 (Nikerabbit) New languages were added [[https://gerrit.wikimedia.org/r/#/c/194521/2/language/content_translation_beta... [13:49:37] Analytics-Cluster, Patch-For-Review: Better way to access Hadoop related web GUIs - https://phabricator.wikimedia.org/T83601#1104300 (Ottomata) Open>Resolved I'm going to close this ticket for now. I wish the proxying that resourcemanager does could be smarter. [13:52:38] Analytics, Wikimedia-Fundraising: Provide performant query access to banner show/hide numbers - https://phabricator.wikimedia.org/T90649#1104307 (Ottomata) We could make special varnishkafka instances that logged only certain requests to a topic. Not a bad idea. Do you know which of the cache clusters... [14:16:23] Analytics-EventLogging, Popups: Large number of popup events not validating - https://phabricator.wikimedia.org/T91272#1104355 (Nuria) Ping ..... [14:18:17] Analytics-Cluster, Analytics-Kanban: Refactor MobileApps uniques HQL to use external table to format data [8 pts] - https://phabricator.wikimedia.org/T90730#1104363 (JAllemandou) Open>Resolved [14:21:03] Analytics-Cluster, Analytics-Engineering: Refine webrequest x_analytics field into a map in the refined table. - https://phabricator.wikimedia.org/T89396#1104370 (Ottomata) [14:22:59] Analytics-Cluster, Analytics-Kanban: Refine webrequest x_analytics field into a map in the refined table. - https://phabricator.wikimedia.org/T89396#1104376 (kevinator) [14:23:44] Analytics-Cluster, Analytics-Kanban: Refine webrequest x_analytics field into a map in the refined table. - https://phabricator.wikimedia.org/T89396#1104379 (kevinator) a:JAllemandou [14:25:41] Analytics-Cluster, Analytics-Kanban: Add processed user agent to refined tables - https://phabricator.wikimedia.org/T91793#1104390 (kevinator) p:Triage>Normal a:JAllemandou [14:26:30] Analytics-EventLogging, Analytics-Kanban, operations: Eventlogging JS client should warn users when serialized event is more than "N" chars long and not sent the event - https://phabricator.wikimedia.org/T91918#1104401 (mforns) a:mforns>None [14:27:10] Analytics-Cluster, Analytics-Engineering: Automate sqooping of page table into Hive - https://phabricator.wikimedia.org/T89394#1104405 (Ottomata) [14:27:56] Analytics, Analytics-Kanban, Wikimedia-Fundraising: Provide performant query access to banner show/hide numbers - https://phabricator.wikimedia.org/T90649#1104410 (Nuria) [14:53:59] Analytics, Language-Engineering, MediaWiki-extensions-UniversalLanguageSelector, Mobile-Apps, and 5 others: there should be a comparison of clicks count on interlanguage on different platforms - https://phabricator.wikimedia.org/T78351#1104526 (Aklapper) [15:08:14] joal, milimetric: interesting article: http://radar.oreilly.com/2014/07/questioning-the-lambda-architecture.html [15:23:04] Thx Nuria for the communication about mobile uniques [15:25:08] ottomata: Shall I come to the op meeting with you ? [15:25:29] sure! [15:25:34] k [15:25:45] I hope to be more useful here than on the vcl meeting :( [15:33:21] joal: yw, np [15:47:56] Analytics, MediaWiki-Core-Team, VisualEditor, VisualEditor-Performance, and 3 others: Apply Schema:Edit instrumentation to WikiEditor - https://phabricator.wikimedia.org/T88027#1104728 (Jdforrester-WMF) Open>Resolved [15:59:00] Analytics-Cluster: Alert on stuck Hadoop jobs - https://phabricator.wikimedia.org/T92283#1104756 (Ottomata) NEW [16:27:40] Mobile_apps monthly jobs caught up as well [16:28:00] Data is now up to date and should continue to flow in [16:54:49] halfak: yt ? [16:54:57] yeah. What's up? [16:55:00] joal, ^ [16:55:04] :) [16:55:50] halfak: In our xml dumps, is ordering constant (page info before revisions for instance) or not ? [16:56:01] Yes [16:56:08] YEAH :) [16:56:13] Page info *always* comes first [16:56:21] Ok cool [16:56:26] :) [16:56:33] In revs, rev info always come before text ? [16:57:20] halfak: Simpler: revision (id?,timestamp,contributor,minor?,comment,text) [16:57:45] should we assume this order be respected ? [16:58:19] That's a good question. I'm not sure if we get to assume that. [16:58:25] Actually... wait... yes we can. [16:58:27] huhu [16:58:33] really ? [16:58:38] I remember a bug being filed about a field appearing after text. [16:58:41] * halfak digs [16:59:32] Woops. Meeting! [16:59:35] Back in 30 [16:59:39] np [17:19:53] joal, I think we can guarantee that comes after *most* revision metadata. http://www.mediawiki.org/xml/export-0.10.xsd [17:20:04] ok [17:20:12] It looks like the schema specifies a sequence. comes after text. [17:20:34] This is sad because sha1 is most useful without text. [17:21:49] ok thx :) [20:03:42] G'd night team [20:05:37] good night joal|night :] [20:18:34] i need a brain bounce [20:18:36] ne1? [20:18:37] hmm [20:18:46] nuria: ? [20:18:54] ottomata: yessir [20:19:14] batcave! [20:19:22] k [21:29:40] mforns: hey marcel, you there? [21:29:47] hey kevinator [21:29:57] batcave? [21:30:00] what's up [21:30:01] sure [21:30:16] milimetric: batcave? [21:30:32] brt [21:31:51] nuria: we're talkin' about the scheduler [21:31:53] in the batcave [21:51:54] mforns: I'll make it my top priority to merge your code and help to deploy, but you're still going to submit the generate.py integration patch, yes? [21:52:10] milimetric, yes, please wait until tomorrow [21:52:18] np, let me know when I should look [21:52:59] milimetric, also, I'd like to help you review the code, because it's quite a bunch of stuff, so if you want, I can give you an overview [21:53:09] and this may be quicker also, to get a feedback [21:54:31] I reviewed it for the general idea [21:54:39] and I'm not going to be too thorough on the second review [21:54:46] basically, if it works, I'm happy [21:58:30] milimetric, ok then, I'll let you know tomorrow :] [22:15:40] mforns: there are I think some SQL problems too, but the main problem appears to be the logstash module not being available on stat1003 [22:15:58] so I was going to revert the last commit (the logging one) [22:16:00] milimetric, no I think that is on my side [22:16:09] ooh [22:16:19] hm... but the same appears in the limn-mobile-data log [22:19:54] ok, I self-merged that [22:19:59] ok [22:20:17] overnight it should re-run at least some of the jobs and then it'll save me going through each SQL to see which one's broken and which one's good [22:22:11] wow, my fault, didn't check the puppet code for logstash in the logging task [22:22:39] or requirements.txt [22:25:30] requirements.txt wouldn't make too much difference, because there's no pip install or anything [22:25:34] but yeah we have to fix the puppet code [22:27:41] (PS1) Ottomata: Adapt bin/refinery-drop-webrequest-partitions to work with refined table [analytics/refinery] - https://gerrit.wikimedia.org/r/195672 (https://phabricator.wikimedia.org/T89257) [22:27:45] (PS2) Ottomata: Adapt bin/refinery-drop-webrequest-partitions to work with refined table [analytics/refinery] - https://gerrit.wikimedia.org/r/195672 (https://phabricator.wikimedia.org/T89257) [22:31:16] milimetric: i am back but you guys are probably done [23:17:22] nuria: yeah, we talked quickly about the scheduler [23:17:48] we did a back-of-the-napkin estimate of how much effort we'd have to spend on maintaining generate.py if we ran the VE stuff on it [23:17:54] we came up with about 4 full working days [23:18:11] so we said, if we can showcase the scheduler next week then we'll call it done and use it from now on [23:18:32] if not, then we can use generate.py and accept its flaws for now [23:19:22] kevinator: I think it's cool and all if you guys like it, but maybe consider putting the myriad of single file cleanup tasks in one ticket? like https://phabricator.wikimedia.org/T92340 [23:21:01] chasemp: there are 20ish subtasks, and when we had them all in one ticket… we’d always get stuck trying to boil the ocean. [23:23:31] fair enough, I had visions of 100 tickets in and/or single file removal tickets into the horizon :) [23:31:05] milimetric: ok, sounds like right tradeoff