[04:21:50] (PS1) BearND: Add isAppPreview to pageview definition [analytics/refinery/source] - https://gerrit.wikimedia.org/r/234937 [04:23:55] (PS2) BearND: Add isAppPreview to pageview definition [analytics/refinery/source] - https://gerrit.wikimedia.org/r/234937 [04:30:27] (Abandoned) BearND: Add isAppPreview to pageview definition [analytics/refinery/source] - https://gerrit.wikimedia.org/r/234937 (owner: BearND) [04:30:36] (PS1) BearND: Add isAppPreview to pageview definition [analytics/refinery/source] - https://gerrit.wikimedia.org/r/234938 [04:39:38] (CR) BearND: "Could use some additions from iOS guys; in this or in a separate patch." (3 comments) [analytics/refinery/source] - https://gerrit.wikimedia.org/r/234938 (owner: BearND) [05:56:02] Analytics-Tech-community-metrics, ECT-August-2015, ECT-September-2015, Patch-For-Review: "Age of unreviewed changesets by affiliation" shows negative number of changesets - https://phabricator.wikimedia.org/T72600#1588421 (Acs) >>! In T72600#1567113, @Aklapper wrote: > @Dicortazar: Any news here whe... [09:31:53] Analytics-Tech-community-metrics: " Age of open changesets by Affiliation" layover partially displayed off-screen - https://phabricator.wikimedia.org/T110873#1588692 (Aklapper) NEW [09:31:55] Analytics-Tech-community-metrics: Clicking "Age of open changesets by Affiliation" explanation link goes to top of page - https://phabricator.wikimedia.org/T110874#1588698 (Aklapper) NEW [09:31:58] Analytics-Tech-community-metrics: "Age of open changesets by Affiliation" has some "NaN" values - https://phabricator.wikimedia.org/T110875#1588704 (Aklapper) NEW [09:33:10] Analytics-Tech-community-metrics, ECT-August-2015, ECT-September-2015, Patch-For-Review: "Age of unreviewed changesets by affiliation" shows negative number of changesets - https://phabricator.wikimedia.org/T72600#1588711 (Aklapper) Open>Resolved >>! In T72600#1588421, @Acs wrote: > Fixed. The... [10:48:51] Analytics-Tech-community-metrics, ECT-August-2015, ECT-September-2015: Tech community KPIs for the WMF metrics meeting - https://phabricator.wikimedia.org/T107562#1588866 (Aklapper) We won't be able to answer {T103292} until {T110678} is solved. Currently {T104845}/{T103984} are higher on the list than... [12:33:51] (PS1) Joal: Update pageview_hourly oozie job for backfill [analytics/refinery] - https://gerrit.wikimedia.org/r/234983 (https://phabricator.wikimedia.org/T110614) [12:38:37] (PS1) Joal: Update changelog for v0.0.17 release. [analytics/refinery/source] - https://gerrit.wikimedia.org/r/234984 [12:39:18] (CR) Joal: [C: 2 V: 2] "Self merging the changelog in order to deploy early." [analytics/refinery/source] - https://gerrit.wikimedia.org/r/234984 (owner: Joal) [12:42:54] !log Deploying refinery-[core|hive]-0.0.17.jar in archive [13:22:08] ottomata: Morning ! [13:22:49] ottomata: have a look at that: https://www.youtube.com/watch?v=_PQbVH_aO5E [13:23:11] I think we could try that for internal stats usage [13:25:02] oh cool, like iPython notebook but with scala! [13:25:17] joal, did you know that ellery has hooked iPython Notebook up to spark, and uses that sometimes? [13:25:28] --> Can be done in python as well, uses spark, and is shareable ! [13:25:54] ottomata: didn't know that [13:26:12] I know ellery runs sparks, didn't know he uses IPython notebook [13:28:42] (PS1) Joal: Bump refinery core and hive version to 0.0.17 [analytics/refinery] - https://gerrit.wikimedia.org/r/234992 [13:37:36] (CR) Ottomata: [C: 2 V: 2] Update pageview_hourly oozie job for backfill [analytics/refinery] - https://gerrit.wikimedia.org/r/234983 (https://phabricator.wikimedia.org/T110614) (owner: Joal) [13:37:55] joal: dont' forget changelog! [13:38:07] ottomata: self merged ;) [13:38:34] oh ok! [13:38:41] (CR) Ottomata: [C: 2 V: 2] Bump refinery core and hive version to 0.0.17 [analytics/refinery] - https://gerrit.wikimedia.org/r/234992 (owner: Joal) [13:38:50] Thanks ottomata :) [13:40:17] ottomata: Quick remark on the kafka dashboard [13:40:42] yes? [13:41:00] ottomata: disk write | read and network out | in are mirrors :) [13:41:14] yeah was trying it out [13:41:22] its kinda weird for disk now that there are no reads :) [13:41:31] not sure if i like it better or not [13:41:46] :) [13:42:18] i take it you don't like it so much? :) [13:42:28] I like it actually [13:43:10] But I think I would have gone for reversed [13:43:19] like network In | out [13:43:34] to have out as minus values and in as positive ones [13:43:39] But it's just convenrtions [13:44:00] I think it's very readable for global monitoring [13:44:12] For specific debugging, we'll probably add more detailedo nes [13:48:26] aye i'm for that [13:54:17] joal: going to do the election to bring an13 back into the fray [13:54:33] ottomata: ok sounds good [14:02:27] ottomata: charts really awesome :) [14:37:11] Analytics-Backlog: Set up bucketization of editCount fields {tick} - https://phabricator.wikimedia.org/T108856#1589297 (mforns) [14:37:12] Analytics-Kanban: Enforce policy for each schema: Sanitize {tick} [8 pts] - https://phabricator.wikimedia.org/T104877#1589296 (mforns) [14:37:50] Analytics-Kanban: Set up bucketization of editCount fields {tick} - https://phabricator.wikimedia.org/T108856#1532299 (mforns) [14:38:09] Analytics-Kanban: Set up bucketization of editCount fields {tick} - https://phabricator.wikimedia.org/T108856#1589303 (mforns) a:mforns [14:56:48] ottomata: interesting how network out is unbala [14:57:05] unbalanced per host when some leader is issing [14:57:50] hm, yeah, maybe secondary replica leader choice is not as balanced as first? [14:58:00] I guess you are right :) [15:27:28] Analytics-Kanban, Research-and-Data, Patch-For-Review: Backfill pageview data correcting space in title bug {hawk} [5 pts] - https://phabricator.wikimedia.org/T110614#1589441 (JAllemandou) p:Triage>High [15:35:54] Analytics-Dashiki, Analytics-Kanban, Browser-Support-Firefox: vital-signs doesn't display pageviews graph in Firefox 41, 42 {crow} [3 pts] - https://phabricator.wikimedia.org/T109693#1589475 (Milimetric) it seems there was a labs outage that took out everything. Vital Signs is up again. [15:39:21] (CR) Milimetric: [C: 2 V: 2] Reorder getData params from datasets api [analytics/dashiki] - https://gerrit.wikimedia.org/r/234797 (https://phabricator.wikimedia.org/T107504) (owner: Mforns) [15:39:29] a-team: Somthing to look at: https://zeppelin.incubator.apache.org/ [15:51:21] !log Starting to deploy refinery [16:08:12] joal: I like where Zeppelin is trying to go, we should definitely stand it up and try it [16:10:27] milimetric: I like it as well :) [16:10:40] After api though ;) [16:21:44] Analytics-EventLogging: Make webperf eventlogging consumers use eventlogging on Kafka - https://phabricator.wikimedia.org/T110903#1589603 (Ottomata) NEW a:Krinkle [16:34:11] Ironholds: For you to know, arbcom is no more in pageviews [16:34:21] I am currently updating doc [16:36:00] joal, awesome, thank you :) [16:36:52] Ironholds: You're welcome [16:58:00] Analytics, Discovery, MediaWiki-General-or-Unknown, Services, and 5 others: Reliable publish / subscribe event bus - https://phabricator.wikimedia.org/T84923#1589674 (GWicke) Etherpad notes: https://etherpad.wikimedia.org/p/scalable_events_system [17:41:09] Analytics-Tech-community-metrics, Engineering-Community, ECT-August-2015: Automated generation of (Gerrit) repositories for Korma - https://phabricator.wikimedia.org/T104845#1589841 (Dicortazar) And finally, Automator is now able to fetch and push changes to the git repository. So, as commented, this i... [17:43:30] Analytics-Tech-community-metrics, ECT-August-2015: Exclude third-party / pulled upstream code repositories from metrics - https://phabricator.wikimedia.org/T103984#1589846 (Dicortazar) @Qgil, thank you for the documentation :). I'll close this task and the blocking one as we finished the development. The... [17:43:50] Analytics-Tech-community-metrics, ECT-August-2015, ECT-September-2015: Tech community KPIs for the WMF metrics meeting - https://phabricator.wikimedia.org/T107562#1589848 (Dicortazar) [17:43:52] Analytics-Tech-community-metrics, Engineering-Community, ECT-August-2015: Automated generation of (Gerrit) repositories for Korma - https://phabricator.wikimedia.org/T104845#1589849 (Dicortazar) [17:43:55] Analytics-Tech-community-metrics, Engineering-Community, ECT-August-2015, ECT-September-2015: Check whether it is true that we have lost 40% of (Git) code contributors in the past 12 months - https://phabricator.wikimedia.org/T103292#1589850 (Dicortazar) [17:43:57] Analytics-Tech-community-metrics, ECT-August-2015: Exclude third-party / pulled upstream code repositories from metrics - https://phabricator.wikimedia.org/T103984#1589847 (Dicortazar) Open>Resolved [17:44:38] Analytics-Tech-community-metrics, Engineering-Community, ECT-August-2015: Automated generation of (Gerrit) repositories for Korma - https://phabricator.wikimedia.org/T104845#1589854 (Dicortazar) Open>Resolved [17:44:40] Analytics-Tech-community-metrics, ECT-August-2015, ECT-September-2015: Tech community KPIs for the WMF metrics meeting - https://phabricator.wikimedia.org/T107562#1498048 (Dicortazar) [17:47:20] Analytics-Tech-community-metrics, Engineering-Community, ECT-August-2015: Automated generation of (Gerrit) repositories for Korma - https://phabricator.wikimedia.org/T104845#1589863 (Dicortazar) As an example of this behaviour, the last commit by Automator with this respect is available at [1]. There y... [18:31:46] madhuvishy: hiiii [18:31:56] hey [18:32:29] haha, maybe i don't have anything to say, just sayin hi and I can work on this again [18:32:54] Hey! So I'm finally getting 'round to make use of the access to hive I was granted last month. [18:33:08] Krinkle: hi! [18:33:14] Running into a fair number of warnigns, errors and deprecation notices. It seems they've not been reported in phabricator yet, but just checking if it's known issue [18:33:45] $ hive [18:33:46] log4j:WARN No such property [maxBackupIndex] in org.apache.log4j.DailyRollingFileAppender. [18:33:53] Logging initialized using configuration in file:/etc/hive/conf.analytics-hadoop/hive-log4j.properties [18:33:53] WARNING: Hive CLI is deprecated and migration to Beeline is recommended. [18:34:14] yeah, the beeline thing is a thing. if you are new, maybe just start using that instead of hive cli! [18:34:17] i think it is avail [18:34:31] although, it may not be configured with good deafults, not sure [18:34:32] I also get a few random thigns liek 'WARNING: parquet.hadoop.ParquetRecordReader: Can not initialize counter due to context is not a instance of TaskInputOutputContext' in the middle of SELECT query results [18:34:33] i haven't used it much [18:34:48] that's weird [18:34:57] hive (wmf)> SELECT uri_host, is_pageview, user_agent_map["browser_family"] ua_browser, user_agent_map["browser_major"] ua_browser_v FROM webrequest WHERE year=2015 AND month=8 AND day=30 AND hour=20 LIMIT 200; [18:35:24] I don't know if there's a better way to do what I'm doing. Just prodding around to get an idea of the data [18:35:39] looks fine to me i think. dunno about that warning though [18:35:42] This query yields a dozen of these warnings: [18:35:43] Aug 31, 2015 6:30:47 PM WARNING: parquet.hadoop.ParquetRecordReader: Can not initialize counter due to context is not a instance of TaskInputOutputContext, but is org.apache.hadoop.mapreduce.task.TaskAttemptContextImpl [18:35:43] Aug 31, 2015 6:30:47 PM INFO: parquet.hadoop.InternalParquetRecordReader: RecordReader initialized will read a total of 16501 records. [18:35:43] Aug 31, 2015 6:30:47 PM INFO: parquet.hadoop.InternalParquetRecordReader: at row 0. reading next block [18:35:45] Aug 31, 2015 6:30:47 PM INFO: parquet.hadoop.InternalParquetRecordReader: block read in memory in 40 ms. row count = 16501 [18:36:26] Overall though, I'm happy to see user_agent_map. Very useful. [18:37:10] Krinkle: i have never seen those warnings before, or even those info messages. strange. [18:37:14] joal: have you seen those? ^^ [18:37:24] Krinkle: btw, did you see this? [18:37:24] https://phabricator.wikimedia.org/T110903 [18:38:39] Analytics-EventLogging, Performance-Team: Make webperf eventlogging consumers use eventlogging on Kafka - https://phabricator.wikimedia.org/T110903#1590166 (Krinkle) a:Krinkle>None [18:38:46] hehe [18:38:52] Krinkle says NO [18:39:05] Hehe, I didn't see it yet. [18:39:18] I use assignee very sparingly as what I'm actively working on [18:39:21] aye :) [18:39:24] I might, or maybe someone else will [18:39:25] thx [18:39:27] maybe i should do that [18:39:33] i have way too many tasks assigned to me, heh [18:39:34] ottomata: can you double check that input url? [18:39:37] Looks funny to me [18:39:44] which part? :) [18:39:47] * Krinkle knows 0.01 about kafka [18:39:52] oh, it is [18:39:55] every part to be honest [18:39:59] there ia typs [18:40:01] typo [18:40:03] double proto [18:40:35] Analytics-EventLogging, Performance-Team: Make webperf eventlogging consumers use eventlogging on Kafka - https://phabricator.wikimedia.org/T110903#1590185 (Ottomata) [18:40:35] comma separated hostnames, and no slash before the query? [18:40:45] its the path [18:40:47] that is an eventlogging uri [18:40:55] ok [18:40:58] we just used the path as the parameter to the kafka consumer [18:41:03] since eventlogging is configured via uris [18:41:09] it'd be nicer as the host, but no commas allowed [18:41:28] hence [18:41:29] kafka:/// [18:41:31] I don't follow [18:41:32] w 3 slashes [18:41:33] no hostname [18:41:37] Oh, righ [18:41:38] t [18:42:04] it seems odd to hardcode all that in the webperf consuming application though [18:42:09] Krinkle: https://github.com/wikimedia/mediawiki-extensions-EventLogging/blob/master/server/eventlogging/handlers.py#L97 [18:42:10] Doesn't feel right or scalable/maintainable. [18:42:14] naw, you should get it from puppet [18:42:14] really [18:42:15] like [18:42:34] is there no loadbalancer or consumer proxy? [18:42:49] or canonical round robin? [18:42:52] those URIs are just for bootstrapping [18:42:56] the first request [18:43:10] on startup kafka client asks for the cluster topology [18:43:15] I'll have to revive my kafka context, I forgot it all. [18:43:28] But not right now. Monday rush still kicking in. [18:43:53] aye :) [18:49:46] Quite cool to see the query dispatch. I didn't realise at first since my query was so simple. But now that it covers more rows I get a difference experience. It visibly enters hadoop and starts showing progress of the map reduce. [18:50:17] aye [18:58:28] Analytics, Discovery, MediaWiki-General-or-Unknown, Services, and 5 others: Reliable publish / subscribe event bus - https://phabricator.wikimedia.org/T84923#1590270 (GWicke) [19:00:21] ottomata: sorry was walking to office. batcave now? [19:01:12] sure, hm, lemme poke at this a bit, unless you got some news [19:01:29] i'm gogin to try to reproduce in a smaller environment in labs, and then see if i can trace the offset request stuff in the pykafka code [19:01:32] no, i'm trying to read the code and see what it's rying to do [19:01:35] k [19:01:35] cool [19:01:40] aah cool [19:01:44] i'd like to watch [19:01:47] ok [19:03:33] Analytics-Tech-community-metrics, ECT-August-2015, ECT-September-2015: Tech community KPIs for the WMF metrics meeting - https://phabricator.wikimedia.org/T107562#1590295 (Aklapper) > What about one graph with these three lines instead? [[ https://www.mediawiki.org/wiki/Community_metrics#Active_Gerrit... [19:13:10] ottomata: I don't know those warnings [19:13:20] I'll have a look tomorrow morning [19:23:38] Analytics-Tech-community-metrics, ECT-August-2015, ECT-September-2015: Mysterious repository breakdown(s)/sorting order - https://phabricator.wikimedia.org/T103474#1590366 (Aklapper) [19:24:08] Analytics-Tech-community-metrics, ECT-August-2015, ECT-September-2015: Labeling some bots (active in Git/Gerrit) as bots - https://phabricator.wikimedia.org/T110545#1590368 (Aklapper) [19:26:32] Analytics-Tech-community-metrics, ECT-September-2015: "Tickets" (defunct Bugzilla) vs "Maniphest" sections on korma are confusing - https://phabricator.wikimedia.org/T106037#1590386 (Aklapper) >>! In T106037#1567056, @Aklapper wrote: > @Dicortazar: What's the "right" approach to get this fixed? I need fe... [19:42:54] Bye a-team ! [19:42:59] see you tomorrow :) [19:43:02] bye joal! [19:43:56] bye! [20:17:10] madhuvishy: https://github.com/Parsely/pykafka/issues/241 [20:17:49] ottomata: cool [20:26:02] Analytics, Discovery, MediaWiki-General-or-Unknown, Services, and 5 others: Reliable publish / subscribe event bus - https://phabricator.wikimedia.org/T84923#933968 (Ottomata) [20:26:24] Analytics, Discovery, EventBus, MediaWiki-General-or-Unknown, and 6 others: Reliable publish / subscribe event bus - https://phabricator.wikimedia.org/T84923#933968 (Ottomata) [20:28:01] Analytics-Tech-community-metrics, Engineering-Community, ECT-September-2015: Automated generation of (Git) repositories for Korma - https://phabricator.wikimedia.org/T110678#1590569 (Qgil) Just checking: Git stats will not be affected by the blacklist at https://github.com/Bitergia/mediawiki-repositori... [20:35:02] Analytics-Tech-community-metrics: Checking code review metrics for a specific repository may take dozens of clicks - https://phabricator.wikimedia.org/T110520#1590589 (Aklapper) p:Low>Normal [20:49:29] Analytics-Tech-community-metrics: Checking code review metrics for a specific repository may take dozens of clicks - https://phabricator.wikimedia.org/T110520#1590595 (Aklapper) @Dicortazar: Any idea how complicated would this be? Would love to make it easy for developer teams to find these three stats for t... [21:25:12] Analytics-Tech-community-metrics, ECT-September-2015: Provide open changeset snapshot data on Sep 22 and Sep 24 (for Gerrit Cleanup Day) - https://phabricator.wikimedia.org/T110947#1590709 (Aklapper) NEW [21:53:36] https://stats.wikimedia.org/EN/TablesArticlesGt1500Bytes.htm shows % of "Articles over 2 Kb". Except this shows 0% for most wikis since Feb 2014. Known issue? [23:29:56] Analytics-Kanban: Archive obsolete schema pages {tick} [5 pts] - https://phabricator.wikimedia.org/T110247#1591175 (madhuvishy) I also looked through the list of schemas here - https://meta.wikimedia.org/wiki/Category:Schemas_lacking_a_purge_schedule, and marked schemas that are active, but haven't been edite...