[08:50:11] Analytics-Wikistats, and 1 other: stats.wikimedia.org is visually unattractive - https://phabricator.wikimedia.org/T28353#1040927 (Qgil) a:ezachte>None [14:13:31] Analytics: Report Edits for 2014 Oct-Dec - https://phabricator.wikimedia.org/T89284#1041453 (ezachte) So we estimate a final count of 9,433,673 edits for all Wikipedias in December. {F41676} [14:14:04] Analytics, Analytics-Kanban: Report metrics for Quarterly Report 2014 Oct-Dec - https://phabricator.wikimedia.org/T89024#1041455 (ezachte) [14:14:05] Analytics: Report Edits for 2014 Oct-Dec - https://phabricator.wikimedia.org/T89284#1041454 (ezachte) Open>Resolved [14:16:05] Analytics: Report New editors per month in 2014 Oct-Dec - https://phabricator.wikimedia.org/T89277#1041463 (ezachte) I don't think we can apply this estimation technique used for other wikistats metrics to new editors. This metric is much more influenced by dump-over-dump adjustments, not in the range of tent... [14:26:43] Analytics-Cluster, Analytics-Engineering: Make gecoded data and chosen client_ip available as fields in refined webrequest data - https://phabricator.wikimedia.org/T89401#1041510 (Ottomata) Not sure, but this may be helpful once we upgrade (hopefully today): https://issues.apache.org/jira/browse/HIVE-6456 [15:01:04] Hey ottomata, remember us talking about Mesos / Yarn ? Here is some interesting reading : http://radar.oreilly.com/2015/02/a-tale-of-two-clusters-mesos-and-yarn.html [15:01:21] Have a good upgrade ;) [15:01:29] hm, cool! will read soon [15:01:29] :) [15:48:16] Hey ottomata, me again :) [15:48:47] Do you have any tool to automatise creation of test env for hadoop, or should I go for making one ? [15:50:15] hm, welp, yes, in labs, yes. in vagrant yes, but I need to revisit that to make it work properliy [15:50:18] that is a todo for post migration [15:51:55] I can imagine [15:51:56] qchris had a really good page [15:52:13] wikitech ? [15:53:00] i cna't find it! [15:53:00] hm [15:55:16] FOUND IT! [15:55:16] https://wikitech.wikimedia.org/wiki/User:QChris/TestClusterSetup [15:55:32] joal: ^ [15:58:09] Nice :) [16:01:49] If you have the same kind of things for vagrant, I am up too ! [16:16:54] it won't work today, i will have to get that up and running very soon [16:17:01] they upgraded vagrant to trusty [16:17:08] and i had to get production in trusty working with cdh 5.3 [16:19:35] np ;) [16:31:18] Hey ottomata, do you have explanations on why the namenode of the 'test cluster' as defined by qchris is using and external namenode ? [16:35:21] this tstep? [16:35:22] • Set hadoop_cluster_name to analytics-hadoop. [16:35:29] which step? [16:37:27] step: Set hadoop namenodes [16:37:55] i don't understand the question [16:38:01] 'external namenode'? [16:38:03] --> tells to set the namenode to qchris-master.eqiad.wmnet [16:38:20] ah [16:38:22] vs .wmflabs [16:38:27] ? [16:38:29] that seems to be a mistake [16:38:31] I would have done qchris-master.eqiad.wmflabs indeed [16:38:33] yes [16:39:04] I don't think so, since in configuration of worker1, if tells explicitely: no typo ... [16:39:20] weirdo ... [16:45:04] Analytics, Multimedia: Set up varnish 204 beacon endpoint for virtual media views - https://phabricator.wikimedia.org/T89088#1041708 (Gilles) [16:47:06] Analytics, Multimedia, MediaWiki-extensions-MultimediaViewer: Set up varnish 204 beacon endpoint for virtual media views and use it in Media Viewer - https://phabricator.wikimedia.org/T89088#1041714 (Gilles) [16:48:00] yeah joal, that is weird [16:48:04] it should def be .eqiad.wmflabs [16:50:47] Yup [16:51:38] I am currently setting things up, and as you say "17:48:04 < ottomata> it should def be .eqiad.wmflabs" [16:53:43] cool [16:53:55] this may be a problem! [16:53:56] https://issues.apache.org/jira/browse/HIVE-6367 [16:54:08] they underlying code to do hive parquet integration has changed [16:54:10] in cdh 5.3.1 [16:54:14] apparently the new code doesn't suppor this [16:54:17] in hive 0.13.0 [16:54:19] https://issues.apache.org/jira/browse/HIVE-6384 [16:54:44] can't read the wmf.webrequest table! [16:54:44] Caused by: java.lang.NoSuchFieldError: DECIMAL [16:55:01] aouch [16:58:35] hm... reading the definition of the wmf.webrequest table, I can't find a decimal field ... [16:59:32] Analytics-Tech-community-metrics: Remove the filter for key Wikimedia software projects in korma.wmflabs.org - https://phabricator.wikimedia.org/T86154#1041754 (Acs) Ok, so no changes for Bugzilla repositories. For source code, add all projects from gerrit.wikimedia.org (cvsanaly2 analysis and gerrit analysi... [16:59:46] there is a double, which i thought might be it. [17:00:00] but, i'm trying to use parquet without that field in atest table right now, still getting that problem [17:04:46] bucket ? [17:06:24] bucket? [17:07:16] table is bucketized [17:08:21] yes [17:08:31] doh, joal, actually, hm,i thin it is working, i just did something really dumb. [17:08:36] it seems like it shouldn't work, based on those JIRAs [17:08:36] but [17:08:48] i didn't have a step to upgrade the parquet packages everywhere [17:08:53] so 5.0.1s were still out there [17:09:00] i just installed 5.3.1s everywhere, now it seems to work [17:09:46] :) [17:09:55] Better this way ! [17:13:15] yes and I was afraid of some compression codec problem too, but now this seems to have solved that [17:13:26] i was getting warnings about how parquet was looking up the compressino format, conflicting with the default form hadoop configs [17:13:28] but now it is cool [17:13:29] phew! [17:13:34] close one. [17:14:05] hm [17:14:37] hmmmm [17:14:38] maybe. [17:14:41] triple checking. [17:17:34] hmm, yeah, hm. i think the new parquet format doesn not respect hadoop setting [17:17:40] you have to set it manually witha parquet setting [17:17:51] SET parquet.compression=SNAPPY; [17:19:28] mouarf ... if not perfect, at least feasible ;) [17:23:36] yeah, i'm going to set i;t in mapred-site.xml next to the other compression settings [17:23:40] by default [17:27:15] Analytics: Wikimania submissions - https://phabricator.wikimedia.org/T89486#1041856 (leila) [17:47:33] Analytics-Tech-community-metrics: Remove the filter for key Wikimedia software projects in korma.wmflabs.org - https://phabricator.wikimedia.org/T86154#1041895 (Acs) Quim, checking the complete list of repositories, you have 1203 repositories. So, do you want to manage the full 1203 repositories in the dash... [18:26:58] Interesting (and almost complete) summary : http://radar.oreilly.com/2015/02/processing-frameworks-for-hadoop.html [18:28:15] joal, i've been (almost by myself) usin gthis pinboard to collect links [18:28:15] https://pinboard.in/u:wmfa [18:29:46] but do post them here too, i might not notice them there [18:29:52] i mainly put them there so I don't forget about them [18:35:45] ottomata: is the cluster usable today again or is it going to be down? [18:36:49] it should be back up shortly, i'm keeping it down for a bit, i have one last step - oozie, but the db backup is takign a while [18:37:02] i could probably bring it officially back on (it sorta is), but i'd rather have everyhing 100% before I say it sok [18:37:51] ottomata: sounds good [19:11:04] woo, nuria, joal, should be back up and good. [19:11:12] camus just launched so it should be catching up data now [19:33:38] ottomata: ok [20:27:42] hmm, some issues with oozie load and refine jobs, and probably others, thikn i got it though... [20:40:51] Analytics-Tech-community-metrics: Remove the filter for key Wikimedia software projects in korma.wmflabs.org - https://phabricator.wikimedia.org/T86154#1042241 (Qgil) Is this a problem? Many of these are mostly inactive. [20:59:13] joal: how's labs goin? [21:16:36] Analytics-Tech-community-metrics: Remove the filter for key Wikimedia software projects in korma.wmflabs.org - https://phabricator.wikimedia.org/T86154#1042276 (Acs) >>! In T86154#1042241, @Qgil wrote: > Is this a problem? Many of these are mostly inactive. The number is not a biig problem ... more time to g... [21:30:12] Analytics-Tech-community-metrics: Remove the filter for key Wikimedia software projects in korma.wmflabs.org - https://phabricator.wikimedia.org/T86154#1042298 (Qgil) As discussed in {T86630}, the minimum requirement is to have this data updated on a monthly basis. It's fine if generating the stats takes a wh... [21:36:23] Hey ottomata [21:36:35] Labs fun ! [21:37:15] I'll have few questions tomorrow, but globaly I like very much your scripting ! [21:37:42] ha, you mean the puppet stuff? [21:38:03] puppet, python for hive, oozie [21:38:09] oh ha, yeah [21:38:12] You have clean stuff ;) [21:38:14] thanks [21:38:32] You're very welcome, it's a pleasure to read code when written like that [21:39:15] So, as said, a few questions tomorrow in your morning if you have time [21:39:19] for sure [21:39:29] man this hive thing is annoying, one little problem after another. I am close! [21:39:39] but ja, tomorrow should be goooood [21:39:43] :) [21:40:01] I don't have access yet, approval needed on Operations [21:41:13] Yeah, tomorrow, some discussions about frameworks (the last link pinned), and maybe some more work on automation (spark on the way ?) [21:41:28] And obviously, the ticket I have to work on :) [21:42:53] Wish you good afternoon, and a hive fast success ! [21:43:03] Talk tomorrow :) [21:44:01] oh yeah, enough time has passed, i can get you all that access now [21:44:04] laters! [21:57:24] Analytics, Analytics-Kanban: Report metrics for Quarterly Report 2014 Oct-Dec - https://phabricator.wikimedia.org/T89024#1042355 (Tbayer) [21:57:25] Analytics: Report New editors per month in 2014 Oct-Dec - https://phabricator.wikimedia.org/T89277#1042353 (Tbayer) Open>Resolved Per ErikZ's argument, recorded in the email thread over the weekend, I have marked this as "TBD" in the [[https://commons.wikimedia.org/w/index.php?title=File%3AWikimedia_Fou... [22:06:19] Analytics: Report New articles for 2014 Oct-Dec - https://phabricator.wikimedia.org/T89283#1042359 (Tbayer) Open>Resolved Per discussion, I used Daniel's stats as record by Emausbot for the [[ https://commons.wikimedia.org/w/index.php?title=File%3AWikimedia_Foundation_Quarterly_Report%2C_FY_2014-15_Q2_(... [22:06:20] Analytics, Analytics-Kanban: Report metrics for Quarterly Report 2014 Oct-Dec - https://phabricator.wikimedia.org/T89024#1042361 (Tbayer) [22:10:02] Analytics: Report visitors (comScore) in 2014 Oct-Dec - https://phabricator.wikimedia.org/T89281#1042363 (Tbayer) Open>Resolved [22:10:03] Analytics, Analytics-Kanban: Report metrics for Quarterly Report 2014 Oct-Dec - https://phabricator.wikimedia.org/T89024#1042364 (Tbayer) [22:11:52] Analytics, Analytics-Kanban: Report metrics for Quarterly Report 2014 Oct-Dec - https://phabricator.wikimedia.org/T89024#1025696 (Tbayer) [22:11:53] Analytics-Wikistats: Provide total active editors for December 2014 - https://phabricator.wikimedia.org/T88403#1042373 (Tbayer) Open>Resolved Thanks again, ErikZ! I used this (marked as "est.") in the [[https://commons.wikimedia.org/w/index.php?title=File%3AWikimedia_Foundation_Quarterly_Report%2C_FY_20... [23:43:38] Possible-Tech-Projects, Analytics-General-or-Unknown: Pageviews for Wikiprojects and Task Forces in Languages other than English - https://phabricator.wikimedia.org/T56184#1042601 (Qgil) @kevinator and others, do you think this task has the volume of work and complexity suitable for GSoC / Outreachy? (I have... [23:54:12] ottomata: About the .eqiad.wmnet vs .eqiad.wmflabs . I wanted to avoid that people just naively copy/paste the values, as that would make them abuse my qchris labs test cluster. [23:54:32] Mhmmm. [23:54:47] How to do that in a less confusing way? [23:55:16] I mean ... I could use the eqiad.wmflabs and somewhere in the page to replace the qchris in the names with their usernames. [23:55:24] But I guess no one would read that. [23:55:38] Possible-Tech-Projects, Analytics-General-or-Unknown: Pageviews for Wikiprojects and Task Forces in Languages other than English - https://phabricator.wikimedia.org/T56184#1042628 (Doc_James) I would still love to see this project up and running. I think it has potential not only to increase participation but... [23:55:42] Hence, I mangled the domain name, so that everybody stumbles across it. [23:55:55] Mhmm. [23:56:00] How to do that nicer? [23:58:55] use fake names altogether [23:59:00] that end in .eqiad.wmflabs [23:59:12] my-worker1.eqiad.wmflabs [23:59:23] .eqiad.wmflabs [23:59:24] or something? [23:59:25] Mhmm. k [23:59:39] Fake the first level, not the last.