[00:24:11] status: overwhelmed by /query windows [00:24:16] (PS4) Nuria: Fetch Pageview Data from pageview API [analytics/dashiki] - https://gerrit.wikimedia.org/r/270867 (https://phabricator.wikimedia.org/T124063) [00:25:27] (CR) Nuria: [C: -1] "Voting -1 cause this still needs tests but adding dan to CR to verify that we want to use native promises, this means that UI will not wor" [analytics/dashiki] - https://gerrit.wikimedia.org/r/270867 (https://phabricator.wikimedia.org/T124063) (owner: Nuria) [00:27:26] (PS5) Nuria: Fetch Pageview Data from Pageview API [analytics/dashiki] - https://gerrit.wikimedia.org/r/270867 (https://phabricator.wikimedia.org/T124063) [00:37:41] hmm, something about the cdh upgrade changed how spark jobs in oozie find SPARK_HOME...fun next thing to figure out :) [01:10:34] (CR) Nuria: "Opened pull request to add bower.json. We can maintain our own clone if it doesn't work." [analytics/dashiki] - https://gerrit.wikimedia.org/r/270867 (https://phabricator.wikimedia.org/T124063) (owner: Nuria) [01:37:23] anyone here who is an admin of hue.wikimedia.org or so? [01:37:35] a user reports all of a sudden they cant login anymore [01:37:57] all i know is it used to be based on the "wmf" LDAP group but that i checked and is still normal [01:38:32] i'm asking because it's called analytics_cluster::hue [01:39:46] Analytics, Dumps-Generation: Provide a way to check if a dump has been generated - https://phabricator.wikimedia.org/T126808#2023999 (ArielGlenn) A new dump for any project, or for a specific one, or... ? [01:40:11] mutante: it requires membership in the analytics-privatedata-users group in admin.yaml [01:40:23] or admin/data/data.yaml rather [01:41:54] ori: oh, thanks, let me check that [01:43:02] ori: sure it's not "statistics-privatedata-users"? [01:43:14] he is in that but not the other [01:43:20] strange is he claims there was a recent change [01:43:28] checks more git log [01:43:47] https://phabricator.wikimedia.org/T113069 [01:46:27] will find out via the ticket , added otto, bbl [01:52:17] mutante: analytics-privatedata-users creates a hadoop uer [01:52:19] user [01:52:48] statistics-privatedata-users gives access to stat1002 but does not create a hadoop user [04:41:41] madhuvishy: thank you, the interesting part would be "why did it change recently for the user" [04:42:09] madhuvishy: i think he just meant the web login on hue.wm.org [05:07:22] mutante: yeah that's just LDAP creds. Otto has to sync the creds sometimes and we did a full hadoop and related ecosystem upgrade today, so something might have happened [05:10:29] madhuvishy: that would make sense about the sync, i already confirmed the LDAP group membership is there, ok, so i already assigned it to otto who synced it last time , i'm sure it will work out via ticket. thanks again [05:11:27] mutante: cool. Np :) [07:33:02] Analytics, Pageviews-API: Pageviews API not updated on 2/18/2016 at 8;34 utc - https://phabricator.wikimedia.org/T127414#2058783 (Alexdruk) not updated on Feb 24 at 7:32 UTC example: https://wikimedia.org/api/rest_v1/metrics/pageviews/top/en.wikipedia/mobile-web/2016/02/23 [07:47:11] Analytics-Tech-community-metrics, DevRel-February-2016: Make GrimoireLib display *one* consistent name for one user, plus the *current* affiliation of a user - https://phabricator.wikimedia.org/T118169#2058788 (Lcanasdiaz) Panel http://korma.wmflabs.org/browser/top-contributors.html is **not** working... [08:23:47] Analytics, Pageviews-API: Enable pageviews on nl/be.wikimedia.org - https://phabricator.wikimedia.org/T127804#2058819 (Romaine) Please include in the domain `.wikimedia.org` the projects `nl` and `be` as well in the pageview API. [11:20:00] hi a-team :] [11:23:35] hello mforns ! [11:23:58] joal: 2016-02-23T22/1H looks bad in the last report.. is it when we made the switch? [11:38:07] * elukey is afk for lunch [12:05:47] Hi mforns, hi elukey [12:05:58] sorry guys, late night, so late morning for me :) [12:07:22] elukey: about wrong line in report, here is the full story [12:07:39] Yesterday ottomata upgraded the cluster [12:08:02] As decided, he also changed oozie and hive servers from an1027 to an1015 [12:08:59] The thing we forgot was that oozie jobs were using hive-site.xml, that would change from moving to an1015 [12:09:32] So pause/resume for oozie jobs was not enough, we had to kill/restart them (but obviously, we realised that at the resume mopment) [12:10:31] I killed/restarted all the jobs, and while monitoring them, realised that something was wrong with data loading [12:11:16] I checked camus - fine - the issue was with the camus-checker --> with the upgrade, some script we used to generate classpath for it was not available anymore [12:13:37] ottomata found a new way to define checker's classpath that's even better because now it outputs its logs correctly (Thanks again ottomata :) [12:13:46] And ingestion starated back [12:14:18] But, being a half day late, some load jobs had been waiting too long - and failed because of timeout. [12:19:45] I restarted them this morning when I checked the cluster status [12:22:20] ahhhhhh [12:22:29] thanks a lot for the looong update, really appreciated :) [12:22:45] really sorry that I wasn't here for the upgrade yesterday :( [12:23:16] elukey: not to worry, the upgrade went fine, it's the rest ;) [12:23:55] elukey: we had to tweak mysql conf a bit for oozie to be more responsive, restart the jobs, and find the classpath solution [12:23:58] Busy night :) [12:24:18] I can imagine :( [12:26:59] joal, whenever you have 10 minutes I have a couple of questions about https://wikitech.wikimedia.org/wiki/User:Elukey/Analytics/PageViewDumps (basically how to do the first tests without causing data loss or weird things :) [12:27:15] elukey: when you want :) [12:27:22] batcave now ? [12:27:29] sure! [12:27:37] or more precisely, in two minutes (sorry) [12:29:00] Joining :) [12:49:07] Analytics-Tech-community-metrics, DevRel-February-2016: Mailing lists recently added to korma do not have "Top senders" data created (JSON file is 404) - https://phabricator.wikimedia.org/T123929#2059347 (Lcanasdiaz) This is a bug in code. Some weeks/months ago we improved the way we query SQL in order to... [12:56:53] Analytics-Tech-community-metrics, DevRel-February-2016: Make GrimoireLib display *one* consistent name for one user, plus the *current* affiliation of a user - https://phabricator.wikimedia.org/T118169#2059393 (Aklapper) >>! In T118169#2058788, @Lcanasdiaz wrote: > Panel http://korma.wmflabs.org/browser/t... [13:05:50] Analytics-Tech-community-metrics: top-contributors.html empty due to 404s for several JSON files - https://phabricator.wikimedia.org/T126971#2059422 (Lcanasdiaz) a:Lcanasdiaz [13:59:50] Analytics-EventLogging, Analytics-Kanban: EventLogging needs to be ready for codfw failover - https://phabricator.wikimedia.org/T127209#2035752 (faidon) UDP traffic can get from codfw to eqiad in general — the two DCs are interconnected (although keep in mind that the fibers may be wiretapped and thus no... [14:00:00] Analytics-EventLogging, Analytics-Kanban, codfw-rollout, codfw-rollout-Jan-Mar-2016: EventLogging needs to be ready for codfw failover - https://phabricator.wikimedia.org/T127209#2059632 (faidon) [14:14:13] Analytics-Tech-community-metrics, DevRel-February-2016: Mailing lists recently added to korma do not have "Top senders" data created (JSON file is 404) - https://phabricator.wikimedia.org/T123929#2059691 (Lcanasdiaz) Fixed. I've tested it with the URLs provided above. [14:15:42] Analytics-Tech-community-metrics, DevRel-February-2016: Key performance indicator: Top contributors: Find good Ranking algorithm fix bugs on page - https://phabricator.wikimedia.org/T64221#2059695 (Lcanasdiaz) [14:15:44] Analytics-Tech-community-metrics, DevRel-February-2016: Mailing lists recently added to korma do not have "Top senders" data created (JSON file is 404) - https://phabricator.wikimedia.org/T123929#2059694 (Lcanasdiaz) Open>Resolved [14:28:05] joal: second iteration - https://wikitech.wikimedia.org/wiki/User:Elukey/Analytics/PageViewDumps [14:28:46] The plan would be to re-use refinery's stuff as much as possible, creating only a new workflows.xml and coordinator.properties. The dumps will go to a tmp dir too [14:28:52] (whenever you have time) [14:33:19] I am sure that something is wrong in my doc, but I am starting to understand how the refinery works.. thanks! [14:35:58] elukey: some bugs :) [14:38:00] ah yes for sure, I wouldn't have expected anything else :) [14:44:50] elukey: finishing some code, will review with you in minutes [14:46:54] joal: whenever you have time, I am not in a hurry :) [14:48:49] (PS1) Joal: Update CamusPartitionChecker (errors and history) [analytics/refinery/source] - https://gerrit.wikimedia.org/r/272977 (https://phabricator.wikimedia.org/T127909) [14:51:56] elukey: batcave? [14:55:16] sure! [14:56:17] GOODMORNININGNGGNG [14:56:32] morning [14:56:32] ottomata, hiii [14:57:07] Heya ottomata ! [14:57:09] hey ottomata, do you have 10 mins to help me with puppet today? [14:57:40] hm, now I'm starting to wonder if some of ottomata's repeated letters are just DNA strands and he communicates on a higher meta plane [14:58:21] :] [14:58:48] maybe NGNGGNG codes for a protein that helps the morning after long nights of fighting with the cluster [14:59:15] Analytics-Tech-community-metrics, DevRel-February-2016: Many profiles on profile.html do not display identity's name though data is available - https://phabricator.wikimedia.org/T117871#2059895 (Aklapper) >>! In T117871#1955255, @Aklapper wrote: > Ah. Going to http://korma.wmflabs.org/browser/mls.html , u... [15:00:11] maybe he wrote Berrrrrliinnnnnnnn in a way that we would be unconsciously tricked to accept his proposal [15:00:29] mforns: definitely! [15:00:45] Analytics-Tech-community-metrics, DevRel-February-2016: Many profiles on profile.html do not display identity's name though data is available - https://phabricator.wikimedia.org/T117871#2059898 (Aklapper) ...and e.g. http://korma.wmflabs.org/browser/people.html?id=07f18356dbd6e1ed014a8cbc68161a3dcfb0202b... [15:00:52] :] [15:00:57] man, letters are free [15:01:01] i'm gonna use em all [15:02:51] mforns: whatcha doin? [15:03:11] Analytics-Tech-community-metrics: top-contributors.html empty due to 404s for several JSON files - https://phabricator.wikimedia.org/T126971#2059905 (Aklapper) Regarding 404s for JSON files, T117871#2059898 is another example (which I should have probably added here as a comment instead). [15:03:12] ottomata, I'm trying to puppetize reportupdater into stat1002 [15:03:52] trying to reuse limn/data/generate, in a way that does not install legacy stuff into 1002 [15:04:17] I have a couple questions, if you have time, we can batcave maybe? [15:06:33] ja mforns gimme 10ish mins [15:06:47] ottomata, sure whenever is better for you [15:11:09] (CR) Ottomata: [C: 1] "NICE!" [analytics/refinery/source] - https://gerrit.wikimedia.org/r/272977 (https://phabricator.wikimedia.org/T127909) (owner: Joal) [15:11:15] (CR) Ottomata: [C: 2] Update CamusPartitionChecker (errors and history) [analytics/refinery/source] - https://gerrit.wikimedia.org/r/272977 (https://phabricator.wikimedia.org/T127909) (owner: Joal) [15:18:01] (Merged) jenkins-bot: Update CamusPartitionChecker (errors and history) [analytics/refinery/source] - https://gerrit.wikimedia.org/r/272977 (https://phabricator.wikimedia.org/T127909) (owner: Joal) [15:18:01] (Merged) jenkins-bot: Update CamusPartitionChecker (errors and history) [analytics/refinery/source] - https://gerrit.wikimedia.org/r/272977 (https://phabricator.wikimedia.org/T127909) (owner: Joal) [15:19:24] Thanks ottomata :) [15:22:25] thank YOU joal that thing is such a handy tool [15:30:30] Analytics-EventLogging, Analytics-Kanban, codfw-rollout, codfw-rollout-Jan-Mar-2016: EventLogging needs to be ready for codfw failover - https://phabricator.wikimedia.org/T127209#2060009 (Ottomata) Traffic is not multicast, it is direct from app servers -> eventlog1001. Hitting the beacon/event.gif... [15:44:21] mforns: bc? [15:44:28] ottomata, sure omw [15:46:29] Analytics, Pageviews-API: Enable pageviews on nl/be.wikimedia.org - https://phabricator.wikimedia.org/T127804#2060052 (Nuria) @Romaine: the PageviewAPi doesn't return pageviews for all domains, as @JAllemandou said there is a criteria behind what is consider a pageview. @Ironholds, @DarTar We do no... [15:48:21] Analytics, Pageviews-API: Enable pageviews on nl/be.wikimedia.org - https://phabricator.wikimedia.org/T127804#2060058 (Ironholds) Because they're not "production" wikis; the goal was to avoid counting meta-pageviews (chapter traffic, say) towards our metrics. [15:48:37] mforns: ? [15:49:03] thanks nuria :) [15:54:03] Analytics, Pageviews-API: Enable pageviews on nl/be.wikimedia.org - https://phabricator.wikimedia.org/T127804#2054400 (Milimetric) We have gotten this request from pretty much everyone that's been excluded by that policy, though, @Nuria and @Ironholds. It might make sense to not report on those as part... [15:58:19] Analytics, Pageviews-API: Enable pageviews on nl/be.wikimedia.org - https://phabricator.wikimedia.org/T127804#2060106 (Nuria) @milimetric: I think we can do that, it requires several modifications we need to plan for : Let's keep those wikis with is_pageview= false (so as @Ironholds said it is clear wha... [15:58:26] Analytics, Pageviews-API: Enable pageviews on nl/be.wikimedia.org - https://phabricator.wikimedia.org/T127804#2060107 (Nuria) p:Triage>Normal [15:59:35] Analytics, Pageviews-API: Enable pageviews on nl/be.wikimedia.org - https://phabricator.wikimedia.org/T127804#2060111 (Milimetric) Yeah, that's what I meant, Nuria, because the split might make sense on our back-end but not the front-end (pageview API) [16:01:15] Analytics, Pageviews-API: Enable pageviews on nl/be.wikimedia.org - https://phabricator.wikimedia.org/T127804#2060122 (Ironholds) Sounds good to me! [16:01:36] nuria, evidently milimetric has taken over my title of "lightning-fast message responder" :D [16:01:49] Ironholds: now, don't be jealous [16:01:59] I'm not, I'm just glad someone has inherited! [16:02:14] It's like highlander but instead of MOODY LIGHTING AND STORMS it's 30s-spaced emails [16:02:14] you're a little jealous :P [16:02:32] milimetric, I was gonna be jealous, then I looked at my new hourly rate and business cards, and now I am not jealous again :D [16:02:45] awww snap! :) [16:02:46] Doing the for-profit thing for a year is gonna be all new kinds of interesting! [16:03:17] I wish you all the best man, but just as a guideline, it took me about 6 years of for-profit to choke on that smog [16:03:39] milimetric, like, leave? [16:03:46] I'm very well off for the experience and $$$ I got [16:03:49] because I have a place to be in Autumn 2017 [16:03:59] I'm not in it for the long-haul unless something goes tremendously awry [16:04:06] but yeah, I left as fast as physics allowed [16:04:22] hahah [16:04:33] I'm confused. It took you 6 years but you left as soon as you could? [16:04:42] elukey: send you a new event, try to join: https://plus.google.com/hangouts/_/wikimedia.org/nuria-luca [16:04:48] well, it was all part of this plan I had in college [16:04:57] make money, get experience, start a business [16:05:05] except I did it a little backwards [16:05:45] Ironholds: change your nick to mr_richy_rich so people know who they are talking to [16:05:56] lose all money I made on taking ladies on dates, get experience, start a business, fail, become completely disillusioned with capitalism overall, find out there's such a thing as non-profits (idea which never even occurred to me), leave as fast as physics allowed [16:06:15] milimetric, oh my god you're right I can go on DATES now [16:06:23] lol, exactly [16:06:25] maybe I can buy BOOKS [16:06:31] ...oh god my savings plan is screwed [16:06:46] (I am saving up for a PhD. It's gonna be fun on a bun!) [16:14:34] joal: when you say "less10_hosts_countries" in your spreadsheet, do you mean like "select count(*) from last_access_uniques_daily group by uri_host, country having count(*) < 10"? [16:15:00] milimetric: I had a bet on you noticing that :) [16:17:09] milimetric: I have not done it the way you do, but it's similar yes [16:31:13] Analytics-Cluster, Analytics-Kanban, Patch-For-Review: Upgrade to CDH 5.5 {hawk} - https://phabricator.wikimedia.org/T119646#2060250 (Ottomata) a:Ottomata [16:31:39] madhuvishy: staddduppp [16:38:12] Analytics, Pageviews-API: Enable pageviews on nl/be.wikimedia.org - https://phabricator.wikimedia.org/T127804#2060265 (Romaine) >>! In T127804#2060090, @Milimetric wrote: > We have gotten this request from pretty much everyone that's been excluded by that policy, though, @Nuria and @Ironholds. It might... [16:38:53] Analytics, Pageviews-API: Enable pageviews on nl/be.wikimedia.org - https://phabricator.wikimedia.org/T127804#2060267 (Milimetric) Cool, we'll use this ticket then to prioritize that work. [16:39:04] Analytics, Pageviews-API: Enable pageviews on nl/be.wikimedia.org {melc} - https://phabricator.wikimedia.org/T127804#2060270 (Milimetric) [16:52:03] ottomata: do we have that setup ? http://stackoverflow.com/questions/17298784/deleting-jobs-from-oozies-web-ui [16:53:34] OOoO looking [16:54:18] hmm, joal there are sane defaults [16:54:25] ok [16:54:37] So that means there is 40G of jobs for only 30 days ? [16:54:52] oozie.service.PurgeService.older.than [16:54:52] 30 [16:54:54] hmm [16:55:11] let me see! [16:55:36] Thing I'm not sure of is the purge service: I had understood that it needed to be started or something like that [16:56:09] nope, earliest job is created_time: 2014-07-24 15:11:05 [16:56:46] That's long ago :) [16:56:57] "However, actions associated with long-running coordinators do not purge until the coordinators complete." [16:57:16] No way we have 40G of more-than-a-month long jobs [16:57:33] Ooooooh ! Coordinator completes ! [16:57:36] Man .... [16:57:45] hm [16:57:55] but still, we've restarted coordsmore often than that [16:58:09] depends which I guess, but most of them yes [16:59:38] ah ha [16:59:49] oozie.service.PurgeService.purge.old.coord.action [16:59:51] https://oozie.apache.org/docs/4.1.0/oozie-default.xml [16:59:53] default false [17:00:08] Riiiight [17:00:10] joal: making task :) [17:00:13] good find! [17:00:15] awesome :) [17:02:04] Analytics, Analytics-Cluster: Enable purging of old jobs and coordinators in oozie - https://phabricator.wikimedia.org/T127988#2060457 (Ottomata) [17:02:11] elukey: heyayyy [17:02:46] ottomata: o/ [17:03:01] we have the next meeting now right [17:03:02] ?? [17:03:15] I went away a sec and everybody was gone :D [17:03:21] hehe [17:03:31] eh? we are still in batcave [17:03:33] no? [17:03:51] mmmm in the invite I see another meeting room [17:03:57] milimetric: ^ [17:04:07] batcave or? [17:04:11] batcave [17:04:12] batcave [17:07:20] mforns: you might have missed ^, we're in the batcave [17:07:37] milimetric, coming! [17:15:13] Analytics, Analytics-Cluster: Use MySQL as Hue data backend store - https://phabricator.wikimedia.org/T127990#2060494 (Ottomata) [17:17:19] Analytics, Analytics-Cluster: Create regular backups of Analytics MySQL Meta instance - https://phabricator.wikimedia.org/T127991#2060511 (Ottomata) [17:29:30] Analytics: Create table in hive with a continent lookup for countries - https://phabricator.wikimedia.org/T127995#2060605 (Nuria) [17:29:37] Analytics: Create table in hive with a continent lookup for countries - https://phabricator.wikimedia.org/T127995#2060617 (Nuria) p:Triage>Normal [17:34:06] Analytics, Analytics-Cluster: Create regular backups of Analytics MySQL Meta instance - https://phabricator.wikimedia.org/T127991#2060635 (jcrespo) For data availability (not service availability/hardware issues), we perform regular backups from dbstore1001 into the backups server (although that depends... [17:42:04] ottomata: https://gerrit.wikimedia.org/r/#/c/273008/ [17:42:19] I need to write a better commit msg but I feel pretty good about it [17:51:55] ori cool! what happens in a non varnish setting, like mw vagrant? [17:52:37] also, why no recvFrom or timestamp anymore? [17:56:22] ottomata: soooo varnish-kafka.. I am reviewing the patch but I'd also like to test it in beta.. I guess I'd need to create a new deb with the patch right? [17:56:57] plus, where do we have a beta replica of varnish-kafka that I can trust? I was trying to spin up one but if we have something working I could use it [17:57:00] :) [17:58:20] no, you can just try it somewhere you don't have to make a new patch [17:58:32] yes [17:58:37] there are several in beta [17:58:41] deployment-cache-* [17:58:53] i just turned varnishkafka on those yesterday [17:58:58] but it should be the same as prod [17:59:05] you can just compile varnishkafka [17:59:11] and run a separate instance on one of those boxes if you like [17:59:22] ottomata: recvFrom and timestamp are part of the varnishlog output, which is why they're also not explicitly part of the client-side event payload [17:59:35] ah! right. [18:00:07] oh, and ori you are saying the the req will go back to the same varnish that is recvFrom anyway [18:00:08] ja? [18:01:07] sorry elukey you don't have to make a new *deb [18:01:11] just to try it [18:01:33] no? mmmmmm [18:02:37] elukey: especially since you don't need new librdkafka, and vk should be dynamically linked (I think), you might be able to compile and just copy binary to one of the deployment-cache-* nodes [18:02:41] deployment-cache-text04 maybe [18:02:52] and then run vk with the same or similar args [18:03:37] ah ok so the brutal way :D [18:03:38] or maybe you can statically compile it with deps like librdkafka...OOOR if you really have to, you can just sudo apt-get install whatever build tools you need on one of those cache boxes in deployment-prep [18:03:40] no one will mind :) [18:03:46] ja you are just testing [18:04:06] when you want to install on a prod box for testing, that's when we can start talking building a deb [18:04:11] ok I'll look into deployment-caches then [18:04:12] although, you can test a binary from your home dir too [18:04:17] like, on one of the misc boxes [18:04:29] won't hurt to compile and copy binary and just run it in your shell in prod for testing [18:04:35] (as long as you don't pollute topics with duplicates) [18:04:44] (you can produce to a test topic) [18:05:11] I'll try to study a test scenario and then I'll ask you to review, I am still not confident to avoid a mess :P [18:06:06] heh k [18:08:28] the other question that I have for everybody is about our dear oozie [18:08:59] I tried to run my workflow to generate a page view dump for may 2015 in a tmp folder, running manually oozie with a lot of -D etc.. [18:09:20] starting from the "transform" step of the regular page view workflow [18:09:23] buuut [18:09:25] ActionInputCheck:: File:hdfs://analytics-hadoop/wmf/data/wmf/webrequest/webrequest_source=text/year=2015/month=5/day=1/hour=1/_SUCCESS, Exists? :false [18:09:53] my job gets stuck with this, because there is nothing in those dirs of course [18:10:29] and I don't remember who is responsible of that check [18:10:40] you guys already told me but I forgot :P [18:17:43] elukey: you can't get webrequest data from May 2015 [18:17:52] it doesn't exist [18:18:01] you can get pageview_hourly data [18:18:23] which is what I assume you'd need to generate dumps [18:18:40] so i'm not sure why it's looking at webrequest [18:19:22] a-team, I'm off for tonight ! [18:19:28] madhuvishy: yep I got that, I was trying to start from transform_pageview_to_legacy_format.hql but oozie starts looking for webrequests data [18:19:32] right [18:19:35] joal: o/ thanks for today! [18:19:35] * madhuvishy looks [18:19:42] Except if there is anything I should help on :) [18:19:47] np elukey :) [18:20:16] elukey: sounds like you might need to edit the workflow.xml file [18:20:19] or maybe the coordinator [18:20:21] dpending on which you are doing [18:20:36] ottomata: yep I am doing it with https://wikitech.wikimedia.org/wiki/User:Elukey/Analytics/PageViewDumps [18:20:38] i betcha the whole pageview flow starts with webrequest, and then does pageview tables, then does dump files [18:20:46] yep yep [18:20:53] I am starting from transform_pageview_to_legacy_format.hql [18:21:14] something like [18:21:15] oozie job -config coordinator.properties -Duser=elukey -Dstart_time=2015-05-01T00:00Z -Dend_time=2015-05-01T01:00Z -Dworkflow_file=hdfs://analytics-hadoop/tmp/elukey/pageviewdumps/workflow.xml -Darchive_directory=hdfs://analytics-hadoop/tmp/elukey/pageviewdumps/archive -run [18:21:41] coordinator.properties is not the one in the wikipage, just the same one as pageview hourly [18:22:10] hm [18:22:50] elukey: can you paste your workflow? [18:22:52] hmm, but you are just running a workflow [18:22:58] why do you use coordinator.properties? [18:23:26] ottomata: i guess it can be called anything [18:23:30] it can but [18:23:31] he has [18:23:33] oozie.coord.application.path = ${coordinator_file} [18:23:35] whci is [18:23:39] right [18:23:41] coordinator_file = ${oozie_directory}/pageview/hourly/coordinator.xml [18:23:44] aah [18:23:45] yes [18:23:49] but it wont matter [18:23:51] but i think [18:23:54] sure it will [18:23:57] you have to tell it [18:24:02] to run as a workflow [18:24:06] because that coordinator depends on webrequest [18:24:10] right [18:24:11] you need to tell it the app path is a workflow [18:24:12] ja [18:24:41] elukey: instead of [18:24:53] oozie.coord.application.path [18:24:54] use [18:24:58] oozie.wf.application.path [18:25:03] and point that at your ${workflow_file} [18:25:33] be back shortly, getting some lunch! [18:25:42] that's not the same as overriding -Dworkflow_file ? [18:25:44] :O [18:26:07] elukey: no.. [18:26:07] good lunch, I'll start swearing.. ehm enjoying oozie tomorrow morning :) [18:26:37] madhuvishy: ah.. [18:26:39] the property makes it decide to run as a coordinator [18:28:35] https://oozie.apache.org/docs/3.3.2/DG_CommandLineTool.html#Submitting_a_Workflow_Coordinator_or_Bundle_Job [18:29:21] workflow_file is not an oozie property - it's something we define. [18:31:21] madhuvishy: ahhhhh ok! Really intuitive.. I'll read the docs! I guess I thought it was too easy :P [18:31:40] elukey: dont read the docs until you need to :P [18:32:07] :D [18:32:22] all right logging off, have a good day a-team! [18:32:25] (or evening) [18:32:31] good night :) [18:34:12] night elukey ! [19:02:25] nuria: I can still review your pageview API change, right? I'm thinking don't need to wait for the bower change since that won't chance much [19:26:07] madhuvishy: having an issue i can't seem to figure out..after the cdh upgrade our pyspark oozie jobs are now failing to find SPARK_HOME. previously i added `--conf spark.yarn.appMasterEnv.SPARK_HOME=/bogus` and things continued working. That does seem to work in in the new cdh version though. [19:26:25] (PS1) Ottomata: Fix package path to JsonSerDe in create_webrequest_raw_table.hql [analytics/refinery] - https://gerrit.wikimedia.org/r/273027 [19:26:33] looking at the source code, it seems in this new version the function that bails (findPySparkArchives) is run before setupLaunchEnv which applies those variables. Any ideas how to fix? [19:27:46] i meant to say above that (appMasterEnv) doesn't seem to work anymore [19:28:53] but anyways, is there perhaps something i don't know about that can be used to apply system environment earlier in the process? Or perhaps i need to make some puppet adjustments so SPARK_HOME exists in the oozie environment ? [19:28:55] ebernhardson: i have no idea. ottomata ^ [19:29:16] ebernhardson: one min will look [19:29:20] ottomata: thanks [19:29:51] (PS10) Bearloga: Functions for categorizing queries. [analytics/refinery/source] - https://gerrit.wikimedia.org/r/254461 (https://phabricator.wikimedia.org/T118218) [19:30:12] (PS1) Yuvipanda: Add ability to 'fork' a Query [analytics/quarry/web] - https://gerrit.wikimedia.org/r/273030 [19:30:25] ebernhardson: does pyspark work for you? [19:30:51] ottomata: yes, but that also worked in the old version without setting SPARK_HOME [19:31:06] hmm, i just got Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/fs/FSDataInputStream when ust doing [19:31:08] pyspark [19:31:26] ottomata: yes, i get that error too but thats unrelated to my problem (but i would probably run into it next) :P [19:32:14] HMMM [19:32:16] * ebernhardson is still tempted to use scala instead...but my teammate wasn't so interested and pyspark was easy enough [19:32:22] i think there may be a classpath workaround ahck that I puppetized for older version [19:32:25] that might now be needed [19:32:29] ebernhardson: scala is fun! [19:32:46] the anon function stuff in scala feels much smoother [19:32:55] lambdas in python everywhere are a little annoying [19:33:03] yea, i have the pagerank stuff writen in scala and it was relateively easy [19:33:21] although i havn't bothered figuring out what a proper build/deploy process would look like...but couldn't be that hard [19:34:14] AHHH yes [19:34:31] yeah, spark-env.sh is messed up [19:34:33] (CR) jenkins-bot: [V: -1] Add ability to 'fork' a Query [analytics/quarry/web] - https://gerrit.wikimedia.org/r/273030 (owner: Yuvipanda) [19:34:33] because I puppetized it [19:34:39] and new version is changed a bunch [19:35:12] that makes sense, if you are otherwise occupied i can work on that today [19:35:56] naw, am on it now [19:40:22] ok, thanks [19:43:45] Analytics, ContentTranslation-Analytics, Operations, Ops-Access-Requests: access for nikerabbit to researchers - https://phabricator.wikimedia.org/T127808#2061176 (Dzahn) [19:50:39] ebernhardson: how do you submit spark python job? [19:50:42] spark-submit, ja? [19:57:13] ottomata: well, spark-submit only for testing [19:57:18] ah oozie, ja [19:57:23] ottomata: oozie jobs have their own xml element [19:58:04] ottomata: fwiw, they fire through org.apache.spark.deploy.yarn.Client.main on oozie [19:58:37] which is run from org.apache.spark.deploy.SparkSubmit [19:58:53] so similar perhaps? but not running the CLI script [20:00:08] aye ok [20:00:16] i'm not sure about this SPARK_HOME issue, but there are dev env var issues [20:00:27] so, i'm going to fix those, and see if your problem goes away [20:06:13] ottomata: ok thanks. My guess, after looking at the spark-submit script is SPARK_HOME gets set explicitly before starting up java. Perhaps it would be enough to also add that to oozie's environment (but i don't know enough about oozie to know how/where) [20:11:30] milimetric: yes please [20:11:36] milimetric: sorry i missed your ping [20:12:00] milimetric: main issue is that with the pageview api client things will ot work on IE due to use of native promises [20:15:17] Analytics-Kanban, DBA, Editing-Analysis, Patch-For-Review, WMF-deploy-2016-02-09_(1.27.0-wmf.13): Edit schema needs purging, table is too big for queries to run (500G before conversion) {oryx} - https://phabricator.wikimedia.org/T124676#2061285 (jcrespo) It should have finished now, after deleti... [20:31:46] (CR) Ottomata: [C: 2 V: 2] Fix package path to JsonSerDe in create_webrequest_raw_table.hql [analytics/refinery] - https://gerrit.wikimedia.org/r/273027 (owner: Ottomata) [20:38:14] ebernhardson: i merged a change that I think will help [20:38:31] ottomata: ok lemme kick off a test oozie coordinator again and see what happens [20:38:36] well [20:38:41] might have to wait for puppet to run across the clsuter [20:38:46] hmmm [20:38:47] or [20:38:48] mabye not [20:38:48] actually [20:38:49] hm [20:38:50] oh right :) I'll go grab lunch then try after [20:39:00] no [20:39:10] only on oozie server i think [20:39:13] running puppet one sec [20:41:10] ok ebernhardson, maybe try now [20:45:54] ottomata: same failure [20:46:06] in: yarn logs -applicationId application_1456242175556_4827 [20:46:21] hm [20:46:26] ebernhardson: proper SPARK_HOME should be /usr/lib/spark [20:46:36] ottomata: right, but there is no way to set it [20:47:26] ottomata: in previous versions of spark the conf key spark.yarn.appMasterEnv.SPARK_HOME could be set, but in the newer version SPARK_HOME is accessed before that variable is evaluated [20:47:28] ll=ooking more [20:55:02] hmm [20:57:26] ebernhardson: can you try [20:58:00] -conf spark.home=/var/lib/spark [20:58:01] ? [20:58:38] ottomata: thats in there already (i added it this morning while testing) [20:58:47] oh [20:58:51] and that doesn't work? [20:59:07] correct [21:00:23] ottomata: the exception comes from a call to sys.env("SPARK_HOME"), it doesn't seem to even try and evaluate spark.home. Grepping the source code it seems that is only referenced later in the pipeline from SparkContext and SparkConf, not during the initialization phase [21:00:39] sys.env is a python error? [21:02:01] no [21:02:03] ottomata: scala [21:02:06] ah, sys.env.get [21:02:07] ja? [21:03:08] ottomata: https://github.com/apache/spark/blob/v1.5.0/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala#L936 [21:05:51] ebernhardson: weird that it does sys.env.get right before that [21:17:39] ebernhardson: the app logs I see don't have spark.yarn.appMasterEnv set, but you've tried that and it doesn't matter, right? [21:19:36]