[02:39:01] Analytics-Kanban, Patch-For-Review: Caching on pageview API should be for 1 day - https://phabricator.wikimedia.org/T127214#2053972 (Pchelolo) Created a PR for frontend #restbase (https://github.com/wikimedia/restbase/pull/523) to pass through any headers sent by AQS without modifications. [07:22:42] leila, yt? [07:29:18] Ironholds: yup [07:42:50] Ironholds: I think I'm going to catch some sleep. See you later. [08:15:37] Analytics, Pageviews-API: Enable pageviews on nl/be.wikimedia.org - https://phabricator.wikimedia.org/T127804#2054400 (Romaine) [08:57:32] Analytics-Tech-community-metrics, Phabricator: Closed tickets in Bugzilla migrated without closing event? - https://phabricator.wikimedia.org/T107254#2054463 (Aklapper) p:Low>Lowest [09:02:19] Analytics, Pageviews-API: Enable pageviews on nl/be.wikimedia.org - https://phabricator.wikimedia.org/T127804#2054400 (Sjoerddebruin) AFAIK all projects are included in the pageviews API. This seems to be something with the tool instead, so https://github.com/MusikAnimal/pageviews/issues would be a bette... [09:21:49] Analytics, Pageviews-API: Enable pageviews on nl/be.wikimedia.org - https://phabricator.wikimedia.org/T127804#2054400 (JAllemandou) Hi, Not all projects are included in the pageview API. The definition of what is called a "pageview" can be found [[ https://meta.wikimedia.org/wiki/Research:Page_view | her... [09:26:37] Analytics-Kanban, DBA, Editing-Analysis, Patch-For-Review, WMF-deploy-2016-02-09_(1.27.0-wmf.13): Edit schema needs purging, table is too big for queries to run (500G before conversion) {oryx} - https://phabricator.wikimedia.org/T124676#2054534 (jcrespo) 205 million rows purged. [09:47:48] Analytics, ContentTranslation-Analytics, Operations, Ops-Access-Requests: access for nikerabbit to stat1002.eqiad.wmnet - https://phabricator.wikimedia.org/T127808#2054571 (Nikerabbit) [09:49:40] Analytics, ContentTranslation-Analytics, Operations, Ops-Access-Requests: access for nikerabbit to stat1002.eqiad.wmnet - https://phabricator.wikimedia.org/T127808#2054590 (Arrbee) This request is approved. Thanks. [11:53:19] Analytics-Kanban: Make last access data public {mole} - https://phabricator.wikimedia.org/T126767#2022824 (JAllemandou) a:JAllemandou [12:59:25] hi a-team :] [12:59:32] morning mforns :) [12:59:40] hey joal! [13:00:16] Actually mforns, good afternoon :) [13:00:58] hehehe joal, yes it seems morning to me too :] [13:01:56] mforns: thanks for your review on pageview sanitization ! [13:02:12] joal, np [13:02:16] mforns: other topic: stat1002 is very overloaded :) [13:02:16] :] [13:02:20] aha [13:02:33] should we do something? [13:02:35] mforns: just so that you know [13:02:52] mforns: mostly due to ezachte scripts, and I don't want to kill tham [13:02:53] ok joal I'm not doing anything on it right now [13:18:52] (PS1) Mforns: Add all reportupdater files to this dedicated repo [analytics/reportupdater] - https://gerrit.wikimedia.org/r/272712 (https://phabricator.wikimedia.org/T127327) [13:26:18] mforns: question for you [13:26:22] yes? [13:26:35] mforns: your oozie code on browser/os has been merged, right ? [13:26:40] yes [13:26:50] mforns: we need a deploy and start, right ? [13:27:03] yes, I wanted to ask you about backfilling [13:27:26] and what to do with the old files [13:27:39] hmhm ... [13:27:53] Backfilling sounds ok (if we accept that it'll take time) [13:28:04] about the old files, I am not opinionated [13:28:30] I don't think those old files have much users [13:28:48] and nuria says it would be confusing to keep them both [13:28:59] so, I think we should remove them, once we have backfilled [13:29:03] mforns: then deletion is the way I guess :) [13:29:03] if we do [13:29:13] ok cool [13:29:24] do you think we can backfill until the reports start? [13:29:37] they start at... [13:30:11] 2015-08-30 [13:30:38] the thing is... the reports are weekly, but the table is daily, so more backfilling to do [13:31:04] it's ~6 months [13:31:42] iirc the job runs in a couple minutes for one day, like 6 minutes [13:31:51] mforns: I think we can go for daily backfilling from 2015-06 (which is the date where user_agent_map gets added to the table) [13:32:32] mforns: As long as we don't put too much parallelisation, backfilling will just be an ongoing thing until it has caught up [13:32:58] so, 9 months, 270 days * 6 minutes = 1620 minutes = 27 hours of computation [13:33:26] aha makes sense, doesn't seem too long [13:34:13] joal, so do you want me to deploy? [13:34:29] mforns: sure ! [13:34:35] mforns: do you have hdfs rights ? [13:34:39] not sure [13:34:46] there's a request [13:34:53] yeah, I remember that [13:34:55] but I think it's not done yet [13:35:09] ok, so you do the tin deploy, I do the hdfs one :) [13:35:09] * mforns looks [13:36:06] yes! https://phabricator.wikimedia.org/T126752 [13:36:25] awesome :) [13:36:27] I can do both, but I have to look at the docs first [13:36:30] I'll watch you do then :) [13:36:34] batcave? [13:38:48] sure mforns, joining [13:50:06] ottomata: Hi sir, are you here ? [13:53:35] mforns: https://gist.github.com/jobar/9126574693ece814a2ac [14:04:24] Analytics, ContentTranslation-Analytics, Operations, Ops-Access-Requests: access for nikerabbit to stat1002.eqiad.wmnet - https://phabricator.wikimedia.org/T127808#2055302 (ema) p:Triage>Normal [14:05:14] Analytics, ContentTranslation-Analytics, Operations, Ops-Access-Requests: access for nikerabbit to stat1002.eqiad.wmnet - https://phabricator.wikimedia.org/T127808#2054571 (ema) [14:09:17] joal: good morning! [14:09:22] eating some breakfast then ready to get started [14:09:23] hey ottomata [14:09:39] in meeting with halfak, so no rush is good ;) [14:09:46] k [14:09:55] also ottomata : stat1002 is overloaded :( [14:10:03] Don't know if you can do anything [14:13:45] looking [14:14:47] looks like ezachte is doing some crunchin :) [14:15:03] joal: is it causing a problem? [14:15:14] ottomata: can't deploy, and almost can't use hive [14:15:25] ottomata: would be good to nice those things :) [14:15:39] But ezahte not on IRC [14:16:13] hm [14:16:14] yeah [14:16:35] will email and ask him to nice [14:18:08] ok, I could have done that byt didn't really know what channel to use :( [14:19:29] i'm just personal emailing him, CCing you [14:19:38] thx ottomata :) [14:34:42] joal: getting ready, i'm going to just suspend all oozie jobs I can find [14:34:50] ok ottomata [14:34:57] let me know if I can help [14:35:36] Actually ottomata, I think you only need to pause the laod ones --> everything will naturaly wait after those [14:35:49] also ottomata, you should stop camus :) [14:35:58] yes [14:36:07] there are also now discovery ones! [14:36:14] True ! [14:36:19] forgot about those [14:36:27] yeah, all ours are webrequest based [14:36:35] so pausing the bundles (whihc I just did) [14:36:37] should be enough [14:36:41] i don't have to search for coordinators [14:36:47] ottomata: correct [14:37:02] Then we should wait for the dependent jobs to have finihed [14:37:08] ok, i paused all bundles, and 3 analytics-search coordinators [14:37:10] ja cool [14:37:39] i'm not going to stop camus until we are ready to shut things down [14:37:47] which will probably be in 30 mins or so [14:37:55] ok ottomata [14:43:20] Analytics, ContentTranslation-Analytics, Operations, Ops-Access-Requests: access for nikerabbit to stat1002.eqiad.wmnet - https://phabricator.wikimedia.org/T127808#2054571 (ema) Hi, it looks like @Nikerabbit is already a member of the statistics-users group. Perhaps something is wrong with the S... [14:48:36] Analytics, ContentTranslation-Analytics, Operations, Ops-Access-Requests: access for nikerabbit to stat1002.eqiad.wmnet - https://phabricator.wikimedia.org/T127808#2054571 (Krenair) statistics-users provides stat1003 access, not stat1002. It also does not provide access to the password needed to a... [14:58:36] Analytics, ContentTranslation-Analytics, Operations, Ops-Access-Requests: access for nikerabbit to stat1002.eqiad.wmnet - https://phabricator.wikimedia.org/T127808#2054571 (Ottomata) https://wikitech.wikimedia.org/wiki/Analytics/Data_access#Access_Groups [15:08:05] ottomata: everything smooth ? [15:08:06] joal: ja, am just waiting for last thigns to finish [15:08:06] i killed discovery jobs [15:08:06] commenting out camus now [15:08:06] ok [15:09:31] joal: as soon as runnig jobs finish, I will start with process outlined here [15:09:31] https://etherpad.wikimedia.org/p/analytics-cdh5.5 [15:09:38] will let you know each step as I go [15:10:11] ok ottomata, let me know if there is any specifics you want me to monitor [15:10:21] k [15:13:26] Analytics, ContentTranslation-Analytics, Operations, Ops-Access-Requests: access for nikerabbit to researches - https://phabricator.wikimedia.org/T127808#2055392 (Nikerabbit) [15:14:15] Analytics, ContentTranslation-Analytics, Operations, Ops-Access-Requests: access for nikerabbit to researches - https://phabricator.wikimedia.org/T127808#2054571 (Nikerabbit) Ottomata summarised this well: > Nikerabbit you want to be in the researchers group, and you will use it to access a file o... [15:20:07] joal: cool, finally! no jobs [15:20:09] proceeding [15:20:24] !log shutting down analytics (hadoop) cluster for CDH 5.5 upgrade [15:21:08] who [15:21:12] oops:) [15:24:25] ok, stopping analytisc1027 services [15:26:33] oops, I was just trying to run a query :) [15:29:46] stopping all hadoop services [15:41:59] just upgraded cdh on journalnode hosts, moving to master and standby [15:43:46] Analytics-Wikistats, Operations, Regression: [Regression] stats.wikipedia.org redirect no longer works ("Domain not served here") - https://phabricator.wikimedia.org/T126281#2055478 (Krinkle) Fair enough, but I'm making the case we don't need statistics. This worked and was used and linked to. There's... [15:47:41] upgrading workers [15:48:01] Analytics-Wikistats, Operations, Regression: [Regression] stats.wikipedia.org redirect no longer works ("Domain not served here") - https://phabricator.wikimedia.org/T126281#2010000 (Krenair) I don't care about this redirect enough to upload the patch, but I imagine this is because stats.wikipedia.org... [15:57:14] hadoop back up, testing mapreduce job [16:03:20] upgrading packages on analytics1027 [16:08:36] upgrading oozie db and sharelib [16:11:40] starting analytics1027 mysql migration to analytics1015 [16:23:38] installing hive and oozie servers on analytics1015 [16:25:14] ottomata: o [16:25:16] o/ [16:25:25] hiyaa [16:25:29] how is the migration going?? [16:25:50] perfectly so far! [16:26:05] i'm doing the most risky (but still not very) part now, moving hive and oozie to analytics1015 [16:26:10] i already did the db move [16:26:15] promoted slave to master, etc. [16:26:32] will reboot analytics1027 soon [16:28:18] super! [16:30:24] ah, just found something out, spark-core now depends on flume-ng, which is not in our apt... [16:30:26] gotta add it [16:30:46] a-team: standddupppp [16:31:18] eee! [17:00:50] mforns: you got a sec? [17:00:56] milimetric, sure [17:01:02] to the batcave! [17:01:05] o/ [17:02:11] !log rebooting analytics1027 for kernel upgrade [17:02:27] Hey ottomata, which statsd lib would you recommend me using for python ? [17:02:50] joal: i don't really have an opinion, i guess whatever we have in apt [17:02:56] or is in debian [17:03:03] hm [17:03:31] looks like python-statsd 3.0.1 in trusty at least [17:03:39] right [17:03:44] just 'statsd' in python [17:03:45] i thikn [17:04:01] python-statsd seem to be widely use, let's go for that :) [17:04:04] Thqanks [17:04:39] k [17:05:49] hm, thinkg about that again: statsd is for time-regular info at second level - Here we plan to have daily data [17:05:52] ottomata: --^ [17:06:00] joal: sock = socket.socket(socket.AF_INET, socket.SOCK_DGRAM) sock.sendto(b'foo.bar:123|ms', ('statsd.eqiad.wmnet', 8125)) [17:06:23] just saved yourself a library dependency [17:06:33] Thanks ori ! [17:08:33] Analytics-Kanban, DBA, Editing-Analysis, Patch-For-Review, WMF-deploy-2016-02-09_(1.27.0-wmf.13): Edit schema needs purging, table is too big for queries to run (500G before conversion) {oryx} - https://phabricator.wikimedia.org/T124676#2055866 (Neil_P._Quinn_WMF) Thank you, @jcrespo! One questi... [17:09:47] ottomata, ori: Given that we want to plot daily data, I can [17:10:18] can't think of something else than sending data to graphite directly, with a timestamp (same as what we did for restbase hits) [17:10:24] any objections ? [17:11:47] joal: what is this for? [17:12:13] global hadoop metrics: how many jobs, mappers / reducers, how many users [17:12:18] ah [17:12:21] ja graphite sounds fine [17:12:32] Very few data (I think daily is ok) [17:12:37] ok cool [17:13:06] joal: everythign is kinda back up! [17:13:08] looking good [17:13:12] about to unsuspend oozie jobs [17:13:17] Looking with you ! [17:14:12] man, 1030 didn't upgrade via salt [17:14:18] ? [17:14:31] Well caught ! [17:14:55] ottomata: looks like I can't log in hue [17:15:39] joal: send to statsd, not graphite [17:15:56] if you send to graphite you have the following issues to deal with: [17:16:04] ori: how to handle not regular data ? [17:16:05] mforns: before you push that for review [17:16:10] milimetric, yes [17:16:14] if you don't send a metric in the minimum aggregation period, you have a hole in your data [17:16:16] ori: I want to send daily points [17:16:23] how come you put the text color (which works great btw) in the view model instead of the bindings? [17:16:37] I figured you could just use d.color in the bindings [17:16:45] if you send more than one metric in the smallest aggregation period, the lattermost wins -- they are not averaged or summed [17:16:50] milimetric, AHA [17:16:53] *aha [17:16:59] ori: the reason to use graphite for this stuff was that statsd doesn't let you send timestamps of when the data is being recorded for [17:17:26] * mforns looks [17:17:30] yeah, it's tricky :/ [17:17:54] ori: this comes up in our case more because we might rerun jobs later etc [17:18:07] i have resorted to writing a thing that writes directly to whisper files in the past [17:18:10] ori: we are not trying to have complete datasets in the way statsd does, we are trying to get very sparse data (daily points) and we graph them using tricks in grafana [17:18:13] oh! mforns never mind, I don't think that's possible... [17:18:19] i've had grand plans of setting up influxdb on tungsten but haven't gotten around to it; it is supposedly better at this [17:18:26] aah [17:18:28] milimetric, I don't know [17:18:37] 'cause the text is created as a sibling of the polygon [17:18:37] why no? [17:18:46] ah [17:18:50] (if you have time / interest for trying it out I can pass that machine over to you) [17:19:09] ori: sending to graphite directly seems to be mostly okay though - although you don't get the nice sum, count graphs etc [17:19:11] also we were considering opentsdb at one point too no? [17:19:33] mforns: yeah, don't worry about it for now, that looks like it needs some refactoring to all happen under d3's control, but maybe some other time :) [17:20:36] milimetric, ok, then we can refactor both hierarchy and sunburst [17:20:38] ori: I have no idea what opentsdb is - sure I'd love to try playing with influxdb - may be next week or so [17:20:54] yes, they both do the same thing [17:21:05] ori: IIRC, invetigation has been made by addshore [17:21:18] ok [17:21:32] opentsdb was interesting to look at, but we only glanced [17:22:06] ori: , joal, we have talked about a project to use some fancy time series db / graph tool for analytics data, maybe in analytics cluster [17:22:08] but have put it off [17:22:23] as it is not a high prioirty, and is mostly wanted for internal project usage graphs [17:22:27] would be cool though! [17:22:28] biggest diff between influx and openTSDB (to my opinion) is that the latter runs on the cluster (good for big data, bad for possible downtimes) [17:22:42] for sure ottomata :) [17:23:02] great! no more cdh 5.4 packages anywhere i can see [17:23:05] ottomata: oozie is responsive (while doing nothing, it's good ;) [17:23:07] godog is also a big fan of prometheus [17:23:12] cool [17:23:14] hive looks good too i think [17:23:16] iirc all of these are configurable as backends for grafana [17:23:24] yup ottomata, checked it and it looked ok [17:23:32] joal: ok with you if I unsuspend oozie jobs? [17:23:43] ottomata: hue thing first ? [17:23:48] Can't log in :( [17:23:50] hue thign? [17:23:53] no? i logged in [17:23:58] shell username + ldap pw? [17:24:11] AH! [17:24:18] hue users all unsynced?!?! [17:24:26] bah! [17:24:26] ha [17:24:38] brb [17:25:23] ottomata: camus already running ;) [17:28:36] *reads up* [17:30:18] ja i wanted to run puppet so i uncommented crons [17:30:19] milimetric, there is another thing I'd like to change... [17:30:35] when you zoom across a couple levels, the chart shrinks... [17:30:46] ottomata: still no chance with hue :( [17:30:47] it should stay the same size [17:30:49] Analytics-Kanban, DBA, Editing-Analysis, Patch-For-Review, WMF-deploy-2016-02-09_(1.27.0-wmf.13): Edit schema needs purging, table is too big for queries to run (500G before conversion) {oryx} - https://phabricator.wikimedia.org/T124676#2055964 (jcrespo) I will manually make sure to resync it- a... [17:31:34] joal: ja han gon [17:31:38] checking a few things [17:31:43] ottomata: sure [17:32:00] * milimetric looks for this shrinky chart thing [17:32:31] mforns: oh yeah, I noticed that but I liked it [17:32:44] it's cool, gives you an idea that something's happening in case you maybe missed the zoom [17:32:57] milimetric, I see... [17:33:05] like your cat jumps in your lap and ON PURPOSE turns off your monitor [17:33:06] ok :] [17:33:24] xD [17:33:36] * milimetric ::grumbles about the cat:: [17:34:31] Analytics-Kanban, DBA, Editing-Analysis, Patch-For-Review, WMF-deploy-2016-02-09_(1.27.0-wmf.13): Edit schema needs purging, table is too big for queries to run (500G before conversion) {oryx} - https://phabricator.wikimedia.org/T124676#2055990 (jcrespo) Regarding replication lag, I shared a scr... [17:36:17] joal: i'm going to spend a few minutes seeing if i can quickly make the hue ldap integration cooler [17:36:23] can you help me with something? [17:36:38] can you prepare a puppet patch that increases heap sizes of hive and oozie servers? [17:37:09] should be in hiera [17:37:10] files like [17:37:27] hieradata/eqiad/cdh/{hive,oozie}/....yaml [17:37:29] heapsize: [17:37:31] etc. [17:37:38] i can do, but i'm looking at hue real quick [17:37:41] no worries if not [17:38:45] (PS11) Mforns: Add hierarchy visualization [analytics/dashiki] - https://gerrit.wikimedia.org/r/269713 (https://phabricator.wikimedia.org/T124296) [17:38:51] joal: Let's ask analytics e-mail list (public) feedback on how to release the unique devices dataset [17:38:51] milimetric, ^ [17:39:19] joal: that way we have input as to what people would like to see, will try to craft an e-mail [17:39:24] (CR) Milimetric: [C: 2 V: 2] "nice and clean, love it" [analytics/dashiki] - https://gerrit.wikimedia.org/r/269713 (https://phabricator.wikimedia.org/T124296) (owner: Mforns) [17:39:34] :] [17:41:56] ottomata: I can ! [17:42:00] Was away for a minute [17:43:00] ok nuria [17:48:03] ottomata: hive and oozie puppet code is in submodule ? [17:49:46] found it ottomata [17:51:20] yes but you only need to set it in hiera [17:51:33] (IF it works...) [17:51:41] joal: , try hue now. [17:51:42] oof [17:51:50] i had to manually resync each user in analytics-privatedata-users :/ [17:51:59] gonna make a task to fix this ! [17:52:00] ottomata: Worked ! [17:52:07] Thanks mate [17:52:32] ottomata: about HEAP_SIZE, I don't get how the thing get's passed to a template or something :( [17:52:48] hieradata defined values that are reused in puppet conf, right? [17:54:10] Analytics, Analytics-Cluster: Improve Hue user management - https://phabricator.wikimedia.org/T127850#2056054 (Ottomata) [17:54:11] hehehe [17:54:21] yeah hiera is wayyyy high up [17:54:35] ok, joal i'm through with hiera, i can do it, unless you really want to :) [17:54:40] sorry [17:54:42] through with hue* [17:54:51] ottomata: I'd like to learn :) [17:54:57] ok [17:55:07] hiera is all about scope [17:55:09] so uhhh [17:55:12] in this case in production [17:55:16] cdh module parameters [17:55:22] are being filled in from hiera in the 'eqiad' scope [17:55:23] so [17:55:28] makes sense [17:55:35] hieradata/eqiad/cdh/** [17:55:43] in files named after the classes [17:55:53] with hiera keys after the parameters [17:56:12] you can see i'm already setting heapsize via hiera for hive server and metastore [17:56:27] because heapsize is a parameter on the cdh::hive::{metastore,server} classes [17:56:43] it is also a paramater on the cdh::oozie::server class [17:56:49] so, you should be able to add to the server.yaml file too [17:56:55] sorry [17:56:58] oh [17:57:05] there is no oozie server file yet [17:57:07] gotta make one [17:57:15] cdh/oozie/server.yaml [17:57:16] and add [17:57:17] heapsize: [17:57:21] not sure what we should set them at hmmm [17:57:29] about hive first: what put me in is the cdh::hive file --> didn't find it [17:57:51] cdh/hive/server.yaml [17:57:54] cdh/hive/metastore.yaml [17:57:56] cdh folder is empty for me ! [17:58:01] in hieradata? [17:58:06] hierdata/eqiad [17:58:11] hieradata/eqiad/cdh [17:58:24] no, in modules [17:58:29] naw, not in modules [17:58:34] cdh is a ubmodule [17:58:37] git submodule update --init [17:59:02] Riiiight, that was the thing I was after ! [17:59:25] Thanks, now it all should fall in place: find the parameter name, put it in hiera file with correct value :) [18:00:11] yup [18:00:36] a-team: staff on batcave right? [18:00:40] nup [18:01:34] a-team: let's do batcave [18:01:34] a-team: let's all go to batcave [18:01:50] milimetric: maybe you can cahnge meeting video link? [18:01:52] *change [18:02:06] joal, milimetric , ottomata : holaaaa [18:02:16] ottomata: heapsize parameter is defined in hive/server.pp - Shouldn't we use a specific file ? [18:02:44] ? [18:02:57] heapsize is set via hiera in hieradata/eqiad/cdh/hive/server.yaml [18:03:08] joal: ^ [18:03:13] yessir, makes sense [18:03:25] got it now ! [18:03:33] pffff, was looking at hive.yaml [18:03:36] :( [18:03:40] sorry [18:03:44] * joal is learning [18:04:13] ottomata: shall we bump that to 4096 for hive and same for oozie ottomata ? [18:04:25] looking at recomendations [18:04:31] http://www.cloudera.com/documentation/enterprise/5-4-x/topics/admin_hive_configure.html [18:05:25] lets go with 6 and 10G for hive server and metastore [18:05:33] ottomata: let's go for large, and make it 6144? [18:06:39] joal: oozie server 2G? 4G? [18:07:03] hm ... Checking current value [18:08:06] Analytics-EventLogging, Analytics-Kanban: EventLogging needs to be ready for codfw failover - https://phabricator.wikimedia.org/T127209#2056154 (Nuria) We need to verify that udp traffic can get from dallas to eqiad OR migrate eventlogging server to use kafka (as equiad will be up while this exercise is t... [18:09:05] ottomata: currently, catalina mem is set to 1024 in oozie-env [18:09:24] ja [18:10:01] Not really more users, just possibly better repsonse times --> 2048 ? [18:11:01] Analytics-EventLogging, Analytics-Kanban: EventLogging needs to be ready for codfw failover - https://phabricator.wikimedia.org/T127209#2056158 (Nuria) Can server side produce via http to varnishkafka [18:11:24] k [18:22:21] (CR) Bearloga: "Sure, I'll get those performance numbers updated. As for insource thing, there's two because one checks if the query is insource: in gener" [analytics/refinery/source] - https://gerrit.wikimedia.org/r/254461 (https://phabricator.wikimedia.org/T118218) (owner: Bearloga) [18:34:48] Analytics-Kanban, DBA, Editing-Analysis, Patch-For-Review, WMF-deploy-2016-02-09_(1.27.0-wmf.13): Edit schema needs purging, table is too big for queries to run (500G before conversion) {oryx} - https://phabricator.wikimedia.org/T124676#2056335 (Neil_P._Quinn_WMF) @jcrespo, ah, okay, I misunders... [18:37:23] joal: cool, it worked! [18:37:43] great ottomata [18:38:28] you should have access now too [18:38:38] uhh in maybe 1 min [18:38:44] ok joal awsome [18:38:52] ready to unpause oozie stuff? [18:39:09] ottomata: sure, LET'S DO IT ! [18:39:37] ottomata: you do it, or shall I ? [18:39:41] i will [18:39:51] ok waiting for the flow to reappear in hue :) [18:40:22] man, ellery must be happy, he has a full cluster for himself currently :) [18:40:32] haha [18:40:51] hmmm [18:41:14] hang on, an27 is funky, hmmm [18:41:23] with oozie/hive settings, puppet shoulda fixed, hmm [18:42:33] ottomata: no hive/oozie processes on an1027, only jmxtrans [18:42:38] i just stopped them [18:42:51] ottomata: and by the way, no jmxtrans on an1015 [18:42:57] ok ottomata makes sense :) [18:43:06] oh [18:43:07] hm [18:43:14] hmmm, yeah i think we don't have it configured for hive or oozie [18:43:20] ok cool [18:43:28] i wonder what's inthere! [18:43:45] hm, yeha, no jmxtrans configs on anallytics1027 either [18:43:48] must be crusty [18:43:50] uninstalling [18:43:55] ottomata: unpause load first and monitor? [18:44:41] hmmm [18:44:43] is hue working for you? [18:45:31] Weird ... Was off, then back on [18:45:35] yeah [18:45:42] i restarted it, maybe it just took a bit [18:45:50] weird though [18:46:20] list bundles not working? [18:46:32] working for me, but no way to tick [18:47:05] need to change it CLI [18:47:35] hmmm [18:47:40] i can tick [18:47:41] ottomata: I let you do it ? [18:47:43] ok [18:47:45] weird :( [18:47:46] hmmm [18:47:59] super user rights I guess [18:48:21] just made you one [18:48:23] better? [18:48:45] No tick for bundles - corrd ok [18:48:47] starting load [18:48:53] ok [18:49:39] ottomata: new buttons for me :) [18:51:19] ottomata: load job have started ok [18:51:37] ja looking good [18:51:45] ok to unsuspend refine? [18:51:59] ottomata: give me aminute, waiting for hive jobs to start on cluster [18:52:05] k [18:52:16] oh ja, just the oozie launchers [18:52:23] ottomata: yup, not cool :( [18:52:44] ottomata: hive address in oozie config ? [18:54:21] oh [18:54:22] hm [18:54:29] is that hardcoded in oozie jobs? [18:54:45] ottomata: was thinking of that, we pass hive-site.xml to oozie jobs !!! [18:54:58] OH, but [18:54:59] So let's kill every hive related job and restart them ... [18:55:00] yeahhh [18:55:02] it gets it from hdfs [18:55:03] on deploy [18:55:08] so hdfs has the stale one [18:55:09] yessir [18:55:18] running a deploy to hdfs... [18:55:21] Let's deploy, stop, and restart [18:55:34] ottomata: I kill load bundle [18:55:37] k [18:55:38] thanks [18:55:43] kill? [18:55:48] ohhh [18:55:49] yeah [18:55:50] we have to restart [18:55:51] :/ [18:55:56] ottomata: yes, kill to renew hive [18:55:56] well [18:55:56] hm [18:55:57] no we don't [18:56:03] ? [18:56:07] i don't think that oozie saves the hive-site [18:56:08] does it? [18:56:15] i would think the wf just looks it up when it is created [18:56:23] I don't know if it reads it at each execution or not ...l [18:56:25] the path will be the same [18:56:30] Let's try not killing :) [18:56:32] yeah [18:56:43] actually, no the path would have changed ! [18:57:09] Cause hive-site.xml gets copied onto hdfs by deploy [18:57:13] ottomata: --^ [18:57:25] hmmm do the oozie conifgs refer to the versinoed path in hdfs? [18:57:30] probalby not, it probably uses the current dir? [18:57:30] So restart needed, or manual update of various hive-site.xml [18:57:49] nope, oozie is usually based on refinery folder, using versionned one [18:57:51] ah [18:57:51] yes [18:58:00] because we start them setting oozie_directory dynamically to latest [18:58:00] ja [18:58:03] rats ok [18:58:05] ok [18:58:08] so killing load [18:58:16] k [18:59:29] joal: hdfs-site in hdfs looks good now [18:59:41] hue not answering :( [18:59:45] looks ok to me [18:59:47] goimng for cli [18:59:53] k [19:00:12] bundle id is https://hue.wikimedia.org/oozie/list_oozie_bundle/0001792-160202151345641-oozie-oozi-B [19:00:14] oops [19:00:18] well, you see it :p [19:00:31] joal: brb, gonna heat up some lunch [19:04:29] ottomata: oozie still very unresponsive :( [19:04:47] OOOOOHHH ! [19:04:51] Maybe it's stat1002 ? [19:05:57] mwarf, it's not [19:11:59] man ... [19:12:12] Will go for dinner and back after [19:12:41] load job have been stopped and restart, we need to do the same for refine and all other hive-based jobs (most) [19:12:56] I'll be back in 1/2h or so [19:13:01] back [19:13:15] hmm ok [19:13:31] was just saying that I was going to leave [19:13:42] ok ahead joal [19:13:42] ottomata: Don't know why, but oozie still VERY unresponsive: )( [19:13:44] yeah hm [19:14:19] Will be back soon ottomata [19:14:41] ok [19:17:58] hmm, joal i think you might be right about mysql [19:18:02] or [19:18:06] something needs optimized with oozie dbs [19:18:13] long running queries in mysql for oozie [19:18:55] hmmm i betcha ORDER BY t0.created_time is not good [19:19:49] whooa [19:19:50] -rw-rw---- 1 mysql mysql 38G Feb 23 19:18 WF_JOBS.ibd [19:19:51] heh [19:19:52] yeah [19:19:55] i btecha we coudl prune that [19:19:57] lots of old jobs [19:20:10] I'm going to pause load again, stop oozie, and do some maintenance [19:26:09] didn't pause load cause oozie was too unresponsive, just stopped oozie server [19:26:14] am backing up the oozie database [19:26:27] asking jynus for advice, but i think i'm going to try to add an index on the WF_JOBS.created_time field [19:38:33] (CR) Milimetric: "all looks good, waiting to finalize all settings though." (2 comments) [analytics/reportupdater-queries] - https://gerrit.wikimedia.org/r/272635 (https://phabricator.wikimedia.org/T127326) (owner: Mforns) [19:41:18] ottomata: back ! [19:41:50] ottomata: anything I can help with ? [19:45:14] wiating for mysqldump to finish [19:45:19] wanted to make another backup before i did something [19:49:26] joal: probably the best thing to do would be to prune old oozie somehow [19:49:29] buuut [19:49:31] that sounds hard [19:49:39] probably lots of foreign keys floating around [19:51:30] ottomata: I am double checking mysql configs [19:52:01] ottomata: sounds bizzare that an1027 was managing better than 1015 [19:53:37] ottomata: found [19:55:02] ottomata: my.cnf has values for key_buffer, thread_stack, thread_cache_size, query_cache_limit, query_cache_size in an1027 not ste in 1015 [19:55:26] joal: aye, the my.cnf on 1015 was recommended by jynus [19:55:33] but, looking [19:55:33] hm ... [19:55:36] also, we know that an27 was slow [19:55:42] and i think this is one of the main reasons why [19:55:52] but ja, could be a combo [19:56:03] def a 38G table and a sort by non index field is not going to do well [19:56:15] for sure :) [19:56:27] the ones on an27 were just default mysql-server installed [19:56:44] no values on an1015 [19:56:52] are default values bigger / smaller? [19:57:05] hm ... Indexing is probably the first thing to do :) [19:57:15] key_buffer is for MyISAM [19:57:49] hmm but [19:57:54] query_cache_size default is 0? [19:57:56] that can't be good [19:59:07] the thread defaults seem sane I think [19:59:19] joal: lets increase query_cache_size to what it was [20:01:56] wow, joal [20:01:57] ja [20:02:02] the innodb buffer pool size is way smaller [20:02:07] increasing that back to what it was [20:02:19] oy wait [20:02:23] looking at wrong config [20:02:45] am going to make it bigger though [20:03:25] kevinator: https://github.com/wikimedia/analytics-refinery-source/blob/master/refinery-hive/src/main/java/org/wikimedia/analytics/refinery/hive/HostNormalizerUDF.java [20:03:59] kevinator: https://github.com/wikimedia/analytics-refinery-source/blob/master/refinery-hive/src/main/java/org/wikimedia/analytics/refinery/hive/HostNormalizerUDF.java [20:07:10] ok, going to add index [20:07:15] k ottomata [20:07:49] kevinator: https://github.com/wikimedia/analytics-refinery-source/blob/master/refinery-hive/src/main/java/org/wikimedia/analytics/refinery/hive/GetPageviewInfoUDF.java - I'd do get_pageview_info(uri_host, "", "")["project"] [20:08:11] heh, joal this might take a while... [20:08:21] too bad we didn't think to do this before the downtime [20:08:25] could have done it on this slave [20:08:54] ottomata: We probably expected it wouldn't have been needed with better machine ... [20:09:03] aye [20:09:27] ottomata: currently buffer pool size is 1G ... Is that enough ? [20:09:48] Oh, 8 instances :) [20:09:58] ok [20:10:23] Oh, actually no: total size is divided among instances [20:10:28] We could make it bigger [20:13:05] ottomata: https://dev.mysql.com/doc/refman/5.5/en/glossary.html#glos_buffer_pool_instance [20:18:10] interesting joal, perhaos [20:18:12] perhaps ja [20:18:38] the buffer pool was 128M on an27 [20:18:59] oh! [20:19:00] index done [20:19:01] cool [20:19:08] Given we have 48G global, we use 10 + 6 for hive, and 2 for oozie, we probably can afford 8G for mysql [20:19:16] ottomata: Gresat about index ! [20:19:23] hmm, default is 1 instance? [20:19:23] hmmm [20:19:24] Weird about an1027 [20:19:34] Don't know ottomata ... Currently: 8 [20:20:03] where do you see that? [20:20:22] oh [20:20:23] hm [20:20:26] SHOW VARIABLES LIKE 'innodb_buffer_pool%'; [20:20:30] i must be reading some out of date mysql doc [20:20:32] ok [20:21:28] sure joal lets try 8! [20:21:31] 8G [20:21:40] seems like a lot though [20:21:45] analytics-store is at 4G [20:21:45] True ! [20:21:54] hm, how many instances? [20:22:10] maybe less instances [20:22:13] yeah, um [20:22:18] lets put it at 4G and say 4 instances? [20:22:26] Sure ! [20:24:27] ok ottomata, let's restart oozie server when conf setup ? [20:24:32] ok [20:24:37] And see [20:24:37] yup, just got it in place [20:24:39] mysql restart [20:24:44] starting oozie [20:25:46] ottomata: load job in weird mode [20:25:52] ottomata: I kill and relaunch load [20:26:22] ottomata: hue for oozie is now responsive :) [20:26:30] k hang on [20:26:35] ottomata: done [20:26:48] hive metastore seems funky [20:26:52] aouch [20:27:00] maybe that then ... [20:27:25] hmmmm think its ok, checking [20:27:50] actually i thin its fine [20:28:19] whoa, lots of hive jobs now joal :) [20:28:23] :D [20:28:27] HAPPY ! [20:28:40] and oozie / hue is really more responsive [20:28:45] ottomata: good job ! [20:28:52] ja whoa it is [20:29:05] ok, puppetizing this change to make it permanent [20:29:10] awesome [20:29:17] ottomata: killing / restarting refine [20:29:35] k [20:32:58] killing restarting legacy TSVS [20:33:01] * madhuvishy likes new hue too [20:33:06] :) [20:34:43] * milimetric will like hue when it stops 500-ing [20:36:21] milimetric: it's not 500-ing now :) [20:36:29] but i have 10 seconds worth of trust [20:36:39] :D [20:36:41] :) [20:37:16] i betcha this will be way betterrRRRrrr [20:37:31] websites should have those things they have in workplaces, like "50 days since last accident"... "10 seconds since last 500" [20:47:15] killed / restarted pagecounts-allsite load and archive [20:47:17] nice [20:47:24] joal: i'm going to restart discovery's jobs [20:47:37] k ottomata [20:47:46] following my list :) [20:48:31] thanks joal :) [20:48:34] let me know if you want me to do any [20:50:53] Analytics-Wikistats: Problems with Erik Zachte's Wikipedia Statistics - https://phabricator.wikimedia.org/T127359#2057074 (Samat) Dear Erik, 2. I have found these links here: https://stats.wikimedia.org/index.html#fragment-14 (section Raw Data and Scripts), but they are on the https://meta.wikimedia.org/wik... [20:52:39] joal: we should do a oozie job name audit [20:52:54] at the least we could do is indicate which jobs are the top level [20:52:56] somehow [20:53:01] bundle vs coordinator, etc. [20:54:33] ottomata: That would be good [20:54:47] I use the diagram when doing this kind of restart [20:56:49] joal: i'd say the cluster is back and ready, eh? [20:56:53] mind if I send an email? [20:57:13] ottomata: plese do, but there still is some catch up to do data-wise [20:57:43] aye for sure, will mention [20:59:22] hmmMMmm [20:59:23] hey! [20:59:40] is last_access_uniques job creating 'tmp' tables in the default hive database? [20:59:47] https://hue.wikimedia.org/metastore/table/default/tmp_last_access_uniques_2016_2_2 [20:59:48] etc. [21:00:11] ottomata: let me double check [21:00:40] ottomata: not the one that is productionized [21:00:45] maybe previous ones [21:00:49] aye, that's the latest one I see [21:00:49] madhuvishy: --^ [21:01:05] madhuvishy: any idea? [21:01:29] joal: i thought you killed the old jobs [21:01:36] I did not ! [21:01:41] joal: aah [21:01:41] I will :) [21:01:48] sure, i can do that too [21:02:42] madhuvishy: will you drop those tables too? [21:03:13] ottomata: okay - let me look where they are coming from [21:05:02] ottomata: ya those can go - i'll drop them [21:06:33] ottomata: I cheated on cassandra loading jobs: launched the bundle from march 1st (beginning of month), and used other coordinators to backfill end of feb [21:06:54] that's not a hive job though, ja? [21:06:59] shouldn,'t need restarting? [21:08:13] ottomata: there is some hive in it, yeah [21:08:18] data prep is done in hive [21:08:35] ah ok [21:19:24] madhuvishy: I kill your last_access jobs --> confirmed ? [21:19:30] joal: yeah [21:19:49] madhuvishy: done [21:23:21] joal: awesome [21:27:52] (PS3) Nuria: [WIP] Dashiki gets pageview data from pageview API [analytics/dashiki] - https://gerrit.wikimedia.org/r/270867 (https://phabricator.wikimedia.org/T124063) [21:35:29] mforns: About browser data, shall I kioll existing job and restart using yours ? [21:35:47] joal, I can do that, before I need to finish the deployment [21:35:51] no? [21:36:01] mforns: We had to deploy with ottomata, so code is ready [21:36:07] oh I see [21:36:15] And I'm in the middle of restarting everyjob, so nevermind, I'm on it :) [21:36:22] joal, ok [21:36:26] thanks! [21:36:39] np :) [21:36:56] mforns: I keep old data, but job is killed [21:37:00] awesome [21:37:12] joal: i did not do a full deploy [21:37:21] i just did the hdfs copy [21:37:35] ottomata: only hdfs deploy failed with mforns - You did the missing part :) [21:37:38] ottomata, the other part we already did this morning [21:37:39] ah ok! [21:37:41] nice [21:38:00] thx [21:44:55] ottomata, sorry you had to selfmerge the EL thing, I looked at it and the code seemed ok to me, but I didn't know what that was and forgot to ask you, my bad [21:45:15] Analytics, Analytics-Kanban, Operations, Patch-For-Review: Increase HADOOP_HEAPSIZE (-Xmx) for hive-server2 - https://phabricator.wikimedia.org/T76343#2057411 (Ottomata) [21:46:23] Analytics, Analytics-Cluster, Analytics-Kanban, Patch-For-Review: Set up Webrequest -> kafka flow in beta. - https://phabricator.wikimedia.org/T127369#2057416 (Ottomata) [21:47:37] np! [21:47:47] its mostly for mw vagrant, where pip + virtualenv is used [21:50:05] Analytics oozie jobs restarted - Hopefully everything correctly - Half a day to catch up [21:50:18] ottomata: You have restarted discovery jobs ? [21:51:22] ottomata: three last jobs of the list are popularity_score, and transfer_to_es[codw|eqiad] [21:56:40] joal: i unsuspened them, ja [21:56:54] didn't restar tthem [21:56:55] are they hive? [21:56:58] ottomata, I see [21:57:14] ottomata: they are not, but still seem to have fialed [21:57:21] looking [21:57:22] We should probably let ebernhardson know [21:57:44] oh, i did let them now, maybe they killed them? [21:58:20] asking in #discovery [22:00:02] ok joal, ebernhardson is taking care of it [22:00:05] dunno what happneed there [22:00:15] ottomata: k ! Thanks ebernhardson :) [22:02:49] last-minute request: does anyone have the time to compute for me the daily mobile pageviews for the period Feb 13 to Feb 22? [22:04:52] ori - doing [22:04:58] joal: <3 [22:05:16] ori, per hour ? [22:05:22] oh, daily sorry [22:05:30] joal: hourly granularity would be even better [22:05:46] *huge* thanks [22:06:02] ori: all projects? [22:06:07] yep [22:09:42] ori: https://gist.github.com/jobar/78aa129183b3d13e8d08 [22:10:37] joal: :)))) it wasn't long ago that fulfilling such requests in 5 minutes was in the realm of science fiction. thanks a lot, once again. [22:11:00] np ori, hive helps ;) [22:11:23] joal: can you run another slightly modified version of that query? only mobile web (no mobile apps), and daily aggregation? [22:11:30] sure [22:12:50] ottomata: hue is more responsive for giving workflow related data now - but page load times have gone up [22:14:41] dunno - may be it'll get better [22:15:27] ori: if you have access to hue (ldap creds) https://hue.wikimedia.org/beeswax/#query makes it really easy to run these queries - if you want to run them yourself anytime [22:15:45] ori: madhuvishy is very right :) [22:15:48] ori: https://gist.github.com/jobar/5b66f4405cc21b0a73b0 [22:16:01] madhuvishy: page load times? [22:16:37] ottomata: ya like when you request the query editor or something it takes longer to load than before. it may be that the resources were not cached at first too [22:16:45] hm [22:16:55] * ottomata never really used the query editor :D [22:17:32] ottomata: camus checker broken :( [22:17:39] ! [22:17:39] oh! [22:17:42] not just in labs [22:17:48] i was noticing this in labs, am trying to setup camus there [22:17:53] ottomata: an1027 [22:18:09] ottomata: thereforce load job don't start :( [22:18:24] Iinnnteresting! [22:18:26] good catch joal [22:18:41] i guess new version of spark doesn't have that class? [22:18:46] class not found stuff :( [22:18:50] probably [22:18:51] or uh [22:18:55] -bash: /usr/lib/spark/bin/compute-classpath.sh: No such file or directory [22:19:16] yes, looks like that [22:19:29] das a shame :/ [22:19:55] too MAAAAAAANY dependencies [22:20:23] HMmMMmm [22:20:41] ok we need a fix! i'm looking around in spark shell scripts to see if something else will do, or how new spark finds class path [22:21:14] joal: maybe just add [22:21:35] /usr/lib/spark/lib/spark-assembly.jar to camus checker CP? [22:21:36] i will try... [22:22:14] ottomata: feasible, but IIRC, we need the full spark classpath for scala [22:22:34] ok, i see scala in that jar [22:22:41] ottomata: https://phabricator.wikimedia.org/T115970 :) [22:22:52] ottomata: could do ! [22:23:07] erg, no [22:23:11] because now no hadoop! [22:23:12] HMmm [22:24:14] ottomata: spark-conf not accessible for me [22:24:20] how about spark-end.sh ? [22:24:47] end? [22:25:31] thanks joal, madhuvishy! [22:25:45] ooh hue got a makeover? [22:26:02] ottomata: $SPARK_DIST_CLASSPATH [22:26:09] after sourcing spark-env.sh [22:27:24] OOO [22:27:52] ottomata: ott ooo[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[A[A[A[A[A[A[A[A[A[A[A[A[A[A[A[A[A[A[A[A[A[A[A[A[A[A[A[A[A[A[A[A[A[A[A[A[A[A[A[A[A[A[A[A[A[A[A[A[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B [22:27:58] [B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[Booo[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[A[A[A[A[A[A[A[A[A[A[A[A[A[A[A[A[A[A[A[A[A[A[A[A[A[A[A[A[A[A[A[A[A[A[A[A[A[A[A[A[A[A[A[A[A[A[A[A[A[A[A[A[A[A[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B [22:28:04] [B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[6~[6~[A[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B [22:28:08] hahaha [22:28:10] [B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B[B [22:28:12] uhhh [22:28:15] wooow [22:28:16] something bad has happend to joal [22:28:20] :D [22:28:25] joal i think you sourced that env too hard [22:28:35] * joal didn't know he could do that ! [22:28:53] Sorry for the mess team ! [22:28:57] haha [22:29:29] ottomata: about classpath: we could reuse the classpath defined with the command [22:29:59] hmm, dunno why spark conf is not readble [22:30:06] oh, hive-site is in there? [22:30:09] symlink [22:30:11] root-owned [22:30:12] that should be ok though [22:30:13] uhhh [22:30:31] but puppet must have done this? hm. [22:31:30] hm [22:31:49] dunno why it was 700 though [22:31:50] fixed [22:31:53] ok [22:31:53] um [22:31:54] yeah [22:32:00] lemme try running checker with t hat [22:32:35] IIRC we decided to use the spark script to prevent having a very long CP, but if that works with that, I'd go for it :) [22:33:31] grrr, no good still [22:34:01] also ottomata, since checker failed for some time, manual fix will be needed (_IMPORTED file to be added on folders when checker failed) [22:34:15] batcave ottomata ? [22:34:18] aye [22:34:21] kj [22:37:24] hey a-team see you tomorrow! :] [22:37:29] bye mforns ` [22:38:07] byye [22:40:34] ottomata: if (arguments['--run']): [22:40:37] oops [22:40:44] ottomata: https://gist.github.com/jobar/3885b09de427ac640408 [22:42:29] joal: ottomata dropped all the tmp_last_access tables [22:44:18] (PS1) Ottomata: Fix camus checker classpath in CDH 5.5 [analytics/refinery] - https://gerrit.wikimedia.org/r/272901 (https://phabricator.wikimedia.org/T119646) [22:44:26] ottomata: https://gist.github.com/jobar/ecdcfcee1667ffb04984 [22:46:13] joal: https://gerrit.wikimedia.org/r/#/c/272901/ [22:46:13] ottomata: 2016-02-23-14-00-08 [22:46:14] 2016-02-23-14-20-08 [22:46:14] 2016-02-23-14-40-07 [22:46:14] 2016-02-23-15-00-17 [22:47:52] (CR) Joal: [C: 2 V: 2] "LGTM !" [analytics/refinery] - https://gerrit.wikimedia.org/r/272901 (https://phabricator.wikimedia.org/T119646) (owner: Ottomata) [22:48:17] ottomata: mediawiki:2016-02-23-12-15-09 [22:48:17] 2016-02-23-13-15-08 [22:48:17] 2016-02-23-14-15-07 [22:49:04] eventlogging: 2016-02-23-13-05-08 [22:49:04] 2016-02-23-14-05-08 [22:49:05] 2016-02-23-15-05-10 [22:55:30] madhuvishy: I don't think I have access to Hue -- is it using LDAP credentials? [22:56:54] ori: yes - do you have Hive access? (analytics-privatedata-users group) [23:01:41] (yes ldap, but requires manual account sync, but yes, you need to be in that grou ptoo) [23:26:35] (PS1) Ottomata: Fix classpath for camus again, need more things! [analytics/refinery] - https://gerrit.wikimedia.org/r/272909 [23:27:42] ottomata: can you add me? [23:27:48] I had access before [23:29:45] (CR) Joal: [C: 2 V: 2] "Merging !" [analytics/refinery] - https://gerrit.wikimedia.org/r/272909 (owner: Ottomata) [23:30:03] ottomata: https://gist.github.com/jobar/ecdcfcee1667ffb04984 [23:31:50] * yurik pokes milimetric [23:34:25] ori MHMMmmm ExcUZE me that is an access request and will require 3 day wait time [23:34:44] does that count for people with root? [23:34:48] hehehe [23:35:02] I know that you're kidding, but I don't mind waiting [23:35:16] I can run queries from the shell on stat1002 [23:35:21] wait [23:35:25] hive queries? [23:35:36] yes [23:35:40] you won't be able to access the webrequest data unless you are in that group thoguh [23:35:58] the only pressing question I had, joa.l already answered [23:36:09] ori i gotcha you are root, can't see why that would apply [23:36:18] wfm [23:36:23] i'm feeling bold! [23:36:38] Analytics-Kanban: Corrext camus partition checker to not fail globally on one topic error - https://phabricator.wikimedia.org/T127909#2058132 (JAllemandou) [23:38:12] * ebernhardson updates a dozen references to analytics1027. [23:39:36] heh, eh? [23:39:41] where you got references to that? [23:39:48] ebernhardson: ^ [23:41:06] ottomata: https://wikitech.wikimedia.org/wiki/Discovery/Analytics#Re-run_a_failed_workflow and anything else that gives an oozie command, not sure why i used the explicit version must have copied them from somewhere [23:41:50] ebernhardson: [23:41:52] ottomata: yup: https://wikitech.wikimedia.org/wiki/Analytics/Cluster/Oozie#How_to_deploy_Oozie_production_jobs [23:41:54] you shouldn't need to use it [23:41:55] oh [23:41:56] sudo [23:41:57] that's why [23:42:04] ebernhardson: echo $OOZIE_URL [23:42:05] as your user [23:42:14] sudo is just loozing then env var [23:42:15] so you can do [23:42:29] sudo -u analytics-search oozie job -oozie $OOZIE_URL [23:42:36] to just explicitly fill it in from your own env [23:42:42] haha [23:42:45] i said 'loozing' [23:42:47] so many Zs! [23:42:57] heh :) Yup that seems to work, i'll update these docs [23:50:03] ok, laters a-team!