[00:18:26] Analytics, Analytics-Kanban, WMF-Product-Strategy: Backfill pageview data for March 2015 from sampled logs before transition to UDF-based reports as of April - https://phabricator.wikimedia.org/T96169#1210993 (kevinator) @nuria the scope of this task is only to parse the sampled logs for the month of... [00:23:42] Analytics-Cluster, Analytics-Kanban: Compute pageviews aggregates daily and monthly from April {crow} - https://phabricator.wikimedia.org/T96067#1211007 (kevinator) [00:38:23] bd808: I'm trying to provision vagrant after enabling the wikimetrics role and it fails with this [00:38:26] https://www.irccloud.com/pastebin/cNbykGlZ [01:30:45] bd808, madhuvishy : seems that it is cloning wikimetrics at "/vagrant/srv/wikimetrics" rather than "/vagrant/wikimetrics" where we would expect [01:46:52] Analytics-EventLogging, Analytics-Kanban, Collaboration-Team, Echo, Patch-For-Review: Echo events not validating on EL - https://phabricator.wikimedia.org/T95169#1211258 (Mattflaschen) Open>Resolved [02:12:21] bd808, nuria: aah. [02:13:09] madhuvishy: hmm... [02:13:20] we moved the code locations around recently [02:13:44] this looks like a file ownership issues from puppet colliding with your file sharing [02:13:51] I think I know how it needs to be fixed [02:14:04] could you file a bug and assign to me [02:14:18] bd808: sure [02:18:18] bd808: filing bug now.. does this have to do with this - http://serverfault.com/questions/487862/vagrant-os-x-host-nfs-share-permissions-error-failed-to-set-owner-to-1000? [02:19:12] madhuvishy: yeah. we have a trick to get away without needing no-root-squash [02:19:28] oh cool [02:26:07] bd808: https://phabricator.wikimedia.org/T96221 First phabricator task I've created :D [02:28:54] madhuvishy: you did it just right. thanks [10:02:41] Analytics-Tech-community-metrics, Phabricator, Wikimedia-Hackathon-2015, ECT-April-2015: Maniphest backend for Metrics Grimoire - https://phabricator.wikimedia.org/T96238#1211835 (Qgil) NEW a:Qgil [10:03:03] Analytics-Tech-community-metrics, ECT-April-2015: Maniphest backend for Metrics Grimoire - https://phabricator.wikimedia.org/T96238#1211835 (Qgil) [10:14:20] Analytics-Tech-community-metrics, ECT-April-2015: Ensure that most basic Community Metrics are in place and how they are presented - https://phabricator.wikimedia.org/T94578#1211887 (Qgil) [10:49:40] Analytics-Tech-community-metrics, ECT-April-2015: Provide list of oldest open Gerrit changesets without code review - https://phabricator.wikimedia.org/T94035#1211908 (Dicortazar) Hi, I'd like to confirm some examples: For instance, according to the Korma database and checking the Gerrit site, the two... [13:08:21] Analytics-Tech-community-metrics, Possible-Tech-Projects, Epic, Google-Summer-of-Code-2015, Outreachy-Round-10: Allow contributors to update their own details in tech metrics directly - https://phabricator.wikimedia.org/T60585#1212036 (Sarvesh.onlyme) Hello, I am having a problem regarding compl... [13:19:59] Analytics-Cluster, Analytics-Kanban: Compute pageviews aggregates daily and monthly from April {crow} - https://phabricator.wikimedia.org/T96067#1212057 (JAllemandou) Kevin: here is an example of data (flat file) we would get as a result :https://hue.wikimedia.org/filebrowser/view//user/joal/pageviews_agg... [13:28:32] Ironholds: Heya [14:26:17] holaaa [14:26:25] Hi nuria [14:26:29] Howdy ? [14:28:44] moorrrniin [14:30:08] mforns: we have tasking right? [14:30:18] kevinator: tasking? [14:30:19] nuria, yes I'm in the batcave [14:30:32] yes [14:30:35] ah ok, it was empty just now [14:30:42] Weird ... [14:30:43] joseph and I must be in an alternate universe [14:30:52] We are in the batcave with kevinator [14:31:56] nuria, kevinator, joal I think I am in an alternate batcave :] [14:33:07] heey! where are you?! :o [14:38:01] hangouts being silly again? [14:38:30] I've heard of it doing such things [14:45:32] joal i don't like that your job hasn't been started yet [14:45:36] your CONCAT(year ...) [14:45:41] Yup, me neither [14:45:51] oozie stuff, he ? [14:46:14] ? [14:46:28] too many oozie jobs ? [14:46:43] i do have a few extra things running right now, as there are still refined partitions that didn't run [14:46:50] and those are dependencies for some other jobs [14:47:02] I didn't know there was missing partitions [14:47:24] joal, run [14:47:25] refinery-dump-status-webrequest-partitions --datasets webrequest [14:47:40] (I add /srv/deployment/analytics/refinery/bin to my PATH) [14:48:25] Nice ! [14:50:16] I still don't get it why I don't get a share on the default queue :( [14:50:23] yeah me neither [14:50:36] maybe your job is large? [14:50:43] and it knows? [14:50:44] i'm not sure [14:51:19] no, one hour of data [14:54:27] ottomata: question for ya [14:54:30] ottomata: yt? [14:54:34] yup [14:54:36] hiya [14:54:44] ottomata: if we want to test impala [14:55:00] ottomata: should we install it in labs 1st , no puppet, all cruft [14:55:25] ottomata: or should we go for puppetization and install in cluster right away? [14:57:25] you can do in labs if you want, my plan was: [14:57:27] test in vagrant [14:57:30] see how it works with yarn [14:57:34] then if all is ok [14:57:39] puppetize, try in labs [14:57:41] then if that is good [14:57:42] install in prod [14:57:46] there are a lot of moving parts though [14:57:50] and i'm worried about resource allocation ow [14:57:56] we already are feeling the pinch [14:58:02] yeah [14:58:03] and impala kinda grabs things for itself [14:58:05] agreed [14:58:31] not true I think about resources: it asks yarn [14:58:36] ottomata: --^ [14:58:38] yes, it asks yarn [14:58:48] but, you were saying about llama and caching [14:58:49] right? [14:58:53] i think it tries to hold onto resources [14:58:57] but we already are just in resources with what we have ... [14:59:29] ottomata: Normally it releases resource in what I recall [14:59:35] hmm, ok [14:59:36] well, we will see :) [14:59:47] joal, I think we need a special queue for oozie launcher. am reading about DRF [14:59:50] https://www.cs.berkeley.edu/~alig/papers/drf.pdf [14:59:50] you know it? [14:59:57] nope [15:00:00] sounds interesting [15:00:16] I do agree with dedicated queue [15:01:18] ja, dno't know if DRF is appropriate yet, but neither fifo nor fair really make that much sense...not sure [15:01:43] was thinking maybe we can specify somehow what type or how much resources or share a launcher will need [15:01:45] cause it isn't much [15:03:40] agreed [15:27:57] ottomata: My job started [15:28:39] I think it's because at one moment, the fair share for the default queue went below it's overuse ratio, and started new jobs [15:29:01] yes [15:29:05] makes sense [15:29:11] which is a reason why we should move the oozie jobs out of there too [15:29:12] i think [15:29:23] i dunno, i'm kinda guessing here :/ [15:29:23] Correct [15:29:46] Particularly the other way around: my job shouldn,t block oozie ;) [15:30:59] ottomata: new standup starting [15:32:02] EEK [15:32:31] oof internet being crappy [15:32:36] trying [15:43:18] Analytics-Kanban, Analytics-Visualization: Build Multi-tennant Dashiki (host different layouts) - https://phabricator.wikimedia.org/T88372#1212430 (Milimetric) a:Milimetric [15:51:30] Analytics-Engineering, Analytics-Wikimetrics: "Validate Again" functionality is broken - https://phabricator.wikimedia.org/T78339#1212477 (Milimetric) a:madhuvishy [15:54:02] Analytics-Engineering, Analytics-Kanban, Analytics-Wikimetrics: "Validate Again" functionality is broken - https://phabricator.wikimedia.org/T78339#842830 (Milimetric) [16:01:12] Analytics-Tech-community-metrics, ECT-April-2015: Provide list of oldest open Gerrit changesets without code review - https://phabricator.wikimedia.org/T94035#1212559 (Qgil) Yes, I think we should keep the rules simple and list these as well. Even if these two changesets don't represent the case of a Ger... [16:12:24] Analytics-Tech-community-metrics, ECT-April-2015: Ensure that most basic Community Metrics are in place and how they are presented - https://phabricator.wikimedia.org/T94578#1212614 (Qgil) >>! In T94165#1212592, @Ironholds wrote: > Fair. I have no idea what the resourcing for this will be, with Erik's dep... [16:19:12] Analytics-Tech-community-metrics, ECT-April-2015: Ensure that most basic Community Metrics are in place and how they are presented - https://phabricator.wikimedia.org/T94578#1212633 (Ironholds) I'd be happy to help, but the 23rd is straight after I (probably? It's...not entirely clear) shift to working fo... [16:27:53] Analytics-Tech-community-metrics, ECT-April-2015: Ensure that most basic Community Metrics are in place and how they are presented - https://phabricator.wikimedia.org/T94578#1212665 (Qgil) [16:35:03] ottomata2: any news on emails from eventlogging ? [16:39:20] news [16:39:20] ? [16:39:36] I don't receive any email, so I wonder :0 [16:39:38] haven't gotten any since monday [16:40:36] I guess it's in code review, right ? [16:42:28] thx tnegrin [16:42:28] :) thank you [16:42:53] ottomata: got an error from maven on PKIX certificate [16:42:56] first time ... [16:42:59] Any idea ? [16:43:18] joal: naw it is deployed, there haven't been any emails [16:43:26] maven on PKIX cert? [16:43:29] yup [16:43:32] we switched maven to https recently [16:43:34] SSL cert issue [16:43:39] you compiling on your local machine? [16:43:40] RIIIIGHT [16:43:43] Yup [16:43:46] I do sometimes :) [16:44:04] So I should add the cert to my java certl list [16:44:07] mouarf [16:44:53] hm. [16:44:55] i didn' thave any trouble [16:45:23] i just changed my configs to use https rather than http url, but i tested it and it worked even if i didn't do that [16:45:26] since it redirects to https [16:45:51] when you say configs, whAt do you mean ? [16:46:23] settings.xml [16:46:28] ~/.m2/settings.xml [16:46:32] right [16:46:46] hm, we should change that in pom.xml too [16:47:48] for the moment, doesn't work as is for me [16:47:48] on it... [16:47:48] Will do some research [16:47:48] hm [16:47:48] At least I know why now ;) [16:47:52] thx [17:25:20] joal: https://gerrit.wikimedia.org/r/#/c/204548/ [17:33:56] ottomata, hi, could you take a quick look at this - does reducing the number of reducing cause issues? I noticed that in the log reducer percentage kept droping to 0% which is very weird [17:33:56] https://gerrit.wikimedia.org/r/#/c/204153/1/scripts/countrycounts.hql,cm [17:34:18] *reducers [17:35:07] yurik: your one reducer is being preempted :( [17:35:07] https://yarn.wikimedia.org/proxy/application_1424966181866_88626/mapreduce/attempts/job_1424966181866_88626/r/KILLED [17:35:43] yurik: we reduced the number of jobs that could preempt things yesterday [17:36:03] * yurik is googling "preempting reducers" [17:36:04] however, i am currently trying to get some refine jobs to catch up, and I mvoed a few of them into the essential queue about 30 mins ago [17:36:15] essential queue may do aggressive preempting [17:36:40] so, if you are seeing this behavior in your current jobs (these reducers were all preempted in the last 30 mins), this might be why [17:37:07] would it help if i kept the old code? [17:37:27] ottomata, with DISTRIBUTE BY printf('%d-%02d-%02d', ${year}, ${month}, ${day}); [17:37:40] instead of ET mapred.reduce.tasks=1; [17:37:44] *SET [17:38:13] yurik: maybe, but also maybe not, i dont' know. you would have more reducers, and maybe each one would run faster, and therefore any single preemption wouldn't slow you down as much [17:38:20] right now its setting you back by 10-20 mins each time it happens [17:38:36] Analytics, MediaWiki-API-Team, MediaWiki-Authentication-and-authorization: Create dashboard to track key authentication metrics before, during and after AuthManager rollout - https://phabricator.wikimedia.org/T91701#1212974 (Tgr) * Successful logins via Special:UserLogin - `LoginAuthenticateAudit` hoo... [17:39:23] ottomata, in the logs i see that not a millisecond was added to ether of two jobs for the past half an hour or so [17:39:46] 91% and 24% [17:39:54] ok, makes sense then. i'd say just wait, i'm still trying to get the mess from yesterday cleaned up [17:40:01] joal recommended that you use one reducer, right? [17:40:08] yes [17:40:16] ok, i didn't follow that, but he probably had good reason to :) [17:40:38] well... if it causes much bigger delays... i dono ) [17:40:48] the idea was to optimize, not the other way around ))) [17:40:57] yurik: let me explain [17:41:19] Using distribute by send data to reducer based on the distribution key [17:41:34] yurik: it will cause bigger delays if you get preempted, which the cluster is more likely to do right now, because we are fixing things, and your job is lower priority! [17:41:35] oh, so in this case everything gets sent to the same one [17:41:42] yurik: you get it [17:41:58] So yeah, reducers would have finished, but maybe not THE one ;) [17:42:03] yurik: --^ [17:42:44] yurik: makes sense ? [17:42:52] joal, would it help if i moved everything into a subquery, and wrapped it with a "select * from (subquery) distribute by $date" [17:43:05] this way it would use multiple reduces [17:43:08] why do you need to distribute by a key? [17:43:13] can't you just tell it to use more reducers manually? [17:43:16] to have just one file [17:43:29] the result is usually 30 files, most of them empty [17:43:35] I men, you can't win very side :) [17:43:37] and since i do all that parsing by hand afterall [17:43:43] yurik: how big is the data? [17:43:43] every sorry [17:43:51] result - 5mb per day [17:43:53] why not just cat the files into one when you are done? [17:44:33] i could, was trying to keep it clean and easy to browse. if causes a lot of slowdowns, i will simply remove all the distribute by and max reducers [17:44:54] It's for you mainly [17:45:14] right, was just trying to keep the text files more browsable [17:45:33] ottomata, btw, i can't do cat on /mnt - i could load them all though [17:45:43] Having more reducers = less time spent for the job = less chances to get preempted all day long and never finsih [17:45:43] sure you can, just don't write to mnt [17:45:44] or [17:45:44] hdfs dfs -cat path/to/dir/* > onefile [17:45:46] or [17:45:58] hdfs dfs -cat path/to/dir/* | hdfs dfs -put - path/to/onefile [17:46:31] yurik: i'd recommend not using /mnt for anything except for browsing though [17:46:36] is it safe to cat things? if they were all generated by "group by", they wouldn't need to be added further, right? [17:46:37] or (if using compression) hdfs dfs -text /path/to/dir/*.snappy > res_file [17:46:38] its more a convenience than a reliable thing [17:46:59] nah, interesting approaches, but i don't want to touch /mnt [17:47:07] none of those use /mnt [17:47:08] so yeah, i will simply read files from it [17:47:15] hdfs dfs -put [17:47:20] that puts into hdfs [17:47:23] not /mnt/hdfs [17:47:40] doesn't /mnt/hdfs reflect what's in hdfs? [17:47:54] yes....but uses a user mounted filesystem thingee that is unreliable [17:48:02] if you use hdfs commands [17:48:04] it doesn't use that [17:48:11] hdfs commands talk to hdfs directly [17:48:33] that's why we mount /mnt/hdfs readonly [17:48:35] i don't trust it :) [17:49:09] It's funny: ottomata doesn't trust /mnt/hdfs, yuri doesn't trust hdfs dfs ;) [17:49:48] yurik: As ottomata says, Hive is usually good at gessing the number of ressources you need [17:49:57] hehe, i would be ok with using direct python funcs to call hdfs, but absent of that, i would much rather just read /mnt/hdfs from python as files [17:50:00] So you should trust it, then aggregate the result files [17:50:00] makes things simple [17:50:09] gotcha, will do [17:50:15] tahnks for all the explanation! [17:50:24] yuri, i would like to install this...maybe one dayyyy [17:50:25] https://github.com/spotify/snakebite [17:50:36] it is faster and kinda nicer than the default hdfs cli [17:50:41] lovely! yes please :D [17:50:56] ottomata, can we have hql queries as part of it too? [17:51:06] haha [17:51:14] um, i mean, hive has a jdbc connector [17:51:46] is it easy to use from python? [17:51:52] dunno, nver tried it [17:52:12] * yurik can't wait for full scale MIT licensed .NET deployment on wiki servers [17:52:21] i want to use proper LINQ :) [18:04:13] meh, one job is stalling, another - 1second of reduction per minute :) [18:04:15] sigh :) [18:11:31] milimetric: after I've provisioned wikimetrics in vagrant, how do i see if it's up and running? [18:14:37] milimetric: sorry just read README. Figured :) [18:23:02] madhuvishy: btw, we weren't expecting you to just figure these tasks out without any help, I'm happy to hangout and talk about any of them [18:23:21] like, point out pieces of wikimetrics, EL, etc. [18:24:44] milimetric: :) yeah, nuria helped with EL yesterday. Just looking at Wikimetrics now, will poke with questions in a bit [18:24:51] k [18:38:58] milimetric, do you have 15 minutes today to talk about EL problems? [18:39:10] mforns: definitely, anytime [18:39:42] milimetric, for me it could be now, let me know your preference :] [18:40:23] to the batcave! [18:40:35] xD [18:40:42] joal: if you're interested as well ^ we're batcaving on EL [18:40:51] now ? [18:40:55] milimetric: --^ [18:41:10] joal: yes [18:52:21] o/ joal [18:52:42] hey halfak, in a meeting, will ping you when ready [18:52:46] kk [18:53:06] I'm heading to a meeting soon too. Don't sweat it. :) [19:02:15] nuria: is there documentation on how to use mount to load local folders on mw-vagrant? i think i'm a little confused [19:03:48] Analytics-Tech-community-metrics, ECT-April-2015: Ensure that most basic Community Metrics are in place and how they are presented - https://phabricator.wikimedia.org/T94578#1213391 (Qgil) @Ironhold, then you could dump your thoughts here before moving to the new team, and we will continue from there. :) [19:15:57] halfak: ready whenever [19:16:15] I'm in a meeting, but I'll be bad. [19:16:22] huhub [19:16:45] So, I am considering having my stream processor make external requests (via HTTP) in a mapper. [19:16:50] Good idea / Bad idea ? [19:17:01] hmm [19:17:17] Depends on the back of server answering :) [19:17:31] Basiccaly you parallelize http calls [19:17:40] Can be called DDOS in some countries ;) [19:17:47] Sort of. Most of the work is finding something that *needs* an http call. [19:17:58] :D [19:18:00] So it's going to spend most of the time filtering and a very small amount of time requesting data. [19:18:45] If you need one or two calls per message, and you have 15 messages per second, that's already some good pressure [19:19:06] For non-acknowledge bots, nettiquette is one call per second [19:19:19] for instance [19:20:34] ottomata: Do you think I have an archiva account ? [19:21:56] nope, you don't [19:22:05] i can get you the deploy pw though [19:22:12] i want to make that thing work with ldap [19:22:21] there was some bug when i installed it originally, and i never got it to work [19:22:24] i think probably they fixed it now [19:22:59] hmm [19:23:09] still not working with https for me :( [19:23:12] milimetric: Can't login to metric.wmflabs.com - it says it's locked or sth [19:23:14] weird [19:23:20] .org sorry [19:23:21] so ok, joal, let's figure this out [19:23:33] changed the url in the pom [19:24:41] madhuvishy: you mean: https://metrics.wmflabs.org/ ? [19:24:42] i just deleted one of my dependecies [19:24:48] and it redownloaded [19:24:48] Downloading: https://archiva.wikimedia.org/repository/mirrored/org/apache/hadoop/hadoop-client/2.3.0-cdh5.0.2/hadoop-client-2.3.0-cdh5.0.2.pom [19:24:49] or you're trying to log into the instance in labs? [19:24:51] jus tfine [19:25:00] what have I got different? [19:25:03] milimetric: yeah that [19:25:10] Have no idea :( [19:25:17] The correct certificates seem [19:25:21] milimetric: it worked now :/ [19:25:23] do you have a ~/.m2/settings.xml [19:25:23] ? [19:25:28] nope [19:25:30] hm [19:25:32] madhuvishy: you had "metric" and it's "metrics" [19:25:39] ok i will remove mine, but i don't htink thats it... [19:25:49] milimetric: nooooo. i typed it wrong :D [19:26:01] oh :) ok, then blame it on the gremlins [19:26:09] still no problems! [19:26:17] seems that I need to import the certificates ottomata [19:26:24] weird [19:26:49] milimetric: :D okay so i want to see where this validate again functionality is supposed to be, but don't have any cohorts to see it? [19:27:29] javax.net.ssl.SSLHandshakeException: sun.security.validator.ValidatorException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target [19:28:03] joal, you are os x, right? or liniux? [19:28:08] linux [19:28:40] joal I'm worries about http requests per second [19:28:54] yeah halfak I can hear that [19:28:56] I'll only need to make a request 1 per 200k records [19:29:17] depends on the rates of your records [19:29:23] But seems low :) [19:29:24] joal, googline [19:29:26] googling [19:29:28] http://commandlinefanatic.com/cgi-bin/showarticle.cgi?article=art032 [19:29:31] yeah, so do I [19:29:34] I guess I could do this in two passes -- one to filter out the problematic records and another to process just them. [19:29:35] ottomata: --^ [19:29:45] hmm [19:30:00] But I guess my concern is more general about best practices for requesting external resources in a hadoop job [19:30:28] Might be worse to do it in a separate executor : more requests at once [19:30:41] halfak: You can do it, no problem, just be concious ;) [19:33:32] madhuvishy: got your questions answered? [19:33:46] nuria: the mount one, no. [19:34:03] madhuvishy: there are no instructions cause there is nothing to do [19:34:28] madhuvishy: by default your /vagrant folder is mounted on the vm [19:34:35] so anything under it is visible [19:35:35] aaah. so if i enable wikimetrics role in vagrant, and want to develop in local. I pull the project into local /vagrant/srv/wikimetrics and it'll be synced? [19:35:52] rather /vagrant/wikimetrics [19:36:01] nuria: i think that's changed now [19:36:07] the "role" should put your checkout there if bd808 changes worked [19:36:39] madhuvishy: confirm with bd808 what should be right location [19:36:59] nuria: yeah, that change worked fine. I see wikimetrics in /vagrant/srv/wikimetrics [19:37:21] madhuvishy: remove role , remove depot, run vagrant provision [19:37:38] what is depot? [19:37:48] madhuvishy: add role again and let's see if it works, that way we are using his changes from scratch [19:37:57] madhuvishy: ay sorry "code repo" [19:38:05] joal: [19:38:07] something like [19:38:12] keytool -printcert -rfc -sslserver archiva.wikimedia.org > /tmp/archiva.wikimedia.org.pem && keytool -importcert -file /tmp/archiva.wikimedia.org.pem [19:38:12] ? [19:38:18] not sure what keystore path shoudl be though [19:38:23] i found one on my mac at [19:38:30] /Library/Java/JavaVirtualMachines/jdk1.7.0_71.jdk/Contents/Home/jre/lib/security/cacerts [19:38:33] ottomata: so we are ok sending jobs to cluster now right? [19:38:38] nuria: ja go ahead [19:38:44] i'm still cleaning stuff up but you should be good [19:38:44] yeah, tried that [19:38:47] will try again [19:38:48] hm [19:38:51] ottomata: --^ [19:38:58] i dunno, imean, i don't know that i will help, i can't reproduce! [19:41:39] ottomata: yeah .. I know [19:41:44] Thanks anyway :) [19:48:32] ottomata: got it to work, didn't add the cert to the right file [19:48:37] Thx again ! [19:49:45] ahhh, cool ok [19:49:46] phew [19:49:49] glad its working [19:49:56] So do I :) [19:50:04] a pain though, but it works ;) [19:54:52] (PS1) Ottomata: Build against cdh5.3.1 packages [analytics/refinery/source] - https://gerrit.wikimedia.org/r/204614 (https://phabricator.wikimedia.org/T93952) [19:54:56] joal: ^ [19:56:34] (CR) Joal: [C: 2] "I'd love to put those as variables :)" [analytics/refinery/source] - https://gerrit.wikimedia.org/r/204614 (https://phabricator.wikimedia.org/T93952) (owner: Ottomata) [19:57:08] joal: do you know how to do that? [19:57:12] i feel like i've tried that before [19:57:22] also, the only variable you could add is cdh_version [19:57:24] oh yeah, works great [19:57:31] When I don't have cert issues ;) [19:57:33] the actual package version will change with each cdh release [19:57:54] (CR) Ottomata: [C: 2 V: 2] "Ok! If you can do it, then yay!" [analytics/refinery/source] - https://gerrit.wikimedia.org/r/204614 (https://phabricator.wikimedia.org/T93952) (owner: Ottomata) [19:59:40] awesoooome, things coming back! [19:59:47] finally got all refined partitions back in place [19:59:56] which immediately launched a bunch of jobs :) [20:00:07] Thansk amilion for that ottomata [20:00:22] yup, so, i haven't restarted anything but the refine and load jobs today [20:00:29] so the other ones are still using the default queue for oozie:launcher [20:00:39] joal and mforns: here are all the gaps I found in Navigation timing: [20:00:43] its kind of annoying to try and restart them while they are running if i want them to just go, so [20:00:44] https://www.irccloud.com/pastebin/XvBHpChL [20:00:57] i'm going to wait, and eithe rrestart them before I leave for the day, or restart them tomorrow [20:01:03] joal: one thing i'm finding really annoying about this setup [20:01:10] is that there are producitno jobs that don't have bundles [20:01:25]