[07:56:45] Analytics, Operations, Ops-Access-Requests: Add analytics team members to group aqs-admins to be able to deploy pageview APi - https://phabricator.wikimedia.org/T142101#2556320 (akosiaris) > We won't gain any extra sudo permissions, but this group will be used to grant access to the deploymenet ssh k... [12:53:18] o/ joal & milimetric [12:53:38] I have stuff to talk about re. live systems, but I'm crunching to get a presentation ready for Wednesday [12:55:19] So I'd like to reschedule. [12:55:30] Would you be interested in talking later this week? [12:56:19] BTW, I want to talk about maintaining state on top of events. It's not what you think. I think that, in order for an "event feed" to be useful to an end user, I'd like to be able to annotate events with post-hoc generated information. [12:56:42] e.g. a revision that was reverted. Or tags that were applied by an asynchronous job (see abuse filter). etc. [12:56:51] Been discussing this with the collab team. [12:57:22] We'll certainly be hacking something together in the meantime, but I thought ya'll would have ideas for how a robust solution would look. [12:57:40] I was thinking of something like RESTBase where new data can be added to an item. [12:58:02] But we'd want to make these annotations queryable [12:58:08] * halfak runs back to presentation land [12:58:25] hey halfak [12:58:33] joseph's on vacation [12:59:11] and yes I chatted with collab about that as well, and have thoughts on it [12:59:18] :) [12:59:34] and also plans on how to store "is_reverted" and "is_new_editor" and stuff like that [12:59:42] (future-looking properties) [13:00:21] if you want joseph around, we have to wait a couple of weeks, if you just want to catch up the two of us, I can talk after your presentation [13:00:32] milimetric, I was thinking that we'd want a nice way to associate moderation options too. E.g. for meta-moderation. [13:00:45] Cool. I'll put something on our calendars for Thursday or Friday [13:00:49] k [13:02:56] {{done}} [13:02:59] <3 [13:07:01] {{wish-luck|presentation}} [13:07:08] :D! [13:20:00] mornning [13:20:10] hey morning [13:20:30] wanna look at this oozie thing with me? [13:20:45] ja let's do that first sure! [13:21:04] k, is cave easy where you are? [13:21:12] ja [13:21:14] i htink so [13:21:19] k, joining [14:19:21] (PS15) Milimetric: [WIP] Oozify sqoop import of mediawiki tables [analytics/refinery] - https://gerrit.wikimedia.org/r/303339 (https://phabricator.wikimedia.org/T141476) [15:00:55] mforns: standduppp [15:06:52] Analytics-Kanban, EventBus, Services, User-mobrovac: Improve schema update process on EventBus production instance - https://phabricator.wikimedia.org/T140870#2557274 (Nuria) Open>Resolved [15:10:02] Analytics-Kanban, Datasets-Webstatscollector, RESTBase-Cassandra, Patch-For-Review: Better response times on AQS (Pageview API mostly) {melc} - https://phabricator.wikimedia.org/T124314#2557280 (Nuria) [15:10:03] Analytics-Kanban: Replace RAID0 arrays with RAID10 on aqs100[456] - https://phabricator.wikimedia.org/T142075#2557279 (Nuria) Open>Resolved [16:07:58] Analytics: Pre-generate mysql ORM code for sqoop - https://phabricator.wikimedia.org/T143119#2557510 (Milimetric) [16:14:09] mforns: second test is more visible in CPU: https://ganglia.wikimedia.org/latest/?r=2hr&cs=&ce=&m=bytes_out&c=Analytics+Query+Service+eqiad&h=aqs1004.eqiad.wmnet&tab=m&vn=&hide-hf=false&mc=2&z=small&metric_group=NOGROUPS [16:18:21] nuria_, aha I can see that [16:29:49] mforns: ok, things break after 300 TPS [16:30:15] mforns: but cpu is not even at 10% [16:30:22] nuria_, aha [16:30:56] 300 requests per second? [16:36:55] nuria_, 300 rps seems a bit low no? [16:37:05] mforns: per machine? [16:37:20] mforns: no, it is 1 order of magnitude > than what we do now [16:37:26] nuria_, I don't think this is per machine [16:37:49] cassandra nodes communicate between them, so even if hitting 1004 only, we are testing all the cluster, no? [16:38:09] but through 1 http endpoint [16:38:50] nuria_, it is 1 order of magnitude greater than the test we did before Luca magically improved the performance of the api [16:38:56] mforns: makes sense? 1 node can serve 300 requests by itself at the http level [16:39:02] AUTO magically [16:39:10] hehe [16:39:30] mforns: on teh old cluster you mean right? [16:39:33] *the [16:39:35] yes [16:40:01] mforns: makes sense [16:40:06] mforns: the loading for april: oozie job -info 0069866-160630131625562-oozie-oozi-C [16:40:13] cool! [16:40:29] mforns: don't tell oozie april has 31 days cause it no work [16:41:05] hehe, oozie is smart [16:55:40] back fyi [16:55:59] nuria_: mforns, ok if I start the next aqs loading job? [16:56:18] ottomata, nuria_ already started it [16:56:21] oh [16:56:31] oh there it is [16:56:35] i didn't see it, guess my thing didn't refresh [16:56:41] ok nuria_, i'm not going to watch it then [16:56:44] you are on it :) [16:56:53] yes, we're looking [16:57:10] cool [17:21:09] ok milimetric i got it to work, but i had to set user.name=yarn [17:21:12] which is not really ideal :/ [17:21:48] yeah, seems like a bug [17:22:04] you'd think everyone would run into it. [17:22:22] so your other libjars way had no output, right? [17:22:27] right [17:22:29] i was wrong about that [17:22:32] i had the same import tools error [17:22:42] actually, lemme run it with user yarn and wo libjars real quick [17:23:34] so the user yarn property is passed in the workflow or the coordinator properties? [17:24:18] properties, yeah, but if not set [17:24:24] it defaults to the user submitting the job [17:25:17] I wouldn't have expected sqoop to need different rights depending on how it's launched, so weird [17:25:32] what's the downside of setting it to yarn always? [17:25:34] so, it kinda makes sense, since the subprocess that the shell action will run is running as the yarn user [17:25:37] its the shell action thing [17:25:38] not sqoop [17:26:01] i'm not exactly sure at what point its failing though [17:26:06] I see [17:26:12] i think there is a special code check to make sure you have executable access to the .staging dir [17:26:33] this was the error: http://stackoverflow.com/questions/32897524/sqoop-export-oozie-workflow-fails-with-file-not-found-works-when-ran-from-the-c [17:27:19] milimetric: FYI works withotu any special libjars stuff [17:27:28] I think it's more than access, it looks like a mis-coordination [17:27:49] like it executes /user/yarn/.staging but it then looks for a different /user/another/.staging [17:28:16] ok, cool, so I guess for now I'll just put in the user.name property and make a ticket to look into it more or file a bug upstream [17:28:20] hmmm, except it is explicitly failing when looking for executable permis on the staging dir of /user/yarn/.staging [17:28:30] ja, go with that for now milimetric, but gimme a few, i'm going to try some things [17:28:44] k, np [17:47:24] milimetric: it is quite strange that it doesn't work. [17:47:27] sqoop gets launched as yarn. [17:47:40] by yarn user* [17:47:57] it looks like it copies the sqoop libjars into /user/yarn/.staging/job_XXX/ [17:48:04] if it is running as yarn [17:48:09] why doesn't it have perms to access those things? [17:48:52] I know, it's weird. It seems like either both us and this stackoverflow guy have made the same configuration mistake or there should be a bug reported here [17:54:45] ottomata: I don't understand where to set user.name = yarn... [17:55:33] milimetric: in coordinator.properties [17:55:33] or [17:55:34] just do [17:55:35] sudo -u yarn [17:55:37] intead of hdfs [17:55:41] oh [17:55:41] oh, you probably can't sudo -u yarn [17:55:45] yeah put it in coordinator.properties [17:55:56] btw, milimetric just checking, you know you don't need a workflow.properties file, right? [17:56:07] and it'll just get picked up automatically or do I have to define properties and pass them on [17:56:17] no its picked up automatically [17:56:17] or [17:56:21] milimetric: you could do [17:56:21] ok [17:56:28] -Duser.name=yarn [17:56:29] on the cli [17:56:33] when submitting job [17:56:44] yeah, I know I don't need workflow.properties, but Joseph told me to always put that there in case people want to test just the wf (which I never do) [17:56:48] ok cool [17:56:53] oh nice, ok i like it too [17:56:59] i think christian wanted the opposite [17:57:05] not to have the workflow.properties if it wasn't used by prod [17:57:10] but i kinda like havin git [17:57:13] I should probably put it in coordinator props because it always needs to run that way or it won't work [17:57:20] nice to be able to more easily see how to configure an individual workflow [17:57:24] yes, coordinator props [17:57:32] is best i think too [17:57:33] right [17:57:40] i guess also in workflow.props for completeness if you have it [17:58:20] yep [18:08:30] wikimedia/mediawiki-extensions-EventLogging#584 (wmf/1.28.0-wmf.15 - 6e4d11a : Mukunda Modell): The build has errored. [18:08:30] Change view : https://github.com/wikimedia/mediawiki-extensions-EventLogging/commit/6e4d11af88bf [18:08:30] Build details : https://travis-ci.org/wikimedia/mediawiki-extensions-EventLogging/builds/152756748 [18:09:11] milimetric: this is the same as our issue: https://issues.apache.org/jira/browse/OOZIE-1955 [18:09:50] interesting, reading [18:11:26] yeah, I saw other people doing that ssh workaround [18:11:29] but it's kind of insane [18:11:40] I prefer the run-as-yarn way unless that's bad for other reasons [18:11:41] yeah, or we could allow sudo -u yarn, but that's not cool [18:11:53] its not ideal, as it will leave files owned by yarn, which is not what we do [18:11:54] anywhere else [18:11:55] well, this needs to run in prod on a schedule [18:12:02] i'm still looking for workarounds [18:22:50] i don't understand why the staging dir created by the execed shell process is being accessed by the user running the oozie job. [18:30:19] it sounds like an unavoidable bug in the shell action, according to that task [18:31:46] *that jira issue I mean [18:34:38] its not really a bug, welll, i mean [18:34:53] if we knew why it needed to access the shell's staging dir [18:34:56] that's what doesn't make sense [18:35:02] the parent job shouldn't know anything about it. [18:35:03] hmmmm [18:35:21] maybe its an oozie thing, oozie is trying to do something with the staging dir of the job that gets created for the shell action [18:35:51] (PS16) Milimetric: [WIP] Oozify sqoop import of mediawiki tables [analytics/refinery] - https://gerrit.wikimedia.org/r/303339 (https://phabricator.wikimedia.org/T141476) [18:36:26] ^ that patch works, and I've added notes about the hack in the properties files [18:36:49] I have to define output-values and datasets for these two oozie jobs and then I'll look a little more to see if I find a solution [18:38:24] nuria_: you can move our 1/1 to after standup tomorrow, that seems to fit, right? [18:38:34] milimetric: ah, yes i forgot [18:47:48] milimetric: ! [18:47:49] i got it [18:47:59] :) [18:48:04] in your shell action [18:48:05] set [18:48:16] HADOOP_USER_NAME=hdfs [18:48:21] or whatever user you are using [18:48:23] if yourself [18:48:25] milimetric [18:48:28] that worked for me [18:48:39] try it as yourself [18:48:45] set it to milimetric [18:48:54] and then launch the oozie job as yourself [18:48:56] without user.name set [18:49:45] so maybe =${user.name} for prod and -Duser.name=milimetric for testing? [18:50:35] ottomata: ^ [18:50:49] because user.name would default to whatever user's running, in this case hdfs? [18:52:30] (CR) Mforns: [WIP] Refactor Mediawiki History scala code (4 comments) [analytics/refinery/source] - https://gerrit.wikimedia.org/r/301837 (https://phabricator.wikimedia.org/T141548) (owner: Joal) [18:52:33] no... wait, that wouldn't work. [18:52:36] milimetric: i would hope so, i got an error for some reason when i tried that, but its probably because i didn't set it [18:52:47] i think if you have user.name set, either in properties or by -D, that will work [18:52:50] lemme try that [18:52:55] because user.name sets more than just HADOOP_USER_NAME, right? [18:53:19] (PS22) Mforns: [WIP] Refactor Mediawiki History scala code [analytics/refinery/source] - https://gerrit.wikimedia.org/r/301837 (https://phabricator.wikimedia.org/T141548) (owner: Joal) [18:53:20] yes, but that's ok [18:53:20] so [18:53:25] but then I have to still set user.name to yarn, right? [18:53:28] no [18:53:41] you want HADOOP_USER_NAME to be the same as user.name [18:53:49] ok [18:53:52] the default user.name is whateever shell user you launch with [18:53:53] but [18:54:00] i think when the xml file is evaluated and vars are filled in [18:54:11] ${user.name} is not available unless you specify it [18:54:15] in properties somehow [18:54:17] hang on, verifying that ^ [18:54:21] so then maybe I just set HADOOP_USER_NAME=${user.name} and don't set user.name in properties or -D, I just let the system do its thing and set it as me when I'm testing and hdfs when in prod [18:54:21] oh [18:54:52] yeah gm [18:54:53] hm [18:54:54] Error Message : variable [user] cannot be resolved [18:54:59] dunno why i can't do ${user.name} [18:55:34] Analytics-Kanban: Page History: Add unit tests to PageHistoryDataExtractors and PageHistoryBuilder - https://phabricator.wikimedia.org/T142724#2558196 (mforns) a:JAllemandou>mforns [18:56:41] maybe it needs to be passed in from the coordinator somehow [18:56:49] with like coord:getUser(), I'll check that out [18:56:51] hm. [18:57:14] aha, https://oozie.apache.org/docs/3.1.3-incubating/CoordinatorFunctionalSpec.html#a6.7.5._coord:user_EL_Function_since_Oozie_2.3 [18:57:17] so maybe passing that in as a property to the workflow will work, I'll try [19:01:55] (PS17) Milimetric: [WIP] Oozify sqoop import of mediawiki tables [analytics/refinery] - https://gerrit.wikimedia.org/r/303339 (https://phabricator.wikimedia.org/T141476) [19:02:10] milimetric: i think [19:02:22] HADOOP_USER_NAME=${wf:user()} [19:02:24] will do it [19:03:42] ok, just about to try it so I'll try that too [19:04:18] yup, that works! [19:04:22] ja add that [19:04:26] then you can run as any user [19:04:31] woohooo [19:04:45] i was able to run the same workflow as hdfs and as myself [19:06:23] milimetric: tell me that works for you! :D [19:06:55] trying again, i put the env-var in the wrong spot, needs to be after the arguments and before the files apparently [19:09:58] looking good ottomata, I was able to launch it as myself [19:10:00] you da man :) [19:13:59] (PS18) Milimetric: [WIP] Oozify sqoop import of mediawiki tables [analytics/refinery] - https://gerrit.wikimedia.org/r/303339 (https://phabricator.wikimedia.org/T141476) [19:14:17] ok, that last version should work with just basic normal oozie testing config [19:14:43] now I do output-values and datasets and then I hope I break up with oozie and hope I don't run into her for another few weeks [19:16:20] yeehaw! [19:16:27] ok awesome! [19:17:40] :) thanks very much [19:23:55] cool, running home, be back shortly [19:54:41] (CR) Chad: [C: 2 V: 2] "It's a terrible tool, but hey, .gitreview files are harmless ;-)" [analytics/refinery] - https://gerrit.wikimedia.org/r/303108 (owner: Dereckson) [19:57:37] milimetric: one question [19:57:46] milimetric: if you are there [19:58:45] hi nuria_ what's up [19:58:58] milimetric: i was looking into scrolling and daterangepicker [19:59:05] yes [19:59:15] milimetric: i can hide the picker when you scroll but actually [19:59:20] milimetric: what i think makes sense [19:59:32] milimetric: is that the position of picker is not "fixed" [19:59:39] agreed, yep [19:59:45] milimetric: ok [19:59:55] that way it'd stick to where it is and if you scroll back you get to it [20:00:01] milimetric: so i will not add the scrolling fix. [20:00:06] ok [20:00:14] I can do the CSS to it later, if you'd like [20:00:26] maybe after this oozie nightmare is over :) [20:00:59] milimetric: ok, yes, cause i do not think i know how to pin it, will try for a bit. [20:02:15] (PS5) Nuria: Bug fixes on datepicker [analytics/dashiki] - https://gerrit.wikimedia.org/r/303693 (https://phabricator.wikimedia.org/T141165) [20:02:59] ok nuria_ if you're doing it now and you can't get it I can jump in the batcave, maybe I could use a break from oozie [20:03:32] milimetric: sure, i just pushed the last patch with a small fix, i can join batcave [20:03:52] milimetric: i am at the library again [20:03:52] k, I'll grab the patch and join you [20:04:16] np, we'll be quiet :) [20:04:50] milimetric: you need to build it to see how it will interact with real data: gulp --layout tabs --config SimpleRequestBreakdowns [20:05:13] ok, will do [20:06:45] i wonder what takes so long to build [20:08:40] leila: coming? [20:08:48] I'm in somewhere bmansurov_. [20:08:52] was looking for you [20:08:58] hmm [20:08:58] I think there is a Hangout glitch [20:09:07] give me a min to refresh things on my end, bmansurov_ [20:09:10] ok [20:09:55] it says "requesting to join ..." bmansurov_ [20:10:00] not sure if you see any request [20:10:15] let me just call you directly from Hangout, outside of this calender event. [20:10:16] leila: let's try https://plus.google.com/hangouts/_/wikimedia.org/tracyisland [20:10:37] leila: sure you can call me too [20:10:40] same line, bmansurov_. [20:21:40] Analytics-EventLogging, Analytics-Kanban, EventBus, Patch-For-Review: Change or upgrade eventlogging kafka client used for producing - https://phabricator.wikimedia.org/T141285#2558522 (Nuria) Open>Resolved [20:21:55] Analytics-Kanban, Patch-For-Review: Change or upgrade eventlogging kafka client used for consumption - https://phabricator.wikimedia.org/T133779#2558523 (Nuria) Open>Resolved [20:22:06] Analytics-Kanban: EventBus Maintenace: Fork child processes before adding writers - https://phabricator.wikimedia.org/T141470#2558524 (Nuria) Open>Resolved [20:22:08] Analytics-EventLogging, Analytics-Kanban, EventBus, Patch-For-Review: Change or upgrade eventlogging kafka client used for producing - https://phabricator.wikimedia.org/T141285#2493093 (Nuria) [20:25:54] Analytics-Kanban: Stop generating pagecounts-raw and pagecounts-all-sites - https://phabricator.wikimedia.org/T130656#2558550 (Nuria) Open>Resolved [20:28:19] Analytics-Kanban, EventBus, Patch-For-Review: Upgrade kafka main clusters to 0.9 - https://phabricator.wikimedia.org/T138265#2558573 (Nuria) Open>Resolved [20:34:13] Analytics-Kanban: Remove outdated docs regarding dashboard info - https://phabricator.wikimedia.org/T137883#2382422 (Nuria) @Neil_P._Quinn_WMF : looks like editing dashboards also need updates on these docs [20:35:05] (PS6) Milimetric: Bug fixes on datepicker [analytics/dashiki] - https://gerrit.wikimedia.org/r/303693 (https://phabricator.wikimedia.org/T141165) (owner: Nuria) [20:36:57] milimetric, are you adding other changes to nuria_'s patch? [20:37:07] or are those part of your review? [20:37:07] just that little style thing [20:37:22] are you going to merge that? I was reviewing it [20:37:22] I haven't reviewed yet, no, I can try tomorrow [20:37:29] ok ok [20:37:40] oh no, I was just patching the layout of the datepicker and some annoying padding issues [20:37:48] can I continue reviewing? and merge if goof? [20:37:51] mforns: please review , milimetric did css for menus [20:37:55] not goof, good [20:37:56] it's two very tiny changes, feel free to review the rest [20:38:00] ok [20:38:05] mforns: do not merge yet cause i need to throughly text ff [20:38:15] I see [20:38:16] mforns: do not merge yet cause i need to throughly *test* [20:38:30] and yes, you can merge, the only two files I changed in my patch are the layouts/tabs/styles.css and the datepicker-binding.js [20:38:41] ok, I will just review and +2 no merge [20:38:47] (of course not if nuria doesn't +2) [20:39:02] but I mean I'm not going to block the merge, I haven't reviewed yet [20:39:04] I'll review and leave the merge to nuria_ [20:39:11] ok [20:39:49] mforns: ok, will test in ff to make sure it all checks out [20:39:52] mforns: thank you [20:39:58] :] [20:41:30] Analytics-Kanban, Editing-Analysis, Documentation: Remove outdated docs regarding dashboard info - https://phabricator.wikimedia.org/T137883#2558673 (Neil_P._Quinn_WMF) @Nuria, yes, I've worked on these docs before, but they're such a mess that it's hard to know where to start. In addition, I ran int... [20:43:48] Analytics-Kanban, Editing-Analysis, Documentation: Remove outdated docs regarding dashboard info - https://phabricator.wikimedia.org/T137883#2558686 (Nuria) @Neil_P._Quinn_WMF Understood. I have removed outdated info and linked to analytics.wikimedia.org when it pertained. [21:32:59] (CR) Mforns: [C: -1] Bug fixes on datepicker (3 comments) [analytics/dashiki] - https://gerrit.wikimedia.org/r/303693 (https://phabricator.wikimedia.org/T141165) (owner: Nuria) [21:40:45] bye team! cya tomorrow [21:50:00] byeee [21:53:06] (PS19) Milimetric: [WIP] Oozify sqoop import of mediawiki tables [analytics/refinery] - https://gerrit.wikimedia.org/r/303339 (https://phabricator.wikimedia.org/T141476)