[02:19:33] Analytics, Pageviews-API: Provide weekly top pageviews stats - https://phabricator.wikimedia.org/T133575#2546680 (MusikAnimal) @Nuria any chance we could triage this or give an estimate as to feasibility/likelihood of actually happening? //The Signpost// is now relying on [[ https://tools.wmflabs.org/top... [07:52:02] joal: o/ I have no idea about the latency :P [07:52:13] Hi elukey ! [07:52:26] I would wait some days to see if it stays the same.. I observed bumps in the past like these [07:52:46] but I'll try to check the JVM differences [08:09:29] elukey: I wonder if it wouldn't come from a non-restart after user change? [08:13:27] elukey: Do we spen some time now on loading? [08:15:04] joal: I wondered about the non restart after user change but I don't see any good reason... [08:15:18] joal: can you give me ~30 mins that I need to finish some things? [08:15:20] elukey: I don't really know [08:15:24] elukey: sure [08:15:43] elukey: I don't really know relates to the restat, not the 30 mins ;) [08:15:53] take your time [08:22:25] sure :) [08:41:41] joal: I checked the cassandra logs right after the restart and I didn't see anything shiny that explains what happened, plus the diff between old/new jdk seems to be security related [08:41:57] hm ... weird [08:41:57] but the time of the latency drop matches perfectly my restarts [08:42:01] I know :) [08:44:22] it might be something related to the auth cache, not sure, let's see how it goes during the next days.. even p99 looks really good [08:45:15] elukey: I think it's related to new user more than cache [08:45:39] like new-user has kicked-in at that moment, not when you merged [08:47:54] Analytics-EventLogging, Schema-change: Add index on event_type to MultimediaViewerDuration tables. - https://phabricator.wikimedia.org/T70397#2547083 (jcrespo) [08:48:55] joal: no I don't think so since it restabase/aqs started to use it straight away, and the user was already there because I used it manually [08:49:23] elukey: hmmm [08:50:58] I said auth cache because now it might be used more for the aqs user, not sure how evolved it is.. we might assume that it is a standard LRU cache but it might be something more rudimental :D [08:55:18] elukey: I can't imagine it's not rounds-trip related or something like that [08:55:21] elukey: weirdo [08:58:00] joal: this time is weird but good :D [08:58:08] elukey: do you mind proof-reading my prose ? [08:59:11] sure, have you already sent something to me? [08:59:35] elukey: https://etherpad.wikimedia.org/p/backfilling_aqs [08:59:51] WOW! [09:03:00] 10+ [09:03:10] precise and full of examples [09:03:51] elukey: the new openjdk releases also include conversative bugfixes along with the security fixes [09:04:43] moritzm: I tried to review them but didn't find anything super specific.. could it be the responsible of the latency drop that we have observed? [09:06:15] joal: just added two lines to separate between background reading and procedure [09:06:58] joal: the other thing that you could add is the months that needed to be loaded [09:07:00] in a table [09:07:09] so people can tick the ones done [09:09:15] I don't know, but I somewhat doubt that. The Java test suite is really big and they're very conservative about not breaking things, all user-visible changes are usually guarded be config options which need to be explicitly enabled [09:09:24] but maybe let's downgrade one system to validate? [09:13:55] moritzm: we did a user change the day before (namely forcing restbase to use a "regular" user to contact cassandra rather than the 'cassandra' admin one), and I suspect it might be related to the auth cache refreshed.. Either way, the latency heavily dropped so we can leave the system as it is [09:14:04] :) [09:14:27] ok :-) [09:16:59] joal: I'd need to restart the druid java daemons and cassandra on aqs100[456] after the compactions [09:17:26] but I am not sure if you are doing something with druid in these days [09:17:50] * elukey suspects that Joseph forks himself on demand [09:19:06] elukey: good call on adding the month list :) [09:19:31] elukey: no druid activity currently, you can go ahead :) [09:20:34] * elukey observes that Joseph didn't mention anaything about *not* being able to fork [09:20:51] elukey: you learn many when having a child :) [09:20:57] ahahahahah [09:20:59] forking is kinda part of the process [09:21:03] :D [09:21:06] well played, +1 [09:23:01] elukey: I'll send the link to the etherpad on the internal list and update the phab ticket as well [09:24:00] super joal, thanks a lot for this work [09:24:24] elukey: np, mostly needed to hand-off securely while in vacation ! [09:25:52] yep! [09:42:43] Druid cluster restarted [09:42:58] all jvm daemons up and running [09:44:49] awesome elukey :) [09:45:02] a-team, taking a break, will be back later [09:49:00] mobrovac: aloha! I'd need to restart the zookeeper cluster for jvm upgrades [09:59:26] let me know if you are ok with it :) [11:05:41] hi team! [11:06:51] Hi mforns :) [11:06:54] hellooo [11:15:52] elukey: :( [11:16:05] * mobrovac doesn't like zk restarts [11:16:52] I don't either but we'd need to upgrade the openjdk :) [11:17:08] one host at the time shouldn't be that heavy [11:20:06] elukey: any chance we can postpone this for early next week? [11:20:10] it's friday after all [11:20:20] and the last restart attempt didn't go that well [11:22:50] mobrovac: sure, but if you remember correctly the last one was a puppet change, not a regular restart [11:24:00] yup, i know [11:24:15] but, enough stuff went wrong that i fear zk restarts on a friday [11:24:16] :P [11:24:40] it's too nice a day here in lisbon to be stuck with zk [11:27:19] ahahaha okok [11:27:21] got it [11:28:15] mobrovac: enjoy some vino verde for me ;) [11:28:41] will do joal! [11:32:35] mobrovac: I am going to be in Lisbon next week (and then I'll travel to Porto) [11:32:50] oh really? [11:32:58] we have to meet then! [11:33:06] sure! [11:33:14] I'll be there on the 17th [11:33:53] are you going to stay for some time here? [11:35:23] only a week in total, between Lisbon and Porto [11:44:44] joal: I think I asked you this before, sorry, how do I make an oozie coordinator without inputs or datasets? [11:44:51] cool elukey [11:44:52] these python things I'm making are just cron jobs [11:44:54] vacations? [11:45:16] milimetric: Just remove the input related tags :) [11:45:24] ah! :) k [11:46:02] milimetric: I think there should be a check however on existence of data needed in the workflow (maybe) [11:46:12] milimetric: seems reasonable? [11:46:59] joal well in the site matrix case, I guess we could check the internet connection? [11:47:12] and in the other case I guess we could check the db connection? [11:47:55] mobrovac: yep yep vacations! [11:48:02] milimetric: I'm actually very wrong: There's no data dependency on the datasets you generate :) [11:48:16] milimetric: dependecies are for the scala jobs [11:48:30] right, ok [11:48:45] I wouldn't say that's *very* wrong [11:48:46] milimetric: The thing is though, since we need those dataset as inputs, would be great to define them as datasets [11:49:12] milimetric: dependency graphs are in my head currently (subgraphs, maven, oozie...) [11:49:15] :D [11:49:17] yes, that I can do, so I'd define them as output-events, right? [11:49:25] milimetric: exactly ! [11:49:27] k [11:49:37] milimetric: you need to write a dataset.xml file :( [11:49:45] milimetric: I can help / do it if you wish [11:49:55] oh it's ok, I think I did that before [11:50:04] I saw the examples and I copy pasted one already [11:50:09] just gotta fumble through it :) [11:50:11] ok sounds good :) [11:50:17] That's great :) [12:00:19] milimetric: question: in pageDataExtractors, shouldn't we fail if old or new titles are empty (null fails already) [12:00:22] ? [12:15:56] joal: so even 50X went away completely in aqs [12:16:11] ajajajaj [12:16:13] elukey: so far, so good :) [12:16:24] the former should have been a "hahahaah" [12:16:31] meaning that I have no idea [12:16:37] really good then [12:19:03] cool [12:19:21] a-team, need to be AFK for a while, will be back soon [12:19:30] joal, ok, cya [12:26:59] one thing that I noticed in AQS is https://grafana.wikimedia.org/dashboard/db/aqs-cassandra-system?panelId=7&fullscreen [12:27:12] so disk throughput on aqs100[123] went down [12:27:24] that might mean less reads from disk [12:30:56] so the good news is that at the moment AQS is not throwing 50X anymore [12:31:17] at least, we haven't been sending them for the past 24 hours [13:01:16] joal: I'm not sure we should fail, probably just discard the event? [13:03:00] hm, elukey don't we have python on the hadoop nodes? [13:03:06] I get /usr/bin/env: python : No such file or directory [13:05:09] elukey@analytics1045:~$ which python [13:05:09] /usr/bin/python [13:05:43] and usr/bin/env python works [13:05:49] but maybe the PATH is different? [13:06:26] milimetric: is it a script run by a specific user, you, etc..? [13:13:18] elukey: oozie [13:13:28] https://hue.wikimedia.org/jobbrowser/jobs/job_1468526822215_87125/single_logs [13:13:40] std_err there had a problem finding python [13:15:04] hm, someone says it might be due to a line ending problem... [13:16:18] but searching for \r doesn't yield anything [13:18:37] milimetric: I am super ignorant but when you create a map/reduce job there should be the possibility to pass env variables and such right? [13:19:53] uh... I have a very shallow understanding of this. I'm not really sure how oozie dispatches its work to map reduce [13:20:10] it uses Yarn [13:20:28] or at least this is my understanding [13:20:43] that in turns handles the whole thing, creating execution containers etc. [13:23:10] milimetric: is it a oozie shell action? Or something different? Because the major point is that whatever is the executor on the Hadoop node it needs to know where to look for python [13:23:30] elukey: yeah, oozie shell [13:23:54] https://gerrit.wikimedia.org/r/#/c/303339/5/oozie/mediawiki/refresh_site_matrix/load-site-matrix.py [13:27:59] hiiii joal can do refinery stuff whenever [13:29:47] sorry elukey that link's to the actual python script this is the oozie: https://gerrit.wikimedia.org/r/#/c/303339/5/oozie/mediawiki/refresh_site_matrix/workflow.xml [13:30:11] (I removed the empty elements and now I get that python error, I'm googling around to see how others do it) [13:33:00] milimetric: so one quick way to check if this is the problem is to replace /usr/bin/env python with /usr/bin/python [13:33:13] k [13:35:27] hm, different file not found now: https://hue.wikimedia.org/jobbrowser/jobs/job_1468526822215_87309/single_logs [13:40:19] milimetric: why not specify /usr/bin/python as the [13:40:22] and the script as an arg? [13:40:47] k, I'll try [13:42:21] I read the docs on this but it still doesn't make sense [13:42:28] what's load-site-matrix.py#load-site-matrix.py ? [13:43:16] i'm pretty sure that is saying to put load-site-matrix.py into hdfs named as load-site-matric.py [13:43:35] so that should still be there with /usr/bin/python as the exec, right? [13:44:01] https://oozie.apache.org/docs/3.2.0-incubating/WorkflowFunctionalSpec.html#a3.2.2.1_Adding_Files_and_Archives_for_the_Job [13:44:03] i guess its a symlink [13:44:06] yes [13:44:10] you need the file you want to execute [13:45:56] hey ottomata :) [13:46:11] hiiii [13:46:26] ottomata: o/ [13:46:40] have you 10 minutes for a quick hangout today? [13:46:47] maybe let's say 20 [13:46:47] yes! [13:46:50] now is good [13:47:02] nice! thanks! batcave? [13:47:27] Shall we deploy that thing ottomata ? [13:47:31] yup [13:47:37] oh man elukey shoudl I help joal first? [13:47:41] it'll take a few mins [13:48:23] joal: lets do it, elukey lets hang out in a bit after we are done [13:48:29] ottomata: I think synchro should happen for camus running machine (the rest should be fine) [13:48:40] sure! [13:48:41] ja [13:49:02] I let you stop puppet, stop camus-cron, we wait for camus to finish, then proceed ? [13:49:08] was about to say that too! [13:49:27] milimetric: Just read your comment: makes a lot of sense :) I'll also add a counter for discarded events [13:49:43] sounds good ottomata :) [13:49:51] joal: thanks, cool [13:50:58] joal: i don't see any running camus [13:51:00] and puppet and cron are stopped [13:51:03] so go ahead with deploy [13:51:19] ottomata: Doing ! [13:51:30] !log Deploy refinery from tin [13:51:31] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log, Master [13:51:35] eqi deployment [13:51:37] oops :) [13:52:09] elukey: it looks like we don't have docopt installed, should I change it to argparse or do you all want docopt on the nodes? [13:53:46] milimetric: if it is not a huge change I'd go for argparse that is a bit more portable [13:54:03] * joal likes docopt, but went with argparse on the cluster because of the same issue :( [13:54:23] joal: yeah, i looked for examples and all I found were the docopt scripts in /bin [13:54:27] If a-team agrees, I'd rather install docopt :) [13:54:30] *refinery/bin [13:54:34] right [13:54:41] I think we should either [13:54:42] 1. install docopt [13:54:49] 2. refactor the bin scripts to argparse [13:54:52] hm, we could easily install docopt everywhere [13:54:57] ottomata: Refinery deployed on stat100[24] and analytics1027 [13:54:57] I don't mind either way [13:54:58] i'm surprised its not [13:55:06] ok joal [13:55:12] milimetric: its easy, can be done with puppet in a sec [13:55:14] will do shortly [13:55:30] awesome, thanks for docopt ottomata ! [13:55:38] ottomata / elukey: if we do install docopt, can we get a later version than 0.6.1, there's a weird bug I ran into where you can't specify multi-line usage unless you start each line with an optional param [13:55:39] both are ok with me [13:55:46] joal: will see what i can do [13:55:48] sorry [13:55:48] ottomata: I think you can go ahead and merge/deploy camus puppet thing [13:55:50] milimetric: i mean [13:55:50] (because if the line starts with a dash, it's treated as an option) [13:55:50] ok [13:55:52] joal: [13:56:16] joal: we don't need to restart any running oozie jobs, right? [13:56:26] correct ottomata [13:56:28] the next time we restart them they'll pick up the new unshaded core .jar [13:56:35] and if there are problems we deal with them then? [13:56:42] ottomata: actually, they won't pick it up ;) [13:56:49] unless we change refinery version? [13:56:56] refinery_jar_version [13:56:57] ? [13:57:10] ottomata: correct, plus, only refinery-core is unshaded [13:57:15] no job uses refinery-core [13:57:38] this is why it works easily [13:57:41] aye right, riiiight, the other jars are shaded to included it [13:57:43] aye [13:57:43] aye [13:57:45] cool [13:58:25] joal: puppet has run on an27 with updated cron [13:58:35] watching camus logs waiting for next run [13:58:41] ottomata: great :) [13:59:03] ottomata: no merge message on ops channel ... Have you merged the camus thing? [13:59:07] ottomata: going to send the code review for docopt [13:59:39] ' [14:00:13] mmmm do we also need python3? [14:00:17] it seems installed [14:00:34] joal: ja [14:00:52] https://gerrit.wikimedia.org/r/#/c/304195/ [14:01:15] you guys running python3 or 2? [14:02:24] ottomata: I believe 2 but no reason to not install also the p3 version no? [14:02:25] milimetric: the latest docopt is 0.6.2 [14:02:42] sweet, thx [14:03:06] that has your fix? [14:03:20] milimetric: i have to build a new .deb for it if we want it, trusty has 0.6.1 [14:03:29] oh [14:03:31] https://gerrit.wikimedia.org/r/#/c/304472/1/modules/role/manifests/analytics_cluster/hadoop/worker.pp ? [14:03:36] nice elukey :) [14:03:38] uh... ottomata 0.6.1. is fine if it's easier [14:03:52] I worked around the bug anyway, so long as everyone else is ok working around it [14:03:57] what it means is that this works: [14:03:59] Usage: [14:04:30] blah.py --something ARG --else ARG [14:04:30] [--optional ARG] --another ARG [14:04:34] but this doesn't: [14:04:53] Usage: [14:04:53] blah.py --something ARG --else ARG [14:04:53] --another ARG [14:05:20] so you kind of need to invent as many optional args as you have lines :) [14:05:25] ah interesting, haha [14:05:45] milimetric: i usually just do [14:05:49] Usage: camus [options] [14:05:52] and then list options in [14:05:53] Options: [14:05:55] ... [14:06:10] ottomata: I like it like that [14:06:11] right, properties file is cool [14:06:22] ottomata: camus partition checker working ! [14:06:26] huh, eh? [14:06:27] Thanks for deploy :) [14:06:30] the other way is if you want the script itself to enforce usage [14:06:33] milimetric: no i mean [14:06:35] ottomata: the python thing :) [14:06:54] https://github.com/wikimedia/analytics-refinery/blob/master/bin/camus#L18-L31 [14:07:08] i mean i don't specify the usage all on one line [14:07:15] docopt still enforces is [14:07:16] it [14:07:43] ottomata: docopt won't make some options required and some optional though [14:07:54] if you specify each one of them out, you get to control what usages are lega [14:07:55] ohhh, i guess its a little weird to have 'required options' :p [14:07:55] *legal [14:08:11] not unusual at all [14:08:13] lots of scripts do it [14:08:32] yeah, it's just standard linuxy script behavior [14:08:51] yeah, but required args are also pretty standard, i guess it depends on how many you got [14:08:53] you don't do things lik [14:09:00] cp --source-file f1 --dest-file f2 [14:09:00] you do [14:09:02] cp f1 f2 [14:09:05] !log Deploy refinery on hadoop [14:09:07] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log, Master [14:09:24] right, and docopt lets you specify those as well, so you can make them required that way [14:09:33] and leave all the options as --option1, etc. [14:09:52] then you can do Usage: some.py positional1 positional2 [options] and that's fine [14:10:09] yeah, if you had only one or two required args, i would say do that, but if you have a lot, it could get really confusing without naming them as option flags [14:10:25] yeah, in this one script I have a lot [14:10:28] aye ok [14:10:33] addshore: Hi ! [14:10:38] hey! [14:10:45] addshore: Just deployed refinery, will start your job [14:10:53] okay! can I give you a start date? [14:10:54] ottomata: this is it before I moved it to the oozie folder: https://gerrit.wikimedia.org/r/#/c/303339/5/bin/sqoop-mediawiki-dbs [14:10:59] (just so you have a concrete example) [14:11:10] haha, nice work around [14:11:24] addshore: you can give, if it's more than 2 month ago, no data [14:11:30] thx :) [14:11:50] 28th of July please joal :) [14:12:02] addshore: Will do ! [14:13:06] elukey: batcave now? [14:13:56] ottomata: sure [14:21:42] addshore: https://hue.wikimedia.org/oozie/list_oozie_coordinator/0064836-160630131625562-oozie-oozi-C/ [14:26:54] joal: I'd like to create the aqsloader user in aqs100[456] with the same pass as the admin one. Afaiu the only change required would be in your .properties right? [14:27:04] I mean, for the backfilling [14:27:58] I didn't realize that loading took so much time :/ [14:30:08] elukey: works for me [14:31:59] elukey: now is a good time [14:32:21] joal: yep I am preparing the script :) [14:32:23] hmmmmm but this means we'll have to install sqoop and the mysql researcher password file everywhere [14:32:25] elukey: almost finished compacting the 2nd month, I'll start a new loading job later on this evening, can be done with new user [14:32:28] I could use a quick brain bounce about that [14:32:57] milimetric: batcave? [14:33:02] omw [14:40:21] !log created the 'aqsloader' user on aqs100[456] cassandra instances following https://wikitech.wikimedia.org/wiki/User:Elukey/Analytics/AQS_Tasks [14:40:23] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log, Master [14:40:25] joal --^ [14:41:27] * joal rubs his hands :) [14:42:36] joal: I'd also need to restart the cassandra instances for the JVM upgrades [14:42:52] elukey: will let you know when ready [14:43:30] super [14:45:30] elukey: unfortunately I think compaction will end during the night (from what I see) [14:53:01] joal: I can restart the jvms tomorrow morning easily [14:55:31] elukey: can be that or monday after another month (if finished) :( [14:55:53] I'll be on vacation on Monday, so might be better tomorrow [14:56:11] are you going to kick off another job during the weekend or on Monday? [14:56:11] k elukey, I'm not happy though [14:56:17] no? [14:56:21] elukey: I'll do it tomorrow morning [14:57:49] joal: why are you not happy?? [14:58:04] cause it makes you work tomorrow morning [14:58:21] ahhhhh [14:58:28] I thought there was something serious [14:58:37] don't worry, I am happy to do it :) [14:58:37] :) [14:58:44] it will take me 10 minutes [14:59:50] urandom: o/ another thing that you might want to see - https://grafana.wikimedia.org/dashboard/db/aqs-elukey [15:00:21] I restarted cassandra the 11th around 13 UTC, when the latency drops heavily [15:00:54] we are not sure if this is due to the new jvm or if it is somehow related to the user switch that we did the day before (cassandra -> aqs for restbase) [15:01:00] a-team: will be 2 minutes late, important phone call [15:01:02] (maybe for the auth cache cleared) [15:01:15] elukey: interesting [15:03:09] 50x went away too [15:03:19] so we are super happy but it is still a bit strange :) [15:06:40] elukey: not sure about the timing here, it sounds like you might be saying that the drop happened *after* the user change was actually applied, is that the case? [15:07:50] i'd be more inclined to believe it was that if only because we didn't see any change in latency after the jvm upgrade, but if the two did not coincide... [15:13:55] urandom: from the https://wikitech.wikimedia.org/wiki/Server_Admin_Log I switched the aqs user (restarting aqs on each node) on the 9th and restarted cassandra (on each node) on the 11th for the jvm upgrade [15:14:19] and on the 11th, after the restart, the magic happened [15:15:23] I assumed that switching *restbase/aqs* to use the aqs user was enough [15:17:27] elukey: yeah [15:21:51] urandom: I thought that maybe the auth cache (not sure how evolved it is) might have been refreshed only after the restart, giving us the final performance improvement. But there are a lot of changes in the Debian changelog for the JVM upgrade, so it might be a more plausible explanation [15:22:15] elukey: yay? [15:22:17] :) [15:22:44] ??? :D ?? [15:23:31] mysteries like this are always disconcerting, but i guess better that it dropped than spiked :) [15:24:47] ah yes for surE! [15:25:35] I just wanted to get your option, I am really happy :) [15:36:22] * milimetric getting lunch [15:54:42] ottomata: if you have a bit of time, could you install siege and ab in aqs1004/5/6? [15:55:11] siege?! [15:55:23] do you need ab on all of them? [15:55:27] probably just one, right? [15:55:48] nuria_: these are in analytics cluster [15:55:51] do you need it installd on those machines/ [15:55:51] ? [15:55:56] you can probably use ab from stat1002 or something [15:56:28] huh, i didn't know about siege [15:56:31] ottomata: tests hit localhost on each aqs machine so no they had to be run locally [15:56:39] hm, ok [15:56:50] ok nuria_ we can install on all 3, let's remember to unisntall though [15:57:05] ottomata: yes, as part of putting cluster in service [15:58:00] done. [15:58:05] a-team logging off! o/ [15:58:13] elukey: ciao aciao [16:00:23] nuria_: urls are defined with localhost [16:00:39] joal: both siege and ab are installed, want to give it a try? [16:00:51] joal: if you prefer Monday that works too [16:01:04] nuria_: monday will be better, need to log off soon [16:01:09] joal: k [16:08:20] lunchin [16:39:05] back [16:39:20] holy moly it's hot out. It feels like 43 [16:39:24] (with the humidity) [16:48:33] (PS6) Milimetric: [WIP] Oozify sqoop import of mediawiki tables [analytics/refinery] - https://gerrit.wikimedia.org/r/303339 (https://phabricator.wikimedia.org/T141476) [16:57:13] logging off a-team, see you on monday [16:57:20] have a nice weekend joal [17:40:40] (PS7) Milimetric: [WIP] Oozify sqoop import of mediawiki tables [analytics/refinery] - https://gerrit.wikimedia.org/r/303339 (https://phabricator.wikimedia.org/T141476) [17:58:03] (PS8) Milimetric: [WIP] Oozify sqoop import of mediawiki tables [analytics/refinery] - https://gerrit.wikimedia.org/r/303339 (https://phabricator.wikimedia.org/T141476) [18:08:53] Analytics, Pageviews-API: Provide weekly top pageviews stats - https://phabricator.wikimedia.org/T133575#2548587 (Nuria) @MusikAnimal : I think we will not be working on this item this quarter or next as we are focusing on edit data. Note that while we have several sources of pageview data the analytics... [18:10:55] ottomata: you around? [18:11:12] wanted to see about sqoop, that password file, and ask another oozie question [18:11:50] sqoop should be installed everywhere [18:11:55] ja am around! [18:12:31] ok, so oozie is throwing: "E0701: XML schema error, cvc-complex-type.2.4.a: Invalid content was found starting with element 'delete'. One of '{"uri:oozie:shell-action:0.1":mkdir}' is expected." [18:12:49] link to xml [18:12:49] when I tried to pass ... [18:12:49] ? [18:12:56] yes, one sec [18:13:16] (PS9) Milimetric: [WIP] Oozify sqoop import of mediawiki tables [analytics/refinery] - https://gerrit.wikimedia.org/r/303339 (https://phabricator.wikimedia.org/T141476) [18:13:32] line 40 https://gerrit.wikimedia.org/r/#/c/303339/9/oozie/mediawiki/refresh_site_matrix/workflow.xml [18:16:29] I'll try an action instead of a , maybe something's weird about that schema [18:16:30] milimetric: not sure but here's a shot [18:16:33] try [18:16:47] [18:16:49] instead of 0.1 [18:17:07] I tried 0.3 and got the same, but I saw examples that have inside with 0.1 [18:17:13] I'll try 0.2 as well [18:17:17] ah ok [18:17:19] and I'll try the fs action too, maybe it works there [18:17:20] yeah [18:17:26] ok if you tried other versions then i don't htink its that [18:17:30] i'd just use whichever is the latest (and we have) [18:17:59] meanwhile, you wanna see how you prefer to symlink /etc/mysql/conf.d/research-client.cnf ? [18:18:29] btw, how would I know what version we have? [18:19:51] uhhh, good q, [18:19:51] hm [18:19:52] aptitude show oozie [18:19:56] Version: 4.1.0+cdh5.5.2+233-1.cdh5.5.2.p0.10~trusty-cdh5.5.2 [18:20:04] changed version in doc link to 4.1.0 [18:20:04] https://oozie.apache.org/docs/4.1.0/DG_ShellActionExtension.html#AE.A_Appendix_A_Shell_XML-Schema [18:20:14] they only have up to 0.2 listed [18:22:22] hm looking at conf file stuff... [18:23:08] hm, milimetric i think we might not have enough control over permissions to do the symlink put thing [18:23:31] also the hdfs user (that we usually do hdfs refinery deploy with) doesn't have access to that mysql.conf file [18:23:35] because it is not in the right group [18:23:44] might be better to puppetize putting it in hdfs [18:23:47] which will be a little bit hacky... [18:23:48] hm. [18:23:53] thinking about it, brb [18:34:49] sorry, contractors, reading up [18:35:40] ok, lemme know if I can help. If it's easier, all I need is just the password in a file by itself [18:35:49] that's easier for me anyway, and it's in the private repo, right? [18:35:53] hmm, that might be easier actually, hmmmm [18:36:24] still will be hacky though :/ [18:36:32] you pull it into a template and put that to hdfs? [18:36:40] the putting to hdfs is hacky, right? [18:45:17] ottomata: ok, so it was ok with me doing a delete in the action [18:45:29] weird but I don't care, moving on :) [18:47:46] (PS10) Milimetric: [WIP] Oozify sqoop import of mediawiki tables [analytics/refinery] - https://gerrit.wikimedia.org/r/303339 (https://phabricator.wikimedia.org/T141476) [18:47:55] ha, ok :/ [18:48:13] milimetric: yeah, puppetizing stuff in hdfs is hacky [18:48:22] sorry, puppetizing putting stuff into hdfs [18:48:24] but ja, did this: [18:48:29] https://gerrit.wikimedia.org/r/#/c/304494/1/modules/role/manifests/analytics_cluster/mysql_password.pp [18:49:25] I see, yeah, executing the hdfs dfs -put is not the prettiest thing [18:49:42] but it's all contained in that single place, so it seems maintanable to me [18:50:03] like if someone's looking for usage of that password variable they'll find this and won't need to trace spaghetti puppet [18:50:15] hopefully [18:50:21] i had considered making a new define in the cdh module [18:50:24] something like i have for diretories [18:50:30] that would abstract creation of files in a puppet like way [18:50:38] so you could do content => ... [18:50:39] buuuuut [18:50:45] ok, thx very much, I'll set it to /user/hdfs/mysql-analytics-research-client-pw.txt by default and copy it into my own dir for testing [18:50:46] dont' want to go that far right now :/ [18:50:52] k cool [18:51:03] yeah, if you want to clean it up make a task and we'll prioritize it [18:58:52] (PS11) Milimetric: [WIP] Oozify sqoop import of mediawiki tables [analytics/refinery] - https://gerrit.wikimedia.org/r/303339 (https://phabricator.wikimedia.org/T141476) [19:04:50] milimetric: done, its at /user/hdfs/mysql-analytics-research-client-pw.txt now [19:04:59] oh sweet [19:05:03] thx! [19:18:06] (PS12) Milimetric: [WIP] Oozify sqoop import of mediawiki tables [analytics/refinery] - https://gerrit.wikimedia.org/r/303339 (https://phabricator.wikimedia.org/T141476) [19:23:11] ottomata: I'm a little confused about making files available to oozie and I can't find examples [19:23:30] when I ran this workflow, it didn't find the file on line 100: https://gerrit.wikimedia.org/r/#/c/303339/12/oozie/mediawiki/import_history/workflow.xml [19:23:33] and so I added the thing on line 108 [19:23:51] and it still gives me: FileNotFoundError: [Errno 2] No such file or directory: 'wiki_grouped_db_test.list' [19:24:22] brb [19:29:20] Analytics-Kanban: Compile a request data set for caching research and tuning - https://phabricator.wikimedia.org/T128132#2548768 (Nuria) I will try to get a larger dataset (hopefully one week) excluding any response that is not a 200. That should reduce data a bit and provide you with what seems a cleaner d... [19:30:31] ok back [19:41:08] Analytics-EventLogging, Schema-change: Add index on event_type to MultimediaViewerDuration tables. - https://phabricator.wikimedia.org/T70397#2548789 (Tgr) Open>declined The dashboard has been broken for a long time, MediaViewer is not actively developed and has in general gotten stranded on the... [19:42:23] milimetric: sorry, with you in a few mins [19:42:29] brain is busy on something... [19:46:06] ok milimetric with you [19:51:19] Hmmm, milimetric not so sure [19:51:22] i have some things to try though [19:51:25] maybe, try using lib/ dir [19:51:26] http://oozie.apache.org/docs/3.3.2/WorkflowFunctionalSpec.html#a7_Workflow_Application_Deployment [19:52:03] http://stackoverflow.com/questions/12720610/how-to-specify-multiple-jar-files-in-oozie [19:52:24] hmm, dunno though [19:52:35] it seems like oozie will make those lib files available automatically [19:54:58] dunno though [19:57:30] Hmmm! [19:57:32] https://hue.wikimedia.org/oozie/list_oozie_workflow_action/0065120-160630131625562-oozie-oozi-W%40import_history/ [19:57:36] oh that is a later one? [19:57:37] milimetric: ? [19:57:41] variable [appPath] cannot be resolved [19:58:14] do you have a hue link to a workflow that failed for you? [20:00:12] oh hm, milimetric i also see that https://hue.wikimedia.org/jobbrowser/jobs/job_1468526822215_87822/tasks/task_1468526822215_87822_m_000000/attempts/attempt_1468526822215_87822_m_000000_0/logs [20:00:26] that No such file or directory is coming from your python script [20:00:38] maybe you can debug by printing out cwd and the contents of the current directory [20:30:16] thanks ottomata, will try the lib trick [20:32:10] milimetric: not sure if it will work, but if it does, your file will be in lib/ relative to app apth [20:32:21] weird... that one you linked to worked... how the ... [20:32:21] well, relative to cwd [20:32:25] whatever tmp dir does [20:32:28] hah! [20:32:37] it didn't really though [20:32:55] oozie thinks it does, but that is some problem with the python script and oozie disagreeing [20:32:59] yeah, so I'd do like lib/myfile#myfile and then access it the same way from python [20:33:16] maybe, maybe if you put it in lib you wont' need the stuff [20:33:29] if so, then you could access it as lib/myfile in python [20:33:30] right [20:33:32] really not certaino though [20:33:33] I'll try [20:33:37] I'll let you know [20:33:39] k [20:36:37] ottomata: can you put docopt on stat1002 as well? [20:38:18] Analytics-Kanban, EventBus, Wikimedia-Stream: Public Event Streams - https://phabricator.wikimedia.org/T130651#2549276 (Ottomata) @Krinkle I've got a WIP version of Kasocki going over here: https://github.com/ottomata/kasocki It still needs work, tests, etc, but I feel good about letting people l... [20:38:32] wha its not there?! [20:38:51] its there milimetric [20:39:12] oh! [20:39:18] I'm in python3 on this script :( [20:39:22] oh [20:39:24] sorry [20:39:28] I needed concurrent.futures [20:39:33] hm k lemme see real quick [20:40:23] (PS2) Nuria: [WIP] Bug fixes on datepicker [analytics/dashiki] - https://gerrit.wikimedia.org/r/303693 (https://phabricator.wikimedia.org/T141165) [20:44:17] (PS3) Nuria: [WIP] Bug fixes on datepicker [analytics/dashiki] - https://gerrit.wikimedia.org/r/303693 (https://phabricator.wikimedia.org/T141165) [20:51:07] milimetric: installed via puppet. [20:51:14] thx! [20:51:15] milimetric: i'm signing off compy, not sure if i'll be back on for the eve [20:51:21] but i might be! [20:51:25] laters! good luck [20:52:43] thanks! [21:03:43] bearloga: is the instrumentation behind http://discovery.wmflabs.org/metrics/#survival documented somewhere? (e.g a link to the code on github would be great already, or to the schema page in case it's based on eventlogging...) [21:07:01] HaeB: yup! the code that writes out that dataset (https://datasets.wikimedia.org/aggregate-datasets/search/sample_page_visit_ld.tsv) is lines 59-86: https://github.com/wikimedia/wikimedia-discovery-golden/blob/master/search/desktop.R#L59 and it uses this schema: https://meta.wikimedia.org/wiki/Schema:TestSearchSatisfaction2 [21:08:15] bearloga: awesome, thanks! [21:09:50] HaeB: you're welcome! let me know if you have any additional questions. there's a lot to parse there and I could have done a better job documenting what the pieces do [21:10:18] "median lethal dose", huh ;) [21:13:31] survival analysis has its roots in drug trials and I was too lazy to come up with a new term for describing "the point at which half the sample 'dies'" [21:13:57] understood ;) [21:20:47] LD50? [21:22:59] bye a-team! see you next week, have nice days! [21:23:43] bearloga: ok, that code is indeed a bit complex, but i think i understand the basic logic. so i guess there is also a piece of javascript that generates the checkin events (poison doses)? where would one find that? [21:23:44] nite! [21:31:17] MaxSem: yup, LD50 [21:31:51] HaeB: i'll try to find it [21:33:11] bearloga: thanks! only if it's not too much effort [21:55:26] Analytics, Pageviews-API: Provide weekly top pageviews stats - https://phabricator.wikimedia.org/T133575#2236232 (Tbayer) @MusikAnimal The Signpost actually switched away from a weekly publication schedule recently. Is the traffic report going to stick to the weekly format for the foreseeable future? Al... [22:11:26] HaeB: https://github.com/wikimedia/mediawiki-extensions-WikimediaEvents/blob/master/modules/ext.wikimediaEvents.searchSatisfaction.js#L216 [22:12:12] HaeB: ^ that's Erik B's work [22:13:09] bearloga: i see. thanks again!