[01:49:55] (PS1) Nuria: [WIP] Mobile apps oozie jobs [analytics/refinery] - https://gerrit.wikimedia.org/r/181017 [04:50:40] operations, Wikimedia-Site-requests, Analytics-EventLogging: wikitech.wikimedia.org error "$wgEventLoggingBaseUri is not set." - https://phabricator.wikimedia.org/T84965#935588 (Krinkle) NEW [08:36:16] Multimedia, MediaWiki-extensions-MultimediaViewer, Analytics: Create MediaViewer image varnish hit/miss ratio dashboard - https://phabricator.wikimedia.org/T78205#935804 (Gilles) The theory doesn't seem to hold true, the vast majority of varnish misses with a very small "Age" value have a very old Last-Modifi... [09:43:04] MediaWiki-extensions-MultimediaViewer, Analytics, Multimedia: Investigate if pre-rendering images is having an impact on performance - https://phabricator.wikimedia.org/T76035#935895 (Gilles) Now that I have more data I can confirm that only varnish hits are affected by the phenomenon where images uploaded re... [09:57:54] MediaWiki-extensions-MultimediaViewer, Analytics, Multimedia: Create MediaViewer image varnish hit/miss ratio dashboard - https://phabricator.wikimedia.org/T78205#935925 (Gilles) What's interesting in those findings, though, is that 99.34% of varnish misses are swift pulls, regardless of upload time. Which wo... [11:08:50] (PS1) OliverKeyes: [WIP] UDF for identifying if a request meets the legacy pageview definition. [analytics/refinery/source] - https://gerrit.wikimedia.org/r/181049 [11:28:17] operations, Analytics: Allow stat1003 to connect to x1-analytics-slave.eqiad.wmnet - https://phabricator.wikimedia.org/T84990#936099 (QChris) NEW a:Ottomata [11:50:48] operations, Analytics: Move stat1001, stat1002 and stat1003 into Analytics VLAN - https://phabricator.wikimedia.org/T76346#936139 (akosiaris) [11:50:50] operations, Analytics: Allow stat1003 to connect to x1-analytics-slave.eqiad.wmnet - https://phabricator.wikimedia.org/T84990#936137 (akosiaris) Open>Resolved I updated the relevant rule on cr{1,2}-eqiad and verified that now stat1003 can connect to db1031 (x1-analytics-slave). Marking as resolved [12:02:05] operations, Analytics: Allow stat1003 to connect to x1-analytics-slave.eqiad.wmnet - https://phabricator.wikimedia.org/T84990#936160 (QChris) Whoa! That was quick. Thanks :-) [14:54:34] Analytics-Refinery: Import Mediawiki XML dumps into HDFS - https://phabricator.wikimedia.org/T76347#936346 (Ottomata) [14:59:43] (CR) Mforns: [C: 1] "LGTM" [analytics/dashiki] - https://gerrit.wikimedia.org/r/180882 (owner: Milimetric) [15:04:17] MediaWiki-User-blocking, Analytics: Make a SpecialPage to show stats on blocked IP (ranges) that attempt to edit - https://phabricator.wikimedia.org/T78840#936359 (Aklapper) p:Triage>Volunteer? [15:57:52] (PS2) Milimetric: Add external link icon to metric name [analytics/dashiki] - https://gerrit.wikimedia.org/r/180882 [16:00:38] (CR) Milimetric: "Agreed with Nuria's comment too. The icon can't really be shifted without super hacky css, because it's a font. But the underline was ug" [analytics/dashiki] - https://gerrit.wikimedia.org/r/180882 (owner: Milimetric) [16:17:28] (CR) Ottomata: "Cool, will look over this. Qchris, Ironholds, this is kinda why I wanted the class in the other change to be called 'Webrequest'. I see " [analytics/refinery/source] - https://gerrit.wikimedia.org/r/181049 (owner: OliverKeyes) [16:18:45] ottomata, I don't get the "wanted the class in the other change..." bit [16:19:41] I think there should be one refinery-core class named Webrequest [16:19:44] that has methods [16:19:49] isPageview [16:19:49] isLegacyPageview [16:20:14] hey Andrew - have you played with the Hortonworks Sandbox? [16:20:15] http://hortonworks.com/products/hortonworks-sandbox/ [16:28:01] ah cool, no i have not! [16:28:18] cool in virtual box! [16:30:35] ottomata, that makes sense, but I worry about that class getting overloaded [16:32:04] don't make me make a Webrequest wrapper class! I will do it! :) [16:32:24] or maybe i won't [16:35:21] ottomata: you said teh cluster takes about 135K requests per sec, right? [16:35:52] yeah, on average, that was just a guess [16:36:01] btw, i'm watching a linked in kafka presentation [16:36:07] they peak at 3.25 million requests / second [16:36:09] :o [16:43:14] ottomata: do we have a graph of the intake of events? [16:43:44] http://grafana.wikimedia.org/#/dashboard/db/kafka [16:44:06] (the varnishkafka one at the bottom is not complete, that is WIP) [16:46:51] ottomata: is there an analytics server that has enough spare capacity to run a 1gb redis instance and a small python webapp? [16:47:08] permanently or just for gun? [16:47:13] fun* [16:47:22] does webapp need to be public? [16:48:04] nope [16:48:11] well, ideally behind misc-varnish with ldap auth [16:48:22] and permanently, i think [16:48:56] k, probably stat1001 will be fine [16:49:09] its beef webserver thingee, stats.wikimedia.org lives there [16:49:30] -/+ buffers/cache: 1125 31037 [16:50:57] yay [16:52:32] ori, whatcha doin? [16:53:25] i want to experiment with an event consumer for blog stats that updates counters as opposed to doing repeated table scans [16:54:12] i drove tilman insane by having a massive writer's block on the hhvm blog post so i'm looking to make it up to him somehow :P [16:54:20] ha :) [16:54:30] curious, what's up with statsv? are we using it? [16:54:50] yes! ve is using it to report metrics, and a consumer is piping them into graphite [16:54:58] the api is still taking shape [16:55:02] oh, really? what's your consumer? [16:55:28] this is the js piece: https://github.com/wikimedia/mediawiki-extensions-WikimediaEvents/blob/master/modules/ext.wikimediaEvents.statsd.js [16:55:30] roan wrote it [16:56:16] and that's the python consumer: https://github.com/wikimedia/analytics-statsv/blob/master/statsv.py [16:56:35] very rudimentary, config values hard-coded, all the bad things [16:56:40] don't judge me! [16:56:48] * ori hides. [16:57:31] ah cool! where is that consumer running? [16:57:41] hafnium [16:58:31] the js piece is the producer? [16:59:12] yes and no; it's what is generating the request to the url endpoint, but it is also a consumer itself, since we have a pub/sub api in javascript land [16:59:32] so the metric reporting is done in ve, but using mw.track(), which is agnostic about how the metrics are actually reported [17:00:10] the mw.trackSubscribe( 'timing.', function ( topic, time ) { ... } ) registers the statsv code as a handler for client-side timing events [17:00:27] hm, ok [17:00:40] this allows developers to instrument their code without tightly coupling it to wikimedia-specific infrastructure [17:00:44] aey [17:01:14] it's all a bit rough brick-a-brack right now but it'll acquire some polish soon, i promise [17:01:20] haha, cool [17:01:24] this looks awsome [17:01:46] the kafka stuff is awesome! i am a convert [17:02:04] woohoo! [17:02:09] this is from statsv? [17:02:10] http://grafana.wikimedia.org/#/dashboard/db/vetem [17:02:29] yep [17:03:39] apart from general cleanliness one of the big TODOs is to have the python backend quarantine new metrics until they are reported by sufficiently many distinct IPs [17:04:00] as a way of making it harder to abuse by just generating random garbage metric names and flooding graphite that way [17:04:53] someone determined could still spam from multiple IPs, but it requires substantially more effort, and doesn't really give the attacker anything [17:05:30] aye, cool [17:05:55] someone could garbage up already existing metrics though, ja? [17:06:20] ori this is so coooool, there is so much potential here [17:06:32] yeah, we probably want to throttle clients too [17:06:42] could varnish do that? [17:07:11] possibly, but i think it's better to just blackhole it in python [17:07:22] well, except, it'd be nice to keep someone from DoSing kafka [17:07:33] or, at least trying [17:07:49] hmmm [17:08:30] ori, congrats :) [17:08:39] thanks! [17:08:42] does this mean you have to grow a beard now and start opening conversations with [17:08:49] "when I was writing COBOL, not that you were even ALIVE then..." [17:08:50] ? [17:09:17] i hope not :) [17:09:20] actually, turns out my landlord used to write COBOL. That was a fun discussion [17:09:23] rob didn't mention that [17:09:34] "[lots of cobol things]" "...I know that COBOL is 1-indexed! Go me!" [17:12:59] ori, ottomata I think the backend will get dos'ed before kafka [17:13:22] let's test the theory! [17:13:39] I think you can just do the math [17:13:47] everyone go to https://en.wikipedia.org/wiki/Barack_Obama?action=purge and stick a brick on your F5 key. [17:13:50] tnegrin: probably true [17:14:49] yeah probably true [17:14:49] when we built these things at yahoo, the hardest problem was when the message bus filled up and we couldn't deque fast enough [17:15:06] can we flush kafka? [17:15:27] no, its got a 7 day buffer. I think we could make the buffer messages size based instead of time based [17:15:40] so, only keep 15B messages, or whatever [17:15:52] and drop at the head? [17:15:56] yes [17:16:05] but ja, this service is only sitting on 4 bits varnishes [17:16:09] https://www.varnish-cache.org/vmod/throttle [17:16:32] what's the backend? python batching and feeding graphite? [17:16:47] not even sure the 4 varnishes would be able to DoS Kafka if they maxed sending [17:17:04] tnegrin: python consuming from kafka, parsing message, and feeding statsd [17:17:10] yeah, i linked to it above (https://github.com/wikimedia/analytics-statsv/blob/master/statsv.py). calling it a "backend" is getting a bit fancy, it's half a page of basic python [17:17:10] which feeds graphite [17:17:16] ori: interesting, just read Roan's code [17:17:20] call it a consumer [17:17:25] milimetric: hey! [17:17:26] do you see this replacing event logging? [17:17:37] it's basically a race between the kafka disks and the stated disks [17:17:53] milimetric: i don't think so; it's a distinct use-case [17:18:08] handling more throughput, etc. [17:18:15] what's the statsd backend? [17:18:28] statsd is like a monitoring middleman [17:18:39] can aggregate and do some statistics on metrics before sending them to monitoring services [17:18:42] like grahpite/ganglia, etc. [17:18:43] where does the data ultimately get stored? [17:18:50] graphite [17:18:58] milimetric: did HaeB approach you about a dashboard for the blog? [17:19:04] so log files on flash/stats? [17:19:07] yes a while back [17:19:12] ? [17:19:20] ottomata: statsd backend? [17:19:22] ori: but kevin never scheduled it [17:19:22] milimetric: want to pair up on it and surprise him? [17:19:32] :) sure [17:19:45] happy to help - right now we're on "Christmas break" in terms of scheduled stuff [17:19:54] so I'm playing with Storm [17:20:01] tnegrin: not sure what you are asking [17:20:19] how does graphite store the data that gets send via statsd? [17:20:24] btw - for experimentation with Hadoop clusters, the Hortonworks Sandbox is great: http://hortonworks.com/products/hortonworks-sandbox/ [17:20:26] ah yes, on disk somehow [17:20:28] dunno [17:20:37] ori: so how can I help? where are you at [17:20:48] yeah -- so that's the race [17:20:52] milimetric: give me two minutes and i'll throw up a gist of the python bit i'm hacking on [17:23:21] hey qchris, when you get a sec, look over the IsPageviewUDF again? is there anything else you think we should do? [17:24:15] sure. [17:24:46] Analytics: Create Geocode Hive UDF for Refinery - https://phabricator.wikimedia.org/T76349#936529 (Ottomata) [17:24:49] Analytics-Refinery: Geo-coding UDF - https://phabricator.wikimedia.org/T77683#936530 (Ottomata) [17:28:43] milimetric: https://gist.github.com/atdt/85166c6813c86497a2ed [17:28:54] looking [17:30:51] basically if there's a page view on blog.wikimedia.org/some-post, the following counters get incremented: lifetime total views of the post, hourly views for the post, weekly, monthly, yearly. apart from the lifetime totals, the other keys are rotated by having an expiration time set based on a retention policy [17:31:34] qchris may be interested in this too [17:31:49] I am reading along :-) [17:31:53] It's great. [17:32:27] ori: ok [17:33:14] so you want to graph those things, once they're in redis? [17:33:38] yeah [17:33:57] so what's going to generate the "events" list [17:34:12] oh doh, heh, that should read [17:34:26] for meta in iter(sock.recv_json, None): [17:34:33] gotcha [17:34:56] apart from graphs there should be simple tabular presentation of the data, i.e. for a given post you should be able to see the list of top referrers + count for each [17:35:00] i saw you and andrew talking about some redis instance he's setting up? [17:35:02] is that for this/ [17:35:18] i aint setting it up! ori has powers! [17:35:24] heheh [17:35:28] but yeah, that is what i was asking about [17:35:36] sorry i'm jumping in the middle - i'm not sure what all the details are [17:36:09] nobody is, this is the product of some half-brained late-night hacking, haven't thought it through yet [17:36:11] ok, so these things are coming on the normal EL stream from vanadium, going into a *new* redis that's already set up, and *should* end up in a graph somewhere that... should be private right? [17:36:30] yeah, probably restricted to wmf ldap [17:36:39] it's not already set up [17:37:19] ok, ottomata do we have anything we can serve web content on and can restrict to wmf ldap? [17:37:21] if we had redis 2.8 we could use hyperloglog for uniques [17:37:44] milimetric: it's easy to do; there's an apache config snippet we reuse [17:37:47] let me dig up an example [17:38:17] stat1001 is ripe for the choosing,. ori has powers, go ori! [17:38:33] ok, just wondering more from an ops "are we allowed to do this" point of view [17:38:43] https://github.com/wikimedia/operations-puppet/blob/production/templates/graphite/apache-auth-ldap.erb [17:39:30] sure, why not? graphite and logstash are conceptually similar (they're web interfaces to log data) and they're behind ldap [17:40:22] ori: well we're talking about writing some new server (albeit simple) and reading data from redis then exposing it in a new UI, and deploying to prod cluster without security revie [17:40:30] nooooo [17:40:35] we'll ask, of course [17:40:42] i'm just saying, i don't expect any trouble [17:40:55] i'm not suggesting we bypass review [17:41:07] you wanted to do it today though? [17:41:41] we don't have to [17:41:59] but otherwise we wouldn't surprise anyone :) [17:42:13] it's the season for surprises, so I'm down [17:42:31] you think you could whip up a simple web ui today? that'd be pretty impressive [17:42:42] that wouldn't be a problem [17:42:51] but i don't think i could navigate all the policy nuances today [17:43:07] i can do that [17:43:09] ottomata: what can I use on stat1001? python / node / ? [17:43:18] python plz! [17:43:28] k, i guess flask then? [17:43:55] sounds fine [17:44:11] i actually don't think i can deploy this today since i have meetings until 4pm [17:44:31] but if you have the code ready i can do it whenever there's a spare ops person to review and babysit [17:44:35] bbiab [17:44:41] ori: ok i'll see if I can whip something up and I'll put it in my github for now [17:46:09] did someone say redis? [17:46:11] * YuviPanda reads backlog [17:46:23] Analytics-Refinery: Geo-coding UDF - https://phabricator.wikimedia.org/T77683#936574 (Ottomata) Ok, here is what we probably want: # A class in `refinery-core` containing geocoding methods. Something like: ## `getCountryCode(ipAddress)` - returns two letter country code. ## `getGeocodedData(ipAddress)` - r... [17:46:34] lunchtime, back laters! [17:48:45] Analytics-Refinery: Geo-coding UDF - https://phabricator.wikimedia.org/T77683#936577 (Ironholds) That makes sense. Are we going to use the v1 or v2 maxmind API? [18:06:14] milimetric: how about 'abacist' for a name? [18:06:22] http://www.merriam-webster.com/dictionary/abacist [18:11:33] (PS1) Ori.livneh: Add .gitreview [analytics/abacist] - https://gerrit.wikimedia.org/r/181105 [18:13:35] (CR) Ori.livneh: [C: 2 V: 2] Add .gitreview [analytics/abacist] - https://gerrit.wikimedia.org/r/181105 (owner: Ori.livneh) [18:13:45] (PS1) Ori.livneh: Initial commit. [analytics/abacist] - https://gerrit.wikimedia.org/r/181106 [18:18:14] indeed it looks like no data is been sent to logstash: [18:18:18] https://www.irccloud.com/pastebin/NZULkGUS [18:25:55] ori: ok :) [18:26:02] ori: do you want this flask server in the same repo? [18:26:22] milimetric: makes sense to me [18:30:35] nuria__: have you pinged bd808? [18:30:55] ori: is that gage? [18:30:59] no, bryan davis [18:31:03] do you know him? [18:31:18] ori: no, for the logsstash stuff? [18:31:22] yep [18:31:23] *logstash [18:31:39] orI: does he work in ops? [18:31:49] mediawiki-core [18:31:54] see #wikimedia-operations [18:54:16] (CR) Nuria: [C: 2] "Agreed it's better. Merging." [analytics/dashiki] - https://gerrit.wikimedia.org/r/180882 (owner: Milimetric) [18:54:26] (CR) Nuria: [V: 2] "Agreed it's better. Merging." [analytics/dashiki] - https://gerrit.wikimedia.org/r/180882 (owner: Milimetric) [18:59:22] (CR) QChris: [C: -1] [WIP] UDF for classifying pageviews according to https://meta.wikimedia.org/wiki/Research:Page_view/Generalised_filters (19 comments) [analytics/refinery/source] - https://gerrit.wikimedia.org/r/180023 (owner: Ottomata) [19:01:28] ori: i got some basics down, just trying to understand the data now [19:01:49] so we'll have 25 hours worth of hourly counts, 31 days worth of daily, etc. [19:02:01] yep [19:02:07] k, good [19:02:19] will ottomata be back today? [19:02:43] if you see him ask him to check out https://gerrit.wikimedia.org/r/#/c/181110/ [19:03:18] qchris, yt? [19:03:23] nuria__: yup. [19:03:35] But about to head to dinner. [19:03:43] So better be quick *nom nom* [19:03:53] ok, nevermind then, will ask otto when he's back [19:03:57] will do ori - puppet yay :) [19:04:01] see you monday qchris [19:04:04] nuria__: k [19:04:17] nuria__: I'll be back after lunch. [19:04:23] s/lunch/dinner/ [19:04:24] qchris: sorry about the review delay [19:04:28] will try to get to your patches today [19:04:39] enjoy lunch! [19:04:56] ori: No worries :-) duesentrieb and krinkle are starting do discuss objectsvs. arrays. I don't want to get in the way there anyways. [19:05:02] heheh [19:06:26] Wikimedia-Logstash, Analytics-Engineering: Convert Hadoop-Logstash logging to use Redis to address failures - https://phabricator.wikimedia.org/T85015#936719 (bd808) [19:07:30] Wikimedia-Logstash, operations, Analytics-Engineering: Convert Hadoop-Logstash logging to use Redis to address failures - https://phabricator.wikimedia.org/T85015#936720 (Gage) [19:12:13] Wikimedia-Logstash, operations, Analytics-Engineering: Convert Hadoop-Logstash logging to use Redis to address failures - https://phabricator.wikimedia.org/T85015#936734 (Gage) [19:13:54] Wikimedia-Logstash, operations, Analytics-Engineering: Convert Hadoop-Logstash logging to use Redis to address failures - https://phabricator.wikimedia.org/T85015#936707 (Gage) [19:18:22] (CR) Jdlrobson: [C: 2] Add timestamp to query results. [analytics/limn-mobile-data] - https://gerrit.wikimedia.org/r/180996 (owner: Bmansurov) [19:18:29] (Merged) jenkins-bot: Add timestamp to query results. [analytics/limn-mobile-data] - https://gerrit.wikimedia.org/r/180996 (owner: Bmansurov) [19:19:58] milimetric: hi, in order to deploy limn-mobile-data, should I run this command from my local machine? fab mobile_reportcard deploy.only_data [19:22:23] bmansurov: yup! assuming you’ve access to the analytics wikimedia labs project [19:22:40] YuviPanda: ok thanks [19:23:48] YuviPanda: I'm getting this error: Fatal error: Couldn't find any fabfiles! [19:24:06] YuviPanda: I was inside root directory when I ran that command [19:24:23] bmansurov: root difrectory of what? [19:24:32] YuviPanda: limn-mobile-data git repo [19:24:32] bmansurov: of the limn-mobile-data clone? [19:24:36] yes [19:28:11] Analytics-Engineering: Provide a ua-parser service (ua-parser.wikimedia.org) using the One True Wikimedia UA-Parser™ - https://phabricator.wikimedia.org/T1336#936762 (Ironholds) [19:34:16] ottomata: hola [19:34:52] ottomata: following the breadcrumbs from hadoop into oozie i found thsi error on my hadoop job: [19:35:02] https://www.irccloud.com/pastebin/m4eYKE4g [19:35:28] ottomata: which i do not get as i can run hive queries just fine... [19:35:53] ha yeah, ellery had that problem too [19:35:57] i'm not entirely sure why, but [19:36:20] you can change hive.exec.scratchdir [19:36:21] to [19:36:34] hive.exec.scratchdir=/tmp/hive-nuria [19:36:38] one sec. [19:37:20] yeah [19:37:21] so [19:37:26] you can do that in the hive script [19:37:27] via set [19:37:31] but, ellery had problems with that [19:37:36] ori, k i got some fake data and the site working, now the hard part: how to show this in a graph [19:37:37] e.g. [19:37:38] you could do [19:37:39] https://cwiki.apache.org/confluence/display/Hive/AdminManual+Configuration [19:38:00] however, qchris recommends doing it in the oozie configs [19:38:11] see his oozie_demo.tar.gz workflow.xml file that he sent to the internal list [19:38:17] it has [19:38:17] [19:38:18] [19:38:18] hive.exec.scratchdir [19:38:18] /tmp/hive-${user} [19:38:18] [19:38:18] [19:38:23] which makes sense to me [19:38:38] as part of the hive action [19:38:41] ottomata: ok, that is easy enough [19:39:01] ottomata: let me try and i will report in a bit [19:43:34] nuria__: fyi, you can get syntax highlighing in mediawiki with tags [19:43:38] to specify language [19:43:42] ottomata: oohhhhh [19:43:46] ottomata: https://gerrit.wikimedia.org/r/#/c/181110/ :D [19:57:12] milimetric: maybe just expose a JSON API for now, and then do the graphing / layout in JS? [19:57:46] ori: I was gonna skip the fetching of data from json 'cause i thought the graphing was the hard parth [19:57:47] but yea, doing it in js [19:57:55] *fetching from redis i mean [19:58:21] yeah the graphing is the hard part [19:59:07] it'd be nice to have a single beautiful graph to display all this but it kind of mixes too many concepts I think [19:59:41] ori, just curious, why are you doing this with redis + python? why not just => statsd and be done with it? [20:00:32] ottomata: i have to run meet with tilman and apologize for being such a horrible procrastinator with the blog post, but i can explain after. there are reasons :) [20:01:18] ok [20:06:50] Hey guys, I'm trying to deploy limn-mobile-data and fabric is asking me for login password for 'vagrant'. Can anyone tell me what it is? [20:07:04] Full message: [limn1.eqiad.wmflabs] Login password for 'vagrant': [20:07:12] just recite "I am the system administrator, my voice is my password" into your microphone [20:08:35] I did but wasn't allowed. Must be my accent. [20:13:37] bmansurov: something's gone wrong somewhere [20:13:39] are you on your local machine? [20:13:51] milimetric: yes, apparently I don't have access to limn1.eqiad.wmflabs [20:13:51] Ironholds: I forget, did we buy the maxmind 2 db files? [20:13:56] do we ahve them on our systems? [20:13:59] yup [20:14:00] same directory [20:14:02] can you ssh limn1.eqiad.wmflabs? [20:14:13] already built a library that integrates with the C API :D [20:14:16] milimetric: no, Connection closed by UNKNOWN [20:14:18] is your labs username bmansurov? [20:14:22] milimetric: yes [20:15:03] bmansurov: try now [20:15:41] milimetric: still the same error [20:16:04] milimetric: ok, it's working now, thanks [20:16:05] bmansurov: do you have eqiad.wmflabs set up to go through the bastion in your ssh config? [20:16:10] ok, good [20:16:11] milimetric: yes [20:16:23] bmansurov: limn-deploy now, give it a shot again [20:16:38] "fab mobile_reportcard deploy.only_data" right? [20:17:03] milimetric: yes, but i'm still asked for Login password for 'vagrant' [20:17:15] try fab --user bmansurov [20:17:21] or however you pass that parameter [20:17:58] milimetric: looks like it's working. thanks [20:18:19] bmansurov: cool - that user thing is something I have setup in my ssh config [20:18:21] want me to paste it? [20:18:42] milimetric: that'd be great [20:18:52] https://www.irccloud.com/pastebin/BEy6u6Gq [20:19:09] thanks [20:19:19] np [20:23:43] milimetric: 2 of the graphs in the other tab are still missing, is it safe to run 'mobile_reportcard deploy'? [20:24:07] bmansurov: mobile_reportcard deploy does nothing to get the data [20:24:25] the whole limn-deploy system is just to deploy the dashboarding files, any graphs, datasources, etc. [20:24:35] the datafiles and the sql run on a totally different server in prod [20:24:50] milimetric: I changed some sql statements and would like to see them deployed [20:24:51] the fact that the graphs show up there at all means that the dashboard has been deployed and is working fine [20:25:11] bmansurov: the sql is deployed automatically by puppet on stat1003 [20:25:40] running deploy instead of deploy.only_data with limn-deploy will actually deploy the latest version of LIMN, so it's not at all what you want and it affects other instances on that box [20:25:54] i see [20:26:11] milimetric: when I run those sql statements manually in stat1003, I see some results, but for some reason graphs are still missing [20:26:33] yes, bmansurov the generate.py that runs on stat1003 must be broken somehow or those queries are doing something unexpected [20:26:58] rtnpro has a great patch in gerrit that should help with troubleshooting that: https://gerrit.wikimedia.org/r/#/c/180828/ [20:27:09] milimetric: ok thanks [20:27:10] bmansurov: if you could review that, that would be useful to everyone I think [20:27:21] ok [20:30:11] ottomata: docs updated as setting hive's scratch dir fixed issues [20:40:36] ottomata: is hive.exec.scratchdir needed when jobs run on cluster on oozies scheduler? (rather than via cli) [20:41:43] in 1:1, with you shortly... [20:57:02] hey kevinator -- I have a conflict for today's meeting [21:05:52] (CR) Bmansurov: Integrated logging (3 comments) [analytics/limn-mobile-data] - https://gerrit.wikimedia.org/r/180828 (owner: Rtnpro) [21:09:14] nuria__: i think so, i think it wouldn't matter [21:09:19] i'm not really sure why it is needed [21:13:26] ottomata: sounds like this is teh issue: https://github.com/bloomberg/chef-bach/issues/26 [21:22:28] aye ja [21:22:36] so setting it like qchris suggests is looks right [21:25:08] wait, hangon [21:25:29] "hive-site.xml is missing hive.exec.scratchdir parameter. When a query is executed, using ODBC/JDBC that connects to a hiveserver2 process, temporary results are stored in directory specified by hive.exec.scratchdir parameter. " [21:25:40] huh. I had no idea JDBC wrote things to file :/ [21:27:45] Ironholds: it's probably hive rather [21:28:23] hrm [21:28:40] something to bear in mind when we get the hive server its own machine HINTHINTHINT [21:28:41] Ironholds: the query for apps now runs on oozie , still need to parametize it some more but a daily run takes about 10 hours (no sampling) [21:28:45] yay! [21:31:33] Ironholds: i am testing some sampling to see how does that work but volume of apps requests is so low that i do not have high hopes for sampling [21:31:49] yeah :/ [21:31:51] a blessing and a curse [21:32:57] ottomata: any graphs for messages per second per partition? [21:34:13] Ironholds: that directory is in hdfs [21:34:22] aha [21:34:58] Ironholds: you've got app view logic in this pageview patch [21:35:10] would a udf that filtered those speed up nuria__'s work? [21:35:22] * Ironholds thinks [21:35:33] I dunno. We could find out? [21:35:43] pretty easily [21:36:03] the basic question is whether identifying a request as being from an app in Java takes more time than the equivalent structure in hive [21:36:31] ottomata: i do not think so [21:37:09] ottomata: the (main) problem is the volume of data of the query , whether teh text parsing happens in hive [21:37:30] or udf i doubt it will make much difference [21:37:39] cc: Ironholds [21:37:44] * Ironholds nods glumly [21:37:46] believe so [21:38:01] I wish we had something consistent to filter on in addition to Y/M/D/H/varnish [21:38:06] *partition on [21:38:23] ...huh. wait. [21:38:23] we DO. [21:38:25] ottomata? [21:38:31] how much of a pain is changing the partitions scheme? [21:38:57] because, if we were to additionally filter by MIME TYPE....almost every query would involve fewer mappers. Certainly every pageview-related query. [21:39:10] Ironholds: i do not think we need diff partitions, we need ETL [21:39:28]