[01:49:55] (PS1) Nuria: [WIP] Mobile apps oozie jobs [analytics/refinery] - https://gerrit.wikimedia.org/r/181017 [04:50:40] operations, Wikimedia-Site-requests, Analytics-EventLogging: wikitech.wikimedia.org error "$wgEventLoggingBaseUri is not set." - https://phabricator.wikimedia.org/T84965#935588 (Krinkle) NEW [08:36:16] Multimedia, MediaWiki-extensions-MultimediaViewer, Analytics: Create MediaViewer image varnish hit/miss ratio dashboard - https://phabricator.wikimedia.org/T78205#935804 (Gilles) The theory doesn't seem to hold true, the vast majority of varnish misses with a very small "Age" value have a very old Last-Modifi... [09:43:04] MediaWiki-extensions-MultimediaViewer, Analytics, Multimedia: Investigate if pre-rendering images is having an impact on performance - https://phabricator.wikimedia.org/T76035#935895 (Gilles) Now that I have more data I can confirm that only varnish hits are affected by the phenomenon where images uploaded re... [09:57:54] MediaWiki-extensions-MultimediaViewer, Analytics, Multimedia: Create MediaViewer image varnish hit/miss ratio dashboard - https://phabricator.wikimedia.org/T78205#935925 (Gilles) What's interesting in those findings, though, is that 99.34% of varnish misses are swift pulls, regardless of upload time. Which wo... [11:08:50] (PS1) OliverKeyes: [WIP] UDF for identifying if a request meets the legacy pageview definition. [analytics/refinery/source] - https://gerrit.wikimedia.org/r/181049 [11:28:17] operations, Analytics: Allow stat1003 to connect to x1-analytics-slave.eqiad.wmnet - https://phabricator.wikimedia.org/T84990#936099 (QChris) NEW a:Ottomata [11:50:48] operations, Analytics: Move stat1001, stat1002 and stat1003 into Analytics VLAN - https://phabricator.wikimedia.org/T76346#936139 (akosiaris) [11:50:50] operations, Analytics: Allow stat1003 to connect to x1-analytics-slave.eqiad.wmnet - https://phabricator.wikimedia.org/T84990#936137 (akosiaris) Open>Resolved I updated the relevant rule on cr{1,2}-eqiad and verified that now stat1003 can connect to db1031 (x1-analytics-slave). Marking as resolved [12:02:05] operations, Analytics: Allow stat1003 to connect to x1-analytics-slave.eqiad.wmnet - https://phabricator.wikimedia.org/T84990#936160 (QChris) Whoa! That was quick. Thanks :-) [14:54:34] Analytics-Refinery: Import Mediawiki XML dumps into HDFS - https://phabricator.wikimedia.org/T76347#936346 (Ottomata) [14:59:43] (CR) Mforns: [C: 1] "LGTM" [analytics/dashiki] - https://gerrit.wikimedia.org/r/180882 (owner: Milimetric) [15:04:17] MediaWiki-User-blocking, Analytics: Make a SpecialPage to show stats on blocked IP (ranges) that attempt to edit - https://phabricator.wikimedia.org/T78840#936359 (Aklapper) p:Triage>Volunteer? [15:57:52] (PS2) Milimetric: Add external link icon to metric name [analytics/dashiki] - https://gerrit.wikimedia.org/r/180882 [16:00:38] (CR) Milimetric: "Agreed with Nuria's comment too. The icon can't really be shifted without super hacky css, because it's a font. But the underline was ug" [analytics/dashiki] - https://gerrit.wikimedia.org/r/180882 (owner: Milimetric) [16:17:28] (CR) Ottomata: "Cool, will look over this. Qchris, Ironholds, this is kinda why I wanted the class in the other change to be called 'Webrequest'. I see " [analytics/refinery/source] - https://gerrit.wikimedia.org/r/181049 (owner: OliverKeyes) [16:18:45] ottomata, I don't get the "wanted the class in the other change..." bit [16:19:41] I think there should be one refinery-core class named Webrequest [16:19:44] that has methods [16:19:49] isPageview [16:19:49] isLegacyPageview [16:20:14] hey Andrew - have you played with the Hortonworks Sandbox? [16:20:15] http://hortonworks.com/products/hortonworks-sandbox/ [16:28:01] ah cool, no i have not! [16:28:18] cool in virtual box! [16:30:35] ottomata, that makes sense, but I worry about that class getting overloaded [16:32:04] don't make me make a Webrequest wrapper class! I will do it! :) [16:32:24] or maybe i won't [16:35:21] ottomata: you said teh cluster takes about 135K requests per sec, right? [16:35:52] yeah, on average, that was just a guess [16:36:01] btw, i'm watching a linked in kafka presentation [16:36:07] they peak at 3.25 million requests / second [16:36:09] :o [16:43:14] ottomata: do we have a graph of the intake of events? [16:43:44] http://grafana.wikimedia.org/#/dashboard/db/kafka [16:44:06] (the varnishkafka one at the bottom is not complete, that is WIP) [16:46:51] ottomata: is there an analytics server that has enough spare capacity to run a 1gb redis instance and a small python webapp? [16:47:08] permanently or just for gun? [16:47:13] fun* [16:47:22] does webapp need to be public? [16:48:04] nope [16:48:11] well, ideally behind misc-varnish with ldap auth [16:48:22] and permanently, i think [16:48:56] k, probably stat1001 will be fine [16:49:09] its beef webserver thingee, stats.wikimedia.org lives there [16:49:30] -/+ buffers/cache: 1125 31037 [16:50:57] yay [16:52:32] ori, whatcha doin? [16:53:25] i want to experiment with an event consumer for blog stats that updates counters as opposed to doing repeated table scans [16:54:12] i drove tilman insane by having a massive writer's block on the hhvm blog post so i'm looking to make it up to him somehow :P [16:54:20] ha :) [16:54:30] curious, what's up with statsv? are we using it? [16:54:50] yes! ve is using it to report metrics, and a consumer is piping them into graphite [16:54:58] the api is still taking shape [16:55:02] oh, really? what's your consumer? [16:55:28] this is the js piece: https://github.com/wikimedia/mediawiki-extensions-WikimediaEvents/blob/master/modules/ext.wikimediaEvents.statsd.js [16:55:30] roan wrote it [16:56:16] and that's the python consumer: https://github.com/wikimedia/analytics-statsv/blob/master/statsv.py [16:56:35] very rudimentary, config values hard-coded, all the bad things [16:56:40] don't judge me! [16:56:48] * ori hides. [16:57:31] ah cool! where is that consumer running? [16:57:41] hafnium [16:58:31] the js piece is the producer? [16:59:12] yes and no; it's what is generating the request to the url endpoint, but it is also a consumer itself, since we have a pub/sub api in javascript land [16:59:32] so the metric reporting is done in ve, but using mw.track(), which is agnostic about how the metrics are actually reported [17:00:10] the mw.trackSubscribe( 'timing.', function ( topic, time ) { ... } ) registers the statsv code as a handler for client-side timing events [17:00:27] hm, ok [17:00:40] this allows developers to instrument their code without tightly coupling it to wikimedia-specific infrastructure [17:00:44] aey [17:01:14] it's all a bit rough brick-a-brack right now but it'll acquire some polish soon, i promise [17:01:20] haha, cool [17:01:24] this looks awsome [17:01:46] the kafka stuff is awesome! i am a convert [17:02:04] woohoo! [17:02:09] this is from statsv? [17:02:10] http://grafana.wikimedia.org/#/dashboard/db/vetem [17:02:29] yep [17:03:39] apart from general cleanliness one of the big TODOs is to have the python backend quarantine new metrics until they are reported by sufficiently many distinct IPs [17:04:00] as a way of making it harder to abuse by just generating random garbage metric names and flooding graphite that way [17:04:53] someone determined could still spam from multiple IPs, but it requires substantially more effort, and doesn't really give the attacker anything [17:05:30] aye, cool [17:05:55] someone could garbage up already existing metrics though, ja? [17:06:20] ori this is so coooool, there is so much potential here [17:06:32] yeah, we probably want to throttle clients too [17:06:42] could varnish do that? [17:07:11] possibly, but i think it's better to just blackhole it in python [17:07:22] well, except, it'd be nice to keep someone from DoSing kafka [17:07:33] or, at least trying [17:07:49] hmmm [17:08:30] ori, congrats :) [17:08:39] thanks! [17:08:42] does this mean you have to grow a beard now and start opening conversations with [17:08:49] "when I was writing COBOL, not that you were even ALIVE then..." [17:08:50] ? [17:09:17] i hope not :) [17:09:20] actually, turns out my landlord used to write COBOL. That was a fun discussion [17:09:23] rob didn't mention that [17:09:34] "[lots of cobol things]" "...I know that COBOL is 1-indexed! Go me!" [17:12:59] ori, ottomata I think the backend will get dos'ed before kafka [17:13:22] let's test the theory! [17:13:39] I think you can just do the math [17:13:47] everyone go to https://en.wikipedia.org/wiki/Barack_Obama?action=purge and stick a brick on your F5 key. [17:13:50] tnegrin: probably true [17:14:49] yeah probably true [17:14:49] when we built these things at yahoo, the hardest problem was when the message bus filled up and we couldn't deque fast enough [17:15:06] can we flush kafka? [17:15:27] no, its got a 7 day buffer. I think we could make the buffer messages size based instead of time based [17:15:40] so, only keep 15B messages, or whatever [17:15:52] and drop at the head? [17:15:56] yes [17:16:05] but ja, this service is only sitting on 4 bits varnishes [17:16:09] https://www.varnish-cache.org/vmod/throttle [17:16:32] what's the backend? python batching and feeding graphite? [17:16:47] not even sure the 4 varnishes would be able to DoS Kafka if they maxed sending [17:17:04] tnegrin: python consuming from kafka, parsing message, and feeding statsd [17:17:10] yeah, i linked to it above (https://github.com/wikimedia/analytics-statsv/blob/master/statsv.py). calling it a "backend" is getting a bit fancy, it's half a page of basic python [17:17:10] which feeds graphite [17:17:16] ori: interesting, just read Roan's code [17:17:20] call it a consumer [17:17:25] milimetric: hey! [17:17:26] do you see this replacing event logging? [17:17:37] it's basically a race between the kafka disks and the stated disks [17:17:53] milimetric: i don't think so; it's a distinct use-case [17:18:08] handling more throughput, etc. [17:18:15] what's the statsd backend? [17:18:28] statsd is like a monitoring middleman [17:18:39] can aggregate and do some statistics on metrics before sending them to monitoring services [17:18:42] like grahpite/ganglia, etc. [17:18:43] where does the data ultimately get stored? [17:18:50] graphite [17:18:58] milimetric: did HaeB approach you about a dashboard for the blog? [17:19:04] so log files on flash/stats? [17:19:07] yes a while back [17:19:12] ? [17:19:20] ottomata: statsd backend? [17:19:22] ori: but kevin never scheduled it [17:19:22] milimetric: want to pair up on it and surprise him? [17:19:32] :) sure [17:19:45] happy to help - right now we're on "Christmas break" in terms of scheduled stuff [17:19:54] so I'm playing with Storm [17:20:01] tnegrin: not sure what you are asking [17:20:19] how does graphite store the data that gets send via statsd? [17:20:24] btw - for experimentation with Hadoop clusters, the Hortonworks Sandbox is great: http://hortonworks.com/products/hortonworks-sandbox/ [17:20:26] ah yes, on disk somehow [17:20:28] dunno [17:20:37] ori: so how can I help? where are you at [17:20:48] yeah -- so that's the race [17:20:52] milimetric: give me two minutes and i'll throw up a gist of the python bit i'm hacking on [17:23:21] hey qchris, when you get a sec, look over the IsPageviewUDF again? is there anything else you think we should do? [17:24:15] sure. [17:24:46] Analytics: Create Geocode Hive UDF for Refinery - https://phabricator.wikimedia.org/T76349#936529 (Ottomata) [17:24:49] Analytics-Refinery: Geo-coding UDF - https://phabricator.wikimedia.org/T77683#936530 (Ottomata) [17:28:43] milimetric: https://gist.github.com/atdt/85166c6813c86497a2ed [17:28:54] looking [17:30:51] basically if there's a page view on blog.wikimedia.org/some-post, the following counters get incremented: lifetime total views of the post, hourly views for the post, weekly, monthly, yearly. apart from the lifetime totals, the other keys are rotated by having an expiration time set based on a retention policy [17:31:34] qchris may be interested in this too [17:31:49] I am reading along :-) [17:31:53] It's great. [17:32:27] ori: ok [17:33:14] so you want to graph those things, once they're in redis? [17:33:38] yeah [17:33:57] so what's going to generate the "events" list [17:34:12] oh doh, heh, that should read [17:34:26] for meta in iter(sock.recv_json, None): [17:34:33] gotcha [17:34:56] apart from graphs there should be simple tabular presentation of the data, i.e. for a given post you should be able to see the list of top referrers + count for each [17:35:00] i saw you and andrew talking about some redis instance he's setting up? [17:35:02] is that for this/ [17:35:18] i aint setting it up! ori has powers! [17:35:24] heheh [17:35:28] but yeah, that is what i was asking about [17:35:36] sorry i'm jumping in the middle - i'm not sure what all the details are [17:36:09] nobody is, this is the product of some half-brained late-night hacking, haven't thought it through yet [17:36:11] ok, so these things are coming on the normal EL stream from vanadium, going into a *new* redis that's already set up, and *should* end up in a graph somewhere that... should be private right? [17:36:30] yeah, probably restricted to wmf ldap [17:36:39] it's not already set up [17:37:19] ok, ottomata do we have anything we can serve web content on and can restrict to wmf ldap? [17:37:21] if we had redis 2.8 we could use hyperloglog for uniques [17:37:44] milimetric: it's easy to do; there's an apache config snippet we reuse [17:37:47] let me dig up an example [17:38:17] stat1001 is ripe for the choosing,. ori has powers, go ori! [17:38:33] ok, just wondering more from an ops "are we allowed to do this" point of view [17:38:43] https://github.com/wikimedia/operations-puppet/blob/production/templates/graphite/apache-auth-ldap.erb [17:39:30] sure, why not? graphite and logstash are conceptually similar (they're web interfaces to log data) and they're behind ldap [17:40:22] ori: well we're talking about writing some new server (albeit simple) and reading data from redis then exposing it in a new UI, and deploying to prod cluster without security revie [17:40:30] nooooo [17:40:35] we'll ask, of course [17:40:42] i'm just saying, i don't expect any trouble [17:40:55] i'm not suggesting we bypass review [17:41:07] you wanted to do it today though? [17:41:41] we don't have to [17:41:59] but otherwise we wouldn't surprise anyone :) [17:42:13] it's the season for surprises, so I'm down [17:42:31] you think you could whip up a simple web ui today? that'd be pretty impressive [17:42:42] that wouldn't be a problem [17:42:51] but i don't think i could navigate all the policy nuances today [17:43:07] i can do that [17:43:09] ottomata: what can I use on stat1001? python / node / ? [17:43:18] python plz! [17:43:28] k, i guess flask then? [17:43:55] sounds fine [17:44:11] i actually don't think i can deploy this today since i have meetings until 4pm [17:44:31] but if you have the code ready i can do it whenever there's a spare ops person to review and babysit [17:44:35] bbiab [17:44:41] ori: ok i'll see if I can whip something up and I'll put it in my github for now [17:46:09] did someone say redis? [17:46:11] * YuviPanda reads backlog [17:46:23] Analytics-Refinery: Geo-coding UDF - https://phabricator.wikimedia.org/T77683#936574 (Ottomata) Ok, here is what we probably want: # A class in `refinery-core` containing geocoding methods. Something like: ## `getCountryCode(ipAddress)` - returns two letter country code. ## `getGeocodedData(ipAddress)` - r... [17:46:34] lunchtime, back laters! [17:48:45] Analytics-Refinery: Geo-coding UDF - https://phabricator.wikimedia.org/T77683#936577 (Ironholds) That makes sense. Are we going to use the v1 or v2 maxmind API? [18:06:14] milimetric: how about 'abacist' for a name? [18:06:22] http://www.merriam-webster.com/dictionary/abacist [18:11:33] (PS1) Ori.livneh: Add .gitreview [analytics/abacist] - https://gerrit.wikimedia.org/r/181105 [18:13:35] (CR) Ori.livneh: [C: 2 V: 2] Add .gitreview [analytics/abacist] - https://gerrit.wikimedia.org/r/181105 (owner: Ori.livneh) [18:13:45] (PS1) Ori.livneh: Initial commit. [analytics/abacist] - https://gerrit.wikimedia.org/r/181106 [18:18:14] indeed it looks like no data is been sent to logstash: [18:18:18] https://www.irccloud.com/pastebin/NZULkGUS [18:25:55] ori: ok :) [18:26:02] ori: do you want this flask server in the same repo? [18:26:22] milimetric: makes sense to me [18:30:35] nuria__: have you pinged bd808? [18:30:55] ori: is that gage? [18:30:59] no, bryan davis [18:31:03] do you know him? [18:31:18] ori: no, for the logsstash stuff? [18:31:22] yep [18:31:23] *logstash [18:31:39] orI: does he work in ops? [18:31:49] mediawiki-core [18:31:54] see #wikimedia-operations [18:54:16] (CR) Nuria: [C: 2] "Agreed it's better. Merging." [analytics/dashiki] - https://gerrit.wikimedia.org/r/180882 (owner: Milimetric) [18:54:26] (CR) Nuria: [V: 2] "Agreed it's better. Merging." [analytics/dashiki] - https://gerrit.wikimedia.org/r/180882 (owner: Milimetric) [18:59:22] (CR) QChris: [C: -1] [WIP] UDF for classifying pageviews according to https://meta.wikimedia.org/wiki/Research:Page_view/Generalised_filters (19 comments) [analytics/refinery/source] - https://gerrit.wikimedia.org/r/180023 (owner: Ottomata) [19:01:28] ori: i got some basics down, just trying to understand the data now [19:01:49] so we'll have 25 hours worth of hourly counts, 31 days worth of daily, etc. [19:02:01] yep [19:02:07] k, good [19:02:19] will ottomata be back today? [19:02:43] if you see him ask him to check out https://gerrit.wikimedia.org/r/#/c/181110/ [19:03:18] qchris, yt? [19:03:23] nuria__: yup. [19:03:35] But about to head to dinner. [19:03:43] So better be quick *nom nom* [19:03:53] ok, nevermind then, will ask otto when he's back [19:03:57] will do ori - puppet yay :) [19:04:01] see you monday qchris [19:04:04] nuria__: k [19:04:17] nuria__: I'll be back after lunch. [19:04:23] s/lunch/dinner/ [19:04:24] qchris: sorry about the review delay [19:04:28] will try to get to your patches today [19:04:39] enjoy lunch! [19:04:56] ori: No worries :-) duesentrieb and krinkle are starting do discuss objectsvs. arrays. I don't want to get in the way there anyways. [19:05:02] heheh [19:06:26] Wikimedia-Logstash, Analytics-Engineering: Convert Hadoop-Logstash logging to use Redis to address failures - https://phabricator.wikimedia.org/T85015#936719 (bd808) [19:07:30] Wikimedia-Logstash, operations, Analytics-Engineering: Convert Hadoop-Logstash logging to use Redis to address failures - https://phabricator.wikimedia.org/T85015#936720 (Gage) [19:12:13] Wikimedia-Logstash, operations, Analytics-Engineering: Convert Hadoop-Logstash logging to use Redis to address failures - https://phabricator.wikimedia.org/T85015#936734 (Gage) [19:13:54] Wikimedia-Logstash, operations, Analytics-Engineering: Convert Hadoop-Logstash logging to use Redis to address failures - https://phabricator.wikimedia.org/T85015#936707 (Gage) [19:18:22] (CR) Jdlrobson: [C: 2] Add timestamp to query results. [analytics/limn-mobile-data] - https://gerrit.wikimedia.org/r/180996 (owner: Bmansurov) [19:18:29] (Merged) jenkins-bot: Add timestamp to query results. [analytics/limn-mobile-data] - https://gerrit.wikimedia.org/r/180996 (owner: Bmansurov) [19:19:58] milimetric: hi, in order to deploy limn-mobile-data, should I run this command from my local machine? fab mobile_reportcard deploy.only_data [19:22:23] bmansurov: yup! assuming you’ve access to the analytics wikimedia labs project [19:22:40] YuviPanda: ok thanks [19:23:48] YuviPanda: I'm getting this error: Fatal error: Couldn't find any fabfiles! [19:24:06] YuviPanda: I was inside root directory when I ran that command [19:24:23] bmansurov: root difrectory of what? [19:24:32] YuviPanda: limn-mobile-data git repo [19:24:32] bmansurov: of the limn-mobile-data clone? [19:24:36] yes [19:28:11] Analytics-Engineering: Provide a ua-parser service (ua-parser.wikimedia.org) using the One True Wikimedia UA-Parser™ - https://phabricator.wikimedia.org/T1336#936762 (Ironholds) [19:34:16] ottomata: hola [19:34:52] ottomata: following the breadcrumbs from hadoop into oozie i found thsi error on my hadoop job: [19:35:02] https://www.irccloud.com/pastebin/m4eYKE4g [19:35:28] ottomata: which i do not get as i can run hive queries just fine... [19:35:53] ha yeah, ellery had that problem too [19:35:57] i'm not entirely sure why, but [19:36:20] you can change hive.exec.scratchdir [19:36:21] to [19:36:34] hive.exec.scratchdir=/tmp/hive-nuria [19:36:38] one sec. [19:37:20] yeah [19:37:21] so [19:37:26] you can do that in the hive script [19:37:27] via set [19:37:31] but, ellery had problems with that [19:37:36] ori, k i got some fake data and the site working, now the hard part: how to show this in a graph [19:37:37] e.g. [19:37:38] you could do [19:37:39] https://cwiki.apache.org/confluence/display/Hive/AdminManual+Configuration [19:38:00] however, qchris recommends doing it in the oozie configs [19:38:11] see his oozie_demo.tar.gz workflow.xml file that he sent to the internal list [19:38:17] it has [19:38:17] [19:38:18] [19:38:18] hive.exec.scratchdir [19:38:18] /tmp/hive-${user} [19:38:18] [19:38:18] [19:38:23] which makes sense to me [19:38:38] as part of the hive action [19:38:41] ottomata: ok, that is easy enough [19:39:01] ottomata: let me try and i will report in a bit [19:43:34] nuria__: fyi, you can get syntax highlighing in mediawiki with tags [19:43:38] to specify language [19:43:42] ottomata: oohhhhh [19:43:46] ottomata: https://gerrit.wikimedia.org/r/#/c/181110/ :D [19:57:12] milimetric: maybe just expose a JSON API for now, and then do the graphing / layout in JS? [19:57:46] ori: I was gonna skip the fetching of data from json 'cause i thought the graphing was the hard parth [19:57:47] but yea, doing it in js [19:57:55] *fetching from redis i mean [19:58:21] yeah the graphing is the hard part [19:59:07] it'd be nice to have a single beautiful graph to display all this but it kind of mixes too many concepts I think [19:59:41] ori, just curious, why are you doing this with redis + python? why not just => statsd and be done with it? [20:00:32] ottomata: i have to run meet with tilman and apologize for being such a horrible procrastinator with the blog post, but i can explain after. there are reasons :) [20:01:18] ok [20:06:50] Hey guys, I'm trying to deploy limn-mobile-data and fabric is asking me for login password for 'vagrant'. Can anyone tell me what it is? [20:07:04] Full message: [limn1.eqiad.wmflabs] Login password for 'vagrant': [20:07:12] just recite "I am the system administrator, my voice is my password" into your microphone [20:08:35] I did but wasn't allowed. Must be my accent. [20:13:37] bmansurov: something's gone wrong somewhere [20:13:39] are you on your local machine? [20:13:51] milimetric: yes, apparently I don't have access to limn1.eqiad.wmflabs [20:13:51] Ironholds: I forget, did we buy the maxmind 2 db files? [20:13:56] do we ahve them on our systems? [20:13:59] yup [20:14:00] same directory [20:14:02] can you ssh limn1.eqiad.wmflabs? [20:14:13] already built a library that integrates with the C API :D [20:14:16] milimetric: no, Connection closed by UNKNOWN [20:14:18] is your labs username bmansurov? [20:14:22] milimetric: yes [20:15:03] bmansurov: try now [20:15:41] milimetric: still the same error [20:16:04] milimetric: ok, it's working now, thanks [20:16:05] bmansurov: do you have eqiad.wmflabs set up to go through the bastion in your ssh config? [20:16:10] ok, good [20:16:11] milimetric: yes [20:16:23] bmansurov: limn-deploy now, give it a shot again [20:16:38] "fab mobile_reportcard deploy.only_data" right? [20:17:03] milimetric: yes, but i'm still asked for Login password for 'vagrant' [20:17:15] try fab --user bmansurov [20:17:21] or however you pass that parameter [20:17:58] milimetric: looks like it's working. thanks [20:18:19] bmansurov: cool - that user thing is something I have setup in my ssh config [20:18:21] want me to paste it? [20:18:42] milimetric: that'd be great [20:18:52] https://www.irccloud.com/pastebin/BEy6u6Gq [20:19:09] thanks [20:19:19] np [20:23:43] milimetric: 2 of the graphs in the other tab are still missing, is it safe to run 'mobile_reportcard deploy'? [20:24:07] bmansurov: mobile_reportcard deploy does nothing to get the data [20:24:25] the whole limn-deploy system is just to deploy the dashboarding files, any graphs, datasources, etc. [20:24:35] the datafiles and the sql run on a totally different server in prod [20:24:50] milimetric: I changed some sql statements and would like to see them deployed [20:24:51] the fact that the graphs show up there at all means that the dashboard has been deployed and is working fine [20:25:11] bmansurov: the sql is deployed automatically by puppet on stat1003 [20:25:40] running deploy instead of deploy.only_data with limn-deploy will actually deploy the latest version of LIMN, so it's not at all what you want and it affects other instances on that box [20:25:54] i see [20:26:11] milimetric: when I run those sql statements manually in stat1003, I see some results, but for some reason graphs are still missing [20:26:33] yes, bmansurov the generate.py that runs on stat1003 must be broken somehow or those queries are doing something unexpected [20:26:58] rtnpro has a great patch in gerrit that should help with troubleshooting that: https://gerrit.wikimedia.org/r/#/c/180828/ [20:27:09] milimetric: ok thanks [20:27:10] bmansurov: if you could review that, that would be useful to everyone I think [20:27:21] ok [20:30:11] ottomata: docs updated as setting hive's scratch dir fixed issues [20:40:36] ottomata: is hive.exec.scratchdir needed when jobs run on cluster on oozies scheduler? (rather than via cli) [20:41:43] in 1:1, with you shortly... [20:57:02] hey kevinator -- I have a conflict for today's meeting [21:05:52] (CR) Bmansurov: Integrated logging (3 comments) [analytics/limn-mobile-data] - https://gerrit.wikimedia.org/r/180828 (owner: Rtnpro) [21:09:14] nuria__: i think so, i think it wouldn't matter [21:09:19] i'm not really sure why it is needed [21:13:26] ottomata: sounds like this is teh issue: https://github.com/bloomberg/chef-bach/issues/26 [21:22:28] aye ja [21:22:36] so setting it like qchris suggests is looks right [21:25:08] wait, hangon [21:25:29] "hive-site.xml is missing hive.exec.scratchdir parameter. When a query is executed, using ODBC/JDBC that connects to a hiveserver2 process, temporary results are stored in directory specified by hive.exec.scratchdir parameter. " [21:25:40] huh. I had no idea JDBC wrote things to file :/ [21:27:45] Ironholds: it's probably hive rather [21:28:23] hrm [21:28:40] something to bear in mind when we get the hive server its own machine HINTHINTHINT [21:28:41] Ironholds: the query for apps now runs on oozie , still need to parametize it some more but a daily run takes about 10 hours (no sampling) [21:28:45] yay! [21:31:33] Ironholds: i am testing some sampling to see how does that work but volume of apps requests is so low that i do not have high hopes for sampling [21:31:49] yeah :/ [21:31:51] a blessing and a curse [21:32:57] ottomata: any graphs for messages per second per partition? [21:34:13] Ironholds: that directory is in hdfs [21:34:22] aha [21:34:58] Ironholds: you've got app view logic in this pageview patch [21:35:10] would a udf that filtered those speed up nuria__'s work? [21:35:22] * Ironholds thinks [21:35:33] I dunno. We could find out? [21:35:43] pretty easily [21:36:03] the basic question is whether identifying a request as being from an app in Java takes more time than the equivalent structure in hive [21:36:31] ottomata: i do not think so [21:37:09] ottomata: the (main) problem is the volume of data of the query , whether teh text parsing happens in hive [21:37:30] or udf i doubt it will make much difference [21:37:39] cc: Ironholds [21:37:44] * Ironholds nods glumly [21:37:46] believe so [21:38:01] I wish we had something consistent to filter on in addition to Y/M/D/H/varnish [21:38:06] *partition on [21:38:23] ...huh. wait. [21:38:23] we DO. [21:38:25] ottomata? [21:38:31] how much of a pain is changing the partitions scheme? [21:38:57] because, if we were to additionally filter by MIME TYPE....almost every query would involve fewer mappers. Certainly every pageview-related query. [21:39:10] Ironholds: i do not think we need diff partitions, we need ETL [21:39:28] ahh, the utopian and mythical ETL ;p [21:40:20] Ironholds:for example, in this case we could have a query every hour that idenfies apps requests and stores them somewhere [21:40:28] to be retrieved on a daily basis [21:40:29] yerp [21:40:52] Ironholds: but that will be another iteration, i believe. [21:44:00] yup, agree with nuria__'s sentiment [21:44:07] nuria__: why do you have so many underscores in your name? [21:44:45] ottomata: my irc client keeps tagging them along ... i am like "whatever" [21:45:09] ottomata: when i get to 10 i will take action [21:45:38] hha [21:45:40] can you just type [21:45:44] /nick nuria [21:45:59] :) [21:46:51] ottomata: OHHHHHHHH [21:48:24] Ironholds: do you think that the udf shoudl lowercase the uri_host [21:48:25] ? [21:48:36] the pageview udf? [21:48:50] re. the comment here on line 49 https://gerrit.wikimedia.org/r/#/c/180023/15/refinery-hive/src/main/java/org/wikimedia/analytics/refinery/hive/IsPageviewUDF.java [21:52:36] ottomata, I don't know. I haven't encountered any upper-case URLs in the hive logs but they're certainly in the sampled ones [21:53:44] well, if you think they should be normalized like that, i.e. en.wikipedia.org should be counted the same as En.wikipedia.org, then we should probably lowercase them, eh? [21:53:50] do you think it will hurt? [21:54:54] nope! [21:55:01] I just don't know if they're present in caps! [21:55:29] the old sampled logs come from a different source and I'm not sure if christian's filtering already lowercases when it splits the URLs or not [21:55:43] * Ironholds is totally ignorant of this domain, remember. [21:55:49] ok, I'm going to go ahead and add it, just in case then. [21:56:17] Analytics-EventLogging: Get researchers access to right servers so that they can look at logs & quickly see that things are working as expected (events that fail validation) - https://phabricator.wikimedia.org/T85027#937142 (ggellerman) NEW [21:56:18] hmm, or maybe not [21:56:22] Ironholds: What is "christian's filtering"? [21:56:53] whatever happens between varnish getting a request and it appearing in hadoop [21:57:07] I mean, the URL gets split in three at some stage. Does the host get lower-cased or anything? [21:57:07] haha [21:57:13] no filtering happens [21:57:18] that is whatever varnish sets [21:57:27] varnish gives us the fields as separate [21:57:30] Analytics-EventLogging: Pipe events that fail validation - https://phabricator.wikimedia.org/T85028#937149 (ggellerman) NEW [21:57:34] the udp2log stuff (that gnerates sampled logs) joins them into one [21:57:45] yup. [21:58:48] (PS16) Ottomata: [WIP] UDF for classifying pageviews according to https://meta.wikimedia.org/wiki/Research:Page_view/Generalised_filters [analytics/refinery/source] - https://gerrit.wikimedia.org/r/180023 [21:59:00] yet unresolved things on that patch… [21:59:09] hhu! [21:59:11] *huh [21:59:22] yet another reason to hate udp2log [21:59:49] - Class name (oh boy!) [21:59:49] - ¿testing? [21:59:49] - app Regexes and method questions [21:59:55] Analytics-EventLogging: Pipe events that fail validation - https://phabricator.wikimedia.org/T85028#937160 (ggellerman) [22:00:03] ? [22:00:16] Analytics-EventLogging: Get researchers access to right servers so that they can look at logs & quickly see that things are working as expected (events that fail validation) - https://phabricator.wikimedia.org/T85027#937161 (ggellerman) [22:04:19] Ironholds: [22:04:29] let's do that last one first: [22:04:29] https://gerrit.wikimedia.org/r/#/c/180023/15/refinery-core/src/main/java/org/wikimedia/analytics/refinery/Pageview.java [22:04:47] see comments on line 62 and line111 [22:04:54] mainly 111 [22:05:42] can we change that too [22:06:11] (contentTypesSet.contains()... && !apiHit || isAppPageRequest()) [22:07:26] ottomata: That whole condition is rather ... hard to read. [22:07:43] So much decoration (like repeated finds/matchers) [22:07:48] that part starting atline 109? [22:07:52] And somewhat no clear structure/order. [22:07:55] oh i fixed that qchris [22:07:58] see latest patch [22:08:01] Oh. [22:08:07] AWESOME! [22:08:16] ottomata, totally! [22:08:18] go nuts [22:08:29] ok, will do what qchris says then, wasn't sure if the logic matched up [22:08:42] Ironholds: can you satisfy qchris' desire for many more tests? [22:09:07] also, Ironholds, just to make sure the method name is right [22:09:08] Meh. Don't call me out. I have no say there. [22:09:11] isAppPageRequest [22:09:18] you know me, I love testing [22:09:28] qchris, you realise I write unit tests for fun, right? [22:09:41] when I run out of work to do, I go find an R package that the OpenGov or OpenScience team are maintaining and write unit tests for it. [22:09:48] that's my recreational activity [22:09:50] I'd probably do this even if you DIDN'T want it ;p [22:09:54] we are checking content type, user agent, and that query has 'sections=0' [22:10:03] those things qualify a request as an AppPageRequest? [22:10:10] * Ironholds thinks [22:10:15] yes [22:10:32] sections=0 and application/json when coming from the app exclusively corresponds to "requests for the lede of an article, in readable format" [22:10:45] oh and api.php in path [22:10:57] ok, cool [22:11:02] the app actually makes either 1 or 2 requests for a page, depending; sections=0 and then sections=[sequence of sections] [22:11:09] so we have to dedupe, but the second one is only made if there IS >1 section [22:11:12] so, sections=0 it is. [22:13:36] Ironholds: do you want to only check for string contains api.php? [22:13:43] or path starts with /api.php? [22:13:56] starts with makes more sense, actually [22:14:00] oh i guess it is /w/api.php? [22:14:03] is it always that? [22:14:06] * Ironholds thinks [22:14:08] should be, yes. [22:14:13] ok [22:14:22] will it always == that exactly? [22:14:50] I have no indication that it varies, from my work [22:14:57] but if it does we'll find out when we start hammering on the def :) [22:15:11] great. [22:15:12] I just wasn't sure if there was a performance difference between string.substrin(0,10).equals("api.php?") and just contains(), is the only reason it's implemented that way. [22:15:15] * qchris coughs. [22:15:21] qchris, ? [22:15:37] Ironholds: It's never an issue to check something if your not sure. [22:15:53] ;-) [22:16:11] Just run a query that has "api.php" in uri_path, and is not "/w/api.php". [22:16:17] And see if you get results. [22:16:19] then let me rephrase as "I don't know for certain, and I do not currently have the time to check because I am also writing a session reconstruction methodology this quarter, and it's December 19" ;p [22:16:33] But, point well made; I should check when I have the space to do so. [22:16:47] i will check [22:23:17] according to my sources there are api paths with //w/api.php and /w//api.php, so we will stick with stringContains :p [22:23:51] (PS1) Milimetric: Add basic flask server with highcharts [analytics/abacist] - https://gerrit.wikimedia.org/r/181179 [22:25:01] ori: I got a basic graph working ^ [22:25:21] not very pretty but it allowed me to play with highcharts which was interesting [22:25:35] you should be able to spin it up as I mention in the README [22:25:47] I may have to go soon but I'll leave IRC in case you drop me a line [22:30:11] (PS17) Ottomata: [WIP] UDF for classifying pageviews according to https://meta.wikimedia.org/wiki/Research:Page_view/Generalised_filters [analytics/refinery/source] - https://gerrit.wikimedia.org/r/180023 [22:31:59] (CR) Ottomata: [WIP] UDF for classifying pageviews according to https://meta.wikimedia.org/wiki/Research:Page_view/Generalised_filters (14 comments) [analytics/refinery/source] - https://gerrit.wikimedia.org/r/180023 (owner: Ottomata) [22:32:16] * milimetric is gone but ghosting in case people need to contact him [22:32:40] qchris: so, ok, i think I responded to all your stuff and fixed [22:32:45] except for tests (Ironholds is on it! :?) [22:32:49] and class name! [22:32:57] * qchris looks at it again. [22:33:00] let us now commence another rousing naming debate [22:33:02] before I leave for the weekend [22:33:06] k [22:33:11] (I have some orange goop to prepare) [22:33:21] * qchris looks up "goop" [22:33:23] haha [22:33:32] (CR) OliverKeyes: [WIP] UDF for classifying pageviews according to https://meta.wikimedia.org/wiki/Research:Page_view/Generalised_filters (1 comment) [analytics/refinery/source] - https://gerrit.wikimedia.org/r/180023 (owner: Ottomata) [22:33:37] Hu? [22:33:40] hahaha [22:33:43] hehe [22:33:50] (I'm making a filling for a ravioli dinner party) [22:34:01] Mhmmm ravioli :-) [22:34:05] ok soOOOoooo [22:34:09] I want to call this Webrequest! [22:34:16] if you had MORE pageviews [22:34:23] defintions* [22:34:29] then your methods would just be named appropariately [22:34:38] and you could call that method on any webrequest (that would have the same data) [22:34:49] conceptually all webrequests have the same data. [22:34:57] some logic might consider them one of oliver's pageview [22:35:03] another one of oprah's pageviews [22:35:10] do we really want to make a class for every definition> [22:35:11] ? [22:35:38] * qchris checks Kraken again. IIRC it had more than one definitions in a file, and it was hard to read. [22:36:02] hm>> i suppose if we needed to add lots of regexes for each def. [22:36:02] Analytics-EventLogging: Create a new read-only permission group on vanadium for people to be able to access the original log and set the appropriate restrictions to make sure these users don’t perform computationally intensive operations - https://phabricator.wikimedia.org/T85027#937217 (ggellerman) [22:36:06] hmMMm [22:36:17] IF so. TheN! [22:36:20] ottomata, well, plus [22:36:25] it's not just more regexes, it's more methods. [22:36:34] So we call it webrequests, right? It's for webrequests-related functions [22:36:38] I would make the UDF use a Webrequest class that calls out to these other classes [22:36:45] and we rename all the regexes and strings so it's readable, fine [22:36:56] legacyExcludedUriPaths or whatever [22:37:08] ..and then we add an x_analytics parser [22:37:14] ..and some wrapper methods for that to make it human-usable [22:37:31] Analytics-EventLogging: investigate generating a log of events failing validation and rsyncing it more frequently than the complete log - https://phabricator.wikimedia.org/T85028#937219 (ggellerman) [22:37:45] and then maybe we add a urldecoder wrapper so people don't have to reflect weirdly... [22:37:49] ok ok, I guess i'm ok with more single purpose classes, but I like the idea of modeling webrequest....AND i'm not sure what we could call this [22:37:56] certainly not 'NewPageviewDef' [22:38:18] * qchris cannot find the relevant part in kraken repo. I guess I was wrong then. [22:38:42] qchris: https://github.com/wikimedia/kraken/tree/master/kraken-generic/src/main/java/org/wikimedia/analytics/kraken/pageview [22:38:43] ? [22:38:49] ottomata, you want me to come up with a project name for it? :D [22:38:52] Analytics-EventLogging: design a system to assist software developers and researchers to perform automated data unit testing before pushing to production - https://phabricator.wikimedia.org/T85032#937226 (ggellerman) NEW [22:38:54] haha, no [22:39:24] Ironholds: is this not the canonical pageview def we are trying to use? [22:39:44] ottomata: about the kraken file .... No. I remembered a part where one could have a parameter that allowed to specify against which pageview definition to check. [22:39:52] ottomata, yep [22:40:02] I was gonna call it HeraclesViews :( [22:40:09] (he slew the kraken, in greek mythology) [22:40:14] ah [22:40:48] can we just call this Pageview then? and other pageviews will be named accordingly? [22:40:56] Pageview, PageviewLegacy [22:40:56] etc. [22:40:56] makes sense [22:40:58] ? [22:41:07] Ironholds: Give us more facts about the definition, like who is the primary customer? what is the main use case? [22:41:08] and then in three years ReplacementOttomata will be all [22:41:13] Maybe that can dictate a name? [22:41:34] let's just go for "Pageview"? I think the customer is meant to be everyone and the main use case is everything (no pressure, right? ;p) [22:41:39] haha [22:41:47] I would also accept OghmaPageview [22:41:47] if the definition changes, so will the refinery version [22:42:06] Pageview will be versioned. [22:42:07] yep [22:42:13] but not all descriptions of grahps, papers and reports will automatically change. [22:42:17] in that case, maybe we should call it PolymorphicPageviews [22:42:37] nope, but those can refer to a version, if they like [22:42:42] yep [22:42:47] the plan is to use the meta document to log changes [22:42:52] "this graph was compiled using Refinery Pageview Definition 0.0.3" [22:42:56] "current version is foo, stable as of this commit" [22:43:16] "[this commit] and [diff] incremented version to 1.2.1 to solve [problem]" [22:43:56] Mhmm. [22:44:04] Analytics-EventLogging, Analytics-Engineering: EL office hours - https://phabricator.wikimedia.org/T76796#937236 (kevinator) [22:44:32] It seems you both want "Pageview". [22:44:46] I'm also happy with BiblePageview [22:44:50] Analytics-EventLogging: Create a new read-only permission group on vanadium for people to be able to access the original log and set the appropriate restrictions to make sure these users don’t perform computationally intensive operations - https://phabricator.wikimedia.org/T85027#937237 (DarTar) thanks @ggell... [22:44:55] only that implies we won't document the changes and will insist it never altered. [22:44:57] soo...maybe not. [22:45:10] Does "Pageview" not sound too generic. Like "The one and only gapeview definition". [22:45:14] but it does mean I get to respond to "are the numbers reliable this time?" with "this time they're BIBLICAL" [22:45:19] s/gape/page/ [22:45:39] hah, it is the canonical pageview definitiion [22:45:44] yes, but [22:45:47] so, calling it just Pageview sounds rigiht [22:45:54] what if different customers have different needs in the future? [22:45:56] I thought at this point they are an experiment? [22:45:58] like, I get what qchris is saying. [22:46:14] Analytics-Visualization, Analytics-Engineering: [Volunteer] Improve Generate.py [13 pts for the Analytics Eng team] - https://phabricator.wikimedia.org/T76407#937238 (kevinator) [22:46:20] if they have needs for a different type of definition, they will get a new class [22:46:31] at this point the current version of the def and implementation are an experimentation. But the idea is that the experiment will evolve into a canonical version, which will be used everywhere and itself evolve as infrastructure changes [22:46:41] maybe we should just call it GrandfathersAxe on that basis. [22:46:42] Analytics-Visualization: Improve logging - https://phabricator.wikimedia.org/T84892#937239 (kevinator) [22:46:51] "it's the same as it was last year! All the code has changed, but..." [22:47:07] Ok. I have the feeling I only add to bikeshedding. Hence, I am fine with "Pageview". [22:47:11] heh [22:47:21] Well, but.. [22:47:21] I am doing totally fine on bikeshedding by myself. I could bikeshed in an empty room ;p [22:47:28] (Just kidding. I don't have an opinion.) [22:47:54] ori, as a philosopher I would hope you'd at least endorse my awesome Plutarch reference [22:48:13] I never read Plutarch! (Don't tell anyone.) [22:48:40] you're not missing out on much [22:48:55] personally I really like Sider and the perdurantarist approach to that problem [22:49:12] but then Perdurantism is my general life philosophy, so. [22:49:47] uhhhh, ok Pageview it is! [22:49:51] k [22:50:22] aw but I wanted to discuss whether objects were best treated as existing in four dimensions or not :( [22:50:44] (CR) Ottomata: [WIP] UDF for classifying pageviews according to https://meta.wikimedia.org/wiki/Research:Page_view/Generalised_filters (1 comment) [analytics/refinery/source] - https://gerrit.wikimedia.org/r/180023 (owner: Ottomata) [22:50:57] Ok! So! Ironholds, we are getting mighty close! [22:50:59] fill up those tests! [22:51:04] yessir! [22:51:08] it is now time for orange goop prep [22:51:09] I just need to finish the session docs [22:51:12] enjoy your orange goop [22:51:31] qchris: thank you for your bikeshedding, have a good weekend all! [22:51:57] ottomata: Have a nice ravioli party! [22:52:14] (CR) QChris: [WIP] UDF for classifying pageviews according to https://meta.wikimedia.org/wiki/Research:Page_view/Generalised_filters (1 comment) [analytics/refinery/source] - https://gerrit.wikimedia.org/r/180023 (owner: Ottomata) [22:53:18] take care! [22:53:31] (PS1) Bmansurov: Timebox before summing. [analytics/limn-mobile-data] - https://gerrit.wikimedia.org/r/181204 [23:32:48] (CR) QChris: [WIP] UDF for classifying pageviews according to https://meta.wikimedia.org/wiki/Research:Page_view/Generalised_filters (1 comment) [analytics/refinery/source] - https://gerrit.wikimedia.org/r/180023 (owner: Ottomata)