[01:11:32] anyone knows if I can get hold of toby? [01:21:52] dennyvrandecic: try e-mail ? [01:22:07] average: will do [01:22:13] ok [01:22:31] thx [01:39:58] hi average, someone told me you were looking for me yesterday [02:07:06] yes jgonera [02:07:08] jgonera: hi [02:07:54] do you know of any mobile stats on device families ? [02:08:39] not for wikipedia, but in general. it would be interesting to compare what we get with some other ones [02:12:50] device families as in iPhone/Android/whatever? [02:14:00] so my current understanding of "Device family" is actually vendor [02:14:05] http://www.ibtimes.com/samsung-apple-or-nokia-whats-most-popular-brand-mobile-phones-your-country-map-1471036 [02:14:08] ^^ [02:14:12] http://s1.ibtimes.com/sites/www.ibtimes.com/files/styles/v2_article_large/public/2013/11/14/mobile-vendors-01_0.png [02:14:16] something like this ^^ [02:14:48] uhm [02:15:08] scratch that, uhm.. let me just show you what we have right now [02:15:20] please enter this link https://office.wikimedia.org/w/index.php?title=Analytics/Internal/BrowserDeviceDetection [02:17:18] have to look again at something [02:17:28] jgonera: so iPhone is a device family yes [02:17:42] jgonera: Android wouldn't be a device family, since it's an OS right ? [02:20:01] average, I see [02:20:15] I don't know any global stats like that [02:20:48] when I was parsing UAs from logs myself I was mainly interested in the browser because that's what mostly affects page rendering [02:21:26] device families are an interesting piece of information but often not the one that would drive our decisions as to if something should be implemented one way or the other [02:22:15] we have it as a requirement to provide you with device data too [02:22:53] average, did Kenan say that? or someone from a different team? [02:23:15] Kenan said that in a meeting with us yes [02:23:45] then he must have a good reason for it I suppose. I just look at it from more of an engineering point of view ;) [02:24:10] I see [02:24:18] I will double check with him though because I'm curious how he wants to use that information [02:24:40] well, we've changed requirements twice, so I'd rather the requirements stay the same [02:24:46] since we're close to wrapping it up [02:25:20] you can check with him of course [02:27:04] jgonera: but Kenan, or someone else from your team, do they cross-check mobile reports against something to see if they're right/wrong ? we need to make a technical decision because we have 2 libraries that can do this, and we have to figure out which of them is a best fit for the situation [02:27:40] average, which mobile reports? [02:28:20] jgonera: the ones we in Analytics are working to produce ( card 1227 in Mingle, reports on Browser/Device/OS ) [02:31:04] average, I can't see anything about device families at https://wikimedia.mingle.thoughtworks.com/projects/analytics/cards/1227 [02:32:43] I should update the card [02:33:39] the scope for us is currently : 1) Device family 2) OS name 3) OS major version 4) OS minor version 5) Browser name 6) Browser major version [02:33:56] jgonera: this is what the reports will contain ^^ [02:34:31] that was established in the meeting with Kenan, there were 5 persons present ( I should have pinged you to join in as well, let me know if I should do this next time ) [02:35:01] average, I'd appreciate it next time [02:35:12] this is not in the order of priority? [02:36:01] I am thinking to split it in 3 reports. One in which you seem to be interested(from what I understand), with the following columns Browser name, Browser major version, Percentage, Count [02:36:18] Another one with OS name, OS minor version, OS major version, Percentage, Count [02:36:30] And another one with Device family, Percentage, Count [02:36:32] regarding 6) there is one exception: we need minor version number for Android Broswer (which is the same as OS minor version) [02:36:41] that's how they are right now (there are SQL Hive queries written for this) [02:37:23] that looks good, but as I said, the report with browsers needs to distinct between Android 2.2 and 2.3, 4.0 and 4.1 [02:37:44] jgonera: our colleague Nuria has collected requirements from you. She has told us that you do not require Browser minor version(we initially wanted to provide you that) [02:38:06] average, I can forward the e-mail conversation we had [02:38:13] please go ahead and do so [02:39:24] maybe I didn't explain myself: we don't need to collect browser minor versions for all browsers, but in the presentation layer we have to be sure to bind Android Browser minor version to Android OS minor version [02:39:31] what's your e-mail? [02:39:40] stefan.petrea@gmail.com [02:40:59] I see. In the presentation layer. We haven't reached the presentation layer yet. We're at the Hive/Hadoop/libraries&bindings&UDFs level now(in short, the low-level) [02:41:24] average, I see [02:41:42] average, forwarded the email to you, gotta go, we can talk tomorrow if you have more questions [02:41:58] sure, thanks ! [02:42:13] talk to you tommorow [10:41:52] (CR) Nuria: "Ok. I see this is being merged already." [analytics/wikimetrics] - https://gerrit.wikimedia.org/r/114086 (owner: Milimetric) [11:37:53] (PS13) Stefan.petrea: [DO NOT SUBMIT] [WIP] kraken-hive UDFs [analytics/kraken] - https://gerrit.wikimedia.org/r/96738 (owner: QChris) [11:38:15] (PS14) Stefan.petrea: [READY FOR REVIEW] kraken-hive UDFs [analytics/kraken] - https://gerrit.wikimedia.org/r/96738 (owner: QChris) [11:41:57] (PS15) Stefan.petrea: [READY FOR REVIEW] kraken-hive UDFs [analytics/kraken] - https://gerrit.wikimedia.org/r/96738 (owner: QChris) [14:11:03] (CR) QChris: "It seems this change contains both the ua-parser part and the dclass" (1 comment) [analytics/kraken] - https://gerrit.wikimedia.org/r/96738 (owner: QChris) [14:24:45] (CR) Stefan.petrea: "From my point of view, we can postpone this" [analytics/kraken] - https://gerrit.wikimedia.org/r/96738 (owner: QChris) [15:57:58] Hey milimetric, got a sec to talk about Zurich stuff? [15:58:39] hey halfak: I'm in a long meeting [15:58:55] but you can ping me and I can respond later [15:58:56] No worries. Maybe later. :) [15:59:01] or we can talk in a couple of hours [15:59:29] * halfak is looking for someone who is awake that can help him figure out what he may have missed over the last few days.  [15:59:39] I'll go poke around and come back if I fail [15:59:41] Thanks :) [17:09:19] (PS5) Ottomata: Using kt_log (syslog) in addition to stderr to report configuration failures [analytics/kafkatee] - https://gerrit.wikimedia.org/r/114078 [17:09:48] (CR) Ottomata: [C: 2 V: 2] Using kt_log (syslog) in addition to stderr to report configuration failures [analytics/kafkatee] - https://gerrit.wikimedia.org/r/114078 (owner: Ottomata) [17:28:24] halfak: done with my meeting [17:28:25] what's up [17:28:50] Hey milimetric. I was going to ask you about our plans for the Zurich hackathaton. [17:28:55] yes [17:28:55] Do you know if we plan to get their early? [17:28:58] yes [17:29:01] May 5 - 9 [17:29:18] and plans for what exactly we're doing in that time period are open [17:29:31] I have five options that I'm working on building into a survey and sending to everyone [17:29:47] I just have to catch up with DarTar, he was looking into northern Italy [17:30:04] Gotcha. I hope that whoever is processing registration & survey stuff knows that we're planning to do something different. [17:30:26] Or rather, travel from somewhere else. [17:32:44] yep, that'd be me :) [17:33:04] and I'll make sure the travel department knows and all that [17:33:41] Excellent. :) Thats all I needed. Thanks. [17:33:48] lzia: see above [17:34:04] yep yep. reading it. [17:34:27] so, I guess one question lzia and halfak: is it more troublesome visa-wise for you to go first to, say Italy, then to Switzerland? [17:34:29] or is that ok [17:34:44] I guess halfak is fine either way [17:34:45] :D [17:35:37] well, right, I guess this is better left for the survey [17:35:38] Hmm.. IANAL, but I wouldn't expect difficulties. [17:35:45] I'm a US citizen. [17:35:49] k [17:35:58] In my case, the best option is to do it in the U.S. ;-) I will need a Shengen visa for Switzerland and I guess once I have that, it doesn't matter how we roam in the Shengen region [17:36:07] as soon as I talk to DarTar, I'll get back to you all with the options [17:36:22] kk ty [17:36:26] k. great! thanks! [17:36:54] http://en.wikipedia.org/wiki/Category:Visa_requirements_by_nationality [17:37:38] Speaking as our token european citizen on the research team: hahahahahahahaha [17:37:47] well, other than dartar, I guess. [17:37:52] but he doesn't count, he's gone native. [17:38:09] Ironholds, we'll absorb you yet. [17:38:29] halfak: I'm dating an American and fanatical about the Seahawks; I think you may be using the wrong tense ;p [17:38:50] Thanks average. The map is quite interesting in my case. We should definitely consider it the next time we plan for a trip. :D [17:38:55] I don't even know what sport the Seahawks play. [17:39:16] * Ironholds blinks [17:39:18] ty average [17:39:22] they won the Superbowl, like, two weeks ago! :p [17:40:45] I used to keep up with the news :S [17:41:14] milimetric, lzia: Swiss is not part of Schengen, you will need a separate visa [17:43:53] oh boy :) [17:43:57] too complicated, /me lunch [17:47:54] drdee, er, what? [17:47:57] yes it is. [17:48:14] It's not part of the EU, but it's part of the EFTA and a signatory to Schengen [17:49:48] has been since 2008. [17:49:57] hah and that's when i left europe [17:50:19] ahh! [17:50:30] yeah, they finally got around to maybe letting people not carrying gold bullion into their country [17:50:32] took them long enough [17:51:28] ok, good to know' [17:52:31] (they still massively prefer you if you bring them gold bullion, though. Because: Swiss.) [18:40:40] (PS13) Milimetric: [WIP] Changes tu support wikimetrics in vagrant. [analytics/wikimetrics] - https://gerrit.wikimedia.org/r/109676 (owner: Nuria) [18:42:57] headed to cafe, back soon [19:07:14] hey milimetric [19:08:34] I may not make it this afternoon with you and rachel, but I want to talk quickly about my visa issues with IT/CH, ping me if you have a moment to chat [19:27:14] DarTar, any objection to me writing up the MaxMind tutorial on wikitech rather than office-wiki? [19:28:36] why not, as long as we add pointers and we expand on the office wiki with stuff that is specific to internal research [19:29:15] sure; I can't think of anything that would be out of place there, tbh. I mean, it's where analytics has been documenting a lot of the hive stuff, ferinstance [19:35:42] Ironholds: yes, and that nobody read :-/ [19:35:49] DarTar, I read it! [19:35:59] I have it bookmarked so I can refer to it when I forget the necessary JAR. [19:37:55] my concern is that I want to have a single page with all private data access instructions, if having this on wikitech and public by default makes things easier I'm for it [19:38:18] public by default is fine [19:38:31] I mean, it doesn't really tell you anything sensitive. [19:38:41] "here's the directory the API lives in on stat2, knock yourself out". [19:38:47] getting into stat2 is the barrier ;p [19:39:20] exactly, I'd suggest checking with ottomata/ops first though if they are ok with this [19:39:58] okie-dokes [19:40:02] ottomata, ping ;p [19:40:07] ponggngngng [19:40:17] you okay with me documenting MaxMind and how to use it on wikitech? [19:40:23] sure [19:40:31] the only possibly-sensitive things are "the directory the API lives in" and "an example output, using the office IP" [19:40:38] I kind of feel like they already know where we live and such. [19:40:56] and if they can get into stat2 without permission, obfuscating the directory probably won't help ;p [19:40:57] cool! [19:41:29] Ironholds: btw [19:41:29] http://www.mediawiki.org/wiki/MaxMind_Evaluation [19:41:47] what API do you mean? [19:42:09] hi DarTar [19:42:11] chat? [19:42:22] ooh, cool [19:42:27] yes [19:42:35] ottomata, oh, literally just /a/..../geoip -l [ip] [19:42:38] (sorry, API is the wrong word) [19:43:54] Ironholds: you wanna do geoip in R ? [19:44:05] /a ? [19:44:18] ottomata, or usr/bin/ or whatever. I don't have it to hand. [19:44:26] average, I've been doing that for, I think six months ;p. [19:44:34] [@stat1002:~] 1 $ which geoiplookup [19:44:35] /usr/bin/geoiplookup [19:44:35] ? [19:44:37] that? [19:44:38] Dario had some questions about it last week, and so I'm writing up a guide to using the MaxMind db. [19:44:40] indeed [19:45:05] Ironholds: this might be irrelevant, but just checking [19:45:09] you know about udp-filter, right? [19:45:22] know of it, but that's about it [19:45:31] that is, I've heard the name ;p. tell me more! [19:45:33] if you are working with the udp2log logs [19:45:36] and want to geocode them [19:45:40] you can pipe them through udp-filter [19:45:42] check out [19:45:43] aha [19:45:45] man udp-filter [19:45:46] on stat1002 [19:46:00] you can also do custom matching and filtering based on certain criteria [19:46:04] that's hella-cool. [19:46:12] and anonymize after geocoding, if you like [19:46:15] yeah [19:46:17] drdee wrote it :) [19:46:22] at the moment I'm mostly writing it up for ad-hoc things [19:46:43] so, "how many dutch users do we have on enwiki with more than five edits in the last 30 days, go go go" [19:47:04] ah k, using the dbs [19:47:05] aye cool [19:47:06] da [19:56:50] qchris_meeting: hiiii are you really in a meeting? [19:57:03] ottomata: Sorry :-) [19:57:05] back. [19:57:22] hihi [19:57:27] Wanted to finish up some email and grab some food afterwards. [19:57:33] Hi. [19:57:34] ok so i have a mobile log generated by kafkatee on stat1002 right now [19:57:46] Coolio. [19:57:50] what do we currently do with the mobile.tsv logs from udp2log? [19:57:55] can I run the same code on the stuff from kafkatee? [19:58:28] The scripts we run are mostly in firefighting mode ... so I'd have to adapt and run by hand. [19:58:34] Where are those files? [19:59:27] Nothing kafka-like in /a or /a/squid on stat1002. [20:02:04] /a/otto [20:02:19] kafkatee-mobile-100.4023622-lines.log.tsv [20:02:20] mobile-sampled-100.tsv.log-20140220 [20:02:22] they aren't for the exact time period [20:02:24] hmmm [20:02:28] they are just the same number of lines [20:02:36] would it be easier if they were for the same time period? [20:02:50] don't we generate data for the mobile team with mobile udp2log files? [20:02:57] qchris: ^? [20:03:07] We are ... but only some parts that they never looked at IIRC. [20:03:19] So it's mostly wikipedia zero. [20:03:38] I'll just grab that file and pipe it through. [20:04:51] ottomata: Is it ok if I do that until tomorrow? [20:05:31] yeah no probs, would be much appreciated [20:05:42] Ok. Will do :-) [20:06:38] I copied the files into my tmp, so you can remove/change/... them from /a/otto. [20:06:57] k danke [20:07:15] woudl it be helpful if I got the same time period as a full day out of the kafkatee logs? [20:07:22] instead of just a recent sampling [20:07:27] those files both have the same numbrer of lines [20:07:34] but different time periods for sure [20:09:38] (CR) Ottomata: "How do we make this decision?" [analytics/kraken] - https://gerrit.wikimedia.org/r/96738 (owner: QChris) [20:10:57] ahhh kraken [20:11:11] we should get a Jenkins job for it :-D [20:14:51] hashar: Totally! :-D [20:15:01] hashar: But I fear that has to wait a bit ... :-( [20:15:41] (CR) Stefan.petrea: "I am putting together a pros/cons list and sending it out via e-mail." [analytics/kraken] - https://gerrit.wikimedia.org/r/96738 (owner: QChris) [20:15:45] ottomata: I am still seeing "nan"s ... weren't they removed somewhere in varnishkafka? [20:16:17] !!!!!! [20:16:19] NANs [20:16:28] qchris i thought you were going to lopok at it tomorrow! [20:16:30] it is late for you! [20:16:39] I am not looking. [20:16:44] I have my eyes closed. [20:16:57] qchris: no hurry I guess [20:17:04] qchris: though if you use maven that should be very simply :-] [20:17:29] hashar: Yes, maven it is. [20:17:39] poke me about it one day :] [20:17:42] qchris, don't we have a jenkins job? [20:17:46] hashar? [20:17:48] hashar: Ok :-) [20:18:11] ottomata: there is no job for analytics/kraken apparently, at least not triggered by Zuu [20:18:12] l [20:18:22] We have I job that we can trigger. [20:18:28] hm [20:18:29] s/I/a/ [20:18:38] ah yeah http://integration.wikimedia.org/ci/job/Kraken/ [20:18:53] last build from Oct 2013 ! [20:18:56] But we want it to guard against evil people that dush directly and pass code review :-) [20:19:05] i think we fixed that, no? [20:19:10] direct push should be forbidden really [20:19:32] ah maybe not [20:19:55] hashar: you really like zuul don't you ? :) [20:20:22] average: I love the workflow of running tests before folks have to review a change. that is a huge timesaver [20:20:27] average: + python \O/ [20:20:38] :D [20:21:00] we should have an analytics CI sprint one day [20:21:12] but I guess you are all very busy [20:22:08] hashar: I'm still a bit confused as to what zuul brings to the table in addition to what gerrit/jenkins offer [20:22:17] We're always busy :-D [20:22:33] afaik tests were running after patchset submission before zuul was deployed [20:22:41] average: Gerrit is very dumb, it just emit events (aka someone commented, some other voted +1, a patch got merged) [20:23:12] average: Zuul listen for those events, and depending on a configuration file would trigger a bunch of jobs in Jenkins, wait for the results and then report back in Gerrit (and eventually merge for you) [20:23:34] average: or to say otherwise: all the CI intelligence is in Zuul :-] [20:23:43] hashar: and that would not happen in the absence of zuul ? [20:23:49] nop [20:24:12] ok [20:24:27] average: there is a plugin for Jenkins to connect it to Gerrit. But it was missing a bunch of features 1 year and a half ago. Plus it is written in Java, a language I don't know at all [20:24:46] average: whereas Zuul already matched our needs and is written in python which I can hack. I have a few commit upstreamed :-] [20:25:29] I see. Well, now I guess I understand what zuul does a bit better [20:25:38] thanks [20:27:13] yeah I am missing a "big picture" wiki page [20:43:12] (PS1) Hashar: Jenkins job validation (DO NOT SUBMIT) [analytics/kraken] - https://gerrit.wikimedia.org/r/114531 [20:43:42] (PS2) Hashar: Jenkins job validation (DO NOT SUBMIT) [analytics/kraken] - https://gerrit.wikimedia.org/r/114531 [20:44:38] and here is maven downloading the whole internet https://integration.wikimedia.org/ci/job/analytics-kraken/1/console [20:46:49] surprisingly it timed out [20:46:56] unexpected(!) [20:49:01] yeah it ran on a host that does not have direct access to internet :D [20:49:13] need to configure a proxy :( [20:51:28] is there a -D setting for maven to set up a web proxy ? :D [20:51:57] or I should go with settings.xml :( https://maven.apache.org/guides/mini/guide-proxies.html [20:52:48] that works [21:06:33] Unable to locate the Javac Compiler in: :( [21:07:40] user@user-K56CM:/tmp/reports$ apt-file -x search 'javac$' [21:07:42] bash-completion: /usr/share/bash-completion/completions/javac [21:07:44] gcc-snapshot: /usr/lib/jvm/java-1.5.0-gcj-4.8-snap-amd64/bin/javac [21:07:47] gcj-4.6-jdk: /usr/lib/jvm/java-1.5.0-gcj-4.6/bin/javac [21:07:50] gcj-4.7-jdk: /usr/lib/jvm/java-1.5.0-gcj-4.7-amd64/bin/javac [21:07:54] openjdk-6-dbg: /usr/lib/debug/usr/lib/jvm/java-6-openjdk-amd64/bin/javac [21:07:59] openjdk-6-jdk: /usr/lib/jvm/java-6-openjdk-amd64/bin/javac [21:08:02] openjdk-7-dbg: /usr/lib/debug/usr/lib/jvm/java-7-openjdk-amd64/bin/javac [21:08:04] yeah dpkg -S bin/javac [21:08:04] [21:08:05] openjdk-7-jdk: /usr/lib/jvm/java-7-openjdk-amd64/bin/javac [21:08:06] hashar: ^^ :) [21:08:08] so I need to get openjdk-6-jdk :( [21:08:15] and openjdk-7-jdk as well [21:08:23] why both ? [21:08:44] maybe just the latter ? [21:08:59] here what I have got on a slave: [21:08:59] http://paste.debian.net/83207/ [21:09:05] looking [21:09:06] aka openjdk*-jre [21:10:17] I see [21:13:20] maybe installing the jdk would install the jre as well [21:14:02] yes [21:16:40] yeah something is happening http://integration.wikimedia.org/ci/job/analytics-kraken/3/console [21:21:25] looks very nice [21:21:59] I need to add checkstyle [21:32:02] stupid jenkins [21:32:06] freestyle projects have the concept of publishers, which are run post build to inspect logs / results [21:32:15] maven projects have publishers too [21:32:31] AND reporters which is more or less the same thing albeit configured with different xml [21:32:31] grr [21:33:25] DarTar, https://wikitech.wikimedia.org/wiki/Analytics/Geolocation - going to throw it at the mailing list, but, initial thoughts? [21:34:03] (PS3) Hashar: Jenkins job validation (DO NOT SUBMIT) [analytics/kraken] - https://gerrit.wikimedia.org/r/114531 [21:36:55] ottomata1: average: so you know have a very basic Jenkins job for analytics/kraken.git though it just run maven without any test :( [21:38:28] it can't do mvn test after building hashar? [21:39:39] milimetric: I took the config from http://integration.wikimedia.org/ci/job/Kraken/ [21:39:52] the conf is now detailed in https://gerrit.wikimedia.org/r/#/c/114622/1/analytics.yaml,unified [21:40:04] aka maven -Dmaven.test.skip=true clean package [21:40:33] have you triedadding "test" to the goals? [21:40:40] oh i see [21:40:49] should I drop the maven.test.skip=true ? [21:41:11] i don't know what i'm talking about, maven is foreign, but yeah, I think theoretically "maven test" should be all you need [21:41:22] that should clean, build, and run the tests [21:41:31] and package could be a separate job if that one succeeds [21:41:41] but again - I have little understanding of this [21:42:05] I will try with just 'package' [21:43:07] I know it's late for you, if you want to play with it and think I can help, I'll get up earlier tomorrow morning [21:45:12] http://integration.wikimedia.org/ci/job/analytics-kraken/6/ [21:45:12] :D [21:45:28] tnegrin: hi, can we invite jgonera to our analytics-internal list ? [21:45:55] who that? [21:46:04] tnegrin: Juliusz Gonera from the Mobile team [21:46:14] jgonera@wikimedia.org [21:46:15] I know that guy! sure [21:46:22] ok, can you add him please ? [21:46:27] I do not have admin priviledges [21:46:34] commented on https://gerrit.wikimedia.org/r/#/c/114622/ [21:46:59] milimetric: I am not there tomorrow :-]  I guess I will fill in a bug about the failing tests [21:47:23] ok hashar, cool [21:47:24] thanks! [21:47:27] thanks tnegrin [21:48:05] heh, internal list is funny [21:48:13] we really should make that thing public and be done with it [21:48:24] I know -- just what I was going to say [21:48:25] then have analytics-internal-internal so I can have a place where I email if I'm sick [21:48:26] agh there is already a bug at https://bugzilla.wikimedia.org/show_bug.cgi?id=54046 [21:48:48] one day -- we will unite the two kingdoms [21:52:06] I have posted the result on the bug report :-] [21:54:47] I am done for tonight. Have fun folks! [21:56:38] thanks hashar, have a nice night [21:56:53] ok - jgonera -- done [22:19:20] tnegrin: do we have a meeting in 10 minutes? I just received an invite. [22:19:36] oops [22:19:53] we don't have a meeting actually -- I'll cancel it [22:19:54] okay. we don't. :-) [22:20:01] but if we did, it would be in a good videoconf room ;) [22:20:11] I just moved staff to a better room [22:20:17] haha!