[00:02:20] uhhhhHH [00:02:21] yes? [00:02:22] dunno [00:02:24] yes? [00:02:25] Ok. [00:05:02] (PS1) Ottomata: Modify pom.xml to build with WMF archiva and CDH 5.0.2 [analytics/wikihadoop] (wmf) - https://gerrit.wikimedia.org/r/165940 [00:05:40] qchris: gmime a quick review ^ :) [00:05:51] * qchris looks [00:07:09] repo looks good. But why the version changes around log4j and junit? [00:07:54] to match what we had in archiva :) [00:08:28] oh i didn't mean to leave those comments in there [00:08:40] or, qchris, do you thikn I should jsut add those to archiva and leave the original versions? [00:09:18] Not sure ... how will we use wikihadoop. [00:09:26] From inside the cluster or outside? [00:09:37] (No clue what wikihadoop really is) [00:10:19] inside, i will probably add it to artifacts in refinery [00:10:35] Oh. [00:10:35] qchris, it is a Custom InputFormat for mediawiki xml dumps [00:10:43] Yikes! [00:10:44] :-D [00:11:01] It compiles ... so I just CR+2 to not block you. [00:11:03] drdee and aaron were involved with this 2 years ago [00:11:07] ha, ok :) [00:11:17] danke [00:11:31] (CR) QChris: [C: 2 V: 2] "Compiles for me and passes tests." [analytics/wikihadoop] (wmf) - https://gerrit.wikimedia.org/r/165940 (owner: Ottomata) [00:17:54] (PS1) Ottomata: Add wikihadoop-0.2-wmf1.jar and symlink (via git-fat and archiva) [analytics/refinery] - https://gerrit.wikimedia.org/r/165942 [00:18:55] (CR) Ottomata: [C: 2 V: 2] Add wikihadoop-0.2-wmf1.jar and symlink (via git-fat and archiva) [analytics/refinery] - https://gerrit.wikimedia.org/r/165942 (owner: Ottomata) [00:32:24] halfak, ottomata: https://github.com/diegoceccarelli/json-wikipedia [00:32:31] he’s actually a friend of mine [00:35:04] cool! [11:08:59] Analytics / Refinery: Make webrequest partition validation handle races between time and sequence numbers - https://bugzilla.wikimedia.org/69615#c9 (christian) Happened again for: 2014-10-09T17:xx:xx/2014-10-09T18:xx:xx (on upload) [16:47:59] (PS3) Ottomata: Use tsv format when outputting webrequest faulty hosts files [analytics/refinery] - https://gerrit.wikimedia.org/r/150963 [17:20:12] (CR) Ottomata: [C: 2 V: 2] "Gage and I are about to restart Oozie jobs, I want this to go in." [analytics/refinery] - https://gerrit.wikimedia.org/r/150963 (owner: Ottomata) [17:31:25] hey all. are there problems with wikimetrics right now, or is it working normally? [17:32:47] LiAnna says she's getting 504 gateway timeout errors when trying to download a report. [17:36:38] raegesos: let me look [17:36:49] ragesoss: let me look [17:36:56] thanks! [17:43:16] ragesoss: you know that reports are not stored "for ever" right? [17:43:31] nuria: yeah. this as something just run. [17:44:44] ragesoss: a gateway timeout suggests availability on labs having issues rather than a problem with wikimetrics, let me try to repro [17:45:05] ragesoss: i sm in sf currently so i might not run into the same issue [17:45:29] nuria: this was from someone else in SF having the problem. [17:45:39] ragessos: cannot repro, just run a report: https://metrics.wmflabs.org/static/public/1504481.json [17:45:43] (I'm trying to figure out exactly what they were running, as I didn't have the problem with a test) [17:45:58] ragesoss: ok, let us know when you know [17:46:16] just an FYI -- Robla and Mark B asked me to forward the [Uu]defined bug to the ops list [17:46:28] more eyes/shallow bugs and all [17:50:28] nuria: it sounds like it's not related to the query, as they ran almost the same one as me, and wikimetrics reported the successful run, it was just the csv download that was and is not working for them. [17:50:39] but working fine for me with basically the same query. [17:50:55] ragesoss: same cohort? [17:51:13] ragesoss: or perhaps theirs is a bigger one, we have some problems with size [17:51:28] nuria: we uploaded the same csv file (2095 names, 2090 validated for me) [17:52:22] nuria: actually, nevermind, they were running per-user and I just ran aggregate. [17:52:26] I'll test again. [17:52:31] could be the size issue. [17:54:51] ragesoss: we have an open bug for very large cohorts and results, so an workarround can be to split the cohort. If size is bigger than 40.000 you might [17:54:55] run into issues [17:55:05] nuria: size is only ~2000 [17:55:28] ragesoss: then i doubt size is the issue [17:55:52] nuria: looks like I'm going to hit the same issue, now that I ran the query with individual results. [17:56:19] waiting for the csv, but it's taking many tens of seconds. [18:02:35] ragesoss: can you run the report again making it public so we can get resukts as json? [18:02:40] *results [18:03:05] nuria: yes: https://metrics.wmflabs.org/static/public/1504496.json [18:03:28] ragesoss: ok, no problems with json but cvs not so great right? [18:03:36] nuria: right. [18:03:56] ok, would you please file a bug with deatails so we can repro? [18:04:14] nuria: I also just ran into this when I first tried to access json for a small run: { [18:04:14] "message": "no task exists with id: null", [18:04:14] "isError": true [18:04:15] } [18:04:22] I tried again a minute later and it worked. [18:04:38] ragesoss: and suggest the json workarround to the user running the other report? [18:05:10] ragesoss: will check logs now to see but if you can file a bug with details it will be great, are you familiar with bugzilla? [18:05:14] nuria: the problem with the json workaround is that the json results lack usernames. [18:05:50] the reason we're running this report is to check one student editors in our program who have been very active, so we can see if they are doing well and help them out if not. [18:06:20] ragesoss: right, and the json only has user-ids right? [18:06:22] if we have to do db queries to find out the usernames for each id we are interested in, it will be super tedious. [18:06:27] nuria: right. [18:06:34] and yes, I'll file a bug today. [18:06:47] ragesoss: right so not such a good work arround [18:15:28] Analytics / Wikimetrics: Story: WikimetricsUser downloads large CSV quickly - https://bugzilla.wikimedia.org/71255#c2 (Sage Ross) Created attachment 16742 --> https://bugzilla.wikimedia.org/attachment.cgi?id=16742&action=edit Cohort of student editors from fall 2014 on enwiki A cohort of 2095 userna... [18:15:58] Analytics / Wikimetrics: Story: WikimetricsUser downloads large CSV - https://bugzilla.wikimedia.org/71255#c3 (Sage Ross) In many cases it will not download ever. After many minutes, the user gets a 504 gateway error. For example: https://metrics.wmflabs.org/reports/result/c04cf328-f198-4a12-81ec-9cb4... [18:16:13] Analytics / Wikimetrics: Story: WikimetricsUser downloads large CSV - https://bugzilla.wikimedia.org/71255 (Sage Ross) s:enhanc>major [18:16:28] nuria: I added the cohort as an attachment to that bug ^ [18:16:49] ragesoss: is Ok to make the dat apublic? [18:16:54] *data public [18:16:57] nuria: yes. [18:17:05] ragesoss: ok, excellent [18:17:09] thanks for the check nuria! [18:21:14] Analytics / Wikimetrics: Story: WikimetricsUser downloads large CSV - https://bugzilla.wikimedia.org/71255 (nuria) p:Normal>High [18:25:15] Analytics / Wikimetrics: Story: WikimetricsUser downloads large CSV - https://bugzilla.wikimedia.org/71255 (nuria) a:nuria [18:27:05] milimetric: also this [18:27:05] http://thinkaurelius.github.io/titan/ [18:27:09] re. neo4j [18:27:13] https://groups.google.com/forum/#!msg/aureliusgraphs/vkQkzjN8fo0/9YYgqI4TA0QJ [19:23:56] I'm trying to create an RT account for Marcel, can someone provide his email address? [19:32:41] 47 members, all lurkers? [19:43:49] andrewbogott, mforns@wikimedia.org? [19:43:54] it should be in your email account's autocomplete ;p [19:45:41] Ironholds: you have a very long running hive job, just curious, what is it? [19:45:46] and, do you know it is still running? [19:46:03] ottomata, it shouldn't be? what's the query? [19:46:07] Ironholds: thanks [19:46:15] I mean, does the weird dashboard...thing. say so? [19:46:21] (more usefully; can you tell when it was launched?) [19:49:34] ironholds, it was showing up in the hadoop gui an hour or so ago, but via the cli [19:49:40] yarn application -status application_1409078537822_24347 [19:49:42] on stat1002 [19:49:48] SELECT COUNT(DISTINCT(uri_path)) [19:49:48] FROM ...200(Stage-1) [19:49:54] huh! [19:49:55] * Ironholds thinks [19:49:56] OH [19:49:59] wait. no. hrm. [19:50:05] it has been running since sept. something [19:50:09] all mappers have finished [19:50:13] single reducer is trucking along : [19:50:14] :) [19:50:16] ....okay then no that's not what one care about ;p [19:50:18] *I [19:50:30] if it's been running for more than, mmmn, call it 2 days, from me, you can safely kill it. [19:50:43] ok [19:50:46] it's something I tried and failed to kill, or something I got a broken pipe with and couldn't retrieve the application ID, or...etc. [19:50:50] hmm, ok [19:51:11] done [19:51:11] you'd think "kill the task if you stop being able to send things to the place you're sending the results to" would be a feature of hive [19:51:19] but that probably doesn't make sense for reasons I'm unaware of [19:51:39] well, i think usually peole send results to hdfs rather than attached terminal [19:51:42] if it's a SELECT statement that's writing out to the cli or file? and you can't talk to [place]? yeah, why keep it? [19:51:44] gotcha [19:51:46] hadoop has no idea if your terminal is still attached [19:52:21] and, i dunno how hive does it, i betcha hive abstracts it so that if you are expecting the results on stdout, it will write to some HDFS tmp dir and then cat it for you [19:57:04] andrewbogott: i think we were all eating lunch! [19:57:19] mforns@wikimedia.org is correct [19:57:28] gootcha [20:21:46] milimetric: I'm having trouble installing wikimetrics. When I run `vagrant provision` I get the follow error message: "Error: Invalid parameter archive_tablename at /tmp/vagrant-puppet-5/modules-0/role/manifests/wikimetrics.pp:63 on node mediawiki-vagrant.dev" How can I fix it? [20:22:47] yes, bmansurov, that's a problem with the puppet definition [20:22:55] we updated prod but not vagrant [20:23:05] its updated now, no? [20:23:06] git pull? [20:23:06] mforns can help you, he just went through the same thing [20:23:09] git submodule udpate --init? [20:23:13] no, that's the role in vagrant [20:23:17] i don't think that's updated yet [20:23:38] i think mforns submitted a patch [20:23:41] ottomata: I did git submodule update --init, but didn't help [20:23:48] mforns: help? [20:24:04] (we are in a little meeings for the next few mins ) [20:24:26] ok, when you get a chance ;) [21:05:30] Analytics / Wikimetrics: Valid username is ruled invalid by Wikimetrics - https://bugzilla.wikimedia.org/71923 (Sage Ross) NEW p:Unprio s:normal a:None There's a username in an enwiki cohort I pasted in to Wikimetrics that seems to be a normal one, but Wikimetrics checks it and indicates it... [21:09:58] Analytics / Wikimetrics: Valid username is ruled invalid by Wikimetrics - https://bugzilla.wikimedia.org/71923#c1 (Sage Ross) Other usernames that I can't validate and don't know why: Dd-556 Finding Zeno [21:14:59] Analytics / Wikimetrics: Valid username is ruled invalid by Wikimetrics - https://bugzilla.wikimedia.org/71923#c2 (Sage Ross) Also: Hansenr11 [21:15:49] (PS1) Ottomata: Add UAParserUDF from kraken [analytics/refinery/source] - https://gerrit.wikimedia.org/r/166142 [21:16:28] Analytics / Wikimetrics: Valid username is ruled invalid by Wikimetrics - https://bugzilla.wikimedia.org/71923#c3 (Sage Ross) These usernames are invalid whether pasted in or uploaded via CSV. [21:16:34] (CR) Ottomata: "This will conflict with https://gerrit.wikimedia.org/r/#/c/164264/, but I will fix up and rebase that patch on top of this one." [analytics/refinery/source] - https://gerrit.wikimedia.org/r/166142 (owner: Ottomata) [22:01:13] Analytics / Wikimetrics: Valid username is ruled invalid by Wikimetrics - https://bugzilla.wikimedia.org/71923#c4 (nuria) The labs db data has issues around September 20th, part of the data for this user is missing. This is by no means a common event as we have not seen this happening before. We will w... [22:18:33] milimetric: how can I make wikimetrics-web service pick up code changes? [22:18:51] bmansurov: you're in vagrant right? [22:18:57] yes [22:19:08] I don't think wikimetrics-web is what's serving the site there [22:19:12] I think it's coming out of apache [22:19:16] so try restarting apache2 [22:19:52] milimetric: is it also using mod_wsgi? I'd like see my changes immediately without restarting apache. [22:20:07] bmansurov: it's using mod_wsgi, yes [22:20:24] bmansurov: I develop wikimetrics locally, without vagrant [22:20:25] milimetric: do you use flask's built in server for development? [22:20:30] milimetric: ok cool [22:20:41] for me, it makes things much easier [22:20:56] but others find it frustrating to install everything [22:20:56] milimetric: you've also set up the database locally? [22:20:59] yes [22:21:02] there's scripts/install [22:21:02] milimetric: i see [22:21:11] and scripts/[a bunch of database creation files] [22:21:22] cool [22:23:32] but bmansurov, if you want to work locally and avoid vagrant headaches, I can stop by in a few minutes and help you get set up that way [22:24:14] that'd be awesome, milimetric [22:52:50] I'm a little unclear from the documentation I saw... data be gotten from wikimetrics via API? I tried some of the examples from UserMetrics/Guide but they didn't work for me. [22:52:54] eg, https://metrics.wmflabs.org/cohorts/Kmenger/bytes_added?project=mediawikiwiki&start=20130101000000&end=20130401000000&is_user=True [22:53:27] And what about creating cohorts via API? [23:14:38] (PS1) Mforns: Render invalid cohort members as html [analytics/wikimetrics] - https://gerrit.wikimedia.org/r/166150 [23:14:57] (CR) jenkins-bot: [V: -1] Render invalid cohort members as html [analytics/wikimetrics] - https://gerrit.wikimedia.org/r/166150 (owner: Mforns) [23:36:05] (PS2) Mforns: Render invalid cohort members as html [analytics/wikimetrics] - https://gerrit.wikimedia.org/r/166150 [23:58:04] (PS1) Bmansurov: Add a timerange validator [analytics/wikimetrics] - https://gerrit.wikimedia.org/r/166157 (https://bugzilla.wikimedia.org/70714) [23:58:10] (CR) jenkins-bot: [V: -1] Add a timerange validator [analytics/wikimetrics] - https://gerrit.wikimedia.org/r/166157 (https://bugzilla.wikimedia.org/70714) (owner: Bmansurov)