[00:32:55] average: can you please update http://etherpad.wikimedia.org/publish-new-mobile-report with all the things that *must* happen so we can finally wrap up #60? [00:33:17] nothing nice to have, nothing extra investigation; the bare minimum to get the report publihsed [00:39:51] drdee: updated card [00:42:11] sorry, I mean I've updated the etherpad [02:22:43] New patchset: JGonera; "Add optional config override parameter" [analytics/limn-mobile-data] (master) - https://gerrit.wikimedia.org/r/58246 [02:27:10] New patchset: JGonera; "Add Rendering time graph (in new Performance tab)" [analytics/limn-mobile-data] (master) - https://gerrit.wikimedia.org/r/58305 [13:23:19] moooooorning fella's! [13:26:18] ping average [13:29:29] morning :) [13:37:08] heeeyaaa! [13:37:14] wanna hangout ? [13:44:31] coming! [13:45:39] morning all! [14:12:43] hey ottomata, how much effort would #579 be? [14:13:26] could you work this morning on 518 and 579? [14:14:07] what about the vagrant thing i thought that was more pressing [14:14:19] 579 shouldn't be bad [14:14:23] hour maybe 2 [14:15:02] class of service :) [14:15:19] 1) 518 [14:15:27] 2) 595 [14:15:30] 3) 579 [14:16:05] 4) 497 [14:16:08] ungh, gimme titles! [14:16:11] 5) 131 [14:16:12] what are deees numbrers! [14:16:13] ? [14:16:22] the MINGLE numbers! [14:16:37] you've got to get a bit more mingly winlgy [14:16:53] 518 - SSL for User Metrics [14:17:05] 595 - ClassCastException in CDH 4.2 [14:17:24] 579 - Dan's username on cluster [14:17:29] 497 - puppetmaster [14:17:37] 131 - kafka puppetization [14:17:48] and then 570 - vagrant user metrics :) [14:19:22] ping average [14:19:36] haha, ok, i thought 570 was super high priority, that's what I was working on [14:19:37] uhhh [14:19:39] ok [14:19:54] 497 is done unless there are more bugs [14:21:05] k [14:22:08] the RT you likned to in 518 is the stats.wikimedia.org RT [14:22:35] oh whoops [14:22:41] let me hunt for the right link [14:22:58] fixed. [14:23:00] i just hutned too [14:23:44] https://gerrit.wikimedia.org/r/#/c/58254/ [14:23:48] thx [14:55:28] ottomata, drdee: can that IP anonymizer accidentally generate IP addresses that are considered internal? [14:55:32] by the UDFs I mean [14:56:04] using udp-filter /libanon you mean [15:04:39] ag internet! [15:04:43] milimetric: yes [15:04:51] its a random hashed converetd into ip format [15:10:12] pong drdee [15:10:14] just woke up [15:25:33] milimetric: yes, it can [15:25:48] we talked about this [15:26:06] i am actually pretty sick still [15:26:07] right, sorry, I must've blanked out while you were talking about it [15:26:10] but i'm going to see what i can do [15:26:21] right. I put a note and deferred it for Storm to handle basically [15:26:28] I don't see any decent solutions until then [15:27:32] i think somebody needs to think more carefully about my math and check it [15:27:44] just to be sure [15:27:52] but yeah, otherwise, i don't see a solution either [15:29:38] whatchall talking bout? [16:05:23] ottomata, can you send me the apache log files from stat1001 for user metrics api? [16:06:07] access logs [16:06:08] ? [16:11:40] this is the first set of numbers that actually makes good sense to me: [16:11:40] (BlackBerry PlayBook, 110000) [16:11:40] (iOS, 7025000) [16:11:40] (Android, 5630000) [16:11:40] (Windows 8, 52000) [16:11:40] (Firefox OS, 1000) [16:11:54] This is off sampled data from 4-15 [16:12:23] (it makes sense based on my biased understanding of how many people use these different devices) [16:12:53] yes this looks reasoable@ [16:13:10] I would like to point out: the Windows 8 and Firefox numbers are not reliable [16:13:15] based on erosen's analysis [16:13:39] the rest are fine, but the BlackBerry is borderline [16:36:14] drdee: I can be in the hangout if you wanna talk [16:46:54] milimetric; why do you say that the firefox and windows 8 numbers are not reliable [16:49:13] milimetric quick poke [16:57:01] in the hangout [17:01:03] New patchset: Diederik; "Make sure that a lock is always released." [analytics/E3Analysis] (master) - https://gerrit.wikimedia.org/r/59842 [17:21:45] New review: Milimetric; "(2 comments)" [analytics/E3Analysis] (master) - https://gerrit.wikimedia.org/r/59842 [17:22:35] aargh milimetric! [17:22:40] er? [17:22:42] :) [17:22:48] what I do wrong? [17:22:58] :D [17:23:23] hehe [17:23:46] oh shit... [17:23:54] uh... crap [17:24:05] hangout please! [17:24:25] drdee, erosen ^^ [17:24:41] joining [17:25:58] New patchset: Diederik; "Make sure that a lock is always released." [analytics/E3Analysis] (master) - https://gerrit.wikimedia.org/r/59842 [17:30:48] New patchset: Stefan.petrea; "started documenting wikistats. Starting small, with WikiReportsLocalizations.pm module and the method GetLanguageNamesFromSVN. Will continue to do this to increase understanding of the code." [analytics/wikistats] (master) - https://gerrit.wikimedia.org/r/59847 [17:36:45] erosen: http://gp.wmflabs.org/graphs/create takes a long time to become responsive because of all the datasources. I think the next iteration is to make that search/filter awesome [17:36:59] sounds good [17:41:59] Change merged: Milimetric; [analytics/E3Analysis] (master) - https://gerrit.wikimedia.org/r/59842 [18:06:01] drdee, yay, ssl works! [18:06:05] https://metrics.wikimedia.org/ [18:06:12] NICE [18:06:56] awesome ottomata [18:07:07] milimetric can now finally sleep again [18:07:12] he's very grateful [18:07:30] (and so am i) :) [18:07:30] :) [18:07:30] thank you! [18:08:03] also, Jeff_Green says we can 'kill locke with a hammer' [18:08:04] :) [18:08:08] his FR stuff is now all on gadolinium [18:08:38] great news as well! [18:08:43] does RobH know? [18:09:43] updating ticket [18:09:53] does RobH have a huge need to power down this machine? [18:10:01] it might be useful for random udp2log tests [18:10:12] true [18:10:16] ok, let's keep it [18:10:19] (for now) [18:10:26] i'm updating the ticket [18:12:36] New patchset: Rfaulk; "merge. changes in revert_rate.py" [analytics/E3Analysis] (master) - https://gerrit.wikimedia.org/r/59858 [18:14:25] ottomata: you flying down to Tampa to bash in locke, or do you think you can convince Jeff a drone strike would be more appropriate? [18:14:58] i want drones! and have ottomata wear a HUD display [18:20:21] i'm in. [18:21:39] New patchset: Stefan.petrea; "Added links for New mobile pageviews report" [analytics/wikistats] (master) - https://gerrit.wikimedia.org/r/59864 [18:22:34] drdee: ^^ not finished yet, still have to test [18:22:36] drdee: in progress [18:22:40] ok [18:30:35] New patchset: Rfaulk; "mod. except clause too broad - pass on empty queue only." [analytics/E3Analysis] (master) - https://gerrit.wikimedia.org/r/59867 [18:37:22] New review: Diederik; "Ok." [analytics/E3Analysis] (master); V: 2 C: 2; - https://gerrit.wikimedia.org/r/59867 [18:37:23] Change merged: Diederik; [analytics/E3Analysis] (master) - https://gerrit.wikimedia.org/r/59867 [18:47:08] hey DarTar! [18:47:15] howdy [18:47:52] 5 hours after a root canal :D [18:47:56] so about to head off to sleep in a while [18:48:12] DarTar: i responded on the bug. Is there any obvious negative from making them all consistent to be binary? [18:48:32] aarg, sorry to hear that [18:49:14] I think it's the implicit expectation from analysts not necessarily joining with MW databases that the default is utf8 [18:49:23] I forwarded to wmfresearch for feedback [18:49:32] as I am not representative of the whole analyst team :) [18:50:31] DarTar: hmm, ok. [18:50:42] DarTar: but isn't this an implementation detail that should mostly not affect anyone? [18:51:00] i always love the phrase 'should' [18:51:06] heh, right. [18:51:18] ha ha [18:51:18] other than break everyone's scripts, since they're expecting utf8 but get back binary. [18:51:22] that's why I am asking indeed [18:52:05] yuvi, can't you join the data inside your python script (pull using different connections) and then merge [18:52:09] ? [18:52:17] yes it's ugly [18:52:30] drdee: that's what i'm doing [18:52:58] drdee: so, to produce one CSV [18:53:04] is it working? [18:53:11] drdee: I'm joining from log to commons for 2/3 of the data [18:53:16] drdee: and commons to commons for 1/3 of the data [18:53:32] (mobile web doesn't log filenames to eventlogging but uses a revtag) [18:53:47] drdee: oh it's working. it's been running for limn-mobile-data for a week or two now [18:54:07] then i would be hesitant to change it; *something* will break [18:54:13] drdee: https://gerrit.wikimedia.org/r/gitweb?p=analytics/limn-mobile-data.git;a=blob;f=mobile/uploaders-experience.py;h=07cbb00485c8385007c596613cffaae219b4c21f;hb=HEAD [18:55:04] drdee: well, I bet I'm not the first one to be bit by this [18:55:07] and probably won't be the last [18:56:25] plus if we change it (after plenty of warning, etc?), the scripts that depended on it being utf8 will break cleanly and can be fixed cleanly too [18:56:50] besides, I don't think the production clones are going to move off binary anytime soon [18:57:15] the latter is a safe assumption [18:57:16] so we *should* change it unless there is a specific reason not to, no? [18:57:34] but see my comments about mw vs other native utf8 data sources [18:57:40] brb, need coffee fi [18:57:41] x [18:58:29] DarTar: sure, but we should be internally consistent inside the mysql sources, no? [18:58:37] DarTar: imagine if one set of logs was utf8 and another wasn't [18:58:55] yes all logs should be utf8 :) [18:59:15] yup, and in an ideal world I suppose all dbs should also be utf8 :) [18:59:20] so another reason for tables in log to have utf8 as a default set is that they exist in other native utf8 stores [18:59:31] hmm? [18:59:33] where? [18:59:33] Change merged: Milimetric; [analytics/E3Analysis] (master) - https://gerrit.wikimedia.org/r/59858 [18:59:34] like mongo [18:59:47] Mongo, IIRC, handles most of the encoding stuff for you [18:59:59] and I don't know if you're going to be using Mongo *and* mysql together at the same point. [19:00:49] it might well be, we're playing with that at the moment, but my point is that thinking of consistency only wrt MW might be limiting [19:00:53] plus it is easier to say 'all our mysql slaves are binary', than to say 'some are x, some are Y' [19:01:02] no, I'm saying we need to be consistent within mysql [19:01:03] not mw [19:01:16] slaves = mw dbs [19:01:27] I'm referring to the other data crunching db like staging or prod [19:01:58] well, if it is possible to be consistent within mysql by changing everything to utf8, that'll be grand [19:02:26] DarTar: i'm just saying that a consistent encoding from one data source is much better than an inconsistent set of encodings. [19:02:29] agreed, but not sure how realistic [19:02:45] changing productions laves to utf8? completely not realistic at all :) [19:02:52] chaing log to binary? much more realistic :) [19:03:15] sticking to utf8 for everything other than the slaves? also very realistic [19:03:45] let's wait for feedback from the other folks, I am probably biased [19:03:55] yes, I don't think anything I say is going to convince you at all :) [19:07:02] not true! [19:07:37] I need to hear about more use cases I may not be aware of from the rest of the team [19:07:59] gotta go, good luck with your tooth, get some rest! [19:08:02] hmm, let me put up a patchset that's dependent on this bug to work :) [19:08:03] thanks DarTar [19:11:37] New patchset: Yuvipanda; "Generate data for Web too" [analytics/limn-mobile-data] (master) - https://gerrit.wikimedia.org/r/59880 [19:12:31] New review: Yuvipanda; "-2 for now, this one depends on https://bugzilla.wikimedia.org/show_bug.cgi?id=47368 being fixed." [analytics/limn-mobile-data] (master) C: -2; - https://gerrit.wikimedia.org/r/59880 [19:28:24] New patchset: Stefan.petrea; "Added links for New mobile pageviews report" [analytics/wikistats] (master) - https://gerrit.wikimedia.org/r/59864 [19:39:42] New patchset: JGonera; "Add optional config override parameter" [analytics/limn-mobile-data] (master) - https://gerrit.wikimedia.org/r/58246 [19:39:49] New patchset: Rfaulk; "add. 'pages created' metric." [analytics/E3Analysis] (master) - https://gerrit.wikimedia.org/r/59882 [19:40:12] New patchset: Stefan.petrea; "Added links for New mobile pageviews report" [analytics/wikistats] (master) - https://gerrit.wikimedia.org/r/59864 [19:40:37] New patchset: JGonera; "Add Rendering time graph (in new Performance tab)" [analytics/limn-mobile-data] (master) - https://gerrit.wikimedia.org/r/58305 [19:42:21] New review: Yuvipanda; "LGTM, needs testing." [analytics/limn-mobile-data] (master) C: 1; - https://gerrit.wikimedia.org/r/58246 [19:42:58] drdee, i'm trying to reproduce the classcastexception error [19:43:03] i just ran a pig job and it was fine [19:43:23] no need to reproduce we ran into it many times [19:43:27] if you want to reprdouce [19:43:31] set reducers to 1 [19:43:39] and make sure that you have a large input dataset [19:43:56] New review: Yuvipanda; "Actually looks *very* good to me. If you can test it please do a V+2 and self merge :)" [analytics/limn-mobile-data] (master) C: 2; - https://gerrit.wikimedia.org/r/58246 [19:44:48] New patchset: Rfaulk; "add. 'pages created' metric." [analytics/E3Analysis] (master) - https://gerrit.wikimedia.org/r/59882 [19:44:57] well, i'd like to be able to confirm that we fixed it [19:45:20] drdee: could you merge a patchset please ? [19:45:33] drdee: https://gerrit.wikimedia.org/r/59864 [19:45:36] drdee: last patchset here [19:45:39] drdee: it's for card #60 [19:45:52] drdee: or should I send an e-mail and we can have Erik review it [19:45:59] New patchset: Rfaulk; "add. 'pages created' metric." [analytics/E3Analysis] (master) - https://gerrit.wikimedia.org/r/59882 [19:46:12] ottomata: well, i'd like to be able to confirm that we fixed it; [19:46:15] okay that makes sense :) [19:46:26] same trick should do the work but then it should succeed [19:47:13] New review: Erosen; "I just left things at +1 so that ryan can actually test that it works." [analytics/E3Analysis] (master); V: 1 C: 1; - https://gerrit.wikimedia.org/r/59882 [19:47:48] ok, i'm trying with 1 reducer on a full day of mobile logs [19:48:11] is that a big enough dataset? [19:48:29] yes that's big enough i belief [19:48:41] hmm it worked fine [19:48:44] drdee: oh, you wrote 2 days ago that dClass has now native java support. I think we have that too, but yeah, I saw that Rezan wrote the JNI himself now. Would have been cool if we did a pull request before he wrote the code himself [19:48:47] i'm using my webrequest_loss script [19:48:51] maybe I have to use something else [19:49:07] drdee: he wrote basically the same code as us for the jn [19:49:09] *jni [19:49:16] yes i saw that [19:49:20] it's a shaem [19:49:31] drdee: what pig job should I use? [19:49:58] i belief on an10:/home/diederik there is a job called histogram.pig [19:50:04] that one failed IIRC [19:50:55] drdee: should I e-mail Erik for the code review ? [19:51:00] nope, no histogram [19:51:17] average: just add erik to all the patch sets that you want him to review [19:51:20] ottomata: 1 sec [19:51:23] ok [19:55:04] ottomata; sent email [19:55:19] try that one but best test is to have david run his script [19:58:57] drdee, should i set parallel to 1? [19:58:59] yours is set at 10 [19:59:12] yes but set it to 1 to see if the issue has been resolved [19:59:24] the issue is triggered when the reducer needs to start writing to disk [19:59:33] so with 1 it will happen much faster than with 10 [19:59:46] k [20:02:04] how can I delete a mingle ticket [20:02:10] *card [20:02:19] there is a big fat Delete button once you click edit [20:02:23] ja cool, can reproduce, danke [20:02:28] ok sorry [20:02:30] but maybe you don't have permission to do that [20:02:37] which card? [20:03:06] 583 [20:03:31] k [20:05:14] drdee: I can't delete it(no button), please delete it for me [20:05:24] ok [20:08:10] New patchset: JGonera; "Add Rendering time graph (in new Performance tab)" [analytics/limn-mobile-data] (master) - https://gerrit.wikimedia.org/r/58305 [20:09:21] New review: Ori.livneh; "(2 comments)" [analytics/limn-mobile-data] (master) - https://gerrit.wikimedia.org/r/58305 [20:11:36] New review: Ori.livneh; "(1 comment)" [analytics/limn-mobile-data] (master) - https://gerrit.wikimedia.org/r/58305 [20:15:04] New review: JGonera; "(1 comment)" [analytics/limn-mobile-data] (master) - https://gerrit.wikimedia.org/r/58305 [20:15:33] Change merged: Ori.livneh; [analytics/limn-mobile-data] (master) - https://gerrit.wikimedia.org/r/58305 [20:17:28] !restarting hadoop to load jar from https://issues.cloudera.org/browse/DISTRO-461 [20:17:36] !log restarting hadoop to load jar from https://issues.cloudera.org/browse/DISTRO-461 [20:17:38] Logged the message, Master [20:22:54] Change merged: JGonera; [analytics/limn-mobile-data] (master) - https://gerrit.wikimedia.org/r/58246 [20:41:01] !log resourcemanager has not come back online after last restart, working on it... [20:41:02] Logged the message, Master [20:41:39] its not giving me any lgos when it starts! [20:41:43] so no idea why it won't come back online! [20:51:20] ottomata: i don't know anything about resourcemanager, but if you want another pair of eyes to look at things, let me know [20:54:21] i think i got it, disk space issue, some log files still not rotated properly, fixing [20:54:52] yeah looking better [20:54:58] milimetric: try your thing now, lemme know if you need help [20:55:58] cool, yeah, it's definitely broken on my end somehow [20:56:08] basically, I don't understand how to sync kraken to "the cloud" [20:56:57] hangout? [20:59:13] the cloud? [20:59:15] sure [20:59:18] :) [20:59:19] i gotta run soon, lets hangout real quick [21:01:45] iKraken! [23:16:18] erosen, drdee, milimetric: the reason why I couldn't find the card is because it's already queued for dev: https://mingle.corp.wikimedia.org/projects/analytics/cards/430 [23:16:30] hehe [23:16:35] it seemed familiar [23:17:05] so that's the dependency for all the cards related to job review / parameter manipulation [23:17:09] we're good