[14:20:23] morning drdee! [14:32:51] mooooorning ottomata!! [14:32:56] BAM BAM BAM [14:35:34] morning! [14:35:45] so, i'm doing the coalesce stuff on the zero carrier country [14:35:54] just wanted to make sure i will do what you wanted: [14:35:55] you want: [14:36:12] - stop jobs [14:36:12] - remove old data [14:36:12] - resubmit jobs, starting march 1 [14:36:13] ? [14:50:18] yes but i need to: [14:50:35] 1) update the pig scripts (trivial fix, just push code) [14:50:42] 2) copy new jars to HDFS [14:51:04] so once i have done that then yes you can take those steps [14:51:23] so an20 is now behaving? [14:52:11] i haven't checked it yet, but leslie says yet [14:52:12] yes [14:56:30] kool [15:08:05] hey drdee [15:08:10] yooooo [15:08:18] can we make this run on the webrequest-mobile dataset too? [15:08:25] the geocoded and anonymized one? [15:08:39] that means the country pig script will have to change, since it doesn't have to geocode anymore [15:09:01] oh carrier too [15:09:03] it uses GEO() [15:09:10] yup [15:09:12] hm [15:09:28] i suppose we don't have to do this now... [15:09:28] hm [15:09:37] since we won't have to backfill when we change this later [15:09:39] so its not a huge deal [15:09:46] i just want to avoid having to backfill everytime [15:09:51] sure [15:10:54] also, i was talking to yuir [15:10:56] yuri [15:11:03] and he said hourly data might be more useful than daily to him [15:11:07] what do you think? [15:11:12] should we switch this back to hourly? [15:13:50] drdee ^ [15:14:04] no, we should coordinate that with amit and yuri [15:15:01] ok, we should do that before we backfill then [15:15:04] should I email them and ask? [15:15:24] yes please email yuri, evan, amit and myself, thanks! [15:16:35] maybe you can also mention https://mingle.corp.wikimedia.org/projects/analytics/cards/767 in your email [16:39:31] hey erosen [16:39:36] heyo! [16:39:40] I got nose.run() to run [16:39:45] i thought you might appear shortly after diederik [16:39:50] but i can't get it to detect tests in /tests [16:39:52] yea, I messed around with that last night [16:39:56] yup [16:40:05] I thought nose.run(module='tests') would do it [16:40:07] i'll come down to 3 [16:40:07] but no [16:40:16] i think it needs to find that module [16:40:20] it did [16:40:27] because it runs the /tests/__init__.py [16:40:32] interesting... [16:40:37] maybe it needs to find the setup.cfg? [16:40:50] mmm... yeah, maybe [16:40:51] i'll come down to 3 now [16:40:56] can't figure out how to pass that from the docs [16:41:03] but maybe just moving it... [16:57:22] hi tnegrin [16:57:24] morning [16:58:23] welcome tnegrin! [17:02:07] average_drifter: scrum? [17:06:28] hi [17:06:30] can't make it today [17:06:35] to scrum [17:06:41] I'm looking on Faidon's review for dclass [17:06:47] working to fix the things there [17:07:05] also finishing up what we talked yesterday about 738 [17:14:33] drdee https://mingle.corp.wikimedia.org/projects/analytics/cards/745 [17:47:12] drdee: oh meeting cancelled? [17:47:25] yup, tnegrin was double booked [17:47:30] hm ok [18:34:52] tnegrin: https://mingle.corp.wikimedia.org/projects/analytics/cards/291 is the relevant ticket for hosting the stats.grok.se application [18:53:03] New patchset: Erik Zachte; "New script to collect WLM stats, breakdown of uploads/contributors by country" [analytics/wikistats] (master) - https://gerrit.wikimedia.org/r/69517 [18:54:33] Change merged: Erik Zachte; [analytics/wikistats] (master) - https://gerrit.wikimedia.org/r/69517 [18:57:28] New patchset: Erik Zachte; "show for last 7 days how many dumps were processed each day" [analytics/wikistats] (master) - https://gerrit.wikimedia.org/r/69520 [19:00:48] Change merged: Erik Zachte; [analytics/wikistats] (master) - https://gerrit.wikimedia.org/r/69520 [19:00:56] ottomata: can you have a look at https://mingle.corp.wikimedia.org/projects/analytics/cards/768 and add some more details? [19:00:57] New patchset: Erik Zachte; "fix for script failure \xFF's in input, seen as end of file marker" [analytics/wikistats] (master) - https://gerrit.wikimedia.org/r/69522 [19:03:32] Change merged: Erik Zachte; [analytics/wikistats] (master) - https://gerrit.wikimedia.org/r/69522 [20:18:44] drdee: http://infolab.stanford.edu/~usriv/papers/pig-latin.pdf [20:18:53] go there and search "an equivalent" [20:33:25] drdee [20:33:27] awesoooome [20:33:27] 13/06/19 20:33:07 INFO balancer.Balancer: Need to move 8.18 TB to make the cluster balanced. [20:33:27] 13/06/19 20:33:07 INFO balancer.Balancer: Decided to move 10 GB bytes from 10.64.36.113:50010 to 10.64.36.120:50010 [20:33:27] 13/06/19 20:33:07 INFO balancer.Balancer: Decided to move 10 GB bytes from 10.64.36.115:50010 to 10.64.36.119:50010 [20:33:27] 13/06/19 20:33:07 INFO balancer.Balancer: Will move 20 GB in this iteration [20:33:46] very cool indeed [20:36:28] "We just completed the deployment of Kafka 0.8 (latest code in 0.8 branch) to production at LinkedIn yesterday. No major issues were found." junrao @ linkedin [20:36:41] woowoo [20:39:32] awesome! [20:39:45] Snaps, maybe you can mention this in the #ops channel as well [20:40:33] done [20:46:15] ty! [20:49:29] They're all about kafka's predecessor, tftp! (give or take 30 years) ;) [20:49:41] aight [20:52:54] ha [21:06:57] ottomata,ping [21:07:40] whanted to talk to you about collection frequency email [21:07:48] (for zero) [21:11:37] please talk to me first yurik :) [21:12:07] drdee, i would love to talk to you too! =) [21:12:22] (that's why i joined the channel :))) [21:12:23] i am on the 3rd floor actually [21:12:32] oh wait you are EST [21:12:34] nvm [21:12:43] i'm in the hummus place [21:12:56] in new york :) [21:13:01] piiing [21:13:19] yeah! so that fancy 1 second stuff is not possible right now [21:13:38] drdee, ottomata, i sent an email earlier about my thoughts on graphing -- i think this is what zero needs for both our own and extrernal use [21:13:39] we could get you batch generated hourly data right now [21:13:51] but ultimately, yeah! that would be amazing [21:14:40] ottomata, do you know of any OSS graphing solution that handles different levels of data aggregation with time? [21:14:48] something similar to the google finance graph? [21:15:04] not pretty like that, but milimetric probably knows more about the data vis world that me [21:15:30] pretty sure that's how RRDtool works [21:15:41] ottomata: did you delete the duplicated log dies and replace them with the reduplicated log files/ [21:15:45] no [21:15:46] not yet [21:15:48] dies -> files [21:15:49] limn adjusts the x and y scales automatically yurik, but I think most vis packages do [21:15:49] oh [21:15:50] sorry [21:15:51] yes [21:15:52] duplciated [21:15:52] yes [21:15:53] i did [21:16:03] i was thinking you were talking about the zero stuff we are going to do soon [21:16:07] yes, [21:16:09] ok [21:16:22] so i have bad newish [21:16:27] poop [21:16:32] i think there are still duplicates..... [21:16:47] gr [21:28:39] milimetric, thanks, does it allow things like scrolling and realtime updates? [21:29:12] drdee, what do you think of our plans in general (from emeail) [21:29:24] yurik: before jumping into a solution can we talk a bit more about use cases [21:29:50] for example is handling of different levels of data aggregation with time a must have feature? [21:30:00] we have a dashboard tool limn that can make charts [21:30:32] yup still duplicates [21:30:36] sigh sigh sgih [21:30:38] sorry guys [21:31:56] drdee, sure - i wrote some use cases in that email at the end. Would love to discuss them. Multilevel is highly useful for issue monitoring, and technically doesn't even have to be together with historical, but it seems that it has a lot of overlap functionality requirements (like how to group and visualize data) with the historical [21:32:38] i would like to capture your request as a separate story, and not mix it with amit's request [21:32:50] they seem to be too different [21:33:51] i'm not sure i understand what is exactly required by amit's req [21:35:58] drdee, i looked through the email, it doesn't seem to be there [21:50:00] drdee! I just ran deduplicate on 5-23, saved the output in my home dir, and then ran your dup checker script on in [21:50:05] 0 duplicates. [21:50:22] great! [21:54:47] but [21:54:48] i mean [21:54:50] sure great [21:54:58] but why are there duplicates in the logs!? [21:55:06] i shoudl have deduplicated them already! [21:55:10] dunno [21:55:16] you ran that script :) [21:55:23] i know! [21:55:40] grrrr, ok, i'm going to try again on the webrequest-mobile script [21:55:42] logs* [21:55:46] and then will check them [22:01:32] k [22:17:34] ottomata: Faidon asks why we need shlibs [22:18:02] haha [22:18:21] ottomata: should we address this in #wikimedia-operations ? [22:18:31] wait, where is he asking that? [22:18:39] ottomata: in the comments, I'll link you [22:18:45] ottomata: https://gerrit.wikimedia.org/r/#/c/68711/2/debian/libdclass0.shlibs [22:18:50] ottomata: https://gerrit.wikimedia.org/r/#/c/68711/2/debian/libdclass-jni.shlibs [22:18:53] oh i see it [22:22:29] New review: Ottomata; "(1 comment)" [analytics/dclass] (debian) - https://gerrit.wikimedia.org/r/68711 [23:50:48] ottomata still around? [23:52:31] sorta not really [23:52:33] making dinnaaah [23:52:34] whats up? [23:52:38] i got new jars [23:52:39] ! [23:52:49] shall i put them in /libs/kraken-0.0.2/ ? [23:53:00] then you can relaunch the zero job [23:53:10] (zero_country and zero_carrier)