[09:46:58] (CR) Nuria: [C: 2] Remove useless scripts [analytics/wikimetrics] - https://gerrit.wikimedia.org/r/141076 (owner: Milimetric) [09:49:08] (CR) Nuria: [C: 2] "> This script will allow us to slowly roll it out and we shouldn't need to roll back except if we get in trouble." [analytics/wikimetrics] - https://gerrit.wikimedia.org/r/141077 (https://bugzilla.wikimedia.org/65946) (owner: Milimetric) [09:55:42] (CR) Nuria: Fix wiki cohort display (1 comment) [analytics/wikimetrics] - https://gerrit.wikimedia.org/r/142514 (owner: Milimetric) [10:12:43] (CR) Nuria: [C: 2] Ensure wiki cohorts work [analytics/wikimetrics] - https://gerrit.wikimedia.org/r/140830 (https://bugzilla.wikimedia.org/66290) (owner: Milimetric) [10:12:57] (Merged) jenkins-bot: Ensure wiki cohorts work [analytics/wikimetrics] - https://gerrit.wikimedia.org/r/140830 (https://bugzilla.wikimedia.org/66290) (owner: Milimetric) [10:25:09] (PS2) Milimetric: Remove redundant outdated secondary log backup [analytics/wikimetrics] - https://gerrit.wikimedia.org/r/141689 [10:25:14] (PS2) Milimetric: Add pretty symlink for WikimetricsBot [analytics/wikimetrics] - https://gerrit.wikimedia.org/r/143040 (https://bugzilla.wikimedia.org/66087) [10:25:33] (PS2) Milimetric: Fix wiki cohort display [analytics/wikimetrics] - https://gerrit.wikimedia.org/r/142514 [10:28:51] (CR) Milimetric: Fix wiki cohort display (3 comments) [analytics/wikimetrics] - https://gerrit.wikimedia.org/r/142514 (owner: Milimetric) [10:29:00] (PS2) Milimetric: Remove limit on recurrent, add throttling [analytics/wikimetrics] - https://gerrit.wikimedia.org/r/142007 (https://bugzilla.wikimedia.org/66841) [10:29:40] hey nuria, sorry I had forgotten to add you to like 3 of the patches I pushed over the last few days [10:29:52] * milimetric thinks adding people to patches manually is silly [10:29:57] np at all [10:30:19] i just had not looked at that code for couple of days. [10:30:32] so I'm gonna go bike into the coworking space today and I won't be available for about an hour, but until then I'm around [10:30:39] but wait.. is super early in the east coast [10:30:48] i hadn't even noticed [10:30:59] i couldn't sleep, it's getting hot here :( [10:31:01] I hate heat [10:31:53] so i guess i'll grab one of the hadoop tasks? [10:32:21] I'll be terrible at it :) can nobody else like qchris or you work on that instead? :D [10:32:37] * qchris reads backscroll. [10:33:28] * qchris is just finishing testing the backup. [10:33:48] Afterwards I'll pick up whatever is on the scrum board. [10:34:13] milimetric: I do not care much, whether I do Hadoop stuff or other work. [10:34:26] Whatever you prefer. [10:34:40] qchris: I saw you made the duplicate user_id bug s=none [10:34:48] but that it was in the current sprint [10:35:04] so I removed it from the sprint (because it seems nobody ever set it to the current sprint) [10:35:12] During sprint planning we said that it was there by accident, and we wanted it removed. [10:35:17] ah, good [10:35:21] i found out how to remove it [10:35:26] i logged in as admin and managed the stories [10:35:27] Really? How? [10:35:31] Ah. Ok. [10:35:35] i can share my admin credential privately if you want them [10:35:39] Nono. [10:35:46] I think you even already did at some point. [10:35:57] k [10:36:15] as far as hadoop, in this case it might be better if one of you does it [10:36:25] Ah ... the bug is gone from the board. Thanks milimetric. [10:37:01] I already did oozie stuff, so others could benefit from the experience [10:37:10] and the other one Andrew said was almost done or something? [10:37:16] I can grab one of those tasks soonish. [10:37:29] ok, I'll stop bothering you :) [10:37:33] yup, the script is already merged in the repos. [10:37:46] It just needs to get run automatically. [10:40:50] (PS1) Milimetric: Update for July meeting [analytics/reportcard/data] - https://gerrit.wikimedia.org/r/143265 [10:41:28] (CR) Milimetric: [C: 2 V: 2] Update for July meeting [analytics/reportcard/data] - https://gerrit.wikimedia.org/r/143265 (owner: Milimetric) [10:45:34] (CR) Nuria: "Grabbing patch to work on it today." (1 comment) [analytics/wikimetrics] - https://gerrit.wikimedia.org/r/142514 (owner: Milimetric) [10:49:05] milimetric: I can certainly work on the hadoop tasks [10:49:43] I just need to talk to andrew about the process alarms i set as i cannot tests them due to permits, also would like to finish wikimetrics patch [10:49:59] but can grab a hadoop task after that [10:51:03] qchris: sorry about the flip-flop on the backup task, I know we agreed yesterday but looking at it now, it seems scary [10:51:22] milimetric: about to chime in on gerrit :-) [11:07:00] milimetric, there is still work to do on the "cron user runs reports" [11:07:16] as we need to start creating reports for newly register iusers for small wikis [11:07:55] qchris, the backup runs with a cron on root user right? [11:08:33] so there is nothing additional to do there, correct? [11:09:03] nuria: Wikimetrics backup runs as root through cron. Right. I hate it. [11:10:05] ok.. ahem.. leaving hate aside.. we said we were leaving it that way or we chnaged that idea? [11:11:03] I thought we agreed to leave it like that for now. [11:11:09] Didn't we? [11:11:23] yes [11:11:27] i thought so too [11:12:38] then .. what is pending on the task? [11:14:41] Like getting it into production, cleaning up the mess around backup someone of our team left behind on staging. [11:16:57] nuria: agreed about "cron user runs reports", that's not done, but I am hopeful that we can finish it this sprint [11:17:03] at least most of the coding should be done [11:17:43] ah, i was the one that left a bunch of backed up stuff on staging but i call that testing, not garbage [11:18:41] also in dev too , as i tested quite a bit restoring dev from staging backup [11:18:58] nuria: Please do not twist my words. I didn't call it "garbage". [11:19:26] And what you call "testing" would overwrite production backup :-) [11:19:35] Every hour again and again. [11:19:48] not in env variables are correct in staging [11:20:08] nuria: Thanks for the hint. [11:20:47] sorry, "not if env variables are correct in staging" [11:28:17] brb commuting [11:30:51] Enjoy the ride, milimetric! [13:05:10] (CR) Gilles: [C: 2] Use 4 spaces to indent, per Python coding standards [analytics/multimedia] - https://gerrit.wikimedia.org/r/143180 (owner: Gergő Tisza) [13:05:16] (Merged) jenkins-bot: Use 4 spaces to indent, per Python coding standards [analytics/multimedia] - https://gerrit.wikimedia.org/r/143180 (owner: Gergő Tisza) [13:26:23] Is there a faster, more reliable way to pull page view data for large numbers of articles than just sending a whole lot of requests to stats.grok.se json urls? [13:28:39] not yet ragesoss, this is why we're working on the hadoop cluster [13:29:34] milimetric: thanks. Do you know how glamtools/treeviews does it? Just multiple simultaneous requests? [13:30:08] ragesoss: I believe they have a separate filter that takes out only the data they need to analyze and puts it somewhere for them [13:30:11] but I don't know the specifics [13:30:32] if you're interested, I can ask the people who do [13:30:48] but it probably doesn't apply to your case; speaking of that, what are you trying to analyze? [13:30:58] okay, thanks. I tried installing Magnus's code to my own tool account, but it wouldn't pull any pageviews (although the category stuff worked) [13:31:29] milimetric: my general use case is getting pageview data for the articles worked on by Education Program students. [13:32:06] qchris: what's the difference between PS2 and PS1 here: https://gerrit.wikimedia.org/r/#/c/143268/2 [13:32:33] at the moment, I'm just using a python script to get data for a list of articles that I pulled via SQL query. [13:32:56] to get a quick-and-dirty baseline for how much reach the work of these students has. [13:33:18] milimetric: the wikimetrics submodule's commit hash of the parent. [13:33:23] ragesoss: yeah, I don't know of any better way right now. But that's *definitely* the kind of questions we're aiming at answering with our projects [13:33:34] But in the medium term, I expect to be hiring someone to build an automated solution. [13:33:36] 1. the growth team is adding campaign-based cohorts [13:33:42] (to wikimetrics) [13:33:56] 2. we're standing up the hadoop cluster so we can make a proper queryable source for pageview data [13:34:10] what's the timeline on the hadoop pageviews? [13:34:11] 3. I think we'll put those things together and let you access pageview stats by cohort [13:34:42] months? a year? two years? [13:34:57] ragesoss: I think that's a better question for tnegrin and kevinator [13:35:01] :) [13:35:18] but my personal hope is that it's something we get done in 2014 [13:35:51] cool. that's a good piece of info, just have a ballpark of what might be possible. [13:35:59] unlike the past, this is something we are actively prioritizing though and it's no longer on the back burner [13:37:30] nuria, I just checked on hafnium [13:37:32] otto@hafnium:~$ sudo /usr/lib/nagios/plugins/check_eventlogging_jobs [13:37:32] OK: All defined EventLogging jobs are runnning. [13:37:41] mind if I stop the eventlogging jobs and run it? [13:37:43] see what happens? [13:37:44] just to be sure? [14:00:05] qchris: ottomata: standup [14:00:52] thanks [15:25:50] milimetric: ottomata & myself are done testing the EL stuff, back to CR-ing wikimetrics [15:25:58] sweet [15:26:23] I'm doing some reading atm, but working on oozie [15:33:05] eventlogging alert [15:34:08] * milimetric is now an email bot [15:35:36] haha, yeah i know [15:35:36] i'm checking it [15:36:22] good news that it triggered! i think there's some puppet weirdness that i caused there [15:36:31] but that is good because it actually isn't running right now (nuria :) ) [15:38:36] (CR) Nuria: Add pretty symlink for WikimetricsBot (3 comments) [analytics/wikimetrics] - https://gerrit.wikimedia.org/r/143040 (https://bugzilla.wikimedia.org/66087) (owner: Milimetric) [15:38:53] all right! [15:39:24] I actually had another bug about this: https://bugzilla.wikimedia.org/show_bug.cgi?id=67309 [15:39:33] i will add the patchset by hand and close it [15:43:34] (CR) Nuria: [C: 2] "Nice. We certainly do not need this." [analytics/wikimetrics] - https://gerrit.wikimedia.org/r/141689 (owner: Milimetric) [15:44:43] (Merged) jenkins-bot: Remove redundant outdated secondary log backup [analytics/wikimetrics] - https://gerrit.wikimedia.org/r/141689 (owner: Milimetric) [15:46:35] (CR) Milimetric: Add pretty symlink for WikimetricsBot (3 comments) [analytics/wikimetrics] - https://gerrit.wikimedia.org/r/143040 (https://bugzilla.wikimedia.org/66087) (owner: Milimetric) [15:55:49] (CR) Nuria: Add pretty symlink for WikimetricsBot (3 comments) [analytics/wikimetrics] - https://gerrit.wikimedia.org/r/143040 (https://bugzilla.wikimedia.org/66087) (owner: Milimetric) [15:59:36] (CR) Milimetric: Add pretty symlink for WikimetricsBot (3 comments) [analytics/wikimetrics] - https://gerrit.wikimedia.org/r/143040 (https://bugzilla.wikimedia.org/66087) (owner: Milimetric) [16:07:18] (PS1) Milimetric: Migrate oozie folder from Kraken minus archive [analytics/refinery] - https://gerrit.wikimedia.org/r/143336 (https://bugzilla.wikimedia.org/67128) [16:08:04] does anyone else get excited when you get like a really cool bugzilla or gerrit number? [16:08:11] 143336, score! [16:15:15] (CR) Ottomata: [C: 2 V: 2] "This patch was originally reviewed at:" [analytics/refinery] - https://gerrit.wikimedia.org/r/143336 (https://bugzilla.wikimedia.org/67128) (owner: Milimetric) [16:33:29] (CR) Nuria: Add pretty symlink for WikimetricsBot (2 comments) [analytics/wikimetrics] - https://gerrit.wikimedia.org/r/143040 (https://bugzilla.wikimedia.org/66087) (owner: Milimetric) [16:39:58] milimetric, qchris need to get groceries, will be back online in 1hr 30 min [16:40:24] Happy shopping. [18:10:08] i am back ... [18:14:27] i was in a meeting, i'm back too :) [18:14:38] leila! much easier to tab-complete [18:14:39] :) [18:15:00] totally! :-) [18:19:24] after ottomata's help with testing alarms [18:19:38] i think the item with EL and graphite counts is done [18:20:06] cool [18:32:20] qI started the channel-crawIk was ook een beetje door mijn gedult heen. Best. Learning Eperience' ever [18:36:27] Since I managed to screw up joing the correct channel, would Think the HSenWHo peiopenAlthough it did get me interested... What sort of stuff do you use analplatform Ik was bij ieder station wakker van de PA. Schrok me de tering toen ik 's avonds realatief laat pas thuiskwam) :p [18:43:07] hey milimetric [18:43:14] hi [18:43:34] milimetric: I’m gettinng “Wikimetrics is experiencing problems” message while trying to create a bytes added report [18:44:27] (PS1) Milimetric: Update July pageview data [analytics/reportcard/data] - https://gerrit.wikimedia.org/r/143370 [18:44:40] uhoh [18:44:42] checking [18:45:16] kevinator: ? [18:45:19] how / what / where? [18:45:26] i'm not seeing problems, just ran a report [18:45:36] on production… after clicking “Run Report" [18:45:50] (CR) Milimetric: [C: 2 V: 2] Update July pageview data [analytics/reportcard/data] - https://gerrit.wikimedia.org/r/143370 (owner: Milimetric) [18:47:05] kevinator: I did that and I got my report, so can you be specific and list the steps / tell me what cohort you're using? [18:47:19] do you want to screenshare? [18:47:25] batcave? [18:51:40] milimetric: it must be my cohort… I can run the same report against other cohorts [18:53:03] milimetric: aha, my cohort has 0 valid users in it [18:55:09] :) [18:55:17] kevinator: that's one of the bugs we've got a patch in for [18:55:23] so it doesn't show that cohort on the report screen [18:55:40] awesome… thanks [18:55:45] kevinator: sorry missed your ping about batcave, I can't hear my pings very well here at the co-working spot [18:55:56] no worries [18:56:11] It made me troubleshoot myself :-) [19:00:05] ack, ori [19:00:09] confine with blocks only works in facter 2 [19:00:19] i think... [19:00:50] at least, not working in vagrant... [19:00:50] hm [19:01:13] oops, meant to put that in ops [19:22:14] ottomata: I tried man, I don't think it's physically possible for me to do this task [19:22:19] I'll end up stabbing myself in the face [19:22:44] I remembered that like a year ago I said I would never attempt this again, and I think I was totally right then [19:22:54] haha [19:22:55] cmooon [19:22:58] just talk with meeee! [19:23:04] it is physically possible [19:23:06] what's up? [19:23:14] we can do it together, ya know? [19:23:17] "physically possible" ... jaja [19:23:18] :) [19:23:40] i donno man [19:23:44] i am happy to grab it tomorrow if you are near the end of your day [19:23:46] all I want to do is write a workflow generator [19:24:12] now i have no clue as to how oozie works but hey ... [19:24:18] because no human being should ever have to manually type in all this boiler plate to basically... schedule a job [19:24:26] http://blog.cloudera.com/blog/2013/01/how-to-schedule-recurring-hadoop-jobs-with-apache-oozie/ [19:24:48] like, there are three separate files that basically say "every hour do this" [19:24:58] and they're LONG files [19:25:28] with no syntax, grammar, or any way to know what you're doing is right or wrong [19:26:04] Browsing through the stats and services available through wikimedia.org is fascinating. I never knew you guys published data so openly regarding, well, everything really. Also I like how the domains presented feel completely alien to me because of the nae different type of industry I work in. [19:26:13] there is a pretty specific xml grammer [19:26:14] Coolest find of the week for me :) [19:26:19] if you wanna really get into it :) [19:26:23] :) yay Entalyan [19:26:38] and, milimetric, the way the webrequest add stuff is structured [19:26:44] you should be able to just copy/paste and change stuff [19:26:55] i've been doing that for a couple of hours [19:27:13] i basically just deleted everything i did in complete confusion [19:27:37] haha, maybe we should do it together? [19:27:57] i don't think it's very useful, honestly, i do appreciate it though :) [19:28:09] not useful because I really plan on never doing this [19:28:32] and if I have to do it, the way i would do it would be to write a wizard that generated the files we need [19:28:48] although, hue seems to have that already [19:29:03] so I'd try that first, but do we have hue running ? [19:30:49] brb, gonna take a walk :) [19:34:06] we don't really have hue running yet, but CDH5 hues seems a little nicer [19:34:10] hopefully will have it more official then [19:34:14] milimetric: when you get back, let's batcave [20:15:52] ottomata: batcave? [20:18:24] I'm not sure the amount of software is use/development in-house is genious or madness... Impressive regardless. Going to pick that EventLogger extention apart a bit tomorrow to compare the solutions used in subsystems to systems we currently use. [20:19:03] But I've been saring at unfamiliar code/wiki's for the past 2 hours now, so I'm goikg to give it a rest. Good night! [21:33:52] (PS1) Milimetric: Ignore temporary vim files [analytics/refinery] - https://gerrit.wikimedia.org/r/143485 [21:36:29] (PS1) Milimetric: [WIP] Oozify sequence_stats hive script [analytics/refinery] - https://gerrit.wikimedia.org/r/143486 (https://bugzilla.wikimedia.org/67128) [22:04:07] (PS3) Nuria: Fix wiki cohort display for report cohorts [analytics/wikimetrics] - https://gerrit.wikimedia.org/r/142514 (owner: Milimetric) [22:41:47] (PS1) Gergő Tisza: [WIP] Track opt-out ratio [analytics/multimedia] - https://gerrit.wikimedia.org/r/143501