[13:23:34] qchris: if you don't know the answer to this by heart, please don't look it up: [13:23:53] is 240 million pageviews *roughly* normal for a day in enwiki? [13:24:09] I don't know by heart. [13:24:15] sorry - 344 million total (desktop + zero + mobile) [13:24:18] ok, thanks! [13:24:32] But there was some discussion about increased counts on the internal list. [13:24:51] Meh. That is something different. Sorry. [13:33:31] i saw that, right [13:34:14] another question: I'm not sure what to do about the shelling out of tail to get the last line [13:34:30] in python it's terribly inefficient (if you see my reply to your comment) [13:34:43] so how bad is the shelling? [13:46:41] milimetric: If I can create a file in that directory, it allows privilege escalation to the user running the command and execute an arbitrary command. [13:46:51] I would not worry about being inefficient. [13:47:08] Once that becomes a bottleneck, we can speed it up. [13:47:14] ok, cool, i'll change it [13:47:38] awesome! [14:09:56] Analytics / Dashiki: Story: Vital Signs User selects the Daily or Monthly Pageviews metrics - https://bugzilla.wikimedia.org/72740 (Dan Andreescu) a:Dan Andreescu [14:11:08] (PS5) Milimetric: Transform projectcounts hourly files [analytics/refinery] - https://gerrit.wikimedia.org/r/169974 (https://bugzilla.wikimedia.org/72740) [14:11:45] k qchris ^ I ran that script on stat1002 with all the situations I planned for and it produced clean output [14:11:55] I couldn't write too many tests without mocking the filesystem and getting crazy [14:12:21] but manually testing it made me feel comfortable. Let me know if you think more is needed [14:13:02] Cool. [14:13:12] Will have a look. [14:14:12] (CR) Milimetric: [C: 2] Add default tags and default project to cohorts view [analytics/wikimetrics] - https://gerrit.wikimedia.org/r/170963 (https://bugzilla.wikimedia.org/72746) (owner: Mforns) [14:14:20] (Merged) jenkins-bot: Add default tags and default project to cohorts view [analytics/wikimetrics] - https://gerrit.wikimedia.org/r/170963 (https://bugzilla.wikimedia.org/72746) (owner: Mforns) [14:20:22] (CR) Milimetric: [C: 2] Do not store reports that are not going to be used [analytics/wikimetrics] - https://gerrit.wikimedia.org/r/170703 (https://bugzilla.wikimedia.org/72635) (owner: Mforns) [14:20:28] (PS4) Milimetric: Do not store reports that are not going to be used [analytics/wikimetrics] - https://gerrit.wikimedia.org/r/170703 (https://bugzilla.wikimedia.org/72635) (owner: Mforns) [14:21:12] mforns: I rebased and merged your report store fix [14:21:15] looked good [14:21:32] ok, thnks! [14:33:04] (CR) Milimetric: Avoid exception accessing unknown project database (3 comments) [analytics/wikimetrics] - https://gerrit.wikimedia.org/r/170152 (https://bugzilla.wikimedia.org/72582) (owner: Mforns) [14:42:58] Analytics / Refinery: In Hive, change webrequest's time_firstbyte from float to double - https://bugzilla.wikimedia.org/73018 (christian) NEW p:Unprio s:normal a:None time_firstbyte is currently a single-precision float. Varnish exposes 9 decimal digits. So we should switch time_firstbyte... [14:43:11] (PS1) QChris: Change webrequest's time_firstbyte from float to double [analytics/refinery] - https://gerrit.wikimedia.org/r/171264 (https://bugzilla.wikimedia.org/73018) [14:44:41] (CR) Ottomata: [C: 2 V: 2] Change webrequest's time_firstbyte from float to double [analytics/refinery] - https://gerrit.wikimedia.org/r/171264 (https://bugzilla.wikimedia.org/73018) (owner: QChris) [14:51:12] (PS4) Mforns: Avoid exception accessing unknown project database [analytics/wikimetrics] - https://gerrit.wikimedia.org/r/170152 (https://bugzilla.wikimedia.org/72582) [14:53:42] Analytics / Refinery: Add Range header to varnishkafka - https://bugzilla.wikimedia.org/73021 (christian) NEW p:Unprio s:normal a:None For requests to partial content (like skipping in videos), the Range header helps to find the request for the beginning of a resource. This is valuable whe... [14:54:25] (PS1) QChris: Add Range header to webrequest table [analytics/refinery] - https://gerrit.wikimedia.org/r/171267 [14:54:50] (PS2) QChris: Add Range header to webrequest table [analytics/refinery] - https://gerrit.wikimedia.org/r/171267 (https://bugzilla.wikimedia.org/73021) [14:55:56] Analytics / Refinery: In Hive, change webrequest's time_firstbyte from float to double - https://bugzilla.wikimedia.org/73018#c3 (Dan Andreescu) float is the devil!!! sorry, couldn't resist, they cost me three days of pain once in a Berlekamp error corrector. [15:42:55] halfak: would now() - max(timestamp) help you? I think Kevin's fine with what I sent [15:44:49] milimetric: I think it would help as long as it return time in days, not miliseconds [15:45:08] I think that timestamp comparison in MySQL does useful things. [15:45:10] * halfak checks [15:46:21] Bah. Nevermind. [15:47:18] halfak / kevinator: ok, I'll run now() - max time and send output in days [15:47:35] Cool. [15:47:36] * milimetric grumbles about how people can't do lightning speed math in their head like he does [15:47:41] :P [15:47:55] :) [15:48:30] can we also get the date of the newest record in days :-) I suspect some tables could be dropped entirely [15:48:48] if they aren’t being used [15:51:40] I’m headed to the office and will be back online in an hourish [15:55:30] (PS5) Nuria: Avoid exception accessing unknown project database [analytics/wikimetrics] - https://gerrit.wikimedia.org/r/170152 (https://bugzilla.wikimedia.org/72582) (owner: Mforns) [15:55:48] (CR) Nuria: [C: 2] Avoid exception accessing unknown project database [analytics/wikimetrics] - https://gerrit.wikimedia.org/r/170152 (https://bugzilla.wikimedia.org/72582) (owner: Mforns) [15:55:53] (Merged) jenkins-bot: Avoid exception accessing unknown project database [analytics/wikimetrics] - https://gerrit.wikimedia.org/r/170152 (https://bugzilla.wikimedia.org/72582) (owner: Mforns) [16:07:52] mforns: so that got merged but what about my first comment here: https://gerrit.wikimedia.org/r/#/c/170152/3..5/wikimetrics/models/validate_cohort.py,cm [16:08:55] oh! mforns I see there's a raise e there ... didn't see that before [16:08:57] so that's fine [16:09:32] but mforns: add a comment like - "Dan you're being dumb, there's a raise e right there" [16:11:55] :] [16:12:30] the git diff pushed the raise e line way below [16:13:43] milimetric ^ [16:14:52] oh i see that now, but if you don't reply i don't know if you saw it or if I was being dumb [16:14:57] Analytics / Dashiki: Dashiki needs to have a friendlier mobile view. - https://bugzilla.wikimedia.org/73030 (nuria) NEW p:Unprio s:normal a:None Dashiki is the platform we are using to display data for Vital Signs Editor Data. See: https://metrics.wmflabs.org/static/public/dash/ While UI... [16:16:23] Oh, I replied... [16:16:32] but wait [16:17:43] I think I've being writing comments that no one read, for some days now [16:18:10] I replied your comment, but now I see the status of my comment is draft [16:18:13] :[ [16:20:24] milimetric: what should I do for you to see my 'drafts'? [16:21:31] oh! mforns we all do this :) [16:21:48] you have to submit comments [16:21:57] when they're drafts only you can see them [16:22:10] ooook [16:22:30] but, there's no submit button in the diff view [16:22:45] hold on lemme switch and i'll tell you what buttons you need :) [16:23:19] mforns: so you click "Review" [16:23:23] and then "Publish Comments" [16:23:38] ok [16:24:14] ok, sorry for that [16:25:42] btw, how you use to handle CR comments, first push new version, and then go to the previous revision and reply to comments? [16:25:53] or put a new comment in the new revision? [16:25:55] Analytics / Dashiki: Dashiki needs to have a friendlier mobile view. - https://bugzilla.wikimedia.org/73030#c1 (nuria) In mobile we should just be showing the graph plus buttons to move accrooss metrics, the project selector should be hidden as at this time project selection requires larger real state... [16:28:12] mforns: I usually have the review up while I'm making changes, and I reply to each comment and leave them as drafts. Once I'm done, I push the new patchset and I don't refresh my review page, I just publish the draft comments. It's good that way because the comments stay in context to the original review and the new patchset is clean. When I review I [16:28:12] usually have the patchset with the comments on the left and the new one on the right so I can make sure all the comments were taken care of. [16:29:00] ok, makes sense [16:34:07] milimetric: is it OK if I go have lunch now? Or do you want to chat about dashiki first? [16:46:11] milimetric: I'll be right back [16:46:48] mforns_lunch: oh no! sorry to make you wait [16:46:53] in general - don't wait for me :) [16:50:20] milimetric: no problem, see you in a bit [17:33:01] milimetric: I'm back, so whenever you want, you can ping me [17:33:19] k mforns i have a bunch of meetings but i'll try to ping you soon [17:33:33] milimetric: fine [17:48:19] mforns: merged your change that was pending, sorry about the wait. [17:48:35] nuria__: no problem [17:49:24] thanks [18:13:10] Analytics / Dashiki: Dashiki needs to have a friendlier mobile view. - https://bugzilla.wikimedia.org/73030 (Andre Klapper) [19:06:19] Analytics / Wikimetrics: Misleading search result displayed when filtering cohorts - https://bugzilla.wikimedia.org/73040 (Kevin Leduc) NEW p:Unprio s:normal a:None Created attachment 17043 --> https://bugzilla.wikimedia.org/attachment.cgi?id=17043&action=edit cohort that should not be sh... [19:11:27] Analytics / EventLogging: Story: Identify and direct the purging of Event logging raw logs older than 90 days in stat1002 - https://bugzilla.wikimedia.org/72642#c4 (nuria) >Will deleting from there affect what logs we have stored in the DB? No it does not. >Is this an intermediate log storage place,... [21:48:04] milimetric, mforns : going to doc ap, will be back by your dinner time, do send me anything should be looking into [21:48:29] ok, good luck! [21:56:20] feel better nuria [21:56:30] i'm working with flow folks on their dashboards now [21:56:44] mforns: how long are you around for? [22:12:20] (PS1) Milimetric: Initial skeleton of directories [analytics/limn-flow-data] - https://gerrit.wikimedia.org/r/171454 [22:14:41] (CR) Milimetric: [C: 2 V: 2] Initial skeleton of directories [analytics/limn-flow-data] - https://gerrit.wikimedia.org/r/171454 (owner: Milimetric) [22:15:20] milimetric: ops, sorry I lost your msg [22:15:40] I'll be here for another 45 minutes [22:19:52] ok mforns let's catch up tomorrow morning on my steps to start to set up the flow team with dashboards [22:20:05] if you remember remind me :) [22:20:13] ok, fine! [22:40:27] milimetric, mforns : thanks in advance. https://www.mediawiki.org/wiki/Talk:Flow/Analytics sounds like you're nearly done already [22:41:23] spagewmf: looks are deceiving - the data part will take the most time. [22:54:52] (PS2) Ottomata: Add MapReduce job to convert Mediawiki XML Export Dumps into Revision Avro records [analytics/refinery/source] - https://gerrit.wikimedia.org/r/171056 [22:57:10] (PS3) Ottomata: Add MapReduce job to convert Mediawiki XML Export Dumps into Revision Avro records [analytics/refinery/source] - https://gerrit.wikimedia.org/r/171056 [22:58:50] hey YuviPanda: I heard you were officially ops today!! Congrats man [22:58:57] milimetric: :D thanks! [22:59:13] it was a few days ago, but mark was supposed to send the announcement today :) [22:59:49] i heard in SoS [23:01:05] :) [23:01:06] aaah [23:01:09] see you tomorrow folks [23:01:11] nice :) [23:01:27] * YuviPanda goes to slepe [23:01:29] night folks [23:08:55] ohhhhhboy [23:09:12] halfak: I am attempting to convert a full enwiki xmldump from xml into avro [23:09:17] ohhhhboy [23:09:19] we will see! [23:09:28] avro? [23:09:33] ja, it is a serialization format [23:09:46] like json, but binary [23:10:20] it will let me do many things, but first of all let me easily map a hive table on top of it [23:10:41] just experimenting at the moment :) [23:10:53] brb meeting [23:10:56] this conversion will not include the prev revision diffs, but i want to talk to about that [23:11:13] ...maybe we can do something where the diff from the prev revision is computed during the conversion and saved [23:11:21] that way we don't have to duplicate each revision [23:11:25] anyyyway, ok [23:13:48] anyone (milimetric?) know background to flowdb replication request in bug 73047 ? [23:15:08] springle: yes [23:15:19] they're going to start instrumenting with event logging [23:15:29] why would we be replicating to stat1003? [23:15:29] and they want dashboards about how people use flow [23:16:03] sorry - I should've clarified that [23:16:03] they just need it somewhere that stat1003 can access it [23:16:04] like analytics-store is fine [23:16:04] ah ok [23:16:07] phew :) [23:16:10] i'll comment on the bug [23:16:12] :D [23:16:46] springle: people think that stat1003 is the database, because they log into there to access databases, it is a clarification that often needs made :) [23:17:01] why can't stat1003 access x1-analytics-slave? or must it be join-able with eventlogging? [23:17:11] it can [23:17:13] ottomata: :) [23:17:13] afaik [23:17:21] springle: yeah, joining is good here more than likely [23:17:49] can we clarify that too in the bug, please? [23:17:53] sure thing. [23:18:04] tnx [23:18:29] springle: while you are here: https://rt.wikimedia.org/Ticket/Display.html?id=8474 [23:18:30] :) [23:18:47] just needs password changed, just pick a date/time, and I will tell the list when it is going to happen [23:19:34] how much warning do they need? [23:22:43] not much, a day or two is fine [23:22:45] we've already warned [23:25:06] ottomata: tomorrow 2014-11-06 23:00:00 UTC? [23:25:49] cool, danke [23:25:54] make it so [23:26:02] right