[09:15:31] Analytics-Tech-community-metrics: Remove the filter for key Wikimedia software projects in korma.wmflabs.org - https://phabricator.wikimedia.org/T86154#1024253 (Qgil) >>! In T86154#1020446, @Dicortazar wrote: > Once we remove the filter of key Wikimedia projects, there are some open questions: > * projects to... [13:23:58] (CR) Ananthrk: "Converted IpUtil methods to instance methods to force ClientIpUDF to explicitly instantiate IpUtil. This required ClientIpUDF to be writte" (10 comments) [analytics/refinery/source] - https://gerrit.wikimedia.org/r/187651 (owner: Ananthrk) [13:25:04] (PS4) Ananthrk: [WiP] Add new UDF to determine client IP given source IP and XFF header [analytics/refinery/source] - https://gerrit.wikimedia.org/r/187651 [14:38:45] (CR) QChris: [C: -1] "> This required ClientIpUDF to be written" (10 comments) [analytics/refinery/source] - https://gerrit.wikimedia.org/r/187651 (owner: Ananthrk) [15:13:37] Analytics-Kanban: Ad-hoc cron to get data for Language team - https://phabricator.wikimedia.org/T88987#1024863 (Milimetric) NEW a:Milimetric [15:15:17] Analytics-Kanban: Ad-hoc cron to get data for Language team - https://phabricator.wikimedia.org/T88987#1024877 (Milimetric) Open>Resolved cron running under my user (milimetric) on stat1003, generating these files: http://datasets.wikimedia.org//limn-public-data/language/datafiles/daily_content_transla... [15:16:39] Analytics-Kanban: Ad-hoc cron to get data for Language team [3 pts] - https://phabricator.wikimedia.org/T88987#1024880 (Milimetric) [16:11:42] nuria: good morning, i can update EL on hafnium for you now. [16:12:39] jgage: ok, sounds great. Thank you [16:13:23] whew this script has a lot of output [16:13:45] jgage: let me know if you need help as i know what is needed is just that i do not have permits [16:15:37] ok. how long do we expect this script to take? it's outputting hundreds of lines like: [16:15:40] {"wiki": "eswiki", "uuid": "c0605b0e252f52d5a4ee21cb59e7f093", "webHost": "es.wikipedia.org", "timestamp": 1423498466, "recvFrom": "cp3019.esams.wikimedia.org", "seqId": 37291356, "userAgent": "Mozilla/5.0 (Windows NT 6.0; rv:35.0) Gecko/20100101 Firefox/35.0\n", "clientIp": "c48cd5a74aef2a3c793d3fd422bbd96758aa982e", "schema": "MediaViewer", "event": {"action": "right-click-image", "samplingFactor": 1}, "revision": 10867062} [16:16:00] i ran it with the --no-update option per the instructions [16:19:36] hm ok the last like of the script after 'eventloggingctl start' is 'zsub eventlogging.eqiad.wmnet:8600' so i guess i'm just seeing confirmation of successful logging [16:19:44] and now i suppose i can just hit ctrl-c [16:20:20] yep looks like it's running fine [16:20:47] oh hey the ctrl-c part is even documented :D [16:31:28] nuria: ok i've done the git deploy from tin and rerun the script on hafnium, do you have a way to confirm whether it looks as expected? [16:31:38] jgage: looking now [16:33:04] jgage: I think tin did not grab the latest change? we are 1 behind i believe: this one is missing: https://gerrit.wikimedia.org/r/#/c/188950/2 [16:34:12] jgage: but i am not sure how branches are cut on mediawiki so actually that might be the last change available now [16:34:41] yeah the documention didn't give details on what to check out so i grabbed the latest branch, wmf/1.25wmf16 [16:35:59] after grepping root's bash history to discover /srv/deployment/eventlogging [16:37:10] jgage: Could we deploy latest from master or is that not possible? [16:37:19] doing that now [16:41:21] nuria: ok deployed from master, check please? [16:41:51] jgage: checking [16:42:51] jgage: code checkout looks good, process started, can you tail these two logs that require sudo? [16:43:01] https://www.irccloud.com/pastebin/lC8ZChb0 [16:44:00] sure, one sec [16:45:27] nuria: both logs have recent timestamps [16:45:35] 2015-02-09 16:41:08,179 Driving tcp://vanadium.eqiad.wmnet:8600?socket_id=graphite -> statsd://statsd.eqiad.wmnet:8125.. [16:45:41] jgage: ok, i think we are hood then THANK YOU! [16:45:44] 2015-02-09 16:45:11,106 ve.behavior.saveDialogClose.mwTarget:7377|ms [16:45:52] yay! you're welcome :) [17:26:04] i'm looking into those icinga alarms that just went off [17:26:14] but assuming maybe it has something to do with what you two were talking about above ^ [17:28:47] milimetric: ok, we just deployed code to hafnium but probably alarms have to do with some new event not validating. I am checking navinagtion timing alarms in the past week [17:29:04] graph looks ok [17:29:43] well... it dips a little bit around where you guys were playing with it [17:29:47] but that makes sense if it was restarted [17:35:44] (CR) Ananthrk: [WiP] Add new UDF to determine client IP given source IP and XFF header (10 comments) [analytics/refinery/source] - https://gerrit.wikimedia.org/r/187651 (owner: Ananthrk) [17:36:12] (PS5) Ananthrk: [WiP] Add new UDF to determine client IP given source IP and XFF header [analytics/refinery/source] - https://gerrit.wikimedia.org/r/187651 [17:36:30] mforns: is this the task you are working on? https://phabricator.wikimedia.org/T76407 [17:44:58] milimetric: we definitely have a problem with all those navigation timing alarms being raised over teh last 3 days [17:45:40] oh yea? what's in the logs? [17:50:38] kevinator, not at the moment, I'm working on https://phabricator.wikimedia.org/T84892, it's in the kanban-In Progress [17:50:58] (and also in https://phabricator.wikimedia.org/T88583) [17:51:12] got it, thanks [17:51:18] np [17:52:34] the title is really general. What logging needs to be improved? [18:00:02] kevinator, I'll change it [18:00:23] ty [18:03:34] Analytics-Visualization, Analytics-Kanban: Configure limn-mobile-data logging to output into console, file and/or logstash - https://phabricator.wikimedia.org/T84892#1025379 (mforns) [19:00:53] (PS5) Mforns: Integrated logging [analytics/limn-mobile-data] - https://gerrit.wikimedia.org/r/180828 (https://phabricator.wikimedia.org/T84892) (owner: Rtnpro) [19:10:11] (PS1) Mforns: Adapt loading and automatic verification scripts [analytics/data-warehouse] - https://gerrit.wikimedia.org/r/189532 (https://phabricator.wikimedia.org/T88583) [19:10:27] Analytics-Kanban, Analytics-EventLogging: Estimate maximum throughput of Schema:Search - https://phabricator.wikimedia.org/T89019#1025607 (kevinator) NEW [19:25:05] nuria / mforns: I'm sorry I used these blindly, apparently "map" and "filter" suck really bad: http://jsperf.com/2-loops-1-operation-each-or-1-loop-2-operations [19:27:25] milimetric: for our use case? so "loop over it, filtering and processing at the same time" is better right? [19:28:02] well, there's not a huge difference between two loops or one loop [19:28:18] the surprising HUGE difference is from 2 loops to filter + map [19:28:26] filter + map is 10 times slower [19:28:34] Analytics-Wikistats: Discrepancies in historical total active editor numbers - https://phabricator.wikimedia.org/T87738#1025669 (DarTar) @ezachte: can you look into this? Neither Aaron nor myself can generate/audit the TAE data using the legacy definition. [19:28:39] milimetric: ah , yes between "1 loop " and "2 loops manual" [19:28:54] 1 loop and 2 loops manual are about the same, yeah [19:29:00] 2 loops: filter and map is the bad one [19:29:32] milimetric: now, whether that matters (you can perceive it) will depend on our app, in this case seems like an easy change to make [19:29:38] oddly, it seems Chrome optimizes the 2 loops one and makes it faster than a single loop, that's completely crazy :) [19:29:57] oh yeah, i'm changing it. In our case when we're dealing with millions of records, it's noticeable for sure [19:30:51] dashiki should benefit from this too, it's just shocking it's that much slower. I would've thought like 10-20% [19:32:56] Analytics-Kanban, Analytics: Total pageviews (new definition) for Oct-Dec 2014 - https://phabricator.wikimedia.org/T88844#1025673 (kevinator) Note: this data is not citable. The data is available in Pentaho, but it is not a public or reliable server to point people to. [19:36:42] Analytics-Kanban, Analytics: Total pageviews (new definition) for Oct-Dec 2014 - https://phabricator.wikimedia.org/T88844#1025675 (kevinator) @Tbayer: the new definition will show discrepancies with the old data. We should stick to the legacy data until we have stable citable source for the new data. This a... [19:42:09] milimetric, nuria, sorry I was having a snack [19:42:25] what?! during work?! [19:42:25] :P [19:42:58] i was just saying that it turns out filter and map are really bad performance wise [19:43:05] milimetric, I suppose that's because of the time that the vm takes in stacking the function context and creating the scope etc... [19:43:15] compared to an if statement... [19:43:22] but it turns out the gain is only from 15 milliseconds to 10 milliseconds in this particular case [19:43:25] so I'll leave it for now [19:43:29] but I'm very skeptical now [19:43:34] yep, agreed [19:43:46] i thought it'd be slower, but the perf test above shows that it's 10x slower! [19:43:51] aha [19:43:55] i was thinking more like 10-20 percent [19:44:19] but it makes sense, because an if is like very basic operation [19:44:44] but a funcion call is like probably 10 times more complex [19:46:58] Analytics: Report Quarterly Metrics Scorecard - https://phabricator.wikimedia.org/T89024#1025696 (kevinator) NEW [19:48:51] milimetric, and I think (maybe it justifies the other difference in chrome) that V8 optimizes the common code paths, so if a loop contains an if statement inside, it's more difficult for V8 to predict the next operation [19:50:32] although both solutions (1 loop, and 2 loops: manual) have an if inside the loop... well, I don't know [19:52:09] i added another test to test your hypothesis: http://jsperf.com/2-loops-1-operation-each-or-1-loop-2-operations [19:52:15] i think that's half the truth [19:52:29] the other half must be all the safety checks and various things happening in the native map/filter [19:55:07] milimetric, I see, awesome [19:57:35] Analytics: Report Quarterly Metrics Scorecard - https://phabricator.wikimedia.org/T89024#1025719 (kevinator) [19:58:54] Analytics: Report Quarterly Metrics Scorecard - https://phabricator.wikimedia.org/T89024#1025696 (kevinator) [20:00:25] Analytics: Number of new registrations for Oct-Dec 2014 - https://phabricator.wikimedia.org/T88846#1025731 (kevinator) [20:00:26] Analytics-Wikistats: Provide total active editors for December 2014 - https://phabricator.wikimedia.org/T88403#1010737 (kevinator) [20:00:27] Analytics: Report Quarterly Metrics Scorecard - https://phabricator.wikimedia.org/T89024#1025729 (kevinator) [20:00:28] Analytics-Wikistats: Discrepancies in historical total active editor numbers - https://phabricator.wikimedia.org/T87738#1025732 (kevinator) [20:00:30] Analytics-Kanban, Analytics: Total pageviews (new definition) for Oct-Dec 2014 - https://phabricator.wikimedia.org/T88844#1025733 (kevinator) [20:01:13] Analytics: Report Quarterly Metrics Scorecard Oct-Dec 2014 - https://phabricator.wikimedia.org/T89024#1025734 (kevinator) [20:02:55] kevinator, milimetric, nuria, is it ok for you if I grab this task now: https://phabricator.wikimedia.org/T88812 ? [20:03:10] works for me [20:03:40] mforns, kevinator , milimetric : but I think kevinator needs to fill in the user requirements 1st , right? [20:04:40] kevinator, milimetric , mforns : cause i think we should nail down use cases 1st and 2nd talk about implementation as implementation will depend [20:05:08] nuria, ok, then I better take another [20:09:24] nuria, what about this one: https://phabricator.wikimedia.org/T87660, or do you have suggestions? [20:12:45] mforns: sounds fine, backwards compatibility is kind of hard, as the time needs to be preserved for already existing tables, but changed on new tables, I am not sure whether that is doable. So, please take a look at code and think whether it'd be possible to do those changes in a backwards compatible way. [20:13:10] ok nuria [20:15:59] Analytics-Kanban, Analytics-EventLogging: Sanity check changes to timestamp fields and remove autoincrement id from tables & deploy to Prod [8 pts] - https://phabricator.wikimedia.org/T88297#1025764 (Nuria) [20:16:02] Analytics-EventLogging, Analytics-Kanban: Remove autoincrement id from tables [5 pts] - https://phabricator.wikimedia.org/T87661#1025763 (Nuria) Open>Resolved [20:16:31] Analytics-Kanban, Analytics-EventLogging: Change timestamp fields to reduce DB storage size [8 pts] - https://phabricator.wikimedia.org/T87660#1025766 (mforns) a:mforns [20:16:37] Analytics-EventLogging, Analytics-Kanban: Drop clientValidated and isTruncated fields from event capsule - https://phabricator.wikimedia.org/T88595#1025767 (Nuria) Open>Resolved [20:18:01] (PS2) Milimetric: Add funnel-gathering sql and prototype html [analytics/limn-edit-data] - https://gerrit.wikimedia.org/r/188601 [20:21:21] (PS3) Milimetric: Add funnel-gathering sql and prototype html [analytics/limn-edit-data] - https://gerrit.wikimedia.org/r/188601 [20:21:29] ok Marcel, I decided to ignore the sql problem as Nuria suggested and I fixed the build hierarchy problem you found. You can merge now, if you're happy with it [20:21:45] and I'll deploy the static HTML, though it shouldn't change anything [20:22:18] milimetric, ok [20:28:49] Analytics-Kanban, Analytics: Total pageviews (new definition) for Oct-Dec 2014 - https://phabricator.wikimedia.org/T88844#1025859 (Tbayer) Thanks Kevin - I'm of course aware that the new definition yields different numbers than the old definition, that was the reason in the first place why @Eloquence strongly... [20:36:48] (CR) Mforns: "See comment on edit/funnel-prototype.html" (1 comment) [analytics/limn-edit-data] - https://gerrit.wikimedia.org/r/188601 (owner: Milimetric) [20:39:17] (CR) Milimetric: Add funnel-gathering sql and prototype html (1 comment) [analytics/limn-edit-data] - https://gerrit.wikimedia.org/r/188601 (owner: Milimetric) [20:49:58] (CR) Mforns: [C: 2 V: 2] "LGTM" [analytics/limn-edit-data] - https://gerrit.wikimedia.org/r/188601 (owner: Milimetric) [20:53:10] Analytics-Visualization, Analytics-Kanban: Build reactive filters on wiki and date - https://phabricator.wikimedia.org/T88371#1025952 (Milimetric) Open>Resolved [20:53:22] Analytics-Visualization, Analytics-Kanban: Build low level visualization of the paths through the application (starburst) - https://phabricator.wikimedia.org/T88369#1025957 (Milimetric) Open>Resolved [21:01:48] kevinator: is ottomata around today? Ellery and I had some issues with the public dataset rsync over the weekend that we were hoping someone from Dev could help address [21:02:39] question is: is rsync configured in the same way on stat1002 as it is on stat1003? [21:03:37] i.e. copy any file on stat:1002/a/public-datasets/ to stat1 every 30 mins? [21:08:20] DarTar: no, ottomata is back from vacation tomorrow [21:17:58] kevinator: ok, anyone else who could quickly confirm if this assumption is correct? [21:18:21] if not, we’ll postpone the publication [21:18:45] nuria: are you available to look at an rsync issue? ^^ [21:20:10] kevinator, DarTar: i doubt stat1002 and stat1003 share much configuration, give me 5 minutes and i can look to see if that is teh case [21:20:23] nuria: fantastic, thx [21:21:04] we need to publish a large, static file currently on stat1002 to datasets.wikimedia.org [21:21:30] it’s a static file, not a recurring report [21:24:09] also, ellery accidentally created an empty file via rsync but we don’t have permissions to remove it on stat1, if anyone can go ahead and trash it http://datasets.wikimedia.org/public-datasets/clickstream.tsv [21:25:01] DarTar: I do not have permits for that, i think you need ops [21:25:09] nuria: got it [21:26:47] DarTar: i cannot even ssh to that machine, let me look at crons [21:27:01] DarTar: for rsync [21:30:34] DarTar: as far as i can see cront to rsync public datasets is only defined on 1003 [21:31:16] nuria: ok, that would explain it, so maybe the directory on stat1002 is an old test [22:55:02] Analytics-Kanban, Analytics-Cluster: Monitor cluster running out of HEAP space with Icinga - https://phabricator.wikimedia.org/T88640#1026388 (kevinator) a:Ottomata [22:55:36] Analytics-Kanban, Analytics-Cluster: Monitor cluster running out of HEAP space with Icinga - https://phabricator.wikimedia.org/T88640#1026393 (Tnegrin) Namenodes only (that's where the problem is really serious) [23:01:40] ggellerman_: hi - my IRC client has been refusing to let me talk to you :( [23:05:00] ouch! 1. I asked James to get to your email re schema 2. Dario wants to talk to you before Weds Sos [23:07:40] milimetric: it was mostly to brief you on the “commons PV” anomaly [23:07:47] if you have any question [23:08:05] did you see Oliver’s postmortem on analytics-l [23:08:07] ?