[00:31:47] Analytics-General-or-Unknown: datasets.wikimedia.org SSL error - https://phabricator.wikimedia.org/T74805#802896 (Mattflaschen) [00:41:31] nuria__: ok, danny horn typed a few exclamation marks when he saw the dashboards :) [00:41:46] that it something... [00:41:52] at least they were not tears! [00:41:56] hahaha [00:46:27] dannyh> ebernhardson milimetric please tell Nuria that I love it and it gives me flashbacks to the heady days of Aug 30 - Sept 6 when we were popular [00:46:49] oooh can i see? [00:48:39] ori: http://flow-reportcard.wmflabs.org/ [00:49:39] oh, cool [00:51:59] thats just db queries, also putting together some event logging stuff for UI [01:49:59] (PS1) CSteipp: Assert lengths aren't negative [analytics/kafkatee] - https://gerrit.wikimedia.org/r/177152 [10:38:12] Analytics-EventLogging: EventLogging: Add helper for logging link clicks - https://phabricator.wikimedia.org/T54287#806366 (Prtksxna) [11:24:32] YuviPanda, Hi [11:24:39] YuviPanda, what's up? [12:50:27] Analytics-Tech-community-metrics: Automate creating charts from Bugzilla Weekly Report - https://phabricator.wikimedia.org/T51744#806883 (zeljkofilipin) [13:36:46] Analytics-Tech-community-metrics: Key performance indicator: Bugzilla response time - https://phabricator.wikimedia.org/T63561#807103 (Nemo_bis) [13:40:28] Analytics-Tech-community-metrics, Phabricator, Phabricator.org: Metrics for key Wikimedia projects software in Maniphest - https://phabricator.wikimedia.org/T28#807124 (Nemo_bis) [13:44:06] Analytics-Tech-community-metrics: Bugzilla ticket with recent comments listed under "Longest time without comment" on bugzilla_response_time.html - https://phabricator.wikimedia.org/T66373#807132 (Nemo_bis) Open>declined a:Nemo_bis > This bug is related to issues that shift from a component to anothe... [13:44:22] (CR) Gilles: [C: 2] Update schema versions [analytics/multimedia] - https://gerrit.wikimedia.org/r/176944 (owner: Gilles) [13:44:31] (Merged) jenkins-bot: Update schema versions [analytics/multimedia] - https://gerrit.wikimedia.org/r/176944 (owner: Gilles) [13:48:45] Engineering-Community, Analytics-Tech-community-metrics, Phabricator: Monthly report of total / active Phabricator users - https://phabricator.wikimedia.org/T1003#807146 (Nemo_bis) [13:49:39] Engineering-Community, Analytics-Tech-community-metrics: Monthly report of total / active Phabricator users - https://phabricator.wikimedia.org/T1003#807147 (Nemo_bis) [13:49:54] rtnpro: heya! [14:00:11] Engineering-Community, Analytics-Tech-community-metrics, Phabricator: Monthly report of total / active Phabricator users - https://phabricator.wikimedia.org/T1003#807158 (Aklapper) [14:44:20] (PS1) QChris: Drop Oozie bundle for Icinga monitoring of webrequest datasets [analytics/refinery] - https://gerrit.wikimedia.org/r/177217 [14:46:25] (PS1) QChris: Ignore dia backup files for diagrams [analytics/refinery] - https://gerrit.wikimedia.org/r/177220 [14:52:31] (PS1) QChris: Ignore traffic from cache-local SSL terminators [analytics/refinery] - https://gerrit.wikimedia.org/r/177224 [14:55:57] Engineering-Community, Analytics-Tech-community-metrics, Phabricator: Monthly report of total / active Phabricator users - https://phabricator.wikimedia.org/T1003#807261 (Aklapper) Note to myself: Add another query for "shell requests closed in last month" for Guillaume as the UI does not offer that and add G... [15:21:45] YuviPanda: holaaa, have time for one question? [15:21:49] nuria__: sure [15:22:11] YuviPanda: we are thinking of moving wikimetrics to production [15:22:18] oooh, nice [15:22:30] YuviPanda: so here is when we ask for your wisdom [15:23:07] and ask you to let us know (by looking at the packages) wether that would be a lot of work or medium work or nah [15:23:14] *whether [15:23:40] YuviPanda: let me paste the pkg list [15:23:59] nuria__: sure! [15:24:13] https://www.irccloud.com/pastebin/JxS3ywx0 [15:24:50] nuria__: yeah, all should be fairly easy to do, I think. [15:24:56] nuria__: but I should ask, why production suddenly? [15:25:00] nuria__: will it still hit labsdb? [15:26:44] YuviPanda: labs db hasn't been able to run queries (specially for enwiki) for a while now [15:26:56] YuviPanda: we have seeveral metrics that just we cannot compute there [15:26:58] oh, the replication issues? [15:27:02] or is it just too big? [15:27:04] YuviPanda: and perf [15:27:12] right [15:27:21] YuviPanda: both perf & replication (correctness) [15:27:21] anyway, I'll be happy to help with packaging / puppetmunging to bring it to prod quality. [15:29:03] YuviPanda: ok, we need to look at it in detail and will let you know, isn't redis going to be a problem? [15:29:28] nuria__: why is redis going to be a problem? [15:29:48] YuviPanda: i have no idea if debianizing that is easy or not [15:30:22] nuria__: I also debianized a bunch of these for quarry, which is also running a celery with redis setup :) [15:30:57] YuviPanda: all right then, we need to look at quarry's setup [15:31:05] YuviPanda: what about security review? [15:31:20] nuria__: yeaaah, that needs to go through chris. [15:31:33] nuria__: however, debianizing / puppetmunging needn't wait on that, since it'll work the same in labs too. [15:32:02] One of these days, I'll get another nick. There are too many chris around and the pings get annoying :-) [15:32:15] nuria__: so I think way to go is to figure out which packages actually need to be handbuilt - with trusty I suspect not a lot, and then see if any need security review [15:32:20] Mhmm... maybe ... even another name :-) [15:32:32] qchris: hahaha :) sorry, for some reason you don't register as 'chris' in my head at all, just as 'queue-chris' [15:32:38] YuviPanda: excellent!, we will task it tomorrow (after having looked at quarry's packages) and let you know [15:32:52] nuria__: cool :) do tell me if you want help, etc. [15:33:08] YuviPanda: will do, I am sure we could use your help [15:33:30] Maybe I'll just do what ^d is doing :-) [15:48:39] milimetric, Hi [15:49:01] milimetric, I am trying to follow the README at https://github.com/wikimedia/analytics-limn-mobile-data [15:49:03] hi rtnpro - i'm in an interview atm, but talk in 45 min or so? [15:49:24] milimetric, ok [16:36:08] hi rtnpro! sorry for the delay [16:36:12] yes - the readme [16:36:26] milimetric, np [16:37:19] milimetric, I am trying to follow the steps for "Testing using local data" [16:37:56] milimetric, when I run, ./scripts/ssh, I get "ssh: connect to host stat1003.wikimedia.org port 22: Network is unreachable" [16:37:57] rtnpro: sorry that doc is out of date too :( [16:38:14] yeah, you would need access to stat1003 which you probably don't have, right? [16:38:21] it's in our production cluster [16:38:27] but that doesn't matter [16:38:47] milimetric, so, how do I get started? [16:39:00] milimetric, do you have some test db dump? [16:39:53] well, so the sql that generate.py runs doesn't actually matter [16:40:07] it's the responsibility of each team that's using this to make sure that SQL works [16:40:41] for your purpose, you can mock the "execute_sql" functionality in some unit tests and make it return values or throw errors or whatever you need [16:41:11] milimetric, ok [16:41:18] because what needs to get cleaned up is just the glue that reads the config.yaml file [16:41:29] so - to get started maybe it's easier to think of it as a new project [16:41:35] your input is a file like config.yaml [16:41:54] and you have two outputs: [16:42:14] 1. a readable useful log (ebernhardson had some good ideas for integrating with logstash) [16:42:35] 2. datafiles that get generated by running SQL [16:43:02] while it's true that you can't test it "for real", having some unit tests to make sure that glue works would be a *lot* better than where we are today [16:43:39] milimetric, I see [16:44:25] milimetric, I am going through the code, I will ping you if I have any query :) [16:44:59] rtnpro: we all know this is not a small task, and really appreciate your interest in it. It was my next priority so I can help with it as much as you need [16:45:20] milimetric, :) [16:45:36] milimetric, I will not be shy to ask for help [16:45:43] milimetric: heard you're going to move wikimetrics to prod :) [16:45:53] YuviPanda: now *how* in the world [16:46:08] milimetric: read scrollback :-) [16:46:08] * milimetric runs to check his house for cameras [16:46:18] burgerwednesday: you spoilt it. [16:46:22] for the logstash stuff, as long as you use python logging(https://docs.python.org/2/library/logging.html) sending it on to logstash and our kibana instance for log discoverability is then just tacking on a handler library [16:46:31] rtnpro: ^ logstash stuff [16:47:04] milimetric, acknowledged [16:47:41] ebernhardson, +1 [16:48:56] !log starting upgrade of analytics1027 to trusty, hive and oozie are offline for a bit [16:49:17] YuviPanda: ok, read the backlog (yea yea i read slow) [16:49:23] milimetric: :) [16:49:29] yes, indeed. performance + correctness [17:14:54] Analytics-Tech-community-metrics: Key performance indicator: Bugzilla response time - https://phabricator.wikimedia.org/T63561#807442 (Jdforrester-WMF) >>! In T63561#807103, @Nemo_bis wrote: >>>! In T63561#802548, @Jdforrester-WMF wrote: >> Should this be changed into a Phabricator-focussed metric? > > Yes,... [17:15:19] Analytics: Move stat1002 and stat1003 into Analytics VLAN - https://phabricator.wikimedia.org/T76346#807446 (Ottomata) [17:58:36] Analytics: Upgrade Analytics Cluster to Trusty, and then to CDH 5.2 - https://phabricator.wikimedia.org/T1200#807619 (Ottomata) analytics1027 done. Hue didn't work outright because it ships with its own virtual env at /usr/lib/hue/build/env. I had to symlink /usr/lib/hue/build/env/bin/python2.7 -> /usr/lib/... [18:00:36] ottomata, do you remember what the thread about the hardware asks was called? I can't find it [18:23:08] oh for stat1002! ah i haven't responded about that, i meant to talk to toby about that in our 1:1 last week and forgot [18:23:14] for new stat boxes* [18:26:41] ottomata, gotcha :). What's the discussion topic? [18:26:53] I have my 1:1 with him today and I'm happy to split five minutes out for us to talk it through if there are issues. [19:08:38] Ironholds: found it: [19:08:38] hardware requests for Analytics [19:08:43] ta [19:08:51] My 1:1 has been moved anyhoo [19:15:39] hi ottomata. do we have a documentation for webrequest table columns [19:16:25] hm, you can do show create table webrequest; [19:16:25] or [19:16:29] describe webrequest; [19:16:30] or [19:16:50] yeah, did describe. [19:16:52] thanks! [19:17:18] ah, naw, that's it, cool! [19:24:56] wikimedia/mediawiki-extensions-EventLogging#287 (wmf/1.25wmf11 - 4f76f2a : Reedy): The build passed. [19:24:56] Change view : https://github.com/wikimedia/mediawiki-extensions-EventLogging/commit/4f76f2afb65d [19:24:56] Build details : http://travis-ci.org/wikimedia/mediawiki-extensions-EventLogging/builds/42898775 [19:54:50] ottomata, when I'm searching for "human" from the searchbox in enwiki, will there be a log for every search recommendation I get as I type human in hadoop? [19:55:23] or there will be one log and that's for the search query I finally press enter on, or choose from the drop down list? [19:56:10] hm [19:56:39] hmm [19:56:41] i'm no tsure [19:57:05] that's good. at least it wasn't an obvious question. ;p [19:57:07] i think your search queries go to pybal and then are proxied to CirrusSearch somewhere, probably not to varnishes [19:57:20] i really don't know [19:57:23] you should ask ManyBubbles [19:57:29] in #ops [19:57:33] I will and will let you know [20:13:54] Ironholds: qq -- do you know what the definition of a country is in maxmind? I'm trying to come up with some basic stats and there seem to be 236 countries in the cube. [20:14:41] leila: Neither. You will get a request (in the wmf_raw.webrequests table) for each time the user pauses long enough while he is typing the search term (or hits enter) [20:14:42] tnegrin, anything with an ISO 3166-1 alpha-2 code. [20:14:47] hmn. They also have some special codes. [20:14:58] I can't recall off the top of my head if I filter those or not, in the C-based versions. Will check. [20:15:09] leila: So for example, If I want to do a search for "Berlin" and I type "Be" quickly, then pause. [20:15:23] leila: then type "rli" quickly then pause. [20:15:29] leila: then type n and hit enter. [20:15:39] You'll get "action=opensearch" requests for: [20:15:42] * Be [20:15:44] * Berli [20:15:46] * Berlin [20:16:18] (But the wmf_raw.webrequest table will only show the terms you search for, not the recommendations that got sent back) [20:16:40] I see. now I understand qchris_away. thanks! [20:16:49] do you know what long enough is? [20:17:26] No. I'd have to look that up. Manybubbles (and I guess also ^d) would know. [20:17:37] looks like the answer is "no". [20:17:40] grr. Will add that filter. [20:20:36] tnegrin: The definition of the used country codes is in [20:20:38] http://dev.maxmind.com/geoip/legacy/codes/iso3166/ [20:20:53] qchris_away: rethinking it. with your explanation, I think we're good. we have more information, and it's easy to figure out which one is the more complete one [20:21:08] tnegrin: Note the special ones like A1, A2, O1, EU and so on. [20:21:21] leila: Cool. [20:22:34] qchris_away, yeah, I'm excluding those. [20:22:41] qchris_away: thanks -- this explains it [20:22:42] but it looks like I historically didn't. Oop. [20:22:56] A quick vec.begin(),vec.end() find. [20:23:01] I'm using the 0.3 version of the cube [20:24:06] Ironholds: Excluding or not is of course your decision, but they do carry information. They have a FAQ somewhere about it. [20:24:31] There it is. http://dev.maxmind.com/faq/what-are-the-eu-europe-and-ap-asia-pacific-entries/ [20:24:35] yeah, but not geolocateable information [20:24:50] well, not country-level-geolocateable. [20:24:54] * Ironholds nods. [20:25:49] the Vatican has its own code. one day I want to use this data to write a blog post called "Filthy Habits: Internet Activity in the Catholic Church". [20:26:01] and get fired for both the privacy violation and the pun. [20:26:09] Analytics-EventLogging: Engineer reads documentation on Wikitech to set up a dashboard from EL data [3 pts] - https://phabricator.wikimedia.org/T76364#808376 (kevinator) Got feedback from Toby: we need to include steps from the very beginning: I want to use event logging and "what's a schema?". [20:27:25] halfak, you just missed an epic pun. [20:27:40] That's exactly the reason why geocoding is ... meh. Instead of starting a rant, I guess I'll go eat something :-D [20:27:46] haha [20:28:18] Bummer. [20:28:39] *PUN ALERT* "the Vatican has its own ISO code. one day I want to use this data to write a blog post called "Filthy Habits: Internet Activity in the Catholic Church"." *PUN ALERT* [20:28:48] Just got done squashing. Now trying to convince GroupLensers that the effects of performance changes on MediaWiki is interesting. [20:30:53] halfak: for the hhvm experiment? [20:30:54] * halfak doesn't see the pun. [20:30:57] feels shame [20:31:08] nuria__, more generally, but I'll be talking about the HHVM experiment. [20:31:21] halfak: boy, that is super interesting [20:31:27] halfak, filthy habits. habits. [20:31:30] do we have a report on that? [20:31:38] OK I get it. [20:32:09] Ori and I have run two experiments. We're deciding whether we think we have learned enough or if we should try something else. [20:32:22] I'm hoping to think better about what else to try over the next hour. [20:32:29] 5 min to presentation start :) [20:36:27] aha [21:14:15] (CR) Ottomata: [C: 2] Ignore traffic from cache-local SSL terminators [analytics/webstatscollector] - https://gerrit.wikimedia.org/r/177101 (owner: QChris) [21:15:34] (PS1) Florianschmidtwelzow: Add data to UI-Daily [analytics/limn-mobile-data] - https://gerrit.wikimedia.org/r/177318 [21:16:39] (CR) Milimetric: [C: 2] Add data to UI-Daily [analytics/limn-mobile-data] - https://gerrit.wikimedia.org/r/177318 (owner: Florianschmidtwelzow) [21:16:46] (Merged) jenkins-bot: Add data to UI-Daily [analytics/limn-mobile-data] - https://gerrit.wikimedia.org/r/177318 (owner: Florianschmidtwelzow) [21:34:37] Wikimedia-General-or-Unknown, Analytics: Sudden drop in number of articles on nowiki on Nov29 (by 34k articles) - https://phabricator.wikimedia.org/T76356#808566 (greg) Definitely #Analytics [21:34:41] Wikimedia-General-or-Unknown, Analytics: Sudden drop in number of articles on nowiki on Nov29 (by 34k articles) - https://phabricator.wikimedia.org/T76356#808568 (greg) [21:38:30] Analytics-EventLogging: Engineer reads documentation on Wikitech to set up a dashboard from EL data [3 pts] - https://phabricator.wikimedia.org/T76364#808570 (ggellerman) Doc link: https://wikitech.wikimedia.org/wiki/Analytics/Dashboards#Support [21:40:40] mforns: sorry it took so long! [21:40:45] but! it looks great [21:40:49] just rebase and I'll merge it [21:40:55] ok, thanks :] [21:41:25] note: as part of the rebase, you would see a change to the DailyPageviews metric configuration (it has a breakdown now) [21:41:30] I added that to the wiki page [21:41:44] aha [21:41:49] btw - I think it's so kickass that this is wiki-driven now [21:42:09] yes it's nice! [21:42:31] onw thing though... I could not give the page the jsonSchema type [21:43:26] milimetric, the two config pages work, but when you browse them, they do not display nicely like Schema:EventCapsule does [21:43:48] but it seems I have no permits to set that page type [21:43:58] mforns: no, that's not a page type [21:44:06] it's a custom namespace [21:44:09] oh... [21:44:19] so when EventLogging is installed, it registers "Schema" as a custom namespace [21:44:27] with a JsonContentTypeHandler [21:44:49] then when someone makes a page, if it starts with Schema: it gets put in that namespace [21:44:57] in our case, the namespace Dashiki is not registered [21:45:02] ha [21:45:11] but that's out of scope for this [21:45:21] well, should I change the way I did it? [21:45:23] it works ok as is, and we can develop an extension at some point in the future if we want [21:45:27] nono, it's fine [21:45:30] ok [21:46:59] btw milimetric, I need to go out for 40 minutes, is it ok if I rebase after that? [21:47:17] mforns: no problem at all. I'll be in one more meeting and then I'm gone [21:47:27] feel free to self merge after you rebase - my +2 will be there [21:47:37] ok fine! see you tomorrow then [21:48:05] mforns: I missed the link for the config page. Can you paste it again? [21:48:14] I want to see a config example :-) [21:48:18] of course [21:48:32] https://meta.wikimedia.org/wiki/Dashiki:CategorizedMetrics [21:48:41] https://meta.wikimedia.org/wiki/Dashiki:DefaultDashboard [21:49:13] oooooooh… nice :_) [21:49:15] :-) [21:49:41] ok! [21:50:07] kevinator, I need to be out for 40 mins [21:50:15] be right back [21:50:16] ok ttyl [22:01:17] Analytics-EventLogging: Engineer reads documentation on Wikitech to set up a dashboard from EL data [3 pts] - https://phabricator.wikimedia.org/T76364#808612 (Milimetric) @kevinator and @tnegrin: steps from the very beginning seem out of scope. This task talks about dashboarding once you have EL data. Shoul... [22:25:02] Analytics-Engineering: Create new team project: "Analytics-Engineering" - https://phabricator.wikimedia.org/T75776#808645 (Qgil) Open>Resolved Turns out @kevinator had created #Analytics-Engineering already. Resolving. PS: I have set the View policy to Public. Please remember this details when creating... [22:30:45] kevinator, I left kinda fast, did you have any question on the config pages? [22:35:19] off for the night [22:35:26] <3 [22:36:26] take care [22:40:41] bye! [22:40:46] milimetric: bye! [22:45:12] mforns: well, I didn’t have more questions... [22:45:19] fine [22:45:32] mforns: now I’m tempted to edit one of these pages and see what happens [22:45:41] hehe [22:46:11] well, the Dashiki version that reads that is not in production yet [22:46:34] but if it was, it would stop working as expected [22:48:19] ahh, yes, forgot that the code wasn’t in production yet [22:49:13] I won’t bother then testing an edit if I can’t see the impact [22:51:10] Analytics, Mobile-Web: Cannot permalink easily to a single graph - https://phabricator.wikimedia.org/T76670 (Jdlrobson) NEW p:Triage [22:53:01] Analytics-EventLogging, Mobile-Web: MobileWebClickTracking table is huge and thus querying too slow - https://phabricator.wikimedia.org/T76671#808707 (Jdlrobson) [22:53:02] :] [22:57:32] Engineering-Community, Analytics-Tech-community-metrics, Phabricator: Monthly report of total / active Phabricator users - https://phabricator.wikimedia.org/T1003#808728 (Aklapper) > Volume of tasks in the "shell" project resolved/fixed in last month, requested by Guillaume Hmm, no, I can't https://phabricat... [23:03:08] Engineering-Community, Analytics-Tech-community-metrics, Phabricator: Monthly report of total / active Phabricator users - https://phabricator.wikimedia.org/T1003#808744 (mmodell) [23:14:47] (PS3) Mforns: Read configuration from mediawiki pages [analytics/dashiki] - https://gerrit.wikimedia.org/r/177005 [23:33:25] ggellerman____: Hi grace [23:34:54] ggellerman____ on IRC! Yay! [23:35:10] So, the Laws of Data Analysis has a new addition [23:35:20] it is "never trust any field in a request which the end user can modify" [23:35:32] do not trust referers. Do not trust x_forwarded_for. Do not trust user agent. [23:35:39] Trust the IP address, the URL, and that's it. [23:38:57] Analytics-EventLogging: Engineer reads documentation on Wikitech to set up a dashboard from EL data [3 pts] - https://phabricator.wikimedia.org/T76364#808826 (kevinator) I'll create a separate task for steps from the very beginning. I spoke to the PM of the Collaboration team and some of the schemas they are... [23:39:47] ottomata, do we keep the definition of columns in webrequest somewhere? if we don't, I'd like to start documenting it [23:40:05] for example, where should I look if I want to know what's sequence [23:41:38] i would like for comments to be on those columns [23:41:43] but there is a bug in hive that is keeping them from sticking [23:42:21] ah! I see the bug.https://issues.apache.org/jira/browse/HIVE-4703 [23:42:51] leila: i think it is documented somehwere [23:42:53] qhcris knows [23:42:55] i can't find it! [23:43:01] i have to run [23:43:02] ttyl! [23:43:07] k, ttyl! :-) [23:50:42] Wikimedia-General-or-Unknown, Analytics: Sudden drop in number of articles on nowiki on Nov29 (by 34k articles) - https://phabricator.wikimedia.org/T76356#808842 (jeblad) This isn't a minor glitch, it is a title count for articles comparable to a //one year production// on nowiki. The loss in edits are about... [23:52:37] (CR) Bmansurov: Add support for timezones while creating reports (1 comment) [analytics/wikimetrics] - https://gerrit.wikimedia.org/r/175169 (https://bugzilla.wikimedia.org/72116) (owner: Bmansurov) [23:55:40] Wikimedia-General-or-Unknown, Analytics: Sudden drop in number of articles on nowiki on Nov29 (by 34k articles) - https://phabricator.wikimedia.org/T76356#808853 (Ironholds)