[00:11:55] Analytics-Wikimetrics, Analytics-Kanban: Wikimetrics authentication through Google accounts is broken - https://phabricator.wikimedia.org/T90009#1051865 (ggellerman) p:Triage>High [00:49:32] Ironholds: vital signs as in the dashboard and UI stuff is in the dashiki repo: https://github.com/wikimedia/analytics-dashiki/ [00:49:44] the part that computes the metrics is in wikimetrics, yea [00:50:07] i've gotta run, talk tomorrow if you want [00:57:03] milimetric, awesome! ta [02:11:11] DarTar: In 30 minutes i do not think there is time to cover ref-tags and vanity-urls [02:11:31] nuria: fair, let’s focus on ref-tags [02:11:48] DarTar: ok, i was going to try to convince you further ..... but [02:11:53] it was EASY [02:11:55] that’s the top priority, the vanity url use case is not yet fully articulated [02:12:19] DarTar: ok [02:12:28] also, I don’t think vanity url is the exact description of what I am referring to, but we can take this offline, to a separate discussion :) [02:13:10] DarTar: "magic parmeters" that do special things to ui are normally handled via vanity urls [02:13:29] DarTar: plenty of exmpales: "display a spacial promotion" , "display an add for last app" [02:13:37] "land on beta version of ui" [02:13:42] yep [02:13:51] but the difference is that the URL is fixed [02:14:02] DarTar: semi-fixed, not static [02:14:11] think a catalog that is ever changing [02:17:29] DarTar: Like http://blah.store.com/android-phone-- [02:20:33] nuria: yes, the part that I don’t quite get is how this would work on arbitrary pages [02:21:12] because there is a rewrite engine and a link generator [02:22:21] DarTar: for example, std practice is to say something like $link->generateForCatalog("nexus4", "344.5") [02:22:33] I guess I’d need to understand how that would work, I gotta run now but I’d love to continue [02:22:47] DarTar: sure [02:22:53] thx for chiming in [02:23:19] I added to the meeting the folks in mobile who were directly affected by the proposed change [02:23:38] Max and Adam should be able to cover the initial platform/ops implications [03:19:50] Datasets-General-or-Unknown, Analytics, Analytics-General-or-Unknown: pagecounts stats are behind by about 16 hours - https://phabricator.wikimedia.org/T89771#1052225 (Hydriz) This issue seems to be temporary, I can't reproduce it. [03:39:27] Multimedia, §Multimedia-Sprint-2015-02-18, MediaWiki-extensions-Sentry, Analytics-EventLogging: Log EventLogging schema validation errors in Sentry - https://phabricator.wikimedia.org/T90083#1052234 (Tgr) NEW [03:39:44] Multimedia, §Multimedia-Sprint-2015-02-18, MediaWiki-extensions-Sentry, Analytics-EventLogging: Log EventLogging schema validation errors in Sentry - https://phabricator.wikimedia.org/T90083#1052242 (Tgr) a:Tgr [04:31:22] Datasets-General-or-Unknown, Analytics, Analytics-General-or-Unknown: pagecounts stats are behind by about 16 hours - https://phabricator.wikimedia.org/T89771#1052293 (Bawolff) Well its 4:30 am utc right now, and only the 2:00 file is up. So that's still 2 hours behind (which is much better, but still not as... [07:51:35] Analytics-Wikistats: Discrepancies in historical total active editor numbers - https://phabricator.wikimedia.org/T87738#1052449 (Nemo_bis) Interestingly, there is almost no difference in the 1+ and 3+ levels. > also happened long after this bug fix Code or dataset changes may take longer to take effect, if... [09:25:37] (CR) QChris: Add media file consumption reports (1 comment) [analytics/refinery] - https://gerrit.wikimedia.org/r/191118 (owner: QChris) [10:04:12] I guess Joseph / joal is not addicted to IRC yet :-] [10:04:14] bonjour! [10:04:29] qchris: thank you a ton for the mw/core test fix [10:05:44] hashar: yw :-D [10:21:24] Hey hashar [10:21:35] <-- Antoine Musso [10:21:39] I am based in France [10:21:51] I prefer trying not to be addicted that's true :) [10:21:51] in Nantes specially [10:21:55] are you based in Britany as well ? [10:21:56] Nice [10:22:10] I am in Britany, close to Brest [10:22:19] roughly the other side of the world so [10:22:24] Brest is sooooo far [10:22:32] huhu [10:22:47] Closest thing west side is New York :D [10:22:55] :-D [10:24:59] What are you working hashar ? [10:25:18] joal: I am managing the continuous integration infrastructure [10:25:21] ie Jenkins [10:25:26] right [10:25:34] Great [10:25:53] usually working 9:30am - 6pm [10:26:03] and showing up randomly during our evening to catch up with SF [10:26:12] I don't know yet if hadoop jobs are using continuous integration, but if not I think we should try to move in that direction :) [10:26:23] best point of contact is #continuous-integration in Phabricator or irc :D [10:26:33] ok [10:26:51] I have no clue what hadoop is [10:27:01] if you have some tests, sure we can get them in Jenkins [10:27:02] My usual time is 11:00am - 09:00pm, trying to get more overlap with SF [10:27:09] Cool [10:27:50] I'll discuss with the team see if they think it would be good, but I like the idea of having continuous integration for our cluster :) [10:30:07] joal: we could reuse labs instance you maintain [10:30:13] and plug them in our Jenkins to execute whatever you want [10:30:17] the CI layer is quite thin [10:31:15] ok, I'll need a little bit more of understanding here to ensure I provide easily integrated stuff, but that's no issue :) [10:31:22] I''l ping back when needed ! [10:33:22] I am always idling in this channel. [10:33:55] sounds good [10:34:25] And if I manage to get to CI a little, you'll get to know alittle bit more about hadoop as well [13:52:21] morning [13:55:29] milimetric: good morning :] [13:55:39] heya hashar [13:56:13] hashar: have you met joal? [13:56:27] you are fellow countrymen [13:57:50] Analytics-Wikimetrics: Story: AnalyticsEng uses connection pooling on database URL - https://phabricator.wikimedia.org/T73140#1053160 (Milimetric) That would be nice, but it's not true. The connection string still includes the db name in it, so we are still hitting 852 different connection strings. That's w... [13:57:58] milimetric: we exchanged a few words earlier today [13:58:12] is a few hundred kilometers away from my place [13:58:18] but I guess we can meet eventually :D [13:59:58] yeah, as cold as it is here today I think a few hundred meters is too far :) but hopefully a few of us will be at the Lyon hackathon [14:03:22] if we find a buddy :-] [14:05:16] milimetric: what happened to your extensions that shows nice graphics / histograms out of raw json data? [14:05:42] oh, it's chugging along nicely, yurik's doing great work there [14:05:57] I just helped out last night with the graphoid service he's building - which will render the graphs server-side [14:06:00] hehe [14:06:20] that way we can have images to load at startup since the JS is a bit heavy (I think over 400K) [14:06:26] i' now also doing some upstream work - found some vega issuess [14:08:11] graphoid .... [14:08:22] oids are an epidemic! [14:08:28] so that is https://www.mediawiki.org/wiki/Extension:Graph right ? [14:08:34] yep [14:09:02] oh and is it already deployed? [14:09:07] i just wish we havea better overall platform - logging, services, deployment, monitoring, debuging, etc [14:09:13] not the graphoid [14:09:15] just the graph [14:09:26] on mediawiki.org and a few other [14:09:31] no backend support [14:09:59] so my browser fetch the json data + the heavy javascript and render it client side right ? [14:10:11] meanwhile you are building a backend system to generate them as png/svg ? [14:16:29] Analytics-Tech-community-metrics, Wikimedia-Git-or-Gerrit: Basic metrics about contributors exercising +2/-2 permissions in Gerrit - https://phabricator.wikimedia.org/T59038#1053197 (Nemo_bis) > I have also removed the Translation bot. Why did you need to remove l10n-bot? It's not needed, if you exclude self... [14:41:45] hashar, correct [14:41:48] milimetric, https://github.com/nyurik/vega/compare/trifacta:master...master [14:42:01] any thoughts? [14:42:23] milimetric, https://github.com/nyurik/vega/compare/trifacta:master...master?diff=split&name=master [14:42:25] better link [14:58:41] Possible-Tech-Projects, Analytics-Tech-community-metrics: Allow contributors to update their own details in tech metrics directly - https://phabricator.wikimedia.org/T60585#1053344 (NiharikaKohli) @Acs, this seems to have been a popular choice for GSoC/Outreachy. Are you still interested in mentoring for it? [15:01:17] milimetric, https://gerrit.wikimedia.org/r/#/c/191883/ will use those changes [15:03:38] (CR) Ottomata: [C: 2 V: 2] Encode commas in PercentEncoder [analytics/refinery/source] - https://gerrit.wikimedia.org/r/191098 (owner: QChris) [15:08:22] Analytics-Kanban: Review and clean up JS on new graphoid service - https://phabricator.wikimedia.org/T90147#1053384 (Milimetric) NEW a:Milimetric [15:08:47] Analytics-Kanban: Review and clean up JS on new graphoid service - https://phabricator.wikimedia.org/T90147#1053393 (Milimetric) Open>Resolved code merged by Yuri [15:12:41] Analytics-Wikimetrics, Analytics-Kanban: Wikimetrics authentication through Google accounts is broken - https://phabricator.wikimedia.org/T90009#1053399 (Milimetric) a:Milimetric [15:19:15] milimetric, and lastly - https://gerrit.wikimedia.org/r/#/c/191887/ [15:19:22] vagrant role to install it [15:26:50] o/ milimetric [15:26:59] hi halfak :) [15:27:10] I just saw your email about certs and httplib2 [15:27:19] I'm considering helping out with a project that uses httplib2 [15:27:31] What made you guys choose httplib2 for wikimetrics? [15:29:30] halfak: googling quickly and not thinking about it [15:29:45] and as always, assuming things are "normal" and make sense [15:29:56] Gotcha. [15:30:30] Any chance you are using OAuth to sign an authorized request with httplib2? I'd love to see an example if so. [15:31:18] OAuth is definitely doing that behind the scenes, but we don't do it ourselves [15:31:37] the only reason the httplib was even an issue in the first place was that mediawiki used to require a custom httplib [15:32:02] Hmmm. You should still need to provide a consumer key/secret as well as a authorized key/secret. [15:32:04] but your library let us forget about that problem, as it takes care of installing the proper httplib [15:32:09] I'm curious how you do that with httplib2. [15:38:33] Analytics-Tech-community-metrics, Wikimedia-Git-or-Gerrit: Basic metrics about contributors exercising +2/-2 permissions in Gerrit - https://phabricator.wikimedia.org/T59038#1053501 (Dicortazar) Oh well, it's not a big deal. I just removed what seemed to me bots. In the following iteration I'll keep that inf... [15:39:23] milimetric, it looks like httplib2 is unused in the current version. [15:39:24] https://gist.github.com/halfak/c9e9b1805f34ee72e58e [15:39:40] Which might be why I can't find a use of it. [15:41:05] Ahh! It looks like you are using requests! http://git.wikimedia.org/blob/analytics%2Fwikimetrics.git/4223704ec3d31babf3aee9f76e3a56bc87966caa/wikimetrics%2Fcontrollers%2Fauthentication.py [15:41:18] More specifically: http://git.wikimedia.org/blob/analytics%2Fwikimetrics.git/4223704ec3d31babf3aee9f76e3a56bc87966caa/wikimetrics%2Fcontrollers%2Fauthentication.py#L197 [15:41:41] That must be oauth 2.0 [15:42:09] OK. n/m [15:42:44] Possible-Tech-Projects, Analytics-Tech-community-metrics: Allow contributors to update their own details in tech metrics directly - https://phabricator.wikimedia.org/T60585#1053569 (Dicortazar) This sounds such as a thing to do definitively. Let me push this a bit :). [15:42:44] :) sorry halfak I've no understanding of any of that stuff [15:43:21] Gotcha. Thanks anyway. :) [15:44:54] Possible-Tech-Projects, Analytics-Tech-community-metrics: Allow contributors to update their own details in tech metrics directly - https://phabricator.wikimedia.org/T60585#1053585 (NiharikaKohli) @Dicortazar, willing to mentor then? ;) [16:08:54] Possible-Tech-Projects, Analytics-Tech-community-metrics: Allow contributors to update their own details in tech metrics directly - https://phabricator.wikimedia.org/T60585#1053705 (Dicortazar) Trying to look for someone! [16:09:18] yurik: that vagrant puppet change looks good to me, but I am terrible at puppet so I won't comment [16:19:57] (CR) Ottomata: "Other than these comments, LGTM :)" (6 comments) [analytics/refinery] - https://gerrit.wikimedia.org/r/191118 (owner: QChris) [16:30:18] milimetric, thanks, any comments about other changes? I already commited them (playing in labs), but any comments are welcome [16:40:42] yurik: I should focus on this VE analysis for now, graphoid works great and your changes all make sense, but I only looked quickly [16:41:01] thx ) [16:41:02] feel free to ping me if folks raise issues when you're trying to deploy it to prod [16:41:07] and then I'll help polish it up [16:41:14] problem is deploying it ) [16:41:21] vega uses some native stuff [16:41:33] and npm install is not allowed [16:41:47] oh, but parsoid gets around that and has been doing it forever [16:41:53] so you can copy their strategy [16:42:06] back when I asked they had a repo with frozen dependencies, and they would just git deploy that [16:42:12] but now it might be more sophisticated [16:43:12] ottomata: the dumps on http://dumps.wikimedia.org/other/pagecounts-raw/ come from teh cluster right? [16:43:14] *the [16:45:43] Datasets-General-or-Unknown, Analytics, Analytics-Cluster, Analytics-General-or-Unknown: pagecounts stats are behind by about 16 hours - https://phabricator.wikimedia.org/T89771#1054042 (Nuria) [16:46:19] ye [16:46:20] s [16:46:30] Datasets-General-or-Unknown, Analytics-Kanban, Analytics, Analytics-Cluster, Analytics-General-or-Unknown: pagecounts stats are behind by about 16 hours - https://phabricator.wikimedia.org/T89771#1045030 (Nuria) [16:46:45] Datasets-General-or-Unknown, Analytics-Kanban, Analytics, Analytics-Cluster, Analytics-General-or-Unknown: pagecounts stats are behind by about 16 hours - https://phabricator.wikimedia.org/T89771#1045030 (Nuria) Added otto, I believe the pagecounts on that directory should be updated from hadoop as of recent. [16:49:05] hey nuria, can you test something for me? [16:51:00] Datasets-General-or-Unknown, Analytics-Kanban, Analytics, Analytics-Cluster, Analytics-General-or-Unknown: pagecounts stats are behind by about 16 hours - https://phabricator.wikimedia.org/T89771#1054063 (Ottomata) Hi, yes, we did a cluster upgrade on Monday, which caused a few jobs to lag behind. Everything... [16:52:03] ottomata: yessir. [16:53:15] ottomata: ready to test whatever it is [16:53:39] Datasets-General-or-Unknown, Analytics-Kanban, Analytics, Analytics-Cluster, Analytics-General-or-Unknown: pagecounts stats are behind by about 16 hours - https://phabricator.wikimedia.org/T89771#1054067 (Ottomata) Ah yes, 2 hours behind. That will be the case going forward. The backend that is used to gene... [16:53:45] nuri [16:53:45] a [16:53:51] ok, edit your /etc/hosts file and add [16:54:00] 208.80.154.241 hue.wikimedia.org [16:54:07] and then go to hue.wikimedia.org :) [16:54:21] i'm trying to see if I've got this right, but i thikn my browser has cached some redirect loops, not sure. [16:54:26] i'm seeing inconsistent behavior [16:54:33] ottomata: boy that would be SO NICE [16:56:18] ottomata: it no work though, let me ping the ip [16:56:24] ? [16:56:34] MediaWiki-General-or-Unknown, operations, Services, Analytics, Wikidata, wikidata-query-service: Reliable publish / subscribe event bus - https://phabricator.wikimedia.org/T84923#1054081 (bd808) [16:57:11] ottomata: wait ... [16:58:19] ottomata: i does work, YES! [16:58:51] ottomata: do i sign in with ldap credentials? [16:58:53] yes [16:59:00] keep navigating, tell me if you get any redirect loops [16:59:15] ottomata: so, 503 after signnin [16:59:59] ottomata: let me try again [17:02:29] ottomata: tried again, ui works , have not trid running queries [17:02:36] ok cooool [17:02:40] ok i will puppetize this then [17:03:21] ottomata: nice job there. [17:05:14] Analytics-Engineering, Analytics-Kanban: Email engineering re: x-analytics deployed to all wikis - https://phabricator.wikimedia.org/T89749#1054094 (kevinator) [17:05:34] Analytics-Cluster, Analytics-Kanban: Email engineering re: x-analytics deployed to all wikis - https://phabricator.wikimedia.org/T89749#1044388 (kevinator) [17:39:41] milimetric: on your last e-mail regarding batching [17:39:49] yt? [17:53:39] Analytics-Cluster: Force Hue https redirects. - https://phabricator.wikimedia.org/T85834#1054275 (Aklapper) [17:56:36] nuria: just grabbed lunch, back now [17:56:54] i want to test the batch insert from insert_multi too, see how it performs [17:57:03] should be identical to the __table__.insert [17:57:18] but weirder things have happened. [17:57:25] ottomata: analytics cluster ganglia data stopped a few days ago, was that deliberate? [17:57:38] nope [17:57:44] didn't know bout it [17:57:58] HMmm das not good [17:58:20] how was the monster truck rally? [17:59:01] it is tomorrow! [17:59:03] i'm just in va now [17:59:18] gage changes this recently [17:59:19] https://gerrit.wikimedia.org/r/#/c/191078/ [17:59:26] that is the only recent relevant i change I know of [17:59:37] that and the hadoop upgrade hm [18:09:41] milimetric: ok, ia m going to deploy this change to vanadium to time our batching code [18:10:01] milimetric: https://gerrit.wikimedia.org/r/#/c/191913/ [18:10:46] milimetric: it will time the code as is (it's on top of our currently deployed code) and we can tell sean about batch perf [18:11:58] ok, i'm trying to mimic the code locally [18:16:56] nuria: ok, the insert_multi works as fast as expected, blazing fast. [18:17:06] so hundreds of thousands of records per second [18:17:24] milimetric: and that should be identical to what we have right? [18:17:27] but insert_sequential is performing as you would expect, about 100 per second [18:18:03] yeah, I tried code equivalent to both insert_multi and insert_sequential, on my local machine [18:18:09] milimetric: ok, let me find our numbers for real [18:18:23] milimetric: ta-taaa-chann, will have it done within the hour [18:18:26] if we are correctly using insert_multi, then it should never have a problem inserting tons of events [18:18:46] the question is, maybe the group_by produces too many separate insert_multi calls [18:19:16] cool, the real numbers will be great [18:19:28] because they should be somewhere between 100 and 100.000 per second [18:19:45] and where they fall on that depends entirely on how many or few groups come out of the group_by, I think [18:25:19] milimetric: ok, let me test the code i will deploy on beta labs before deploying to vanadium [18:36:45] Analytics: Substantial amount of ip addresses in 1:1000 sampled squid logs does not resolve into geo data, from Nov 2013 onwards - https://phabricator.wikimedia.org/T90235#1054447 (Krenair) Sounds Analytics-y to me. [18:37:51] Analytics: Daily aggregation of page view dumps stalled - https://phabricator.wikimedia.org/T90230#1054456 (Krenair) [18:45:26] Analytics: Upgrade daily/monthly aggregations of pageview dumps to new data files - https://phabricator.wikimedia.org/T90203#1054527 (Krenair) [18:51:44] Analytics-Kanban, Analytics-Wikimetrics: Wikimetrics authentication through Google accounts is broken - https://phabricator.wikimedia.org/T90009#1054582 (Milimetric) Open>Resolved When we changed our dependencies as part of the mw-oauth refactor, the permissions to cacerts.txt got changed to 600.... [19:07:27] ottomata: how do i log to SAL log [19:08:00] on #wikimedia-operations, !log [19:08:24] same here [19:10:56] ottomata, ori: k, thanks [19:12:41] !log EventLogging re-start [19:14:16] ori, milimetric : so , some numbers in vanadium. [19:14:42] ori, milimetric : batched inserts of 300 take about 1.2 secs [19:15:12] ori, milimetric : whereus ~500 goes closer to 2 secs [20:04:26] i replied to your email nuria, that is still very far from what we could be getting if we buffered a little better [20:04:41] so there's definitely performance to be found / milked here. My point from this morning stands though - why go to all the trouble? [20:10:11] Analytics-EventLogging, Analytics-Kanban, Mobile-Web: Follow up with mobile team on instrumentation sampling rate (%50) - https://phabricator.wikimedia.org/T88363#1055105 (Milimetric) [20:25:12] YuviPanda, poke [20:40:22] (CR) OliverKeyes: (WIP) project class/variant extraction UDF (1 comment) [analytics/refinery/source] - https://gerrit.wikimedia.org/r/188588 (owner: OliverKeyes) [22:38:05] (CR) Nuria: (WIP) project class/variant extraction UDF (1 comment) [analytics/refinery/source] - https://gerrit.wikimedia.org/r/188588 (owner: OliverKeyes) [23:38:26] Analytics-Kanban: Look into theoretical sqlalchemy performance to help EL troubleshooting - https://phabricator.wikimedia.org/T90302#1055728 (Milimetric) NEW a:Milimetric [23:38:36] Analytics-Kanban: Look into theoretical sqlalchemy performance to help EL troubleshooting - https://phabricator.wikimedia.org/T90302#1055737 (Milimetric) Open>Resolved [23:56:07] milimetric, http://zero.wmflabs.org/wiki/Main_Page [23:56:10] (PS1) Milimetric: [WIP] Analyze failure rates by type [analytics/limn-edit-data] - https://gerrit.wikimedia.org/r/192029 [23:56:56] yurik1: AWESOME! [23:57:23] that's so cool [23:57:31] me like )) [23:57:57] Eloquence wanted to see this too i think - http://zero.wmflabs.org/wiki/Main_Page