[00:09:23] 10Analytics, 10Analytics-EventLogging, 10Contributors-Analysis, 10EventBus, and 5 others: Record an EventLogging event every time a new content namespace page is created - https://phabricator.wikimedia.org/T150369#3168737 (10Tnegrin) Folks -- this is for a high priority, community visible project ([[ https... [00:58:29] 10Analytics-Cluster, 10Analytics-Kanban: Provision new Kafka cluster(s) with security features - https://phabricator.wikimedia.org/T152015#3353673 (10faidon) [01:19:35] (03CR) 10Ottomata: [C: 031] "You are a magical man coding in a magical language." [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/359019 (https://phabricator.wikimedia.org/T161147) (owner: 10Joal) [11:05:32] (03CR) 10Joal: "Thanks for the review @mforns. Most comments applied." (038 comments) [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/358916 (https://phabricator.wikimedia.org/T161150) (owner: 10Joal) [11:06:18] (03PS8) 10Joal: Use native timestamps in mediawiki history [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/358916 (https://phabricator.wikimedia.org/T161150) [11:43:35] 10Analytics-Tech-community-metrics, 10Developer-Relations (Jul-Sep 2017): Automatically sync mediawiki-identities/wikimedia-affiliations.json DB dump file with the data available on wikimedia.biterg.io - https://phabricator.wikimedia.org/T157898#3354370 (10Albertinisg) @Aklapper Github db dump file updated to... [11:53:29] * fdans lunch! [12:11:17] (03PS7) 10Joal: Add new fields in mediawiki_history job [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/359019 (https://phabricator.wikimedia.org/T161147) [12:11:28] Taking a break a-team - Later [12:38:06] milimetric: do you have a minute? I have a vue question [12:38:38] yes, one moment, gotta get some clothes on [12:39:29] k omw fdans [13:45:46] 10Analytics-Tech-community-metrics, 10Developer-Relations (Jul-Sep 2017): Automatically sync mediawiki-identities/wikimedia-affiliations.json DB dump file with the data available on wikimedia.biterg.io - https://phabricator.wikimedia.org/T157898#3354812 (10Aklapper) > Next week I hope to start synchronizing th... [14:42:21] 10Analytics-Kanban: Modify EL purging script to not use limit/offset - https://phabricator.wikimedia.org/T168071#3355041 (10mforns) [14:52:35] hey fdans, in the test setup, how do I get access to jasmine? Like to do jasmine.Ajax.install()? [14:52:42] SMalyshev: I am the one working on the tagging so it is slower than anything else but we are pretty ready to merge couple changesets [14:56:29] 10Analytics-Kanban, 10Patch-For-Review: Modify EL purging script to not use limit/offset - https://phabricator.wikimedia.org/T168071#3355100 (10mforns) **Trial 1)** The initial idea was to quickly alter the `_get_old_uuids` method to get not only the list of uuids, but also the max timestamp of the events corr... [14:57:05] milimetric: you may have to import jasmine [14:57:22] I did import 'jasmine' and it says it can't find the module [14:57:37] I looked it up and of course the internet thinks this is the easiest most obvious thing in the world so it doesn't tell me what to do [14:57:50] this is where I'd delete all the code and start over :) [14:57:58] confirmed: 13M user_ids have cumulative count coherent with number of edit :) [14:58:04] * milimetric haaaaaates configuration [14:58:06] This looks like a success [14:58:21] joal: !!!! [14:58:25] milimetric: what about doing npm install --save-dev jasmine? [14:58:41] like user.edit_count is similar to what you're getting from your algorithm you mean? [14:59:27] ... that worked fdans ... but how... jasmine was already installed [14:59:32] no [14:59:35] the jasmine runner was [14:59:39] oh! [14:59:47] but you probs need the library to use the spies and so on [14:59:50] nope, I mean I have no inside-problems where max(cumul_edit_count) != count(revision) [15:00:19] tests in js are rocket science [15:00:25] joal: gotcha [15:00:27] awesome [15:00:37] milimetric: works for users - But not for pages [15:00:47] still awesome [15:00:53] :) [15:01:31] ping milimetric [15:01:48] fdans: 'cannot find module fs!!' [15:01:50] what?! [15:02:00] sad! [15:02:15] that's a nodejs module right? [15:02:17] weeeeird [15:02:23] yeah... looks like the browser's looking for it [15:08:39] (03PS8) 10Joal: Add new fields in mediawiki_history job [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/359019 (https://phabricator.wikimedia.org/T161147) [15:12:29] 10Analytics-Kanban: Implement purging settings for Schema:ReadingDepth - https://phabricator.wikimedia.org/T167439#3355137 (10mforns) [15:12:52] 10Analytics-Kanban, 10Page-Previews, 10Reading-Web-Backlog: Update purging settings for Schema:Popups - https://phabricator.wikimedia.org/T167449#3355152 (10mforns) [15:13:54] 10Analytics-Kanban: Preserve userAgent field in apps schemas - https://phabricator.wikimedia.org/T164125#3355158 (10mforns) [15:14:11] 10Analytics-Kanban: Modify EventLogging so that all table fields are nullable - https://phabricator.wikimedia.org/T167161#3355159 (10mforns) [15:17:46] 10Analytics-Kanban, 10Patch-For-Review, 10User-Elukey: Improve purging for analytics-slave data on Eventlogging - https://phabricator.wikimedia.org/T156933#3355172 (10mforns) @elukey @Marostegui @jcrespo Please see performance notes ^ [15:17:58] 10Analytics-Kanban, 10Page-Previews, 10Reading-Web-Backlog: Update purging settings for Schema:Popups - https://phabricator.wikimedia.org/T167449#3355175 (10Tbayer) Looks good, thanks! [15:22:53] SMalyshev: FYI that halfak is working in a project to quantify usage of wikidata entities in wikipedia pages, the data source for that work is different than webrequest but so you know that work is also wip. [15:23:01] 10Analytics-Kanban, 10Patch-For-Review, 10User-Elukey: Improve purging for analytics-slave data on Eventlogging - https://phabricator.wikimedia.org/T156933#3355183 (10jcrespo) I have not much to add, both Riccardo and me warned about the performance penalty of large offsets, and that a different strategy ma... [15:56:49] fdans: I'm having all kinds of problems with imports and still the jasmine problem, but whatever, I'll cobble something together after lunch and push [15:57:11] awesome, not sure what's up with jasmine [15:57:36] if you're still around we can work on it, otherwise we'll chat Monday [16:08:48] 10Analytics, 10Analytics-EventLogging, 10Contributors-Analysis, 10EventBus, and 5 others: Record an event every time a new content namespace page is created - https://phabricator.wikimedia.org/T150369#3355253 (10Nuria) [16:14:24] (03PS9) 10Joal: Add new fields in mediawiki_history job [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/359019 (https://phabricator.wikimedia.org/T161147) [16:15:06] (03PS9) 10Joal: Use native timestamps in mediawiki history [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/358916 (https://phabricator.wikimedia.org/T161150) [16:15:28] (03PS10) 10Joal: Add new fields in mediawiki_history job [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/359019 (https://phabricator.wikimedia.org/T161147) [16:22:01] (03PS1) 10Joal: Improve resiliency of Banner streaming job [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/359461 [16:36:41] (03CR) 10Nuria: Improve resiliency of Banner streaming job (034 comments) [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/359461 (owner: 10Joal) [16:38:53] (03Abandoned) 10Nuria: Add atjwiki [analytics/refinery] - 10https://gerrit.wikimedia.org/r/359062 (https://phabricator.wikimedia.org/T167720) (owner: 10Reedy) [16:41:38] a-team, if you haven't seen the email to wmf-all, and received an email named Phishing Awareness, it's a phishing email itself! [16:42:12] (03CR) 10Nuria: "@joal: I seem to remember that we had an issue on pageview data where running jobs for daily and hourly on the same segments we run into i" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/355598 (https://phabricator.wikimedia.org/T166967) (owner: 10Joal) [16:43:10] joal: just comented on webrequest jobs, i am not clear why do we need an hourly and a daily, sorry if you already explained this [16:58:21] mforns: wow, that's artistic :) [16:59:20] yea, I clicked on it :/, luckily did not fall into it and enter credentials, already notified aeryn and folks, seems there's no problem in my case [16:59:48] I felt so stupid... [16:59:59] still feel [17:10:26] 10Analytics, 10Analytics-EventLogging, 10Contributors-Analysis, 10EventBus, and 5 others: Record an event every time a new content namespace page is created - https://phabricator.wikimedia.org/T150369#3355390 (10Nuria) @tnegrin: The data coming from these patches comes from mediawiki events, there is no h... [17:19:03] nuria_: hourly and daily are for segment optimisation [17:19:34] joal: din't we had a problem in which those two did not worked (the override of segments) [17:19:44] I don't recall any [17:20:12] nuria_: Problem might be when you want to rerun, but that's a different story [17:20:22] joal: on the pageview jobs after we added the work to quantify tuurkey blockage taht worked every hour? [17:20:51] nuria_: we had an issue with streaming, not batch [17:21:22] joal: ah , ok, and why don't we wnat to have just one hourly job? what is the downside? [17:21:47] nuria_: segment optimisitaon, both in storage and performance [17:23:33] nuria_: We could have only one, the difference is not huge [17:23:59] Dropping for dinner lads, later [17:42:03] nuria_: thanks! that's different indeed but good to know too [17:51:50] 10Analytics, 10Analytics-EventLogging, 10Contributors-Analysis, 10EventBus, and 5 others: Record an event every time a new content namespace page is created - https://phabricator.wikimedia.org/T150369#3355542 (10kaldari) >Yes, we can help with this, do you want to put a changeset together we can coderevie... [18:59:34] nuria_: ebernhardson mentioned you were asking about removing special handling in our search satisfaction event logging code for users without send beacon feature? looking at daily pageviews on pivot, it seems IE (which does not support this feature) is the leading browser among desktop users in Japan and Iran (by a mile) and the 3rd most popular browser in US. unfortunately it does not seem like we can comfortably exclude [18:59:35] non-send-beacon-having users [19:06:02] bearloga: jp and iran I have not looked in detail but overall in desktop IE > =10 is not close to 3rd , if you look at numbers overall for IE you are lumping ie6, ie7, ie8, ie9 together [19:06:22] bearloga: and neither of those receives js at all, which means that you cannot possibly expect to see events [19:06:51] bearloga: let me see if event ie10 receives javascript, not sure [19:07:09] bearloga: these numbers might vary per country, give me a sec [19:07:15] bearloga: so i can look at that [19:07:20] nuria_: have not considered that scenario! thanks for looking into that. [19:07:35] bearloga: ya, let's be super sure, cause this comes up often [19:09:51] bearloga: i am sloww looking ta this , found it: https://www.mediawiki.org/wiki/Compatibility [19:10:36] bearloga: so only ie10 and up are receiving javascript [19:11:29] bearloga: but think that most used version of ie is 11 followed up by (relic) ie7 [19:12:11] (03PS5) 10Bearloga: [cirrus] Distinguish morelike vs fulltext api search requests [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/345863 (owner: 10DCausse) [19:13:36] bearloga: so, a better tradeoff criteria might be to look at ie10+ usage and whether that is enough on your case to keep supporting non-send beacon [19:13:46] nuria_: we'd also be cutting out Safari users according to https://caniuse.com/#feat=beacon [19:22:08] bearloga: ah i forgot about these i though safari had moved onto this as the discussion about it is kind of old, that matters for mobile quite a bit (40%) but not for desktop so much, so up to you since i think your code is desktop only. The basic question is : can i gather enough data to make statistically valid conclusions just supporting sendbeacon enabled browsers? Note that the sendbeacon support is only [19:22:08] going to matter for analytics that are sent on page transitions so it is not the bulk of your analytics [19:23:46] bearloga: can elaborate further, that came up a little concise. [19:33:29] (03PS16) 10Nuria: UDF to tag requests [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/353287 (https://phabricator.wikimedia.org/T164021) [19:53:07] (03PS17) 10Nuria: UDF to tag requests [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/353287 (https://phabricator.wikimedia.org/T164021) [19:54:33] (03CR) 10Nuria: UDF to tag requests (031 comment) [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/353287 (https://phabricator.wikimedia.org/T164021) (owner: 10Nuria) [20:00:51] fdans: ok, pushed into feature/aqs-api [20:00:59] lookin decent, but we should sync up next [23:09:41] 10Analytics-Kanban: Preserve userAgent field in apps schemas - https://phabricator.wikimedia.org/T164125#3356350 (10Tbayer) Looks good, thanks!