[00:10:54] quiddity: and for the parts I don't want to be translated (for example the following section), shall I just manually remove what the bot has put on the page? [00:11:03] quiddity: thank you! [00:13:19] leila I think it would just replace that text (i.e. overwrite your removal) as soon as the page was next updated and marked for translation. I.e. It wouldn't be reliable. [00:13:56] Hmm, I'm trying to think of transclusion options, but I'm not sure how that might/could work.. :/ [00:15:45] leila, ah, no, the problem is more that it simply doesn't offer any method to delete the untranslated text! [00:16:02] * leila reads [00:16:45] i.e. there is no "edit" function for that /ar subpage [00:16:53] quiddity: ok. no worries. at the moment it looks like we are encouraging the translation of the whole page which is not the intention. [00:16:55] yeah. :D [00:17:07] subbu has promised to fix this. ;) [01:17:59] I think I'm done for the day. moved all translations of the taxonomy to https://meta.wikimedia.org/wiki/Research:Characterizing_Wikipedia_Reader_Behaviour/Taxonomy_of_Wikipedia_use_cases Tomorrow will be finalizing this page, and making significant progress on the Prevalence card at https://meta.wikimedia.org/wiki/Research:Characterizing_Wikipedia_Reader_Behaviour [01:18:20] srrodlund: ^ (for whenever you look at this page) [01:18:32] quiddity: thanks a lot for your help earlier. have a good night. [04:16:01] 10Quarry, 10Patch-For-Review: Do the big Quarry migration - https://phabricator.wikimedia.org/T202588 (10zhuyifei1999) @Framawiki I think we can proceed with the migration (the test instance at https://quarry-dev.wmflabs.org/ shares NFS with the production instance so I'm too afraid to do any real testing ther... [15:24:29] Hi nuria! I have 2 quick follow-up questions from yesterday! [15:24:36] miriam: yes [15:25:34] nuria: 1) is the sampling rate set at schema level? Or is it possible to have 50% sampling for pageload events, and 100% for all other events? [15:26:31] nuria:2) is the session_token set a schema level? Or it's easy to map session between 2 schemas, say one capturing pageloads, and the other capturing the interaction with citations [15:26:43] nuria: sorry if questoins sound silly :) [15:27:34] miriam: 1) sampling is set normally at schema level , you could tweak it in your code to be whatever you want it to be but at that point (for easy ness of analisys) it seems like you almost would want 2 different schemas as the data gathered at pageload and data gathered when other vents happen are different [15:28:15] nuria: yes, sounds reasonable - hence my second question :) [15:28:18] 2) session is per browser per wiki, once you open it.wikipedia you are given a sessionId and that is yours until you close your browser (not tab) [15:29:09] miriam: you also have access to a pagetoken that is constant for the pageview (not the session) and that can be used for example for AB testing whose unit of diversion is the pageview [15:29:46] miriam: so whi;e sessionId in EL is only retained for 90 days it will be available for you to join data from 2 schemas [15:30:03] miriam: one that captures event1 and other event2 [15:30:06] miriam: makes sense? [15:30:53] nuria: ok, yes, makes a lot of sense! I wanted to make sure the schema type was not used to generate the session_token. [15:31:22] miriam: session_token is used by mediawiki for many other things, EL just consumes it not generate it [15:31:37] miriam: so yes, it is available (for 90 days) to interrelate events [15:32:21] Thanks a lot :) So, if possible, we will probably go for 2 different schemas then, reducing to 30-50% the sampling of pageload events (800-1000 events/sec), and having another schema capturing all interactions with citaitons (150 events/second) [15:32:58] nuria: since you are here, is there anyway to link the session_token to the data in the webrequest table? [15:33:28] miriam: on meeting, can talk in abit [15:33:39] nuria: sure, thanks! [16:00:16] miriam: no, there si no way to link those two [16:00:20] *there is [16:00:27] *there is no way [16:01:02] nuria: ok, thanks, just double checking that we actually need the page load events. [16:01:17] nuria: thanks a lot, I am learning a lot :) [17:01:24] nuria: re miriam's question above, is the question of linking or not a technical question or a principle question? [17:10:04] leila: it is not technically possible at this time nor will we consider to make it so at any time cause we have already discarded the idea of using tokens on the web [17:10:39] nuria: ok [18:00:19] For those of you interested in Jupyter notebooks. ACM's upcoming webinar is on this topic. It's going to take place tomorrow at 12:00 EST. The talk will be archived at https://learning.acm.org/webinars and the link to it is, I think: https://event.on24.com/wcc/r/1818551/31C075B8B757778F790379D1DF363E0A [19:34:38] srrodlund: I just pinged you at https://meta.wikimedia.org/wiki/Research:Characterizing_Wikipedia_Reader_Behaviour/Taxonomy_of_Wikipedia_use_cases . I did my part and we're almost done with it once you put your sign off. [19:34:50] * leila wonders which card to pick up next. [20:26:03] 10Quarry, 10Patch-For-Review: Do the big Quarry migration - https://phabricator.wikimedia.org/T202588 (10zhuyifei1999) Scheduled for 7pm UTC next Wednesday [21:03:36] guillom: I'm joining remotely. :D [21:03:42] srrodlund: ^