[05:15:37] 10Quarry: Ask python scripts to use custom user agents - https://phabricator.wikimedia.org/T197258#4284548 (10zhuyifei1999) >>! In T197258#4283799, @Framawiki wrote: > Mmm, is it some kind of monitoring tool ? benchmark test ? :) @zhuyifei1999 I don't know of one. [05:26:10] 10Quarry: Ask python scripts to use custom user agents - https://phabricator.wikimedia.org/T197258#4284549 (10zhuyifei1999) > 20240 14.29% GET HTTP/1.1 /query/new This should flood the new query list. Maybe we can check the list and see if there's something unhuman there? [05:35:38] 10Quarry: Ask python scripts to use custom user agents - https://phabricator.wikimedia.org/T197258#4284551 (10zhuyifei1999) ``` MariaDB [quarry]> SELECT -> COUNT(DISTINCT query.id) AS numempty, -> user_id, -> (SELECT username FROM user WHERE user.id = user_id) AS username -> FROM query... [11:47:06] 10Quarry, 10DBA, 10Data-Services: Cannot reliably get the EXPLAIN for a query on analytics wiki replica cluster - https://phabricator.wikimedia.org/T195836#4291244 (10jcrespo) p:05Triage>03Low So the workaround for now is to make sure one is connected to the same server by doing: ``` SELECT @@GLOBAL.hos... [15:13:04] * leila wonders what's going on in phabricator [15:13:43] sgoel: good morning. sooo, do you have time around 13:00 PT for a chat with Ashton? [15:14:08] dsaez: hellooo. whenever is a good time for you, and I appreciate it's already late, let's have a chat about readers? [15:14:32] leila: yes! [15:14:42] sgoel: thanks. expect an invitation. [15:20:44] leila, now? [15:26:15] yup [15:51:53] leila, btw: there was vandalism on phabricator -- vandals must be really bored these days :) https://phabricator.wikimedia.org/T197444 [15:52:50] miriam_: right. Aklapper was on it, as always. :D [15:58:39] bmansurov: can you remind me about what comes through event capsule? On top of what's in https://meta.wikimedia.org/wiki/Schema:CitationUsage do we also get IP (or hashed IP) and UA, or not? [15:59:19] leila: o/ https://github.com/wikimedia/eventlogging/blob/master/eventlogging/capsule.py#L31 [15:59:30] * leila checks [15:59:38] IP and UA are included [16:00:11] bmansurov: thanks. makes sense. [16:25:47] bmansurov, leila: did we define how long will the data collection last once we jump to 100% of sample? [16:27:55] miriam: you can put it tentatively for one week to have a full cycle of usage capturing potentially daily differences. [16:28:09] o/ leila [16:28:17] working on slides for the vision stuff. [16:28:18] leila: great, i thought so!! [16:28:24] halfak: o/ [16:28:28] What's the tech that is behind the recommender API? [16:28:37] Is it a relational DB? A big memory-mapped file? [16:29:01] bmansurov: given that you're doing the latest updates, you know the most to answer halfak's above. [16:29:16] Oh yeah. :) Should have pinged bmansurov :D [16:29:32] halfak: I've been sinking in the past days. I had yesterday to work on it, and the consulate stuff happened. :( [16:29:51] I'll try to catch up on Saturday morning, and if I don't, then I'll just support your case. [16:31:47] leila, do you think you'll still have time next week? [16:32:37] Monday afternoon is a good option, halfak, I can block it. [16:32:43] And then Tuesday afternoon. [16:32:46] Cool. Me too then. [16:32:55] Arg! Showcase! [16:32:55] halfak: not sure if you work on Tuesday. I will. [16:33:01] halfak: yup. :D [16:33:15] Yeah. In fact I think Tuesday will work better. [16:33:18] halfak: my Monday is one of the maddest I've seen so far. [16:33:23] halfak: ok. [16:33:34] halfak: I'll be remote on Tuesday, but I will be available. [16:33:47] lol Tuesday is a holiday [16:33:50] Of course it is. [16:33:53] I know. :D [16:33:55] I'm gonna be in the office either way [16:34:05] One of these days I'll use up all my comp days. [16:34:10] halfak: I moved mine to Friday. I really can't take off in the middle of the week. :( [16:34:50] Yeah. that's a hard day to take. Oh well :) [16:35:15] o/ halfak: currently a big ndarray I think [16:35:28] it's compressed on the first load and saved to the filesystem [16:35:36] halfak: basically half of our team is not off, and all external entities will continue sending email to you anyway. so you will have a nightmarish Wednesday to catch up and pay back. :D [16:35:47] Right! [16:35:50] then upon the start of the service, it gets uncompressed and converted to a numpy object [16:36:00] I started some work towards moving this file to a DB [16:36:07] priorities have changed [16:36:19] bmansurov, aha! So it's a huge in-memory array. [16:36:28] halfak, yes about 2-3GB [16:36:39] Gotcha. So eventually it'll be a DB? Thinking relational? [16:36:50] halfak, that's what i was thinking [16:36:55] Gotcha. [16:36:58] Thank you :) [16:37:00] not sure maybe there's a better way [16:37:02] np [16:37:14] * halfak just needs a high level view so he can wave his hand in the right direction :) [16:37:23] ;) [18:30:07] 10Quarry: Add nofollow attribute in some links to prevent bots from following unnecessary ones - https://phabricator.wikimedia.org/T197488#4293811 (10Framawiki) [18:31:59] 10Quarry, 10Cloud-Services: GoogleDocs bot has download 125 000 csv exports in the last month - https://phabricator.wikimedia.org/T197256#4293823 (10Framawiki) I've found this page from google help center that describe their import function It can match our problem. https://support.google.com/docs/answer/30933... [21:31:18] bmansurov: great to see T190774#4293991 . :) [21:31:19] T190774: Improve the prioritization algorithm used in recommendation API - https://phabricator.wikimedia.org/T190774 [21:32:59] leila: it was a long time coming [21:33:09] hopefully the rest will go smoother [21:33:27] the first one is always the hardest. [21:34:15] the excitement of setting it up and seeing how results improve will take you through the rest. I must say that I shouldn't have told you to start from that feature. That gives the model the biggest boost and may set the bar too high. ;) [21:34:59] hah, I figured if we stopped adding features, then we'd at least have something that works OK [21:35:24] bmansurov: yup. that's one way to look at it. [21:35:33] :P [23:27:55] all. I'm going to gradually start slowing down and sign off, maybe. ;) have a good weekend for those of you the weekend starts nowish.