[08:09:49] awight hey! you around? [08:12:12] whenever you see this, can you see https://github.com/saurabhbatra96/wmf-samplecodes#classifier-comparisions [08:12:33] Gradient Boost and Random Forest are performing wonderfully! [08:36:12] 10Scoring-platform-team, 10revscoring, 10artificial-intelligence: Take advantage of word2vec signal in all models - https://phabricator.wikimedia.org/T197007#4278170 (10Aklapper) [09:35:06] 10Scoring-platform-team, 10Fundraising-Backlog: Machine Learning for Fraud Detection - https://phabricator.wikimedia.org/T190523#4278392 (10awight) [10:35:27] 10Scoring-platform-team, 10ORES: Take advantage of word2vec signal in all models - https://phabricator.wikimedia.org/T197007#4278607 (10awight) @Aklapper thanks! [10:37:53] 10Scoring-platform-team (Current), 10Wikilabels, 10articlequality-modeling, 10Patch-For-Review, 10artificial-intelligence: Train/test article quality model for euwiki - https://phabricator.wikimedia.org/T171119#4278623 (10awight) Ping #global-collaboration, I thought you might want to know that this exis... [11:01:59] 10Scoring-platform-team: [Epic] Use LFS for large ORES files - https://phabricator.wikimedia.org/T197096#4278718 (10awight) [11:02:12] 10Scoring-platform-team: [Epic] Use LFS for large ORES files - https://phabricator.wikimedia.org/T197096#4278729 (10awight) [11:02:14] 10Scoring-platform-team, 10ORES: Consider storing all files in datasets - https://phabricator.wikimedia.org/T192617#4278728 (10awight) [11:03:15] 10Scoring-platform-team: Migrate models to LFS - https://phabricator.wikimedia.org/T197097#4278732 (10awight) [11:06:13] 10Scoring-platform-team, 10Huggle, 10JADE: Use JADE as a repository for ORES counterexamples - https://phabricator.wikimedia.org/T197098#4278749 (10awight) [11:17:52] 10Scoring-platform-team, 10JADE, 10Operations, 10User-Joe: Scalability concerns creating a page per revision - https://phabricator.wikimedia.org/T196547#4278789 (10awight) Adding onto @Halfak's comments, I agree that social convention seems to be the best way to protect against runaway JADE usage. Specifi... [11:18:58] Amir1: ^ it would be great if you could weigh in with your concerns, if you still think we shouldn't deploy? [11:19:24] Aaron has me convinced that we have to rely on social conventions. [11:19:45] awight, got a minute? [11:20:13] saurabhbatra: hey yeah! [11:20:30] so I'm writing the cross-validation code right now [11:20:56] was wondering how do I get an average of the PR curves? [11:21:08] ooh interesting, I'm not sure it's in there [11:21:25] yup, could not find it in the sklearn docs [11:21:57] Here's an answer, https://stackoverflow.com/questions/26587759/plotting-precision-recall-curve-when-using-cross-validation-in-scikit-learn [11:22:57] yeah I did have a look at this [11:22:58] Might also be a nice format to plot all of the P-R curves, to give an idea of the error magnitudes [11:23:40] so I just combine the data from all the iterations right and plot the curve using it right [11:24:00] I guess so... not sure what this can be used for, though [11:24:15] If you want the mean, you might as well plot the P-R curve of the final model, right? [11:24:46] 10Scoring-platform-team, 10JADE, 10Operations, 10User-Joe: Scalability concerns creating a page per revision - https://phabricator.wikimedia.org/T196547#4278810 (10awight) @Ladsgroup it would be great if you could weigh in with your concerns, if you still think we shouldn't deploy? [11:27:50] awight -> awight_mob, going mobile for an hour [11:29:45] I'll have a look see and get back to you [12:04:15] Hi Amir1 and awight_mob:. We discussed moving today's meeting at the sync [12:04:23] But I forgot to actually move it. [12:12:32] halfak_: to when actually? [12:16:29] halfak_: It's now collides with Technical advice IRC meeting in which I'm the host :( [12:17:51] Damn [12:18:18] I don't see that in your calendar [12:19:07] Amir1: are you sure it overlaps? [12:19:42] halfak_: It's in WMDE account [12:19:50] did you pull up both? [12:22:47] Arg. I guess not. [12:23:16] We could meet an hour later. That might be hard for awight. [12:23:47] :( [12:29:27] Amir1, I need to leave now for my Doc's apt. Can you coordinate with awight to figure out what will work for him. [12:29:57] Either meet without me, or maybe push the meeting back one more hour if awight can make it. [12:30:00] Either is OK. [12:30:03] Sorry for the trouble [12:32:42] Sure! [12:53:31] saurabhbatra: back again. Any progress towards graphing question? [12:53:48] yup, it's done [12:54:25] just clubbed the data for all the folds together and used it to plot the final curve [12:55:35] none of the graphs actually changed so I guess we can safely say we're not overfitting [12:56:36] going to start work on logistic regression next, let's see what that gives us! [13:04:02] Fantastic! [13:04:44] Ooh what is saurabhbatra working on? [13:04:59] * halfak_ types from doctor's office [13:05:33] Hey halfak_ [13:05:41] Long time no chat [13:05:56] Hey Vermont [13:06:14] I might go afk at any moment, fyi [13:06:31] oh, k [13:06:59] Well, ORES doesn’t seem to be enabled on simplewiki [13:07:06] Or I just can’t find out how to enable it for me :/ [13:07:34] Wasn't it enabled before? Did something change? [13:08:51] I don’t think it was ever enabled tbh [13:08:53] I just didn’t notice :/ [13:09:58] It would be very useful for RC... [13:10:08] halfak_: saurabhbatra has been doing some great work on https://phabricator.wikimedia.org/T190523 , to bring donation fraud detection in-house. [13:11:04] Funny story, the WMF almost got knocked into a punitive rate schedule for credit card processing a few years ago, after crossing some quite low fraud threshold during our busiest month. [13:12:09] Come to think of it, the anomaly detection work might be relevant to the edit quality modeling, we should investigate that. [13:12:42] Vermont: I can take a look, thank you for letting us know! [13:12:59] Thanks. [13:13:21] I used to have a script in my userspace that Halfak told me to put in there, and that worked for ORES (using the English model) [13:13:56] The task says it's done, https://phabricator.wikimedia.org/T182012 -- I'll try it on the live site in case it's just our settings. [13:14:05] Alright thanks. [13:14:45] Vermont: I see the filters and they work for me. [13:15:04] Vermont: the ORES preferences are actually sort of tricky, you might be hitting an edge case here [13:15:21] considering that you're in the rare position of having tried things before they were deployed... [13:15:26] How would I enable them? [13:15:38] lessee... [13:16:03] Vermont: my settings -> new changes [13:16:17] Do you have "Hide the improved version of Recent Changes" checked? It should not be. [13:17:09] Actually, I had the one to highlight in RC off... [13:17:14] So, it worked. [13:17:19] interesting [13:17:22] Thanks for your help! Sorry for being an idiot :/ [13:17:25] Not at all! [13:17:29] It's working now? That's great! [13:17:34] Yep :D [13:18:00] great news, and thanks for bringing the question here. I'm sure other people are going to run into this... [13:22:51] Vermont: I'm trying to write a FAQ entry, but not entirely sure which checkbox you found was interfering. What changes did you make? [13:23:02] There's a chance that simply opening and saving preferences might have fixed it. [13:33:41] oh [13:33:51] Let me check [13:34:30] https://usercontent.irccloud-cdn.com/file/Xm2T8zj6/IMG_0179.PNG [13:34:41] “Highlight likely problems...” [13:34:59] I had it unchecked. [13:36:29] Strange--would you mind unchecking again and seeing if RC still works? [13:36:41] AFIAK, it shouldn't be necessary to have that setting [13:36:46] those are all deprecated [13:40:19] awight: hey [13:40:44] holla! [13:41:27] awight: buenos dias, is it okay if we push the staff meeting for one hour? [13:41:52] Amir1: until when? 7pm CEST? [13:42:06] yup [13:42:07] err I guess that would be 6pm [13:42:16] hehe now I'm confused [13:42:43] in Berlin time, from 18:00 to 19:00 [13:42:44] :D [13:42:55] (I'm lazy I know) [13:43:08] hardly lazy [13:43:22] ah yeah I think I may be able to make that time work [13:49:33] 10Scoring-platform-team (Current), 10JADE, 10Operations, 10TechCom, and 2 others: Deploy JADE extension to production - https://phabricator.wikimedia.org/T183381#4279219 (10awight) [13:49:50] 10Scoring-platform-team (Current), 10JADE, 10MW-1.32-release-notes (WMF-deploy-2018-06-12 (1.32.0-wmf.8)), 10Patch-For-Review: Implement multi-judgment and endorsement schema for Extension:JADE - https://phabricator.wikimedia.org/T194219#4279220 (10awight) 05Open>03Resolved [13:49:52] 10Scoring-platform-team (Current), 10JADE, 10Operations, 10TechCom, and 2 others: Deploy JADE extension to production - https://phabricator.wikimedia.org/T183381#3851603 (10awight) [13:54:37] Great! [13:58:52] Hey! I'm back [13:59:08] Did the meeting happen or did we decide to reschedule? [13:59:19] o/ awight Amir1 [14:02:20] ack, headed into meeting [14:11:16] 10Scoring-platform-team (Current), 10Wikilabels, 10articlequality-modeling, 10Patch-For-Review, 10artificial-intelligence: Train/test article quality model for euwiki - https://phabricator.wikimedia.org/T171119#4279325 (10Theklan) >>! In T171119#4277181, @Ragesoss wrote: > Thanks for the ping. @Theklan,... [14:37:11] 10Scoring-platform-team, 10Collaboration-Team-Triage (Collab-Team-This-Quarter): Enable srwiki edit quality filters in RecentChanges - https://phabricator.wikimedia.org/T197012#4279410 (10JTannerWMF) a:03Catrope [14:37:26] 10Scoring-platform-team (Current), 10Wikilabels, 10articlequality-modeling, 10Patch-For-Review, 10artificial-intelligence: Train/test article quality model for euwiki - https://phabricator.wikimedia.org/T171119#4279413 (10awight) >>>! In T171119#4278623, @awight wrote: >> Ping #global-collaboration, I th... [14:37:50] awight: sorry for the late response. When I uncheck that, it is not highlighted in RC. [14:38:22] 10Scoring-platform-team, 10Collaboration-Team-Triage (Collab-Team-This-Quarter): Enable bswiki edit quality features - https://phabricator.wikimedia.org/T197010#4279420 (10JTannerWMF) a:03Catrope [14:38:31] Vermont: ooh how strange. Okay, thanks for the report! [14:39:41] Amir1: I can't really tell if techcom will discuss the JADE questions today... Meanwhile, would you mind putting your thoughts into the task? [14:41:19] 10Scoring-platform-team (Current), 10Wikilabels, 10articlequality-modeling, 10Patch-For-Review, 10artificial-intelligence: Train/test article quality model for euwiki - https://phabricator.wikimedia.org/T171119#4279436 (10Theklan) Ok, so my guess is that is not possible yet to bulk-extract this informati... [14:46:12] 10Scoring-platform-team (Current), 10Wikilabels, 10articlequality-modeling, 10Patch-For-Review, 10artificial-intelligence: Train/test article quality model for euwiki - https://phabricator.wikimedia.org/T171119#4279464 (10awight) >>! In T171119#4279436, @Theklan wrote: > Ok, so my guess is that is not po... [14:53:06] o/ gtg, hopefully back in time for our meeting [15:57:58] 10Scoring-platform-team (Current), 10ORES, 10WMF-Design, 10WikimediaUI Style Guide, and 2 others: Give a new look to the home page - https://phabricator.wikimedia.org/T196580#4279883 (10Volker_E) [16:57:39] 10Scoring-platform-team (Current), 10JADE, 10Operations, 10User-Joe: Extension:JADE scalability concerns due to creating a page per revision - https://phabricator.wikimedia.org/T196547#4280214 (10awight) [16:57:55] Amir1: halfak: ^ if you don't mind taking a look [16:58:59] "JADE meta-judgements"? [16:59:20] judgments about judgments. [16:59:23] lemme take that out. [17:00:25] 10Scoring-platform-team (Current), 10JADE, 10Operations, 10User-Joe: Extension:JADE scalability concerns due to creating a page per revision - https://phabricator.wikimedia.org/T196547#4280229 (10awight) [17:00:26] does that work? [17:01:19] It's sort of an eccentric point to even make, I think I'm blinded by the programmerly neatness of infinite uroboros [17:02:13] 10Scoring-platform-team (Current), 10JADE, 10Operations, 10User-Joe: Extension:JADE scalability concerns due to creating a page per revision - https://phabricator.wikimedia.org/T196547#4280234 (10Halfak) [17:02:14] took it out entirely [17:02:15] 10Scoring-platform-team (Current), 10JADE, 10Operations, 10User-Joe: Extension:JADE scalability concerns due to creating a page per revision - https://phabricator.wikimedia.org/T196547#4280237 (10awight) [17:02:18] uh, oh [17:02:23] Woops. I just saved my edit over yours :( [17:02:24] Sorry [17:02:32] I think I won actually [17:02:42] Arg [17:02:45] I'll get it. [17:02:46] I hate this software [17:02:58] ok I'm hands-off [17:03:35] Is degraded performance of the page table the primary concern here? [17:03:39] Or is it storage [17:03:42] I think it's storage [17:03:47] I think it's performance but don't know [17:03:49] just guessing [17:04:11] The indexes shouldn't struggle to manage trillions of rows [17:04:17] btrees are log scaled [17:04:35] So you'd need to have a billion billion billion before it would be one cycle slower. [17:04:50] Or maybe just a billion billion [17:04:53] But you get what I mean [17:04:55] In practice, I see mysql eat dirt around 1M rows [17:04:59] here's a thing: https://stackoverflow.com/questions/1276/how-big-can-a-mysql-database-get-before-performance-starts-to-degrade [17:05:14] anyway, yeah we can leave out the consequence part and just explain the immediate impact [17:05:42] "The physical database size doesn't matter. The number of records don't matter." [17:06:01] "The most important scalability factor is RAM. If the indexes of your tables fit into memory"... [17:06:04] some shit [17:06:08] we don't have to care in this case [17:06:18] 10Scoring-platform-team (Current), 10JADE, 10Operations, 10User-Joe: Extension:JADE scalability concerns due to creating a page per revision - https://phabricator.wikimedia.org/T196547#4280245 (10Halfak) [17:06:31] 10Scoring-platform-team (Current), 10JADE, 10Operations, 10User-Joe: Extension:JADE scalability concerns due to creating a page per revision - https://phabricator.wikimedia.org/T196547#4260809 (10Halfak) [17:06:47] Oh yes. But revision is WAY bigger on indexes [17:07:19] I think that, in the worst case scenario, we have mostly 1-2 revisions per page. [17:08:06] Maybe I'm missing something about why page is so scary. I'd like to see that explained. [17:08:39] Me too, but I'm just working from the POV that I should care about proportions. [17:08:53] If we add 5% to a table, meh. If we add 300%, something is going to burst. [17:09:06] Not well founded in facts. [17:09:24] Amir1: ^ do you have insight into what's wrong with the page table? [17:11:20] Anyway, we're talking about doubling the revision table in those worst-case numbers, so that's clearly not okay. [17:12:46] halfak: Thanks for the task clarifications, +1 on all. I'm going to edit the description a bit more, if you're out of it? [17:14:45] 10Scoring-platform-team (Current), 10JADE, 10Operations, 10User-Joe: Extension:JADE scalability concerns due to creating a page per revision - https://phabricator.wikimedia.org/T196547#4280307 (10awight) [17:14:49] Boldly risked another edit conflict. [17:15:59] also Daniel mentioned that we can have jade pages on edits that happen in jade pages, it's easily can turn into recursive match [17:16:47] Amir1: we do want to support that, but in moderation. The problem is with infinite recursion, nothing else, right? [17:17:27] yeah, I don't think this will be a big deal though. We can leave it to community [17:17:29] I just took a description of that issue out of the task, I think it's just a distracting subset of the main problem. [17:17:32] +1 [17:17:45] I was totally distracted myself, cos recursion is fun. [17:18:11] I stopped myself from writing another paragraph about how the previous paragraph was an example of debilitating meta-judgment :p [17:21:48] aww, a quote I love has been debunked. https://quoteinvestigator.com/2013/04/23/good-idea/ [17:28:23] awight, sorry I didn't say so but I'm done editing [17:28:56] Amir1, I don't think it's recursive in a scary way. [17:29:06] Auto-recursion is scary. Human-directed recursion is not [17:29:31] Example, let me tell you a story. I will do a depth-first search of what I think is interesting ;) [17:31:17] Running out to lunch [17:31:17] o/ [18:01:47] afk for dinner, will be back soon [18:01:49] sushiiiiiii [19:52:23] halfak: Probably fine to deploy T196448 now? [19:52:24] T196448: Automatically redirect users to the correct category in the UploadWizard (Wikimedia Commons) - https://phabricator.wikimedia.org/T196448 [19:52:36] * T196468 [19:52:36] T196468: Catch a specific new badword in English - https://phabricator.wikimedia.org/T196468 [19:52:53] * awight butters up deployment keyboard [20:06:18] +1 awight [20:06:20] sorry was on call [20:15:25] 10Scoring-platform-team, 10ORES, 10Release-Engineering-Team, 10Performance: Try to increase ORES deployment parallelism - https://phabricator.wikimedia.org/T197180#4281166 (10awight) [20:16:05] 10Scoring-platform-team: [Epic] Use LFS for large ORES files - https://phabricator.wikimedia.org/T197096#4281179 (10awight) [20:16:08] 10Scoring-platform-team, 10ORES, 10Release-Engineering-Team, 10Performance: Try to increase ORES deployment parallelism - https://phabricator.wikimedia.org/T197180#4281178 (10awight) [20:16:51] 10Scoring-platform-team, 10ORES: Consider committing all non-private datasets to the repos - https://phabricator.wikimedia.org/T192617#4281193 (10awight) [21:12:14] I'm going AFK for a bit. I'll be back in 2 hours for a call with the googles. [21:12:49] google is calling you? [21:27:25] I think that was literal, yeah [21:35:29] halAFK: Quite a nasty deployment artifact. [21:37:18] I'm counting worker processes... [21:42:10] they are there. I wish I had an explanation for the drop in requests processed. [21:46:49] Oh, I see--precaching responses have completely stopped on codfw. [21:49:13] Aaand, we're back to normal. I don't know what that was. [21:50:36] awight how long are you in the netherlands for? :) [21:50:44] and lol @ smell heh [21:51:30] paladox: a few more weeks--are you based nearby / visiting? [21:51:42] awight i am accross the pond [21:51:44] the uk! [21:52:02] well only across from netherlands, i am near london. [21:52:11] ah that's what I thought. I got to visit Cardiff recently, it was surprisingly beautiful [21:52:20] heh [21:52:25] i went to lands end [21:52:30] I was expecting something more like a Dickens scene [21:52:38] awight: you should come visit Spain sometime [21:52:40] but also cut my self running on rocks (on the beach) [21:52:43] heh [21:53:53] Hauskatze: :D I also got to visit Catalonia recently, I loved it... [21:53:56] awight did you see the white cliffs of dover? [21:54:24] noo just downtown [21:54:33] awight downtown? [21:54:38] which city? [21:54:41] or town [21:54:44] exeter? [21:54:51] Barcelona is too noisy and crowded, as Madrid [21:54:52] plymouth, southampton [21:54:54] dover [21:55:16] big city problems I guess [21:55:25] heh [21:55:34] i like cambridge [21:55:38] really nice city center [21:55:42] shops i mean [21:55:45] john lewis is huge [21:55:45] analytica :P [21:55:52] lol [21:56:04] milton keynes has a nice shopping center [21:56:11] but haven't been there in yearrrrrs [21:56:15] i iz in ur facebook stealin' ur data [21:56:19] yet i only live 30 mins from it [21:56:21] lol [21:56:28] Hauskatze nothing much to steal from me [21:56:31] * Hauskatze has never been to the UK, got to visit sometime [21:56:42] neither from me, I have no facebook [21:56:48] except my profile picture (which is like when i was 12-13 (i am alot older now)) [21:56:53] no twitter, no instagram, no nothing [21:57:06] but I have a Phabricator profile lol [21:57:09] i have only fb [21:57:36] oh and it does have my current location [21:57:40] but no mobile numbers [21:57:47] no way am i giving fb my phone number [21:57:57] 10Scoring-platform-team (Current), 10ORES, 10Release-Engineering-Team: Document: ORES deployment caused some sort of downtime - https://phabricator.wikimedia.org/T197191#4281401 (10awight) [21:58:24] Hauskatze you should visit the uk sometime! [21:58:28] i am going into london [21:58:31] in august [21:59:11] Yeah well, I'd like to [21:59:17] :) [21:59:19] lol /me adjusts tin foil [21:59:32] lol [21:59:45] awight tin foil? [21:59:52] I wanted to visit NYC as well [22:00:16] buy an appartment in the Upper East Side of Manhattan, etc. [22:00:30] i like living in the uk :) [22:00:45] but preferably near a big city but currently live in a large town [22:00:49] that's gone bankrupt [22:00:58] I don't dislike living where I am t.b.h. [22:01:12] Hauskatze spains capital? [22:01:17] Nope :) [22:01:48] Madrid is good if you don't live in the center [22:01:52] Hauskatze barcelona? [22:02:06] traffic is horrible, noise... and expensive [22:02:09] london is a huge place [22:02:15] but did you know the city is small [22:02:16] paladox: neither [22:02:21] but london is large [22:02:30] the city of london is just one square Km [22:02:38] heh [22:02:40] 10Scoring-platform-team (Current), 10editquality-modeling, 10revscoring, 10Patch-For-Review, 10artificial-intelligence: Catch a specific new badword in English - https://phabricator.wikimedia.org/T196468#4281416 (10awight) 05Open>03Resolved [22:02:41] yeh [22:02:48] but has their own police dept [22:03:00] the greater London has 'the met' [22:03:00] all countys have there own policy dept [22:03:22] 10Scoring-platform-team, 10ORES: Consider committing all non-private datasets to the repos - https://phabricator.wikimedia.org/T192617#4281419 (10awight) [22:03:22] like mine does! [22:03:23] oh i see what you mean now [22:03:45] the met have the authority where i live [22:03:51] as we are not a federal state [22:03:57] In Scotland I think they reunified everything in Police Scotland [22:04:16] scottish yard [22:04:22] In Spain we have National Police for urban areas and Guardia Civil for rural areas [22:04:24] i have no idea why they have that in london [22:04:52] + some cities have municipal police forces for traffic and municipal ordinances enforcement [22:05:06] heh [22:05:18] and some autonomous communities have their own police forces, but only 5 out of 17 do have them [22:05:22] we have [22:05:22] http://www.northants.police.uk [22:05:33] y'know, it costs money [22:05:58] heh yeh [22:06:02] * awight stumbles away and hides from all polices and policies [22:06:09] i think the govement funds ours [22:06:45] I read at Wikipedia that the Met has authority through the whole UK on certain cases but couldn't find on which basis. [22:07:02] in any case, it's late so /me out [22:07:05] Hauskatze yeh they [22:07:06] do [22:07:16] same goes for police northampton have the authority in london [22:07:23] yet they doint actually police that city [22:07:31] it's because we are not a federal state [23:29:15] 10Scoring-platform-team (Current): [Epic] Use LFS for large ORES files - https://phabricator.wikimedia.org/T197096#4281554 (10awight) [23:29:31] 10Scoring-platform-team (Current), 10ORES: Take advantage of word2vec signal in all models - https://phabricator.wikimedia.org/T197007#4281555 (10awight) [23:30:24] 10Scoring-platform-team (Current), 10ORES: Take advantage of word2vec signal in all models - https://phabricator.wikimedia.org/T197007#4275813 (10awight)