[02:26:25] Amir1, lol. Good catch! [02:26:35] * halfak wishes our home page were simply a wiki [02:31:18] Ahh crud. I want to update revscoring dependency [02:40:36] :D [02:44:56] Amir1, https://gerrit.wikimedia.org/r/#/c/280150/ [02:44:59] :DDDD [02:45:42] halfak: {{done}} [02:45:46] :) [02:48:37] Thanks! [02:53:48] o/ sabya [02:53:58] o/ halfak [02:54:14] Have you had any luck with the HashingVectorizer? [02:54:39] studying. going through basics. [02:54:44] Gotcha :) [02:55:13] Once this deploy finishes, I'll have something cool to show you :) [02:56:53] so, what i understood was, we need to pass the text to the HV function to get a feature matrix. we pass this feature matrix to train the predicting alogrithms. and after training, when we want to predict, we again convert the text using HV and pass it to prediction function. Is that correct? [02:57:41] halfak:^ [02:57:43] Sounds right to me [02:59:11] i would like to start writing a prototype. [02:59:33] any recommendation on which predicting algo should I start with? [02:59:44] XGBoost [02:59:54] It's designed for the massive feature dimension space involved. [03:00:36] https://xgboost.readthedocs.org/en/latest/ [03:00:46] It looks like a real nice library. [03:01:13] It provides a nice scikit-learn style interface if we want to keep it consistent with our other mdels. [03:01:27] great. do we have any similar prototypes we have written in the past, that I could use to build upon? [03:02:27] https://github.com/dmlc/xgboost/blob/master/python-package/xgboost/sklearn.py [03:02:57] sabya, nothing clean that you could just pick up and work with, but I can take that as a to-do for tomorrow. [03:03:06] It's bedtime here :/ [03:03:21] OK. Deploy is done. [03:03:31] So, check this out... [03:03:33] sure. excited to see what you are deploying :) [03:03:48] Here's a score for an edit. https://ores.wmflabs.org/v2/scores/enwiki/damaging/642345234/ [03:03:54] ORES predicts that the edit is damaging. [03:04:12] With this URL, we can check out the features it used in prediction: https://ores.wmflabs.org/v2/scores/enwiki/damaging/642345234/?features [03:04:18] We can modify the features too. [03:04:35] So, let's modify 'feature.is_anon' and set it to 'false' [03:04:43] https://ores.wmflabs.org/v2/scores/enwiki/damaging/642345234/?features&feature.is_anon=false [03:04:58] If you look at the feature list, you'll note that "feature.is_anon": false [03:05:07] And the score has changed too. [03:05:17] yes. [03:05:24] ORES doesn't think the edit is damaging anymore [03:05:48] And here, we have a demo-able demonstration of how strongly biased the damage detection model is against anonymous editors! [03:06:16] OK. Time for bed! Have a good one. [03:06:26] Amir1, ^ [03:06:32] Check out the URLs. [03:06:33] that's a great tool to experiment with features! good night! [03:06:38] o/ [14:40:56] \o_ [14:46:38] o/ [14:58:45] hello halfak, bearloga [14:58:56] o/ guillom :) [14:59:21] o/ guillom halfak! :D [15:00:22] o/ bearloga :) [16:20:24] halfak: I've received messages from sean! \o/ [16:34:33] yuvipanda, re. aws? [16:35:45] yes [16:41:02] :) [19:31:23] yuvipanda, trying to PAWS, but getting 503 "Proxy target missing" [19:32:37] halfak: restarted the proxy, let's see. give it about 30s and try [19:33:05] * halfak waits [19:33:56] yuvipanda, still, no luck. [19:34:44] Apparently daisy is getting 504 when trying to access paws. [19:34:57] I'm still getting 503: Proxy Target Missing [19:35:05] grr [19:35:42] halfak: deleting everything now :) [19:36:00] OK? Um. Everything? [19:36:04] :P [19:36:16] halfak: I mean, restarting everything [19:36:22] halfak: found it! [19:36:53] \o/ [19:41:28] yuvipanda, still 500: Internal Server Error. Keep waiting? [19:41:49] halfak: yup. *just* pushed a fix [19:42:28] halfak: wfm now [19:42:47] 502 Bad Gateway [19:43:08] halfak: on which URL? [19:43:30] https://paws.wmflabs.org/paws/hub/user/EpochFail [19:43:33] halfak: yours is just launching, I see it. [19:43:37] kk [19:44:27] says it is starting up now :) [19:44:32] yeah [19:44:52] halfak: it got scheduled on a new node we just added yesterday, and until we get our private repo setup first pulls on new nodes are gonna be slow [19:45:55] halfak: now? [19:46:16] Got the "start my server" button again [19:46:30] Then "paws.wmflabs.org redirected you too many times." [19:46:42] halfak: can you go to paws.wmflabs.org/hub/logout? [19:46:46] err [19:46:53] paws.wmflabs.org/paws/hub/logout [19:48:29] Logged out and back in [19:48:39] Now back to "paws.wmflabs.org redirected you too many times." [19:48:57] hmmm [19:49:14] o/ kaizen [19:49:17] halfak: did it work for daisy? I see she's logged in [19:49:20] hai [19:49:29] halfak: I killed your instance, let's see if that works (ni about 30s) [19:49:34] kk [19:49:51] yuvipanda yes mine seems to be working [19:50:01] halfak: try now? [19:50:31] It works! [19:50:35] Thanks yuvipanda <3 [19:50:39] halfak: cool! [19:50:56] halfak: I think proxy flakiness is going to be my no 1 priority soon [19:52:35] * yuvipanda goes afk now [19:55:24] J-Mo [19:55:31] Looking at recent Teahouse experiment [19:55:49] It looks like we bucketed about 7k users. [19:55:52] Is that right? [19:56:17] hey halfak. yes, that sound right-ish (though I don't know the numbers off the top of my head) [19:57:08] OK. just confirming [20:11:12] J-Mo, do you know where I can find the card that had all of our follow-up questions for the teahouse analysis? [20:11:43] Nevermind. Got it. https://phabricator.wikimedia.org/T116339 [20:12:17] this should be up to date too, Halfak: https://meta.wikimedia.org/wiki/Research:Teahouse_long_term_new_editor_retention [20:18:24] Thanks J-Mo [20:18:41] Fun story, we get statistical significance for all the retention periods when combining data between the experiments. [20:20:48] \o/ [20:20:52] !!!!! [20:53:44] Also, it looks like the experiment has no significant effect on the outcome measure, so we aren't crazy to combine them together. [20:53:47] :) [22:03:51] * halfak searches for that damn phab task [22:11:59] J-Mo, any luck with the task [22:12:02] I'm still failing! [22:13:12] Was it this? https://phabricator.wikimedia.org/T108523 [22:13:26] Nooo. [22:13:44] lemme look [22:15:20] no luck here, either, halfak [22:15:44] I did find this task that I created recently, though. I guess you'd say it's a "sub task" of the lost Research discoverability epic… https://phabricator.wikimedia.org/T130332 [22:16:18] +1 should have these grouped together. [22:16:52] want me to (re)create the epic task, halfak? or do you want to do the honors? [22:16:58] https://phabricator.wikimedia.org/T131204 [22:17:01] Done [22:17:04] lul [22:18:20] hey are y'all in R&D using [epic] in task names? [22:18:30] Yeah. [22:18:47] let's make this an epic. Then we can put stuff under it [22:19:26] Working on stuffing [22:19:30] cool [22:22:27] Search isn't working suddenly. Will stuff later.