[01:29:32] YuviPanda, Oh! [01:29:54] Does it automatically just know to go to the 'revscoring'/'ores' project? [01:30:18] Also, I think we're OK for now, but I'm going to need a hand figuring out why the celery workers keep crashing. [01:30:36] * halfak runs off to have a Friday night. [01:31:13] If you get back and see this, talk to me about what you want out of mwdb so that I can get to work on it. :) [02:18:22] just finished the Tufekci article and is now sad. [02:18:48] is going for a walk in the Berkeley hills to ponder the future. [05:32:39] https://blog.wikimedia.org/2015/08/31/wikipedia-accounts-blocked-paid-advocacy/ [05:46:26] http://thinkprogress.org/culture/2015/09/04/3697846/wikipedia-extortion-scandal/ [05:51:49] aetilley, o/ [05:52:38] Tufecki makes me sad too, but it's primarily the cursory treatment she gives "algorithms" [13:28:34] halfak: hey, I just added some new features, such as "number_changed_claims" etc. [13:28:49] Also I fixed some upstream bugs in pywikibase [14:21:21] o/ Amir1 [14:21:26] Cool! [14:22:07] Amir1, does pywikibot do anything to make database querying easier? [14:22:45] Hey halfak, since it's hard for me to participate in the hack session, I did my own hack session, Already fixed lots of stuff [14:22:57] halfak: I don't think so [14:23:05] \o/ [14:23:08] Gotcha. [14:23:18] I'm working out what mwdb should do. [14:23:36] I don't think we can do a lot [14:23:50] I've been considering an ORM [14:24:21] for ORES? [14:24:38] Nope, the 'mwdb' package. [14:27:11] I'm not an ORM type of guy, but I did find the ORM-like methods convenient for doing common query patterns like revert detection. [14:31:16] I understand [14:32:03] Everyone seems to love SQL alchemy too. :) [14:32:04] halfak, I'm around. We can irc or call but I'll have to speak softly. [14:32:14] IRC is better for me too. [14:35:10] Ok. In summary I was moved by the Tufekci article. I found the Sandvig article to have a low signal/noise ratio. [14:35:31] Interesting. [14:35:32] The former reinforced my belief that we're doing important stuff! [14:35:38] Heh! [14:36:15] Great. :) So, now to figure out what we *do* next :) [14:36:17] We better not mess up. [14:36:45] One thing I'd really like Tufecki to do is address the long history of modeling work concerned with overfitting issues and feedback. [14:36:56] Because I'd like to take her framing to those methods. [14:37:15] Right now, I don't have a good window into anything except basic model fitness evaluation. [14:39:05] I think that Tufecki is making the common mistake of looking at (e.g.) google's marketing materials and inferring the engineers motivations and concerns from it. [14:39:35] I don't think we're the only AI engineers who worry deeply about the subjective judgements we support. [14:39:47] But I'd like to do more than worry. I'd like to do *better*. [14:42:21] Of course. Well it was only one paper and there's only so much you can do in so many pages. I didn't take away the idea that she was so much assigning blame as pointing out what is possible. [14:43:48] My initial reaction to both articles was "Open Algorithms, Open Training Sets" [14:44:11] But implimenting this might prove hard. [14:44:39] Well.. we certainly have an open process. That, I agree, is the first useful thing we can do. [14:44:48] Most people don't have the ability/time/resources to read through an algorithm in detail to see if it's working "ethically" [14:45:32] or to look at a large training set to evaluate it's fairness. [14:45:35] its [14:45:52] Well... I'm not sure that is true. [14:46:21] As we gather training sets, I think we, as engineers, should be aware of what it should be good and bad at teaching an ML algorithm. [14:47:50] When constructing the algorithm yes. I mean after the fact. [14:48:07] Oh sure! I'm not going to look at the internals of the SVC. [14:48:29] But we don't really need to look at the internals. [14:48:32] We could demand that all WM editors publish their bot's publicly, but most of this is never going to be looked at in detail. [14:48:47] Maybe I should back up. [14:48:55] I kept being reminded of Netfix [14:49:27] Netflix is kind enough to give me a hint at why it recommends the things that it does. [14:49:43] It's not a complete description of the algorithm involved, but it's a start. [14:50:15] Oh! That's a different algorith, [14:50:16] :) [14:50:22] It's not the one that made the recommendation. [14:50:25] ("Recommended because of you watched 30Rock...") [14:50:42] That's the algorithm that comes up with plausible explanations. [14:50:47] Not even joking [14:50:51] halfak's right. [14:51:01] thinking of it as a justification, not an explanation. [14:51:02] Shilad's an old pro in this stuff. [14:51:06] :) [14:51:14] o/ morning dude [14:51:27] you too! [14:51:34] i'm on vacation. can you tell? [14:51:44] I'm not sure what we're disagreeing about right now. I'm just giving an example of an attempt at alorithmic transparency. [14:52:03] * YuviPanda agrees violently with everyone [14:52:10] aetilley, No, I'm with you. I just wish that was transparency in the way we'd like. [14:52:48] It looks like it is. And it might still be useful in many ways, but it doesn't really explain the reasoning behind the recommendation. [14:52:50] Another example is how YouTube (used to) give(s) you the option to customize your adds. [14:53:03] ads [14:53:04] I didn't know about that. [14:53:52] Well "customize" is a strong word. You could click "Don't show me ads like this." [14:54:18] Oh! Yeah. That's some fun input data. [14:54:29] Shilad, re. vacation, time to do the fun programming! [14:54:44] But like Netflix, there's a tradeoff between usability and total transparencey. [14:54:50] So, I have a proposed method for looking for bias in our algorithms. [14:54:57] Ok [14:55:32] We operate using the scientific method and a robust fitness metric like AUC. [14:55:52] So, I have a hypothesis. E.g. the fawiki revert model is unfairly treating anons. [14:55:58] (Real hypothesis) [14:56:12] So, I want to take a sample of anon edits and generate the AUC for just those edits. [14:56:43] If it deviates from the AUC for all edits, we know that our algorithm is stupid when it tries to score anon edits. [14:57:26] Using this strategy, we don't need to know how the alg works in order to evaluate it. [14:59:19] o/ YuviPanda [14:59:28] Thoughts on https://etherpad.wikimedia.org/p/mwdb? [14:59:43] That's assuming you want to treat any deviation from the global AUC when restricting attention to anon (substitute "property P") edits as "unfairness". [14:59:46] Am in an abandoned airport and trying to doddodge rain [14:59:52] I'll check on and off [14:59:54] I'm not sure I have a strong intuition wrt that. [15:00:02] * YuviPanda clicks link [15:00:41] aetilley, not exactly. I'm sure there will be some cases where low AUC is fine. E.g. highly experienced editors. [15:00:55] When they add curse words to an article and delete a lot of content, it's probably fine. [15:01:04] So it's really had to make predictions there. [15:01:05] halfak: have you looked at mwclient? [15:01:15] My ideal would be something like me client [15:01:17] Mwclient [15:01:21] Except actually works [15:01:30] And can easily switch between db and api [15:01:34] Probably not feasible [15:02:09] YuviPanda, yeah. I originally tried the interchangable API and DB thing in mediawiki-utilities. [15:02:15] It works to an extent and then explodes. [15:02:27] Yeah [15:02:28] There are some *really basic things* that the API can't do. [15:02:35] Heh [15:02:43] e.g. get me the revisions that happened on 2015-01-01. [15:03:00] Gotta limit by page. [15:03:06] Or user [15:03:30] Right [15:03:35] What is :mod mwtype [15:03:45] Is it classes for things like revisions? [15:04:07] Oh! https://github.com/mediawiki-utilities/python-mwtypes [15:04:10] Yeah. [15:04:18] With some nice convenient bits to it. [15:04:24] E.g. trivial JSON serialization. [15:04:28] Hah [15:04:29] Nice [15:04:33] I want to standardize a JSON format for future dumps. [15:04:38] I don't think mwdb should depend on it [15:05:00] A 'static reflection' model sounds interesting [15:05:16] Yeah. I don't think we'd need to write much code at all for that. [15:05:20] Aka a codegen that generates models based off a db once and then we ship that [15:05:32] Rather than dynamic db reflection [15:05:33] I dunno if it actually does that. [15:05:36] Yeak [15:05:39] *Yeah [15:05:51] I'd rather generate explicitly and cache. [15:05:54] Yes [15:06:00] But then again, what if someone runs into a weird configuration [15:06:03] Or an extension table. [15:06:20] So another question is are we gonna support just the config Wikimedia does [15:06:26] Or is it gonna be generic [15:06:38] Yeah. That's a good question. I want to target wikimedia/labs first. [15:06:49] Yes I agree. [15:06:54] But I'd like the wikia/wikihow people to be able to use it easily. [15:07:21] I think we can't guarantee / design for that easily unless they are also actually using testing it [15:07:24] I think also [15:07:33] We are unique in offering db access to the public [15:07:39] Wijia doesn't and most wikis don't [15:07:51] Yeah. This is a very good point. [15:08:01] But FWIW, I won't be using this on labs primarily. [15:08:03] So I feel less icky about marrying this to labsdb [15:08:09] Than mwapi [15:08:10] I'll be using it on the analytics slaves. [15:08:11] Well [15:08:13] Wmd-db [15:08:15] Err [15:08:18] Wmf-db [15:08:26] Yeah... That could work. [15:08:28] * YuviPanda invades a random country on finding wmds [15:08:36] wmfdb [15:08:58] G.W. Panda [15:09:16] So, aetilley, comments on proposed methodology? [15:09:18] So things like userindex are prolly more important than compatibility with mw 1.21 [15:10:10] It seems like I could get a dataset of (1) the test set with predictions and (2) the human labeled edits with features and prediction. [15:10:28] With these two things we could do some breakdowns by editor segment, content area, etc. [15:10:40] (I seem to have run into a medieval battle reenactment) [15:10:42] Is 1.21 old? [15:10:55] YuviPanda, wat? IRL? [15:11:10] If so, pics plz. [15:11:22] halfak, thinking. [15:11:26] kk [15:11:49] Yes [15:13:16] https://usercontent.irccloud-cdn.com/file/4Mww4Shr/IMG_20150905_171245015.jpg [15:14:09] You can't really see it there unless you zokm in [15:14:23] The rain picked up I had to run back to the tree I was under [15:14:43] LARP! [15:14:54] Are you still in Rome? [15:15:12] or roam? [15:15:20] Larp? [15:15:25] Is that what this is cslledm [15:15:26] Live Action Role Playing [15:15:27] Called? [15:15:29] Aaah [15:15:30] Yes [15:15:36] But there is a referee [15:15:38] And points [15:15:52] I'm in berlin [15:15:55] halfak I think I will have stronger opinions about this if we could spend a week or so on a test case. [15:16:23] Sure! I think that the fawiki issues is a good place to start given recent interest in this. [15:16:37] And luckily, is_anon is one of our features so we don't need external data. [15:16:49] * halfak is trying to finish off the enwiki labels campaign. [15:16:56] I think the fawiki one might be done. [15:16:58] So your hypothesis is what? [15:17:07] About the is_anon feature. [15:17:27] The fawiki revert predictor doesn't work as well when the editor is anonymous. [15:18:14] per its AUC [15:18:51] Sure and some qualitative assessment as well. [15:18:59] Assuming we see a bad AUC. [15:19:16] If we do, I'd like to drop is_anon from the feature set and then check the global AUC. [15:19:45] Ok. Now when you say "revert predictor" [15:20:33] Which function exactly is that? [15:20:48] Hmm... not sure what you are asking. [15:20:57] "reverted" is one of the models we support. [15:21:06] that was my question [15:21:07] It predicts whether an edit will need to be reverted or not. [15:22:15] one sec [15:25:48] Which one is fawiki? [15:26:08] Persian (fa = farsi) [15:26:10] I'm sorry if these are obvious quesitons to you. I don't quite speak the language yet. [15:26:12] Ok [15:26:14] Na. [15:26:21] fawiki is the most confusing [15:26:27] or maybe zhwiki for Mandarin [15:26:35] :D [15:26:43] * halfak has lots of lookup tables in his head. [15:31:15] So is IRC supposed to be an excercise in humility? Where I get to ask all my noob questions in front of everyone and not just halfak? [15:31:49] I remember reading about a model called "reverted" but I don't remember where. [15:31:59] aetilley, can PM if you want to. No worries. [15:32:18] There's going to be a lot of this at the beginning. [15:32:21] If you're comfortable, then I think it is good to discuss these things in chat. I'm sure most people here don't know the details of ORES either. [15:32:55] aetilley, honestly, I don't think anyone knows ORES/revscoring from end to end except me. [15:33:00] I certainly don't. If you could point me to any relavant documents I would be grateful, but my online searches about ores have not been very productive. [15:33:03] Which is a problem [15:33:26] Yeah. Here are the docs I've got: https://meta.wikimedia.org/wiki/Objective_Revision_Evaluation_Service [15:33:56] That one! [15:34:23] yes I saw this. It is burried deep in the 1000000000 bookmarks I have aquired in the past 3 months. [15:35:36] Oh, you finally decided to go with "machine learning as a service" [15:37:59] * YuviPanda doesn't either [15:38:33] finally? I've always thought that was a find way to describe ORES. [15:38:36] *fine [15:38:48] AI is a bit more broad than I like [15:39:16] oh, I seem to recall an irc discussion a couple weeks ago about how to tag the revscoring project. [15:39:19] Ignore me. [15:39:55] (or rather the more general project that encompasses it) [15:39:56] Hmm... [15:40:04] nevermind [15:40:08] -._o_.- [15:42:21] Oh, and yes, chat is great. I was kind of joking, being self conscious. I've been in a grad student cave for the past half decade and am just now getting used to corresponding with humans. [15:42:23] FWIW, I wrote that page at Wikimania in July [15:45:58] Hey folks... If you have a second, I'd be curious to hear your feedback on http://shilad.github.io/wikibrain/tutorial/web-api.html [15:47:15] There's one performance issue I'm working out, but you should be able to play around with it. [15:47:52] Shilad, I'm confused about which endpoints are up [15:48:09] Nevermind. This works: http://como.macalester.edu/wikibrain/similarity?lang=simple&phrases=coltrane%7Cblues [15:48:40] http://como.macalester.edu/wikibrain/similarity?lang=simple&phrases=ores|revscoring [15:48:41] :D [15:49:42] Shilad, I get different similarity scores when I use underscores and spaces. [15:49:44] e.g. http://como.macalester.edu/wikibrain/similarity?lang=simple&phrases=machine_learning|artificial_intelligence [15:49:54] ^ 11% [15:49:56] http://como.macalester.edu/wikibrain/similarity?lang=simple&phrases=machine_learning|artificial%20intelligence [15:50:00] ^ 45% [15:50:01] Yeah. Is that unexpected for "phrases?" [15:50:06] Not sure. [15:50:30] Maybe I should make it clear that phrases are normal text, not wiki text. [15:51:02] Na. I'm not sure why a phrase with a space would be more similar to one with an underscore than both with an underscore. [15:51:13] Or both with a space. [15:51:38] Either way. It seems like we could use this for statement differencing. [15:51:58] What is statement differencing? [15:52:06] (and cool!) [15:52:40] Shilad, when we score an edit, it's good to know how different the meaning of a sentence is after a change. [15:52:43] Might be good for Wikidata too. [15:53:09] Meh. Looks like maybe no. [15:53:16] http://como.macalester.edu/wikibrain/similarity?lang=simple&phrases=Barack%20Obama%20is%20a%20President|Barack%20Obama%20is%20the%20President = 79% [15:53:31] http://como.macalester.edu/wikibrain/similarity?lang=simple&phrases=Barack%20Obama%20is%20a%20President|Barack%20Obama%20is%20a%20Terrorist = 79% [15:54:23] http://como.macalester.edu/wikibrain/similarity?lang=simple&phrases=President|Terrorist Hmm. Not terribly similar there [15:54:23] Yeah. That's tricky. Mostly an unsolved problem in NLP. Works about 75% of the time with state-of-the-art methods. [15:54:29] GOtcha [15:54:36] 75% is good if we can get it. [15:55:04] Every 5 requests I get an SQL Dao Failed. error [15:55:12] Really! That's interesting. [15:55:17] https://gist.github.com/halfak/d0507b587653a4278081 [15:55:19] Investigating... [15:55:21] ^ Error [15:56:27] Thanks. That definitely deserves some debugging. [15:56:57] I think something's closing idle db connections on me. [15:57:09] :) [15:57:18] * halfak wonders if harej is around [15:57:29] He'd definitely like this stuff . [15:57:48] I've got to get back to ORES stuff, but I'll be back to pushing on this later. [15:57:50] Thanks SHilad :) [15:57:59] Cool to see this stuff online. [15:58:00] Wiki brain eh? [15:58:08] I'm excited to have an endpoint in labs for SPEED [15:58:09] Yeah! [15:58:16] it appears there's now an apey eye? [15:58:51] Indeed. [16:00:11] Thanks for poking around. I'll send it off to some of the lists for feedback. [16:00:55] +1. I'm excited to see wiki-research-l and wiki-tech looking at this. [16:01:10] I recommend wiki-research-l first and wiki-tech for a second round later. [16:01:21] And WP:VPT later too [16:01:50] Shilad, so what exactly does this do? [16:02:49] harej, WikiBrain is a Java platform for the development of Wikipedia-based algorithms. [16:03:13] It includes reference implementations of many state of the art AI, NLP, and GIS algorithms. [16:03:57] The goal of this web API is to expose pieces of WIkiBrain that researchers and tools developers will find most useful. [16:06:12] halfak so currently is all interaction with ORES done through the REST interface? [16:06:25] Or the URL thing. [16:06:29] that is [16:06:38] Yes. URL thing == REST interface [16:06:41] Yes [16:06:48] So gui etc? [16:06:51] no gui etc [16:07:01] ok [16:07:10] Not yet. We have a feature request, but I think it is relatively low propriority [16:07:18] Unless someone else wants to pick it up. [16:08:11] If someone wanted to learn how to use this interface (and perhaps more generally the structure of wikimedia URLs) where would you send them? [16:08:23] I mean we've talked about this in the past [16:08:29] but is there a spec somewhere? [16:09:15] https://meta.wikimedia.org/wiki/Objective_Revision_Evaluation_Service [16:10:39] aetilley, might be misunderstanding your question [16:11:01] Are you saying there's nothing more to the syntax than is contained in this example given on this page? [16:11:07] Example query: /scores/enwiki/?models=reverted|wp10&revids=34854345 [16:11:39] Yes. If you click on that you get http://ores.wmflabs.org/scores/enwiki/?models=reverted%7Cwp10&revids=34854345 [16:12:34] I'm aware of that. Ok, I guess that answers my question. [16:12:49] Are you seeing the "Machine readable paths"? [16:13:27] I see some JSON. Or a some sort of hashmap. [16:14:01] Oh you mean the URL? [16:14:02] Well, it gives you the URL paths and if you click on them, you get example queries. [16:15:06] Right, I know that much. [16:15:18] I just wasn't sure if there was more. [16:15:29] What are you looking for? What "more"? [16:17:17] Ok, so if I am looking through a page's history and want to consult ORES about a specific revision, I need to enter a link like that? [16:17:51] Nevermind about "more". [16:17:52] aetilley, well, you're presumably have ScoredRevisions installed and it would be scored automatically. [16:18:10] See https://github.com/he7d3r/mw-gadget-ScoredRevisions [16:18:22] It's one of the tools that consumes scores from ORES [16:18:31] We also have Huggle now. [16:19:12] https://meta.wikimedia.org/wiki/Research:Revision_scoring_as_a_service#Tools_that_use_ORES [16:19:44] https://tools.wmflabs.org/raun/?language=pt&project=wikipedia&userlang=en [16:19:50] Look for the flame icons [16:19:53] Those are ORES scores [16:23:37] Ok, these are new to me. [16:23:55] but ok [16:25:23] ScoredREvisions anyway. [16:25:33] ORES is intended to be infrastructure that other tools will use. [16:25:49] Gotcha. [16:26:00] ORES should still have a basic UI for running tests and exploring false positives. [16:29:56] ok [16:39:47] Like Special:ApiSandbox [16:40:01] Yeah [16:41:50] I had never seen this raun thing either. [16:42:10] This is every single WMF update? [16:43:04] Basically [16:43:10] I think it is targeted at the Wikipedias [16:43:23] But it seems that it'll support other wikis too [16:48:16] Raun? [16:50:01] https://tools.wmflabs.org/raun/ [16:50:10] YuviPanda^ [16:50:34] Aaah [16:50:37] Nice [16:50:59] Most things appear not to have ores scores [16:51:09] But that stands to reason. [16:51:13] YuviPanda, I have a weird thingie. When I pass an exception through celery, it forgets it's class name and uses it's parent class. [16:51:22] EVer heard of something like this? [16:51:31] Ugh. . [16:51:32] No [16:51:41] aetilley, supported wikis: http://ores.wmflabs.org/scores/ [16:51:49] YuviPanda, yeah.. Very weird problem. [16:54:02] WTF are you doing celery? [16:55:33] WTF! It loses *exactly* one level of inheritance. [16:55:39] * halfak punches at the air [16:55:57] Ask on so? [16:56:00] "DependencyError: Could not locate revision. It may have been deleted." [16:56:12] You can't even have that message unless you are a RevisionNotFound error! [16:56:13] ARG! [16:56:55] Maybe [16:57:04] It has the right type [16:57:10] It doesn't [16:57:15] But wrong type name? [16:57:20] Yeah. [16:57:28] isinstance(error, RevisionNotFound) [16:57:31] false [16:57:47] Uhh [16:57:48] Ok [16:58:02] * YuviPanda is randomly flinging stuff at walls [16:58:09] * halfak too [16:58:52] Celery's logs show that it thinks the parent class error was thrown with the child class's message. [16:59:09] If I use the "Timeout scorer" that runs the same code locally, it all works out. [16:59:36] * halfak upgrades celery [16:59:51] Didn't help [17:00:29] * YuviPanda is still walking around berlin in rain [17:00:53] Hanging out with WMDE people? [17:01:34] halfak: for a bit ya [17:02:25] Today is my random walkabout and stuff stuff into mouth day [17:02:36] * aetilley wants to move to Kreuzburg [17:03:23] Well... the error message still looks good. "MissingResource: Could not locate revision. It may have been deleted." [17:03:28] I might just leave it be. [17:04:03] It's a really silly thing though. [17:08:18] YuviPanda, do they keep copies of XML dumps on labs? [17:08:43] Yes [17:08:53] /public [17:09:41] * halfak adds RevisionNotFound to the set of expected exceptions. [17:10:17] Nope. Still doesn't work. [17:14:33] "With an exception instance, iterate over its super classes (by mro) and find the first super exception that is pickleable." [17:14:39] https://github.com/celery/celery/blob/master/celery/utils/serialization.py#L42 [17:14:41] Aha! [17:14:49] So it doesn't think that my exceptions are picklable. [17:15:02] lol [17:30:38] Aha! My exceptions are not pickleable. WHY! [17:35:58] Yay! [17:36:01] That got it. [17:36:09] Why are exceptions so complicated to pickle? [17:36:25] Whose idea was it to set a default __setstate__ function to confuse pickle?!? [17:37:37] They owe you a nickle. [17:42:37] * halfak pulls down updated models for a final test. [18:12:15] YuviPanda, would you like to review my updates to ORES or should I merge them? [18:12:40] I figure you're the most likely to be able to make good notes. [18:12:52] But I don't want to wait a bunch so that I can get this in staging. [18:13:06] No one else has really looked at ORES. [18:40:43] Amir1, https://github.com/wiki-ai/revscoring/issues/182 if you have some time. [18:41:27] sure, it will take some minuates [18:44:53] Sure. Thanks :) [18:54:46] Amir1, would you be willing to take a look at this: https://github.com/wiki-ai/ores/pull/82 [18:54:58] (when you're done) [18:55:47] It's just cleanup for ORES based on the language context --> language feature-set switch. [18:56:05] I'm hoping to get a new set of models up on staging today. [18:57:23] We're going to need to boost the # of celery workers too [19:01:50] halfak: Can I remove multi-token ones? [19:02:07] We support multi-token now :) [19:03:48] awesome [19:06:23]