[00:05:38] haha well I was struck by the efficiency of it, yes. Probably a person who hasn't been through Ellis Island [00:06:16] (or it would be "van") [00:06:47] (or dropped entirely) [00:08:10] Darn, I didn't use that example. https://github.com/adamwight/ores-lime/blob/master/Explain%20draft%20topic.ipynb [00:08:41] "John" is funny in that one [00:13:34] What I think is interesting is that our AI is the most simplistic possible algorithm, it doesn't have any semantic "understanding" of what's happening, only that people with this word in their name are often on the TV because they're talking about history, and people with "Guido" in their name are always in the Mob. [00:14:36] It's the exact thing we teach kids to not do, making assumptions about individuals based on aggregate and skewed information. [00:16:02] If "van Rossum" were looked up in Wikipedia, a high probability assigned that it's Guido van Rossum based on a pythonic context... [00:16:20] It's a wonder that this embeddings thing words at all. And kind of scary. [00:16:23] *works [00:17:39] (sorry for delays, I'm reading fun papers for CSCW: https://algorithmsworkshop.wixsite.com/mysite/position-papers [13:36:32] o/ [14:05:50] 10Scoring-platform-team, 10Gadgets, 10Code-Health, 10artificial-intelligence: Detect/flag potentially malicious gadget/javascript edits - https://phabricator.wikimedia.org/T208140 (10Halfak) Good notes, @Bawolff My proposal: > But what is the review system? Special:RecentChanges filtered to JS pages >... [14:08:06] 10Scoring-platform-team, 10Wikilabels, 10articlequality-modeling, 10artificial-intelligence: Build article quality model for Galician Wikipedia - https://phabricator.wikimedia.org/T201146 (10Halfak) +1 for what @Theklan said. I would encourage you to consider the quality of the text when evaluating too.... [14:31:10] o/ saurabhbatra [14:31:31] Good morning (or other timezone appropriate greeting) [14:31:40] hello! :-) [14:32:11] halfak: good time to talk? [14:33:27] Yup! I have 30 minutes before my next meeting. [14:34:09] so Adam and I went through the dataset yesterday [14:34:18] https://figshare.com/articles/Known_Undisclosed_Paid_Editors_English_Wikipedia_/6176927 [14:34:47] it seems that getting COI edits for users is going to be a tough task [14:35:19] on account of dummy edits (correcting grammar etc) and the possibility that even a big edit might not be paid promotion [14:36:01] i see that you've also pointed out as much in that link: "User makes just over 10 minor edits. Is quiet for a few days well waiting for autoconfirm (user right) to kick in (takes 4 days). Then creates a promotional article in one big edit followed by the account going silent." [14:36:09] saurabhbatra, yes. I think that the "edit" is just the wrong level of granularity [14:36:49] then there is the case of sockpuppets that make no edit but just advocate on behalf of users already implicated [14:38:32] ex: https://en.wikipedia.org/w/index.php?limit=50&title=Special%3AContributions&contribs=user&target=Bengaloorugirl&namespace=&tagfilter=&start=&end= [14:39:22] i do have an idea in mind that might work, although we'll have to figure out how to do real-time detection with it [14:40:28] i propose that we have 3 classifiers [14:40:51] one that deals with time series data of the user's patterns of contributions [14:42:01] one that takes significant edits of a person (+500 being the threshold let's say) and does text classification on that (promotional writing vs factual writing classifier) [14:42:41] and one on top that uses values that these provide and other heuristics that we can think of [14:43:19] why make it hierarchical? [14:43:27] brb in 10 mins. [14:43:38] Please keep working out your thoughts and I'll read the scrollback [14:43:43] okay! [14:47:20] so an example, let us consider the case of Annakoppad: https://en.wikipedia.org/w/index.php?limit=50&title=Special%3AContributions&contribs=user&target=Annakoppad&namespace=&tagfilter=&start=&end= [14:49:50] A is a sockpuppet who follows the textbook sockpuppet defn.; makes some small edits to mask bigger edits which are almost all paid ones [14:52:49] however there are other pseudonyms that he/she adopts. for ex Bengaloorugirl: https://en.wikipedia.org/w/index.php?limit=50&title=Special%3AContributions&contribs=user&target=Bengaloorugirl&namespace=&tagfilter=&start=&end= [14:53:14] which have no major edits but advocate on behalf of the guilty person [14:54:08] so there could be multiple such classes of sockpuppets/sockpuppeteers [14:55:33] some classes may have enough edits for us to detect, some suspicious time series data etc [14:56:12] Right. I think that if we can model these behaviors, we can probably do a good job of supporting the people who are patrolling for this type of behavior. [14:56:15] using a decision tree based model like grad. boost or random forest usually works well in such situations [14:57:07] hence the ensemble/hierarchical model [14:58:30] I don't understand the hierarchical strategy though. Why not just include the features of the lower-level models in our higher-level model? [14:59:00] just because of how different the features will be [14:59:52] for time series data, we probably want to have a model with RNN/LSTMs or a hidden markov model [15:00:38] for text classification, fasttext or CNN or RNN/LSTM [15:00:43] Hmm. I figure we can get some high fitness with basic statistics of the time series. [15:00:54] Why not try the easiest strategies first? [15:01:11] i.e. KISS until we know that we need the complexity. [15:01:32] hmm, that is a good point [15:02:09] but as long as you agree this is the data we should be working with [15:02:33] I think so. I was also thinking we might want different models to track different behaviors. [15:02:49] Alternatively, we could have a 1 vs. rest model for all behaviors. [15:04:06] how do we use the edits though? [15:04:12] or do you want to skip using them at first? [15:04:59] ? [15:05:05] I'm confused. [15:06:29] ermm [15:07:00] so if i understood you correctly [15:08:17] you want to have heuristics that look for trends in the categorical data and make them into categorical features? [15:08:27] in the time series data* [15:08:45] time series data being edit history metadata [15:14:56] Right, so you could include features about the edits as well. [15:15:49] as in "contains these suspicious words" etc. [15:16:37] or word embeddings directly? [15:16:55] either [15:16:58] both [15:17:38] so that changes our granularity from user to the pair (user, edit) [15:18:50] also there is this problem of aggregation of multiple edits [15:19:15] any sockpuppet may have N no. of COI edits [15:19:26] where N can also be 0 [15:21:32] ahh, although if we have (user,edit) granularity aggregation would just be aggregation of scores [15:22:53] sorry if this is confusing, i'm thinking as i'm typing :-) [15:26:39] 10ORES, 10Scoring-platform-team, 10Edit-Review-Improvements-RC-Page, 10Growth-Team: Define a process for adding ORES filters to new wikis when ORES is enabled on those wikis - https://phabricator.wikimedia.org/T164331 (10kostajh) @awight @Ladsgroup @Halfak, just coming back to this now, and wondering about... [15:26:49] saurabhbatra, any prediction will involve aggregation of edits. [15:28:36] harej, I'm setting up for the management meeting. Any progress for me to report in the last week? [15:28:52] e.g. re. JADE focus group? Setting up search use cases? Etc? [15:29:06] halfak: I'm exploring six wikis for JADE early testing [15:29:29] arwiki, fawiki, fiwiki, cawiki, and maybe ptwiki and ruwiki [15:29:52] got a phab link for that work? [15:30:35] In principle https://phabricator.wikimedia.org/T199520 but I haven't posted anything to that task (yet) [15:32:03] Please add. :) [15:32:22] Like, links to discussion. Who you are working iwth on each wiki, etc. [15:33:30] Thanks! [15:33:35] * halfak looks for awight [15:34:00] I think I'm going to start adding an agenda item to the sync meeting regarding stuff to send to tech-mgmt. [15:36:17] halfak: is there an API I can use to get user edit history data? [15:43:21] * halfak digs [15:43:45] awight, any progress from last week that I can bring to tech management? [15:44:03] Also, good morning ^_^ [15:53:06] oh hey! [15:53:15] * awight rifles through paperwork [15:53:53] Did u mention that we're both invited to (the margins of) a conference, to refine and popularize ORES & JADE? [15:54:20] Thanks! [15:54:50] Your success in getting concrete action items about revision storage is huge. [15:55:07] TechCom rolling towards our final approval for JADE is significant. [15:55:23] As for me, we made JADE content searchable on-wiki [15:55:54] We're in the process of wooing a talented GSoC vet :D [15:56:39] With hoo's help, I was able to land a bevy of E:JADE patches, actually catching us up with my latest dev work. [15:57:41] & our extension is ranked #3 for test coverage, https://doc.wikimedia.org/cover-extensions/JADE/ [15:59:51] Amir1 patched a CVE in ORES and wikilabels [16:00:31] got the task id for searchable JADE? [16:00:34] halfak: Did you mean just my work, or should I grab the rest of the team's accomplishments? [16:00:37] yes one moment please [16:00:43] https://phabricator.wikimedia.org/T206352 [16:01:23] Usually we report things when they are some version of done or notably blocked. [16:01:42] Should we save it for a later report? [16:02:12] halfak: Search? It's actually done enough for the initial release, it just turns out the task's scope was too broad. [16:02:34] We have basic searchability, it's the advanced stuff that needs more refinement of use cases. [16:02:52] I could split it out if we really need that for reporting [16:07:23] I don't see a task for the CVE patch [16:08:17] You must be limited to management access :p [16:08:39] halfak: Sorry, I confused CVE activity from another project. Ours is https://phabricator.wikimedia.org/T208258 [16:09:02] ^ we're not patched yet [16:09:20] gotcha [16:10:56] awight: o/ [16:11:10] saurabhbatra: Evening :) [16:11:43] so i had a chat with halfak about what we were discussing yesterday [16:12:58] i can summarize but i think you'd get a better sense from the logs [16:13:00] https://wm-bot.wmflabs.org/browser/index.php?start=10%2F30%2F2018&end=10%2F30%2F2018&display=%23wikimedia-ai [16:16:57] (reading) [16:20:33] Thanks for flagging. [16:21:38] All I would add is to repeat some specific points from yesterday which still seem relevant: An editor's relationship to the subject matter is probably best studied at article-level granularity, i.e. they have a COI/contract for one article, but another is harmless. [16:21:59] ah, yes about that [16:22:07] This page seems invaluable for linking editors to specific COI articles, https://en.wikipedia.org/wiki/Wikipedia:Conflict_of_interest/Noticeboard [16:22:36] so i only propose flagging really long edits as COI [16:22:59] hand-picking edits for a 1000 accounts seems unfeasible [16:23:51] That's why I'm thinking (editor, article) is a good place to manually label. I think we can manually cross edits using the COI noticeboard. [16:24:18] Maybe filtering by the usernames provided in the data sample, so we're extra certain that the COIs were resolved "guilty" [16:24:35] ah sorry, s/cross edits/cross editors with articles/ [16:24:56] And then we get the long-term editor engagement with each article programmatically. [16:25:39] saurabhbatra: BTW I just want to say, it's great to have you pushing on these questions! We've been eager to take this project on for... a year now? [16:26:08] no probs, it seems like a fun mind-warping problem :-) [16:26:36] hahah just wait until the editors go into stealth mode to avoid our algorithm [16:27:33] well, we can't stop them but at least we can make it tougher for them! [16:27:35] I assume that this group has lots of shadow money, more resources than our team... hopefully they don't figure out how to organize. [16:28:10] :) exactly, just like with the antivandalism, a 90% automated reduction will leave the humans with 10x more time to do their monkey-brain magic. [16:28:45] i'm still skeptical about manually labelling edits :\ [16:28:56] there should be a better way [16:30:00] 10ORES, 10Scoring-platform-team (Current): ORES workers using dramatically higher CPU, increasing linearly with time - https://phabricator.wikimedia.org/T206654 (10awight) 05Open>03Resolved a:03awight /me gives a standing ovation This is fixed! [16:30:18] saurabhbatra: That's a step I was thinking we could avoid, too--I'm only suggesting manually labeling (editor, COI article title) [16:30:39] That's a survivable amount of labor, IMO [16:30:54] & it's mostly done thanks to the COI noticeboard and our figshare sample. [16:32:34] that's still a 1000 noticeboard searches [16:33:36] harej: I'd like your opinion on https://phabricator.wikimedia.org/T206037, when you have time. I have a proposal for how to *not* change the model names, by introducing a "model family" concept. I think we IRC'd but posting your current thoughts to the task would be helpful... [16:33:37] for a 1000 users [16:33:55] saurabhbatra: It's a pain, but compare with our other manual labeling campaigns: [16:34:10] https://labels.wmflabs.org/ui/ [16:34:30] Here's a 20k sample, for instance, https://labels.wmflabs.org/stats/enwiki/ [16:34:49] darn :O [16:34:51] This seems like a domain where high data quality is totally worth the extra effort to manually label [16:35:34] I know... everything about computers turns out to actually be a human in a machine suit. [16:36:02] LOL :-) [16:36:25] 10ORES, 10Scoring-platform-team, 10Growth-Team, 10MediaWiki-extensions-PageCuration, and 2 others: Display ORES draftquality model prediction in PageCuration feed - https://phabricator.wikimedia.org/T157130 (10kostajh) 05Open>03Resolved a:03kostajh This was done in T196178 [16:37:02] I wish I could use the original name for that concept, but it began its coinage in racist mire and now is a trademark. A perennial bad idea! [16:37:33] from what i've noticed edit length and paid promotion have a nice correlation [16:38:32] halfak: I was pinged for the CVE thing, is everything okay? [16:38:46] excellent! probably, add time since account registration and we're halfway done :) [16:38:46] yeah. Just gathering tasks for tech management. [16:39:10] Amir1: my fault, I should have am.ir'd you but wanted to say "hi" o/ [16:40:21] halfak: is tech mgmt. done? If not I have the redis task tracker to add (and the response time thingy) :D [16:40:37] too late, yeah :( [16:40:39] awight: it's fine. I was afk for errands otherwise I'm online for wikidata stuff [16:40:44] I think I might have reported that for last week, Amir1 [16:40:59] halfak: oh okay [16:41:07] So awesome. It's like we switched to AMD cpus or something. [16:41:42] Amir1: Anything I can review? [16:42:02] awight: upgrade to celery4 PR in ores :D [16:42:09] excellent [16:42:27] awight, the only thing holding me back was a read through the testing. [16:42:32] Specifically the overload recovery. [16:43:06] Amir1: {done} [16:43:24] halfak: nice--yes there was a response about that, looks good. [16:49:03] awight: i'm still looking at the data but at 100 a day we can have the labelled dataset in 10 days [16:50:25] * halfak --> Lunch [16:51:32] only problem being exceptions like these - https://en.wikipedia.org/w/index.php?limit=50&title=Special%3AContributions&contribs=user&target=Polannegi&namespace=&tagfilter=&start=&end= [16:52:07] 10MediaWiki-extensions-ORES, 10Scoring-platform-team (Current), 10Patch-For-Review, 10User-Ladsgroup: Implement JS ORES client in mw-ORES extension - https://phabricator.wikimedia.org/T201691 (10awight) I'd like to see the pooling done transparently, maybe defaulting to 1 worker but keeping the promise sig... [16:59:30] saurabhbatra: What's the problem? The fact that everything is deleted? [17:00:10] awight: that there are sockpuppets flagged without having edited any articles [17:00:45] It's possible that they did edit articles, but everything is suppress--I'm not sure. [17:00:52] now that i'm reading individual cases it seems they were blocked based on IPs [17:00:56] on that note, you might want to get the NDA process started again, let me know if I can do anything. [17:01:14] Ooh well you're right, that would be an empty set for our purposed, then. [17:01:56] "The accounts above were Confirmed to each other and Technically indistinguishable from 1-555-confide, though Unrelated to the original case" [17:02:06] https://en.wikipedia.org/wiki/Wikipedia:Sockpuppet_investigations/1-555-confide/Archive [17:02:09] yah we don't care about socks IMO [17:02:45] so we can nitpick from the data points and only pick up on users with COIs? [17:03:27] Yes I think that's a good first approach [17:04:23] then manual data collection might not be that tedious [17:05:12] ;-) we can tell ourselves that, at least [17:05:19] :-) [17:05:48] One thing... since we'll need to repeat this work for every language we support, it's worth some time to think about a generalized labeling interface. [17:06:07] Ideally, this is something we can plug into Wiki Labels [17:06:54] We process usernames, maybe filter for having any main content-space edits above a certain size, then present labelers with the articles they've worked on. [17:07:45] 10ORES, 10Scoring-platform-team (Current), 10Growth-Team, 10MediaWiki-extensions-PageCuration, and 2 others: Merge articlequality and itemquality - https://phabricator.wikimedia.org/T206037 (10awight) @Harej I'd love feedback on ^ the "model family" concept above. [17:08:03] awight: yes, agreed [17:08:27] https://github.com/wikimedia/wikilabels [17:09:19] saurabhbatra: As a volunteer, please be especially generous to yourself about only doing things that are interesting BTW. If the labeling interface seems tedious for example, just outline your thoughts and someone else might be able to step in for implementation. [17:10:49] awight: understood. although I think this is a great idea [17:11:19] 10JADE, 10Scoring-platform-team (Current): Add endorsement.timestamp field - https://phabricator.wikimedia.org/T208334 (10awight) [17:11:21] :D [17:12:30] So far, the labeling framework has proven to be surprisingly flexible, IMO. It's currently being used for individual edits, edit sessions, and playing with citations. [17:12:50] awight: I will get back to you on model family concept after my doctor’s appointment [17:13:20] i think it'd be a nice fit for us [17:13:22] harej: ty, sorry for the double-ping, I just wanted to put my ask somewhere durable. [17:14:20] saurabhbatra: e.g. try "request workset" for these varied campaigns, https://labels.wmflabs.org/ui/enwiki/ [17:15:09] so first task, get a user's edit history metadata and edit diffs for edits of reasonable length for all users in our flagged list [17:15:15] does that sound good? [17:16:13] 10Scoring-platform-team, 10Research, 10Wikilabels: WikiLabels "Labeling Unsourced Statements II" interface is broken - https://phabricator.wikimedia.org/T208337 (10awight) [17:16:14] awight: yeah i did, it's just the tool for the job [17:16:58] saurabhbatra: Yes I think that's a great start. Please look in the revscoring and editquality repos though, it's tempting to write this stuff from scratch but some helpful data pipeline tools already exists. [17:17:02] *exist [17:17:13] & we will want to automate this pipeline [17:17:43] Maybe a "step 0" to document is how you propose we gather this dataset for future language campaigns... [17:17:49] * awight grammars wrongly [17:18:04] i think this API should work for us [17:18:05] https://en.wikipedia.org/w/api.php?action=query&list=usercontribs&ucuser=Annakoppad&uclimit=500&ucdir=newer [17:18:49] AFAICT we just need the revision id for wikilabels [17:19:23] Tools to be aware of, [17:19:24] https://github.com/wikimedia/editquality/blob/master/templates/Makefile.j2 [17:19:37] https://github.com/wikimedia/editquality/blob/master/config/wikis/enwiki.yaml [17:20:03] https://github.com/wikimedia/revscoring/blob/master/revscoring/extractors/api/revision_oriented.py [17:20:59] awight: thanks! I'll go through them [17:21:21] saurabhbatra: We probably want more than the revision ID, but thinking about it this might be tricky. Maybe the longest edit session (group of edits) they made in a given article? [17:21:41] That's a weird problem. Ideally, it would be all changes made to the article, but there could be intervening edits by other users. [17:22:05] https://github.com/wikimedia/revscoring/blob/master/revscoring/utilities/fetch_text.py [17:22:11] https://github.com/wikimedia/revscoring/blob/master/revscoring/utilities/extract.py [17:23:02] saurabhbatra: ^ there's actually a project kicking off to improve our README's, led by srrodlund. I'd like us to document how to use utility scripts in particular, so stay tuned :-) [17:23:57] Yes! It IS happening! [17:24:00] awight: i think a single rev id might work [17:24:30] if we have our model score on (user, rev) [17:24:43] and aggregate those to (user, article) [17:24:57] or just aggregate the lot to user [17:25:55] hmm, but it won't get a bunch of small edits to the same article [17:30:50] srrodlund: Are there pages where I can drop notes like this, yet? [17:32:48] awight: No, but you can file a Phab ticket and use the documentation tag. [17:33:13] saurabhbatra: I think the filtering might look like, * user somehow flagged for COI (could be manual, or appearing in the noticeboard), * take all articles where the user has changed more than >N bytes, spread across any number of edits, [17:33:47] then it gets tricky, maybe * present each of the user's edit sessions on these potential COI articles in wikilabels and ask "Does this look like COI?" [17:34:22] Or actually, assign it to my user workboard in Phab so I see it [17:35:14] alright, this is a reasonable way of doing things [17:35:14] I'm super swamped through the end of Nov, but the READMES project is one of my priorities (I am working on it -- but updates might be slow) [17:36:32] i'll get on it! [17:38:08] 10ORES, 10Scoring-platform-team, 10revscoring, 10Documentation, and 2 others: Improve revscoring READMEs - https://phabricator.wikimedia.org/T208338 (10awight) [17:38:31] srrodlund: no rush, just encouraging us to list the shortcomings is helpful. [17:39:18] saurabhbatra: Don't feel like this is blocking, but maybe document the sampling and extraction plan as you implement... [17:39:46] awight: we both know it's going to be "after" i implement ;-) [17:40:12] as all documentation ever is... [17:40:41] i'll try though :-) [17:41:51] Truth is stranger than fiction, I'd certainly hope that things happen "backwards" like this! [17:41:54] d'oh [17:47:31] 10ORES, 10Scoring-platform-team, 10Multi-Content-Revisions, 10Epic: MCR support in ORES - https://phabricator.wikimedia.org/T195779 (10awight) [17:47:34] 10ORES, 10Scoring-platform-team (Current): ORES feature extraction triggers new MCR-related deprecation warning - https://phabricator.wikimedia.org/T201332 (10awight) 05Open>03Resolved p:05Triage>03Low [18:59:20] halfak: I might be 5 min late! [18:59:33] awight, no worries. Ping me when you're ready. [19:05:59] awight: I guess my first question is, is there other software that works this way>? [19:06:14] With model autodetection and such? [19:07:30] harej: oops, sorry I'm in a meeting for 1hr. I think there are precedents but let me grab some. [19:26:16] 10JADE, 10Scoring-platform-team (Current), 10Advanced-Search, 10Discovery-Search, and 4 others: Extract judgment data for end-user search indexing - https://phabricator.wikimedia.org/T206352 (10Harej) I don't want us to overthink CirrusSearch integration since I am not sure how much value there is for it.... [20:05:34] 10JADE, 10Scoring-platform-team: Document group polarization as a risk to JADE - https://phabricator.wikimedia.org/T208349 (10awight) [20:05:38] harej: ^ thing I've been meaning to look at [20:09:24] 10JADE, 10Scoring-platform-team (Current), 10Advanced-Search, 10Discovery-Search, and 4 others: Extract judgment data for end-user search indexing - https://phabricator.wikimedia.org/T206352 (10awight) @Harej It would be good to list the use cases still, for example, will editors want a workflow like, "bro... [20:16:02] harej: I think that capability negotiation is very common generally, but don't have examples from AI. For example, modem and ethernet link negotiation, Javascript polyfill and graceful degradation... [20:17:34] My vision here is that we can have one client coupled to a specific API version and model endpoint, and another which allows users to experiment with a range of say contentquality models which rely on different algorithms. [20:17:43] This might be extreme... [20:19:22] In the "small picture", it gives us a way to leave articlequality and itemquality with their own identities, and clients will hardcode to one or the other but it will be clearer to the client programmers that model name should be parameterized, and the family information suggests commonalities between the models. [20:33:28] okay, i'm back from one meeting, and i have another in 33 minutes [20:36:09] * harej re-acclimates himself to Scoring Platform mode [20:36:51] That group polarization task is a great point. I'm going to want to write out some scenarios where different types of interactions happen. [20:42:00] Anyways awight, not sure I have very useful feedback regarding your model family implementation idea, other than that it sounds complicated. Do you think it would be hard for end-users to implement? [20:44:37] It's also kind of an abstract thing to me, since I haven't written an ORES client application in years [20:45:06] And uhhhhhhhh as I remember, I picked the API endpoint I wanted to work with, it had a certain name, so I went with the name. [20:48:33] One deliverable I think you might like would be a diagram that describes and relates the different forms of interaction that are possible with JADE. Would that be helpful for you? [20:49:36] Also, would you like me to comment on anything else that I've forgotten to comment on? [20:54:35] awight: do you have a minute to check this? https://github.com/wikimedia/wikilabels/pull/250 [20:54:36] :D [20:59:01] wikimedia/wikilabels#429 (new_flask - f9e43e7 : Amir Sarabadani): The build passed. https://travis-ci.org/wikimedia/wikilabels/builds/448547113 [21:08:31] harej: Cool thanks for the feedback. IMO the proposed change actually won't affect any "correct" clients... [21:08:36] wikimedia/ores#1075 (new_flask - 91421c9 : Amir Sarabadani): The build passed. https://travis-ci.org/wikimedia/ores/builds/448552767 [21:09:14] The theory is that there are three ways a client can behave, I suppose. [21:09:22] 1) "old" way, ask for a specific model e.g. /v3/enwiki/damaging/... [21:09:53] 2) "discover", ask for *all* models and match model_family or other capabilities. [21:10:17] 3) deprecated, ask for all default-visible models, e.g. /v3/enwiki/123456 [21:10:42] 10Scoring-platform-team (Current), 10editquality-modeling, 10revscoring, 10artificial-intelligence: Create a newcomerquality meta-model for revscoring - https://phabricator.wikimedia.org/T205926 (10notconfusing) @halfak and I reconvened and found that redoing the dependency-management of revscoring for thi... [21:10:55] ^ the last one should be deprecated because it causes future migration headaches, and is vague. Why all models? What's the use case? [21:10:58] harej: ^ [21:24:10] (03Abandoned) 10Umherirrender: Adjusting the punctuation. [extensions/ORES] - 10https://gerrit.wikimedia.org/r/365544 (owner: 10Felipe L. Ewald) [21:48:28] 10Scoring-platform-team (Current), 10editquality-modeling, 10revscoring, 10artificial-intelligence: Create a newcomerquality meta-model for revscoring - https://phabricator.wikimedia.org/T205926 (10notconfusing) 05Open>03Resolved p:05Triage>03Normal [21:50:40] awight: So basically this enables the ability for clients to either be simple ("get me this one model I am aware of") or sophisticated ("I will dynamically choose the model based on my own business logic") [21:51:05] exactly [21:51:18] and I'm thinking "all" has always been sort of a mistake to expose. [21:51:43] But I do have a transition strategy for that, where some models are marked as "default visible" which causes them to appear in the "all" list. [21:52:11] BTW that also gives us a really smooth migration if we do decide to consolidate into "contentquality", for example. [21:52:41] Remind me again. Marking as "resolved" and moving a task to "done", not the same things, right? [21:52:42] Because we can provide "wp10" and "articlequality" as aliases, but they are hidden by default in the "all" list. [21:52:57] That seems fine to me [21:53:08] notconfusing: right, but both will get the point across :) [21:53:41] notconfusing: We use "done" something like a sprint-level status, and I think of "resolved" as an extra step, which means "nothing more to verify" [21:54:35] I see [21:55:12] Please make sure all of the done stuff goes in the done column though. [21:55:16] That helps with my reporting :) [21:55:18] A workflow that I think is common is someone will work on a task, they move the task to Done when they are done, and then during the next scrum the team decides if the task is truly-for-real done (and there aren't outstanding QA concerns or anything) [21:55:32] If I have to dig for your task, I won't tell the fancy higher-ups how awesome you are. [21:55:42] (If I forget, which is likely) [21:55:59] +1 for harej [21:56:16] I'll often go through the list and resolve things/ask if I can resolve things. [21:57:38] 10Scoring-platform-team (Current), 10Wikilabels, 10artificial-intelligence: Implement a modeling self-check process - https://phabricator.wikimedia.org/T198144 (10notconfusing) [22:00:55] 10Scoring-platform-team (Current): Qualitative Analysis of Session-Edit mismatches. - https://phabricator.wikimedia.org/T208362 (10notconfusing) [22:09:23] Got my reviews done quickly. If you don't want me to copy-paste my reviews from the last journal/conference, don't submit the same manuscript (and get me as a reviewer again) [22:09:45] It's hard to read through a paper to make sure that it is the exact same manuscript [22:10:37] halfak: run pdf2text on both files and use a diff tool? [22:10:51] That's a really good idea. [22:11:12] Oh well. Already burned some eye time doing it manually (optically?) [22:11:16] OK I'm out of here for the day. [22:11:21] Have a good one folks! o/ [22:11:24] o/ [22:11:28] o/ [22:12:00] 10Scoring-platform-team (Current): Evaluate Newcomer Model - https://phabricator.wikimedia.org/T208364 (10notconfusing) [22:12:32] 10Scoring-platform-team (Current): Evaluate Newcomer Model - https://phabricator.wikimedia.org/T208364 (10notconfusing) Metrics (Maximum recall with minimum precision at 95%) [22:14:37] 10Scoring-platform-team (Current): Incoporate newcomerquality model into a python package - https://phabricator.wikimedia.org/T208365 (10notconfusing) [22:17:06] (03CR) 10jenkins-bot: Localisation updates from https://translatewiki.net. [extensions/JADE] - 10https://gerrit.wikimedia.org/r/470693 (owner: 10L10n-bot) [22:28:24] 10Scoring-platform-team (Current): Understand TeaHouse desires of Newcomer-Quality predictions - https://phabricator.wikimedia.org/T208367 (10notconfusing) [22:41:35] (03PS1) 10Awight: Add endorsement.timestamp field [extensions/JADE] - 10https://gerrit.wikimedia.org/r/470724 (https://phabricator.wikimedia.org/T208334) [22:42:14] 10JADE, 10Scoring-platform-team (Current), 10Patch-For-Review: Add endorsement.timestamp field - https://phabricator.wikimedia.org/T208334 (10awight) @Halfak Let us know whether you agree that endorsement.timestamp should be required. [22:50:42] switching into paper-reading mode, offline [23:46:53] o/