[09:50:06] 10Scoring-platform-team, 10Wikilabels: Wikilabels campaign shows "revision not found" for every revision I try to open - https://phabricator.wikimedia.org/T209926 (10Daimona) [09:53:31] 10Scoring-platform-team, 10Wikilabels: Wikilabels campaign shows "revision not found" for every revision I try to open - https://phabricator.wikimedia.org/T209926 (10Daimona) [10:05:23] 10Scoring-platform-team, 10Wikilabels: Wikilabels campaign shows "revision not found" for every revision I try to open - https://phabricator.wikimedia.org/T209926 (10Daimona) Using a sample API request, I noticed that: - https://it.wikipedia.org/w/api.php?action=compare&format=json&fromrev=68597719&torev=6859... [13:50:31] 10ORES, 10Scoring-platform-team, 10Edit-Review-Improvements-RC-Page, 10Growth-Team: Define a process for adding ORES filters to new wikis when ORES is enabled on those wikis - https://phabricator.wikimedia.org/T164331 (10Trizek-WMF) The filters are indeed not automatically added to the UI. I guess that the... [14:39:26] o/ [14:39:31] Was responding to ORES questions. [14:40:22] 10Scoring-platform-team (Current), 10articlequality-modeling, 10artificial-intelligence: Respond to questions about ORES article quality model - https://phabricator.wikimedia.org/T209951 (10Halfak) [14:40:30] 10Scoring-platform-team (Current), 10articlequality-modeling, 10artificial-intelligence: Respond to questions about ORES article quality model - https://phabricator.wikimedia.org/T209951 (10Halfak) a:03Halfak [14:41:23] 10Scoring-platform-team (Current), 10articlequality-modeling, 10artificial-intelligence: Respond to questions about ORES article quality model - https://phabricator.wikimedia.org/T209951 (10Halfak) Response: https://en.wikipedia.org/w/index.php?title=User_talk:Rosiestep&diff=869810090&oldid=869734679 [14:41:24] ^ Taskified [14:41:29] brb [15:06:25] 10Scoring-platform-team, 10Wikilabels: Wikilabels campaign shows "revision not found" for every revision I try to open - https://phabricator.wikimedia.org/T209926 (10Halfak) It looks like this revision was part of a deleted page and all of the revisions that weren't deleted have already been labeled. I've sub... [15:10:37] 10Scoring-platform-team, 10Wikilabels: Allow privileged users to label deleted revision in Wikilabels - https://phabricator.wikimedia.org/T209960 (10Halfak) [15:12:23] 10Scoring-platform-team, 10Wikilabels: Allow privileged users to label deleted revision in Wikilabels - https://phabricator.wikimedia.org/T209960 (10Halfak) [15:31:55] 10Scoring-platform-team, 10Wikilabels: Wikilabels campaign shows "revision not found" and that's confusing and sometimes wrong - https://phabricator.wikimedia.org/T209926 (10Halfak) [15:33:33] Woops. Been back. Forgot to say [15:35:40] 10Scoring-platform-team, 10Wikilabels: Wikilabels campaign shows "revision not found" and that's confusing and sometimes wrong - https://phabricator.wikimedia.org/T209926 (10Halfak) I've edited the task title to reflect the confusing aspect of this situation. Rather than wave this off and say "that's how it i... [15:36:42] 10Scoring-platform-team, 10Wikilabels, 10UX-Debt: Wikilabels campaign shows "revision not found" and that's confusing and sometimes wrong - https://phabricator.wikimedia.org/T209926 (10Halfak) [15:37:01] 10Scoring-platform-team, 10Wikilabels, 10UX-Debt: Wikilabels campaign shows "revision not found" and that's confusing and sometimes wrong - https://phabricator.wikimedia.org/T209926 (10Daimona) @Halfak Yeah, I lately got to the same conclusion that this happens both because deleted revisions aren't handled p... [16:35:17] Amir1 and harej! Anything I should report to tech management beyond what we have in the sync etherpad? https://etherpad.wikimedia.org/p/scoring_platform [16:36:07] Right now we are reporting that the (probably final) JADE RFC is happening, that we're looking towards putting scores in Hadoop, and that we're starting to realize that we're lacking dedicated design resources for JADE development. [16:42:29] That sounds good [16:46:51] halfak: two things maybe: Fixed the scap restart issue, working on movign to make redis HA [16:46:59] by using redis-sentinel [16:47:10] Can you add that to the list with relevant task links? [16:47:12] Thank you :D [16:49:14] halfak: done [18:02:14] Just finished morning meetings. Taking lunch [18:03:22] o/ [18:37:02] Hopefully everyone is filling out the developer survey... [18:54:00] sheesh, the wikilabels db was rebooted on a different schedule than I'd written down... [18:54:03] kicking [18:55:32] we're back [18:55:44] ooh another problem > 500:Unable to parse response [18:55:55] weird. > Could not load workset list: {"code":500,"status":"error","message":"Unable to parse response"} [18:58:14] hard rebooting [19:00:09] now it works. so the service script is non-obvious [19:03:12] There it is. service uwsgi-wikilabels-web status [19:03:21] Hey dude. How can I help? [19:04:10] Looks like we are online [19:04:50] o/ awight [19:04:56] :) [19:05:20] I got medieval on it and rebooted the box [19:05:44] Dropped a note here for next time, though: https://wikitech.wikimedia.org/wiki/Wikilabels#Restarting_the_service [19:06:04] I suppose that works :) [19:07:39] Not how it should be done [19:08:52] Argh. I just verified that I was given correct warning about the DB reboot schedule, yet somehow I put a calendar event down for two hours later. [19:11:57] I want more ORES in the UI... would be great if I could sort pages I've contributed to by article quality prediction [19:12:09] just installed importScript("User:EpochFail/ArticleQuality-loader.js") [19:12:16] moar [19:12:20] haha :) [19:12:42] I agree though. We should get article quality into elastic search. [19:12:48] I think that might be a good start. [19:31:24] halfak: harej: docs meeting, if u want [20:02:52] harej: hey. Just a small note I couldn't get in during hangouts, that the re-review case is key cos that's when collaboration happens. [20:03:15] That's a good point [20:03:25] But it does seem like "meta-reviewers" might not be a separate persona, just an activity that happens among existing personas. [20:06:52] I'll update the etherpad to denote it as a separate category of activity, just so we have our bases covered. [20:08:43] My head has been foggy this week, sorry if I seem unclear or unfocused. [20:08:57] Not at all! [20:09:02] sorry to hear it though [20:10:59] "Meta-reviewers: On larger wikis, a certain class of users are interested in making sure that users who are reviewing content are upholding a certain standard and doing reviewing "correctly." For example, an administrator on English Wikipedia recently called for some patrollers to lose their privileges due to mistakes in patrolling." [20:11:02] Is this a good description? [20:13:45] Also, I'm still drafting my email to the design friends. Should I send it this week or should I wait until next week? [20:13:47] To send it, that is [20:14:11] halfak: ^ [20:14:23] It sounds right to me, although I can't speak to the "large wiki" qualifier [20:14:51] I'll leave it out. [20:17:21] harej, +1 [20:17:29] +1 to what? [20:17:30] Not sure about the large wiki qualifier either [20:17:31] FWIW, I tend to go ahead and send emails before a holiday, if they're of the "daydream about this thing" variety, but not if they're "do this very hard work" [20:17:41] But +1 to your summary otherwise. [20:17:42] It's basically a long status update [20:18:58] Sounds good for pre-vacay IMO [20:19:10] "here's some stuff to not worry about" ;-) [20:20:34] So, I get like 3 or 4 emails per day about ORES or wiki research generally. I spend a substantial amount of my time reading, considering, and responding to these messages. I've been thinking about documenting these email exchanges onwiki. I've just started asking for approval from my correspondents. Was thinking about where to put this stuff. E.g. a lot of questions are about querying ORES or using revscoring models. Where would you like [20:20:34] to see those appear? Maybe I should post them on mw:Talk:ORES. [20:21:02] I like it. [20:21:16] I'm not sure about publishing verbatim, though [20:21:18] This is part of my effort to make my invisible work more visible :) [20:21:26] Yeah. Only with approval for sure. [20:21:28] seems easier to summarize into a FAQ [20:21:42] also easier to read [20:21:51] A lot of times, I'm filling in gaps in documentation. But fixing the documentation would be a much larger effort. [20:28:40] yup. If only documentation would grow on trees! [20:29:52] we could use a small army of Sarah R.'s [20:34:31] yes, deployed with 1,000 ukeleles and a metric ton of notebooks [20:39:44] * Amir1 after reading the last three messages here without reading further: https://giphy.com/gifs/mrw-coffee-wants-tvGOBZKNEX0ac [20:40:32] Remember to stay away from heroin! [20:48:02] halfak: you had tech managers meeting today, yeah? [20:48:08] Yes. [20:48:17] How did your pitch for design resources go? [20:48:37] We didn't get to it because there was too much discussion of annual planning and the future of the WM Hackathon. [20:49:07] I'll say more about that in our staff meeting -- nothing too concerning. [20:49:39] But I'll bring the design needs revelation to the next tech management & my next 1:1 with Victoria. [20:51:04] Oh boy, it's about to be annual planning season again. [20:55:02] That was a short summer recess [21:04:33] Heh. At least our job is relatively easy. It seems to me that we're pretty well aligned on what we need for our team to continue to work effectively. [21:05:19] It seems we are aligned on what our most important work is too. I really appreciate some of the issues that awight has raised in the past -- some of the limitations of our small team. This is good fodder for annual planning efforts. [21:05:35] We'll see though. I want to make sure that we lay out a bit of that framing in Berlin :) [21:05:49] And then maybe Amir1 can take us long boarding at the airport. [21:06:00] pick up a few grant checks too [21:06:26] * awight spits cigar end over the balcony [21:08:33] In this time of the year? Is fighting whitewalkers included in our job description? [21:40:50] harej: srrodlund: halfak: I moved the technical cruft out of the main article, https://www.mediawiki.org/wiki/Extension:JADE [21:41:36] "Usage" is the only section left :D [21:41:57] Amir1, fair point. We'll need to bring some light gloves or we might get a little cold :P [21:43:01] DOOM [21:45:30] I was thinking it would be nice for us to visit some of the christmas markets. I heard from some Germans that they'd be open after Thanksgiving. [21:45:38] Amir1, ^ you have any experience with them? [21:46:03] I wasn't in them but people in WMDE love it [21:46:10] they serve hot wine [21:46:32] I passed some last year [21:46:47] * awight rubs imaginary gloves together [22:03:35] harej: https://www.mediawiki.org/wiki/Extension:JADE#Bots,_abuse,_and_quality_standards [22:04:04] I changed the focus from "don't flood our servers" to "don't upload garbage", it feels like a better argument point [22:04:51] I have to multitask for the rest of the day unfortunately, will take the time off. [22:05:52] Random thought I just had: do we want to surface the ORES score in JADE in any way? What I am thinking is that if people are tempted to just copy/paste ORES scores into JADE, we can take that tempting task away from them by already showing ORES. [22:06:25] harej, I'd like to ask people that. I don't know if we will know how to design it ahead of time. [22:06:36] if we're prepared for it, we can react quickly to emergent behaviors. [22:07:46] Amir1, sounds like a good idea then. Could you get some recommendations and/or maybe a group of WMDE folk who would like to guide us? [22:08:13] Also some beer and döner :3 [22:08:21] Sure, I know Fisch loved it [22:08:23] I'd love to have Lydia and others who would be interested in meeting with us also join us for a social event :) [22:08:29] Oh great! [22:08:43] Speaking of which, Lydia_WMDE are you working at this unreasonable hour? [22:08:49] harej: I know one good donor place (Donor Gemuse in Mehringdam) [22:11:34] 10JADE, 10Scoring-platform-team: APIs to calculate and judgment page title - https://phabricator.wikimedia.org/T210014 (10awight) [22:12:34] It seems to be a best practice to show the AI score when collecting feedback. [22:12:52] awight, I thought that was the opposite. [22:13:01] Since the score can affect judgment. [22:13:06] There are some refs here, https://www.mediawiki.org/wiki/JADE/Background#Providing_explanations_to_the_end-user [22:13:19] * increases transparency into the ML, building trust [22:13:38] But biases the judgment. [22:13:39] * the users are often here explicitly to criticize our scores, so will look it up anyway [22:13:40] Hmm. [22:13:46] Fair poitn. [22:14:03] But still when providing the judgment, they'll likely be in some tool that has an ORES prediction. [22:14:09] I think this was the strongest argument I came accross, http://www.rosenthalphd.com/papers/Rosenthal_IUI10.pdf [22:14:11] I think we're talking about the discussion angle. [22:14:32] So, post judgment. During a disagreement of some sort. [22:14:33] +1 they've likely seen the ORES already [22:15:01] I'm talking to Rosie who uses article quality scores to update WikiProject assessments. She says she discards the ORES data and uses her own judgment 25% of the time. [22:15:24] Oh that's really interesting. I've been talking to her about that too. [22:16:38] btw > Studies have shown that when a human and robot share a common frame of reference in the environment, they can communicate more effectively (e.g., [Torrance, 1994; Topp et al., 2006; Steel, 2003] [22:16:41] So the big question, I think, is whether the status of the prediction at the time of the judgment is important. [22:16:47] hence our scale. [22:16:58] here's another ref, from the same group http://www.cs.cmu.edu/~mmv/papers/09ijcaiw-inter-stephanie.pdf [22:17:01] awight, +1 to that. This is especially important for asking people to make judgments in a familiar context. [22:17:34] If the prediction changes, is that a problem? If it was once a false-positive and now it is a true-negative, is that OK? [22:17:38] On the one hand I don't want people just copying and pasting ORES scores into JADE; on the other hand, there is something to be said for a human backing up a robot. [22:17:50] "Robot tested, human approved." [22:18:20] halfak: +1 we need to snapshot the ORES score at the time of judgment [22:18:29] awight, I'm not sure that is clear. [22:18:48] true, maybe not necessary [22:18:50] I can see why we want it in theory for a specific use-case. But I don't know if it is a user-need. [22:19:00] I'd really like to store ORES scores historically. [22:19:08] Maybe we can do that with hadoop [22:19:08] +1 [22:19:15] yeah I'm starting to think that, too [22:19:20] So we can look for old false-positives. [22:19:30] maybe date gathered is useful after all [22:19:35] But maybe users *do* want a nice ORES snapshot sometimes/all the time. [22:20:01] It feels prudent to collect it... but even nicer if we have a decoupled archive [22:20:10] What's more, maybe we should ask the tool to tell us what was shown to the user when the judgment is submitted. [22:20:15] e.g. hadoop is doing just that, and we can look up by date [22:20:18] Because some tools show predictions differently. [22:20:31] I think that's captured by endorsement.origin [22:20:33] E.g. huggle mixes ORES predictions with internal heuristics. [22:20:45] Right. But that doesn't tell us how the heuristics lined up. [22:21:08] Could have been a true-negative in ORES-land but Huggle pushed it over the edge with heuristics. [22:21:09] people can navigate various ways, too. Seems like a bigger clicktracking-style problem [22:21:46] Right. I'm struggling to figure out what a standard format for this would be. [22:22:08] usually a list of URLs? [22:22:44] E.g. if Huggle gives the user a visual score (e.g. Score: 1000, Color: #FF0000) and RecentChanges gives an ordinal (Severity: Very-likely-problematic) [22:23:04] haha nice, A/B testing color choices already [22:23:44] awight: https://www.mediawiki.org/w/index.php?title=Extension:JADE&type=revision&diff=2974846&oldid=2974787&diffmode=source [22:24:05] In some cases, damaging.true: 0.5 is a "positive" and in other cases anything above damaging.true: 0.1 is a "positive". [22:24:46] harej: +1 thanks! I'd like to start s/JADE/Jade/ though, to be less shouty [22:24:59] It's fine to come back and do that on the whole page consistently. [22:25:07] My engineer brain says no, but my design brain says yes to "Jade" [22:25:20] give in to left brain [22:25:28] "It's an acronym!" "It doesn't have to be" [22:25:36] Wait. Wouldn't that be giving in to right-brain? [22:25:46] +1 I had to check that [22:25:46] Right brain = design brain. Left brain = German brain. [22:26:03] stupid crossed spinal thing [22:26:18] Corpus Mixup'm [22:26:37] now awaying for real [22:26:48] Check this out: https://www.mediawiki.org/wiki/Topic:Uow6mh6ihy3tfpcx [22:26:53] Invisible work made visible! [22:30:01] It's exhausting to read like this, still thinking the FAQ is a good home: * I hear you're working on a paid editor model? * (I don't know what to do with <2>) * What machine learning features do you calculate? [22:30:17] I'm heading out too. I still haven't gotten to writing my conference reports. I'll hopefully get that done tomorrow. [22:30:35] Oh definitely exhausting to read like this. But when we move it into the FAQ, we can do some copy-pasting now :D [22:30:38] sorry I'm suggesting... more work [22:30:51] cool, nice to see it though! [22:31:08] Now to just get approval from like 50 more people :) [22:31:21] This is just easier than trying to get them to post their questions on a wiki page. [22:35:54] OK actually running away now. Have a good evening (or I think morning for Amir1 -- ha!) [22:35:55] o/ [22:36:07] :D [22:36:10] Have fun [22:43:33] harej: great letter! [22:43:43] Thanks! It took a while [22:46:35] I'll try to write some change requests for the UI prototype, and see if I can't get started implementing myself.