[05:09:27] harej: When you say "Forbidden Four", was that referring to the biggest wikis? Unclear cos I don't understand whether this will help directly. [05:09:30] It does unblock my schema concerns, though. [05:09:55] yes, the ones with revision tables either above or uncomfortably close to 100 GB [05:10:25] I think the slots thing won't affect table size, if anything it's a pinch heavier. [05:12:07] Only because there might be more slot_content metadata for the two slots, and any gains from the total amount of data stored doesn't matter cos that's already scalable. [05:16:25] The API makes sense too, we provide a revision query and generator which can return all rv_slots diff judgment, get_diff_judgments(rev_id) -> (revision metadata, wikitext content, json content) [08:36:16] (03CR) 10jenkins-bot: Localisation updates from https://translatewiki.net. [extensions/JADE] - 10https://gerrit.wikimedia.org/r/459940 (owner: 10L10n-bot) [08:39:19] (03CR) 10jenkins-bot: Localisation updates from https://translatewiki.net. [extensions/ORES] - 10https://gerrit.wikimedia.org/r/459947 (owner: 10L10n-bot) [09:56:46] 10Scoring-platform-team (Current), 10DBA, 10JADE, 10Operations, 10TechCom-RFC: Introduce a new namespace for collaborative judgments about wiki entities - https://phabricator.wikimedia.org/T200297 (10daniel) >>! In T200297#4575766, @awight wrote: > I'm making some changes to the proposal, which I hope em... [11:49:55] 10Scoring-platform-team (Current), 10DBA, 10JADE, 10Operations, 10TechCom-RFC: Introduce a new namespace for collaborative judgments about wiki entities - https://phabricator.wikimedia.org/T200297 (10daniel) [14:00:01] Woah. This transition towards wikitext and unstructuredness leaves me really uneasy. [14:02:06] Technical Advice IRC meeting starting in 60 minutes in channel #wikimedia-tech, hosts: @chiborg & @amir1 - all questions welcome, more infos: https://www.mediawiki.org/wiki/Technical_Advice_IRC_Meeting [14:09:59] 10Scoring-platform-team (Current), 10DBA, 10JADE, 10Operations, 10User-Joe: Write our anticipated "phase two" schemas and submit for review - https://phabricator.wikimedia.org/T202596 (10Halfak) With the wikitext slot, we won't know which note relates to which judgement. This is like having one big "not... [14:14:14] I wonder if ORES could score newly created usernames heh. I've tried it on my own with a couple different frameworks and utterly failed. [14:15:59] Seems like it would be possible. One difficulty would be deliminiting name-parts. A lot of names are uncapitalized concatenations. Like "foohats" [14:16:14] Hard to see "foo" and "hats" without a native-speaking human. [14:16:17] halfak: that's a lot of the trouble I've run into [14:16:32] I broke things up by camelcase, and spaces / symbols / numbers [14:16:36] but it's not really enough [14:30:44] SQL, yeah. I could see that. What kind of username problem-classes were you looking at? [14:30:54] (sorry been deep diving on a schema) [14:30:58] o/ harej [14:31:12] halfak: sneaky impersonation type stuff was what I was messing around with [14:31:24] Oh! I see. [14:34:32] 10Scoring-platform-team (Current), 10DBA, 10JADE, 10Operations, 10User-Joe: Write our anticipated "phase two" schemas and submit for review - https://phabricator.wikimedia.org/T202596 (10Halfak) For clarity, here's a rough version of the endorsements proposal that I'd originally put together about a year... [14:51:28] PROBLEM - ORES worker production on ores.wikimedia.org is CRITICAL: HTTP CRITICAL: HTTP/1.1 500 INTERNAL SERVER ERROR - 6426 bytes in 5.262 second response time [14:52:05] Technical Advice IRC meeting starting in 10 minutes in channel #wikimedia-tech, hosts: @chiborg & @amir1 - all questions welcome, more infos: https://www.mediawiki.org/wiki/Technical_Advice_IRC_Meeting [14:52:29] RECOVERY - ORES worker production on ores.wikimedia.org is OK: HTTP OK: HTTP/1.1 200 OK - 860 bytes in 0.311 second response time [14:53:02] Looks like precaching just took a dump. [14:53:11] https://grafana.wikimedia.org/dashboard/db/ores?refresh=1m&panelId=4&fullscreen&orgId=1 [15:05:57] Just found out via -services that this was probably due to the wiki-switch-over [15:06:11] The wikis were read-only for a moment so no changes to precache! [15:07:22] 10Scoring-platform-team, 10Bad-Words-Detection-System, 10revscoring, 10artificial-intelligence: Add language support for galician - https://phabricator.wikimedia.org/T201142 (10Halfak) [15:07:55] A visualization of the switchover. [15:08:44] yep about 7 minutes [15:09:18] On the one hand I'm terrified of the idea that no one can edit for seven minutes, but actually that's not much time at all. [15:09:45] We interrupted about 4200 edits. [15:09:54] And you can see that editing didn't pick up to normal levels right away. [15:10:18] In fact, it's still at about 50% [15:14:46] Amir1, are you doing SoS today? [15:15:09] halfak: no, It's my off day today. Doing volunteer stuff [15:15:22] (I said it in the staff meeting already) [15:30:51] Oh. I have that on the calendar for tomorrow. [15:30:57] Is Adam doing the SoS then? [15:31:04] Amir1, ^ [15:31:19] halfak> Woah. This transition towards wikitext and unstructuredness leaves me really uneasy. [15:31:32] o/ awight [15:31:36] oh dear [15:31:38] Are you doing SoS today? [15:31:41] yep I could do it [15:31:45] That's vacation, today is overtime compensation [15:31:48] good timing ;-) [15:33:12] I don't see anything to report [15:35:00] Looks like maybe it wasn't covered on Monday. I've added a "to report" section to the Agenda. [15:43:09] harej, it looks like I'm supposed to get you in contact with Theklan [15:43:32] The best way I have is via Phabricator. It seems he responds quickly there. [15:45:59] My favorite twist involving MCR is T204112. A good old fashioned surprise! [15:46:00] T204112: Support slots other than the main slot in EditPage - https://phabricator.wikimedia.org/T204112 [15:47:24] awight, what's surprising here? [15:48:03] The... entire task is a surprise. I was under the impression that at least the editing worked already, and we were only blocking on AbuseFilter etc. [15:48:15] oh. ha. [15:48:38] yeah. again I keep going back to tgr and others saying, "MCR maybe some day, but for now, don't count on it." [15:48:41] I've been reading EditPage lately for other reasons, and half of the work is refactored into a class PageUpdater which is capable of writing to slots. [15:53:07] 10Scoring-platform-team, 10Wikilabels, 10articlequality-modeling, 10artificial-intelligence: Build article quality model for Galician Wikipedia - https://phabricator.wikimedia.org/T201146 (10Halfak) I can get the GA and FA articles for model-training with this query: https://quarry.wmflabs.org/query/29680... [16:01:22] I'm in the hangout but no one is there [16:01:26] ...Spoke too soon [16:02:55] urgh [16:39:57] 10Scoring-platform-team, 10articlequality-modeling, 10artificial-intelligence: Update monthly article quality datasets - https://phabricator.wikimedia.org/T203468 (10Halfak) a:03Halfak [16:40:20] 10Scoring-platform-team (Current), 10articlequality-modeling, 10artificial-intelligence: Update monthly article quality datasets - https://phabricator.wikimedia.org/T203468 (10Halfak) [16:42:12] 10Scoring-platform-team, 10articlequality-modeling, 10artificial-intelligence: Update monthly article quality datasets - https://phabricator.wikimedia.org/T203468 (10Halfak) [16:48:12] 10Scoring-platform-team (Current), 10ORES: Test poolcounter support for ores in beta cluster - https://phabricator.wikimedia.org/T201825 (10Halfak) Seems like this happened. Is it done? [16:48:17] 10Scoring-platform-team, 10Wikilabels, 10articlequality-modeling, 10artificial-intelligence: Build article quality model for Galician Wikipedia - https://phabricator.wikimedia.org/T201146 (10Theklan) We don't have a translation for the b and c grades, but we have rules for FA, good and stub quality. I thin... [16:49:29] 10Scoring-platform-team (Current), 10Documentation: Workshop proposal for CSCW (JADE, ORES, etc.) - https://phabricator.wikimedia.org/T204134 (10Halfak) [16:49:57] 10Scoring-platform-team (Current), 10JADE: JADE literature review - https://phabricator.wikimedia.org/T201887 (10Halfak) @awight said he had a bunch more stuff to paste in. [16:50:13] 10Scoring-platform-team, 10ORES, 10Upstream: Make ORES dependency solving upstreamable - https://phabricator.wikimedia.org/T201657 (10Halfak) [17:00:58] 10Scoring-platform-team (Current), 10DBA, 10JADE, 10Operations, 10User-Joe: Write our anticipated "phase two" schemas and submit for review - https://phabricator.wikimedia.org/T202596 (10awight) >>! In T202596#4577106, @Halfak wrote: > With the wikitext slot, we won't know which note relates to which jud... [17:33:51] relocating, back in 10-15min [17:55:27] (back) [18:03:34] 10Scoring-platform-team (Current), 10DBA, 10JADE, 10Operations, 10TechCom-RFC: Introduce a new namespace for collaborative judgments about wiki entities - https://phabricator.wikimedia.org/T200297 (10awight) >>! In T200297#4576584, @daniel wrote: > Note that it's blocked on {T204112}. That's not particul... [18:11:10] 10Scoring-platform-team (Current), 10DBA, 10JADE, 10Operations, 10TechCom-RFC: Introduce a new namespace for collaborative judgments about wiki entities - https://phabricator.wikimedia.org/T200297 (10awight) Another change which I'll document here is that I'm dropping the use cases for "write-only" workf... [18:12:54] halfak: Do you feel like going into your reservations about yesterday's content schema proposal, or prefer to do that in our meeting? [18:21:35] halfak: Also, I'd like to use our meeting for just the constructive stuff and I'm pretty concerned given the agenda so far that I'm going to be put in a defensive position about the premise of design iteration per se. I'd like to get that aired ahead of time. [18:25:25] What I see happening is that we normally have a project cycle that looks like * design * development * deployment * feedback, then it loops back into design. [18:26:07] In this case, we've been blocked from deploying for months so I'm left doing these iterations while starved for feedback. [18:26:32] And the loop is simply * design, * development. [18:27:19] It's been fairly productive IMO because I've been able to catch a lot of things that would be much harder to correct after deployment. [18:27:54] Hey, so yeah, the point of the meeting is to break free from the re-design loop. I think there's need to be discussion about the issues that have been "caught" and the use-cases that are being side-lined. [18:28:31] I want us on the same page about the key use cases and what makes for "better" from a objective point of view. [18:28:38] The only ways to break the loop are either to pause the project or to deploy it, IMO. [18:29:06] Unless I'm missing something? [18:29:08] I think pausing the content schema design is a good idea. There's other work to do. [18:29:31] There's no sense in iterating something where we don't have feedback. Imagining feedback doesn't get us very far. [18:29:36] What you're suggesting ^ is just having some group feedback into the design cycle, which is great. [18:31:56] And I'm not sure what you imagine with pausing the content schema design, we're currently blocked by it. I need another few days to unblock us, then we can pause. [18:32:43] What is blocking us? [18:34:13] We're blocked by TechCom keeping our RFC open and asking for a more detailed pilot implementation. In order to do that, I need to be able to write the correct secondary schema and API. To do that, we need to finalize an initial content schema. [18:34:46] The previous iteration of the schema is almost good enough, but had a few glitches which we've been discussing. [18:35:06] Right. Finalizing the initial content schema is what I'd like to do. [18:35:25] Also, we've dropped a major use case I've been designing for, which left me free to make what I consider to be big improvements to the content schema. [18:35:36] What use-case is that? [18:35:42] write-only judgments [18:36:22] It was the main target for our initial pilot, and I decided I don't like the idea at all, it encourages multiple, unranked judgments. [18:36:41] Right. I don't remember that from the initial set of use-cases. It seems like that one was added after I backed off. [18:37:00] We've all been saying that I'm finally taking the schema back to something very close to your suggestion a year ago, yes. [18:39:24] Also for the record, we've all had reservations about the designs so far, and I think I've finally reached something that does what we intended originally. I don't see any holes in it so far, so looking forward to trial by fire ;-) [18:39:46] to mix metaphors with a rusty knife. [18:40:08] awight, it shouldn't be a trial. It should be a discussion about what we're doing. I think you've gotten far ahead of us and we're trying to reel you back in. [18:40:23] Or maybe drag ourselves forward. [18:41:06] Much appreciated! I'd love to have us hammer on my current proposal, but mentioned here because I * want to have the correct materials ready for review and * very much want to have process discussions separately from the content discussion. [18:41:21] IMO the latter is a recipe for disaster, at least in my experience. [18:42:08] I want to be open and responsive about the content schema questions, and don't want to have to defend the process to date in that same convo. [18:42:13] Any discussion of why you want to move in the direction of freeform anything would be helpful. It seems that is the most unexpected proposal I have seen. [18:42:28] inorite! [18:43:21] Also, for what it is worth, this meeting isn't about the past. It's about setting a good direction for the future. [18:43:41] So, the main slot has two functions: it integrates all freeform content into a single narrative, which has been an underlying thread of all the judgment.notes concerns to date. It also creates a generative, blank slate for the editors to get ahead of us with their needs. [18:46:37] Editors already have blank slates. [18:46:45] They are called wiki pages. [18:47:02] exactly :-) [18:47:04] I don't understand your concern about judgement.notes. [18:47:30] One concern is the editquality thing we discovered. notes don't correspond well to single judgment schemas. [18:47:42] Notes do correspond well to all judgments on an entity [18:48:03] I think you had a good counterexample, draftquality maybe? [18:48:47] That was a case where conclusions about damaging and goodfaith aren't an exact match for conclusions about draftquality, although "spam" is an interesting case. [18:50:19] awight, whole entity notes don't make sense across edittype and editquality. [18:50:38] draftquality is a revision-level entity [18:51:00] ah hmm how does it overlap with articlequality, then? [18:51:22] So, I'm thinking that the wikitext slot is going to be a pure narrative with no structural requirements. [18:51:56] "This edit is fishy, it promotes some product which the editor may have CoI in. We've agreed that it's damaging and should be reverted." [18:52:14] The JSON will have "damaging: true", and some endorsements. [18:53:52] Editors are of course free to add multiple paragraphs in their narrative to discuss different topics. [18:54:14] I really like that the wikitext is freed of structure, and the structured data is freed of wikitext... [18:55:35] awight, I don't imagine anyone will be writing summaries to in-depth. [18:55:42] *that in-depth [18:56:41] For a simple case, that's fine, the justification or notes can even be lacking. [18:56:48] See the examples of notes people leave in false-positive reports. https://it.wikipedia.org/wiki/Progetto:Patrolling/ORES [18:57:18] For the simplest notes, I imagine a proposition like "This edit is damaging because it adds a spam link." [18:57:33] For discussions about logged actions, we might want the blank slate. [18:58:08] yeah those are a great example of the simplest wiki page, and also demonstrate how we want to support full wikitext including templates. [18:59:28] I disagree on the use of template. I don't see how this is necessary for what we see here. [18:59:39] But if it is easy, then sure. [18:59:46] Why don't we support templates in edit comments? [19:00:42] Good question, probably because they can expand into a million bytes. [19:01:04] also, parser functions will take a dump in contexts other than a wiki page. [19:05:35] relocating again [19:31:27] (03CR) 10Catrope: [C: 04-1] Expose basic ORES API configs to frontend (031 comment) [extensions/ORES] - 10https://gerrit.wikimedia.org/r/459549 (https://phabricator.wikimedia.org/T201691) (owner: 10Ladsgroup) [19:32:59] Amir1: Re https://gerrit.wikimedia.org/r/c/mediawiki/extensions/ORES/+/459549 , some context you might not have: ResourceLoader lets you define custom modules with programmatically generated output, and some extensions use that for exporting config data [19:33:24] Not everything does that, and we use the global config blob way too much, but I'd like us to start reducing our reliance on it [19:33:44] Krinkle (not in this channel) might have thoughts on that too, so I also CCed him on that change [19:34:23] RoanKattouw: oh thanks [19:34:29] Will check [19:35:22] This might escalate into a higher-level/longer-term conversation though, if that happens I'll merge your patch and we can fix it later [19:35:45] Cool. Let's see [19:36:13] RoanKattouw: I also have this one for you in case you missed it https://gerrit.wikimedia.org/r/#/c/mediawiki/core/+/459688/ [19:36:29] Thank you for merging the rest [19:37:01] Will try to enable it again in beta cluster tomorrow [19:42:36] (03CR) 10Krinkle: Expose basic ORES API configs to frontend (031 comment) [extensions/ORES] - 10https://gerrit.wikimedia.org/r/459549 (https://phabricator.wikimedia.org/T201691) (owner: 10Ladsgroup) [19:44:46] harej: Not sure this is helpful, but I'm copying the schema proposal into a sheet for our meeting. https://docs.google.com/spreadsheets/d/1crkl1UuxFGBC4GH2yP9w9wZFRReTd6P1P3HI7SBAT1g/edit#gid=0 [19:47:09] Amir1: Just reviewed that one, happy to merge it but I'd first like your thoughts re my comment on caching [19:47:38] If your reaction is "oh yeah duh I'll do that right now", then let's do that before I merge, if you disagree I'll probably still merge it [19:49:36] RoanKattouw: let's do it [20:00:56] new patch is up, let's see if jenkins is happy [20:34:48] 10Scoring-platform-team, 10Wikilabels, 10articlequality-modeling, 10artificial-intelligence: Build article quality model for Galician Wikipedia - https://phabricator.wikimedia.org/T201146 (10Halfak) How would you say "Assess article quality" in Galician? I'll use that to name the "labeling campaign" in Wi... [20:40:22] 10Scoring-platform-team (Current), 10DBA, 10JADE, 10Operations, 10User-Joe: Write our anticipated "phase two" schemas and submit for review - https://phabricator.wikimedia.org/T202596 (10Halfak) > The main thing driving us to that conclusion was that a "notes" field should be shared between damaging and... [20:47:17] Just met with Danny Horn. I figured out how to incorporate his feedback but still make our point. [20:47:29] I'll make those edits tomorrow and then we can kick the JADE blog post back to comms. [20:47:53] What was his issue with the post? [20:52:26] (03CR) 10Sbisson: [C: 032] Rename wp10 to articlequality [extensions/ORES] - 10https://gerrit.wikimedia.org/r/456129 (https://phabricator.wikimedia.org/T203080) (owner: 10Ladsgroup) [20:53:08] Will explain later. Gotta change locations. Talk to you in 30 [20:59:12] (03Merged) 10jenkins-bot: Rename wp10 to articlequality [extensions/ORES] - 10https://gerrit.wikimedia.org/r/456129 (https://phabricator.wikimedia.org/T203080) (owner: 10Ladsgroup) [21:03:07] (03CR) 10jenkins-bot: Rename wp10 to articlequality [extensions/ORES] - 10https://gerrit.wikimedia.org/r/456129 (https://phabricator.wikimedia.org/T203080) (owner: 10Ladsgroup) [22:26:30] https://docs.google.com/spreadsheets/d/1IxXtym0ELTANl40zzK8zix_HQ-uVPgP-uNOb_8gVJNY/edit#gid=0 [22:26:55] harej: Schema proposal ^ [22:27:39] (03CR) 10Krinkle: [C: 032] Minor cleanup (031 comment) [extensions/JADE] - 10https://gerrit.wikimedia.org/r/458931 (owner: 10Reedy) [22:27:42] unlocked. [22:33:19] (03Merged) 10jenkins-bot: Minor cleanup [extensions/JADE] - 10https://gerrit.wikimedia.org/r/458931 (owner: 10Reedy) [22:37:15] (03CR) 10jenkins-bot: Minor cleanup [extensions/JADE] - 10https://gerrit.wikimedia.org/r/458931 (owner: 10Reedy) [22:43:45] harej: halAFK: ^ Interesting fact, I tried to make the columns in my spreadsheet an example of the human narrative schema I'm proposing. [22:46:17] Now I see halAFK's point that the schema can usefully delineate between the notes for each judgment.data value... [22:47:52] but still think that consensus opinion on why a diff is of ambiguous faith is a perfectly appropriate section heading for the human-readable synopsis. [22:48:37] That leads me to the conclusion that these headings should be parallel and stored in a consistent format, therefore fully freeform sections. [22:56:01] halAFK: Writing it down, I see that a *lot* has happened in this latest proposed change.