[04:52:10] 10Scoring-platform-team, 10MediaWiki-Patrolling, 10MediaWiki-extensions-ORES, 10MediaWiki-extensions-Patroller, and 2 others: Enable RC patrolling on trwiki - https://phabricator.wikimedia.org/T140475#3954804 (10Liuxinyu970226) [11:20:49] 10Scoring-platform-team, 10ORES, 10WMSE-Development-Support-2018, 10editquality-modeling, and 3 others: Enable ORES filters for svwiki - https://phabricator.wikimedia.org/T174560#3955198 (10Sebastian_Berlin-WMSE) What's the status on this? As I understand it, ORES is ready to be activated. Is the next step... [11:36:48] 10Scoring-platform-team, 10ORES, 10editquality-modeling, 10Collaboration-Team-Triage (Collab-Team-This-Quarter), and 3 others: Enable ORES filters for svwiki - https://phabricator.wikimedia.org/T174560#3955248 (10Sebastian_Berlin-WMSE) [15:06:58] awight, joining us for the staff meeting? [15:07:13] Spaced out but currently waiting for the link [15:08:03] (03PS2) 10Sbisson: build: Update linters [extensions/ORES] - 10https://gerrit.wikimedia.org/r/408207 (owner: 10Esanders) [15:34:30] (03CR) 10Sbisson: [C: 032] build: Update linters [extensions/ORES] - 10https://gerrit.wikimedia.org/r/408207 (owner: 10Esanders) [15:37:18] (03Merged) 10jenkins-bot: build: Update linters [extensions/ORES] - 10https://gerrit.wikimedia.org/r/408207 (owner: 10Esanders) [15:39:54] (03CR) 10jenkins-bot: build: Update linters [extensions/ORES] - 10https://gerrit.wikimedia.org/r/408207 (owner: 10Esanders) [15:53:33] Amir1: I can’t remember if you saw my question—do you know anything about an “ores” user for this IRC channel, which seems to be password protected? [15:53:51] I’ve had no luck setting up the GitHub IRC relay for the jade repo. [15:54:15] halfak: Something fun, https://github.com/wiki-ai/jade/pull/9 [15:54:36] ^ ooooh [15:59:42] I can cover that last line, too... [16:04:57] got it. [16:20:38] brb forgot to eat breakfast [16:27:48] 10Scoring-platform-team, 10ORES, 10Operations, 10Patch-For-Review: rack/setup/install ores2001-2009 - https://phabricator.wikimedia.org/T165170#3955995 (10akosiaris) [16:28:09] 10Scoring-platform-team, 10ORES, 10Operations, 10Patch-For-Review: rack/setup/install ores2001-2009 - https://phabricator.wikimedia.org/T165170#3258942 (10akosiaris) 05Open>03Resolved This is finally done, resolving [16:33:18] halfak: I saw some notes about JSON web tokens. Is that still the suggestion for how we proceed? [16:42:09] awight, I haven't confirmed with security folks, but I think it's a great option. [16:42:23] cool, thanks for the confirmation [16:42:39] awight, BTW, I asked Amir1 to write up his thoughts on going full MW for JADE. [16:42:51] might not get that until he's back from vacation. [16:43:23] I should… stop then [16:43:23] I think that we should do a comparison of full-MW vs. MW/Event-Server and make a call about what we can do based on our capacities. [16:43:47] awight, I don't think it's a waste to continue this work because it's part of the analysis. [16:44:03] But yeah, I'd prioritize the hard bits that we're not sure how to do yet. [16:44:39] Amir1 and I chatted yesterday and we agreed on: https://etherpad.wikimedia.org/p/JADE_extension [16:44:48] E.g. how will we emit events? How will we handle idempotence? [16:44:49] One agreement isn’t reflected, lemme add it [16:45:00] good call. [16:45:54] If "We can rely on MW's built-in JSON editor.", then we'll need to re-write our notion of a "judgement" [16:46:07] Ah one thing we’re both worried about is how we’re potentially adding a ~ recursive patroller load [16:46:08] e.g. someone can change multiple judgements in a single action. [16:46:22] There’s only one judgment per MW page [16:46:35] recursive patroller load is good IMO. For too long, we have not had meta-moderation. [16:46:42] Or am I misunderstanding. [16:46:55] you got it right. [16:47:00] "damaging" and "goodfaith" are separate judgements? [16:47:10] As "edit type" is different from "damaging". [16:47:22] interesting, good point [16:47:31] Or maybe "edit type", "damaging" and "goodfaith" are all part of a judgement-group. [16:47:38] entity-judgement [16:48:02] That’s how I had been thinking about it, yeah. Not sure we ever got to 100% certainty on that though [16:49:10] About the meta-moderation thing, it’s probably fine as long as judgments are being added manually, but what about the bots that will surely join the game? [16:49:34] That’s potentially 2 or more judgments per wiki revision... [16:49:52] tripling patroller load, in that scenario [16:52:31] awight, seems like a regular wiki-problem to me. [16:52:40] Those bots would been to go through a BAG equivalent. [16:52:45] Hmm, our schema here is 1:1 between judgment and score_schema, but here it’s a link table allowing multiple schemas per judgment, https://github.com/wiki-ai/jade/blob/master/schema.sql#L107 [16:52:52] And based on their activities, they can be mass reverted if necessary. [16:52:56] halfak: Very nice. Offloaded! [16:53:15] I’ll note in the etherpad [16:53:22] awight, wikibase has entry points for updating a specific value that is part of larger item. E.g. "Set the value for P21 to Q243". [16:53:29] We could have something similar. [16:53:46] "Set the value for 'damaging' to False" [16:55:29] Probably not necessary, even a full entity-judgment structure containing all possible schemas would be small enough to export to a review tool, and send back modified... [16:56:37] The only scoring schema that looks expensive is e.g. draft topic, but for that we’ll certainly want to see the set of assigned topics before modifying, eh? [16:57:29] Well… it’s not too hard to imaging wanting to assign drafttopic += ProjectName… [16:57:39] still seems like an optimization for later, though. [16:59:23] Oh. I think we should stick to 1 judgment per page to keep the rev_id <-> judgment mapping 1:1. [16:59:52] I’m still inclined to say that judgments contain multiple schema_scores. [17:00:25] Can you think of any reason we would want to have a judgment 1:1 with schemas? [17:01:19] It’ll be very common to say “{damaging: true, goodfaith: false}”, which feels like it should be one create_judgment… [17:07:56] 10Scoring-platform-team, 10ORES: Switch ORES to dedicated cluster - https://phabricator.wikimedia.org/T168073#3956112 (10Halfak) [17:07:58] 10Scoring-platform-team (Current), 10ORES, 10Patch-For-Review: Make sure ORES is compatible with stretch - https://phabricator.wikimedia.org/T182799#3956108 (10Halfak) 05Open>03Resolved ORES is currently deployed in cloud VPS on Stretch machines. This is not an issue of ORES compatibility but rather a m... [17:12:15] wiki-ai/jade#1 (travis - e36ea14 : Adam Wight): The build failed. https://travis-ci.org/wiki-ai/jade/builds/339081303 [17:14:10] awight, we should plan for future complicated judgement schemas [17:15:11] awight, when it comes to what we store in MediaWiki, I'm less concerned with what is in a page. [17:15:20] And more concerned with how people query and curate it. [17:15:31] In the case of wikibase, people curate and query at the statement level [17:15:40] For the most part, anyway. [17:16:00] The update_judgment_details call seems like an alias for update_judgment, I think we would support it with a new event type. [17:16:42] As for how the data is stored… I’m not seeing issues with either choice, 1:1 or 1:many, judgment:schemas [17:17:20] A 1:1 design would just mean slightly more complicated inserts and updates [17:17:45] Two create_judgment calls to add multiple schema_scores [17:17:59] awight, right. And I think that would be fine. [17:18:09] Especially because most bots and humans already do that in Wikidata. [17:18:32] E.g. you can change multiple aliases, labels, and descriptions before hitting save. But the UI will make multiple edits in quick succession. [17:18:36] halfak or awight: can you quickly +1 https://gerrit.wikimedia.org/r/#/c/409079/ ? [17:18:44] Even though there is still an option in the API to make one big edit. [17:18:49] Even if we decide to make a Jade: wiki page include multiple score_schemas from different judgments, when you edit the raw JSON, we emit multiple update_judgment events for each section. [17:18:55] * halfak clicks on tgr's change [17:19:23] https://meta.wikimedia.org/w/index.php?title=Schema:ServerSideAccountCreation&diff=17719237&oldid=17706338 is the relevant schema change [17:19:59] tgr, why is this necessary at all? [17:20:04] Just reference the old revision ID [17:20:19] https://meta.wikimedia.org/w/index.php?title=Schema:ServerSideAccountCreation&oldid=5487345 [17:20:22] the schema has been updated since [17:20:34] Oh I see. You want to keep one change but not another? [17:20:41] yeah [17:20:42] OK [17:21:11] lgtm [17:21:26] thx [17:21:36] You don’t need +2? [17:22:29] I just +2'd [17:22:34] uh yeah I do :) [17:22:43] lol [17:22:51] got used to SWATting config changes only [17:23:42] Careful of the PTSD [17:23:59] halfak: Anything else you’d like to see on the jade#centralauth PR? [17:25:02] awight, nothing that came to mind in the scan. It doesn't do everything but it does most things :) [17:25:09] Looks like the CI PR will be ready to go after a rebase. [17:36:22] wiki-ai/jade#6 (travis - d24db76 : Adam Wight): The build was fixed. https://travis-ci.org/wiki-ai/jade/builds/339091497 [17:37:40] halfak: free? [17:38:04] in meeting [17:58:06] halfak: Just ran into a fun detail. There’s nothing like a Kafka replica in Cloud VPS. [17:58:23] I suppose we don’t need any of the MW integration for the MVP [17:58:44] But if we do want to test that part of it, the simplest course is probably to create our box within deployment-prep. [18:00:27] Oh yes. Right. Damn it. [18:01:28] Just as well… [18:01:36] Makes us look more serios :) [18:05:53] On the other hand, we’re welcome to run our box without events, to stick with the original MVP plan. [18:06:18] My only reservation is that the event part is a huge unknown and… sort of a neat challenge. [18:07:28] Agreed. I think that is our biggest unknown right now :| [18:08:36] yeah. An API that reads and writes to a local database is a very small question mark… [18:08:45] How about… I focus on the internal events. [18:09:18] e.g. the sync-read vs async-write thing we were chatting about. [18:09:31] ^ say more about what. [18:09:40] I don't remember those terms [18:09:47] ah, there’s a bit about it here, https://etherpad.wikimedia.org/p/JADE_API_changes [18:10:24] Oh yeah. I still don't buy into the async write pattern. [18:10:29] Shall we talk more about that? [18:10:29] What I mean is that, my current idea is for GET APIs to make synchronous DB calls, but a PUT call will shoot off an event [18:10:32] sure [18:10:51] What's the advantage of shooting off the event on a PUT/POST? [18:11:44] There are two arguments for it: * Our most truthy source of truth is supposed to be an event stream, so that’s the outcome we need to guarantee in a write-API call, and * I haven’t found any other way around the distributed transaction. [18:12:18] With a guaranteed event, we have “eventual consistency” of PostgreSQL and MediaWiki [18:12:32] I agree on point 1 but I don't see why it is more truthy if it is async. [18:12:48] Oh yeah... distributed transation. [18:13:01] Creating the event is synchronous in this scenario, so that’s the result the API returns: “yes, we made it happen, here’s your event receipt" [18:13:07] *transaction. It's weird to have a source of "truth" than can contain things that are invalid. [18:13:19] It would never be invalid, though [18:13:23] You mean, if pgsql fails? [18:13:29] But I thought we discussed that. [18:13:31] then we have a db failure and the event is still good [18:13:33] E.g. deleting something that doesn't exist. [18:13:46] Oh, yeah I thought we were chalking that up to idempotency [18:13:53] But that's not relevant. [18:14:03] Because it's not the same action as deleting something that did exist [18:14:11] Otherwise, there’s no way to clear deal with race conditions. [18:14:16] Even if we got duplicate deletions of the thing that did exist. [18:14:27] Sync deals with race conditions. [18:14:30] That’s true, but suppression vs. never-existed isn’t something the client is supposed to know about. [18:14:46] awight, well, a client (event consumer) will know [18:15:32] they’re sworn to secrecy ;-). but I see what you mean. We can certainly do a sync get on the db to sort-of-check validity [18:18:27] I like that idea. My drawing needs more tweaks to reflect it. [18:18:46] Oh actually, it works as-is. [18:18:56] awight, in this case, the check on user-rights is really funny. [18:19:04] why? [18:19:16] We do sync calls to the db and user services [18:19:22] If we're going to have JADE actions show up in MW then it would be a problem if, by the time they get to MW, the user no longer has the rights. [18:19:52] atomic * blast ;-) [18:20:20] That’s one thing that an MW-only platform would solve. But super edgy. [18:22:56] Right. This is one of the reasons I've been pushing for an assessment of MW-only. [18:23:12] I think we're gonna get bitten if we don't do MW-only or MW-primary. [18:23:13] :\ [18:23:38] Either way MW-only/primary is gross and cuts deep for some of the cool things we want to do. [18:24:05] The user rights edge case can only happen once per (user, wiki)… [18:24:18] Here’s current Kafka performance, https://grafana.wikimedia.org/dashboard/db/eventbus?from=now-6h&to=now-5m&refresh=1m&orgId=1 [18:24:23] It does show a big lag if the consumers are lazy... [18:24:45] events go into Kafka in mean 10ms [18:25:07] Most backlogs are around 1s [18:25:36] For user rights change, we’re not doing any processing, I’m imagining we’re going to hear about the change <<1s [18:26:54] awight, what if we have an issue that blocks the MW-consumer for a couple of hours? [18:27:03] hehe then we have our pants down. [18:27:13] heh. [18:27:34] Oh so for patrollers, the MW-only/primary use case would be OK [18:27:46] For ORES' fast-cache, it would be the same [18:28:30] In the first race condition you’re talking about, the create_judgment gets passed to MW and a user was blocked 1s previously. When our extension API receives the event, it tries to edit as that user and is blocked. So we respond with a “suppress_judgment” event perhaps, or something more specific. [18:28:41] It doesn’t seem that anything is corrupted. [18:29:15] We can guarantee that we’ve responded to the event, so any third-party consumers will see both the create and its suppression. [18:30:11] It’ll look similar if a user spams judgments, then is blocked 1day later and rogue edits suppressed. [18:36:06] Wondering how we should get consensus on the events… [18:37:09] I think I should prototype that shit. [18:37:39] Hopefully we’ll be able to see design flaws better at that point? Shouldn’t take tooo long. [18:42:19] btb [18:42:22] *r [18:45:23] One thing I don’t like about event-first is having multiple IDs for a judgment. UUID, PGSQL_ID, and MW_REV_ID [18:53:06] awight, that "last event invalid event" honestly sounds like a mess [18:53:28] we definitely wouldn't want to follow up with "suppress" because there was no one performing a suppression. [18:54:14] I'm not sure what you mean by "event-first" [18:54:38] just that the API returns after sending an event [18:54:56] Ahh yeah. [18:55:06] So in that case, there will be an event with just a UUID. [18:55:19] And later we'd need to figure out how to match that with a rev_id [18:56:06] Yup. The link is pushed into postgres by a sync call to the Extension:JADE API, which returns the rev_id [18:56:27] awight, how would a consumer get that? [18:56:31] :( [18:57:13] They would either make calls like get_judgment(uuid=UUID), or would need to follow the link with get_judgment_id(uuid=UUID) -> ID [18:57:35] In any case, a feisty consumer could make those calls before we know the ID [18:59:22] yeah... ewww. [18:59:25] Hmm [18:59:29] I'm gonna break for lunch [18:59:32] back in ~ an hour [18:59:49] o/ [19:09:07] I have a way out of this. [19:09:38] relocating. [19:50:38] o/ awight [19:50:45] hey [19:50:52] What's the way out? [19:50:55] Trying to piece it together, https://docs.google.com/spreadsheets/d/1DKkEhyE-uh5iZNd72ihhzupgsID0rR-s9EDPhemLkwg/edit#gid=0 [19:51:13] * halfak requests access [19:51:15] My thought is that we can pull the sequence ID up front, [19:51:23] stupid defaults. [19:51:45] fixed. [19:52:04] The sequence ID, sure, but not the rev_id [19:52:06] Pulling the sequence ID is pretty much the same as using a UUID, but prettier. [19:52:20] Nothing we can do about the rev_id. That’s distributed transaction hell. [19:52:33] If you’re talking about making sync calls to the pgsql DB and to MW? [19:53:08] So you’re thinking, client want to create a judgment, get the rev_id back, and immediately do something with that? [19:53:17] I don’t see much use for the rev_id, off-hand. [19:53:50] If someone wants to suppress something, it's rev_id-based. [19:54:13] suppression can only be done in MW [19:54:19] And MW only knows about the rev_id [19:54:25] And the original author wouldn’t want to suppress, right? [19:54:41] So… an event listener might want to suppress, yes [19:54:44] Sure. Might want to. [19:54:47] And the rev_id won’t be in the event. [19:56:05] Sure. [19:57:16] I guess it would take two calls: get_judgment(NON_MW_ID) [19:57:17] then do a MW thing. [19:57:46] There would be some non-zero amount of time in which the get_judgment call would return a structure lacking rev_id [19:59:37] awight, other than the freedom of event design, I'm starting to lose the thread on why we're valuing the external service. I feel like we still haven't really addressed what more we can do with it. [19:59:55] I think that originally, we hoped to have loose integrations in MW and have the freedom to do whatever in JADE. [20:00:25] It turns out that we needed very tight integrations in MW and that constrains what JADE can do so much that I'm losing grasp of what the advantages of 3rd party JADE is. [20:01:02] I can recap it. [20:01:10] But these edge cases really don’t seem like deal-breakers. [20:01:41] I guess I'm losing motivation to deal with edge cases (and worried about the unknown-unknowns) [20:03:24] * halfak goes to meeting :( [20:03:26] The external service can * return results quickly, and integrate with ORES results. [20:03:39] It can * be used in environments other than MediaWiki [20:05:18] * we can develop much more quickly, * use libraries that do things, and * change the schema without weeks of pain [20:06:52] As a general programming thing, we want to decouple from MediaWiki except where absolutely necessary. Our mw-ext-JADE concept does that, it’s very lightweight, and could be replaced by any other external service. [20:07:14] If we write this in MW, nobody else will ever use our service. [20:11:44] The event coupling to MW is unfortunate, and only necessary in multi-master scenarios, so third-party installations of JADE can probably just omit that whole thing. [20:12:29] But that fact that we’re providing it is cool for potential major users, who would have their own suppression workflow. [20:12:39] Have heart!! [20:49:51] All stuff we’ve talked about, but this a good essay about distributed transactions and how to avoid them, http://www.grahamlea.com/2016/08/distributed-transactions-microservices-icebergs/ [20:54:50] It does seem that message passing is known to be an alternative to distributed transactions. You can trigger all consequences based on one event, or use a “saga” of events emitted after each consequence. [20:56:44] This is a common issue with microservices, exactly because we aren’t running on a monolithic data store. [20:57:37] https://en.wikipedia.org/wiki/CAP_theorem makes me feel normal as well. [20:59:50] Our discussion can be interpreted as agreement that we need partition tolerance, then debate between consistence and availability. [21:00:17] The event-driven design would be “availability”, we can see that we created a judgment, but the information might not be 100% complete for a short amount of time. [21:04:56] * halfak reads backscroll [21:06:38] Here’s a name for the pattern I’m suggesting… http://microservices.io/patterns/data/cqrs.html [21:06:40] OK I think I buy that argument. I especially like the bit about schema changes. [21:06:56] I wonder how Wikidata handles that. I know that they struggle with schema changes too. [21:07:46] In a big way… [21:07:52] We could ask Amir1 about lexemes :) [21:08:10] I wonder if we are staffed to bring a novel event-based thing to production level and if we should be worried about 3rd party use for our next iteration. [21:08:18] Also, we might have 3rd party use via MW. [21:08:31] good points [21:08:36] MW is, of course, available for 3rd party use. [21:08:53] Thinking about something fun now: Maybe the multi-master nastiness can be abstracted into a discrete component [21:09:19] in that case, it could be a tool people use to build microservices that interact with MW [21:11:49] Now you're talking. [21:12:02] Martin Fowler calls CQRS risky, which gives me goosebumps: https://martinfowler.com/bliki/CQRS.html [21:12:05] Still not sure I want to do it, but I would love to discuss better ways to integrate things with MW. [21:12:27] "Make it look like a wiki page" is limited. [21:12:52] What I’m ready to give up on is the direct link with rev_id. [21:13:19] which is super hard if we are going to look like a wiki page. [21:13:27] The use case is, “I just received an event telling me about a judgment, and I’d like to suppress it” [21:13:57] I'd rather have advice that looked like "Implement these actions: suppress, revert, ..., etc. and tie them to MW with this class: FooContentExtensionReviewThing" [21:14:13] +1 [21:14:30] But our specific audience is going to be doing their patrolling through MediaWiki APIs, I think. [21:14:42] At least at first, I think so. [21:14:44] oh, that’s compatible with what you’re saying. [21:14:56] MW should know how to talk to JADE and how to listen to JADE and make it look like JADE lives in MW> [21:15:39] Hmm I guess I’m not clear on what you’re saying, but we’re definitely looking at the same problem. [21:16:33] If all curation needs to look like pages/revisions, we should have a nice clean way to adapt an arbitrary system to page/revision curation styles via MW. [21:16:38] e.g. Flow is suffering for this [21:16:41] Even though it lives in MW. [21:16:44] +1 [21:17:46] Now I’m back on board, we could have e.g. a PHP library that we pull into ext-JADE, which abstracts away the synchronization, and uses arbitrary backends. [21:18:38] ext-JADE might implement that using both Kafka and some MW DB operations to clean up our tracking tables. [21:19:30] I’d love if the Kafka events were also generalized, so anything external mirroring a subset of MW would react appropriately. [21:19:40] Regardless of namespace or data contained. [21:20:29] MW is *almost* set up to do this, fwiw. These are built-in events,mediawiki.revision-visibility-change [21:20:30] mediawiki.page-delete [21:20:30] resource_change [21:20:42] Right on. [21:20:49] OK I have to get back to budgeting stuff :| [21:21:21] lol thanks for joining me on the cutting-edge roller coaster ;-) [21:26:23] Phew! The Fowler essay is scathingly realistic. [22:46:13] OK finished a full first draft of next year's budget! [22:46:23] Good enough. I'm hitting the road. Have a good one, folks.