[01:09:01] Pchelolo: Is there a page (or can you nut-shell for me) how the new Kafka/ChangeProp pipeline does de-duplication? Do we somehow index the kafka data or create a variant stream? [02:49:48] Krinkle: oh sorry I was away.. there a bit about it here https://wikitech.wikimedia.org/wiki/Kafka_Job_Queue implemented here https://github.com/wikimedia/change-propagation/blob/master/sys/deduplicator.js#L24 [02:50:18] the data's stored in redis, it's mostly maps of event ids/sha1 to execution times [03:01:06] Pchelolo: aha, in redis, interesting. [03:02:06] yeah, so from MW's perspective, jobqueue is a per-wiki concept, all nominal operations that affect behaviour need to be scoped to the wiki. If there were to be logic across wikis, it would likely be implemented by MW choosing a shared wiki to deduplicate via or through a dummy ID of some kind. [03:03:44] I don't know if it's worth an auditing after the dedupe issue to see if anything else might accidentally share some kind of state. [03:04:05] if you'r referring to MassMessage problem - I'm really surprised this oversight hasn't manifested before [03:04:08] would most likely be a source of bugs [03:04:19] yeah, it's silent. [03:04:38] MassMessage is one of the few jobs were humans verify the impact. We generally trust the system and don't look back. [03:04:53] almost every major issue with JQ in last year was found through MassMessage :D [03:06:24] in kafka the wiki domain is a require part of the message, and it's passed on via the host header unconditionally. deduplication was the only state that was separately mainteined elsewhere, so I think we should be fine now [03:06:39] Yeah, it's an easy thing to miss, but might also be worth recognising in some way within the code to eliminate the possibility more strongly through some kind of closure state so that nothing is able to easily bypass it by accident. [03:06:49] ah okay. [03:06:52] good :0 [03:06:54] :) * [03:07:58] Pchelolo: the FIFO queues are sharded by job type only right? no more by wiki. Or by db section as well I think was talked about as an idea, not sure if that happened? [03:08:55] Krinkle: we have a topic per job type. It's partitioned according to DB shard for refreshLinks and htmlCacheUpdate cause those were causing problems for databases [03:09:26] ah, ok. [03:09:37] we can partition more job types, it's now just a matter of a simple config change, but there was never need for that [03:09:55] yeah, if it's low traffic not worth it I suppose [03:12:31] those recursive jobs are special cause they are enqueued in large per-wiki batches causing the load to switch from shard to shard in crazy patterns. Even high traffic jobs that are not posted in large bulks are not a problem [16:12:35] duesen_: Want me to push out the Beta Cluster SCHEMA_COMPAT_NEW change? [16:15:20] James_F: yes! Though I wonder who should get a heads up about that. [16:15:33] I'm fuzzy on who "owns" the beta wikis [16:15:35] Well, it's almost exclusively a CPT concern. [16:15:45] No-one. Which is a long-running issue. :-( [16:16:00] Yea, but who would notice problems, and know what to blame them on? [16:16:09] *sigh* [16:16:39] I'll flag it to the Quality & Test team, who are often the first to spot things in Beta Cluster. [16:16:43] Does anyone look at the logs? [16:17:35] Ok, great, thank you. I had Zeljko on the patch, but he removed himself [16:18:19] It's unlikely to break anything in the browser, really. The potentially problematic code is all in maintenance scripts. [16:18:41] Yeah. [16:19:06] Primary effect: rev_text_id will be 0 for new revisions [16:19:17] * James_F nods. [16:19:42] Deployed. [16:19:46] \o/ [16:19:50] thanks! [16:21:38] Thank you for keeping things moving. :-) [16:22:16] Next up, testwiki, presumably in a week or so if nothing breaks? [16:23:58] James_F: yes. Idealy, we'd wait until the patches for Translate land. Though we won't be running the translation export scripts anyway. [16:24:08] Right. [19:03:01] Krinkle I think this is the only outstanding question: https://gerrit.wikimedia.org/r/c/mediawiki/core/+/534933/7/includes/OutputPage.php#2483 [19:11:47] replied [19:17:40] duesen_: Thanks! [20:29:33] Krinkle: can you move the patch out of WIP mode? I apparently can't do that, even after uploading a new version. [20:33:22] duesen_: done. [20:33:39] thanks [20:33:45] that was among rights mass-revoked when we were tightening gerrit security earlier this week. [20:33:47] this year* [20:34:00] might be worth bringing up in a task to reconsider this particular one [20:34:04] at least for fellow +2'ers [20:34:47] *making* something WIP could be problematic, because it hides it from dashbaords... [20:37:12] Only bad dashboards. ;-) [21:11:25] There's a new WIP right in gerrit, which we can use later. [21:24:31] new WIP right? [21:30:51] hauskater https://github.com/GerritCodeReview/gerrit/commit/6def400a3024d50ee78753ca2738a5e7e589fa8a [21:31:47] paladox: ah, to allow others in an ACL to [un]wip stuff [21:31:51] sound good [21:31:52] yup [22:30:33] is anyone around to quickly review some patches related to logging? i'm not sure if i know what i'm doing. https://gerrit.wikimedia.org/r/c/mediawiki/extensions/VisualEditor/+/540234/1/includes/ApiVisualEditor.php https://gerrit.wikimedia.org/r/c/operations/mediawiki-config/+/540235/1/wmf-config/InitialiseSettings.php [22:30:37] is that ^ all i have to do to see the results on logstash.wikimedia.org?