[01:54:30] bd808: uhh, kind of not really, is it causing problems? [01:57:34] legoktm: nope. I was just doing some cleanup of older vms and noticed it was still up. [02:00:25] we can probably kill it in a week or two once it finally rolls out to production [02:02:02] oh. I thought the prod one was live now [02:02:10] but yeah no rush [13:01:47] SMalyshev: I wouldn't say I'm all that familiar with them, but I may be able to help. [16:36:49] anomie: thanks! check out https://phabricator.wikimedia.org/T138998 - mainly my question is why Revision/WikiPage is ignoring content model that is set in the page table [16:37:40] SMalyshev: Sanity check: $wgContentHandlerUseDB is true? [16:39:17] let me see... it's from vagrant, so whatever vagrant is setting it to, but I'll check [16:41:14] anomie: yes, it's true [16:43:45] SMalyshev: I'm not sure whether page_content_model is even really its own thing, or if it's just a denormalization of the rev_content_model of the revision pointed to by page_latest... [16:44:19] anomie: well, I'm looking into the db and in reveision table, content column is always null [16:44:34] but in page table it has content models [16:44:46] the only problem is it is somehow ignored [16:45:57] The only thing I see in a quick grep that sets page_content_model is WikiPage, which sets it to $revision->getContentModel() in updateRevisionOn(). [16:47:03] I think rev_content_model is left null when it's the (current) default content model for the title, to take less room in the DB. [16:48:13] (See T104033 on that topic, BTW) [16:48:14] T104033: Populate rev_content_model and rev_content_format when saving - https://phabricator.wikimedia.org/T104033 [16:48:37] anomie: so what the page_content_model in page table means? [16:48:38] (And T105652 apparently) [16:48:38] T105652: RfC: Content model storage - https://phabricator.wikimedia.org/T105652 [16:48:58] and why rev_content_model is null for pages that are not in default content model [16:48:59] ? [16:49:59] SMalyshev: I think page_content_model is just denormalizing rev_content_model (or the default for the page), so the revision row doesn't have to be fetched all over the place. rev_content_model might be null if the default content model for the page changed since the revision was created. [16:50:51] anomie: that's what I assumed but I don't see where page_content_model is being used... I see that Revision loads it from revision table, and if it's null it just uses namespace default. Am I missing something? [16:52:23] SMalyshev: page_content_model seems to mainly be used for $title->getContentModel(). [16:53:44] anomie: ok, but that doesn't seem to be used by Revision... so when loading content from Revision, it seems to use wrong content model [16:54:21] check out Revision::getContentModel - it doesn't use $title->getContentModel(). [16:55:29] it instead takes default content model for a title, which goes by namespace [16:55:29] SMalyshev: And it shouldn't be. Consider this: Default model for [[Foo]] is wikitext. A bunch of revisions are created using that default content model, so they store null in rev_content_model. Then someone makes a new revision for [[Foo]] that is application/fubar; we don't want all those old revisions to suddenly be interpreted as application/fubar. Unfortunately, as you noted, that means we can't change the default content model for a page [16:55:30] unless we first populate all those null rev_content_models. [16:56:04] anomie: but default content model for namespace and content model for specific page is not the same thing? [16:56:33] or page_content_model is not doing anything? [16:56:49] I'm just not seeing how content model can get from page.page_content_model into Revision [16:56:56] and indeed on my install it does not [16:57:02] default content model for a namespace, default content model for a title, and content model for a specific revision are things. Content model for a page is not a thing. [16:57:20] so why page table has page_content_model? [16:57:33] which is apparently populated by the correct content model... [16:57:38] unlike revision table [16:58:05] https://en.wikipedia.org/wiki/Denormalization [16:58:18] right, I know what denormalization is :) [16:58:42] I just don't see how it works in this particular case - how data from page.page_content_model is to get into Revision object? [16:58:47] It doesn't. [16:58:57] so what purpose then this data is serving? [16:59:26] Denormalization, so the Title object can know the content model of the top revision without having to load the revision row. [16:59:29] and how one can have a page in non-default-for-namespace content model? [16:59:51] but Title object is not used to load content. Revision is [16:59:53] A page can't. But the top revision can. [17:00:11] but revision table has nothing in content model row, only page table has [17:00:24] I'm just trying to figure what's going on there... [17:00:48] and how comes if revision table is supposed to be used for this, the actual data is in the page table [17:01:12] The revision table *should* have a content model if the model isn't the default for the title (as returned by ContentHandler::getDefaultModelFor()). Most likely what happened is that the revision for Main_Page was created when the default was wikitext, then you enabled the wikidata role which change the default. [17:02:11] But when the wikidata role changed the default, it didn't update all the now-wrong rows with rev_content_model = null. [17:02:54] ah, ok, that is possible and would explain it [17:02:58] Which is what T104033 is referring to when it says "however this prevents changing of the default" [17:02:58] T104033: Populate rev_content_model and rev_content_format when saving - https://phabricator.wikimedia.org/T104033 [17:03:27] so the normal situation would be that if I created different-model page the content model would be in revision table? [17:05:19] Short answer: yes. Correct answer: If you create a different-model revision for the page, the content model would be in the revision table. It would also be in the page table until someone made a new revision with a different content model. [17:05:31] I wonder if there shouldn't be some kind of maintenance script for this - if page's content model is not default and != last revision's content model, fix the last rev table [17:06:12] anomie: ok, now I think I understand where the breakage happened. Thanks! [17:07:02] Probably wouldn't hurt. maintenance/fixDefaultJsonContentPages.php looks like one for one special case. [17:08:00] BTW, it looks like actually doing anything about this breakage is more or less blocked on "we'd rather do T107595 instead" [17:08:00] T107595: [RFC] Multi-Content Revisions - https://phabricator.wikimedia.org/T107595 [17:12:27] anomie: well, T107595 is a big fat task, and this one is pretty small breakage (though thoroughly annoying) [17:12:27] T107595: [RFC] Multi-Content Revisions - https://phabricator.wikimedia.org/T107595