[11:33:55] AaronSchulz: quick question - is there something like CONN_NO_REQUSE or CONN_NEW or something? [11:34:13] I need a reliable way to get two separate connections to the same database [11:35:03] I see that the CONN_TRX_AUTO is used (abused?) for this in a setSpeculativeRevIdCallback() callback. Is that safe? sane? reliable? [11:36:30] what for? [11:40:21] AaronSchulz: generating dumps. WikiExporter uses a connection in streaming mode. With MCR code, separate queries are needed to retrieve content. Blob URLs cannot be resolved in a Join. [11:40:44] But I can't make "inner" queries with the "outer" query in streaming mode [11:41:01] An alternative would be to just use batching instead of streaming, I guess [11:41:10] but that needs structural changes to the code which I'd like to avoid [11:48:59] AaronSchulz: any suggestions? [11:49:15] so you want a second connection for blob metadata (text_id/name) lookup? Seems like that can use CONN_TRX_AUTO. That will never return the main connection(s) that use DBO_TRX. Is this just for read queries? [11:49:35] Yes, read only. [11:49:51] But abusing this flag for "give me a different connection" seems... icky? [11:49:59] Would be nice to have a better solution for that, right? [12:08:08] maybe, I recall not doing that do avoid a proliferation of too many connections. Ideally, there'd always be some reuse/pool logic if possible. [12:09:20] AaronSchulz: otoh, we have comments like "Use a fresh connection in order to see the latest data"... [12:09:52] i guess that doesn't really mean fresh connection. it means "a connection with no active transaction context". [12:13:40] AaronSchulz: now I'm again hitting the issue that the fake tables for unit tests only exist on the main connection. *sigh* [12:39:37] DanielK_WMDE: I suppose tests could make CONN_TRX_AUTOCOMMIT no-op if we assume only one connection is at play (no concurrency) and consistent reads are used. I suppose another way would be forcing real tables in some cases...but that sounds more complex. [12:54:04] AaronSchulz: but then the tests for the dumper code will fail again, because of the nested query issue [12:55:12] the entire fake-db mess in MediaWikiTestCase needs to die. We should just have a UnitTestLBFactory that does the right thing [12:55:22] That would make things so much easier [12:58:13] Looks like I tried to do that at some point https://gerrit.wikimedia.org/r/c/mediawiki/core/+/270555/49/tests/phpunit/mocks/CloakingLBFactory.php [13:00:03] also, the "streaming" logic is fairly hacky and only makes sense with reference-counted connections (strict, not just by DB name selected like now) [13:01:04] AaronSchulz: how would you fix https://gerrit.wikimedia.org/r/c/mediawiki/core/+/443608 ? [13:01:48] as far as I can see, either the fake DB logic needds to change, or the streaming logic has to go away [13:01:54] comments would be welcome [13:02:09] i'm also not sure how well that handles interrupts/failures. I usually prefer batching (and paging iterators can be made over it like FileBackend does)...but that would indeed be a lot of rewriting. [13:04:13] AaronSchulz: this was supposed to be a quick fix. and it looked like one, until I ran into the "Commands out of sync" error [13:04:32] DanielK_WMDE: making new LBs is usually the way to force new connections, so that's fine...are you sure that nothing will try to use the main DB (e.g. the dependency inject is complete?) [13:04:55] e.g., hooks that trigger some code that does wfGetDB() or some such [13:06:26] AaronSchulz: There is no way to be sure of that in the presens of hooks. [13:06:38] I guess as long as everything that streams gets it's own LB+connection like that code itself does [13:06:42] forcing new connections is fine for everything except unit tests. [13:06:50] nothing should *ever* stream that doesn't do that [13:07:10] * AaronSchulz looks at the export/dump code [13:07:20] the code as written should work, but I can't see hwo to make it work in unit tests [13:08:35] heh, PopulateImageSha1 shells out to 'mysql' to stream [13:08:41] I guess that works too ;) [13:09:18] not in tests of course, that is [13:19:49] DanielK_WMDE: so, why not just disable STREAM during unit tests for the exporter? [13:20:40] maybe allow it if --use-normal-tables [13:21:33] though allowing it in that case still has the second connection logic problem (the core code should use a new LB in that case) [13:23:56] SpecialExport already uses a new LB, though not the API, huh [13:41:39] AaronSchulz: because if streaming was disabled during tests, I would have never caught the problem. My original change would have been merged, and we'd have hit an error in production [13:41:57] Actually testing streaming mode is important [13:42:36] Let's not do the Volkswagen thing :) [13:45:39] tgr: do you think you can review (and perhaps even merge) the SCHEMA_COMPAT_XXX change today? https://gerrit.wikimedia.org/r/c/mediawiki/core/+/442153 [15:05:25] anomie: coudl you have a look at https://gerrit.wikimedia.org/r/c/mediawiki/core/+/426009 soon? [15:52:52] thanks brad! [15:52:58] i just pushed another, hopefully simple one https://gerrit.wikimedia.org/r/c/mediawiki/core/+/443646 [16:10:36] *sigh* nothing is ever simple [16:59:26] anomie: thanks fro the heads up regarding ExternalStore. I wonder why it does cachign at all. We already do cachign one level higher, this seems redundant... [17:03:46] DanielK_WMDE: I think it's largely for the case of multi-revision compressed blobs, two adjacent revisions are probably in the same blob. [17:05:16] right. but we don't use that. perhaps that cache should eb in the code that handles the multui-revision stuff instead. [17:06:56] anomie: by the way, cross-wiki access for the revision store needs ActorMigration and CommentStore to be cross-wiki capable. This may be tricky if differnt wikis have different migration stages set. [17:07:06] we'd at least need to detect when this is not the case, and fail [17:07:42] We do use that. Anything with an address beginning with DB://rc1/ uses it, and maybe some on cluster3, cluster4, and cluster5. We've also been considering running the recompression scripts again, on a regular basis. [17:08:50] Good luck with that. [17:09:17] it'S a blocker, so it will need to hapepn [17:09:21] this is why global state is evil [17:09:27] DI enables cross-wiki operations [17:09:46] anyway... that multi-revision compression stuff relies on php serialization, no? [17:10:51] we probably don't want more of that in the database. [17:11:39] anyway. more tomorrow. i'm off for today. [17:13:25] Actually, you can probably just ignore it. CommentStore and ActorMigration already don't do any database connecting themselves, they just use whatever handle is passed in. As for the configuration, it's designed such that each step is compatible with both the next and previous steps so cross-wiki access doesn't break. There's a table of compatible settings at T181930#3810812. [17:13:25] T181930: Comment schema migration and foreign db repos not working well - https://phabricator.wikimedia.org/T181930 [17:14:20] "DI enables cross-wiki operations" is bogus, DI has nothing to do with it. Cross-wiki operations exist because you coded it to expect cross-wiki operations. [17:15:10] DanielK_WMDE: Re "that multi-revision compression stuff relies on php serialization, no?", I see you're unaware of T181555 and https://gerrit.wikimedia.org/r/c/mediawiki/core/+/397632. [17:15:10] T181555: Remove use of PHP serialization in revision storage - https://phabricator.wikimedia.org/T181555 [22:20:11] Hmm. `sudo -u www-data php tests/phpunit/phpunit.php --wiki wiki --configuration tests/phpunit/suite.xml --list-groups` is entertainingly totally broken. Should it work?