[04:39:09] TimStarling: is there no way to bind poolcounterd to localhost? I don't see anything... [04:39:20] no [04:39:31] (from memory) [04:40:07] https://gerrit.wikimedia.org/r/plugins/gitiles/mediawiki/services/poolcounter/+/master/main.c#73 [04:40:28] yeah, no options there [04:42:42] any reason to not include it? / would you be opposed to having it default to binding to localhost (with some cli option for listening to everything)? [04:43:08] I think it would be good to have [04:43:22] also maybe a different incantation is needed to listen on IPv6 [04:51:17] ok, for IPv6 you would have to make two sockets [04:55:48] I'll uh, just deal with changing how IPv4 binds for now :) [04:57:23] * TimStarling is reading RFC 3493 for some reason [05:00:47] actually you can do both IPv4 and IPv6 using a single IPv6 socket, IPv4 addresses will be mapped to IPv6 addresses when it is used this way [05:01:07] there is a socket option IPV6_V6ONLY to disable this [06:10:04] TimStarling: https://gerrit.wikimedia.org/r/c/mediawiki/services/poolcounter/+/459677 I've never used getopt before [06:11:11] abort()? [06:11:42] I copied that from the manual [06:11:58] https://www.gnu.org/software/libc/manual/html_node/Example-of-Getopt.html#Example-of-Getopt [06:14:12] right, I checked the linux manpage which just prints an error in that case [06:14:20] I think they are trying to say it is unreachable [06:15:33] abort() raises SIGABRT and will cause a core dump if it is configured, so it's mostly a debugging function for unexpected fatals [06:17:33] https://manpages.debian.org/stretch/manpages-dev/getopt.3.en.html#getopt() hm, ok [06:19:17] you should check for errors from inet_addr(), it returns INADDR_NONE if the input was invalid [06:20:04] I check for errors in main() with inet_aton [06:20:33] ok, but then you pass the string through to listensocket? [06:21:13] why not pass the in_addr if you have it already? [06:21:38] oops. I was originally validating with inet_addr, but then I read the man page which said not to and use inet_aton, but I never updated what I was passing in [06:43:01] TimStarling: updated the patch [06:43:13] gonna sleep now though, good night :) [13:21:37] anomie: could you have a look at https://gerrit.wikimedia.org/r/c/mediawiki/core/+/459620 and the two patches now based on this? [13:22:02] They were nearly ready to be merged yesterday already, perhaps we can get them in today. [13:22:39] DanielK_WMDE: Think we're ready to go for https://gerrit.wikimedia.org/r/c/mediawiki/core/+/443831/? I think so. [13:31:46] anomie: yes, i also think that can go in [13:32:54] once that is merged, I'll write to dbas about enabling read-new on test and otehr wikis [13:34:11] There shouldn't be a problem with https://gerrit.wikimedia.org/r/c/operations/mediawiki-config/+/454534. We'll probably want to watch that for a little bit before we try to go live on other wikis. [13:35:11] I'll just ping marostegui about testwiki on IRC when I'm ready to do that one. [13:36:02] ok. since read-new has been on beta for more than a month, i see no reason to wait much longer with that. [13:36:08] could happen today or tomorrow [14:30:48] anomie: the "main" slot discussion on isReadForInsertion is getting philosophical. you say correctly that whether content is correct isn't part of the contract. I say whether all required slots are filled is not part of the contract either, and should be left to SlotRoleRegistry. What is part of the contract is the requirement of having at least one slot. [14:31:27] the requirement ot have a main slot is a limitation imposed by the need for b/c in revisionstore. i see no reason to bake it into Revisionrecord. [14:32:41] anomie: so far, RevisionRecord and friends do not know about a main slot. because they don't need to. i think it would be bad to change this now [14:32:42] It's baked in a dozen other places in the code though. [14:33:09] yes. usualyl as a default, for b/c [14:33:48] spreading this over more code seems like something to avoid, not something to promote [14:33:58] For example, DifferenceEngine throws an exception if it's missing, so I had to make ApiComparePages specifically check for that too. [14:36:17] yes, a lot of legacy code needs a main slot [14:36:22] new code should ideally not need a main slot [14:36:50] revisionStore currently makes sure we don't save stuff without a main slot. [14:37:00] i'm rather happy about Revisionrecord not needing to know about that [14:37:13] what would we gain by adding that check? [14:39:07] I think we've had this disagreement before, over whether it makes sense to have a revision without a main slot in the future when it's not necessarily required for BC. Some of that comes back to whether long-term we want to keep the "concatenate all slots' HTML together" rendering versus something more nuanced. [14:39:56] We'd gain the fact that the method that purports to say whether then slot has all necessary fields filled in actually tests for whether the currently necessary 'main' slot is filled in. [14:40:29] err... s/then slot/the revision/, no idea how I mistyped that [14:43:09] anomie: to me it's about not sprinkling knowledge about the code base. At least trying to keep these things localized as much as possible. [14:43:38] I think that battle is already lost with respect to the main slot... [14:43:40] Even if it doesn't make sense for a *wiki* to have revisions without a main slot, this doesn't mean the revision class itself has to know or care about that [14:44:24] yea, the main slot is used in an annyoing number of places. why make that situation worse if there is no need? [14:45:37] in my mind, mentioning the main slot in RevisionRecord introduces a circular dependency with respect to the abstract model we use. [14:46:02] RevisionRecord then knows about something that RevisionStore defines, and RevisionStore depends on REvisionRecord. [14:46:25] Anything that uses RevisionStore, directly or indirectly, is ok to use "main". Anything used *by* RevisionStfore should avoid this. [14:53:32] anomie: thanks for unblocking. for the record, I do see your point, we disagree on what is more valuable here, isolation of code or enforcing contraints. I'd be happy to continue the discussion an a separate patch that adds a check for the main slot. [14:54:34] I'm hoping we can be done sprinting on SDC/MCR stuff soon so I can catch up on backlog I still have after my vacation last month ;) [14:55:40] anomie: with the view stuff merged and default settings merged as well, we are really close to done, I think! [14:56:01] I'm working on COntentHandler::getSecondaryDataUpdates. I'll probably have a first version up tonight. [14:56:38] (an ugly one, I may make a follow up with better architecture right away) [14:56:51] Other than that, the only thing left is enabling read-new on the live cluster [14:58:30] DanielK_WMDE: Remind me, is there more to ContentHandler::getSecondaryDataUpdates than just changing it to take a role parameter, and maybe a RenderedRevision rather than a single ParserOutput? [15:00:01] LinksUpdate stuff I suppose. [15:00:40] anomie: we are introducing it, not changing it. The old method is on COntent, not Content handler. Providing B/C for extensions that override that makes this ugly. Well, one part. The other part is that we don't really have a very good place for this logic to live right now. [15:01:03] anomie: the WIP is here, comments welcome: https://gerrit.wikimedia.org/r/c/mediawiki/core/+/418134 [15:39:11] DanielK_WMDE: fyi https://phabricator.wikimedia.org/T204065 [15:39:39] *looks to see what was recently merged* [15:40:02] addshore: read-new per default was just merged [15:40:19] Introduce RevisionRecord::isReadForInsertion, or [MCR] Set MCR migration stage to write-both/read-new. … :P [15:40:21] addshore: this looks like another instance of that CI issue we were hitting two weeks ago [15:40:43] the wikibase tests that run against core patches passed [15:41:38] addshore: Wikibase tests often don't use MediaWikiTestCase. If they use MediaWIkiServices, they need to make sure to reset all services! [15:41:50] that'll be the problem I imagine. [15:42:33] might be able to find the problem by basing a change on your patch that stops writing / logs writing to the db while things are not meant to? [15:42:35] * addshore goes to find it [15:42:43] addshore: this will be the culprit https://gerrit.wikimedia.org/r/c/mediawiki/core/+/443831 [15:42:58] no, the issue is not bad db access [15:43:14] the problem is the name cache persisting between caches when the db is reset [15:44:14] addshore: let me guess, we don't run Lexeme tests against core patches [15:44:25] im pretty sure we do [15:44:26] *checks* [15:45:26] I don't see WikibaseLexeme in https://integration.wikimedia.org/ci/job/mediawiki-quibble-vendor-mysql-php70-docker/5956/consoleFull >.> [15:45:39] hmhm. [15:45:54] that's why this was not caught in advance [15:46:21] hmmm, I think lexeme should be loaded in those when wikibase is also loaded..... [15:46:34] uh... why do I see failures for core parser tests in your exmaple consoles? [15:47:03] setting to UBN, if the lexeme tests now fail, all other extension tests that load Wikibase will also fail [15:47:46] addshore: we can revert the core patch. re-open the ticket if you do, though. [15:48:01] can you reproduce locally? [15:48:30] not tested locally [15:49:03] persoanlly I'm happy to leave it until the morning, we are all done for the day, but im pretty sure 90% of the extensions the WMF will work on in the next 10 hours will now have their CI broken :P [15:49:11] addshore: are all failures in the parsertest? [15:49:18] *checks* [15:49:26] at a glance, it does seem that way [15:49:34] which is strange, since that tests passes in core [15:49:58] something is probably run before the parsertest, and leaks state into it. [15:50:04] DanielK_WMDE: yes I think so [15:50:44] so, a guess: parser tests don't use MediaWikiTestCase, so they don't reset servies, so they get a NameTabelStore with bad cached info. [15:51:32] hmmmmm [15:51:49] the parser tests are run by their own runner thing though? so could just reset at the start of that? [15:53:24] there is already a bunch of stuff in ParserTestRunner::staticSetup [15:54:00] ParserTestFileSuite::setUp I guess? [15:54:54] essentially just copy the logic of service resetting performed by MediaWikiTestCase at the start of the parser test suite? [15:55:03] or is that evil and probably hiding the actual issue [15:55:06] aaaan ping timeout :P [15:58:41] addshore: sry, laoptop crashed hard :/ [15:58:49] hehe [15:59:01] DanielK_WMDE: https://usercontent.irccloud-cdn.com/file/X7Nk817f/image.png [16:00:15] addshore: parser tests have always been their own special thing [16:00:33] i'm looking at how it initializes now [16:06:34] "Poorly", I'd guess. [16:06:44] addshore: i think I know what I have to do, but phpstorm is still re-indexing everything [16:06:50] seems to have messed up something in the crash [16:06:56] James_F: :P [16:06:58] :D [16:07:13] DanielK_WMDE: I'll assign the ticket to you then ;) [16:08:20] hm yay. [16:08:50] ;) [16:13:36] addshore: I think I oughly know what needs doing, but I don't have a good way to test it. Also, I think I caught a cold, and now my mind is getting foggy... [16:14:32] addshore: in any case, it seems that somethign in Lexeme that isn't a MediaWikitestCase reads from the database (or writes to it) through a NameTableStore [16:15:00] addshore: could you run Lexeme tests with --exclude-group Database and patch NameTableStore to fail? [16:15:17] that shoudl expose the Lexeme side of this [16:16:14] i can certainly give it a shot :) [16:18:45] alcoholic [16:20:14] hmm, well, something failed, which i guess is good [16:29:27] if I’m not mistaken, the only test that doesn’t get excluded by --exclude-group Database and still ends up calling NameTableStore::acquireId() is ExternalLexemeSerializerTest [16:29:58] but earlier I also saw LexemeChangeOpDeserializerTest popping up, so I’m probably doing something wrong [16:30:11] (both of them are subclasses of MediaWikiTestCase and not in @group Database) [16:31:41] Wikibase\Lexeme\Tests\DataModel\Serialization\ExternalLexemeSerializerTest [16:31:47] yup [16:32:10] Thats the one that just came up in my hackey test [16:33:10] * addshore makes a patch [16:33:33] note that this may not actually fix the issue. [16:34:00] @group Database causes it to hit the fake database instead of the real one. different model ids would be assigned. [16:34:09] it may still leak them to the parser etsts somehow [16:34:19] not sure how, MediaWikiTestCase should always reset all services [16:34:28] well, i got 3 tests so far that insert things in name table store without using group database [16:34:42] 4 [16:35:04] as far as I can tell, that happens somewhere in MediaWikiTestCase’s setup [16:35:06] [7da68f10cc5e1c9b35da3b93] [no req] Daniel from line 407 of /var/www/mediawiki/tests/phpunit/MediaWikiTestCase.php: Wikibase\Lexeme\Tests\MediaWiki\Diff\LexemeDiffVisualizerTest [16:35:08] another one.... [16:35:08] before the test starts [16:35:14] yes, it is in addcoredbdata [16:36:30] hmmmm, for example, !self::$dbSetup, so if the db isn't setup, addCoreDBData will always be called adding data or something.. [16:36:39] ok, but these are false alarms. if addcoredbdata runs, it'S doing the fake database thing [16:36:51] @group Database is not the only way to trigger that [16:37:01] i should probably look as see what is being inserted and make sure it is "wikibase-item" or just not "wikitext" [16:37:15] but @group Database should be set anyway, for the sake of picking the right tests on the command line and in CI [16:38:08] should be set… where? [16:38:12] DanielK_WMDE: some of the tests that were using mediawikitestcase but not inserting things to the db are things like formatter test, they really shouldn't be in group database [16:38:13] ExternalLexemeSerializerTest doesn’t do anything with the database as far as I can tell [16:38:39] I think it extends MediaWikiTestCase to have access to $this->setMwGlobals() [16:39:12] addshore: then they really shouln't hit the database [16:39:37] DanielK_WMDE: well, MediaWikiTestCase is what is hitting the db in those cases, adding the core db data apparently [16:39:41] addCoreDbData is only triggered if the fake db handling si somehow trigegred [16:40:10] addshore: MediaWIkiTestCase::needsDB [16:41:14] or if !self::$dbSetup [16:41:40] self::$dbSetup = false; in teardownTestDB [16:42:30] but that is only called at the very end of all tests [16:45:33] addshore: yea so before the start of the first test, $dbSetup is false, and the fake db gets initialized, and addDBCoreData runs. It then runs again if any tables needed for page/revision storage are mentioned in tablesUsed. [16:45:36] So, ExternalLexemeSerializerTest is the first out of the lexeme tests without the DB group that uses MediaWikitestCase, as a result of that self::$dbSetup is false, so addCoreDBData runs inserting stuff into the tables? [16:46:07] looks like it [16:46:08] yes. iirc this didn't used to be the case, but now we just do this always [16:46:19] I tried it out with a completely blank test and addCoreDBData() is still called [16:46:43] we could check if needsDb before doing $this->addCoreDBData() ? [16:47:11] or change the check to check needsDb instead...? *checks the git history here* [16:48:01] DanielK_WMDE: https://gerrit.wikimedia.org/r/#/c/mediawiki/core/+/275596/13/tests/phpunit/MediaWikiTestCase.php [16:49:22] again, perhaps we are only seeing all of these things now that everything runs on quibble, and all tests run with a single DB, rather than spread over a few different jobs, and also now we run non DB tests first then DB tests [16:50:04] addshore: we could also only run that bit of code if needsDB() returns true, and ignore self::$setupDB [16:50:24] I don't actually see why we don't do that... [16:51:29] maybe it has just been overlooked until now [16:51:43] * addshore has to dash for 5 mins [16:58:34] addshore: for your hacky test, change if ( !self::$dbSetup || $this->needsDB() ) to just if ( $this->needsDB() ) [16:58:45] I see no reason for that not to work [17:00:33] 8) Wikibase\Lexeme\Tests\MediaWiki\WikibaseLexemeExtensionRegistrationTest::testMediaWikiTestCaseParentSetupCalled [17:00:33] MWException: Can't create user on real database [17:00:41] i guess there is some more stuff doing a similar check [17:01:52] and that's because ApiTestCase always calls getTestSysop in setUp [17:03:35] everything that extends ApiTestCase should be in group Database [17:04:33] DanielK_WMDE: That change has been suggested before, see https://gerrit.wikimedia.org/r/c/mediawiki/core/+/377041 and T155147. [17:04:36] T155147: Do not initialise the database in tests when not needed - https://phabricator.wikimedia.org/T155147 [17:05:53] anomie: right, thanks [17:06:04] i remember discussing it with time, but i didn't remember the outcome [17:06:21] so, it's really a security measure. make sure no test can ever write to the real database [17:06:45] DanielK_WMDE: https://gerrit.wikimedia.org/r/#/c/mediawiki/extensions/WikibaseLexeme/+/459821/ is all I found [17:08:23] made https://gerrit.wikimedia.org/r/#/c/459823/ too for MediaWikiTestCase, going to see how it performs in CI, Lexeme tests were green for me locally [17:08:48] * addshore has a few other ideas of things to put in place once we get this fixed. [17:12:22] addshore: https://gerrit.wikimedia.org/r/c/mediawiki/core/+/459820 [17:12:56] addshore: i hope that fixes the test failure. it's not a real solution (tm), but should work [17:13:09] i'm not sure how to realyl test it without merging it [17:13:22] but merging it shoudl do no harm (if ci on core passes) [17:14:49] Make a dummy WikibaseLexeme patch depend on the core change; if it passes with it, success. [17:17:27] James_F: I really don't think that is the one we are looking for :/ [17:17:35] James_F: jenkins doesn't seem to like that, any idea why? https://gerrit.wikimedia.org/r/c/mediawiki/extensions/WikibaseLexeme/+/459821/#message-683d1c6637aeb3ad87aff6f11c2099d55bd28a82 [17:18:30] addshore: your change is useful cleanup, but that probably wasn't the problem. i'm trying to use it to test my patch now, though :) [17:18:41] Hmm, no. [17:19:31] DanielK_WMDE: https://gerrit.wikimedia.org/r/#/c/mediawiki/core/+/459820/ might need a rebase [17:19:31] perhaps [17:19:41] unlikely [17:20:00] more likely it doesn't find it, or can't handly any depends-on from core [17:20:02] well, gerrit / jenkins might think that it does for some reason [17:20:09] It should be able to :) [17:24:02] addshore: hmmm is jenkins running now? or still refusing to try? I can't tell [17:24:25] DanielK_WMDE: https://integration.wikimedia.org/zuul/ says running [17:25:50] ok, let's wait and see [17:34:33] DanielK_WMDE: I made a patch that caught some stuff in flow https://gerrit.wikimedia.org/r/#/c/mediawiki/core/+/459826/ https://integration.wikimedia.org/ci/job/wmf-quibble-core-vendor-mysql-hhvm-docker/2372/console [17:35:26] I'll go and patch those now [17:42:45] addshore: i want to go home soon. if my patch fixes it, let's merge it. if it doesn't, let'S revert the MCR mode [17:44:06] DanielK_WMDE: yup [17:44:16] there is so much stuff in flow missing @group Database ... [17:48:59] addshore: Shocked, shocked I am… [17:49:43] heh [17:50:05] addshore: failed. [17:50:06] bah [17:50:08] James_F: I think we just conflicted each other on that pathc [17:50:16] this sucks, i was so happy to have this merged [17:50:28] Ha. [17:50:33] and this is all due to test leakage, not exposing real problems, just dirty tests [17:50:41] DanielK_WMDE: Give a few more minutes before reverting. [17:50:56] James_F: why? [17:51:11] addshore: the failues point at Scrubuntu leaking state. [17:51:26] shall we revert and keep digging tommorrow? [17:51:27] but I *really* don't see how, with the new code. [17:51:38] DanielK_WMDE: See if we can get them fixed now rather than regress. [17:52:16] James_F: i have no idea where to start. i don't understand why the fix didn't fix, and i can't reproduce locally [17:52:24] Fair. [17:52:26] also, i want to go home :) [17:52:34] if you guys want to keep digging, sure! [17:52:42] i am at home, but also want to "go home" :P [17:53:42] i really want this db connection query listener thing... [17:56:21] addshore: much easier, if needsDb() is false, set the table prefix to nonsense. [17:57:12] DanielK_WMDE: yeh, well, i tried that, and it failed, but, well, maybe that was a good thing [17:57:32] i just didn't dig into it too much at the time, maybe I'll look at bringing it back and trying again [18:01:04] * addshore has no idea where that patch went actually [18:01:06] addshore: Well, with the Flow patch your throw-on-DB-use patch passes phpunit, so merging. [18:01:58] anomie: could you clarify how making the arguments non-optional to QueryInfo makes it clearer? I'm not seeing the value in doing that... [18:01:59] Do we want to land https://gerrit.wikimedia.org/r/c/mediawiki/core/+/459820 / https://gerrit.wikimedia.org/r/c/mediawiki/extensions/WikibaseLexeme/+/459821/ too? [18:03:19] James_F: awesome [18:03:21] James_F: resetting NameTableStoreFactory apparently doesn't help, though i'm rather confused about why. I thought that was a fool-proof fix [18:03:27] legoktm: It seems clearer to me to not be arbitrarily leaving out parameters like you do in https://gerrit.wikimedia.org/r/c/mediawiki/core/+/459242/2/includes/Storage/RevisionStore.php#2186 [18:03:28] but it won't hurt [18:03:33] ahh, I never commited my old patch to gerrit actually... thats why i can;t find it [18:03:52] anyway, the revert is going in, and i re-opened the ticket [18:03:55] going home now [18:03:59] addshore: That'd do it. [18:03:59] \o [18:04:05] Bye Daniel. [18:07:08] anomie: hmm, ok. I think in that example it's a bit weird because there's no join at the bottom, so you have three empty arrays. but most cases won't have that [18:10:05] anomie: the one other thing I've been thinking about is a ->merge( QueryInfo $other ) function that would do the array_merges for you, so you could have ->merge( $actorMigration->getJoin(...) ) [18:17:57] legoktm: Except you need to add at least one join condition to connect the two. The details of that join can vary depending on the details of the query you're building, e.g. whether you need to parenthesize the ['tables']. [18:18:33] err, ok, ActorMigration gives you a usual version of that join. [18:19:13] all of the code I've looked at yesterday/today just unconditionally merges ActorMigration stuff in [18:20:03] * legoktm codesearches [18:21:33] ah, the API does a lot more detailed stuff with that [18:21:43] I'll leave it for another time then [18:25:07] addshore: Are we going to find a bunch of these? [18:35:26] James_F: probably [18:35:42] I think The CentralAuth patch is done, im testing it with a WikibaseLexeme patch.... [18:36:05] James_F: it probbaly broke a bunch of extension CI tbh, and we won't see some of it, but the error should be pretty easy for people to fix :) [18:36:43] Yeah, but we should find out if we broke prod-deployed ones ideally. [18:37:56] hehe, indeed, I'm not gonna go through and run them all by hand :) everything in the gated extensions list is green, and I'm fixing anything that Wikibase also loads in CI now [18:47:02] Sure. [18:47:26] Normally we steal l.egoktm's bot or whatever that pushes a test patch to all 200 repos and we find issues that way. [18:51:19] actually! I have a prototype script that triggers CI for all extension repos and then compiles a report [18:58:47] legoktm: oooh, do that! [18:58:57] :D [18:59:08] the core patch isn't actually merged yet though << legoktm [19:01:07] Details, details. [19:01:13] addshore: hm, I haven't figured out how to run it on unmerged patches yet [19:01:41] the patch should be ready to be merged :) but also, if it isn;t merged, we dont need to run over all extensions :P [20:39:22] addshore: https://phabricator.wikimedia.org/T204083 if you're still around [20:41:03] legoktm: I'm watching the Great British bake of currently :p [20:41:17] I just gave it a quick read, sounds like it needs fixing :) [20:44:38] legoktm: I subscribed and will look more in the morning, I assume it's not going to explode over night [20:45:08] ack :)