[00:00:41] and a better solution than eatmydata, which was just convenient for me since I'm not recreating the data dir every time I run tests [02:30:02] why does ServiceContainer::destroy() not reset the service array? [03:23:45] I'm doing it [07:54:59] woot PS15 verified +2 [07:55:05] nice way to finish off the week [15:24:06] hello smart MediaWiki devs! is it true that `category`is not purged, ever? so if I add `[[Category:Foo]]` to a page, then change it to `[[Category:Bar]]`, the record for Foo stays there forever, even if it is no longer used? [15:30:02] musikanimal: no, that used to be the case, but since https://gerrit.wikimedia.org/r/c/mediawiki/core/+/298791 (2016) MediaWiki is supposed to delete `category` table rows that correspond to categories with no members and no description page [15:31:05] (i will not guarantee that this still works properly today, though :) pretty sure it worked in 2016) [15:33:07] ah, yes you are right. I guess it just took a while for the row to be deleted [15:33:54] so there is no true renaming or moving of a category, right? instead a new one is created, and the old one deleted [15:36:25] [[Category:Foo]] I know is a page, and that can be moved. But there's no scenario where `cat_title` in `category` is updated, it seems [15:38:09] I guess this is because categories are defined by wikitext, so you can't tell when a category was "moved" versus "removed" [15:40:35] yeah, i think that's right [15:41:26] mkay, thanks! [19:22:57] RoanKattouw: I'm trying to figure out what exactly triggering user and cause agent mean in MediaWiki, I see you have struggled with that in https://gerrit.wikimedia.org/r/c/mediawiki/extensions/Echo/+/455259 [19:23:26] do you have any idea who might know that? [19:24:02] neither is actually used for anything as far as I can tell, now that Echo dropped the getTriggeringUser call [19:24:26] well, triggeringUser still gets passed to EventBus but that seems to be the only thing [19:54:15] tgr: hello :] If you get sometime next week, I could use a fix for ReadingLists to unbreak it with the installer :] ( https://gerrit.wikimedia.org/r/447086 ) [19:56:41] hashar: ugh, sorry about ignoring that for so long :( [19:56:50] will look at it next week [19:57:08] tgr: no worries, I have a lot of other extensions to deal with :] [19:57:23] thank you! [20:07:38] tgr: I'm not sure. I know we use triggeringRevision in Echo (both for identifying the user, and for linking to the diff), not sure why triggeringUser was ever needed [20:07:56] Maybe legoktm might know, he was involved in the early Echo work [20:09:10] hmm [20:09:49] tgr: I believe they were introduced around the time we were struggling with a lot of refreshLink/HTMLCacheUpdate jobs, such as caused when editing a highly used template, or by a bot that's abusing the API to purge a million pages, and we wanted better logging to find out what triggered these upates. [20:10:10] They are string-based and only for logging. They are not valid user names per se, and could be other descriptive strings. [20:10:30] It was renamed from triggeringUser to causeAgent to better reflect this purpose. [20:10:35] That's my understanding anyway. [20:10:49] Krinkle: that's the cause* stuff, right? triggeringUser seems to be Echo-related [20:11:08] It is similar indeed, but triggerUser was deprecated / is removed. [20:11:15] legoktm: looks like you added this in T116485 [20:11:15] T116485: Echo should not notify the user about his own linking activity - https://phabricator.wikimedia.org/T116485 [20:11:47] Echo no longer uses it though, and it wasn't "right" to begin with since action=purge, or cascading updates result in update jobs triggered by a user that isn't the revision author. [20:11:49] RoanKattouw: tgr: it used to be that the refreshLinks stuff was run in deferred updates, so Echo was using $wgUser to get the triggering user. Once those moved to jobs, we needed a different way to get the User, so we added getTriggeringUser [20:11:56] Echo now uses revision::getUser instead, which is more correct. [20:12:06] what I am trying to figure out is, can we just replace it with $linksUpdate->getRevision()->getId()? [20:12:06] I think LinksUpdate::getRevision() was introduced later on [20:12:45] hm, indeed, 1.28, and https://gerrit.wikimedia.org/r/c/mediawiki/core/+/249302 is 1.27 [20:13:06] it's no problem if I just kill it then, right? [20:14:11] we are updating data updates for MCR and it was very unclear when should it be set to the user who saved the revision and when to the user who e.g. initiated a purge [20:14:22] and most of the time it seems to be just null [20:14:25] aren't they different concepts? [20:14:44] like if I purge a page, then I'm the triggeringUser, but that doesn't mean I'm the author to the last revision on the page [20:15:17] maybe? that doesn't seem what the extensions using it expect [20:15:31] and either of those could be called "triggering" [20:17:24] it seems confusing and not really necessary, unless someone needs it for some kind of logging, but then we already have causeAgent for that [20:17:37] the intial implementation would have used the user that caused the linksupdate to be queued AIUI - whether that was important I don't know. I think if someone edits a template that causes page link notifications to go out, we want the user to be the template editor, not the people who last edited those articles [20:18:23] (for which it is likewise unclear which of the two concepts it should be, and a bunch of callers just set it to 'unknown'. Legacy MediaWiki code is fun.) [20:19:08] oh, ok, I see, this is for inheriting the user in recursive links updates [20:19:42] and I guess you filter out purges and similar non-real updates somehow? [20:20:56] theoretically purges don't add new links to the page, but if they did I think it would trigger a notification [20:23:31] index.php action=purge does not update the link tables [20:23:41] null edits and API action=purge can [20:23:55] the API can even do it recursively [20:24:48] right. [20:25:25] I guess if user A edits a template, and user B null-edits a page using the template since the job gets backlogged, there is no sane way to figure out that LinksUpdate is really caused by user A [20:26:23] is it a good idea to send Echo notifications from recursive links updates, anyway? [20:26:31] seems like you'd get a flood most of the time [20:27:12] We don't really use the agent for page link notifications anyway. It's stored in the DB but not used for display [20:27:36] ...we don't? [20:29:29] legoktm: Krinkle: so anyway, based on that I'd say the correct semantics is: if an edit triggers LinksUpdates (possibly recursively), triggeringUser should be the user who saved the edit. In all other cases (API purge etc) it should be null. causeAgent should be the user who triggered the update, whatever that means (the user who called the API, or some system user for refreshLinks.php). [20:29:38] does that sound correct? [20:29:51] and I suppose causeAgent should also inherit recursively [20:30:31] that seems reasonable [20:30:51] "doesn't seem what the extensions using it expect" - what is this referring to? when grepping I only found Echo, which indeed was wrongly attributing notifs to the purging user, and is now fixed by using rev user instead. [20:32:18] tgr: for refreshLinks I'd prefer it is the user who started that chain of jobs. [20:32:42] If it gets deduplicated it doesn't matter which one is kept, but it should at least in the common case allow us to trace and potentially identify abuse and other problems. [20:32:46] not null. [20:32:48] but if it is now, that's fine. [20:33:10] triggeringUser indeed seems redundant now that we have causeAgent and Revision/User. [20:33:28] I thought the removal of that was already finished. I recall Roan's commits doing that last week or earlier this week [20:35:00] I abandoned them because tgr fixed the feature [20:35:07] But I could un-abandon them [20:36:27] seems like three legitimately different things [20:38:34] oh, right, we have triggeringRevisionId too [20:39:05] so then maybe the user is not really needed [20:39:24] I'll make a phab task about sorting things out, let's follow up there [20:41:47] RoanKattouw: oh, I missed that. thanks [20:42:04] I do recall merging part of it, though. [20:54:42] Are we not supposed to, like, rotate our logs or something? [20:54:44] catrope@mwlog1001:/srv/mw-log$ du -sh api.log [20:54:45] 97G api.log [20:58:50] Oh wait that is just the past 13 hours of logs... [21:03:35] Krinkle: you are correct, https://gerrit.wikimedia.org/r/c/mediawiki/extensions/Echo/+/455259 was merged and Echo does not use triggeringUser since then [21:05:14] RoanKattouw: :D [21:05:40] Turns out we do rotate these and the prior days' logs are around 30GB each after gzip(!) [21:09:37] RoanKattouw: It's almost like we run a big site. [21:09:53] With ever-increasing API usage. [23:00:41] Does phan only check modified files? I'm getting a phan error about LightnCandy being an undeclared class, but the code I changed already referenced it (and phan seems to be erroring on both the existing code and my new code) https://gerrit.wikimedia.org/r/c/mediawiki/core/+/456533 [23:01:03] In other words, is it possible for Phan to fail on files in master without us noticing until someone tries to edit that file?