[13:07:39] [xpost from #mediawiki] What's the best way to know the most recent date when a given user made an edit on any wiki? It seems the API doesn't help with this. Centralauth db doesn't seem promising either. [13:53:22] Niharika: https://en.wikipedia.org/w/api.php?action=query&list=usercontribs&ucuser=Reedy&uclimit=1 ? [13:53:34] user contribs by default is date sorted... [13:53:42] And you can limit it to 1 as you only want the most recent... [13:53:58] Reedy: Any wiki. [13:54:05] This is restricted to enwiki, right? [13:54:15] Well, you'd have to swap en.wikipedia.... [13:54:18] But it'd work elsewhere [13:54:40] Yeah but I want to query all 800+ projects and find out the most recent date the user was active... [13:54:57] I guess it's what GlobalContribs would solve [13:55:07] I think https://github.com/wikimedia/labs-tools-guc does what I want but doesn't have an API for it yet. [13:56:35] legoktm: Why have we got GlobalContribs and GlobalContributions? Are they forks or something? [13:57:25] https://github.com/wikimedia/labs-tools-guc/blob/master/api.php [13:57:57] Niharika: A quick look.. You should be able to add it to it's api... With maybe < 20 lines of php? [13:58:03] Mostly copy pasta from https://github.com/wikimedia/labs-tools-guc/blob/master/index.php [13:58:26] Reedy: Yeah, seems easy. [13:58:40] Maybe a bit of refactoring to save some duplication.. But shouldn't be much/that necessary [17:32:45] Niharika: I'd be open to a Gerrit change request for GUC :) [17:33:36] I've taken over maintenance of that tool about 2 years ago. Still a bit messy, but getting into better shape. Recently added an API but mainly for secondary data, not for the primary data yet. [17:35:31] It's far from efficient though (by design), still potentially 800+ dbs being touched, just indirectly so. [17:36:08] The queries are optimised by grouping together wikis and using a UNION query, given they are hosted on the same server/shard. [17:36:58] Krinkle: Ah, I didn't realize it was on Gerrit. I'll be happy to submit one! [17:37:14] Niharika: github is too much effort? :P [17:37:31] Reedy: Gerrit is. :P [17:50:54] Niharika: Wanna file a task or two? Was thinking maybe one taskfor exposing the query through the API, and one task for adding a feature (api-only if you like) to get the latest edit only (instead of the default, which is upto 20 from each wiki). [17:52:18] The current behaviour is to do 1 batch query (7 currently) where we get a true/false from each wiki whether or not the user has edits there, and then we re-use those 7 connections to make one query to 'true' wiki to get the actual edit information. Which is 7 + N queries. But for your case, that initial true/false query could instead query the latest edit :) [17:55:33] Krinkle: Ah, I see. I'll file a task for it. [17:55:38] Thanks! [19:00:08] Reedy: they were written independently, I think there was a task to merge them at some point [19:00:17] heh [19:00:35] mine was written as a port of the GUC tool, and the other one was written by Brickimedia I think [20:32:28] anomie: ApiQueryUserContributions.php is quite something [20:32:51] I'm treating your changes as a rewrite, just reading the code top to bottom [20:36:31] TimStarling: Yeah, I had to make major changes there to avoid it having really bad queries. [20:37:53] are there changes to the output? you still don't have release notes, and your commit message is short [20:39:02] The output should be the same. I probably should add release notes, I probably put it off and then forgot because rebasing them is always a pain. [20:39:55] maybe they could be in a dependent commit [20:50:46] TimStarling: I submitted a new patchset with release notes and a longer message. [20:57:27] I don't think I will be able to merge this today, I'm very sorry to say [20:59:38] :( [21:07:26] it's things like [21:07:28] $munge = function ( $v ) { [21:07:28] $v = preg_replace( '/\brev_timestamp\b/', 'revactor_timestamp', $v ); [21:07:28] $v = preg_replace( '/\brev_id\b/', 'revactor_rev', $v ); [21:07:28] return $v; [21:07:28] }; [21:07:30] $where = array_map( $munge, $where ); [21:07:54] it takes time to review these kinds of hacks [21:59:57] well, at least the strpos/preg_replace stuff is all confined to ApiQueryUserContributions by the looks of it [22:01:16] and in those cases, the relevant conditions never contain usernames, right? [22:52:49] anyway, it seems like there is still a big pile of code here that I haven't reviewed [22:52:58] and I can't blame that on anyone but myself [22:55:19] well, hopefully it won't be too hard to rebase in 2 weeks' time [22:56:38] there won't be any deployments between now and then, so there's still a chance to get this into the next deployment branch