[00:04:06] <Krinkle>	 dancecat: thx
[00:19:11] <Krinkle>	 dancecat: So the implicit "else" is now handled by the call below that blok
[02:52:37] <Krinkle>	 ^d: Should we remove things like $ps = Profiler::instance()->scopedProfileIn( __METHOD__ . "-{$this->name}" ); at https://github.com/wikimedia/mediawiki/blob/master/includes/filebackend/FileBackendStore.php#L603  given we're using xhprof?
[02:52:54] <Krinkle>	 not sure what the _name suffix is doing
[02:53:34] <Krinkle>	 name of filebackend subclass
[02:56:58] <dancecat>	 registered backend name
[02:57:11] <dancecat>	 could be several for a class
[03:32:28] <Krinkle>	 dancecat: Cool
[03:33:18] <Krinkle>	 dancecat: Question.. What are database groups used for? I see several code paths passing it around but never set anywhere. 'group' in Database, 'groupLoads' in LoadBalancer, 'groupLoadsByDB' in LBFactoryMulti
[03:34:13] <Krinkle>	 in wmf-config groupLoadsByDB is set to an empty array
[03:34:16] <dancecat>	 db-eqiad.php in mediawiki-config is worth looking at
[03:34:32] <dancecat>	 groupLoadsBySection
[03:35:11] <Krinkle>	 Ah, right.
[03:35:18] <dancecat>	 they set the preferred weighted list of servers for certain queries (even if lagged MW will stay in the group)
[03:35:19] <Krinkle>	 There is one more abstraction layer I missed
[03:35:22] <Krinkle>	 groupLoadsBySection pop
[03:35:25] <dancecat>	 though if unreachable other slaves can be used
[03:35:37] <Krinkle>	 Yeah
[03:36:14] <dancecat>	 it's useful for keeping caches warmer by focusing the type of queries better and also when certain DBs have extra indexes
[03:36:49] <Krinkle>	 dancecat: Hm.. we have db slaves with additional mysql indexes not in mw core?
[03:37:24] <dancecat>	 some DBs using table partitioning to
[03:37:33] <Krinkle>	 Ah nice. This is about MySQL's internal caches for parts of the database?
[03:37:57] <dancecat>	 so the log ones partition by log_user, same for contributions on rev_user
[03:38:17] <dancecat>	 we used to have some DBs with extra covering indexes...not sure if that's true anymore
[03:38:29] <dancecat>	 springle would know best
[03:39:15] <dancecat>	 https://wikitech.wikimedia.org/wiki/MariaDB#Partitioned_Tables
[03:39:40] <Krinkle>	 I makes sense that MySQL keeps popular sections of the db in memory and that focussing similar queries to the same slaves optimises this behaviour. I'm curious what logic myql internally uses to decide what to cache, e.g. LRU?
[03:40:09] <Krinkle>	 Or do we control it specifically (e.g. we configure the recentchanges group to cache that table better?)
[03:41:37] <dancecat>	 the buffer pool has my.cnf settings, but we mostly just throw RAM at it, tell it to use it, and have it "do magic"
[03:42:54] <dancecat>	 indexes and records cache in an LRU "two queue" style manner (new stuff is inserted 3/8s from the end, having to work it's way up)
[03:43:08] <Krinkle>	 I didn't know about table partitions, interesting. I thought mysql would do that based on the indexes that exist.
[03:43:20] <dancecat>	 that way a huge query that runs once doesn't churn the whole cache with garbage
[03:43:34] <dancecat>	 https://dev.mysql.com/doc/refman/5.1/en/innodb-buffer-pool.html
[03:44:24] <Krinkle>	 Nice
[03:44:29] <dancecat>	 Krinkle: partitioning has to manual really
[03:44:47] <Krinkle>	 Yeah
[03:45:43] <Krinkle>	 ah, so the caches only work for entire indexes, not for parts of it. Unless a partition is declared, then it can allow a part to be cached. 
[03:47:02] <Krinkle>	 dancecat: The partition steps get bigger over time. I guess that's because we don't want to change existing ones and we have more capacity now than before?
[03:47:25] <Krinkle>	 more RAM I mean
[03:47:51] <dancecat>	 ranges are determined by sean looking at the distribution
[03:48:02] <dancecat>	 post-sul there are LOTS of automatic accounts with little activity
[03:48:52] <Krinkle>	 I guess it would help if users have the same ids accross wikis
[03:49:04] <Krinkle>	 SUL step 3 :)
[03:49:06] <dancecat>	 we don't use partitioning to manually map to different disks (some people do that), just to cut down on query scan sizes and index tree heights and such
[03:49:35] <dancecat>	 so it's not like ES where older "stores" are crappier and newer ones have better hardware or anything
[03:49:45] <Krinkle>	 Right
[03:50:34] <Krinkle>	 dancecat: having the global user table used more reminds me.. is it a good thing or not to go in that direction? I remember some controversy about the _user_text columns and whether or not those are better than a join to user.
[03:50:49] <Krinkle>	 what's the current take on that principle?
[03:51:37] <dancecat>	 IMO batch queries (like we do for page existence and it works just fine) are the way to go, with *_text just for IPs and original PoT names
[03:52:03] <dancecat>	 heh, it was rename user jobs that prompted https://gerrit.wikimedia.org/r/#/c/205825/
[03:52:26] <Krinkle>	 Pot?
[03:52:32] <dancecat>	 I guess those could select on range and update by primary key (or we could just use row-based replication)
[03:52:42] <dancecat>	 mariadb 10 parallel replication will help to
[03:52:48] <dancecat>	 point in time
[03:52:56] <Krinkle>	 Right
[03:53:09] <dancecat>	 ...but yeah, its a lot of work renaming people, and every extension has to register
[03:53:24] <dancecat>	 the code makes that reasonably easy...it's doable, but I don't like it much
[03:53:29] <Krinkle>	 So instead of joining to user, just get the user row ahead of time and when using the rows of a table, map it in php code to the right data about the user.
[03:53:33] <dancecat>	 also it's racey if people get renamed twice 
[03:53:55] <dancecat>	 reminds me how much it sucks that central auth uses names mostly for foreign keys
[03:54:06] <Krinkle>	 well, as long as there's a user_id field in the same tables
[03:54:08] * dancecat remembers tim's "give me a break" comment about that
[03:54:20] <Krinkle>	 what? That sounds terrible
[03:56:14] <dancecat>	 too bad `localuser` doesn't mention global and local IDs
[03:56:21] <dancecat>	 ah, well, just more tech debt
[03:59:01] <Krinkle>	 dancecat: You reckon we will convert local ids to match centralauth user ids within a year?
[03:59:07] <Krinkle>	 I guess it would require readonly
[03:59:40] <dancecat>	 I don't see that happening really
[04:00:01] <dancecat>	 keeping local accounts around isn't were most of the baggage is IMO
[04:00:32] <dancecat>	 though lazy creation vs pre-population have their tradeoffs
[04:00:50] <dancecat>	 making 950 local rows per new user is lame
[04:01:09] <dancecat>	 lazy creating them is tricky since it randomly requires master queries to update stuff
[04:02:37] <dancecat>	 I guess if accounts are pre-made on the top X wikis, it's not likely to matter
[04:03:06] <dancecat>	 most people don't view random svwikiquote wikis for example
[04:03:17] <dancecat>	 (random assuming they started with en*)
[04:03:46] <dancecat>	 so really one could premake the relative "top Y" in addition based on language...maybe
[08:19:03] <wikibugs>	 6MediaWiki-API-Team, 10MediaWiki-extensions-OAuth, 5Patch-For-Review: OAuth: Authorisation should not fail because you don't have an account on central wiki - https://phabricator.wikimedia.org/T74469#1235923 (10yuvipanda) @legoktm @csteipp is this still relevant now? SULF is done
[17:36:01] <wikibugs>	 6MediaWiki-API-Team, 10MediaWiki-extensions-OAuth, 5Patch-For-Review: OAuth: Authorisation should not fail because you don't have an account on central wiki - https://phabricator.wikimedia.org/T74469#1236125 (10csteipp) Still relevant. Even though accounts are unique, it doesn't mean all accounts actually ex...