[16:37:37] <addshore>	 hey jynus ! are you about?
[16:38:05] <jynus>	 at a meeting
[16:38:28] <jynus>	 end in 20 minutes
[16:38:31] <addshore>	 okay!
[16:38:36] <addshore>	 i'll poke again in 20!
[16:38:39] <jynus>	 ^addshore
[16:58:54] <jynus>	 addshore, yes?
[16:59:41] <addshore>	 jynus: regarding https://gerrit.wikimedia.org/r/#/c/276446/ and https://phabricator.wikimedia.org/T67058
[17:00:09] <addshore>	 Do you have any idea of the max the size of the delete queries for the watchlist table that tend to happen?
[17:00:26] <addshore>	 Or have some sort of number on the most items you would like to see deleted at once?
[17:01:01] <jynus>	 yes, there are 2 things that shoud be done for batches
[17:01:14] <jynus>	 have it configurable, and default to 1000 records
[17:02:08] <jynus>	 and second, after every batch, check and wait for lag to become 0, abort if it is stays high for more than X time
[17:02:36] <jynus>	 if possible, iterate over an index, not with limit
[17:02:51] <jynus>	 because that would make the query very slow for large limits
[17:03:03] <jynus>	 even if that means it will be slow
[17:03:43] <jynus>	 if there can be a large number of items, do not do it syncronous, as it will likely timeout, create a job queue for it
[17:03:51] <addshore>	 yeh
[17:03:56] <jynus>	 does all this seem reasonable?
[17:03:59] <addshore>	 *goes to remind himself where this thing is used*
[17:04:32] <jynus>	 some of these are already implemented/can be resused
[17:04:48] <jynus>	 but do not ask me about mediawiki, only about the database :-)
[17:04:53] <addshore>	 yup :P
[17:05:12] <addshore>	 not sure any of that has been implemented regarding watchlists tbh!
[17:05:16] <jynus>	 no
[17:05:29] <jynus>	 but there are similar processes "batches" for other tasks
[17:05:38] <addshore>	 yup
[17:05:38] <jynus>	 e.g. maintenance scripts
[17:05:50] <jynus>	 collation update, etc.
[17:08:20] <addshore>	 and COUNT(*) where wl_user = X should still be quick for users with a large watchlist right?
[17:08:31] <addshore>	 I have no idea what size some of these watchlists have got up to...
[17:13:20] <jynus>	 it depends
[17:15:13] <jynus>	 it would not read all rows, but it would use a convering index (but still O(n) complexity, there n is the number of records for that user)
[17:15:42] <jynus>	 but it would not be instant, like max(indexed_column)
[17:16:59] <jynus>	 we still have some time issues when users read its watchlist, for extreme cases- the query fails for cold retrievals, works when the cache has been warmed
[17:17:19] <jynus>	 (e.g. while editing it)
[17:17:45] <addshore>	 okay!
[17:18:42] <jynus>	 watchlist has an issue, which is the lack of a primary key
[17:19:28] <jynus>	 once https://gerrit.wikimedia.org/r/#/c/271435/ is merged
[17:19:50] <jynus>	 the idea would be to use the id to iterate, and speed the process in general
[17:20:35] <jynus>	 in extreme cases, we can partition by user, as we already do for recentchanges and logging
[17:38:08] <addshore>	 heh, jynus yeh the primary key doesn't help, and also the lack of being able to pass a limit to the delete method abstraction in mediawiki :P
[17:38:09] <addshore>	 bah
[17:51:15] <addshore>	 jynus: any idea what sqlite does if you dont compile it with the SQLITE_ENABLE_UPDATE_DELETE_LIMIT compile option but still pass it LIMIT X in a delete query? I guess it either dies or just deletes everything :P
[18:02:34] <jynus>	 no idea
[18:03:11] <addshore>	 :D
[18:03:55] <jynus>	 well, the index would help doing it as a WHERE, instead of as a LIMIT
[18:04:48] <jynus>	 we do not want a LIMIT, LIMIT 100000000,1 will take (almost) the same time than deleting 100000000 records (or at least, equaly problematic)
[18:05:19] <addshore>	 yep. So with the index would you do something along the lines of Delete from watchlist where wl_id in ( Select wl_id from watchlist limit 1000) ; ??
[18:05:30] <jynus>	 no no
[18:06:00] <jynus>	 iterating over id in a range
[18:06:51] <jynus>	 "WHERE wl_id BETWEEN ? AND ?" would be the batch
[18:07:26] <jynus>	 obviously, selecting first would make sure we do not delete empty chunks
[18:07:30] <addshore>	 hmmm okay.
[18:07:37] <jynus>	 we can detect the limits first
[18:07:59] <jynus>	 there are many ways, the idea is avoiding complex queries, not because mysql cannot handle them
[18:08:25] <addshore>	 yeh, I mean if we were to do batches of 1000 and not know where those batches should start and end when removing all entries for 1 user there would be lots of empty chunks
[18:08:32] <jynus>	 but because with replication, the philosophy is small updates, on separate transactions
[18:08:49] <jynus>	 also thinking about row based replication, primary key updates are preferred
[18:09:31] <jynus>	 it is ok, we can select them and order them in primary key, we can delete in, as you said
[18:09:39] <jynus>	 but not on the same query
[18:10:02] <jynus>	 that contradicts a bit the idea of reducing roundtrips
[18:10:18] <jynus>	 but for batched updates, we care more about small updates than fast updates
[18:10:42] <jynus>	 so sorry, you were right in the first place, just not litterally like that
[18:11:15] <jynus>	 I took it literally, and not "along the lines" :-)
[18:14:45] <addshore>	 okay, so select 1000 items to delete, wait for lag, delete those 100 items, wait for lag, repeat
[18:15:07] <jynus>	 correct
[18:15:28] <jynus>	 1000 is more or less a rule of a thumb
[18:15:32] <addshore>	 okay, I'm going to write an implementation now before wl_id is merged, and then we can update the job once wl_id exists! :)
[18:16:10] <jynus>	 do not expect that change too soon, watchlist is a very large table, which means it can take a lot of time do be changed, even after merged
[18:17:26] <addshore>	 yup! hence I'll implement it in this slightly lamer way first! :)
[18:18:33] <addshore>	 *more lame
[18:18:58] <jynus>	 if you do not want it a blocker, any index could work, but not interating over an index can result in unsafe updates
[18:19:43] <jynus>	 e.g. DELETE... LIMIT 1000 without ORDER BY can have different results in masters and slaves, causing all kind of issues
[18:20:13] <addshore>	 yup, as this is deleting for a single user I guess order by nsID and title dbkey
[18:20:25] <addshore>	 as im 99.99% sure there is an index on them
[18:21:10] <jynus>	 not a really good one, that is why the PK creation (among other things)