[16:37:37] hey jynus ! are you about? [16:38:05] at a meeting [16:38:28] end in 20 minutes [16:38:31] okay! [16:38:36] i'll poke again in 20! [16:38:39] ^addshore [16:58:54] addshore, yes? [16:59:41] jynus: regarding https://gerrit.wikimedia.org/r/#/c/276446/ and https://phabricator.wikimedia.org/T67058 [17:00:09] Do you have any idea of the max the size of the delete queries for the watchlist table that tend to happen? [17:00:26] Or have some sort of number on the most items you would like to see deleted at once? [17:01:01] yes, there are 2 things that shoud be done for batches [17:01:14] have it configurable, and default to 1000 records [17:02:08] and second, after every batch, check and wait for lag to become 0, abort if it is stays high for more than X time [17:02:36] if possible, iterate over an index, not with limit [17:02:51] because that would make the query very slow for large limits [17:03:03] even if that means it will be slow [17:03:43] if there can be a large number of items, do not do it syncronous, as it will likely timeout, create a job queue for it [17:03:51] yeh [17:03:56] does all this seem reasonable? [17:03:59] *goes to remind himself where this thing is used* [17:04:32] some of these are already implemented/can be resused [17:04:48] but do not ask me about mediawiki, only about the database :-) [17:04:53] yup :P [17:05:12] not sure any of that has been implemented regarding watchlists tbh! [17:05:16] no [17:05:29] but there are similar processes "batches" for other tasks [17:05:38] yup [17:05:38] e.g. maintenance scripts [17:05:50] collation update, etc. [17:08:20] and COUNT(*) where wl_user = X should still be quick for users with a large watchlist right? [17:08:31] I have no idea what size some of these watchlists have got up to... [17:13:20] it depends [17:15:13] it would not read all rows, but it would use a convering index (but still O(n) complexity, there n is the number of records for that user) [17:15:42] but it would not be instant, like max(indexed_column) [17:16:59] we still have some time issues when users read its watchlist, for extreme cases- the query fails for cold retrievals, works when the cache has been warmed [17:17:19] (e.g. while editing it) [17:17:45] okay! [17:18:42] watchlist has an issue, which is the lack of a primary key [17:19:28] once https://gerrit.wikimedia.org/r/#/c/271435/ is merged [17:19:50] the idea would be to use the id to iterate, and speed the process in general [17:20:35] in extreme cases, we can partition by user, as we already do for recentchanges and logging [17:38:08] heh, jynus yeh the primary key doesn't help, and also the lack of being able to pass a limit to the delete method abstraction in mediawiki :P [17:38:09] bah [17:51:15] jynus: any idea what sqlite does if you dont compile it with the SQLITE_ENABLE_UPDATE_DELETE_LIMIT compile option but still pass it LIMIT X in a delete query? I guess it either dies or just deletes everything :P [18:02:34] no idea [18:03:11] :D [18:03:55] well, the index would help doing it as a WHERE, instead of as a LIMIT [18:04:48] we do not want a LIMIT, LIMIT 100000000,1 will take (almost) the same time than deleting 100000000 records (or at least, equaly problematic) [18:05:19] yep. So with the index would you do something along the lines of Delete from watchlist where wl_id in ( Select wl_id from watchlist limit 1000) ; ?? [18:05:30] no no [18:06:00] iterating over id in a range [18:06:51] "WHERE wl_id BETWEEN ? AND ?" would be the batch [18:07:26] obviously, selecting first would make sure we do not delete empty chunks [18:07:30] hmmm okay. [18:07:37] we can detect the limits first [18:07:59] there are many ways, the idea is avoiding complex queries, not because mysql cannot handle them [18:08:25] yeh, I mean if we were to do batches of 1000 and not know where those batches should start and end when removing all entries for 1 user there would be lots of empty chunks [18:08:32] but because with replication, the philosophy is small updates, on separate transactions [18:08:49] also thinking about row based replication, primary key updates are preferred [18:09:31] it is ok, we can select them and order them in primary key, we can delete in, as you said [18:09:39] but not on the same query [18:10:02] that contradicts a bit the idea of reducing roundtrips [18:10:18] but for batched updates, we care more about small updates than fast updates [18:10:42] so sorry, you were right in the first place, just not litterally like that [18:11:15] I took it literally, and not "along the lines" :-) [18:14:45] okay, so select 1000 items to delete, wait for lag, delete those 100 items, wait for lag, repeat [18:15:07] correct [18:15:28] 1000 is more or less a rule of a thumb [18:15:32] okay, I'm going to write an implementation now before wl_id is merged, and then we can update the job once wl_id exists! :) [18:16:10] do not expect that change too soon, watchlist is a very large table, which means it can take a lot of time do be changed, even after merged [18:17:26] yup! hence I'll implement it in this slightly lamer way first! :) [18:18:33] *more lame [18:18:58] if you do not want it a blocker, any index could work, but not interating over an index can result in unsafe updates [18:19:43] e.g. DELETE... LIMIT 1000 without ORDER BY can have different results in masters and slaves, causing all kind of issues [18:20:13] yup, as this is deleting for a single user I guess order by nsID and title dbkey [18:20:25] as im 99.99% sure there is an index on them [18:21:10] not a really good one, that is why the PK creation (among other things)