[06:28:30] jynus: there are replags again now, I am not very amused [06:28:46] it's better to seperate as we suggested [06:32:14] I do not see any of that [06:34:36] I see heavy queries from Doc Taxon, however, as a potential cause [06:39:37] lots of Title::purgeExpiredRestrictions on dewiki [06:42:56] doctaxon, what does your bot do? [06:43:41] now it has to protect some pages to autoconfirmed [06:43:59] how fast? [06:44:02] read and write of page content [07:15:56] so, I do not believe you to be the root cause, but the api queries to dewiki from bot "Doc Taxon" are the ones causing lag [07:16:40] separating wikidata would have not achieved anything [07:16:43] in this case [07:20:13] however, these queries were taking 15 seconds to execute, and the log shows more than 4 edits per seconds, which means "not doing requests in concurrency" rules is being violated [07:24:38] jynus: ah, sorry. But this was an one & only action [07:24:53] and is stopped [07:25:16] no problem, we should be able to support it, I've made changes [07:25:18] but the lag was before this action, too [07:25:34] there is currently no lag [07:25:45] what changes? [07:26:15] I've changed db1026 role, now it is not used for production traffic (no api or user requests go there) [07:26:53] I've given more server power to problematic queries until they are fixed [07:27:46] that doesn't mean that you shouldn't be careful when lag appears- if it does, your bot should sleep for an equivalent time, not continue processing requests [07:28:06] and above all, not doing requests in concurrency [07:31:55] okay thanks [07:32:10] do you state it into the phab task? [07:33:26] I am waiting to confirm it works [07:33:44] I have created 2 tickets to track issues [07:34:17] I will wait for those to be fixed and the server installation before consider it done [07:35:09] If you reload https://phabricator.wikimedia.org/T135100 and see the subtasks you can see all of that [08:02:30] volans, why did you chose db1049 as a master if I asked all to be slower than the slaves? [08:04:15] ah, no, it is ok, it is only 64GB, but it may have faster storage [08:05:08] that is probably why semisync is timouting so much there [08:39:45] jynus: db1049 has 15k disks like db1026, just bigger AFAIK [08:43:36] I am thinking of increasing again the stall value [08:43:58] so I reduce paralelism [09:56:34] doctaxon, see https://phabricator.wikimedia.org/T135471#2300457 for the root cause [10:30:56] jynus: BotNinja? 10 edits per second? what is "s5"? [10:32:30] s5 is the database group where dewiki and wikidata is [10:32:53] it has been banned until he corrects that behaviour [10:33:16] it was going over the recommended API limits [10:34:56] once banned, lag issues disappeared: https://phab.wmfusercontent.org/file/data/pthjx3j4gp4g2ez3vh3s/PHID-FILE-kgp4i4tdcqpc5zzeqhxk/ninja.png [10:35:17] it is 0 now