[17:21:15] jynus: Have a second? [17:22:53] yes, hoo [17:24:48] I have queries like SELECT DISTINCT eu_entity_id FROM `wbc_entity_usage` WHERE eu_entity_id IN ('Q1005059','Q10865115','Q10885463','Q10924077','Q10930884','Q10938107','Q11181347','Q14582744','Q14583061','Q14583132','Q14583192','Q148','Q391942','Q42200'); [17:25:20] aha [17:25:29] Sometimes they are "Using where; Using index for group-by" and sometimes they're not (depending on the host) [17:25:46] If they don'T use the index for the group by, stuff gets extremely slow [17:25:51] timing out, even [17:26:32] Eg. zhwiki on db1054 is problematic zhwiki on db1060 isn't... differen maria versions? [17:27:16] only on zhwiki? [17:27:34] No, but that one just came to my attention through the error logs [17:27:38] seeing that on various wikis [17:28:44] it could be the statistics [17:29:18] the idea would be to run anlyze on a non production slave, then copy the index stats to the other hosts [17:30:33] How much work would that be for $everything [17:31:10] depends on the size of the table, it could take a day for our largest table [17:31:54] most of it is checking that after the statistics have been rebuit they have the right outout [17:32:12] I think these tables have less than 10m rows on all wikis [17:32:17] we recently find a case in which we needed to force the plan because mariadb was taking the wrong one [17:32:25] might peak slightly above that on a few big ones, but probably not much [17:34:32] both boxes have 10.0.16 [17:34:58] Yeah, just saw [17:34:59] :S [17:37:35] holy crap [17:37:41] that table has 50M entries on zhwiki [17:38:50] https://phabricator.wikimedia.org/P2222 [17:39:41] Yeah, that's what I saw, thanks [17:41:28] I will open a task, I have not seen it as an unbreak now and investigate it [17:41:45] Yes, please do [17:42:07] but probably it will have to wait until monday, I am in the middle of something [17:42:12] $ grep -c 'SELECT DISTINCT eu_entity_id' exception.log [17:42:12] 20 [17:42:21] That's the ones timing out, so not to many [17:42:34] but I guess we have several ones running annoyingly long, still [17:44:42] is this api activity or regular user requests? [17:45:26] is this for the wikidata item usage? [17:45:45] Job queue only, I think [17:45:59] so it's not slowing down actual web requests, I think [17:46:07] so in that case, it can wait [17:46:30] I have right now higher priorities, but I will check it on monday [17:46:36] Ok, good [17:47:00] do you have the function name, for the ticket? [17:47:10] the one that is shown on the query comment? [17:47:26] EntityUsageTable::getUsedEntityIdStrings [17:47:35] thanks [17:47:36] or Wikibase\Client\Usage\Sql\EntityUsageTable::getUsedEntityIdStrings if you want it qualified [17:49:11] https://phabricator.wikimedia.org/T116404 [17:49:19] thanks for the heads up [17:49:51] I normally monitor the slow queries, but these week I have been focused on pending schema changes, etc. [17:49:56] *focusing [17:52:15] I created a dashboard recently: https://logstash.wikimedia.org/#/dashboard/elasticsearch/Slow%20queries [20:45:03] beta replication from deployment-db1 to deployment-db2 has broken [20:45:34] I found it had stopped, tried to make it start but without much luck