[06:50:43] fyi codfw wdqs-main has been having trouble keeping up for the last several hours. brian and i did some investigation and deployed a requestctl rule, it didn't seem to help though [06:51:04] we should consider removing the rule tomorrow and trying another one if the issue still persists. for now i'm not touching anything further though [08:33:10] update: cluster is currently happy at the moment [08:33:59] cpu load is notably 2x as high as eqiad still. but we're processing updates consistently again so lag is at normal levels [13:52:14] o/ [14:43:26] \o [14:49:52] do we know anything about `listTaskCounts.php --topictype ores`? I'm not that familiar with GrowthExperiments, re T408052 [14:49:54] T408052: PHP Warning: Trying to access array offset on value of type null (via GrowthExperiments listTaskCounts) - https://phabricator.wikimedia.org/T408052 [14:55:42] not very familiar with what it does [15:33:56] .o/ [16:19:49] we should be ready to merge the puppet patch to sync cirrus dumps from hdfs to the public host: https://gerrit.wikimedia.org/r/c/operations/puppet/+/1184585 [16:46:13] dcausse: had some questions around whether our wdqs error rate metric includes throttled requests or not here, thought you might have some insight: https://phabricator.wikimedia.org/T393966#11325354 [16:47:09] ebernhardson: shipped it. anything specific i need to do as a post-deploy step? [16:48:12] ryankemper: looking [16:51:47] for T409070, i'm leaning towards some sort of version check, or maybe remove regex start/end anchors from REL1_45. The thought is that it would be generous to let users upgrade sometime during 1.45, before 1.46, instead of forcing them to upgrade elastic and mediawiki at same tim [16:51:48] T409070: Latest CirrusSearch is incompatible with ES7.10 and the corresponding WMF extra plugin - https://phabricator.wikimedia.org/T409070 [16:53:04] i don't know if we ever decided what exactly we want the dependencies to be in the public cirrus releases [16:54:32] version check you mean checking if running elastic or opensearch? and disable new plugins if run elastic? [16:55:24] the current bug is an analysis component, so when we do the plugin check we could also do a version check. But i'd need to runs cirrus against elastic to double check that's the only bit [16:55:59] if it's only dropping the add_regex_start_end_anchors, thats easy, but there might be more [16:56:00] perhaps a elastic version check with a warning could be fine, and some best effort attempt at building a analysis/mappin config that's compatible with elastic [16:56:23] hmm, actually 1_44 says opensearch 1.3: https://github.com/wikimedia/mediawiki-extensions-CirrusSearch/tree/REL1_44 [16:56:27] so maybe it is time to cut elastic? [16:57:18] for some reason i was thinking we hadn't done a public opensearch release yet [16:57:34] yes... worst case they could disable the extra plugin and run bare features [16:58:26] or we give another release to transition but without strong guarantees that it'll work :/ [16:59:05] yea...i worry we haven't communicated well and that deciding now is a bit late. Maybe we support elastic for now, and update the readme to clearly state this is the last suppoted version for elastic, and opensearch must be migrated to? [16:59:13] Is updating the readme enough? [16:59:32] a big warning in the backend version check should be made at least [16:59:55] yea some warnings during maint scripts, maybe all of them as a header [17:00:02] yes [17:56:57] draft of the default sort A/B test on en, fr and he wikipedias: https://people.wikimedia.org/~dcausse/T404858-completion-default-sort-en-fr-he.html [17:57:27] seems like a modest win [18:02:18] had to increase bootstrap rounds from 1k to 2k to make hebrew wiki marings less random [18:03:58] and also noticed that we might get empty state (using more like suggestions) click events into our satisfaction logs (esp. if you start searching, empty the search and click a more like suggestion) [18:04:24] nice! reading [18:09:04] interesting, indeed small improvement to success rate and no change elsewhere. Suggests perhaps it's solving a new class of queries that makes up a small %, without negatively effecting what it was already solving [18:10:39] indeed [18:28:01] dinner [19:23:58] heh, the regex keyword in cirrus still has the groovy fallback. Maybe it's time to drop that. [19:36:10] dcausse: nice report! I agree with Erik's analysis. Hopefully over the long run it will continue to improve as people realize it is there and become used to using it. [21:31:27] stepping out for ~90 mins