[04:24:24] <ryankemper>	 We should add some further retry logic to our rolling operation elasticsearch cookbook. Most common failure scenario is `elasticsearch.exceptions.ConnectionTimeout: ConnectionTimeout caused by - ReadTimeoutError(HTTPSConnectionPool(host='search.svc.eqiad.wmnet', port=9243): Read timed out. (read timeout=60))` which seems like a pretty easy error to detect and retry a few times on
[09:14:47] <pfischer>	 dcausse: you where right about the failing rdf-spark-tools tests. After replacing the constructor calls with a builder chain, it compiles again.
[09:41:40] <gehel>	 weekly update: https://wikitech.wikimedia.org/wiki/Search_Platform/Weekly_Updates/2024-06-07
[09:43:11] <gehel>	 ryankemper: what operation causes timeouts? 60 seconds seems pretty long already
[09:44:26] <gehel>	 We should still implement retries, but I'm wondering if we have an underlying issue
[10:13:58] <dcausse>	 lunch
[12:52:03] <dcausse>	 hm... was about to re-deploy the cirrus-streaming-updater to staging (for https://gerrit.wikimedia.org/r/c/operations/deployment-charts/+/1039727) but I realize that that might enable the sanitizer there
[12:52:10] <dcausse>	 not sure we want it there...
[13:00:59] <dcausse>	 pfischer: if you're around https://gerrit.wikimedia.org/r/c/operations/deployment-charts/+/1040151
[13:11:38] <inflatador>	 <o/
[13:11:43] <dcausse>	 o/
[13:52:04] <inflatador>	 dropping off my son, back in ~20
[14:24:30] <inflatador>	 back
[14:33:13] <ebernhardson>	 \o
[14:33:24] <pfischer>	 dcausse: I noticed that yesterday, already set up a patch: https://gerrit.wikimedia.org/r/c/operations/deployment-charts/+/1039736
[14:35:20] <dcausse>	 pfischer: oh thanks!
[14:35:22] <dcausse>	 o/
[14:35:55] <dcausse>	 will get that deployed to upgrade staging
[14:36:08] <inflatador>	 gotta drop off the other kid ;) . Back in ~30
[14:55:04] <ebernhardson>	 hmm, in the mediawiki page_change events for the prior state of a page move, do we think it should always include the namespace id or only if it changed?
[14:55:18] <ebernhardson>	 Currently only the page title is included there, but we need the old namespace id to know if it moved between indices
[14:56:10] <ebernhardson>	 seems like it should simply always be there for consistency, allow consumer to compare .page.namespace_id against .prior_state.page.namespace_id
[14:57:32] <dcausse>	 yes, makes to me
[14:57:36] <dcausse>	 *sense
[14:59:19] <gehel>	 dcausse, pfischer: last minute, but if you want to chat about Search and languages, feel free to join
[14:59:41] <gehel>	 I just sent you the invite
[15:01:10] <inflatador>	 back
[15:44:47] <pfischer>	 gehel: I’am sorry, I was AFK
[15:46:09] <pfischer>	 BTW: looks like rate-limiting via envoy is now ready https://phabricator.wikimedia.org/T362310#9870761 - shall we enable this in general or only for backfill setup?
[15:49:40] <inflatador>	 heading back home, be there in ~30
[16:09:21] <dcausse>	 pfischer: no objections to enable it everywhere but is the pipeline ready to slowdown on 429 or will we fail more events?
[16:12:29] <dcausse>	 going offline, have a nice week-end
[16:16:12] <inflatador>	 back
[16:17:14] <ebernhardson>	 hmm, deciding what counts as language support is not easy...in a way glad they asked for binary.  I was thinking binary isn't specific enough, but then choosing an appropriate divider is hard 
[17:07:45] <inflatador>	 lunch, back in ~40
[17:21:28] * ebernhardson tries turning off FSLockManager on cindy...seems like many of the tests are failing on upload due to it
[17:56:11] <pfischer>	 ebernhardson: I just noticed, there are two PRs for sending a distinct user agent with requests originating from the SUP: https://gitlab.wikimedia.org/repos/search-platform/cirrus-streaming-updater/-/merge_requests
[18:00:35] <pfischer>	  Looks like your's is offering greater flexibility, I'll discard mine.
[18:02:20] <ebernhardson>	 pfischer: i just saw that as well, not sure how i missed yours was already in MR
[18:15:29] <inflatador>	 back, working from my 3rd venue today! It's a new record!
[18:27:34] <ebernhardson>	 lol
[18:37:33] <inflatador>	 too many summer kids' activities ;)
[18:40:20] * ebernhardson realizes while looking at this that event page titles are namespace prefixed, and our redirect update handling isn't stripping them
[18:43:19] <pfischer>	 ebernhardson: which redirect handling? SUP or cirrus?
[19:21:18] <ryankemper>	 gehel: It seems to be the flushing markers causing the timeout
[19:21:28] <ryankemper>	 https://www.irccloud.com/pastebin/qXmZPRTW/stack_trace.log
[20:15:34] <ebernhardson>	 pfischer: in SUP
[20:16:29] <ebernhardson>	 pfischer: the redirect_page_link fields is prefixed db key, so namespace and underscores, but we load it into the TargetDocument.pageTitle which is mostly unused, except in the case of add/remove redirects
[20:19:22] <ebernhardson>	 for extra fun, `Kill Bill: volume 1` and such things have :, but not to delimit the namespace. But as long as we get the namespace id we can then strip when ns > 0