[03:17:08] Amir1: Does your -1 still apply from https://gerrit.wikimedia.org/r/#/c/mediawiki/core/+/497537/ ? [03:18:14] Looks like Wikibase is now passing at https://gerrit.wikimedia.org/r/#/c/mediawiki/extensions/Wikibase/+/497443/ [03:19:36] I didn't quite get what you meant about the ephemeral state. As far as I know when the deferred update runs all state is committed, there may be some state in in-process cache optimised that will reload indeed. Note that this is only the case if the update crashed with a fatal error. The plan is that rather than leaving it after that crash unsolved, to queue a job that can retry a few more times. [03:19:51] The most common reason for such failure is lock timeouts from LInksUpdate-related code. [11:53:53] Krinkle: I can't check for sure, let me do it [11:55:19] So as far as I can see, the changes on top of my comment is only rebase [11:59:51] The whole class is super complicated, I don't understand most of it so take whatever I say with a grain of salt. The issue I encountered was that DeferedUpdates has 'enqueue' mode: "if ( $mode === 'enqueue' && $du instanceof EnqueueableDataUpdate ) {" (line 226. That goes to self::jobify and reduce the RefreshSecondaryDataUpdate to a job (the refreshLinksPrioritized) losing all other deferred update work. [12:01:05] HTH [12:01:46] RefreshSecondaryDataUpdate has all of the deferred updates but when it goes to jobify mode, it loses everything except the refresh links [17:24:39] Amir1: ok, I recall the job being a super set, not a subset. I'd agree it's a blocker if it's a subset. Thanks [23:02:53] Pchelolo: Are you aware of something still using RunJobs? (as opposed to RunSingleJob) Looks like the perf profiler still receives a small hand ful of stack samples from that entry point, which seems suspicious. [23:04:01] Krinkle: one suspect might be wikitech as it has different jobqueue configuration.. other then that, nothing comes to mind [23:06:22] also a quick search reveals that there's still some monitoring pointing to rpc/RunJobs.php https://gerrit.wikimedia.org/g/operations/puppet/+/c8f058b5ee8f554b13f82855e1264e9921428067/modules/profile/manifests/mediawiki/jobrunner_tls.pp [23:08:20] we should probably change that to something else.. and probably drop the RunJobs.php entirely. I'll make a ticket [23:10:20] wikitech seems to be using maintenance/runJobs.php from CLI [23:10:27] atleast in puppet [23:10:58] only ref in puppet to rpc/RunJobs.php is this icginga check [23:10:58] https://github.com/wikimedia/puppet/blob/d7d212ab76e/modules/profile/manifests/mediawiki/jobrunner_tls.pp#L27 [23:11:01] maybe that's it? [23:14:23] yeah, I can't find anything else..