[01:28:42] <legoktm>	 does MediaWiki sample statsd metrics like ->increment() ?
[01:30:00] <legoktm>	 Krinkle: ^ I assume you'd know
[01:30:58] <legoktm>	 I'm looking at https://grafana.wikimedia.org/d/ZIvCK9EMz/globalwatchlist?orgId=1&from=now-30d&to=now and also don't understand why there are fractional numbers there
[01:31:23] <legoktm>	 it's just `$this->statsdDataFactory->increment( 'globalwatchlist.load_special_page' );`
[01:37:24] <Krinkle>	 legoktm: no, we do not. not anywhere in core or prod
[01:38:08] <Krinkle>	 a good graph will use .rate or .sample_rate and not any of the other (unreliable) fields. https://wikitech.wikimedia.org/wiki/Graphite#Extended_properties 
[01:38:34] <Krinkle>	 the one caveat with rate is that it is per second, whereas we buffer out from statsd to graphite once a minute
[01:38:38] <Krinkle>	 so fractional is expected
[01:38:57] <Krinkle>	 1 data point a given minute will be 1/60 rate
[01:39:17] <Krinkle>	 use scale(60) if you prefer per minute numbers (which will generally always be whole numbers)
[01:39:43] <Krinkle>	 (also document in the axis label what the unit is etc)
[01:42:06] <legoktm>	 I mostly just want to know the total count, not too plussed on the exact timing
[01:42:10] * legoktm reads the link
[01:51:58] <legoktm>	 Krinkle: so on https://grafana-rw.wikimedia.org/d/ZIvCK9EMz/globalwatchlist?viewPanel=4&orgId=1&from=now-7d&to=now&forceLogin=true I updated it to use .rate + scale(3600) and labeled it as "loads per hour", is that accurate?
[01:52:50] <Krinkle>	 legoktm: hm.. not exactly, this is giving you the per-minute average of how many there would be per hour if constant
[01:53:18] <Krinkle>	 given the per-minute resolution, that might be confusing
[01:54:30] <Krinkle>	 the default for this kind of metric is to plot it as .rate labelled per second, or rate |scale(60) and labelled per minute
[01:54:57] <Krinkle>	 scaling it further tends to be confusing I think since it's still a data point per-minute, just inflated with a different label
[01:55:06] <legoktm>	 hm
[01:55:59] <legoktm>	 okay, switched it to scale(60) / per minute
[01:56:54] <Krinkle>	 I'd also set a Y-min of 0, use bars, null as zero
[01:57:21] <Krinkle>	 since here nulls/absent are essentially zero its just that we don't push any data (in prometheus, the pull would get real zeros)
[01:57:44] <Krinkle>	 right now it is cutting off the bottom of the graph since no data will be zero
[01:58:42] <Krinkle>	 lgtm :)
[01:59:15] <Krinkle>	 the "total" panel, for example, when you hover somewhere in the large empty space, the tooltip shows the closest data from a few days earlier 
[01:59:34] <Krinkle>	 although that may be fine there
[01:59:53] <Krinkle>	 but you'd want to avoid that with the more regular rate stuff, and actually give you zero etc.
[10:26:55] <duesen>	 addshore: to reply to your question from yesterday: I don't even know what API. All I have is the screenshot produced  by selenium. That's why I'm asking.
[11:10:21] <addshore>	 duesen: can you link me to the code patch again? i'll take a look
[12:47:08] <duesen>	 addshore: the patch is here: https://gerrit.wikimedia.org/r/c/mediawiki/core/+/670570
[12:47:08] <duesen>	 I suppose the issue will be easy to find once I know what API modules are involved and why.
[17:51:05] <RhinosF1>	 Does anyone know how long it took wikimedia to get a photo dna key?
[18:07:15] <bd808>	 RhinosF1: Cindy might, but she seems to have vanished from irc into Slack chatrooms :/
[18:09:32] <RhinosF1>	 bd808: is it supposed to take like over 6 months
[18:10:12] <bd808>	 the task was written over a year ago, so yeah maybe (T247977)
[18:10:13] <stashbot>	 T247977: Implement Hash Checking of Media Files - https://phabricator.wikimedia.org/T247977
[18:14:13] <bd808>	 RhinosF1: I found https://phabricator.wikimedia.org/T246206#5932214 which I think confirms that Cindy is the person who may be able to answer your question.
[18:14:45] <RhinosF1>	 bd808: email best?
[18:15:07] <bd808>	 probably, yeah
[18:15:40] <apergos>	 she's also on irc but not in this channel
[18:15:57] <apergos>	 she's in wikimedia-cpt though
[18:16:33] <bd808>	 my /whowas CindyCicaleseWMF gave no results, but awesome!
[18:16:58] <apergos>	 heh
[19:30:46] <Krinkle>	 TimStarling: AaronSchulz: I'm seeing some JobQueue jobs failing due to "The critical section …rdbms… timed out after 180 seconds". I can't tell if this is intentional or not. Since in wmf-config we set higher limits for jobs and POST.
[19:31:15] <Krinkle>	 it seems there is a separate limit wgCriticalSectionTimeLimit which defaults to 180 and is unmodified.
[19:31:46] <Krinkle>	 Naively, I was thinking these would have the same limit as the main one, just interrupted later instead of immediately
[19:32:47] <Krinkle>	 there's also transactionprofiler limits for queries, which yet another thing 
[21:33:17] <addshore>	 duesen: https://github.com/wikimedia/Wikibase/blob/master/client/data-bridge/src/data-access/ApiPageEditPermissionErrorsRepository.ts#L41-L49
[21:33:52] <addshore>	 it does that api call twice, once for the page on the client, and once for the entity page on the repo (I believe)
[21:51:33] <TimStarling>	 I talked with AaronSchulz about maybe making $wgCriticalSectionTimeLimit be infinite by default, but we didn't quite get to the end of that conversation
[21:51:52] <TimStarling>	 I think it would be fine to do that, but Aaron seemed noncommital
[22:03:04] <duesen>	 addshore: thanks for digging that up!
[22:09:23] <AaronSchulz>	 the script will get killed by something at somepoint eventually...I guess the idea was to at least give MW a chance to handle it and shut down. Maybe it can be raised for the job queue though.
[22:11:04] <AaronSchulz>	 Krinkle: which jobs?
[22:36:56] <Krinkle>	 AaronSchulz: I think they were upload jobs but I think that's an aside. 
[22:37:08] <Krinkle>	 actually, thining about it some more, I don't understand why it exisxts at all as a configurable time limit
[22:37:25] <Krinkle>	 I assume under no ciircumstances do we interrupt a critical section, right?
[22:37:35] <Krinkle>	 so what does it actually control?
[22:38:54] <Krinkle>	 I mean, why would we want a generic indescrimitate time limit specifically for use around "critical sections" different from the general execution timeout (neither of which is process-killing anyway, so shutdown should be fine either way).
[22:45:10] <Krinkle>	 I can imagine a use case for wanting a non-global time limit over a closure, but in my mind those (theoretical) use cases would be non-critical, e.g. where you'd want it to stop early. For example, invoking the parser for an interface message with upto ~ 1 second wall time allotted. I'm trying to think of when you'd want a sequence to run uninterrupted but then end with an exception if it took longer than a certain amount of time (that'd 
[22:45:10] <Krinkle>	 be easier to implement on your own as wel with startTime-endTime without this library).
[23:27:09] <TimStarling>	 looking at selenium flakiness -- there's a task open since 2019 about the test that's failing