[19:56:31] hey all, Research Showcase (“Evolution of privacy loss in Wikipedia”) is starting in 5 - streaming: https://www.youtube.com/watch?v=Xle0oOFCNnk - discussion in #wikimedia-research [20:58:11] ArchCom-RFC meeting coming up in a couple of minutes [20:59:58] https://phabricator.wikimedia.org/E148 [20:59:59] woohoo [21:00:22] #startmeeting E148 [21:00:22] Meeting started Wed Mar 16 21:00:22 2016 UTC and is due to finish in 60 minutes. The chair is robla. Information about MeetBot at http://wiki.debian.org/MeetBot. [21:00:22] Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. [21:00:22] The meeting name has been set to 'e148' [21:00:53] #topic wikimedia meetings channel | Please note: Channel is logged and publicly posted (DO NOT REMOVE THIS NOTE) | Logs: http://bots.wmflabs.org/~wm-bot/logs/%23wikimedia-office/ | https://phabricator.wikimedia.org/E148 [21:01:07] hi folks! [21:01:19] hi [21:01:31] hey [21:01:56] #link https://phabricator.wikimedia.org/T66214 [21:02:01] ok so last we saw, gwicke made some updates to the task description on https://phabricator.wikimedia.org/T66214 [21:02:02] :D [21:02:16] o/ [21:03:01] yeah, I summarized use cases and requirements separately [21:03:14] based on last week's discussion and the follow-up on the task [21:03:24] so the big update is a clearer split between the URL-format structure and the use of hashes as the file id [21:03:53] i think one major question still open is, where we add extra parameters (page, time, language, etc) is the format sufficiently clearly specified? [21:03:58] today I would like to ask you for input on those in particular, so that we can establish what we are trying to solve before exploring solutions further [21:04:02] o/ [21:04:06] and what are implications for extensibility and caching/uniqueness of URLs [21:04:13] #info topic discussed: where we add extra parameters (page, time, language, etc) is the format sufficiently clearly specified? [21:05:04] and we should probably also cover an implementation note; eg on one-off mediawiki installs without a lot of extra services, do we route all image URLs through a PHP router similar to img_auth.php / thumb.php ? [21:05:26] (on large installs we'd tend to customize that thorugh a heavy varnish infrastructure of course) [21:05:28] I would like to focus on use cases & requirements first [21:05:31] ok [21:05:47] #info to discuss in future: implementation details for small/large installs (backlog) [21:05:48] hi all! [21:06:00] * robla notes we'll probably wrap up this half of the conversation at around 21:20 UTC (ish) [21:06:18] legoktm: are you raising your hand to ask question or just waving? :D [21:06:28] waving :) [21:06:31] ok :) [21:06:39] one use case that is not currently listed is client-side selection of complex transformations [21:06:52] like cropping, rotation, sharpening etc [21:07:14] *nod* if we have clear infrastructure for parameters then they could be stashed in there fairly easily [21:07:21] but they may have performance implications [21:07:24] and certainly caching implications [21:07:39] we've had questions in past about arbitrary server-side scaling after all [21:07:43] if we add to that it can get scary [21:08:00] first step is parameter ordering [21:08:26] yeah, there is a trade-off there, and this RFC is primarily focused on the common case, possible at the cost of some convenience for the rare power user case [21:08:42] if one client sks for /hash;rotate=30;p=22/320 and another asks for /hash;p=22;rotate=30/320, do we accept them both? are they stored separately? [21:08:57] as you observed earlier in the discussion, many of those more advanced options tend to be specified server-side [21:09:13] * brion hmmmms [21:09:30] through thumb syntax, for example [21:09:31] one possibility which is quite different from our current model is to embed the additional rendering parameters into its own server-side object, and reference it by id [21:09:51] downside: client can't create a new parameter set except by going through the server [21:10:02] upside: client doesn't have to worry about formatting/details of parameter set [21:10:17] yeah, and the server can be sure that those parameters make sense [21:10:32] (compare with, say, how we do rendering which is actually similar) [21:10:55] the main parameters i see required from arbitrary client stuff tend to be: [21:10:59] 1) size/scaling factor [21:11:13] the current proposal has a compromise that still allows advanced clients to build up options, but the requirements on those clients are quite high [21:11:16] 2) (potentially) squared-off thumbs/manual centering [21:11:28] and 3) (for media viewer) switching between pages [21:11:53] 2) could be seen as a subset of format / quality selection [21:12:18] yeah, that's probably distinct from 'arbitrary crop for art direction' [21:13:27] 3) is more advanced; most clients don't need this, but the feature needs to be there for advanced ones [21:13:36] similarly, selecting offsets of video thumbs [21:13:47] yeah, those are quite similar in principle [21:14:45] things like SVG language feel more like a _context_ than an _option_ [21:15:04] by default you'd pass in a default rendering context of 'in the language of this page/wiki', but it could be overridden [21:15:27] so, do you think it would make sense to more clearly state the priorities wrt power / features vs. caching / simplicity of use for the common use cases? [21:15:28] perhaps that's a good concept to distinguish between presentation attributes (size/page) and lower-level attributes [21:15:43] yeah we're making some tradeoffs so let's be explicit on that [21:16:08] * gwicke wonders if anybody else is following along [21:16:41] okay, I'll add some wording about the trade-off later [21:16:45] * robla and Roan were just talking about gwicke's wondering ;-) [21:16:47] they're all waiting for part 2 i guess ;) [21:16:49] * DanielK_WMDE__ is reading along [21:16:59] * robla is trying to keep up as well [21:17:21] so, requirements: I think the most prominent / limiting one is probably the desire to make URLs deterministic [21:17:29] to avoid cache fragmentation [21:17:35] program note: should we wrap this up at 21:20 UTC? [21:17:56] yes, 1) avoid cache fragmentation while retaining 2) easy parsing/reconstruction of the client-relevant parameters on the client [21:18:04] robla: sounds good [21:18:13] this means that optional parameters need to be serialized in a deterministic order, which means that regular query strings are problematic [21:18:21] yep [21:18:49] deterministic order for parameters means either we enforce serialization order of a hashmap or we order them such as /in/separate/path/components/ [21:19:01] map is more readable and extensible [21:19:05] the current syntax proposal is based on a similar spec / practice for video thumbs, but I'm open to suggestions on that [21:19:12] gwicke: i'd make out-of-order parameters trigger a redirect to the url with the canincial ordering [21:19:13] path components don't scale well past a couple of things (such as the base size) [21:19:43] DanielK_WMDE__: redirects are expensive, so they should be the exception [21:19:50] #info avoiding cache fragmentation will require normalization of parameters (deterministic serialization of hashmap or other ordering) [21:19:57] and, the canonicalization would need to happen in Varnish [21:20:24] #info daniel suggests having the failed-validation case be a redirect, gwicke warns they're expensive [21:20:27] while not impossible, parsing & normalizing query parameters is not currently supported / implemented [21:20:43] #info normalizing parameters within varnish would be painful [21:20:53] A proper routing layer (dispatcher) can normalize internally. Can Varnish do the same (e.g. split, sort, dispatch on the cannonical) [21:20:57] ok we ready to wrap this up or a few more minutes? :) [21:21:17] sounds like we're ready to move on [21:21:27] I think this has already been quite useful, and it's a good point to move on [21:21:33] okee doke [21:21:35] #topic T78639: How to address the long tail of low priority tasks in active projects [21:21:36] bd808: let's follow up on varnish implementation details later :) [21:21:36] T78639: How to address the long tail of low priority tasks in active projects - https://phabricator.wikimedia.org/T78639 [21:22:03] bd808: it would require some custom C code [21:22:15] #topic Wikimedia meetings channel | Please note: Channel is logged and publicly posted (DO NOT REMOVE THIS NOTE) | Logs: http://bots.wmflabs.org/~wm-bot/logs/%23wikimedia-office/ | #topic T78639: How to address the long tail of low priority tasks in active projects [21:22:16] T78639: How to address the long tail of low priority tasks in active projects - https://phabricator.wikimedia.org/T78639 [21:22:48] #link https://phabricator.wikimedia.org/T101686 [21:22:57] #link https://phabricator.wikimedia.org/T114419 [21:23:04] andre__: you around? [21:23:15] robla, yepp [21:23:48] what's the most important question to answer today? [21:24:26] If we talk about code review: Does the "Blocked By" list in https://phabricator.wikimedia.org/T101686 make sense and is something we can agree on. [21:24:44] "how do I get my core patches reviewd in a timely manner"? [21:25:07] are we talking about code review or low priority tasks? [21:25:17] DanielK_WMDE__: key word in my question: _today_ ;-) [21:25:30] That's also my question a bit. As I see https://phabricator.wikimedia.org/T78639 as being about tasks only. See my comment in https://phabricator.wikimedia.org/T78639#1910641 [21:25:41] andre__: if I understand you right, the problem is that we don't clearly signal when something is not wanted / a priority? [21:25:42] robla: "what are some ideas of how to get my core patches reviewed in a timely manner"? [21:25:50] (learned that trick on quora) [21:27:15] legoktm: my intention was to have a discussion focussed on code review, but the task we originally linekd is about low prio tasks, which I take includes the review of volunteer patches, among other things [21:27:23] the scope of this discussion is indeed a bit unclear... [21:27:34] I'm not sure if the proposal is to change prioritization towards third party contributions, or just to deal more efficiently with them by rejecting unwanted ones quickly [21:28:08] #info question discussed: what is the scope of this RFC? [21:28:09] gwicke: Is "something" a task, or a patch, or both? [21:28:22] as i said a few hours ago on the mailing list, another option to scale batter is to motivate people that don't have +2 to do +1 reviews, so people with +2 rights can concentrate on patches that have already been pre-screened. [21:28:46] andre__: both are about some feature / change / project, and ultimately that's what we prioritize [21:29:01] I'd love to see an attitude that all patch contributors are on equal footing ("we are all remote") rather than "3rd party contributors" [21:29:11] what does +1 mean anyway? [21:29:12] Reviewer has a working mouse. [21:29:19] is there really a problem of not reviewing completed patches? One issue i see is we often based this off stats like how long a patch lives in gerrit, but the different between an uploaded patch and something that's ready for review and merge is often quite large [21:29:25] gwicke: a lot of such patches are not "unwanted", they are just uneconomic: it takes mroe effor to fix them than to write the code yourself. of yourse, if you do that, you miss the chance to foster a new contributor. [21:29:55] ebernhardson: yes, there is a massive problems. even with patches by established contributors. [21:29:57] many people seem to treat gerrit as remote git branch for WIP's [21:30:18] DanielK_WMDE__: some also just conflict with other goals, or implement explicit non-goals [21:30:18] as long as those patchsets have a [WIP] prefix I don't mind [21:30:40] because we exclude [WIP] prefixed patchsets from the statistics on korma.wmflabs.org, like http://korma.wmflabs.org/browser/gerrit_review_queue.html [21:30:56] DanielK_WMDE__: I agree though that most fall somewhere into a grey area in between [21:30:59] ebernhardson: even if it's WIP, feedback can be useful. If no feedback is wanted, no reviewers should be addded (and ideally, the change should be marked as a draft) [21:31:31] drafts are annoying because they're not public [21:31:36] indeed. [21:31:38] andre__: could you clarify which problem you are mainly trying to solve? [21:31:39] gwicke: that'S true, but the ones that do go against defined goals can (and should!) quickly get a -2. [21:31:41] A "WIP = do not review now" flag could help. [21:31:47] i don't think that is the problematic category [21:32:26] yeah, the grey area is harder to deal with, and unfortunately also quite populated [21:32:34] gwicke: I didn't know about this meeting until 5 hours ago. https://phabricator.wikimedia.org/T101686 is about fixing our code review processes. https://phabricator.wikimedia.org/T78639 is about our queue of low priority tasks in Phabricator. [21:32:51] Purodha: we could try to establish a few markers, and define their semantics a bit better. I often *do* want feedback on a WIP [21:32:52] Both tasks have some potential action items listed in the tasks. [21:33:19] andre__: looking at https://phabricator.wikimedia.org/T78639, I'm still not entirely sure what the problem is you are trying to solve [21:33:48] DanielK_WMDE__: I'd appreciate that [21:34:00] you state that the backlog grows, but I see little on why that is a bad thing, and why we should change it [21:34:10] +1 to gwicke [21:34:12] gwicke: Under "Summary" in https://phabricator.wikimedia.org/T78639#1910648 I outline what I see as possible items. [21:34:27] andre__: do you think https://phabricator.wikimedia.org/T129952 is related/should be linked? [21:34:41] gwicke: where exactly do I state that a growing backlog is a problem? [21:34:44] gwicke: we are losing potential contributors, because they get frustrated. We let good work go to waste, by letting patches grow stale. [21:34:57] https://www.mediawiki.org/wiki/User:MZMcBride/Bugs#Philosophy [21:35:35] DanielK_WMDE__: so the thesis is that we'd lose less contributors by being swift & blunt? [21:35:53] gwicke: Correcting myself: Regarding tasks, I don't see a problem with a growing backlog. A growing patch backlog though is a problem I'd say. [21:36:02] gwicke: swift and blunt, but also helpful. [21:36:28] gwicke: a quick negative response may scare away some, but the chance that the right people stay is higher than when we let stuff just sit & rot. [21:36:32] almost all of my likely) stale patches are owed to extension maintainers not responding [21:37:05] but as I said before: it'S not just about "us". I think we should try to get the many many inactive people on gerrit to review. Want people to review your patch? review three patches. Want +2 rights? review many patches. [21:37:10] gwicke: my personal thesis is that we lose contributors because of many factor and reasons, listed under "Contents" on https://www.mediawiki.org/wiki/User:AKlapper_(WMF)/Code_Review [21:37:29] ...got +2 rights, but have your dashboard flooded? Set it to only show things that already have +1 [21:37:42] DanielK_WMDE__: where "review" doesn't mean just "+1" but actual reviews (comments with suggestions etc) [21:38:19] greg-g: yes, of course. Ideally, we'd have a meta-moderation feature, where you can vote comments/reviews as "helpful". [21:38:35] but even without that - the informal reputation will grow with *actual* reviews, not button pushes [21:38:47] right right [21:38:54] When I see a +1 review, it usually means "this looks good, but I'm too scared to merge it" [21:39:05] legoktm: depends on the person giving it, of course [21:39:05] or, if things go badly: spam me with senseless +1s, and i get you kicked from gerrit... [21:39:13] greg-g: sure [21:39:22] I agree that more timely patch feedback / decisions would be helpful for third party contributors [21:39:49] legoktm: yes. which means "escalate it to someone who feels comfortable with it". But most +1 reviews are see are by people who actually have +2 rights. [21:40:01] legoktm: I've been using +1 along with comments to say things like "I'd merge this but I'll wait for others to give feedback" [21:40:30] DanielK_WMDE__: that's a really good point. we should try to help establish some norms around +1, rather than assuming the "working mouse" theory [21:40:34] gwicke: to achieve this, we need a process that scales better. Building a hierarchie of tiers (formal or not) should help with that. [21:41:06] maybe have some tools to allow patches needing review to get to the hands of people willing to review [21:41:09] bd808: i do that sometimes. i usually add "poke me if nothing happens for a few days" [21:41:14] two contributing factors I see to reviews being slow / timid: 1) lack of faith in our current test coverage, especially in core, and 2) lack of clarity on goals & non-goals [21:41:19] e.g. if I wanted to help now, I'd have little idea where to start [21:42:08] legoktm: I almst everywhere cannot +2 and my +1 usually means "I recommend to merge it" [21:42:16] andre__: one thing that might help is a smaller RFC/document/whatever on standards for +1. should be concise [21:42:40] "you have +1. this is a _big deal_" [21:42:42] Does a "+1" concept exist in Differential? I think the only options there are "accept" and "request changes" [21:42:53] Purodha: that's actually who I think +1 should be used. [21:43:09] and if it is used that way consistently, people with +2 rights can focus on patches that already have a +1 [21:43:20] that would create a two pier system. i think it would help [21:43:36] robla: standards for +1 sounds like in scope of https://phabricator.wikimedia.org/T129067 . Though +1 is an implementation detail for Gerrit (and won't be in a Differential world) [21:43:46] bd808: ugh, that would be bad :/ [21:44:02] +1 is often used as "good idea, but I didn't have the time to actually review in detail" [21:44:03] bd808: DanielK_WMDE__ comments of course, which we want, just not "+1" [21:44:09] bd808: but good point. do we want/need +1? is this a blocker for the move to differential? [21:44:09] An "accept" in differential isn't an automatic merge either [21:44:12] what could be used instead? [21:44:18] DanielK_WMDE__: comments [21:44:30] "accept" is "code review is done. go merge it yourself" [21:44:36] greg-g: if they can be queried consistently, sure [21:44:38] since we want more than just a literal +1 from reviewers, encouraging them to leave a comment is prefered [21:44:43] we'd probably have to agree on tags [21:45:07] greg-g: a +1 should always come with a comment, but we need a simple filter facility [21:45:39] I honestly think designing a new protocol for gerrit is a waste of time [21:45:46] gwicke: i think that's bad. or rath3er, i'd want a way to differeentiate that. [21:45:53] we should focus on a workflow for differential [21:46:09] well, we should design a review workflow that works for us [21:46:14] and then find the tools that fit it best [21:46:17] not the other way around [21:46:47] #info discussed role of +1. question now discussed: how will this work in Differential? [21:46:59] DanielK_WMDE__: too late IMO. We are well down the phab/differential rabbit hole [21:47:09] In gerrit, comments are under-used partly because the interface hides all the context once you start to write one [21:47:24] comments are much nicer in differential [21:47:25] I see we're discussing some details here which are important but much less than actually getting people reviewing things... if we have robust process, it can work with almost any tool [21:47:28] gwicke: yeah, this is something Differential gets really well [21:47:32] without it, no tool would help [21:47:37] easier to read and easier to jump to the code [21:47:47] #info should we discuss workflow for gerrit, for differential, or detach the discussion from the tools [21:48:07] in github, we don't really miss the formal +1 / +2 stuff, and rely on plain comments and @mentions instead [21:48:21] SMalyshev: any insight to processes that work (or don't) from PHP land? [21:48:56] tools discussion can go to T119908 [21:48:56] T119908: [RfC]: Migrate code review / management to Phabricator from Gerrit - https://phabricator.wikimedia.org/T119908 [21:49:01] bd808: well, in PHP land we unfortunately have the same issue with pulls piling up... [21:49:32] gwicke: trying to use github got me to really appreciate gerrit. the UI sucks, but at least it does what i need... [21:49:45] bd808: so I'm sad to say we're in pretty much the same position there. lots of patches sit unreviewed unless somebody really actively nudges somebody specific [21:49:53] DanielK_WMDE__: I only warn that a too theoretical discussion, without input from those outside our normal/aclimated group should be avoided, we need input from outside people (not invovled in our community) to get an idea of what might work better, and, of course, we aren't going to build a code-review system ourselves, so measuring against what we have out there is prudent. That's all :) [21:50:10] bd808: there's RFC process that works pretty well but it's for big things. For smaller ones, not so much [21:50:17] (to clarify, we need input from everyone, not just our close-nit community) [21:50:31] SMalyshev: *nod* I have the same backlog problem in my own tiny projects :/ [21:50:32] greg-g: yes, we should look outside for ideas, but inside for feedback [21:50:48] * robla giggles at close-"nit" ;-) [21:50:53] the discussion seems to have drifted into tools land [21:51:02] DanielK_WMDE__: both for feedback and ideas, is what I'm saying, we can't expect ourselves to give ourselves good feedcback :) [21:51:04] https://en.wiktionary.org/wiki/a_bad_workman_always_blames_his_tools [21:51:06] robla: heh, nice catch [21:51:28] * greg-g thinks he's always typed that, dislikes silent k's :) [21:51:54] we know that greg-g :) [21:51:56] The thing I miss most in a github workflow is an easy way to amend a pul request started by someone else without merging it. [21:51:57] greg-g: how can you get feedback from people who don't use our code, toosl, or process? [21:52:22] DanielK_WMDE__: just like anything, you ask them to review what you're doing and give feedback, I don't see the hurdle [21:52:27] but that's not currently a phab thing either [21:52:43] I think the thing this community needs to talk about is what to do about patches when they come. I think we can have that aspect of the conversation without it being 100% all inclusive [21:52:44] bd808: it is (almost) now thanks to mukunda [21:53:00] greg-g: :) I saw that patch [21:53:03] tools won't give us more trust in merging complex patches without good test coverage, or make decisions on "not sure if this is a good direction" patches [21:53:04] :) [21:54:13] andre__: we're just about out of time. next steps by default are "take the discussion to T78639". any other next steps you'd like to capture? [21:54:13] T78639: How to address the long tail of low priority tasks in active projects - https://phabricator.wikimedia.org/T78639 [21:54:24] bd808: we tend to use branches; you pull the other branch, add your commits on top, open a new PR [21:54:36] gwicke: yay, test coverage! [21:55:11] #idea tools won't give us more trust in merging complex patches without good test coverage, or make decisions on "not sure if this is a good direction" patches [21:55:17] gwicke: *nod* with a reasonably small number of contributors who all have commit rights that can work [21:55:19] robla, no other steps for tasks in T78639. For code review in https://phabricator.wikimedia.org/T101686 , specific feedback and discussion in each blocking task is welcomed. [21:55:20] T78639: How to address the long tail of low priority tasks in active projects - https://phabricator.wikimedia.org/T78639 [21:55:22] robla: we mostly talked about code review. T78639 is not really abotu code review. But yea, we should continue this on phab [21:55:22] T78639: How to address the long tail of low priority tasks in active projects - https://phabricator.wikimedia.org/T78639 [21:55:34] bd808: none of this requires commit rights [21:55:45] I have a question about "implementation details for small/large installs" which Brion brought up in the beginning, and about WUaS which WUaS donated to Wikidata last autumn 2015 as a possible install [21:56:03] gwicke: working in shared branches does doesn't it? [21:56:11] DanielK_WMDE__: then you would also need published coverage, and even more helpful compute the coverage difference a patch creates [21:56:18] it's not a shared branch, it's a public but private branch [21:56:33] in your own private repo [21:56:50] public as in I-can-check-out-your-branch [21:57:06] jzerebecki: yep [21:57:49] gwicke: ah. so you'd send a PR to the contributor and then they would accept it into the branch that is the original PR to upstream? [21:58:27] What's the process for how this works, please? WUaS just heard back from Cecilia d'Oliveira, Associate Dean of Digital Learning at MIT and as former Executive Director of MIT OCW [21:58:30] I think I'll hit "endmeeting" in 60 seconds unless there's anything else we need to capture now [21:58:50] bd808: you can do that if you want them to review first (or aggregate contributions a la linux subsystems), but in small projects we just open a new PR with the entire branch pushed to our own repo [21:59:09] Scott_WUaS: Sorry but I'm afraid that is off-topic for this meeting... [21:59:28] bd808: this model also supports complex rebases quite well [21:59:31] Thanks, @andre__ [21:59:49] alright wrapping up now. thanks everyone! [22:00:02] #endmeeting [22:00:03] Meeting ended Wed Mar 16 22:00:02 2016 UTC. Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4) [22:00:03] Minutes: https://tools.wmflabs.org/meetbot/wikimedia-office/2016/wikimedia-office.2016-03-16-21.00.html [22:00:03] Minutes (text): https://tools.wmflabs.org/meetbot/wikimedia-office/2016/wikimedia-office.2016-03-16-21.00.txt [22:00:03] Minutes (wiki): https://tools.wmflabs.org/meetbot/wikimedia-office/2016/wikimedia-office.2016-03-16-21.00.wiki [22:00:03] Log: https://tools.wmflabs.org/meetbot/wikimedia-office/2016/wikimedia-office.2016-03-16-21.00.log.html [22:00:06] o/