[16:56:39] Language Engineering office hour starts here in a few minutes [17:00:10] Hello [17:00:14] #startmeeting Language Engineering monthly office hour - May 2014 [17:00:14] Meeting started Wed May 21 17:00:14 2014 UTC and is due to finish in 60 minutes. The chair is arrbee. Information about MeetBot at http://wiki.debian.org/MeetBot. [17:00:14] Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. [17:00:14] The meeting name has been set to 'language_engineering_monthly_office_hour___may_2014' [17:00:28] Hello, Welcome to the monthly office hour of the Wikimedia Language Engineering team [17:00:34] Hi santhosh [17:00:50] Hi arrbee [17:01:11] My name is Runa. I am the Outreach co-ordinator for our team and will be hosting the session today [17:01:35] Our office hours are held every 2nd Wednesday of the month [17:01:53] We delayed it by a week this month as we were traveling [17:02:28] Before we go further here is an important message: [17:02:52] The chat today will be logged and publicly posted. It is also mentioned in the channel topic (above). [17:03:25] Our last office hour was held on April 9, 2014. The logs are at: [17:03:31] #link https://meta.wikimedia.org/wiki/IRC_office_hours/Office_hours_2014-04-09 [17:04:06] aharoni kart_ divec Nikerabbit pginer santhosh are here today and will be answering questions [17:04:47] In today's session we would like to talk about our recent work and end with an open session for Q & A [17:05:37] сәлем [17:05:57] The Wikimedia Foundation's Language Engineering team builds language features and tools to support our wiki communities across the world [17:06:18] We are a distributed team and operate from various locations around the globe [17:06:56] As already mentioned in the mail I sent out earlier, we have been concentrating on the Content Translation tool and intend to make the first release very soon [17:07:27] We have mentioned the Content Translation project several times in our earlier office hours but we will talk more about it today [17:07:57] The Content Translation tool is a way to create new Wikipedia articles from existing articles in another language [17:08:05] Hi alolita [17:08:45] The tool is designed to have an editing interface and several translation aids like dictionaries, and machine translation support [17:09:12] Hi Runa [17:09:22] The purpose of the tool is to help create an initial version of an article which can then be published and edited like any other article [17:09:48] The immediate benefits from this tool are: [17:10:19] 1. the relative ease in quick article creation for both new and advanced users [17:10:46] 2. provide assistance to achieve high quality translations [17:10:47] and [17:10:57] 3. prevent errors [17:11:16] Development on this tool started in February, earlier this year [17:11:51] The initial phase of development was focused primarily on research and exploration of backend design choices [17:12:50] Over the past month we have narrowed down on the actual deliverables for the first release and the development plan to support this release [17:13:43] We do not have a specific release date just yet but are aiming for sometime in the next couple of months when we will be able to present this as a beta feature [17:14:36] Presently we are fully absorbed in development of key features for the first release (termed as "minimum viable product" or "MVP" in product speak) [17:15:04] (There was an interesting discussion about MVP in the ECT meeting yesterday) :) [17:15:40] Alongside we are collaborating with other teams on machine translation support, analytics and performance [17:16:03] Detailed designs are at: [17:16:12] #link https://commons.wikimedia.org/wiki/File:Content-translation-designs.pdf [17:16:32] The workflow we are creating will be: [17:17:00] 1. Show a red link in the interlanguage area for a missing language version of the displayed article [17:17:42] 2. Clicking the red link would display the Content Translation editor (for logged-in users only) and users can choose to translate the article or create it from scratch [17:18:18] 3. The editor displays 3 columns - source article text, target article text, tools column for the translation aids [17:18:40] 4. Translations are done in the editor by the logged-in user [17:19:28] 5. Using a publish button from the editor, a new version of the article in the target language is published under the user's namespace on the target language wiki [17:19:56] ^^ workflow is only for the first release i.e. MVP [17:20:41] We went for a simple REST style api that can easily be cached by Varnish [17:21:46] We are using a special presentation of the rich markup called LinearDoc to enable transfer of markup like bolds and links even if the word order is different in machine translation [17:22:25] We use the Parsoid service to get the translatable content in HTML, then we segment it and when saving we use Parsoid to convert it back to wikitext [17:22:43] divec: santhosh : would you like to add anything more about this? [17:23:44] https://www.mediawiki.org/wiki/Content_translation/Documentation has most of these technical details [17:25:00] Thanks santhosh [17:25:06] that's helpful santhosh [17:25:29] (writing the link again in a bot-friendly way) [17:25:32] #link https://www.mediawiki.org/wiki/Content_translation/Documentation [17:25:55] As we go closer to the release date, we will step up on the communications [17:26:21] Until then the best place to track the project is: [17:26:27] Does this tool ties with Wikidata? [17:26:31] #link https://www.mediawiki.org/wiki/Content_translation [17:26:45] Hi Pavanaja [17:26:51] Pavanaja: we intend to do so [17:26:52] Pavanaja: yes [17:26:54] Hi Alolita [17:27:29] Pavanaja: the plan is to release basic wikidata integration in the first deployment [17:27:33] Is this something lik a Translation Memory (TM) tool? [17:28:09] it will use wikidata to check whether it's possible to insert links to the target article in the target wiki automatically. [17:28:55] You could view the tool in action here: [17:28:58] so, for example, if you are translating from English to Telugu, and the English article has a link to [[Philosophy]], and there is an corresponding article in Telugu, then the link to the Telugu article will be automatically inserted into the translation [17:29:05] #link http://language-stage.wmflabs.org/w/index.php?title=Special:ContentTranslation&page=Food&lang=es [17:29:16] (you need to login) [17:29:38] interlanguage links are stored in Wikidata, so this is a kind of Wikidata integration. [17:29:55] future development may have deeper Wikidata integration. [17:30:38] about Tranlsation Memory, this is not a translation memory tool; it is a translation tool that will include translation memory as one of its features. [17:31:08] Thanks aharoni for that neat summary [17:31:34] The development roadmap and timeline of milestone completion can be found at: [17:31:42] #link https://www.mediawiki.org/wiki/Content_translation/Roadmap [17:32:04] (This page is changing very rapidly though) [17:32:18] More questions? [17:35:36] just a little announcement [17:36:41] the "compact interlanguage links list" is currently broken in production because a of an incompatible core change [17:36:49] Oh. [17:37:02] I couldn´t figure out what broke it. [17:37:27] a core change in how the sidebar is handled in the Vector skin. [17:37:30] I already committed a fix, but it needs review and deployment. [17:37:43] https://gerrit.wikimedia.org/r/#/c/134592/ [17:37:55] Thanks aharoni. [17:38:13] (leaving as "aharoni", staying as "aharoni|mobile") [17:39:20] Just in case there is some interest about the technical architecture of the Content Translation tool, you can find it here: [17:39:24] #link https://www.mediawiki.org/wiki/Content_translation/Technical_Architecture [17:39:47] Pavanaja, any more questions? [17:41:49] Okay… if there are no more questions we can wrap up early :) [17:42:38] I am waiting for the tool [17:42:49] #info Our next office hour is scheduled for 11 June 2014 [17:43:01] Great to hear [17:43:23] Pavanaja: we will make sure to communicate about its arrival much in advance [17:43:35] Thanks RB [17:43:49] Meanwhile if you like you can look at the staging instance and give us feedback [17:43:58] :) [17:44:08] We are starting small, but we have further plans [17:44:47] Our mailing list is mediawiki-i18n@lists.wikimedia.org and IRC channel is #mediawiki-i18n [17:45:07] Like a "translation center", which will show suggestions for translation, [17:45:10] We will be around on #mediawiki-i18n [17:45:35] Statistics, management of existing translations, etc. [17:46:00] And more wikidata integration. [17:47:09] But for a start - a simple translation interface with basic tools. [17:50:40] Thanks aharoni|mobile [17:50:53] Thanks everyone for joining [17:51:13] The office hour ends here today (a tad bit early than usual) [17:51:29] The logs will be on meta very soon: [17:51:34] #link https://meta.wikimedia.org/wiki/IRC_office_hours [17:51:44] #endmeeting [17:51:45] Meeting ended Wed May 21 17:51:44 2014 UTC. Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4) [17:51:45] Minutes: https://tools.wmflabs.org/meetbot/wikimedia-office/2014/wikimedia-office.2014-05-21-17.00.html [17:51:45] Minutes (text): https://tools.wmflabs.org/meetbot/wikimedia-office/2014/wikimedia-office.2014-05-21-17.00.txt [17:51:45] Minutes (wiki): https://tools.wmflabs.org/meetbot/wikimedia-office/2014/wikimedia-office.2014-05-21-17.00.wiki [17:51:45] Log: https://tools.wmflabs.org/meetbot/wikimedia-office/2014/wikimedia-office.2014-05-21-17.00.log.html [17:51:46] thanks! bye [20:50:59] in 10 min in this room, we're talking about the square bounding box proposal + the typesafe enums proposal [20:51:31] https://www.mediawiki.org/wiki/Architecture_meetings/RFC_review_2014-05-21 [20:59:44] ok, AndyRussG & cscott - almost ready? :-) [20:59:58] all set here :) [21:00:01] #startmeeting Discussion of square bounding boxes and typesafe enums| Channel is logged and publicly posted (DO NOT REMOVE THIS NOTE). https://meta.wikimedia.org/wiki/IRC_office_hours [21:00:01] Meeting started Wed May 21 21:00:00 2014 UTC and is due to finish in 60 minutes. The chair is sumanah. Information about MeetBot at http://wiki.debian.org/MeetBot. [21:00:01] Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. [21:00:01] The meeting name has been set to 'discussion_of_square_bounding_boxes_and_typesafe_enums__channel_is_logged_and_publicly_posted__do_not_remove_this_note___https___meta_wikimedia_org_wiki_irc_office_hours' [21:00:05] #chair sumanah brion Tim-away [21:00:05] Current chairs: Tim-away brion sumanah [21:00:10] #link https://www.mediawiki.org/wiki/Architecture_meetings/RFC_review_2014-05-21 [21:00:19] First, cscott :-) [21:00:23] since yours should be faster [21:00:31] #topic square bounding boxes [21:00:39] #link https://www.mediawiki.org/wiki/Requests_for_comment/Square_bounding_boxes [21:00:42] This should be quick [21:00:45] #link https://gerrit.wikimedia.org/r/#/c/123683/ cscott is looking to get this merged :-) [21:01:03] is that right C. Scott? [21:02:04] #chair sumanah brion TimStarling [21:02:04] Current chairs: Tim-away TimStarling brion sumanah [21:02:14] * brion double-checks the code [21:03:07] cscott: so as i understand this only affects the default case where something’s specified as a thumb without a size spec? [21:03:23] and sets a max height to match with the default thumb width [21:04:05] i’m pretty happy with that i think [21:04:06] TimStarling: ? [21:04:47] I gave it +2 already, but then cscott changed the code substantially and I haven't gotten around to reviewing it again [21:05:04] if it looks good to you, just give it +2 [21:06:13] ok i’ve done so [21:06:18] let’s make sure it runs its tests and whatnot :) [21:06:28] Sweet [21:06:42] #agreed to +2 https://gerrit.wikimedia.org/r/#/c/123683/ [21:06:52] rock, we can move on I think. [21:06:56] #topic typesafe enums [21:07:00] #link https://www.mediawiki.org/wiki/Requests_for_comment/Typesafe_enums [21:07:04] #info Andrew Russell Green calls this "a facility that I've found really useful so far, and is used in other stuff that we'd like to propose moving to core :)" [21:07:23] #info AndyRussG today seeks "I guess a general path forward for any changes required to merge this to core, or an opinion on the feasability threof" [21:07:31] ok so i think typesafe enums are cute and sometimes nice, but i don’t have a strong opinion [21:07:31] it looks slow [21:07:32] #info Andrew says: "I'm sure there are improvements to be made (for example, see MWalker's suggestions on the talk page), so if we can consider what would need to be done, and end up with something really nice that improves MW code quality, I think that'd be fantastic" [21:07:33] sorry i'm late, what did i miss? ;) [21:07:38] all of my rfc! [21:07:42] cscott: brion merged your change [21:07:43] yes! [21:07:45] :D [21:07:51] and then jenkins bot rejected.. [21:08:00] <^d> As I said on the list, I'm not a fan of enums. I don't make use of them in languages that have them natively, and implementing them in userspace seems slow per Tim. [21:08:04] probably because the release notes conflicted [21:08:06] quick, rebase before he changes his mind again [21:08:09] #link http://lists.wikimedia.org/pipermail/wikitech-l/2014-May/076605.html - there have been some questions/discussion on the mailing list by hashar & ^d & others [21:08:27] James_F thinks i should push to get this in 1.23 [21:08:35] so there’s a few different kinds of enums in the universe i think [21:08:37] but maybe we can return to discuss this after typesafe enums [21:08:44] one is ‘i really just need names for a couple flags’ [21:08:44] cscott: yeah, sounds good to me. [21:08:57] one is ‘i need user-friendly markers for id numbers that go in some API or database’ [21:09:00] also, https://gerrit.wikimedia.org/r/#/c/133600/ is the more contentious part of this, i'd like to discuss that as well [21:09:03] Personally I'm not a fan of the syntax of having a $ in DayOfTheWeek::$TUESDAY (I know that's really superficial) [21:09:06] in HHVM, class constants are probably very heavily optimised [21:09:07] and another is ‘i need a code-only marker for various extensible enum list' [21:09:10] * cscott will shut up while we discuss enums [21:09:22] since the JIT knows what the value of the class constant is when it is generating machine code [21:09:28] so it probably just uses the immediate value [21:09:56] so I don't think the "$" is superficial, it makes it a hashtable lookup instead of an immediate machine code operand [21:10:11] Hmmmm :) [21:10:21] do we tend to use enums in perf-critical code paths? [21:10:25] (i’m thinking parser/sanitizer) [21:10:44] Also note that in this implementation nothing happens if the class isn't loaded [21:10:44] i guess one question is: are we moving to enums for better readability of infrequently used code, or are we moving to enums because we want blazing fast performance on the hot paths through our codebase [21:10:54] I meant superficial from an aesthetic point of view [21:10:57] because the SplEnum implementation is almost certainly faster. but not better for readability. [21:11:16] (although it would be nice to get benchmarks to verify my intuitions) [21:11:34] <^d> SplEnum requires people installing a non-default pecl extension as bawolff pointed out on-list. [21:11:34] Re: peformance I do think it'd be a matter of a tradeoff, and some benchmarking would be cool [21:11:35] i don't remember seeing wide-spread enums in the parser [21:11:39] <^d> That's a -1 from my POV for it. [21:11:58] ^d: I didn't point that out, but I agree I don't like that :) [21:11:58] AndyRussG: https://www.mediawiki.org/wiki/Performance_profiling_for_Wikimedia_code might help you do a bit of benchmarking. Maybe. [21:12:15] well, a standalone microbenchmark would be my first pass [21:12:26] just to get a rough order of magnitude for the different enums [21:12:40] also, a standalone microbenchmark would probably be easier to run on HHVM [21:12:41] <^d> bawolff: Sorry, Tyler, not you. [21:12:47] a common usecase for enums is in function argument -- instead of having a bare TRUE / FALSE / NULL you have some explicit type [21:12:49] Definitely ... Still if it's a trade-off between an extremely small performance difference and better and typesafe code, I'd take the latter [21:13:11] <^d> mwalker: Well if you're passing boolean params to a function you're breaking coding conventions anyway :) [21:13:12] For me the ability to to typehint is really nice [21:13:49] The issue I have with SplEnum is it works quite differently from enums in other languages and is extra verbose [21:13:59] #info http://www.php.net/manual/en/class.splenum.php [21:14:31] I think Antoine may be the only one arguing for SplEnum, but I could be mistaken. Anyone arguing for it here? [21:14:41] not i, i don’t like splenum much [21:14:43] <^d> I don't think anyone's arguing for it. [21:14:53] Cool [21:14:55] <^d> I think Antoine was just like "hey maybe this?" not so much a strong advocate. [21:15:05] so i think there are two questions [21:15:14] one is: do we want to use this sort of facility widely? [21:15:24] the other is: when folks do want to use it, should they use a common base class from core? [21:15:31] -or implement separately in every ext? [21:15:35] nope, i like the new code. but i'd like to get a little bit of insight into performance. [21:15:45] so that we know whether to recommend it for hot paths or not [21:16:00] gwiw, Parser->mOutputType is an enumerated type, but it's not on a hot path [21:16:44] Parser::setFunctionHook takes an enumeration too, again now performance-sensitive [21:16:49] __toString() should probably be included in the benchmark [21:17:04] I can do benchmarks and add them to the RFC [21:17:05] since that suggested case syntax will call it [21:17:20] <^d> brion: We'd have to ask more people if they want to use it widely. [21:18:15] I think that most places where you use a class constant you could use something like this [21:18:44] SplEnum is PECL? [21:18:47] so iiuc, basically proposal is to copy https://git.wikimedia.org/blob/mediawiki%2Fextensions%2FCampaigns.git/e4714059330ace2c9c8457433919fb90ce1f4af5/includes%2FTypesafeEnum.php into core so people could use it [21:18:54] that kind of rules that out doesn't it? [21:19:19] $ php maintenance/eval.php [21:19:19] > return class_exists('SplEnum'); [21:19:20] bool(false) [21:19:27] <^d> TimStarling: Yes. [21:19:29] yeah let’s kill SplEnum it’s not there by default [21:19:32] I think we could also add what mwalker suggests on the talk page, about setting a constant integer value for database, and some facilities for bitmask when desired [21:19:50] quick side question: do you plan to discuss square bounding boxes for upright as well in this meeting? [21:20:26] <^d> Already done. [21:20:29] <^d> brion merged. [21:20:40] gwicke: oh i think we forgot about that case — it might make sense to turn uprights into squares if the current patch doesn’t do that [21:20:44] not for upright, just for default params [21:20:51] gwicke: We're returning to the topic after the enums chat because there's more to discuss, now that cscott is around. [21:20:52] <^d> Oh nvm, ignore me. [21:21:41] it's ok! we were done until we were not :) [21:22:36] re: the enums, I looked through a few other folks' implementations, seemed a lot of people are doing something similar, though I imagine it's not enough code to make worrying about an external library worth the effort [21:22:54] it probably won't be a hashtable lookup in HHVM, btw, I think it will just be dereferencing and Variant unboxing [21:22:56] square bounding boxes now? [21:22:59] brion, TimStarling, sumanah: thx [21:23:05] i thought i could fix the rebase conflict before we got back to me [21:23:15] but i'm multitasking poorly [21:23:27] are we done with the enums topic? can we have a couple #agreed or #action lines? [21:23:49] #agreed not gonna use SplEnum - it’s not available by default [21:24:10] i think we’re still kinda undecided on whether to put TypesafeENums class into core [21:24:20] and even more undecided on whether to change any code to use it [21:24:22] Is doing some benchmarks #agree-able? [21:24:31] sounds good. TimStarling ? [21:24:50] #action AndyRussG to benchmark setup and access performance [21:25:00] cool! :) [21:25:03] * TimStarling likes #action [21:25:08] \o/ [21:25:12] sweet [21:25:25] ok, we can therefore move on to [21:25:25] mwalker: you mentioned some other use cases before? should we look at those? [21:25:25] #topic square bounding boxes redux [21:25:26] from what TimStarling was saying earlier, HHVM benchmarks are also desirable? [21:26:03] well; if we dont want to use this for performance critical code then it doesn't make much sense [21:26:12] but I had suggested use cases of namespaces and permissions [21:26:32] MartijnH: I can also try HHVM benchmarks [21:26:35] #info also, https://gerrit.wikimedia.org/r/#/c/133600/ is the more contentious part of this, i'd like to discuss that as well [21:27:01] hhvm is pretty easy, now that they have packages on hhvm.com [21:27:18] Ah cool plums [21:27:22] just run "hhvm -f benchmark.php" instead of "php benchmark.php" [21:27:47] or "hhvm -v Eval.Jit=true -f benchmark.php" to be sure the JIT is enabled [21:27:48] sigh rebasing fail, i managed to add https://gerrit.wikimedia.org/r/#/c/134734/, boo me [21:28:06] so I'm not convinced that upright has much of a use case left with square bounding boxes [21:28:29] TimStarling: thanks [21:28:30] the only use case would be resizing images relative to a user's default thumb size preference [21:29:18] hmm [21:29:19] which might be better covered by introducing more semantic alternate image types [21:29:26] you know this change that cscott is merging... [21:29:36] will that make the site go down when it is deployed? [21:29:42] what? [21:29:44] * TimStarling always thinks of these things too late [21:30:05] well, how many thumbnails do you suppose will be regenerated? [21:30:11] mwalker: (aside: still enumish: got more details, maybe to put on the talk page or send elsewhere? thanks btw) [21:30:42] TimStarling: the non-square ones, i guess. [21:30:44] better make sure Greg and Sam know about it I guess [21:30:49] we could probably pre-generate them. [21:30:57] i could give you exact statistics [21:31:17] i have a db of all figure usage on en/fr/it/de/.. wikis [21:31:31] the question is whether the image scalers or swift backend will fall over when it is deployed to en or commons, due to increased rate of scaling operations [21:32:28] ok, patches are rebased, so i can devote brain cells to discussion [21:32:41] i think it would be a good idea to pre-scale as many images as possible [21:33:02] i can generate a list of all images on enwiki (say) that would change size as well as the new size [21:33:25] you think it won't be too difficult to generate that list? [21:33:25] i should just be able to ask mw for those thumbs in the new size to seed them in swift [21:33:25] no, less than an hour [21:33:27] ^d: is mwgrep on the expanded text or the source text? [21:33:29] ok, sounds good then [21:33:31] it's a good segue to the *next* topic [21:33:48] <^d> AaronSchulz: Expanded. We don't store unparsed wikitext. [21:34:04] <^d> (yet. maybe?) [21:34:12] since i wrote that code/assembled that db in order to show that https://gerrit.wikimedia.org/r/133600 would be safe [21:34:25] see https://bugzilla.wikimedia.org/show_bug.cgi?id=63904 for a link to the code, and statistics [21:34:33] that part of the patch affects far fewer images [21:35:49] I gave my opinion on that earlier in this meeting [21:35:54] swift stuff lasts forever, right? so i can start requesting the new sizes for the thumbnails now, regardless of the deployment/merge time of the patch. [21:36:28] yes [21:36:34] let's sync up -- so are we still talking about 'thumbs with no explicit size should use a square bounding box (that's https://gerrit.wikimedia.org/r/123683) [21:36:41] let's finish up that part [21:36:49] the patch has been rebased [21:37:17] my understanding is that someone will +2 it, and i will asap start a process to start requesting thumbs that change size, so that changeover will not be drastic [21:37:41] and someone (?) will inform greg and sam, so we all know what's going on before the code goes live? [21:37:47] last meeting we had "ACTION: cscott to patch upright to have a square bounding box, and use dumpGrepper to see whether this breaks too much (cscott, 21:59:55)" [21:38:09] TimStarling: that's the next part. i'm just trying to make sure we're all on the same page regarding the first part. [21:38:12] before we move on [21:38:51] first part being "thumbs with no explicit size" (upright flag semantics unchanged) [21:39:21] ok? my summary above is correct for the first part? [21:39:41] sumanah: you want to add some # actions? [21:39:51] greg-g: for changes that need to be deployed carefully, I'm meant to add you as a reviewer, right? [21:40:28] I think we pretty much agree that with the right preparations the move to square bounding boxes for bare thumbs is a good idea [21:40:46] TimStarling: that's good yeah, what are you thinking of? [21:40:50] ie: how carefully? :) [21:41:21] TimStarling: ah, I see in -dev [21:41:27] load on swift and image scalers needs to be watched, since thumbnailing parameters will change [21:41:30] well, at a minimum you should probably check with me to ensure that my job to pregenerate the new image thumbs has completed? [21:41:39] be vewy vewy quite [21:41:56] quiet [21:41:58] if there is an overload, it should be rolled back [21:42:09] gotcha [21:42:35] cscott: can we chat post this discussion re deploy details? [21:42:36] other than me writing a script to try to pregenerate images, is there any other possible deployment strategy to mitigate? [21:43:01] * sumanah wants to leave #action items to others to decide [21:43:11] greg-g: sure. [21:43:52] and maybe it should be flagged in the release notes more prominently that this will have affect on scaler load? [21:44:04] cscott: sounds good [21:44:14] #action https://gerrit.wikimedia.org/r/123683 expected to be +2'd [21:44:30] #action cscott will work with greg-g to ensure no deployment hiccups [21:44:40] #action cscott will write a script to pre-generate thumbnails that will change size, to avoid scaler load [21:45:19] ok, are we done with that then? [21:45:52] I believe so [21:45:54] now we can start the screaming and yelling ;) [21:46:00] let's move on [21:46:06] so the second part makes 'upright' simultaneously more and less useful [21:46:22] https://gerrit.wikimedia.org/r/#/c/133600/ and https://bugzilla.wikimedia.org/63904 for those following along at home [21:46:31] it makes 'upright' also use a square bounding box. [21:46:49] so that it actually is useful for upright images, and doesn't require the user to manually specify the aspect ratio of the image [21:47:17] Something like 0.75% of the images on enwiki use the 'upright' flag [21:47:26] this would change the size of 0.04% of the images [21:47:42] it'd be a mis-named 'scale' option, which scales relative to the user / wikis's default thumb size pref [21:47:55] since the number of images whose size changes is so small, it is proposed to reset the scale factor to 1.0 at the same time, which makes upright's semantics much less mysterious [21:48:19] (changing the default scale changes about 110 image references on enwiki) [21:48:57] yes, the semantics of upright would be pretty much "scale", and one might consider in the future adding an alias. but i'd rather upright go away entirely the the future. ;) [21:48:58] cscott, how many on frwiki use that flag? [21:49:09] it's in the bugzilla [21:49:28] because i remember reading on ve-feedback page that upright is popular/encouraged on frwiki [21:49:49] 17,870 of 1,242,985 images use 'upright' and the proposed change would alter the size of 1049 or 1053 of them. (the latter if the scale factor was changed to 1.0) [21:50:08] on frwiki [21:50:42] for dewiki, it's 38,624 of 1,786,779 and the size would be altered for 1963/1764 [21:50:52] while i'm copying numbers into chat ;) [21:51:06] can anyone explain to me why upright isn't the stupidest thing ever? [21:51:09] so deploying this one should be a breeze, at least. ;) [21:51:17] :) [21:51:23] TimStarling, that'd be hard [21:51:30] TimStarling: i think both gwicke and i agree that upright is stupid. [21:51:44] who approved it? [21:51:49] this patch makes it slightly less stupid. at least it has some sensible semantics (scales the default thumbnail size) [21:52:11] instead of being some completely weird thing that i don't want to write a bunch of special-case code for in parsoid [21:52:19] to me the interesting bit is that there are very few uses of the scale factor [21:52:26] (actually, that i've already written a bunch of special-case code for in parsoid, multiple times) [21:52:32] the help says "When the height of an image in thumbnail is bigger than its width (i.e. in portrait orientation rather than landscape) and you find it too large, you may try the option upright=N, where N is the image's aspect ratio (its width divided by its height, defaulting to 0.75)." [21:52:39] most non-scale factor uses will basically be made unnecessary by square bounding boxes by default [21:52:50] gwicke, no i think you are misinterpreting the statistics [21:53:10] if you do that, calculate the aspect ratio and use that as the upright factor, that is equivalent to a square bounding box, correct? [21:53:40] cscott, you don't break out the explicit factors vs. those using the default [21:53:54] but in my experience the explicit factor is very rare [21:54:15] but without an explicit factor, upright is just equivalent to multiplying the width by 0.75 [21:54:28] i think the ones which change size when the default scale factor is tweaked are the ones without an explicit scale factor. [21:54:28] that's like 110 images on enwiki. [21:54:28] so i think most upright images do actually have a scale factor. [21:54:28] i can crunch the exact numbers for you if you like. [21:54:28] TimStarling: i wrote that help text, i think. [21:54:33] the previous help text was much less helpful, and stated behavior that differed from the implementation [21:55:01] TimStarling: yes. multiplying the width by 0.75 is useful if all your images are either 4:3 or 3:4 [21:55:18] so upright w/o a scale factor gives you 4:3 images that are 'the right size' next to the rest of your 3:4 images [21:55:29] are least that's my reverse-engineering of the logic [21:55:33] *at least [21:55:45] afaik the factor was added later [21:55:48] which also explains the weird name [21:56:30] cscott, so I think it would be good to have data on how common the scale is actually set explicitly [21:56:30] but images are typically not one of those two aspect ratios [21:56:43] anyway, i think this patch is worth merging, because it affects very few images, and at least gives us semantics for 'upright' that don't require the user to manually compute aspect ratios. [21:57:14] gwicke: i can compute that, but i'm curious why you want to know. [21:57:26] if there are few uses with explicit scale factors then I'd prefer to just deprecate upright [21:57:44] since it'll be much less useful [21:57:48] upright with a specified width seems kind of pointless [21:57:51] and mis-named as well [21:57:56] there are 37,185 uses of upright on enwiki [21:58:03] since you could just multiply the specified width by the scale factor and omit upright [21:58:05] we could deprecate it, but probably not without tool assistance [21:58:22] upright is ignored if there is a specified with. [21:58:24] *width [21:58:28] there are parser tests for that [21:58:33] right... [21:58:54] the original use case for upright will disappear with square bounding boxes [21:58:58] it's only use is scaling the "default size" of the thumbnail (which might be user-specified) [21:58:58] without a specified width -- at least it does something unique [21:59:34] (and by user specified i mean in the user's preferences, not in wikitext) [21:59:41] well, the question is whether people actually want square bounding boxes or if they want their images to be shrunk by some arbitrary factor [22:00:04] the rounding to the nearest 10px is pointless, right? [22:00:25] TimStarling: i think the real missing feature is a 'scale and crop' option that will give me an exactly XxYpx image if i ask for one, cropping the edges as needed. but that's a different patch set. [22:00:29] it doesn't actually reduce the number of generated images, like it claims [22:00:42] the 0.75 default combined with the then-common 3:4 aspect ratio suggests that they wanted the equivalent of square bounding boxes [22:00:49] it is probably pointless, yes. [22:01:38] gwicke: but square bounding boxes already existed [22:01:45] at least for specified widths [22:01:56] yeah, but not for a bare 'thumb' [22:02:18] if that's what they wanted, it would have been trivial to implement [22:02:27] remember that upright is only used if there are no explicit dimensions [22:02:28] since the user's preferred thumbnail size was variable, it was formerly impossible to get "a square bounding box of the user's preferred thumbnail size" [22:02:32] at the place where the 0.75 factor is applied, the aspect ratio is loaded and trivially available [22:02:49] i believe it was a short-sighted design originally [22:03:08] that probably implemented what the user asked for, without pausing to think about what the user actually wanted. [22:03:17] * gwicke nods [22:04:14] if we were to design a scale option relative to the default size from scratch, how would we go about it? [22:04:33] (the current patch 133600 includes the 'round width to 10px' behavior, but i would be happy to take that out.) [22:04:47] well, I would call it "scale", not "upright", for a start [22:04:51] gwicke: we wouldn't. we'd take scaling decisions out of the user's hands and add semantic classes for images instead [22:05:16] TimStarling, cscott: +1 to both of you ;) [22:05:22] hey folks [22:05:33] but I think we would want to consider very carefully whether such a feature is needed at all [22:05:35] We can make scale an alias for upright w/o breaking the 37k existing uses [22:05:53] even better, i could rename the option to 'scale' in the source and docs, and include 'upright' as the alias [22:06:07] if someone asked me for it out of the blue, I would probably say no [22:06:15] I probably did, that's why I don't know anything about it ;) [22:06:18] cscott, thumb|upright could just become a no-op with square bboxes [22:06:24] but my personal preference is *not* to add scale, because i think the long term plan is to get rid of upright, not legitimatize it [22:06:50] the only interesting case would be upright with an explicit factor [22:07:06] what about deprecating and ignoring the parameter to upright? [22:07:06] gwicke: actually, with upright's scale factor set to 1.0 by default, making upright a no op would probably be not too different. [22:07:37] i can re-run the numbers for (a) what if we just ignore 'upright entirely' and (b) what if we ignore the scale factor -- but isn't (b) equivalent to (a)? [22:07:45] we should find out who requested this feature and who implemented it [22:07:52] i think upright already requires 'thumb' to also be specified, or it has no effect. [22:08:03] git blame will tell you that, i bet. [22:08:25] cscott, at the minimum we should probably ignore a bare upright with thumb [22:08:38] as the square bbox should take care of that [22:08:41] i think setting the default upright scale factor to 1.0 has that effect. [22:08:44] whoever it was should be included in the discussion [22:08:58] (with my patch. without my patch, thumb|upright gives you a non-square bbox) [22:09:13] yeah, it works out to the same [22:10:17] so I think we'd all prefer to get rid of upright if possible, at least in the longer term [22:10:19] we could deprecate upright completely, and start discussions with whoever requested the feature in the first place about what they really need [22:10:44] but like I say, they should be included [22:10:48] i'm trying to determine who that is from git-blame, but it's been hidden by layers of patches [22:11:01] it was added before oct 2010, that's all i can say at the moment [22:11:42] let's end this meeting, since we're out of time, we can talk about it later [22:11:55] in the interest of moving things forward, i think that merging my patch would still be a good first step to deprecating upright. ;) [22:12:14] noted [22:12:15] since the scale=1.0 feature then makes bare upright a no-op [22:12:21] #action cscott to do some git archeology to figure out the original author of upright, so that we can include him/her in the discussion [22:12:47] #endmeeting [22:12:47] Meeting ended Wed May 21 22:12:42 2014 UTC. Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4) [22:12:47] Minutes: https://tools.wmflabs.org/meetbot/wikimedia-office/2014/wikimedia-office.2014-05-21-21.00.html [22:12:47] Minutes (text): https://tools.wmflabs.org/meetbot/wikimedia-office/2014/wikimedia-office.2014-05-21-21.00.txt [22:12:47] Minutes (wiki): https://tools.wmflabs.org/meetbot/wikimedia-office/2014/wikimedia-office.2014-05-21-21.00.wiki [22:12:47] Log: https://tools.wmflabs.org/meetbot/wikimedia-office/2014/wikimedia-office.2014-05-21-21.00.log.html [22:12:47] #action cscott to run some additional statistics [22:12:49] whoops, too late [22:12:59] cscott: can add it on the wiki page [22:13:10] anyway, i'll run the statistics and find out how many images would change size if we just ignored 'upright' entirely. [22:13:22] since i agree that's an even better way to make upright sane. [22:13:22] Many thanks, all! cheers [22:13:46] cscott: sounds great, thanks! [22:14:30] i'm in 2008 and upright is still there, like a persistent wart [22:14:49] cscott: All the silly things are really old [22:15:06] i'm afraid I'm going to find that TimStarling wrote it. ;) [22:15:16] bawolff: Hey! That's ageism [22:15:24] cscott: if you feel like writing up the tale of upright as a cautionary tale suitable for telling around campfires, that would make me happy [22:16:09] sumanah: Once upon a time, special pages were implemented as functions instead of classes. If that doesn't scare you, nothing will :) [22:16:47] bawolff: I will probably be able to be scared by that in like 2 months. Right now I am oblivious. I am like Donny in The Big Lebowski [22:16:56] Raimond Spekking was responsible, on may 21 2007: git commit f8014e24e5d3292a555fc704b63bd14509e4f774 [22:17:11] bawolff: the code is still there [22:17:36] if ( !$function ) { [22:17:36] $this->mFunction = 'wfSpecial' . $name; [22:17:38] cscott: http://mediawiki.org/wiki/Special:Code/MediaWiki/22305 [22:17:45] Introducing new image parameter 'upright' and corresponding variable $wgThumbUpright. [22:17:45] This allows better proportional view of upright images related to landscape images on a page without nailing the width of upright images to a fix value which makes views for anon unproportional and user preferences useless [22:17:45] Usage: [22:17:45] * [[Image:pix.jpg|thumb|upright|caption]] = Upright image will be scaled down by $wgThumbUpright (default 0.75, seems to me the best value) [22:17:48] * [[Image:pix.jpg|thumb|upright=0.6|caption]] = Upright image will be scaled down by 0.6 [22:17:48] Size of thumb is always rounded to full __0 px to avoid odd thumbsizes and spare the cache [22:17:50] [22:17:50] If used in combination with a width, upright will be ignored. [22:18:03] TimStarling: And sometimes it haunts my nightmeres [22:18:39] of course that is not the original code, it is a backwards compatibility layer that I wrote when I introduced OOP special pages [22:19:54] here is the original code: http://paste.tstarling.com/p/vDwSuz.html [22:20:14] cscott, interesting that the scaling factor was added right in the first version [22:20:41] i personally like the "seems to me the best value" rationale. ;) [22:20:49] I don't even want to know why getting the list of special pages that are restricted to certain groups is a method of the Language class [22:21:41] cscott: no bug report referenced [22:21:50] so.. do we want to schedule another rfc meeting for upright, and invite Raimond Spekking ? [22:21:52] cscott: empirically determined [22:22:23] TimStarling: yes, i was poking around looking for that. mw didn't use code review back then, did it? [22:22:25] I didn't realise it was so long ago, he probably doesn't care anymore [22:22:39] yeah, so it was possible for things to sneak in without review [22:22:53] there was a commits mailing list [22:23:12] review basically depended on the "unread" status in brion's mail client [22:23:39] cscott: I say yes. Would you terribly mind checking in with Raimond about what time, in some upcoming M/W/F, he would be available? It's also fine to try to work this out on the mailing list which might be easier [22:23:53] it was a silly time [22:25:10] it's funny, i still don't think of 200x as "a long time ago" [22:26:01] cscott: http://xkcd.com/647/ [22:27:14] i remember java 1.0 [22:27:30] when designing your VM around Sparc architecture seemed a sane and sensible thing to do. ;) [22:27:38] <^d> TimStarling: I'm trying to get rid of wfSpecial*() actually. [22:27:59] <^d> I should probably break that into two commits. Getting rid of wfSpecial*() is easier than making execute() abstract. [23:24:13] Welcome, lilatretikov! [23:24:19] Hello World! [23:24:22] lilatretikov: hi! [23:24:28] We will be starting office hours with the incoming Wikimedia Foundation ED in just over 5 minutes (at the bottom of the hour) in #wikimedia-office. [apologies for multi channel spam] [23:24:42] * philippe waves at folks... we're still getting set up here. :-) [23:25:50] Oh wow so many great people are here for this. [23:25:59] * marktraceur waves at lvillaWMF and kmaher_ too [23:26:04] Y'all need to come 'round more often [23:26:12] sigh [23:26:15] I miss IRC :( [23:26:19] but I’d miss my productivity more ;) [23:26:23] Truth [23:28:20] hi hi! :) [23:28:26] Hey kmaher_ :) [23:28:58] why lfaraone :-) You told me to come on IRC more.... :) [23:29:27] OK, folks... it's about that time, shall we get started? :-) [23:30:09] Welcome to office hours... today we have with us Lila Tretikov, who is the incoming Executive Director of the Wikimedia Foundation, and therefore my (and many of the people here) new boss. :-) [23:30:35] Hi everyone -- really excited to be here and meet all of you! [23:30:37] Lila, you wanna hello, or impart sage words or anything like that? :) [23:31:06] I think we'll be casual and informal as much as possible today, so I'm not going to queue up questions for Lila... so if you have them just shout them out. [23:31:26] * Krenair waves [23:31:39] * karma says hi [23:31:41] hihi... cool to be back on IRC again. it has been a while :) [23:31:57] So, Lila.... since they're being shy today.... [23:32:06] * lfaraone welcomes lilatretikov , and congratulates philippe on coming back to IRC, if only for a brief time. [23:32:09] What do you think about the WMF after your first couple of weeks with us? :) [23:32:17] * philippe thwaps lfaraone :-) [23:32:18] what have you found most exciting so far, and and the most worrisome? [23:32:28] Suppose an opening question is about your approach: transactional or transformational? [23:32:34] lfaraone: You tell 'im [23:32:35] i love the people and the community -- i just started meeting some of you [23:32:54] and i cannot stop talking about it [23:33:02] i think people think i am too excited [23:33:19] but it is trully incredible to see how passionate people are about wikipedia first-hand [23:33:23] :P [23:33:43] Yeah, wikimedia has dedicated editors. :) [23:33:45] that said there are some adjustments i am going through as well :) [23:33:58] Risker's question first.... [23:34:05] MOst exciting, and most worriesome :) [23:34:08] exciting: the people, both at the wmf and the community [23:34:47] worrysome: trends, diffeculty of maintaining focus with a mission this large, some community dynamics [23:35:28] sDrewth is up next [23:35:30] * marktraceur would like to hear more about what trends are most worrisome but thinks it would count as a separate question [23:35:42] Hi sDrewth... I think it depends on what we are talking bout [23:35:47] Mark you're up next then. :) [23:35:51] Righto [23:36:12] some things will be transwormational as we embark on our strategy development in the different part of the lifecycle of our project [23:36:56] others are highly operational/transactional... for example improving how we improve software manufacturing process :) [23:37:12] i think we need to think at both levels [23:37:32] marktraceur asks about worrisome trends... [23:37:45] hi marktraceur -- it's a bit early for me to tell -- i am coming up to speed and digging into the data. [23:37:48] but at the high level [23:37:52] I imagine the "Oh shit" graph is one, but I'm curious if there are more [23:38:05] (the editor decline) [23:38:17] i don't like seing the drivative on our editor and uniques chart flat or negative [23:38:52] but i stil need to understand more [23:39:01] That's all the questions so far.... anyone else have any thoughts? Or are we going to be forced to listen to Risker sing for us? [23:39:16] I sing very well, Philippe [23:39:20] There are worse fates [23:39:23] lilatretikov: apropos of software delivery, I imagine you are involved in the search for the new Vice President of Engineering, and I'd be interested in your opinion of that role and how we're filling it (btw, we haven't met, I'm the QA guy in Tucson AZ) [23:39:27] the way we are looking at the data needs to be more wholistic -- i.e. include mobile for example [23:40:08] thanks, Chris, she's working on that one now :) [23:40:23] hi chrismcmahon -- i am involved, but it needs to be the team's decision. [23:40:38] Holistic in what way, may I ask? (aside from mobile) [23:40:41] i am looking to provide guidance and clarity on what that role's primary focus needs to be [23:40:52] so we evaluate against the same criteria [23:41:40] RichardNevellWMU -- we need to look at all of the channels that contribute towards the number individually and together [23:41:43] Lila, could you please define the Wikipedia community? Thanks. [23:42:08] so for example... [23:42:12] Guest38632: nothing like an easy one, huh? That's maybe the hardest question around.... I've been trying for years :) [23:42:18] :) [23:42:53] if we look at the contribution funnel we want to make sure we account for: regions/locations, form factors. [23:43:07] types of contributions [23:43:09] etc. [23:43:37] Moving on to the question from Guest28632, the definition of the Wikipedia community. [23:44:41] hi Guest38632 -- to me wikipedia community as a whole represents all of our audiences: both readers and editors, both contribute to our mission [23:44:58] "lilatretikov: if we look at the contribution funnel we want to make sure we account for: regions/locations, form factors" What does that mean? (Feeling buried by jargon) [23:45:14] SB_Johnny: thanks, you're up next :) [23:45:21] but each group has unique perspecitves and needs... and within those two there are differences as well [23:46:21] SB_Johnny sorry: where people are, what devices are they using, what browsers, how do they get to the point at which they make an edit -- is that better? [23:46:44] lol. yes, much better! [23:46:45] hehe I understood it better. :) [23:47:04] sorry -- it's hard for me to tell -- some questions are pretty technical! [23:47:17] so here are some questions for you: [23:47:49] what do you love most about wikipedia, what are you troubled by most [23:47:59] where do you think is our biggest opportunity [23:48:11] OK, the lady asked some questions, who's gonna take a stab at some answers? :) c'mon lfaraone, you're being quiet, which usually means you're plotting something.... [23:48:13] lilatretikov: some of us prefer other parts than the P 'wikiPedia' [23:48:20] Yeah [23:48:37] the WP become too much of a bitch fight :-/ [23:48:48] The thing I love most about the Wikimedia movement is that I can help everyone and that everyone can help me [23:48:49] I was about to say something like "I'm a little worried that you're only asking about Wikipedia"! But I'm biased :) [23:48:50] sDrewth tell me more.. what do you mean? [23:48:51] I think that could be described as "almost everyone can find their niche", sDrewth [23:48:55] OK, then let me please to rephrase my question. Lila, how many members of the Wikipedia Community should take a part in voting (any voting) to call the result of this voting "the action of the community"? Thank you. [23:49:00] lol billing [23:49:00] remember I am new -- need context [23:49:31] <- primarily a wikisourcer (and a steward) [23:49:31] So, Guest39632, thanks, I've got that lined up.... as soon as weve seen some answers to Lila's questions, we'll get to that one. : ) [23:49:41] lilatrettikov: I think he means that there's more to the wikimedia than wp [23:49:45] yeah [23:50:30] I think that Wiki(p)(m)edia is becoming increasingly conservative and inflexible in its community, which is the antithesis of the founding principles [23:51:13] my loving is opportunity to do something to the word. to share and help as well to learn thinks [23:51:15] She's typing away :-) [23:51:27] sDrewth -- yes i agree... i still use it a bit interchangebly, but i am learning how different parts of the commnity identify with the projects [23:51:29] In response to Lila's question, what I love most about Wikipedia is the positive feedback article writers get sometimes from readers. For me, one particular occasion comes to mind, and I hope that other article writers have had that moment where they feel they're people learn. Few things are more motivating. [23:51:51] *they're helping [23:52:02] lilatretikov: don't take me wrong, I have broad rights and broad contributions. I just prefer the lesser politics of the smaller communities [23:52:19] my question is relevant to the movement as a whole, our biggest potential may be with one of the sister-projects [23:53:32] lilatretikov: I love that we have a community that is so driven to create high-quality content, as showcased by the featured articles/pictures that serve as a constant reminder that I could always be a better writer. [23:53:32] In terms of troubles, my focus has been mostly from a process standpoint; I worry that the processes that require participants with community trust (administrators, functionaries, ArbCom etc) will simply fail to scale, and the process quality will decline. (I'm speaking from looking at the English Wikipedia; I admit I'm not familiar with how backlogged e.g. [23:53:32] Commons is, but every time I look things seem to have mostly kept in check) [23:53:32] I quite dislike the 'wikipedia vs. sister-projects as a group' thing. Wikipedia should be given equal weight to each of the 'sister projects' :/ [23:53:54] ++Krenair [23:54:10] (++ too) [23:54:28] Wiki(m/p)edia has a remarkable mission, that is remarkably easy to share with others and that others recognized immediately, only if superficially. OTOH, academics and such recognize and promote the mission in highly sophisticated ways. [23:54:47] We're furiously reading over here, folks, thanks for your opinions.... for those who are just joining us, Lila is asking what we lovea bout the projects, and what the biggest risks are... [23:54:47] Being part of it is just a cool feeling [23:54:51] Krenair, marktraceur , Jasper_Deng -- if everything is equal weight -- how would you prioritize? [23:55:03] while they are the flagship as the public votes, they tend to fix the poicy, sometimes to the detriment of the sister projects [23:55:11] policy [23:55:23] paid editing policy was one example [23:55:24] What seems to be happening is that enwiki seems to get the most attention because I'd imagine most of our donations are because we're known for enwiki [23:55:33] lilatretikov: Prioritise things that have broad effect, then things that have the most effect in order, IMO - there's a much longer definition to be laid out, but it's definitely possible [23:55:34] Guest38632 -- not sure yet, and I think probably depends on a specific project in question and its audience [23:55:44] Jasper_Deng: granted, it also has the pageviews. [23:56:15] (Not saying that is the correct approach, I just don't think its purely a donation-generation prioritisation) [23:57:48] Improvement potentials: 1. WMF is not doing enough to measure content quality. 2. Volunteers have asked for help with child protection; nothing has been done. [23:58:46] Lila, you do understand that the actions of that community of mostly anonymus users have great potentials to destoy people lives both the editors and the subject of BLPs? [23:58:50] marktraceur -- i agree with broad effect/reach. but prioritization by definition means that something gets attention, and something else does not. just want to be clear on that. [23:58:58] Another thing I think should be done is more reach out to high schools and colleges about how to use Wikipedia to do research. [23:59:35] lilatretikov: Sure, I think there's enough to go around that WP doesn't wind up looking like our only priority, but at the same time I'm aware that WP is a huge part of our priorities for good reason :) [23:59:48] Whether equal weight means equal prioritisation, I'm not sure