[16:57:06] Niharika: hi [16:57:19] Hi aharoni! [16:57:21] hola [16:57:37] you may want to take a look at https://www.mediawiki.org/wiki/Interlanguage_links/September_2014 [16:58:24] Language Engineering office hour begins in a few minutes .. right here [16:59:26] aharoni: Okay, thanks! I'm working on a couple of bugs. Issue with gerrit, not been able to push them for review. [16:59:41] Hi arrbee. :) [16:59:50] Hey Niharika! Good to see you [17:00:03] Same here. [17:00:07] :) [17:00:08] Hello World! [17:00:19] Hello.. [17:00:30] #startmeeting Language Engineering monthly office hour - September 2014 [17:00:30] Meeting started Wed Sep 10 17:00:30 2014 UTC and is due to finish in 60 minutes. The chair is arrbee. Information about MeetBot at http://wiki.debian.org/MeetBot. [17:00:30] Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. [17:00:30] The meeting name has been set to 'language_engineering_monthly_office_hour___september_2014' [17:00:42] yay.. all set with meetbot [17:01:05] Hello, Welcome to the monthly office hour of the Wikimedia Language Engineering team [17:01:31] I am Runa, the Outreach and QA co-ordinator for our team [17:01:55] Our office hours are held every 2nd Wednesday of the month [17:02:15] The last office hour was on 9th July [17:02:25] Logs at: [17:02:33] #link https://meta.wikimedia.org/wiki/IRC_office_hours/Office_hours_2014-07-09 [17:03:03] We did not host an office hour in August as it coincided with Wikimania [17:03:27] we met with a lot of our community members in person at the conference [17:03:43] Before we start, the important announcement: [17:03:58] IMPORTANT: The chat today will be logged and publicly posted [17:04:13] This is also mentioned in the channel topic [17:04:37] A quick introduction of the team (if you haven't met us before) [17:04:55] We are the Wikimedia Language Engineering team [17:05:01] Good day [17:05:09] Hello [17:05:20] Hello [17:05:26] Our team page on mediawiki.org has details about the projects we work on and how you can participate in them [17:05:29] LeT. [17:05:38] #link https://www.mediawiki.org/wiki/Wikimedia_Language_engineering [17:06:12] We build and maintain language features and tools for the wikis in more than 300 languages [17:06:23] and support the wiki communities around the world [17:07:21] Along with me, present today are my team mates aharoni kart_ jsahleen Nikerabbit pginer santhosh [17:08:18] jsahleen joined us very recently [17:08:30] Thanks for your great work with and between so many languages! [17:08:36] Hi everyone. [17:08:38] :) [17:08:51] Hi jsahleen! [17:09:07] Scott_WUaS: Thank you. :) [17:09:14] :) [17:09:30] We are an entirely remote team and work from several places around the world [17:10:06] Today we will begin with an update from the project that has kept us busy for quite a few months [17:10:22] Content Translation [17:10:26] #link https://www.mediawiki.org/wiki/Content_translation [17:10:53] and then move on to other topics and Q & A [17:11:13] Please feel free to ask questions at any time [17:11:40] The Content Translation tool is a way to create new Wikipedia articles from existing articles in another language [17:11:58] The first release of the tool was made right after our last office hour in July [17:12:15] You may have seen the announcement on the Wikimedia blog [17:12:18] #link http://blog.wikimedia.org/2014/07/16/first-look-at-the-content-translation-tool/ [17:12:43] This version was deployed on the beta wikis and machine translation support was enabled for Spanish to Catalan [17:13:10] You can try the tool by reading through the instructions on the project page [17:13:16] #link https://www.mediawiki.org/wiki/Content_translation#Try_the_tool [17:13:37] In just over a month more than 150 articles have already been created for the Catalan Wikipedia [17:14:14] Last week, we hosted an online round-table with some of the Catalan editors, who shared their experiences about how they were using the tool to create articles [17:14:35] You can watch the video (its an hour long) [17:14:37] #link https://www.youtube.com/watch?v=vHu3vdlE1X8 [17:15:03] The second release is currently in progress and targeted for completion by the last week of September [17:15:26] The details of the release plan is at the following link [17:15:29] #link https://www.mediawiki.org/wiki/Content_translation/Roadmap/CX01Release [17:16:04] I will pass on to aharoni to talk a bit more about what we can expect from this upcoming release [17:16:40] Hallo! [17:16:49] aharoni: Hello [17:16:57] For the next release we are working on the following: [17:17:24] carefully testing more tools for more language pairs [17:17:56] making template handling more stable and predictable, so that in general terms: [17:18:20] 1. Big block templates such as infoboxes are usually skipped completely, unless specific adaptation code was written for them. [17:18:57] 2. Inline templates, such as {{IPA}}, {{lang}} or everybody's favorite {{citation needed}} are copied as HTML (this will be improved later). [17:19:41] 3. Specifically adapted templates are adapted automatically - we did this as an experiment for three templates in Spanish for now. [17:20:05] Where in MediaWiki or on the web, please, is there a list of sister, external, and complementary interlingual projects, if any exist, that are in MediaWiki and in Wikidata / Wikibase, as well? [17:20:40] Further, we are fixing a lot of bugs in the translation interface, based on feedback from the users in the Catalan Wikipedia. [17:21:13] And we are making the algorithms for preventing articles that have nothing but machine translation smarter. [17:22:48] How to link to external dictionaries and glossaries like FUELproject? [17:22:58] Thanks aharoni [17:23:52] Pavanaja, adding more glossaries and dictionaries is definitely on our roadmap. [17:24:23] Thanks aharoni [17:24:33] Currently we have simplistic support with a few free dictionaries, but we are going to enhance it in the coming months. [17:25:03] The issue is that for each dictionary we need to write its own API, but we'll get there, step by step. [17:25:50] Scott_WUaS: aharoni may be the best person to answer that. I will wait for him to return with a reply. [17:26:01] Moving on [17:26:31] thanks [17:26:57] To add to what aharoni already mentioned about the next release, we are currently testing more language pairs that are on the roadmap [17:27:17] Catalan to Spanish i.e. the reverse of what currently exists [17:27:32] and also Spanish to Portuguese [17:28:15] pginer is currently testing the tool with users with the new languages [17:28:42] Please do let him know if you would like to participate in the tests [17:29:01] (I will post the link to the sign up form in a few minutes) [17:29:39] For Spanish and Portuguese we are planning for bidirectional support [17:29:56] Scott_WUaS: If I understand your question correctly _and_ as far as I know, the projects supported by Wikidata are Wikipedia, Wikinews, Wikiquote, Wikisource, Wikivoyage and Wikimedia Commons. [17:29:56] You can fill this form: https://docs.google.com/a/wikimedia.org/forms/d/1yCvPS65eWk9S8uXkksAbDbLsbZQd0ISQKBDFfJnSSo0/viewform [17:30:06] Thanks pginer :) [17:30:14] if you are interested in taking part of upcoming testing sessions [17:31:02] We prioritize based on the next languages to be enabled, but we like to test with people that speak as many different languages as possible [17:31:27] and with a variety of scripts [17:32:27] We would also love to hear feedback from users who have used the tool by using the tweak [17:32:42] i.e. by specifying the source and target language in the URL [17:34:01] Scott_WUaS: I see aharoni just replied. Is that the information you were after? [17:35:05] okay.. we will come back to that discussion :) [17:35:22] Also, there were two updates related to Universal Language Selector [17:36:13] On request from the respective communities, webfonts have now been enabled on the English Wikisource and all the wikis in Divehi language [17:37:03] So if you are viewing any of these wikis you should not be seeing any tofu or square blocks of text [17:37:55] We have about 20 more minutes left [17:38:21] aharoni: pginer kart_: did you want to add anything more about the Content Translation project? [17:38:58] What about ULS integration with VE? [17:39:23] Good point [17:39:43] divec has made some progress with integrating the input methods with VE [17:40:03] But its still needs more work [17:40:50] He is not here today but he might be reaching out to more users to test the input methods [17:41:12] We or the more likely the VE team will keep you posted when that happens [17:41:41] Thanks for bringing that up Pavanaja [17:42:11] @arrbee :) [17:43:03] Okay, since we mentioned Wikimania earlier on [17:43:57] In case you were not there at the conference or could not attend the talks we presented, here are the videos: [17:44:23] 1. - Santhosh Thottingal, Amir Aharoni: Machine-aided machine translation (Content Translation) [17:44:32] #link http://www.youtube.com/watch?v=b6qvv3eJ_Ag&t=32m40s [17:44:42] 2. Runa Bhattacharjee, Kartik Mistry: Testing multilingual applications [17:44:52] #link http://www.youtube.com/watch?v=0hcjZvateZs&t=32m15s [17:45:00] Thanks Arrbee and Aharoni - much happening at this end - I've heard on the Wikidata email list that there are external sister projects engaging Wikidata and Mediawiki ... and I'm curious if you know where these are listed ... thanks [17:45:27] thousands of websites use MediaWiki :) [17:45:53] in simplistic terms, it's just a content management system. [17:46:15] aharoni: inter-lingually and for many languages (and with Wikidata)? [17:46:53] no, mediawiki by itself is *just* a content management system. Its users make it really multilingual; we just provide the tools. [17:47:27] Wikidata is still pretty young and growing, but just a few days ago I heard about http://www.histropedia.com/ , a site that uses Wikidata to show information in a different way. [17:47:40] But it's a completely external site, that we have no influence on. [17:49:28] thanks ... I don't see various languages there yet ... and I'm also interested in ones that engage Wikicommons in a variety of languages ... [17:50:33] as well as SemanticWiki in a variety of languages with MediaWiki [17:51:49] * arrbee will check the wikidata list again to see if we can find more information to help better about this [17:52:08] thanks [17:52:41] Scott_WUaS: if possible could you please PM me your email address, so that I can contact you if we find something? [17:53:24] We have about 5 more minutes left [17:53:49] * arrbee looks around for any more questions :) [17:55:18] oh, before I forget, we also had our quarterly review last week. [17:55:38] The notes and slides from the meeting will be up on this page sometime soon [17:55:42] #link https://meta.wikimedia.org/wiki/WMF_Metrics_and_activities_meetings/Quarterly_reviews [17:56:01] Please let us know if you have questions [17:56:54] arrbee - just sent you my email - thanks ... what's yours? [17:57:26] Scott_WUaS: Thanks. Mine is runa at wikimedia dot org. I will share the contact details for our team in a minute. [17:58:05] Okay, so we are now nearly out of time [17:58:26] We won't keep you back longer :) [17:58:43] As I mentioned earlier, our office hours are generally held every 2nd Wednesday of the month at 1700 UTC [17:58:48] Thank you, Runa! [17:58:52] Thanks arrbee! [17:59:00] So if nothing changes, our next office hour will be on October 8, 2014 [17:59:17] Our mailing list is mediawiki-i18n@lists.wikimedia.org and IRC channel is #mediawiki-i18n [17:59:47] Hi everyone, this is KC. I have 2 questions: 1. Could anyone give a guesstimate on a V1.0 release date for the Content Translation Tool? 2. Could anyone give a guesstimate on a V1.0 release date for Extension:Translate/Usability improvements 2014 (https://www.mediawiki.org/wiki/Extension:Translate/Usability_improvements_2014#The_page_language_for_multilingual_wikis_should_not_be_fixed), particularly with reference to Bug 3548 [17:59:56] Also contact details can be found here: [17:59:59] #link https://www.mediawiki.org/wiki/Wikimedia_Language_engineering#Contact_us [18:00:10] be arbitrary (https://bugzilla.wikimedia.org/show_bug.cgi?id=35489) [18:00:35] kclau: Hi, for your first question, its last week of September 2014 i.e this month [18:01:03] For the 2nd I will ping Nikerabbit but you may have to continue the discussion ong #mediawiki-i18n [18:01:14] We will have to leave this channel now, sorry [18:01:33] thank you [18:01:48] Thanks everyone. See you next month. [18:01:50] #endmeeting [18:01:50] Meeting ended Wed Sep 10 18:01:50 2014 UTC. Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4) [18:01:50] Minutes: https://tools.wmflabs.org/meetbot/wikimedia-office/2014/wikimedia-office.2014-09-10-17.00.html [18:01:50] Minutes (text): https://tools.wmflabs.org/meetbot/wikimedia-office/2014/wikimedia-office.2014-09-10-17.00.txt [18:01:50] Minutes (wiki): https://tools.wmflabs.org/meetbot/wikimedia-office/2014/wikimedia-office.2014-09-10-17.00.wiki [18:01:51] Log: https://tools.wmflabs.org/meetbot/wikimedia-office/2014/wikimedia-office.2014-09-10-17.00.log.html [18:02:16] kclau: that seems wrong bug number [18:07:23] cndiv: can i come pick up that USB CD player now? [20:26:54] guillom: https://bugzilla.wikimedia.org/show_bug.cgi?id=70576 :) [20:29:03] \o/ [21:00:39] do we have a DanielK_WMDE? [21:00:56] I know it's very late there [21:01:28] <^d> Idle >1hr :( [21:02:48] #startmeeting [21:02:48] TimStarling: Error: A meeting name is required, e.g., '#startmeeting Marketing Committee' [21:03:09] #startmeeting RFC meeting 2014-09-10 [21:03:10] Meeting started Wed Sep 10 21:03:09 2014 UTC and is due to finish in 60 minutes. The chair is TimStarling. Information about MeetBot at http://wiki.debian.org/MeetBot. [21:03:10] Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. [21:03:10] The meeting name has been set to 'rfc_meeting_2014_09_10' [21:03:18] \o/ [21:03:23] \o/ [21:03:31] <^d> /o\ [21:03:32] Weeeee! [21:03:43] #topic RFC meeting | https://meta.wikimedia.org/wiki/IRC_office_hours | Please note: Channel is logged and publicly posted (DO NOT REMOVE THIS NOTE). | Logs: http://bots.wmflabs.org/~wm-bot/logs/%23wikimedia-office/ [21:04:22] more importantly, do we have andrew green? [21:04:24] Hi all :) [21:04:30] Yes I am had [21:04:35] hello! [21:04:55] #topic Data mapper | RFC meeting | https://meta.wikimedia.org/wiki/IRC_office_hours | Please note: Channel is logged and publicly posted (DO NOT REMOVE THIS NOTE). | Logs: http://bots.wmflabs.org/~wm-bot/logs/%23wikimedia-office/ [21:05:02] #link https://www.mediawiki.org/wiki/Requests_for_comment/Data_mapper [21:05:43] so the general idea of this is to move the ORM stuff from Campaigns to the core? [21:06:19] Well, at least to talk about better ORM in core or somehow accessible by Mediawiki code [21:07:02] * aude waves [21:07:08] so how does this differ from ORMTable? [21:07:14] I think that better ORM can go together well with improving our achitecture [21:07:18] hi aude [21:07:34] The main difference is it's the data mapper pattern versus active record pattern [21:07:42] Or at least that's one main difference [21:08:16] So on this pattern domain objects don't worry about storage and retrival much, and can contain a more succinct expression of domain logic [21:08:39] There are other differences, for example the lack of reliance on hard-coded strings [21:09:08] And also a few features geared specifically for optimizations that are recommended for MW code [21:09:52] reminds me a little of apple’s CoreData [21:10:06] objects are relatively “dumb”, and you insert/update/etc them via a context mapper object [21:10:27] * bd808 dislikes smart data objects [21:10:44] Yeah that's about it, (I don't know much about coredata though) [21:10:48] our core objects like User and Title and WikiPage are heavyweight smart data objects, and it’s painful :D [21:10:57] i’m less familiar with the newer ORMTable stuff [21:11:07] A big motivation is just to have a clean way of separating out relatively low-level persistence stuff [21:11:24] Is there new ORMTable stuff? My experience with it is from the EducationProgram extension [21:11:49] brion means newer as in last century or something [21:11:56] heh [21:11:57] as opposed to the century before [21:12:03] newer* not new :) [21:12:16] my main objection to dumb data objects though is you could have just used database row objects …. [21:12:23] Ah hmmmmmm :) [21:12:24] * brion hrms [21:12:53] * DanielK_WMDE wibbles [21:12:56] Here's a book that I was reading while I worked on this, took several pointers from it: http://books.google.ca/books?id=xColAAPGubgC&printsec=frontcover#v=onepage&q&f=false [21:13:46] it would be nice if db loading, encoding, master/slave differences, lazy-loading, caching, and CRUD were more consistent in core [21:14:10] Yeah! [21:14:16] every object does things it's own way, even to really just do the same thing [21:14:17] Here's another book I looked at a lot: http://books.google.ca/books?id=vqTfNFDzzdIC&printsec=frontcover#v=onepage&q&f=false [21:14:27] i'd argue somewhere between smart and dumb objects. If your data model is just a class of getters/setters thats not better than a db row. The model should contain the necessary code related to state changes possible. To take a flow example, if a Post can be replied to then $post->reply(...) returns a new Post object that can be stored in the db [21:14:42] ok so ORMTable is about 2 years old, not quite a century yet [21:15:06] <^d> Question: do we have any concrete use cases for data mapper things outside of extensions? [21:15:08] Hmmm so I'm about half as old as it in WMF terms [21:15:11] <^d> That's my complain with ORMTable. [21:15:15] <^d> We never used it really. [21:15:30] isn't that deprecated? [21:15:37] ORMRow isn't though [21:15:47] <^d> I don't think so? [21:15:55] ebernhardson: right. The idea would be to keep purely infrastructure code out of the logic that tells you about the domain [21:15:58] AndyRussG: would classes have to be manually created for this? i’m a little vague readig the notes [21:16:04] IMO it's not so much about dumbness as the single responsibility principle [21:16:05] * bd808 hides from class names like IORMRow [21:16:09] fwiw, ORMTable isn't really ORM, and it's an Abomination Unto Nuggan. I'd be very happe to see actual ORM, as long as it deals with "stupid" value objects. [21:16:17] <^d> AaronSchulz: Only deprecated stuff I see there in singleton() related things. [21:16:23] * DanielK_WMDE is still reading the rfc [21:16:24] an object should either have detaiuled knowledge about the domain or the database but not both [21:16:42] tgr: yeah!! [21:16:44] tgr: +1 [21:16:54] almost going back to plain old procedural programming here ;) [21:16:56] <^d> DanielK_WMDE: I'm not saying we should follow ORMTable's model, just saying "like ORMTable, I want concrete use cases" [21:17:46] mark: As a reaction against kitchen sink object designs I suppose. [21:17:48] ^d: makes sense, yes the proposed development methodology is meant to address that at least partly [21:17:55] yeah [21:17:56] ^d: oh, i agree. i was just comenting on earlier mention of ormtable [21:18:14] Since I've been working on CentralNotice, I've definitely seen some use cases there [21:18:24] <^d> bd808: I will say: a kitchen is rather lacking without a kitchen sink. [21:18:51] I mean, it's hard _not_ to see a use case anywhere where you might like to deal in objects rather than rows [21:18:59] ^d: Sure, but `new Kitchen()->washDishes()` is wacky [21:19:22] side note: can we please not have interface names start with "I"? The interface for cars should be called "Car". That's what you'll see in type hints. The implementation can have an ugly name, nobody is going to see it anyway. [21:19:27] <^d> Less wacky than `new Bathroom()->washDishes()` [21:19:30] <^d> ^ That's just gross [21:20:03] DanielK_WMDE: interesting, yeah I was grepping about core to see how strong the pattern is [21:20:03] DanielK_WMDE: AndyRussG's code generally has an interface (with I prefix) and a single concrete implementation (without I prefix) [21:20:18] from what I have seen of it [21:20:23] It could be InterfaceName and InterfaceNameImpl [21:20:33] I'd like that better personally [21:20:36] There's nothing in the code that obliges that naming pattern [21:20:38] ^d: it gets worse when Bathroom() extends Kitchen() [21:20:40] class CampaignRepository implements ICampaignRepository { [21:20:51] TimStarling: and i dislike that :) just a pet peeve. [21:20:56] Could be CampaignRepositoy implements CampaignRepositoryImpl [21:21:08] reversed of course [21:21:17] grep -ri "^interface I" includes/ [21:21:20] i feel like i’d like this more if i didn’t have to manually create both a class and a table definition [21:21:21] (from core) [21:21:34] Although for most things I'd argue that an interface is overkill [21:21:39] brion: yes that was a complaint during code review [21:21:46] <^d> mark: In that case, just invoke the House() which has accessors to all the rooms :p [21:21:56] bd808: it could also be just an implementation with no separate interface [21:21:57] for instance i love that the memcache interface you just need to devise a key and stuff things in :) [21:22:12] so... let me ask an evil question. why make our own, when decent orm implementations exist? [21:22:21] like, err... dbal? [21:22:23] DanielK_WMDE: yesssss! [21:22:34] * bd808 listens for the roar of the crowd [21:22:34] :)) [21:22:38] brion: if we end up using Doctrine, it has pretty clever tools for autogenerating either one from the other [21:22:45] ARE WE NOT ENTERTAINED? [21:22:53] brion: I think Doctrine uses code generation get around such tedium [21:22:54] dbal+1 ;) [21:23:07] wonder how well that works with hhvm… [21:23:11] AndyRussG: thats for the db table -> php code instance [21:23:18] AndyRussG: it can also go from php code -> db table without autogen [21:23:25] sure, if you are writing your own thing and don't even like it that much compared to 3rd party libraries then we should probably look at the 3rd party libraries [21:23:37] Ah OK [21:23:54] "One of our goals for 2014 is running DBAL and ORM on HHVM with 100% of the testsuites passing." -- http://www.doctrine-project.org/2013/12/23/our-hhvm-roadmap.html [21:23:57] TimStarling: yes I would have gone with 3rd party libraries, I tried that first for the Dependency Injection [21:23:58] now if we can write a PHP class and have it autogenerate tables? that’s the shiznit [21:24:12] but it was not approved at that time [21:24:22] i'm pitching dbal for a reason. we are in the process of ripping a dbal depdenceny out of the query code for wikibase. can't get it reviewed/deployed. but perhaps if there are more good use cases for dbal (we mainly picked it because of its schema generation support), things would be different [21:24:32] <^d> brion: Tried that. It's messy if we don't want to ditch Oracle/Postgres/MSSQL [21:25:04] <^d> (Speaking of: we should really have an RFC on dropping databases nobody cares about) [21:25:08] i honestly feel we should ditch oracle/postgres/mssql for the simple reason that we haven’t historically kept them in working shape [21:25:19] brion: careful now or i'll invite domas in here [21:25:28] :D [21:25:38] <^d> brion: I tried to, but then someone said "I'll care about MSSQL" [21:25:39] Doctrine (and I think any reasonably mature ORM) is definitely much more complete than this implementation, and working up to a similar level of completeness would be a big task [21:25:39] would be nice to keep postgres. [21:25:41] <^d> Turns out it was lies. [21:25:43] <^d> And nobody cared. [21:25:44] bd808: brion: http://hhvm.com/frameworks/ says all doctrine2 tests pass with HHVM [21:25:50] nice tgr thanks [21:26:07] #info doctrine2 tests pass with HHVM, so dbal should be ok [21:26:34] #info for later we should discuss dropping non-my/maria dbs that are not well supported (gasp) [21:26:39] I think a first decision to try to get to (maybe not even today tho) would be 3rd party library vs. homegrown vs. do nothing vs. something else I haven't thought of [21:26:41] tgr: that's for doctrine 2.5, right? i just checked the other day, and Ubuntu 14.04 comes with 2.3... [21:26:48] * aude notes that doctrine dbal !== doctrine orm [21:26:54] Note that so far we are mostly talking about dbal, which is just an abstraction layer and query builder. The ORM is another library on top of that, we could use the abstraction layer and schema generation without the full orm [21:26:57] in case anyone is confused [21:27:01] aude: :) [21:27:03] * aude would like the former [21:27:18] would take care of stuff like postgres not breaking all the time [21:27:23] #info doctrine dbal and doctribe orm are separate layers. we are more interested in dbal [21:27:23] ok, can we have an RFC that's more like the template RFC, i.e. with implementation options and pro/con tables and what not? [21:27:27] aude: orm is build on top of dbal, i assume [21:27:36] hashar almost has the Jenkins infrastrucure fixed up for testing things that depend on code in our medaiwiki/vendor repo which would make dbal reasonably easy to work with. [21:27:40] DanielK_WMDE: it is [21:27:50] AndyRussG: is that RFC something you'd be interested in writing? [21:27:52] and i think was around longer afaik [21:28:00] "The query builder does no validation whatsoever if certain features even work with the underlying database vendor." [21:28:05] * AaronSchulz frowns [21:28:15] bd808: there's the big question of "what 3rd party libraries can we trust" [21:28:24] ^d: yeah, ditching non mysql would make that easy ;) [21:28:26] DanielK_WMDE: for sure [21:28:33] TimStarling: sure! You mean, kind of like this one but taking a step back and less focused on this proposed implementation? [21:28:43] yes [21:28:45] <^d> AaronSchulz: Heck, sqlite isn't hard since it's mostly similar. [21:29:01] the current RFC would be one of the implementation options [21:29:08] sqlite is important for jenkins [21:29:10] *nod* [21:29:14] and dev installs [21:29:16] <^d> Yes. [21:29:17] and know there are folks that care about postgres [21:29:22] others don't know [21:29:22] yeah sqlite is actually maintained :D [21:29:31] <^d> aude: It'd be nice to have...a maintainer. [21:29:40] * aude nods [21:29:54] i would love to care about pg and mssql but unless somebody’s gonna fund work on them i expect them to wither [21:30:02] it also doesn't use a bunch of triggers and crap nothing else does that cause problems and require more hacks (like an anon user entry in the table) [21:30:20] you remember MS did actually briefly fund someone to work on MSSQL support [21:30:23] one worry I have about a 3rd party lib would be precisely legacy support and keeping that through the same database classes underneath [21:30:24] TimStarling, AndyRussG: Uncle Bob tells us we shouldn't bind to a framework directly, but always add a thin layer on top, so we can swap out the framework easily. [21:30:39] <^d> TimStarling: Yes. Didn't last long. [21:30:46] that would mean designing an interface for orm we'd be using, which could be implemented on top of dbal, or something else [21:30:50] | sitenotice | 1082646.90 | 69909 | 15.49 | [21:30:52] they had a lot of internal difficulty with the concept, apparently [21:31:09] sorry [21:31:25] ejegg: not forgiven! No, just kidding ;) [21:31:41] brion: wasn't saper taking care of pg? [21:32:12] not sure [21:32:15] ok, time to move on to the next RFC? [21:32:18] DanielK_WMDE: I would worry about that being just extra cruft, and frameworks like this aren't as similar as SQL implementations [21:33:00] AndyRussG: yes, I have the same worries. but it's a good point to consider, nevertheless. [21:33:20] TimStarling: sure [21:33:21] #action AndyRussG to write a more generic ORM RFC comparing implementation options [21:33:32] * AndyRussG gets a last point in sideways [21:33:42] #topic Dependency injection | RFC meeting | https://meta.wikimedia.org/wiki/IRC_office_hours | Please note: Channel is logged and publicly posted (DO NOT REMOVE THIS NOTE). | Logs: http://bots.wmflabs.org/~wm-bot/logs/%23wikimedia-office/ (Meeting topic: RFC meeting 2014-09-10) [21:33:49] #link https://www.mediawiki.org/wiki/Requests_for_comment/Dependency_injection [21:34:12] now, we already have some ideas about DI in the core courtesy of DanielK_WMDE [21:34:14] Though in favor of what you say DanielK_WMDE, I confess I would love to see more structured data like Wikidata used for other stuff in MW and a layer like that might be a way in [21:34:53] but the way instances are created is quite different [21:35:19] https://git.wikimedia.org/blob/mediawiki%2Fextensions%2FCampaigns.git/e99531e0e4f2a30c9b673240752db73b09784f81/Campaigns.php#L98 [21:35:37] see $wgCampaignsDI there [21:36:24] It's true that it's verbose [21:36:40] In an ideal world, I think that might (?) be folded in with an improvement to autoloading [21:36:44] Or something like this? -- https://github.com/wikimedia/wikimedia-wikimania-scholarships/blob/master/src/Wikimania/Scholarship/App.php#L135-L203 [21:37:14] i do prefer the php code, rather than a configuration array. Much more flexible when you have a particular use case [21:37:31] here is Daniel's instance creation idea: https://git.wikimedia.org/blob/mediawiki%2Fcore.git/81041728520a8eb72df2dab422aa35e0052e71b0/includes%2Fspecials%2FSpecialCategories.php#L56 [21:37:34] I tend to dislike auto wiring containers but I may in the minority on that [21:37:56] I've untangled some nasty Spring systems in the past [21:37:56] ebernhardson: hmm... the reason the config array was suggested was to get things in the fast load path [21:38:19] My DI mindset comes from Guice BTW [21:38:49] It's similar in that you can chain injection in serieses of object creation without doing too much [21:38:59] Daniel has promoted the idea of instance creation being done by entry points [21:39:09] Just have your required injections in your constructor [21:39:32] So DanielK_WMDE is doing setter based injection with fallback. I like that actually as a first step. [21:39:38] Yes, so you only call the DI container at your entry points and objects created further along just get injected by previous ones [21:39:43] That's in theory of course [21:39:44] AndyRussG: from the example code tim linked, i don't see when and where the actual injection is happening. i see implementation classes being registered for service interfaces. [21:39:55] Setter injection is less likely to lead to the 70 param constructor problem [21:40:06] bd808: that's true [21:40:08] DanielK_WMDE: https://git.wikimedia.org/blob/mediawiki%2Fextensions%2FCampaigns.git/e99531e0e4f2a30c9b673240752db73b09784f81/includes%2Fsetup%2FSetup.php [21:40:16] bd808: yea, but i'm trying to get rid of that, by allowing callbacks to be registred instead of class names [21:40:57] Let me pull up some code that actually uses it... [21:40:57] setter is a way to get around that we can' talways easily control construction of objects in mediawiki [21:40:57] DanielK_WMDE: then you have e.g. [21:40:58] The other way to stop the constructor arg madness is to make the objects do less. Which is also usually good. [21:41:00] $setup = Setup::getInstance(); [21:41:00] $campaignProvider = [21:41:00] $setup->get( 'Campaigns\Services\ICampaignFromUrlKeyProvider' ); [21:41:03] e.g. special pages [21:41:15] TimStarling: Yeah that's it [21:41:24] not necessarily what we'd want but it can help as a refactorign step [21:41:42] +1 for callbacks [21:41:50] How would callbacks work? [21:41:57] aude: special pages should really be renamed "small applications that run in the same address space as the wiki" [21:42:01] AndyRussG: ok, so it uses reflection to figure out the type of the constructor parameters, and then provides an instance for ach such parameter. [21:42:01] i think by callabcks, you mean something like Pimple? [21:42:03] the scholarships thing looks slow to me [21:42:08] bd808: yep [21:42:10] to be honest, i dislike this much magic [21:42:13] and maintenance scripts.... [21:42:13] DanielK_WMDE: yes [21:42:15] Good you guys talked about Doctrine during the data mapper one. :) Now how can I convince you guys to use the Symfony DI container [21:42:29] i like things like this to be quite explicit (and thus more flexible). [21:42:31] parent5446: you missed the worries about code generation ;) [21:42:51] Yeah I know. That's always my biggest concern about Doctrine [21:43:07] AndyRussG: what if i have different implementations of the same interface (e.g. binding to different databases)? What if I have settings (primitive type parameters)? [21:43:13] TimStarling: It probably is. Slim is not a framework designed for a site like enwiki [21:43:24] parent5446: i meant the symfony DI container, it autogens a giant class to get around the performance limitations of defining hundreds or thousands of closures at runtime [21:43:26] AndyRussG: it's at least like to have the option to directly supply a callback that will generate the instance, instead of relying on introspection magic [21:43:34] DanielK_WMDE: that's the sort of additional feature that would have to be created [21:43:53] AndyRussG: in my oppinition, that's the foirst thing that should work, because it's simpler, and might just be sufficient [21:44:00] ah yes [21:44:02] lemme reopen the log [21:44:04] the callback helps with lazy initialization [21:44:16] yea, good point aude [21:44:22] re: callbacks, any concrete code examples? [21:44:36] aude: yeah, that's the difference between slow and disasterously slow [21:45:05] AndyRussG: some api modules have recently been converted to use that approach to get the right COnfig instance [21:45:06] I'd like MW to be fast enough that we can promote the use of api.php [21:45:31] people say we should just ditch the whole of MW because api.php is not fast enough [21:45:35] AndyRussG: can try to dig up links, but i think it was in some extension [21:45:51] I say we can save all that work if we can make startup fast [21:45:56] DanielK_WMDE: OK, thanks much! Or just some other code that demonstrates the same idea [21:46:18] there's not really any point in introducing DI which makes startup slow if it is then to slow for anyone to use MW [21:46:30] AndyRussG: an api module defined by an extension might use a Config object specific to this extension. This could be injected via a callback easily. With registration and introspection magic, this would be complicated and obscure. [21:46:40] Can we maybe do some performance testing on a bunch of the DI solutions? [21:46:42] we are talking single digit milliseconds here, that's the difference between feasible and stupid [21:47:24] TimStarling: Good points. Many DI containers have roots in platforms with log running processes. Php's shared nothing system makes these tricky to adapt wholesale. [21:47:31] parent5446: that may be useful [21:47:45] startup time is proportional to class count etc. [21:47:46] TimStarling: the most important bit, i think, is to load and create services lazily, when needed. the second important point is to try and avoid the creation of object in bootstrap code. [21:48:04] I'd like to know how it will work with 1k classes, 10k classes, 100k classes [21:48:30] Makes sense [21:49:05] DanielK_WMDE: I think the most important bit is to basically not run any code at all on startup [21:49:11] In the implementation I did instantation doesn't happen until the object is requested, but setup does [21:49:14] AndyRussG: https://gerrit.wikimedia.org/r/#/c/149183/ is my patch that allows api modules to be registered via callbacks, allowing for simply DI [21:49:18] Also, one interesting thing to note is that many containers (e.g., Symfony, Pimple) have string-index based service loading. In other words, you do $container->get('service') or the like. This destroys IDE compatibility, so we should try and look at containers that allow us to declare aliases, so we can do $container->getDatabase() and have the IDE know the type. Not as big a deal as performance, but something to consider. [21:49:31] DanielK_WMDE: thanks! :) [21:49:47] HHVM and RepoAuthoritative may help with class count problems and request startup times [21:49:52] in HHVM, array literal setup time can be O(1) per request [21:49:57] AndyRussG: same for SpecialPages (not merged yet): https://gerrit.wikimedia.org/r/#/c/152755/ [21:50:03] *poke* poke* [21:50:06] in the number of array elements [21:50:21] TimStarling: Damn, that's pretty nice. [21:50:23] it allows you to reference shared arrays at runtime, they don't need to be cloned [21:50:31] it uses COW [21:51:22] we have to think about how to use that kind of feature if we want startup time to be small numbers of milliseconds [21:51:30] I think that's how the FB release engineers told us they handle their l10n strings [21:51:51] Compiled into the binary basically [21:51:54] if you call O(N) functions for N classes, then that is going to be slow [21:52:23] would it be acceptable for startup time to be slow with zend? [21:52:36] the number of classes isn't something that varies tremendously; non-wmf installations of mediawiki presumably have a comparable number [21:53:30] ori: for zend we may have some more flexibility [21:54:21] I remember seeing some previous work on making autoloading less verbose, and that there were performance barriers there [21:54:47] For autoloading we could probably just let composer handle that. [21:54:57] It just scans every file and makes a map for each class in it. [21:55:14] Yeah if composer handles that and DI, I think that'd be nice :) [21:55:28] Well composer won't handle DI [21:55:30] :P [21:55:37] <^d> TimStarling: Getting startup time down would be wonderful :) [21:55:52] Sorry not compser, I meant Symfony [21:56:17] Oh yeah, I'm a big supporter of Symfony DI. Yes the code generation is iffy, but it's a very powerful and complete DI framework. [21:56:27] The only thing I'd want to check is the performance [21:56:41] Because I'm not entirely sure how performant it is compared to something like Pimple [21:56:45] Ok so it could potentially composer handling autoloading and symfony handling DI? Does anyone want else want to explore that route? [21:56:55] SymfonyDI has a steep learning curve and adds two more levels of caching [21:57:12] AndyRussG: composer can dump a classmap [21:57:13] there are some DI performance comparisons here: http://www.sitepoint.com/php-dependency-injection-container-performance-benchmarks/ [21:57:16] Steep learning curve? Service container definitions in YAML is not that difficult [21:57:24] tgr: does it necessarily add that if you're just using the DI part? [21:57:24] so like AutoLoader.php but automatically maintained [21:57:24] they didn't test Pimple, though [21:57:28] if you do the psr stuff [21:57:36] aude: thanks :) [21:57:45] or even if not psr, it can work around it and still do it [21:57:47] parent5446: thats the easy half. build a big symfony application and you'll start building compiler passes that dig through annotations and whatnot [21:57:49] aude: yes, but you don't have to do PSR [21:57:55] yep [21:58:11] TimStarling: do you know how performance of creating a closure compares to creating a normal object from a class in hhvm? [21:58:14] * AndyRussG should be more knowledgeable about composer [21:58:19] ebernhardson: Indeed, but considering MediaWiki's complete lack of DI at the moment, I doubt we'd be using advanced features in the near future [21:58:22] AndyRussG: IIRC, one for compiling the config files into arrays and one for compiling the classes into a single php file [21:58:34] which is a pain in the ass when debugging [21:58:43] DanielK_WMDE: probably similar [21:59:10] i was hoping for closures to be specially optimized. but there's not much room for that i guess, [21:59:50] hmmm [22:00:19] I think a closure actually is an object [22:00:29] can we switch to Hack soon? [22:00:35] lol [22:01:21] Well the gist I got from the performance test is that Orno\Di, Dice, and Auri.Di are all bounds faster than Symfony, which comes in on the midground (with Zend and PHP-DI trailing behind). [22:01:22] maybe [22:01:38] TimStarling: yes it is. but one e.g. with no "virtual" members, no subclassing at work, no public fields, etc. Not sure if you can glue random fields to closures, either. that kind of thing might be used for optimization [22:02:00] #info DanielK_WMDE does not favour Reflection magic [22:02:08] DanielK_WMDE: We could for writing services if we have good interface abstractions. [22:02:21] K, on that specific point DanielK_WMDE I guess we disagree, I do like reflection magic [22:02:45] DanielK_WMDE: did you have an opinion on big lists of concrete classes as opposed to instantiation at the entry point? [22:02:55] If it can be made clear enough to a reader of code what's going on, I think it's fine [22:03:02] parent5446: unfortunatly if we look at the test code(https://github.com/TomBZombie/php-dependency-injection-benchmarks/blob/master/symfonydi/test1.php) they arn't testing compiled symfony instances [22:03:04] Reflection in php especially of doc strings is sloooow [22:03:20] parent5446: or i should say, compile symfonyDI, so it has to do everything at runtime instead of taking advantage of the caching [22:03:21] ebernhardson: Yep was just browsing the code. [22:03:23] bd808: ah OK I didn't know that [22:03:25] <^d> +1 to not liking reflection. [22:03:50] DanielK_WMDE: I mean there's no closure type, so I assume it is a KindOfObject [22:03:54] ^d: boo ;p [22:03:57] AndyRussG: Unlike java there is no true internal runtime support for annotations in php [22:03:59] TimStarling: the static entry point would trigger instantiation, but keeping it free of the knowledge of the concrete implementation is nice. so having a global registry / top level factory that contains that mapping is a godd thing. [22:04:08] in wikibase, it's simply hard coded into global factory objects [22:04:15] bd808: but there is in hhvm ;) [22:04:19] or hacklang at least :P [22:04:34] #info DanielK_WMDE does like central registry idea [22:04:40] ebernhardson: And in 2047 we will be able to use that? [22:04:59] * bd808 notes we are writing php 5.3 code today [22:04:59] #info DanielK_WMDE warns agains taccessing global registry anywhere but in static entry points [22:05:02] ;) [22:05:13] although the central registry .... good to break it down into subparts [22:05:14] DanielK_WMDE: totally agreed on that point [22:05:21] Yeah I think we need to get to PHP 5.4 before we start thinking about hack XD [22:05:25] DanielK_WMDE: +1 never pass the container into something [22:05:35] AndyRussG: do you want another action item for third party library investigation on this point or would that be too much on your plate? [22:05:50] delegate [22:05:53] Can we break down the objections to reflection more, beyond the possible performance issues? [22:06:31] What type of reflection are we talking about here? [22:06:33] bd808: indeed. [22:06:35] Contructor-type hintings? [22:06:52] TimStarling: I guess it depends, I may have to talk to some of the owners of other things on my plate [22:07:02] I'd be thrilled to do so, though [22:07:03] parent5446: https://git.wikimedia.org/blob/mediawiki%2Fextensions%2FCampaigns.git/e99531e0e4f2a30c9b673240752db73b09784f81/includes%2Fsetup%2FSetup.php#L185 [22:07:10] parent5446: looking at the type hints of the constructor arguments [22:07:15] Ah thanks [22:07:27] I'm objecting primarily to /** @needs Foo **/ inspection [22:07:37] <^d> That ^ [22:08:07] Yeah so the reason I don't like constructor-type-hint injection is because it implies that there is only one service for each class, which may not necessarily be true. I prefer having an explicit specification of what service needs to be injected. [22:08:08] AndyRussG: one problem with reflection is that it tends to create unnecessary coupling [22:08:17] ^that as well [22:08:19] it's a bit like static calls [22:08:21] AndyRussG: in general, i like to keep things simply, and avoid mapping between symbolic names and objects. if possible, use the object, not some symbolic name. if possible, use a function that does something, not obscure declaration that triggers some hidden behavior. [22:08:24] In some Java frameworks I think you have the option of annotations or declarativeish setup code [22:08:43] Hmmm [22:08:47] The purpose of DI is to keep the service injection external to the actual classes being injected. Having the type hints or doc comments in the service anyway defeats the purpose [22:08:50] It might be because i've always used name based injection rather than auto-wiring, but how do you handle things like having an interface with multiple implementations going to different objects, all of which are just annoted FooInterface? [22:08:59] ok, we need to wind up, anyone want to suggest next actions for this? [22:09:16] do some profiling and publish findings, as parent5446 suggested [22:09:22] ^beat me to it [22:09:22] who? [22:09:39] parent5446: is it something you'd be interested in taking on? [22:09:42] that will likely favor the simplest solution [22:09:44] Yeah I'll do it. [22:09:51] AndyRussG: regiertsing a callback is explicit, registering a class plus an array of parameters is a bit annoying, registering a class name and constructor argumnets magically pop up is obscure. not saying it can't be very handy. but it's not *obvious* [22:09:52] I can at least write up a summary of the notes from this and figure out when I might be able to do help with that too, if you like [22:09:53] \o/ thanks. let me know if i can support that in any way. [22:09:55] * DanielK_WMDE likes obvious code [22:10:08] #action parent5446 to do some profiling of alternative solutions and publish findings [22:10:09] * gwicke is firmly in the simple data camp too [22:10:34] * DanielK_WMDE seems to always be behind the conversation by half a screen [22:12:11] ok, I'll call a soft end of meeting now, I'll do #endmeeting in a few minutes, since that terminates log collection [22:12:29] TimStarling: sounds great :) [22:12:38] thanks for staying up late DanielK_WMDE [22:12:51] thanks for the RFCs AndyRussG [22:13:07] thanks everyone else for your comments [22:13:08] :) totally my pleasure [22:13:14] <^d> TimStarling: Can we do hitcounters next time? [22:13:24] AndyRussG: Yeah thanks this is good stuff to be debating. [22:13:26] thanks for working on this parent5446 :) [22:13:47] well, we already discussed the agenda for the next meeting in the committee meeting last hour [22:13:48] No problem. I'm always in favor of anything involving not rolling our own DI/data mapper/etc. library [22:14:10] * DanielK_WMDE waves his beer in TimStarling's general direction [22:14:15] bd808: great to hear, yeah that was a big motivation for me [22:14:27] (that would be "south") [22:14:37] :) [22:14:43] oh, right: can we talk about Assertions again some time= [22:14:43] * AndyRussG notes DanielK_WMDE appears to not have minded staying up late [22:15:06] i implemented my proosal :) https://github.com/wmde/Assert [22:15:08] <^d> TimStarling: Aw ok :( [22:15:25] proposed agenda was: "Page protection as a component" and "Linker refactor" [22:15:50] TimStarling: "page protection based on the permission system"? [22:16:07] TimStarling: ok [22:16:17] basically page protection refactored [22:17:00] Symfony may also be able to help with that :D [22:17:08] AndyRussG: yea, thanks from me, too. i always have a oppinions and nitpicks, but it's great that you are pushing for more, well, sanity :) [22:17:31] let's add assertions and hitcounters as possible things to talk about in 2 weeks [22:17:38] \o/ [22:17:45] <^d> Sounds good! [22:17:46] I don't know how much has to be said about hitcounters, maybe we can resolve it before then [22:17:52] * DanielK_WMDE will try not to miss the session about assertions this time around [22:17:52] DanielK_WMDE: opinions and nitpicks are good :) [22:19:08] parent5446: DanielK_WMDE: everyone: please ping me anytime you feel like chatting informally about either of these topics or if you think of something I can do [22:19:20] * robla reminds Tim he has another meeting :-) [22:19:24] OK will do [22:19:50] hmm, 20 minutes ago [22:22:45] #endmeeting [22:22:46] Meeting ended Wed Sep 10 22:22:45 2014 UTC. Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4) [22:22:46] Minutes: https://tools.wmflabs.org/meetbot/wikimedia-office/2014/wikimedia-office.2014-09-10-21.03.html [22:22:46] Minutes (text): https://tools.wmflabs.org/meetbot/wikimedia-office/2014/wikimedia-office.2014-09-10-21.03.txt [22:22:46] Minutes (wiki): https://tools.wmflabs.org/meetbot/wikimedia-office/2014/wikimedia-office.2014-09-10-21.03.wiki [22:22:47] Log: https://tools.wmflabs.org/meetbot/wikimedia-office/2014/wikimedia-office.2014-09-10-21.03.log.html [22:23:25] Thanks TimStarling and everyone else, tt everyone later [22:31:14] * DanielK_WMDE waves [23:03:23] AndyRussG, TimStarling, all: i just send a quick banchmark for closure vs. object creation to wikitech-l [23:04:08] closures are about 40% faster on my box, without hhvm, but both are just around one microsecond per instance. [23:06:08] DanielK_WMDE: nice! [23:17:40] DanielK_WMDE, hey. [23:18:08] DanielK_WMDE, I think you forgot to attach the file. :) [23:21:10] Krenair: nope... but maybe it got stripped? [23:22:02] DanielK_WMDE, hm, I don't see any attachment in gmail :/ [23:23:31] X-Content-Filtered-By: Mailman/MimeDel 2.1.13 [23:23:41] which doesn't appear in the headers of other wikitech-l stuff. [23:24:26] Sigh. Okay, it appears in other wikitech-l stuff, just not the first one I looked at. [23:24:26] Krenair: yea, mailman probably stripped it. uploaded it and posted a link. thanks for poking :) [23:25:10] Krenair: i guess it appears for messages authored as html, because mailman strips anythign but plain text... [23:25:22] maybe [23:25:27] yea, who knows :P