[07:47:18] Hello, where can I ask about stats? Is this the right channel? [07:50:48] Hi tuxnani! [07:51:01] Depends on the exact kind of stats, but you might try #wikimedia-research [07:51:05] Emufarmers: Hello [07:51:07] ok [07:51:09] Thank you [07:51:41] Oh, there's also #wikiemdia-analytics [07:51:45] #wikimedia-analytics [13:13:45] anyone have a clue as to why i wouldn't be able to create pages on the office wiki? [13:13:58] i just get a never-ending, looping progress bar [13:18:34] hi bgerstle, this is the public office hours channel. you are looking for -staff [13:21:51] ah ok [13:21:52] thanks [13:22:13] ah, that's invite only :-/ [13:24:35] -> pm [15:56:34] yo arrbee [15:56:43] hullo aharoni [15:56:53] Language Engineering office hour starts here soon [15:56:56] hello world [15:57:07] Hello! [15:57:09] Did I ever tell why do I usually write "Hallo" and not "Hello"? [15:57:19] Nope. [16:00:15] and here we go [16:00:24] #startmeeting Language Engineering monthly office hour - January 2015 [16:00:24] Meeting started Wed Jan 14 16:00:24 2015 UTC and is due to finish in 60 minutes. The chair is arrbee. Information about MeetBot at http://wiki.debian.org/MeetBot. [16:00:24] Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. [16:00:24] The meeting name has been set to 'language_engineering_monthly_office_hour___january_2015' [16:00:36] Hello everyone [16:00:48] Welcome to the monthly office hour of the Wikimedia Language Engineering team [16:01:00] The first one for this year [16:01:08] oh hi Niharika [16:01:16] Hi arrbee. :) [16:01:42] * arrbee = Runa, the Outreach coordinator for our team [16:02:04] First, the message we can't ignore: [16:02:17] IMPORTANT: The chat today will be logged and publicly posted [16:03:18] Before we begin, let me introduce my team mates who are also here - aharoni kart_ pginer [16:03:48] Hallo. [16:03:54] Our last office hour was on December 10, 2014. Logs at: [16:04:01] #link https://meta.wikimedia.org/wiki/IRC_office_hours/Office_hours_2014-12-10 [16:04:20] aharoni: is that the explanation for the 'Hallo'? en-gb? [16:04:53] [Yes - but I'll give more details later ;) ] [16:04:58] okay :) [16:05:34] We should include one language fun fact from aharoni in every meeting. :P [16:05:45] So it may have been rather noticeable that we are super excited these days about the progress with Content Translation [16:06:24] We made an announcement earlier this week about preparing Content Translation as a beta feature: [16:06:35] #link http://blog.wikimedia.org/2015/01/10/content-translation-beta-coming-soon/ [16:07:23] TL;DR version: Very soon we are deploying Content Translation as a beta-feature in Catalan, Spanish, Portuguese, Danish, Esperanto, Indonesian, Malay, Norwegian (Bokmål) Wikipedias [16:08:13] This will enable editors to translate articles directly on the Wikipedias and not on the beta servers any more [16:09:00] translate from any of these Wikipedias to others on that list? [16:09:05] We will also be deploying Content Translation on the Swedish and Norwegian (Nynorsk) Wikipedias, but these can only be used as a source [16:09:22] YairRand: No, there are some limitations. Let me show you the list. [16:09:56] Check the table here: https://www.mediawiki.org/wiki/Content_translation/Deployment_Plan#CX_Deployment_Plan_for_January_2015 [16:10:30] You will see which are the source languages for each of them and also whether MT support is available for each source-target combination [16:10:46] Do you support yet all the language-to-language options offered by Apertium? [16:11:16] Ah, clearly not. [16:11:22] Rastus_Vernon: Not yet. We are doing that slowly based on the language quality survey that is going in. [16:11:40] Rastus_Vernon: some language pairs on Apertium did not give very helpful results [16:13:04] Rastus_Vernon: For instance you may notice that we are not enabling Swedish to Danish MT right now. The initial user tests suggested that the Apertium generated content was not very helpful [16:13:19] But the users still wanted to use Swedish as a source to translate from [16:14:03] For all the languages, English will be available as a source [16:14:29] But only English to Esperanto will have MT support [16:15:04] is the plan to eventually support any language to any language? [16:15:35] YairRand: Well, the real limitation is the MT backends that can support language pairs [16:16:41] But since we are introducing non-MT pairs too, we would use that as a study to understand what features would be helpful for editors who are using Content Translation without MT [16:17:02] And yes, eventually it will be enabled for all languages, with or without MT. [16:18:08] does that include incubator languages? [16:18:31] YairRand: its probably too early to say that right now [16:19:18] YairRand: as arrbee says, it's too early, but I hope that some day it will be possible. It certainly makes a lot of sense. [16:19:46] YairRand: For instance, we are yet to go deep into languages with complex writing scripts [16:22:15] One correction. We are also supporting English to Catalan and English to Spanish machine translation [16:23:33] The last 2 months, we tested heavily on the beta servers and fixed numerous bugs to make the tool more efficient and stable [16:24:23] Currently, we are waiting for the deployment to be completed and then run a few more tests to make sure nothing is broken [16:24:44] After that we will follow up with a bigger announcement [16:25:22] Right now, the beta server is still open (and will continue to be) for everyone to come and test [16:25:37] #link http://en.wikipedia.beta.wmflabs.org/wiki/Special:ContentTranslation [16:25:46] You will have to enable the beta feature here [16:26:49] We made several bug fixes for the translation dashboard that we introduced in our last version [16:27:06] The dashboard looks like this: [16:27:12] #link https://upload.wikimedia.org/wikipedia/commons/b/b8/Content-Translation-User-Dashboard.png [16:28:42] aharoni: pginer: Would you like to add something here about the entry points that editors can use once the beta feature is deployed? [16:28:58] The translation dashboard is global? As in, it is not specific to each wiki? [16:29:19] The entry points to ContentTranslation are: [16:29:36] 1. A red link in the interlanguage links list. [16:29:42] It appears if you enabled the Content Translation beta feature, your user interface language is one of the supported languages, and there is no article about the given topic in your language. [16:30:03] 2. A button in the Contributions page. (The logic behind it is that a translation is a kind of a contribution.) [16:30:38] Rastus_Vernon: It opens on each wiki separately, but the information about translation drafts is pulled from one common database. [16:31:27] Rastus_Vernon: That's a good question - ContentTranslation is a multilingual project by its very nature, so many things there must be cross-wiki. [16:31:41] That makes sense. [16:32:03] So, for example, we have to do some trickery about enabling the beta feature: [16:32:42] the preference are separate in each wiki, but if you enable the beta feature in one wiki, it will appear to work in the wiki in the language into which you are translating. [16:33:03] (It's a bit of a hack, and we hope that MediaWiki will have global preferences as soon as possible.) [16:33:13] Rastus_Vernon: And its restricted to display only data for the specific user [16:35:20] So right now, besides all the work related to the actual deployment, we are still looking for more suggestions about other languages we can evaluation [16:35:50] prioritize for evaluation* [16:36:32] You can help us by filling this survey: [16:36:35] # link https://docs.google.com/a/wikimedia.org/forms/d/1JzM2VAbd14bA5NpsMoxzbVO5Njw_bic4V5qtiuScX70/viewform [16:37:28] Rastus_Vernon: YairRand: Have you tried the tool on the beta servers? [16:38:30] Yes, though not recently. [16:38:44] I have not. [16:38:53] Instructions are here: https://www.mediawiki.org/wiki/Content_translation#Try_the_tool [16:38:56] Rastus_Vernon: In which languages can you read and write? [16:40:11] aharoni: None that is supported by Apertium, unfortunately—English and French. [16:41:04] Rastus_Vernon: Have you tried using it without MT for en-fr? [16:41:49] I have not, but I shall try. [16:42:06] Rastus_Vernon: Alright. :) Do let us know how you find it. [16:42:18] Rastus_Vernon: At the moment, there's a trick that you need to do to translate to a language that doesn't appear in the selector. [16:42:24] You need to use a special URL: http://en.wikipedia.beta.wmflabs.org/wiki/Special:ContentTranslation?from=en&to=fr [16:42:35] * arrbee wonders if Niharika has ever tried the tool ;) [16:42:39] note the values of "from" and "to" parameters in the end. [16:42:49] Rastus_Vernon: That would be really useful. If you do, feel free to report any issue, or suggest potential steps that could be improved [16:42:53] Rastus_Vernon: you can find the instruction in the link that pginer passed a few minutes ago :) [16:42:54] arrbee: Yes I have! :) [16:43:13] Rastus_Vernon: If you use that URL, then you'll see English and French when you click "Create new translation". [16:43:24] Niharika: Nice. I am guessing without MT? [16:43:32] I actually tried installing it and getting it up and running to make a few contributions, but I couldn't get it up. [16:43:49] Niharika: ahh ok [16:44:18] Niharika: you might want to catch hold of kart_ to get that sorted [16:44:23] :) [16:44:38] arrbee: Will do. Is it possible to get a vagrant role for this? [16:44:47] * arrbee pokes jsahleen [16:44:51] Niharika: We are working on a mediawiki-vagrant role for ContentTranslation. That should make it easier. [16:45:10] jsahleen: Great! That would be perfect. [16:45:11] It is under review at the moment. I hope to finish it after the deployment. [16:45:32] Another option, depending on what you try to do, is to install only the CX extension locally but use a remote server [16:46:15] We have that with the role now, but would like to have cxserver be deployed locally as well. That takes a little more work. [16:46:26] pginer: Any documentation on how I could do that? [16:46:53] * arrbee mumbles something about getting that documentation sorted [16:47:02] :D :P [16:47:14] No worries. I'll wait for the role to be activated. [16:47:28] Niharika: Thanks. I'll make it a priority. [16:47:33] Niharika: I am not sure if you have already seen this: https://www.mediawiki.org/wiki/Content_translation/Setup [16:47:49] * aharoni mumbles something about being old-school and just cloning stuff from Gerrit ;) [16:48:08] * jsahleen mumbles something about how vagrant is soooo much easier. [16:48:13] :) [16:48:47] arrbee: Did see it. Don't use Ubuntu, hence everything is much easier with vagrant. :) [16:48:59] :) [16:49:15] We have just under 10 mins left [16:50:58] YairRand, Rastus_Vernon, Niharika - do you translate articles in Wikipedia to your languages in general? [16:51:08] I mean manually, without ContentTranslation. [16:52:22] The most difficult part of translation is clearly the templates, since they are different from one wiki to another. [16:52:45] arrbee: No. :( I've tried doing that, but I've found it really hard to type in Hindi using on-screen keyboards and google translate tool. I have translated stuff on translatewiki though. That was easier. [16:53:30] I do not translate Wikipedia articles, but I have contributed to the interface translations on translatewiki.net and also translated some pages on meta.wikimedia.org. [16:53:45] Meanwhile, pginer has created this very nice screencast of the whole process. If you haven't yet used Content Translation, you can get a preview through this: [16:53:49] Niharika: it's really great to hear that translatewiki is easier than Google \o/ [16:53:52] #link https://upload.wikimedia.org/wikipedia/commons/e/ee/Content_Translation_Screencast_%28English%29.webm [16:54:13] or if you prefer youtube: [16:54:17] #link https://www.youtube.com/watch?v=nHTDeKW3hV0 [16:55:23] We have just under 5 mins more on the channel [16:55:38] Rastus_Vernon: Templates are on our radar. We are going to try to translate them automatically in the foreseeable future, but it will require some clever development and help from the editors community. [16:55:41] Content Translation seems even easier to use than the Translate extension. [16:55:58] :) [16:56:16] Woo-hoo! \o/ [16:56:57] I will start wrappng up now, just in case there is another team waiting to use this channel [16:57:02] Rastus_Vernon: Great to hear! [16:57:07] aharoni: Hallo fact? [16:57:10] We will be around on #mediawiki-i18n [16:57:38] Our office hours are generally held every 2nd Wednesday of the month [16:57:44] Oh! [16:58:07] But we are looking to change it to an earlier hour i.e. not like the earlier 1700UTC [16:58:15] Please let us know if you have any suggestions [16:58:35] So the story is that I started learning English in the Soviet Union in the 1980s, and Soviet English textbooks used "Hallo". When I moved to Israel, I found out that Hello is much more common, but for the fun of it, I still frequently use "Hallo". [16:59:01] :D [16:59:10] Our next office hour is scheduled to be on February 11th. But do please check our announcements [16:59:12] You can't beat a Soviet education. [16:59:46] :) [16:59:49] Thanks a lot everyone and do lookout for our big announcement in a few days! [16:59:52] #endmeeting [16:59:52] Meeting ended Wed Jan 14 16:59:52 2015 UTC. Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4) [16:59:52] Minutes: https://tools.wmflabs.org/meetbot/wikimedia-office/2015/wikimedia-office.2015-01-14-16.00.html [16:59:52] Minutes (text): https://tools.wmflabs.org/meetbot/wikimedia-office/2015/wikimedia-office.2015-01-14-16.00.txt [16:59:52] Minutes (wiki): https://tools.wmflabs.org/meetbot/wikimedia-office/2015/wikimedia-office.2015-01-14-16.00.wiki [16:59:52] Log: https://tools.wmflabs.org/meetbot/wikimedia-office/2015/wikimedia-office.2015-01-14-16.00.log.html [17:08:34] Hi J-Mo! [17:09:15] hi harej [17:09:22] I liked your dissertation :3 [17:09:39] please tell me you didn't read the whole thing. [17:09:54] I read from the beginning to the end. 140 pages or something like that. [17:09:55] (also, thanks!) [17:10:02] * J-Mo bows [17:10:15] It was eminently readable. The same can't be said for most publications of that type. [17:10:28] you may be the only person who's ever done that. not sure my advisor even did ;) [17:10:50] You had several typos and there were some paragraphs that were repeated verbatim throughout. I didn't mind though. [17:11:43] hehe [17:11:45] oops [20:35:59] Real nice work Research Team :-P http://dispenser.homenet.org/~dispenser/temp/qualtrics.png [20:40:08] The more I read about "services," the more annoyed and concerned I am. [20:40:18] Seems to be lacking appropriate supervision. [21:00:05] #startmeeting RFC meeting [21:00:05] Meeting started Wed Jan 14 21:00:05 2015 UTC and is due to finish in 60 minutes. The chair is TimStarling. Information about MeetBot at http://wiki.debian.org/MeetBot. [21:00:05] Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. [21:00:05] The meeting name has been set to 'rfc_meeting' [21:00:18] o/ [21:00:32] #topic Support for user-specific page lists in core | RFC meeting | Please note: Channel is logged and publicly posted (DO NOT REMOVE THIS NOTE) | Logs: http://bots.wmflabs.org/~wm-bot/logs/%23wikimedia-office/ [21:00:42] #link https://www.mediawiki.org/wiki/Requests_for_comment/Support_for_user-specific_page_lists_in_core [21:01:07] \o [21:01:56] So, i've been hoping to push along the user lists RFC (and assume ownership for that) as some work the mobile web team would be made a lot easier by such a move. [21:02:10] * the mobile web team/are doing/ [21:02:30] TimStarling, quick possibly OT question: auth service is not on today? [21:02:53] no, it was bumped because jdlrobson wanted to discuss this one [21:03:00] kk, thx! [21:03:05] more background: I want to use the RFC as an opportunity to cleanup a lot of our legacy watchlist code and move away from sql queries in special pages and towards a situation where SpecialWatchlist is a controller and builds the watchlist using essentially a model of the Watchlist class (I suspect using OOUI for the view). [21:03:10] gwicke: and Chad thought it would be best to leave it until after the kickoff anyway [21:03:37] Moving RFC along / today: Was hoping we could talk about the proposed DB changes and get some agreement on how that could be done. Does that sound good? [21:03:37] makes sense; might have more info on the Wikia activity in that area by then as well [21:04:06] My understanding is that Wikia contributes almost nothing to Wikimedia development. [21:04:14] jdlrobson: sure [21:04:24] I've just been reading the POC patch [21:04:45] TimStarling: The one in the mediawiki page or the very similar one I posted this week? [21:04:58] https://gerrit.wikimedia.org/r/#/c/183370/5 [21:05:28] which was linked from https://phabricator.wikimedia.org/T1352 [21:06:15] I also read the IRC log from last time, in which almost my sole contribution was to say "massive intersection" about 6 times [21:06:22] TimStarling: cool. Yup. There are various other patches that follow but esentially https://gerrit.wikimedia.org/r/#/c/183757/ is the one I think would be good to talk about today [21:06:31] I think Pao raised some good issues about the collections/page list feature on the mailing list today. [21:06:36] i've drawn a diagram too http://i.imgur.com/LtmxsxE.jpg [21:06:52] in an attempt to articulate how i see the future of watchlists [21:06:53] jdlrobson: UML? Eww. [21:07:01] i have some concerns about whether the thing we are rushing into implementation is actually the right thing [21:07:13] I like UML :P [21:07:29] cscott: yup agreed so this is what i'd like to talk about. [21:07:46] has everybody read the thread on engineering@lists ? [21:07:53] cscott: That's not a public list. [21:08:05] Also, nobody is rushing anything. [21:08:21] well, there's a patch itching for +2 on gerrit ;) [21:08:40] i know it's not a public list, i guess i was asking: how many people here need that discussion recapped? [21:08:58] Everyone? Why would you assume knowledge of a private, internal mailing list? [21:09:01] in a nutshell - I envision the existing watchlist table becoming a generic collection_items table and making the changes proposed in RFC by ori. My only slight deviation from it is I see the wl_tag field as a reference to a lists database table where meta data can be held on a list such as title, owner(s), privacy status [21:09:46] cscott: apologies if it seemed that way. My patches were primarily meant as discussion points [21:09:57] i would like to see some movement though within the next month [21:10:06] Basically we're moving away from a user having a watchlist to the generic idea of having lists of pages. [21:10:11] Right? [21:10:22] Maybe. I think any model should support lists of other things as well. [21:10:31] and i don't mean to seem like i'm digging in my heels and yelling 'stop'. i'm just trying to facilitate communication. [21:10:36] Like user contributions, pagelinks, templatelinks, etc. [21:10:53] here's another suggestion: how is the watchlist edited? are we building special tools? [21:10:57] parent5446: correct. So a user might have 5 watchlists say - e.g. Watchlist 1: Articles that are likely to get vandalised, 2. pages i want to flesh out 3. articles i created etc [21:11:00] "lists of anything" seems like we're getting extremely abstract. [21:11:12] could we instead use some of the more general "data structure editing" tools which wikidata is building/ [21:11:26] Agreed with anomie. I like the idea of generic page lists, but getting too generic is going to shoot us in the foot. [21:11:40] Even this "pagecollection" stuff seems like we're introducing a lot of abstraction without a clear idea of what exactly we're going for with the abstraction. [21:11:51] pau said, eg: "I was wondering if collections would work cross-language (that is, based on Wikidata IDs). That would allow people to consume lists in different languages regardless of the language of the creator of those lists and would simplify the life of users participating in multiple wikipedias. On the other hand it would require to deal with articles missing in the local language (e.g., fallback to another language? ask users to translate i [21:12:07] You truncated at "ask users to translate i" [21:12:08] that's an interesting point, I think. [21:12:09] At the moment, the "pagecollection" seems to basically be an array with fancy accessors. [21:12:14] i don't see how adding two fields makes us too generic. The wl_tag field is all that would be needed to support multiple lists and a wl_modified field would be all that is needed to support ordering other than alphabetical [21:12:18] ask users to translate it?)." [21:12:30] ok anomie as stated forget the abstractions - i'd like to discuss the underlying database changes [21:12:44] also, from pau: "To which extent would these collections integrate with existing collection-like elements? It would be great if article categories (or even lists, or lists of lists) are considered some kind of "official" collections that you can also share/consume/etc in the same way as user generated collections." [21:12:51] jdlrobson: I think multiple watchlists quickly raises a lot of thorny user interface issues. [21:13:02] jdlrobson: the DB changes on the RFC? [21:13:04] "too generic" was in response to Marybelle wanting to talk about lists of users etc [21:13:07] TimStarling: yes please [21:13:13] i think that categories are another interesting point where perhaps we could use a more general mechanism of "lists of pages" [21:13:37] if we were to do them as is what would they enable; what wouldn't they enable; what issues might we run into? [21:13:42] I don't think we should extend the watchlist table. It'd make more sense to just make a new table and eventually phase out the original watchlist table altogether. [21:14:06] Because right now the primary key on the watchlist table is the user and the title, but for generic collections, it would have to be the collection ID and the title [21:14:07] what are the indexes? [21:14:30] oh, "likely require an index" [21:14:40] lol [21:14:40] seems a bit vague [21:14:51] parent5446: i think i'm saying more or less the same thing: can we just make the Right Thing (without making it *too* general and hairy) and eventually make the existing watchlists an instance of that. instead of bolting some features on the existing table. [21:15:06] if I was going to approve a schema change I would want to see the actual schema change, I think [21:15:08] my concern is that this project concentrates on lists controlled by a single user, with read sharing only [21:15:46] Out of OuKB's statement, are we planning on supporting a full permissions system? [21:15:49] Is this collections thing the same as https://www.mediawiki.org/wiki/Mobile_web_projects/Collections_Backend (just came up during the research showcase)? [21:15:50] I'm not even sure it supports read sharing, though that phrase confuses me a little. [21:16:16] there's no collection ID on the RFC page [21:16:19] Marybelle, read sharing is all that project is about [21:16:27] Looking at the UML diagram linked, it has fields like "collaborators", so I'd assume we are supporting write access as well [21:16:42] OuKB: Maybe I'm confused... watchlists have traditionally been considered private. [21:16:51] OuKB's statement is another concern i have. The Collections extension allows multiple authors to collaborate on the list of articles included in the collection. Strictly per-user seems a big step back. [21:16:57] there seems to be a lot of confusion here. [21:17:07] Maybe the question is if we were doing the Watchlist from scratch now how would we do it. Yes things like full permissions are hard things but how do we make sure if we ever wanted to do that we set it up for success. [21:17:20] There's a lot of confusion because no one is entirely clear on what we're supposed to be building, IMO [21:17:45] "Better watchlist" vs "general lists of pages with multiple authors" and so on [21:17:50] there's wl_tag in the whiteboard photo, is that the collection ID? [21:18:26] the whiteboard photo has a "lists" table, the obvious way to link it with the list_items table would be with an integer foreign key [21:18:32] not a string [21:18:41] TimStarling: Essentially. wl_tag would be the thing linking the entry to some meta data. I spoke to Ori about this and felt an id in a lists/collections table would make more sense as it allows you to associate more things with a list e.g. permissions [21:18:52] <^d> Why are the only two options "start fresh" or "bolt onto existing watchlists" ... isn't there a third ...? "Cleanup/refactor/improve watchlists to be what we want out of them" [21:18:54] let me propose a starting point: we have a number of similar current features -- watchlists, the Collection extension, this new mobile collection thingy, article categories, ...maybe folks here can suggestion some more. can we make this a feature which improves more than one of these? [21:18:54] the problem then becomes what would you do with the existing entries [21:18:58] since these lists wouldn't exist [21:19:15] you could import them easily enough [21:19:52] cscott makes a good point. We have a lot of things that do different parts of this one big thing we want to create. [21:19:57] anomie: to be clear, all i want to build is an infrastructure to support multiple versions of the existing watchlist concept - a bunch of pages which can be used to generate feeds of edits/raw editable lists/HTML lists [21:20:02] i'd love to hack the collection extension to use a better way to build books, for example. [21:20:15] parent5446: cscott +++ [21:20:40] the collection extension uses page lists in a session, not in a DB [21:20:46] cscott: I worry that trying to get the extra metadata that collection needs into the watchlist concept is going to turn hairy. [21:20:55] TimStarling: it serializes the page lists as wikitext. [21:21:16] sure, but most of the code is session-based [21:21:21] temporary page lists are stored in browser local data, but you can save them. it then serializes them as a * list of articles, with some special format. [21:21:47] Let's not do that [21:21:51] jdlrobson: But is this a watchlist as in soething owned by one user, or are we having multiple-user ownership? If the latter, then wl_user+wl_tag doesn't work too well. [21:22:01] TimStarling: that's because wp doesn't have any notion of 'draft' articles or edits. but you're actually making my point -- it's a big hack. i'd love to use a better page listy feature instead. [21:22:12] anomie: well that's what the meta data table would be for. [21:22:53] jdlrobson: If your main list table is wl_user+wl_tag, then having only wl_tag pointing to some metadata table is going to bomb as soon as two users want different metadata for the same wl_tag. [21:22:55] I think it's fair to say that the proposed DB schema is not reflective of what we actually want to do, and thus needs to be revised. [21:23:01] I am fine with the idea of adding another table (say page_list) and linking it by integer primary key with an item table (say page_list_item) [21:23:05] that seems pretty orthodox to me [21:23:11] the collection extension also does this wikitext serialization thing because it means you can use the existing wikitext editor(s) to collaboratively edit them. but in theory we have extensible wikidata editors now that can edit data structures not tied to wikitext. [21:23:13] why anomie? So you have an entry id=1, title=Jon's 2014 watchlist there, owner=jon, existing watchlist table would be used for entries that point to id=1 [21:23:25] And if wl_tag is unique, then we don't need wl_user in the page list at all. We just need metadata somewhere saying "$user's watchlist is wl_id=123462" [21:23:38] you could then have a table ListMembers where this an entry list_id=1, member=Anomie [21:25:03] anomie: I think we are in agreement. It would be great if we could shift the user to the meta data table/another table. [21:25:19] (the how for that I'm not sure about) [21:25:47] jdlrobson: Yeah, if you're going that route then you want an integer wl_id as the key in your page-list table, not wl_user+wl_tag. [21:25:47] surely we've done that sort of thing lots of times [21:25:48] I'm guessing that could be a step 2 though? [21:26:11] Agreed. This is what I said early on (RFC page doesn't reflect this since I didn't originally author it ) [21:26:36] what would the value be for existing entries? [21:26:38] null? [21:26:40] (catching upon the log) Marybelle: watchlists are private, but we'd had the option of sharing watchlists using the watchlist token feature [21:26:45] would we populate the lists table [21:26:53] yes, populate it [21:27:01] TimStarling: got it. [21:27:16] there is lots of code in maintenance/ for similar past DB migrations [21:27:23] TimStarling: i would also like to add the modified timestamp for when something was added to the list as this will support arbitary sorting other than A-Z [21:27:39] i suspect this will help cscott where items in a chapter are in a random order [21:27:43] legoktm: You'd need multiple tokens then, I guess. [21:27:59] legoktm: And a means of adding pages to one or more watchlists... the UI part of this seems much harder than the internals. [21:28:14] Marybelle: that is why i mentioned the data editors above [21:28:24] A timestamp won't really help with something like the Collections extension, unless you're abusing it to store not-really-a-timestamp. [21:28:32] ( cscott i can imagine a chapter in a book becoming a list, where the chapter title is the title of the list and all the entries are stored in a table pointing to that list) a book would then essentially be tying lists/collections together [21:28:39] although i'd expect more specialized UI to be built by (say) the mobile team. [21:29:10] here is a question I had from a user once: why do watchlists show all changes, why do they not show only unreviewed changes? [21:29:14] cscott: I definitely agree with leveraging data editors. [21:29:39] we have the timestamp which gives you read/unread status [21:29:52] but you can't filter on that [21:30:17] or how about showing changes since your last visit to the site? [21:30:30] If we're getting into watchlist improvements besides multiple lists, T11790 is a good one. [21:30:52] https://phabricator.wikimedia.org/T11790 [21:31:13] yeah, that's good [21:31:22] these are really DB schema issues [21:31:24] Some people have also asked for a way to see all most recent changes to all pages on their watchlist. Like you'd maintain a list of pages you care about and you could see the most recent change, even if was outside 30 days or whatever. [21:31:41] people file bugs like that and we say "impossible because schema" [21:31:59] well, if we're redoing the schema from scratch, we should probably try to have less impossible things [21:32:56] pau actually made a suggestion of that form as well: [21:33:02] Considering that articles are living entities, it would be interesting to surface some updates about the content included in a collection. This is something that already happens in the watchlist, but I was thinking about something more focused on readers where I could view that an article was added to a list on interesting architects, or some piece of information was added there. [21:33:04] We've previously discussed the difference between making something public --> private and private --> public. (The latter is much more difficult.) [21:33:31] So it may be best to deprecate watchlists and use a new term for a new thing. [21:33:35] so, answering my own question: the reason why this isn't just a json data structure handled by https://meta.wikimedia.org/wiki/Wikidata/Notes/ContentHandler is... [21:33:56] ...that we want to watch changes to the pages mentioned in this list, and trigger events based on that. [21:33:57] right? [21:34:39] TimStarling: So yes. I think there is loads of room for improvement around the watchlist feature. Right now I can see just having multiple watchlists as a big win and relatively easy to achieve which would address many use cases and open up many doors that are currently closed. [21:34:45] cscott: you can fill in the DB tables when the JSON is updated [21:35:02] Again, I don't think it's relatively easy to implement multiple watchlists from the user interface side. [21:35:03] JsonContent is a pretty generic UI [21:35:15] Marybelle: not true. I've done a POC [21:35:33] jdlrobson: Link? [21:35:43] ok, time to move on, please note next steps etc. [21:36:00] So what would it look like if watchlists were implemented on top of ContentHandler? i'm guessing we'd still need a db table under the covers somewhere. what would that db table look like? [21:36:04] who will write the schema in non-whiteboard-photo form? [21:36:07] Seems a lot like Amazon's "add to wishlist" default to a primary list and allow selecting an alternate plus gui to move a item from list A to list B [21:36:30] cscott: What, watchlists? Isn't it because we want to be able to do "select ... from watchlist join recentchanges where wl_user=?" instead of "select ... from recentchanges where title in (...list of 10000 titles...)"? [21:36:33] #action jdlrobson to propose schema [21:36:45] TimStarling: so it seems we don't have any objections to adding a list_id entry to the database and shifting the user information over there? I wil definitely help that happen but would be great to have some help from a SQL expert e.g. aaron s? [21:36:55] #info nobody seems quite sure how broad a tent the abstraction should be [21:37:02] Marybelle: https://gerrit.wikimedia.org/r/#/c/183757/ < changes Special:EditWatchlist [21:37:39] jdlrobson: maybe I should do it [21:37:49] (The primary issue with EditWatchlist is the Scunthorpe problem. ;-) [21:37:58] Aaron is not here so can't refuse, seems unfair ;) [21:38:00] jdlrobson: Not so much "add a list_id entry" as adding a whole new table to store lists identified by a list_id, and one to tie those list_ids to metadata. [21:38:13] TimStarling: I'm happy to help with that [21:38:24] Lists or queues or collections. [21:38:42] #info cscott suggests JsonContent or some such [21:39:18] anomie: simple enough to do that with JsonContent [21:39:37] we do such queries on links, and they come from content, right? [21:39:48] anomie: right, i think you're getting at the crux: we want to be able to easily join "something" with the watchlist. traditionally it's recentchanges, pau suggested a more general "related content" join. [21:40:02] we already have a shared public watchlist feature, it is recentchangeslinked [21:40:06] which is of course content-based [21:40:38] And largely unused. [21:41:14] #topic Guidelines for extracting, publishing and managing libraries | RFC meeting | Please note: Channel is logged and publicly posted (DO NOT REMOVE THIS NOTE) | Logs: http://bots.wmflabs.org/~wm-bot/logs/%23wikimedia-office/ [21:41:15] Well, the queries on links are joining against the pagelinks table, not supplying a list of every link. The API limits "supplying a list of every link" to 500 for most users, 5000 with apihighlimits. [21:41:18] #link https://www.mediawiki.org/wiki/Requests_for_comment/Guidelines_for_extracting,_publishing_and_managing_libraries [21:42:01] anomie: right, so do the same thing with watchlists/collections [21:42:19] have an indexed table derived from the content, join on it [21:43:08] It seems something of a waste to have a content blob stored that's basically a serialized list of what's supposed to be in the database table. [21:43:20] so this next RFC is a funny kind of RFC in that it is basically permanent developer documentation [21:43:24] Unless we're needing the content blob for some other purpose. [21:44:17] in theory RFCs are resolved and archived -- this one will maybe be moved elsewhere once we have discussed it? [21:44:21] I think so [21:44:24] yeah [21:44:54] that still needs to be done for the mediawiki/vendor RFC too :/ [21:46:03] GitHub auth: probably not going to happen. [21:46:31] csteipp: It was an idea :) [21:46:41] If you mean use github to login and attache accounts, right? [21:47:12] yes. allowing someone with github credentials only to access phab as a full particpant [21:47:33] to make getting contributions and bugs from github easier [21:47:54] That plugin looked sketchy, but I can look at it again if we really want it. [21:48:20] Are we referring to logging into Phab using GitHub? [21:48:31] Yeah [21:48:35] parent5446: https://www.mediawiki.org/wiki/Requests_for_comment/Guidelines_for_extracting,_publishing_and_managing_libraries#Issue_tracking_guidelines [21:48:49] Thanks was trying to find what section it was in [21:49:18] that sync idea actually came from robla but I liked it conceptually [21:50:09] I like your punctuation style bd808: (?are we ready to try arc out with libraries?) [21:50:34] you know spanish actually has bracketed questions: https://en.wikipedia.org/wiki/Inverted_question_and_exclamation_marks [21:50:44] *nod* [21:51:02] I tend to do that leading ? thing with inline notes [21:51:19] I know that, for sure, allowing contributions via GitHub pull requests in some manner would make contributing a lot easier, and is something we should try and do. [21:51:58] phab may make it a lot easier. Facebook moves things from github pull requests in to phabe for review [21:52:10] s/e// [21:52:11] i will note that there exists gerrithub.io, and it works reasonably well to allow gerrit-style code review on github-hosted code. [21:52:34] are there any objections to moving this page out of the RFC namespace and continuing to work on it there? [21:53:38] Where should it go? Just in the main namespace "somewhere"? [21:53:51] Manual namespace [21:54:25] maybe link from https://www.mediawiki.org/wiki/Developer_hub [21:54:53] legoktm: I guess Manual:, if this is official policy. Everythign on mediawiki.org is documentation. [21:55:16] I'll update Developer hub etc. with links to it [21:55:49] thanks spagewmf [21:55:55] spagewmf: please give your tech writer input. It's very welcome [21:56:26] ok, all done? [21:56:52] So something like https://www.mediawiki.org/wiki/Manual:Developing_libraries to mirror https://www.mediawiki.org/wiki/Manual:Developing_extensions ? [21:57:09] genius [21:57:24] * bd808 pats self on head gently [21:57:45] #endmeeting [21:57:45] Meeting ended Wed Jan 14 21:57:45 2015 UTC. Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4) [21:57:45] Minutes: https://tools.wmflabs.org/meetbot/wikimedia-office/2015/wikimedia-office.2015-01-14-21.00.html [21:57:45] Minutes (text): https://tools.wmflabs.org/meetbot/wikimedia-office/2015/wikimedia-office.2015-01-14-21.00.txt [21:57:45] Minutes (wiki): https://tools.wmflabs.org/meetbot/wikimedia-office/2015/wikimedia-office.2015-01-14-21.00.wiki [21:57:45] Log: https://tools.wmflabs.org/meetbot/wikimedia-office/2015/wikimedia-office.2015-01-14-21.00.log.html [21:58:17] no RFC meeting next week due to the WMF all hands [21:58:32] the wednesday after will be the day after the dev summit [21:58:59] Does "Project code review should use the tool most closely associated with the primary git hosting." mean that committing changes suggested in pull requests instead of merging them with the merge button should be discouraged? [21:59:29] some people will be flying, I'm not sure how convinient it will be [21:59:30] If the idea is to have every change go through a pull request, that means having one "Merge ..." commit for each commit in the repository, which is a bit annoying. [21:59:44] Rastus_Vernon: i rather like using gerrithub with github, it integrates pretty well. [21:59:46] but if we do have a meeting, SOA Auth would be a possible topic [22:00:08] or maybe something that comes out of the SF meetings that week [22:00:29] mobile mobile mobile mobile [22:00:35] cscott: How does gerrithub work? [22:00:44] or maybe, "radical reinvention of the mediawiki software" ;) [22:01:05] My biggest complaint with using pull requests for everything is that you end up with a gazillion of "Merge ..." commits that pollute the Git log. [22:01:06] Rastus_Vernon: the same way gerrit works, except it pulls & pushes code to github. [22:01:16] Rastus_Vernon: We've been using pull requests a bit on https://github.com/wikimedia/composer-merge-plugin but I could see how for a high volume project the merges could get a bit crazy [22:01:24] gerrit has a "rebase to merge" feature. [22:01:37] Yes, that's what some projects already do on GitHub. [22:01:41] There are other ways to land a pull request besides the "merge" button too [22:01:44] But it requires being done manually, not using the pull request interface. [22:01:49] but i really got over my distaste of merge commits after a while [22:02:14] Rastus_Vernon: i'm suggesting using gerrithub to manage patches, not pull requests. [22:02:18] if the merges are true feature branches rather then individual commits they are less grating [22:03:18] Rust and Servo use pull requests for every single commits, and they seem to be going along fine with it, so maybe people eventually get used to it. [22:03:20] I actually like seeing the revision history actually in the repo but it certainly is something that each library can work out for themselves. Direct pushes to master are lame though. [22:03:53] pre-commit code review is a valuable thing for stability in my opinion [22:04:00] It is. [22:04:08] TimStarling: we should probably skip the following week as well, and start back up in February [22:04:26] And unit tests too, unless you use a functional language where the compiler almost guarantees you don't have bugs. [22:05:05] * bd808 would like to meet the magic compiler and language [22:05:34] bd808: Look at yesodweb.com! [22:05:46] a compiler that understands the application domain would be truly exciting :) [22:07:02] Ur/Web too is nice. http://www.impredicative.com/ur/ [22:07:31] The compiler, with its type system, goes as far as guaranteeing that you don't have dead links or return invalid HTML. [22:16:22] <^d> i will note that there exists gerrithub.io, and it works reasonably well to allow gerrit-style code review on github-hosted code. [22:16:32] <^d> I just threw up a little after reading that. It's like the worst of both worlds! [22:16:49] It looks nice, though. [22:17:06] ^d: i'm not sure why you think that/ [22:17:18] reviewing a stack of dependent patches on github is a nightmare [22:17:23] But using gerrit code review on GitHub-hosted code is a bit pointless: you might as well use gerrit to host the code. [22:17:29] <^d> cscott: Reviewing anything in gerrit is a nightmare! [22:17:30] <^d> :) [22:18:05] ^d: what's your preferred tool? [22:18:46] Rastus_Vernon: perhaps. but that's not how gerrithub built their service. and to some degree it makes sense: github has a lot of community and features around it, it's pointless to try to reinvent that on the gerrit side Just Because. [22:18:55] <^d> cscott: Phab's my least hated at the moment :) [22:19:08] cgit with mailing list to send patches and track bugs!! [22:19:20] ^d: doesn't phab have the same problem with not supporting patch-level reviews? [22:20:06] Rastus_Vernon: been there, done that. gerrithub is the closest to how that feels. without making me have to fight with patch formatting in an email client. [22:20:24] <^d> How do you mean? You can comment on versions of patches. [22:21:18] mailman is better than GitHub and Bugzilla and Phabricator and gerrit for tracking bugs and patches, and you can have it all unified in one tool! The best part is that it's very user friendly and easy to use. [22:21:32] It would be nice if GitLab added code review features. [22:22:35] <^d> I miss code review on wikitech-l. [22:22:39] <^d> Good times, that. [22:23:26] ^d: https://phabricator.wikimedia.org/T167 [22:24:53] cscott: "Priority: Low" [22:25:25] "Assigned To: None" [22:25:37] yes, because we're not using phab for code review atm [22:25:46] It should have the new epic tag applied to it