[20:59:09] <brion>	 soon...
[21:01:08] * robla gets plugged in
[21:01:31] <brion>	 #startmeeting ArchCom RFC meeting - Markdown support | Wikimedia meetings channel | Please note: Channel is logged and publicly posted (DO NOT REMOVE THIS NOTE) | Logs: http://bots.wmflabs.org/~wm-bot/logs/%23wikimedia-office/
[21:01:31] <wm-labs-meetbot`>	 Meeting started Wed Jun 22 21:01:31 2016 UTC and is due to finish in 60 minutes.  The chair is brion. Information about MeetBot at http://wiki.debian.org/MeetBot.
[21:01:31] <wm-labs-meetbot`>	 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
[21:01:32] <wm-labs-meetbot`>	 The meeting name has been set to 'archcom_rfc_meeting___markdown_support___wikimedia_meetings_channel___please_note__channel_is_logged_and_publicly_posted__do_not_remove_this_note____logs__http___bots_wmflabs_org__wm_bot_logs__23wikimedia_office_'
[21:01:46] <brion>	 i hope that wasn't too many bits for poor meetbot
[21:02:19] <robla>	 #link https://phabricator.wikimedia.org/E218 Phab event for this week's meeting
[21:02:40] <brion>	 #info discussing https://phabricator.wikimedia.org/T137946 develop Markdown support strategy for MediaWiki
[21:03:20] <robla>	 #link https://www.mediawiki.org/wiki/Requests_for_comment/Markdown this week's RFC
[21:03:30] * robla wipes brow
[21:03:41] <brion>	 robla, care to chat a bit on the background?
[21:04:33] <robla>	 sure, this is asking "what should our Markdown strategy be?", where pretty much any answer is valid
[21:04:56] <brion>	 :D
[21:05:12] <robla>	 why I'm asking that: there are many, many flavors of "wiki syntax" out there, of which MediaWiki wikitext is only one
[21:06:00] <YairRand>	 (but ours is the _real_ wikisyntax... :P )
[21:06:04] <robla>	 many implementations claim "Markdown support", which the interpretation varies quite a bit based on implementation
[21:06:43] <robla>	 YairRand: :-D  I think that actually gets to the heart of it
[21:07:51] <robla>	 YairRand: do you (or anyone out there) believe that all other implementations will "see the light" and start using our format?  should they?
[21:08:59] <subbu>	 a different question is: will all the disparate markdown efforts to go beyond "simple" markdown eventually arrive at the wikitext level of complexity?
[21:09:25] <subbu>	 even if the syntax will probably not be wikitext syntax itself.
[21:09:59] <brion>	 (taking off my chair hat momentarily) what's a reason a given wiki might have for choosing to use markdown? preference, or compatibility with existing data or other tools, or?
[21:10:21] <brion>	 (that might affect how one would go about such support)
[21:10:56] <robla>	 I think both questions are very good, and now I'm having trouble choosing  :-)
[21:11:00] <brion>	 :D
[21:11:08] <brion>	 let's do em in turn
[21:11:11] <bd808>	 migrating from a github wiki to mediawiki might be one reason to want markdown page source
[21:11:54] <brion>	 *nod*
[21:11:54] <robla>	 bd808: yup
[21:11:54] <YairRand>	 are there any serious limitations regarding wikitext that are solved in other syntaxes? are they pretty freely convertable?
[21:12:09] <Scott_WUaS>	 (Is there a question here about how Wikimedia markdown talked about now will interface with SQID and Wikidata?)
[21:12:35] <robla>	 YairRand: the Pandoc folks aspire to provide complete interchangability
[21:12:58] <brion>	 #info open question: reasons for choosing markdown? example: moving hosting of a github wiki
[21:12:59] <YairRand>	 robla: ... <clap clap clap>
[21:13:06] * subbu is looking at http://pandoc.org/README.html#pandocs-markdown and sees that it is a pretty long spec 
[21:13:47] <brion>	 #info open question: complexity and extensions to the markup? example: would we need a syntax extension for templates/parserfunctions/lua/wikidata/etc?
[21:14:39] <brion>	 easy things are easy to convert, hard things are ....... well that's the question isn't it :D
[21:15:04] <subbu>	 one good reason to entertain this markdown question for mediawiki is that it might let us abstract the markup / parsing parts of the codebase behind an interface.
[21:15:26] <brion>	 #info for convertability of markdownish things, see pandoc http://pandoc.org/README.html#pandocs-markdown
[21:15:41] <TimStarling>	 what does cut and paste support mean for users in practice?
[21:15:52] <bd808>	 agreed. getting serious about multiple markup formats would led to cleaning up a lot of entagled cruft in core
[21:15:58] <brion>	 subbu: good point. also, how much do we rely on wikitext eg in the user interface?
[21:16:00] <subbu>	 i toyed with that interface idea in https://www.mediawiki.org/wiki/User:SSastry_(WMF)/Notes/Wikitext#Core_ideas
[21:16:35] <subbu>	 brion, yes, wikitext in the UI is tricky ...
[21:16:49] <subbu>	 site messages are another i guess.
[21:17:25] <robla>	 TimStarling: I know what it means for me, but that's probably a better question for the folks who work with VE regularly, since my understanding is that cut-n-paste bugs happen a lot
[21:17:28] <brion>	 #info question: heavy use of wikitext in UI may require core parser. implications for alternate formats?
[21:17:48] * robla goes to find the Phab component for cut-n-paste issues
[21:17:58] <subbu>	 brion, is this (wikitext in UI) used a lot in non-wmf installs of mediawiki?
[21:18:26] <robla>	 https://phabricator.wikimedia.org/project/view/898/ VisualEditor copypaste component in Phab
[21:18:29] <TimStarling>	 would markdown be a third editing mode, after "source" and VE?
[21:18:33] <robla>	 #link https://phabricator.wikimedia.org/project/view/898/ VisualEditor copypaste component in Phab
[21:19:00] <subbu>	 TimStarling, I would think not.
[21:19:12] <TimStarling>	 would you have an "insert markdown" toolbar button which gives you a box for pasting markdown?
[21:19:18] <brion>	 subbu: at least some yes, sentences and paragraphs allowing bold, links, etc on various special pages. don't know how scary they are
[21:19:27] <subbu>	 as in .. i see robla's proposal as that of using it as an interchange format for copy-paste
[21:20:12] <brion>	 #info question: would cut-and-paste and interchange for markdown add a third editing mode beyond source/visual?
[21:20:38] <TimStarling>	 <bd808> agreed. getting serious about multiple markup formats would led to cleaning up a lot of entagled cruft in core
[21:20:43] <TimStarling>	 or it could be done as a ContentHandler
[21:21:17] <bd808>	 yeah. then you could have a mixed wiki if you wanted
[21:21:29] <TimStarling>	 then you wouldn't even touch $wgParser or create a Parser base class
[21:21:31] <subbu>	 i don't see a used case for mixed-markup-format wikis.
[21:21:31] <brion>	 #info tim sez "getting serious about multiple markup formats would led to cleaning up a lot of entagled cruft in core"
[21:21:34] <subbu>	 that would be pretty confusing.
[21:21:44] <TimStarling>	 no, I was quoting bd808
[21:21:51] <brion>	 #info whoops bd808 sez that
[21:22:04] * brion quote parsing error ;)
[21:22:07] * bd808 denies it all
[21:22:40] <TimStarling>	 it can be the default content handler if you like, the point of doing it as a content handler is that it gives you a convenient pre-existing hook point
[21:22:53] <brion>	 i can see particular uses, such as when a wiki is used as a source repository of documents to be reused.... but they get scary ;)
[21:22:59] <brion>	 (for mixed modes)
[21:23:00] <TimStarling>	 pretty much everything about wikitext has already been abstracted there, for wikidata's benefit
[21:23:36] <TimStarling>	 things like links table updates, redirect syntax, PST and parsing itself
[21:23:42] <brion>	 #info tim is pretty sure ContentHandler can implement a markdown mode well. should already be well-factored. can be used as default contenthandler in theory
[21:23:48] <subbu>	 i see ...
[21:24:51] <bd808>	 that wouldn't effect site messages because the message system grabs onto $wgParser
[21:25:12] <bd808>	 but maybe that's not a bad thing
[21:25:17] <brion>	 but they'd still have to be written in wikitext if they are stored in a wiki page, right?
[21:25:19] <TimStarling>	 yeah, that's the point
[21:25:47] <TimStarling>	 site messages could have the wikitext content type, so you could even preview them using wikitext
[21:26:27] <TimStarling>	 we already support default content types that vary depending on namespace
[21:26:27] <brion>	 #info example of needing core parser: messages in MediaWiki: namespace, such as site notices. force them to use wikitext CH
[21:26:34] <TimStarling>	 again for wikidata's benefit
[21:26:34] <robla>	 is some sort of wikitext always going to be at the heart of MediaWiki or is T112999 forseeable?
[21:26:35] <stashbot>	 T112999: Let MediaWiki operate entirely without wikitext - https://phabricator.wikimedia.org/T112999
[21:27:09] <brion>	 robla: it's conceivable but we'd have to eliminate or make optional the remaining wikitext users ;)
[21:27:36] <subbu>	 brion, i don't think robla is saying get rid of wikitext .. but whether mediawiki might support an option without wikitext.
[21:27:48] <bd808>	 allowing the parser for site messages to change would be like adding a language variant to every i18n language which seems unlikely to turn out well
[21:28:22] <brion>	 right you'd basically have to change them to plaintext or plaintext with a very limited markup that is not full wikitext
[21:28:31] * subbu is trying to grok what bd808 just said
[21:28:34] <TimStarling>	 I don't think it would really be helpful to attempt to translate i18n into some other markup language
[21:28:46] <brion>	 but we've got all sorts of fun things like grammatical plural and gender markers done via a subset of wiki markup
[21:28:47] <TimStarling>	 you know, i18n really drove the development of a lot of parser features
[21:28:48] <bd808>	 subbu: en-wikitext && en-markdown
[21:29:30] <brion>	 #info i18n is heavily dependent on a subset of the core parser for plurals, genders, and other message variants... but that doesn't have to be used for content if you don't want
[21:29:36] <robla>	 let's say that the version of wikitext we have now is "wikitext 1.0".  is "wikitext 1.1" something we could do?  (and still support i18n)
[21:30:10] * brion ponders
[21:30:36] <brion>	 could we, or would we want to, split a wikitext spec into 'the bits used for i18n' and 'extra fancy-ass markup used in wikipedia-like content'
[21:30:36] <brion>	 ?
[21:30:45] <subbu>	 robla, wikitext has evolved over the years .. so, i guess the qn. you are asking is if explicit versioning is needed?
[21:30:48] <brion>	 or is that even worse :D
[21:30:49] <TimStarling>	 i18n of course is a mix of formats
[21:31:05] <TimStarling>	 preprocessed plain text, preprocessed HTML and true wikitext
[21:31:07] <brion>	 plaintext, plaintext plus, wikitext, html, .... oh helllllls
[21:31:14] <robla>	 subbu: yeah, I think so
[21:31:49] <TimStarling>	 well, except the qqq language which is pretty consistently wikitext
[21:32:52] <brion>	 #info question: is explicit versioning needed? can/should we make a 'wikitext 1.1' that is always implemented for i18n and ui messages?
[21:33:18] <brion>	 #info note i18n messages are a mix of plaintext+preprocess, HTML+preprocess, and pure wikitext
[21:34:42] <TimStarling>	 robla, are you proposing any role for markdown on WMF wikis?
[21:35:10] <Scott_WUaS>	 (What are the implications of these MediaWiki markdown choices/decisions re ContentTranslation and Wikipedia's 358 languages, and security questions especially?)
[21:35:47] <robla>	 TimStarling: I think it potentially has a role in normalizing CopyPaste issues, but the path toward that is complicated
[21:35:59] <brion>	 #info question: implications of markdown choices on other tools like CT, need for i18n, and security?
[21:36:19] <subbu>	 that requires browsers, doc-creating systems (word, etc.) to support conversion to "standard" markdown.
[21:36:45] <TimStarling>	 it seems very limited as an interchange format
[21:37:02] <TimStarling>	 compared to RTF, HTML, PDF, etc.
[21:37:22] <brion>	 if I were going to copy-paste from a markdown wiki page, bug report, or readme file on github for instance, my choices are to copy-paste the source, or copy-paste the rendered HTML
[21:37:39] <robla>	 subbu: I think at a base level, we have a number of applications that claim "text/html" during copy/paste operations, but text/html copy pasting pretty much anything
[21:37:52] <brion>	 we know that pasting text/html is way harder than it should be ;) but we already support it in VE
[21:38:07] <subbu>	 brion, from some sources, yes.
[21:38:13] <TimStarling>	 pasting HTML into VE is already good enough to be useful
[21:38:14] <robla>	 brion: we support it today, but it's an arms race, isn't it?
[21:38:17] <brion>	 benefits of source copy?
[21:38:18] <TimStarling>	 I have used it a few times
[21:38:18] <brion>	 hehe yep
[21:39:00] <robla>	 no one (that I'm aware of) has defined a useful subset of HTML that is safe for copy/paste operations
[21:39:08] <brion>	 but so is markdown isn't it?
[21:39:22] <brion>	 if we support github's extensions, next we get asked about someone else's extensions
[21:40:18] <brion>	 #info question is the HTML copy-paste "arms race" good enough vs markup-specific paste converter tools for markdown etc?
[21:40:36] <TimStarling>	 HTML paste is likely to work if the HTML is very simple
[21:41:00] <TimStarling>	 for example if you're copying from a github README.md you'd expect it to work
[21:41:29] <robla>	 TimStarling: is there a "very simple" subset of HTML we can get browser makers to support?
[21:41:43] <robla>	 (for copy/paste purposes)?
[21:41:47] <subbu>	 robla, you linked to https://tools.ietf.org/html/draft-ietf-appsawg-text-markdown-12 ... what are your thoughts on how likely it is to be adopted?
[21:42:24] <TimStarling>	 robla: no... but then browsers can't export to markdown either
[21:42:31] <brion>	 #link https://tools.ietf.org/html/draft-ietf-appsawg-text-markdown-12
[21:42:55] <robla>	 subbu: I think like that could happen
[21:43:18] <subbu>	 our original goal for parsoid html2wt (which is still there as a comment in the serialization code) is to be able to accept arbitrary html and convert it to "acceptable" wikitext. but we haven't quite worked on that goal for a while now since we are mostly behind clients whose output is more controlled.
[21:44:05] <robla>	 subbu: what do you mean by "output is more controlled"?
[21:44:27] <subbu>	 as in .. VE/CX/Flow etc. don't generate arbitrary html.
[21:44:39] <robla>	 ah, got it
[21:45:25] <subbu>	 but, if you say, took the html from a bbc article and gave it to parsoid to convert to wikitext, the output isn't pretty.
[21:45:35] <robla>	 so...basically, the copy/paste code works when we can control the generation of the HTML, but most implementations don't conform to our spec
[21:45:51] <subbu>	 no, VE does its own handling of copy-pasted HTML .. it doesn't go through parsoid.
[21:46:20] <brion>	 fun :D
[21:46:29] <TimStarling>	 you mean it cleans up the HTML before it hands it to parsoid for serialization?
[21:46:41] <subbu>	 but, we've talked about creating a library for normalization and cleanup.
[21:47:01] <brion>	 #info for comparison, the HTML paste handling in VE is done by normalizing HTML on the VE end, before it eventually lands in parsoid during save/serialization
[21:47:10] <subbu>	 TimStarling, as far as i know ... they strip unrecognized / unsupported attributes.
[21:47:30] <brion>	 #info ideally the parsoid html2wt would take any html and produce 'acceptable' wikitext but is not fully exercised at that right now
[21:49:28] <robla>	 things like html2wt are going to be necessary for a long time, I imagine, but it seems to me we should at least start pulling people toward a world where html2wt isn't necessary
[21:50:32] <brion>	 well, there's the html-only world possibility :)
[21:50:47] <brion>	 where you'd still have some validation stage
[21:50:56] <brion>	 but not a major reparse i guess
[21:51:13] <brion>	 (and presumably a stage to handle composition of templates, media etc)
[21:51:56] <subbu>	 for parsoid to accept arbitrary html, we would need to run a sanitization pass on the html and strip unrecognized attributes, normalize html, etc.
[21:52:09] <robla>	 I think we live in a world where wikitext is sanitized and tries to be safe, and HTML is known unsafe
[21:52:28] <brion>	 indeed we'd have "inside html" and "outside html" at the least
[21:52:32] <subbu>	 which is also something that needs to happen with a html-only wiki .. sanitization at the very least.
[21:52:32] <brion>	 never, EVER mix em :D
[21:52:44] <robla>	 there's no "sanitized HTML" spec
[21:53:09] <subbu>	 :)
[21:53:14] <brion>	 #info an HTML-only storage world needs to carefully sanitize between "outside HTML" and "safe inside HTML".... but there's no spec! we'd need one
[21:53:37] <robla>	 there's the old HTML email spec
[21:53:58] <robla>	 (but yeah, that's not really a good alternative)
[21:54:38] <robla>	 https://en.wikipedia.org/wiki/HTML_email
[21:55:34] <brion>	 probably we need to spec out our extensions as well, such as how you extract the file name from a usage, a wiki page from a link, a template reference and parameter set from a big ol' blob of divs or whatever
[21:55:46] <TimStarling>	 I think if VE's HTML paste can produce reasonable wikitext markup for any HTML generated from original markdown, then that more or less replaces the need for direct markdown paste
[21:56:15] <brion>	 i tend to agree
[21:56:36] <TimStarling>	 "original markdown" as in http://daringfireball.net/projects/markdown/syntax
[21:56:45] <TimStarling>	 which is much simpler than pandoc markdown
[21:57:28] <robla>	 commonmark would be the modern simple version, I think
[21:57:47] <robla>	 http://commonmark.org/
[21:57:50] <brion>	 ok we're getting low on time
[21:58:08] <brion>	 any action items to pursue? decisions made?
[21:58:21] <subbu>	 T127329 is the placeholder for the parsoid side work to consolidate html-import/cleanup code into a library for use by whoever.
[21:58:21] <stashbot>	 T127329:  Using Parsoid as a wikitext bridge for importing content into wikitext format - https://phabricator.wikimedia.org/T127329
[21:58:50] <brion>	 #link https://phabricator.wikimedia.org/T127329 related parsoid bridge for html-import-to-wikitext
[21:59:19] <Scott_WUaS>	 Thanks All!
[21:59:26] <TimStarling>	 so I'm fairly skeptical about the idea of direct markdown paste as being superior to markdown->html->wikitext
[21:59:28] <robla>	 subbu: my understanding is that you're working on RFCs as a goal soon, right?
[21:59:28] <subbu>	 i was interested in the markdown strategy as a potential benefit for refactoring some code in mediawiki .. but looks like that is mostly already in place?
[21:59:50] <brion>	 yay wikidata -> contenthandler \o/
[22:00:17] <subbu>	 robla, rfcs for .. that task i pasted above?
[22:00:21] <brion>	 #info tim is skeptical of direct paste; html import seems to serve well
[22:00:33] <robla>	 subbu: something related to T112999?
[22:00:33] <stashbot>	 T112999: Let MediaWiki operate entirely without wikitext - https://phabricator.wikimedia.org/T112999
[22:00:43] <brion>	 #action someone should revise the RfC, probably drop the cut-paste
[22:00:44] <subbu>	 ah, cscott territory.
[22:00:49] <subbu>	 yes.
[22:00:58] <brion>	 #action update T112999 for the ContentHandler era
[22:00:58] <stashbot>	 T112999: Let MediaWiki operate entirely without wikitext - https://phabricator.wikimedia.org/T112999
[22:01:15] <subbu>	 i'll chat with him about it.
[22:01:36] <brion>	 #action subbu will chat with cscott
[22:01:38] <brion>	 thanks all!
[22:01:40] <brion>	 #endmeeting
[22:01:42] <wm-labs-meetbot`>	 Meeting ended Wed Jun 22 22:01:41 2016 UTC.  Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4)
[22:01:42] <wm-labs-meetbot`>	 Minutes:        https://tools.wmflabs.org/meetbot/wikimedia-office/2016/wikimedia-office.2016-06-22-21.01.html
[22:01:42] <wm-labs-meetbot`>	 Minutes (text): https://tools.wmflabs.org/meetbot/wikimedia-office/2016/wikimedia-office.2016-06-22-21.01.txt
[22:01:42] <wm-labs-meetbot`>	 Minutes (wiki): https://tools.wmflabs.org/meetbot/wikimedia-office/2016/wikimedia-office.2016-06-22-21.01.wiki
[22:01:42] <wm-labs-meetbot`>	 Log:            https://tools.wmflabs.org/meetbot/wikimedia-office/2016/wikimedia-office.2016-06-22-21.01.log.html
[22:01:49] <robla>	 thanks brion!
[22:01:55] <brion>	 :D
[22:01:57] <brion>	 see y'all later!
[22:02:19] <subbu>	 see ya