[18:55:12] is there a youtube link for elasticsearch? [19:03:35] jeremyb-phone: Not yet. I'll post as soon as I find one. [19:03:50] jeremyb-phone: There is just https://plus.google.com/u/0/events/cokipb2senmmvkvdjif7aq55kac for now [19:05:30] jeremyb-phone: https://www.youtube.com/watch?v=FubXExbAvOA [19:12:57] (thanks, also following in #-dev) [19:48:50] pasting some comments on the talk from #mediawiki-parsoid: [19:48:52] (12:46:04 PM) subbu: cscott, maybe you can follow up on nik's response ... but curious how will a html document store change the search issues? or, maybe this is an offline discussion. [19:48:52] (12:46:22 PM) subbu: or actually, maybe it won't. [19:48:52] (12:46:25 PM) subbu: never mind. [19:48:52] (12:46:38 PM) subbu: in the sense it is still just a api call as far as cirrus is concerned. [19:48:55] (12:46:46 PM) cscott-free: yeah, i wanted to figure out whether or not their troubles getting good sentence data was due to using non-semantic HTML output from the PHP parser [19:48:56] (12:47:05 PM) cscott-free: for example, our
markup should be easier to identify and separate out [19:48:57] (12:47:15 PM) subbu: i think so. [19:48:58] (12:47:39 PM) cscott-free: probably table markup would be comparable, so no difference there [19:48:59] (12:47:49 PM) cscott-free: but we could allow some more precise info on templates [19:49:00] (12:48:03 PM) cscott-free: maybe lowering the relevance of hits which occur inside templates, for example [19:49:02] (12:48:13 PM) cscott-free: that's lost in PHP HTML output