[09:18:43] bd808: sounds like it will be needed for wikitech account conflicts after all? [09:19:00] SUL rename handling in AuthManager I mean [15:22:15] tgr: I'm hoping that won't be necessary, but I guess time will tell. My intuition is that there is a existing SUL account for every active wikitech editor (minus bots). [21:05:19] MaxSem: dude, what is this, ? :D [21:05:22] also, nice tool [21:06:24] welt, that's the thing that works. behold its beauty XD [21:07:51] MaxSem: it doesn't work on ie 8!!!! [21:08:00] wanna make the buttons OOjs UI? :P [21:08:04] DIEDIEDIE IE [21:08:31] there was an ie8? I thought ie6 was the last one ever made [21:08:31] MaxSem, would it be possible to add this to the comparison? https://www.mediawiki.org/wiki/Extension:WikEdDiff Or a second comparison? [21:08:43] and +1 nice tool :) [21:08:59] * bd808 missed something on the other side of the netsplit [21:09:29] bd808, nope, we're just talking about the very recent https://lists.wikimedia.org/pipermail/wikitech-l/2016-April/085316.html [21:09:40] quiddity, I can't compare PHP with non-PHP algo because I pregenerate a list of diffs that are actually different [21:09:47] MaxSem: also: when the two diffs have different length, they don't align :( https://diff-forge.wmflabs.org/w/index.php?title=Special:DiffCompare&oldid=708093439&newid=708092610 [21:09:52] MaxSem: where can i submit a patch? :D [21:10:14] https://github.com/MaxSem/DiffCompare [21:10:27] (CAREFUL, NOT WRITTEN FOR WMF DEPLOYMENT!) [21:11:09] * bd808 orders a bigger monitor so he can use the tool [21:11:35] MaxSem, are you sure it's not PHP? the description elsewhere says "The library has also been ported to PHP and is available as a MediaWiki extension." [21:11:52] (at https://en.wikipedia.org/wiki/User:Cacycle/diff ) [21:11:54] ah, right [21:12:08] MaxSem: https://github.com/MaxSem/DiffCompare/pull/1 [21:12:42] whoa [21:12:52] github live updated the status to "Merged" as i was viewing it [21:13:09] also, deployed [21:13:14] I think github installs a service worker these days [21:14:02] I haven't looked into the js code but I've noticed things that make me think that is the magic for faster page views [21:15:17] It would be awesome if we could give that alternative some attention, whilst we're looking at these things. It's used in the WikEd gadget, the second most popular on Enwiki (34,324 users), and has had a lot of work put into it. [21:16:59] (second most popular overall, too, https://meta.wikimedia.org/wiki/Gadgets/wikipedia ) [21:17:52] quiddity, it's also several times longer than core implementations [21:18:25] nod. so perhaps not suitable as a default, but perhaps as an alternative in jon's dropdown diff-engine picker? [21:18:57] the problem is that even native C++ wikidiff2 can take a minute to diff [21:19:15] deploying a PHP diff on cluter is out of question [21:19:20] *cluster [21:21:45] MaxSem: https://github.com/MaxSem/DiffCompare/pull/2 be advised that i did not run this code [21:21:55] (i did syntax-check though) [21:22:04] so test before deploying. or when deploying. :D [21:22:43] see how cute OOUI is? and you get HTML escaping for free, no pesky entities in your code [21:35:21] MatmaRex, Class undefined: DiffCompare\OOUI\ButtonWidget [21:35:47] MaxSem: bah [21:35:59] needs to be \OOUI\ then, i guess [21:36:02] stupid php namespaces [21:37:17] fixed [21:38:11] URLs are now overescaped :P [21:38:56] bahhhh. [21:39:54] MaxSem: there [21:43:24] MatmaRex, thanks - deployed [21:44:28] yay [21:47:23] MaxSem: heh, half of the diffs i'm seeing are vandalisms where somebody removes half the article (or reverts of them), and both diffs are very confused. [21:47:43] heh [21:56:03] I'm comparing the examples given in https://meta.wikimedia.org/wiki/Community_Tech/Improved_diffs against your tool, WikEdDiff, and the Mobilediff. E.g. http://imagizer.imageshack.com/img921/2880/EUTc7F.png [21:57:34] in that example, WikEdDiff is the only one to recognize the "braclets"->"bracelets" is a single letter addition. But it also gets confused by "called"->"entitled" [22:01:02] MaxSem: haha, the "Differences between diffs" section for this one is busted: https://diff-forge.wmflabs.org/w/index.php?title=Special:DiffCompare&oldid=708104305&newid=708102196 [22:01:50] the 10kb limit:P [22:03:06] quiddity, note that mobile diff has the same iternals as normal diff you see on WP [22:03:30] ...whixh is basically a straightforward port of DairikiDiff [22:04:08] ah, good to know. Are these kinda details written anywhere onwiki? [22:04:18] no idea [22:04:28] me not documentor [22:04:37] zanzu! [22:05:06] MaxSem: hmm, how much time/work are you going to be spending on diffs? and, if anything, what's next after this? just wondering [22:05:56] https://diff-forge.wmflabs.org/w/index.php?title=Special:DiffCompare&oldid=708074064&newid=708072345 is an example of poorly matches paragraphs (on both sides) :( ideal would be something like https://i.imgur.com/qjr4kRY.png (ignore the UI, just look at the highlighted diff) [22:06:42] MatmaRex, I have a few simple fixes in mind [22:07:01] just want to do them in 2 places instead of 3 hence this shootout [22:07:54] that diff yeah, but fuzzy matching is gonna be ridiculously slow [22:08:20] yeah. :( [22:10:18] maybe, with a few shortcuts... [22:10:51] WikEdDiff gets it ;) http://imagizer.imageshack.com/img924/7866/Egrvwx.png [22:10:57] oh yeah, and wikidiff3 is actually winning on quality [22:12:19] quiddity: i think this particular diff is easier to get right in tools that don't split the text into paragraphs first, i think wikeddiff doesn't (it just treats a newline like any other character) [22:12:49] MaxSem: it handles diffs involving wikilinks better, from what i saw. (maybe that's why it's called wikidiff :P) [22:13:20] (i should probably stop looking at diffs and get back to real work) [22:14:51] helping colleagues counts as real work! /me however, looks for lunch. [22:17:01] another paragraph-matching mess: https://diff-forge.wmflabs.org/w/index.php?title=Special:DiffCompare&oldid=708148088&newid=698665920 i wonder if anyone has ever looked into using e.g. similar paragraph lengths as a heuristic for matching them up? MaxSem? [22:18:28] yeah, that is the obvious speedup: if lengths within 1%, try Levenshtein distance [23:40:47] quiddity, so I ran quick perf tests: WikEd is more than 2 times slower than our worst engine, DairikiDiff