[07:47:44] i am a newbie but want to contribute in wikimedia, can someone please help me [09:52:17] hey guys, is it correct that Wikipedia creates content dumps roughly once every month? [11:27:47] I'm importing an XML dump of a wiki with 294k pages (I split the XML dump it up in batches of 50000 pages, though), and my question is: do you recommend to set the --no-updates flag in importDump.php? [11:28:55] yes [11:29:42] Would running refreshLinks.php, initSiteStats.php and rebuildTextIndex.php be enough to ensure the database is afterwards not in an inconsistent state, Vulpix? [11:29:53] specially if it contains all revisions, since last I checked MediaWiki wasn't smarter enough to detect if the revision being imported was the last one, so it ended updating all tables (links, etc) on *each historical revision* [11:30:13] or simply rebuildAll.php [11:30:18] It's a full dump with logs, all revisions, etc [11:30:47] It's no problem to rebuild the wiki afterwards, as long as I can be sure the wiki doesn't 'break' or something with --no-updates [11:31:25] it won't break, some things won't work as expected, like Special:WhatLinksHere, etc [11:31:44] but I can rebuild that all with rebuildall.php I guess? [11:32:11] and depending on the order of the imports, some templates included on pages will render as red links because page was imported before the template, etc [11:32:17] yes, rebuildall solves that [11:32:24] Should be fixable with runJobs.php I guess [11:33:08] Ok, I will let importDump.php running until MariaDB crashes again and then I'll do it with --no-updates. Thank you. [11:33:13] I'd discard all jobs after import if you were running rebuildall [11:33:29] yw [11:33:54] Perhaps you want to put a small notice on https://www.mediawiki.org/wiki/Manual:ImportDump.php about what to do if you run it without --no-updates. [11:34:57] will do [19:57:40] is there somewhere where I can find the specification for defining templates? For example, what's valid vs what works, {{foo}} {{ foo}} {{\nfoo}} etc [19:58:10] I'm doing some parsing and want to make sure that I can find all variations of how templates are written [20:04:40] that reminds me of https://tools.wmflabs.org/bash/quip/AU7VU_VJ6snAnmqnK_t0 [20:05:48] but sadly the response seems to be that :( [20:06:46] hehe [20:07:14] {{foo}} {{ foo}} {{Template:foo}} {{:Template:foo}} .... I'm pretty sure {{\nfoo}} won't work, though [20:07:45] I wouldn't be so sure [20:07:52] add also {{TEMPLATE:foo}} or {{tEmPlAtE:Foo}} to the list [20:08:30] nowadays, it's possible that wikitext folks made a decent grammar for VE [20:10:41] yeah a newline works [20:11:07] interesting, but a newline inside a link breaks it... [20:12:03] chippy: why do you need to parse that? [20:13:31] Vulpix: I want to update the value of one of the parameters (in the Map template, in Commons) based on a tool on labs that the user may or may not do [20:15:07] we're georeferencing historical maps (wikimaps) and want to let the filepage now it has been done, or need attention etc [20:16:44] ah, yeah, I've written some bots that add or change template parameters... I ended up assuming the syntax wasn't using edge cases [20:18:04] I think that's sensible. I think most of the templates are going to be (possibly) added via another bot or the glamwiki toolkit, or expert users anyhow, so I think assuming no edge cases is probably safe also [20:18:08] I didn't feel like recreating parser.php in python or JavaScript again :P [20:18:20] * chippy is writing in ruby [20:19:05] thanks! [20:19:32] It might be simpler to use parsoid [20:20:03] Get the page as HTML, change whatever, transform back to wikitext and save the result [20:21:03] change template parameters from HTML? doesn't seem simpler [20:21:39] compared to wikitext...? [20:21:59] Krenair: interesting tool, thanks [20:22:10] It apepars it doesnt like the map template though [20:22:36] ? [20:22:41] a parameter change can be translated in very different HTML output, I don't see how would be easier in HTML [20:26:04] I had experimented with the rendered html, but a html table with no ids in the elements is not that easy either [20:26:27] i suppose the template could be tweaked to make it easier to parse, but I think parsing the wikitext should be straightforward [21:07:52] chippy, Vulpix: are you guys talking about plain HTML straight from MediaWiki's parser, or parsoid's output? [21:10:35] I don't know what does parsoid return, so I'm lost here [21:22:51] Vulpix, http://parsoid-lb.eqiad.wikimedia.org/_wikitext/ -> input "{{done}}" -> submit -> view source [21:29:32] so the plan is to find data-mw attribute with the template, change the params: { } part and resend it so it reverse-parses it to wikitext again? [21:32:26] gtg [22:25:23] ]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]] [22:25:30] whoops, my bad