WikipediaProcessor This program processes one by one the xml files split with wikipediaDumpSplitter. Each xml file is converted
to an sql source file with mwdumper-2008-04-13.jar (org.mediawiki.dumper.Dumper) The tables names in the sql source are
prefixed with the local (ex. en_US, de etc.) Each sql source is loaded in a mysql database, basically the tables local_text,
local_page and local_revision are loaded. Once the tables are loaded the WikipediMarkupCleaner is used to extract clean text
and a wordList, as a result two tables will be created in the database: local_cleanText and local_wordList (the wordList is
also saved in a file).