| Alvis-NLPPlatform documentation | view source | Contained in the Alvis-NLPPlatform distribution. |
Alvis::NLPPlatform::Convert - Perl extension for converting files in any format into the ALVIS XML.
use Alvis::NLPPlatform::Convert;
my %config = &Alvis::NLPPlatform::load_config($rcfile);
my $mm = Alvis::NLPPlatform::Convert::load_MagicNumber(\%config);
my $AlvisConverter = Alvis::NLPPlatform::Convert::html2alvis_init();
Alvis::NLPPlatform::Convert::conversion_file_to_alvis_xml($ARGV[0], $AlvisConverter, \%config, $mm);
This module provides methods to convert input files into the ALVIS XML format. It determines the type of the input files according to its magic number and applies converters. Output files are stored in a temporary spool.
load_MagicNumber(\%config);
This method loads additional information for magic numbers. The file
is defined in the variable SupplMagicFile in the section
CONVERTER.
It returns the object containing the list of magic numbers.
html2alvis_init(\%config);
The method Initializes the HTML2XML Alvis converter. It also
determines the directory where will store the output files. It is
either the directory by the variable ALVISTMP) or either, by
default, the current directory. The start number of the files is also
determined.
The method returns the Alvis converter (i.e. from HTML file to Alvis DTD XML).
conversion_file_to_alvis_xml($file, $AlvisConv, $config, $mm);
The method converts the input file $file into the Alvis XML. Other
arguments are the Alvis converter $AlvisConv, the NLP platform
configuration ($config), providing command lines for convertion,
and additional magic numbers ($mm).
html2alvis($file, $Alvis_converter);
The method converts the HTML file $file into the ALVIS XML format
(thanks to the ALVIS converter Alvis_converter) and store the
output file in the temporary spool directory.
It returns a value different of 0 if it fails.
make_meta($filename)
The method generates the meta information associated to filename
with default values, i.e. title, date and url, and then returns it.
outputting_default_xmlns_file($outdata, $outfile, $AlvisConverter, $config, $mm);
The method print the output data $output (defined in a empty XML
namespace) into the temporary file outfile, and carries out the
convertion to the ALVIS XML format, with $AlvisConverter.
Additional parameters are the configuration $config and the
additional magic filter $mm.
applying_stylesheet($file, $xmlns, $config);
This method applies the XML style sheet, defined for the namespace
$xmlns given the configuration $config, to the file $file.
The method returns an two element array containing the XML namespace and the XML data.
get_type_file($file, $mm);
The method determines and returns the type of the file $file according to its
magic number (regarding the list $mm) and in the case of "msword",
according to the extension of the file (PowerPoint and Excel.
outputting_alvis_from_file($alvisfile, $Alvis_converter, $config);
The method formats and outputs the file $alvisfile. It loads the
file, and applies the ALVIS converter. The language of the document(s)
is identified at this point thanks to the method defined in the
ALVIS::NLPPlatform::Document module.
The $config parameter is the hashtable containing the configuration
variables.
outputting_alvis($alvisXML, $Alvis_converter, $config);
The method ouputs the data contained in $alvisXML thaks to the
ALVIS converter. The $config parameter is the hashtable containing
the configuration variables.
making_spool($config, $outputRootDir);
The method generates the spool directory defined in $config from
the $outputRootDir.
# =head1 ENVIRONMENT
Alvis web site: http://www.alvis.info
Thierry Hamon <thierry.hamon@lipn.univ-paris13.fr>
Copyright (C) 2007 by Thierry Hamon
This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.8.6 or, at your option, any later version of Perl 5 you may have available.
| Alvis-NLPPlatform documentation | view source | Contained in the Alvis-NLPPlatform distribution. |