| Data-Presenter documentation | view source | Contained in the Data-Presenter distribution. |
select_rows()sort_by_column()print_to_screen()print_to_file()print_with_delimiter()full_report()writeformat()writeformat_plus_header()writeformat_with_reprocessing()writeformat_deluxe()writedelimited()writedelimited_plus_header()writedelimited_with_reprocessing()writedelimited_deluxe()writeHTML()Data::Presenter - Reformat database reports
This document refers to version 1.03 of Data::Presenter, which consists of Data::Presenter.pm and various packages subclassed thereunder, most notably Data::Presenter::Combo.pm and its subclasses Data::Presenter::Combo::Intersect.pm and Data::Presenter::Combo::Union.pm. This version was released February 10, 2008.
use Data::Presenter;
use Data::Presenter::[Package1]; # example: use Data::Presenter::Census
our (@fields, %parameters, $index);
$configfile = 'fields.XXX.data';
do $configfile;
$dp1 = Data::Presenter::[Package1]->new(
$sourcefile, \@fields,\%parameters, $index
);
$data_count = $dp1->get_data_count();
$dp1->print_data_count();
$keysref = $dp1->get_keys();
$seenref = $dp1->get_keys_seen();
$dp1->print_to_screen();
$dp1->print_to_file($outputfile);
$dp1->print_with_delimiter($outputfile, $delimiter);
$dp1->full_report($outputfile);
$dp1->select_rows($column, $relation, \@choices);
$sorted_data = $dp1->sort_by_column(\@columns_selected);
$seen_hash_ref = $dp1->seen_one_column($column);
$dp1->writeformat(
sorted => $sorted_data,
columns => \@columns_selected,
file => $outputfile,
);
$dp1->writeformat_plus_header(
sorted => $sorted_data,
columns => \@columns_selected,
file => $outputfile,
title => $title,
);
%reprocessing_info = (
lastname => 17,
firstname => 15,
);
$dp1->writeformat_with_reprocessing(
sorted => $sorted_data,
columns => \@columns_selected,
file => $outputfile,
reprocess => \%reprocessing_info,
);
$dp1->writeformat_deluxe(
sorted => $sorted_data,
columns => \@columns_selected,
file => $outputfile,
title => $title,
reprocess => \%reprocessing_info,
);
$dp1->writedelimited(
sorted => $sorted_data,
file => $outputfile,
delimiter => $delimiter,
);
$dp1->writedelimited_plus_header(
sorted => $sorted_data,
columns => \@columns_selected,
file => $outputfile,
delimiter => $delimiter,
);
@reprocessing_info = qw( instructor timeslot room );
$dp1->writedelimited_with_reprocessing(
sorted => $sorted_data,
columns => \@columns_selected,
file => $outputfile,
delimiter => $delimiter,
reprocess => \@reprocessing_info,
);
$dp1->writedelimited_deluxe(
sorted => $sorted_data,
columns => \@columns_selected,
file => $outputfile,
delimiter => $delimiter,
reprocess => \@reprocessing_info,
);
$dp1->writeHTML(
sorted => $sorted_data,
columns => \@columns_selected,
file => 'somename.html',
title => $title,
);
Data::Presenter::Combo objects:
use Data::Presenter;
use Data::Presenter::[Package1]; # example: use Data::Presenter::Census
use Data::Presenter::[Package2]; # example: use Data::Presenter::Medinsure
our (@fields, %parameters, $index);
$configfile = 'fields.XXX.data';
do $configfile;
$dp1 = Data::Presenter::[Package1]->new(
$sourcefile, \@fields,\%parameters, $index
);
# different source file and configuration file
$configfile = 'fields.YYY.data';
do $configfile;
$dp2 = Data::Presenter::[Package2]->new(
$sourcefile, \@fields,\%parameters, $index);
@objects = ($dp1, $dp2);
$dpC = Data::Presenter::Combo::Intersect->new(\@objects);
$dpC = Data::Presenter::Combo::Union->new(\@objects);
If you have not used Data::Presenter prior to version 1.0, skip this section.
writeformat()-Family of Methods Now Takes List of Key-Value PairsSince the last publicly available version of Data::Presenter (0.68), the
interface to nine of its public methods has been changed. Previously, methods
in the writeformat()-family of methods took a list of arguments which had
to be provided in a very specific order. For example, writeformat_deluxe()
took five arguments:
$dp1->writeformat_deluxe(
$sorted_data,
\@columns_selected,
$outputfile,
$title,
\%reprocessing_info
);
As the number of elements in the list of arguments increases, it becomes more
difficult to remember the order in which they must be passed. At a certain
point it becomes easier to pass the arguments in the form of key-value pairs.
As long as each pair is correctly specified, the order of the pairs no longer
matters. writeformat_deluxe(), for example, now has this interface:
$dp1->writeformat_deluxe(
sorted => $sorted_data,
columns => \@columns_selected,
file => $outputfile,
title => $title,
reprocess => \%reprocessing_info,
);
Please study the "SYNOPSIS" above to see how to revise your calls to
methods with writeformat, writedelimited or writeHTML in their names.
$index in Data::Presenter::[Package1]::_init()Data::Presenter is used by writing and using a subclass in which a new object
is created. Each such subclass must hold an _init() method and each such
_init() method must accomplish certain tasks. One of these tasks is to
store the value of $index (found in the configuration file) in the object
being created. In versions 0.68 and earlier, the code which did this looked
like this:
$data{'index'} = [$index];
In other words, $index was not directly assigned to the hash holding the
Data::Presenter::[Package1] object's data. Instead, a reference to a
one-element array holding $index was passed.
This has now been simplified:
$data{'index'} = $index;
In other words, simply assign $index; no reference is needed. See the
sample packages included under the t/ directory in this distribution for a
live presentation of this change.
Data::Presenter requires Perl 5.6 or later. The module and its test suite require the following modules from CPAN:
By the same author as Data::Presenter: http://search.cpan.org/dist/List-Compare.
Used only in the test suite to capture output printed to screen by Data::Presenter methods. By Mark Reynolds and Jon Morgan. http://search.cpan.org/dist/IO-Capture.
Used only in the test suite to capture output printed to screen by Data::Presenter methods. By the same author as Data::Presenter. Has IO::Capture (above) as prerequisite. http://search.cpan.org/dist/IO-Capture-Extended.
Used only in the test suite to validate text printed to files by Data::Presenter methods. By Mark-Jason Dominus. Distributed with Perl since 5.7.3; otherwise, available from CPAN: http://search.cpan.org/dist/Tie-File.
Each of the prerequisites is pure Perl and should install with the cpan shell by typing 'y' at the prompts as needed.
Data::Presenter is an object-oriented module useful for the reformatting of already formatted text files such as reports generated by database programs. If the data can be represented by a row-column matrix, where for each data entry (row):
then the data structure is suitable for manipulation by Data::Presenter. In Perl terms, if the data can be represented by a hash of arrays, it is suitable for reformatting with Data::Presenter.
Data::Presenter can be used to output some fields (columns) from a database while excluding others (see "sort_by_column()" below). It can also be used to select certain entries (rows) from the database for output while excluding other entries (see "select_rows()" below).
In addition, if a user has two or more database reports, each of which has the same field serving as an index for the data, then it is possible to construct either a:
Whichever flavor of Data::Presenter::Combo object the user creates, the module guarantees that each field (column) found in any of the source databases appears once and once only in the Combo object.
Data::Presenter is not a database module per se, nor is it an interface to databases in the manner of DBI. It cannot used to enter data into a database, nor can it be used to modify or delete data. Data::Presenter operates on reports generated from databases and is designed for the user who:
Data::Presenter is most appropriate in situations where the user either has no access to (or chooses not to use) commercial desktop database programs such as Microsoft Access(r) or open source database programs such as MySQL(r). Data::Presenter's installation and preparation require moderate knowledge of Perl, but the actual running of Data::Presenter scripts can be delegated to someone with less knowledge of Perl.
The individual in a workplace responsible for the installation of Data::Presenter on the system or network, analysis of sources, preparation of Data::Presenter configuration files and preparation of Data::Presenter subclass packages other than Data::Presenter::Combo and its subclasses. (Cf. "Operator".)
A row in the source|"Source" containing the values of the fields for one particular item.
A column in the source|"Source" containing a value for each entry.
The column in the source|"Source" whose values uniquely identify each entry in the source. Also referred to as ''unique ID.'' (In the current implementation of Data::Presenter, an index must be a strictly numerical value.)
The column in the source|"Source" containing a unique value ("index") for each entry.
Entries in the Data::Presenter object's data structure which hold information prepared by the administrator about the data structure and output parameters.
In the current version of Data::Presenter, metadata is extracted from the
variables @fields, %parameters and $index found in the configuration
file fields.XXX.data. The metadata is first stored in package variables in
the invoking Data::Presenter subclass package and then entered into the
Data::Presenter object as hash entries keyed by 'fields', 'parameters'
and $index, respectively. (The word 'options' has also been reserved for
future use as the key of a metadata entry in the object's data structure.)
Non-metadata|"Metadata" entries found in the Data::Presenter object at the point a particular selection, sorting or output method is called.
The object's current data structure may be thought of as the result of the following calculations:
construct a Data::Presenter::[Package1] object
less: entries excluded by application of selection criteria found
in C<select_rows>
less: metadata entries in object keyed by 'fields', 'parameters' or
'fields'
result: object's current data structure
The individual in a workplace responsible for running a Data::Presenter script, including:
A report, typically saved in the form of a text file, generated by a database program which presents data in a row-column format. The source may also contain other information such as page headers and footers and table headers and footers. Also referred to herein as ''source report,'' ''source file'' or ''database source report.''
Sample files are included in the archive file in which this documentation is found. Three source files, census.txt, medinsure.txt and hair.txt, are included, as are the corresponding Data::Presenter subclass packages (Census.pm, Medinsure.pm and Hair.pm) and configuration files (fields.census.data, fields.medinsure.data and fields.hair.data).
This section addresses those aspects of the usage of Data::Presenter which must be implemented by the administrator|"Administrator":
If Data::Presenter has already been properly configured by your administrator and you are simply concerned with using Data::Presenter to generate reports, you may skip ahead to "USAGE: Operator".
Data::Presenter installs in the same way as other Perl extensions available from CPAN: either automatically via the CPAN shell or manually with these commands:
% gunzip Data-Presenter-1.03.tar.gz
% tar xf Data-Presenter-1.03.tar
% cd Data-Presenter-1.03
% perl Makefile.PL
% make
% make test
% make install
This will install the following directory tree in your ''site perl'' directory, i.e., a directory such as /usr/local/lib/perl5/site_perl/5.8.7/:
Data/
Presenter.pm
Presenter/
Combo.pm
Combo/
Intersect.pm
Union.pm
Once the Administrator has installed Data::Presenter, she must then decide
which location on the network will be used to hold Data::Presenter::[Package1]
subclass packages, where [Package1] is a Data::Presenter subclass in which a
new object will be created. That location could be the Data/Presenter/
directory listed above or it could be some other location which users can
access in a Perl program via the use lib () pragma.
The Administrator must also decide on a location on the network which will be used to hold the Data::Presenter configuration files -- one for each data source to be used by Data::Presenter. By convention, each configuration file is named by some variation on the theme of fields.XXX.data.
Suppose, for instance, that /usr/share/datapresenter/ is the directory created to hold Data::Presenter-related files accessible to all users. Suppose, further, that in this business two database reports, census and medinsure, will be processed via Data::Presenter. The Administrator would then create a directory tree like this:
/usr/share/datapresenter/
Data/
Presenter/
Census.pm
Medinsure.pm
config/
fields.census.data
fields.medinsure.data
The Administrator could also create a directory called source/ to hold the source files to be processed with Data::Presenter, and she could also create a directory called results/ to hold files created via Data::Presenter -- but neither of these are strictly necessary.
Successful use of Data::Presenter assumes that the administrator is able to analyze a report generated from a database, distinguish key structural features of such a source report and write Perl code which will extract the most relevant information from the report. A complete discussion of these issues is beyond the scope of this documentation. What follows is a taste of the issues involved.
Structural features of a database report are likely to include the following: report headers, page headers, table headers, data entries reporting values of a variety of fields, page footers and report footers. Of these features, data entries and table headers are most important from the perspective of Data::Presenter. The data entries are the data which will actually be manipulated by Data::Presenter, while table headers will provide the administrator guidance when writing the configuration file fields.XXX.data. Report and page headers and footers are generally irrelevant and will be stripped out.
For example, let us suppose that a portion of a client census looks like this:
CLIENTS - AUGUST 1, 2001 - C O N F I D E N T I A L PAGE 1
SHRED WHEN NEW LIST IS RECEIVED!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
LAST NAME FIRST NAM C. NO BIRTH
HERNANDEZ HECTOR 456791 1963-07-16
VASQUEZ ADALBERTO 456792 1973-10-02
WASHINGTON ALBERT 906786 1953-03-31
The first two lines are probably report or page headers and should be stripped out. The third line consists of table column names and may give clues as to how fields.census.data should be written. The fourth line is blank and should be stripped out. The next three lines constitute actual rows of data; these will be the focus of Data::Presenter.
A moderately experienced Perl programmer will look at this report and say, ''Each row of data can be stored in a Perl array. If each client's 'c. no' is unique, then it can be used as the key of an entry in a Perl hash where the entry's value is a reference to the array just mentioned. A hash of arrays -- I can use Data::Presenter!''
Our Perl programmer would then say, ''I'll open a filehandle to the source
file and read the file line-by-line into a while loop. I'll write lines
beginning next if to bypass the headers and the blank lines.'' For
instance:
next if (/^CLIENTS/);
next if (/^SHRED/);
next if (/^\s?LAST\sNAME/);
next if (/^$/);
Our Perl hacker will then say, ''I could try to write regular expressions to
handle the rows of data. But since the data appears to be strictly columnar,
I'll probably be better off using the Perl unpack function. I'll use the
column headers to suggest names for my variables.'' For instance:
my ($lastname, $firstname, $cno, $datebirth) =
unpack("x A14 x A10 x A6 x A10", $_);
Having provided a taste of what to do with the rows of the data structure, we now turn to an analysis of the columns of the structure.
For each data source, the administrator must prepare a configuration file,
typically named as some variation on fields.XXX.data.
fields.XXX.data consists of three Perl variables:
@fields, %parameters and $index.
@fields@fields has one element for each column (field) that appears
in the data source. The elements of @fields must appear in exactly the
same order as they appear in the data source. Each element should be a
single Perl word, i.e., consist solely of letters, numerals or the
underscore character (_).
In the sample configuration file fields.census.data included with this documentation, this variable reads:
@fields = qw(
lastname firstname cno unit ward dateadmission datebirth
);
In another sample configuration file, fields.medinsure.data, this variable reads:
@fields = qw(lastname firstname cno stateid medicare medicaid);
%parameters%parameters is a bit trickier. There must be one entry
in %parameters for each element in @fields. Hence, there is one entry
in %parameters for each column (field) in the data source. However, the
keys of %parameters are spelled $fields[0], $fields[1], and so on
through the highest index number in @fields (which is 1 less than the
number of elements in @fields). Using the example above, we can begin to
construct %parameters as follows:
%parameters = (
$fields[0] =>
$fields[1] =>
$fields[2] =>
$fields[3] =>
$fields[4] =>
$fields[5] =>
$fields[6] =>
);
The value for each entry in %parameters consists of an array of 4 elements
specified as follows:
A positive integer specifying the maximum number of characters which may be
displayed in any output format for the given column (field). In the example
above, we will specify that column 'lastname' ($fields[0]) may have a
maximum of 14 characters.
$fields[0] => [14,
An upper-case letter 'U' or 'D' (for 'Up' or 'Down') enclosed in single quotation marks indicating whether the given column should be sorted in ascending or descending order. In the example above, 'lastname' sorts in ascending order.
$fields[0] => [14, 'U',
A lower-case letter 'a', 'n' or 's' enclosed in single quotation marks indicating whether the given column should be sorted alphabetically (case-insensitive), numerically or ASCII-betically (case-sensitive). In the example above, 'lastname' sorts in alphabetical order. (Data::Presenter per se does not yet have a facility for sorting in date or time order. If dates are entered as pure numerals in 'MMDD' order, they may be sorted numerically. If they are entered in the MySQL standard format ' YY-MM-DD', they may be sorted alphabetically.)
$fields[0] => [14, 'U', 'a',
A string enclosed in single quotation marks to be used as a column header
when the data is outputted in some table-like format such as a Perl format
with a header or an HTML table. The administrator may choose to use exactly
the same words here that were used in @fields, but a more natural language
string is probably preferable. In the example above, the first column will
carry the title 'Last Name' in any output.
$fields[0] => [14, 'U', 'a', 'Last Name'],
Using the same example as previously, we can now complete %parameters as:
%parameters = (
$fields[0] => [14, 'U', 'a', 'Last Name'],
$fields[1] => [10, 'U', 'a', 'First Name'],
$fields[2] => [ 7, 'U', 'n', 'C No.'],
$fields[3] => [ 6, 'U', 'a', 'Unit'],
$fields[4] => [ 4, 'U', 'n', 'Ward'],
$fields[5] => [10, 'U', 'a', 'Date of Admission'],
$fields[6] => [10, 'U', 'a', 'Date of Birth'],
);
$index$index is the simplest element of fields.XXX.data. It is the
array index for the entry in @fields which describes the field in the data
source whose values uniquely identify each entry in the source. If, in the
example above, 'cno' is the index field|"Index Field" for the data in
census.txt, then $index is 2. (Remember that Perl starts counting
array elements with 0.)
Data::Presenter.pm, Data::Presenter::Combo.pm, Data::Presenter::Combo::Intersect.pm and Data::Presenter::Combo::Union are ready to use ''as is.'' They require no further modification by the administrator. However, each report from which the operator draws data needs to have a package subclassed beneath Data::Presenter and written specifically for that report by the administrator.
Indeed, no object is ever constructed directly from Data::Presenter. All objects are constructed from subclasses of Data::Presenter.
Hence:
$dp1 = Data::Presenter->new( # INCORRECT
$source, \@fields, \%parameters, $index);
$dp1 = Data::Presenter::[Package1]->new( # CORRECT
$source, \@fields, \%parameters, $index);
Data::Presenter::[Package1], however, does not contain a new() method. It
inherits Data::Presenter's new() method -- which then turns around and
delegates the task of populating the object with data to
Data::Presenter::[Package1]'s _init() method!
This _init() method must be customized by the administrator to properly
handle the specific features of each source file. This requires that the
administrator be able to write a Perl script to 'clean up' the source file so
that only lines containing meaningful data are written to the Data::Presenter
object. (See "Analysis of Source Files" above.) With that in mind, a
Data::Presenter::[Package1] package must always include the following
methods:
_init()This method is called from within the constructor and is used to populate the
hash which is blessed into the new object. It opens a filehandle to the
source file and typically reads that source file line-by-line via a Perl
while loop. Perl techniques and functions such as regular expressions,
split and unpack are used to populate a hash of arrays and to strip out
lines in the data source not needed in the object. Should the administrator
need to ''munge'' any of the incoming data so that it appears in a uniform
format (e.g., '2001-07-02' rather than '7/2/2001' or '07/02/2001'), the
administrator should write appropriate code within _init() or in a separate
module imported into the main package. Each element of each array used to
store a data record must have a defined value. undef is not permitted;
assign an empty string to the element instead. A reference to this hash of
arrays is returned to the constructor, which blesses it into the object.
_extract_rowsThis method is called from within the Data::Presenter select_rows method.
In much the same manner as _init(), it permits the administrator to
''munge'' operator-typed data to achieve a uniform format.
The packages Data::Presenter::Census and Data::Presenter::Medinsure
found in the t/ directory in this distribution provide examples of
_init() and _extract_rows. Search for the lines of code which read:
# DATA MUNGING STARTS HERE
# DATA MUNGING ENDS HERE
Here is a simple example of data munging. In the sample configuration file
fields.census.data, all elements of @fields are entered entirely in
lower-case. Hence, it would be advisable to transform the operator-specified
content of $column to all lower-case so that the program does not fail
simply because an operator types an upper-case letter. See _extract_rows()
in the Data::Presenter::Census package included with this documentation for
an example.
Sample file Data::Presenter::Medinsure contains an example of a subroutine
written to clean up repetitive coding within the data munging section.
Search for sub _prepare_record.
Once the administrator has installed Data::Presenter and completed the preparation of configuration files and Data::Presenter subclass packages, the administrator may turn over to the operator the job of selecting particular source files, output formats and particular entries and fields from within the source files.
Using the hospital census example included with this documentation, the operator would construct a Data::Presenter::Census object with the following code:
use Data::Presenter;
use lib ("/usr/share/datapresenter");
use Data::Presenter::Census;
our @fields = ();
our %parameters = ();
our $index = q{};
my $sourcefile = 'census.txt';
my $configdir = "/usr/share/datapresenter";
my $configfile = "$configdir/fields.census.data";
do $configfile;
new() my $dp1 = Data::Presenter::Census->new(
$sourcefile, \@fields, \%parameters, $index);
get_data_count()Returns the current number of data entries in the specified Data::Presenter object. This number does not include those elements in the object whose keys are reserved words. This method takes no arguments and returns one numerical scalar.
my $data_count = $dp1->get_data_count();
print 'Data count is now: ', $data_count, "\n";
print_data_count()Prints the current data count preceded by ''Current data count: ''. This number does not include those elements in the object whose keys are reserved words. This method takes no arguments and returns no values.
$dp1->print_data_count();
get_keys()Returns a reference to an array whose elements are an ASCII-betically sorted list of keys to the hash blessed into the Data::Presenter::[Package1] object. This list does not include those elements whose keys are reserved words. This method takes no arguments and returns only the array reference described.
my $keysref = $dp1->get_keys();
print "Current data points are: @$keysref\n";
get_keys_seen()Returns a reference to a hash whose elements are key-value pairs where the key is the key of an element blessed into the Data::Presenter::[Package1] object and the value is 1, indicating that the key has been seen (a 'seen-hash'). This list does not include those elements whose keys are reserved words. This method takes no arguments and returns only the hash reference described.
my $seenref = $dp1->get_keys_seen();
print "Current data points are: ";
print "$_ " foreach (sort keys %{$seenref});
print "\n";
seen_one_column()Takes as argument a single string which is the name of one of the fields
listed in @fields in the configuration file. Returns a reference to a hash
whose elements are keyed by the entries for that field in the data source and
whose values are the number of times each entry was seen in the data.
For example, if the data consisted of this:
HERNANDEZ HECTOR 1963-08-01 456791
VASQUEZ ADALBERTO 1973-08-17 786792
VASQUEZ ALBERTO 1953-02-28 906786
where the left-most column was described in @fields as lastname, then:
$seenref = $dp1->seen_one_column('lastname');
and $seenref would hold:
{
HERNANDEZ => 1,
VASQUEZ => 2,
}
select_rows()select_rows() enables the operator to establish criteria
by which specific entries from the data can be selected for output. It does
so not by creating a new object but by striking out entries in the
object's current data structure|"Object's Current Data Structure" which do
not meet the selection criteria.
If the operator were using Perl as an interface to a true database program, selection of entries would most likely be handled by a module such as DBI and an SQL-like query. In that case, it would be possible to write complex selection queries which operate on more than one field at a time such as:
select rows where 'datebirth' is before 01/01/1960
AND 'lastname' equals 'Vasquez'
# (NOTE: This is generic code,
# not true Perl or Perl DBI code.)
Complex selection queries are not yet possible in Data::Presenter. However, you could accomplish much the same objective with a series of simple selection queries that operate on only one field at a time,
select rows where 'datebirth" is before 01/01/1960
then
select rows where 'lastname' equals 'Vasquez'
each of which narrows the selection criteria.
How do we accomplish this within Data::Presenter? For each selection query,
the operator must define 3 variables: $column, $relation and
@choices. These variables are passed to select_rows(), which in turn
passes them to certain internal subroutines where their values are
manipulated as follows.
$column$column must be an element of @fields|"@fields" found in the
configuration file|"Preparation of Configuration File (fields.XXX.data)".
$relation$relation expresses the verb part of the selection query, i.e.,
relations such as equals, is less than, E=>, after and so
forth. In an attempt to add natural language flexibility to the selection
query, Data::Presenter permits the operator to enter a wide variety of
mathematical and English expressions here:
'eq', 'equals', 'is', 'is equal to', 'is a member of',
'is part of', '=', '=='
'is', 'is not', 'is not equal to', 'is not a member of',
'is not part of', 'is less than or greater than',
'is less than or more than', 'is greater than or less than',
'is more than or less than', 'does not equal', 'not',
'not equal to ', 'not equals', '!=', '! =', '!==', '! =='
'<', 'lt', 'is less than', 'is fewer than', 'before'
'>', 'gt', 'is more than', 'is greater than', 'after'
'<=', 'le', 'is less than or equal to',
'is fewer than or equal to', 'on or before', 'before or on'
'>=', 'ge', 'is more than or equal to', 'is greater than or equal to',
'on or after', 'after or on'
As long as the operator selects a string from the category desired, Data::Presenter will convert it internally in an appropriate manner.
@choicesIf the relationship being tested is one of equality or non-equality, then the operator may enter more than one value here, any one of which may satisfy the selection criterion.
my ($column, $relation, @choices);
$column = 'lastname';
$relation = 'is';
@choices = ('Smith', 'Jones');
$dp1->select_rows($column, $relation, \@choices);
If, however, the relationship being tested is one of 'less than', 'greater than', etc., then the operator should enter only one value, as the value is establishing a limit above or below which the selection criterion will not be met.
$column = 'datebirth';
$relation = 'before';
@choices = ('01/01/1970');
$dp1->select_rows($column, $relation, \@choices);
sort_by_column()sort_by_column() takes only 1 argument: a reference
to an array consisting of the fields the operator wishes to present in the
final output, listed in the order in which those fields should be sorted.
All elements of this array must be elements in @fields. The index field
must always be included as one of the columns selected, though it may be
placed last if it is not intrinsically important in the final output.
sort_by_column() returns a reference to a hash of appropriately sorted data
which will be used as input to Data::Presenter methods such as
writeformat(), writeformat_plus_header() and writeHTML().
To illustrate:
my @columns_selected = ('lastname', 'firstname', 'datebirth', 'cno');
$sorted_data = $dp1->sort_by_column(\@columns_selected);
Suppose that the operator fails to include the index column in
@columns_selected. This risks having two or more identical data entries,
only the last of which would appear in the final output. As a safety
precaution, sort_by_column() throws a warning and places duplicate entries
in a text file called dupes.txt.
Note: If you want your output to report only selected entries from the
source, and if you want to apply one of the complex Data::Presenter output
methods which require application of sort_by_column(), call select_rows
before calling sort_by_column(). Otherwise your report may contain
blank lines.
print_to_screen()print_to_screen() prints to standard output (generally, the computer
monitor) a semicolon-delimited display of all entries in the object's
current data structure. It takes no arguments and returns no values.
$dp1->print_to_screen();
A typical line of output will look something like:
VASQUEZ;JORGE;456787;LAVER;0105;1986-01-17;1956-01-13;
print_to_file()print_to_file() prints to an operator-specified file a
semicolon-delimited display of all entries in the object's current data
structure. It takes 1 argument -- the user-specified output file -- and
returns no values.
$outputfile = 'census01.txt';
$dp1->print_to_file($outputfile);
A typical line of output will look exactly like that produced by
print_to_screen|"print_to_screen()".
print_with_delimiter()print_with_delimiter(), like print_to_file(),
prints to an operator-specified file. print_with_delimiter() allows the
operator to specify the character pattern which will be used to delimit
display of all entries in the object's current data structure. It does not
print the delimiter after the final field in a particular data record. It
takes 2 arguments -- the user-specified output file and the character pattern
to be used as delimiter -- and returns no values.
$outputfile = 'delimited01.txt';
$delimiter = '|||';
$dp1->print_with_delimiter($outputfile, $delimiter);
The file created print_with_delimiter() is designed to be used as an input
to functions such as 'Convert text to tabs' or 'Convert text to table' found
in commercial word processing programs. Such functions require delimiter
characters in the input. A typical line of output will look something like:
VASQUEZ|||JORGE|||456787|||LAVER|||0105|||1986-01-17|||1956-01-13
full_report()full_report() prints to an operator-specified file each
entry in the object's current data structure, sorted by the index and
explicitly naming each field name/field value pair. It takes 1 argument --
the user-specified output file -- and returns no values.
$outputfile = 'report01.txt';
$dp1->full_report($outputfile);
The output for a given entry will look something like:
456787
lastname VASQUEZ
firstname JORGE
cno 456787
unit LAVER
ward 0105
dateadmission 1986-01-17
datebirth 1956-01-13
writeformat()writeformat() writes data via Perl's formline function
-- the function which internally powers Perl formats -- to an
operator-specified file. writeformat() takes a list of 3 key-value pairs:
$dp1->writeformat(
sorted => $sorted_data,
columns => \@columns_selected,
file => $outputfile,
);
sortedThe value of sorted is a hash reference which is the return value of
sort_by_column(). Hence, writeformat() can only be called once
sort_by_column() has been called.
columnsThe value of columns is a reference to the array of fields in the data
source selected for presentation in the output file. It is the same variable
which is used as the argument to sort_by_column().
fileThe value of file is the name of a file arbitrarily selected by the
operator to hold the output of writeformat().
Using the ''census'' example from above, the overall sequence of code needed
to use writeformat() would be:
@columns_selected = ('lastname', 'firstname', 'datebirth', 'cno');
$sorted_data = $dp1->sort_by_column(\@columns_selected);
$dp1->writeformat(
sorted => $sorted_data,
columns => \@columns_selected,
file => $outputfile,
);
The result of the above call would be a file named format01.txt containing:
HERNANDEZ HECTOR 1963-08-01 456791
VASQUEZ ADALBERTO 1973-08-17 786792
VASQUEZ ALBERTO 1953-02-28 906786
The columnar appearance of the data is governed by choices made by the
administrator within the configuration file (here, within
fields.census.data). The choice of columns themselves is controlled by
the operator via \@columns_selected.
writeformat_plus_header()writeformat_plus_header() writes data via
Perl formats to an operator-specified file and writes a Perl format header to
that file as well. writeformat_plus_header() takes a list of 4 key-value
pairs. Three of these pairs are the same as in writeformat(); the fourth
is:
titletitle => $title,
title holds text chosen by the operator.
The complete call to writeformat_plus_header looks like this:
@columns_selected = ('lastname', 'firstname', 'datebirth', 'cno');
$sorted_data = $dp1->sort_by_column(\@columns_selected);
$dp1->writeformat_plus_header(
sorted => $sorted_data,
columns => \@columns_selected,
file => $outputfile,
title => $title,
);
and will produce a header and formatted data like this:
Hospital Census Report
Date Date of
Unit Ward Last Name First Name of Birth Admission C No.
------------------------------------------------------------------
LAVER 0105 VASQUEZ JORGE 1956-01-13 1986-01-17 456787
LAVER 0107 VASQUEZ LEONARDO 1970-15-23 1990-08-23 456788
SAMSON 0209 VASQUEZ JOAQUIN 1970-03-25 1990-11-14 456789
The wording of the column headers is governed by choices made by the administrator within the configuration file (here, within fields.census.data). If a particular word in a column header is too long to fit in the space allocated, it will be truncated.
writeformat_with_reprocessing()writeformat_with_reprocessing() is an
advanced application of Data::Presenter and the reader may wish to skip this
section until other parts of the module have been mastered.
writeformat_with_reprocessing() permits a sophisticated administrator to
activate ''last minute'' substitutions in the strings printed out from the
format accumulator variable $^A. Suppose, for example, that a school
administrator faced the problem of scheduling classes in different classrooms
and in various time slots. Suppose further that, for ease of programming or
data entry, the time slots were identified by chronologically sequential
numbers and that instructors were identified by a unique ID built up from
their first and last names. Applying an ordinary writeformat() to such
data might show output like this
11 Arithmetic Jones 4044 4044_11
11 Language Studies WilsonT 4054 4054_11
12 Bible Study Eliade 4068 4068_12
12 Introduction to Computers Knuth 4086 4086_12
13 Psychology Adler 4077 4077_13
13 Social Science JonesT 4044 4044_13
51 World History Wells 4052 4052_51
51 Music Appreciation WilsonW 4044 4044_51
where 11 mapped to 'Monday, 9:00 am', 12 to 'Monday, 10:00 am', 51
to 'Friday, 9:00 am' and so forth and where the fields underlying this output
were 'timeslot', 'classname', 'instructor', 'room' and 'sessionID'. While
this presentation is useful, a client might wish to have the time slots and
instructor IDs decoded for more readable output:
Monday, 9:00 Arithmetic E Jones 4044 4044_11
Monday, 9:00 Language Studies T Wilson 4054 4054_11
Monday, 10:00 Bible Study M Eliade 4068 4068_12
Monday, 10:00 Introduction to Computers D Knuth 4086 4086_12
Monday, 11:00 Psychology A Adler 4077 4077_13
Monday, 11:00 Social Science T Jones 4044 4044_13
Friday, 9:00 World History H Wells 4052 4052_51
Friday, 9:00 Music Appreciation W Wilson 4044 4044_51
Time slots coded with chronologically sequential numbers can be ordered to
sort numerically in the %parameters established in the
fields.[package1].data file corresponding to a particular
Data::Presenter::[package1]. Their human-language equivalents, however, will
not sort properly, as, for example, 'Friday' comes before 'Monday' in an
alphabetical or ASCII-betical sort. Clearly, it would be desirable to
establish the sorting order by relying on the chronologically sequential time
slots and yet have the printed output reflect more human-readable days of the
week and times. Analogously, for the instructor we might wish to display the
first initial and last name in our printed output rather than his/her ID
code.
The order in which data records appear in output is determined by
sort_by_column() before writeformat() is called. How can we preserve
this order in the final output?
Answer: After we have stored a given formed line in $^A, we reprocess
that line by calling an internal subroutine defined in the invoking class,
Data::Presenter::[package1]::_reprocessor(), which tells Perl to splice out
certain portions of the formed line and substitute more human-readable copy.
The information needed to make _reprocessor() work comes from two places.
First, from a hash passed by reference as an argument to
writeformat_with_reprocessing(). writeformat_with_reprocessing() takes
a list of four key-value pairs, the first three of which are the same as those
passed to writeformat(). The fourth key-value pair to
writeformat_with_reprocessing() is a reference to a hash whose keys are
the names of the fields in the data records where we wish to make
substitutions and whose corresponding values are the number of characters
the field will be allocated after substitution. The call to
writeformat_with_reprocessing() would therefore look like this:
%reprocessing_info = (
timeslot => 17,
instructor => 15,
);
$dp1->writeformat_with_reprocessing(
sorted => $sorted_data,
columns => \@columns_selected,
file => $outputfile,
reprocess => \%reprocessing_info,
);
Second, writeformat_with_reprocessing() takes advantage of the fact that
Data::Presenter's package global hash %reserved contains four keys --
fields, parameters, index and options -- only the first
three of which are used in Data::Presenter's constructor or sorting methods.
Early in the development of Data::Presenter the keyword options was
deliberately left unused so as to be available for future use.
The sophisticated administrator can make use of the options key to store
metadata in a variety of ways. In writing
Data::Presenter::[package1]::_init(), the administrator prepares the way
for last-minute reprocessing by creating an options key in the hash to
be blessed into the Data::Presenter::[package1]() object. The value
corresponding to the key options is itself a hash with two elements
keyed by subs and sources. If $dp1 is the object and %data
is the hash blessed into the object, then we are looking at these two
elements:
$data{options}{subs}
$data{options}{sources}
The values corresponding to these two keys are references to yet more hashes.
The hash which is the value for $data{options}{subs} hash keys whose
elements are the name of subroutines, each of which is built up from the
string reprocess_ concatenated with the name of the field to be
reprocessed, e.g.
$data{options}{subs} = {
reprocess_timeslot => 1,
reprocess_instructor => 1,
};
These field-specific internal reprocessing subroutines may be defined by the
administrator in Data::Presenter::[package1]() or they may be imported from
some other module. writeformat_with_reprocessing() verifies that these
subroutines are actually present in Data::Presenter::[package1]()
regardless of where they were originally found.
What about $data{options}{sources}? This location stores all the
original data from which substitutions are made. Example:
$data{options}{sources} = {
timeslot => {
11 => ['Monday', '9:00 am' ],
12 => ['Monday', '10:00 am' ],
13 => ['Monday', '11:00 am' ],
51 => ['Friday', '9:00 am' ],
},
instructor => {
'Jones' => ['Jones', 'E' ],
'WilsonT' => ['Wilson', 'T' ],
'Eliade' => ['Eliade', 'M' ],
'Knuth' => ['Knuth', 'D' ],
'Adler' => ['Adler', 'A' ],
'JonesT' => ['Jones', 'T' ],
'Wells' => ['Wells', 'H' ],
'WilsonW' => ['Wilson', 'W' ],
}
};
The point at which this data gets into the object is, of course,
Data::Presenter::[package1]::_init(). What the administrator does at that
point is limited only by his/her imagination. Data::Presenter seeks to bless
a hash into its object. That hash must meet the following requirements:
fieldsparametersindexoptions is required only if some
Data::Presenter method has been written which requires the information stored
therein. writeformat_with_reprocessing() is the only such method currently
present, but additional methods using the options key may be added in
the future.
The author has used two different approaches to the problem of initializing Data::Presenter::[package1] objects.
unpack,
etc. to build an array for each data record. Keyed by a unique ID, a
reference to this array then becomes the value of an element of the hash
which, once metadata is added, is blessed into the
Data::Presenter::[package1] object. The source for the metadata is the
fields.[package1].data file and the @fields, %parameters and
$index found therein. _init() do data munging on a
file, why not directly pass it a hash of arrays? Better still, why not pass
it a hash of arrays which already has an 'options' key defined? And
better still yet, why not pass it an object produced by some other Perl
module and containing a blessed hash of arrays with an already defined
options key?'' In this approach, Data::Presenter::[package1]::_init()
does no data munging. It is mainly concerned with defining the three
required metadata elements.writeformat_deluxe()writeformat_deluxe() is an advanced application of
Data::Presenter and the reader may wish to skip this section until other
parts of the module have been mastered.
writeformat_deluxe() enables the user to have both column headers (as in
writeformat_plus_header()) and dynamic, 'just-in-time' reprocessing of data
in selected fields (as in writeformat_with_reprocessing()). Call it just
as you would writeformat_with_reprocessing(), but add a key-value pair
keyed by title.
%reprocessing_info = (
timeslot => 17,
instructor => 15,
);
$dp1->writeformat_deluxe(
sorted => $sorted_data,
columns => \@columns_selected,
file => $outputfile,
reprocess => \%reprocessing_info,
title => $title,
);
writedelimited()The Data::Presenter::writeformat...() family of
methods discussed above write data to plain-text files in columns aligned
with whitespace via Perl's formline function -- the function which
internally powers Perl formats. This is suitable if the ultimate consumer of
the data is satisfied to read a plain-text file. However, in many business
contexts data consumers are more accustomed to word processing files than to
plain-text files. In particular, data consumers are accustomed to data
presented in tables created by commercial word processing programs. Such
programs generally have the capacity to take text in which individual lines
consist of data separated by delimiter characters such as tabs or commas and
transform that text into rows in a table where the delimiters signal the
borders between table cells.
To that end, the author has created the
Data::Presenter::writedelimited...() family of subroutines to print output
to plain-text files intended for further processing within word processing
programs. The simplest method in this family, writedelimited(), takes a
list of three key-value pairs:
sortedThe value keyed by sorted is a hash reference which is the return value of
sort_by_column(). Hence, writedelimited() can only be called once
sort_by_column() has been called.
fileThe value keyed by file is the name of a file arbitrarily selected by
the operator to hold the output of writedelimited().
delimiterThe value keyed by delimiter is the user-selected delimiter character or
characters which will delineate fields within an individual record in the
output file. Typically, this character will be a tab (\t), comma (,)
or similar character that a word processing program's 'convert text to table'
feature can use to establish columns.
Using the ''census'' example from above, the overall sequence of code needed
to use writedelimited() would be:
@columns_selected = ('lastname', 'firstname', 'datebirth', 'cno');
$sorted_data = $dp1->sort_by_column(\@columns_selected);
$dp1->writedelimited(
sorted => $sorted_data,
file => $outputfile,
delimiter => $delimiter,
);
Note that, unlike writeformat(), writedelimited() does not require a
reference to @columns_selected to be passed as an argument.
Depending on the number of characters in a text editor's tab-stop setting, the result of the above call might look like:
HERNANDEZ HECTOR 1963-08-01 456791
VASQUEZ ADALBERTO 1973-08-17 786792
VASQUEZ ALBERTO 1953-02-28 906786
This is obviously less readable than the output of writeformat() -- but
since the output of writedelimited() is intended for further processing by
a word processing program rather than for final use, this is not a major
concern.
writedelimited_plus_header()Just as writeformat_plus_header() extended
writeformat() to include column headers, writedelimited_plus_header()
extends writedelimited() to include column headers, separated by the same
delimiter character as the data, in a plain-text file intended for further
processing by a word processing program.
writedelimited_plus_header() takes a list of four key-value pairs:
sorted, columns, file, and delimiter. The complete call
to writedelimited_plus_header looks like this:
@columns_selected = (
'unit', 'ward', 'lastname', 'firstname',
'datebirth', 'dateadmission', 'cno');
$sorted_data = $dp1->sort_by_column(\@columns_selected);
$dp1->writedelimited_plus_header(
sorted => $sorted_data,
columns => \@columns_selected,
file => $outputfile,
delimiter => $delimiter,
);
Note that, unlike writeformat_plus_header(), writedelimited_plus_header()
does not take $title as an argument. It is felt that any title would be
more likely to be supplied in the word-processing file which ultimately holds
the data prepared by writedelimited_plus_header() and that its inclusion
at this point might interfere with the workings of the word processing
program's 'convert text to table' feature.
Depending on the number of characters in a text editor's tab-stop setting, the result of the above call might look like:
Date Date of
Unit Ward Last Name First Name of Birth Admission C No.
LAVER 0105 VASQUEZ JORGE 1956-01-13 1986-01-17 456787
LAVER 0107 VASQUEZ LEONARDO 1970-15-23 1990-08-23 456788
SAMSON 0209 VASQUEZ JOAQUIN 1970-03-25 1990-11-14 456789
Again, the readability of the delimited copy in the plain-text file here is not as important as how correctly the delimiter has been chosen in order to produce good results once the file is further processed by a word processing program.
Note that, unlike writeformat_plus_header(), writedelimited_plus_header()
does not produce a hyphen line. The author feels that the separation of
header and body within the table is here better handled within the word
processing file which ultimately holds the data prepared by
writedelimited_plus_header().
Note further that, unlike writeformat_plus_header(),
writedelimited_plus_header() does not truncate the words in column headers.
This is because the writedelimited...() family of methods does not impose
a maximum width on output fields as does the writeformat...() family of
methods. Hence, there is no need to truncate headers to fit within specified
column widths. Column widths in the writedelimited...() family are
ultimately determined by the word processing program which produces the final
output.
writedelimited_with_reprocessing()writedelimited_with_reprocessing()
is an advanced application of Data::Presenter and the reader may wish to skip
this section until other parts of the module have been mastered.
writedelimited_with_reprocessing(), like writeformat_with_reprocessing(),
permits a sophisticated administrator to activate ''last minute''
substitutions in strings to be printed such that substitutions do not affect
the pre-established sorting order. For a full discussion of the rationale
for this feature, see the discussion of "writeformat_with_reprocessing()"
above.
writedelimited_with_reprocessing() takes a list of five key-value pairs,
four of which are the same arguments passed to
writeformat_with_reprocessing(). The fifth key-value pair is a reference
to an array holding a list of those columns selected for output upon which
the user chooses to perform reprocessing.
@reprocessing_info = qw( instructor timeslot room );
$dp1->writedelimited_with_reprocessing(
sorted => $sorted_data,
columns => \@columns_selected,
file => $outputfile,
delimiter => $delimiter,
reprocess => \@reprocessing_info,
);
Taking the classroom scheduling problem presented above,
writedelimited_with_reprocessing() would produce output looking something
like this:
Monday, 9:00 Arithmetic E Jones 4044 4044_11
Monday, 9:00 Language Studies T Wilson 4054 4054_11
Monday, 10:00 Bible Study M Eliade 4068 4068_12
Monday, 10:00 Introduction to Computers D Knuth 4086 4086_12
Monday, 11:00 Psychology A Adler 4077 4077_13
Monday, 11:00 Social Science T Jones 4044 4044_13
Friday, 9:00 World History H Wells 4052 4052_51
Friday, 9:00 Music Appreciation W Wilson 4044 4044_51
Usage of writedelimited_with_reprocessing() requires that the administrator
appropriately define Data::Presenter::[Package1]::_reprocess_delimit() and
Data::Presenter::[Package1]::_init() subroutines in the invoking package,
along with appropriate subroutines specific to each argument capable of being
reprocessed. Again, see the discussion in "writeformat_with_reprocessing()".
writedelimited_deluxe()writedelimited_deluxe() is an advanced
application of Data::Presenter and the reader may wish to skip this section
until other parts of the module have been mastered.
writedelimited_deluxe() completes the parallel structure between the
writeformat...() and writedelimited...() families of Data::Presenter
methods by enabling the user to have both column headers (as in
writedelimited_plus_header()) and dynamic, 'just-in-time' reprocessing of
data in selected fields (as in writedelimited_with_reprocessing()). Except
for the name of the method called, the call to writedelimited_deluxe() is
the same as for writedelimited_with_reprocessing():
@reprocessing_info = qw( instructor timeslot );
$dp1->writedelimited_deluxe(
sorted => $sorted_data,
columns => \@columns_selected,
file => $outputfile,
delimiter => $delimiter,
reprocess => \@reprocessing_info,
);
Using the classroom scheduling example from above,the output from
writedelimited_deluxe() might look like this:
Timeslot Group Instructor Room GroupID
Monday, 9:00 Arithmetic E Jones 4044 4044_11
Monday, 9:00 Language Studies T Wilson 4054 4054_11
Monday, 10:00 Bible Study M Eliade 4068 4068_12
Monday, 10:00 Introduction to Computers D Knuth 4086 4086_12
Monday, 11:00 Psychology A Adler 4077 4077_13
Monday, 11:00 Social Science T Jones 4044 4044_13
Friday, 9:00 World History H Wells 4052 4052_51
Friday, 9:00 Music Appreciation W Wilson 4044 4044_51
As with writedelimited_with_reprocessing(), writedelimited_deluxe()
requires careful preparation on the part of the administrator. See the
discussion under "writeformat_with_reprocessing()" above.
writeHTML()In its current formulation, writeHTML() works very much
like writeformat_plus_header(). It writes data to an operator-specified
HTML file and writes an appropriate header to that file as well.
writeHTML() takes the same 4 arguments as writeformat_plus_header():
$sorted_data, \@columns_selected, $outputfile and $title. The
body of the resulting HTML file is more similar to a Perl format than to an
HTML table. (This may be upgraded to a true HTML table in a future release.)
$dp1->writeHTML(
sorted => $sorted_data,
columns => \@columns_selected,
file => $HTMLoutputfile, # must have .html extension
title => $title,
);
It is quite possible that we may have two or more different database reports which present data on the same underlying universe or population. If these reports share a common index field which can be used to uniquely identify each entry in the underlying population, then we would like to be able to combine these sources, manipulate the data and re-output them via the simple and complex Data::Presenter output methods described in the "Synopsis" above.
In other words, if we have already created
my $dp1 = Data::Presenter::[Package1]->new(
$sourcefile, \@fields,\%parameters, $index);
my $dp2 = Data::Presenter::[Package2]->new(
$sourcefile, \@fields,\%parameters, $index);
...
my $dpx = Data::Presenter::[Package2]->new(
$sourcefile, \@fields,\%parameters, $index);
we would like to be able to define an array of the objects we have created and construct a new object combining the first two in an orderly manner:
my @objects = ($dp1, $dp2, ... $dpx);
my $dpC = Data::Presenter::[some subclass]->new(\@objects);
We would then like to be able to call all the Data::Presenter sorting,
selecting and output methods discussed above on $dpC without having to
re-specify $sourcefile, \@fields, \%parameters or $index.
Can we do this? Yes, we can. More precisely, we can create two new types of objects: one in which the data entries comprise those entries found in each of the original sources, and one in which the data entries comprise those found in any of the sources. In mathematical terms, we can create either a new object which represents the intersection of the sources or one which represents the union of the sources. We call these as follows:
my $dpI = Data::Presenter::Combo::Intersect->new(\@objects);
and
my $dpU = Data::Presenter::Combo::Union->new(\@objects);
Note the following:
@fields in the fields.XXX.data configuration files corresponding to each of
the objects, though that field does not have to appear in the same element
position in @fields in each such file. Similarly, the parameters on the
value side of %parameters for the index field must be specified
identically in each configuration file. If these conditions are not met, a
Data::Presenter::Combo object cannot be constructed and the program will die
with an error message.
$obj1 and $obj2. For fields1.data, we
have:
@fields = qw(lastname, firstname, cno);
%parameters = (
$fields[0] => [14, 'U', 'a', 'Last Name'],
$fields[1] => [10, 'U', 'a', 'First Name'],
$fields[2] => [ 7, 'U', 'n', 'C No.'],
);
$index = 2;
@fields = qw(cno, dateadmission, datebirth);
%parameters = (
$fields[0] => [ 7, 'U', 'n', 'C No.'],
$fields[1] => [10, 'U', 'a', 'Date of Admission'],
$fields[2] => [10, 'U', 'a', 'Date of Birth'],
);
$index = 0;
$obj1 and $obj2 be combined into a Data::Presenter::Combo object?
Yes, they can. cno is named as the index field in each configuration
file, and the values assigned to $fields[$index] in each are identical:
[ 7, 'U', 'n', 'C No.'].
$obj3. If the
contents of fields3.data were:
@fields = qw(cno, dateadmission, datebirth);
%parameters = (
$fields[0] => [ 7, 'U', 'n', 'Serial No.'],
$fields[1] => [10, 'U', 'a', 'Date of Admission'],
$fields[2] => [10, 'U', 'a', 'Date of Birth'],
);
$index = 0;
$obj3 could not be combined with either $obj1 or $obj2 because
the elements of $parameters{$fields[$index]} in $obj3 are not identical
to those in the first two objects.Here are some things to consider in using Data::Presenter::Combo objects:
$dp1 has entries not found in $dp2 (or vice versa)?
$dp1 and $dp2 are included in a
Data::Presenter::Combo::Intersect object. But if you are constructing a
Data::Presenter::Combo::Union object, any entry found in either source file
will be represented in the Union object. These properties would hold no
matter how many sources you used as arguments. $dp1 and $dp2 have fields named, for instance,
'lastname'?
'lastname' field is
entered into $dpC. Assuming that $dp1 is listed first in @objects,
all the fields in $dp1 will appear in $dpC. Only those fields in
$dp2 not found in $dp1 will be added to $dpC. If, however,
@objects were defined as ($dp2, $dp1), then $dp2's fields would have
precedence over those of $dp1. If a $dp3 object were constructed based
on yet another data source, only those fields entries not found in $dp1
or $dp2 would be included in the Combo object -- and so forth. This
left-to-right precedence rule governs both the data entries in $dpC as
well as the selection, sorting and output characteristics.It was discovered that in versions 0.68 and earlier, sort_by_column()
failed to sort data properly in descending order. This has been fixed.
See Changes.
The fundamental reference for this program is, of course, the Camel book: Larry Wall, Tom Christiansen, Jon Orwant. <Programming Perl, 3rd ed. O'Reilly & Associates, 2000, http://www.oreilly.com/catalog/pperl3/.
A careful reading of the code will tell any competent Perl hacker that many tricks were taken from the Ram book: Tom Christiansen & Nathan Torkington. Perl Cookbook. O'Reilly & Associates, 1998, http://www.oreilly.com/catalog/cookbook/.
The object-oriented programming skills needed to develop this program were learned via extensive re-reading of Chapters 3, 6 and 7 of Damian Conway's Object Oriented Perl. Manning Publications, 2000, http://www.manning.com/Conway/index.html.
This program goes to great length to follow the principle of 'Repeated Code is a Mistake' http://www.perl.com/pub/a/2000/11/repair3.html -- a specific application of the general Perl principle of Laziness. The author grasped this principle best following a 2001 talk by Mark-Jason Dominus http://perl.plover.com/ to the New York Perlmongers http://ny.pm.org/.
Most of the code in the _init() subroutines was written before the author
read Data Munging with Perl http://www.manning.com/cross/index.html by
Dave Cross. Nonetheless, that is an excellent discussion of the problems
involved in understanding the structure of data sources.
The discussion of bugs in this program benefitted from discussions on the Perl Seminar New York mailing list http://groups.yahoo.com/group/perlsemny, particularly with Martin Heinsdorf.
Correcting the bug involving sorting in descending order entailed a complete
rewrite of much code. This rewrite was greatly assisted by brian d foy and
Tanktalus in the Perlmonks thread ''Building a sorting subroutine on the
fly'' (http://perlmonks.org/?node_id=512460).
James E. Keenan (jkeenan@cpan.org).
Creation date: October 25, 2001. Last modification date: February 10, 2008. Copyright (c) 2001-5 James E. Keenan. United States. All rights reserved.
All data presented in this documentation or in the sample files in the archive accompanying this documentation are dummy copy. The data was entirely fabricated by the author for heuristic purposes. Any resemblance to any person, living or dead, is coincidental.
This is free software which you may distribute under the same terms as Perl itself.
| Data-Presenter documentation | view source | Contained in the Data-Presenter distribution. |