| MARC-Record documentation | Contained in the MARC-Record distribution. |
MARC::Record - Perl extension for handling MARC records
Version 2.0.3
Module for handling MARC records as objects. The file-handling stuff is in MARC::File::*.
Any errors generated are stored in $MARC::Record::ERROR.
Warnings are kept with the record and accessible in the warnings() method.
Base constructor for the class. It just returns a completely empty record. To get real data, you'll need to populate it with fields, or use one of the MARC::File::* modules to read from a file.
This is a wrapper around MARC::File::USMARC::decode() for compatibility with
older versions of MARC::Record.
The wanted_func() is optional. See MARC::File::USMARC::decode for details.
Following are a number of convenience methods for commonly-retrieved
data fields. Please note that they each return strings, not MARC::Field
objects. They return empty strings if the appropriate field or subfield
is not found. This is as opposed to the field()/subfield() methods
which return undef if something's not found. My assumption is that
these methods are used for quick & dirty reports and you don't want to
mess around with noting if something is undef.
Also note that no punctuation cleanup is done. If the 245a is "Programming Perl / ", then that's what you'll get back, rather than "Programming Perl".
Returns the title from the 245 tag.
Returns the title proper from the 245 tag, subfields a, n and p.
Returns the edition from the 250 tag, subfield a.
Returns the publication date from the 260 tag, subfield c.
Returns a list of all the fields in the record. The list contains a MARC::Field object for each field in the record.
Shortcut method for getting just a subfield for a tag. These are equivalent:
my $title = $marc->field('245')->subfield("a");
my $title = $marc->subfield('245',"a");
If either the field or subfield can't be found, undef is returned.
Appends the field specified by $field to the end of the record.
@fields need to be MARC::Field objects.
my $field = MARC::Field->new('590','','','a' => 'My local note.');
$record->append_fields($field);
Returns the number of fields appended.
Inserts the field specified by $new_field before the field $before_field.
Returns the number of fields inserted, or undef on failures.
Both $before_field and all @new_fields need to be MARC::Field objects.
If they are not an exception will be thrown.
my $before_field = $record->field('260');
my $new_field = MARC::Field->new('250','','','a' => '2nd ed.');
$record->insert_fields_before($before_field,$new_field);
Identical to insert_fields_before(), but fields are added after
$after_field. Remember, $after_field and any new fields must be
valid MARC::Field objects or else an exception will be thrown.
Will insert fields in strictly numerical order. So a 008 will be filed
after a 001 field. See insert_grouped_field() for an additional ordering.
Will insert the specified MARC::Field object into the record in grouped order and return true (1) on success, and false (undef) on failure.
my $field = MARC::Field->new( '510', 'Indexed by Google.' );
$record->insert_grouped_field( $field );
For example, if a '650' field is inserted with insert_grouped_field()
it will be inserted at the end of the 6XX group of tags. After discussion
most people wanted the ability to add a new field to the end of the
hundred group where it belonged. The reason is that according to the MARC
format, fields within a record are supposed to be grouped by block
(hundred groups). This means that fields may not necessarily be in tag
order.
Deletes a given list of MARC::Field objects from the the record.
# delete all note fields
my @notes = $record->field('5..');
$record->delete_fields(@notes);
delete_fields() will return the number of fields that were deleted.
Same thing as delete_fields() but only expects a single MARC::Field to be passed in. Mainly here for backwards compatibility.
This is a wrapper around MARC::File::USMARC::encode() for compatibility with
older versions of MARC::Record.
Returns a pretty string for printing in a MARC dump.
Returns the leader for the record. Sets the leader if text is defined. No error checking is done on the validity of the leader.
A method for getting/setting the encoding for a record. The encoding for a record is determined by position 09 in the leader, which is blank for MARC-8 encoding, and 'a' for UCS/Unicode. encoding() will return a string, either 'MARC-8' or 'UTF-8' appropriately.
If you want to set the encoding for a MARC::Record object you can use the string values:
$record->encoding( 'UTF-8' );
NOTE: MARC::Record objects created from scratch have an a default encoding of MARC-8, which has been the standard for years...but many online catlogs and record vendors are migrating to UTF-8.
WARNING: you should be sure your record really does contain valid UTF-8 data when you manually set the encoding.
Internal function for updating the leader's length and base address.
The clone() method makes a copy of an existing MARC record and returns
the new version. Note that you cannot just say:
my $newmarc = $oldmarc;
This just makes a copy of the reference, not a new object. You must use
the clone() method like so:
my $newmarc = $oldmarc->clone;
You can also specify field specs to filter down only a certain subset of fields. For instance, if you only wanted the title and ISBN tags from a record, you could do this:
my $small_marc = $marc->clone( 245, '020' );
The order of the fields is preserved as it was in the original record.
Returns the warnings (as a list) that were created when the record was read. These are things like "Invalid indicators converted to blanks".
my @warnings = $record->warnings();
The warnings are items that you might be interested in, or might not. It depends on how stringently you're checking data. If you're doing some grunt data analysis, you probably don't care.
A side effect of calling warnings() is that the warning buffer will be cleared.
add_fields() is now deprecated, and users are encouraged to use
append_fields(), insert_fields_after(), and insert_fields_before()
since they do what you want probably. It is still here though, for backwards
compatability.
add_fields() adds MARC::Field objects to the end of the list. Returns the
number of fields added, or undef if there was an error.
There are three ways of calling add_fields() to add data to the record.
my $author = MARC::Field->new(
100, "1", " ", a => "Arnosky, Jim."
);
$marc->add_fields( $author );
add_fields() take care of the objectifying. $marc->add_fields(
245, "1", "0",
a => "Raccoons and ripe corn /",
c => "Jim Arnosky.",
);
$marc->add_fields(
[ 250, " ", " ", a => "1st ed." ],
[ 650, "1", " ", a => "Raccoons." ],
);
A brief discussion of why MARC::Record is done the way it is:
One of the areas Perl excels is in allowing the programmer to create easy solutions quickly. MARC::Record is designed along those same lines. You want a program to dump all the 6XX tags in a file? MARC::Record is your friend.
Currently, I'm using MARC::Record for analyzing bibliographic data, but who knows what might happen in the future? MARC::Record needs to be just as adept at authority data, too.
I use method calls everywhere, and I expect calling programs to do the same, rather than accessing internal data directly. If you access an object's hash fields on your own, future releases may break your code.
One of the tradeoffs in using accessor methods is some overhead in the method calls. Is this slow? I don't know, I haven't measured. I would suggest that if you're a cycle junkie that you use Benchmark.pm to check to see where your bottlenecks are, and then decide if MARC::Record is for you.
A mailing list devoted to the use of Perl in libraries.
The definitive source for all things MARC.
Online version of the free booklet. An excellent overview of the MARC format. Essential.
Follett Software Company's (http://www.fsc.follett.com/) monthly discussion of various MARC tags.
Combine MARC.pm and MARC::* into one distribution.
Imagine something like this:
my @sears_headings = $marc->tag_grep( qr/Sears/ );
(from Mike O'Regan)
Please feel free to email me at <mrylander@gmail.com>. I'm glad
to help as best I can, and I'm always interested in bugs, suggestions
and patches.
An excellent place to look for information, and get quick help, is from the perl4lib mailing list. See http://perl4lib.perl.org for more information about this list, and other helpful MARC information.
The MARC::Record development team uses the RT bug tracking system at
http://rt.cpan.org. If your email is about a bug or suggestion,
please report it through the RT system. This is a huge help for the
team, and you'll be notified of progress as things get fixed or updated.
If you prefer not to use the website, you can send your bug to <bug-MARC-Record@rt.cpan.org>
Ideas are things that have been considered, but nobody's actually asked for.
These could be ASCII or MarcMaker.
This code may be distributed under the same terms as Perl itself.
Please note that these modules are not products of or supported by the employers of the various contributors to the code.
| MARC-Record documentation | Contained in the MARC-Record distribution. |
package MARC::Record;
use strict; use integer; use vars qw( $ERROR ); use MARC::Field; use Carp qw(croak carp);
use vars qw( $VERSION ); $VERSION = '2.0.3'; use Exporter; use vars qw( @ISA @EXPORTS @EXPORT_OK ); @ISA = qw( Exporter ); @EXPORTS = qw(); @EXPORT_OK = qw( LEADER_LEN ); use vars qw( $DEBUG ); $DEBUG = 0; use constant LEADER_LEN => 24;
sub new { my $class = shift; my $self = { _leader => ' ' x 24, _fields => [], _warnings => [], }; return bless $self, $class; } # new()
sub new_from_usmarc { my $blob = shift; $blob = shift if (ref($blob) || ($blob eq "MARC::Record")); require MARC::File::USMARC; return MARC::File::USMARC::decode( $blob, @_ ); }
sub title() { my $self = shift; my $field = $self->field(245); return $field ? $field->as_string : ""; }
sub title_proper() { my $self = shift; my $field = $self->field(245); if ( $field ) { return $field->as_string('anp'); } else { return ""; } }
sub author() { my $self = shift; my $field = $self->field('100|110|111'); return $field ? $field->as_string : ""; }
sub edition() { my $self = shift; my $str = $self->subfield(250,'a'); return defined $str ? $str : ""; }
sub publication_date() { my $self = shift; my $str = $self->subfield(260,'c'); return defined $str ? $str : ""; }
sub fields() { my $self = shift; return @{$self->{_fields}}; }
my %field_regex; sub field { my $self = shift; my @specs = @_; my @list = (); for my $tag ( @specs ) { my $regex = $field_regex{ $tag }; # Compile & stash it if necessary if ( not defined $regex ) { $regex = qr/^$tag$/; $field_regex{ $tag } = $regex; } # not defined for my $maybe ( $self->fields ) { if ( $maybe->tag =~ $regex ) { return $maybe unless wantarray; push( @list, $maybe ); } # if } # for $maybe } # for $tag return unless wantarray; return @list; }
sub subfield { my $self = shift; my $tag = shift; my $subfield = shift; my $field = $self->field($tag) or return; return $field->subfield($subfield); } # subfield()
sub _all_parms_are_fields { for ( @_ ) { return 0 unless UNIVERSAL::isa($_, 'MARC::Field'); } return 1; }
sub append_fields { my $self = shift; _all_parms_are_fields(@_) or croak('Arguments must be MARC::Field objects'); push(@{ $self->{_fields} }, @_); return scalar @_; }
sub insert_fields_before { my $self = shift; _all_parms_are_fields(@_) or croak('All arguments must be MARC::Field objects'); my ($before,@new) = @_; ## find position of $before my $fields = $self->{_fields}; my $pos = 0; foreach my $f (@$fields) { last if ($f == $before); $pos++; } ## insert before $before if ($pos >= @$fields) { $self->_warn("Couldn't find field to insert before"); return; } splice(@$fields,$pos,0,@new); return scalar @new; }
sub insert_fields_after { my $self = shift; _all_parms_are_fields(@_) or croak('All arguments must be MARC::Field objects'); my ($after,@new) = @_; ## find position of $after my $fields = $self->{_fields}; my $pos = 0; my $found = 0; foreach my $f (@$fields) { if ($f == $after) { $found = 1; last; } $pos++; } ## insert after $after unless ($found) { $self->_warn("Couldn't find field to insert after"); return; } splice(@$fields,$pos+1,0,@new); return scalar @new; }
sub insert_fields_ordered { my ( $self, @new ) = @_; _all_parms_are_fields(@new) or croak('All arguments must be MARC::Field objects'); ## go through each new field NEW_FIELD: foreach my $newField ( @new ) { ## find location before which it should be inserted EXISTING_FIELD: foreach my $field ( @{ $self->{_fields} } ) { if ( $field->tag() >= $newField->tag() ) { $self->insert_fields_before( $field, $newField ); next NEW_FIELD; } } ## if we fell through then this new field is higher than ## all the existing fields, so we append. $self->append_fields( $newField ); } return( scalar( @new ) ); }
sub insert_grouped_field { my ($self,$new) = @_; _all_parms_are_fields($new) or croak('Argument must be MARC::Field object'); ## try to find the end of the field group and insert it there my $limit = int($new->tag() / 100); my $found = 0; foreach my $field ($self->fields()) { if ( int($field->tag() / 100) > $limit ) { $self->insert_fields_before($field,$new); $found = 1; last; } } ## if we couldn't find the end of the group, then we must not have ## any tags this high yet, so just append it if (!$found) { $self->append_fields($new); } return(1); }
sub delete_fields { my $self = shift; _all_parms_are_fields(@_) or croak('Arguments must be MARC::Field object'); my @fields = @{$self->{_fields}}; my $original_count = @fields; foreach my $deleter (@_) { @fields = grep { $_ != $deleter } @fields; } $self->{_fields} = \@fields; return $original_count - @fields; }
sub delete_field { return delete_fields(@_); }
sub as_usmarc() { my $self = shift; require MARC::File::USMARC; return MARC::File::USMARC::encode( $self ); }
sub as_formatted() { my $self = shift; my @lines = ( "LDR " . ($self->{_leader} || "") ); for my $field ( @{$self->{_fields}} ) { push( @lines, $field->as_formatted() ); } return join( "\n", @lines ); } # as_formatted
sub leader { my $self = shift; my $text = shift; if ( defined $text ) { (length($text) eq 24) or $self->_warn( "Leader must be 24 bytes long" ); $self->{_leader} = $text; } # set the leader return $self->{_leader}; } # leader()
sub encoding { my ($self,$arg) = @_; # we basically report from and modify the leader directly my $leader = $self->leader(); # when setting if ( defined($arg) ) { if ( $arg =~ /UTF-?8/i ) { substr($leader,9,1) = 'a'; } elsif ( $arg =~ /MARC-?8/i ) { substr($leader,9,1) = ' '; } $self->leader($leader); } return substr($leader,9,1) eq 'a' ? 'UTF-8' : 'MARC-8'; }
sub set_leader_lengths { my $self = shift; my $reclen = shift; my $baseaddr = shift; if ($reclen > 99999) { carp( "Record length of $reclen is larger than the MARC spec allows (99999 bytes)." ); } substr($self->{_leader},0,5) = sprintf("%05d",$reclen); substr($self->{_leader},12,5) = sprintf("%05d",$baseaddr); # MARC21 defaults: http://www.loc.gov/marc/bibliographic/ecbdldrd.html substr($self->{_leader},10,2) = '22'; substr($self->{_leader},20,4) = '4500'; }
sub clone { my $self = shift; my @keeper_tags = @_; # create a new object of whatever type we happen to be my $class = ref( $self ); my $clone = $class->new(); $clone->{_leader} = $self->{_leader}; my $filtered = @keeper_tags ? [$self->field( @keeper_tags )] : undef; for my $field ( $self->fields() ) { if ( !$filtered || (grep {$field eq $_} @$filtered ) ) { $clone->append_fields( $field->clone ); } } # XXX FIX THIS $clone->update_leader(); return $clone; }
sub warnings() { my $self = shift; my @warnings = @{$self->{_warnings}}; $self->{_warnings} = []; return @warnings; }
sub add_fields { my $self = shift; my $nfields = 0; my $fields = $self->{_fields}; while ( my $parm = shift ) { # User handed us a list of data (most common possibility) if ( ref($parm) eq "" ) { my $field = MARC::Field->new( $parm, @_ ) or return _gripe( $MARC::Field::ERROR ); push( @$fields, $field ); ++$nfields; last; # Bail out, we're done eating parms # User handed us an object. } elsif ( UNIVERSAL::isa($parm, 'MARC::Field') ) { push( @$fields, $parm ); ++$nfields; # User handed us an anonymous list of parms } elsif ( ref($parm) eq "ARRAY" ) { my $field = MARC::Field->new(@$parm) or return _gripe( $MARC::Field::ERROR ); push( @$fields, $field ); ++$nfields; } else { croak( "Unknown parm of type", ref($parm), " passed to add_fields()" ); } # if } # while return $nfields; } # NOTE: _warn is an object method sub _warn { my $self = shift; push( @{$self->{_warnings}}, join( "", @_ ) ); return( $self ); } # NOTE: _gripe is NOT an object method sub _gripe { $ERROR = join( "", @_ ); warn $ERROR; return; } 1; __END__