Bio::ASN1::Sequence::Indexer - Indexes NCBI Sequence files.


Bio-ASN1-EntrezGene documentation  | view source Contained in the Bio-ASN1-EntrezGene distribution.

Index


NAME

Top

Bio::ASN1::Sequence::Indexer - Indexes NCBI Sequence files.

SYNOPSIS

Top

  use Bio::ASN1::Sequence::Indexer;

  # creating & using the index is just a few lines
  my $inx = Bio::ASN1::Sequence::Indexer->new(
    -filename => 'seq.idx',
    -write_flag => 'WRITE'); # needed for make_index call, but if opening
                             # existing index file, don't set write flag!
  $inx->make_index('seq1.asn', 'seq2.asn');
  my $seq = $inx->fetch('AF093062'); # Bio::Seq obj for Sequence (doesn't work yet)
  # alternatively, if one prefers just a data structure instead of objects
  $seq = $inx->fetch_hash('AF093062'); # a hash produced by Bio::ASN1::Sequence
                            # that contains all data in the Sequence record

PREREQUISITE

Top

Bio::ASN1::Sequence, Bioperl and all dependencies therein.

INSTALLATION

Top

Same as Bio::ASN1::EntrezGene

DESCRIPTION

Top

Bio::ASN1::Sequence::Indexer is a Perl Indexer for NCBI Sequence genome databases. It processes an ASN.1-formatted Sequence record and stores the file position for each record in a way compliant with Bioperl standard (in fact its a subclass of Bioperl's index objects).

Note that this module does not parse record, because it needs to run fast and grab only the gene ids. For parsing record, use Bio::ASN1::Sequence.

As with Bio::ASN1::Sequence, this module is best thought of as beta version - it works, but is not fully tested.

SEE ALSO

Top

Please check out perldoc for Bio::ASN1::EntrezGene for more info.

AUTHOR

Top

Dr. Mingyi Liu <mingyi.liu@gpc-biotech.com>

COPYRIGHT

Top

CITATION

Top

Liu, M and Grigoriev, A (2005) "Fast Parsers for Entrez Gene" Bioinformatics. In press

OPERATION SYSTEMS SUPPORTED

Top

Any OS that Perl & Bioperl run on.

METHODS

Top

fetch

  Parameters: $geneid - id for the Sequence record to be retrieved
  Example:    my $hash = $indexer->fetch(10); # get Sequence #10
  Function:   fetch the data for the given Sequence id.
  Returns:    A Bio::Seq object produced by Bio::SeqIO::sequence
  Notes:      Bio::SeqIO::sequence does not exist and probably won't
                exist for a while!  So call fetch_hash instead

fetch_hash

  Parameters: $seqid - id for the Sequence record to be retrieved
  Example:    my $hash = $indexer->fetch_hash('AF093062');
  Function:   fetch a hash produced by Bio::ASN1::Sequence for given id
  Returns:    A data structure containing all data items from the Sequence
                record.
  Notes:      Alternative to fetch()

_file_handle

  Title   : _file_handle
  Usage   : $fh = $index->_file_handle( INT )
  Function: Returns an open filehandle for the file
            index INT.  On opening a new filehandle it
            caches it in the @{$index->_filehandle} array.
            If the requested filehandle is already open,
            it simply returns it from the array.
  Example : $fist_file_indexed = $index->_file_handle( 0 );
  Returns : ref to a filehandle
  Args    : INT
  Notes   : This function is copied from Bio::Index::Abstract. Once that module
              changes file handle code like I do below to fit perl 5.005_03, this
              sub would be removed from this module


Bio-ASN1-EntrezGene documentation  | view source Contained in the Bio-ASN1-EntrezGene distribution.