Bio::Phylo::Unparsers::Phylip - Serializer used by Bio::Phylo::IO, no serviceable parts inside


Bio-Phylo documentation Contained in the Bio-Phylo distribution.

Index


Code Index:

NAME

Top

Bio::Phylo::Unparsers::Phylip - Serializer used by Bio::Phylo::IO, no serviceable parts inside

DESCRIPTION

Top

This module unparses a Bio::Phylo data structure into an input file for PHYLIP and RAxML. The file format (as it is interpreted here) consists of:

first line

the number of species, a space, the number of characters

subsequent lines

ten-character species name, sequence

Here is an example of what the output might look like:

 4 2
 Species_1 AC
 Species_2 AG
 Species_3 GT
 Species_4 GG

To the unparse() function pass a matrix object or a project (whose first matrix will be serialized) as value of the '-phylo' argument. After serialization, any shortened phylip-specific names (which need to be 10 characters long) will have been assigned to the 'phylip_name' slot of set_generic. Example:

 my $phylip_string = unparse(
 	-format => 'phylip',
 	-phylo  => $matrix,
 );
 for my $seq ( @{ $matrix->get_entities } ) {
    # this returns the shortened name, which is unique to the matrix
 	my $phylip_name = $seq->get_generic('phylip_name');
 }

The phylip module is called by the Bio::Phylo::IO object, so look there to learn about parsing and serializing in general.

SEE ALSO

Top

Bio::Phylo::IO

The phylip unparser is called by the Bio::Phylo::IO object. Look there to learn how to create phylip formatted files.

Bio::Phylo::Manual

Also see the manual: Bio::Phylo::Manual and http://rutgervos.blogspot.com.

CITATION

Top

If you use Bio::Phylo in published research, please cite it:

Rutger A Vos, Jason Caravas, Klaas Hartmann, Mark A Jensen and Chase Miller, 2011. Bio::Phylo - phyloinformatic analysis using Perl. BMC Bioinformatics 12:63. http://dx.doi.org/10.1186/1471-2105-12-63

REVISION

Top

 $Id: Phylip.pm 1660 2011-04-02 18:29:40Z rvos $


Bio-Phylo documentation Contained in the Bio-Phylo distribution.
package Bio::Phylo::Unparsers::Phylip;
use strict;
use base 'Bio::Phylo::Unparsers::Abstract';
use Bio::Phylo::Util::Exceptions 'throw';
use Bio::Phylo::Util::CONSTANT qw':objecttypes looks_like_object';

sub _to_string {
    my $self = shift;
    my $obj  = $self->{'PHYLO'};
    my $matrix;
    eval { $matrix = $obj if looks_like_object $obj, _MATRIX_; };
    if ($@) {
        undef($@);
        eval {
            ($matrix) = @{ $obj->get_matrices }
              if looks_like_object $obj, _PROJECT_;
        };
        if ( $@ or not $matrix ) {
            throw 'ObjectMismatch' => 'Invalid object!';
        }
    }
    my $string = $matrix->get_ntax() . ' ' . $matrix->get_nchar() . "\n";
    my ( %seq_for_id, %phylip_name_for_id, @ids, %seen_name );
    for my $seq ( @{ $matrix->get_entities } ) {
        my $id = $seq->get_id;
        $seq_for_id{$id} = $seq->get_char;
        my $name = $seq->get_internal_name;
        push @ids, $id;
        if ( length($name) <= 10 ) {
            my $phylip_name = $name . ( ( 10 - length($name) ) x ' ' );
            if ( !$seen_name{$phylip_name} ) {
                $seen_name{$phylip_name}++;
                $phylip_name_for_id{$id} = $phylip_name;
            }
            else {
                my $counter = 1;
                while ( $seen_name{$phylip_name} ) {
                    $phylip_name =
                      substr( $phylip_name, 0, ( 10 - length($counter) ) );
                    $phylip_name .= $counter;
                    $counter++;
                }
                $phylip_name_for_id{$id} = $phylip_name;
            }
        }
        elsif ( length($name) > 10 ) {
            my $phylip_name = substr( $name, 0, 10 );
            if ( !$seen_name{$phylip_name} ) {
                $seen_name{$phylip_name}++;
                $phylip_name_for_id{$id} = $phylip_name;
            }
            else {
                my $counter = 1;
                while ( $seen_name{$phylip_name} ) {
                    $phylip_name =
                      substr( $phylip_name, 0, ( 10 - length($counter) ) );
                    $phylip_name .= $counter;
                    $counter++;
                }
                $phylip_name_for_id{$id} = $phylip_name;
            }
        }
        $seq->set_generic( 'phylip_name' => $phylip_name_for_id{$id} );
    }
    for my $id (@ids) {
        $string .= $phylip_name_for_id{$id} . ' ' . $seq_for_id{$id} . "\n";
    }
    return $string;
}

# podinherit_insert_token

1;