| Bio-Phylo documentation | Contained in the Bio-Phylo distribution. |
Bio::Phylo::Unparsers::Phylip - Serializer used by Bio::Phylo::IO, no serviceable parts inside
This module unparses a Bio::Phylo data structure into an input file for PHYLIP and RAxML. The file format (as it is interpreted here) consists of:
the number of species, a space, the number of characters
ten-character species name, sequence
Here is an example of what the output might look like:
4 2 Species_1 AC Species_2 AG Species_3 GT Species_4 GG
To the unparse() function pass a matrix object or a project (whose first matrix will be serialized) as value of the '-phylo' argument. After serialization, any shortened phylip-specific names (which need to be 10 characters long) will have been assigned to the 'phylip_name' slot of set_generic. Example:
my $phylip_string = unparse(
-format => 'phylip',
-phylo => $matrix,
);
for my $seq ( @{ $matrix->get_entities } ) {
# this returns the shortened name, which is unique to the matrix
my $phylip_name = $seq->get_generic('phylip_name');
}
The phylip module is called by the Bio::Phylo::IO object, so look there to learn about parsing and serializing in general.
The phylip unparser is called by the Bio::Phylo::IO object. Look there to learn how to create phylip formatted files.
Also see the manual: Bio::Phylo::Manual and http://rutgervos.blogspot.com.
If you use Bio::Phylo in published research, please cite it:
Rutger A Vos, Jason Caravas, Klaas Hartmann, Mark A Jensen and Chase Miller, 2011. Bio::Phylo - phyloinformatic analysis using Perl. BMC Bioinformatics 12:63. http://dx.doi.org/10.1186/1471-2105-12-63
$Id: Phylip.pm 1660 2011-04-02 18:29:40Z rvos $
| Bio-Phylo documentation | Contained in the Bio-Phylo distribution. |
package Bio::Phylo::Unparsers::Phylip; use strict; use base 'Bio::Phylo::Unparsers::Abstract'; use Bio::Phylo::Util::Exceptions 'throw'; use Bio::Phylo::Util::CONSTANT qw':objecttypes looks_like_object';
sub _to_string { my $self = shift; my $obj = $self->{'PHYLO'}; my $matrix; eval { $matrix = $obj if looks_like_object $obj, _MATRIX_; }; if ($@) { undef($@); eval { ($matrix) = @{ $obj->get_matrices } if looks_like_object $obj, _PROJECT_; }; if ( $@ or not $matrix ) { throw 'ObjectMismatch' => 'Invalid object!'; } } my $string = $matrix->get_ntax() . ' ' . $matrix->get_nchar() . "\n"; my ( %seq_for_id, %phylip_name_for_id, @ids, %seen_name ); for my $seq ( @{ $matrix->get_entities } ) { my $id = $seq->get_id; $seq_for_id{$id} = $seq->get_char; my $name = $seq->get_internal_name; push @ids, $id; if ( length($name) <= 10 ) { my $phylip_name = $name . ( ( 10 - length($name) ) x ' ' ); if ( !$seen_name{$phylip_name} ) { $seen_name{$phylip_name}++; $phylip_name_for_id{$id} = $phylip_name; } else { my $counter = 1; while ( $seen_name{$phylip_name} ) { $phylip_name = substr( $phylip_name, 0, ( 10 - length($counter) ) ); $phylip_name .= $counter; $counter++; } $phylip_name_for_id{$id} = $phylip_name; } } elsif ( length($name) > 10 ) { my $phylip_name = substr( $name, 0, 10 ); if ( !$seen_name{$phylip_name} ) { $seen_name{$phylip_name}++; $phylip_name_for_id{$id} = $phylip_name; } else { my $counter = 1; while ( $seen_name{$phylip_name} ) { $phylip_name = substr( $phylip_name, 0, ( 10 - length($counter) ) ); $phylip_name .= $counter; $counter++; } $phylip_name_for_id{$id} = $phylip_name; } } $seq->set_generic( 'phylip_name' => $phylip_name_for_id{$id} ); } for my $id (@ids) { $string .= $phylip_name_for_id{$id} . ' ' . $seq_for_id{$id} . "\n"; } return $string; } # podinherit_insert_token
1;