Locale::SubCountry - convert state, province, county etc. names to/from code


Locale-SubCountry documentation Contained in the Locale-SubCountry distribution.

Index


Code Index:

NAME

Top

Locale::SubCountry - convert state, province, county etc. names to/from code

SYNOPSIS

Top

    my $country_code = 'GB';
    my $UK = new Locale::SubCountry($country_code);
    if ( not $UK )
    {
        die "Invalid code $country_code\n"; 
    }
    elsif (  $UK->has_sub_countries )
    {
        print($UK->full_name('DGY'),"\n");           # Dumfries and Galloway
        print($UK->regional_division('DGY'),"\n");   # SCT (Scotland)
    }

    my $australia = new Locale::SubCountry('Australia');
    if ( not $australia )
    {
        die "Invalid code: Australia\n"; 
    }
    else
    {
        print($australia->country,"\n");        # Australia
        print($australia->country_code,"\n");   # AU

        if (  $australia->has_sub_countries )
        {
            print($australia->code('New South Wales '),"\n");     # NSW
            print($australia->full_name('S.A.'),"\n");            # South Australia
            my $upper_case = 1;
            print($australia->full_name('Qld',$upper_case),"\n"); # QUEENSLAND
            print($australia->category('NSW'),"\n");              # state
            print($australia->FIPS10_4_code('ACT'),"\n");         # 01
            print($australia->ISO3166_2_code('02'),"\n");         # NSW

            my @aus_state_names  = $australia->all_full_names;
            my @aus_code_names   = $australia->all_codes;
            my %aus_states_keyed_by_code  = $australia->code_full_name_hash;
            my %aus_states_keyed_by_name  = $australia->full_name_code_hash;

            foreach my $code ( sort keys %aus_states_keyed_by_code )
            {
               printf("%-3s : %s\n",$code,$aus_states_keyed_by_code{$code});
            }   
        }
    }

    # Methods for country codes and names

    my $world = new Locale::SubCountry::World;
    my @all_countries     = $world->all_full_names;
    my @all_country_codes = $world->all_codes;

    my %all_countries_keyed_by_name = $world->full_name_code_hash;
    my %all_country_keyed_by_code   = $world->code_full_name_hash;




DESCRIPTION

Top

This module allows you to convert the full name for a countries administrative region to the code commonly used for postal addressing. The reverse lookup can also be done. Sub country codes are defined in "ISO 3166-2:1998, Codes for the representation of names of countries and their subdivisions".

Sub countries are termed as states in the US and Australia, provinces in Canada and counties in the UK and Ireland.

Names and ISO 3166-2 codes for all sub countries in a country can be returned as either a hash or an array.

Names and ISO 3166-1 codes for all countries in the world can be returned as either a hash or an array.

ISO 3166-2 codes can be converted to FIPS 10-4 codes. The reverse lookup can also be done.

METHODS

Top

Note that the following methods duplicate some of the functionality of the Locale::Country module (part of the Locale::Codes bundle). They are provided here because you may need to first access the list of available countries and ISO 3166-1 codes, before fetching their sub country data. If you only need access to country data, then Locale::Country should be used.

Note also the following method names are also used for sub country objects. (interface polymorphism for the technically minded). To avoid confusion, make sure that your chosen method is acting on the correct type of object.

    all_codes
    all_full_names
    code_full_name_hash
    full_name_code_hash




new Locale::SubCountry::World

The new method creates an instance of a world country object. This must be called before any of the following methods are invoked. The method takes no arguments.

full_name_code_hash (for world objects)

Given a world object, returns a hash of full name/code pairs for every country, keyed by country name.

code_full_name_hash for world objects)

Given a world object, returns a hash of full name/code pairs for every country, keyed by country code.

all_full_names (for world objects)

Given a world object, returns an array of all country full names, sorted alphabetically.

all_codes (for world objects)

Given a world object, returns an array of all country IS) 3166-1 codes, sorted alphabetically.

new Locale::SubCountry

The new method creates an instance of a sub country object. This must be called before any of the following methods are invoked. The method takes a single argument, the name of the country that contains the sub country that you want to work with. It may be specified either by the ISO 3166-1 two letter code or the full name. For example:

    AF - Afghanistan
    AL - Albania
    DZ - Algeria
    AO - Angola
    AR - Argentina
    AM - Armenia
    AU - Australia
    AT - Austria




If the code is specified, such as 'AU' the format may be in capitals or lower case If the full name is specified, such as 'Australia', the format must be in title case If a country name or code is specified that the module doesn't recognised, it will issue a warning.

country

Returns the current country of a sub country object, the format is title case

country_code

Given a sub country object, returns the two letter ISO 3166-1 code of the country

code

Given a sub country object, the code method takes the full name of a sub country and returns the sub country's ISO 3166-2 code. The full name can appear in mixed case. All white space and non alphabetic characters are ignored, except the single space used to separate sub country names such as "New South Wales". The code is returned as a capitalised string, or "unknown" if no match is found.

full_name

Given a sub country object, the full_name method takes the ISO 3166-2 code of a sub country and returns the sub country's full name. The code can appear in mixed case. All white space and non alphabetic characters are ignored. The full name is returned as a title cased string, such as "South Australia".

If an optional argument is supplied and set to a true value, the full name is returned as an upper cased string.

category

Given a sub country object, the category method takes the ISO 3166-2 code of a sub country and returns the sub country's category type. Examples are city, province,state and district. The category is returned as a capitalised string, or "unknown" if no match is found.

regional_division

Given a sub country object, the regional_division method takes the ISO 3166-2 code of a sub country and returns the sub country's regionional_division. This is, an alphanumeric code. The regional_division is returned as a capitalised string, or "unknown" if no match is found.

has_sub_countries

Given a sub country object, the has_sub_countries method returns 1 if the current country has sub countries, or 0 if it does not. Some small countries such as Singapore do not have sub countries.

FIPS10_4_code

Given a sub country object, the FIPS_10_4_code method takes the ISO 3166-2 code of a sub country and returns the sub country's FIPS 10-4 code, or the string 'unknown', if none exists. FIPS is a standard developed by the US government.

ISO3166_2_code

Given a sub country object, the ISO3166_2_code method takes the FIPS 10-4 code of a sub country and returns the sub country's ISO 3166-2 code, or the string 'unknown', if none exists.

full_name_code_hash (for subcountry objects)

Given a sub country object, returns a hash of all full name/code pairs, keyed by sub country name. If the country has no sub countries, returns undef.

code_full_name_hash (for subcountry objects)

Given a sub country object, returns a hash of all code/full name pairs, keyed by sub country code. If the country has no sub countries, returns undef.

all_full_names (for subcountry objects)

Given a sub country object, returns an array of all sub country full names, sorted alphabetically. If the country has no sub countries, returns undef.

all_codes (for subcountry objects)

Given a sub country object, returns an array of all sub country ISO 3166-2 codes, sorted alphabetically. If the country has no sub countries, returns undef.

SEE ALSO

Top

ISO 3166-1:1997 Codes for the representation of names of countries and their subdivisions - Part 1: Country codes

ISO 3166-2:1998 Codes for the representation of names of countries and their subdivisions - Part 2: Country subdivision code Also released as AS/NZS 2632.2:1999

Federal Information Processing Standards Publication 10-4 1995 April Specifications for COUNTRIES, DEPENDENCIES, AREAS OF SPECIAL SOVEREIGNTY, AND THEIR PRINCIPAL ADMINISTRATIVE DIVISIONS

http://www.statoids.com/statoids.html

Locale::Country,Lingua::EN::AddressParse, Geo::StreetAddress::USGeo::PostalAddressGeo::IP

LIMITATIONS

Top

ISO 3166-2:1998 defines all sub country codes as being up to 3 letters and/or numbers. These codes are commonly accepted for countries like the USA and Canada. In Australia this method of abbreviation is not widely accepted. For example, the ISO code for 'New South Wales' is 'NS', but 'NSW' is the abbreviation that is most commonly used. I could add a flag to enforce ISO-3166-2 codes if needed.

The ISO 3166-2 standard romanizes the names of provinces and regions in non-latin script areas, such as Russia and South Korea. One Romanisation is given for each province name. For Russia, the BGN (1947) Romanization is used.

Several sub country names have more than one code, and may not return the correct code for that sub country. These entries are usually duplicated because the name represents two different types of sub country, such as a province and a geographical unit. Examples are:

    AZERBAIJAN : Länkäran; LA (the City), LAN (the Rayon)
    AZERBAIJAN : Säki; SA,SAK
    AZERBAIJAN : Susa; SS,SUS
    AZERBAIJAN : Yevlax; YE,YEV
    INDONESIA  : Kalimantan Timur; KI,KT
    LAOS       : Vientiane VI,VT
    MOLDOVA    : Hahul; CA,CHL
    MOLDOVA    : Bubasari; DU,DBI
    MOLDOVA    : Hrhei; OR,OHI
    MOLDOVA    : Coroca; SO,SOA
    MOLDOVA    : Gngheni; UN,UGI
    MOZAMBIQUE : Maputo; MPM,L

FIPS codes are not provided for all sub countries.

BUGS

Top

None known

AUTHOR

Top

Locale::SubCountry was written by Kim Ryan <kimryan at cpan dot org>.

CREDITS

Top

Alastair McKinstry provided many of the sub country codes and names.

Terrence Brannon produced Locale::US, which was the starting point for this module.

Mark Summerfield and Guy Fraser provided the list of UK counties.

TJ Mather supplied the FIPS codes and many ammendments to the sub country data

COPYRIGHT AND LICENSE

Top


Locale-SubCountry documentation Contained in the Locale-SubCountry distribution.

#-------------------------------------------------------------------------------

use strict;
use warnings;
use locale;
use Exporter;
use Locale::SubCountry::Data;

#-------------------------------------------------------------------------------

package Locale::SubCountry::World;

# Define all the methods for the 'world' class here. Note that because the 
# name space inherits from the Locale::SubCountry name space, the
# package wide variables $::country_lookup and $Locale::SubCountry::subcountry_lookup are
# accessible.


#-------------------------------------------------------------------------------
# Create new instance of a SubCountry::World object

sub new
{
    my $class = shift;

    my $world = {};
    bless($world,$class);
    return($world);
}

#-------------------------------------------------------------------------------
# Returns a hash of code/name pairs for all countries, keyed by  country code.

sub code_full_name_hash
{
    my $world = shift;
    return(  %{ $Locale::SubCountry::country_lookup{_code_keyed} } );
}
#-------------------------------------------------------------------------------
# Returns a hash of name/code pairs for all countries, keyed by country name.

sub full_name_code_hash
{
    my $world = shift;
    return( %{ $Locale::SubCountry::country_lookup{_full_name_keyed} } );
}
#-------------------------------------------------------------------------------
# Returns sorted array of all country full names

sub all_full_names
{
    my $world = shift;
    return ( sort keys %{ $Locale::SubCountry::country_lookup{_full_name_keyed} });
}
#-------------------------------------------------------------------------------
# Returns sorted array of all two letter country codes

sub all_codes
{
    my $world = shift;
    return ( sort keys %{ $Locale::SubCountry::country_lookup{_code_keyed} });
}

#-------------------------------------------------------------------------------

package Locale::SubCountry;

our $VERSION = '1.47';


#-------------------------------------------------------------------------------
# Initialization code must be run first to create global data structure.
# Read in the list of abbreviations and full names defined in the
# Locale::SubCountryData package

{

    unless ( $Locale::SubCountry::Data::xml_data )
    {
      die "Could not locate Locale::SubCountry::Data::xml_data variable";
    }
    
    # Get all the data from the Locale::SubCountryData pakage and place into an array of lines
    my @lines = split(/\n/,$Locale::SubCountry::Data::xml_data);
 
    while ( @lines )
    {
        my $current_line = shift(@lines);
       
        # Data is in XML format, use a simple parser to extract it
        my ($country_name,$country_code);
        if ( $current_line =~ /<country>/ )
        {
            # start of a  /<country> .. </country> block
            my $country_finished = 0;
            until ( $country_finished )
            {
                $current_line = shift(@lines);
                if ( $current_line =~ /\s*<name>(.*)<\/name>/ )
                {
                    $country_name = $1;
                }
                elsif ( $current_line =~ /\s*<code>(.*)<\/code>/ )
                {
                    $country_code = $1;
                }
                elsif ( $current_line =~ /<subcountry>/ )
                {
                    my $sub_country_finished = 0;

                    my ($sub_country_name,$sub_country_code,$category,$regional_division,$FIPS_code);
                    until ( $sub_country_finished )
                    {
                        $current_line = shift(@lines);
                        if ( $current_line =~ /\s*<name>(.*)<\/name>/ )
                        {
                            $sub_country_name = $1;
                        }
                        elsif ( $current_line =~ /\s*<code>(.*)<\/code>/ )
                        {
                            $sub_country_code = $1;
                        }
                        elsif ( $current_line =~ /\s*<FIPS>(.*)<\/FIPS>/ )
                        {
                            $FIPS_code = $1;
                        }
                        elsif ( $current_line =~ /\s*<category>(.*)<\/category>/ )
                        {
                            $category = $1;
                        }
                        elsif ( $current_line =~ /\s*<regional_division>(.*)<\/regional_division>/ )
                        {
                            $regional_division = $1;
                        }
                        elsif ( $current_line =~ /<\/subcountry>/ )
                        {
                            
                            $sub_country_finished = 1;
                            
                            # Some sub countries have no ISO code, such as Shariff Kabunsuan in the
                            # Phillipines. Only index sub country if it has a code
                            if ( defined $sub_country_code )
                            {
                                # Insert into doubly indexed hash, grouped by country for ISO 3166-2 
                                # codes. One hash is keyed by abbreviation and one by full name. Although 
                                # data is duplicated, this provides the fastest lookup and simplest code.
                
                                $Locale::SubCountry::subcountry_lookup{$country_name}{_code_keyed}{$sub_country_code} = $sub_country_name;
                                $Locale::SubCountry::subcountry_lookup{$country_name}{_full_name_keyed}{$sub_country_name} = $sub_country_code;
                            }

                            if ( $category )
                            {
                                $Locale::SubCountry::subcountry_lookup{$country_name}{$sub_country_code}{_category} = $category;
                            }
            
                            if ( $regional_division )
                            {
                                $Locale::SubCountry::subcountry_lookup{$country_name}{$sub_country_code}{_regional_division} = $regional_division;
                            }

                            if ( $FIPS_code )
                            {
                                # Insert into doubly indexed hash, grouped by country for FIPS 10-4 codes
                                $Locale::SubCountry::subcountry_lookup{$country_name}{_FIPS10_4_code_keyed}{$FIPS_code} = $sub_country_code;
                                $Locale::SubCountry::subcountry_lookup{$country_name}{_ISO3166_2_code_keyed}{$sub_country_code} = $FIPS_code;
                            }
                        }
                        else
                        {
                            print "Badly formed sub country data in $country_name\n";
                            print $current_line,"\n";
                            die;
                        }
                    }
                }
                elsif ( $current_line =~ /<\/country>/ )
                {
                    $country_finished = 1;

                    # Create doubly indexed hash, keyed by  country code and full name.
                    # The user can supply either form to create a new sub_country
                    # object, and the objects properties will hold both the countries
                    # name and it's code.
    
                    $Locale::SubCountry::country_lookup{_code_keyed}{$country_code} = $country_name;
                    $Locale::SubCountry::country_lookup{_full_name_keyed}{$country_name} = $country_code;

                }
                else
                {
                    print "Badly formed country data in $country_name\n";
                    print $current_line,"\n";
                    die;                    
                }
            }
        
        }
    }
    
    # use Data::Dumper;
    # print Dumper(\%{ $Locale::SubCountry::subcountry_lookup{_full_name_keyed} });
    # die;
}

#-------------------------------------------------------------------------------
# Create new instance of a sub country object

sub new
{
    my $class = shift;
    my ($country_or_code) = @_;


    my ($country,$country_code);
    

    # Country may be supplied either as a two letter code, or the full name
    if ( length($country_or_code) == 2 )
    {
        $country_or_code = uc($country_or_code); # lower case codes may be used, so fold to upper case
        if ( $Locale::SubCountry::country_lookup{_code_keyed}{$country_or_code} )
        {
            $country_code = $country_or_code;
            # set country to it's full name
            $country = $Locale::SubCountry::country_lookup{_code_keyed}{$country_code};
         }
        else
        {
          warn "Invalid country code: $country_or_code chosen";
          return(undef);
        }
    }
    else
    {
        if ( $Locale::SubCountry::country_lookup{_full_name_keyed}{$country_or_code} )
        {
            $country = $country_or_code;
            $country_code = $Locale::SubCountry::country_lookup{_full_name_keyed}{$country};
        }
        else
        {
            warn "Invalid country name: $country_or_code chosen, names must be in title case";
            return(undef);

        }
    }

    my $sub_country = {};
    bless($sub_country,$class);
    $sub_country->{_country} = $country;
    $sub_country->{_country_code} = $country_code;


    return($sub_country);
}

#-------------------------------------------------------------------------------
# Returns the current country of the sub country object

sub country
{
    my $sub_country = shift;
    return( $sub_country->{_country} );
}
#-------------------------------------------------------------------------------
# Returns the current country code of the sub country object

sub country_code
{
    my $sub_country = shift;
    return( $sub_country->{_country_code} );
}

#-------------------------------------------------------------------------------
# Given the full name for a sub country, return the ISO 3166-2 code

sub code
{
    my $sub_country = shift;
    my ($full_name) = @_;

    my $orig = $full_name;

    $full_name = _clean($full_name);

    my $code = $Locale::SubCountry::subcountry_lookup{$sub_country->{_country}}{_full_name_keyed}{$full_name};

    # If a code wasn't found, it could be because the user's capitalization
    # does not match the one in the look up data of this module. For example,
    # the user may have supplied the sub country "Ag R" (in Turkey) but the
    # ISO standard defines the spelling as "Ag r".

    unless ( defined $code )
    {
        # For every sub country, compare upper cased full name supplied by user
        # to upper cased full name from lookup hash. If they match, return the
        # correctly cased full name from the lookup hash.

        my @all_names = $sub_country->all_full_names;
        my $current_name;
        foreach $current_name ( @all_names )
        {
            if ( uc($full_name) eq uc($current_name) )
            {
                $code = $Locale::SubCountry::subcountry_lookup{$sub_country->{_country}}{_full_name_keyed}{$current_name};
            }
        }
    }

    if ( defined $code )
    {
        return($code);
    }
    else
    {
        return('unknown');
    }
}
#-------------------------------------------------------------------------------
# Given the ISO 3166-2 code for a sub country, return the FIPS 104-4 code.

sub FIPS10_4_code
{
    my $sub_country = shift;
    my ($code) = @_;

    $code = _clean($code);
    $code = uc($code);

    my $FIPS_code = $Locale::SubCountry::subcountry_lookup{$sub_country->{_country}}{_ISO3166_2_code_keyed}{$code};

    if ( $FIPS_code )
    {
        return($FIPS_code);
    }
    else
    {
        return('unknown');
    }
}
#-------------------------------------------------------------------------------
# Given the FIPS 10-4 code for a sub country, return the ISO 3166-2 code.

sub ISO3166_2_code
{
    my $sub_country = shift;
    my ($FIPS_code) = @_;

    $FIPS_code = _clean($FIPS_code);

    my $code = $Locale::SubCountry::subcountry_lookup{$sub_country->{_country}}{_FIPS10_4_code_keyed}{$FIPS_code};

    if ( $code )
    {
        return($code);
    }
    else
    {
        return('unknown');
    }
}
#-------------------------------------------------------------------------------
# Given the ISO 3166-2 code for a sub country, return the category,
# being state, province, city, council etc

sub category
{
    my $sub_country = shift;
    my ($code) = @_;

    $code = _clean($code);

    my $category = $Locale::SubCountry::subcountry_lookup{$sub_country->{_country}}{$code}{_category};

    if ( $category )
    {
        return($category);
    }
    else
    {
        return('unknown');
    }
}

#-------------------------------------------------------------------------------
# Given the ISO 3166-2 code for a sub country, return the regional division,

sub regional_division
{
    my $sub_country = shift;
    my ($code) = @_;

    $code = _clean($code);

    my $regional_division = $Locale::SubCountry::subcountry_lookup{$sub_country->{_country}}{$code}{_regional_division};

    if ( $regional_division )
    {
        return($regional_division);
    }
    else
    {
        return('unknown');
    }
}

#-------------------------------------------------------------------------------
# Given the ISO 3166-2 code for a sub country, return the full name.
# Parameters are the code and a flag, which if set to true
# will cause the full name to be uppercased

sub full_name
{
    my $sub_country = shift;
    my ($code,$uc_name) = @_;

    $code = _clean($code);
    $code = uc($code);

    my $full_name = 
        $Locale::SubCountry::subcountry_lookup{$sub_country->{_country}}{_code_keyed}{$code};
    if ( $uc_name )
    {
        $full_name = uc($full_name);
    }

    if ( $full_name )
    {
        return($full_name);
    }
    else
    {
        return('unknown');
    }
}
#-------------------------------------------------------------------------------
# Returns 1 if the current country has sub countries. otherwise 0.

sub has_sub_countries
{
    my $sub_country = shift;
    if ( $Locale::SubCountry::subcountry_lookup{$sub_country->{_country}}{_code_keyed} )
    {
        return(1);
    }
    else
    {
        return(0);
    }
}
#-------------------------------------------------------------------------------
# Returns a hash of code/full name pairs, keyed by sub country code.

sub code_full_name_hash
{
    my $sub_country = shift;
    if ( $sub_country->has_sub_countries )
    {
        return( %{ $Locale::SubCountry::subcountry_lookup{$sub_country->{_country}}{_code_keyed} } );
    }
    else
    {
        return(undef);
    }
}
#-------------------------------------------------------------------------------
# Returns a hash of name/code pairs, keyed by sub country name.

sub full_name_code_hash
{
    my $sub_country = shift;
    if ( $sub_country->has_sub_countries )
    {
        return( %{ $Locale::SubCountry::subcountry_lookup{$sub_country->{_country}}{_full_name_keyed} } );
    }
    else
    {
        return(undef);
    }
}
#-------------------------------------------------------------------------------
# Returns sorted array of all sub country full names for the current country

sub all_full_names
{
    my $sub_country = shift;
    if ( $sub_country->full_name_code_hash )
    {
        my %all_full_names = $sub_country->full_name_code_hash;
        if ( %all_full_names )
        {
            return( sort keys %all_full_names );
        }
    }
    else
    {
        return(undef);
    }
}
#-------------------------------------------------------------------------------
# Returns sorted array of all sub country ISO 3166-2 codes for the current country

sub all_codes
{
    my $sub_country = shift;

    if ( $sub_country->code_full_name_hash )
    {
        my %all_codes = $sub_country->code_full_name_hash;
        return( sort keys %all_codes );
    }
    else
    {
        return(undef);
    }
}

#-------------------------------------------------------------------------------
sub _clean
{
    my ($input_string) = @_;

    if ( $input_string =~ /[\. ]/ )
    {
        # remove dots
        $input_string =~ s/\.//go;

        # remove repeating spaces
        $input_string =~ s/  +/ /go;

        # remove any remaining leading or trailing space
        $input_string =~ s/^ //;
        $input_string =~ s/ $//;
    }

    return($input_string);
}

return(1);