This module generates a canonical string by converting roman numerals to digits, English descriptions of numbers to digits, stripping off all accents on characters (as well as handling oe = ö, etc.), replacing words with symbols (e.g. and = &, plus = +, etc.) and removing common variant endings.
In short, this module generates the same signature for the following strings:
bjørk = björk = bjoerk = bjork
1,000 maniacs = one thousand maniacs = 1k maniacs
Boyz II Men = Boyz To Men = Boyz 2 Men
ACDC = AC/DC = AC-DC
Rubin and company = Rubin & Company = Rubin & Co.
Third Eye Blind = 3rd eye blind
INSTALLATION
To install this module type the following:
perl Makefile.PL
make
make test
make install
DOCUMENTATION
Full documentation available in the POD.