Text::Phonetic::Koeln - Kölner Phonetik algorithm


Text-Phonetic documentation  | view source Contained in the Text-Phonetic distribution.

Index


NAME

Top

Text::Phonetic::Koeln - Kölner Phonetik algorithm

DESCRIPTION

Top

The "Kölner Phonetik" is a phonetic algorithm for indexing names by sound, as pronounced in German. The goal is for names with the same pronunciation to be encoded to the same representation so that they can be matched despite minor differences in spelling.

In contrast to Soundex this algorithm is suitable for long names since the length of the encoded result is not limited. This algorithm is able to find allmost all ortographic variations in names, but also produces many false positives.

The result is always a sequence of numbers. Special characters and whitespaces are ignored. If your text might contain non-latin characters (except for German umlaute and 'ß') you should unaccent it prior to creating a phonetic code.

AUTHOR

Top

    Maroš Kollár
    CPAN ID: MAROS
    maros [at] k-1.com
    http://www.k-1.com

COPYRIGHT

Top

SEE ALSO

Top

Description of the algorithm can be found at http://de.wikipedia.org/wiki/K%C3%B6lner_Phonetik

Hans Joachim Postel: Die Kölner Phonetik. Ein Verfahren zur Identifizierung von Personennamen auf der Grundlage der Gestaltanalyse. in: IBM-Nachrichten, 19. Jahrgang, 1969, S. 925-931


Text-Phonetic documentation  | view source Contained in the Text-Phonetic distribution.