URI::Find::Schemeless - Find schemeless URIs in arbitrary text.


URI-Find documentation  | view source Contained in the URI-Find distribution.

Index


NAME

Top

URI::Find::Schemeless - Find schemeless URIs in arbitrary text.

SYNOPSIS

Top

  require URI::Find::Schemeless;

  my $finder = URI::Find::Schemeless->new(\&callback);

  The rest is the same as URI::Find.




DESCRIPTION

Top

URI::Find finds absolute URIs in plain text with some weak heuristics for finding schemeless URIs. This subclass is for finding things which might be URIs in free text. Things like "www.foo.com" and "lifes.a.bitch.if.you.aint.got.net".

The heuristics are such that it hopefully finds a minimum of false positives, but there's no easy way for it know if "COMMAND.COM" refers to a web site or a file.

top_level_domain_re

  my $tld_re = $self->top_level_domain_re;

Returns the regex for matching top level DNS domains. The regex shouldn't be anchored, it shouldn't do any capturing matches, and it should make itself ignore case.

AUTHOR

Top

Original code by Roderick Schertler <roderick@argon.org>, adapted by Michael G Schwern <schwern@pobox.com>.

Currently maintained by Roderick Schertler <roderick@argon.org>.

SEE ALSO

Top

  L<URI::Find>


URI-Find documentation  | view source Contained in the URI-Find distribution.