Rose::HTML::Util - Utility functions for manipulating HTML.


Rose-HTML-Objects documentation Contained in the Rose-HTML-Objects distribution.

Index


Code Index:

NAME

Top

Rose::HTML::Util - Utility functions for manipulating HTML.

SYNOPSIS

Top

    use Rose::HTML::Util qw(:all);

    $esc = escape_html($str);
    $str = unescape_html($esc);

    $esc = escape_uri($str);
    $str = unescape_uri($esc);

    $comp = escape_uri_component($str);

    $esc = encode_entities($str);

DESCRIPTION

Top

Rose::HTML::Util provides aliases and wrappers for common HTML manipulation functions. When running in a mod_perl 1.x web server environment, Apache's C-based functions are used in some cases.

This all may seem silly, but I like to be able to pull these functions from a single location and get the fastest possible versions.

EXPORTS

Top

Rose::HTML::Util does not export any function names by default.

The 'all' tag:

    use Rose::HTML::Util qw(:all);

will cause the following function names to be imported:

    escape_html()
    unescape_html()
    escape_uri()
    escape_uri_component()
    encode_entities()

FUNCTIONS

Top

escape_html STRING [, UNSAFE]

This method passes its arguments to HTML::Entities::encode_entities(). If the list of unsafe characters is omitted, it defaults to <>&"

unescape_html STRING

This method is an alias for HTML::Entities::decode().

escape_uri STRING

This is a wrapper for URI::Escape::uri_escape() (URI::Escapeuri_escape) that is intended to escape entire URIs. Example:

    $str = 'http://foo.com/bar?baz=1%&blay=foo bar'
    $esc = escape_uri($str);

    print $esc; # http://foo.com/bar?baz=1%25&blay=foo%20bar

In other words, it tries to escape all characters that need to be escaped in a URI except those characters that are legitimately part of the URI: forward slashes, the question mark before the query, etc.

The current implementation escapes all characters except those in this set:

    A-Za-z0-9\-_.,'!~*#?&()/?@:[]=

Note that the URI-escaped string is not HTML-escaped. In order make a URI safe to include in an HTML page, call escape_html() as well:

    $h = '<a href="' . escape_html(escape_uri($str)) . '">foo</a>';

escape_uri_component STRING

When running under mod_perl 1.x, this is an alias for Apache::Util::escape_uri() (escape_uri in Apache::Util). Otherwise, it's an alias for URI::Escape::uri_escape() (URI::Escapeuri_escape).

encode_entities STRING [, UNSAFE]

This method passes its arguments to HTML::Entities::encode_entities(). If the list of unsafe characters is omitted, it defaults to <>&"

AUTHOR

Top

John C. Siracusa (siracusa@gmail.com)

LICENSE

Top

Copyright (c) 2010 by John C. Siracusa. All rights reserved. This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.


Rose-HTML-Objects documentation Contained in the Rose-HTML-Objects distribution.

package Rose::HTML::Util;

use strict;

require Exporter;
our @ISA = qw(Exporter);

our @EXPORT_OK = 
  qw(escape_html unescape_html escape_uri escape_uri_component
     encode_entities strip_html html_attrs_string);

our %EXPORT_TAGS = 
(
  all => 
  [
    qw(escape_html unescape_html escape_uri escape_uri_component 
       encode_entities) 
  ]
);

use HTML::Entities();
use URI::Escape;

if(exists $ENV{'MOD_PERL'} && require mod_perl && $mod_perl::VERSION < 1.99)
{
  require Apache::Util;

  #*escape_html   = \&HTML::Entities::encode;
  *escape_html   = \&encode_entities;
  *unescape_html = \&HTML::Entities::decode;
  *escape_uri_component = \&Apache::Util::escape_uri;
}
else
{
  #*escape_html   = \&HTML::Entities::encode;
  *escape_html   = \&encode_entities;
  *unescape_html = \&HTML::Entities::decode;
  *escape_uri_component = \&URI::Escape::uri_escape;
}

our $VERSION = '0.011';

sub encode_entities { HTML::Entities::encode_entities($_[0], @_ > 1 ? $_[1] : '<>&"') }

sub escape_uri
{
  URI::Escape::uri_escape($_[0], 
    (@_ > 1) ? (defined $_[1] ? $_[1] : ()) : q(^A-Za-z0-9\-_.,'!~*#?&()/?@\:\[\]=));
}

sub html_attrs_string
{
  my %attrs;

  if(@_ == 1 && ref $_[0] eq 'HASH')
  {
    %attrs = %{$_[0]};
  }
  elsif(@_ && @_ % 2 == 0)
  {
    %attrs = @_;
  }

  return '' unless(keys %attrs);

  return ' ' . join(' ', map { $_  . q(=") . escape_html($attrs{$_}) . q(") }
                         sort keys(%attrs));
}

sub strip_html
{
  my($text) = shift;

  # XXX: dumb for now...
  $text =~ s{<[^>]*?/?>}{}g;

  return $text;
}

1;


__END__