| Text-Widont documentation | Contained in the Text-Widont distribution. |
Text::Widont - Suppress typographic widows
use Text::Widont;
# For a single string...
my $string = 'Look behind you, a Three-Headed Monkey!';
print widont($string, nbsp->{html}); # "...a Three-Headed Monkey!"
# For a number of strings...
my $strings = [
'You fight like a dairy farmer.',
'How appropriate. You fight like a cow.',
];
print join "\n", @{ widont( $strings, nbsp->{html} ) };
Or the object oriented way:
use Text::Widont qw( nbsp );
my $tw = Text::Widont->new( nbsp => nbsp->{html} );
my $string = "I'm selling these fine leather jackets.";
print $tw->widont($string); # "...fine leather jackets."
Collins English Dictionary defines a "widow" in typesetting as:
A short line at the end of a paragraph, especially one that occurs as the
top line of a page or column.
For example, in the text...
How much wood could a woodchuck
chuck if a woodchuck could chuck
wood?
...the word "wood" at the end is considered a widow. Using Text::Widont,
that sentence would instead appear as...
How much wood could a woodchuck
chuck if a woodchuck could
chuck wood?
Text::Widont exports a hash ref, nbsp, that contains the following
representations of a non-breaking space to be used with the widont function:
The HTML character entity.
The   HTML character entity.
The   HTML character entity.
Unicode's "No-Break Space" character.
The widont function takes a string and returns a copy with the space
between the final two words replaced with the given $nbsp. $string can
optionally be a reference to an array of strings to transform. In this case
strings will be modified in place as well as a copy returned.
In the absence of an explicit $nbsp, Unicode's No-Break Space character
will be used.
Text::Widont also provides an object oriented interface.
Instantiates a new Text::Widont object. nbsp is an optional argument
that will be used when performing the substitution. It defaults to Unicode's
No-Break Space character.
Performs the substitution described above, using the object's
nbsp property and the given string.
Text::Widont requires the following modules:
Please report any bugs or feature requests to
bug-text-widont at rt.cpan.org, or through the web interface at
http://rt.cpan.org/NoAuth/ReportBug.html?Queue=Text-Widont.
You can find documentation for this module with the perldoc command.
perldoc Text::Widont
You may also look for information at:
Dave Cardwell <dcardwell@cpan.org>
I was first introduced to the concept of typesetting widows and how they might be solved programatically by Shaun Inman.
http://www.shauninman.com/archive/2006/08/22/widont_wordpress_plugin
Copyright (c) 2007 Dave Cardwell. All rights reserved.
This module is free software; you can redistribute it and/or modify it under the same terms as Perl itself. See perlartistic.
| Text-Widont documentation | Contained in the Text-Widont distribution. |
package Text::Widont; use strict; use warnings; use Carp qw( croak ); our $VERSION = '0.01'; # By default export the 'widont' function and 'nbsp' constant. use base qw/ Exporter /; our @EXPORT = qw( widont nbsp );
use constant nbsp => { html => ' ', html_dec => ' ', html_hex => ' ', unicode => pack( 'U', 0x00A0 ), };
# This function also acts as an object method as described in the next POD # section. sub widont { my ( $self, $string, $nbsp ); $string = shift; # Check to see if the subroutine has been called as an object method... if ( ref $string eq 'Text::Widont' ) { $self = $string; $string = shift; $nbsp = $self->{nbsp} eq 'html' ? nbsp->{html} : $self->{nbsp} eq 'html_dec' ? nbsp->{html_dec} : $self->{nbsp} eq 'html_hex' ? nbsp->{html_hex} : $self->{nbsp} eq 'unicode' ? nbsp->{unicode} : $self->{nbsp}; } # Make sure a $string was passed... croak 'widont requires a string' if !defined $string; # $nbsp defaults to unicode... $nbsp ||= shift || nbsp->{unicode}; # Iterate over the string(s) to perform the transformation... foreach ( ref $string eq 'ARRAY' ? @$string : $string ) { s/([^\s])\s+([^\s]+\s*)$/$1$nbsp$2/; } return $string; }
sub new { my $class = shift; my $self = bless { @_ }, $class; # Default to No-Break Space. $self->{nbsp} ||= nbsp->{unicode}; return $self; }
# sub widont {} already defined above. 1; # End of the module code; everything from here is documentation... __END__