Filter::Indent::HereDoc - allows here documents to be indented within code blocks


Filter-Indent-HereDoc documentation Contained in the Filter-Indent-HereDoc distribution.

Index


Code Index:

NAME

Top

Filter::Indent::HereDoc - allows here documents to be indented within code blocks

SYNOPSIS

Top

  use Filter::Indent::HereDoc;
  {
    {
      print <<EOT;
      Hello, World!
      EOT
    }
  }
  # This will print "Hello, World!" and stop at EOT
  # even though the termination string is indented.

DESCRIPTION

Top

When a 'here document' is used, the document text and the termination string must be flush with the left margin, even if the rest of the code block is indented.

Filter::Indent::HereDoc removes this restriction, and acts in a more DWIM kind of way - that if the terminator string is indented then that level of indent will apply to the whole document.

If there is no terminator string (so the here document stops at the first blank line), then enough whitespace will be stripped out so that the leftmost character of the document will be flush with the left margin, e.g.

  print <<;
       Hello,
      World!

  # This will print:
   Hello,
  World!

Changes to terminator strings

In addition to allowing indented here documents, Filter::Indent::HereDoc also provides support for a more permissive style of terminator string, as specified in Perl6 RFC111. The changes are:

You can force the module to revert to the standard Perl5 style of terminator string by specifying 'strict_terminators' when the module is use'd, e.g.

 use Filter::Indent::HereDoc;
 print << EOT
 Hello, World!
 EOT ;             # this will work

 use Filter::Indent::HereDoc 'strict_terminators';
 print << EOT
 Hello, World!
 EOT ;             # this will generate an error

CAVEATS

Top

SEE ALSO

Top

Filter::Simple, http://perl.jonallen.info/modules, perlfaq4, Perl6 RFC111

AUTHOR

Top

Jon Allen, <jj@jonallen.info>

THANKS TO

Top

Michael Schwern for the suggestions about Perl6 RFC111

COPYRIGHT AND LICENSE

Top


Filter-Indent-HereDoc documentation Contained in the Filter-Indent-HereDoc distribution.

package Filter::Indent::HereDoc;

use strict;
use warnings;
use Filter::Simple;

our $VERSION = '1.01';
our %options = ();
our @buffer;            # Temporary storage of current here document
our @termstring;        # FIFO list of here document terminating strings

sub import {
  %options = ();
  $options{$_}++ foreach (@_);
}

FILTER_ONLY
  executable => sub {
    my @code = split /\n/;
    $_ = join '',(map &process_line($_),@code);
  };

sub process_line {
  my $line = shift;
  if (@termstring) {
    # At this point we are in a here document, so all lines of code
    # are buffered until the end of the heredoc is detected
    push @buffer,$line;
    
    # 2 scenarios - terminator is a blank line, or terminator contains non-
    # whitespace. If blank line, then look for same whitespace at start of
    # each line in buffer. Otherwise take the whitespace that precedes the
    # terminator and match this against each line in the buffer.
    #
    # By default, we accept terminator strings in the Perl6 RFC111 format,
    # i.e. whitespace, ';', and comments following the terminator are
    # allowed. The only exception is if the terminator is a blank line,
    # in this case then only whitespace is allowed.
    
    my $termregex;
    unless ($options{strict_terminators}) {
      if ($termstring[0] =~ /\S/) {
        $termregex = qr/^(\s*)($termstring[0])(\s*;{0,1}\s*(?:#.*){0,1})$/;
      } else {
        $termregex = qr/^\s*$/;
      }
    } else {
      if ($termstring[0] =~ /\S/) {
        $termregex = qr/^(\s*)($termstring[0])$/;
      } else {
        $termregex = qr/^$/;
      }
    }
    
    my ($whitespace,$terminator,$extras);
    if ($line =~ $termregex) {
      ($whitespace,$terminator,$extras) = ($1,$2,$3);
      if ($termstring[0] =~ /\S/) {
        foreach (@buffer) {
          return unless (/^$whitespace/);
        }      
      } else {
        # Terminator string is a blank line
        undef $whitespace;
        foreach (@buffer) {
          if (/^(\s+)\S/) {
            $whitespace = $1 unless ($whitespace and /^$whitespace\s*/);
          }
        }
      }
      # End of heredoc - strip the required amount of whitespace
      map s/^$whitespace//,@buffer;
      
      # If we found extra characters after the terminator (Perl6 RFC111
      # style), move them onto a new line to be compatible with Perl5
      if ($extras) {
        pop @buffer;
        push @buffer,$terminator;
        push @buffer,$extras;
      }
      
      # Return captured heredoc back to Perl and reset the buffer
      $line   = join "\n",@buffer;
      @buffer = ();
      shift @termstring;
      return "$line\n";
    }
  } else {
    # Perl6 RFC111 states that whitespace after the terminator
    # should be ingored
    unless ($options{strict_terminators}) {
      $line =~ s/(?<!<)<<(?!<)\s+/<</g;
    }

    # Can we find the start of any here documents?
    MATCH: while ($line =~ m/(?<!<)<<(?!<)/g) {
      if ($line =~ m/\G(\w+)/) {
        push @termstring,$1;
        next MATCH;
      }

      if ($line =~ m/\G(?:\s*(['"`]))(.*?)(?<!\\)\1/) {
        my ($quote,$string) = ($1,$2);
        $string =~ s/\\$quote/$quote/g;
        push @termstring,$string;
        next MATCH;
      }
      
      # Use of bare << to mean <<"" is depreciated
      # ...but still works so the module needs to support it!
      push @termstring,'';
    }
    return "$line\n";    
  }
}

1;
__END__