Text::DeSupercite - remove supercite quotes and other non-standard quoting from text


Text-DeSupercite documentation Contained in the Text-DeSupercite distribution.

Index


Code Index:

NAME

Top

Text::DeSupercite - remove supercite quotes and other non-standard quoting from text

SYNOPSIS

Top

    use Text::DeSupercite qw/desupercite/;

    # just convert supercite quotes to '>'s    
    $text = desupercite($mail->body());

    # or convert *all* quot characters that aren't '>'s
    $text = desupercite($mail->body(),1);

    # set it back again
    $mail->body_set($text);

    


DESCRIPTION

Top

Supercite is a Emacs Gnus package (http://www.gnus.org/) for providing a more err ... comprehensive ... form of quoting which tends to look like

    >>>>> "Foo" == Foo  <foo@foo.com> writes:
    >> blah blah blah blah blah blah blah blah blah blah blah blah blah blah
    >> blah blah blah blah blah blah blah blah blah blah blah blah blah blah

    Foo> yak yak yak yak yak yak yak yak yak yak yak yak yak yak yak yak yak
    Foo> yak yak yak yak yak yak yak yak yak yak yak yak yak yak yak yak yak
    Foo> yak yak yak yak yak yak yak yak yak yak yak yak yak yak yak yak yak




which annoys quite a lot of people who find it too noisy.

There's also people who quote like this

    | this is a quote 

    this is not

which annoys another load of people. Mostly the two sets of annoyed people intersect. Which is quite understandable.

This module takes a simplistic approach to removing these forms of quoting and replacing them with the more normal

    > this is a quote

    this is not 

    > > this is a quote of a quote 

style.

It has two modes, harsh and lenient. Lenient just desupercites. Harsh normalises all quoting.

BUGS

Top

Non known but I haven't really hunted out pathological cases of superciting so if you find one then please let me know.

It currently fails to desupercite stuff looking like

    Name1> some quote

this is a bug in Text::Quoted. There's a patch included with this module to fix it if it's not fixed by Simon Cozens soon.

AUTHOR

Top

Simon Wistow <simon@thegestalt.org>

COPYRIGHT

Top

SEE ALSO

Top

Text::Quoted, desupercite


Text-DeSupercite documentation Contained in the Text-DeSupercite distribution.

package Text::DeSupercite;

use strict;
use Text::Quoted;
use Exporter;
use vars qw(@EXPORT_OK $VERSION);
use base qw(Exporter);

@EXPORT_OK = qw(desupercite);

$VERSION = '0.6';

sub desupercite ($;$);
sub _desupercite_aux ($$);

sub desupercite ($;$) {

    my $text      = shift || return "";
    my $merciless = shift || 0;


    return _desupercite_aux(extract($text),$merciless);

}


sub _desupercite_aux($$) {
    my $node      = shift || return "";
    my $merciless = shift || 0; # paranoia, paranoia, everybody's coming to get you

    if (ref $node eq 'ARRAY') {
        my $ret="";
        $ret.=_desupercite_aux($_, $merciless) for (@$node);
        return $ret;


    } elsif (ref $node eq 'HASH') {
        return "\n" if $node->{empty};

        if (!defined $node->{quoter} || $node->{quoter} eq '') {
                return $node->{raw}."\n";
        } else { 
            my $new  = join ' ', 
                       map { ($merciless)?_merciless($_):_lenient($_) } 
                       split /\s+/, 
                       $node->{quoter}; 
            
            $node->{raw} =~ s!^\Q$node->{quoter}!$new!mg;
            return $node->{raw}."\n";
        }
    } else {
            die "Eeeek unknown node type - ".(ref $node)."\n";
    }
    

}


sub _merciless {
    return '>';
}

sub _lenient {
    return ($_[0] =~ /^[\w\d]+\>/i)? '>' : $_[0];
}


1;