Mail::Graph - draw graphical stats for mails/spams


Mail-Graph documentation  | view source Contained in the Mail-Graph distribution.

Index


NAME

Top

Mail::Graph - draw graphical stats for mails/spams

SYNOPSIS

Top

	use Mail::Graph;

	$graph = Mail::Graph->new( items => 'spam', 
	  output => 'spams/',
	  input => '~/Mail/spam/',
          );
        $graph->generate();

DESCRIPTION

Top

This module parses mailbox files in either compressed or uncompressed form and then generates pretty statistics and graphs about them. Although at first developed to do spam statistics, it works just fine for normal mail.

File Format

The module reads in files in mbox format. These can be compressed by gzip, or just plain text. Since the module read in any files that are in one directory, it can also handle mail-dir style folders, e.g. a directory where each mail resides in an extra file.

The file format is quite simple and looks like this:

	From sample_foo@example.com  Tue Oct 27 18:38:52 1998
	Received: from barfel by foo.example.com (8.9.1/8.6.12) 
	From: forged_bar@example.com
	X-Envelope-To: <sample_foo@example.com>
	Date: Tue, 27 Oct 1998 09:52:14 +0100 (CET)
	Message-Id: <199810270852.12345567@example.com>
	To: <none@example.com>
	Subject: Sorry...
	X-Loop-Detect: 1
	X-Spamblock: caught by rule dummy@

	This is a sample spam

Basically, an email header plus email body, separated by the From lines.

The following fields are examined to determine:

	X-Envelope-To		the target address/domain
	From address@domain	the sender
	From date		the receiving date

METHODS

Top

new()

Create a new Mail::Graph object.

The following options exist:

	input		Path to a directory containing (gzipped) mbox files
			Alternatively, name of an (gzipped) mbox file
	index		Directory where to write (and read) the index files
	output		Directory where to write the output stats
	items		Try 'spams' or 'mails' (can be any string)
	generate	hash with names of stats to generate (1=on, 0=off):
			 month		 per each month of the year
			 day		 per each day of the month
			 hour		 per each hour of the day
			 dow		 per each day of the week
			 yearly		 per year
			 daily		 per each day (with average)
			 monthly	 per each month
			 toplevel	 per top_level domain
			 rule		 per filter rule that matched
			 target		 per target address
			 domain	         per target domain
			 last_x_days     items for each of the last x days
				         set it to the number of days you want
			 score_histogram show histogram of SpamAssassin scores
					 set it to the step-width (like 5)
			 score_daily     SA score for each of the last x days
				         set it to the number of days you want
			 score_scatter   SA scatter score diagram, set it to
					 the limit of the score (a line will be
					 draw there)
	average		set to 0 to disable, otherwise it gives the number
			of days/weeks/month to average over
	average_daily	if not set, uses average, 0 to disable
			number of days to average over in the daily graph
	height		base height of the generated images
	template	name of the template file (ending in .tpl) that is
			used to generate the html output, e.g. 'index.tpl'
	no_title	set to 1 to disable graph titles, default 0
	filter_domains	array ref with list of domains to show as "unknown"
	filter_target	array ref with list of targets (regualr expressions)
	graph_ext	extension of the generated graphs, default 'png'
	last_date	in yyyy-mm-dd format: specify the last used date, any
			mail newer than that will be skipped. Defaults to today
	first_date	in yyyy-mm-dd format: specify the first used date, any
			mail older than that will be skipped. Defaults to undef
			meaning any old mail will be considered.

generate()

Generate the stats, fill in the template and write it out. Takes no options.

error()

Return an error message or undef for no error.

BUGS

Top

There are a couple of known bugs, some of the are unfinished features or problem of GD::Graph:

Divide by Zero

This is a bug in some versions of GD::Graph, when generating a graph with only one bar it will crash with this error. If you encounter this, please bug the author of GD::Graph and send me a copy.

Argument "4, 0.7%" isn't numeric

You might get a lot of warnings like

	Argument "4, 0.7%" isn't numeric in numeric lt (<) at 
	/usr/lib/perl5/site_perl/5.8.2/GD/Graph/Data.pm line 231.

This is a problem with GD::Graph: Mail::Graph wants to use labels like 4, 0.7% but GD::Graphs uses the same string for the label and the value of the point/bar. And thus Perl warns. This needs a small patch to GD::Graph that strips anything non-numeric out of the label before using it in numeric context. Please bug the author of GD::Graph and send me a copy.

gzipped archives are not included in the stats

Some of the gzipped archives seem to trigger some bug in Compress::Zlib, at least til version v1.32. For instance, on my system on of the sample archives in /sample/archives/ is not read properly by Compress::Zlib. I already have notified the author of Compress::Zlib.

LICENSE

Top

This program is free software; you may redistribute it and/or modify it under the same terms as Perl itself.

AUTHOR

Top

(c) Copyright by Tels http://bloodgate.com/ 2002.


Mail-Graph documentation  | view source Contained in the Mail-Graph distribution.