LaBrea::Tarpit::Get - LaBrea::Tarpit::Get documentation


LaBrea-Tarpit documentation  | view source Contained in the LaBrea-Tarpit distribution.

Index


NAME

Top

LaBrea::Tarpit::Get

SYNOPSIS

Top

  use LaBrea::Tarpit::Get;

  ($rv,$host,$port,$path)=parse_http_URL($url)
  ($handle,$host,$port,$path)=open_http(*S,$url);
  $rv=parse_http_response(\$buffer,\%response);
  $rv=short_response($url,\%response,\%content,$timeout);
  $line = make_line($url,$err,\%content);
  $rv = not_hour($file);
  $rv = not_day($file);
  $rv=auto_update($url,$file,$cur_ver,$timeout);

DESCRIPTION - LaBrea::Tarpit::Get

Top

Module connects to a web site running LaBrea::Tarpit::Report::html_report.plx and retrieves a short_report as described in LaBrea::Tarpit::Report.

Run examples/web_scan.pl from a cron job hourly or daily to update the statistics from all know sites running LaBrea::Tarpit. A report can then be generated showing the activity worldwide.

 # MIN HOUR DAY MONTH DAYOFWEEK   COMMAND
 30 * * * * ./web_scan.pl ./other_sites.txt ./tmp/site_stats

See: LaBrea::Tarpit::Report::other_sites

($handle,$host,$port,$path)= parse_http_URL($url);

Separate an http URL into its components

  input:	URL of the form
	http://www.foo.com[:8080]/file.html

  https:// service is not supported

  returns: (undef, error message)
		or
	   (file_handle,hostname,port,path)
	where port and path may be empty

($handle,$host,$port,$path)=open_http(*S,$url);

Open connection to http target

  input:	*S,$url	[default port = 80]
  returns:	(undef, error)	on error

		(file_handle,
		 hostname,
		 port
		 path )		on success

$rv=parse_http_response(\$buffer,\%response);

Parse an http server response into a hash of headers.

  i.e.	(representative, will vary)

  rc		 => 200
  msg		 => OK
  date		 => Wed, 24 Apr 2002 21:46:30 GMT
  server	 => Apache/1.3.22
  protocol	 => HTTP/1.1
  content-type	 => text/plain
  content-length => 92
  last-modified	 => Wed, 24 Apr 2002 21:46:34 GMT
  expires	 => Wed, 24 Apr 2002 21:47:04 GMT
  connection	 => close
  content	 => (complete text buffer)

  input:	\$text_in, \%response
  returns:	true on success, %response filled
		false on failure

  NOTE:		%response{rc}	(server response code)
		%response(msg}	(server messages)
		are ALWAYS filled with something.
		In the case of server failure, the 
		cause of the failure will be inserted
		into %response(msg} and undef returned.

$rv=short_response($url,\%response,\%content,$timeout);

Fetch the short report from $url and place the headers in %response, the content, parsed, in %content. Optional $timeout, default is 60 seocnds.

%response contains http headers

%content contains key => value pairs

  LaBrea	=> version
  Tarpit	=> version
  Report	=> version
  Util		=> version
  now		=> seconds since epoch (local)
  tz		=> time zone (i.e. -0700)
  threads	=> number of threads
  total_IPs	=> total IP's
  bw		=> bandwidth

  input:	URL,	# complete url
	   i.e. www.foo.com/html_report.plx
		\%response,
		\%content,

  returns:	false on success
		error message on failure

$line = make_line($url,$err,\%content);

Make a line of text summarizing the short report where $err is the return value from short_report

  Format:

  url threads total_IPs bw time tz version:nn:nn:nn
    or
  url error message

$rv = not_hour($file);

Check if the file has been accessed this hour;

  input:	path/to/file
  returns:	true, not current hour
		false if accessed this hour
		or non-existent or not readable

$rv = not_day($file);

Check if the file has been accessed this day;

  input:	path/to/file
  returns:	true, not accessed this day
		false if accessed this day
		or non-existent or not readable

$rv=auto_update($url,$file,$cur_ver,$timeout);

Update the 'other_sites.txt' file from $url on a daily basis only.

  input:  url,	# complete url to 'other_sites.txt'
	# http://scans.bizsystems.net/other_sites.txt

          file,	# path to your 'other_sites.txt'

	  cur_ver	# optional current version
	# the current file will be opened and scanned
	# if this is not supplied

	  timeout	# wait for http response	
	# default 60 seconds
  returns:	false on success or no update needed
		error msg on failure

EXPORT_OK

Top

	parse_http_URL
	open_http
	parse_http_response
	short_response
	make_line
	not_hour
	not_day
	auto_update

COPYRIGHT

Top

AUTHOR

Top

Michael Robinton, michael@bizsystems.com

SEE ALSO

Top

perl(1), LaBrea::Tarpit(3), LaBrea::Codes(3), LaBrea::Tarpit::Report(3), LaBrea::Tarpit::Util(3), LaBrea::Tarpit::DShield(3)


LaBrea-Tarpit documentation  | view source Contained in the LaBrea-Tarpit distribution.