| HoneyClient-Agent documentation | Contained in the HoneyClient-Agent distribution. |
HoneyClient::Agent::Driver::Browser::FF - Perl extension to drive Mozilla Firefox to a given web page. This package extends the HoneyClient::Agent::Driver::Browser package, by overridding the drive() method.
This documentation refers to HoneyClient::Agent::Driver::Browser::FF version 0.98.
use HoneyClient::Agent::Driver::Browser::FF;
# Library used exclusively for debugging complex objects.
use Data::Dumper;
# Create a new FF object, initialized with a collection
# of URLs to visit.
my $browser = HoneyClient::Agent::Driver::Browser::FF->new(
links_to_visit => {
'http://www.google.com' => 1,
'http://www.cnn.com' => 1,
},
);
# If you want to see what type of "state information" is physically
# inside $browser, try this command at any time.
print Dumper($browser);
# Continue to "drive" the driver, until it is finished.
while (!$browser->isFinished()) {
# Before we drive the application to a new set of resources,
# find out where we will be going within the application, first.
print "About to contact the following resources:\n";
print Dumper($browser->next());
# Now, drive browser for one iteration.
$browser->drive();
# Get the driver's progress.
print "Status:\n";
print Dumper($browser->status());
}
# At this stage, the driver has exhausted its collection of links
# to visit. Let's say we want to add the URL "http://www.mitre.org"
# to the driver's list.
$browser->{links_to_visit}->{'http://www.mitre.org'} = 1;
# Now, drive the browser for one iteration.
$browser->drive();
# Or, we can specify the URL as an argument.
$browser->drive(url => "http://www.mitre.org");
This library allows the Agent module to drive an instance of Mozilla Firefox inside the HoneyClient VM. The purpose of this module is to programmatically navigate this browser to different websites, in order to become purposefully infected with new malware.
This module is object-oriented in design, retaining all state information within itself for easy access. This specific browser implementation inherits all code from the HoneyClient::Agent::Driver::Browser package.
Fundamentally, the FF driver is initialized with a set of absolute URLs for the browser to drive to. Upon visiting each URL, the driver collects any new links found and will attempt to drive the browser to each valid URL upon subsequent iterations of work.
For each top-level URL given, the driver will attempt to process all corresponding links that are hosted on the same server, in order to simulate a complete 'spider' of each server.
URLs are added and removed from hashtables, as keys. For each URL, a calculated "priority" (a positive integer) of the URL is assigned the value. When the FF driver is ready to go to a new link, it will always go to the next link that has the highest priority. If two URLs have the same priority, then the FF driver will chose among those two at random.
Furthermore, the FF driver will try to visit all links shared by a common server in order before moving on to drive to other, external links in an ordered fashion. However, the FF driver may end up re-visiting old sites, if new links were found that the FF driver has not visited yet.
As the FF driver navigates the browser to each link, it maintains a set of hashtables that record when valid links were visited (see links_visited); when invalid links were found (see links_ignored); and when the browser attempted to visit a link but the operation timed out (see links_timed_out). By maintaining this internal history, the driver will never navigate the browser to the same link twice.
Lastly, it is highly recommended that for each driver $object, one should call $object->isFinished() prior to making a subsequent call to $object->drive(), in order to verify that the driver has not exhausted its set of links to visit. Otherwise, if $object->drive() is called with an empty set of links to visit, the corresponding operation will croak.
The following functions have been overridden by the FF driver. All other methods were implemented by the generic Browser driver. For further information about the Browser driver, see the HoneyClient::Agent::Driver::Browser documentation.
Drives an instance of Mozilla Firefox for one iteration, navigating either to the specified URL or to the next URL computed within the Browser driver's internal hashtables.
For a description of which hashtable is consulted upon each iteration of drive(), see the next_link_to_visit description of the HoneyClient::Agent::Driver::Browser documentation, in the "DEFAULT PARAMETER LIST" section.
Once a drive() iternation has completed, the corresponding browser process is terminated. Thus, each call to drive() invokes a new instance of the browser.
Inputs:$url is an optional argument, specifying the next immediate URL the browser must drive to.
Output: The updated FF driver $object, containing state information from driving the browser for one iteration.
Warning: This method will croak, if the FF driver object is unable to navigate to a new link, because its list of links to vist is empty and no new URL was supplied.
This package will only run on Win32 platforms. Furthermore, it has only been tested to work reliably within a Cygwin environment.
In a nutshell, this object is nothing more than a blessed anonymous reference to a hashtable, where (key => value) pairs are defined in the DEFAULT PARAMETER LIST, as well as fed via the new() function during object initialization. As such, this package does not perform any rigorous data validation prior to accepting any new or overriding (key => value) pairs.
However, additional links can be fed to any FF driver at any time, by simply adding new hashtable entries to the links_to_visit hashtable within the $object.
For example, if you wanted to add the URL "http://www.mitre.org" to the FF driver $object, simply use the following code:
$object->{links_to_visit}->{'http://www.mitre.org'} = 1;
In general, the FF driver does not know how many links it will ultimately end up browsing to, until it conducts an exhaustive spider of all initial URLs supplied. As such, expect the output of $object->status() to change significantly, upon each $object->drive() iteration.
For example, if at one given point, the status of percent_complete is 30% and then this value drops to 15% upon another iteration, then this means that the total number of links to drive to has greatly increased.
Lastly, we assume that the Mozilla Firefox browser has been preconfigured to not cache any data. This ensures the browser will render the most recent version of the content hosted at each URL.
Kathy Wang, <knwang@mitre.org>
Thanh Truong, <ttruong@mitre.org>
Darien Kindlund, <kindlund@mitre.org>
Brad Stephenson, <stephenson@mitre.org>
Copyright (C) 2007 The MITRE Corporation. All rights reserved.
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, using version 2 of the License.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.
| HoneyClient-Agent documentation | Contained in the HoneyClient-Agent distribution. |
####################################################################### # Created on: May 11, 2006 # Package: HoneyClient::Agent::Driver::FF # File: FF.pm # Description: A specific driver for automating an instance of # the Firefox browser, running inside a # HoneyClient VM. # # CVS: $Id: FF.pm 773 2007-07-26 19:04:55Z kindlund $ # # @author knwang, ttruong, kindlund, stephenson # # Copyright (C) 2007 The MITRE Corporation. All rights reserved. # # This program is free software; you can redistribute it and/or # modify it under the terms of the GNU General Public License # as published by the Free Software Foundation, using version 2 # of the License. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program; if not, write to the Free Software # Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA # 02110-1301, USA. # # #######################################################################
package HoneyClient::Agent::Driver::Browser::FF; use strict; use warnings; use Carp (); # Traps signals, allowing END: blocks to perform cleanup. use sigtrap qw(die untrapped normal-signals error-signals); ####################################################################### # Module Initialization # ####################################################################### BEGIN { # Defines which functions can be called externally. require Exporter; our ( @ISA, @EXPORT, @EXPORT_OK, %EXPORT_TAGS, $VERSION ); # Set our package version. $VERSION = 0.98; # Define inherited modules. use HoneyClient::Agent::Driver::Browser; @ISA = qw(Exporter HoneyClient::Agent::Driver::Browser); # Symbols to export automatically # Note: Since this module is object-oriented, we do *NOT* export # any functions other than "new" to call statically. Each function # for this module *must* be called as a method from a unique # object instance. @EXPORT = qw(); # Items to export into callers namespace by default. Note: do not export # names by default without a very good reason. Use EXPORT_OK instead. # Do not simply export all your public functions/methods/constants. # This allows declaration use HoneyClient::Agent::Driver::Browser::FF ':all'; # If you do not need this, moving things directly into @EXPORT or @EXPORT_OK # will save memory. # Note: Since this module is object-oriented, we do *NOT* export # any functions other than "new" to call statically. Each function # for this module *must* be called as a method from a unique # object instance. %EXPORT_TAGS = ( 'all' => [qw()], ); # Symbols to autoexport (when qw(:all) tag is used) @EXPORT_OK = ( @{ $EXPORT_TAGS{'all'} } ); $SIG{PIPE} = 'IGNORE'; # Do not exit on broken pipes. } our ( @EXPORT_OK, $VERSION );
####################################################################### # Include the Global Configuration Processing Library use HoneyClient::Util::Config qw(getVar); # Include Win32 Job Library use Win32::Job; # Include Logging Library use Log::Log4perl qw(:easy); # The global logging object. our $LOG = get_logger(); ####################################################################### # Public Methods Implemented # #######################################################################
sub drive { # Extract arguments. my ($self, %args) = @_; # Sanity check: Make sure we've been fed an object. unless (ref($self)) { $LOG->error("Error: Function must be called in reference to a " . __PACKAGE__ . "->new() object!"); Carp::croak "Error: Function must be called in reference to a " . __PACKAGE__ . "->new() object!\n"; } # Sanity check, don't get the next link, if # we've been fed a url. my $argsExist = scalar(%args); if (!$argsExist || !exists($args{'url'}) || !defined($args{'url'})) { # Get the next URL from our hashtables. $args{'url'} = $self->_getNextLink(); } # Drive the generic browser before opening with FF $self = $self->SUPER::drive(%args); # Sanity check: Make sure our next URL is defined. unless (defined($args{'url'})) { $LOG->error("Error: Unable to drive browser - no links left to browse!"); Carp::croak "Error: Unable to drive browser - no links left to browse!\n"; } # Indicates how long we wait for each drive operation to complete, # before registering attempt as a failure. my $timeout : shared = $self->timeout(); # Create a new Job. my $job = Win32::Job->new(); # Sanity check. if (!defined($job)) { $LOG->error("Error: Unable to spawn a new process - " . $^E . "."); Carp::croak "Error: Unable to spawn a new process - " . $^E . ".\n"; } # Spawn the job. my $processExec = getVar(name => "process_exec"); my $processName = getVar(name => "process_name"); my $status = $job->spawn($processExec, $processName . " " . $args{'url'}); # Sanity check. if (!defined($status)) { $LOG->error("Error: Unable to spawn a new browser - " . $^E . "."); Carp::croak "Error: Unable to spawn a new browser - " . $^E . ".\n"; } # Run the job. $job->run($timeout); # Check to see if run fails. $status = $job->status(); # Sanity check. if (!defined($status) || !scalar(%{$status})) { $LOG->error("Error: Unable to retrieve job status from spawned process."); Carp::croak "Error: Unable to retrieve job status from spawned process.\n"; } # Figure out the correct Process ID. my @keys = keys(%{$status}); my $processID = pop(@keys); # Sanity checks. if (!defined($processID) || !exists($status->{$processID}->{'exitcode'}) || !defined($status->{$processID}->{'exitcode'})) { $LOG->error("Error: Unable to retrieve job status from spawned process."); Carp::croak "Error: Unable to retrieve job status from spawned process.\n"; } # TODO: We may want to report this suspicious activity back to the Agent; # perhaps force the Agent to do an early integrity check, to make sure nothing # sketchy is going on. # Check to make sure the exitcode is '293', meaning, that the # application didn't unexpectedly die early. if ($status->{$processID}->{'exitcode'} != 293) { $LOG->warn("Unexpected: '" . $processName . "' process (ID = " . $processID . ") terminated early!"); } # Return the modified object state. return $self; } ####################################################################### 1; ####################################################################### # Additional Module Documentation # ####################################################################### __END__