MultiProcFactory - Base class for multiprocess batch processing.


MultiProcFactory documentation  | view source Contained in the MultiProcFactory distribution.

Index


NAME

Top

MultiProcFactory - Base class for multiprocess batch processing.

SYNOPSIS

Top

  #!/usr/bin/perl

  use strict;
  use MultiProcFactory;

  my $do_child = sub {
    my $self = shift;
    $self->inc_scalar();
    $self->set_hash_element($self->get_prockey() => " $$: " . $self->get_scalar());
  };

  my $do_parent_final =
  sub {
    my $self = shift;

    foreach my $key ($self->get_prockeys()) {
      my $value = $self->get_hash_element($key);
      $self->log_parent("$key: $value\n");
    }
  };

  my $link_obj = MultiProcFactory->factory(
    work_by => 'MultiProcFactory::Generic',
    do_child => $do_child,
    do_parent_final => $do_parent_final,
    partition_list => [
      'A',
      'B',
      'C',
      'D',
      'E',
      'F',
      'G',
      'H'
    ]
  );

  $link_obj->run();

ABSTRACT

Top

  This is a factory framework interface for multiprocess batch processing.
  You need to write a class to inherit from this class that fits your data model.
  Can be a very powerful data processing tool, for system wide application patterns.

DESCRIPTION

Top

This class is a factory class for multiprocess batch processing. The definition of processing bins are defined in subclasses that this object returns. run method manages child processes and executes code references. Depending on subclass logic can be used to execute do_child as an iterator for batch processing. Shared memory through IPC::Shareable (SysV IPC) is available by default. I have setup two shared variables, a scalar and a hash.

PUBLIC METHODS

Top

factory();

This method takes all contructor arguments. Additional parameters will be needed for your subclassed object.

Base Class Required Parameters

* work_by =>'BFI::MultiProcFactory::Schema::Mailing' ## Package name to subclass
* do_child => $code_ref_a ## code executed in each child
* do_parent_final => $code_ref_b, ## code executed by parent when child procs are complete.

Base Class Optional Parameters

* max_proc => N # max number of concurrent child processes (default 20)
* log_children => 0|1 (default 0)
* log_parent => 0|1 (default 1)
* do_parent_init => $code_ref_c, ## code executed in parent before forking
* parent_sig => {INT => $coderef, TERM => $coderef, ...}
* child_sig => {INT => $coderef, TERM => $coderef, ...}
* IPC_OFF => 0|1 (default 0) ## turns off default allocation of shared memory

run()

This method is called after initialization. It contains all forking and subroutine execution logic.

log_parent()

Method logs input string to parent filehandle

log_child()

Method logs input string to child filehandle

set_parent_logname()

Default - $0 minus any extensions . '.log' can override default by redefining in subclass.

set_child_logname()

Default - $0 minus any extensions . "_$instance\.log" can override default by redefining in subclass.

get_prockey()

returns current childs process key. This key maps back to process slot in partition_hash. Has no meaning if called from parent and should return undef.

get_prockeys()

returns list of process keys in partition_hash used for iterating over all children

scalar_lock()

wrapper for IPC::Shareable shlock() on shared scalar

hash_lock()

wrapper for IPC::Shareable shlock() on shared hash

scalar_unlock()

wrapper for IPC::Shareable shunlock() on shared scalar

hash_unlock()

wrapper for IPC::Shareable shunlock() on shared hash

set_hash_element()

wrapper to set shared hash with key => value. Calls hash_lock() and hash_unlock()

get_hash_element()

wrapper to get value stored in shared hash identified by $key

set_scalar()

wrapper to set shared scalar var with $value

inc_scalar()

wrapper to increment current value in shared scalar by 1

dec_scalar()

wrapper to decrement value in shared scalar by 1

get_scalar()

wrapper to access shared scalar value

PRIVATE METHODS

Top

new()

called internally by factory()

_set_do_child()

sets child code reference

_set_do_parent_init()

sets parent initialization reference

_set_do_parent_final()

sets parent cleanup code reference

_set_parent_signals()

sets parent signal handlers if passed in with hash ref parent_sig =>{}, this allows you to override the default signal handling behavior.

_set_child_signals()

sets child signal handlers if passed in with hash ref parent_sig =>{}, this allows you to override the default signal handling behavior.

SIGNALS

Top

* Parent - by default TERM, ABRT, INT and QUIT are set to call IPC::Shareable->clean_up. Unless you like calling ipcrm this is a good thing.

* Child - by default TERM, ABRT, INT and QUIT are reset undef.

SHARED MEMORY

Top

* Sets up two shared variables with IPC::Shareable, a scalar and a hash.

* For the curious semaphores and memory are stored in

* $self->{share_scalar}{handle}
* $self->{share_scalar}{var}
* $self->{share_hash}{handle}
* $self->{share_hash}{var}

PUBLIC DATA

Top

* $self->{prockey} - defines each process bin

INTERFACE IMPLENTATION METHODS

Top

init()

called from constructor. parent contains partitioning algorithm. Partition algorithm bins data into self->{partition_hash} Each of these bins is forked.

do_child_init()

This method does any basic child process level initialization.

work()

This method at the bare minimum must call do_child(). Can be written to iterate do_child over a result set.

AUTHOR

Top

Aaron Dancygier, <adancygier@bigfootinteractive.com>

COPYRIGHT AND LICENSE

Top

SEE ALSO

Top

perl(1), IPC::Shareable, MultiProcFactory::Generic


MultiProcFactory documentation  | view source Contained in the MultiProcFactory distribution.