| Gearman-Driver documentation | Contained in the Gearman-Driver distribution. |
Gearman::Driver - Manages Gearman workers
package My::Workers::One;
# Yes, you need to do it exactly this way
use base qw(Gearman::Driver::Worker);
use Moose;
# this method will be registered with gearmand as 'My::Workers::One::scale_image'
sub scale_image : Job {
my ( $self, $job, $workload ) = @_;
# do something
}
# this method will be registered with gearmand as 'My::Workers::One::do_something_else'
sub do_something_else : Job : MinProcesses(2) : MaxProcesses(15) {
my ( $self, $job, $workload ) = @_;
# do something
}
# this method wont be registered with gearmand at all
sub do_something_internal {
my ( $self, $job, $workload ) = @_;
# do something
}
1;
package My::Workers::Two;
use base qw(Gearman::Driver::Worker);
use Moose;
# this method will be registered with gearmand as 'My::Workers::Two::scale_image'
sub scale_image : Job {
my ( $self, $job, $workload ) = @_;
# do something
}
1;
package main;
use Gearman::Driver;
my $driver = Gearman::Driver->new(
namespaces => [qw(My::Workers)],
server => 'localhost:4730,otherhost:4731',
interval => 60,
);
#or should save all config into a YAML config file, then read config from it.
my $driver = Gearman::Driver->new(configfile => '/etc/gearman-driver/config.yml');
$driver->run;
Warning: This framework is still EXPERIMENTAL!
Having hundreds of Gearman workers running in separate processes can consume a lot of RAM. Often many of these workers share the same code/objects, like the database layer using DBIx::Class for example. This is where Gearman::Driver comes in handy:
You write some base class which inherits from Gearman::Driver::Worker. Your base class loads your database layer for example. Each of your worker classes inherit from that base class. In the worker classes you can register single methods as jobs with gearmand. It's even possible to control how many workers doing that job/method in parallel. And this is the point where you'll save some RAM: Instead of starting each worker in a separate process Gearman::Driver will fork each worker from the main process. This will take advantage of copy-on-write on Linux and save some RAM.
There's only one mandatory parameter which has to be set when calling the constructor: namespaces
use Gearman::Driver;
my $driver = Gearman::Driver->new( namespaces => [qw(My::Workers)] );
See also: namespaces. If you do not set
server (gearmand) attribute the default will be used:
localhost:4730
Each module found in your namespaces will be loaded and introspected, looking for methods having the 'Job' attribute set:
package My::Workers::ONE;
sub scale_image : Job {
my ( $self, $job, $workload ) = @_;
# do something
}
This method will be registered as job function with gearmand, verify it by doing:
plu@mbp ~$ telnet localhost 4730
Trying ::1...
Connected to localhost.
Escape character is '^]'.
status
My::Workers::ONE::scale_image 0 0 1
.
^]
telnet> Connection closed.
If you dont like to use the full package name you can also specify a custom prefix:
package My::Workers::ONE;
sub prefix { 'foo_bar_' }
sub scale_image : Job {
my ( $self, $job, $workload ) = @_;
# do something
}
This would register 'foo_bar_scale_image' with gearmand.
See also: prefix
See also ATTRIBUTES in Gearman::Driver::Loader.
A list of Gearman servers the workers should connect to. The format
for the server list is: host[:port][,host[:port]]
See also: Gearman::XS
localhost:4730StrGearman::Driver has a telnet management console, see also:
47300IntSet this to 0 to disable management console at all.
Each n seconds Net::Telnet::Gearman is used in Gearman::Driver::Observer to check status of free/running/busy workers on gearmand. This is used to fork more workers depending on the queue size and the MinProcesses/MaxProcesses attribute of the job method. See also: Gearman::Driver::Worker
5IntWhenever Gearman::Driver::Observer notices that there are more
processes running than actually necessary (depending on min_processes
and max_processes setting) it will kill them. By default this happens
immediately. If you change this value to 300, a process which is
not necessary is killed after 300 seconds.
Please remember that this also depends on what value you set
interval to. The max_idle_time is only checked each n seconds
where n is interval. Besides that it makes only sense when you
have workers where MinProcesses in Gearman::Driver::Worker is set to
0.
0IntPath to logfile.
Strgearman_driver.logSee also Log::Log4perl.
Str[%d] %p %m%nSee also Log::Log4perl.
StrINFOWhenever Gearman::Driver::Observer sees a job that isnt handled it will call this CodeRef, passing following arguments:
$driver$status my $driver = Gearman::Driver->new(
namespaces => [qw(My::Workers)],
unknown_job_callback => sub {
my ( $driver, $status ) = @_;
# notify nagios here for example
}
);
$status might look like:
$VAR1 = {
'busy' => 0,
'free' => 0,
'name' => 'GDExamples::Convert::unknown_job',
'queue' => 6,
'running' => 0
};
You can pass runtime options to the worker module, these will pass to the worker constructor.
Example:
has worker_options => (
default => sub {
{
'My::App::Worker::MysqlPing' => {
'dsn' => 'DBI:mysql:database=test;host=localhost;mysql_auto_reconnect=1;mysql_enable_utf8=1;mysql_server_prepare=1;',
},
'My::App::Worker::ImageThumbnail' => {
'default_format' => 'jpeg',
'default_size => '133x100',
}
}
}
);
You should define these in a runtime config (See also configfile), might be:
---
worker_options:
'My::App::Worker::MysqlPing':
'dsn': 'DBI:mysql:database=test;host=localhost;mysql_auto_reconnect=1;mysql_enable_utf8=1;mysql_server_prepare=1;'
'user': 'root'
'password:': ''
'My::App::Worker::ImageThumbnail':
'default_format': 'jpeg'
'default_size': '133x100'
You can override a job attribute by its name here. This help to tuning job some runtime-related options (like max_processes, min_processes) handy. You just change the options in a config file, no need to modify the worker code anymore.
Currently only 'max_processes', 'min_processes' make sense. The hash key is "worker_module::job_key", job_key is ProcessGroup attribute or job method name.
#in your config file: /etc/gearman-driver.yml (YAML)
---
job_runtime_attributes:
'My::App::Worker::job1':
max_processes: 25
min_processes: 2
#job has a ProcessGroup attribute named 'group1'
'My::App::Worker::group1':
max_processes: 10
min_processes: 2
#then run as:
gearman_driver.pl --configfile /etc/gearman_driver.yml
Runtime config file path, You can provide a default configfile pathname like so:
has +configfile ( default => '/etc/gearman-driver.yaml' );
You can pass an array of filenames if you want, like:
has +configfile ( default => sub { [ '/etc/gearman-driver.yaml','/opt/my-app/etc/config.yml' ] });
This might be interesting for subclassing Gearman::Driver.
Stores all Gearman::Driver::Job instances. There are also two methods:
Example:
{
'My::Workers::ONE::scale_image' => bless( {...}, 'Gearman::Driver::Job' ),
'My::Workers::ONE::do_something_else' => bless( {...}, 'Gearman::Driver::Job' ),
'My::Workers::TWO::scale_image' => bless( {...}, 'Gearman::Driver::Job' ),
}
HashRefTrueInstance of Gearman::Driver::Observer.
Gearman::Driver::ObserverTrueInstance of Gearman::Driver::Console.
Gearman::Driver::ConsoleTrueThere's one mandatory param (hashref) with following keys:
Maximum number of processes that may be forked.
Minimum number of processes that should be forked.
Job name/alias that method should be registered with Gearman.
ArrayRef of HashRefs containing following keys:
CodeRef to the job method.
The name this method should be registered with gearmand.
Name of a decoder method in your worker object.
Name of a encoder method in your worker object.
Worker object that should be passed as first parameter to the job method.
Basically you never really need this method if you use
namespaces. But namespaces depends on method attributes which
some people do hate. In this case, feel free to setup your $driver
this way:
package My::Workers::One;
use Moose;
use JSON::XS;
extends 'Gearman::Driver::Worker::Base';
# this method will be registered with gearmand as 'My::Workers::One::scale_image'
sub scale_image {
my ( $self, $job, $workload ) = @_;
# do something
}
# this method will be registered with gearmand as 'My::Workers::One::do_something_else'
sub do_something_else {
my ( $self, $job, $workload ) = @_;
# do something
}
sub encode_json {
my ( $self, $result ) = @_;
return JSON::XS::encode_json($result);
}
sub decode_json {
my ( $self, $workload ) = @_;
return JSON::XS::decode_json($workload);
}
1;
package main;
use Gearman::Driver;
use My::Workers::One;
my $driver = Gearman::Driver->new(
server => 'localhost:4730,otherhost:4731',
interval => 60,
);
my $worker = My::Workers::One->new();
# run each method in an own process
foreach my $method (qw(scale_image do_something_else)) {
$driver->add_job(
{
max_processes => 5,
min_processes => 1,
name => $method,
worker => $worker,
methods => [
{
body => $w1->meta->find_method_by_name($method)->body,
decode => 'decode_json',
encode => 'encode_json',
name => $method,
},
]
}
);
}
# share both methods in a single process
$driver->add_job(
{
max_processes => 5,
min_processes => 1,
name => 'some_alias',
worker => $worker,
methods => [
{
body => $w1->meta->find_method_by_name('scale_image')->body,
decode => 'decode_json',
encode => 'encode_json',
name => 'scale_image',
},
{
body => $w1->meta->find_method_by_name('do_something_else')->body,
decode => 'decode_json',
encode => 'encode_json',
name => 'do_something_else',
},
]
}
);
$driver->run;
Returns all Gearman::Driver::Job objects ordered by jobname.
This must be called after the Gearman::Driver object is instantiated.
Sends TERM signal to all child processes and exits Gearman::Driver.
Params: $name
Returns true/false if the job exists.
Params: $name
Returns the job instance.
There's also a script gearman_driver.pl which is installed with
this distribution. It just instantiates Gearman::Driver with its
default values, having most of the options exposed to the command
line using MooseX::Getopt.
usage: gearman_driver.pl [long options...]
--loglevel Log level (default: INFO)
--lib Example: --lib ./lib --lib /custom/lib
--server Gearman host[:port][,host[:port]]
--logfile Path to logfile (default: gearman_driver.log)
--console_port Port of management console (default: 47300)
--interval Interval in seconds (see Gearman::Driver::Observer)
--loglayout Log message layout (default: [%d] %p %m%n)
--namespaces Example: --namespaces My::Workers --namespaces My::OtherWorkers
--configfile Read options from this file. Example: --configfile ./etc/gearman-driver-config.yml
Johannes Plunien <plu@cpan.org>
Uwe Voelker, <uwe.voelker@gmx.de>
Night Sailer <nightsailer@gmail.com>
Copyright 2009 by Johannes Plunien
This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
| Gearman-Driver documentation | Contained in the Gearman-Driver distribution. |
package Gearman::Driver; use Moose; use Moose::Util qw(apply_all_roles); use Carp qw(croak); use Gearman::Driver::Observer; use Gearman::Driver::Console; use Gearman::Driver::Job; use Gearman::Driver::Job::Method; use Log::Log4perl qw(:easy); use MooseX::Types::Path::Class; use POE; with qw(MooseX::Log::Log4perl MooseX::SimpleConfig MooseX::Getopt Gearman::Driver::Loader); our $VERSION = '0.02005';
has 'server' => ( default => 'localhost:4730', documentation => 'Gearman host[:port][,host[:port]]', is => 'rw', isa => 'Str', required => 1, );
has 'console_port' => ( default => 47300, documentation => 'Port of management console (default: 47300)', is => 'rw', isa => 'Int', required => 1, );
has 'interval' => ( default => '5', documentation => 'Interval in seconds (see Gearman::Driver::Observer)', is => 'rw', isa => 'Int', required => 1, );
has 'max_idle_time' => ( default => '0', documentation => 'How many seconds a worker may be idle before its killed', is => 'rw', isa => 'Int', required => 1, );
has 'logfile' => ( coerce => 1, default => 'gearman_driver.log', documentation => 'Path to logfile (default: gearman_driver.log)', is => 'rw', isa => 'Path::Class::File', );
has 'loglayout' => ( default => '[%d] %p %m%n', documentation => 'Log message layout (default: [%d] %p %m%n)', is => 'rw', isa => 'Str', );
has 'loglevel' => ( default => 'INFO', documentation => 'Log level (default: INFO)', is => 'rw', isa => 'Str', );
has 'unknown_job_callback' => ( default => sub { sub { } }, is => 'rw', isa => 'CodeRef', traits => [qw(NoGetopt)], );
has 'worker_options' => ( isa => 'HashRef', is => 'rw', default => sub { {} }, traits => [qw(Hash NoGetopt)], );
has 'job_runtime_attributes' => ( isa => 'HashRef', is => 'rw', default => sub { {} }, traits => [qw(Hash NoGetopt)], );
has '+configfile' => ( documentation => 'Gearman-driver runtime config path', );
has 'jobs' => ( default => sub { {} }, handles => { _set_job => 'set', get_job => 'get', has_job => 'defined', all_jobs => 'values', }, is => 'ro', isa => 'HashRef', traits => [qw(Hash NoGetopt)], );
has 'observer' => ( is => 'ro', isa => 'Gearman::Driver::Observer', traits => [qw(NoGetopt)], );
has 'console' => ( is => 'ro', isa => 'Gearman::Driver::Console', traits => [qw(NoGetopt)], ); has 'session' => ( is => 'ro', isa => 'POE::Session', traits => [qw(NoGetopt)], ); has 'pid' => ( default => $$, is => 'ro', isa => 'Int', ); has '+logger' => ( traits => [qw(NoGetopt)] ); has '+wanted' => ( traits => [qw(NoGetopt)] ); has '+modules' => ( traits => [qw(NoGetopt)] );
sub add_job { my ( $self, $params ) = @_; $params->{name} = $params->{worker}->prefix . $params->{name}; foreach my $key ( keys %$params ) { delete $params->{$key} unless defined $params->{$key}; } my @methods = (); foreach my $args ( @{ delete $params->{methods} } ) { foreach my $key ( keys %$args ) { delete $args->{$key} unless defined $args->{$key}; } $args->{name} = $params->{worker}->prefix . $args->{name}; push @methods, Gearman::Driver::Job::Method->new( %$args, worker => $params->{worker} ); } my $job = Gearman::Driver::Job->new( driver => $self, methods => \@methods, %$params ); $self->_set_job( $params->{name} => $job ); $self->log->debug( sprintf "Added new job: %s (processes: %d)", $params->{name}, $params->{min_processes} || 1 ); return 1; }
sub get_jobs { my ($self) = @_; my @result = (); foreach my $name ( sort keys %{ $self->jobs } ) { push @result, $self->get_job($name); } return @result; }
sub run { my ($self) = @_; push @INC, @{ $self->lib }; $self->load_namespaces; $self->_start_observer; $self->_start_console; $self->_start_session; POE::Kernel->run(); }
sub shutdown { my ($self) = @_; POE::Kernel->signal( $self->{session}, 'TERM' ); } sub DEMOLISH { my ($self) = @_; if ( $self->pid eq $$ ) { $self->shutdown; } }
sub BUILD { my ($self) = @_; $self->_setup_logger; } sub _setup_logger { my ($self) = @_; Log::Log4perl->easy_init( { file => sprintf( '>>%s', $self->logfile ), layout => $self->loglayout, level => $self->loglevel, }, ); } sub _start_observer { my ($self) = @_; if ( $self->interval > 0 ) { $self->{observer} = Gearman::Driver::Observer->new( callback => sub { my ($response) = @_; $self->_observer_callback($response); }, interval => $self->interval, server => $self->server, ); } } sub _start_console { my ($self) = @_; if ( $self->console_port > 0 ) { $self->{console} = Gearman::Driver::Console->new( driver => $self, port => $self->console_port, ); } } sub _observer_callback { my ( $self, $response ) = @_; # When $job->add_process is called and ProcessGroup is used # this may end up in a race condition and more processes than # wanted are started. To fix that we remember what kind of # processes we need to start in each single run of this callback. my %to_start = (); my $status = $response->{data}; foreach my $row (@$status) { if ( my $job = $self->_find_job( $row->{name} ) ) { $to_start{$job->name} ||= 0; if ( $job->count_processes <= $row->{busy} && $row->{queue} ) { my $diff = $row->{queue} - $row->{busy}; my $free = $job->max_processes - $job->count_processes; if ($free) { my $start = $diff > $free ? $free : $diff; $to_start{$job->name} += $start; } } elsif ( $job->count_processes && $job->count_processes > $job->min_processes && $row->{queue} == 0 ) { my $idle = time - $job->lastrun; if ( $idle >= $self->max_idle_time ) { my $stop = $job->count_processes - $job->min_processes; $self->log->debug( sprintf "Stopping %d process(es) of type %s (idle: %d)", $stop, $job->name, $idle ); $job->remove_process for 1 .. $stop; } } } else { $self->unknown_job_callback->( $self, $row ) if $row->{queue} > 0; } } foreach my $name (keys %to_start) { my $job = $self->get_job($name); my $start = $to_start{$name}; my $free = $job->max_processes - $job->count_processes; $start = $free if $start > $free; if ($start) { $self->log->debug( sprintf "Starting %d new process(es) of type %s", $start, $job->name ); $job->add_process for 1 .. $start; } } my $error = $response->{error}; foreach my $e (@$error) { $self->log->error( sprintf "Gearman::Driver::Observer: %s", $e ); } } sub _find_job { my ( $self, $name ) = @_; foreach my $job ( $self->all_jobs ) { foreach my $method ( @{ $job->methods } ) { return $job if $method->name eq $name; } } return 0; } sub _start_session { my ($self) = @_; $self->{session} = POE::Session->create( object_states => [ $self => { _start => '_start', got_sig => '_on_sig', monitor_processes => '_monitor_processes', } ] ); } sub _on_sig { my ( $self, $kernel, $heap ) = @_[ OBJECT, KERNEL, HEAP ]; foreach my $job ( $self->get_jobs ) { foreach my $process ( $job->get_processes ) { $self->log->info( sprintf '(%d) [%s] Process killed', $process->PID, $job->name ); $process->kill(); } } $kernel->sig_handled(); exit(0); } sub _start { $_[KERNEL]->sig( $_ => 'got_sig' ) for qw(INT QUIT ABRT KILL TERM); $_[OBJECT]->_add_jobs; $_[OBJECT]->_start_jobs; $_[KERNEL]->delay( monitor_processes => 5 ); } sub _add_jobs { my ($self) = @_; my $worker_options = $self->worker_options; my $job_runtime_attributes = $self->job_runtime_attributes; foreach my $module ( $self->get_modules ) { my $module_options = $worker_options->{$module} || {}; $module_options->{server} = $self->server; my $worker = $module->new( $module_options ); my %methods = (); foreach my $method ( $module->meta->get_nearest_methods_with_attributes ) { apply_all_roles( $method => 'Gearman::Driver::Worker::AttributeParser' ); $method->default_attributes( $worker->default_attributes ); $method->override_attributes( $worker->override_attributes ); next unless $method->has_attribute('Job'); my $name = $method->get_attribute('ProcessGroup') || $method->name; $methods{$name} ||= []; push @{ $methods{$name} }, $method; } foreach my $name ( keys %methods ) { my @methods = (); my ( $min_processes, $max_processes ); foreach my $method ( @{ $methods{$name} } ) { warn sprintf "MinProcesses redefined in ProcessGroup(%s) at %s::%s", $method->get_attribute('ProcessGroup'), ref($worker), $method->name if defined $min_processes && $method->has_attribute('MinProcesses'); warn sprintf "MaxProcesses redefined in ProcessGroup(%s) at %s::%s", $method->get_attribute('ProcessGroup'), ref($worker), $method->name if defined $max_processes && $method->has_attribute('MaxProcesses'); $min_processes ||= $method->get_attribute('MinProcesses'); $max_processes ||= $method->get_attribute('MaxProcesses'); push @methods, { body => $method->body, name => $method->name, decode => $method->get_attribute('Decode'), encode => $method->get_attribute('Encode'), }; } my $job_runtime_attributes = $self->job_runtime_attributes->{$module.'::'.$name} || {}; if (defined $job_runtime_attributes->{min_processes} ) { $min_processes = $job_runtime_attributes->{min_processes} ; } if (defined $job_runtime_attributes->{max_processes}) { $max_processes = $job_runtime_attributes->{max_processes}; } $self->add_job( { max_processes => $max_processes, min_processes => $min_processes, methods => \@methods, name => $name, worker => $worker, } ); } } } sub _start_jobs { my ($self) = @_; foreach my $job ( $self->get_jobs ) { for ( 1 .. $job->min_processes ) { $job->add_process(); } } } sub _monitor_processes { my $self = $_[OBJECT]; foreach my $job ( $self->get_jobs ) { if ( $job->count_processes < $job->min_processes ) { my $start = $job->min_processes - $job->count_processes; $self->log->debug( sprintf "Starting %d new process(es) of type %s", $start, $job->name ); $job->add_process for 1 .. $start; } } $_[KERNEL]->delay( monitor_processes => 5 ); } no Moose; __PACKAGE__->meta->make_immutable;
1;