Revision history for Perl extension Parallel::Depend (most recent first).
4.09 Tue Jun 30 12:14:45 EDT 2009
4.08 Sun Jun 28 18:52:54 EDT 2009
This has an advantage for long-lived schedules since purging the extra data keeps the queue at a reasonable size if multiple sets of ad-hoc jobs are added over time.
Catch: The object that dispatched ad_hoc may not be the one used to unalias the job within execute.
Fix: the ad_hoc method adds the current object into the attributes for its hidden group. Within unalias, the ad_hoc_mgr attribute is used if it exists to resolve method calls.
The object will stay alive for the group execution due to being stored in the group's attributes, which are purged on the way out of the group (see previous item).
Result is that a main handler can use something like:
my $child = $other_class->new;
$mgr->share_queue( $child );
...
$child->ad_hoc( @stuff );
and know that $child will be used to
dispatch the jobs added via ad_hoc
and also that $child will be destroyed
after all of its ad-hoc jobs have been
completed.
This is called from the group method, call this via SUPER or NEXT if the method is overridden. This is where the ad hoc manager object is dropped, so failing to cause this will leave any number of otherwise-short-lived objects floating around in the attributes until the schedule completes.
4.07 Mon Jun 8 19:43:45 EDT 2009
This is for staged operations where a set of modules handle, say, the processing of a group of files in one dataset. They all have the same methods for each stage:
serialize < Nodes = pass1 >
serialize < Names = pass1 >
serialize < Division = pass1 >
serialize < Delnodes = pass1 >
serialize < Merged = pass1 >
serialize < Nodes : >
serialize < Names : >
serialize < Division : >
serialize < Delnodes : >
serialize < Merged : >
becomes just:
serialize < alias % pass1 >
serialize < Nodes : >
serialize < Names : >
serialize < Division : >
serialize < Delnodes : >
serialize < Merged : >
The alias is not inherited to avoid accidentally polluting ad_hoc or nested groups. This defines a default alias, allowing for:
serialize < alias % pass1 >
serialize < Nodes = foo > # explicit alias is foo
serialize < Nodes : > # default alias is pass1
serialize < Names : >
serialize < Division : >
serialize < Delnodes : >
serialize < Merged : >
serialize < alias % pass1 >
serialize < Nodes ~ ad_hoc>
serialize < Names ~ ad_hoc>
serialize < Division ~ ad_hoc>
serialize < Delnodes ~ ad_hoc>
serialize < Merged ~ ad_hoc>
serialize < Nodes : > # default alias is pass1
serialize < Names : >
serialize < Division : >
serialize < Delnodes : >
serialize < Merged : >
to just:
serialize < alias % pass1 >
serialize < pass1 ~ ad_hoc>
serialize < Nodes : >
serialize < Names : >
serialize < Division : >
serialize < Delnodes : >
serialize < Merged : >
Which leaves each of the jobs (Nodes, Names, etc) handed of as ad_hoc jobs via the managers 'pass1' as $mgr->pass1( $job ).
4.06 Thu Jun 4 16:14:16 EDT 2009
make && perl -Mblib -d:Profile Profile/many-jobs-fork; less prof.out;
Note that these require Devel::Profile, which is not in the dependency tree.
4.05 Fri May 29 12:36:43 EDT 2009
4.04 Thu May 21 15:57:40 EDT 2009
The new syntax also allows setting verbose, etc, for single jobs. The default for most items is true (1), separate the attribute with whitespace to store another value:
peaches : cream
peaches ~ ad_hoc # peaches is not forked. peaches ~ verbose # verbose % 1 for peaches. cream ~ verbose 2 # verbose %2 for cream.
peaches = frobnicate # ad_hoc doesn't affect alias
4.03 Tue May 19 14:15:18 EDT 2009
4.02 May 11 2009
4.00 Mon Mar 23 11:01:50 EDT 2009
Keeping the group items in on place required adding a namespace for jobs within groups -- the older method hid the groups from one another via forks. This leaves the keys for job, alias, attribute, etc, entries looking like $que->{ $namespace, $job }. The default (global) namespace is an empty string, successive levels of groups are appended to their parent namespace to get their own.
The namespace and name can be taken from:
my $i = rindex $fullname, $Parallel::Depend::job_id_sep;
my $namespace = substr $fullname, 0, $i;
my $name = substr $fullname, $i+2;
the separator variable is a read-only copy of $; (which gets messy in syntax use).
Namespaces are also used to generate log and runfiles, with the $; replaced with '-'.
The main external differences are that the run files include a timestamp and the log files use a group-nested namespace for everything. Logging also includes the namespace (i.e., nested group) for all jobs to uniquely identify them (i.e. no more collisions between job names in separate groups). Unneeded subs were also stripped from P::D::Util (see 00* tests for quick list of public methods).
Dependencies between groups are also a bit more reliable with group namespaces for the jobs.
More, smaller tests give a better illustration of how to use the code.
This also leaves the queue structure suitable for persistence management via DBM::Deep, which would allow simpler restarts without depending on the local filesystem for bookkeeping.
3.07 Wed Jan 28 16:15:58 EST 2009
3.06 Sun Jan 25 19:48:58 EST 2009
3.05 Thu Jan 22 18:49:29 EST 2009
3.04 Mon Jan 19 16:44:06 EST 2009
3.03 Sat Dec 27 15:00:13 EST 2008
3.02 Wed Dec 24 13:05:32 EST 2008
3.01 Tue Dec 23 12:59:47 EST 2008
3.0 Mon Dec 22 18:05:42 EST 2008
2.5 Fri Apr 29 19:48:08 EDT 2005
Allows que methods to use:
my ( $que, $config ) = &handle_args;
or
my ( $config ) = &handle_args;
and yanks $que off of the stack. It also sets $DB::single = 1 if $que->debug (i.e., automatic breakpoint in debug mode) and logs the caller, arguments.
This replaces the top few lines of nearly every que method.
2.4 Tue Apr 26 16:54:56 EDT 2005
$que->subque( %blah )->validate->execute;
when executing via runsched in debug mode.
2.3 Sat Jan 15 10:17:11 EST 2005
2.2 Mon Jan 10 08:20:47 UTC 2005
2.1 Sun Dec 26 06:34:50 EST 2004
2.00 Sun Dec 26 06:34:50 EST 2004
1.00 Fri Apr 9 01:28:58 CDT 2004
0.33 Mon Mar 15 23:07:33 CST 2004
0.31 Sat Jan 4 21:40:03 CST 2003
0.30 Mon Nov 4 17:11:56 CST 2002
0.29 Thu Sep 12 19:31:32 CDT 2002
0.28 Thu Sep 12 19:31:11 CDT 2002
0.27 Tue Sep 10 14:32:58 CDT 2002
0.26 Thu Jun 6 09:06:45 CDT 2002
0.25 Wed May 22 02:17:08 CDT 2002
groupname < job1 job2 : job3 >
groupname < job4 : job5 job6 >
creates a single group with 6 jobs in it. Syntax for
the group is a full scheudule. The default handler is
group (i.e., groupname is aliased to "group" if none
already exists when the group is processed).
The group mechanism will look something like:
Create a lookaside list for the group with the group's
schedule in it. The schedule is keyed by the groupname.
When the group becomes runnable then $que->group('groupname')
is called, which prepares the schedule from the lookaside
list and then executes it. If the groupname is aliased to
another method then that one will be used to dispatch the
group's schedule (e.g., $que->mygroup('groupname') via
"groupname = mygroup" in the schedule).
The schedule then looks something like:
transform = group # optional, added automatically
transform < mungethis : mungethat >
transform < mungeother : >
transform : extract
load : transform
This creates a group with munge* jobs in it (they can be spread
out however is most useful). The group handler converts this into
something like:
transform = group
transform : extract
load : transform
When "transform" becomes runnable it is called via
$que->group( 'transform' ) which calles subque with
( sched => 'mungethis mungethat mungeother :' ) then
prepares executes the sub-que.
- More doc updates. They now reflect most of reality.
Included example of using multi-line entries in the
schedule via array ref.
- Fix typos in progress message trim, docs.
0.24
0.23 Wed May 15 11:12:44 CDT 2002
0.22 Tue May 7 22:48:42 CDT 2002
alias()
alias( @items )
returns a copy of the alias hash (not the
original) or a hash slice of the requested
items from the hash.
status()
Returns an anon. hash with the values of
alias(), restart, noabort, abort, verbose,
and debug keyed by their names.
These are mainly useful for creating sub-queues with
largely the same values as the parent.
And single-value items:
restart() true/false if que is restarting
noabort() true/false if que is running noabort.
abort() current abort value (string).
verbose() integer of verbosity.
debug() true/false if in debug mode.
jobz() $jobz{$pid} = $job (i.e., pid => job string ).
pidz() $pidz{$pid} = $pidfilepath
rundir() path of run directory
logdir() path of log directory
- Minor cleanups.
0.21 Sat May 4 13:43:47 CDT 2002
0.19 Sat Apr 20 22:27:38 CDT 2002
0.18 Fri Apr 20 2002
0.17 Pretend this never happend...
0.16 Wed Apr 17 20:54:30 CDT 2002
0.15 Tue Apr 16 12:13:59 CDT 2002
For example:
foo = { do_this($dir) || do_that($dir) or croak "$$: Neither that nor this!" }
...
foo : bar
Will call the do_this or do_that sub's with the
value of $dir expanded in the parent process at
runtime.
Note that the code must be on a single line.
Since unalias() is called in the parent this can
have side effects on subsequent children (e.g.,
foo = {do_this(pop @dirlist)}).
0.12 Fri Apr 12 14:05:41 CDT 2002
At this point unalias can automatically detect methods, local subs and subs w/ package names before dispatching to the shell.
Order of lookups is also changed, from Package::Sub through method to sub.
0.11 Fri Apr 12 03:46:48 CDT 2002
0.10 Thu Apr 11 14:16:40 CDT 2002
The mechanism for finding sub's is to unalis the jobname first, look for the ->can or CODE ref and pass it the original jobname if it's found.
abc : foo
xyz : foo
bar : foo
bar : abc xyz
to call foo('abc'), foo('xyz') and foo('bar').
This might be useful for, say, cleaning up multiple
directories. To pass more informaiton just use
the arguments as hash keys into a package/global
value with the extra info.
- Assigning an empty or 'PHONY' alias is a noop.
Thus:
foo =
or
foo = PHONY
Will insert an immediate return of zero job into
the schedule. This is mainly to neaten things up:
foo =
foo : long_named_job_one
foo : even_longer_named_job_two
foo : something_you_surely_do_not_want_to_type
bar : foo
now bar depends on all of the other three jobs without
the line-from-hell in the middle of a schedule. This can
also be handy for generating schedules on the fly, where
bar is a placeholder and the others are pushed onto a
stack if they are needed.
- Doc updates.
0.09
Put the tarball into a tarball rahter than
cpio archive, no change to the code.
0.08 Fri Apr 5 10:10:08 CST 2002
Using an unalias with:
no strict 'refs';
\&$job
to expand the tag from the scheule allows
queueing subroutines -- or a mixture of
subroutines and shell execs. This could also
return a closure to push evaluation parameters
even later into the cycle (or for testing).
See notes for unalias & runjob for examples.
- Added switch for handling failed jobs without
aborting the queue. This offers the same effect
as "make -k". If "noabort => true" is passed
into prepare then jobs that fail will have their
dependencies marked for skipping and the pidfiles
will get a nastygram + exit status of -1 (i.e.,
they will be re-executed on a restart).
The noabort code doesn't seem to break anything,
but has not been fully tested (yet).
- pidfile and output directories can be passed in with the arguments,
picked up from the environment or default to the current
executable's directory name. the environment method can be
handy for the single-argument version.
- Serious updates to pod and comments.
0.07
Default for $que->{alias}{logdir} and {rundir}
are "dirname $0". Simplifies running multpile
copies of the same file and schedule from
different directories via soft link.
Revised exit status writing. The child uses
system instead of exec and writes $? to its
own pidfile; the parent writes $? to the pidfile
if it is non-zero. This allows either the parent
or child to get zapped by a signal, leave the
other running and correctly record the status.
It also means that the pidfiles may be 4 lines
long on failure. Fix there is to read [0..2]
to check the status on the way in for restarts.
Updated comments to reflect reality a bit better.
0.06
Remove some of the extra newlines -- they aren't
required since individual job output goes to
stdout/stderr files.
Shuffled verobse prints to a bit to give saner
combinatins of output. At this point test.log
should give a reasonable idea of what slient,
progress and detailed output look like.
Process verbose as an alias for setting verbosity.
Add sanity check for odd number of arg's > 1
in prepare. Makes it harder to zap thyself by
adding "verbose => X" after a single-value
schedule entry.
test.pl checks for forkatotis in the module by
comparing the initial pid running test.pl with
what's running after the test_blah call; croaks
if the pid has changed since test.pl startup.
Updated comments, pod to reflect the changes
since 0.03.
Note: Still need to come up with a reasonable
definition for processing the debug alias/arg
during prepare and execution. It may require
debug levels like: 1 => don't fork, 2 => also
don't check or write pidfiles. Problem there
is making sure that mixing $que->debug with
$que->prepare( ... debug => X ) doesn't cause
unnecessary errors.
0.05:
Output of individual jobs goes to $logdir/$job.out
and $logdir/$job.err. Main purpose is to keep the
top-level schedule logs clean.
test.pl puts stdout to test.log -- saves a lotta
stuff flying by on make test.
verbose and debug arg's to prepare and execute
are independent (i.e, you can now debug in silent
mode and get minimal output).
0.04 Fri Mar 1 13:52:34 CST 2002
debug uses copy of queue, doesn't consume original
object during debug, returns original object if
debug is successful. see comments for use.
updated verbose handling, now has three levels: 0, 1, 2.
0 == stop/start messages and nastygrams, 1 == progress
messages from fork/reap; 2 == fairly detailed. $q->{verbose}
overrides the debug switch; no verbose setting w/ debug
gives verbose == 2. added description of changes to POD.
all verbose-controlled output goes to STDOUT, nastygrams
and que start/complete messages to STDERR.
doc updates to reflect changes in verbosity.
0.03 Wed Feb 27 12:20:18 CST 2002
Doc updates.
test.pl updated.
0.02 Wed Feb 6 17:25:02 CST 2002
Release-able version.
0.01 Wed Feb 6 10:20:32 2002
Beta