TM::Graph - Topic Maps, trait for graph-like operations


TM documentation Contained in the TM distribution.

Index


Code Index:

NAME

Top

TM::Graph - Topic Maps, trait for graph-like operations

SYNOPSIS

Top

  use TM::Materialized::AsTMa;
  my $tm = new TM::Materialized::AsTMa (file => 'old_testament.atm');
  $tm->sync_in;
  Class::Trait->apply ( $tm => 'TM::Graph' );

  # find groups of topics connected
  print Dumper $tm->clusters;




  # use association types to compute a hull
  print "friends of Mr. Cairo: ".
   Dumper [
       $tm->frontier ([ $tm_>tid ('mr-cairo') ], [ [ $tm->tids ('foaf') ] ])
   ];

  # see whether there is a link (direct
  print "I always knew it" 
     if $tm->is_path ( [ 'gw-bush' ],              # there could be more
                      (bless [ [ 'foaf' ] ], '*'),
                      'osama-bin-laden');




DESCRIPTION

Top

Obviously a topic map is also a graph, the topics being the nodes, and the associations forming the edges, albeit these connections connect not always only two nodes, but, ok, you should know TMs by now.

This package provides some functions which focus more on the graph-like nature of Topic Maps.

INTERFACE

Top

Methods

This trait provides the following methods:

clusters

$hashref = clusters ($tm)

computes the islands of topics. It figures out which topics are connected via associations and - in case they are - will collate them into clusters. The result is a hash reference to a hash containing list references of topic ids organized in a cluster.

In default mode, this function only regards topics to be in the same cluster if topics play roles in one and the same maplet. The role topics themselves or the type or the scope are ignored.

You can change this behaviour by passing in options like

  use_scope => 1

  use_roles => 1

  use_type  => 1

Obviously, with use_scope => 1 you will let a lot of topics collapse into one cluster as most maplets usually are in the unconstrained scope.

NOTE: This is yet a somewhat expensive operation.

frontier

@hull = $tm->frontier (\@start_lids, $path_spec)

This method computes a qualified hull, i.e. a list of all topics which are reachable from @start_lids via a path specified by $path_spec. The path specification is a (recursive) data structure, describing sequences, alternatives and repetition (the * operator), all encoded as lists of lists. The topics in that path specification are interpreted as assertion types.

Example (reformatting for better reading):

   # a single step: start knows ...
   [             ]            # outer level: sequence (there is only one)
     [ 'knows' ]              # inner level: alternatives (there is only one)

   # two subsequent steps: start knows ... isa ...
   [                        ] # outer level: two entries
     [ 'knows' ], [ 'isa' ]   # inner level, one entry each

   # repetition: start knows ... knows ... knows ... ad infinitum
   bless [             ], '*' # outer level: one entry, but blessed
           [ 'knows' ]        # inner level

   # alternatives: start knows | hates ...
   [                      ]   # outer level: one entry
     [ 'knows', 'hates' ]     # inner level: alternatives

   # nesting: first follow an 'eats', then any number of 'begets'
   [                                              ]
     [ 'eats' ], [                              ]
                    bless [              ], '*'
                            [ 'begets' ]

NOTE: All tids have to be made map-absolute with tids.

NOTE: Cycles are detected.

NOTE: I am not sure how this performs at rather large graphs, uhm, maps.

is_path

$bool = $tm->is_path (\@start_lids, $path_spec, $end_lid)

This method returns 1 if there is a path from start_lids to end_lid via the path specification. See frontier for that one.

neighborhood

@neighbors = $tm->neighborhood ($MAXDEPTH, \@start_lids)

This method returns a list of neighbors for the given start LIDs. In that it follows paths with the maximal length given as first parameter. In any case the path with length 0 is returned, which includes any of the starting nodes.

Each neighbor is represented by a hash (reference) with the path and the end LID. The path is a list (reference) holding the LIDs of the association types visited along the path.

SEE ALSO

Top

TM

COPYRIGHT AND LICENSE

Top


TM documentation Contained in the TM distribution.
package TM::Graph;

use strict;
use Data::Dumper;

use TM;

use Class::Trait 'base';

sub clusters {
    my $tm    = shift;

    my %opts = @_;
# by default
#   not using scope
#   not using type
#   not using roles
    $opts{use_lid} = 1 unless defined $opts{use_lid}; #   always use maplet ID

    my $i = 0;
    my $clusters = { map { $_ => $i++ } map { $_->[TM->LID] } $tm->toplets };   # we store every toplet into its own cluster

    foreach my $m ($tm->match (TM->FORALL, nochar => 1)) {

	my   @candidates;
#	push @candidates, $m->[TM->LID]         if $opts{use_lid};
	push @candidates, $m->[TM->TYPE]        if $opts{use_type};
	push @candidates, $m->[TM->SCOPE]       if $opts{use_scope};
	push @candidates, @ { $m->[TM->ROLES] } if $opts{use_roles};
	push @candidates, @ { $m->[TM->PLAYERS] };

	my $i = $clusters->{shift @candidates};
	foreach (@candidates) {
	    my $j = $clusters->{$_};
                                                                   # now all entries which have currently $j must be turned into $i
	    unless ($i == $j) {
		map { $clusters->{$_} = $clusters->{$_} == $j ?  $i : $clusters->{$_} } keys %{$clusters};
	    }
	}
    }
    my @clusters = map { [] } values %$clusters;
    map { push @{@clusters[ $clusters->{$_} ]}, $_ } keys %$clusters ;
    return [ grep (@$_, @clusters)];  # get rid of empty clusters
}

sub frontier {
    my $tm = shift;
    my $as = shift;
    my $ps = shift;

    my @bs = _frontier_star ($tm, $as, {}, $ps);                                     # {} are the axes followed so far

sub _frontier_star {
    my $tm = shift;
    my $as = shift;                                                                 # the list (ref) of things where we are now
    my $vs = shift;                                                                 # what have we visited so far, hash ref
    my $ps = shift;                                                                 # a list ref for the sequence of path items left to be done

    if (ref ($ps) eq '*') {                                                         # if *'ed then add the starting points
	my @front = @$as;                                                           # start off with the starting points, they belong to it
	while (1) {                                                                 # repeat ad-infinitum
	    my @bs = _frontier_seq ($tm, \@front, $vs, $ps) ;                       # compute from the current front
	    last unless @bs;                                                        # there might not be any new ($vs has side effects!!!!!!!!)
	    push @front, @bs;                                                       # what we got we collect
	    { my %X; map { $X{$_}++ } @front; @front = keys %X; }                  # make that unique (otherwise too many identical entries)
	}
	return @front;  # and finally return it

    } else {
	return _frontier_seq ($tm, $as, $vs, $ps)
    }
}

sub _frontier_seq {
    my $tm = shift;
    my $as = shift;                                                                 # the list (ref) of things where we are now
    my $vs = shift;                                                                 # what have we visited so far, hash ref
    my $ps = shift;                                                                 # a list ref for the sequence of path items left to be done

    my $front = $as;
    foreach my $p (@$ps) {                                                          # one step after the other
	$front = [ _frontier_alt ($tm, $front, $vs, $p) ];                          # compute what we can reach from there
	{ my %X; map { $X{$_}++ } @$front; $front = [ keys %X ]; }                  # make that unique (otherwise too many identical entries)
    }
    return @$front;
}

sub _frontier_alt {
    my $tm = shift;
    my $as = shift;
    my $vs = shift;                                                                 # what have we visited so far, hash ref
    my $os = shift;                                                                 # a list of alternative paths

    my @front;
    foreach my $o (@$os) {
	push @front, _frontier_step ($tm, $as, $vs, $o);
    }
    { my %X; map { $X{$_}++ } @front; @front = keys %X; };                          # make that unique (otherwise too many identical entries)
    return @front;
}

sub _frontier_step {
    my $tm = shift;
    my $as = shift;
    my $vs = shift;                                                                 # what have we visited so far, hash ref
    my $s  = shift;                                                                 # the step

    if (ref ($s) eq '*') {                                                          # something more complex, can only be sequence
	return _frontier_star ($tm, $as, $vs, $s);
    } elsif (ref ($s)) {                                                            # something more complex, can only be sequence
	return _frontier_seq  ($tm, $as, $vs, $s);
    } else {                                                                        # atomic step
	my @as = grep { $vs->{$_}->{$s}++ == 0 } @$as;                              # get rid of those where we already followed that axis
	if ($s eq 'isa') {                                                          # instance of, easy
	    return map { $tm->typesT ($_) } @as;                                    # compute their types

	} elsif ($s eq 'iko') {                                                     # subclasses, easy
	    return map { $tm->superclassesT ($_) } @as;                             # computer their superclasses

	} else {
	    my $tt = $tm->mids ($s);
	    return 
		map { $tm->get_players ($_) }
	        map { $tm->match_forall (type => $tt, iplayer => $_) }
	        @as;
	}
    }
}

}

sub is_path {
    my $tm = shift;
    my $as = shift;
    my $ps = shift;
    my $b  = shift;

    my $bt = $tm->mids ($b) or $TM::log->logdie ("end topic not in map");
    return grep { $_ eq $bt } $tm->frontier ( [ $tm->mids (@$as) ], $ps);
}

sub neighborhood {
    my $self   = shift;
    my $DEPTH  = shift;
    my $starts = shift;
    my @starts = grep { $_ } $self->mids (@$starts);                  # make sure we only have defined ones

    my @ns = map { { path => [], end => $_ } } @starts;               # bootstrap result, will be accumulated below
    return _neighborhood ($self,
			  \@ns,                                       # current paths and frontiers
			  $DEPTH, 1,                                  # max depth and current depth
	                  );

sub _neighborhood {
    my $self = shift;
    my $ns   = shift;
    my $DEPTH = shift;
    my $depth = shift;

    if ($depth > $DEPTH) {                                            # if we went too far
	return @$ns;                                                  # this is the result
    } else {                                                          # still not at DEPTH
	my %seen = map { $_ => 1 }                                    # build already seen hash
	           map { $_->{end} }                                  # find end points
                   @$ns;                                              # walk through all paths we have
	my @ns;
	foreach my $n (@$ns) {
	    my @as = $self->match_forall (iplayer => $n->{end});
	    foreach my $a (@as) {
		push @ns, 
		           map  {                                     # construct path/end combo
		                  { path => [ @{$n->{path}}, $a->[TM->TYPE] ],
				    end  => $_
				  }
	                         }         
		           grep { !$seen{ $_ }++ }                    # filter out those which we have already (and mark them seen)
	                   $self->get_players ($a);                   # get all players of this assoc
	    }
	}
	return _neighborhood ($self,                                  # the map
			      [ @ns, @$ns ],
			      $DEPTH, $depth+1,                       # max depth and current depth
	                      );
    }
}
}

our $VERSION  = 0.3;
our $REVISION = '$Id: Graph.pm,v 1.1 2007/07/28 16:40:31 rho Exp $';


1;

__END__

sub _is_path {
    my $tm = shift;
    my $a  = shift;
    my $b  = shift;
    my $vs = shift;                                                                 # what have we visited so far
                                                                                    # @_ contains a sequence of steps
    return $a eq $b if scalar @_ == 0;                                              # empty path? then a == b
                                                                                    # ok, there is more of a path
    $vs->{$a} = 1;                                                                  # make an entry in the visitor's guestbook
    my $r = shift;                                                                  # take the first step
# or list
    foreach my $s (@$r) { # this is a list of or'ed steps, s is an atom (= list reference, possibly blessed, contains only ONE element)
	my $t = $s->[0];
	if (ref ($s) eq '*') {
	    die;
	} else {
	    my @bs = _make_step ($tm, $vs, $tm->mids ($t), $a);


	    return grep { $b eq $_ } @bs unless @_;                                 # if there is more, we have to continue, otherwise, all if a == b
	    foreach my $b2 (@bs) {
		return 1 if _is_path ($tm, $b2, $b, $vs, @_);
	    } 
	    return 0;
	}



    if (! ref ($r)) {                                                               # something simple, one id
	if ($r =~ /(\w+)\*$/) {                                                     # this is a repetition*
	    my $t2 = $tm->mids ($1);
	    return 1 if _is_path ($tm, $a, $b, $vs, @_);                            #     empty step is also ok
	    my @bs = _make_step ($tm, $vs, $t2, $a);                                 #     get the next in the front
	    foreach my $b2 (@bs) {
		return 1 if _is_path ($tm, $b2, $b, $vs, ($r, @_));                 # use the original expression
	    }
	    return 0;

	} else {                                                                    # one single step from $a via $r
	}
    } else { # an OR
	die;
    }
}

xxx=cut


computes a tree of topics based on a starting topic, an association type
and two roles. Whenever an association of the given type is found and the given topic appears in the
role given in this very association, then all topics appearing in the other given role are regarded to be
children in the result tree. There is also an optional C<depth> parameter. If it is not defined, no limit
applies. Starting from XTM::base version 0.34 loops are detected and are handled gracefully. The returned
tree might contain loops then.

Examples:


  $hierarchy = $tm->induced_assoc_tree (topic      => $start_node,
					assoc_type => 'at-relation',
					a_role     => 'tt-parent',
					b_role     => 'tt-child' );
  $yhcrareih = $tm->induced_assoc_tree (topic      => $start_node,
					assoc_type => 'at-relation',
					b_role     => 'tt-parent',
					a_role     => 'tt-child',
					depth      => 42 );

B<Note>


x=cut


x=pod