| Astro-Catalog documentation | Contained in the Astro-Catalog distribution. |
Astro::Catalog::Query::SkyCat - Generate SkyCat catalogue query clients
On load, automatically parse the SkyCat server specification file
from ~/.skycat/skycat.cfg, if available, and dynamically
generate query classes that can send queries to each catalog
server and parse the results.
Simple constructor. Forces read of config file if one can be found and the config has not been read previously. If no config file can be located the query object can not be instantiated since it will not know the location of any servers.
$q = new Astro::Catalog::Query::SkyCat( catalog => 'gsc', %options ); $q = new Astro::Catalog::Query::SkyCat( catalog => 'gsc@eso', %options );
The catalog field must be present, otherwise the new object will
not know which remote server to use and which options are mandatory
in the query. Note that the remote catalog can not be changed after
the object is instantiated. In general it is probably not wise to
try to change the remote host via either the query_url or
url methods unless you know what you are doing. Modifying your
skycat.cfg file is safer.
Currently only one config file is supported at any given time.
If a config file is changed (see the cfg_file class method)
the current config is overwritten automatically.
It is not possible to override the catalog file in the
constructor. Use the cfg_file class method instead.
Obviously a config per object can be supported but this is probably not that helpful. This will be reconsidered if demand is high.
Catalog name selected by the user and currently configured for this object. Not to be used outside this class..
configureConfigure the object. This calls the base class configure , after it has made sure that a sky cat config file has been read (otherwise we will not be able to vet the incoming arguments.
Construct a query URL based on the options.
$url = $q->_build_query();
All the SkyCat servers return data in TST format. Need to make sure that column information is passed into the TST parser.
This method declares which options can be configured by the user of this service. Generated automatically by the skycat config file and keyed to the requested catalog.
Get the default options that are relevant for the selected catalog.
%defaults = $q->_get_default_options();
These methods are not associated with any particular object.
Location of the skycat config file. Default location is
$SKYCAT_CFG, if defined, else $HOME/.skycat/skycat.cfg,
or $PERLPREFIX/etc/skycat.cfg if there isn't a version
in the users home directory
This could be made per-class if there is a demand for running queries with different catalogs. This would also move the config contents into the query object itself.
'directory' entries are not followed by default although the class can be configured to do so by setting
$Astro::Catalog::Query::SkyCat::FOLLOW_DIRS = 1;
to true.
This class could simply read the catalog config file and allow queries on explicit servers directly rather than going to the trouble of auto-generating a class per server. This has the advantage of allowing a user to request USNO data from different servers rather than generating a single USNO class. ie
my $q = new Astro::Catalog::Query::SkyCat( catalog => 'usnoa@eso',
target => 'HL Tau',
radius => 5 );
as opposed to
my $q = new Astro::Catalog::Query::USNOA( target => 'HL Tau',
radius => 5 );
What to do with catalogue mirrors is an open question. Of course, convenience wrapper classes could be made available that simply delegate the calls to the SkyCat class.
SkyCat FTP server. [URL goes here]
SSN75 [http://www.starlink.rl.ac.uk/star/docs/ssn75.htx//ssn75.html] by Clive Davenhall.
At the very least for testing, an up-to-date skycat.cfg file should be distributed with this module. Whether it should be used by this module once installed is an open question (since many people will not have a version in the standard location).
Copyright (C) 2001-2003 University of Exeter and Particle Physics and Astronomy Research Council. All Rights Reserved.
This program was written as part of the eSTAR project and is free software; you can redistribute it and/or modify it under the terms of the GNU Public License.
Tim Jenness <tjenness@cpan.org>, Alasdair Allan <aa@astro.ex.ac.uk>
| Astro-Catalog documentation | Contained in the Astro-Catalog distribution. |
package Astro::Catalog::Query::SkyCat;
use 5.006; use strict; use warnings; use warnings::register; use vars qw/ $VERSION $FOLLOW_DIRS $DEBUG /; use Data::Dumper; use Carp; use File::Spec; use base qw/ Astro::Catalog::Transport::REST /; $VERSION = '0.02'; $DEBUG = 0; # Controls whether we follow 'directory' config entries and recursively # expand those. Default to false at the moment. $FOLLOW_DIRS = 0; # This is the name of the config file that was used to generate # the content in %CONFIG. Can be different to the contents ofg_file # if that my $CFG_FILE; # This is the content of the config file # organized as a hash indexed by remote server shortname # this has the advantage of removing duplicates my %CONFIG;
sub new { my $proto = shift; my $class = ref($proto) || $proto; # Instantiate via base class my $block = $class->SUPER::new( @_ ); return $block; }
sub _selected_catalog { my $self = shift; if (@_) { # The class has to be configured as a hash!!! $self->{SKYCAT_CATALOG} = shift; } return $self->{SKYCAT_CATALOG}; }
sub configure { my $self = shift; # load a config if we do not have one read yet # Note that this may force a remote URL read via directory # directives even though we do not have a user agent configured... $self->_load_config() unless %CONFIG; # Error if we have no config yet croak "Error instantiating SkyCat object since no config was located" unless %CONFIG; # Now we need to configure this object based on the # supplied catalog name. This is not really a public interface # let's call it a protected interface available to subclases # even though we are not technically a subclass... my %args = Astro::Catalog::_normalize_hash(@_); croak "A remote service catalog name must be provided using a 'catalog' key" unless exists $args{catalog}; # case-insensitive my $cat = lc($args{catalog}); # if we have an entry in %CONFIG then we can use it directly # else we may have a root name without a remote server if (! exists $CONFIG{$cat}) { my $name = $cat; # clear it and look for another $cat = undef; # if name does not include an @ we probably have a generic catalog # and just need to choose a random specific version if ($name !~ /\@/) { # look through the catalog for my $rmt (keys %CONFIG) { if ($rmt =~ /^$name\@/) { # a match $cat = $rmt; } } } # No luck finding catalog name croak "unable to find a remote service named $name" unless defined $cat; } # Now we know the details we need to store this somewhere in # the object so that it won't get clobbered. Otherwise the # super class configure will not be able to get the information # it needs. We can not simply store this in options since configure # does not know it is an allowed option... $self->_selected_catalog( $cat ); # delete catalog from list delete $args{catalog}; # Configure $self->SUPER::configure( %args ); }
sub _build_query { my $self = shift; my $cat = $self->_selected_catalog(); # Get the URL my $url = $CONFIG{$cat}->{url}; # Translate all the options to the internal skycat format my %translated = $self->_translate_options(); print "Translated query: ".Dumper(\%translated,$url) if $DEBUG; # Now for each token replace it in the URL for my $key (keys %translated) { my $tok = "%". $key; croak "Token $tok is mandatory but was not specified" unless defined $translated{$key}; $url =~ s/$tok/$translated{$key}/; } print "Final URL: $url\n" if $DEBUG; return $url; }
sub _parse_query { my $self = shift; # Get the catalog info my $cat = $self->_selected_catalog(); # and extract formatting information needed by the TST parser my %params; for my $key (keys %{ $CONFIG{$cat} }) { if ($key =~ /_col$/) { print "FOUND column specified $key\n" if $DEBUG; $params{$key} = $CONFIG{$cat}->{$key}; } } # If this catalogue is a GSC, pass in a GSC parameter $params{gsc} = 1 if $cat =~ /^gsc/i; print $self->{BUFFER} ."\n" if $DEBUG; # Make sure we set origin and field centre if we know it my $newcat = new Astro::Catalog( Format => 'TST', Data => $self->{BUFFER}, ReadOpt => \%params, Origin => $CONFIG{$cat}->{long_name}, ); # set the field centre my %allow = $self->_get_allowed_options(); my %field; for my $key ("ra","dec","radius") { if (exists $allow{$key}) { $field{$key} = $self->query_options($key); } } $newcat->fieldcentre( %field ); return $newcat; }
sub _get_allowed_options { my $self = shift; my $cat = $self->_selected_catalog(); return %{ $CONFIG{$cat}->{allow} }; }
sub _get_default_options { my $self = shift; # Global skycat defaults my %defaults = ( # Target information ra => undef, dec => undef, id => undef, # Limits radmin => 0, radmax => 5, width => 10, height => 10, magfaint => 100, magbright => 0, nout => 20000, cond => '', ); # Get allowed options my %allow = $self->_get_allowed_options(); # Trim the defaults (could do with hash slice?) my %trim = map { $_ => $defaults{$_} } keys %allow; return %trim; }
sub _get_supported_init { croak "xxx - get supported init"; }
{ my $cfg_file; if( exists $ENV{SKYCAT_CFG} ) { $cfg_file = $ENV{SKYCAT_CFG}; } elsif ( -f File::Spec->catfile( $ENV{HOME}, ".skycat", "skycat.cfg") ){ $cfg_file = File::Spec->catfile( $ENV{HOME}, ".skycat", "skycat.cfg"); } else { # generate the default path to the $PERLPRFIX/etc/skycat.cfg file, # this is a horrible hack, there is probably an elegant way to do # this but I can't be bothered looking it up right now. my $perlbin = $^X; my ($volume, $dir, $file) = File::Spec->splitpath( $perlbin ); my @dirs = File::Spec->splitdir( $dir ); my @path; foreach my $i ( 0 .. $#dirs-2 ) { push @path, $dirs[$i]; } my $directory = File::Spec->catdir( @path, 'etc' ); # reset to the default $cfg_file = File::Spec->catfile( $directory, "skycat.cfg" ); # debugging and testing purposes unless ( -f $cfg_file ) { # use blib version! $cfg_file = File::Spec->catfile( '.', 'etc', 'skycat.cfg' ); } } sub cfg_file { my $class = shift; if (@_) { $cfg_file = shift; $class->_load_config() || ($cfg_file = undef); } return $cfg_file; } }
sub _load_config { my $self = shift; my $cfg = $self->cfg_file; print "SkyCat.pm: \$cfg = $cfg\n" if $DEBUG; if (!defined $cfg) { warnings::warnif("Config file not specified (undef)"); return; } unless (-e $cfg) { my $xcfg = (defined $cfg ? $cfg : "<undefined>" ); return; } my $fh; unless (open $fh, "<$cfg") { warnings::warnif( "Specified config file, $cfg, could not be opened: $!"); return; } # Need to read the contents into an array my @lines = <$fh>; # Process the config file and extract the raw content my @configs = $self->_extract_raw_info( \@lines ); print "Pre-filtering has \@configs " . @configs . " entries\n" if $DEBUG; # Close file close( $fh ) or do { warnings::warnif("Error closing config file, $cfg: $!"); return; }; # Get the token mapping for validation my %map = $self->_token_mapping; # Currently we are only interested in catalog, namesvr and archive # so throw everything else away @configs = grep { $_->{serv_type} =~ /(namesvr|catalog|archive)/ } @configs; print "Post-filtering has \@configs " . @configs . " entries\n" if $DEBUG; # Process each entry. Mainly URL processing for my $entry ( @configs ) { # Skip if we have already analysed this server if (exists $CONFIG{lc($entry->{short_name})}) { print "Already know about " . $entry->{short_name} . "\n" if $DEBUG; next; } print " Processing " . $entry->{short_name} . "\n" if $DEBUG; print Dumper( $entry ) if( $DEBUG ); # Extract info from the 'url'. We need to extract the following info: # - Host name and port # - remaining url path # - all the CGI options including the static options # Note that at the moment we do not do token replacement (the # rest of the REST architecture expects to get the above # information separately). This might well prove to be silly # since we can trivially replace the tokens without having to # reconstruct the url. Of course, this does allow us to provide # mandatory keywords. $url =~ s/\%ra/$ra/; if ($entry->{url} =~ m|^http:// # Standard http:// prefix ([\w\.\-]+ # remote host (?::\d+)?) # Optional port number / # path separator ([\w\/\-\.]+\?) # remaining URL path and ? (.*) # CGI options without trailing space |x) { $entry->{remote_host} = $1; $entry->{url_path} = $2; my $options = $3; # if first character is & we append that to url_path since it # is an empty argument $entry->{url_path} .= "&" if $options =~ s/^\&//; # In general the options from skycat files are a real pain # Most of them have nice blah=%blah format but there are some cases # that do ?%ra%dec or coords=%ra %dec that just cause more trouble # than they are worth given the standard URL constructor that we # are attempting to inherit from REST # Best idea is not to fight against it. Extract the host, path # and options separately but simply use token replacement when it # comes time to build the URL. This will require that the url # is moved into its own method in REST.pm for subclassing. # We still need to extract the tokens themselves so that we # can generate an allowed options list. # tokens have the form %xxx but we have to make sure we allow # %mime-type. Use the /g modifier to get all the matches my @tokens = ( $options =~ /(\%[\w\-]+)/g); # there should always be tokens. No obvious way to reomve the anomaly warnings::warnif( "No tokens found in $options!!!" ) unless @tokens; # Just need to make sure that these are acceptable tokens # Get the lookup table and store that as the allowed options my %allow; for my $tok (@tokens) { # only one token. See if we recognize it my $strip = $tok; $strip =~ s/%//; if (exists $map{$strip}) { if (!defined $map{$strip}) { warnings::warnif("Do not know how to process token $tok" ); } else { $allow{ $map{$strip} } = $strip; } } else { warnings::warnif("Token $tok not currently recognized") unless exists $map{$strip}; } } # Store them $entry->{tokens} = \@tokens; $entry->{allow} = \%allow; print Dumper( $entry ) if( $DEBUG ); # And store this in the config. Only store it if we have # tokens $CONFIG{lc($entry->{short_name})} = $entry; } # if entry } # for loop # Debug print Dumper(\%CONFIG) if $DEBUG; return; }
sub _extract_raw_info { my $self = shift; my $lines = shift; # Now read in the contents my $current; # Current server spec my @configs; # Somewhere temporary to store the entries for my $line (@$lines) { # Skip comment lines and blank lines next if $line =~ /^\s*\#/; next if $line =~ /^\s*$/; if ($line =~ /^(\w+):\s*(.*?)\s*$/) { # This is content my $key = $1; my $value = $2; # Assume that serv_type is always first if ($key eq 'serv_type') { # Store previous config if it contains something # If it actually contains information on a serv_type of # directory we can follow the URL and recursively expand # the content push(@configs, $self->_dir_check( $current )); # Clear the config and store the serv_type $current = { $key => $value }; } else { # Just store the key value pair $current->{$key} = $value; } } else { # do not know what this line signifies since it is # not a comment and not a content line warnings::warnif("Unexpected line in config file: $line\n"); } } # Last entry will still be in %$current so store it if it contains # something. push(@configs, $self->_dir_check( $current )); # Return the entries return @configs; }
my %followed_dirs; sub _dir_check { my $self = shift; my $current = shift; if (defined $current && %$current) { if ($current->{serv_type} eq 'directory') { # Get the content of the URL unless we are not # reading directories if ($FOLLOW_DIRS && defined $current->{url} && !exists $followed_dirs{$current->{short_name}}) { print "Following directory link to ". $current->{short_name}. "[".$current->{url}."]\n" if $DEBUG; # Indicate that we have followed this link $followed_dirs{$current->{short_name}} = $current->{url}; # Retrieve the url, pass that array to the raw parser and then # return any new configs to our caller # Must force scalar context to get array ref # back rather than a simple list. return $self->_extract_raw_info(scalar $self->_get_directory_url( $current->{url} )); } } else { # Not a 'directory' so this is a simple config entry. Simply return it. return ($current); } } # return empty list since we have no value return (); }
sub _get_directory_url { my $self = shift; my $url = shift; # Call the base class to get the actual content my $content = ''; eval { $content = $self->_fetch_url( $url ); }; # Need an array my @lines; @lines = split("\n", $content) if defined $content; if (wantarray) { return @lines; } else { return \@lines; } }
sub _token_mapping { return ( id => 'id', ra => 'ra', dec => 'dec', # Arcminutes r1 => 'radmin', r2 => 'radmax', w => 'width', h => 'height', n => 'nout', # which filter??? m1 => 'magfaint', m2 => 'magbright', # Is this a conditional? cond => 'cond', # Not Yet Supported cols => undef, 'mime-type' => undef, ws => undef, ); }
sub _from_dec { my $self = shift; my $dec = $self->query_options("dec"); my %allow = $self->_get_allowed_options(); # Need colons $dec =~ s/\s+/:/g; # Need a + preprended $dec = "+" . $dec if $dec !~ /^[\+\-]/; return ($allow{dec},$dec); } sub _from_ra { my $self = shift; my $ra = $self->query_options("ra"); my %allow = $self->_get_allowed_options(); # need colons $ra =~ s/\s+/:/g; return ($allow{ra},$ra); }
sub _translate_one_to_one { my $self = shift; # convert to a hash-list return ($self->SUPER::_translate_one_to_one, map { $_, undef }(qw/ cond /) ); }
1;