| CPAN-Search-Lite documentation | view source | Contained in the CPAN-Search-Lite distribution. |
CPAN::Search::Lite::Populate - create and populate database tables
This module is responsible for creating the tables
(if setup is passed as an option) and then for
inserting, updating, or deleting (as appropriate) the
relevant information from the indices of
CPAN::Search::Lite::Info and CPAN::Search::Lite::PPM and the
state information from CPAN::Search::Lite::State. It does
this through the insert, update, and delete
methods associated with each table.
Note that the tables are created with the setup argument
passed into the new method when creating the
CPAN::Search::Lite::Index object; existing tables will be
dropped.
The tables used are described below.
This table contains module information, and is created as
mod_id SMALLINT UNSIGNED NOT NULL AUTO_INCREMENT dist_id SMALLINT UNSIGNED NOT NULL mod_name VARCHAR(100) NOT NULL mod_abs TINYTEXT doc bool mod_vers VARCHAR(10) dslip CHAR(5) chapterid TINYINT(2) UNSIGNED PRIMARY KEY (mod_id) FULLTEXT (mod_abs) KEY (dist_id) KEY (mod_name(100))
This is the primary (unique) key of the table.
This key corresponds to the id of the associated distribution
in the dists table.
This is the module's name.
This is a description, if available, of the module.
This value, if true, signifies that documentation for the
module exists, and is located, eg, in dist_name/Foo/Bar.pm
for a module Foo::Bar in the dist_name distribution.
This value, if true, signifies that the source code for the
module exists, and is located, eg, in dist_name/Foo/Bar.pm
for a module Foo::Bar in the dist_name distribution.
This value, if present, gives the version of the module.
This is a 5 character string expressing the dslip (development, support, language, interface, public license) information.
This number corresponds to the chapter id of the module, if present.
This table contains distribution information, and is created as
dist_id SMALLINT UNSIGNED NOT NULL AUTO_INCREMENT stamp TIMESTAMP(8) auth_id SMALLINT UNSIGNED NOT NULL dist_name VARCHAR(90) NOT NULL dist_file VARCHAR(110) NOT NULL dist_vers VARCHAR(20) dist_abs TINYTEXT size MEDIUMINT UNSIGNED NOT NULL birth DATE NOT NULL readme bool changes bool meta bool install bool PRIMARY KEY (dist_id) FULLTEXT (dist_abs) KEY (auth_id) KEY (dist_name(90))
This is the primary (unique) key of the table.
This is a timestamp for the table indicating when the entry was either inserted or last updated.
This corresponds to the CPAN author id of the distribution
in the auths table.
This corresponds to the distribution name (eg, for
My-Distname-0.22.tar.gz, dist_name will be My-Distname).
This corresponds to the CPAN file name.
This is the version of the CPAN file (eg, for
My-Distname-0.22.tar.gz, dist_vers will be 0.22).
This is a description of the distribtion. If not directly
supplied, the description for, eg, Foo::Bar, if present, will
be used for the Foo-Bar distribution.
This corresponds to the size of the distribution, in bytes.
This corresponds to the last modified time of the distribution, in the form YYYY/MM/DD.
This value, if true, indicates that a README file for the distribution is available.
This value, if true, indicates that a Changes file for the distribution is available.
This value, if true, indicates that a META.yml file for the distribution is available.
This value, if true, indicates that an INSTALL file for the distribution is available.
This table contains CPAN author information, and is created as
auth_id SMALLINT UNSIGNED NOT NULL AUTO_INCREMENT cpanid VARCHAR(20) NOT NULL fullname VARCHAR(40) NOT NULL email TINYTEXT PRIMARY KEY (auth_id) FULLTEXT (fullname) KEY (cpanid(20))
This is the primary (unique) key of the table.
This gives the CPAN author id.
This is the full name of the author.
This is the supplied email address of the author.
This table contains chapter information associated with
distributions. PAUSE allows one, when registering modules,
to associate a chapter id with each module (see the mods
table). This information is used here to associate chapters
(and subchapters) with distributions in the following manner.
Suppose a distribution Quantum-Theory contains a module
Beta::Decay with chapter id 55, and
another module Laser with chapter id 87. The
Quantum-Theory distribution will then have two
entries in this table - chapterid of 55 and
subchapter of Beta, and chapterid of 87 and
subchapter of Laser.
The table is created as follows.
chap_id SMALLINT UNSIGNED NOT NULL AUTO_INCREMENT chapterid TINYINT UNSIGNED NOT NULL dist_id SMALLINT UNSIGNED NOT NULL subchapter TINYTEXT KEY (dist_id)
This is the primary (unique) key of the table.
This number corresponds to the chapter id.
This is the id corresponding to the distribution in the
dists table.
This is the subchapter.
This table lists the prerequisites of the distribution,
as found in the META.yml file (if supplied - note that
only relatively recent versions of ExtUtils::MakeMaker
or Module::Build generate this file when making a
distribution). The table is created as
req_id SMALLINT UNSIGNED NOT NULL AUTO_INCREMENT dist_id SMALLINT UNSIGNED NOT NULL mod_id SMALLINT UNSIGNED NOT NULL req_vers VARCHAR(10) KEY (dist_id)
This is the primary (unique) key of the table.
This corresponds to the id of the distribution in the
dists table.
This corresponds to the id of the prerequisite module
in the mods table.
This is the version of the prerequisite module, if specified.
This table contains information on Win32 ppm
packages available in the repositories specified
in $repositories of CPAN::Search::Lite::Util.
The table is created as
ppm_id SMALLINT UNSIGNED NOT NULL AUTO_INCREMENT dist_id SMALLINT UNSIGNED NOT NULL rep_id TINYINT(2) UNSIGNED NOT NULL ppm_vers VARCHAR(20) KEY (dist_id)
This is the primary (unique) key of the table.
This is the id of the distribution appearing in the
dists table.
This is the id of the repository appearing in the
$repositories data structure.
This is the version of the ppm package found.
This table contains information on the Win32 ppm
repositories specified in $repositories of
CPAN::Search::Lite::Util.
The table is created as
rep_id SMALLINT UNSIGNED NOT NULL abs TINYTEXT browse TINYTEXT perl VARCHAR(10) alias VARCHAR(20) KEY (rep_id)
This is the primary (unique) key of the table, and
corresponds to the rep_id of the ppms table.
This is a description of the repository.
This is a URL where one can browse the repository.
This specifies the perl version the repository corresponds to.
This specifies a short alias for the repository.
This contains information on the chapters. The table is created as
chapterid SMALLINT UNSIGNED NOT NULL chap_link TINYTEXT KEY (chapterid)
This is the id of the distribution appearing in the
dists table.
This is the primary (unique) key of the table, and
corresponds to the chapterid of the dists, mods,
and chaps table.
This is a description of the chapter that chapterid corresponds
to (eg, File_Handle_Input_Output).
When uploading a module to PAUSE, there exists an option to assign it to one of 24 broad categories. However, many modules have not been assigned such a category, for one reason or another. When populating the tables, the AI::Categorizer module is used to guess a possible category for those modules that haven't been assigned one, based on a training set based on the modules that have been assigned a category (see <AI::Categorizer> for general details). If this guess is above a configurable threshold (see CPAN::Search::Lite::Index, the guess is accepted and subsequently inserted into the database, as well as updating the categories associated with the module's distribution.
| CPAN-Search-Lite documentation | view source | Contained in the CPAN-Search-Lite distribution. |