TITLE

CMap Code Overview

Back to Top


VERSION

$Revision: 1.10 $

CMap is a CGI application for viewing comparative and genetic maps. Written entirely in Perl, this application will run on many different operating systems and relational database management systems (RDBMS), including Oracle, MySQL, Sybase and PostgreSQL. CMap can create images using "libgd" for standard image formats like PNG and JPEG as well as creating SVG (Scalable Vector Graphics). The code was originally written for the Gramene project (http://www.gramene.org/), a comparative mapping resource for crop grasses, but much has been done to make the application generic enough to be used with many different types of data.

Back to Top


ARCHITECTURE OVERVIEW

Care has been given to carefully separate functionally different parts of the code into different modules, roughly corresponding to a traditional "three-tiered" structure of layers for the data, the logic, and the presentation layers. You'll find all the database interaction encapsulated into the Bio::GMOD::CMap::Data* modules, all the "logic" (the code that lays out the map components) in the Bio::GMOD::CMap::Drawer* modules, and all the HTML generation in the Bio::GMOD::CMap::Apache* modules.

Back to Top


DATA MODULES

As stated above, all the database interaction happens in the Bio::GMOD::CMap::Data* modules. One goal of this project has always been compatibility with multiple RDBMSs (perhaps if only from necessity, as the system was developed using MySQL but is deployed on Oracle). As a consequence, all the SQL will be placed (eventually) into object-oriented modules where the statements can be sub-classed and modified to run with a particular database without affecting any other SQL.

The Bio::GMOD::CMap::Data module has as a component an "SQL" object, with the choices right now confined to Bio::GMOD::CMap::Data::[Generic|MySQL|Oracle]. The "Generic" module is the superclass of the other two (and conceivably any others, such as classes for PostgreSQL, Sybase, etc.). All SQL statement methods are defined in the Generic module, and any that don't work for a particular RDBMS can be overridden in a subclass. This also allows users of other systems to create their own modules and drop them into place with very little effort. All that need happen is to subclass Bio::GMOD::CMap::Data::Generic (as noted in the perldocs), and then add a line to the Bio::GMOD::CMap::Constants to point to the new module.

Back to Top


LOGIC MODULES

All the modules that actually do something toward laying out the comparative maps live in the Bio::GMOD::CMap::Drawer* namespace. The top level, "Drawer.pm," is basically the coordinator of the objects it manipulates. The Drawer creates a "Map" object for each map (or map set) that the user has requested. It asks each Map to lay itself out, then it adjusts the frame, and writes the image to a file. It then is able to tell the calling object the filename of the image and its height and width.

Eventually other modules should fall within this classification, especially the module for administrative functions such as creating and editing maps sets, maps, features, correspondences, etc. All of those functions are currently spread around in the Bio::GMOD::CMap::Admin and Bio::GMOD::CMap::Apache::AdminViewer modules and the cmap_admin.pl script. Eventually I hope to move all the logic into Bio::GMOD::CMap::Admin and have the web- and command-line interfaces simply invoke methods on this Admin object.

Back to Top


PRESENTATION MODULES

The modules in the Bio::GMOD::CMap::Apache namespace are responsible for actually displaying the maps through a web interface. All of the modules are basic Perl classes and are objects inheriting from the Bio::GMOD::CMap::Apache superclass. This superclass creates the Template Toolkit object, the "page" object (see perldocs), and handles any errors thrown by the derived classes, reducing the amount of code to create a new handler.

You'll notice that there is no HTML mixed with Perl code as all the web pages are generated with the Template Toolkit Perl module (http://www.template-toolkit.com/) written by Andy Wardley. Template Toolkit is powerful and freely available Perl templating system, and the hope is that by using it, non-technical people who want to tweak the HTML to do so without interfering with the code.

Back to Top


CONFIGURATION MODULES

There is Bio::GMOD::CMap::Config which handles the reading in and parsing of the config files.

All local configuration of CMap should be done through the "cmap.conf" directory. Of course, the directory doesn't have to be called "cmap.conf." It can be called whatever you like, so long as the absolute path to the directory is in the Bio::GMOD::CMap::Constants file. This path is automatically written during installation if you do the standard "perl Build.PL; ./Build; ./Build install" process.

There are now two types of config files. There is the "global.conf" that handles information that all the data sources need, like the default data source. There is also one config file for each data source. This file is handles most of the configurable options.

There are defaults provided for most every option in the local config file with the exception of the database connection info and the template and image cache directories. The latter two should be set during installation, and the first should be set by the installer after installation (they are promted to do this after "./Build install"). If you comment out any of the options in "cmap.conf" (except "database," "template_dir" and "cache_dir"), there are defaults in the Bio::GMOD::CMap::Constants file.

Feature, map and evidence types are now defined and controlled in the data source config files.

Back to Top


GENERAL FLOW FOR HANDLERS

The web presentation modules are all located under the Bio::GMOD::CMap::Apache namespace and are instantiated as objects. In order to understand how they are invoked, I will describe how the main map viewer (Bio::GMOD::CMap::Apache::MapViewer) works.

The above scenario is probably the most involved process in the comparative maps, but it shows the way that distinct pieces of the problem are split into specialized modules and objects.

Back to Top


SQL CONVENTIONS

The tables used by the comparative maps follow a fairly rigid naming convention so that they should be able to integrate easily with existing databases.

Back to Top


TABLE DESCRIPTIONS

In the "docs" directory, you will find a schema diagram illustrating the structure and relationships of the tables. In the "sql" directory, you will find create statements of the tables for MySQL, Oracle and PostgreSQL. Following is a general description of each table, what kind of data it is supposed to hold, and how it fits in with the others. Most of the fields in the tables are described more fully in the ADMINSTRATION document when discussing the forms presented in the web admin tool.

Back to Top


TIPS

If you'd like to write a script to access the database directly, you can get a handle to the database quite easily using the Bio::GMOD::CMap modules. Here's an example:

  #!/usr/bin/perl
  use strict;
  use Bio::GMOD::CMap;
  my $cmap = Bio::GMOD::CMap->new or die Bio::GMOD::CMap->error;
  # optional, only if you have muliple data sources defined
  # $cmap->data_source('Foo');
  my $db = $cmap->db or die $cmap->error;

Back to Top


AUTHOR

Ken Y. Clark, kclark@cshl.edu Ben Faga, faga@cshl.edu

Copyright (c) 2002-5 Cold Spring Harbor Laboratory

Back to Top