NAME

CMap and GBrowse integration

Back to Top


INTRODUCTION

There has been considerable interest in integrating GBrowse and CMap. The solution presented here is to interweave their database schemas by merging the cmap_feature table (from CMap) and the fgroup table (from GBrowse). This enables features/groups from each to be referenced from each program.

The process of setting up an integrated is explained in this document.

Note: Currently the only RDBMS used in the integration is mysql.

Back to Top


CREATE THE DATABASE

There are several different ways to create the integrated database.

Fresh Install

If you are installing both for the first time, you can use the sql/cmap_gbrowse_create.mysql file. This will create a usable CMap database and also include the integraded GBrowse schema. For instructions on how to install a sql file, see INSTALL.pod.

You can also install the regular cmap database and use the following the "Previous Install" directions.

Previous CMap Install

If you currently have a CMap version of 0.13 or higher, don't panic. The database update is non-destructive to the CMap side. (If you are running and older version of CMap, you will need to upgrade first.)

Simply, run the upgrade sql script, sql/cmap_gbrowse.adapt.mysql. This will add the GBrowse tables and the new columns in the CMap tables.

    Example:
        $ mysql CMAP_GBROWSE < sql/cmap_gbrowse.adapt.mysql

Create Database Using bp_load_gff.pl or bp_bulk_load_gff.pl

You will need a recent version of bioperl-live from cvs.

If you wish to import GBrowse data first (and use that data to populate the CMap side), you can use either of these loaders; bp_load_gff.pl or bp_bulk_load_gff.pl. These programs can create the database while they import data.

You must specify mysqlcmap as the adaptor.

    Examples:
        $ bp_bulk_load_gff.pl -d dbi:mysql:db_name  -a mysqlcmap --create yeast_data.gff
        $ bp_load_gff.pl -d dbi:mysql:db_name -a dbi::mysqlcmap --create yeast_data.gff

Back to Top


CONFIGURING CMAP

Starting with a regular CMap config, configuring for integration can be fun and easy. (If you need help with the CMap configuration part, please see ADMINISTRATION.pl in the docs folder.)

New Fields

The following fields need to be added to the individual config file for this data source.

The following is an example of how that would look.

    #
    # The database includes gbrowse tables.
    #
    gbrowse_compatible 1
    #
    # The GBrowse class that a map will have.
    #
    gbrowse_default_map_class CMap_Map
    #
    # The CMap feature_type_acc that a map "feature" will have.
    #
    gbrowse_default_map_feature_type_acc gbrowse_map
    <feature_type gbrowse_map>
        feature_type_acc gbrowse_map
        feature_type GBrowse Map
        default_rank 10
        color
        shape in-triangle
        drawing_lane 10
        drawing_priority 1
    </feature_type>

Modifying Feature Types

    Example:
    <feature_type read>
        feature_type_acc read
        feature_type Read
        default_rank 3
        shape out-triangle
        drawing_lane 3
        drawing_priority 1
        gbrowse_class read
        gbrowse_ftype Read
    </feature_type>

Modifying Map Types

    Example:
    <map_type sequence>
        map_type_acc sequence
        map_type Sequence
        map_units bp
        is_relational_map
        width 1
        shape span
        color
        display_order 1
        gbrowse_ftype Sequence
    </map_type>

Back to Top


CONFIGURING GBROWSE

To use the integrated database, you must select the dbi::mysqlcmap adaptor as in the following example.

    description   = CMap Integrated 
    db_adaptor    = Bio::DB::GFF
    db_args       = -adaptor dbi::mysqlcmap
                    -dsn     CMAP_GBROWSE
    user          = mysql
    pass          = mysql

Also, remember to make the "reference class" the same as what was defined in the cmap config file.

If you are going to copy data from the GBrowse side of the db into the CMap side, you will also need to add a "cmap_feature_type_acc" value to each track that you want to be able to copy. An example track would look like this:

    [contig]
    feature      = contig
    glyph        = generic
    key          = Contig
    link         = sub {my $self = shift; return $self->cmap_viewer_link('GCMAP');}
    cmap_feature_type_acc = contig

Back to Top


DATA

There are multiple ways to get data into the database. You can load CMap data first using cmap_admin.pl (that comes with CMap). Then you can either simply copy that data into the GBrowse side or manually insert corresponding GBrowse data using bp_load_gff.pl (from bioperl). Alternatively, you can load GFF data into the GBrowse side first and then copy that data into the CMap side.

LOADING CMAP DATA

You can load the CMap data as normal. Keep in mind that (currently) it is essential that the CMap data be loaded in first. Under this system, there can be CMap data not displayed in GBrowse but not the other way around.

COPY DATA FROM CMAP

You can also copy data that was imported as CMap data into the GBrowse database using CMap Admin. If your data_source config file has the "gbrowse_compatible" set, the menu option "Copy CMap data into GBrowse" will appear in the cmap_admin.pl main menu.

Simply select the copy option and select the map sets and feature types you want to import.

Be sure that the feature types you want to copy have both a gbrowse_class and a gbrowse_ftype. The map sets' map type must have a gbrowse_ftype associated with it in the config file as well.

MANUALLY IMPORT CORRESPONDING GBROWSE DATA

If you have a GFF file with data that corresponds to the CMap data, you can import that data separately. Doing this rather than copying from the CMap side will allow you to import more detailed data.

When importing, it is very important that the feature names and classes correspond to the feature names and gclasses already in the CMap database.

PREPARING CMAP DATA FOR GBROWSE DATA INSERT

Using cmap_admin.pl, select the correct data source and then choose the "Prepare the Database for GBrowse data" option. It will ask you to select a map set and some feature types to prepare. You will want to prepare the data that corresponds to the GBrowse data you will be then importing.

This will copy the gbrowse_class into the gclass column of each feature. The gbrowse_class is defined by the feature type in the config file (see Modifying Feature Types above).

It also creates a feature for the whole map. This feature will have the gclass that is defined in the config file as gbrowse_default_map_class and the cmap feature type that is defined in the config file as gbrowse_default_map_feature_type_acc.

When this is done, the database will be ready for importing the GBrowse data.

IMPORTING THE GBROWSE DATA

You can use bp_load_gff.pl to load Data into the GBrowse side of the database (making sure that the data is already prepared in the CMap side, of course). Use the dbi::mysqlcmap adaptor as follows.

    /usr/bin/bp_load_gff.pl --dns CMAP_GBROWSE --adaptor dbi::mysqlcmap file_name.gff

LOADING GBROWSE DATA

You can load GBrowse data on its own from a GFF file. This method is also described in Create Database Using bp_load_gff.pl or bp_bulk_load_gff.pl.

Keep in mind NOT to use the --create option if you don't want to wipe the database and start over.

You can use either of these loaders; bp_load_gff.pl or bp_bulk_load_gff.pl.

You must specify mysqlcmap as the adaptor.

    Examples:
        $ bp_bulk_load_gff.pl -d dbi:mysql:db_name  -a mysqlcmap yeast_data.gff
        $ bp_load_gff.pl -d dbi:mysql:db_name -a dbi::mysqlcmap yeast_data.gff

COPYING DATA FROM GBROWSE

In order to copy data from the GBrowse side to the CMap side of the database, you will need to make some changes to the config files. These were described in CONFIGURING CMAP and CONFIGURING GBROWSE.

Using cmap_admin.pl, select the "Copy GBrowse into the CMap database" option (this option will only appear if the datasource (config file) that you are currently using is gbrowse_compatible).

Then select the map set that you want to copy the data into and the program will take care of the rest.

Back to Top


LINK THE TWO

Now you will want to link from each to the other. You will have to make some adjustments to each GBrowse and CMap to be able to do this.

Link to GBrowse from CMap

To create a link from a GBrowse feature you simply need to create a link option for the track in the configuration file like the following.

 link = sub {my $self = shift; return $self->cmap_viewer_link('data_source');}

This calls a method of the feature which returns a link to CMap. If it is able to, it will return a link to the viewer showing the feature highlighted. If it is unable to create the link because the data cannot be found in the CMap db or because it is not using an adaptor with the create_cmap_viewer_link method (only mysqlcmap.pm has this currently), it will return a link to the feature search page using the feature name.

The 'data_source' is the name of the CMap data_source that the link will use.

With this information in place, you have just connected to CMap.

Link to CMap from GBrowse

You can change the URL of a cmap feature by defining it in the config file, as in the following example. Pay attention to the $url variable in the area_code string.

 <feature_type read>
 feature_type_acc read
 feature_type Read
 default_rank 3
 shape span
 drawing_lane 3
 drawing_priority 1
 gbrowse_class read
 gbrowse_ftype read
 area_code <<EOF
 $code=sprintf("onMouseOver=\"window.status='View %s in GBrowse';return true\"",$feature->{'feature_type_acc'});
 $alt = $feature->{'feature_name'};
 $url=sprintf("/cgi-bin/gbrowse/cmap?name=%s:%s",$self->feature_type_data($feature->{'feature_type_acc'},'gbrowse_ftype'),$feature->{'feature_name'});
 EOF
 </feature_type>

The area_code option allows you to change three variables to affect the behavior of the area box around the feature (and it's label). The $code variable can contain javascript, the $alt variable contains the string displayed on mouse over. We are currently interested in the $url variable which contains the URL that the feature will refer to when clicked.

 $url=sprintf("/cgi-bin/gbrowse/cmap?name=%s:%s",$self->feature_type_data($feature->{'feature_type_acc'},'gbrowse_ftype'),$feature->{'feature_name'});

GBrowse needs two bits of information to find a feature, it's ftype and its name. To get the name we look into the feature object that we have access to. It is contained in $feature->{'feature_name'}.

To get the ftype, we need to query the config. We can do this by calling the 'feature_type_data' method which takes the feature type accession ($feature->{'feature_type_acc'}) and the key that your looking for ('gbrowse_ftype'). Although you could cheat and hard code the ftype, it would make it more difficult to copy and paste this bit of code. Besides this is more fun.

That is all you need to connect the feature back to GBrowse.

Back to Top


Conclusion

Then your done and you can happily use both CMap and GBrowse in concert.

Back to Top


AUTHOR

Ben Faga <faga@cshl.org>

Back to Top