This document is intended to briefly describe each of these scripts and point to other documentation where an individual script is described in more detail.

ADMINISTRATION

The chief administration script is cmap_admin.pl. This is what will be the most used script because it does most of the tasks that are needed to get CMap running and keep it that way. The other administration scripts will help diagnose problems or attempt to solve issues that might be faced by an administrator.

cmap_admin.pl

Description

This is the main administration script. Most administrative tasks can be completed using the cmap_admin.pl script, from importing data to clearing the cache. For much more information, please read the ADMINISTRATION document included with the distribution.

Usage

  $ cmap_admin.pl -d datasource [options] [data_file]

cmap_data_diagnostics.pl

Description

This script was written to check the CMap database for problems. It will report any problems that it finds such as missing configurations for map or feature types.

Usage

Stats and warnings are printed to standard out and the errors are printed to standard error. To separate the outputs, use the following example.

  $ cmap_data_diagnostics.pl -d datasource [options] 1>stats_file 2>error_file

cmap_validate_config.pl

Description

This script checks a config file (not the global.conf though) to see if it is a valid. It will report any problems it finds.

Usage

  $ validate_cmap_config.pl config_file.conf

cmap_reduce_cache_size.pl

Description

From the documentation: This script cycles through each CMap data_source and reduces the size of the query cache to the value given as 'max_query_cache_size' in the config file.

An optional --config_dir value can be set to use this on config files in a secondary location of config file.

Usage

  $ cmap_reduce_cache_size.pl

cmap_metrics.pl

Description

From the script documentation: A simple script to tell you how many records are in each table. If no data-source is provided, the default is used.

Usage

  $ cmap_metrics.pl -d DATASOURCE

cmap_matrix_compare.pl

Description

From the script documentation: This script is designed to compare the CMap correspondence matrix data between different loads of the database.

Usage

  $ ./cmap_matrix_compare.pl --store monday.dat

  $ ./cmap_matrix_compare.pl --store tuesday.dat

  $ ./cmap_matrix_compare.pl --compare monday.dat --compare tuesday.dat

cmap_examine_attribute.pl

Description

The goal of this script is to help look for a specific attribute that should be on every object of the specified type (map set, feature, etc). It examines the CMap database and find all instances where the attribute should be.

It prints the value of the attribute if it exists for the object and a warning if it is missing.

An example is if all map sets are supposed to have a "Description" attribute. Running

  $ ./cmap_examine_attribute.pl -d DATASOURCE -a "Description" -o "map_set"

would check to make sure all the map sets in the database have a Description attribute and provide a list of those that were missing it.

Usage

  $ /cmap_examine_attribute.pl -d DATASOURCE -a ATTRIBUTE_NAME -o OBJECT_TYPE

DATA PARSING AND IMPORT

Getting data into the CMap database is an important step. These scripts will help get data into a format that can be imported into CMap. Some of them will directly import the data while others will rely on cmap_admin.pl to read the files they create.

They also provide a good jumping off point if a new or custom parser is required.

cmap_validate_import_file.pl

Description

This script can be used to check an import file to see if it will import correctly. Any problems will be reported.

Usage

  $ cmap_validate_import_file.pl -d DATASOURCE -f IMPORT_FILE

cmap_parseWashUAceFiles.pl

Description

This script will parse an ACE file of super contigs and output a tab-delimited file that is readable by the CMap importer.

The script was written for Washington University and may not be useful to most people out of the box but can be modified to suit your needs. It can also provide some insight into writing a parser for your favorite file format.

It is best to follow this script with cmap_manageParsedAceFile.pl to remove reads that aren't interesting and mark ones that are.

Usage

  $ ./cmap_parseWashUAceFiles.pl ace_file > cmap_import_file

cmap_parsefpc.pl

Description

This script will parse an FPC (fingerprint contig) file and output a tab-delimited file that is readable by the CMap importer.

If an assembly file created by cmap_manageParsedAceFiles.pl is provided, the script will read through that file and output (into a separate file) the lines that define clones that share a name with one of the FPC clones. The feature type accession of the assembly clones must be "clone".

Usage

  $ cmap_parcefpc.pl [-a assembly_file] [options] fpc_file > CMAP_IMPORT_FILE

cmap_parseagp.pl

Description

This script will parse an AGP formatted file and output a tab-delimited file that is readable by the CMap importer.

Usage

  $ cmap_parceagp.pl agp_file > CMAP_IMPORT_FILE

cmap_manageParsedAceFile.pl

Description

This script reads a CMap import file which was output from cmap_parseWashUAceFiles.pl and modifies the data to be more easily viewed in CMap.

It will create read-depth features which will allow the user to see the number of reads over a window wihtout having to load each read (which would bog down the viewer).

It also finds problem reads such as singleton reads and read pairs that are too far apart.

The output can then be loaded into CMap using cmap_admin.pl.

Usage

  $ ./cmap_manageParsedAceFile.pl options CMAP_IMPORT_FILE > NEW_CMAP_IMPORT_FILE

cmap_import_alignment.pl

Description

This script directly imports alignment data recognized by BioPerl's SearchIO module (it has been tested for BLAST) into a CMap database.

It uses the name field from both the query and subject (as parsed by BioPerl) to determine which CMap map is being refered to. If no map with that name is currently in the specific map set, the map will be created.

The HSPs are created as features with the type defined from the command line. Correspondences between the HSPs are created.

Usage

  $ cmap_import_alignment.pl -d DATASOURCE -f ALIGNMENT_FILE \
    -q QUERY_MAP_SET_ACC -s SUBJECT_MAP_SET_ACC -t FORMAT \
    --fta HSP_FEATURE_TYPE -eta CORRESPONDENCE_EVIDENCE_TYPE

cmap_insert_gnomspace_xml.pl

Description

This script parses an XML file from the GnomSpace program, inserts the fragment data into CMap and creates correspondences between them as defined in the map_alignment section of the file. It is an exelent example of how to use the CMap API to insert features.

Usage

  $ cmap_insert_gnomspace_xml.pl -f FILE -d DATASOURCE -r REF_MAP_SET_ACC \
    -a ALIGN_MAP_SET_ACC -t FEATURE_TYPE_ACC -e EVIDENCE_TYPE_ACC

DATA MODIFICATION

These scripts modify the data, hopefully in in a way to make it more understandable for the user.

cmap_create_stacked_maps.pl

Description

When there are a large number of comparative maps, the CMap view loads slowly and the resulting view can be unusably dense. This script takes maps from a relational map set, groups them based on correspondences to a relational map and stacks them. This creates a smaller number of stacked maps that are made up of the original maps. These stacked maps can then be displayed much more quickly and legibly than before.

It is important to note that this is a non-destructive script. The stacked maps are inserted into a new map set. It is recommended that the original map sets be kept in the database but if database size is an issue, the original maps can be removed. Be aware that there is no script currently or in development to reverse the process.

Usage

See the documentation within the script for more details on the required options.

  $ cmap_create_stacked_maps.pl [options]

  $ cmap_fix_map_display_order.pl -d DATASOURCE --ms-accs=msacc1[,msacc2...]

DEVELOPMENT

The script in this section will only useful for debugging or benchmarking.

  $ sudo profile-cmap-draw.pl -u CMAP_URL