Skip to content

The Genome U-Plot is a JavaScript tool to visualize chromosomal abnormalities in the Human Genome using a U-shape layout.

Notifications You must be signed in to change notification settings

gaitat/GenomeUPlot

Repository files navigation

Genome U-Plot sample implementation

The Genome U-Plot is a JavaScript tool to visualize Chromosomal abnormalities in the Human Genome using a U-shape layout.

GenomeU-Plot Whole Genome U-Plot. Visible are the 24 human chromosomes arranged in a U-shape, the cytobands, the chromosome junctions and the copy number variations (CNVs). The axes at the bottom right of the graph are respectively for the chromosomes on the right side of the plot.

Node

Node.js is an open-source, cross-platform JavaScript runtime environment for developing a diverse variety of server tools and applications.

We use Node for basically everything in this project, so we are going to need it. Please visit the download page for macOS or Windows binaries, or the package manager installations page for Linux distributions.

In this project we used Node.js v6.10.0 LTS.

Node Version Management Tools

If you need the flexibility to use multiple versions of Node, check out NVM or Windows NVM.

NPM

NPM is the default package manager for Node. It is automatically installed alongside with Node. Package managers are used to install and manage packages (modules of code that you or someone else wrote). We are going to use a lot of packages but we'll use Yarn, another package manager.

Yarn

Yarn is a Node.js package manager which is much faster than NPM, has offline support, and fetches dependencies more predictably.

To install yarn

Use NPM and run:

> $ npm install --global yarn

To install the project dependencies

Start a command shell, change directory to the directory of the project and install the project dependencies using:

> $ yarn install

To run the project

Use:

> $ yarn start

Using a modern browser visit:

http://localhost:8000/GenomePlot.html?sampleId=LNCAP

Data Visualization

A sample (LNCAP) with all required files is provided in the public/data directory

LNCAP/LNCAP_alts_comprehensive.csv  (Sample Rearrangements)
LNCAP/LNCAP_cnvIntervals.csv        (Sample Copy Number Variation - Intervals)
LNCAP/LNCAP_genomePlot_cnv30.json   (Sample Copy Number Variation - Raw Frequency)
LNCAP/LNCAP_visualization.json      (Sample Definition)

In order to run the application against a different sample (eg. MY_SAMPLE) you need to create an appropriate directory and file structure replacing for example LNCAP with MY_SAMPLE. Finally don't forget to replace your sample name in the URL parameter of the app.

Reference file

  • A Human Genome Assembly GRCh38 cytobands reference file is provided by the visualization (public/reference/cytobands/hg38/cytoBand.json), however if you want to use your own you may download and uncompress a definition file from ftp://hgdownload.cse.ucsc.edu/goldenPath/hg38/database/cytoBand.txt.gz. Then you must convert the file to a json format of the following form:
[
    {
        chrom: "chr1",
        chromStart: 0,
        chromEnd: 2300000,
        gieStain: "gneg",
        name: "p36.33"
    }, {
        chrom: "chr1",
        chromStart: 2300000,
        chromEnd: 5300000,
        gieStain: "gpos25",
        name: "p36.32"
    },
    ...
]

Sample Definition

A sample specific json file must be provided (as in LNCAP\LNCAP_visualization.json):

{
    fileFormatVersion: 1,    
    altsComprehensive: "sampleId_alts_comprehensive.csv",
    cnvBinned30KJson: "sampleId_genomePlot_cnv30.json",
    cnvIntervals: "sampleId_cnvIntervals.csv"
}

Sample Rearrangements

In order to visualize chromosomal rearrangements, a csv file is required (as in LNCAP/LNCAP_alts_comprehensive.csv) and the following columns of integers must be supplied:

Nassoc,chrA,chrB,posA,posB

where Nassoc is the number (integer) of supporting fragments of the events.

Sample Copy Number Variation

In order to visualize copy number, two files of a specific format must be supplied. First, a file (as in LNCAP/LNCAP_genomePlot_cnv30.json) with the raw frequency data from a 30000 bin moving window.

The second file contains the copy number state information; a csv file (as in LNCAP/LNCAP_cnvIntervals.csv) with the following columns must be supplied:

chr,start,end,cnvState,nrd

where cnvState is one of 1 (loss), 2 (normal) or 3 (gain) and nrd is a floating point value corresponding to the Normalized Read Depth score that provides a quantitative measure of how far the CNV deviates from the calculated normal level (nrd = 2.0).

Variant Call Format (VCF) file Support

In order to run the application against a sample that is stored in a VCF file, we provide an R script vcftoUplot.R (which resides in the public/data directory). The script was tested with R-3.3.3 and requires the R package VariantAnnotation, which will be automatically installed if not present. The script takes as input a VCF file (tested VCF v4.1 and v4.2) and produces the file structure hierarchy required by the Genome U-Plot in order to visualize the sample. Finally don't forget to replace your sample name in the URL parameter of the app.

To run vcftoUplot.R

Given a VCF sample file NA12878.vcf (provided in the public/data directory), run

Rscript vcftoUplot.R NA12878.vcf

This will produce the following directory hierarchy

NA12878/
├── NA12878_alts_comprehensive.csv
└── NA12878_visualization.json

Then, using a modern browser visit:

http://localhost:8000/GenomePlot.html?sampleId=NA12878

Note: For this particular example you should use the "Filter on # of Frags" GUI option in order to reduce the number of visualized Chromosomal abnormalities. You can also uncheck the "Line width to # Frags" to disassociate the line thickness from the number of fragments supporting the event.

Note II: The Human Genome Assembly GRCh38 is assumed

Commercial use

If you want to use Genome U-Plot in commercial settings, please contact us.

How to cite

Gaitatzes AG, Johnson SH, Smadbeck JB and Vasmatzis G.; Genome U-Plot: a whole genome visualization. Bioinformatics 2017 Dec 21. https://doi.org/10.1093/bioinformatics/btx829

About

The Genome U-Plot is a JavaScript tool to visualize chromosomal abnormalities in the Human Genome using a U-shape layout.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published