Calling card

From wubrowse wiki
Jump to: navigation, search
  • Calling card track is a new type of track invented by Rob Mitra's lab, since version 46 it was integrated to the browser.

Format the Data

Calling card data must be stored in a tab-delimited, plain text format. This format requires a minimum of four columns and can support up to six. The four required columns are CHROM, START, STOP, and COUNT, where COUNT refers to the number of reads for that insertion. The START and STOP columns can be either 0- or 1-indexed. The fifth and sixth columns are optional and represent STRAND and BARCODE, respectively. Here is an example of a four-column calling card file:

chr1    41954321        41954325        1
chr1    41954321        41954325        18
chr1    52655214        52655218        1
chr1    52655214        52655218        1
chr1    54690384        54690388        3
chr1    54713998        54714002        1
chr1    54713998        54714002        1
chr1    54713998        54714002        13
chr1    54747055        54747059        1
chr1    54747055        54747059        4
chr1    60748489        60748493        2

Here is an example of a six-column calling card file:

chr1    51441754        51441758        1       -       CTAGAGACTGGC
chr1    51441754        51441758        21      -       CTTTCCTCCCCA
chr1    51982564        51982568        3       +       CGCGATCGCGAC
chr1    52196476        52196480        1       +       AGAATATCTTCA
chr1    52341019        52341023        1       +       TACGAAACACTA
chr1    59951043        59951047        1       +       ACAAGACCCCAA
chr1    59951043        59951047        1       +       ACAAGAGAGACT
chr1    61106283        61106287        1       -       ATGCACTACTTC
chr1    61106283        61106287        7       -       CGTTTTTCACCT
chr1    61542006        61542010        1       -       CTGAGAGACTGG

Your text file must be sorted by the first three columns. If your filename is example.ccf, you sort it with the following command: sort -k1V -k2n -k3n example.ccf > example_sorted.ccf

Note that you can have strand information without a barcode, but you cannot have barcode information without a strand column.

Place your sorted text file in the public folder. Since genomic data is often large, we must compress and index it for fast retrieval. Use the following commands to do so:

bgzip example_sorted.ccf
tabix -p bed example_sorted.ccf.gz

This command is for 1-indexed coordinates. If your data is 0-indexed, replace the last command with tabix -0 -p bed example_sorted.ccf.gz.

Upload the Data

A JSON file is need to upload data and create tracks on the browser. Here is the structure of a simple, two-track JSON file:

[
    /* this is a comment */
    {
        "type":"bedgraph",
        "url":"https://htcf.wustl.edu/files/vdY5b0dP/test2.bedgraph.gz",
        "name":"test2",
        "mode":"show",
        "colorpositive":"#ff33cc",
        "height":50
    },
    {
        "type":"callingcard",
        "url":"https://htcf.wustl.edu/files/vdY5b0dP/test5.cc.gz",
        "name":"N2A Brd4",
        "mode":"show",
        "colorpositive":"#ff33cc",
        "height":50
    }
]

The “url” field should specify the bgzipped file (ends in .gz), not the tabix index file (which ends in .gz.tbi). Note that all strings must be in quotation marks, while numerical values need not. Here, the colorpositive and height specifications are optional. To add more tracks, enclose them in braces within the brackets and separate the braces with commas. More information about supported file types and how to format them can be found at the WashU EpiGenome Browser wiki.

Save the JSON file on your local computer; it is not necessary to save this file to the HTCF cluster. To upload data, open the appropriate reference genome in the browser, click on “Tracks”, then “Custom Tracks”, then “Add new tracks”. Click on the “Datahub by upload” button, then select your JSON file. If everything works perfectly, your data should now be visible on the browser!