From wubrowse wiki
Jump to: navigation, search

Categorical Matrix is a map or two dimensional matrix. The columns of the matrix are regions of the chromosome. The matrix represents a number of rows stacked on the genome. Each cell has a value of one integer, as category id.

The categorical matrix track is rendered as a 2-D colormap, with the color of each cell defined by its category.


Data format

Each line of the track file has four columns:

  1. chromosome name
  2. start, 0-based
  3. stop
  4. layers:[an array of category ids]

Example of the 4th column is:

layers:[15,15,15,15,15,15,15,15,15,15,15,15,...more data omitted...]

The length of the array in the 4th column equals the number of rows in the matrix. All lines in a track file must have array of equal length.

The matrix row count is defined by the "rowcount" attribute in the track definition in datahub.

Track definition

Categorical matrix track can only be submitted through datahub. An example track definition:

name:'catmat test',
          '1':['Active TSS','#ff0000'],
          '2':['Flanking Active TSS','#ff4500'],
          '3':['Transcr at gene 5\' and 3\'','#32cd32'],
          '4':['Strong transcription','#008000'],
          '5':['Weak transcription','#006400'],
          '6':['Genic enhancers','#c2e105'],
          '8':['ZNF genes & repeats','#66cdaa'],
          '10':['Bivalent/Poised TSS','#cd5c5c'],
          '11':['Flanking Bivalent TSS/Enh','#e9967a'],
          '12':['Bivalent Enhancer','#bdb76b'],
          '13':['Repressed PolyComb','#808080'],
          '14':['Weak Repressed PolyComb','#c0c0c0'],

Prepare track file

Take following steps to prepare a categorical matrix track:

1. sort the track file using "sort" command:

sort -k1,1 -k2,2n old_file > new_file

2. compress file

bgzip new_file

3. index file

tabix -p bed new_file.gz

4. put files "new_file.gz" and "new_file.gz.tbi" to a web server. The two files must be in the same directory. Obtain the URL to "new_file.gz" for use in datahub.