Make gene track

From wubrowse wiki
Jump to: navigation, search
Last modified: Feb 19, 2014

Making gene track by parsing UCSC Genome Browser track files
You can apply these methods to make your own custom gene track. Simply prepare your gene data into the formats of "refGene.sql" and "refLink.sql", and run the ucscsimplegene.py script to generate a custom gene track.


RefGene and xenoRefGene



Take the fruit fly as an example:

$ cd ~/data/dm3/gene
$ wget http://hgdownload.soe.ucsc.edu/goldenPath/dm3/database/refLink.txt.gz
$ wget http://hgdownload.soe.ucsc.edu/goldenPath/dm3/database/refGene.txt.gz
$ wget http://hgdownload.soe.ucsc.edu/goldenPath/dm3/database/xenoRefGene.txt.gz
$ gunzip *

$ python ~/subtleKnife/script/genescript/ucscsimplegene.py refGene.txt refGene > load.sql

$ python ~/subtleKnife/script/genescript/ucscsimplegene.py xenoRefGene.txt xenoRefGene >> load.sql



### if you're preparing a custom track, you're done here
### following steps are for building native gene track only


$ cat load.sql |mysql -uroot -pxzhou dm3


$ mv *gz *tbi /srv/epgg/data/data/subtleKnife/dm3/

$ cd ~/subtleKnife/config/dm3

# add these two lines in "decorInfo" file
refGene RefSeq genes \N 2 24 0 http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Search&db=Nucleotide&doptcmdl=GenBank&term=
xenoRefGene non-fly RefSeq genes \N 2 24 0 http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Search&db=Nucleotide&doptcmdl=GenBank&term=

# add these two lines in "track2Details" file
refGene download date=Nov. 28, 2013; source=http://hgdownload.soe.ucsc.edu/goldenPath/dm3/database/; note=this is a copy of RefSeq Genes track of UCSC Genome Browser
xenoRefGene download date=Nov. 28, 2013; source=http://hgdownload.soe.ucsc.edu/goldenPath/dm3/database/; note=this is a copy of non-fly RefSeq Genes track of UCSC Genome Browser

# add these two lines in "track2Style" file
refGene 'textcolor':'rgb(0,0,0)','fontsize':'8pt','fontfamily':'sans-serif','fontbold':false,'bedcolor':'rgb(0,77,153)',dbsearch:true
xenoRefGene 'textcolor':'rgb(0,0,0)','fontsize':'8pt','fontfamily':'sans-serif','fontbold':false,'bedcolor':'rgb(0,77,77)',dbsearch:true

$ cat makeDb.sql |mysql -uroot -pxzhou dm3



Ensemble Genes

mm9 ensGene build log
wget http://hgdownload.soe.ucsc.edu/goldenPath/mm9/database/ensGene.txt.gz
gunzip ensGene.txt.gz
wget http://hgdownload.soe.ucsc.edu/goldenPath/mm9/database/knownToEnsembl.txt.gz
gunzip knownToEnsembl.txt.gz
wget http://hgdownload.soe.ucsc.edu/goldenPath/mm9/database/kgXref.txt.gz
gunzip kgXref.txt.gz

python ~/subtleKnife/script/ucsc/ensGene.py ensGene.txt ensGene
mv ensGene.gz* /srv/epgg/data/data/subtleKnife/mm9/

## following steps are for building native gene track only (and would have gone void soon)

mysql -uroot -pxzhou mm9

drop table if exists ensGene;
create table ensGene (
chrom varchar(20) not null,
start int unsigned not null,
stop int unsigned not null,
name varchar(100) not null
) ENGINE=MyISAM DEFAULT CHARSET=latin1;
load data local infile 'ensGene_load' into table ensGene;
create index name on ensGene (name);

Add to "decorInfo":
ensGene Ensembl genes \N 2 24 0 http://dec2011.archive.ensembl.org/Mus_musculus/geneview?gene=

Add to "track2Detail"
ensGene download_date=Feb. 19, 2014; source=http://hgdownload.soe.ucsc.edu/goldenPath/mm9/database/; note=this is a copy of Ensembl gene track of UCSC Genome Browser

Add to "track2Style"
ensGene 'textcolor':'rgb(0,0,0)','fontsize':'8pt','fontfamily':'sans-serif','fontbold':false,'bedcolor':'rgb(143,71,36)',dbsearch:true

cat makeDb.sql |mysql -uroot -pxzhou mm9


mm10 ensGene build log
wget http://hgdownload.soe.ucsc.edu/goldenPath/mm10/database/ensGene.txt.gz
wget http://hgdownload.soe.ucsc.edu/goldenPath/mm10/database/knownToEnsembl.txt.gz
wget http://hgdownload.soe.ucsc.edu/goldenPath/mm10/database/kgXref.txt.gz
gunzip *.gz
python ~/subtleKnife/script/ucsc/ensGene.py ensGene.txt ensGene
mv ensGene.gz* /srv/epgg/data/data/subtleKnife/mm10/
mysql -uroot -pxzhou mm10
... load sql
ensGene Ensembl genes \N 2 24 0 http://www.ensembl.org/Mus_musculus/geneview?gene=