4. Appendix
4.1. Data
The data directory contains several types of data:
Genometable files for several species. To generate genometable files for other species (and genome builds), use
makegenometable.pl
that can be found in the scripts directory as follows:makegenometable.pl genome.fa > genometable.txt
Ideogram files for visualization of chromosomes in GV command.
Mptable files that describe the mappable bases of each chromosome.
4.2. Mappability files
DROMPAplus accepts the mappability files generated by scripts provided by MOSAiCS, which is based on the code from Peakseq. The binary mappability files (chr*_map_binary.txt) can be generated by using the MOSAiCS scripts with the appropriate fragment length and bin size (here, 150 bp and 100 bp, respectively).
Next, gzip and modify the filenames for DROMPAplus:
for i in $(seq 1 22) X Y M; do
gzip chr${i}_map_binary.txt
mv chr${i}_map_binary.txt.gz map_chr${i}_binary.txt.gz
done
After this, make the “mappability table”, a tab-delimited file describing the number of mappable bases for each chromosome, using makemappabilitytable.pl
found in the otherbins directory:
$ makemappabilitytable.pl genometable.txt map > mptable.txt
where the second argument “map” is the prefix of binary mappability files.
4.3. Gene-density files
Gene-density files can be generated through makegenedensity.pl
found in the scripts directory.
To use a 500-kbp window, type:
$ makegenedensity.pl genometable.txt refFlat.txt 500000
and the gene-density files chr*-bs<binsize>
will be generated in the current directory.
These data for several species can also be downloaded from the DROMPAplus website.
Next, make a new directory for the gene-density files:
$ mkdir gene_density_hg19
$ mv chr*-bs* gene_density_hg19
The gene-density files can be specified as follows:
$ drompa+ GV $s1 $s2 $s3 $s4 -o ChIPseq-wholegenome --gt genometable.txt \
--GD gene_density_hg19/ --gdsize 500000