4. Appendix

4.1. Data

The data directory contains several types of data:

  • Genometable files for several species. To generate genometable files for other species (and genome builds), use makegenometable.pl that can be found in the scripts directory as follows:

    makegenometable.pl genome.fa > genometable.txt
    
  • Ideogram files for visualization of chromosomes in GV command.

  • Mptable files that describe the mappable bases of each chromosome.

4.2. Mappability files

DROMPAplus accepts the mappability files generated by scripts provided by MOSAiCS, which is based on the code from Peakseq. The binary mappability files (chr*_map_binary.txt) can be generated by using the MOSAiCS scripts with the appropriate fragment length and bin size (here, 150 bp and 100 bp, respectively).

Next, gzip and modify the filenames for DROMPAplus:

for i in $(seq 1 22) X Y M; do
   gzip chr${i}_map_binary.txt
   mv chr${i}_map_binary.txt.gz map_chr${i}_binary.txt.gz
done

After this, make the “mappability table”, a tab-delimited file describing the number of mappable bases for each chromosome, using makemappabilitytable.pl found in the otherbins directory:

$ makemappabilitytable.pl genometable.txt map > mptable.txt

where the second argument “map” is the prefix of binary mappability files.

4.3. Gene-density files

Gene-density files can be generated through makegenedensity.pl found in the scripts directory. To use a 500-kbp window, type:

$ makegenedensity.pl genometable.txt refFlat.txt 500000

and the gene-density files chr*-bs<binsize> will be generated in the current directory. These data for several species can also be downloaded from the DROMPAplus website.

Next, make a new directory for the gene-density files:

$ mkdir gene_density_hg19
$ mv chr*-bs* gene_density_hg19

The gene-density files can be specified as follows:

$ drompa+ GV $s1 $s2 $s3 $s4 -o ChIPseq-wholegenome --gt genometable.txt \
  --GD gene_density_hg19/ --gdsize 500000