Welcome to the homepage of DREEP



The website is under construction, we appreciate your feedback

Download the DREEP package (v 0.7) from here

A NUMTs database could be found here

Scripts to generate the input for mtDNAble.exe could be found here

Note: Heterozygote has not been considerred in the present version, thus the script here only fit for the haploid genome (mt,Y,X in male)

Test data:

A test dataset (134 whole mitochondrial sequencing data, 3 artifically-mixed samples) are publicly available from the European Nucleotide Sequence Read Archive (http://www.ebi.ac.uk/ena/) through the accession number ERP000879.

More data (double indices, paired-end) would be released soon (with the original project)

Citation:

Mingkun Li, Stoneking Mark. A new approach for detecting low-level mutations in next-generation sequence data. Genome Biology 2012, 13:R34 Find it at Genome Biology

An example of using DREEP could be found at: Li et al. 2012. Fidelity of capture-enrichment for mtDNA genome sequencing: influence of NUMTs. Nucleic Acid research.doi:10.1093/nar/gks499 Find it at NAR

Perl Modules required to run DREEP:

Text::NSP::Measures::2D::Fisher2::right (Fisher method)
Text::NSP::Measures::2D::Fisher2::left (Fisher method)
Text::NSP::Measures::2D::Fisher::twotailed (Dreep_result_filter.pl)
Math::CDF (Poisson method)
Statistics::Descriptive (all)
Statistics::PointEstimation (EMP method)
Statistics::TTest (EMP method)
Statistics::Distrib::Normal (EMP method)

Version:

  • v0.7 (Sep 17 2012) The default value for -r is set to be 0 (deactivated), because we found the distribion of reads on different strands is not as random as people thought; A bug in filter_and_summary.pl is now fixed, in the previous version, when mismatch number is greater than 9, it was recognized as 0, please note that when you run BWA in pair-end mode, the mismatch number could be much greater than what you set. Distint reads number for the major allele has be to greater than 10 on each strands.
  • v0.6 (Jun 7 2012) A range could be given to filter_and_summary.pl (-a) in the format of ChrX:Y-X, e.g., MT:100-1500, all the following analysis will be restricted to this specific region. A bug has been fixed in Dreep_poisson.pl, Dreep_fisher.pl, if there is no reads in one direction (normally at the two ends of the reference sequence), a quality socre of "0" will be given instead of nothing.
  • v 0.5 (May 30 2012) Fisher exact test is now added to quantify the violation of the minor allele counts and major allele counts on different strands in Dreep_result_filter.pl -r
  • v 0.4 (May 7 2012) The two-tailed Fisher Exact test was substituted by One-tailed Fisher Exact test( with considering the direction of the null hypothesis). -b option is added to output the original Bias statistics rather than Phred-scaled quality score. Dreep script itself doesn't give any LLM candidates now (only log files are generated). Please use the Dreep_result_filter.pl to specify your own criteria to call LLMs. For this update, we really appreciate the comments from the 3rd reviwer that we met at Genome Biology.
  • v 0.3 (Apr4,2012) including an option -p in Dreep_result_filter_v2.pl to remove the mutation adjacent to an indel (either major allele or minor allele, 10bp in both directions)), thanks to the reviewer of our manuscript
  • v 0.2 EMP method has the similar pipeline as the POISSON method, output file have the same format. Threshold for Standard output (on Screen in Dreep_poisson.pl) were updated according to the result of NUMTs project ( MAF>=0.02,DQS>=10,Other allele frequency -other than the majority and the secondary allele- <=20%).
  • v 0.1 first version