P-clouds

2013 - 2016

P-clouds is a methodology for identifying transposable elements in genomes and for estimating the false positive and false negative rates of identification. While using this program, I found a number of critical bugs in the software that lead incorrect results. I tracked down these errors and fixed them as they were found. I then created a number of new methods for identification using similar ideas from P-clouds and tested their efficacy against older methods.

Associated publication:

  • Gu, Wanjun, et al. "Identification of repeat structure in large genomes using repeat probability clouds." Analytical biochemistry 380.1 (2008): 77-83.

Tools: C, Perl, R, Git

Repository: https://github.com/PollockLaboratory/pclouds