Friday, June 20, 2008

Making MAT RepeatLib

Download the repeat_mask.txt, simple_repeat.txt and segment_dups.txt from UCSC

and run

python Rep.py -m repeat_mask.txt simple_repeat.txt segment_dups.txt

The Rep.py is available at the MAT lib subdirectory of the MAT install.

For the genomes like Drosophila melanogaster, for which UCSC doesn't provide Segment Duplication file. The repeat library can be generated by

1) remove all the instances of usage of "segdup" from Rep.py file
2) remove the original Rep.py and Rep.pyc files form MAT compile.
3) Recompile MAT.

RepeatLib and the modified Rep.py files are also available on request from me.
parantu dot shah at gmail dot com

No comments: