We provide all >=1 Kbp duplications with >= 90% identity in the human genome [1] computed using Mashmap [2] (visualization).
=============================
Format of file mashmap.segdup.out.gz:
Space separated columns with following columns:
query chromosome, chrm. length, 0-based start, end, strand, reference choromosome, chrm. length, start, end and alignment nucleotide identity, alignment length
The alignment identity and length were computed using LAST [3].
=============================
References:
1. hg38 genome downloaded from UCSC genome browser. We use chromosomes 1-22, X, Y and M.
2. Chirag Jain, Sergey Koren, Alexander Dilthey, Adam M. Phillippy, and Srinivas Aluru. A Fast Adaptive Algorithm for Computing Whole-Genome Homology Maps. BioRxiv, 2018.
3. Kiełbasa SM, Wan R, Sato K, Horton P, Frith MC. Adaptive seeds tame genomic sequence comparison. Genome Res. 2011 21(3):487-93. URL: http://last.cbrc.jp