Long range “independence”

The repetitive nature of DNA strings is one of the challenges in read alignment. When one examines longer substrings of DNA, they appear less repetitive, or more unique.

One way to measure property is “resolution length.” This property is discussed in A Note about the Resolution-Length Characteristics of DNA.

This property has been used in gapped-seed algorithms since PatternHunter. We propose algorithms that allow to use longer ranges, and thus stronger “independence” in Using the Long Range “Independence” in DNA: Coupled-Seeds and Pre-Alignment Filters.