The Genetic Layer Below the Single-Nucleotide Map
When most people think about genetic variation, they think about single-nucleotide polymorphisms — SNPs — the one-letter differences in the DNA sequence that account for a large share of easily detected variation between individuals. SNPs are the backbone of consumer DNA testing. They are also, in an important sense, only the surface of the genome.

Below the SNP layer lies the layer of structural variation: insertions, deletions, inversions, and — most relevant here — copy number variants. These are regions of the genome where different individuals carry different numbers of copies of stretches of DNA, sometimes a few hundred bases, sometimes many thousands. The typical dog genome contains hundreds to thousands of CNV regions. Their cumulative contribution to phenotype rivals that of SNPs for many traits.
In herding breeds specifically, copy number variation underlies several well-characterised traits and diseases, and the rate of discovery of new CNV associations is accelerating.
What CNVs Are and How They Arise
A copy number variant is a stretch of DNA present in a different number of copies in different individuals. A typical diploid region would contain two copies (one from each parent). A CNV might contain zero, one, three, four, or more copies — a deletion or amplification relative to the reference.
CNVs arise through several mechanisms: non-allelic homologous recombination between repeat sequences, replication errors, and retrotransposon-mediated rearrangements. Some CNVs are ancient and segregate within breeds over many generations. Others arise recurrently as new mutations in specific genomic locations prone to instability.
CNVs with Established Effects in Herding Breeds
MDR1 (ABCB1) gene deletion. The MDR1 mutation — a 4-bp deletion in exon 4 causing a premature stop codon — is technically a small-scale CNV at the single-gene level, and it is the paradigmatic example of how a tiny structural change can have breed-defining medical consequences. The full practical implications are covered in the dedicated MDR1 gene guide; the key point here is that this was an early warning that structural variation matters.
CDH23 region in Border Collies. Certain structural variants in the region encompassing CDH23 have been linked to early-onset hearing loss in specific lines of Border Collies. Research is ongoing; this is an example of a CNV whose breed-specific distribution has only become clear as sequencing coverage in the breed has increased.
KIT region and white spotting. Variation in the KIT gene region, including structural variants, contributes to the extreme white spotting patterns seen in some herding breeds. This is where coat genetics and structural variation overlap — see the broader discussion in coat color genetics.
MC1R and e-locus variants. Some color variation arises from structural rearrangements that reshape regulatory regions rather than altering coding sequence directly.
Idiopathic epilepsy. Research in several herding breeds is identifying CNVs associated with increased epilepsy risk, though current commercial tests do not yet integrate these markers.
Detection Technologies
CNV detection has lagged behind SNP detection because standard SNP arrays are not optimised for it. The technologies that actually work:
| Method | Strengths | Limitations |
|---|---|---|
| Array CGH | Established, good for medium-large CNVs | Requires comparison genome; coarse resolution |
| SNP array with CNV calling | Available on many existing platforms | Misses small CNVs; noisy for rare variants |
| Whole-genome sequencing (short-read) | Comprehensive for small-to-medium variants | Misses large structural variants in repetitive regions |
| Long-read sequencing (PacBio, Nanopore) | Best current resolution of structural variation | Higher cost; still less common in canine genomics |
| Optical mapping | Excellent for large-scale rearrangements | Specialist; rarely used commercially |
Most consumer and breeder-facing genetic tests in 2026 rely on SNP array or targeted panel technology. These platforms can call some common CNVs but will miss many rare structural variants. As long-read sequencing costs continue to fall, the range of CNVs reliably detectable in routine testing will expand.
Why CNVs Matter for Breeding Decisions
Effect size per variant is often larger than for SNPs. Because CNVs can disrupt whole genes or regulatory regions, a single CNV often has a larger phenotypic effect than a single SNP. The MDR1 deletion is a clean example: one structural variant with a profound pharmacological consequence.
Rare CNVs contribute disproportionately to disease burden. Research in both humans and dogs has shown that rare CNVs — many of them new mutations or low-frequency variants — account for a substantial fraction of severe phenotypic outcomes.
Some CNVs are breed-specific. A CNV that rises to meaningful frequency in one breed may be effectively absent in another. Breed-specific genetic panels are therefore not just a marketing convenience; they reflect genuine differences in which structural variants are segregating.
Standard SNP-based relatedness and COI calculations can miss CNV-driven similarity. Two dogs can share a rare CNV that substantially increases shared disease risk without appearing unusually related on SNP-based metrics. This is a genuine limit of standard pedigree analysis.
What to Ask Your Testing Provider
If you are considering a canine DNA panel and structural variants matter to you — particularly for the herding breeds where work is most advanced — these are the questions that will separate serious providers from casual ones:
- Which of the variants on your panel are CNVs as opposed to SNPs?
- What CNV-calling methodology do you use?
- How is performance validated for your CNV assays? Is there published validation in my breed?
- If I request raw data, do you include CNV calls or only SNP genotypes?
- Are you currently assessing novel CNVs identified through research, or only those already published?
Most providers will struggle to give specific answers. That is diagnostic. The commercial canine genetic testing market has optimised heavily for SNP-based panels and is only now beginning to integrate CNV calling at scale.
The Direction of Travel
Expect CNV content in canine genetic panels to increase steadily over the next three to five years, especially as long-read sequencing becomes commercially viable at a per-dog cost comparable to current panel testing. The herding breeds, because they are genetically well-characterised and have active research communities, are likely to be among the first to see clinically useful CNV panels. For breeders of these dogs, building an understanding of structural variation now — even before it is fully integrated into commercial testing — will pay off when the tests catch up to the science. The single-gene, single-variant model of canine genetic disease is incomplete. The structural layer is the next chapter.