Overview of repeats
The study by Goldfarb and colleagues (Goldfarb, et al., 1991) reported about a repeat expansions in PRNP gene that is associated with Creutzfeldt–Jakob disease (CJD) in several families.
The affected individuals were heterozygous for alleles with 10, 12 and 13 octapeptide repeats in and some alleles had "wobble" nucleotide substitutions. Notably, an individual with 9 repeats showed no signs of neurological disease, suggesting that a minimum of 10 repeats may be necessary to predispose an individual to CJD. Familial patients with insertion mutations typically experienced prolonged illnesses (average of 7 years), beginning at a relatively young age (average of 38 years), marked by progressive dementia and cerebellar as well as other neurological symptoms. Additional cases with 11 and 14 repeats also presented with prolonged illnesses starting at an early age (22-28 years).
The probable genetic mechanism behind the generation of extra repeats was proposed to be unequal crossover during recombination, which can lead to the creation of alleles with a varying number of repeats due to the similar structure of these sequences. This process could explain the presence of all observed repeat variations, including the 11-repeat insertions, except for the "R2c" repeat, which might be an independent point mutation.
In normal patients, the allele consists of four octapeptide tandem repeat sequences, each with a distinct nucleotide sequence, however, the peptide sequence is the same for three repeats. The sequence of the locus in normal patients is as follows: R1-R2-R2-R3-R4, where:
- R1: CCT CAG GGC GGT GGT GGC TGG GGG CAG (Pro Gln Gly Gly Gly Gly Trp Gly Gln)
- R2: CCT CAT GGT GGT GGC TGG GGG CAG (Pro His Gly Gly Gly Trp Gly Gln)
- R3: CCC CAT GGT GGT GGC TGG GGA CAG (Pro His Gly Gly Gly Trp Gly Gln)
- R4: CCT CAT GGT GGT GGC TGG GGT CAA (Pro His Gly Gly Gly Trp Gly Gln)
- R1-R2-R2-R3-R2-R3g-R2-R2-R3-R4
- R1-R2-R2-R2-R3-R2-R3g-R2-R2-R3-R4
- R1-R2-R2c-R3-R2-R3-R2-R3-R2-R3g-R3-R4
- R1-R2-R2-R3-R2-R2-R2-R2-R2-R2-R2-R2a-R4
- R2a: CCT CAT GGT GGT GGC TGG GGA CAG (Pro His Gly Gly Gly Trp Gly Gln)
- R2c: CCT CAT GGC GGT GGC TGG GGG CAG (Pro His Gly Gly Gly Trp Gly Gln)
- R3g: CCC CAT GGT GGT GGC TGG GGG CAG (Pro His Gly Gly Gly Trp Gly Gln)
NB! The provided motif here is R2, but the reference region includes repeats from R1 to R4.