ExpansionHunter's catalogue creator

Create a custom variant catalogue for ExpansionHunter. You can either convert a BED file containing any STR loci into a variant catalogue or manually select known pathogenic loci from the list that you wish to target during your analysis. The same catalogue can be also generated from command line (see API instructions).

Convert a BED file to variant catalogue

You can use a BED file or a tab/space-delimited text file with a maximum of approximately 1 million loci to convert it into a variant catalogue. Each locus must be placed on a new line, and the columns should include: 1) chromosome, 2) STR start position, 3) STR end position, 4) repeat unit, and 5) locus ID (optional). If the locus ID is missing, an ID will be generated based on the reference coordinates.

Example content of a regions file:
chrX6754531667545385GCA
chr430748763074933CAGHTT
chr96903728669037304GAAFXS

 
Available loci
Loci in the catalogue
Catalogue's content
Reference genome
Chromosome naming
Extended analysis

* While off-target regions enable genotyping of alleles longer than the fragment length, there is also a chance of obtaining overestimated genotypes (which may vary depending on the locus). Additionally, you should ensure that there are no other expansions of the same repeat unit in the genome. Overall, interpret your results with caution and visualise reads using the REViewer tool.

Known issues with using the catalogue and ExpansionHunter (v4.0.2 & v5.0.0):
• ARX_1 and ARX_2 as well as HOXA13_1, HOXA13_2 and HOXA13_3 tracts are too close to each other and false positive results for these loci are often observed.
• Genotypes returned for Replaced and Nested types of repeats also includes the non-pathogenic repeats. You could use STRipy to determine the presence of the pathogenic motif in your sample.
• Long homozyogus alleles (over the fragment length, above 400-500 bp on average;) are likely determined as heterozyous with one allele being overestimated and the other underestimated (see figure C on this plot).
Additional issues with ExpansionHunter v4.0.1 or an earlier version: genotyping AR, ATXN1 and TCF4 loci are limited to the read length (approximately).