A Locus Reference Genomic (LRG) record contains stable reference sequences that are used for reporting sequence variants with clinical implications
What is unique about LRG reference sequence records?
Created specifically for clinical reporting by manual curation
- Include a minimum set of transcripts for reporting at a locus (ideally one)
- Manually curated by expert scientists
Stable
- Include transcript sequences that are stable and independent of changes to transcript models (RefSeq and GENCODE)
- Establish a coordinate system that is independent from upgrades to the reference genome assembly, and provides mappings to present and past assemblies
- Use a unique and stable identifier that is not versioned
Flexible and collaborative
- Allow inclusion of legacy or community-requested reference sequences
- Individual records created at the request of the clinical community and in collaboration with gene-specific experts
Connect the past, present and future of clinical variant reporting
- Part of the effort to rationalise differences in NCBI (RefSeq) and EMBL-EBI (Ensembl/GENCODE) gene sets
- Aim to achieve faster convergence between NCBI (RefSeq) and EMBL-EBI (Ensembl/GENCODE) on key high value annotations to provide a common minimal set of transcripts per gene
- Facilitate unambiguous multi-directional data exchange between NCBI (RefSeq), EMBL-EBI (Ensembl/GENCODE) and the reference genome assemblies (GRCh37, GRCh38)
- Define the relationship between legacy and community-requested sequences and the reference assembly
- Ensure compatibility with genome reference assembly-based NGS variant reporting systems
Well-supported
- Integrated into the Ensembl, NCBI and UCSC genome browsers to allow visualization in genomic context, with all other existing annotations
- Compatible with the Human Genome Variation Society (HGVS) nomenclature and supported by nomenclature checker systems (e.g. Mutalyzer, VariantValidator)
- Generated and maintained by the NCBI and EMBL-EBI