Retrieval sources

NCBI

References annotations are retrieved using the Eutils sviewer endpoint (example), while the fasta sequences are retrieved using the Entrez API (example).

Assemblies

For human chromosomal references (NC_) the following FTP location is used to manually retrieve the annotations making sure that the history is taken into account.

Ensembl

Ensembl offers an API from where the most recent reference versions can be retrieved. Queries are not accepted with the version included, e.g., ENST00000383925.1, the version being part of the response, e.g., ENST00000383925. For this reason we check if the provided reference id includes the version, case in which we use the following endpoint to check if the most recent version equals the provided one. If not, for humans we check if the version matches the GRCh37 dedicated API. If the reference has the same id in GRCh37 and GRCh38 the retrieved one is from GRCh38. The transcript archive may be employed in future to retrieve other versions, but currently the annotation provided is not complete.

Note that the retriever accepts only stable ensembl ids, which start with ENS.

LRG

LRG references are retrieved from the following location.