Related referencesΒΆ
To retrieve the related reference IDs the following steps are performed:
Get a summary of the reference id from the NCBI using the Eutils ESummary endpoint (for example LRG_303).
We consider the nucleotide database in the query.
If in the response the accessionversion is different than the input reference id, we include it within the related reference ids (NG_008376.4 for the previous LRG_303 example).
The assemblyacc value is also considered as related (AC254562.1 for LRG_303).
If a new version is available, the replacedby key is present in the response (see NG_012337.1). In this case we use the ESummary endpoint recursively to retrieve all the newer versions.
Get the NCBI linked uids by using the Eutils ELink endpoint (for example LRG_303).
We use nucleotide in both database parameters (db and dbfrom).
We extract all the NCBI uids from the linksetdbs.
Note that for chromosomes we do not want all the transcript ids. For this reason, if genome equals chromosome in the previous ESummary response (for example NC_000022.11), we consider only those for which nuccore_nuccore_comp and nuccore_nuccore_rsgb appear as linknames in the ELink response.
Use the ESummary endpoint with the NCBI uids extracted in the previous step to obtain their reference ids (for example LRG_303).
If the reference id is from a transcript we make use of the NCBI Datasets REST API to obtain further related references (see NM_003002.2).
We query the Ensembl REST API to obtain their related references (see LRG_303). Momentarily, we consider the entries for which dbname is either ENS_LRG_gene, LRG or`Ens_Hs_gene`.