Related references ================== To retrieve the related reference IDs the following steps are performed: - Get a summary of the reference id from the NCBI using the Eutils `ESummary `_ endpoint (for example `LRG_303 `_). - We consider the `nucleotide` database in the query. - If in the response the `accessionversion` is different than the input reference id, we include it within the related reference ids (`NG_008376.4` for the previous `LRG_303` example). - The `assemblyacc` value is also considered as related (`AC254562.1` for `LRG_303`). - If a new version is available, the `replacedby` key is present in the response (see `NG_012337.1 `_). In this case we use the ESummary endpoint recursively to retrieve all the newer versions. - Get the NCBI linked uids by using the Eutils `ELink `_ endpoint (for example `LRG_303 `__). - We use `nucleotide` in both database parameters (`db` and `dbfrom`). - We extract all the NCBI uids from the `linksetdbs`. - Note that for chromosomes we do not want all the transcript ids. For this reason, if `genome` equals `chromosome` in the previous ESummary response (for example `NC_000022.11 `_), we consider only those for which `nuccore_nuccore_comp` and `nuccore_nuccore_rsgb` appear as `linknames` in the ELink `response `_. - Use the ESummary endpoint with the NCBI uids extracted in the previous step to obtain their reference ids (for example `LRG_303 `__). - If the reference id is from a transcript we make use of the `NCBI Datasets REST API `_ to obtain further related references (see `NM_003002.2 `_). - We query the `Ensembl REST API `_ to obtain their related references (see `LRG_303 `_). Momentarily, we consider the entries for which `dbname` is either `ENS_LRG_gene`, `LRG` or`Ens_Hs_gene`.