The topic of interest

The topic is defined by a set of related scientific articles. It can be given as a standard PubMed query, biomedical terms from the Medical Subject Headings (MeSH) database, or a list of PubMed identifiers (PMIDs). Please see the tutorials for PMIDs list and MeSH terms retrieval (see also the MeSH Browser), or visit Pubmed help page for details.

The genes to be ranked

The selected genes will be ranked by their relevance to the query topic. The relevance is computed by a text mining algorithm using the number of abstracts related to the topic and the gene. Adjusting the P-value cutoff for abstract selection may be critical for the gene selection specificity. With low values, individual genes will be more specifically associated to the query topic. By lowering the flase discovery rate cutoff, one will select a smaller final list of genes. See all taxonomic identifiers in NCBI Taxonomy database, or all gene identifiers in NCBI Entrez Gene database.

Performance Boost

A performance boost will be available to the document classifier if using a MeSH term to define the input topic. In this case, abstracts annotated by NLM's curators with the same MeSH term will be directly associated to a p-value equal to zero (Seen as ♥ in the output page). If the MeSH term is not found in the annotations or if the topic is defined by an other way (PubMed query or list of PMIDs), the document classifier uses only words from the abstract for scoring.

Gene literature expansion from ortholog genes

Abstracts retrieved for a gene can be extended to abstracts of ortholog genes. Selecting again the target species will have no effect. This feature works only among species from release 68 of NCBI Homologene database

