Documentation

Tutorials

The topic of interest

The topic is defined by a set of related scientific articles. It can be given as a standard PubMed query, biomedical terms from the Medical Subject Headings (MeSH) database, or a list of PubMed identifiers (PMIDs). Please see the tutorials for PMIDs list and MeSH terms retrieval (see also the MeSH Browser), or visit Pubmed help page for details.

Importantly, results may differ for a given topic depending on the type of query. Querying using PubMed models the topic of interest with the latest articles published in the literature. Querying using MeSH terms models the topic of interest with a random selection of articles manually annotated with the MeSH terms. Thus, the former would model the latest Research on the topic while the latter would model general properties of the topic.

The chemicals to be ranked

The chemicals (including drugs) will be ranked by their relevance to the query topic (mining related scientific abstracts). The internal list of chemicals comes from the MeSH database.

The first option is to use all biomedical documents published during the last 1-3 years to rank chemicals discussed in those documents. Links between documents and chemical are manually set. Each chemical is ranked using the text-mining classification of its related abstracts as relevant or not to the topic of interest.

The second option is to select a few chemicals by name. Chemical names as defined in MeSH MUST BE used as input. Please see the 'Chemicals statistics' page for a selected list of names (chemicals associated with at least 100 abstracts) or the MeSH Browser for a full list. Using this second option, all chemical-abstract links of the selected chemicals are used in the computation. Please keep the list of selected chemical small. If not, it may be very long or it may create no output (e.g. a blank HTML page).

Chemical identifiers are provided by the MeSH database as either Chemical Abstracts Service (CAS) registry numbers, Enzyme Commission (EC) numbers, or U.S. Food and Drug Administration Substance Registration System Unique Ingredient Identifiers (UNIIs) (detailed information page). CAS registry numbers are composed by a number, a hyphen, 2 digits, a hyphen, and a single last digit (e.g. 14484-47-0). EC numbers start with 'EC' letters.

The ranking relevance is computed by a text mining algorithm using the number of abstracts related to the topic and the chemical. Adjusting the P-value cutoff for abstract selection may be critical for the chemical selection specificity. With low values, individual chemicals would be more specifically associated to the query topic. By lowering the false discovery rate cutoff, one would select a smaller final list of chemicals with higher confidence.

Chemicals statistics

The Chemicals statistics page shows chemicals occurring with high frequency in the literature. Chemical names, number of associated abstracts and registry numbers are shown for chemicals with at least 100 associated abstracts. The statistics are derived from PubMed data and especially its annotations with chemical MeSH terms.

Comments are closed.