The training set

The training set is a set of scientific articles related to the same topic. You can write a PubMed query to build automatically a training set. Alternatively, you can use biomedical terms from the Medical Subject Headings (MeSH). Please input only exact terms. These terms can be found in the MeSH Browser. The procedure is detailed in the tutorials. The training set can be also defined with your own list of articles (identified by PubMed identifiers, e.g. PMIDs). You can get a list of PMIDs in few clicks from the PubMed interface: go to the Pubmed page, make a query, select 'UI List' in the Display menu, and Send to file. Please see the related tutorial.

The background set

The background set should use the whole medline database or a random selection of articles. You can also provide your own list to take into account a bias in the test set.

The test set

The test set defines abstracts which will be ranked by the MedlineRanker program. Ranking all the medline abstracts of the last months or years may be long. The processing speed is approximately 1 Million abstracts (~2 years old abstracts) per minute after initialization steps. The speed may vary depending on the server load.

Comments are closed.