TREC 2005 Genomics Track Data

This page lists the files that are in the distribution for the TREC Genomics Track.  More detail about these files can be found in the 2005 track protocol.  The data files themselves are available.

Ad Hoc Retrieval Task



Relevance judgments

Categorization Task


PMID-full text crosswalk

Sample, training, and test files

File contents
Sample output file
Training data file
Test data file
A (alelle) sample.Atrain.txt (18 KB)
Atrain.txt (4 KB)
Atest.txt (4 KB)
E (expression) sample.Etrain.txt (64 KB)
Etrain.txt (1 KB)
Etest.txt (2 KB)
G (GO annotation) sample.Gtrain.txt (28 KB)
Gtrain.txt (5 KB)
Gtest.txt (6 KB)
T (tumor) sample.Ttrain.txt (16 KB)
Ttrain.txt (1 KB)
Ttest.txt (1 KB)

Gene tagging of MEDLINE corpus

Evaluation program cat_eval

The required version of the program is 2.0, updated on Sept. 9, 2005.  Sample data to test the program are provided as described in the above table.  The program is provided in source code and as a Windows executable (see protocol page for documentation):
Last update - April 6, 2015