TREC 2024 Biomedical Generative Retrieval (BioGen) Track

Large language models (LLMs) adapted for the biomedical domain show exceptional performance on many tasks, but are also known to provide false information, i.e., hallucinations or confabulations. Inaccuracies may be particularly harmful in high-risk situations, such as making clinical decisions or appraising biomedical research. The TREC 2024 BioGen task will focus on reference attribution as a means to mitigate generation of false statements by LLMs.

The goal of the TREC 2024 BioGen task will be to cite references to support the text of the sentences and the overall answer from LLM output for each topic. Each run will be scored by the proportion of sentences and overall answer that have correctly supporting attributions. (If the sentences themselves are not true, then presumably they will not be supported by references.) We will pool the sentences in each topic for each run, along with their references, across all the runs from all the participants. Please see the full task description for more details.

Organizers

Bill Hersh, Oregon Health & Science University
Dina Demner-Fushman, National Library of Medicine
Deepak Gupta, National Library of Medicine
Steven Bedrick, Oregon Health & Science University
Kirk Roberts, University of Texas Houston

Timeline (tentative)


Dataset release	June 2024
Topics release	July 2024
Submission deadline	August 30, 2024
Official results	October 2024