Large language models (LLMs) adapted for the biomedical domain show exceptional performance on many tasks, but are also known to provide false information, i.e., hallucinations or confabulations. Inaccuracies may be particularly harmful in high-risk situations, such as making clinical decisions or appraising biomedical research. The TREC 2024 BioGen task will focus on reference attribution as a means to mitigate generation of false statements by LLMs.
The goal of the TREC 2024 BioGen task will be to cite references to support the text of the sentences and the overall answer from LLM output for each topic. Each run will be scored by the proportion of sentences and overall answer that have correctly supporting attributions. (If the sentences themselves are not true, then presumably they will not be supported by references.) We will pool the sentences in each topic for each run, along with their references, across all the runs from all the participants. Please see the full task description for more details.
Dataset release | June 2024 |
Topics release | July 2024 |
Submission deadline | August 30, 2024 |
Official results | October 2024 |