Protocol for evaluating ChatGPT in biomedical association generation and verification using a RAG enabled, cross-model majority voting workflow

Hamed, Ahmed AbdeenRocha, Luis M.2026-06-012026-06-012026-06-1986b79572-383d-4002-9814-6a03816ee030http://hdl.handle.net/10400.14/57864We present a protocol to evaluate ChatGPT’s ability to generate disease-centric biomedical associations. It outlines how we generate the associations, validate the biological entities using biomedical ontologies, and verify associations using literature. The protocol includes a self-consistency strategy to assess generative reliability across ChatGPT models. To address ontology exact-match limitations, we provide a use case performing semantic verification through a workflow enabled by Retrieval-Augmented Generation (RAG) powered by open-source large language models (LLMs). This enables LLMs to establish truth over content generated by other LLMs and expose hallucination.engProtocol for evaluating ChatGPT in biomedical association generation and verification using a RAG enabled, cross-model majority voting workflowresearch article10.1016/j.xpro.2026.10453342133493001770377700001