State of AI Reasoning for Theoretical Physics - Insights from the TPBench Project
Moritz Munchmeyer - University of Wisconsin–Madison
Cappello, F. (2025). EAIRA: Establishing a methodology to evaluate LLMs as research assistants.. Perimeter Institute. https://pirsa.org/25040059
Cappello, Frank. EAIRA: Establishing a methodology to evaluate LLMs as research assistants.. Perimeter Institute, Apr. 08, 2025, https://pirsa.org/25040059
@misc{ pirsa_PIRSA:25040059,
doi = {10.48660/25040059},
url = {https://pirsa.org/25040059},
author = {Cappello, Frank},
keywords = {},
language = {en},
title = {EAIRA: Establishing a methodology to evaluate LLMs as research assistants.},
publisher = {Perimeter Institute},
year = {2025},
month = {apr},
note = {PIRSA:25040059 see, \url{https://pirsa.org}}
}