Models deployed on HuggingFace or RunPods.
AI & ML interests
LLM Evaluation
Papers
View all Papers
A benchmark for tip-of-the-tongue search and reasoning.
-
PatronusAI/lynx-70b-instruct-covidqa-generations
Viewer • Updated • 1k • 6 -
PatronusAI/lynx-70b-instruct-drop-generations
Viewer • Updated • 1k • 7 -
PatronusAI/lynx-70b-instruct-financebench-generations
Viewer • Updated • 1k • 5 -
PatronusAI/lynx-70b-instruct-halueval-generations
Viewer • Updated • 10k • 6
Models deployed on HuggingFace or RunPods.
A benchmark for tip-of-the-tongue search and reasoning.
-
PatronusAI/lynx-70b-instruct-covidqa-generations
Viewer • Updated • 1k • 6 -
PatronusAI/lynx-70b-instruct-drop-generations
Viewer • Updated • 1k • 7 -
PatronusAI/lynx-70b-instruct-financebench-generations
Viewer • Updated • 1k • 5 -
PatronusAI/lynx-70b-instruct-halueval-generations
Viewer • Updated • 10k • 6