Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
Philip Walsh's picture
4 2

Philip Walsh

Philip-Walsh
abidlabs's profile picture
·
https://philip-walsh.github.io/
  • Philip-Walsh
  • philip-walsh-01

AI & ML interests

None yet

Recent Activity

upvoted a paper about 2 months ago
The Tool Decathlon: Benchmarking Language Agents for Diverse, Realistic, and Long-Horizon Task Execution
upvoted a paper about 2 months ago
Reliable Weak-to-Strong Monitoring of LLM Agents
upvoted a paper about 2 months ago
TheMCPCompany: Creating General-purpose Agents with Task-specific Tools
View all activity

Organizations

Hugging Face MCP Course's profile picture

upvoted 4 papers about 2 months ago

The Tool Decathlon: Benchmarking Language Agents for Diverse, Realistic, and Long-Horizon Task Execution

Paper • 2510.25726 • Published Oct 29, 2025 • 45

Reliable Weak-to-Strong Monitoring of LLM Agents

Paper • 2508.19461 • Published Aug 26, 2025 • 1

TheMCPCompany: Creating General-purpose Agents with Task-specific Tools

Paper • 2510.19286 • Published Oct 22, 2025 • 8

Disambiguation-Centric Finetuning Makes Enterprise Tool-Calling LLMs More Realistic and Less Risky

Paper • 2507.03336 • Published Jul 4, 2025 • 6
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs