WeSearch

Diagnosing Knowledge Gaps in LLM Tool Use: An Agentic Benchmark for Novel API Acquisition

·3 min read · 0 reactions · 0 comments · 8 views
#artificial intelligence#machine learning#api#benchmarking
Diagnosing Knowledge Gaps in LLM Tool Use: An Agentic Benchmark for Novel API Acquisition
⚡ TL;DR · AI summary

The paper introduces NovelAPIBench, a dynamic benchmark designed to evaluate large language models' ability to use novel APIs. It highlights the importance of both retrieval and parametric adaptation in enhancing model performance. The study finds that usage examples are crucial for effective learning, while adding more context can sometimes lead to errors.

Key facts
Original article
arXiv cs.AI
Read full at arXiv cs.AI →
Opening excerpt (first ~120 words) tap to expand

Computer Science > Artificial Intelligence arXiv:2606.03657 (cs) [Submitted on 2 Jun 2026] Title:Diagnosing Knowledge Gaps in LLM Tool Use: An Agentic Benchmark for Novel API Acquisition Authors:Jinnuo Liu, Yue Peng, Jinhan Niu, Hongyi Wen View a PDF of the paper titled Diagnosing Knowledge Gaps in LLM Tool Use: An Agentic Benchmark for Novel API Acquisition, by Jinnuo Liu and 3 other authors View PDF HTML (experimental) Abstract:Large language models for code generation often need to use APIs that are absent from their pretraining data. This requires more than recalling a function name: models must coordinate signatures, module paths, input-output contracts, semantics, and executable usage patterns.

Excerpt limited to ~120 words for fair-use compliance. The full article is at arXiv cs.AI.

Anonymous · no account needed
Share 𝕏 Facebook Reddit LinkedIn Threads WhatsApp Bluesky Mastodon Email

Discussion

0 comments

More from arXiv cs.AI