12 results for "performance analysis"
Analytica: Soft Propositional Reasoning for Robust and Scalable LLM-Driven Analysis
Large language model (LLM) agents are increasingly tasked with complex real-world analysis (e.g., in financial forecasting, scientific discovery), yet their reasoning suffers from stochastic instabili…
An Analysis of the Coordination Gap between Joint and Modular Learning for Job Shop Scheduling with Transportation Resources
Efficient job-shop scheduling with transportation resources is critical for high-performance manufacturing. With the rise of "decentralized factories", multi-agent reinforcement learning has emerged a…
An Information-Geometric Framework for Stability Analysis of Large Language Models under Entropic Stress
As large language models (LLMs) are increasingly deployed in high-stakes and operational settings, evaluation strategies based solely on aggregate accuracy are often insucient to characterize system r…
The Ithaka Group 1Q26 Performance: Contributors And Detractors
Ithaka's portfolio trailed the Russell 1000 Growth Index in Q1 2026 as tech names faced AI-driven headwinds. Read the full analysis for more details.…
Third Avenue Real Estate Value Fund Q1 2026 Shareholder Letter
Third Avenue Real Estate Value Fund discusses Q1 2026 performance and new investments in Brookdale and Hang Lung. Read the full analysis for more details.…
Deep Sail Capital Q1 2026 Investor Letter
Deep Sail Capital Partners analyzes its Q1 2026 performance and the impact of the AI infrastructure supercycle. Read the full analysis for more details.…
GreensKeeper Value Fund Q1 2026 Performers And Detractors
GreensKeeper Value Fund reports Q1 2026 performance and reviews core holdings like Hershey and Visa. Read the full analysis for more details.…
An Intelligent Fault Diagnosis Method for General Aviation Aircraft Based on Multi-Fidelity Digital Twin and FMEA Knowledge Enhancement
Fault diagnosis of general aviation aircraft faces challenges including scarce real fault data, diverse fault types, and weak fault signatures. This paper proposes an intelligent fault diagnosis frame…
Don't Make the LLM Read the Graph: Make the Graph Think
We investigate whether explicit belief graphs improve LLM performance in cooperative multi-agent reasoning. Through 3,000+ controlled trials across four LLM families in the cooperative card game Hanab…
Expert Evaluation of LLM's Open-Ended Legal Reasoning on the Japanese Bar Exam Writing Task
Large language models (LLMs) have shown strong performance on legal benchmarks, including multiple-choice components of bar exams. However, their capacity for generating open-ended legal reasoning in …
QED: An Open-Source Multi-Agent System for Generating Mathematical Proofs on Open Problems
We explore a central question in AI for mathematics: can AI systems produce original, nontrivial proofs for open research problems? Despite strong benchmark performance, producing genuinely novel proo…
A systematic evaluation of vision-language models for observational astronomical reasoning tasks
Vision-language models (VLMs) are increasingly proposed as general-purpose tools for scientific data interpretation, yet their reliability on real astronomical observations across diverse modalities r…