NOVAID: Natural-language Observability Visualization Assistant for ITOps Dashboard Widget GenerationPratik MishraCaner Gözübüyüket al.2026IAAI 2026
Small Models Exhibit Limited Answer Consistency in Repetition Trials of the Multiple-Choice MMLU-Redux and MedQA BenchmarksClaudio Santos PinhanezPaulo Rodrigo Cavalinet al.2026AAAI 2026
AutoTuneX: Interactive Automated Fine-Tuning for Large Language ModelsDaniel Karl I. WeidelePriyanshu Raiet al.2026AAAI 2026
ToolSmith: A Multi-Agent Framework for Enterprise Tool CreationPurna Chandra Sekhar VakudavathuKushal Mukherjeeet al.2026AAAI 2026
SOFAI-LM: A Cognitive Architecture for Building Efficient and Reliable Reasoning Systems with LLMsVedant KhandelwalFrancesca Rossiet al.2026AAAI 2026
Enhancing Geospatial Chain-of-Thought Reasoning in Visual Question Answering Models for Multispectral Remote Sensing DataShambhavi ShankerManikandan Padmanabanet al.2025AGU 2025
Eliciting Reasoning in Language Models with Cognitive ToolsBrown EboukyAndrea Bartezzaghiet al.2025NeurIPS 2025
Rollout Roulette: A Probabilistic Inference Approach to Inference-Time Scaling of LLMs using Particle-Based Monte Carlo MethodsIsha PuriShivchander Sudalairajet al.2025NeurIPS 2025
Musings on AI Muses: Support for Human CreativityJohn RichardsJacquelyn Martinoet al.2025NeurIPS 2025