基础理论
🌐 语言: English 中文
本目录收集了具身智能中与基础理论相关的论文和代码实现。
主要内容
- 具身智能的认知基础
- 具身智能的计算模型
- 具身智能的学习理论
- 具身智能的评估方法
手动添加的论文
自动更新的论文
日期 | 标题 | 论文 | 代码 | 推荐指数 |
---|---|---|---|---|
2025-08-20 | [TransLLM] TransLLM: A Unified Multi-Task Foundation Framework for Urban Transportation via Learnable Prompting | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-08-20 | [MCP-Universe] MCP-Universe: Benchmarking Large Language Models with Real-World Model Context Protocol Servers | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-08-20 | Can LLM Agents Solve Collaborative Tasks? A Study on Urgency-Aware Planning and Coordination | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-08-19 | [CausalPlan] CausalPlan: Empowering Efficient LLM Multi-Agent Collaboration Through Causality-Driven Planning | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-08-19 | [Virtuous Machines] Virtuous Machines: Towards Artificial General Science | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-08-19 | [en] CrafterDojo: A Suite of Foundation Models for Building Open-Ended Embodied Agents in Crafter | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-08-19 | [en] Structured Prompting and Multi-Agent Knowledge Distillation for Traffic Video Interpretation and Risk Inference | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-08-19 | [Tooth-Diffusion] Tooth-Diffusion: Guided 3D CBCT Synthesis with Fine-Grained Tooth Conditioning | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-08-19 | [RynnEC] RynnEC: Bringing MLLMs into Embodied World | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-08-18 | Contrastive Representations for Temporal Reasoning | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-08-18 | Hierarchical Evaluation Function (HEF): A Multi-Metric Approach for Optimizing Demand Forecasting Models | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-08-17 | [MedKGent] MedKGent: A Large Language Model Agent Framework for Constructing Temporally Evolving Medical Knowledge Graph | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-08-16 | [FutureX] FutureX: An Advanced Live Benchmark for LLM Agents in Future Prediction | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-08-15 | Embodied Edge Intelligence Meets Near Field Communication: Concept, Design, and Verification | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-08-14 | [ComoRAG] ComoRAG: A Cognitive-Inspired Memory-Organized RAG for Stateful Long Narrative Reasoning | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-08-14 | [LeanRAG] LeanRAG: Knowledge-Graph-Based Generation with Semantic Aggregation and Hierarchical Retrieval | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-08-14 | [FROGENT] FROGENT: An End-to-End Full-process Drug Design Agent | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-08-14 | Promoting Efficient Reasoning with Verifiable Stepwise Reward | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-08-14 | Large Model Empowered Embodied AI: A Survey on Decision-Making and Embodied Learning | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-08-14 | A Unified Multi-Agent Framework for Universal Multimodal Understanding and Generation | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-08-14 | [Retro-Expert] Retro-Expert: Collaborative Reasoning for Interpretable Retrosynthesis | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-08-13 | [EvoCurr] EvoCurr: Self-evolving Curriculum with Behavior Code Generation for Complex Decision-making | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-08-13 | [GoViG] GoViG: Goal-Conditioned Visual Navigation Instruction Generation | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-08-13 | [en] RelayFormer: A Unified Local-Global Attention Framework for Scalable Image and Video Manipulation Localization | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-08-12 | [Efficient Agent] Efficient Agent: Optimizing Planning Capability for Multimodal Retrieval Augmented Generation | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-08-12 | Designing Memory-Augmented AR Agents for Spatiotemporal Reasoning in Personalized Task Assistance | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-08-12 | [BrowseMaster] BrowseMaster: Towards Scalable Web Browsing via Tool-Augmented Programmatic Agent Pair | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-08-12 | Rational Inverse Reasoning | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-08-11 | Breaking Down and Building Up: Mixture of Skill-Based Vision-and-Language Navigation Agents | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-08-11 | [InterChart] InterChart: Benchmarking Visual Reasoning Across Decomposed and Distributed Chart Information | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-08-11 | [GVGAI-LLM] GVGAI-LLM: Evaluating Large Language Model Agents with Infinite Games | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-08-11 | [TeamMedAgents] TeamMedAgents: Enhancing Medical Decision-Making of LLMs Through Structured Teamwork | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-08-11 | Progressive Bird’s Eye View Perception for Safety-Critical Autonomous Driving: A Comprehensive Survey | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-08-09 | Simulating Biological Intelligence: Active Inference with Experiment-Informed Generative Model | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-08-08 | Society of Mind Meets Real-Time Strategy: A Hierarchical Multi-Agent Framework for Strategic Reasoning | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-08-07 | Information-Theoretic Graph Fusion with Vision-Language-Action Model for Policy Reasoning and Dual Robotic Control | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-08-07 | [GRAIL] GRAIL:Learning to Interact with Large Knowledge Graphs for Retrieval Augmented Reasoning | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-08-07 | A Novel Architecture for Symbolic Reasoning with Decision Trees and LLM Agents | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-08-07 | [R-Zero] R-Zero: Self-Evolving Reasoning LLM from Zero Data | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-08-07 | Do Robots Really Need Anthropomorphic Hands? | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-08-07 | [en] OmniEAR: Benchmarking Agent Reasoning in Embodied Tasks | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-08-07 | [en] Resource-Limited Joint Multimodal Sentiment Reasoning and Classification via Chain-of-Thought Enhancement and Distillation | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-08-06 | [DRIVE] DRIVE: Dynamic Rule Inference and Verified Evaluation for Constraint-Aware Autonomous Driving | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-08-06 | [OmniPlay] OmniPlay: Benchmarking Omni-Modal Models on Omni-Modal Game Playing | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-08-06 | [ViFP] ViFP: A Framework for Visual False Positive Detection to Enhance Reasoning Reliability in VLMs | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-08-06 | [en] Synthetic POMDPs to Challenge Memory-Augmented RL: Memory Demand Structure Modeling | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-08-06 | [Voost] Voost: A Unified and Scalable Diffusion Transformer for Bidirectional Virtual Try-On and Try-Off | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-08-05 | [AGENTiGraph] AGENTiGraph: A Multi-Agent Knowledge Graph Framework for Interactive, Domain-Specific LLM Chatbots | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-08-05 | [Tree-of-Reasoning] Tree-of-Reasoning: Towards Complex Medical Diagnosis via Multi-Agent Reasoning with Evidence Tree | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-08-05 | LLMs Have a Heart of Stone: Demystifying the Soft Thinking Ability of Large Reasoning Models | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-08-05 | Visual Document Understanding and Question Answering: A Multi-Agent Collaboration Framework with Test-Time Scaling | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-08-05 | [UniFucGrasp] UniFucGrasp: Human-Hand-Inspired Unified Functional Grasp Annotation Strategy and Dataset for Diverse Dexterous Hands | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-08-05 | [en] A Survey of AI Agent Registry Solutions | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-08-05 | [ToolVQA] ToolVQA: A Dataset for Multi-step Reasoning VQA with External Tools | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-08-05 | LLMs are Single-threaded Reasoners: Demystifying the Working Mechanism of Soft Thinking | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-08-05 | [Point2Act] Point2Act: Efficient 3D Distillation of Multimodal LLMs for Zero-Shot Context-Aware Grasping | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-08-04 | [NaviMaster] NaviMaster: Learning a Unified Policy for GUI and Embodied Navigation Tasks | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-08-04 | Clinically Grounded Agent-based Report Evaluation: An Interpretable Metric for Radiology Report Generation | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-08-02 | A Survey on Agent Workflow – Status and Future | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-08-02 | Is Chain-of-Thought Reasoning of LLMs a Mirage? A Data Distribution Lens | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-08-02 | [WinkTPG] WinkTPG: An Execution Framework for Multi-Agent Path Finding Using Temporal Reasoning | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-08-01 | [UAV-ON] UAV-ON: A Benchmark for Open-World Object Goal Navigation with Aerial Agents | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-08-01 | [REACT] REACT: A Real-Time Edge-AI Based V2X Framework for Accident Avoidance in Autonomous Driving System | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-07-31 | Towards Affordable Tumor Segmentation and Visualization for 3D Breast MRI Using SAM2 | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-07-31 | [MPCC] MPCC: A Novel Benchmark for Multimodal Planning with Complex Constraints in Multimodal Large Language Models | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-07-31 | [SimuRA] SimuRA: Towards General Goal-Oriented Agent via Simulative Reasoning Architecture with LLM-Based World Model | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-07-31 | Punching Bag vs. Punching Person: Motion Transferability in Videos | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-07-30 | Modeling Human Gaze Behavior with Diffusion Models for Unified Scanpath Prediction | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-07-30 | Early Goal-Guided Multi-Scale Fusion for Real-Time Vision-Language Driving | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-07-29 | Large Language Models for Wireless Communications: From Adaptation to Autonomy | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-07-29 | [en] Exploring the Link Between Bayesian Inference and Embodied Intelligence: Toward Open Physical-World Embodied AI Systems | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-07-29 | Hebbian Memory-Augmented Recurrent Networks: Engram Neurons in Deep Learning | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-07-29 | Foundation Models for Demand Forecasting via Dual-Strategy Ensembling | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-07-28 | Free Energy-Inspired Cognitive Risk Integration for AV Navigation in Pedestrian-Rich Environments | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-07-28 | A Survey of Self-Evolving Agents: On Path to Artificial Super Intelligence | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-07-28 | Enhancing QoS in Edge Computing through Federated Layering Techniques: A Pathway to Resilient AI Lifelong Learning Systems | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-07-28 | Advancing Compositional LLM Reasoning with Structured Task Relations in Interactive Multimodal Communications | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-07-28 | Projecting the New Body: How Body Image Evolves During Learning to Walk with a Wearable Robot | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-07-27 | [VLMPlanner] VLMPlanner: Integrating Visual Language Models with Motion Planning | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-07-25 | [OS-MAP] OS-MAP: How Far Can Computer-Using Agents Go in Breadth and Depth? | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-07-24 | [E.A.R.T.H.] E.A.R.T.H.: Structuring Creative Evolution through Model Error in Generative AI | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-07-24 | [en] DepthDark: Robust Monocular Depth Estimation for Low-Light Environments | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-07-24 | A Foundation Model for Massive MIMO Precoding with an Adaptive per-User Rate-Power Tradeoff | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-07-24 | [en] Evaluation of facial landmark localization performance in a surgical setting | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-07-23 | Agent Identity Evals: Measuring Agentic Identity | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-07-23 | Dynamic Modeling and Dimensional Optimization of Legged Mechanisms for Construction Robot | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-07-23 | Confidence Calibration in Vision-Language-Action Models | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-07-22 | Design and Dimensional Optimization of Legged Structures for Construction Robots | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-07-22 | Towards Robust Foundation Models for Digital Pathology | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-07-21 | [EgoPrune] EgoPrune: Efficient Token Pruning for Egomotion Video Reasoning in Embodied Agent | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-07-21 | A Framework for Analyzing Abnormal Emergence in Service Ecosystems Through LLM-based Agent Intention Mining | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-07-21 | [HAMLET] HAMLET: Hyperadaptive Agent-based Modeling for Live Embodied Theatrics | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-07-21 | Scaling Decentralized Learning with FLock | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-07-21 | [Data Mixing Agent] Data Mixing Agent: Learning to Re-weight Domains for Continual Pre-training | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-07-21 | [en] Strong, Accurate, and Low-Cost Robot Manipulator | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-07-20 | [en] TriCLIP-3D: A Unified Parameter-Efficient Framework for Tri-Modal 3D Visual Grounding based on CLIP | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-07-18 | [en] A Recursive Lie-Group Formulation for the Second-Order Time Derivatives of the Inverse Dynamics of parallel Kinematic Manipulators | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-07-17 | [SE-VLN] SE-VLN: A Self-Evolving Vision-Language Navigation Framework Based on Multimodal Large Language Models | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-07-17 | [FormulaOne] FormulaOne: Measuring the Depth of Algorithmic Reasoning Beyond Competitive Programming | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-07-16 | [Aime] Aime: Towards Fully-Autonomous Multi-Agent Framework | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-07-16 | Assessing the Value of Visual Input: A Benchmark of Multimodal Large Language Models for Robotic Path Planning | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-07-16 | [MindJourney] MindJourney: Test-Time Scaling with World Models for Spatial Reasoning | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-07-15 | [CogDDN] CogDDN: A Cognitive Demand-Driven Navigation with Decision Optimization and Dual-Process Thinking | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-07-15 | [en] FOUNDER: Grounding Foundation Models in World Models for Open-Ended Embodied Decision Making | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-07-15 | Galaxy image simplification using Generative AI | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-07-14 | [Hyper-Dexterous] Demonstrating the Octopi-1.5 Visual-Tactile-Language Model | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-07-14 | Deep Hidden Cognition Facilitates Reliable Chain-of-Thought Reasoning | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-07-14 | Supporting SENĆOTEN Language Documentation Efforts with Automatic Speech Recognition | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-07-11 | [KG-Attention] KG-Attention: Knowledge Graph-Guided Attention at Test-Time via Bidirectional Information Aggregation | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-07-11 | Making VLMs More Robot-Friendly: Self-Critical Distillation of Low-Level Procedural Reasoning | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-07-11 | [BioAnalyst] BioAnalyst: A Foundation Model for Biodiversity | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-07-10 | On the capabilities of LLMs for classifying and segmenting time series of fruit picking motions into primitive actions | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-07-10 | [en] ROS Help Desk: GenAI Powered, User-Centric Framework for ROS Error Diagnosis and Debugging | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-07-09 | Foundation Model Self-Play: Open-Ended Strategy Innovation via Foundation Models | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-07-09 | [Gradientsys] Gradientsys: A Multi-Agent LLM Scheduler with ReAct Orchestration | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-07-09 | Temporal Information Retrieval via Time-Specifier Model Merging | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-07-09 | A Neural Representation Framework with LLM-Driven Spatial Reasoning for Open-Vocabulary 3D Visual Grounding | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-07-09 | Frontier LLMs Still Struggle with Simple Reasoning Tasks | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-07-08 | [NeoBabel] NeoBabel: A Multilingual Open Tower for Visual Generation | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-07-08 | Graph Learning | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-07-08 | [en] SCCRUB: Surface Cleaning Compliant Robot Utilizing Bristles | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-07-07 | Rule Learning for Knowledge Graph Reasoning under Agnostic Distribution Shift | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-07-07 | On the Bias of Next-Token Predictors Toward Systematically Inefficient Reasoning: A Shortest-Path Case Study | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-07-07 | [DeepRetro] DeepRetro: Retrosynthetic Pathway Discovery using Iterative LLM Reasoning | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-07-06 | “Hi AirStar, Guide Me to the Badminton Court.” | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-07-06 | [ZERO] ZERO: Multi-modal Prompt-based Visual Grounding | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-07-05 | Accurate and Efficient World Modeling with Masked Latent Transformers | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-07-04 | [LTLCrit] LTLCrit: A Temporal Logic-based LLM Critic for Safe and Efficient Embodied Agents | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-07-04 | [BMMR] BMMR: A Large-Scale Bilingual Multimodal Multi-Discipline Reasoning Dataset | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-07-04 | Effects of structure on reasoning in instance-level Self-Discover | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-07-03 | [CyberRAG] CyberRAG: An agentic RAG cyber attack classification and reporting tool | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-07-03 | Knowledge Graph-Based Explainable and Generalized Zero-Shot Semantic Communications | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-07-02 | [RALLY] RALLY: Role-Adaptive LLM-Driven Yoked Navigation for Agentic UAV Swarms | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-07-02 | A Survey on Vision-Language-Action Models: An Action Tokenization Perspective | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-07-01 | Thinking Beyond Tokens: From Brain-Inspired Intelligence to Cognitive Foundations for Artificial General Intelligence and its Societal Impact | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-07-01 | Cognitive Load-Aware Inference: A Neuro-Symbolic Framework for Optimizing the Token Economy of Large Language Models | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-07-01 | Large Language Model Powered Intelligent Urban Agents: Concepts, Capabilities, and Applications | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-06-30 | A Survey on Autonomy-Induced Security Risks in Large Model-Based Agents | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-06-30 | [PokéAI] PokéAI: A Goal-Generating, Battle-Optimizing Multi-agent System for Pokemon Red | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-06-30 | [en] A Survey on Vision-Language-Action Models for Autonomous Driving | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-06-30 | Towards foundational LiDAR world models with efficient latent flow matching | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-06-30 | [en] SurgiSR4K: A High-Resolution Endoscopic Video Dataset for Robotic-Assisted Minimally Invasive Procedures | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-06-29 | Are Large Language Models Capable of Deep Relational Reasoning? Insights from DeepSeek-R1 and Benchmark Comparisons | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-06-29 | [en] SoMi-ToM: Evaluating Multi-Perspective Theory of Mind in Embodied Social Interactions | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-06-29 | [UrbanLLaVA] UrbanLLaVA: A Multi-modal Large Language Model for Urban Intelligence with Spatial Reasoning and Understanding | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-06-27 | Do Vision-Language Models Have Internal World Models? Towards an Atomic Evaluation | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-06-27 | [SPADE] SPADE: Spatial Transcriptomics and Pathology Alignment Using a Mixture of Data Experts for an Expressive Latent Space | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-06-26 | [Agent-RewardBench] Agent-RewardBench: Towards a Unified Benchmark for Reward Modeling across Perception, Planning, and Safety in Real-World Multimodal Agents | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-06-26 | Unlocking Constraints: Source-Free Occlusion-Aware Seamless Segmentation | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-06-26 | [THE-Tree] THE-Tree: Can Tracing Historical Evolution Enhance Scientific Verification and Reasoning? | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-06-26 | [en] SEEA-R1: Tree-Structured Reinforcement Fine-Tuning for Self-Evolving Embodied Agents | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-06-25 | The Decrypto Benchmark for Multi-Agent Reasoning and Theory of Mind | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-06-25 | Leveraging Vision-Language Models to Select Trustworthy Super-Resolution Samples Generated by Diffusion Models | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-06-25 | [IMA-Catcher] IMA-Catcher: An IMpact-Aware Nonprehensile Catching Framework based on Combined Optimization and Learning | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-06-25 | [en] Generating and Customizing Robotic Arm Trajectories using Neural Networks | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-06-24 | Robotic Perception with a Large Tactile-Vision-Language Model for Physical Property Inference | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-06-24 | [en] Is an object-centric representation beneficial for robotic manipulation ? | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-06-24 | [UniTac-NV] UniTac-NV: A Unified Tactile Representation For Non-Vision-Based Tactile Sensors | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-06-24 | Robust Embodied Self-Identification of Morphology in Damaged Multi-Legged Robots | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-06-24 | [en] Evolutionary Gait Reconfiguration in Damaged Legged Robots | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-06-23 | Optimization-Induced Dynamics of Lipschitz Continuity in Neural Networks | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-06-23 | [TAMMs] TAMMs: Temporal-Aware Multimodal Model for Satellite Image Change Understanding and Forecasting | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-06-23 | [Matrix-Game] Matrix-Game: Interactive World Foundation Model | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-06-23 | [en] Dynamic Knowledge Exchange and Dual-diversity Review: Concisely Unleashing the Potential of a Multi-Agent Research Team | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-06-23 | [Drive-R1] Drive-R1: Bridging Reasoning and Planning in VLMs for Autonomous Driving with Reinforcement Learning | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-06-23 | Including Semantic Information via Word Embeddings for Skeleton-based Action Recognition | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-06-21 | [DRAMA-X] DRAMA-X: A Fine-grained Intent Prediction and Risk Reasoning Benchmark For Driving | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-06-20 | With Limited Data for Multimodal Alignment, Let the STRUCTURE Guide You | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-06-20 | [en] Kinematic Model Optimization via Differentiable Contact Manifold for In-Space Manipulation | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-06-18 | [EmojiVoice] EmojiVoice: Towards long-term controllable expressivity in robot speech | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-06-18 | [MEM1] MEM1: Learning to Synergize Memory and Reasoning for Efficient Long-Horizon Agents | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-06-18 | [KG-FGNN] KG-FGNN: Knowledge-guided GNN Foundation Model for Fertilisation-oriented Soil GHG Flux Prediction | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-06-17 | [EVA02-AT] EVA02-AT: Egocentric Video-Language Understanding with Spatial-Temporal Rotary Positional Embeddings and Symmetric Optimization | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-06-17 | From Points to Places: Towards Human Mobility-Driven Spatiotemporal Foundation Models via Understanding Places | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-06-16 | Knowledge Graph Fusion with Large Language Models for Accurate, Explainable Manufacturing Process Planning | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-06-16 | Towards a Formal Specification for Self-organized Shape Formation in Swarm Robotics | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-06-16 | Unveiling the Learning Mind of Language Models: A Cognitive Framework and Empirical Study | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-06-16 | [en] IKDiffuser: Fast and Diverse Inverse Kinematics Solution Generation for Multi-arm Robotic Systems | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-06-15 | [en] Building Trustworthy AI by Addressing its 16+2 Desiderata with Goal-Directed Commonsense Reasoning | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-06-14 | [AgentOrchestra] AgentOrchestra: A Hierarchical Multi-Agent Framework for General-Purpose Task Solving | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-06-14 | A Comprehensive Survey of Deep Research: Systems, Methodologies, and Applications | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-06-13 | Foundation Models in Autonomous Driving: A Survey on Scenario Generation and Scenario Analysis | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-06-12 | Reliable Reasoning Path: Distilling Effective Guidance for LLM Reasoning with Knowledge Graphs | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-06-12 | [LogiPlan] LogiPlan: A Structured Benchmark for Logical Planning and Relational Reasoning in LLMs | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-06-12 | [SlotPi] SlotPi: Physics-informed Object-centric Reasoning Models | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-06-12 | [RICE] RICE: Reactive Interaction Controller for Cluttered Canopy Environment | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-06-12 | [en] An $O(n$)-Algorithm for the Higher-Order Kinematics and Inverse Dynamics of Serial Manipulators using Spatial Representation of Twists | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-06-12 | [en] TARDIS STRIDE: A Spatio-Temporal Road Image Dataset for Exploration and Autonomy | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-06-11 | Reinforcing Spatial Reasoning in Vision-Language Models with Interwoven Thinking and Visual Drawing | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-06-11 | [OctoNav] OctoNav: Towards Generalist Embodied Navigation | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-06-11 | [CausalVQA] CausalVQA: A Physically Grounded Causal Reasoning Benchmark for Video Models | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-06-11 | Intelligent Design 4.0: Paradigm Evolution Toward the Agentic AI Era | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-06-11 | Know What You Don’t Know: Uncertainty Calibration of Process Reward Models | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-06-11 | Hierarchical Image Matching for UAV Absolute Visual Localization via Semantic and Structural Constraints | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-06-11 | [HopaDIFF] HopaDIFF: Holistic-Partial Aware Fourier Conditioned Diffusion for Referring Human Action Segmentation in Multi-Person Scenarios | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-06-10 | Hybrid Reasoning for Perception, Explanation, and Autonomous Action in Manufacturing | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-06-10 | ROS-related Robotic Systems Development with V-model-based Application of MeROS Metamodel | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-06-09 | [en] SAFEFLOW: A Principled Protocol for Trustworthy and Transactional Autonomous Agent Systems | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-06-09 | Reproducibility in the Control of Autonomous Mobility-on-Demand Systems | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-06-08 | Learn as Individuals, Evolve as a Team: Multi-agent LLMs Adaptation in Embodied Environments | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-06-08 | Overclocking LLM Reasoning: Monitoring and Controlling Thinking Path Lengths in LLMs | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-06-08 | [Mind the Web] Mind the Web: The Security of Web Use Agents | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-06-08 | [en] Less is More: some Computational Principles based on Parcimony, and Limitations of Natural Intelligence | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-06-08 | [en] LLM-Enhanced Rapid-Reflex Async-Reflect Embodied Agent for Real-Time Decision-Making in Dynamically Changing Environments | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-06-08 | [Theorem-of-Thought] Theorem-of-Thought: A Multi-Agent Framework for Abductive, Deductive, and Inductive Reasoning in Language Models | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-06-06 | [MOGO] MOGO: Residual Quantized Hierarchical Causal Transformer for High-Quality and Real-Time 3D Human Motion Generation | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-06-06 | Object Navigation with Structure-Semantic Reasoning-Based Multi-level Map and Multimodal Decision-Making LLM | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-06-06 | [en] Multi-Modal Multi-Task Federated Foundation Models for Next-Generation Extended Reality Systems: Towards Privacy-Preserving Distributed Intelligence in AR/VR/MR | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-06-05 | Mathematical Reasoning for Unmanned Aerial Vehicles: A RAG-Based Approach for Complex Arithmetic Reasoning | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-06-05 | [CzechLynx] CzechLynx: A Dataset for Individual Identification and Pose Estimation of the Eurasian Lynx | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-06-05 | Look Before You Leap: A GUI-Critic-R1 Model for Pre-Operative Error Diagnosis in GUI Automation | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-06-05 | [MORSE-500] MORSE-500: A Programmatically Controllable Video Benchmark to Stress-Test Multimodal Reasoning | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-06-05 | [en] Towards provable probabilistic safety for scalable embodied AI systems | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-06-04 | [SemNav] SemNav: A Model-Based Planner for Zero-Shot Object Goal Navigation Using Vision-Foundation Models | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-06-04 | [AssetOpsBench] AssetOpsBench: Benchmarking AI Agents for Task Automation in Industrial Asset Operations and Maintenance | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-06-04 | Enhancing Safety of Foundation Models for Visual Navigation through Collision Avoidance via Repulsive Estimation | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-06-04 | [“Don’t Do That!”] “Don’t Do That!”: Guiding Embodied Systems through Large Language Model-based Constraint Generation | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-06-04 | A Framework Leveraging Large Language Models for Autonomous UAV Control in Flying Networks | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-06-03 | Corrigibility as a Singular Target: A Vision for Inherently Reliable Foundation Models | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-06-03 | [en] Geometric Visual Servo Via Optimal Transport | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-06-02 | [iQUEST] iQUEST: An Iterative Question-Guided Framework for Knowledge Base Question Answering | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-06-02 | [Fire360] Fire360: A Benchmark for Robust Perception and Episodic Memory in Degraded 360-Degree Firefighting Videos | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-05-30 | [GridRoute] GridRoute: A Benchmark for LLM-Based Route Planning with Cardinal Movement in Grid Environments | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-05-30 | Seeing is Not Reasoning: MVPBench for Graph-based Evaluation of Multi-path Visual Physical CoT | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-05-30 | [en] P: A Universal Measure of Predictive Intelligence | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-05-30 | Bi-Manual Joint Camera Calibration and Scene Representation | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-05-29 | Conceptual Framework Toward Embodied Collective Adaptive Intelligence | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-05-29 | [GAM-Agent] GAM-Agent: Game-Theoretic and Uncertainty-Aware Collaboration for Complex Visual Reasoning | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-05-29 | [MAPLE] MAPLE: A Mobile Assistant with Persistent Finite State Machines for Recovery Reasoning | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-05-29 | From Token to Action: State Machine Reasoning to Mitigate Overthinking in Information Retrieval | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-05-29 | Eye-tracking-Driven Shared Control for Robotic Arms:Wizard of Oz Studies to Assess Design Choices | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-05-29 | Darwin Godel Machine: Open-Ended Evolution of Self-Improving Agents | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-05-29 | [Data-to-Dashboard] Data-to-Dashboard: Multi-Agent LLM Framework for Insightful Visualization in Enterprise Analytics | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-05-29 | Representing local protein environments with atomistic foundation models | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-05-29 | [MAPLE] MAPLE: A Mobile Agent with Persistent Finite State Machines for Structured Task Reasoning | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-05-28 | [3DLLM-Mem] 3DLLM-Mem: Long-Term Spatial-Temporal Memory for Embodied 3D Large Language Model | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-05-28 | [EPiC] EPiC: Efficient Video Camera Control Learning with Precise Anchor-Video Guidance | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-05-28 | [WorkForceAgent-R1] WorkForceAgent-R1: Incentivizing Reasoning Capability in LLM-based Web Agents via Reinforcement Learning | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-05-28 | New Tools are Needed for Tracking Adherence to AI Model Behavioral Use Clauses | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-05-28 | [en] Spring-Brake! Handed Shearing Auxetics Improve Efficiency of Hopping and Standing | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-05-28 | [ASyMOB] ASyMOB: Algebraic Symbolic Mathematical Operations Benchmark | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-05-27 | Cross from Left to Right Brain: Adaptive Text Dreamer for Vision-and-Language Navigation | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-05-27 | [RRO] RRO: LLM Agent Optimization Through Rising Reward Trajectories | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-05-27 | [CoDA] CoDA: Coordinated Diffusion Noise Optimization for Whole-Body Manipulation of Articulated Objects | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-05-27 | Prostate Cancer Screening with Artificial Intelligence-Enhanced Micro-Ultrasound: A Comparative Study with Traditional Methods | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-05-27 | Don’t Think Longer, Think Wisely: Optimizing Thinking Dynamics for Large Reasoning Models | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-05-26 | [ReasonPlan] ReasonPlan: Unified Scene Prediction and Decision Reasoning for Closed-loop Autonomous Driving | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-05-26 | Subtle Risks, Critical Failures: A Framework for Diagnosing Physical Safety of LLMs for Embodied Decision Making | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-05-26 | [RFTF] RFTF: Reinforcement Fine-tuning for Embodied Agents with Temporal Feedback | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-05-26 | [SaSi] SaSi: A Self-augmented and Self-interpreted Deep Learning Approach for Few-shot Cryo-ET Particle Detection | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-05-26 | [DFIR-Metric] DFIR-Metric: A Benchmark Dataset for Evaluating Large Language Models in Digital Forensics and Incident Response | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-05-25 | [LIMOPro] LIMOPro: Reasoning Refinement for Efficient and Effective Test-time Scaling | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-05-23 | [CXReasonBench] CXReasonBench: A Benchmark for Evaluating Structured Diagnostic Reasoning in Chest X-rays | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-05-23 | Controlled Agentic Planning & Reasoning for Mechanism Synthesis | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-05-23 | [en] HoloLLM: Multisensory Foundation Model for Language-Grounded Human Sensing and Reasoning | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-05-23 | [en] Knot So Simple: A Minimalistic Environment for Spatial Reasoning | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-05-22 | [Date Fragments] Date Fragments: A Hidden Bottleneck of Tokenization for Temporal Reasoning | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-05-22 | Bridging the Dynamic Perception Gap: Training-Free Draft Chain-of-Thought for Dynamic Multimodal Spatial Reasoning | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-05-22 | [SpatialScore] SpatialScore: Towards Unified Evaluation for Multimodal Spatial Understanding | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-05-22 | [SPaRC] SPaRC: A Spatial Pathfinding Reasoning Challenge | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-05-22 | [en] SEM: Enhancing Spatial Understanding for Robust Robot Manipulation | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-05-22 | [Beyond Correlation] Beyond Correlation: Towards Causal Large Language Model Agents in Biomedicine | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-05-21 | [VERDI] VERDI: VLM-Embedded Reasoning for Autonomous Driving | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-05-20 | [KORGym] KORGym: A Dynamic Game Platform for LLM Reasoning Evaluation | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-05-20 | [en] Memory-Centric Embodied Question Answer | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-05-20 | Visual Agentic Reinforcement Fine-Tuning | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-05-19 | [PLAICraft] PLAICraft: Large-Scale Time-Aligned Vision-Speech-Action Dataset for Embodied AI | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-05-19 | [MM-PRM] MM-PRM: Enhancing Multimodal Mathematical Reasoning with Scalable Step-Level Supervision | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-05-19 | Seeing, Saying, Solving: An LLM-to-TL Framework for Cooperative Robots | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-05-19 | [PEER pressure] PEER pressure: Model-to-Model Regularization for Single Source Domain Generalization | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-05-19 | [ARIW-Framework] ARIW-Framework: Adaptive Robust Iterative Watermarking Framework | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-05-18 | [en] BeliefNest: A Joint Action Simulator for Embodied Agents with Theory of Mind | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-05-16 | [PoE-World] PoE-World: Compositional World Modeling with Products of Programmatic Experts | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-05-16 | A Survey on the Safety and Security Threats of Computer-Using Agents: JARVIS or Ultron? | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-05-16 | [en] PARSEC: Preference Adaptation for Robotic Object Rearrangement from Scene Context | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-05-16 | [en] Multi-Modal Multi-Task (M3T) Federated Foundation Models for Embodied AI: Potentials and Challenges for Edge Integration | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-05-16 | [en] Real-Time Verification of Embodied Reasoning for Generative Skill Acquisition | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-05-14 | [EWMBench] EWMBench: Evaluating Scene, Motion, and Semantic Quality in Embodied World Models | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-05-13 | [ARC-NCA] ARC-NCA: Towards Developmental Solutions to the Abstraction and Reasoning Corpus | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-05-12 | [LAMM-ViT] LAMM-ViT: AI Face Detection via Layer-Aware Modulation of Region-Guided Attention | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-05-12 | Learning from Peers in Reasoning Models | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-05-12 | Learning Dynamics in Continual Pre-Training for Large Language Models | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-05-12 | Joint Graph Convolution and Sequential Modeling for Scalable Network Traffic Estimation | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-05-12 | Towards user-centered interactive medical image segmentation in VR with an assistive AI agent | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-05-12 | [en] Cooperative Assembly with Autonomous Mobile Manipulators in an Underwater Scenario | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-05-10 | [en] STRIVE: Structured Representation Integrating VLM Reasoning for Efficient Object Navigation | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-05-09 | Towards Robust Few-Shot Text Classification Using Transformer Architectures and Dual Loss Strategies | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-05-09 | From Millions of Tweets to Actionable Insights: Leveraging LLMs for User Profiling | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-05-09 | Adapting a Segmentation Foundation Model for Medical Image Classification | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-05-09 | Ohana trees and Taylor expansion for the $λ$I-calculus. No variable gets left behind or forgotten! | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-05-09 | [en] Neuro-Symbolic Concepts | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-05-09 | Representation of tensor functions using low-order structural tensor set: two-dimensional point group | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-05-09 | Revisiting the connection of baryon number, lepton number, and operator dimension | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-05-09 | Rydberg atomic spectrum analyzer with microwave-dressed-state-locking and multimode Floquet theory | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-05-09 | Preferential Attachment Trees with Vertex Death: Persistence of the Maximum Degree | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-05-09 | Bi-LSTM based Multi-Agent DRL with Computation-aware Pruning for Agent Twins Migration in Vehicular Embodied AI Networks | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-05-08 | From Single Image to 3D Avatar via Synthetic Data Generation with Video Diffusion and Data Augmentation | [pdf] | yc4ny/SVAD | ⭐️⭐️⭐️ |
2025-05-08 | A Survey | [pdf] | hzxie/awesome-3d-scene-generation | ⭐️⭐️⭐️ |
2025-05-08 | Predicting Structure and Motion via Ray Origin and Endpoint Diffusion | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-05-08 | An Omni Foundation Model for Interleaved Multi-Modal Generation | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-05-08 | Evaluating Legally Consistent Bias in Machine Learning | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-05-08 | Training Flow Matching Models via Online RL | [pdf] | yifan123/flow_grpo | ⭐️⭐️⭐️ |
2025-05-08 | Generating Physically Stable and Buildable LEGO Designs from Text | [pdf] | AvaLovelace1/LegoGPT | ⭐️⭐️⭐️ |
2025-05-08 | A Solovay-Kitaev theorem for quantum signal processing | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-05-08 | Turning Your Offline Video Large Language Model into a Proactive Streaming Assistant | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-05-08 | Comparison of integral equations used to study $T_{cc}^+$ | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-05-08 | Preference Alignment via Comparison Oracles | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-05-08 | Understanding Perception and Reasoning through Model Merging | [pdf] | shiqichen17/vlm_merging | ⭐️⭐️⭐️ |
2025-05-08 | Primordial black-hole formation and heavy r-process element synthesis from the cosmological QCD transition. Two aspects of an inhomogeneous early Universe | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-05-08 | Marsden–Meyer–Weinstein reduction for $k$-contact field theories | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-05-08 | Representation Stability for Marked Graph Complexes | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-05-08 | Reduced Basis Method for Driven-Dissipative Quantum Systems | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-05-08 | A Dataset of Misleading Narratives Surrounding Recent UK General Elections | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-05-08 | Emergence of Spin-Polarized Unconventional Skin Effect in Hatano-Nelson Model | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-05-08 | A Study on Improvement of Image Quality in Quantum Polarized Microscopy using an Entangled-Photon Source | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-05-08 | towards Spatial Intelligence Thorough Evaluation | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-05-08 | Calculation of ground state energy of Lithium and Beryllium based on variational method | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-05-08 | Resolution of the Solar Convective Conundrum? New Results Using the Time-Distance Deep-Focus Method | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-05-08 | Conversational Process Model Redesign | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-05-08 | Reinforcement Learning-Driven Data Assimilation with Uncertainty-Aware Constrained Ensembles | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-05-08 | Novel Forms of Early Dark Energy | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-05-08 | Manifest Gauge Invariance for Structure Dependent Radiative Corrections to Processes Involving Atoms and Nuclei | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-05-08 | Quantum effects in rotating thermal states on anti-de Sitter space-time | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-05-08 | Adaptive Markup Language Generation for Contextually-Grounded Visual Document Understanding | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-05-08 | todd | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-05-08 | a cosmic explosion with a complex off-axis jet and cocoon from a massive progenitor | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-05-08 | implications for the observed abundance of ultra-violet luminous galaxies at z>10 | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-05-08 | Leveraging Co-Speech Gestures to Augment LLM-Based Interaction in Virtual Reality | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-05-08 | An Efficient Edge-Cloud Collaborative Multi-Agent Framework for Mobile Automation | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-05-08 | Stabilization of Kac polynomials | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-05-08 | Scalable Bernoulli factories for Bayesian inference with intractable likelihoods | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-05-08 | Cell size heterogeneity controls crystallization of the developing fruit fly wing | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-05-08 | The effective energy of a lattice metamaterial | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-05-08 | Asymmetric decay of quantum many-body scars in XYZ quantum spin chains | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-05-08 | Artifact Sharing for Information Retrieval Research | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-05-08 | Non-Markovianity in collision models with initial intra-environment correlations | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-05-08 | Boundary Energy-Momentum Tensors for Asymptotically Flat Spacetimes | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-05-08 | Statistical Characterization of Entanglement Degradation Under Markovian Noise in Composite Quantum Systems | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-05-08 | Two-dimensional water waves with constant vorticity and general bottom topography | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-05-08 | Theoretical modeling of approximate universality of tidally deformed neutron stars | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-05-08 | Empowering Scientific Workflows with Federated Agents | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-05-08 | Efficient Data Filtering and Verification for High-Quality LLM Training Data | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-05-08 | Sideways on the highways | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-05-08 | On differentiation of integrals in Lebesgue spaces | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-05-08 | An efficient second-order cone programming approach for dynamic optimal transport on staggered grid discretization | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
2025-05-08 | an LLM-based Literary Translation evaluation metric with Professional Question Answering | [pdf] | ⚠️ | ⭐️⭐️⭐️ |
📊 统计
- 论文总数:346篇
- 代码实现:5个
- 最后更新:2025年08月
最后更新: 2025-08-22