Agent Memory 每日论文综述 - 2026-05-09

2026-05-09

Agent Memory 每日论文综述

本报告自动生成自 papers.cool/arxiv/cs.AI

筛选标准：标题或摘要包含 agent、memory、RAG、episodic memory 等关键词

生成时间：2026/5/9 11:30:42

📊 今日概况

总扫描论文: 25 篇
Agent Memory 相关: 12 篇

📝 相关论文列表

1. AI Co-Mathematician: Accelerating Mathematicians with Agentic AI

arXiv ID: 2605.06651
核心要点: mathematician,mathematicians,agentic,mathematical,workflows,workbench,ideation,frontiermath,accelerating,stateful…
关键词: mathematician,mathematicians,agentic,mathematical,workflows,workbench,ideation,frontiermath,accelerating,stateful

2. MASPO: Joint Prompt Optimization for LLM-based Multi-Agent Systems

arXiv ID: 2605.06623
核心要点: maspo,prompts,prompt,agent,llm,agents,wangzx1219,joint,across,optimization…
关键词: maspo,prompts,prompt,agent,llm,agents,wangzx1219,joint,across,optimization

3. SkillOS: Learning Skill Curation for Self-Evolving Agents

arXiv ID: 2605.06614
核心要点: skill,curation,skillos,curator,skillrepo,skills,executor,agents,tasks,evolving…
关键词: skill,curation,skillos,curator,skillrepo,skills,executor,agents,tasks,evolving

4. NeuroAgent: LLM Agents for Multimodal Neuroimaging Analysis and Research

arXiv ID: 2605.06584
核心要点: neuroagent,neuroimaging,preprocessing,analysis,smri,llm,multimodal,470,fmri,pet…
关键词: neuroagent,neuroimaging,preprocessing,analysis,smri,llm,multimodal,470,fmri,pet

5. Market-Alignment Risk in Pricing Agents: Trace Diagnostics and Trace-Prior RL under Hidden Competitor State

arXiv ID: 2605.06529
核心要点: hotel,revpar,pricing,competitor,trace,market,revenue,adr,prior,failure…
关键词: hotel,revpar,pricing,competitor,trace,market,revenue,adr,prior,failure

6. Process Matters more than Output for Distinguishing Humans from Machines

arXiv ID: 2605.06524
核心要点: process,human,cognitive,task,humans,machines,agents,fine,tuning,mimicry…
关键词: process,human,cognitive,task,humans,machines,agents,fine,tuning,mimicry

7. Instrumental Choices: Measuring the Propensity of LLM Agents to Pursue Instrumental Behaviors

arXiv ID: 2605.06490
核心要点: behaviour,instrumental,agents,propensity,680,dangerous,stakes,choices,roleplay,measuring…
关键词: behaviour,instrumental,agents,propensity,680,dangerous,stakes,choices,roleplay,measuring

8. Beyond Task Success: Measuring Workflow Fidelity in LLM-Based Agentic Payment Systems

arXiv ID: 2605.06457
核心要点: tsr,hf1,payment,agentic,asr,agent,checkpoint,success,workflow,llm…
关键词: tsr,hf1,payment,agentic,asr,agent,checkpoint,success,workflow,llm

9. PrefixGuard: From LLM-Agent Traces to Online Failure-Warning Monitors

arXiv ID: 2605.06455
核心要点: prefixguard,prefix,webarena,auprc,monitor,terminalbench,warning,monitors,llm,bench…
关键词: prefixguard,prefix,webarena,auprc,monitor,terminalbench,warning,monitors,llm,bench

10. Knowledge Graphs, the Missing Link in Agentic AI-based Formal Verification

arXiv ID: 2605.06434
核心要点: rtl,svas,syntax,formal,specification,verification,agentic,coverage,specifications,grounding…
关键词: rtl,svas,syntax,formal,specification,verification,agentic,coverage,specifications,grounding

11. Automated alignment is harder than you think

arXiv ID: 2605.06390
核心要点: alignment,research,agents,human,outputs,automated,supervise,mistakes,likely,assessments…
关键词: alignment,research,agents,human,outputs,automated,supervise,mistakes,likely,assessments

12. From Agent Loops to Deterministic Graphs: Execution Lineage for Reproducible AI-Native Work

arXiv ID: 2605.06365
核心要点: memo,lineage,execution,artifact,replay,dag,update,unrelated,final,producing…
关键词: memo,lineage,execution,artifact,replay,dag,update,unrelated,final,producing

AI Agent Memory 深度洞察报告

1. 研究趋势

今日AI Agent Memory研究呈现出多元化与专业化并行的趋势。一方面，Agent能力边界持续拓展，从数学推理(《AI Co-Mathematician》)到神经影像分析(《NeuroAgent》)再到金融定价(《Market-Alignment Risk in Pricing Agents》)，应用场景不断深化。另一方面，Agent可靠性成为研究焦点，包括流程忠实度测量(《Beyond Task Success》)、失败预警系统(《PrefixGuard》)和对齐机制(《Automated alignment is harder than you think》)。与往日相比，研究从单纯追求任务成功转向关注Agent决策过程和可靠性，新兴方向包括多Agent协作优化(《MASPO: Joint Prompt Optimization》)、技能自进化(《SkillOS: Learning Skill Curation》)以及执行谱系追踪(《From Agent Loops to Deterministic Graphs》)，表明Agent系统正从简单任务执行向复杂自主决策系统演进。

2. 技术演进

Memory系统架构正经历从简单检索增强(RAG)到复杂记忆体系的显著演进。早期RAG系统主要关注外部知识检索，而现代Memory系统已发展为包含工作记忆、长期记忆和程序记忆的综合架构。今日论文中，《From Agent Loops to Deterministic Graphs》提出了执行谱系追踪，将Agent决策过程转化为有向无环图(DAG)，实现了对推理路径的完整记录与回放。《SkillOS》则展示了如何通过技能库管理实现Agent的自进化能力。关键突破包括：《PrefixGuard》从Agent执行轨迹中提取失败模式，实现在线预警；《Knowledge Graphs》将知识图谱与形式化验证结合，提升推理可靠性。这些技术共同推动Agent Memory从被动存储向主动理解、预测和决策演进，为构建World Model奠定了基础，使Agent能够构建对世界的内部表征并进行前瞻性规划。

3. 关键洞察

洞察1：Agent可靠性评估正从结果导向转向过程导向。《Process Matters more than Output for Distinguishing Humans from Machines》研究表明，评估Agent不应仅看最终结果，而应关注其决策过程。这提示我们在设计Agent系统时，应实现中间步骤的记录与分析，建立过程评估机制，而非仅依赖最终输出质量。

洞察2：多Agent系统需要协调优化机制。《MASPO: Joint Prompt Optimization》指出，多Agent系统中提示词的跨Agent联合优化至关重要。这表明在构建复杂Agent协作系统时，应设计提示词优化框架，考虑Agent间的信息流动和任务依赖关系，而非简单独立优化各Agent。

洞察3：Agent技能管理需要动态演化能力。《SkillOS》展示了技能自进化的重要性，Agent应能根据任务需求动态调整和扩展技能集。实践中，我们应构建可扩展的技能库系统，支持技能的组合、评估和迭代更新，而非静态技能集合。

洞察4：失败模式识别是Agent安全的关键。《PrefixGuard》证明了从Agent轨迹中提取失败模式并构建预警系统的可行性。这提示我们应实现Agent执行的实时监控和异常检测，建立失败模式库，实现早期预警和干预。

洞察5：Agent记忆需要结构化表示与推理。《Knowledge Graphs, the Missing Link in Agentic AI-based Formal Verification》表明知识图谱对Agent推理的重要性。实践中，我们应将非结构化记忆转化为结构化知识表示，结合符号推理与神经网络，提升Agent的推理能力和可解释性。

4. 开源项目关联

今日研究与主流开源项目紧密相关。《SkillOS》的技能库管理理念与LangChain的AgentExecutor和LlamaIndex的查询引擎有共通之处，可借鉴其模块化设计。《From Agent Loops to Deterministic Graphs》的执行谱系追踪与Mem0的记忆链路记录技术高度契合，值得在MyClaw项目中实现类似的功能。PrefixGuard的失败预警机制可整合到LangChain的回调系统中，而《MASPO》的提示词优化框架则与LlamaIndex的提示模板系统相互补充。对于MyClaw项目，建议结合《Knowledge Graphs》的知识图谱表示方法，构建结构化记忆层，同时借鉴《SkillOS》的技能管理机制，实现Agent能力的动态扩展。此外，《Process Matters more than Output》的过程评估理念可融入MyClaw的Agent评估框架，提供更全面的性能指标。

5. 下一步行动

构建执行谱系追踪系统：基于《From Agent Loops to Deterministic Graphs》的研究，设计Agent决策过程的DAG表示，实现完整的执行路径记录与回放功能，为Agent调试和优化提供支持。
开发技能库管理框架：参考《SkillOS》的设计理念，构建可扩展的Agent技能库系统，支持技能的动态加载、组合和评估，实现Agent能力的持续进化。
实现多Agent协作优化机制：基于《MASPO》的研究，设计Agent间的提示词联合优化框架，特别关注信息传递效率和任务依赖关系，提升多Agent系统的整体性能。
建立失败模式预警系统：借鉴《PrefixGuard》的方法，从Agent执行轨迹中提取失败模式，构建实时监控和预警系统，提高Agent的可靠性和安全性。
整合知识图谱与记忆系统：结合《Knowledge Graphs》的研究，将非结构化记忆转化为知识图谱表示，增强Agent的推理能力和可解释性，为构建更复杂的World Model奠定基础。

📚 附录

搜索关键词

agent, memory, memory-augmented, episodic, long-term, recall, retrieval, knowledge base, RAG, retrieval-augmented, episodic memory, working memory, memory system, remember, experience replay, memory network, external memory, vector database

本报告由 OpenClaw 自动生成（GLM-5 深度分析版）
面向 Agent Memory 系统设计者，提供前沿研究洞察