Software Engineering Daily - Redis与AI代理内存:Andrew Brookins分享 封面

Redis与AI代理内存:Andrew Brookins分享

Redis and AI Agent Memory with Andrew Brookins

本集简介

设计AI代理时的一个关键挑战在于,大型语言模型是无状态的且上下文窗口有限。这需要精心设计工程方案,以确保在连续的LLM交互中保持连续性和可靠性。为了表现优异,代理需要快速系统来存储和检索短期对话、摘要及长期事实。 Redis是一款开源的、内存中的数据存储,广泛用于高性能缓存、分析和消息代理。最新进展已将其能力扩展到向量搜索和语义缓存,使其日益成为代理应用栈中的热门组件。 Andrew Brookins是Redis的首席应用AI工程师。他与Sean Falconer一同参与节目,探讨构建AI代理的挑战、记忆在代理中的作用、混合搜索与纯向量搜索的对比、世界模型的概念等话题。 完整声明:本集节目由Redis赞助播出。 Sean曾担任学者、初创公司创始人和谷歌员工,发表过涵盖从AI到量子计算等广泛主题的著作。目前,Sean是Confluent的AI驻场企业家,致力于AI战略和思想领导工作。您可以通过LinkedIn联系Sean。 点击此处查看本集节目文稿。 赞助咨询:sponsor@softwareengineeringdaily.com 《Redis与AI代理记忆——Andrew Brookins访谈》一文首发于Software Engineering Daily。

双语字幕

仅展示文本字幕,不包含中文音频;想边听边看,请使用 Bayt 播客 App。

Speaker 0

设计AI代理的一个关键挑战在于,大型语言模型是无状态的且具有有限的上下文窗口。

A key challenge with designing AI agents is that large language models are stateless and have limited context windows.

Speaker 0

这需要精心的工程设计,以确保在连续LLM交互中保持连续性和可靠性。

This requires careful engineering to maintain continuity and reliability across sequential LLM interactions.

Speaker 0

为了表现良好,代理需要快速系统来存储和检索短期对话、摘要以及长期事实。

To perform well, agents need fast systems for storing and retrieving short term conversations, summaries, and long term facts.

Speaker 0

Redis是一种开源的内存数据存储,广泛用于高性能缓存、分析和消息代理。

Redis is an open source, in memory data store widely used for high performance caching, analytics, and message brokering.

Speaker 0

最新进展已扩展Redis的能力至向量搜索和语义缓存,这使其日益成为AgenTic应用堆栈中的热门组件。

Recent advances have extended Redis' capabilities to vector search and semantic caching, which has made it an increasingly popular part of the AgenTic application stack.

Speaker 0

Andrew Brookins是Redis的首席应用AI工程师。

Andrew Brookins is a principal applied AI engineer at Redis.

Speaker 0

他与Sean Falconer一同参与节目,讨论构建AI代理的挑战、代理中记忆的作用、混合搜索与纯向量搜索的对比、世界模型的概念等内容。

He joins the show with Sean Falconer to discuss the challenges of building AI agents, the role of memory in agents, hybrid search versus vector only search, the concept of world models, and more.

Speaker 0

本期节目由Sean Falconer主持。

This episode is hosted by Sean Falconer.

Speaker 0

查看节目备注获取更多关于Sean工作的信息以及如何联系他。

Check the show notes for more information on Sean's work and where to find him.

Speaker 1

Andrew,欢迎来到我们的节目。

Andrew, welcome to the show.

Speaker 2

谢谢。

Thank you.

Speaker 2

感谢邀请。

Thanks for having me.

Speaker 2

我是你们的忠实粉丝,所以这次访谈很有趣。

I'm a big fan, so this is fun.

Speaker 1

不错。

Nice.

Speaker 1

是啊。

Yeah.

Speaker 1

很高兴你能来参加,我们终于把时间安排妥当了。

Well, I got glad you could be here while we could work it out.

Speaker 1

节目里有粉丝参与总是件好事。

Always good to have a fan on the show as well.

Speaker 2

确实如此。

Absolutely.

Speaker 1

所以我想要先从宏观视角或者一个快速的大局问题开始。

So I wanted to kind of start with the big picture or a big quick picture question.

Speaker 1

你认识很多人。

You know a lot of people.

Speaker 1

很多人都在说2025年将成为AI智能体的爆发年。

Are saying that 2025 is going to be this breakout year for AI agents.

Speaker 1

你知道的,就是智能体之年。

It's you know the year of the agent.

Speaker 1

目前市场上有很多炒作的声音。

There's a lot of hype going on in the market right now.

Speaker 1

我们正在超越基础的聊天功能。

We're moving beyond just basic chat.

Speaker 1

那么从你的角度来看,构建这些更自主的智能体系统有哪些难点?为什么记忆或其他组件在这里扮演如此核心的角色?

So from your perspective, what makes building these more autonomous agentic systems like hard and why does memory or other components kind of play such a central role here?

Speaker 2

是的。

Yeah.

Speaker 2

嗯,我当然一直在深入思考这个问题。

Well, I think I've been thinking a lot about this, of course.

Speaker 2

我认为其中一个主要困难在于,许多我们可以放入概念验证来展示由大语言模型支持的智能体能力的任务,恰恰暴露了大语言模型的弱点。

And one of the reasons I think it's so difficult is that many of the tasks that we can put into a POC to show off what an agent can do backed by an LLM, they satisfy the weaknesses of the LLM.

Speaker 2

它们基于信息和训练数据

They draw on information and training.

Speaker 2

它们利用生成能力加上上下文工程来高效产出信息

They use that generative ability plus context engineering to produce information effectively.

Speaker 2

这些元素确实能做得很好,我们已经做了大量工作让智能体也能做到这一点

And elements can do that really well, we've we've done a lot of work now to make agents be able to do that as well.

Speaker 2

对吧?

Right?

Speaker 2

但棘手的地方在于,我认为是当智能体需要融入任何环境并实际执行操作时,真正改变某些事物,并且关键是要能够预测这种改变的结果。

But the tricky part is, I think, when the agent has to integrate in any kind of environment and do something, actually change something and crucially, be able to predict, like, the outcome of the change.

Speaker 2

而这部分恰恰是,你知道的,大语言模型实际上并未建模的内容。

And that's the part that, you know, LLMs just don't don't model, actually.

Speaker 2

它们无法为环境建模那种状态转换。

They don't they don't model state transitions like that for environments.

Speaker 2

这正是它们容易失效的地方,也是智能体容易崩溃的环节。

That's where they tend to break down, where agents tend to break down.

Speaker 1

那么你认为这是否主要是导致企业困在演示验证阶段,由于这个限制因素而无法推进到产品化层面的原因?

And you think that is that primarily that you know companies get stuck in sort of that, you know, demo POC mode where they just can't move to a productionization standpoint because of that limiting factor.

Speaker 2

不是。

No.

Speaker 2

我认为那是另一个完全不同的问题,当然也是个问题。

I think that's that's a whole other problem and is also a problem.

Speaker 2

我的观点更倾向于:构思'让我们为这个问题构建一个智能体'很简单,但更难的是分析问题本身,并将其映射到智能体真正擅长的领域,以及你需要做些什么才能让它在特定任务中表现出色。

I think it's more along the lines of it's really easy to think about, well, let's build an agent for this problem, but it's harder to think about the problem and map that back to what the agent will actually be good at and what if what you will need to make it good at certain tasks.

Speaker 2

因此,用智能体和大型语言模型解决某些问题,需要更多考虑预测组件,而不仅仅是聊天机器人这类应用。

So tackling certain problems with an agent and with an LLM require more thought about the predictive component beyond just, you know, chatbots and things like that.

Speaker 2

然后还有整个关于如何从概念验证阶段走出来,真正实现有意义的规模化生产部署的问题。

And then there's the whole side of get just getting out of the POC phase and getting things into production in any meaningful capacity.

Speaker 2

这个问题本身由于其他原因也是个重大挑战。

And that itself is also a large problem for other reasons.

Speaker 1

嗯。

Mhmm.

Speaker 1

是的。

Yeah.

Speaker 1

就比如在构建这些智能体系统时,关于记忆功能或挑战,或者说记忆的价值方面的问题——要知道每次调用模型都是无状态的。

Is in terms of like some of the challenges around memory or the value of memory when building some of these agentic systems, you know, every sort of call to the models stateless.

Speaker 1

当然它们还有容量限制,比如你能在上下文中发送多少信息给它们。

So and then of course they have a limited capacity in terms of like how much information you can be sending to them in the context.

Speaker 1

那么你需要考虑的记忆组件有哪些?这些如何影响上下文?你又该如何解决在正确时机提供正确上下文的优化挑战呢?

So what are sort of the components of memory you need to be thinking about and how do those influence context and how do you work through sort of the optimization challenges of feeding it the right context at the right time?

Speaker 2

是的,完全同意。

Yeah, absolutely.

Speaker 2

好问题。

Great question.

Speaker 2

我认为关于记忆的思考,甚至在你的问题中,就隐含了一些关于它究竟是什么、属于什么类型的假设。

So I think thinking about memory, you know, even in your question, right, there's some implicit assumptions about what what it is exactly, what type of thing it is.

Speaker 2

每次我和人们讨论记忆时都会遇到这种情况,这非常有趣。

This happens all the time when I talk to folks about memory, and and it's really interesting.

Speaker 2

最基础的层面来说,如果我们甚至不讨论智能体,就像网页应用那样——我是用户,访问网页应用,进行操作,返回时希望它能延续上次的进度。

So the most basic level, right, if we just are not even talking about agents, if we're just talking like a web application, I'm a user, I go to a web application, I do something, I come back, I expect it to continue where we left off.

Speaker 2

对吧?

Right?

Speaker 2

这本质上就是一种有状态的交互。

It's just like fundamentally, they're stateful interactions.

Speaker 2

大多数场景都是这种延续性的有状态交互。

Most things are just a stateful interaction where it continues.

Speaker 2

而LLM作为更复杂应用中的一个组件本身是无状态的,对吧?

And just the fact that the LLM being this component inside of more complex applications is itself stateless, right?

Speaker 2

所以我们经常思考这个问题。

So we tend to think about that a lot.

Speaker 2

但实际上,就像我们一直构建应用的网络服务器也都是无状态的。

But really it's just kind of like, well, so are all of the web servers that we've been building applications on.

Speaker 2

我们需要做些什么才能让它正常工作?

What do we have to do to make that work?

Speaker 2

我们需要存储数据。

Well, we have to store data.

Speaker 2

我们需要使用数据库。

We have to use a database.

Speaker 2

对我来说,关键点在于:好吧,这必须是个前提条件。

And so for me, part is the one that's like, okay, this has to be a given.

Speaker 2

在开始构建基于LLM的应用时,我们必须以需要存储数据为前提。

We have to start the process of building something with an LLM with the assumption that we'll have to store data.

Speaker 2

这就是一种记忆形式。

And that is a form of memory.

Speaker 2

你可以将其映射到2025版记忆盒中的各个小组件。

And you can map this into the 2025 version of what are the small pieces inside of that memory box.

Speaker 2

其中通常包含某种消息历史记录。

One of them is typically message history of some kind.

Speaker 2

因为很显然,如果要延续对话,你需要过去发送的消息。

Because obviously, if what you're continuing is a conversation, you need the messages that were sent in the past.

Speaker 2

因此,消息往往是这种最底层或者说基础的部分。

So messages are they tend to be this lowest level or, you know, fundamental part.

Speaker 2

这更多是关于存储,因为我们不需要过多考虑如何存储它。

That's more about storage because we don't have to think a lot about storing that.

Speaker 2

我们只需存储消息即可。

We can just store messages.

Speaker 2

这本来不是什么大问题,直到它变成大问题——比如用户从未开启过新对话。

It's not a big deal until it is a big deal because, like, the person never started a new chat.

Speaker 2

他们实际上从未真正开始过任何事情。

They never actually started anything.

Speaker 2

他们只是在同一个数据对话中开始了新的对话。

They just started a new conversation within the same data conversation.

Speaker 2

我想我们很多人都见过这种情况。

I think many of us have seen this.

Speaker 2

我们都知道现在讨论的核心问题——上下文工程与提示工程(上个月可能还是热门话题)——当消息积累到一定程度时,要准确判断哪些内容与当前问题或输入相关其实相当困难。

Many of us know the whole point that we talk about now, context engineering versus prompt engineering, which was last month probably, is like, it's actually quite difficult at a certain point when you have enough messages to try to figure out what exactly is relevant to the the incoming question or the input.

Speaker 2

于是这就变成了一个问题:好吧,这种情况已经持续一段时间了。

And so then it becomes a question of, okay, we've been this has gone on for some time.

Speaker 2

我们知道这个模型的局限性,但研究也表明,即使是长上下文模型,尽量精简输入内容仍然很重要,因为它们依然会迷失方向。

You know, we we know the limit of this model, but we also know that research suggests, right, even with long context models, it's still important for us to send as little as possible, right, because they still get lost.

Speaker 2

所以关键在于压缩或总结对话内容。

So it's a matter of compacting that or summarizing the conversation.

Speaker 2

现在我们讨论的已经是另一个话题了。

And now we're talking about a different thing.

Speaker 2

这只是我的个人观点。

This is just my opinion.

Speaker 2

对吧?

Right?

Speaker 2

所以消息只是数据。

So messages are just data.

Speaker 2

我们只是在存储现有的内容。

We're just storing what what we have.

Speaker 2

摘要处理开始变成一个超越单纯存储的工程问题,我们需要实际解决如何为这个应用和用户正确总结对话历史,从而减少发送的上下文量。

Summarization starts to become an engineering problem beyond just storage where we have to actually figure out how to get this conversation history for this application and this user summarized correctly so that we can reduce the amount of contacts we're sending.

Speaker 2

然后由此继续推进。

And then it goes from there.

Speaker 2

对吧?

Right?

Speaker 2

那么我就此作结。

So then I'll wrap it up.

Speaker 2

对吧?

Right?

Speaker 2

但这就像是下一个层级。

But that's like the next level.

Speaker 2

更进一步说,想象一个人或认知系统整天与人互动,实际发生的是他们在睡眠时,大脑会在后台线程(可以说是一个进程)中筛选出重要事项并倾向于记住它们。

Beyond that, where if you imagine a person or a cognitive system, let's say, that's interacting with people all day, what's happening is they go to sleep and their brain in a background thread, let's say, a process, picks things out that are important and tends to remember them.

Speaker 2

另一种可能是有人57次询问关于玉米饼的事,或者57个人都问起玉米饼,这个人也会记住关于玉米饼的事。

The other thing that can happen is that somebody asks about tacos 57 times or 57 people ask about tacos, and the person also remembers the taco thing.

Speaker 2

尽管那可能并不那么重要,但由于出现频率太高,大脑通常会将其烙印下来。

Even though maybe that's not that important, but, like, it's so frequent that they their brain usually, like, stamps it in.

Speaker 2

于是就有了这种机制:提取长期事实存入大脑某处,之后再调取出来。

So you get this thing that's extracting long term facts and putting them somewhere in the brain and then pulling them back out later.

Speaker 2

这通常就是我们理解的智能体认知系统运作方式。

And that usually is how we think about, like, cognitive system of of an agent.

Speaker 2

对吧?

Right?

Speaker 2

所以还有第三种情况,就是从对话中提取片段,将它们存放在某处。

So there's that third thing of taking pieces out of this conversation, putting them somewhere.

Speaker 2

我们可以在之后使用它们,通常是当用户回来想要再次互动时。

We can use them later, typically when the person comes back, the user, and wants to interact again.

Speaker 2

但我们知道一些信息。

But we know something.

Speaker 2

我们不需要查看所有总结过的消息历史。

We don't have to look at all of the summarized message histories.

Speaker 2

我们只需要知道他们是素食主义者,所以当然会给他们素食食谱。

We just know that they're vegan, and so of course we would give them a vegan recipe.

Speaker 2

对吧?

Right?

Speaker 2

这就是我思考记忆时考虑的三大方面。

So those are three big areas that I think about with memory.

Speaker 1

是的。

Yeah.

Speaker 1

此外,还有一类参考数据充当潜在的长期记忆,比如无论我是从向量数据库还是其他类型的数据存储中检索,我可能希望以特定形式引入这部分数据,本质上构成特定上下文的B部分,随后这些内容会被纳入这种循环压缩、循环摘要的循环中,可能在后续周期中再次使用。

And then there's also the sort of reference data that serves as this potential long term memory, like whether I'm looking that up in a vector database or some other type of data store, I might want to pull that in that form a particular, a B part of essentially a particular context, which then gets factored into this sort of loop compaction loop summarization loop that also might be used in a later cycle.

Speaker 2

是的,完全正确。

Yeah, absolutely.

Speaker 2

思考知识库与长期记忆之间的区别确实很有意思,因为检索过程可能看起来很相似。

It's really interesting to think about what's the difference between the knowledge base and long term memory, for example, because retrieval looks probably similar.

Speaker 2

我们会使用混合搜索、向量搜索或关键词搜索来提取信息。

We're going to use hybrid search or vector or keyword search to pull stuff out.

Speaker 2

这往往只是时间因素的问题。

It tends to be just a factor of time.

Speaker 2

通常长期记忆——我可能与你观点相左,这只是我的思考方式——

Typically long term memory, and again, I'm contradicting you even, I'm just like This is just how I think about it.

Speaker 2

长期记忆是我们在运行时学到的内容,也就是智能体学习的内容,而RAG或知识库则更多是我们开发者已知要包含或动态注入的内容,但我们仍明确知道需要注入这些信息。

Long term memory is stuff that we learned, the agent learned at runtime, let's say, whereas Rag or the knowledge base tends to be stuff that we, the developers of the agent, knew to include or we dynamically injected, but still we knew that we wanted to inject it.

Speaker 2

所以两者存在细微差别,但最终效果往往非常相似。

So there's a slight difference and but the result tends to be very similar.

Speaker 2

我们开始检索它。

We start retrieve it.

Speaker 1

是的,而且即使在模型直接交互之外,考虑整个作为智能体的软件部分,因为这些确实是完整的系统。

Yeah, and then even outside of the direct interactions of the model, thinking about the whole piece of software that's like an agent, because these are really full body systems.

Speaker 1

这不仅仅是与模型互动那么简单。

It's not like you're just interacting with model.

Speaker 1

模型会指示你去调用一个工具,而工具可能会与API进行通信。

The model is telling you to go and call a tool where the tools may be going to communicate with the API.

Speaker 1

此外,还有模型之外的其他记忆系统,用于维护状态以确保如果API调用失败你不会重复操作,或者成功时不会多次发起相同调用。

Then there's other sort of memory systems that are part of this outside of the model of how do you maintain state to make sure that if the API call fails, you don't, or it succeeds, you're not making that call multiple times.

Speaker 1

这些都属于事件驱动系统或持久化执行等关键特性的范畴。

These types of sort of key characteristics of event driven systems or durable execution and so forth too.

Speaker 2

哦,是的,完全同意。

Oh yeah, absolutely.

Speaker 2

没错。

Yeah.

Speaker 2

这真的很有趣,对吧?

That's really fascinating, right?

Speaker 2

所以这就是为什么我认为人们会在这方面犯糊涂,因为它比听起来要复杂得多。

So this is why I think people get tripped up thinking about this because it's so much more complicated than it sounds.

Speaker 2

因为即使了解很多这方面的知识,我个人也往往会将代理和模型混为一谈,交替使用。

Because it you you tend to like well, personally, I tend to even knowing a lot of this stuff, I'll tend to think about or talk about an agent and be like, The model, the agent, the model, just interchangeably.

Speaker 2

但实际上,它两者都不是。

But actually, it's neither of those things.

Speaker 2

它是一个涉及多种不同状态类型的复杂系统。

It's a complex system that involves lots of different types of state.

Speaker 2

当人们考虑构建一个能完成复杂任务、进行多重工具调用的深度研究代理时,他们往往没有意识到——也许直到为时已晚,或者恰好在合适的时候意识到——我们实际上是在讨论某种程度上的持久执行。

And when folks think about building out an agent that does complex things, multiple tool calls, a deep research agent, let's say, you know, often what they don't realize perhaps until too late maybe, or they realize and understand in the right amount of time, is that we're really talking about durable execution at some level.

Speaker 2

在某种程度上,我们讨论的是动态工作流,在生产环境中它通常会在执行中途失败然后必须重启,或者用户会突然改变主意说'等等,我搞错了'。

At some level, we're talking about dynamic workflow that typically in production is going to die in the middle of something and then have to restart, or the user is going to be like, woah, actually, I was wrong.

Speaker 2

这不是一只狗。

It's not a dog.

Speaker 2

我刚才看的是一只猫。

It's a cat that I was looking at.

Speaker 2

所以先退一步,从我们之前的另一个节点重新开始。

So just back up and start at the other point where we were at.

Speaker 2

因此从那个节点重启,就像工作流崩溃后我们需要重新运行,但不必重新执行每个步骤。

And so restarting from that other point, similar to the workflow crashed and we need to rerun it but not re execute every step.

Speaker 2

这就是所谓的检查点机制,Landgraf会这么称呼它。

So there's all those checkpointing, right, is what Landgraf would call that.

Speaker 2

但这已经存在很长时间了,对吧?

But it's been around for a while, right?

Speaker 2

它存在于所有这些不同的工作流系统中。

It's been around all these different workflow systems.

Speaker 2

所以没错,完全同意。

So yes, a big yes.

Speaker 1

那么说到Redis在这其中的角色,如果我们从短期记忆开始看——这也是我们最初讨论不同记忆系统的起点

So in terms of, you know, Redis' role in all of this, if we look at just starting even with short term memory, which is kind of where we first started talking about, you know, what are these different memory systems?

Speaker 1

LMS存在这些有限的上下文窗口。

LMS have these limited context windows.

Speaker 1

有些代理可能通过相同的会话或线程进行通信。

You have agents that are maybe communicating over sort of the same session or thread.

Speaker 1

如果它与另一个代理交互或与需要追踪的用户交互,就会存在这些来回传递的消息历史记录。

If it's interacting with another agent or interacting with a user that, you know, needs to be tracked, you have those sort of message histories that you're passing back and forth.

Speaker 1

Redis在这种架构中扮演什么角色?

Where does Redis fit into that architecture?

Speaker 2

在我看来,这与非代理性或确定性的Web应用非常相似。

So, you know, I view it as very similar to a web application that is not agentic or deterministic web application, let's say.

Speaker 2

从我的角度看它适合的位置,以及从普通生产工程师的角度看它适合的位置。

You've got the where does it fit from my perspective, and then you've got also where does it fit from the perspective of, you know, random production production engineer.

Speaker 2

工程师?

Engineer?

Speaker 2

它们往往有重叠,但有些方面我的思考方式可能有所不同。

They tend to overlap, but there are some ways that I tend to think about it differently maybe.

Speaker 2

我先说说我的看法。

So I'll give you my perspective first.

Speaker 2

首先,我们讨论的是智能体,所以紧扣主题,我们经常会提到短期记忆这个概念,但最近我更倾向于将其视为工作记忆。

First of all, we're talking about agents and of course, so sticking closely to agents, there is this concept of often we'll call it short term memory, but lately, I think about it more as as working memory.

Speaker 2

就像人类解决问题需要工作记忆一样,我们需要一个地方来存放当前正在处理的信息。

Just like I need working memory to solve problems as a human being, we need somewhere to put the stuff that we're juggling right now.

Speaker 2

对于这些应用场景来说,这通常是指消息历史记录,或者说是我们六个月前已经总结过的历史消息摘要。

And that tends to be, for these applications, a message history or a message history, you know, and a summary of the past message history from six months ago that we summarized already.

Speaker 2

因此Redis在这方面绝对是绝佳选择。

So Redis is absolutely a great fit for that.

Speaker 2

它速度极快——无论是通过Redis中已有的传统数据结构,还是通过查询引擎(这对开源版本来说算是比较新的功能)。

It's super fast, you know, from just accessing it through the direct, like, data structures that we have in Redis that have been around forever, through the query engine, which is kinda new for, you know, the open source version, I think, as a as a core component of Redis.

Speaker 2

这个功能已经包含在Redis8社区版中。

It's in Redis eight community edition and actually yeah.

Speaker 2

是的,它就在Redis8里。

So it's in Redis eight.

Speaker 2

不过在此之前,它仍然可以通过模块等方式使用。

Before that, though, it was still available in in modules and things.

Speaker 2

但我们真正想说的是数据结构——如果你想定义模式并使用查询语言进行查询,Redis同样可以做到。

But really what we're saying is data structures or when you if you wanna define, like, a schema and make queries with a query language, you can also do that with Redis.

Speaker 2

无论采用哪种方式,它的速度都极快,比大多数数据处理工具都要快。

In both of those ways, it's extremely fast, faster than most things you're gonna use for for data.

Speaker 2

因此对于工作记忆来说非常合适,特别是因为你可以把工作记忆想象成键值查找。

So for working memory, it's a great fit, especially because you can also think about working memory as being, like, key value lookups.

Speaker 2

我们非常清楚自己需要什么。

We know exactly what we need.

Speaker 2

我们需要获取特定用户的特定数据块,这个数据块就是智能体在工作记忆中处理的内容。

We need that thing for that user, and we'll just pull it out, that blob is what the agent's working with in working memory.

Speaker 2

然而作为向量数据库,它在检索方面也能发挥作用。

However, as a vector database, it can also serve a purpose in retrieval.

Speaker 2

对吧?

Right?

Speaker 2

从知识库中检索内容,你可以将其融入记忆概念或长期记忆中。

Retrieving stuff from the knowledge base that you can incorporate in the idea of memory or from long term memory.

Speaker 2

所以如果你长期将这些提取的事实存储在长期记忆中,它在检索方面同样表现出色,速度也非常快。

So if you actually store these extracted facts over time in long term memory, it's great at retrieval as well, also very quickly.

Speaker 2

这还不是全部。

So and that's not even it.

Speaker 2

对吧?

Right?

Speaker 2

实际上,我认为Redis的终极亮点在于流处理。

So actually, it could just final finishing where I view Redis is streams.

Speaker 2

我认为流是Redis中一个被严重低估的功能。

I think streams are really overlooked thing perhaps about Redis.

Speaker 2

对吧?

Right?

Speaker 2

在我最近在Redis负责的生产级智能体中,核心就是后台任务与流处理。

So in the production agent that I most recently worked on at Redis, it's all about background tasks and streams.

Speaker 2

我们使用一个名为docket的后台任务库,在流中跟踪这个工作流的状态。

We we track the state of this workflow in streams with a background task library called docket.

Speaker 2

所以Redis就在那里帮助我们管理,比如协调这个作为代理的动态工作流的状态。

So Redis is right there helping us manage, like, orchestrating the state of this dynamic work flow that is an agent.

Speaker 2

它同时存在于短期记忆、长期记忆以及基于知识的检索中。

It's also there in short term memory and in long term memory and in retrieval for knowledge based stuff.

Speaker 2

当然,我现在为Redis工作,所以你知道,我有动力尽可能在所有场景使用它。

Now, of course, I work for Redis, so, you know, I'm incentivized to to use it for everything as possible.

Speaker 2

但实际上我是重返Redis工作的。

But I actually came back to work for Redis.

Speaker 2

我之前在这里工作过,有一天我正在为一个RAG应用构建代码检索功能,当时用的是Postgres。

I worked here once before, and there was a day I was building out code retrieval for, you know, a rag application basically, and I was using Postgres.

Speaker 2

那时候我已经大量使用Postgres,甚至是极端规模级别的Postgres,而它在生产环境中频繁崩溃。

And I had been working with Postgres quite a bit at that time and, like, at at extreme scale levels of Postgres, and it had been crashing in production a ton.

Speaker 2

我对Postgres非常恼火,然后意识到用Redis可以完成我正在做的所有事情,而且它能按我期望的方式扩展,速度也会非常快。

I was really angry at Postgres, and I realized that you could do everything I was doing with Redis, and it would scale out the way that I wanted, and it would be really fast.

Speaker 2

总之,我们拥有所有这些功能,我认为它们以各种不同方式展现出来。

So anyway, we have all these things and I view them as showing up in all these different ways.

Speaker 1

我是说,确实,这项技术已经存在好几年了。

So I mean, right, this has been around for a number of years.

Speaker 1

产品中是否有必须扩展的特定功能,以支持我们现在看到的来自智能体或AI领域其他类型应用的新型工作负载,而这些在Redis早期版本中并不具备?

Were the specific things that had to be extended in the product to support some of these kind of new workloads that we're seeing from like agents or other types of applications in the AI realm that weren't already available in in in, you know, prior versions of Redis?

Speaker 2

倒不是在最近这段时间。

Not in like immediate recent history.

Speaker 2

比方说今年,我认为我们并没有被迫引入什么新功能来解锁这些智能体的应用场景。

Let's say this year, I don't think we've, like, been forced to introduce something new that unlocks one of these one of these use cases with agents.

Speaker 2

但长远来看,查询引擎本身就是我们添加到Redis的一个模块,而非核心数据结构。

But, I mean, in the fullness of time, right, like, the query engine itself was a module we added to Redis, not a core data structure.

Speaker 2

当我思考Redis为支持这些用例新增了什么时,这对我来说是最重要的一点。

And that's the big one for me when I think about what if what has Redis added to enable some of these use cases?

Speaker 2

这就是头等大事。

And that's the number one thing.

Speaker 2

对吧?

Right?

Speaker 2

能够索引向量,用查询语言动态查询它们,这可以说是头等大事。

Being able to index vectors, query them dynamically with a query language, that's, like, number one.

Speaker 2

但更近期的进展是,作为补充或者说另一种新方法,我们正在尝试的Redis创建者Salvatore开发的向量集合。

But move more recently, you know, as an accompaniment to that or like a different approach, a new approach that we're experimenting with, Salvatore, the creator of Redis, worked on vector sets.

Speaker 2

这是一种核心数据结构,与集合等并列,让你能实现类似于当前查询引擎的功能。

And that's a core data structure that sits alongside things like sets and lets you do you know, similar things to what you can do with the query engine right now.

Speaker 1

关于向量搜索方面的支持,Redis在向量导入方面具体支持哪些功能?

Terms of some of the support around like vector search, what is supported in terms of getting those vectors actually into Redis?

Speaker 1

比如,这部分工作主要是由我负责吗?像是需要在Redis外部完成向量导入的管道工作,然后Redis只作为索引和检索的终端?

Like, am I mostly doing that work, like the kind of pipeline work that I would need to get those vectors in outside of Redis and then Redis ends up being the landing point for indexing and allowing me to retrieve.

Speaker 1

还是说Redis会承担部分管道工作?

Or is there other things where Redis kind of owns part of that pipeline?

Speaker 2

答案是肯定的,在当前Redis架构中,这部分主要属于客户端的工作范畴。

So the answer is yes, It is mostly a client side concern in our in Redis' current architecture.

Speaker 2

如果你想将向量数据存入Redis,那么你需要自己生成嵌入向量,然后它会如你所说存入Redis。

If you want to put something into Redis as a vector, you know, then you're gonna be generating the embedding yourself, and and it's landing in Redis, as you said.

Speaker 2

我们今年还推出了一款新产品叫LangCache。

We also have a new product this year called LangCache.

Speaker 2

我记得它目前处于私有预览阶段。

It's in private preview, I believe.

Speaker 2

它就像是基于Redis的一个更高层抽象,用于我们所说的语义缓存。

It's like a slightly higher level abstraction over using Redis for what we call, like, semantic caching.

Speaker 2

我们确实有计划尝试在这个产品流程中集成嵌入模型,你只需把数据发送给这个产品,它会帮你完成这部分工作。

And we do have plans to experiment with including the embedding model, like, within that pipeline of just send things to that product and it will do that part for you.

Speaker 2

但就核心Redis和我们今天讨论的现有功能而言,这部分工作需要你自己完成。

But core Redis and what we're talking about today as far as what's available, you do it yourself.

Speaker 2

这是第一个问题的答案。

So that's the first answer.

Speaker 2

我忘记你刚才提到的另一点了。

I forget your other point there.

Speaker 1

它主要是一个外部构建管道的着陆区,还是Redis本身具备某些功能,能让我直接在Redis内部构建部分管道,而无需依赖其他第三方工具?

Is it primarily a landing zone where the pipeline is being built externally or are there essentially functionality in Redis that allows me to build some of that pipeline directly within Redis, not having to leverage some other third party tools?

Speaker 2

明白了。

Gotcha.

Speaker 2

是的。

Yeah.

Speaker 2

是的。

Yeah.

Speaker 2

那么我想这就是您问题的答案。

Then I think that is the answer to your question.

Speaker 2

对吧?

Right?

Speaker 2

大部分工作在于构建生成向量的流程,而Redis则帮助您快速存储和检索这些向量。

Most of that stuff you're building to produce the the vectors, and then Redis is helping you to store them and retrieve them quickly.

Speaker 1

您提到了语义缓存。

You mentioned semantic caching.

Speaker 1

什么是语义缓存?

What is semantic caching?

Speaker 2

语义缓存是我们所说的不基于确定性模式(比如必须依赖键)的缓存方式,而这是我们多年来常用的缓存方法。

Semantic caching is what we would call caching based not on a deterministic pattern like a key necessarily, which is how most we all do caching for many years.

Speaker 2

对吧?

Right?

Speaker 2

而是基于相似性。

And instead on similarity.

Speaker 2

因此,输入内容的相似性成为我们从缓存中提取内容的依据。

So the similarity to an input being the the thing that lets us pull something out of the cache.

Speaker 2

这对于那些我们已经从大语言模型获得过答案的问题特别有帮助。

And that's really helpful for questions that we have the answer for already from the LLM.

Speaker 2

我们经常会遇到完全相同或极其相似的问题。

So we've got the exact same question, very close to it many, many times.

Speaker 2

这样我们就不必反复向供应商付费获取相同答案,可以运用语义缓存来解决这个问题。

We don't have to keep paying a vendor to to produce the same answer so we can use some anti caching for that.

Speaker 1

对。

Right.

Speaker 1

如果我在构建一个代理来解决特定问题,并且使用Redis作为代理的内存服务,那么我是否需要在编写代码和业务逻辑时主动利用Redis来实现某些功能?比如确保不会因为相同问题或任务多次出现而导致令牌开销过大。

If I'm building out an agent to solve some particular problem and I'm using Redis for basically the service kind of the memory for my agent, am I building because we're writing the code and the business logic to leverage Redis for some of the stuff like, let's say I want to make sure that I'm not going and overspending on tokens because the same question has come in multiple times or the same task has come in multiple times.

Speaker 1

我可以在构建过程中利用语义缓存来建立这种钩子,还是说存在某种现成框架可以将Redis集成进去,从而省去这些抽象工作?

I can leverage the semantic caching in my essentially you know, building that hook in the rest or is there something where I can hook Redis into, I don't know, existing framework and I, you know, that's kind of abstracted away.

Speaker 2

我们付钱让你问这个问题了吗?

Do we pay you to ask this question?

Speaker 2

因为这听起来是个绝妙的计划。

Because this feels like a great a great plan.

Speaker 2

你不需要自己编写那段代码。

You don't have to write that code yourself.

Speaker 2

不过你也可以选择自己写。

You could, though.

Speaker 2

现在用Cloud Code来写这个还挺有趣的。

It's it's quite fun to write it now with Cloud Code.

Speaker 2

你可以放手让他去做很多事情。

You can just let him go or let it go and do a lot of stuff.

Speaker 2

不过,如果你正在使用许多流行的框架来创建代理,比如LangGraph,或者你只是在使用LangChain而甚至没有用到LangGraph,那么我们有一些即插即用的组件。

However, if you're using many of the popular frameworks that exist for creating agents like LangGraph or you're just doing lane chain stuff and not even using LangGraph, then we have drop in components.

Speaker 2

其中一些可以通过外部或我应该说独立的代码库获取,比如LangGraph-Redis,这是我们部分开源LangGraph组件的仓库。

And some of those are available through external or I should say separate repositories like LangGraph dash Redis, get some of our like open source LangGraph parts.

Speaker 2

但还有一个由我们维护的库。

But there is also a library that we maintain.

Speaker 2

我所在的Redis团队维护了一个名为Redis向量库或Redis VL的库,它可以在PyPI上获取。

Team at Redis that I'm on maintains called Redis Vector Library or Redis VL, and that's available on PyPI.

Speaker 2

我们在那里有很多即插即用且更具通用性的组件。

We've got a lot of components in there that are drop in and a little bit more general purpose.

Speaker 2

所以你可能不一定在使用某个特定框架,或者你有一个由多个框架拼凑而成的系统,你想要消息历史记录或语义哈希对象,想直接嵌入到一个Python项目中。

So you might not necessarily be using a particular framework or you've got a Frankenstein of them and you want message history or a semantic hash object you want to drop into like a Python project.

Speaker 2

你可以用那个库来实现这一点。

You can do that with that library.

Speaker 2

所以我们为你准备了几种不同的选择。

So we've got a few different options for you.

Speaker 1

你认为最终会进入这样一个世界吗?在众多代理框架中,内存将变得更像可插拔组件,比如‘我想用Redis作为我的记忆提供者’。

Do you think that will end up moving into a world with, you know, with the plethora of agent frameworks that are out there where memory will be sort of more of this like pluggable thing where it's like, hey, I want to use Redis as my memory provider.

Speaker 1

我只需简单接入,这些框架就能支持它。

I can just like plug that in and these frameworks support it.

Speaker 1

几乎像是某种开放的内存标准。

Almost like a some kind of like open standard for memory.

Speaker 2

开放的内存标准。

Open standard for memory.

Speaker 2

我认为这是个非常有趣的话题,因为我对这个问题没有确切的答案。

I think it's a really interesting topic because I don't have a good answer on that.

Speaker 2

不过,如果你看看LangGraph这类框架的运作方式,我想你能大致看出我预期的演变方向。

However, you know, if you look at the way that frameworks like LangGraph work, I think you can see what I would what I would expect to shape up more or less.

Speaker 2

对吧?

Right?

Speaker 2

类似于LangChain有一个向量存储接口,可以标准化地将向量存储插入链中,你也可以为你的LangGraph代理更换记忆提供者。

So similar to how LangChain has a vector store interface that standardizes just dropping in a vector store into a chain, you can also swap out the memory provider for your LangGraph agent.

Speaker 2

我认为这非常合理。

And I think that makes a lot of sense.

Speaker 2

根据我使用这些抽象层的经验,我可以告诉你,它并不总是有效,因为所有数据库都有这些略微不同的个性特点和特性。

Having worked with those abstractions, like, I can tell you, like, it doesn't always work, right, because all databases have these slightly different personality quirks and traits.

Speaker 2

特别是当你考虑像过滤向量搜索这样的事情时。

So so especially if you think about things like filtering the vector search.

Speaker 2

当然,如果是单纯的向量搜索那还好,我们只是在讨论执行相似性搜索。

So of course, like vector search, that's fine, you know, if we're just talking about just do a similarity search.

Speaker 2

但当我们讨论执行相似性搜索并包含25个不同的分面搜索点时,事情就开始变得复杂了。

But when we're talking about do a similarity search and include these 25 different faceted search points around that, that's where it starts to be like, okay.

Speaker 2

有些数据库根本不支持这个功能,或者某个数据库有特定的查询类型但实现方式不同。

Well, this database doesn't even support that at all, or this one has this particular query type for that, but it's different.

Speaker 2

所以它并不总是像理论上那么顺畅。

So it's not always quite as as smooth as it could be.

Speaker 2

因此,这正是标准可以发挥重要作用的地方。

So that's where a standard could be useful for sure.

Speaker 2

不过说实话,我真正感兴趣的是像Cogni、MEM0这类代理记忆产品和框架。

I guess what I'm really interested in though is just to be, like, totally brutally honest is stuff like Cogni and MEM0 and agent memory, like, products, agent memory frameworks.

Speaker 2

我在思考这究竟是什么样的存在?

And thinking about that as, like, a what what is that like?

Speaker 2

因为我曾参与过这类系统的开发,这个领域非常有趣——你能将与记忆相关的内容从即插即用的组件中提取出来,集中放置到核心位置。

Because I I've worked on systems like that now, and it's just a really interesting area where you can take you take more of that thing that is related to memory out of the just drop in a component and use something place and put it in somewhere central.

Speaker 2

总之,我认为这是个极具创新潜力的领域。

So anyway, it's a really interesting area of innovation, I think.

Speaker 1

确实,从很多方面来说,你确实需要一个位于存储层之上的、专为代理设计的抽象层。

Yeah, I mean, in a lot of ways you do need essentially a layer that's above the storage where that's agent specific?

Speaker 1

就像拥有查询引擎还不够,还需要某种专为代理设计的记忆引擎——它能理解代理特有的摘要压缩需求,这些功能通常超出了传统数据库的支持范围。

Kind of like you have a query engine, but you need some sort of, I don't know, memory engine for agents that understand the summarization compaction needs of an agent that go above and beyond those things are typically supported by like a, you know, a conventional database.

Speaker 2

没错。

Yeah.

Speaker 2

确实如此。

Absolutely.

Speaker 2

所以我认为目前的情况是,而且我们可能会看到的是,代理框架将会尝试在这方面开辟新天地。

And so I think where that falls currently and and maybe what we'll probably see is agent frameworks are gonna try to carve that out, I think.

Speaker 2

对吧?

Right?

Speaker 2

你可以看到LangGraph已经在通过LangMem尝试这样做,其中存储某种程度上是一个可替换的组件。

You you can see LangGraph already trying to do that with LangMem where this, you know, storage is sort of a swappable component.

Speaker 2

然后围绕管理所有这些记忆内容的智能和工程,以及其认知系统方面,更像是框架工作的一部分,或者说它承担并试图完成的任务。

And then the intelligence and the engineering around managing all of that memory stuff and the cognitive system aspect of it is more like part of the framework's job or the job that it's taking on and trying to do.

Speaker 2

但这确实是个大工程。

But it is a big job.

Speaker 2

现在实际参与这项工作后,发现工作量相当大。

Having now worked on that, it's quite a lot.

Speaker 2

而且我认为你看看我之前提到的这些产品,就会发现为什么它是一个独立于代理运作的完整系统——代理运作通常更偏向持久化执行或工作流执行。

And I think you look at these products that I mentioned before, and, I mean, you can see why it's it's a whole, like, system separate from the workings of the agent, which tends to be more like durable execution or workflow execution.

Speaker 2

所以这件事本身比听起来要复杂得多。

So which itself is more complicated than it sounds to.

Speaker 1

是的。

Yeah.

Speaker 1

是的。

Yeah.

Speaker 1

不过至少可以依赖,我不知道,大约十年左右的工程积累,这些主题已经投入了大量研究。

Although there is, you know, you can rely on at least, I don't know, a decade or so of of engineering that's, you know, been poured into some of those topics.

Speaker 2

没错,完全同意。

Yeah, absolutely.

Speaker 1

关于长期记忆,是否存在不同类别的思考方式?

In terms of like long term memory, is there different sort of classes of how we think about long term memory?

Speaker 1

不仅仅是'这是我要跟踪的数据',你是否认为需要将这些数据分门别类,以便在智能体工作过程中进行差异化管理和处理?

It's not just, you know, here's the data I want to keep track of, but do you think about those as being in sort of different buckets that you need to be able to manage and treat differently as you know agents progress through their work?

Speaker 2

是的,完全正确。

Yeah, absolutely.

Speaker 2

我认为如果你仔细想想,不管出于什么原因,当我测试一个智能体及其记忆功能时,我特别喜欢用苹果来举例。

I think if you think about for whatever reason, when I'm testing out an agent and its memory, particularly, I use like apples.

Speaker 2

我真的很喜欢苹果,我想。

I I really like apples, I guess.

Speaker 2

所以我经常和智能体讨论苹果。

So I'm constantly talking to agents about apples.

Speaker 2

每个人都有自己的怪癖,但我觉得用苹果举例是个有趣的切入点。

Everybody has their own weird thing that they do, but I think the apples thing is like an interesting one.

Speaker 2

所以我可以告诉智能体我喜欢苹果。

So I can tell an agent that I like apples.

Speaker 2

对吧?

Right?

Speaker 2

这只是一个事实。

And that's just a fact.

Speaker 2

我确实在某个时刻告诉过它我喜欢苹果,但你可以把这看作是关于我的一个事实。

I did tell it at some point that I liked apples, but you could look at it as a fact about me.

Speaker 2

事实上,如果你要考察这个关于我的事实是否随时间稳定,你会发现它相当稳定。

In fact, if you were to look at it as whether or not it's stable over time about me in particular, you would find it's pretty stable.

Speaker 2

这可能只是一个普遍事实。

It's probably just a general fact.

Speaker 2

根据一些论文的分类方式,我喜欢苹果可能属于语义记忆的事实类型。

It might be a semantic fact that I like apples in the words of some of the papers that that break this down into different, like, types of memory.

Speaker 2

而如果你和我妻子谈论我和苹果,她会告诉你,'嗯,他确实喜欢苹果'。

Whereas, you know, if you were talking to my wife about me and apples, she would tell you, like, well, he likes apples.

Speaker 2

对吧?

Right?

Speaker 2

但具体是这周的哪种苹果?

But which type of apple this week?

Speaker 2

因为我偏好的苹果品种每周都在变。

Because it changes every week, which is the apple that I prefer.

Speaker 2

这种事实就更像与时间或时效性紧密相关的事实。

And that is more like a fact that is very tightly bound to time or duration.

Speaker 2

所以,没错,上周我又喜欢上了蜜脆苹果。

So, yeah, last week I did like Honeycrisp again.

Speaker 2

这周我更倾向于宇宙脆苹果。

This week it's more about Cosmic Crisp.

Speaker 2

你知道,我需要苹果带点酸味,这更像是那种情景记忆。

You know, I need a little bit more tartness in the apple, and that's more like what you would consider like an episodic memory.

Speaker 2

或者如果我们再次将这些不同的事实分类,可以将其归为一种有时间限制的记忆类型。

Or if we again, if we were to break these different facts you could store down into types, that could be one type, a type of memory that is time bound.

Speaker 2

时间限制这个方面非常重要,因为当代理去检索时——就像我说的,我们实际上是在讨论上下文工程的过程——通过我们的工程过程和运行时部分,我们最终为LLM提供了合适的上下文,以便给应用程序的用户或代理一个像样的响应。

And the time bound aspect is really important because when the agent goes and retrieves you know, we're like I said, we're we're really talking about this process of context engineering where by the end of the process of our engineering and the runtime portion of it, we arrive at the right context for the LLM to give the user of the application or the agent a decent response.

Speaker 2

因此我们思考如何达成那个结果。

And so we think about how to get to that outcome.

Speaker 2

我们如何才能让代理——比如我让它去买菜什么的——出去买到正确的苹果?

How do we get to the outcome that the agent that I'm asking it to buy my groceries or whatever is gonna go out and buy the right apple?

Speaker 2

我可不想它买蜜脆苹果。

I don't want it to buy Honeycrisp.

Speaker 2

它需要知道我这周想要宇宙脆苹果,而不需要我每次都去告诉它,因为那样就失去了自主性的价值。

It needs to know that I want Cosmic Crisp this week without me having to go in and tell it every single time because that erases the value of autonomy.

Speaker 2

所以它需要存储我喜欢宇宙脆苹果这一事实以及时间信息,这样当它检索信息构建上下文时,可以按时间排序并叠加最近可能的情景记忆。

So it needs to store the fact that I like Cosmic Crisp apples with the time so that when it retrieves stuff to build that context, it can order them by time and overlay the most recent facts that it has that are episodic perhaps.

Speaker 2

我想大多数人会说,什么?

Now most people, I think, would say, well, what?

Speaker 2

等等。

Well, wait.

Speaker 2

情景记忆到底是什么?

What is episodic exactly?

Speaker 2

我认为你也可以从另一个角度看待情景记忆,比如说去年夏天我去了奥卡斯岛。

And I think you could also look at episodic in a different way, which is to say, you know, last summer I visited Orcas Orcas Island.

Speaker 2

去年夏天也是一个有时间界限的事实。

Last summer is also a time bound fact.

Speaker 2

我去了奥卡斯岛。

I visited Orcas Island.

Speaker 2

所以它们其实非常相似。

So they kinda are very similar.

Speaker 2

因为第一个例子唯一的区别就是我们没有存储时间。

Because really the only difference with the first one is we just didn't store the time.

Speaker 2

而且它非常笼统。

And it's really general.

Speaker 2

这其实不太实用。

It's like not that useful.

Speaker 2

对吧?

Right?

Speaker 2

总之,这些其实是我对程序性记忆(另一种记忆类型)的一些犀利见解。

So anyway, those are something I have actually a kind of spicy take on procedural memory, which is another type of memory.

Speaker 2

我们可以单独讨论那个问题,但这是我的第一反应。

We could talk about that separately, but that's that's my first response.

Speaker 1

是的。

Yeah.

Speaker 1

那么,在建模这类内容时,如果需要一个代理来追踪这些事件的发生时间,我是否必须像设计数据库模式那样,先考虑记录时间戳,然后可能还需要加入某种知识随时间衰减的机制?这样就不会向安德鲁提起六年前他早已不关心且无关紧要的事情。

Well, in terms of what goes into like modeling this stuff, though, with an agent, like if I need to keep track of when these types of things happen, do I have to basically go through the process of modeling this like a schema where it's like, Okay, well, need to think of like I need to track the timestamps and then I might need to factor in some sort of, I don't know, decay scenario on this knowledge over time so that I'm not, you know, telling Andrew about something that happened six years ago that he no longer cares about and is no longer relevant.

Speaker 1

本质上,如何让代理能够综合考虑所有这些不同因素?

Like how does essentially do inform the agent to be able to be able to take into account all these different things?

Speaker 2

是的,完全正确。

Yeah, absolutely.

Speaker 2

我刚才记得你说过,我们在工作流和持久执行方面已经积累了几十年的相关知识。

So I I remember just a moment ago, you were like, well, but we have, you know, with workflows and durable execution stuff, we have all these decades of knowledge about how to do it.

Speaker 2

关于AI技术,我最欣赏的一点是,几乎所有问题在某种程度上都可以归结为我们拥有数十年经验的领域。

What I love about AI stuff is like pretty much every problem in some way, almost every problem boils down to something that is something we have decades of of experience with.

Speaker 2

具体到这个问题,上下文工程和检索本质上都属于信息检索范畴,意味着我们在这方面拥有极其丰富的经验,正是针对这类问题的经验。

So in particular, this one, the fact that context engineering and retrieval are all really a form of information retrieval means that we have tons of experience with this problem, exactly this problem.

Speaker 2

长期以来,我们一直在将内容存入搜索引擎并尝试提取,力求获取最相关的信息。

So we we've been putting things into search engines and trying to pull them out and to pull out the most relevant things for a long time.

Speaker 2

而且这种方法已经存在很长时间了。

And this is one that's, you know, that's been around.

展开剩余字幕(还有 234 条)
Speaker 2

如果你曾经开发过任何带搜索功能的产品,你就会知道,总会有那么一天,有人会走到你桌前说,嘿。

If you've ever worked on any kind of product with search, you'll know, like, at some point, somebody is gonna come to your desk in the past and be like, hey.

Speaker 2

你知道,就是网站上这周视频在打折促销。

You know, it's the thing is that videos are on sale this week on the site.

Speaker 2

所以我需要这种临时提升。

So I kinda need this boost.

Speaker 2

搜索引擎应该直接把视频排名推上去。

The engine should just push videos up.

Speaker 2

对吧?

Right?

Speaker 2

这是一种基于索引中内容类型的动态提升。

It's a dynamic boost based on the type of content that's in the search in the index.

Speaker 2

通常还会有一个针对时效性的稳定提升机制,对吧,这本质上会逐渐降低旧内容的权重。

Typically, would also have a stable boost for recency, right, that starts to essentially downvote stuff that's older.

Speaker 2

在信息检索方面也是同样的道理。

The same thing is true for retrieving information.

Speaker 2

所以答案是肯定的。

So the answer is yes.

Speaker 2

你确实需要这样做。

You definitely do.

Speaker 2

你确实需要存储的不仅仅是文本。

You definitely do wanna store more than just the text.

Speaker 2

你需要存储结构化的数据,比如时间。

You wanna store things, structured data like the time.

Speaker 2

我会说可能是越多越好。

And I would say probably as much as possible.

Speaker 2

对吧?

Right?

Speaker 2

如果用户记忆中提到的时间能被提取出来,我们后续就能在查询中包含这一点。

So the time that the user referenced in the memory, if you can extract that into a time, then we can include that inquiries later.

Speaker 2

这样你就能按用户在记忆中提及的时间来排序信息——当然,这对他们来说通常比记忆系统中创建的时间更重要,但后者也同样重要。

Then you could you could order information by the times that users were talking about in the memory, which of course is like, that tends to be more important to them than the time that you created it in the memory system, but which is also important.

Speaker 2

对吧?

Right?

Speaker 2

所以,是的,这就是我兴奋的回应——需要模式、日期。

So, yeah, that's my like excited response there is yes, schemas, dates.

Speaker 1

我们的上下文有限,即使我们来回交流,这本质上是一种衰退,甚至对当前会话重要的短期互动最终也可能超出上下文窗口,需要以某种方式修剪等等。

We have limited context and even if we're having, you know, we're essentially going back and forth, this is a recession, even sort of the short term interaction that's important to that session could eventually grow outside of the context window and want to trim it in some fashion and so forth.

Speaker 1

但同时,我们也必须考虑存储在长期记忆中的内容,我们正在恢复这些内容,并希望将其纳入考量。

But then at the same time, we also have to take into account sort of what is stored in this kind of long term memory where we're restoring that and we want to take that into account as well.

Speaker 1

那么你如何看待优先分享这些信息的能力?

So how do you think about being able to prioritize the sharing of that information?

Speaker 1

因为就像一个人最近的对话,我可能更倾向于优先考虑它,而不是长期记忆存储中的某些内容。

Because presumably just like a person, you know, recent conversation I had, I probably want to take into account at a higher priority than maybe something that is sort of in my long term memory store.

Speaker 1

但那些仍然很重要。

But that's still important.

Speaker 1

那么,在上下文构建和上下文工程中,我该如何确定如何强制实施短期与长期的优先级排序呢?

So how do I you figure out in sort of the construction of the context, the context engineering piece, how to actually force a prioritization in the short term versus long term?

Speaker 2

这个问题问得好,因为我觉得答案还在探索中。

That's a great question because I would say, you know, I feel like I feel like the answer is still out there.

Speaker 2

因为...因为我只是在不断调整提示词,试图获得更好的检索效果。

Because because I'm just trying to, you know, screw with the prompt until I get, like, better retrieval in in many times.

Speaker 1

是啊。

Yeah.

Speaker 1

所以还是有点靠感觉来调整提示词。

So still a little vibe prompting.

Speaker 2

没错。

Yeah.

Speaker 2

说实话,每次我调整提示词做上下文工程时,都感觉本质上是在用英语文本帮助理解——虽然实际根本不是这么回事。

I mean, honestly, every time I'm, like, manipulating the prompt for context engineering, I do feel as if at the end of the day, I'm writing English text to try to help someone understand, although that's not even what's happening.

Speaker 2

对吧?

Right?

Speaker 2

说白了,我的解决方案就是尝试用不同方式构建提示词结构,试图在提示中传达意图。

Literally the answer is I will tend to try to structure the prompt in different ways so that I'm attempting to communicate in the prompt.

Speaker 2

这些信息属于长期记忆范畴。

This information is term, it's from long term memory.

Speaker 2

我们认为它具有持久性或非常重要。

We consider it durable or we consider it very important.

Speaker 2

然而,用户刚刚才说了这些内容。

However, the user just said this stuff.

Speaker 2

这是即时对话内容,因此如果他们推翻了之前说过的话,你应该以此为准。

This is the immediate conversation, so you should consider this canonical if they override something else that they said.

Speaker 2

不过是否能产生正确结果——要知道它并不总是完全按照我刚描述的方式运作——某种程度上你只能期待并测量。

Whether or not that'll produce the right result, though, know, it's not always doing it exactly the way I just described, you know, is is sort of a you just hope and measure.

Speaker 2

我们会测量结果的。

We'll measure the results.

Speaker 1

我觉得我们在把这些事情转化为更工程化的规范方面有所进步,但仍有大量需要迭代测试的环节,说白了就是摸索如何操控模型,让它通过调用各种外部信息源(无论是向量存储还是某种数据库)来产生你需要的帮助。

I think we're getting a little bit better at turning some of these things into more of an engineering discipline, but there's still a lot of this kind of like iterate test cycles of just, you know, sort of, you know, for lack of a better word, figuring out how to manipulate the model to get the help that you need it to produce in terms of using some of these various places where we're going to bring in additional information, whether that's like a vector store or some sort of database.

Speaker 1

要知道两年前——或许是一年半前——所有人都在谈论RAG和向量数据库这类东西,而现在我们似乎已经意识到,向量和语义搜索并非唯一需要做的事情。

You know, two years ago or maybe it was a year and a half ago, like everyone was always talking about rag and vector databases and things like that, and it feels like we've gotten to a place where we've realized that vectors and semantic search isn't the only thing that we need to do.

Speaker 1

我们还需要考虑一些其他因素。

We also need to take into account some other things.

Speaker 1

所以你能谈谈为什么会出现过度依赖向量数据库的断层吗?

So you talk a little bit about why was there a gap with using or relying solely on sort of vector databases?

Speaker 1

这些混合搜索技术在这些场景中又是如何发挥作用的?

And how do some of these hybrid search techniques sort of play a role with all this?

Speaker 2

是的,这也是个很棒的话题.

Yeah, this is a great topic as well.

Speaker 2

想想看,回到我们正在进行的上下文工程到底是什么?

Think you know, again, getting back to, like, what is the engineering that we're doing in context engineering?

Speaker 2

其中一部分就是这种英语语言,甚至不限于语言层面的提示词操控,对吧?

Part of it is this sort of English language or not even, you know, language manipulation, right, within the prompt.

Speaker 2

但这还不是全部,对吧?

That's not all, right?

Speaker 2

那就只是提示词工程了。

That's that would be prompt engineering.

Speaker 2

这是撰写优质提示所面临的挑战之一。

That's part of the the challenge of writing a good prompt.

Speaker 2

但其他工程环节往往围绕信息检索展开。

But the other engineering parts tend to be around information retrieval.

Speaker 2

所以,这就是我对这个问题的思考方式。

And so, you know, that's how I think about this problem.

Speaker 2

我认为如果你回顾当时究竟发生了什么,这只是我个人的看法——我从事这行太久了。

So I think if you just look back at what is that exactly happened back then, this is just my, like, I've been doing this too long kind of take.

Speaker 2

我观察到很多人试图将向量数据库用于本质上是基于文本的数据,或者我们需要通过有效关键词搜索获取输入的场景。

I observe a lot of people trying to use a vector database for data that is fundamentally text based or where we're fundamentally going to get the input that we need to do an effective keyword search.

Speaker 2

然而,这种做法曾非常流行,而且我认为很多这类技术给人的感觉比实际要新得多。

However, it was very popular and a lot of this stuff, I think, feels much newer than it really is.

Speaker 2

所以我们往往会想,哦,没错。

So we tend to think, oh, right.

Speaker 2

当然,我们需要一些并非人人都在使用的新工具,比如用向量数据库来为AI内容进行搜索,因为AI是新生事物。

Of course, we would need something new that not everybody uses, like a vector database to do this search for AI stuff because AI is new.

Speaker 2

但归根结底,我认为云代码是这个问题或实际运作方式的绝佳例证。

But at the end of the day, I think Cloud Code is a really good example of this question or how this really plays out.

Speaker 2

如果你观察云代码和Cursor,或者随便拿其他任何东西举例,只要在另一边放上任何东西然后说,好吧。

So if you look at Cloud Code and you look at Cursor, well, let's just say some other anything, really, if you just put anything on the other side and you say, well, okay.

Speaker 2

这个团队正在构建这个东西,他们选择了向量搜索,并且全力投入向量搜索。

This team is building this thing and they chose vector search and they leaned hard into vector search.

Speaker 2

向量搜索在输入不太具体的情况下非常出色,这时你仍希望用向量找到相关事物的集群。

Vector search is great in in cases where the input's gonna be less specific, right, and you still want to try to find clusters of related things with vectors.

Speaker 2

这正是它的优势所在。

That's what it's good at.

Speaker 2

或者当输入是非语言数据时它很管用,比如你想进行图像搜索这类操作。

Or it's good at when the input is non language, or you want to find like or you want to do something like image search.

Speaker 2

但它往往不太针对具体应用或数据特性,而是要看哪种方式更有效。

But it tends to be not so much application or, let's say, data specific, which of these things is going to be more effective?

Speaker 2

通常这往往取决于具体任务。

It often tends to be task specific.

Speaker 2

一个很好的例子是,我打开编辑器或代码代理(不管具体用哪个),然后开始探索一个新的代码库。

So a really good example is I open up my editor or my age, my code agent, whatever it is, which of these two I'm using, and I'm exploring a new code base.

Speaker 2

你知道吗?

You know?

Speaker 2

我真正需要的是能帮我识别相关事物集群的工具。

What I really need is something to help me identify the clusters, the clusters of related things.

Speaker 2

因为在项目中,这些内容虽然概念相关,却经常分散存储在各种不同的文件中。

Because in projects, they're often stored in all of these files in other places, even though they're actually conceptually related.

Speaker 2

因此你可以通过整个应用程序中的一堆不同目录下的不同文件来获取其中的一个切片。

So you could take a slice of that through the entire application in a bunch of different files in different directories.

Speaker 2

这些才是相互关联的内容,而不是目录中的文件。

Those are the things that are related, not the things in the directory.

Speaker 2

或者说它们是以另一种维度相关联的,而这正是我所关心的。

Or they're related in a different dimension, and that's the one I care about.

Speaker 2

所以在探索新代码库或类似场景时,语义搜索或向量搜索会非常有用。

So semantic search or vector search can be very useful in that case of exploring a new repo or whatever.

Speaker 2

但有时候我会让代理重命名一个变量。

But then there's the time that I tell the agent to rename a variable.

Speaker 2

我绝对不需要模糊搜索来做这件事。

I definitely don't need I don't need fuzzy search for that.

Speaker 2

我只需要使用关键词搜索,精确匹配那个变量名。

I just needed to use keyword search and do exact matches on that variable name.

Speaker 2

就像我说的,对吧,这是同样的数据。

So like I said, right, it's the same data.

Speaker 2

这是一个代码仓库。

It's a it's a repository.

Speaker 2

我是同一个用户。

I'm the same user.

Speaker 2

我可能在使用同一个应用程序,但任务不同。

I'm using the same application probably, but the task is different.

Speaker 2

因此对搜索的需求也就不同了。

And so the the demands on search are different.

Speaker 2

因此我认为,在某些类似情况下,你可以轻松将其分解为:这是适合该任务的搜索类型。

So I think in in some cases like that, you can really break it down easily into this is the search type that's appropriate for this task.

Speaker 2

但还有其他情况,比如我们的知识库包含大量不同结构化的、分层的或层级化的数据——听起来可能有点高级,但其实我指的就是像书籍这样的东西。

But there are other cases where we're doing something like, you know, our knowledge base has lots of different structured and, like, tiered or hierarchical data for that sounds you know, maybe that sounds fancy, but, like, I'm just talking about like book.

Speaker 2

比如,一本书就是书,它有章节——也许我索引了几本书,对吧,它们都有章节。

Like, a book is a book and it has chapter maybe I indexed several books, right, and they all have chapters.

Speaker 2

这些章节又包含段落。

Those chapters have paragraphs.

Speaker 2

所以有一篇名为《Raptor论文》的文章,他们在将内容拆分存入索引时,逐层递归地总结了这些内容。

And so there's a paper called the Raptor paper where they went through and recursively summarized each of these things when they broke them up to go into to the index.

Speaker 2

当你观察这个过程时,我们实际上在讨论:我究竟需要什么?

And when you look at that, then we're talking about, well, what do I really need exactly?

Speaker 2

因为在他们案例中,论文最后会——抱歉,剧透预警——

Because, you know, we're gonna have in their case, right, by the end of the paper sorry, spoiler alert.

Speaker 2

但他们发现,如果在完成所有工作后进行分层搜索,数据库中就已经包含了全部内容。

But what they found is if you do it as a hierarchical search after doing all that work, you've got everything in the in the database.

Speaker 2

在他们的案例中,数据库里存储的是每一层级的嵌入式摘要。

In their case, like, what what they have in the database is embedded summaries of each of those layers.

Speaker 2

所以他们拥有整个集合的嵌入式摘要、集合部分的嵌入式摘要,以及叶子节点(如小段落等)的嵌入式摘要。

So they have embedded summaries of the collection of things, the embedded summary of the part of the collection of things, and embedded summaries of the leafs, so the the tiny, like, paragraphs or whatever.

Speaker 2

抱歉。

Or sorry.

Speaker 2

具体到他们的案例中,叶子节点直接嵌入了实际段落内容,而其他层级则是摘要。

And in their case, specifically, have the actual paragraph at the leaf node embedded directly, and then the other things are summaries.

Speaker 2

但当作为用户的我使用这类应用程序时(我曾参与开发过),从代理端发起的搜索并不完全属于其中某一类,因为某些内容会受益于精确匹配。

So but when as a user I go and you need to use this application that uses something like that, which I have worked on, the search that we wanna do from the agent side isn't so cleanly one or the other because one of those things is gonna benefit from exact matches.

Speaker 2

这通常是指直接存在于数据库中的文本内容。

That's gonna be the text that's directly in there.

Speaker 2

在实际应用中,这些文本不仅以向量形式存在,还会保留原始文本以便进行两种类型的搜索和某种混合搜索。

And it's usually directly in the database in a real application, right, not just as vectors, but also as text so that we can do both types of searches and some some kind of hybrid search.

Speaker 2

当你这样组织数据——部分内容使用摘要,其他保留原始内容——并融合两种搜索结果的排名时,往往会取得良好效果。

And it tends to be effective when you've laid the data out like that with summaries of some things, the actual like content of the others to do both searches and fuse the ranks.

Speaker 1

关于如何根据任务选择查找方式的问题

In terms of where making this choice of how to do the lookup based on the task.

Speaker 1

这必须是一个预先确定的工程选择,还是可以依赖模型智能决策?

Does that have to be an engineering choice that's essentially predetermined or is this something that you can rely on the model to be able to intelligently decide that?

Speaker 1

好的,在这种情况下我意识到可以进行基于键的查找,所以我会调用某种工具端点来执行基于键的查找,而不是语义性质的查找

Okay, in this circumstance I realize I can do a key based lookup, so I'm going to talk to some sort of tool endpoint that can do the key based lookup versus something that's more like semantic in nature.

Speaker 2

这要看情况

It depends.

Speaker 2

我想说的是,我们团队经常与人们讨论这个问题。

I would say this is so we talked to people about this a lot on my team.

Speaker 2

这就是为什么测量如此重要。

This is why measurements are so important.

Speaker 2

所以这个项目中你首先应该做的是尝试衡量质量或准确性,也就是你正在构建的这个东西对你来说重要的那些方面,这样你才能进行实验,因为这会因情况而异。

So being the first thing that you should work on in this project is, like, trying to measure the quality or accuracy, you know, the things that matter to you about this this thing that you're building so that you can experiment because it will depend.

Speaker 2

但我的观点是——我的观点是这还取决于你使用的数据库。

But my opinion is my opinion is it it also depends on the database that you're using.

Speaker 2

这其实也是为什么我当时想,你知道吗?

This is actually another reason why I was like, you know what?

Speaker 2

Redis正处于这样一个位置,我觉得这是件好事。

Redis is in this position where I could this is a good thing.

Speaker 2

我要回去研究AI相关的东西了。

I'm gonna go back and work on AI stuff.

Speaker 2

因为如果我需要进行两次缓慢的搜索来完成融合,或者现在很多数据库会自己处理融合操作。

Because if I have to make two slow searches to do a fusion or, you know, these days, a lot of databases will just do the fusion on their side.

Speaker 2

但他们的数据仍然存储在磁盘上。

Still, their data is on disk.

Speaker 2

对吧?

Right?

Speaker 2

所以即使他们在数据库服务器端处理了部分工作,搜索仍然会很慢。

So the the search is still gonna be slow even if they manage some part of that on the the server side of the database side.

Speaker 2

如果真是这样,无论哪种方式都可能是个问题。

If that's true, it could be a problem either way.

Speaker 2

对吧?

Right?

Speaker 2

所以如果我们为了提升准确性而进行过多搜索,反而会增加延迟。

So then we could add latency by doing too many searches to try to improve accuracy.

Speaker 2

而且问题在于我们是否进行了过多搜索,并且每次都机械式地执行——因为我们采用确定性方式,不让模型自主选择。

And whether that's we do too many searches and we do them every single time because we do it deterministically, we don't let the model choose.

Speaker 2

这确实是个严重问题,因为每次交互都会这样。

That's that's a real problem because it's like every single time.

Speaker 2

明白吗?

Right?

Speaker 2

用户与代理之间的每次互动,都会触发模型发起一系列工具调用或其他操作。

Every turn of the every interaction between the user and the and the agent that spawns off a series of tool calls from the model or whatever happens.

Speaker 2

但作为工具调用,这种方式通常效果很好。

But as a tool call, that often works really well, I think.

Speaker 2

这取决于具体模型,但就我的经验而言,将任务分解为工具调用往往很成功——这样既能拆分任务,模型也常能做出正确决策。

It depends on the model, you know, but, like, it's I've had a lot of success with moving things into tool calls so that you can break it up and it will often make good decisions.

Speaker 2

但除非你实际测量过,否则你不会知道它长期的表现如何。

But you don't know unless you've measured, you know, how that performs over time.

Speaker 1

对。

Right.

Speaker 1

嗯。

Yeah.

Speaker 1

我是说,所有这些归根结底,希望你不只是凭感觉行事,用那种方式衡量。

I mean, all of these things come down to hopefully you're not just kind of, you know, putting your finger in the air in the wind and measuring it that way.

Speaker 1

但没错,这更像是一种更正式的方式,就像测试和实际评估你之前提到的关于混合搜索的问题——对于我们讨论的这些记忆系统,你真的需要从零开始构建,从一开始就建模吗?

But yes, it's sort of more formal way of like testing and actually evaluating it in terms of what you were talking about before around like hybrid search for any of these kind of memory systems that we're talking about, do you really need to be sort of, you know, building these thinking about the what you're building from scratch and sort of modeling it from the beginning?

Speaker 1

或者如果你已经拥有这些数据,部分数据已经存在于数据库中,你能利用这些系统吗?还是你真的需要考虑如何将这些数据转换成易于消费的形式来服务于AI用例?

Or can you leverage if you had this data already, some of this data already somewhere in like a database, can you leverage those systems or do you really need to be thinking about how to get that data in a form that can be easily consumed and serve the AI use case?

Speaker 2

我认为你需要。

I think you do.

Speaker 2

我认为你确实需要考虑这个问题,因为就像其他数据工程特有的问题一样,数据几乎从来不会以适合任何用途的形式存在。

I think you do need to be thinking about it because, you know, like like with other data engineering specific problems, data is almost never in the right form for anything.

Speaker 2

就像我这样的应用开发者会一股脑儿地把一堆乱七八糟的东西塞进数据库,还自以为这样没问题。

Like, application developers like me will just cowboy through and put a bunch of junk in the database, you just expect that that's good.

Speaker 2

从操作层面来说,这勉强能用。

Operationally, it's fine.

Speaker 2

但当你真正想用它做点什么时,就会发出疑问——

But then you try to use it for anything, and it's like, why?

Speaker 2

为什么你要把所有这些东西用逗号塞在同一个字段里?

Why did you put all this stuff with commas in one field?

Speaker 2

这简直太荒谬了。

It's just ridiculous.

Speaker 2

在不了解背景的情况下根本没法拆分这些数据。

There's no way to split these up without knowing.

Speaker 2

所以数据格式往往是不对的,在AI领域尤其如此——就像我之前说的,假设我有个合法授权购买的书籍数据集。

So it tends to be the wrong format, and that's particularly true with AI stuff where, like I was describing earlier, we could imagine that I have this dataset of books that I have licensed and purchased.

Speaker 2

这些不是盗版书,我正在把它们存入数据库。

They are not, you know, pirated books, and I'm putting them in a database.

Speaker 2

对吧?

Right?

Speaker 2

所以我可以直接把它们放进去,就像那样。

So I could just put them in, you know, like that.

Speaker 2

我完全可以只把它们全部放进一条记录、一个文档里,比如我的文档数据库或Redis,全都只是文本。

I could literally just put them all in one one record, one document, you know, my document database or Redis, and it's all just text.

Speaker 2

但你知道,这对很多很多事情都不适用,尤其不适用于AI相关的东西。

But, you know, that's not gonna work for for many, many different things, but especially not for AI stuff.

Speaker 2

所以你真的需要审视你在用AI做什么。

So you really need to look at what you're doing with AI.

Speaker 2

我们会称之为分块处理。

We would consider we would call it chunking.

Speaker 2

对吧?

Right?

Speaker 2

就像猛禽方法那样,他们把内容拆分成片段,然后递归总结不同层级的内容。这正是你需要考虑的事情,即使数据已经存在,也要根据你的智能体将要进行的搜索类型,对现有数据进行这样的分块处理。

So like with the raptor approach where they break these things up into pieces and then recursively summarize the different levels of them, That's really the kind of thing you need to be thinking about, even if the data already exists, taking the data that already exists and chunking it out like that, depending on, you know, the type of search that your agent is going to be making.

Speaker 1

在处理向量时尤其需要注意,要考虑你的重新索引策略是什么,比如当需要重新处理一个网站或网页时,网页更新后就需要重新处理。

You have to be especially with vectors, thinking about what your perhaps your re indexing strategy is as well, where if you need to reprocess, let's say a website or web page, the web page updates, then you need to reprocess it.

Speaker 1

你知道有哪些策略可以实际更新存储该页面分块的向量库吗?

You know, what are the strategies for actually updating the vector store that has the chunks of that page?

Speaker 1

我是否需要彻底清除原有的副本?

Do I need to sort of blow away the original copy of it?

Speaker 1

重新注入新数据并重新索引,还是有更好的方式来处理这种更新或增量更新?

Ingest inject a new data and sort of re index or is there a better way of handling that kind of update or upsert?

Speaker 2

这取决于,你知道的,答案其实是取决于数据库,甚至取决于你在数据库中存储数据的方式。

So it depends on, you know, well, the answer is it really depends on the database and even within the database, how you're storing the data.

Speaker 2

我特别想到Redis,对吧?

And I'm thinking specifically of Redis, right?

Speaker 2

所以在Redis中,你可以使用哈希结构存储多个字段,这种情况下向量只是其中一个字段。

So so with Redis, you know, you can use a hash and store multiple fields, in which case the vector is just one of the fields.

Speaker 2

然后如果你有能力,在给定特定输入时,知道其在Redis中的表示(比如作为哈希)与源数据现在有所不同。

And then if you have the ability to, given a particular input, know that the the representation of that in Redis, let's say, as a hash, is different from the source now.

Speaker 2

对吧?

Right?

Speaker 2

所以如果你愿意——抱歉再次用哈希举例——但如果你能对一组稳定的输入进行哈希处理,它就能告诉你这是同一个网页,只是内容发生了变化。

So if you like, sorry to reuse hash, but if you can hash a set of in inputs that's like stable, it will tell you that that's the same web page, but the content has changed.

Speaker 2

那么是的。

Then yes.

Speaker 2

我的意思是,你可以直接重新索引。

I mean, you can just re index.

Speaker 2

你可以只修改哈希的部分内容。

You can just change part of the hash.

Speaker 2

如果你使用Redis的查询引擎,它会用新的变更值重新索引该哈希。

And if you're using like the query engine in Redis, it will reindex that hash with, like, the new the changed value, for example.

Speaker 2

JSON也是同样的道理。

JSON is is the same thing.

Speaker 2

现在Redis有JSON数据类型,其工作方式几乎相同。

So Redis now has a JSON data type, and it works pretty much the same way.

Speaker 2

对吧?

Right?

Speaker 2

我们可以对JSON中的特定字段建立索引。

We can you can index specific fields in the JSON.

Speaker 2

所以是有这个功能的。

So so there's that.

Speaker 2

但显然,就像我刚说的,有几件事你需要考虑。

But obviously, like I just said, there's a couple of things you have to think about.

Speaker 2

你如何进行映射?

How do you how do you map them?

Speaker 2

通常是通过哈希处理来获取相同的记录。

Usually, it's hashing stuff to get the same, you know, record.

Speaker 2

然后要看数据库是否支持这种操作。

And then whether the database can can support that.

Speaker 2

说实话,我实在想不出有哪个数据库不支持这种策略。

And honestly, I I I would struggle to think of a database that couldn't support that that strategy.

Speaker 1

你认为在智能体记忆系统领域,下一个前沿方向会是什么?

What do you think is the next frontier around these like memory systems for for agents?

Speaker 1

除了我们讨论过的那些需要靠感觉摸索的上下文工程问题外,目前还缺少哪些关键要素?

Like what is kind of missing from besides some of the things that we talked about of just like, hey, we got to like sort of vibe our way through the context engineering.

Speaker 1

从数据基础设施的角度来看,是否存在某些核心组件的缺失,阻碍了我们解决智能体在生产系统中高效运行的问题?

But like, you know, is there sort of core pieces of like the data infrastructure that's missing that would help us solve some of these problems with getting some of these agents to perform really well in production systems?

Speaker 2

是的,完全同意。

Yeah, absolutely.

Speaker 2

我认为有个重大缺失环节。

There's the big missing piece, I think.

Speaker 2

今年谷歌DeepMind团队发表了一篇相关论文。

There's a paper this year that Google did, Google DeepMind research, and they did a paper.

Speaker 2

可能很多人都读过了。

Lot of people probably have read.

Speaker 2

我不太确定。

I don't know.

Speaker 2

我不知道那是什么原理。

I don't know how that works.

Speaker 2

我猜只有少数人会熬夜读学术论文吧。

Only some people stay up at night reading academic papers, I guess.

Speaker 2

如果你读过的话,通用智能体需要世界模型。

If you do, general agents need world models.

Speaker 2

这篇论文一直萦绕在我心头。

This paper is haunting me.

Speaker 2

我也不知道为什么。

I don't know why.

Speaker 2

可能同事都在说:你能别再提那篇论文了吗?

Probably everybody at work is like, would you shut up about the paper?

Speaker 2

它其实没那么长。

It's really not that long.

Speaker 2

我们不需要每天都听你讲这个。

We don't need to hear about it every day.

Speaker 2

但它一直萦绕在我心头,因为今年早些时候出于好玩,我就想做一个专门玩文字游戏的智能体,因为那些游戏很有趣。

But it haunts me because just for fun earlier this year, I was like, I want to make an agent that just plays text games because those are fun.

Speaker 2

唉,就像这样。

Like, ugh.

Speaker 2

为智能体设计一个文字游戏,然后能向大家展示:看,这是个文字游戏智能体,它会利用记忆来学习。

Making a text game for an agent and then being able to show people, oh, this is a text agent a text game playing agent and it uses memory to learn.

Speaker 2

听起来就很有意思。

That sounds like a lot of fun.

Speaker 2

确实很有趣,直到后来发现智能体其实没多大进步。

And it was fun until the agents didn't actually improve all that much.

Speaker 2

正是这点让我开始思考这个问题,结果发现和论文里的观点完全吻合。

And so this is what got me thinking about this, and then it aligned perfectly with what I I see in this paper.

Speaker 2

所以问题就像我开场说的那样,因为它一直盘旋在我脑海里。

So the problem is what I actually started this this conversation with because it's always on my mind.

Speaker 2

我在考虑接下来几个月要开发一个能与实时基础设施交互并进行调整的智能体。

I'm thinking about building, you know, an agent over the next few months that'll interact with live infrastructure and make changes.

Speaker 2

这与玩文字游戏的智能体非常相似。

That's very similar to an agent playing a text game.

Speaker 2

相似之处在于,要在那种环境中生存并成功完成任务,智能体需要预测环境将如何变化。

The thing that's similar is to survive that environment, to do things successfully, the agent needs to predict how an environment will change.

Speaker 2

这本质上不是预测之后会用什么语言来描述,而是真正成功预测如果采取某个行动,下一个状态会如何改变。

That's fundamentally not about predicting the language that will represent that afterward in ways that it's going to actually succeed at predicting the next state change if it does something.

Speaker 2

所以对于玩这些文字游戏的智能体,你可以做很多上下文工程。

So agents that are playing these text games, you could do a lot of context engineering.

Speaker 2

你可以提升它们的表现,但说真的,考虑到你投入的工作量,它们应该能掌握一个游戏并将从过去游戏中学到的经验泛化应用,但往往并非如此。

You can improve their performance, but by golly, with with the amount of work that you could pour into these things, they should be able to pick up a game and generalize on what they've learned from a past game, and often that's not true.

Speaker 2

这是所有智能体都会遇到的相同问题。

And that's the same problem that any agent is gonna run into.

Speaker 1

是啊。

Yeah.

Speaker 1

这几乎就像是强化学习,但针对的是...

It's almost like a reinforcement learning, but for Exactly.

Speaker 1

这个智能体。

The agent.

Speaker 1

是的。

Yeah.

Speaker 2

没错。

Right.

Speaker 2

嗯,问题就在于。

Well, that's the problem is.

Speaker 2

确实如此,实际上,我用于这个智能体的文本世界框架就是类似强化学习的。

So it's true that and in fact, text world, this framework I I use for this agent is, like, for reinforcement learning.

Speaker 2

因为关键在于,如果你确切知道智能体会做什么以及它将在什么环境中运作,或者说,对于游戏玩家智能体而言,它要玩的特定游戏,那么你就可以生成交互数据。

Because the thing is, you know, if you know exactly what the agent's gonna do and the environment it's gonna work with, or in other words, in a game playing agent, the specific game it plays, then you can generate interaction data.

Speaker 2

你可以生成状态转换数据来进行强化学习。

You can generate the, you know, the state transition data to be able to do reinforcement learning.

Speaker 2

它会在那个游戏中表现得更好,但不一定适用于其他游戏。

And it will become better at that game, but not necessarily at other games.

Speaker 2

这就是通用智能体需要世界模型的原因,因为每次遇到新问题就想着'我们该怎么办?'这种想法根本不现实。

And that's the whole that's the general agents need world models because and it's just not realistic to think that every time a general agent encounters a new problem, it's like, well, what are we going to do?

Speaker 2

难道我们要每次都停下一切,启动新的强化学习周期,专门教你做这一件事吗?

Are we going to just shut everything down and spin up a new reinforcement learning cycle and teach you how to do that one thing?

Speaker 2

因为情况总是在不断变化。

Because things change all the time.

Speaker 2

即便在同一个环境中,变化也非常频繁。

And even within one environment they change a lot.

Speaker 2

所以我认为这个问题我们仍需解决。

So I think we still have to figure that out.

Speaker 1

是的,而且我认为回到你之前提到的基础设施管理案例,除了必要的学习过程外,可能还存在许多其他内存方面的挑战,这将是一个长期持续的进程,可能需要长时间等待某些事件发生。

Yeah, and then I think also, you know, going back to your sort of management of infrastructure example, besides like the learning that would have to happen, there's also a probably a number of other memory challenges are going to go on with that, where it's going be this kind of long running continuous thing where it probably has to long wait cycles for certain things to happen.

Speaker 1

从分布式系统的角度,以及在这些长期运行过程中如何管理智能体状态的问题,都有大量需要解决的难题。

You know, there's a lot both from a distributed system standpoint and also from how you're managing the state with the agent over these long running processes that you would have to figure out.

Speaker 1

要真正实现这样的智能体,其复杂性是相当高的。

There's a lot of complexity with with making that agent actually.

Speaker 1

你觉得有信心将其部署在你的基础设施中。

There is something that you feel confident unleashing within your infrastructure.

Speaker 2

是的。

Yeah.

Speaker 2

他们...有人在讨论这个智能体,你知道,这听起来是否像是个好项目。

They they somebody was talking about this agent, you know, whether whether this sounds like a good project.

Speaker 2

我当时就觉得,这听起来绝对是个会失败的项目。

And I was like, that sounds like a project that absolutely is gonna fail.

Speaker 2

但我完全支持,因为这正是令人兴奋的地方,你知道,如果...我还没真正看到让它完美运作的路径。

And I am all in because that is what that's what's, like, exciting, you know, if it's like, I don't really see I don't see the path yet to that, like, working really well.

Speaker 2

所以我很期待能找到这个解决方案。

So I'm very excited to, like, find it.

Speaker 2

不过,是的,我同意你的观点。

But, yeah, I agree with you.

Speaker 1

嗯。

Yeah.

Speaker 1

好的,太棒了,安德鲁。

Well, awesome, Andrew.

Speaker 1

我们即将结束,你还有什么

As we start to wrap up here, is there anything else you'd

Speaker 2

想分享的吗?

like to share?

Speaker 2

没有了,我想我们已经讨论了很多我关心的话题,特别是关于智能体和记忆的部分,这次对话非常愉快。

No, I think we covered a lot of things that are just on my mind, lot about agents and memory, and we had a really good conversation.

Speaker 2

能和你深入探讨具体细节,谈论我的担忧和梦想,真是太棒了。

It was awesome to chat with you and kind of dig into the specifics and talk about my fears and my dreams.

Speaker 1

太棒了。

Fantastic.

Speaker 1

感谢你分享你的专业知识。

Well, thank you for sharing your expertise.

Speaker 1

我也非常享受这次对话。

I enjoyed the conversation as well.

Speaker 2

太棒了。

Excellent.

Speaker 2

再见。

See you.

关于 Bayt 播客

Bayt 提供中文+原文双语音频和字幕,帮助你打破语言障碍,轻松听懂全球优质播客。

继续浏览更多播客