Google AI: Release Notes - 科莱·卡武克乔格鲁:“我们就是这样打造通用人工智能的” 封面

科莱·卡武克乔格鲁:“我们就是这样打造通用人工智能的”

Koray Kavukcuoglu: “This Is How We Are Going to Build AGI”

本集简介

加入Logan Kilpatrick与谷歌DeepMind首席技术官兼谷歌首席AI架构师Koray Kavukcuoglu,共同探讨Gemini 3与人工智能发展现状! 他们的对话涵盖Gemini 3的市场反响、AI研究的持续进展,以及基准测试在推动新领域中的作用。双方深入探讨了Gemini重点发展的关键领域,强调指令遵循、工具调用与国际化,同时分享了谷歌在AI开发中的协作模式。 YouTube观看链接:https://www.youtube.com/watch?v=fXtna7UrL44 章节时间点: 0:00 - 开场 2:00 - Gemini 3发布反响 4:16 - 持续进步与创新 6:47 - Gemini改进关键领域 11:45 - 模型改进的产品支撑 13:56 - 首席AI架构师角色 17:04 - 工程思维与协作 18:37 - Gemini未来增长领域 20:33 - 从研究到工程思维的转变 23:22 - 生成式媒体的崛起 27:22 - Nano Banana Pro功能 29:31 - 迈向统一模型检查点 36:26 - 组织架构助力AI成功 38:26 - 探索与规模化的平衡 41:40 - DeepMind协作文化 45:21 - 谷歌创新实践 48:37 - 结束语

双语字幕

仅展示文本字幕,不包含中文音频;想边听边看,请使用 Bayt 播客 App。

Speaker 0

双子座三号,我们正坐在这里。

Gemini three, we're sitting here.

Speaker 0

反响看起来非常积极。

Reception seems super positive.

Speaker 0

这个模型的氛围很好。

The vibes of the model are good.

Speaker 1

我对进展感到非常兴奋。

I'm very excited about the progress.

Speaker 1

我对这项研究感到兴奋。

I'm excited about the research.

Speaker 0

我们实际上已经在多个维度上推进了前沿。

We had actually pushed the frontier on a bunch of dimensions.

Speaker 1

这就是我们构建AGI的方式。

This is how we are gonna build AGI.

Speaker 1

我们希望以正确的方式实现它,这正是我们投入所有智慧与创新的方向。

We wanna do it the right way, And that's where we are putting all our minds, our innovations.

Speaker 1

并不是

It's not

Speaker 0

像那种纯粹的研究工作,在某个实验室里孤立进行。

like it's this purely research effort that's off in a lab somewhere.

Speaker 0

而是与我们共同在现实世界中的协作努力。

Like, it's a joint effort with us in the world.

Speaker 1

这是一个崭新的世界,对吧?

This is a new world, right?

Speaker 1

有一项新技术正在定义用户的大部分期望。

There's a new technology that is defining a lot of what users expect.

Speaker 0

在某种意义上,我们正在与客户共同构建AGI。

We are, in some sense, like co building AGI with our customers.

Speaker 1

突然间,你让更多人成为了创造者。

So all of a sudden, you enable a lot more people to be builders.

Speaker 0

让一切构想变为现实。

Bring anything to life.

Speaker 1

赋予万物生命。

Bring anything to life.

Speaker 1

对吧?

Right?

Speaker 0

是啊。

Yeah.

Speaker 0

我觉得接下来的六个月会和过去六个月以及更早的六个月一样令人兴奋。

I feel like the next six months are gonna be probably just as exciting as the as the last six months and the previous six months before that.

Speaker 1

我们很幸运能生活在这个时代。

We are lucky to be living in this in this age.

Speaker 1

这一切正在当下发生。

It's happening right now.

Speaker 1

非常令人振奋。

It's very exciting.

Speaker 0

嘿,大家好。

Hey, everyone.

Speaker 0

欢迎回到Release Notes。

Welcome back to Release Notes.

Speaker 0

我是Logan Kilpatrick。

My name is Logan Kilpatrick.

Speaker 0

我来自DeepMind团队。

I'm on the DeepMind team.

Speaker 0

今天很荣幸邀请到DeepMind首席技术官、Google新任首席AI架构师Kory Kavachulu加入我们。

Today, it's an honor to be joined by Kory Kavachulu, who is the CTO of DeepMind and the new chief AI architect of Google.

Speaker 0

Kory,感谢你的到来。

Kory, thanks for being here.

Speaker 0

很高兴能进行这次对话。

I'm excited to chat.

Speaker 1

我也是。

Me too.

Speaker 1

是啊。

Yeah.

Speaker 1

非常兴奋。

Very excited.

Speaker 1

感谢邀请。

Thanks for inviting.

Speaker 0

当然。

Of course.

Speaker 0

Gemini 3。

Gemini three.

Speaker 0

我们正坐在这里。

We're sitting here.

Speaker 0

我们已经发布了这个模型。

We've launched the model.

Speaker 0

反响看起来非常积极。

Reception seems super positive.

Speaker 0

我想我们之前就有预感这个模型会有多出色。

Like, I think we we went out and we obviously had a hunch about how good the model was going to be.

Speaker 0

排行榜看起来很棒,但我觉得真正让用户上手使用才是关键

Leaderboards looked awesome, but I think putting it in the hands of users and getting out is like a

Speaker 1

这才是真正的考验,对吧?

That's always the test, right?

Speaker 1

我是说,我们首先进行了基准测试,然后一直在做各种实验

I mean, we have been like benchmarking is the first step, and then we have been doing tests.

Speaker 1

我们与可信的测试者合作,进行了同行评审等各种验证

We have been like with trusted testers, with peer reviews and everything.

Speaker 1

所以你能感觉到,是的,这是个好模型,它确实很强大

So, you get a feeling that, yes, it's a good model, it's capable.

Speaker 1

它并不完美,对吧?

It's not perfect, right?

Speaker 1

但我对目前的用户反响确实相当满意

But like I think I'm quite pleased with the reception really.

Speaker 1

大家似乎都喜欢这个模型,而且我们觉得有趣的功能,用户们也觉得很吸引人

People seem to like the model and the kinds of things that I think we found interesting, they also found interesting.

Speaker 1

所以,到目前为止情况不错,这挺好的。

So, like that's good so far, like this is good.

Speaker 0

是啊。

Yeah.

Speaker 0

我们昨天还在讨论,话题的主线就是关于珍惜这个进步速度不减的时刻,我觉得这让我深有共鸣。

We we were talking yesterday and the the thread of the conversation was just around, like, appreciating this moment that the progress isn't slowing down, which I think resonates with me.

Speaker 0

当我回想起上次坐在你旁边时,我们正在IO大会上发布2.5版本,听着Demis和Sergei谈论AI那些事。

As I was reflecting back to the last time I sat next to you, we were at IO as we launched 2.5, and we were listening to Demis and Sergei talk about AI and all that stuff.

Speaker 0

我感觉进步速度完全没有放缓,这真的很有意思。

I feel like the progress has not slowed down, which is really interesting.

Speaker 0

当我们发布2.5时,感觉那是个顶尖模型,感觉我们在多个维度都推进了前沿,而3.0版本再次实现了这种突破。

Like, when we launched 2.5, it felt like a state of the art model, and it felt like we had actually pushed the frontier on a bunch of dimensions, and I feel like three point o delivers that again.

Speaker 0

没错。

Yeah.

Speaker 0

我很好奇关于规模扩展的讨论能否持续下去,目前你的直觉是什么?

And I'm curious what the scaling conversation of can it continue, continues to go, what's your sense right now?

Speaker 1

是的,我的意思是,我对这些进展感到非常兴奋。

Yeah, I mean, I'm very excited about the progress.

Speaker 1

我对这项研究感到兴奋。

I'm excited about the research.

Speaker 1

当你真正置身于研究中时,在各个领域都能感受到许多令人振奋的东西,对吧?

Like when you are actually there in the research, there are a lot of excitement in terms of like in all areas of this, right?

Speaker 1

无论是数据、预训练还是后训练阶段,处处皆是。

Like coming from data, pre training, post training, everywhere.

Speaker 1

我们看到许多令人振奋的成果,大量进展和新想法不断涌现。

We see a lot of excitement, see a lot of progress, a lot of new ideas.

Speaker 1

归根结底,这一切都建立在创新和想法的基础上,不是吗?

At the end of the day, this whole thing is really running on innovation, running on ideas, right?

Speaker 1

我们做的事情影响力越大,越贴近现实世界,越被人们使用,你实际上会获得更多灵感——因为你的接触面扩大了,接收到的信号类型也增多了。我认为问题会变得更难,也会更加多样化。

The more we do something that is impactful, that is in real world, that people use, you actually get more ideas because your surface area increases, the kinds of signals that you get increases, and I think the problems will get harder, the problems will get more varied, right?

Speaker 1

正因如此,我认为我们将面临挑战,而这类挑战是良性的。

And with that, I think we will be challenged and these kinds of challenges are good.

Speaker 1

我认为这也是推动我们构建智能的动力所在。

And I think that is the driver for going towards building intelligence as well.

Speaker 1

没错,事情就会这样发展。

Right, that's how it's going to happen.

Speaker 1

我觉得有时候如果只看一两个基准测试,可能会觉得进展受限,但这很正常,因为基准测试是在某项技术尚具挑战性时定义的,随着技术进步,这些基准自然就不再代表前沿水平。

I feel like sometimes if you look at one or two benchmarks you can see squeeze but I think that's normal because benchmarks are defined at a time when something was a challenge, you define that benchmark and then of course as the technology progresses that benchmark becomes not the frontier.

Speaker 1

它已无法界定技术前沿。

It doesn't define the frontier.

Speaker 1

于是你会定义新的基准测试。

And then what happens is you define a new benchmark.

Speaker 1

这在机器学习领域非常常见,对吧?

It's very normal in machine learning, right?

Speaker 1

基准测试和模型开发总是相辅相成的。

Like benchmarks and model development is always hand in hand.

Speaker 1

你需要基准测试来指导模型开发,但只有当你接近前沿时,才能知道下一个突破点在哪里,从而定义新的基准。

Like, you need the benchmarks to guide the model development, but you only know what the next frontier is when you get close to it so that you can define with the new benchmark.

Speaker 0

是的,我也有同感。比如HLE这个基准测试,最初所有模型的表现都很糟糕,只有1%或2%的水平,而现在最新的DeepThink模型能达到40%多,这简直不可思议。

Yeah, I feel this way, and there was a couple of benchmarks like HLE, originally, all the models were horrible on and doing like one or 2%, and I think now the newest with DeepThink is like 40 something percent, which is crazy.

Speaker 0

Arc AGI二代的测试最初所有模型几乎都无法完成任何部分。

Arc AGI two was originally all the models could barely do any of that.

Speaker 0

现在却能达到40%以上。

It's now, like, 40 plus.

Speaker 0

这确实很有趣,而且观察那些经得起时间考验的静态基准测试也很有意思,虽然我不清楚具体原因。

So it is it is interesting, and then it's also interesting to see, and I don't have the context on why, the benchmarks that are static that do have a little bit of, like, the test of time, if you will.

Speaker 0

我认为这些测试可能已经接近饱和状态了。

And, like, I I think they are probably close to saturated.

Speaker 0

以GPQA Diamond为例,尽管我们只能勉强提升1%的性能,但它仍然保持着参考价值。

But, like, GPQA Diamond, as an example, like, continues to stick around even though we're eking out 1%

Speaker 1

或者类似的情况。

or whatever.

Speaker 1

这说明其中确实存在一些非常困难的问题。

It's like there are really hard questions there.

Speaker 0

是啊,是啊。

Yeah, yeah.

Speaker 1

我是说,那些难题我们至今仍无法解决。

Like, I mean, and like those hard things we are still not able to do.

Speaker 1

对。

Yeah.

Speaker 1

对吧?

Right?

Speaker 1

而且它们仍在测试某些东西。

And they still test something.

Speaker 1

但想想我们在GPQA上的进展,并不是说从20分要提升到90分,对吧?

But if you think about, like, where we are with GPQA, it's not like, oh, you're at twenties and you need to go to nineties, right?

Speaker 1

你们已经接近目标了,所以被定义为未解决的课题数量自然在减少。

So you're getting close there, so the number of things that it defines as unsold is, of course, decreasing.

Speaker 1

所以在某个阶段,寻找新的前沿、新的基准是很有必要的。

So at some point it's good to find new frontiers, new benchmarks.

Speaker 1

制定基准确实非常重要,对吧?

And defining benchmarks is really, really important, right?

Speaker 1

因为如果你要考虑——如果我们把基准视为进步的定义,但这并不总是完全一致。

Because if you're going to think about if we think about benchmarks as the definition of progress, which does not necessarily always align.

Speaker 1

对吧?

Right?

Speaker 1

就像在进步和基准之间存在某种关系。

Like there's this thing between like there's progress and then there's the benchmarks.

Speaker 1

在理想情况下应该是完全一致的,但现实中从来不会完全一致。

In an ideal case it's 100% aligned, but it's never 100% aligned.

Speaker 1

对我来说最重要的进步衡量标准是:我们的模型在现实世界中被科学家、学生、律师、工程师使用,人们用它们来做各种事情——写作、创意写作、邮件处理,无论是简单还是困难的任务,对吧?

Like to me the most important measure of progress is like we have our models in real world and scientists use them, students use them, lawyers use them, engineers use them, and then like people use them to do like all sorts of things writing, creative writing, emails, easy or hard, right?

Speaker 1

这个应用范围很重要,涵盖不同主题和领域。

Like, that spectrum is important, and different topics, different domains.

Speaker 1

如果你能持续在那里提供更大的价值,我认为这就是进步。

If you can actually continue delivering larger value there, I think that's progress.

Speaker 1

而这些基准能帮助你量化这一点。

And these benchmarks help you quantify that.

Speaker 0

是的。

Yeah.

Speaker 0

你是怎么考虑的,或许可以举个具体例子,比如从2.5到3.0版本,或者其他任何版本。

How do you think about and, like and maybe even there's a particular example from, like, 2.5 to three or, like, whatever.

Speaker 0

我们可以选择你想要的任何模型版本变更。

We could choose whichever model version change you want.

Speaker 0

我们在哪些方面进行优化提升?

Where are we hill climbing?

Speaker 0

实际上,在如今有无数基准测试的世界里,你可以选择优化方向,你是如何为Gemini整体,特别是Pro模型考虑这个问题的?

And, like, like, how actually, how in a world where there's, like, a zillion benchmarks now actually of, you know, you could choose where you wanna hill climb, how are you thinking about for just, like, broadly Gemini, but also maybe the pro model specifically?

Speaker 0

我们该在哪些方面重点优化它呢?

Like, where do we try to hill climb for that?

Speaker 1

我认为有几个重要领域,对吧?

I think there several important areas, right?

Speaker 1

比如其中很重要的一点是指令遵循。

Like one of them is instruction following is important.

Speaker 1

指令遵循是指模型需要理解用户的请求并执行,你不希望模型随意回答它认为应该回答的内容,所以这种指令遵循能力很重要,这也是我们一直在做的。

Instruction following is where the model needs to be able to understand the request of the user and to be able to follow that, You don't want the model just answering whatever it thinks it should answer, so that instruction following capability is important and that's what we always do.

Speaker 1

其次对我们来说国际化很重要,谷歌是非常国际化的公司,我们希望触达全球每一个用户,这部分很重要。

And then for us internationalisation is important, Google is very international and we want to reach everyone in the world, so that part is important.

Speaker 0

我觉得3.0 Pro版本至少在这方面做得很好,今早我和Tulsi交流时,她还惊叹于模型在那些我们历来不太擅长的语言上的惊人表现,这真的很棒。

And I feel like three point zero Pro, at least, I was talking to Tulsi this morning and she was remarking about how incredible the model is for languages that historically we haven't been really good at, which is awesome to see.

Speaker 1

所以需要持续聚焦在这些关键领域,对吧?

So, like, continuously, you have to put the focus on some of these areas, right?

Speaker 1

这些领域可能看起来不像知识前沿那么耀眼,但它们确实至关重要,因为正如我所说,关键是要从用户那里获取反馈信号。

They might not look like, okay, it's the frontier of knowledge, but they're really, really important because you want to be able to interact with the users there because, as I said, it's all about getting that signal from the users.

Speaker 1

如果再深入一些技术领域,函数调用、工具调用、智能体行为和代码这些都非常重要。

And then if you come to actually a little bit more technical domains, function calls, tool calls, agentic actions, and code, these are really important.

Speaker 1

函数调用和工具调用之所以重要,我认为它能带来智能水平的指数级提升,既能让模型自然使用我们创建的所有工具和函数,又能将其融入自身的推理过程。

Function calls and tool calls are important because I think it's a whole different multiplier of intelligence that comes from there, both from the point of view of the models being able to just naturally use all the tools and functions that we have created ourselves and then use it in its own reasoning.

Speaker 1

但模型也能自己编写工具,对吧?

But also the model writing its own tools, right?

Speaker 1

某种程度上你可以认为模型本身也是工具。

Like you can think that the models are in a way the models are tools in themselves as well.

Speaker 1

所以这一点很重要。

So that one is a big thing.

Speaker 1

显然代码也很重要,不仅仅因为我们都从事软件工程,更因为我们知道通过代码能在你的笔记本电脑上构建任何东西。

Like obviously like code is because, like not just because we are all software engineers, but also because like we know that with that you can actually build anything that is happening your laptop.

Speaker 1

而在你的笔记本电脑上发生的远不止软件工程。

And on your laptop it's not just software engineering that happens.

Speaker 0

让一切变为现实。

Bring anything to life.

Speaker 1

让一切变为现实,对吧?

Bring anything to life, right?

Speaker 1

是的。

Yeah.

Speaker 1

所以我们当下做的很多事情都发生在数字世界,而代码就是这一切的基础,它能与生活中的几乎所有事物实现整合。

So a lot of the things that we do right now happens in the digital world and code is the basis for that to be able to integrate with anything that happens, pretty much anything that happens in your life.

Speaker 1

虽然不是全部,但确实涵盖了很多方面。

Not everything, but like a lot of things.

Speaker 1

因此我认为这两者结合能为用户带来更广泛的触达。

That's why these two things together I think makes up for a lot of reach for users as well.

Speaker 1

我举这个擦除编码的例子,对吧?

I give this example of wipe coding, right?

Speaker 1

我喜欢它,为什么呢?

I like it, why?

Speaker 1

因为很多人富有创造力,他们灵光一现时你就能让他们高效产出——这种从创意到产出的转化方式,你只需写下来就能看到应用程序呈现在眼前,大多数时候它都能顺利运行,效果很棒,不是吗?

Because a lot of people are creative, they have ideas and all of a sudden you make them productive, Like going from creative to productive in a way that you can just write it down and then like you see the application in front of you and like it is like, I mean most of the time it works and when it works it's great, right?

Speaker 1

我觉得这个循环非常出色。

I mean, that loop I think is great.

Speaker 1

于是突然间你就能让更多人成为创造者

So all of a sudden you enable a lot more people to be builders,

Speaker 0

比如

like

Speaker 1

构建一些东西,我是说,这真的很棒。

building something like, I mean, like, it's great.

Speaker 0

我很喜欢。

I love it.

Speaker 0

是的。

Yeah.

Speaker 0

没错。

Yeah.

Speaker 0

感谢你这次AI工作室的推介。

Thank you for this is the AI studio pitch.

Speaker 0

感谢配合,我们会把这段剪掉。

Appreciate the We'll clip this part out.

Speaker 0

我们会把它发布到网上。

We'll put it out online.

Speaker 0

你提到的一个有趣话题是,作为Gemini 3发布的一部分,我们推出了Google反重力平台——一个全新的智能体编码平台。

One of the interesting threads that you mentioned is like how and actually the as part of this Gemini three moment, we launched Google Anti Gravity, a new agent encoding platform.

Speaker 0

从模型角度来看,你认为这种产品框架对质量提升的重要性有多大?

How much do you think about, like, the importance of having this product scaffolding to hill climb on quality from a a model perspective.

Speaker 0

嗯。

Yeah.

Speaker 0

工具调用与编码。

Tool calling and coding.

Speaker 0

对。

Yeah.

Speaker 1

是的。

Yeah.

Speaker 1

对我来说,这确实非常非常重要。

It's like, to me, it's very, very important.

Speaker 1

我认为反重力作为产品本身固然令人兴奋,但从模型角度思考,它是双向的。我们先从模型层面来讨论。

And I think like, anti gravity as a product itself, yes, it's exciting, but like from a model perspective, if you think about it, so it's double sided, Let's talk about first the model perspective.

Speaker 1

从模型角度来看,能够与终端用户(在这里是软件工程师)进行这种整合,并直接从他们那里学习以理解模型需要改进的地方,这对我们来说非常关键。

From the model perspective, being able to have this integration with the end users, in this case software engineers, and learning from them directly to understand where the model needs to improve is really critical for us.

Speaker 1

我的意思是,在Gemini应用等领域,出于同样的原因,这一点也很重要。

I mean it is important in areas like Gemini app is important for the same reason.

Speaker 1

对吧?

Right?

Speaker 1

我是说,直接了解用户的需求非常非常重要。

Like I mean understanding users directly is very very important.

Speaker 1

反重力系统也是如此,AI工作室也是如此。

Anti gravity is the same way, AI studio is the same way.

Speaker 1

因此,拥有这些我们紧密合作的产品,并通过理解和学习获取这些用户信号,我认为是非常重要的。

So having these products that we work really closely with and then understanding and learning that getting those user signals, I think is really massive.

Speaker 1

而反重力系统一直是一个非常关键的发布合作伙伴。

And anti gravity has been a very critical launch partner.

Speaker 1

他们加入的时间并不长,对吧,但在我们发布流程的最后两三周里,他们的反馈确实起到了非常关键的作用。

It hasn't been long that they have joined, right, but in the last two, three weeks of our launch process, their feedback has been really, really instrumental.

Speaker 1

搜索AI模式也是同样的道理,对吧?

The same thing with Search AI mode, right?

Speaker 1

我的意思是,

Like I mean, overview is even that we get a lot of feedback from there.

Speaker 1

所以对我来说,与产品的这种整合以及获取这种信号是我们理解的主要驱动力。

So like to me this integration with the products and getting that signal is the main driver that we understand.

Speaker 1

当然我们也有基准测试,所以我们知道如何推动STEM、科学、数学这类智力。

Like of course we have the benchmarks, so like we know how to push the stem, the sciences, the math, that kind of intelligence.

Speaker 1

但真正重要的是我们理解现实世界的用例,因为这必须在现实中有用。

But it's really important that actually we understand the real world use cases, because this has to be useful in the real world.

Speaker 0

是的。

Yeah.

Speaker 0

在你的新职位上作为首席AI架构师,你现在负责确保我们不仅有好的模型,还要确保产品实际采用这些模型并在谷歌内部构建出色的产品体验。

In your new Chief AI Architect role, you're now responsible for also making sure that we don't just have good models, but the products actually take the models and implement them and sort of build great product experiences across Google.

Speaker 0

显然,我认为这对用户来说是正确的方向。

Obviously, I think this is the right thing for users.

Speaker 0

比如,让Gemini三号和所有产品服务在首日上线,对谷歌来说是一项了不起的成就。

Like, getting Gemini three and all the product services on day one is an is like an awesome accomplishment for for Google.

Speaker 0

而且我认为未来还会有更多产品服务加入,希望如此。

And I think it'll even more so, hopefully, more product services in the future.

Speaker 0

从DeepMind的角度来看,你认为尝试这样做会增加多少额外的复杂性?

How much additional complexity from the DeepMind perspective do you think it adds to try to do it?

Speaker 0

在某种程度上,一年半前的生活要简单得多。

In some sense, life was simpler a year and a half ago.

Speaker 0

确实。

Sure.

Speaker 1

但我们正在构建的是智能,对吧?

But like, we are building intelligence, right?

Speaker 1

很多人问我,你同时担任这两个角色。

A lot of people ask me, you have these two roles.

Speaker 1

我的意思是,我虽然有两个头衔,但它们本质上是一回事。

Like, I mean, I have these two titles in a way, but they very much the same thing.

Speaker 1

如果我们要构建智能,就必须通过产品来实现,与用户建立连接。

If we are going to build intelligence, we have to do it with the products, through the products, connecting with the users.

Speaker 1

在开罗项目中,我的目标是确保谷歌产品能获得最先进的技术支持。

With the Cairo, what I'm trying to do is make sure that the products in Google have the best technology that is available to them.

Speaker 1

我们并非在做产品开发——我们不是产品经理,而是技术开发者,对吧?

We are not trying to do the products, like we are not product people, we are technology developers, right?

Speaker 1

我们专注于技术研发,构建模型。

Like we develop the technology, we do the models.

Speaker 1

当然,就像所有人对任何事都有见解一样,对我来说最重要的是打造模型,以最佳方式让技术可用,然后与产品团队协作,在这个AI时代构建最优秀的产品。

And of course, I mean, just like everyone is opinionated on anything, like I mean people are opinionated but like the most important thing for me is making the models, making the technology available in the best way that is possible and then work with the product teams to enable them to build the best product in this AI world.

Speaker 1

因为这是一个新世界,新技术正在定义用户预期、产品行为模式以及应传递的信息。

Because this is a new world, There's a new technology that is defining a lot of what users expect and how the products should behave, what information that they should carry over.

Speaker 1

还有这项新技术能实现的所有新功能。

And all the new things that you can do with this new technology.

Speaker 1

所以对我来说,关键在于推动这项技术在整个谷歌的应用,与所有产品线协同工作。

So to me it's about enabling that across Google, working with all the products.

Speaker 1

我认为这从产品角度、用户获取角度都令人兴奋,正如我所说,这也是我们的主要驱动力。

I think that's exciting both from the product perspective, from what users getting perspective, but also from the point of view of like as I said, that's our main driver.

Speaker 1

对我们而言,能够感知用户需求、获取用户信号至关重要。

It's really important for us to be able to feel that user need, to be able to get that user signal.

Speaker 1

这对我们非常关键。

That's critical for us.

Speaker 1

正因如此我才想这么做,这就是我们构建通用人工智能(AGI)的路径。

So that's why I wanted to do it, this is how we are going to build AGI.

Speaker 1

这就是我们通过产品来构建智能的方式。

This is how we are going to build intelligence like, with the products.

Speaker 1

我认为事情就会这样发展。

That's how I think it's going to happen.

Speaker 0

这真是你将来可以发的一条绝佳推文,因为我觉得这个观点很有意思。

This a great tweet for you to put out at some point because I do think it's interesting.

Speaker 0

分享这个视角:在某种意义上,我们正在与客户、与其他产品团队共同构建通用人工智能。

Share this perspective that we are, in some sense, co building AGI with our customers, with the other PAs.

Speaker 0

这不像那些通常位于某个实验室的纯粹研究项目。

It's not like it's this purely research effort that's often a lab somewhere.

Speaker 0

这是与我们共同在现实世界中的协作努力。

It's a joint effort with us in the world.

Speaker 1

而且我认为这实际上是一个非常值得信赖、经过验证的系统。

And I think it is actually a very trusted, tested system as well.

Speaker 1

这是一种我们越来越适应的工程思维方式。

It's a very engineering mindset that I think we are adapting more and more.

Speaker 1

我认为在这种事情上保持工程思维很重要,因为当某样东西被精心设计时,你就知道它是稳健且使用安全的。

And I think it's important to have an engineering mindset in this one because when something is nicely engineered you know that it robust, it is safe to use.

Speaker 1

所以我们正在现实世界中实践,并以某种方式整合所有经过验证的建设理念。

So we are doing something in the real world and we are adapting all the trusted tested in a way ideas of how to build things.

Speaker 1

我认为这反映在我们对安全性、对保障性的思考方式上。

And I think that's reflected in how we think about safety, how we think about security.

Speaker 1

我们试图再次从工程思维出发,从一开始就考虑这些问题,而不是事后补救。

We try to think about it again from that engineering mindset of think about it from the ground up, from the beginning, not something that comes at the end.

Speaker 1

因此,当我们进行模型后训练、预训练或审视数据时,始终秉持这一理念。

So when we are doing post training models, when we are doing pre training, when we are looking at our data, we always have this.

Speaker 1

每个人都必须思考这个问题:我们是否配备了安全团队?

Everyone needs to think about this, do we have a safety team?

Speaker 1

显然我们是有安全团队的。

Obviously we have a safety team.

Speaker 1

他们正在整合所有相关技术资源。

And they are bringing in all the technology that is related to them.

Speaker 1

我们还有安全团队,他们不仅引入技术,更让Gemini的每位成员都能深度参与这个以安全为第一原则的开发流程。

We have a security team, they're bringing in all the technology but enabling everyone in Gemini to actually also heavily be part of that development process that is taking this as a first principle.

Speaker 1

而这些团队本身也是我们后训练体系的重要组成部分,对吧?

And those teams are themselves part of our post training teams, right?

Speaker 1

所以当我们开发这些迭代版本和候选发布时。

So when we are developing these, when we are doing these iterations, release candidates.

Speaker 1

就像我们关注GPQA、HLE这类基准测试一样。

Just like we look at GPQA, HLE, those kinds of benchmarks.

Speaker 1

我们同样会审视其安全防护措施。

We look at its safety security measures as well.

Speaker 1

我认为这种工程思维模式非常重要。

I think that is a very, like that engineering mindset is important.

Speaker 0

是的,我完全同意你的观点。

Yeah, I completely agree with you.

Speaker 0

我觉得这对谷歌来说也很自然,因为协作性强且投入巨大,这很有帮助

I think it also feels natural to Google, which is also helpful because of how collaborative and how big effort is

Speaker 1

现在要

now to

Speaker 0

将Gemini模型推向市场。

ship Gemini models out the door.

Speaker 1

我是说,对于Gemini三,我们刚刚还在反思这一点。

I mean, with Gemini three, I think we were just reflecting on this.

Speaker 1

在我看来,关键之一是这款模型真正体现了谷歌团队的协作精神。

To me, one of the important things is this model has been a very Team Google model.

Speaker 0

我们应该查看一下数据。

We should look into the data.

Speaker 0

这可能就像某些项目,比如NASA的阿波罗计划动员了大量人员,但这次谷歌全球范围内所有团队的共同努力规模之大,简直令人难以置信。

It might be like one of the, I mean, some of maybe the Apollo NASA programs had a lot of people, but like it is, I think this massive Google global, also global effort across all of our teams to make it happen, is crazy.

Speaker 1

每个Gemini版本的发布都需要来自欧洲、亚洲乃至全球各地的人员参与。

Every Gemini release takes people from this continent, Europe, Asia, all around the world.

Speaker 1

我们在全球各地都有团队,他们都在贡献力量,不仅仅是GDM团队,对吧?

We have teams all around the world and they contribute and not just GDM teams, right?

Speaker 1

谷歌的所有团队。

All teams across Google.

Speaker 1

这是一项巨大的协作努力,我们与AI模式同步发布,与Gemini应用同步发布,对吧?

It's a huge cooperative effort and we sim shipped with AI mode, sim shipped with Gemini app, right?

Speaker 1

这些都不容易实现,因为它们在我们开发过程中一直与我们同在。

These are not easy to do because they were together with us during our development.

Speaker 1

只有这样,我们才能在模型准备就绪的第一天同时全面推出,并且我们一直在这样做。

That's the only way that on day one we can actually go all together out at the same time the model is ready and we have been doing that.

Speaker 1

当我们说'整个谷歌'时,不仅仅是指那些直接参与模型构建的人。

When we say across Google, it's not just people actively trying to build the model.

Speaker 1

所有产品团队也在各司其职贡献力量。

All the product teams doing their parts as well.

Speaker 0

是啊。

Yeah.

Speaker 0

我有个可能不算争议的问题——Gemini 3在很多基准测试中表现突出,大量基准测试。

I have a maybe this isn't a controversial question, but, you know, Gemini three, we're sort of soda on many benchmarks, a lot of benchmarks.

Speaker 0

我们正同步在谷歌产品服务和合作伙伴生态系统中进行部署。

We're sort of SIM shipping across the Google product services, our sort of partner ecosystem services.

Speaker 0

市场反响非常积极。

The reception is very positive.

Speaker 0

这个模型给人的整体感觉很好。

Sort of the vibes of the model are good.

Speaker 0

你们动作相当迅速。

You sort of fast Gunches.

Speaker 0

敲敲木头。

Knock on wood.

Speaker 0

如果我们快进到谷歌下一个主要模型发布时,你还有什么希望我们做到X、Y、Z的事情仍在你的清单上吗?

If we sort of fast forward to the next major Google model launch, are there things that you are still on your list of you wish we were doing X, Y, and Z?

Speaker 0

或者说它怎样才能比Gemini更好?还是说我们应该先享受Gemini 3的当下?

Or how does it get better than the Gemini or should we just enjoy the moment of Gemini three?

Speaker 0

我认为我们应该两者兼顾。

I think we should do both.

Speaker 1

我们应该享受当下,因为享受当下的一天是件好事。

We should enjoy the moment, because one day of enjoying the moment is a good thing.

Speaker 1

今天是发布日,我认为人们都很欣赏这个模型。

This is the launch day, and I think people are appreciating the model.

Speaker 1

我希望团队能享受这一刻,但同时,我们审视每个领域时也都能看到不足。

I'd like the team to enjoy this moment as But at the same time, every area we look at, we also see gaps.

Speaker 1

对吧?

Right?

Speaker 1

比如,它在写作方面完美吗?

Like, is it perfect in writing?

Speaker 1

不,它在写作方面并不完美。

No, it's not perfect in writing.

Speaker 1

它在编程方面完美吗?

Is it perfect in coding?

Speaker 1

它在编程方面也不完美。

It's not perfect in coding.

Speaker 1

特别是,我认为在代理行为和编程领域,还有很大的提升空间。

I mean, especially, I think, on the area of agentic actions and coding, I think that there's a lot more room there.

Speaker 1

这是最令人兴奋的增长领域之一,我们需要明确可以在哪些方面做得更多,然后就会做得更多,对吧?

That's one of the most exciting growth areas and like we need to identify where we can do more and we'll do more, right?

Speaker 1

我认为我们已经取得了很大进展。

Like I think we have come a long way.

Speaker 1

这个模型,我想说大概90%到95%会以某种方式参与编程的人,他们是软件工程师还是那些想创造东西的创意人士?

The model is, I would say pretty much like maybe 9095% of the people who will engage with coding in some way are they software engineers or these creative people who want to build something?

Speaker 1

是的。

Yeah.

Speaker 1

我认为这个模型是他们能使用的最佳工具。

I'd like to think that this model is the best thing that they can use.

Speaker 1

对吧?

Right?

Speaker 1

但可能在某些情况下,我们仍需做得更好。

But there are some cases probably that is we still need to do better.

Speaker 0

嗯。

Yeah.

Speaker 0

关于编程和工具使用,我还有个尖锐的问题。

I have another sort of pointed question for coding and tool use.

Speaker 0

回顾Gemini的发展历程,1.0版本我们非常注重多模态,而2.0版本我们开始构建类似代理的基础设施。你认为...

What do you think has it just been, if you sort of look at the history for Gemini, and obviously we had like a very multimodal focus for one point o, and I think for two point o we started to make some of the like agentic infrastructure work.

Speaker 0

你是否能理解——虽然进展速度看起来很强劲——但为什么...

Like, do you have a sense of, like, why we and I'll make the caveat that, like, I think the rate of progress looks really strong, but, like, why?

Speaker 0

是否仅仅因为关注点不同,导致我们一开始就没有达到最先进的工具使用水平?

Has it just been, like, a focus thing why we haven't been, like, state of the art agentic tool use from the get go?

Speaker 0

但以多模态为例,Gemini一代在多模态方面确实是最先进的,我们在这方面保持了很长时间的优势。

But for example, multimodal, we have been literally Gemini one was state of the art and multimodal, we've sort of held that for a long time.

Speaker 1

我看。

I look.

Speaker 1

我不认为这是有意为之。

I don't think it was a deliberate thing.

Speaker 1

说实话,我认为回顾起来,这与我们使用模型的方式有关——开发环境与现实世界紧密相连。

I think it was like I mean, honestly, I think, like, if anything, when I reflect back, I tie it to using the models, development environment being closely tied to real world.

Speaker 1

我们联系得越紧密,就越能更好地理解实际发生的需求。

The more we are tied, then we are more better understanding these real requirements that is happening.

Speaker 1

我认为在Gemini的发展历程中,我们起点很高——毕竟谷歌在AI研究方面有着深厚历史,对吧?

And I think in our journey in Gemini we started from a point where, of course, the AI research in Google has a huge history, right?

Speaker 1

我们拥有众多杰出的研究人员和辉煌的AI研究历史,这很棒。但Gemini也是一个从研究环境转型的过程,正如我们所说,转向工程思维,进入真正与产品相连的领域。

The amount of amazing researchers that we have and the amazing history of AI research that has been done in Google, I think it's great but Gemini is also a journey of moving from that research environment into this, like as we talk, this engineering mindset and getting into a space where we are really connected with the products.

Speaker 1

当我看着这个团队时,我必须说我感到非常自豪,因为这个团队的主要成员包括我在内,四五年前我们还在写论文、研究人工智能。而现在,我们实际上已经站在这项技术的前沿,并且通过产品与用户共同开发这项技术。

When I look at the team, I have to say I feel really proud because this team is still majority formed by people including me, Like four or five years ago we were writing papers, we were researching AI And here we are actually, we are at that frontier of that technology and that technology you are developing it via products with the users.

Speaker 1

这完全是另一种思维方式——我们每六个月构建模型,然后每个月或一个半月进行更新。

It's a completely different mindset that we are building models every six months and then we are doing updates every month, month and a half.

Speaker 1

这是一个惊人的转变。

It's an amazing shift.

Speaker 1

我想我们已经完成了这个转变过程。

I think we walked through that shift.

Speaker 1

是的。

Yeah.

Speaker 0

我很喜欢这样。

I love that.

Speaker 0

Gemini三代的进展非常出色。

Gemini three progress has been awesome.

Speaker 0

另一个我一直在思考的主线是,我们如何看待生成式媒体模型的定位——从历史上看,它们并不算是重点领域,我的意思是并非完全没有被关注过。

Another thread that was top of mind is just generally sort of how we're thinking about where the gen media models, I which think historically have, like, not been a huge I mean, not that they haven't been a focus.

Speaker 0

它们一直很有趣,但我觉得我们通过VO3、VO3.1和纳米香蕉模型取得了不少成果。

They were they've always been interesting, but I feel like we've had with v o three v o 3.1 with the nano banana model.

Speaker 0

从产品外化的角度来看,我们取得了巨大的成功。

We've had, like, so much success from, like, a product externalization standpoint.

Speaker 0

我很好奇,在你看来,在我们追求构建AGI的过程中,

And I'm curious how you think about in this, like, pursuit of we wanna build AGI.

Speaker 0

有时我会说服自己,视频模型似乎与这个目标无关。

Sometimes I think sometimes I can convince myself that, like, a video model is, like, not part of that story.

Speaker 0

但我不认为这是事实。

I don't think that's true.

Speaker 0

我认为总体上,模型应该理解世界、物理规律等等。

I think in general, can sort of, you should understand the world and physics and all this other stuff.

Speaker 0

所以我很好奇你如何看待所有这些因素如何交织在一起。

So I'm curious how you see all these things intertwining together.

Speaker 1

如果你回溯十到十五年前,生成模型主要还集中在图像领域。

If you actually go back like ten, fifteen years ago, generative models were mostly on images.

Speaker 1

对吧?

Right?

Speaker 1

因为我们可以更好地观察发生了什么,而且理解世界、理解物理现象这一理念,正是推动图像等生成模型发展的主要动力。

Because we could much better inspect what is going on in terms of and also this idea of understanding the world, understanding the physics was the main driver of doing generative models with images and so on.

Speaker 1

就像我们在生成模型领域做过的一些激动人心的工作,可以追溯到十年前,感觉像是二十年前的事了。

Like some of the exciting things that we have done with generative models date back to ten years ago, like maybe Feels ten years like twenty.

Speaker 1

对吧?

Right?

Speaker 1

二十年前我们还在做图像模型,对吧?

Twenty years ago we were still doing image models, right?

Speaker 1

所以我刚才有点犹豫,但在我读博期间我们就在做生成式图像模型了,对吧?

That's why I was hesitating a little bit, but during my PhD we were doing generative image models, right?

Speaker 1

那时候所有人都在做这些。

Like everyone was doing those at that time.

Speaker 1

我们经历过那个阶段,有过像像素CNN这样的模型,对吧?

We walked through that, we had things called pixel CNNs, right?

Speaker 1

它们属于图像生成模型。

They were image generative models.

Speaker 1

某种程度上,我认为一个重大发现是文本领域实际上更易于实现快速进展。

In a way what happened was I think it was also a big realization that text actually was the better domain to have very fast progress.

Speaker 1

但我认为图像模型的回归很自然,比如在GDM,我们长期拥有强大的图像视频音频模型。

But I think it is very natural that the image models are coming back and like at GDM we have had really strong image video audio models for a long time.

Speaker 1

我想这大概就是我要解释的观点。

I think that's what I'm trying to explain maybe.

Speaker 1

将两者结合是很自然的发展。

Bringing those together I think is natural.

Speaker 1

所以我们当前的方向是始终讨论的多模态性,对吧?

So where we are going right now is we have always talked about this multimodality, right?

Speaker 1

当然,我们一直在探讨输入输出的多模态性,这正是我们的发展方向,对吧?

And of course, naturally, like we have always talked about inputoutput multimodality and that's where we are going, right?

Speaker 1

随着技术进步,你会发现这两个不同领域之间的架构和理念正在相互融合。

And when you look at it, as the technology progresses, the architectures, the ideas in between those two different domains have been merging with each other.

Speaker 1

过去这些架构差异很大,对吧?

It used to be that these architectures were very different, right?

Speaker 1

但它们现在正高度融合。

But they are getting together quite a lot.

Speaker 1

这并非我们强行拼凑,而是技术自然趋同的结果。

So like it's not like we are forcing something in, what is happening is naturally the technology is converging.

Speaker 1

技术之所以趋同,是因为人们都清楚效率提升的方向,理念正在进化,我们看到了一条共同的道路,而这条共同道路正很好地融合成型。

As the technology is converging, it is converging because everyone understands where to get more efficiency from, where the ideas are evolving, and we see a common path and that common path I think is getting together well.

Speaker 1

所以纳米香蕉是那些最初的时刻之一,对吧,你可以迭代图像,可以和模型对话。

So nanobanana is one of those first moments, right, where you can iterate over images, you can talk to the model.

Speaker 1

因为文本模型拥有大量对世界的理解,对吧,这些理解来自文本。

Because what happens is that text models have a lot of world understanding, right, like from the text.

Speaker 1

它们拥有大量对世界的理解。

They have a lot of world understanding.

Speaker 1

而图像模型则从不同视角获得对世界的理解。

And then the image model has the world understanding from a different perspective.

Speaker 1

所以当你把这两者结合起来时,我认为会产生令人兴奋的结果,因为人们会觉得这个模型理解了他们想要传达的核心理念。

So like when you merge those two, I think you get exciting things because, like, I think people feel that this model understands the neons that they want to get through.

Speaker 0

我还有个关于NanoBenai的问题。

I have another question about NanoBenai stuff.

Speaker 0

你觉得我们是不是该给所有模型都起些滑稽的名字?

Do you think we should just have goofy names for all of our models?

Speaker 0

你觉得这样会有帮助吗?

Do you think that would help?

Speaker 1

并不觉得。

Not really.

Speaker 1

听着,我的意思是,我们并不是故意这样做的。

Look, I mean, like, I think we didn't do it on purpose.

Speaker 0

Gemini三号。

Gemini three.

Speaker 0

如果我们没把它命名为Gemini三号,你觉得应该叫什么?

If we didn't name it Gemini three, what would you what would we have called it?

Speaker 0

一些荒谬的名字。

Something ridiculous.

Speaker 1

我...我不知道。

I I don't know.

Speaker 1

我不擅长取名。

I'm not good at names.

Speaker 1

我想我其实喜欢...比如叫'裂隙奔跑者'。

I think I like I like mean, it was Rift Runner.

Speaker 1

对吧?

Right?

Speaker 1

就是叫'裂隙奔跑者'。

Like, it was Rift Runner.

Speaker 1

实际上我们用的是Gemini模型。

Like we actually use Gemini model.

Speaker 1

那些都是代号。

Those are code names.

Speaker 1

我们也用Gemini模型来生成这些代号。

We use Gemini models to come up with those code names too.

Speaker 1

而纳米香蕉不在其中。

And Nanobanana was not one of those.

Speaker 1

就像我们没用Gemini模型,对吧?

Like we didn't use Gemini, right?

Speaker 1

这背后有个故事。

There's a story about it.

Speaker 1

我记得应该已经公开发表过。

Think like it's published somewhere.

Speaker 1

只要这些命名是自然有机的,我就很满意。因为我认为对开发团队来说,这种联系很重要。

Mean, long as these things are natural and organic, I think I'm happy because I think the teams who are building the models, it's good for them to sort of have that connection.

Speaker 1

是的。

Yeah.

Speaker 1

当我们发布时,这种情况之所以发生是因为我们一直用代号测试模型,对吧?

And then when we release them, like I think that just like, I mean that happened because we were testing the model with the code name, right?

展开剩余字幕(还有 291 条)
Speaker 1

在LM竞技场上,大家都很喜欢。

On LM Arena and people loved it.

Speaker 1

我觉得,虽然不确定,但我愿意相信它是如此自然自发地流行起来的。

And I think, I don't know, I'd like to think that it was so organic that like, sort of it caught on.

Speaker 1

我不确定是否能人为设计出这样的流程。

I'm not sure if you can create a process to generate that.

Speaker 0

我同意你的看法。

I agree with you.

Speaker 0

这正是我的感受。

That's my feeling.

Speaker 1

既然我们有,就应该好好利用。

If we have it, we should use it.

Speaker 1

如果没有的话,使用标准名称也不错。

If you don't have it, it's good to have standard names.

Speaker 0

是啊。

Yeah.

Speaker 0

我们应该谈谈Nano Banana Pro,这是我们基于Gemini三Pro构建的最新尖端图像生成模型。

We should talk about, Nano Banana Pro, which is our new state of the art, image generation model built on top of Gemini three Pro.

Speaker 0

我认为团队在接近完成时,Nano Banana其实已经显示出专业级应用的早期迹象——比如在文本渲染、世界理解等更细微的用例上能获得显著更强的性能表现。

And I think the team, I think actually, even as they were sort of finishing, Nano Banana sort of like had early signal that potentially doing this in a pro capacity, like you could sort of get a lot more performance on a bunch of like more nuanced use cases like text rendering and world understanding and stuff like that.

Speaker 0

目前有什么重点事项...我知道我们手头有很多

Any anything sort of top of mind for I know we're we're a lot of

Speaker 1

事情要处理。

stuff going on.

Speaker 1

我觉得这大概就是各种技术协同发力的交汇点了吧?

I think, like, this is like probably where we see this like alignment of different technologies is coming into play, right?

Speaker 1

我的意思是,Gemini系列模型我们一直强调每个版本都是模型家族,对吧?

Like I mean because with Gemini models we have always said like every model version is a family of models, right?

Speaker 1

比如我们有Pro版、Flash版、Flashlight版这样的模型家族。

Like we have the Pro, Flash, Flashlight, like this family of models.

Speaker 1

因为不同规模的模型在速度、准确性、成本等方面需要做出不同权衡。

Because at different sizes you have different compromises in terms of speed, accuracy, cost, those kinds of things.

Speaker 1

随着这些技术的融合,我们在图像领域也获得了相同的体验。

As these things are coming together, of course we have the same experience on the image side as well.

Speaker 1

是的。

Yeah.

Speaker 1

所以团队很自然地会考虑:既然有3.0 Pro架构,我们能否利用第一版的经验并扩大规模,将这个模型进一步调优为生成式图像模型?

So I think it's natural that the teams thought about okay, there's the three point zero Pro architecture, can we actually tune this model more to be generative image using everything that we learned in the first version and increasing the size?

Speaker 1

我认为最终我们得到的是一个能力更强、能理解非常复杂内容的模型。

And I think like where we end up with is something a lot more capable, understands really complex.

Speaker 1

最令人兴奋的应用场景之一是:当你拥有大量复杂文档时,可以输入这些文档,依靠这些模型来提问。

Like some of the most exciting use cases are you have a large set of really complex documents, you can feed those in, We rely on these models to ask questions.

Speaker 1

你还可以要求它生成相关信息图,它也能完美执行。

You can ask it to generate an infographic about that as well, and then it works.

Speaker 1

对吧?

Right?

Speaker 1

这就是这种自然的输入输出模式发挥作用的地方,非常棒。

So this is where this natural input modality input output modality just comes into play, and it's great.

Speaker 0

是啊,感觉就像魔法一样。

Yeah, it feels like magic.

Speaker 0

我不知道,希望这期视频发布时大家已经看过那些示例了,但我觉得真的很酷,看到那么多内部案例被分享出来。

I don't know, hopefully folks will have seen the examples by the time this video comes out, but I think it's just, it's so cool, seeing a bunch of the internal examples being shared around.

Speaker 0

太疯狂了。

It's crazy.

Speaker 1

是的,我同意。

Yes, I agree.

Speaker 1

就像当你突然看到那些内容时,天啊,太令人兴奋了,那么多文字、概念和复杂的事物被一张图解释清楚。

Like it's exciting when you see that all of a sudden, oh my god, yes, like that's sort of huge amount of text and concepts and like complicated things explained in one picture.

Speaker 1

多么美妙的方式啊。

Such a nice way.

Speaker 1

当你看到那些东西时,真的很棒,对吧?

Like when you see those things like it is, it's nice, right?

Speaker 1

你会意识到这个模型有多强大。

You realize the model is capable.

Speaker 0

而且,是的,这其中还有很多微妙的细节,非常有趣。

And it's, yeah, there's so much nuance to it too, which is really interesting.

Speaker 0

我有个相关的问题,大概是在2024年12月,Tulsi曾承诺我们将如何实现这些统一的Gemini模型检查点。

I have a parallel question to this, which is probably December, December 2024, Tulsi was promising how we were going to sort of have these unified Gemini model checkpoints.

Speaker 0

我认为你所描述的实际上是我们现在已经非常接近那个目标了,历史上曾经是

And I think what you're describing is actually that we've gotten really close to that now, where the historically was

Speaker 1

在图像生成方面的统一,哦,我明白了。

Unified in terms of image generation and oh, I see.

Speaker 0

我明白了。

I see.

Speaker 0

是的。

Yeah.

Speaker 0

我很好奇,你认为这是否是一个目标。

And I'm curious, do you think that I assume that's a goal.

Speaker 0

我们希望这些特性真正融入模型,但现实中存在阻碍因素,我想知道是否有任何背景或高层面的

We want these things actually made mined into the model, and there's natural things that stop that happening, and I'm curious if any context or sort of high level

Speaker 1

听着,正如我所说,技术和架构正在趋于一致,我们正目睹这一过程。

Look, I think as I said, the technology, the architectures, they're aligning, so we see that happening.

Speaker 1

人们定期尝试,但这仍是一个假设,你不能基于意识形态行事,对吧?

At regular intervals, people are trying, but it's a hypothesis, and like you can't be ideology based in this, right?

Speaker 1

科学方法就是科学方法。

The scientific method is the scientific method.

Speaker 1

我们尝试各种方案,提出假设并观察结果,有时成功有时失败,但这就是我们必经的发展过程。

Like we try things, we have an hypothesis and you see the results, sometimes it works, sometimes it doesn't, but that's the progression that we go through.

Speaker 1

距离目标越来越近了。

It's getting closer.

Speaker 1

我确信在不久的将来我们会看到某种整合,而且我认为逐渐会越来越接近单一模型。

I'm pretty sure near future we are going to see something getting together and I think gradually it's going to be more and more like one single model.

Speaker 1

但这需要大量创新,对吧?

But it will require a lot of innovation, right?

Speaker 1

仔细想想,这确实很困难。

Like it is hard, like if you think about it.

Speaker 1

模型的输出空间非常关键,因为学习信号正是来源于此。

The output space is very critical for the models because that's where your learning signal comes from.

Speaker 1

对吧?

Right?

Speaker 1

目前我们的学习信号来自代码和文本。

Right now our learning signal comes from code and text.

Speaker 1

这是输出空间的主要驱动力,也是你在这方面表现优异的原因。

That's the most of the driver of that output space and that's why you are getting good at there.

Speaker 1

现在能够生成图像——我们对图像质量的要求如此之高,这确实是件难事,比如要生成真正高质量的图像,像素级完美很难做到;而且从概念上讲,图像必须高度连贯,每个像素既要保证质量,又要符合画面的整体构思,这很重要对吧?

Now being able to generate images is we are so tuned for the quality in images, like it is a hard thing to do, right, like generating really like the quality of the images, the pixel perfectness is hard and then images are also conceptually it has to be very coherent like every pixel both the quality matters but also how it fits with the general concept of the picture like it matters right?

Speaker 1

要训练一个同时兼顾这两方面的模型更加困难。

It is harder to train something that does both.

Speaker 1

在我看来,我认为这绝对是有可能的。

The way I look at this is to me, I think it's definitely possible.

Speaker 1

这终将成为可能。

It will be possible.

Speaker 1

关键在于找到模型中的正确创新点来实现这一目标。

It's just about finding the right innovations in the model to make it happen.

Speaker 0

是啊。

Yeah.

Speaker 0

我很喜欢。

I love it.

Speaker 0

我很期待。

I'm excited.

Speaker 0

如果我们能实现单一模型检查点,希望也能让我们的服务部署变得更简单。

It'll hopefully make our serving situation easier too if we have

Speaker 1

单一模型检查点。

single model checkpoint.

Speaker 1

现在还不好说。

It's impossible to say.

Speaker 0

确实不可能。

It's impossible.

Speaker 0

我同意你的观点。

I agree with you.

Speaker 0

我们坐在这里讨论时,有个有趣的线索——DeepMind拥有世界上最好的一批AI产品,但愿如此。

The sort of interesting thread as we sort of sit here and, you know, DeepMind has a bunch of the world's best AI products, hopefully.

Speaker 0

五个编码与AI工作室、Gemini应用、反重力技术,以及正在谷歌内部发生的种种创新。

Five coding and AI studio, Gemini app, antigravity, and sort of across Google that's happening now.

Speaker 0

我们拥有最先进的Gemini三代模型。

We have a great state of the art model with Gemini three.

Speaker 0

我们现在还有Banana。

We have now Banana.

Speaker 0

我们还有Vio。

We have Vio.

Speaker 0

我们拥有所有这些处于前沿的模型。

We have all these models that are sort of at the frontier.

Speaker 0

世界在十年前,甚至十五年前,看起来都大不相同。

The world looked very different, like, ten years ago, or even, like, fifteen years ago.

Speaker 0

我有点好奇,你是如何一步步走到今天这个位置的。

And I'm sort of curious, like, for your personal journey to get to this point.

Speaker 0

昨天我们聊天时你提到的事我完全不知道,后来我跟别人说起,他们也表示同样毫不知情。

You when we were talking yesterday, you had mentioned, which I had no idea, and I mentioned this to someone else, they also were like, I had no idea of this.

Speaker 0

你是DeepMind的第一位深度学习研究员。

You were the first deep learning researcher at DeepMind.

Speaker 0

我觉得从那时到现在的发展轨迹简直是个疯狂的跨越,毕竟当初人们对这项技术并不热衷。

And I think taking that thread to this place that we're at now feels like it's a crazy jump, to go from just like the fact that people weren't excited about this technology.

Speaker 0

不知道你创办DMI具体是什么时候,大概十年前?

I don't know how long ago you started DMI, like ten years?

Speaker 1

2012年。

Twenty twelve.

Speaker 0

十三年了?

Thirteen years?

Speaker 0

对。

Yeah.

Speaker 0

这太疯狂了。

That's crazy.

Speaker 0

十三年前,人们对这项技术并不热衷,而如今DeepMind却对其充满热情,它几乎驱动了所有产品,成为了核心所在。

Thirteen years ago, people weren't excited about this technology to the place or I guess, DeepMind was excited about this technology to the place now where, like, it is literally powering all these products and is, like, the main thing.

Speaker 0

我很好奇,当你回顾这段历程时,脑海中浮现的是什么?

And I'm curious, as you reflect on that, what comes to mind?

Speaker 0

是感到意外,还是觉得理所当然

Is it surprising or like was it obviously

Speaker 1

嗯,我想这就是我们期待中的积极发展情景,对吧?

Well, I mean, think this is the hopeful positive outcome scenario case, right?

Speaker 1

用我的话来说,就像我读博士时那样——我想每个博士生都如此——你坚信自己的研究很重要或将会很重要,对吧?

The way I say it is like, when I was doing my PhD, think it's the same for everyone doing their PhDs, you believe that what you do is important or is going to be important, right?

Speaker 1

你对那个领域充满热情,认为它将产生重大影响。

Like you're really interested on that topic, you think that it's going to make a big impact.

Speaker 1

我想我当时也抱着同样的心态,所以当Demis和Shane联系我并交谈时,我对DeepMind感到无比兴奋。

And I think I was in the same mindset that's why I was really excited about DeepMind when Demis and Shane reached out and we talked.

Speaker 1

得知有一个真正专注于构建智能的机构,而深度学习是其核心,这让我感到非常兴奋。

Was really excited to learn that there was a place that was really focused on building intelligence and deep learning was at the core of it.

Speaker 1

实际上我和我的朋友卡尔·格雷戈尔都在纽约大学Jan的实验室工作过。

It's actually like me and my friend Karl Gregor, actually we were both in Jan's lab in NYU.

Speaker 1

我们同时加入了DeepMind,具体来说就是这样。

We joined DeepMind at the same time, just to be very specific.

Speaker 1

那时候,即使是专注于深度学习和AI的初创公司也非常罕见。

And then at those times it was very unnatural that you would have a deep learning focused and AI focused, startup even.

Speaker 1

所以我认为那非常有远见,是个令人惊叹的地方,真的非常令人兴奋。

So like I think that was very visionary and an amazing place to be like it was really, it was really exciting.

Speaker 1

然后我组建了深度学习团队,团队逐渐壮大。

And then I started the deep learning team, it grew.

Speaker 1

我认为我喜欢的一点是,我对深度学习的态度始终是一种解决问题的思维方式。

I think one of the things that I like, I mean my approach to deep learning has always been that like a mentality of how you approach problems.

Speaker 1

而首要原则始终是基于学习的。

And the first principle it's always learning based.

Speaker 1

这就是DeepMind的核心理念。

That's what DeepMind was about.

Speaker 1

基于学习,一切都能变得更好。

Everything is better on learning.

Speaker 1

从最初的日子开始,到强化学习、智能体以及我们一路走来所完成的一切,这是一段激动人心的旅程。

It was an exciting journey to start from where we were at the days and then RL and agents and everything that we have done along the way.

Speaker 1

就像你投身于这些事情时——至少我是这么想的——我带着对积极结果的期待参与其中,但事后反思时,我会说我们其实很幸运,对吧?

Like you go into these things, at least like this is how I think, I go into these things hoping that a positive outcome happens, but I reflect and I say that we are lucky, right?

Speaker 1

我们很幸运生活在这个时代,因为我认为许多人投身AI或他们真正热衷的领域时,都坚信这是属于他们的时代,成果终将显现。

Like we are lucky to be living in this age because I think a lot of people have worked on AI or the topics that they are really passionate about thinking that this is their age and this is when it's going to pan out.

Speaker 1

但变革正在当下发生,我们必须认识到AI的崛起不仅归功于机器学习和深度学习,还因为硬件发展已达到特定阶段,互联网和数据积累也趋于成熟,对吧?

But it's happening right now and we have to also realise that AI is happening right now not just because machine learning and deep learning but also because it's like the hardware evolution has come to a certain state, like internet and data has come to a certain state, right?

Speaker 1

诸多因素共同作用形成了这种局面,能亲身参与AI研究并见证这一时刻,我感到十分幸运。

So there are a lot of things that align together and I feel lucky to be actually be doing AI and sort of like working up to this moment.

Speaker 1

当我回顾时,确实如此认为——选择AI道路是我们共同的决定,而我个人也做出了专攻AI领域的特定选择。

I think it is a like when I reflect that's how I feel that like yes, they were all choices that we worked on AI and we made and I made like specific choices to work on AI.

Speaker 1

但同时,我也感到非常幸运,我们此刻能处于这个位置。

But also at the same time, I also feel very lucky at this time we are in this position.

Speaker 1

这非常令人兴奋。

It's very exciting.

Speaker 0

是的,我同意你的观点。

Yeah, I agree with you.

Speaker 0

我很喜欢这样。

I love that.

Speaker 0

我很好奇,比如,我在观看思维游戏视频时,想了解更多关于AlphaFold的事情,但我当时并不在场。

I'm curious, like, what are some of the and I was watching thinking game video and sort of, like, see like, learning more about, like, which I hadn't and I wasn't I wasn't around for AlphaFold.

Speaker 0

所以我仅有的背景就是阅读相关资料和听人们谈论它。

So that's the only context that I have is, like, reading about it and seeing people talk about it.

Speaker 0

我很好奇,当你回顾并经历过这一切后,现在的情况与过去有何不同。

And I'm curious, like, as you reflect and having lived through a bunch of that, how things are different today versus what they were before.

Speaker 0

我先给你举个例子,就是你刚才在镜头外提到的那个——虽然这不是你的原话。

And I'll sort of tee you up with one example, which is what you kind of alluded to off camera right before this, which is, and this is not exactly your words.

Speaker 0

你曾提到,我们已经摸索出如何构建这些模型并将它们推向世界。

You were like, we've kind of figured out how to make these models and bring them to the world.

Speaker 0

这有点像你所要表达的核心意思

It was like sort of an essence of what you're getting

Speaker 1

at,

Speaker 0

对此我表示认同,并且很好奇这是否与之前某些迭代阶段的情况相似或不同

which I agree And I'm curious if that felt like, yeah, how that is similar or not to how things were for some of the previous iterations

Speaker 1

我在思考如何组织,或者说哪些文化特质对于将棘手的科学技术问题转化为成功成果至关重要。

I think of how to organize or the cultural traits of what is important to be successful to turn hard scientific and technical problems into successful outcomes.

Speaker 1

我认为我们通过许多项目学到了很多,从DQN、AlphaGo、AlphaZero到AlphaFold都是如此。

I think we learned to do that a lot with many of the projects that we have done starting from DQN, AlphaGo, AlphaZero, AlphaFold.

Speaker 1

所有这些项目都产生了相当大的影响力。

All these kinds of things have been quite impactful.

Speaker 1

而且通过这些项目,我们学到了很多关于如何围绕特定目标、特定任务组织团队,以及如何作为大型团队运作的经验。

And in their ways, like we learned a lot on how to organise around the particular goal, particular mission, organise as a largest team.

Speaker 1

我记得在DeepMind早期,我们会有25个人共同做一个项目,然后25个人一起写论文,所有人都会对我们说‘肯定没有25个人参与这个工作吧’。我会回答‘是的,确实有25人参与’。

Like I remember in the early days of DeepMind, we would work on a project with like 25 people and we would write papers with 25 people and then everyone would say to us 'surely 25 people didn't work on this.' I would say yes they did, they did.

Speaker 1

对吧?

Right?

Speaker 1

我的意思是,我们会这样组织,因为在科学和研究领域这并不常见,对吧?

Like, I mean, we would organize because in sciences and in research that wasn't common, right?

Speaker 1

我认为那种认知、那种思维方式是关键。

And I think like that knowledge, that mentality, I think is key.

Speaker 1

我们就是那样发展过来的。

We evolved through that.

Speaker 1

我认为这真的非常重要。

I think that is really, really important.

Speaker 1

同时,我认为就像我们最近两三年讨论的那样,我们所融合的是这样一种理念:现在这更像是一种工程思维模式,我们有一条主线模型在开发,并学习如何在这条主线上进行探索,如何用这些模型进行探索。

At the same time, I think like with the latest, like the last two, three years as we talked, what we have been merged, like what we have merged this with is like the idea that now this is more like an engineering mindset where we have a main line of models that we are developing and we learn how to do exploration on this main line, how to do exploration with these models.

Speaker 1

一个让我每次看到或想到都感到非常欣慰的好例子就是我们的深度思考模型。

The good example where I see this and every time I see this or think about this I feel quite happy is our deep think models.

Speaker 1

这些就是我们带去参加国际数学奥林匹克竞赛和国际大学生程序设计竞赛的模型。

Those are the models that we go to the IMO competition with, to the ICPC competition with.

Speaker 1

我认为这是个非常酷且很好的例子,因为我们进行探索时,会选择像国际数学奥林匹克这样重要的重大目标,对吧?

And I think that's a really, really cool and good example because we do the exploration, you pick these big targets like, am I competition is really important, right?

Speaker 1

这些问题确实非常难,要向所有参加这些竞赛的学生致敬,他们做的真是了不起。

Like, it's really hard problems and kudos to every student out there who's competing in those competitions, amazing stuff really.

Speaker 1

当然,能够将模型应用其中时,你自然会想要为其定制些特别的东西。

And like being able to put a model there, of course, like you have the urge to do something custom for that.

Speaker 1

我们试图做的是将其视为一个机会,来完善现有技术或提出与现有模型兼容的新想法,因为我们相信所持技术的通用性。

We sort of what we try to do is use that as an opportunity to evolve what we have or to come up with new ideas that are compatible with the models that we have because we believe in the generality of the technology that we have.

Speaker 1

这就是DeepThink这类成果的诞生过程——我们先提出构想,然后将其开放给所有人使用。

And then that's how things like DeepThink happen and then like we come up with something and then we make it available for everyone.

Speaker 1

对吧?

Right?

Speaker 1

这样每个人都能使用一个真正参与过国际数学奥林匹克竞赛的模型。

So everyone can use a model that is actually one that is used in the IMO competition.

Speaker 0

是的,只是想在你提到的论文中的25人和现在的情况之间做个类比,我相信Gemini三代的贡献者名单即将或已经公布

Yeah, just to draw a corollary between what you said, the 25 people in the paper, and I think now the today version of that is you look at like, I'm sure there 's a Gemini three contributors list that will come out or is already

Speaker 1

2500人。

2,500.

Speaker 0

大概有2500人,我相信人们会保守估计。

And there's like 2,500 people, and then I'm sure people are Conservatively.

Speaker 0

对,我敢肯定有人觉得2500人不可能都真正做出了贡献,但他们确实做到了。

Yeah, I'm sure people are thinking there's no way that 2,500 people contributed to actually They did.

Speaker 0

但他们确实做到了,这让人不禁感叹现在某些项目的规模之大。

But they did, which is And it is fascinating to see how large scale some of these problems are now.

Speaker 0

确实如此。

Really.

Speaker 1

我认为这对我们很重要,这也是谷歌的一大优势。

And I think it is important for us and that's one of the great things about Google.

Speaker 1

这里有太多在各自领域堪称专家的人才。

There are so many people who are amazing experts in their areas.

Speaker 1

我们从中受益。

We benefit from that.

Speaker 1

谷歌采用全栈式方法。

Google has this full stack approach.

Speaker 1

我们从中受益。

We benefit from that.

Speaker 1

因此,从数据中心到芯片,再到网络,再到如何大规模运行这些系统,每个环节都有专家。

So you have experts at every layer from data centers to chips to networking to how to run these things at scale.

Speaker 1

这再次回归到工程思维的核心——这些环节本就是不可分割的整体,对吧?

It comes to a state again going on this engineering mindset it comes to a state that these things are not separable, right?

Speaker 1

比如我们设计模型时,会预先考虑它将运行的硬件平台;而设计下一代硬件时,又会预测模型的发展方向。

Like when we design a model, we design it knowing what hardware it's going to run on and we design the next hardware knowing where the models will probably go.

Speaker 1

这种协同固然美妙,但协调数千人共同贡献确实需要付出,我们应该认可这种努力,这本身就是件美好的事。

But this is beautiful, But coordinating this, yes of course you have thousands of people working together and contributing and I think we need to recognize it, and that's a beautiful thing.

Speaker 1

这很棒。

That's great.

Speaker 0

是啊。

Yeah.

Speaker 0

要做到这一点并不容易。

It's not easy to pull off.

Speaker 0

其中一个有趣的线索是回归到DeepMind的传统,采用各种科学方法尝试解决这些非常有趣的问题。而如今我们确实知道这项技术已在多个领域发挥作用,现在真正需要做的就是持续扩大规模。显然这需要创新才能实现,但我很好奇你如何看待当今时代的DeepMind,在纯粹的科学探索与仅仅扩大Gemini规模之间如何平衡。

One of the one of the interesting threads is around back to this sort of like deep mind legacy sort of doing all these different scientific approaches and trying to solve these really interesting problems, and today where we actually know that this technology works in a bunch of capacities, and we truly just need to keep scaling it up, obviously there's innovation that's required to keep doing that, but I'm curious how you think about DeepMind in today's era balancing purely doing scientific exploration versus we're just trying to scale up Gemini.

Speaker 0

或许我们可以用我最喜欢的例子来说明——Gemini扩散模型,某种程度上体现了这种决策过程的具体实践?

And maybe we can use my favorite example for you, which is Gemini diffusion as an example of that decision making come to life in some capacity?

Speaker 0

那个

That

Speaker 1

是最关键的因素。

is the most critical thing.

Speaker 1

很重要。

Finding that balance is really important.

Speaker 1

即使现在,当人们问我,Gemini面临的最大风险是什么时

Even now, when people ask me, what is the biggest risk for Gemini?

Speaker 1

当然,我经常思考这个问题。

And of course, I think about this a lot.

Speaker 1

Gemini面临的最大风险是创新枯竭。

The biggest risk for Gemini is running out of innovation.

Speaker 1

因为我真的不认为我们已经找到了秘诀,从此只需按部就班执行。

Because I really don't believe that we figured out the recipe and we're just going to execute from here.

Speaker 1

我不相信这种说法。

I don't believe in that.

Speaker 1

如果我们的目标是构建智能,我们当然要与用户和产品一起实现,但面临的问题极具挑战性。

If our goal is to build intelligence and we're going to do that of course with the users, with the products, but the problems out there are very challenging.

Speaker 1

我们的目标依然充满挑战,我并未觉得我们已经掌握了只需扩大规模或执行的秘诀。

Our goal is still very challenging and it's out there and I don't feel like we have the recipe figured out that it's just scaling up or executing.

Speaker 1

唯有创新才能实现这一目标。

It is innovation that is going to enable that.

Speaker 1

而创新,你可以从不同规模或与你现有方向相切的不同角度来思考。

And innovation, you can think about it as at different scales or at different tangential directions to what you have right now.

Speaker 1

比如我们当然拥有Gemini模型,在Gemini项目中我们进行了大量探索。

Like of course we have Gemini models and inside the Gemini project we explore a lot.

Speaker 1

我们探索新架构,尝试新想法,研究不同的实现方式。

We explore new architectures, we explore new ideas, we explore different ways of doing things.

Speaker 1

我们必须这样做,并持续下去,这正是所有创新的源泉。

We have to do that, we continue to do that and that's where all the innovation comes from.

Speaker 1

但与此同时,我认为DeepMind或整个Google DeepMind正在进行更广泛的探索。

But like also at the same time I think DeepMind or Google DeepMind as a whole doing a lot more exploration.

Speaker 1

我认为这对我们至关重要。

I think it is very critical for us.

Speaker 1

我们必须做这些事情,因为Gemini项目本身可能在某些方面的探索上存在局限性。

We have to do those things because like again, like there might be some things that like the Gemini project itself might be too constraining to explore some things.

Speaker 1

所以我认为我们能做的最好的事情就是在Google DeepMind和Google Research同时推进。

So like I think the best thing that we can do is both in Google DeepMind, also in Google Research.

Speaker 1

对吧?

Right?

Speaker 1

我们会探索各种想法,并将这些想法引入,因为归根结底,Gemini并非架构本身。

Like we would explore all sorts of ideas and we will bring those ideas in because at the end of the day Gemini is not the architecture.

Speaker 1

对吧?

Right?

Speaker 1

Gemini是你们想要实现的目标。

Gemini is the goal that you want to achieve.

Speaker 1

你们想要实现的目标是智能,并通过产品助力谷歌真正运行在这个AI引擎上的目标。

The goal that you want to achieve is the intelligence and you want to do it with your products enabling goal of Google to really run on this AI engine.

Speaker 1

从某种意义上说,具体采用什么架构并不重要。

In a way it doesn't matter what particular architecture it is.

Speaker 1

我们目前拥有一些方案,并有途径逐步完善,我们将通过这种方式持续演进。

We have something currently and we have ways of evolving through that and we will evolve through that.

Speaker 1

而驱动这一切的引擎将是创新。

And the engine of that will be innovation.

Speaker 1

创新永远是第一动力。

It will always be innovation.

Speaker 1

因此,找到这种平衡点或发现以不同方式实现目标的机会,我认为非常关键。

So finding that balance or finding opportunities of doing that in different ways, I think is very critical.

Speaker 0

是的。

Yeah.

Speaker 0

我有个与之相关的问题,在IO大会上,我和谢尔盖坐下来交谈时,我向他提到过——我个人在IO大会上深切感受到的是,你们召集所有人共同发布这些模型并推动创新。

I have a parallel question to that, which is at IO, I sat down with Sergei, and I made the comment to him that sort of when and I I personally felt this at IO, is you bring all these people together to launch these models and and have this innovation.

Speaker 0

在这个过程中,你会感受到一种人性的温暖,这非常有趣。

You sort of, like, feel the the warmth of of humanity as you do that, which is really interesting.

Speaker 0

我提到这个是因为,当时坐在你旁边听他们讲话时,我也感受到了你散发出的温暖。

And I I was referencing this because of, you know, I was sitting next to you also listening to them, and I sort of, was feeling your warmth.

Speaker 0

我这话说得很个人化,因为我认为这反映了DeepMind整体的运作方式。

And I I mean this very personally because I think this translates into, like, how DeepMind sort of as a whole operates.

Speaker 0

就像德米斯也具备这种特质——既有深厚的科学根基,又拥有一群友善亲切的人。

Think, like, Demis has this as well where it's, this deep scientific roots, but also it's just like people who are like nice and friendly and kind.

Speaker 0

这里有件很有意思的事——我不确定人们是否充分意识到这种文化的重要性及其具体体现。

And there there is something interesting where like, I don't I don't know how much people appreciate, like, how much that culture matters and how it manifests.

Speaker 0

我很好奇,当你思考如何帮助塑造和运营这一切时,这对你而言是如何体现的?

I'm curious, as you think about helping sort of shape and run this, how that manifests for you?

Speaker 1

首先,非常感谢你。

First of all, thank you very much.

Speaker 1

你让我有点不好意思了。

You're embarrassing.

Speaker 1

但我认为,重要的是——我相信我们的团队,也相信给予人们信任和机会,团队精神至关重要。

But like, I think it is important to be I believe in the team that we have and I believe in giving people, like trusting people, giving people the opportunity and that team aspect is important.

Speaker 1

我想说,至少对我而言,这也是在DeepMind工作中学到的,因为我们曾是个小团队,自然在那里建立了信任,而如何在成长中保持这种信任。

And I think like this is something that at least to my part I can say I've learned through working at DeepMind as well because we were a small team and of course you build that trust there and then how you maintain that as you grow.

Speaker 1

我认为营造一个让成员感到'我们真心关注解决那些能对现实世界产生重要影响的挑战性技术科学问题'的环境很关键。

I think it is important to have this environment where people feel like, okay, we really care about solving the challenging technical scientific problem that makes an impact that matters for real world.

Speaker 1

我想这依然是我们正在践行的。

And I think that is still what we are doing.

Speaker 1

对吧?

Right?

Speaker 1

就像我说的,Gemini正是关于这一点的。

Like Gemini, as I said, is about that.

Speaker 1

构建智能是一个极具技术挑战性的科学难题。

Building intelligence is a highly technical challenging scientific problem.

Speaker 1

我们必须以这种方式来应对。

We have to approach it that way.

Speaker 1

我们还需要带着谦逊的态度去面对,对吧?

We have to approach it with that humility as well, right?

Speaker 1

就像我们必须不断质疑自己。

Like we have to always question ourselves.

Speaker 1

希望团队也能有这样的感受。

Like hopefully the team feels like that too.

Speaker 1

这就是为什么我总是说,我为团队能如此出色地协作感到无比自豪。

And I'm like, that's why I always keep saying I'm really proud of the team that they work together amazingly well.

Speaker 1

就像今天我们在高岭的茶水间聊天时说的,是的这很累,是的很难,是的我们都精疲力尽,但这就是我们的使命。

Like we were just talking upstairs at the micro kitchen today at Takamine, I said to them, yes it's tiring, yes it's hard, yes we are all exhausted, but this is what it is.

Speaker 1

我们对此并没有一个完美的架构体系

Like we don't have a perfect structure for this.

Speaker 1

每个人都在齐心协力、相互支持

Everyone is coming together and working together and like supporting each other.

Speaker 1

虽然艰难,但正是这种团队协作让攻克难题变得有趣且令人愉悦——我认为很大程度上取决于是否拥有合适的团队共同奋斗

It is hard but like what makes it fun and enjoyable and also like what makes you tackle really hard problems is I think to a big extent like having the right team together working together.

Speaker 1

在我看来,真正的压力在于要清晰认识我们现有技术的潜力

The burden is the way I see it is more like be clear about the potential of the technology that we have.

Speaker 1

我无法断言二十年后LLM架构还会完全保持不变

I can't definitely say that twenty years from now it's the exact same LLM architecture.

Speaker 1

我确信它一定会进化

I'm sure it won't be.

Speaker 1

对吧?

Right?

Speaker 1

因此我认为推动新的探索才是正确的方向

So I think pushing for new exploration is the right thing to do.

Speaker 1

我们讨论过,GDM作为一个整体与Google Research一起,必须与学术研究社区合作推进。

We talked about, GDM as a whole together with Google Research, we have to be doing with the academic research communities.

Speaker 1

作为一个整体,我们必须推动多个不同的方向。

As a whole we have to push many different directions.

Speaker 1

我认为这完全没问题。

I think that's perfectly fine.

Speaker 1

什么是正确、什么是错误,我觉得这不是重要的讨论点。

What is right, what is wrong is a like, I don't think that it's the important conversation.

Speaker 1

我认为,真正的能力及其在现实世界中的展示效果才是最有力的证明。

I think like, like the capabilities and the demonstrations of those capabilities in real world is the real thing that should speak for itself.

Speaker 0

是的。

Yeah.

Speaker 0

我最后还有一个问题,也很好奇您对此的看法。

I have one last question, is, and I'm curious to have your reflection on this as well.

Speaker 0

就我个人而言,在谷歌的前一年半时间里,其实我很喜欢某种程度上谷歌作为挑战者的故事——尽管拥有所有基础设施优势,但对我来说...

I feel like the, for me personally, like my first year and a half plus at Google felt like, which I really liked actually this like, Google underdog story to a certain extent, which despite all the infrastructure advantage and all that, for me personally showing When

Speaker 1

你是什么时候加入的?

did you join?

Speaker 0

2024年4月。

April 2024.

Speaker 0

2024年。

2024.

Speaker 0

好的。

Okay.

Speaker 0

是的。

Yeah.

Speaker 0

还有AI工作室的背景下,我们正在打造这款产品

And also the AI studio context, we were building this product and

Speaker 1

对。

Right.

Speaker 1

哦,现在我想起来了。

Sort of Oh, Now I remember.

Speaker 0

当时我们既没有用户,或者说只有3万用户,也没有收入,Gemini模型的生命周期还处于非常早期阶段。而如今快进到现在,显然情况已经不同——就像这两天随着模型逐步发布,我收到了大量消息提醒。

We had no users, or we had 30,000 users, we had no revenue, we had sort of very early in the Gemini model life cycle, and I think fast forward to today, and, like, it's obviously not like I was getting a bunch of pings earlier as as sort of the the last couple of days as this model has been rolling out.

Speaker 0

而且你知道,来自整个生态系统的同行们——我相信你也收到了不少——人们似乎终于意识到这件事正在发生。

And, you know, from folks across the ecosystem, I'm sure you got a bunch of these as well, people, like, very I think they're finally realizing that this is happening.

Speaker 0

但我很好奇从你的角度来看,你是否也感受到了那种劣势感?我当初是怀着信念加入谷歌的,相信我们会走到今天这一步。

But I'm curious from your perspective, what did you feel that like, again, I had belief, that's why I joined Google, that we were going to get to this point, but did you feel that underdog ness too?

Speaker 0

我还想知道,你认为团队在跨越这个转折点时会有怎样的表现?

And I'm curious how you think the team will manifest for the team as we turn that corner?

Speaker 1

我确实在那之前就意识到了,因为当大语言模型真正展现出强大能力时,说实话,我觉得我们就是前沿AI实验室,就像在DeepMind那样,但同时我也觉得我们在某些方面作为研究者投入得还不够,这对我来说是个深刻的教训。

I definitely did even before that because like, when LLMs really like became apparent that they're really powerful, right, like I felt like very honesty, I felt like we were the frontier AI lab, right, like in DeepMind, but also at the same time I felt like okay, there's something that we haven't invested as much as we should have as researchers and that's a big learning for me as well.

Speaker 1

对吧?

Right?

Speaker 1

这就是为什么我总是非常谨慎地认为我们需要广泛撒网,这非常重要,探索精神至关重要。

Like that's why I'm always very careful about like we need to cast a wide net, that's really important, that exploration is important.

Speaker 1

这不是关于这种架构或那种架构的问题。

It's not about this architecture, that architecture.

Speaker 1

我一直非常坦诚地告诉团队,当我们开始更认真地对待大语言模型,大约两年半前启动Gemini项目时。

And I've been very I've been very open with the team that when we started taking LLMs a lot more seriously and starting with like with the Gemini program like two and a half years ago.

Speaker 1

我认为我们始终如此,我也非常诚实地告诉团队,我们离行业最前沿水平还差得很远。

I think we have been always and I've been very honest with the team that like we are nowhere near what is state of the art here.

Speaker 1

我们对许多事情都还束手无策。

We don't know how to do a lot of things.

Speaker 1

虽然我们掌握了不少技术,但确实还没达到那个水平,这是一场漫长的追赶赛。

There are a lot of things we know how to do but like we are not at that level yet and it's a catch up and it has been a catch up for a long while.

Speaker 1

我感觉如今我们已跻身领导集团。

I feel like nowadays we are at that leadership group.

Speaker 1

对我们当前的发展节奏,我感到非常乐观和积极。

I feel really good and positive about the pace that we are operating at.

Speaker 1

我们正处于良性的发展节奏中。

We're in a good sort of rhythm.

Speaker 1

团队氛围非常融洽。

We have a good dynamic.

Speaker 1

我们节奏不错,但确实,我们一直在追赶。

We have a good rhythm But like, yeah, we have been catching up.

Speaker 1

你必须对自己诚实,当你处于追赶状态时,就是在追赶。

You have to be honest with yourself, Like when you are catching up, are catching up.

Speaker 1

你必须观察别人在做什么,学习你能学的,但也要为自己创新。

You have to see what others are doing and learn what you can learn, but you have to innovate for yourself.

Speaker 1

这正是我们做的,我觉得从某种意义上这是个不错的逆袭故事,对吧?

And that's what we did and that's what I feel like it's a good underdog story in a sense in that way, right?

Speaker 1

我们为自己创新,在技术、模型、流程和运营方式上都找到了自己的解决方案。

Like we innovated for ourselves and like we found our own solutions both like technology wise, model wise, process wise and how we run.

Speaker 1

对吧?

Right?

Speaker 1

这是我们独有的,我们是与整个谷歌共同前进的。

And it's unique to us, Like we run together with all of Google.

Speaker 1

看看我们在做的事,规模完全不同。

Like look at what we are doing, it's a very different scale.

Speaker 1

我从不把这些视为问题,虽然人们常说‘谷歌太大了,运作起来很困难’。

I never saw these things as like sometimes people also say oh Google is big and it is hard'.

Speaker 1

我认为我们可以将其转化为优势,因为我们有独特的资源和能力。

I see that as we can turn it into our advantage because we have unique things that we can do.

Speaker 1

我对我们目前的进展感到满意,但我们必须持续学习和创新。

I'm quite pleased where we are but we have to learn through and innovate through that.

Speaker 1

这是我们取得当前成就的正确方式,而且还有更多工作要做。

That's a good way to achieve what we have achieved right now and there's a lot more to do.

Speaker 1

对吧?

Right?

Speaker 1

我的意思是,我感觉我们某种程度上还在追赶阶段。

Like, I mean, I feel like we are sort of just catching up.

Speaker 1

我们才刚刚到达那个位置。

We are just getting there.

Speaker 1

总是会有比较,但我们的目标是构建智能,对吧?

There's always comparisons, but our goal is to build intelligence, right?

Speaker 1

我们正是想要实现这个目标。

Like we want to do that.

Speaker 1

我们希望以正确的方式来实现它。

We want to do it the right way.

Speaker 1

这正是我们集中所有智慧和创新力量的方向。

And that's where we are putting all our minds, all our innovation that way.

Speaker 0

是啊。

Yeah.

Speaker 0

我觉得接下来的六个月,可能会像过去六个月以及更早前的六个月一样令人振奋。

I feel like the next, the next six months are going to be probably just as exciting as the last six months and the previous six months before that.

Speaker 0

感谢您抽时间参与这次对话。

Thank you for taking the time to sit down.

Speaker 0

这次交流非常愉快。

This was a ton of fun.

Speaker 0

希望明年IO大会前我们能再次相聚——虽然感觉还很遥远,但时间转眼就会到来。

I hope we get to sit down again before IO next year, which feels like forever, but it is going to sneak up on it.

Speaker 0

我确信下周就会有类似IO 2026规划会议,让一切如期实现。

I'm sure there's going to be meetings like next week that are like IO twenty twenty six planning to make everything happen.

Speaker 0

所以,感谢你抽出时间。

So, thank you for taking the time.

Speaker 0

再次祝贺你、DeepMind团队以及模型研究团队的所有成员,让Gemini 3、Nano Banana Pro等一切成果成为现实。

Congrats, again, to, you and the DeepMind team and everyone on the model research team for making Gemini three, Nano Banana Pro, everything else happen.

Speaker 1

是的。

Yeah.

Speaker 1

非常感谢。

Thank you very much.

Speaker 1

能进行这次对话真是太棒了。

It's been amazing having this conversation.

Speaker 1

这也是一段奇妙的旅程,很高兴能与团队一起分享,也与你分享这些。

It's an amazing journey as well, and glad to have all the team, but also, like, sharing with you as well.

Speaker 1

真的很棒。

It's it's great.

Speaker 1

非常感谢你邀请我。

Thank you very much for inviting me.

Speaker 0

我们准备了一份特别的小礼物。

We got a special a special little gift.

Speaker 0

感谢你和团队让这一切成为现实。

Thank you to congratulate you and the team for making this happen.

Speaker 1

哦,太棒了。

Oh, nice.

Speaker 1

非常感谢。

Thank you very much.

Speaker 1

非常切中要点。

Very much on point.

Speaker 0

1500分的首个Elo模型,对吧?

1,500 point First Elo model, right?

Speaker 0

15美元一个,是的,

$15.00 1 for Yes,

Speaker 1

第一个模型。

first model.

Speaker 1

非常贴心。

Very kind.

Speaker 1

非常感谢。

Thank you very much.

关于 Bayt 播客

Bayt 提供中文+原文双语音频和字幕,帮助你打破语言障碍,轻松听懂全球优质播客。

继续浏览更多播客