超越钓鱼攻击：AI时代的网络威胁与弗林四世

本集简介

延伸阅读： CodeMender：https://deepmind.google/discover/blog/introducing-codemender-an-ai-agent-for-code-security/ 谷歌网络安全：https://blog.google/technology/safety-security/ai-security-frontier-strategy-tools/ 威胁情报报告：https://cloud.google.com/blog/topics/threat-intelligence/adversarial-misuse-generative-ai 特别感谢为此付出努力的每一位成员，包括但不限于：主持人：汉娜·弗莱教授系列制片人：丹·哈杜恩编辑：拉米·察巴尔监制与制片人：艾玛·尤西夫音乐作曲：埃莱妮·肖音频工程师：理查德·考蒂斯视频剪辑：比拉尔·梅尔希音频工程师：佩里·罗甘廷视觉设计：罗伯·阿什利由谷歌DeepMind委托制作若您喜欢本期节目，请在Spotify或Apple Podcasts上为我们留下评价。我们始终期待听众的反馈，无论是意见、新想法还是嘉宾推荐！由Simplecast（AdsWizz旗下公司）提供托管服务。有关我们收集和使用个人数据用于广告的信息，请访问pcm.adswizz.com。

双语字幕

仅展示文本字幕，不包含中文音频；想边听边看，请使用 Bayt 播客 App。

Speaker 0

该项目的目标正是通过发现漏洞并帮助开源社区解决这些问题，让全球各地的人们都能受益，从而提升整个行业的安全性。

The goal of the project is nothing short of helping improve the security for the whole industry by finding vulnerabilities and helping the open source community get them resolved for everybody to benefit from around the world.

Speaker 1

欢迎收听谷歌DeepMind播客。我是汉娜·弗莱教授。如今，网络攻击从未如此容易。从足以欺骗你家人的逼真深度伪造视频，到看起来与真实邮件无异的钓鱼邮件，人工智能让这些攻击以惊人的速度规模化。

Welcome to Google DeepMind Podcast. I'm professor Hannah Fry. Now cyberattacks have never been easier. From deepfakes that are so convincing they could fool your own family, to phishing emails that look just like the real thing. AI has allowed these attacks to scale at a dizzying pace.

Speaker 1

但希望在于，推动这些攻击的同一技术或许也是阻止它们的关键。而对此战局了解最深的人，莫过于我今天的嘉宾。索尔·弗林是谷歌DeepMind的安全副总裁，也是网络安全界的传奇人物。他曾在2009年2月的极光行动期间亲历现场，当时对Gmail的大规模攻击改写了网络安全规则。如今，他再次站在前线，应对新一轮由AI驱动的网络攻击。

But there is some hope that the same technology that's fueling these attacks could also be the key to preventing them. And few people know this battle better than my guests today. Thor Flynn is VP of Security at Google DeepMind and a cybersecurity legend. He was in the room during Operation Aurora back in 02/2009, when a massive attack on Gmail rewrote the rules of cybersecurity. Today, he is on the front lines again, taking on a new wave of AI powered cyber attacks.

Speaker 1

事实上，索尔有太多要说的，太多引人入胜的见解要分享，以至于我们决定将这期播客分成上下两集。下次，我们将探讨网络犯罪中的人为因素，我们如何被恶意行为者操纵和欺骗，以及这一切在智能体AI时代如何变化。但在本集中，我们想聚焦于战斗本身，攻击者试图利用的系统漏洞，以及我们如何防御它们。非常感谢您加入我的节目，非常荣幸。

And in fact, Four had so much to say, so many totally fascinating insights to share, that we decided to make this into a podcast of two halves. Next time, we're going to be talking about the human side of cybercrime, how we can be manipulated and tricked by bad actors, and how all of that is changing with the era of agentic AI. But for this episode, we wanted to focus on the battle itself, the ways intersystems that attackers seek to exploit and what we can do to defend them. Well, thank you so much for joining me for. It's a pleasure.

Speaker 1

那么，我想先从谷歌历史上最著名的安全事件之一——极光行动开始谈起。您在那次事件中扮演了什么角色？

Now, thought I might start by talking about one of the most notable security incidents in Google's history, Operation Aurora. How do you fit into that story?

Speaker 0

是的，极光行动确实是网络安全历史上的一个重大时刻，对整个行业来说都是如此。我认为，一个国家竟会入侵私营公司，这对我们所有人来说都是一个巨大的震惊。当时的情况是，中国正在入侵谷歌，或试图入侵谷歌，作为该行动的一部分，他们还试图入侵其他一些公司，这是他们长期对西方众多机构进行间谍活动的一部分。

Yeah, well, so Operation Aurora was a huge moment in the history of cybersecurity writ large, really, for the industry as a whole. I think the idea that a nation state would compromise a private company was quite a shock to really all of us. We had essentially a case in which China was compromising Google, or attempting to compromise Google, and as part of that campaign, they actually attempted to compromise a number of other companies, and it was part of a long running process that they had to conduct espionage across a great deal of various institutions in the West.

Speaker 1

具体来说，他们是在寻找那些曾公开批评中国侵犯人权行为的人。

And specifically, they were looking for people who had been vocal against human rights abuses in China.

Speaker 0

这就是我们当时的看法。当然，在处理这类情况时，很难确定攻击者是谁或他们试图获取什么。早在最初发现攻击的时候——这是我团队负责的工作，就是发现并应对攻击——谷歌内部有许许多多的人共同参与调查事件真相。在我们弄清楚发生了什么之后，自然要将攻击者驱逐出我们的环境，而更重要的是，在随后的几年里，根据我们吸取的教训来加强环境安全。

That's that's what we believe at this point. Of course, at the time, whenever you're dealing with these situations, it's very hard to figure out who the actors are or what they're attempting to gain access to. And so, way back in the early days when we first detected the attack, which was something my team was responsible for, is sort of finding that attack and responding to it. There's many, many, many people across Google that were contributing to figuring out what happened. And then, of course, after we figured out what happened, to evict the attacker from our environment, and then, you know, maybe even more importantly over the following years, to harden our environment based on the lessons that we learned.

Speaker 0

但在当时，确实存在所谓的‘战争迷雾’，你完全不知道发生了什么，甚至对攻击的具体细节一无所知，只能通过取证来尝试理清头绪。至今我仍与谷歌许多当时在调查中起到关键作用的同事共事，比如希瑟·阿德金斯等人，他们都是这个领域绝对顶尖、令人难以置信的专家，我很幸运能与他们合作。

But definitely in the moment, have this thing called the fog of war, where you really have no idea what's going on, you really have no idea even what the bits are of the attack, and you're doing forensics to try to figure that out. And so there's still many people I work with here at Google that were instrumental in the determining of that, Heather Adkins and others that are just absolute, unbelievable experts in the subject matter that I'm lucky to work with.

Speaker 1

那就带我回到那个时候。你最初是什么时候意识到出事了？第一次检测到异常是什么时候？

Just take me back to that time then. When did you first realise that something was up? When was the first moment of detection?

Speaker 0

那时候这几乎成了个经典现象，可能现在也是。你几乎可以确定，当你计划了一个重要的圣诞假期时，网络攻击很可能就在那时曝光。我记得是在12月，所有细节开始浮出水面，我们很多人都不分昼夜地工作，整个假期乃至随后几个月都在努力查明真相，试图拼凑起拼图，而你甚至不知道拼图的全貌是什么。所以你面对的就是一堆零散的技术数据片段。

Back then, it was sort of famous. Maybe it still is. You can almost guarantee that when you have a big Christmas vacation planned, most likely that's when the cyber attack is going to come to light. And so it was in December, I remember, when all the details started to come out, and many of us worked tirelessly, really over the break, but also for months, trying to ascertain what happened and trying to put the puzzle pieces together when you don't even really know what the puzzle looks like. So you're faced with this picture of just bits and pieces of technical data.

Speaker 1

我想，那年你们恐怕没过成圣诞节假期吧。

I mean, presumably you didn't get Christmas holiday that year.

Speaker 0

没有。确实没有。我们谁都没过成。

No. No. None of us did.

Speaker 1

嗯，你就是和谢尔盖·布林一起每天在房间里待了十五个小时。

Well, you just sat in a room with Sergey Brin for, fifteen hours a day.

Speaker 0

那天我们在那个房间里待了很长时间，人很多。回想起来，现在想起来，我仍然会感到一阵心慌。但他们

There was a lot of us in that room for a lot of hours in the day. And, you know, when we think about it, we look back on it. You know, I still get a pit in my stomach, think. But they

Speaker 1

不过，当时压力有多大呢？我对此很感兴趣。我的意思是，一方面这感觉像是一个技术挑战，但另一方面也涉及到人为因素，对吧？

Well, how stressful was it, though? I am intrigued by this. I mean, sort of because the thing is, is on the one level, this feels like quite a technical challenge. And I mean, there is a human element to this too, right?

Speaker 0

这么说吧，像我这样毕生致力于保护他人的人，我知道我所有的同事也是如此，这感觉就像是一种失败，对吧？至少对我来说，压力的来源就在于你觉得让人民失望了。我一生都在保护人们的数据、他们的账户、公司的网络，全是为了让人们的日常生活尽可能美好。这当然与谷歌服务在很多方面都有交集，无论你是使用手机、浏览器还是搜索。我们对此非常重视。

Well, I mean, look, I'll say I'll say those of us that have pledged our life to defending people, as I have, and I know all the people I work with have, it feels like a failure, right? And that's where, at least for me, the source of stress comes from, is you feel like you've let people down. I mean, I spent my entire life to protect people's data, protect their accounts, protect companies' networks, all in service to helping people's daily lives be as great as it can be. You know, that intersects, of course, with Google services in a bunch of different ways, whether you're using a phone or a browser or even search. And we take that super seriously.

Speaker 1

他们是怎么入侵的？你们现在知道了吗？

How did they get in? Do you know now?

Speaker 0

是的，我们现在知道了。你知道，那是安全领域的早期阶段，但从某种意义上说，有些东西从未改变。那是在Internet Explorer时代——不知道听众们还记不记得，但那是微软当时的主流浏览器——漏洞存在于浏览器中，是通过对一名谷歌员工的钓鱼攻击实现的。

Yes. We do know now. You know, this was an earlier era in security, but in some sense, some things have never changed. This was back when Internet Explorer I don't know if any of the listeners remember that, but that was a big browser back then by Microsoft, and it was a vulnerability in the browser that was conducted via a phishing attack against somebody who was employed at Google.

Speaker 1

基本上就是有人点击了什么东西。

Someone clicked on something, basically.

Speaker 0

没错，正是如此。所以钓鱼攻击至今仍然是一个——如果不是比以往更大的威胁的话——但当时的浏览器利用还处于早期阶段。我们称之为客户端攻击，基本上就是利用弱点。请允许我稍微离题说一句：在安全的最早期，攻击者入侵系统的主要方式是通过服务器攻击。

Yeah, exactly. And so phishing is still very much a, if not a bigger threat today than it's ever been, but the browser exploitation was still in its earlier days back then. We called it client side attacks, basically people leveraging weaknesses. Let me just say one brief thing as a digression. In the earliest days of security, most of the ways that attackers broke into systems were through server attacks.

Speaker 0

对吧？它们会出现在互联网上。你会有一个银行或类似的东西，你会有一个大型主机或某个大网站，人们会通过该网站的前门进行攻击。因此，奥罗拉攻击就是这种演变的一个例子，它再也没有回到我们所谓的客户端攻击。所以攻击转向了组织的最薄弱环节，通常是用户。

Right? They would be on the Internet. You would have a bank or something like that, and you would have a big mainframe or some big website, and people would attack through the front door of that website. And so it was really this Aurora attack as an example of that evolution that really has never changed back to what we call client side attacks. And so the attacks shifted to be against the weakest link of organizations, which are often the users.

Speaker 0

因此，他们会通过社会工程学利用用户，例如对他们的密码进行钓鱼攻击，我相信你在生活中也遇到过需要定期更换密码的情况，但同时也利用了在笔记本电脑或台式机上运行的东西，而不是在服务器端。这是行业中的一个巨大变化的一部分，历史上我们建立了一种类似护城河和吊桥的安全模型，我们建造了这些巨大的防火墙城堡墙，把所有人和服务器都放在里面，这基本上是公司安全运作的普遍智慧。对吧？而这个模型中有很多弱点在演变。例如，我们意识到员工不再总是坐在同一栋大楼里，移动性变得越来越普遍。

And so they would exploit the users through social engineering, do phishing attacks against their passwords, for example, something I'm sure you've had to deal with in rotating your passwords in your own life, but also taking advantage of things that were running on the laptop or on the desktop computer, not on the server side. And it was part of a huge change in the industry where we had sort of historically built this sort of moat and drawbridge model for security, where we'd built these big castle walls with these big firewalls and had all the people and the servers inside, and that was essentially the common wisdom for how the security of companies would work. Right? And there's a whole bunch of weaknesses in that model that evolves. For example, we realized that employees weren't always sitting in the same building anymore, that mobility became more and more pervasive.

Speaker 0

对吧？首先是笔记本电脑的兴起，然后是个人手机的兴起。智能手机打破了那个模式的一个轴，但被打破的另一个轴是客户端攻击，因为人们不再试图攻击这些防御严密的服务器，比如大城堡墙后面的城堡。相反，他们攻击客户端，客户端有几个弱点。第一，所有客户端软件现在都像我们在服务器端做的那样被加固了，所以那里有一个很大的攻击面。

Right? First, the rise of the laptop, and then the rise of the personal phone. The smartphone broke that one axis of the mold, but then the other axis of the mold that got broken was client side attacks, because people were no longer trying to attack these very well defended servers, such as the castle behind the big castle wall. Instead, they were attacking the client, which had several weaknesses. One, all the client side software had now been hardened like that we'd done on the server side, so there was a big attack surface there.

Speaker 0

但此外，人类因素通过社会工程学在客户端更容易被利用。因此，这也导致了奥罗拉之后我们创建的一个演变，称为BeyondCorp，在行业中也称为零信任，这是一种全新的重新思考企业安全应该如何运作的方式，远离这种护城河和吊桥模型，而是承认客户端和用户作为需要防御的最高优先级的重要性。

But also, the human element was much more easy to exploit using social engineering on the client. And so that also led to an evolution post Aurora that we created called BeyondCorp, that's also known in the industry as Zero Trust, which is a whole new way of rethinking how enterprise security should work away from this moat and drawbridge model and instead sort of acknowledging the importance of the client and the user as the supreme thing to defend.

Speaker 1

就像，你几乎假设 perpetrator 已经 infiltrated 了网络，然后是如何阻止他们，减轻他们一旦得手可能造成的潜在损害。

Like, you've almost assumed that a perpetrator has infiltrated the network and it's sort of how to stop them, mitigate against potential damage they might do once they have.

Speaker 0

是的，这实际上是你提出的另一个元素，也就是我们安全领域所谓的假定漏洞。这也是另一个创新，因为我认为我们意识到，尽管我们开始构建的检测系统来发现这些攻击有些相当不错，包括我们为检测奥罗拉攻击而构建的那个，但你知道你无法捕捉到一切，而且随着攻击者，尤其是国家行为体，演变得更隐蔽和低于雷达的技术，即使市场上最好的检测系统也无法检测到，那么你必须采取一种双管齐下的方法。所以你不能再仅仅依赖你的检测系统向你的分析师标记这些事物，你还必须采取一个单独的步骤，即假定所有这些都失败了，并确保你正在做我们所谓的假定漏洞。所以这意味着你要做一些事情，比如威胁狩猎，例如，你假设坏人已经在你系统内部，你通过部署的任何系统都没有抓住他们，然后你去彻底搜查整个系统，找到那些已经渗透你防御的人。

Yeah, that's actually another element of it that you raise, which is this assumed breach, is what we call it in security. And that also was another innovation, because I think what we realized is that as good as the detection systems we had started to build to find these attacks were, and some of them were pretty good, including the one that we built to detect the Aurora attack, that said, you know you're not gonna get everything, and that as attackers, especially nation states, had evolved more stealthy and below the radar techniques that even the best detection systems on the market couldn't detect, then you had to sort of take a two pronged approach. So no longer could you just rely on your detection system to flag these things to your analysts, you had to also take a separate step, which is assume that all those things failed and make sure that you were doing what we call assume breach. And so that means that you do things like threat hunts, for example, where you assume that you have the bad guys already on your systems internally, that you didn't catch them by any of the systems you've deployed, and that you're gonna go and scour the the entire systems to find the people that had already penetrated your defenses.

Speaker 0

因此，这又是奥罗拉后时代出现的另一个新事物。还有很多其他东西，你知道，多因素认证令牌，谷歌至今仍然比我所知的任何其他公司做得更好，使用我们的泰坦安全密钥，对吧？所以它是一种不可钓鱼的多因素凭证。

And so, again, that was another novelty that grew up in this post Aurora era. And many other things, you know, multifactor authentication tokens that Google does better still today than any other company that I'm aware of using our Titan Security Keys, right? So it's an unphishable multifactor credential.

Speaker 1

也就是说，这不仅仅是一条可以轻易转发到另一部手机的短信。

As in, like, it's not just a text message which you could divert to another phone very easily.

Speaker 0

没错，完全正确。我们在谷歌能够做到的事情之一就是与一些合作公司共同发明了我们自己的硬件密钥，这样就不只是像你说的那样发送一条可能被别人在邮件或短信中找到并重放攻击的信息。这是一个相当不错的开端。很多公司已经部署了这种方案，总比没有好。

Right. Exactly. That's one of the things we were able to do at Google is invent our own hardware keys in partnership with some other companies we were working with, such that it's not just about sending you a message like you say that somebody can find in your email or in your text messages and replay that attack. That's a pretty good start. A lot of companies have deployed that, and it's better than nothing.

Speaker 0

但这个系统有很多弱点。如果有人入侵了电话系统，或者入侵了你的邮箱，你的账户仍然可能被接管。嗯，在谷歌，在这个事件之后，我们发明了一种不可钓鱼的多因素硬件令牌，它直接与浏览器连接，作为第二因素进行认证，而无需使用可能被攻击者窃取的字符串。

But there's a lot of weaknesses of that system. If somebody compromises the phone system, if somebody compromises your email, then you can still have your account taken over. Well, at Google, again, in the aftermath of this era, one of the things we invented was a non phishable multifactor hardware token, which connects directly to the browser and authenticates as a second factor without having some string of characters that could be stolen by an attacker.

Speaker 1

这件事还有另一个因素让它成为一个真正具有历史意义的时刻，那就是谷歌决定公开发生的事情。是的，我的意思是，他们为什么要这样做？

There's one other element of this that made it a really historic moment, which is Google's decision to go public with what had happened. Yeah. I mean, why did they do that?

Speaker 0

这里还有几个背景点我认为很有用。这类攻击在私营行业之外已经持续了一段时间。嗯。我们看到这些攻击发生在国防部，以及在军工复合体中通过网络攻击进行的间谍活动，你知道，组成这个复合体的各种公司，这些都是对谷歌攻击的前兆。所以我认为做出这个决定的部分考量是提高人们对这一持续了一段时间的事件的认识。

There's a couple other points of context here that I think are useful. These types of attacks had been going on outside of private industry for some time. Mhmm. We'd seen these attacks happening in the Department of Defense and espionage happening with cyber attacks in the military industrial complex, you know, the various companies that make that up, and that had all been precursors to the attack on Google. And so I think part of the calculus for that decision was bringing awareness to this thing that had been going on for some time.

Speaker 0

在这种情况下，它导致了行业内一系列非常积极的变化。你知道，现在很多地方都有了数据泄露披露的法律。我认为它导致了后来成为负责任的漏洞披露、最佳实践，并且总体上为安全带来了透明度，我认为这是谷歌多年来在各个领域真正带来的贡献。

And in this case, it led to a whole bunch of really positive changes in the industry. You know, data breach disclosure laws that are now on the books in a lot of places. I think it led to what then became responsible vulnerability disclosure, best practices, and and generally just bringing transparency to security overall is something Google's, I think, really brought to the table across the board for many years.

Speaker 1

在事件发生后不久，也就是2009年2月，对吧，其他公司是否迅速从谷歌发生的事情中吸取了教训？

In the immediate aftermath, so this is 02/2009, right, did other companies learn quickly from what had happened at Google?

Speaker 0

我认为意识很快就被意识到了，但进展非常缓慢的是我们安全最佳实践为应对这一风险所做的调整。总的来说，在整个行业中，采用更现代的安全方法，如多因素认证，进展极其缓慢。多因素认证现在已相当普及，但从那次事件到现在，花了十五年时间才真正成为企业能够有效部署的东西。零信任是另一个例子。

I think the awareness came quickly, but what happened very slowly was the adaptation of our security best practices to confront this risk. And in general across the industry, it's been incredibly slow to adopt the more modern approaches to security. Like multi factor authentication. Multi factor authentication is now getting fairly pervasive, but it took fifteen years from, you know, that event till now for it to really become something enterprises were able to deploy in effective ways. Zero trust is another example.

Speaker 0

我知道我在不同公司的同行们，如果你能相信的话，仍在努力使其环境现代化以适应这一新现实。尤其是，我认为政府，正如你可以想象的，那些建立了遗留环境的政府正在艰难地进行转型。所以我想我学到的教训是，不幸的是，改变根深蒂固的事物是很困难的。

I know peers of mine at various companies are still struggling, if you can believe it, with the modernization of their environments to this new reality. And especially, I would say governments, as you can imagine, that built a legacy environment are struggling to make that pivot. And so I guess the lesson I've learned is that unfortunately, it's hard to change things that are entrenched.

Speaker 1

但结果就是，这会导致一些相当重大、戏剧性的问题。我想到2014年的名人照片泄露事件（Celebgate），如果她们使用了多因素认证，这事就不会发生，对吧？

But then as a result of that, there are sort of quite big, dramatic issues that arise. I mean, I'm thinking here about Celebgate in 2014, where celebrities' photos were leaked. That wouldn't have happened had they been using multi factor authentication, right?

Speaker 0

没错。

That's right.

Speaker 1

是的。而那已经是极光行动（Operation Aurora）五年后的事了。

Yeah. And yet that was five years after Operation Aurora.

Speaker 0

极光行动，是的。所以你指出了另一点，即企业安全与消费者安全之间的差异。另一部分是消费者安全往往落后于企业安全，而企业安全本身有时也落后于最佳实践。这是一个很好的例子，我记得当时苹果甚至支持将多因素认证作为iCloud的一个可选功能，但很少有人采用。实际上，我们在消费者安全中看到，真正能推动变革的不是要求消费者改变他们的行为，而是改变默认设置。

Operation Aurora, yeah. So you're highlighting another point, which is the difference between enterprise security and consumer security. The other piece is consumer security often does lag behind enterprise security, and enterprise security itself sometimes lags behind the best practices. And so that's a good example where I think Apple even supported multifactor authentication as an opt in for iCloud at the time, if I recall, but very few people had adopted it. And really, what we see in consumer security is the thing that really moves the needle is not asking consumers to change their behavior, but changing the defaults.

Speaker 0

没错。让默认设置更安全才是真正推动人们在日常生活中变得更安全的关键。

Right. And making the defaults more secure is the real needle mover to make people in their daily lives more secure.

Speaker 1

因为公众对变革有抵触情绪。

Because the public are resistant to change.

Speaker 0

我认为这是一部分原因，或者他们只是没有受过教育，不知道应该怎么做。所以，说句公道话，我认为Android、Chrome OS、Chrome浏览器、苹果以及生态系统中的其他许多参与者实际上做了大量工作，将消费设备和消费应用的默认安全级别提升到了相当高的程度。我认为尤其是消费类移动设备在这方面是一个非常好的案例研究。相比十年前，现在你在Android或iOS上看到的默认设置已经截然不同了。很多人认为这是理所当然的，但这背后付出了很多努力。

I think that's part of it, or they're just not educated, right, on what they should be doing. And so, you know, credit where it's due, I think Android and Chrome OS and in the Chrome browser and Apple and a bunch of other players in the ecosystem have actually done a lot to increase the sort of default level of security on consumer devices and consumer applications to a pretty high degree. I think especially consumer mobile devices is a really brilliant case study in this. I think the defaults that you find now versus ten years ago on, say, Android or iOS is just dramatically different. A lot of people take that for granted, but it's a lot of hard work.

Speaker 1

嗯，如果说有很多积极的方面值得感谢，我想我也需要理解这个潜在问题的规模。在我们讨论大型语言模型如何改变游戏规则、生成式AI时代如何改变一切之前，让我先问问我们可能存在的不同脆弱点。社会工程学当然是一种，但还有哪些其他入侵方式？更技术性的系统入侵手段有哪些？

Well, if there are lots of positives to be sort of thankful for, I think I also want to understand the scale of the potential problem here. So before we get on to talking about how large language models have kind of changed the game, the era of generative AI have changed things, let me ask you about the different ways that we are potentially vulnerable. So there's social engineering, sure, but what are the other ways in? What are the more technical ways into a system?

Speaker 0

我们可以从三类安全故障的角度来思考。正如你提到的，其中之一，可能也是最常被利用的，就是社会工程学。LLM的一个有趣特点是它们具有类似人类的行为特征。实际上，你可以通过类似迷惑人类的方式来混淆LLM。但我认为另外两类是配置问题和完整性问题。

Think of it in terms of three categories of security failures, let's say. So as you mentioned, one of them, and probably the most frequently abused one, is social engineering. And one of the interesting quirks of LLMs is that they have somewhat human like behaviors. And so, in fact, you can cause an LLM to get confused through similar types of things that humans do. But I think the other two categories are issues with configuration and issues of integrity.

Speaker 0

基本上，保护系统的方式是：你需要通过配置使其安全，然后确保没有方法可以绕过该配置。我认为安全领域的几乎所有问题都归入这两类，至少是安全预防方面的所有问题。我们随便举个具体例子。访问控制，对吧？

So basically, the way to think about protecting a system is you have to configure it so that it's secure, and then you have to make sure there's not a way to bypass that configuration. And I think pretty much everything in security falls into those two categories, Everything in terms of security prevention, at least. And so let's just pick any particular example. Access control. Right?

Speaker 0

假设你在一家公司，共享一个Google文档之类的东西，你可能将其共享给整个公司的所有人。这意味着公司任何一个账户被黑的人都可以查看该文档。这就是配置问题。正确的配置意味着只有适当数量的人有权访问该文档，这应该根据文档内容的敏感程度来决定。对吧？

So if you have a company where you're sharing a Google Doc or something like that, you might have it shared with everybody at the whole company. And that means any one particular person that has an account that's hacked at the company could view that document. And so that's an issue of configuration. Getting the configuration right means having only the right number of people with access to that document that really should persuade it to the level of sensitivity of the content of the document. Right?

Speaker 0

好的，我们都知道这一点。这是基本的安全常识。那么完整性在哪里起作用呢？假设你已经很好地限制了该文档的访问权限，只让正确的人看到，但还可能发生的是，托管该文档的服务器存在漏洞，比如缺少补丁。

Okay. We all know this. This is normal security one zero one. Now, where does integrity come in? So let's say you've done a good job of getting that document locked down to the right number of people, but what could also happen is that there's a vulnerability like a patch missing on the server that's hosting that document.

Speaker 0

因此，有人可能潜在地入侵服务器，完全绕过访问控制机制。第三个问题是，拥有文档密码的人的账户被盗，然后该账户获得了文档访问权限。所以你看，我们有完整性、配置和人员这三类问题，我认为这几乎涵盖了所有类型的安全隐患。

And so somebody could potentially compromise the server, bypassing the access control situation altogether. And then the third issue is that somebody who does have access to the document's password is stolen, and then that account gets access to the document. So you see we have integrity, configuration, and people as the sort of three classes of issues that I think really is every kind of security problem.

Speaker 1

但有时候这会以有点奇怪的方式表现出来。我前阵子读到一个故事，讲的是拉斯维加斯赌场里一个鱼缸，上面装了个智能温度计。

But then sometimes that manifests in, like, slightly strange ways. I mean, read one story a little while ago about a fish tank in a Las Vegas casino that had a smart thermometer on it.

Speaker 0

是的，我记得读过这个。基本上是有个人试图进行金融欺诈，或者某种对赌场的滥用行为。他们有个鱼缸接在内部网络上。而运行鱼缸的系统，当然就像现在所有东西一样，有IP地址并且连接着互联网。

Yeah. I think I remember reading about this. It was basically somebody that was trying to cause financial fraud, I believe, or some sort of abuse of that casino. And they had had a fish tank that was on their internal network. And the system that was running the fish tank, of course, like everything these days, has an IP address and is connected to the Internet.

Speaker 0

对吧？我敢说我的烤面包机现在可能都有IP地址了。这让攻击者能够在该网络上获得立足点，然后以此为跳板攻击后台那些防御薄弱的关键系统。这是物联网系统的典型问题，其困境在于它们通常没有足够的CPU、内存和功耗预算来实施很多安全最佳实践。结果你会看到公司在这方面偷工减料，导致这些物联网系统被广泛部署却存在隐患。

Right? My toaster probably does have an IP address at this point, I'm sure. And so what that allowed the attackers to do is to gain a foothold on that network and then use that as a pivot point to attack the more sensitive systems that were undefended behind the scenes. This is a classic issue with IoT systems, where the problem with IoT is that they often don't have enough CPU, memory, and power budget in order to do a lot of the security best practices. And so you end up seeing companies that skimp on those things, and so therefore you end up with these IoT systems that are deployed somewhat pervasively.

Speaker 0

如果这再与后台系统（比如服务器，就像我们之前讨论的吊桥模式中的那种）相结合，那些系统防御薄弱且依赖网络信任——我认为这现在是安全领域的反模式——那么只要你有个东西凭借联网就获得了一定权限能与那系统交互，这通常就是灾难的配方。

And then if that's compounded with a system that's behind the scenes, like a server, like we talked about in that old model of the mode and drawbridge, that is poorly defended and relies on network trust, which I think is an anti pattern now in security. But if you have something that has, like, by virtue of being on the network, gains you some amount of privilege to being able to interact with that system, then generally speaking, that's a recipe for disaster.

Speaker 1

但我觉得这确实展示了你可能遭受攻击的多种不同潜在途径。

But I mean, think this does just demonstrate the number of different potential ways that you can be vulnerable.

Speaker 0

是的，说得好。我们称之为‘防御者困境’，这是行业术语。本质上这是种不对称：防御方需要防范所有可能危害你人员或公司的途径，而攻击者实际上只需找到一条入口即可。

Yeah, it's a great point. Mean, we call this the Defender's Dilemma, is the term we use for it in the industry. It's essentially this asymmetry between the folks that have to protect against all potential ways to compromise your people or your company and an attacker that really only has to find one avenue in.

Speaker 1

我的意思是，在某种程度上，你几乎必须假设你的系统存在一些你尚未知晓的漏洞。

I mean, suppose in a way, you almost have to assume that you have some vulnerabilities that you don't yet know about on your system.

Speaker 0

是的。每个从事安全防御工作的人都有这个假设。

Yes. Everybody that does security defence has that assumption.

Speaker 1

这类漏洞有专门名称对吧？什么是零日漏洞？

Those have a name, right? What is zero day vulnerability?

Speaker 0

零日漏洞绝对是极难控制的一类漏洞。因为即使你做好了一切防护措施——给系统打补丁、设置正确的访问控制等等——零日漏洞依然能够攻破完全打好补丁的安全系统。这正是让我们大多数人夜不能寐的那类漏洞，历史上它们都极难防御。

Zero day vulnerabilities are definitely a type of vulnerability that is very difficult to control for, because those are vulnerabilities where even if you've done everything right, patching your system, putting your access controls in the right place, and so on and so forth, a zero day vulnerability is something that can compromise a fully patched, secure system. And so those are the class of vulnerabilities that when most of us are laying awake at nights, those are the ones that we're often most scared of because historically, it's challenging to defend against those.

Speaker 1

就是那些你根本不知道存在的漏洞。

The vulnerabilities you don't know are there.

Speaker 0

没错。要知道代码非常复杂，我们日常依赖的这些系统都是由数百万行复杂代码构建的。虽然漏洞数量可能是有限的，但这个数字非常庞大，其中很多漏洞从未被发现过。因此始终存在这种潜在风险：代码中可能隐藏着前所未见的漏洞。

Yeah, exactly. And, you know, code is complex, right? It's important to remember, these systems that we all depend on in our daily lives are built on millions of lines of very complicated code. And so, while the number of vulnerabilities is probably finite, it's also a very big number, and many of them have never been discovered before. And so, there's always this latent risk of code that might have a vulnerability in it that is never seen before.

Speaker 0

说个让你稍微安心的事：安全领域有个'纵深防御'概念。通过构建多层防护系统，确保任何单一漏洞（无论是假设性还是实际存在的）都不会导致整个系统崩溃。以零日漏洞为例，现代操作系统设有多重防御层。即便发现漏洞，如今由于底层操作系统添加的安全保护措施，实际利用这些漏洞变得极其困难。

You know, just to make you feel slightly better, you know, we have this concept in security called defense in-depth. The idea is that you build systems such that any one particular flaw, hypothetical or otherwise, doesn't lead to a catastrophic failure of the whole system. So let's talk about a zero day vulnerability in system. There's all these layers of defense in modern operating systems. And so even if there's a vulnerability discovered, these days, it's very hard to exploit those vulnerabilities because of all these added security protections in the underlying operating system that make it very difficult to, even if there's a vulnerability you can discover, to actually exploit it in real life.

Speaker 0

你需要学习各种技巧，比如利用漏洞、在内存中放置特定数量的字节，然后使操作系统跳转到该内存执行。这些是不同类型的溢出攻击等等。现代内核和操作系统内置了许多内存安全功能，试图防范这些攻击。此外，当你从宏观角度思考一家大公司时，一个零日漏洞可能会危及你的手机或浏览器。我们还讨论了一个概念——希望我没有引入太多新概念——我们称之为杀伤链。

And there's all this trickery you have to learn how to do, such as exploiting the vulnerability, landing a certain amount of bytes in the memory, and then causing the operating system to jump over to that memory. These are different kinds of overflows and so on. There's all these memory safety features that have been built into modern kernels and modern operating systems to try to protect against these things. Moreover, when you zoom out and you think about a larger company, a zero day vulnerability could perhaps compromise your phone or your browser. But the other thing that we talk about and hopefully I'm not introducing too many concepts that are novel but we talk about this concept called the kill chain.

Speaker 0

杀伤链是我们从军事领域借鉴的概念，现已成为网络安全领域的固定术语。基本思想是，我们一直与防御者的困境作斗争：我们必须阻止攻击者的所有可能途径，而他们只需找到一种方法即可。杀伤链让我们能够以不同方式思考问题：虽然攻击者只需找到一种入口，但他们仍然必须经历一系列可预测的阶段。这些阶段包括侦察、漏洞利用的投递、利用后的活动，以及在网络环境中横向移动等等。

And the kill chain is a concept we borrowed from the military and now is a sort of fixture of cybersecurity. And basically, the idea was, you know, we had been struggling against this Defenders' Dilemma that we have to stop every possible avenue for an attacker, but they only have to find one way in. And what the kill chain allowed us to think about is to think about the problem differently. Like, that's true, but even though they have to only find one way in, they still have to go through this series of stages that you understand what they're going to be. And it's things like reconnaissance, delivery of the exploit, post exploit, sort of moving around the environment in the network, and so on and so forth.

Speaker 0

因此，我认为这实际上重新赋予了防御者力量，在一定程度上重新平衡了双方的力量对比。因为当你纵观整个公司时，显然，仅仅钓鱼攻击一名员工是不够的，因为攻击目标通常位于系统深处，该员工可能无法访问。在钓鱼攻击员工后，攻击还必须经历一系列阶段：他们在员工笔记本电脑上获得代码执行权限，试图在环境中扩散，尝试确定要访问的服务器或代码库。这些阶段提供了众多检测机会、设置绊网、实施深度防御，以阻止后续攻击阶段。现在，整个公司都成为你的战场，你可以部署预防和检测技术， inevitably 阻止、检测并减缓攻击者的行动。

And so this actually, I think, re empowered the defenders and sort of re tilted the scales and balanced them a little bit more. Because when you zoom out and you look at a whole company, obviously if they just phish one employee, that's not good enough, because typically they have a target that's deep inside a system that that employee might not necessarily have access to, and so there's this whole series of stages the attack has to go through after they've phished that employee, they've gotten code execution on their laptop, now they're trying to spread out through the environment, now they're trying to figure out what server they're trying to access, or what code base they're trying to access, and so there's all these opportunities to detect, to set tripwires, to have defense in-depth, try to block those additional stages. And the whole entire company now is your field of battle in which you can deploy prevention and detection technologies that inevitably can stop and detect and slow the attacker down.

Speaker 1

所以，尽管他们可能有多个攻击区域，但你现在也有了多个防御区域？

So even though they may have multiple areas of attack, you've now got multiple areas of defence?

Speaker 0

正是如此。现在整个公司都可以成为你的防御战场。所以，

Exactly. The whole company now can become your field of defence. So,

Speaker 1

好的，那么大语言模型如何改变这种情况？有若干

okay, how do large language models change this situation? A number

Speaker 0

种方式。

of ways.

Speaker 1

我是说，有没有新的漏洞？系统有没有新的失效方式？

I mean, there's like new are there new vulnerabilities? Are there new ways in which the systems can fail?

Speaker 0

是的，随着大型语言模型的出现，涌现出许多有趣的新事物，我认为这对防御者和攻击者都是如此。但在深入之前，我想先奠定一个基础。我认为我们行业仍在努力应对的一个问题是，传统计算系统本质上是确定性的。从安全角度来看，大型语言模型的不同之处在于它们通常是非确定性的。很多时候，你可以向大型语言模型输入相同的提示，它会根据随机因素给出不同的答案，对吧？

Yeah, so there's a whole bunch of interesting new things that come out of the advent of large language models, I think both for defenders and for attackers. But before I get to that, there's one other thing I'd like to start as a foundation. I think one of the things that we're still wrestling with as an industry is that ultimately traditional computing systems are deterministic. I'd say the fundamental thing from a security point of view about LLMs that's different is that they're generally nondeterministic. Oftentimes, you can give the same prompt to a large language model, and it will give you different answers depending on random things, right?

Speaker 0

当它追踪那些标记的路径并通过模型的‘大脑’生成这些标记时，你会得到非确定性的答案。我们会详细讨论，但我认为值得从这一点开始，因为这对我们防御者和安全领域来说是一个相当大的突破。我认为我们中的一些人仍在努力适应这种差异。现在，就攻击而言，我认为大型语言模型在某些方面仍然是新的。显然，它们已经存在多年了，但从安全角度来看，我认为我们仍在试图了解风险格局会是什么样子。

As it's tracing those paths of those tokens and producing those tokens through the model's sort of brain, if you will, you'll get nondeterministic answers. And so we'll get into the details, but I just think it's worth starting with that as that's a pretty big break from the past for us defenders and security. And I think some of us are still kind of wrestling with that difference. Now, terms of attacks, I think large language models are still new in some ways. I mean, obviously, they've been around for a number of years, but I think from a security point of view, I think we're all still trying to learn what the risk landscape is going to look like.

Speaker 1

攻击者能否使用大型语言模型来创建恶意软件？

Could attackers use large language models to create malware?

Speaker 0

初步迹象表明，攻击者开始摸索如何使用大型语言模型创建恶意软件。我应该说的是，我们与威胁情报团队密切合作，仔细检查恶意行为者的活动，并定期发布报告。实际上，我们在1月份发布了一份相当详尽的报告，涵盖了所有不同的国家威胁行为体以及他们如何使用Gemini。但我们已经看到了野外的攻击，以及在实验室中构建的使用AI和LLMs作为恶意软件攻击链一部分的原型攻击。其中一个例子是，我们开始看到人们使用LLMs来实现多态性。

So there's initial signs that attackers are starting to figure out how to use large language models to create malware. We work, I should say, closely with our threat intelligence teams to carefully examine what the bad actors are doing, and we put out periodic reports. We actually released a pretty exhaustive report of all the different nation state threat actors and how they were using Gemini in January. But we have seen attacks in the wild already and prototype attacks that have been built in the lab that use AI and LLMs to be part of the malware attack chains. So one example of that is we're starting to see people use LLMs for the polymorphism.

Speaker 0

让我解释一下这是什么意思。创建恶意软件的一个问题是，它经常会被标记——人们过去称之为杀毒软件，现在叫EDR。但本质上，这些系统运行在你的笔记本电脑上，寻找恶意代码。因此，恶意软件作者面临的一个问题是，他们希望确保他们的东西不会被现代杀毒引擎标记并被删除或禁用。

So let me explain what that is. One of the problems with creating malware is that oftentimes it can get flagged by I mean, people used to call them antiviruses. Now they're called EDR. But essentially, the systems that are running on your laptop that are looking for malicious code. And so one of the problems that malware authors face is they want to make sure that their stuff can't get flagged by a modern antivirus engine and be deleted or disabled.

Speaker 0

历史上，解决这个问题的办法是让某种东西在你的电脑上出现，作为完全全新的、从未被任何人见过的东西。这意味着你需要为那次攻击的精确实例定制创建一些东西。这在历史上是昂贵的，但不幸的是，我们开始看到大型语言模型在帮助制作定制恶意软件并使其具有多态性，或者至少能够在它们植入的每个系统上保持独特性方面越来越有用。这是一个例子。我们已经看到了这方面的原型。

And so the solution historically to that has been to have something that shows up on your computer as something completely brand new that's never been seen before by anybody out there before. And so what that entails is you creating something custom crafted for that exact instance of that attack. Now that's been expensive historically, but unfortunately, we're starting to see that large language models are increasingly useful for helping craft bespoke malware and having them be polymorphic, or at least being able to have them be unique on every system that they're planted on. So that's one example. We've seen prototypes of this.

Speaker 0

我还不确定我们是否已经见过那种性质的实际攻击案例，但我确实看到了许多有趣的不同实验，我认为这类真实世界的攻击很可能即将发生。

I don't know that we've seen an in the wild attack yet of that nature, but I've definitely seen a bunch of different interesting experiments out there, and I think real world attacks of that nature are probably imminent.

Speaker 1

我也在思考新的漏洞问题，现在大型语言模型某种程度上成了系统的入口点。是的，因为提示词注入是另一种方式。

I also wonder about the new vulnerabilities where now a large language model is sort of your entry point into a system. Yes. Because prompt injections is another way.

Speaker 0

这个观点非常精彩。这回到了我刚才提到的确定性与非确定性行为的观点。提示词注入、越狱攻击这些例子表明，随着语言模型变得更智能，它们也会像人类一样容易受到某些攻击。提示词注入实际上在某种程度上混淆了模型的心理处理过程——本质上就是让模型困惑于用户指令的来源。

That is a really great point. So, this gets back to the point I made a moment ago about deterministic versus nondeterministic behavior. And prompt injection, jailbreaks, these are examples where LMs are susceptible to some of the things humans are as they become more intelligent. Prompt injection is actually, in some ways, kind of a confusion of the model's mental processing. Basically, prompt injection is, is getting confused about where the command from the user is coming from.

Speaker 0

假设你正在使用LLM，你说'请总结这个网站'。是你告诉LLM该做什么，它应该专注于你的指令。但在攻击场景中，你要求总结的网站可能是恶意的，这个网站可能会劫持LLM的思考过程并说'忽略之前给你的指令'。

So let's say you're using an LLM, and you say, Hey, summarize this website. You're the one telling the LLM what to do. It should be focusing on what you're asking it to do. But what could happen in an attack scenario is that that website you're asking it to summarize is actually malicious. And so that website might actually hijack the thought process of the LLM and say, Ignore the instructions you've previously been given.

Speaker 0

'改为执行这个其他操作'。当然，在网站总结的案例中听起来可能无关紧要，但当你考虑到未来将这些系统部署为具有越来越强独立性和工具使用能力的智能体，它们可能会接触潜在敌对内容时，这就成为了一个越来越重要的问题——这些系统的可信度究竟如何。因此，提示词注入确实是我花大量时间持续改进Gemini防御机制的领域之一，我认为业界所有人都在努力提升针对这类攻击的防御能力。

Do this other thing instead. And of course, it sort of sounds trivial in a case of summarizing a website. I mean, who cares, right? But as you think about the future of deploying these things as agentic systems that have increasing levels of independence and increasing tool use where they're engaging with potentially hostile content, this becomes a bigger and bigger issue as to how trustworthy these systems can be. And so, yeah, I would say prompt injection is definitely one of the things that I am spending a lot of my time continuing to improve Gemini's defense of, and I think all of us in the industry are working on improving defences against this class of attack.

Speaker 1

我想到你刚才提到的恶意网站。因为大型语言模型通过阅读互联网进行训练，是否存在数据投毒这样的潜在漏洞？

I'm thinking about you talking about malicious websites there. Because is there also a potential vulnerability that large language models are reading the internet for their training. Is there something about data poisoning as well that could go on here?

Speaker 0

是的，存在多种令人担忧的数据投毒攻击类型。一个是用于训练模型的预训练数据海量性本身就是一个潜在风险。不过实践中有个缓解因素：预训练阶段输入大型语言模型的数据通常非常庞大，任何单一数据单元通常不会在模型输出中被过度体现。假设整个互联网中只有几个恶意网站，很难让一个基于数万亿token训练的模型被这么小的数据成分所破坏。

Yeah, there's a number of different types of data poisoning attacks that are potentially concerning. One is just sort of the mass of pre training data that's used to train these models is definitely a potential risk. I think one thing that mitigates that risk a bit in practice is that the data that's generally fed into a large language model in pre training tends to be so expansive that any one unit of that data generally isn't overrepresented in the outcome of the model. You know, let's say you have a couple of websites in the entire Internet that are malicious. It's very difficult to get the model that's trained on trillions of tokens or whatever the number is to be compromised by that very small component of the data input.

Speaker 0

现在，在训练后数据阶段更令人担忧一些，因为通常用于训练后的数据集较小，因此存在一些我在文献和学术研究中看到的潜在场景，表明训练后数据可能对训练大型语言模型具有恶意行为的风险。一旦你对模型进行了预训练和训练后处理，在部署模型时还需要考虑一类有趣的风险。假设你完成了这些模型权重的训练，然后将其发布在开源仓库中，你就拥有了一个开放权重的模型。然而，也有论文和攻击案例表明，攻击者实际上可以恶意操纵磁盘上的最终模型。

Now, it's a little bit more concerning in the post training data because there generally is a smaller set of data that's used for post training, and so there are potential scenarios that I've seen literature on and academic work on that indicates that post training data could be potentially risky for training a large language model to be malicious. Once you pre and post train a model, there's also an interesting class of risk that you have to think about when you're serving the model. So let's say you have these model weights that you've finished doing training on, and then let's say you put that out in an open source repository. You have an open weights model. Well, there's also papers and attacks where you can actually have an attacker maliciously manipulate that finalized model on disk.

Speaker 0

所以，你知道，这就像是对实际模型本身的热补丁或随便你怎么称呼它。如果你不验证下载的模型与最初生成的模型是否相同，那么也存在那种我们都担心的攻击方式。

And so, you know, that's like a a hot patch or whatever you wanna call it to the actual model itself. And if you don't validate that the model you're downloading is the same as the one that was produced in the first place, there's also an attack of that variety that we all worry about as well.

Speaker 1

好的。那么，面对攻击者可能通过这些方式入侵的所有潜在途径，你们是如何为Gemini防止这种情况发生的呢？

Okay. So when it comes to all of these potential ways that attackers could get in, how do you prevent this from happening for Gemini?

Speaker 0

我们的做法是首先构建一个可防御的模型。因此，我们努力提升Gemini发现并防御这些攻击的能力。在模型中，我们做了大量有趣的工作，包括提示注入防御、越狱防御，利用训练后处理、SFT（监督微调）以及模型内部的RL（强化学习），以确保我们不断提升模型对攻击的抵抗力。如果允许我稍微离题一点，DeepMind在这方面做的一项创新是，我们是所谓自适应攻击方法的先驱之一，即为了学会如何防御模型，你必须真正构建一套能代表恶意行为者可能对你采取的行动的攻击套件。

The way we do that is we sort of first start with building a defendable model. And so we're trying to improve Gemini's ability to find and defend itself against these attacks. So in the model, we do a whole bunch of interesting work, prompt injection defense, jailbreaking defense, using post training, using SFT, using RL inside the model to make sure that we are constantly improving the model's resistance to attack. And if you allow me a brief digression just on one point, one of the novel things we do at DeepMind on this is that we're one of the innovators in the so called adaptive attacks approach, which is in order to learn how to defend a model, you have to really build a good suite of attacks that are representative of what the bad actors might do to you.

Speaker 1

那么，这是否类似于你是一个祖母，正在给孙女读一个关于凝固汽油弹的故事这样的例子？

So is an example of this something like you are a grandmother reading a story to your granddaughter about napalm?

Speaker 0

是的，类似那样的东西。这样做的目的是让你能够测试一系列攻击场景，然后看看你对它们的抵抗力如何，同时生成训练数据。所以，就像你提到的，可能是一些例子，比如这是一封恶意邮件。现在我们让模型忽略指令，通过调用工具并将我的私人数据发送到某个恶意电子邮件地址来采取恶意行动，诸如此类。

Yeah. Mean, something like that. And so what that allows you to do is to test a bunch of attack scenarios and then see how resistant you are against them, and also to generate training data. And so that might be to your point, it might be examples like, here's a malicious email. Now we're causing the model to ignore the instructions and to take a malicious action by calling a tool and sending my private data off to some bad email address, something like that.

Speaker 0

对吧？这些都是构建在该框架中的所有内容。但除此之外，我们还拥有这些自适应攻击，使我们能够通过这种学习过程反复不断地攻击模型，直到我们获胜。我们有多种不同的算法，并就此撰写了一篇论文。

Right? That's all the stuff that's built into that framework. But what we do beyond that, we have these adaptive attacks that allow us to constantly hit the model over and over again using this sort of learning process until we win. Have a number of different algorithms. We've written a paper on it.

Speaker 0

我鼓励你去阅读它。但这确实是一种巧妙的方法，为我们所要防御的内容设定了一个更高的标准，而不是依赖于一份可能无法代表攻击方式演变的静态预设攻击列表。因此，模型本身需要具备强大的防御能力，但在模型周围，同样地，防御和纵深，你还需要有其他层次的防御。我们做了大量工作来确保模型的安全，但我们也做了诸如部署非常智能的分类器在模型周围的事情，这些分类器也会观察并标记这类行为，以便我们也能防御它们。分类器的好处在于，它们既能增强模型自身的防御能力，又能更快速地应对新型攻击的演变。

I encourage you to read it. But it's really a clever way to set a higher bar for what we're defending against than a static list of canned attacks that may or may not represent how attacks are evolving. So the model itself needs to be able to have strong defenses, but around the model, also, again, defense and depth, you want to have other layers of defense. We do a bunch of work to make the model secure, but we also do things like really intelligent classifiers that we put around the model that also look and flag for these sorts of behavior so that we can defend them too. And what's nice about classifiers is that they both augment the model's own ability to defend itself, but also they allow more rapid evolution against novel attacks.

Speaker 0

因此，构建一个新的分类器并将其部署出去，比从头开始重新训练整个模型要轻量得多。

And so it's a much lighter weight to build a new classifier and get it out there than it is to retrain an entire model from the ground up.

Speaker 1

那么，如果这些是攻击方式，我们也来谈谈人工智能如何被用来预防它们。告诉我关于 Big Sleep 的情况。

So if these are the attacks, let's also talk about the ways that AI can be used to prevent them. Tell me about Big Sleep.

Speaker 0

是的，所以 Big Sleep

Yeah, so Big Sleep

Speaker 1

它是来自……有人刚才在来的路上告诉我，好像有个小憩之类的。

Did it come from someone told me this just on the way in, that there was like a little nap or something.

Speaker 0

我有个午睡时间。

I had a nap time.

Speaker 1

就是这样。

That's it.

Speaker 0

是的。在我正式参与这个项目之前，有一个名为“Project Nap Time”（小憩计划）的研究项目，你现在仍然可以在谷歌博客上零星看到相关提及。那是最初的前身。我被告知，这个命名惯例来源于一个想法：安全漏洞研究人员——也就是那些发现新型漏洞的人，我们稍后会详细讨论——可以利用这个系统打个盹，因为他们可以让AI替他们完成所有工作。

Yeah. So before I got involved in the project formally, there was a research project called Project Nap Time, which you can still find mention of on Google blogs here and there. And that was the original precursor. And I'm told that the naming convention comes from the idea that security vulnerability researchers, who are the people that find new types of vulnerabilities, which we'll get into in a moment, could use this system to take a nap, Because they could just let the AI do all the work for them.

Speaker 1

替他们搜索漏洞。

Search for vulnerabilities on their behalf.

Speaker 0

没错，在他们打盹的时候发现漏洞。这就是名字背后的想法。然后随着项目获得更多动力，它演变成了‘The Big Sleep’（长眠计划）。

Yeah, find vulnerabilities while they took a nap. So that was the idea behind the name. And then as we sort of got more momentum behind the project, it evolved into The Big Sleep.

Speaker 1

现在他们真的可以冬眠了。

Now they can actually hibernate.

Speaker 0

是的，现在他们可以整个冬天都冬眠。但这个项目本质上是一个大赌注，就像DeepMind非常自豪的那样，利用AI来发现新型漏洞。你可能会问，为什么这是一件好事？对吧？因为我以为我们刚刚讨论过漏洞是坏的。

Yeah, now they can just hibernate all winter. But yeah, what the project essentially is, is kind of a big bet, like DeepMind is so proud of, of using AI to find novel vulnerabilities. Now, you might ask, why is that a good thing? Right? Because I thought we just discussed vulnerabilities are bad.

Speaker 0

这是一个有趣的观点。我认为我们这些在安全行业的人发现，透明度始终是安全的最佳消毒剂。我们所做的是，我们采用了最新最棒的Gemini版本，并通过一个代理框架，让它们基本上成为一个漏洞研究员，在代码中发现新的零日漏洞。这个项目的目标毫不逊色，旨在通过发现漏洞并帮助开源社区解决它们，让全世界受益，从而帮助提升整个行业的安全水平。

And it's an interesting point. I think those of us in the industry in security have found that transparency is always the best disinfectant for security, and what we've done is we've taken the latest and greatest versions of Gemini, and we're using them to, with an agentic harness, basically become a vulnerability researcher and find novel zero days in code. And the goal of the project is nothing short of helping improve the security for the whole industry by finding vulnerabilities and helping the open source community get them resolved for everybody to benefit from around the world.

Speaker 1

所以它是在寻找这些无人知晓存在的未受保护的后门。

So it's hunting for these unprotected backdoors that nobody knows is there.

Speaker 0

完全正确。是的。我们正在发现那些人们从未听说过或见过的漏洞，这些代码在很多情况下实际上是互联网大部分底层架构的基础。

That's exactly right. Yeah. We're finding the vulnerabilities that people have never heard of or seen before in code that is really underlying much large portions of the Internet in many cases.

Speaker 1

那么以前人们是怎么做的？人工搜索的方式是什么？

So how did people do it before? What was the human way of searching

Speaker 0

寻找它们？没错。嗯，存在一个漏洞黑市。它们非常昂贵，通常一次交易就价值数百万美元。如果你找到一个呢？

for them? Right. Well, there's a black market for vulnerabilities. They're very expensive, millions of dollars often at a time. If you find one?

Speaker 0

是的，如果你找到一个。我提到这个只是为了说明这些东西有多么精巧和罕见。实际上，在实践中运作的方式是，这是安全领域中唯一有点像电影情节的部分。你知道吗？就像有人穿着深色连帽衫在黑暗中，整晚盯着六个显示器，吃着麦片什么的。

Yeah, if you were to find one. And the reason I bring that up is just how exquisite these things are and how rare they were. And basically, the way it works in practice is, this is the one part of security that actually is kind of like the movies. You know? It's like somebody in a dark hoodie in the dark, staring at six monitors all night, eating Cheerios or whatever.

Speaker 0

所以这需要大量非常密集的脑力劳动，通常持续数周或数月，你拿到一段代码，试图理解它是如何工作的，你对系统中可能存在漏洞的地方做出假设，你输入可能危险的输入。这种方式的一种情况是，软件开发人员会根据他们头脑中的假设来设计系统，这些假设通常是未明说的。哦，是的，这个系统只会接收图像文件，对吧？你知道，为什么有人会发送其他东西呢？然后攻击者说，嗯，如果我在这里放一个音乐文件呢？

So it's a lot of really intensive mental work that happens often over weeks or months, where you are getting a piece of code, trying to understand how it works, you are making a hypothesis of where a vulnerability might be in that system, you're putting in inputs that are potentially dangerous. So one of the ways that this works is software developers will design a system with assumptions in their head, oftentimes unstated assumptions. Oh yeah, this system will only ever get image files, Right? And it's, you know, why would anybody ever send me something else? And then the attacker says, Well, what if I put a music file in here?

Speaker 0

对吧？或者如果我放一个我创建的、里面有一堆疯狂内容的文件呢？所以这是一种结合：尝试那些非常非正统、开发者未曾预料到的事情，然后逐步跟踪代码，看看发生了什么，试图以某种意外的方式使其崩溃。然后一旦它崩溃，你实际上必须发现它是可被利用的。因此，它不仅因为意外的输入而以某种方式崩溃，还需要以一种特殊的方式崩溃，使得你提供的输入能够控制其下的系统。

Right? Or what if I put a file that I created that has a bunch of crazy stuff in it? And so it's a combination of trying things that are really unorthodox, that were not expected by the developers, and then stepping through the code and seeing what happened, trying to cause it to break in a certain way that's unexpected. And then once it breaks, you actually have to find that it's exploitable. And so not only is it broken in a certain way with an unexpected input, but it needs to be broken in a specialized way that allows that input that you're providing to cause the system underneath it to be controlled by that input.

Speaker 0

所以本质上，整个想法是，我给它一个恶意的输入，导致它做我试图让系统做的事情，而不是它原本想做的事情。

So you're essentially, the whole idea is I'm giving it something hostile as an input that would cause it to do what I am trying to get the system to do, not what it wants to do.

Speaker 1

我是说，你能看出这简直是AI的完美应用场景，对吧？非常严谨地搜索许多以不同方式组合在一起的代码行，然后尝试多种不同选项来寻找某些东西。它某种程度上就是如此适合。

I mean, you can see how this is like the perfect situation for AI, right? Search really rigorously through many lines of code that are constructed together in different ways, and then try lots different options in order to find something. It sort of just So makes

Speaker 0

我们确实发现了这一点。我们正在开发的系统是Big Sleep。我们发现它在某些方面非常超乎人类能力，因为漏洞研究人员面临的挑战之一是要全面掌握所有这些复杂框架的百科全书式知识。

we've been finding exactly that. The system that we're working on is the Big Sleep. We've found that in some ways it's very much superhuman, because one of the challenges in vulnerability researchers is having a comprehensive encyclopedic knowledge of all these complicated frameworks.

Speaker 1

这些庞大的语言体系。

These gigantic languages.

Speaker 0

是的，这些巨大的代码库。然后，你知道，比如我看到一个Big Sleep发现的漏洞，当我阅读模型思考这个问题的过程时，很明显存在一些非常超人类的元素，因为它会提出一个假设，意识到这个框架的这个版本是这样工作的，但那个框架的第三版是那样工作的，第四版又是这样工作的，基本上对构成这个系统的所有不同库和框架有着惊人难以置信的广度理解，这对任何具体人类来说都很难在脑海中全部掌握。所以，是的，我认为我们确实在我们构建的系统中看到了令人惊讶的性能，并且我们已经用这个系统发现了多个新颖的零日漏洞。而且，如我所说，我们整个项目的目标是防御和帮助世界，因为我们知道坏人会随着时间推移使用AI来发现这些漏洞。我们希望做到最好，这样我们就能帮助开源社区和依赖这些软件的人们尽快修复这些问题。

Yeah, these huge code bases. And then, you know, like, I saw a vulnerability that we found with Big Sleep, where I was reading through the way the model was thinking through the issue, and it was clear there was elements that were very superhuman there, because it would come up with a hypothesis where it realized that, oh, this version of the framework this depends on works like this, but version three of that framework works like that, and version four works like this, essentially having this amazing, unbelievable breadth of understanding of basically all the different libraries and frameworks that would go into making this system, which is very hard for any particular human to hold in their head. And so, yeah, I think we've definitely seen surprising performance out of the system that we've built, and we've already found a number of novel Zero Days with the system. And, you know, as I say, our entire goal of this project is defense and to help the world because we know the bad guys will be using AI to find these vulnerabilities over time. We want to be the best so that we can help the open source community and those people that depend on this software to get these things fixed as quickly as possible.

Speaker 1

因为这是一个关键点，对吧？你们不只是拿Big Sleep瞄准谷歌的内部代码。你们还瞄准了各种开源软件。

Because that's a key point, right? You're not just taking big sleep and pointing it at Google in house code. You're also pointing at all sorts of open source software.

Speaker 0

正是如此。我的意思是，当然，谷歌建立在开源社区的许多优秀工作之上。我们使用的许多系统都依赖于此，所以这对谷歌也有好处。但我们明确尝试选择那些在谷歌内外广泛部署的项目，以此尝试帮助整个社区。因为我们确实担心，坦率地说，AI未来会被用来发现和利用这些漏洞。

That's exactly it. I mean, of course, Google is built on a lot of really great work by the open source community. A lot of the systems we use depend on that, so there is a benefit to Google as well. But we're explicitly trying to pick things that are widely deployed both in and outside of Google as a way to try to be helpful to the community as a whole. Because we do worry, I'd say frankly, about the future of AI being used to find and exploit these vulnerabilities.

Speaker 0

因此，我们尝试帮助的方式之一是尽可能先发现这些漏洞，并尽快帮助开源社区。

And so one of the things we're trying to do to help is to try to find the vulnerabilities first and to help the open source community as quickly as possible.

Speaker 1

为了提前防范。因为，正如你之前提到的，互联网的很大一部分运行在这种开源代码上，任何人都可以查看。几年前有一个相当普遍的观点，认为开源软件正因为这个原因更安全，因为每个人都能查看它，所以有很多双眼睛在关注漏洞。你对私有代码和开源代码之间的这种争论持什么立场？

To get ahead of it. Because, I mean, as you said, you mentioned earlier, a lot of the internet runs on this open source code, which is available to view for anybody. There was an argument, a quite prevalent argument, a couple of years ago in particular that open source software is safer for this very reason, that everybody can look at it and so you have many, many eyes who are watching for vulnerabilities. Where do you stand on that argument between sort of private code and open source code?

Speaker 0

我认为这是对的。开源代码确实允许更多的眼睛来审视。它也允许更多不同的技术以及人工智能来对系统进行分析。总的来说，正如我在讨论中多次提到的，透明度是防御者拥有的首要武器，用以对抗那些试图保密、不透明的攻击者。

I mean, think that's right. I think open source code does allow for more eyes. It does allow for more different techniques and AI as well to be run against the systems. I think in general, transparency, as I said a few times in our discussion, is the number one ingredient that defenders have on their side to help against the attackers who are the ones that are trying to hold things close to their chest and to not be transparent.

Speaker 1

那么，你能否也使用大型语言模型来构建工具修复这些漏洞，或者就如何修复这些漏洞提出建议？

Can you then also use large language models to build tools to fix those vulnerabilities or to make suggestions for how those vulnerabilities might be fixed.

Speaker 0

是的，这是一个非常好的见解，因为这显然是你接下来会遇到的问题：如果你开始扩大发现漏洞的能力，那么很容易想象，坦白说，我们自己和更广泛的社区会被大量需要修复的问题淹没。

Yeah, so this is a really good insight because that's obviously the next problem that you run into, is if you start to scale up your ability to find vulnerabilities, then it's very easy to imagine overwhelming both ourselves, frankly, and the broader community with a large volume of things that everybody has to fix.

Speaker 1

突然之间，你就有成千上万的后门需要修补。没错。而且，你知道，

Suddenly, you've got thousands of backdoors that need patching. Exactly. And, you know,

Speaker 0

说实话，很多这些开源维护者都是尽力而为的志愿者。他们是我们共同依赖的IT世界大部分领域的真正动手英雄。因此，我们希望让这个社区尽可能容易地吸收我们认为在这个AI新世界中可能大量出现的更改。所以我们开始的第二个大项目是一个叫做MENDER的项目。它还处于相当早期的阶段，但我们对进展感到非常兴奋。

to be honest, a lot of these open source maintainers are volunteers that are doing their best. They're really hands on heroes of large portions of the IT world that we depend on together. And so we want to make it as easy as possible for that community to absorb what we think is a high, potentially high volume of changes in this new world of AI. And so the second big project that we've started is a project we call MENDER. It's pretty early stage, but we're really excited about the progress.

Speaker 0

它是一个旨在根据我们发现的漏洞自动生成补丁的系统。这还有几个其他方面，对吧？一是你要确保修复的方式不会破坏重要的功能，以及其他所有人依赖的一切。你显然要确保它从根本上修复了安全问题，而不是仅仅做一些表面功夫。

And what it is is a system that's designed to automatically generate patches based on a vulnerability that we've discovered. There's a couple other dimensions of that, right? So one is that you want to make sure that you're fixing it in a way that doesn't break important functionality Everything else. That everybody else depends on. You obviously want to make sure it fixes the security issue fundamentally, not just some cosmetic way.

Speaker 0

此外，还有一个问题是确保你维护开源维护者偏好的编码习惯，然后我们使用一系列验证器来验证我们输出的代码质量足够好，可以提交。我们如何做到这一点呢？有几种技术。我们使用大语言模型（LLMs）来判断输出代码是否良好。我们还采用了一些形式化方法、概念以及其他技术，以确保我们产出的内容足够优秀。

Then there's also the issue of making sure that you maintain coding idioms that the open source maintainers prefer, and then have a series of things that we use as validators to validate that the code that we're producing on the output is good enough to submit. And how do we do that? Well, there's a couple of techniques. We're using LLMs to ascertain whether or not the output code is good. We're using some formal methods, concepts, and a number of other technologies to essentially make sure that the thing that we've produced is good enough.

Speaker 0

当然，至少目前，我们还会继续进行人工审核，以确保在我们与开源社区以及我们自己的系统达到一定的信任水平之前，不会向社区提交一堆乱七八糟的东西。希望最终我们能实现整个系统的端到端全自动化。

And then of course, for now at least, we'll continue to have human review just to make sure that we're not throwing a bunch of crazy stuff over the fence to the open source community until we get to a certain comfort level with them and with our own system. And then we'll hopefully get to the point where we can fully automate the whole system end to end.

Speaker 1

我的意思是，这非常复杂，对吧？简直复杂得惊人。先不说发现漏洞本身，实际上修复漏洞并不是一贴万能膏药就能解决的。漏洞所在的复杂性、细微差别，以及漏洞如何与系统的其他部分连接，漏洞所使用的语言——

I mean, is phenomenally complicated, right? Like, phenomenally complicated. Forget about finding the vulnerabilities in the first place. Actually fixing it is not like there's one sticking plaster fits all. The intricacies and the nuances of where that vulnerability lies, how that vulnerability plugs into other parts of the system, the language the vulnerability

Speaker 0

是的，没错，说得太对了。好消息是，我们现在有了能够生成代码的语言模型，实际上它们也能生成漏洞补丁。问题在于我们需要在系统中建立制衡机制，确保产出的内容质量非常高。要知道，仅靠大语言模型自行其是，至少目前我们还没法做到每次都能一次性生成完美的补丁。也许将来能做到，那将是非常美好的一天。

is Yeah, no, mean, is such a great point. I mean, the good news is we now have language models that can produce code, and indeed, they can produce patches to vulnerabilities. The problem is that we need to build a check and balance into the system to make sure that what's being produced is really, really high quality. And, you know, LLMs alone, left to their own devices, at least we haven't figured out yet how that alone could just one shot a perfectly good patch every time. Maybe we'll get there, and that'll be a wonderful day.

Speaker 0

但目前来看，我认为构建一组候选补丁，然后通过一系列验证器进行筛选，是合适的技术组合。

But for now, think building a set of candidate patches and then having a pass with a number of suite of validators is the right technology combination.

Speaker 1

因为这样一来，我猜会有层层叠叠的问题，因为你可能还得开始担心这个工具本身的安全性，对吧？比如有人入侵了这个工具，然后所有它贴的‘膏药’——

Because then I guess, layers and layers of this, because you would then also have to start potentially worrying about the security of that tool itself, right? Like someone gets into the tool and then all the plasters that it's

Speaker 0

贴得到处都是——这实际上又是另一个非常好的观点。如何端到端地解决一个问题？我认为一方面是我们发现漏洞，并希望能够自动为每一个漏洞生成补丁，理想情况下是以一种非常出色的方式，让任何人都无需担心这些补丁的质量。这样它们就能顺利融入。但你提出的最后一点也非常重要，即使你已经修补并修复了所有遗留代码，越来越多的人正在依赖大语言模型来生成项目。

sticking This all over the is actually another really good point. How do you solve a problem end to end? I think one is that we find vulnerabilities, and you want to be able to automatically generate a patch for each and every one of those, ideally in such a great way that nobody has to worry about the quality of those patches. And so we can just have those sail right in. But the final point you raise is such a good one as well, in that even if you've patched and fixed all the legacy code, more and more people are depending on large language models to generate projects.

Speaker 0

你知道，现在大家都在某种程度上进行氛围编程，甚至不完全是氛围编程。很多工程师都在使用大型语言模型，这是应该的，因为它们是非常高效的工具。但你怎么知道那个大型语言模型生成的代码是安全的呢？这很难判断。所以我们正在进行的另一个项目就是确保我们不仅教会Gemini如何编写优秀的代码，还要教会它以安全的方式编写。

You know, everybody's sort of vibe coding or even not vibe coding. A lot of engineers now are using large language models as they should because they're very productive tools. But how do you know that that large language model is producing secure code? You know, it's very difficult to tell. And so that's another project that we're working on, is making sure that we're teaching Gemini not just how to create great code, but how to do it in a secure way.

Speaker 0

我们认为这三件事结合起来，能够真正显著提升全球软件安全的质量。

We think those three things together can really make a major dent in the quality of software security for the world.

Speaker 1

那么你认为这是一个真正的雄心吗？理论上说，你能找到并修补地球上所有代码中的漏洞吗？

Do you think this is a real ambition then, that you could theoretically find and patch every vulnerability in CodeOn Earth?

Speaker 0

这正是我想要做的。但我认为我们面临的问题超出了技术层面，还涉及到如何确保全球所有人都将这些补丁应用到实际系统中。这是一个涉及人类的问题，涉及到风险规避等问题。所以，虽然我认为这是解决所有技术挑战的一个良好起点，但我仍然认为还有其他组织和人类层面的问题需要我们去应对，以确保我们不仅能生成优秀的安全补丁，还能让这些补丁真正应用到现实世界的系统中。

That's what I want to do. And I think that there are issues that we have to contend with that go beyond the technical, that involve how do we make sure everybody in the world is applying these patches to the real systems. And so that's kind of a human issue, and it comes into issues like risk aversion. So while I think this is a good starting point to handle all the technical challenges, I still think there's other organizational and human problems that we would have to contend with to really make sure not only do we generate great security in patches, but get those actually applied to real life systems.

Speaker 1

这些就是我想在这个两集播客的第二部分问你的问题。但在我们结束这一部分之前，确实感觉谷歌在这个领域所做的，至少在技术方面，与其他公司截然不同。这是你的看法吗？

So those are the questions that I want to ask you for the second part of this two part podcast. But just before we wrap up on this part, it does feel as though what Google is doing in this space, at least on the technical side, is fundamentally different from what we're seeing from other companies. Is that your view?

Speaker 0

我认为是的。当然，其中一些部分其他人也在做。但我们认为，凭借我们所拥有的所有数据、多年来在谷歌积累的所有代码，以及我们拥有世界上最好的工程和安全人才这一事实，这给了我们一个独特的优势。我很自豪能成为其中的一员。

I mean, I think it is. I mean, there's definitely pieces of this that are being done by others. But we think because of the strengths of all the data we have, all the code that we've got, that we've built up over the years at Google, and the fact that we have the best engineering and security talent in the world gives us a unique vantage point. And it's something I'm proud to be part of.

Speaker 1

绝对令人惊叹。好吧，我想你能看到这次对话还有更多内容。我知道我们才刚刚谈到精彩的部分，但我们决定将其分成两集，因为内容太多，一集放不下。所以请关注你的订阅源，等待第二部分。如果你担心可能会错过，那么现在就是你订阅我们频道的好机会，这样你就能随时知道我们何时发布新视频。

Absolutely amazing. Well, I think you can see that there is so much more to come in this conversation. And I know that we are just getting to the good stuff, but we have decided to split this across two episodes because there was just too much to fit into one. So keep an eye on your feed for part two. And if you are worried that you might miss it, well, is a great opportunity for you to subscribe to our channel so you always know when we have a new video out.

Speaker 1

嘿，顺便说一句，你还可以点赞和添加评论。虽然只是小事，但这是确保我们能继续制作这些内容的一种非常实际的方式。下次再见。

And hey, while you're at it, you might as well like and add a comment. Very small thing, but a very tangible way to make sure that we can keep making these. Until next time.