Training Data - 用AI“攻克所有疾病”的探索:Isomorphic Labs的Max Jaderberg 封面

用AI“攻克所有疾病”的探索:Isomorphic Labs的Max Jaderberg

The Quest to ‘Solve All Diseases’ with AI: Isomorphic Labs’ Max Jaderberg

双语字幕

仅展示文本字幕,不包含中文音频;想边听边看,请使用 Bayt 播客 App。

Speaker 0

我们从公司成立第一天起就为实现这一宏大抱负而努力。这不仅仅是针对某个特定适应症或靶点开发疗法,而是思考如何用AI打造一个通用的药物设计引擎——一个不仅能应用于单一靶点或模式,还能反复应用于不同疾病领域的技术。这正是我们当前正在推进的方向。

We have set up the company from day one to really go after this big ambition. This isn't about developing therapeutics for a particular indication or particular target. Is really thinking about how do we create a very general drug design engine with AI, something that we can apply to not just a single target or even a single modality, but we can apply this again and again across any different disease area. And that's what we're stepping towards at the moment.

Speaker 1

今天我们很荣幸邀请到Isomorphic Labs的首席AI官马克斯·约特伯格做客节目。这家从DeepMind分拆出来的公司,旨在用AI彻底改变药物发现领域。去年夏天,他们发布了AlphaFold三号——这项惊人突破不仅能模拟蛋白质结构,还能模拟所有分子及其相互作用。这项成果让德米斯·埃萨巴斯赢得了去年的诺贝尔化学奖。马克斯将阐述他们对药物设计'圣杯'模型的构想,以及科学智能体的形态。他结合了开发AlphaStar和《夺旗》游戏的经验,更广泛地探讨了构建智能体与游戏的研究方向。

Today, we're excited to welcome Max Jotterberg to the show, Chief AI Officer of Isomorphic Labs, which launched out of DeepMind with a goal of revolutionizing drug discovery using AI. Last summer, they released AlphaFold three, a stunning breakthrough that allows us to model not just the structure of proteins, but of all molecules and their interactions with each other. That led to Demis Esabas winning the Nobel Prize in Chemistry last year. Max describes their vision for what a holy grail model for drug design and what agents for science look like. He draws parallels to his experiences building AlphaStar and Capture the Flag, and the research directions of building agents and games more broadly.

Speaker 1

具体而言,面对10的60次方种可能的药物分子结构,我们需要构建既能生成又能探索整个设计空间的智能体模型。马克斯还描述了该领域的'GPT-3时刻'愿景——就像AlphaGo著名的第37手棋那样,当AI在药物设计中展现出超人类水平的创造力时,连我们自己都会为之震撼。这是我最喜欢的节目之一,请尽情享受。马克斯,非常感谢你今天来到伦敦参加我们的节目。

Specifically, with 10 to the power of 60 possible drug molecule structures, we need to build both generative models and agents that can learn how to explore and search through the whole potential design space. Max also describes his vision for what a GPT-three moment for the field might look like, describing it more akin to AlphaGo's famous Move 37 when we start to see things that exhibit superhuman levels of creativity in AI drug design and that stun even humans ourselves. This is one of my favorite episodes yet. Enjoy the show. Max, thank you so much for joining us today here in London.

Speaker 0

不,能来这里是我的荣幸。是的,这太棒了。

No. It's a pleasure to be with you here. Yeah. It's fantastic.

Speaker 1

AlphaFold三号的发布和德米斯获得诺贝尔化学奖的时机也堪称完美,这真实印证了你和团队过去几年的所有努力。

Awesome timing too with the launch of AlphaFold three and with Demis winning the Nobel Prize in chemistry, which is a true testament to everything that you and your team have done over the last couple of years.

Speaker 0

是的,2024年对我们来说绝对是忙碌的一年,取得了很多重大突破。诺贝尔奖更是令人难以置信,我认为这是对这项开创性工作的绝佳认可。

Yeah, 2024 was definitely a busy year for us. Lots of big breakthroughs. Nobel Prize was just incredible to see. I think amazing recognition for this seminal piece of work.

Speaker 1

好的。我想先聊聊你的个人经历。你从深度学习领域起步就有着非凡的职业生涯,在DeepMind期间发表了多篇开创性论文,包括关于《夺旗》游戏和深度学习突破的研究。能否为我们讲讲你当时在强化学习研究领域关注的关键问题?

Yes. Well, I'd love to start with talking a little bit about your own personal story. You've had an incredible career in the world of deep learning from the very start, authoring many seminal papers while at DeepMind, including for Capture the Flag and Breakthroughs in the World of Deep Learning. Can you walk us through some of the key questions that you had in your field of research around reinforcement learning at the time?

Speaker 0

是的。在DeepMind工作时,我涉足了许多领域,包括计算机视觉和深度生成模型的早期研究,但真正让我着迷的是强化学习。当时DeepMind是全球强化研究的前沿阵地。我们思考的核心问题是:如何才能开发出能自主完成任意任务的AI?而当时的主流范式还是监督学习。

Yes. So at DeepMind, I worked on a whole host of stuff, early days of computer vision and deep generative models, but it was really reinforcement learning that ended up hooking me there. DeepMind was the place in the world to be working on reinforcement learning at that time. Really the question in our minds was, how can we actually get to a point where we could get an AI that could go off and do any task you wanted it to do? And the dominant paradigm at that point in time was supervised learning.

Speaker 0

没错。监督学习与强化学习截然不同。虽然都是学习技术,但监督学习需要预先知道问题的答案,并以此训练模型。是的。

Yeah. And supervised learning is very different from reinforcement learning. They're both learning techniques. But supervised learning, you need to know what the answer to your question is, and that's how you train the model. Yeah.

Speaker 0

在监督学习中,你需要提供示例并给出标准答案。如果你已经完全掌握了要训练AI(或神经网络)解决的问题,这种方法会很有效。

So in supervised learning, you give an example, and then you supply the model with the answer to that question. Now, that can be great if you already know everything about the problem that you're training this AI to do, this neural network to do.

Speaker 1

但多数情况下我们并不掌握这些知识。

But most times you don't.

Speaker 0

确实。世界上有太多问题的答案我们尚未知晓。当我思考如何将AI应用于现实世界时,我认为真正的突破在于:能否将AI应用到人类束手无策的领域,或是突破人类能力的极限。

Yeah. I mean, there's just so many problems in the world where we don't know what the answer is. We don't know what the solution is. And if you think about, I think about how I want AI to be applied to the world. Yes, it's gonna be great to be able to apply things where we're already good as humans here, but really, the big frontier is can we start applying AI to places where humans don't know how to do this stuff or if there's a limit to human performance there.

Speaker 0

这正是强化学习成为关键工具的原因——它不需要预先知道正确答案,只需判断模型给出的方案优劣即可,甚至能评估优劣程度。这为模型训练开辟了全新的问题领域。

And that's where reinforcement learning is one of the key tools and has real promise here because in reinforcement learning, you don't need to know what the answer to the question is. You just need to be able to say whether the answer that the model gave you was good or not good. Yeah. Maybe even how good or not good. So this opens up a completely new field of problems to train these models against.

Speaker 0

因此强化学习,特别是DeepMind早期的重大突破,都是从雅达利这类游戏开始的。

And so reinforcement learning and really starting from what was one of the big breakthroughs of DeepMind in the early days was working on games like Atari.

Speaker 1

是的。

Yes.

Speaker 0

问题是,好吧,我们如何将这项技术从《乒乓》和《太空侵略者》这样的游戏,扩展到真正开始解决现实世界中的实际问题?是的。随着这些方法的升级,研究领域取得了惊人的进展。是的。

The question was, okay, so how can we scale this up from the world of Pong and space invaders to things that really start to look like real problems in the world? Yeah. And so there was amazing track of research as we scaled up these methods. Yeah.

Speaker 1

你知道吗,红杉资本当年是雅达利的首个投资人?

Did you know that Sequoia was the first investor in Atari back in the day?

Speaker 0

哦,真的吗?我...我不知道这事。这...这太不可思议了。是啊。确实。

Oh, really? I didn't I didn't know that. That that that that's incredible. Yeah. Yeah.

Speaker 0

不。那些雅达利游戏其实...回想起来玩起来特别有意思,尤其是当我们有个智能体在旁边,比如一边研究一边还能来局《乒乓》什么的。所以

No. Those those Atari games were, you know, great fun actually to to sort of go back and play in the context of, hey, we've got an agent and, you know, I'm just gonna have a game of pong on the side as well. So

Speaker 1

我们红杉办公室里有一面很棒的墙,上面记录着所有传奇IPO和并购案例的名字。其中有个叫'披萨公司'的,我特别喜欢问别人知不知道这是什么。其实它指的是查克芝士餐厅,那是红杉早期的投资项目之一。

There's a wonderful wall at Sequoia in our office where we have all these names of legendary IPOs and and M and A's that have happened. There's one, I think it's called the Pizza Company. And I love asking folks if they know what that is. And it's actually from Chuck E. Cheese's, which was an original Sequoia investment at the time.

Speaker 0

太棒了。太棒了。

Amazing. Amazing.

Speaker 1

当时《夺旗战》和《阿尔法星》是突破性的成就。你能详细说说这些突破具体是什么,以及为什么选择这些特定游戏吗?

So Capture the Flag and AlphaStar were incredible breakthroughs at the time. Can you share a little bit about what exactly those breakthroughs were and maybe why you chose those specific games?

Speaker 0

是的。要知道,回顾AI使用电子游戏的历史,我们为何要用电子游戏?游戏就像可塑性极强的封闭世界,作为研究人员,我们可以操控它们,测试不同算法,设置各种情境。

Yeah. So, you if know, you think about the history of of AI using video games, why do we use video games at all? Video games are these sort of malleable, perfectly encapsulated worlds that as researchers and scientists, you know, we we can manipulate them. We can test out different algorithms in them. We can set up different situations.

Speaker 0

对我们来说这是开发新算法的绝佳试验场。作为强化学习研究者,思考如何让AI尽可能通用时,总会想:我们已经攻克了雅达利游戏,如何挑战更复杂的游戏?

So the the perfect test ground for us to develop new algorithms. And then you can imagine as a, you know, RL researcher, as someone who's, like, thinking about how can we get AI to be as general as possible, you're always thinking, okay. We've cracked Atari. How do we get a more complex game?

Speaker 1

嗯。

Yeah.

Speaker 0

我个人痴迷的是让智能体具备零样本学习能力——能直接执行任何任务。这与当时人们在雅达利游戏上的训练模式不同,传统强化学习是:给你一个游戏,训练精通后再用相同算法从零开始训练其他游戏。

And the thing that I was personally obsessed with is I want these agents to be able to zero shot, be able to do any task. And this is a slightly different paradigm from the from what people were doing at the time with training on Atari where, you know, normally in reinforcement learning, think about, here's a game. Now you get to train on it and get good at it. Yeah. And then you apply that same algorithm from scratch training on different games.

Speaker 1

对。

Yes.

Speaker 0

我向往另一种模式:训练一个智能体后,能直接移植到任何新任务上,无需额外训练就能表现出色。这本质上要求的是任务空间的泛化能力。没错。

I'd love a different scenario where instead we train an agent and then we can lift it and put it on any new task Yeah. And that agent will be able to perform well in that task without any more training. Yeah. And so to do that, what you're really asking for is generalization over task space. Yes.

Speaker 0

这意味着你需要大量大量的训练任务。所以在这个强化学习中,训练数据就是任务。对,不是图像,不是文本片段,而是任务。

And that means you need lots and lots of training tasks. So the training data in this RL Yeah. For agents becomes tasks. Yeah. Not images, not pieces of text, but tasks.

Speaker 0

你可以想象,可以坐下来,比如接管整个游戏工作室,尝试手动编写数百种不同的任务,制作大量迷你小游戏

And so you can imagine you could go and sit and, you know, take a whole game studio and try and hand author, you know, hundreds of different tasks, lots of little mini games

Speaker 1

是的。

Yes.

Speaker 0

在这些虚拟世界里。我们确实这么做了。我们当时做了很多这类工作。然后你会想,其实我们可以超越手动编写,通过程序化生成

In these virtual worlds. And we did that. You know, we we we were doing lots of that. And then you can think, yeah, we can actually go further than hand authoring. We can procedurally generate

Speaker 1

对。

Yeah.

Speaker 0

这些任务和游戏,生成世界、地图以及不同目标,我们做到了。但你总会遇到复杂度上限,能移交或设计的复杂度终究有限

These tasks and games, generating worlds and and and maps and and and different objectives, and we did that. But you keep running into this complexity ceiling that there's only so much complexity that you can hand off or you can design

Speaker 1

没错。

Yeah.

Speaker 0

人类层面。但这就是多人游戏的用武之地。是的,因为一旦从单人模式转向多人模式,就不只是智能体在玩了。游戏中还有另一个玩家,那个或其他玩家可以呈现出许多不同的特性和行为。

Humanly. But that's where multiplayer games come in. Yeah. Because as soon as you go from single player to multiplayer, it's not just the agent playing. You've got another player in this game, and that other player or other players can take on many different characteristics and many different behaviors.

Speaker 0

所以每个不同的对手、每种不同的策略都会从根本上改变游戏规则和智能体的目标。你知道吗,我常常回想,为什么人们至今仍痴迷下棋?是的,职业棋手为何持续对弈?这明明是同一个游戏。

So every different player, every different strategy that you're up against changes fundamentally the game and what the agent is trying to do. You know, I I I go back and think, you know, why are people still obsessed with playing chess? Yeah. You know, why does a professional chess player still keep playing chess? It's the same game.

Speaker 0

但其实并非如此,因为你每天面对的对手完全不同,世界上总有新玩家加入。因此游戏始终在变化。多人游戏和多智能体游戏真正囊括了因其他玩家存在而产生的巨大任务多样性。夺旗战实际上是我们首次探索如何利用多人游戏来突破强化学习算法的极限,迫使我们深入思考如何泛化到新任务,如何处理多智能体动态。

Yeah. But it's actually not because you're playing completely different opponents day after day and new people into the world. So the game is continually changing. So multiplayer games and multi agent games really encapsulates that huge diversity of tasks that you might encounter just from other players being there. And so capture the flag was actually one of our first forays into how can we use multiplayer games to really stretch what our reinforcement learning algorithms can do, really force us to think strongly about how we can generalize to new tasks, how we deal with these multi agent dynamics.

Speaker 0

夺旗战确实是个惊人突破,它证明了我们能在第一人称多人游戏中达到人类水平的表现。当然,随后《星际争霸》带来了更高复杂度,成为我们必须攻克的新前沿。

So Capture the Flag was a fantastic breakthrough, really showed that we could get to human level performance for these multiplayer first person games. Yeah. And then, of course, Starcraft added on a huge amount of complexity and was sort of the next frontier that we had to go after for this.

Speaker 1

你们在这个领域如此超前,这些概念至今仍在语言领域极具现实意义。看到部分研究持续产生影响,你作何感想?

You were so early in this that so many of these concepts are very, very relevant today in the world of language. How does it feel to to see some of this work continue to be played out?

Speaker 0

太棒了。这真的非常美妙。要知道,我们当时讨论的许多议题...

Yeah. It's brilliant. It's it's it's just fantastic, actually. You know, there there were so many things that we were talking about

Speaker 1

在七年前就有了雏形。是的。

at Seven years ago. Yeah.

Speaker 0

是的。十五、十六、十七这些数字,看到所有这些核心基础概念如今在大语言模型领域真正发挥作用并具有实际应用价值,最终实现了我们当年只能梦想的性能表现,这真的令人无比满足。

Yeah. You know, fifteen, sixteen, seventeen, And to see all of these core fundamental concepts being really useful and really applicable today in the world of large language models, you know, and resulting in performance that we could only really dream about at the time. That's incredibly satisfying, Yeah.

Speaker 1

那么,用你自己的话来说,你提到自己从开发玩具转向寻找实际应用。你是何时确信自己找到了正确的方法?

So then, you know, in your own words, you said that you moved from building toys to then finding real applications. When did you know that you found the right recipe?

Speaker 0

我热爱深度学习。过去十到十五年来我一直对此痴迷。最让我着迷的是那些底层核心概念——这些基础构建模块能以某种方式在不同应用领域间惊人地迁移复用。2012年我们在计算机视觉使用的构建模块,与早期语言生成模型、强化学习等领域使用的如出一辙。我反复见证着这种能力:运用相同核心概念,集结真正理解如何组合这些模块的顶尖人才——他们就像是调配概念的主厨。

So I just love deep learning. I've been obsessed with deep learning for ten, fifteen years now. And the thing that I love about it is that you have these underlying core concepts, these fundamental building blocks that are somehow incredibly transferable between different application spaces. So it's the same building blocks that we were using in computer vision in 2012 as we were using in early generative models in language, then reinforcement learning, etcetera, etcetera. So what I was seeing just again and again was this ability to take these concepts, these same core concepts, take incredible people who understand how to they're almost like master chefs of putting these concepts together and these different building blocks together.

Speaker 0

组建杰出团队,攻克真正具有挑战性的难题。那些你在学术会议上提出时,领域权威研究者会说'不不不,这还要十年才能实现'的问题。而在你心底知道:我们其实已经破解了。

Take a team of incredible people and go after really, really challenging problems. Problems that you go to conferences at a time and you talk to leading researchers in the field, they say, No, no, no, this is ten years away. And to the back of your mind, Okay, we basically cracked it.

Speaker 1

哇。

Wow.

Speaker 0

我目睹这种场景一次次重演:汇聚卓越人才、卓越算法和强大算力,面对极端困难的课题,我们现在能找到破解众多难题的方法。我一直执着于这些技术的实际应用,希望见证它们为世界带来真正变革性的积极影响。是时候真正开始追求这个目标了——我认为这个时机已经成熟好几年了。

And I saw that happen again and again and again. You take amazing people, amazing algorithms, amazing compute on really challenging problems, and we can find recipes now to crack so many problems. And so it just got to the point where and I've always been quite obsessed with the application of these methods, I want to see this technology have real transformative positive impacts in the world. And we need to start actually going after that. And the time has been right for, I think, a few years now.

Speaker 1

确实。你和当代最伟大的科学家、技术专家及创始人之一Demis已合作十年。他在你还在牛津时就联系了你。后来你们的公司Vision Factory和DeepMind在2014年左右同时被谷歌收购,从那时起你们两人就开始合作至今。

Yeah. Well, so you've now had a decade long relationship working together with one of the greatest scientists, technologists and founders of our lifetime, Demis. He called you while you were still at Oxford. And then your company, Vision Factory and DeepMind, were both acquired by Google back in 2014, around the same time. And that's when the two of you started to work together now for over ten years.

Speaker 1

与Demis共事是怎样的体验?

What was it like or what has it been like to work with Demis?

Speaker 0

是的,我是说,Demis是个了不起的人,你知道,他个性鲜明且极具远见。而且,他还非常平易近人。我认为这真的很能激励人们。

Yeah. I mean, Demis is a incredible person, you know, a real character and real visionary. Yeah. And, you know, also amazingly human and relatable. And I think that that really inspires people.

Speaker 0

所以,你知道,只需要五分钟的交谈,他就能展现出那种野心的深度

So, you know, it it only takes a five minute conversation to for him to sort of really bleed out the the depth of ambition

Speaker 1

嗯。

Yeah.

Speaker 0

他所思考的。以及实现这些野心的可能性之近在咫尺。我觉得他有一种非凡的能力,能给一群非常聪明的人注入大量能量,让人们看到眼前之外的东西。我记得有次站在早期DeepMind办公室大厅里的场景,那是在为DeepMind的第一篇《自然》论文举行的庆祝活动上。

That he thinks about. And just the immediacy of the potential to get, you know, to step towards these ambitions. So I think he has this great ability to inject a lot of energy into, you know, a group of very smart people, get people to see beyond what's right in front of them. I remember moments sitting well, standing in the lobby of one of the early DeepMind officers. I think this was the it was a a toast.

Speaker 0

我们当时在庆祝DeepMind发表的首篇《自然》论文。

We were a celebration we were having for the first nature paper from DeepMind.

Speaker 1

哇。

Wow.

Speaker 0

德米斯曾说,这实际上只是我们将在《自然》杂志发表的数十篇论文中的第一篇。当时,这基本上算是《自然》杂志发表的第一篇机器学习论文——就是那篇关于Atari DQN的论文。说到要在《自然》发表数十篇论文,听起来可能有点天方夜谭。但他更进一步表示,我们还将因此赢得各类奖项。

And Demis was saying, you know, this is actually just gonna be the first of dozens of nature papers. And at the time, this was the first basically, first machine learning paper in nature. This was the Atari DQN paper. And the prospect of dozens of nature papers, you know, it seems a bit far fetched. And actually, he went further and said, and and and we're gonna be winning there were prizes as a result of this.

Speaker 0

那是十年前的事了。是啊,他的前瞻性简直不可思议。他拥有我称之为'推演型思维'的特质。

And that was ten years ago. Yeah. That's incredible. Forethought that he has. He's got what I call, like, one of these rollout minds.

Speaker 0

或许这源于他丰富的国际象棋经验,但他总是能将思维推演到未来——现在需要采取哪些步骤才能实现这个宏大目标?这十年来与他共事非常棒,我们现在依然在Isomorphic实验室紧密合作,而我们的雄心壮志丝毫未减。

Maybe it comes from all of his experience playing chess, but it it's he's always, you know, rolling out into the future. What what are the steps now that are gonna lead, you know, to this big ambition? So, yeah, it's been it's been fantastic. I've been working with him for about ten years now, you know, still work really closely together on isomorphic labs. And the ambition is as big as ever.

Speaker 1

听到你们从一开始就怀揣这样的抱负真是引人入胜,更不可思议的是这些愿景正在成为现实

It's so interesting to hear that you had this ambition and that he had this ambition from the very start. And it's incredible that it's played out

Speaker 0

that way.

Speaker 1

我很想聊聊Isomorphic实验室。你们正在开启我们这代人最雄心勃勃的使命之一——用AI重塑药物发现与开发流程。如果一切顺利,当Isomorphic的愿景实现时,世界会变成什么样子?

Well, I'd love to talk a little bit about isomorphic. You're now embarking on one of the most ambitious missions of our generation to reimagine drug discovery and drug development with AI. If everything goes right and you realise your vision for isomorphic, what does the world look like?

Speaker 0

是的,我们在Isomorphic有着宏大的构想——我们要解决所有疾病问题,而且是真正实现这个规模。关键在于,我们正在构建的这项技术以及整个AI领域,将彻底改变我们理解生物学的方式,提升我们操控化学物质来调节生物系统的能力。我们真正憧憬的未来是:AI不仅能帮助我们发现、创造和设计新疗法,更能让我们深入理解生物世界的运作机制,了解细胞工作原理和疾病根源,从而开辟全新的治疗路径。

Yeah, you know, we think really big isomorphic. We want to be solving all diseases here and genuinely that scale. And the point is that this technology that we're building and AI as a whole field is going to be completely transformative in how we understand biology, in our ability to manipulate and craft chemistry to modulate that biology. So we really think about a future where we are solving all diseases, where AI is not just helping us discover and create and design new therapeutics, but also just understand so much more about our biological world, about how our cells are working, what are the root causes of disease, and therefore opening up new pathways that we can think about modulating.

Speaker 1

那么

So

Speaker 0

我们从公司成立第一天起就确立了这一宏大目标。这不是针对某个特定适应症或靶点开发疗法,而是思考如何用AI打造一个通用的药物设计引擎——不仅能应用于单一靶点或模式,更能反复适用于任何疾病领域。这正是我们当前正在努力实现的方向。

we have set up the company from day one to really go after this big ambition. This isn't about developing therapeutics for a particular indication or a particular target. It's really thinking about how do we create a very general drug design engine with AI, something that we can apply to not just a single target or even a single modality, but we can apply this again and again across any different disease area. And that's what we're stepping towards in the moment.

Speaker 1

怀着这种通用性目标,实际操作中从第一天起就改变了你们的建设方式吗?

How does setting out with this ambition of being general change how you built in practice from day one?

Speaker 0

这是个好问题。当我思考AI在药物设计领域的现状时,发现化学和生物学中已大量应用机器学习模型,但我会称其为第一代应用

Yeah, it's a good question. When I think about some of the status quo of AI in drug design, there's a lot of there's been a lot of use of machine learning models in chemistry and biology, but I would call them a lot of the first generation of this sort

Speaker 1

的应用

of application

Speaker 0

更多属于局部模型——你可能掌握某个特定靶点的数据,或某类分子的行为特征,然后针对这些数据训练小型多层感知机来生成预测,指导下一轮设计。这与我们的做法完全相反。因此从第一天起,我们就致力于创建能跨越化学空间和靶点空间的通用模型。AlphaFold和AlphaFold3就是典型例证,

to be more local models, where you might have some data about a particular target or about how a particular class of molecules is behaving, and you'll fit a small multilayer MLP against this data to help you generate some predictions that lead to your next round of design. This is the complete opposite approach of what we were trying to do. So from day one, we were setting out to create models that generalise across chemistry and across target space. And a key example of this something like AlphaFold and AlphaFold3,

Speaker 1

其中

where

Speaker 0

这是一个可以应用于各种不同目标的模型。你可以将其应用于蛋白质组中的任何蛋白质,乃至整个蛋白质宇宙。它适用于任何你能设计的小分子,无需微调,也无需适配任何本地数据。可以想象,这彻底改变了化学家使用这些模型的方式——不再需要为每个具体应用调整模型。我们所有的内部研究项目都是如此。顺便说一句,要实现我们正在构建的突破性药物设计引擎,我们大概需要六七个类似AlphaFold这样的突破。

this is a model that you can apply to a whole different host of targets. You can apply it to any protein in the proteome, in the universe of proteins. You can apply it to any small molecule that you can think of designing without needing to fine tune it, without needing to fit any local data. And so you can imagine that completely changes the way that chemists can use these models if you don't need to be adapting this model to every single application. So every single one of our internal research projects, and by the way, when I think about what we're going to need to get this breakthrough drug design engine that we've been building, we need like half a dozen AlphaFold's.

Speaker 1

哇。

Wow.

Speaker 0

AlphaFold只是故事的一部分。

AlphaFold is just part of the story.

Speaker 1

哇。

Wow.

Speaker 0

因此从一开始,我们就设立了这些内部研究项目,致力于解决这六七个关键问题。除了在AlphaFold和结构预测方面取得重大突破外,其他关键领域也有显著进展。所有这些模型都具有通用性,可应用于任何靶点。而且我们发现,它们实际上可以应用于多种模式,至少是许多不同的模式。

So from day one, we've been setting up these internal research programmes, going after these half dozen problems. We've had significant breakthroughs obviously in alpha fold and structure prediction, but also in other key areas. And in all of these, these models are general. They can be applied to any target. And then what we're finding actually, they can be applied to any modality or lots of different modalities at least.

Speaker 1

嗯。这是我第一次听你提到需要六七个类似AlphaFold的突破。能详细说说这意味着什么吗?

Yeah. So that's the first time I've heard you say half a dozen AlphaFold's. Can you share a little bit more about what that means?

Speaker 0

当然。AlphaFold显然是理解生物分子结构的一次重大突破——即蛋白质结构是什么。现在有了AlphaFold3,我们还能理解蛋白质与小分子以及DNA、RNA等物质结合时的结构。这代表着根本性的变革。

Yeah. So AlphaFold was obviously a massive breakthrough in understanding biomolecular structure. So how what is the structure of proteins? And now with AlphaFold3, structure of proteins with small molecules and things like DNA and RNA. That's a fundamental step change.

Speaker 0

这让我们能够以实验级别的精度理解生物化学的核心概念,为化学家们开启了一系列思考和设计工作。但我的观点是,我们可能还需要大约六项类似的突破——即在生物学和化学不同核心概念上达到实验级精度——才能将这些整合成真正变革药物设计的成果。药物设计确实非常非常困难,它不单是一个问题,也不仅关乎理解蛋白质结构。

It allows us to get experimental level accuracy of a really core concept of biochemistry that unlocks a whole bunch of thinking and design work for chemists. But my comment here is actually we're probably going to need something like half a dozen more of these sort of breakthroughs, this sort of getting to experimental level accuracy of different core concepts of biology and chemistry to be able to put this together into something that's really transformative for drug design. Drug design is really, really hard. It's not just a single problem. It's not just about understanding the structure of a protein.

Speaker 0

甚至不只是设计一个能按你期望方式调节蛋白质的分子。你希望这个分子最好能制成药丸,通过身体吸收,以正确方式到达目标细胞类型,真正进入细胞而不被肝脏以某种方式分解。作为药物设计师,需要掌控的复杂性实在太多。而每一项都像是我们正在创造的AlphaFold级别的突破。

It's not even just about designing a molecule that will modulate that protein in the way that you want. You want this molecule to be able to ideally be taken as a pill and go through the body and be absorbed in the right way and reach the right cell type and actually go into the cell and not be broken down by the liver in a certain way. There's just so much complexity to hold onto as a drug designer. And each one of those is like an AlphaFold level style breakthrough that we've been creating.

Speaker 1

真有意思。我还听你提到过'药物设计的圣杯模型'和'科学代理',能详细解释一下吗?

So interesting. Well, I've also heard you use the words a holy grail model for drug design and agents for science. Can you explain a little bit more about what you mean?

Speaker 0

是的。我们正在攻关的某些研究领域——预测分子结构和性质,以及这些生物分子如何随时间相互作用和演变——这些确实是药物预测领域的圣杯级难题。我们已取得一些惊人突破,彻底震撼了化学家团队,并逐步改变了ISO内部的药物设计方式。但我觉得最值得思考的是:即便你创建了世界最佳预测模型(甚至优于实验级模型)来预测分子特性(比如预测真实实验结果),我们可能拥有一整套这样的模型,却仍无法解决药物设计问题。关键在于,存在一个10的60次方数量级——这可能是所有潜在药物分子的可能数量。

Yes. So some of these research areas that we've been going after, predicting structure and properties of these molecules and how all of these biomolecules interact and play out over time, these really are sort of holy grail predictive problems for drug And we've made some incredible breakthroughs there, which have really stunned our chemists and step changed how we do drug design internally at ISO. But what's, I think, a really interesting thing to think about is that you create the best possible predictive model of the world, like even better than experimental level model to predict a particular property about a molecule, for example, to be able to predict the outcome of a real experiment. So we could have a whole suite of those, but that still wouldn't solve drug design. The way to think about this is, there's this number 10 to the power of 60, which is perhaps all of the possible drug like molecules that you could that could exist.

Speaker 0

这个数字可能考虑了很多因素。即便我们将其减少20个数量级到10的40次方,仍然是个天文数字。假设你拥有全球最佳预测模型,能筛选十亿种分子(即10的9次方),我们仍有10的31次方分子未被探索。

That's maybe takes into account a lot of things. We could even reduce that by 20 orders of magnitude and get to ten forty. That's still a lot of things. And even if you had the best predictive models in the world, so let's say you could screen a billion different molecules, you could go and test a billion different molecules, that's 10 to the nine. Now we're still like 10 to the 31 molecules left on the table.

Speaker 0

因此即便使用最佳预测模型,你甚至还没触及应该探索的分子空间的表面。这就是为什么我们需要超越实验预测模型,发展生成式模型和能导航整个10的40次方到60次方空间的智能代理。

So even with the best predictive models, you're still not even scratching the surface of molecular space that you should be exploring. And this is why we need to go beyond just predictive models of experiment, but also models like generative models, like agents that can actually navigate that whole 10 to the forty, ten to the 60 space.

Speaker 1

这真是太有趣了。

That's so interesting.

Speaker 0

利用我们的预测模型,显然能理解如何驾驭这一点,这样我们就不必进行穷举搜索,因为我们永远无法穷尽整个分子宇宙的搜索——就像AlphaGo无法穷尽所有可能的围棋走法一样。而国际象棋则不同,理论上可以穷尽所有可能的走法。

Using our predictive models, obviously understand how to navigate that, but so we don't have to exhaustively search because we can never exhaustively search the whole universe of molecules, if that makes sense. In the same way that AlphaGo couldn't exhaustively search all of the possible Go moves. Unlike chess, where you could exhaustively search all possible chess moves.

Speaker 1

是啊,是啊。

Yeah, yeah.

Speaker 0

但确实,分子设计更像围棋而非国际象棋。这正是生成模型发挥作用的地方。利用生成模型的智能体,结合搜索技术和这些惊人的预测能力,才能真正打开整个分子空间的大门。说实话,即使没有人工智能,我们也能在10^40量级的空间中找到药物,这仍然让我感到震撼。这说明实际上可能存在大量冗余设计,蕴藏着巨大潜力。

But yeah, molecular design is much more like Go than it is like chess. So that's where generative models come into play. Agents that utilise generative models, utilise search techniques, as well as these amazing predictive capabilities to really open up the entirety of molecular space. Now, to me, it's actually still amazing that even without AI, we managed to find drugs in this 10 to 60 space, 10 to the 40 space. It just says that actually there's probably a lot of redundancy, there's a lot of potential designs.

Speaker 0

如果你考虑某个特定疾病适应症或靶点,应该存在许多既有效又符合治疗要求的设计方案。我认为真正的潜力在于这些生成模型和智能体能够探索这个空间,真正发掘出整个潜在的设计空间。

If you think about a particular disease indication, a particular target, there should be many designs that exist that would be good for that and would be the right sort of product profile for this therapeutic. And I think the real potential here is for these generative models, these agents as well, to be able search through this space and really uncover that whole potential design space.

Speaker 1

这太有趣了。用最简单的门外汉说法,你们既在模拟学习过程,又在模拟游戏规则,试图打造能解决各类游戏的最强玩家。

That's so interesting. Think in very simplistic layman terms, you're both modeling learning and modeling the game and trying to build the best player to solve different types of games.

Speaker 0

没错。说实话,我对游戏有着难以避免的偏爱。从小玩电子游戏长大,在那个世界里成长。但确实,这正是我的思考方式。

Yeah. So it's I mean, you know, I'm I'm incredibly biased of of by by by games. I've, you know, I've been playing video games since I was a kid. Grow grow up in that world. But, know, that's exactly how I think about it.

Speaker 0

我们必须建立自己的世界模型——包括生化世界和生物世界的模型。但这还不够,我们还需要创造能够探索这些空间的智能体和生成模型,在化学宇宙的干草堆中找到那些可能改变数百万人生命的金针。

We've gotta be creating our world models, our models of the biochemical world, our biological world. And then we don't stop there. We actually then need to be creating agents and generative models that can work out how to explore, how to traverse that, and to basically uncover these amazing needles in the haystack in chemical space, which could be life changing therapeutics for so many millions of people.

Speaker 1

我很喜欢这个观点,这正是我们今天要强调的。Alapult3确实具有开创性意义——它让我们从仅能模拟蛋白质结构,跃升至能够模拟所有分子结构及其相互作用。能否请您分享一下,我们该如何从准确性、速度效率、以及探索以往无法解决的问题领域等角度来理解这一突破?

I love that. That is our punchline today. So, Alapult3 is truly groundbreaking. You've taken us from being able to model just the structure of a protein to now being able to model the structure of all molecules and their interactions with each other. Can you share a little bit about how we should think about that in terms of the impact in accuracy, in speed and efficiency, and also potentially in being able to explore problem spaces that we couldn't solve before this?

Speaker 0

没错。AlphaFold二代是最大的突破对吧?它让我们能理解蛋白质结构,后来又有了AlphaFold2 Multimer,不仅能解析单个蛋白质结构,还能解析蛋白质复合体的结构——即这些蛋白质如何组合在一起。这帮助我们解答了许多生物学问题,但要实现药物设计仍有很大距离。而小分子药物正是其中一大类重要药物。

Yeah. So, yeah, AlphaFold two was, the biggest breakthrough, right? To be able to understand the structure of proteins and then there was something called AlphaFold2 Multimer, which then allows you to understand not just the structure of proteins by themselves, each individual protein, but the structure of proteins as they come together and what we call complexes, so how these proteins fit together. That opens up and helps us answer a lot of questions in biology, but there's still a big hop to designing therapeutics. And one the big classes of therapeutics is what's called small molecules.

Speaker 0

这类分子并非蛋白质,比如咖啡因或扑热息痛这类通常以药丸形式服用的物质。这类小分子药物的工作原理是:它们通过人体进入细胞,然后附着在蛋白质上。蛋白质是生命的基本构建单元,通过与其他蛋白质相互作用形成分子机器。

So these are molecules that are not proteins. These would be things like caffeine or paracetamol, things that more often you can take as a pill. And the way that these therapeutics work with these small molecules is that they go through the body, they go into the cell, and they actually come and attach themselves to these proteins. These proteins, they're the fundamental building blocks of life. They form these molecular machines by interacting with other proteins.

Speaker 0

想象一下,如果你的药物分子附着在某处蛋白质上,就可能破坏该蛋白质与日常运作中其他蛋白质的正常互动能力。也就是说,你正在用这个小分子调节蛋白质功能。这就是药物设计的核心原理。作为化学家或药物设计师,你的工作就是设计能精准对接特定蛋白质的小分子,破坏或增强其正常功能。因此理解小分子与蛋白质的相互作用至关重要。

And so you can imagine that if you have another molecule, your drug, that comes in and attaches itself to a protein over here, then it might disrupt the ability for that protein to interact with another protein out of its normal machine in day to day life. And so you're modulating the function of that protein with this small molecule. And that's the essence of drug design and how therapeutics work. And so you can imagine as a chemist, your job or a drug designer, you're trying to design a small molecule that's going to fit to this protein over here and disrupt how it normally functions, or in some cases enhance how it normally functions. And so it'd be really helpful to understand how this small molecule interacts with the protein.

Speaker 0

它会形成什么结构?会产生哪些实质性的物理相互作用?

What's the structure that it might make? What are the interactions, these literally physical interactions that are being made.

Speaker 1

而且

And

Speaker 0

这直接催生了AlphaFold3的诞生——现在我们拥有的模型不仅能预测蛋白质结构,还能预测蛋白质与小分子的相互作用,包括DNA等其他基本分子机器的构建模块。这为从结构层面理解小分子(药物设计的核心)开辟了新途径,拓展了靶标类别。比如转录因子这类结合在DNA上读取遗传信息的蛋白质,现在可以尝试设计小分子来改变或破坏其功能了。

so that really inspired the creation of AlphaFold3, where now we have a model that not only predicts the structure of proteins, but how these proteins interact with small molecules. Also other fundamental molecular machine building blocks, things like DNA and And this basically opens up the ability to structurally understand, which is a core part of drug design, small molecules. It opens up new classes of targets. There are things like transcription factors, which are proteins that sit on DNA and read DNA. You can imagine now trying to design a small molecule to change or disrupt the function of something like that.

Speaker 0

因此要做到这一点,你确实需要能够以三维视角直观地看到整体结构。如果我修改这个小分子,它会如何改变与这个生物分子系统中蛋白质的相互作用?AlphaFold三现在极其精确,让我们能完全通过计算机模拟来解答这类问题。

So to do that, you'd really want to be able to see literally in three d how this all looks. And if I make changes to my little molecule, how will that change the way it interacts with this protein in this biomolecular system? So AlphaFold three is now very, very accurate. It allows us to answer a lot of these questions purely in silico

Speaker 1

是的。

Yeah.

Speaker 0

或者说完全在电脑上完成,而以前你必须去实验室,实际结晶这些物质。这可能耗时六个月,甚至数年,有时根本不可能实现。如今在ISO,我们的药物设计师只需坐在笔记本电脑前,通过浏览器界面就能理解设计、进行修改并实时观察效果。

Or purely on a computer, where before you would have to go to the lab, literally crystallize this stuff. This can take six months. It can take years. Sometimes it's even impossible. Now at ISO, our drug designers are, literally sitting with their laptop, browser based interface, being able to understand, make changes to their designs and see the impact of that.

Speaker 1

太不可思议了。那么主要关注的是蛋白质与核酸、蛋白质与配体、以及抗体与抗原的相互作用。能否分享几个ALFOL3目前对这些不同类型蛋白质分子相互作用产生实际影响的典型案例?

Incredible. So there are a couple of interactions that is focused on proteins and nucleic acids, proteins and ligands, and antibody to antigen. Can you give us some good examples of the impact that ALFOL3 now has on the interaction of these different types of proteins and molecules?

Speaker 0

没错,蛋白质与配体其实就是蛋白质与小分子的关系。配体和小分子这两个术语是同义的。这让我们能理解小分子药物的作用机制,进而研究蛋白质相互作用。还有一类被称为生物制剂的治疗方法。

Yeah, so protein and ligands, that's the same as protein in small molecules. Those two terms, ligands and small molecules, are synonymous. That allows us to understand how small molecule drugs interact. Then we can think about protein interactions. There's a whole class of therapeutics called biologics.

Speaker 0

比如抗体,它能帮助我们理解如何与靶标结合,从而开辟新治疗模式。这也涵盖了抗体抗原界面。所以设计抗体时,你需要了解抗体设计会如何与目标蛋白表面相互作用。我们可以在所有这些不同应用中使用同一个模型。

These are things like antibodies that allows us to understand how they might interact with our targets, opens up new modalities. That also encapsulates the antibody antigen interface. So if you're designing an antibody, you want to understand how your antibody design is going to interact with the protein surface there. It's the same model that we can use across all of these different applications.

Speaker 1

训练AlphaFold三这类模型有哪些技术细节?采用扩散式架构又有什么优势?

What are the nuances of training a model like AlphaFold three and what are the benefits of using a diffusion based architecture?

Speaker 0

是的,这是个很棒的问题。为了让AlphaFold三号成功运行,我们不得不克服许多挑战。最有趣的一点实际上就是:我们如何将原本仅能处理蛋白质的AlphaFold,扩展到能够输入RNA、DNA和小分子这些新模态、新数据类型。因此我们不仅要解决已知的蛋白质标记化问题,还要研究如何标记DNA和小分子。对于DNA和RNA这类物质,解决方案相对明显些。

Yes, a great question. There were a lot of challenges we had to overcome to get AlphaFold three to work. One of the most interesting things was actually just how do we take something like AlphaFold, which was only working with proteins, and then input these new modalities, these new data types of RNA, DNA, small molecules. So we had to work out how to tokenize not just proteins, which we kind of knew how to do, but how to tokenize then DNA, how to tokenize small molecules. For things like DNA and RNA, that's a little bit more obvious.

Speaker 0

我们可以对碱基进行标记化处理。但对于小分子,我们尝试了各种方法,最终发现原子级分辨率的标记化效果极佳。随之而来的问题是:如何预测这种混合分子类型的结构?是的,我们无法沿用AlphaFold二代的框架,而这正是扩散模型大放异彩之处。

We could tokenize in the bases, But then for small molecules, we would really go to we we tried a whole bunch of different stuff. It really ended up that this atomic resolution tokenization worked super well. And then you have the question of, okay, how do you actually predict the structure of this mixture of different molecule types? Yeah. You couldn't use the same framework as AlphaFold two, and this is where diffusion modelling just really shone.

Speaker 0

在这里我们实际上可以单独建模每个原子及其三维坐标,让扩散模型生成这些三维坐标。而我们讨论的标记化技术,正是对这个扩散过程推理的条件设定。

Here we could actually model every single individual atom and the coordinates of every atom individually, and have a diffusion model be producing those three d coordinates. And the tokenization that we talked about is conditioning the inference of that diffusion process.

Speaker 1

太有趣了。

So interesting.

Speaker 0

这确实是个重大突破。在我们的排行榜上,这代表着质的飞跃——特别是在小分子与蛋白质相互作用的预测精度方面。这一突破为整个项目的后续发展扫清了障碍。

And this was a huge breakthrough. So, you know, we're talking about on our leaderboard just a massive step change, particularly in small molecule protein interaction accuracy. It was a massive step change and something that really unblocks the rest of the project.

Speaker 1

哇。数据、算力和算法——我们知道这三者在所有相关领域都很重要。但读到Demis的访谈时我很惊讶,他说我们在生物学领域并不受数据限制。你能分享一下对这个观点的看法吗?

Wow. So data, compute, and algorithms. We know those three are important in all other adjacent fields. But I was surprised to read an interview with Demis where he shared that we're not data constrained in biology. Can you share your point of view on that?

Speaker 0

我认为无论身处机器学习哪个领域,都会感受到某种数据约束。但Demis想表达的是:这并非真正的瓶颈——我们可以利用现有数据、可生成的数据取得实质性进展。我们不需要干等五十年让世界产生足够数据才能产生影响。完全不是这样。在某些建模领域,那些积压多年的数据正让我们取得前所未有的重大突破。

You know, I think it doesn't matter what field of machine learning you're in, you're going to feel some data constraint. And I think the point here from Demis is that it's it's not a real bottleneck, as in we can make progress with the data that is out there, that the data we can generate and real progress can be made. It's not, you know, we've got to sit and wait fifty years for, like, the world to generate data before we can actually make impact here. No, we're not seeing that at all. Are modelling spaces where the data has been sitting around for years, that we can see that we can make really substantial progress beyond anything that people have experienced before.

Speaker 0

这是否意味着生物学领域的数据没有发展机会?绝非如此。这将成为我们持续开发这些模型的基础部分,而我们将生成的数据将决定这些系统的走向。在我看来,这里蕴藏着巨大机遇。我认为,真正适用于机器学习的生物学数据其实尚未被创造出来。

Now, does that mean there's no opportunity for data in biology? Absolutely not. Like, this is going be a fundamental part of how we continue to develop these models and these systems will be what data we go out and generate. And there I think there's just a massive opportunity. In my mind, the data for machine learning in biology hasn't actually been created yet.

Speaker 1

是的。

Yes.

Speaker 0

没错,虽然存在大量历史数据,但这些数据当初并非为机器学习目的而创建。因此当你思考'如何创建数据来训练模型'时,其思维方式与传统数据生成方式截然不同。这正是一个值得探索的重大机遇。

Yes, there's a lot of historical data, but that historical data hasn't been created for the purposes of machine learning. And so when you're going out and thinking, how do I create data to actually train my model? You're thinking in a very different way to how people have gone out and generated data in the past. And that there's a big opportunity there to explore.

Speaker 1

你认为当前我们缺乏哪些类型的数据?你觉得我们需要合成数据吗?

What kind of data do you think we're missing here right now? And do we think do you think that we need anything in synthetic data?

Speaker 0

是的。我是合成数据的忠实拥趸,事实上从我职业生涯初期就是如此。当时我还是博士生,只能接触几千张图片,而谷歌拥有数百万图像资源。于是我就生成了海量合成文本数据来突破这个瓶颈。如今我们在化学领域也看到了类似情况,毕竟我们拥有完善的理论基础。

Yes. So I'm a massive fan of synthetic data. Actually, have been for since the very beginning of my career where, you know, we would I I I was generating synthetic text data just to overcome the fact that, you know, I was a PhD student with access to a couple of thousand images, and Google had millions and millions of images. And so instead, I just generated tonnes and tonnes of synthetic data, and that unblocked things. And we're seeing the same thing in the especially the chemistry space, where we have good theory.

Speaker 0

我们对物理规律已有深刻认知,拥有量子化学和量子力学理论,可以据此创建模拟器。通过近似计算,我们能构建更具扩展性的分子动力学模拟,这为各类合成数据奠定了基础。特别是现有的生成模型,配合评分系统,可以生成并增强数据的信息含量。

We actually know a lot about physics. We know we have the theory of quantum chemistry and quantum mechanics, we can create simulators out of that. We can approximate that and create more scalable molecular dynamic simulations. This gives the basis for a whole host of synthetic data. Then we have the models themselves that, especially we have generative models, this can actually generate data that we can use scoring systems to help really enhance the information content of this data.

Speaker 0

但我认为最大的空白领域在于所谓的体内数据,即通常通过小鼠等活体动物测量的数据。虽然有些历史记录,但这类数据根本无法大规模生成。对吧?

But I think one of the big open spaces will be on what's called in vivo data. So data that you would normally measure on a real animal, something like a mouse or a rat. There's some historical data on that, but you can't generate tons of You can't really generate any at all. Yeah. Right?

Speaker 0

因此,探索新型数据生成技术蕴藏着巨大机遇。有些杰出人才正在研发诸如芯片器官这样的技术——即通过芯片完全实现原本需要在真实动物身上进行的测量工作。我认为

So then there's a big opportunity to look to new data generating technologies. There are some incredible people doing things like organoids on a chip. So ways of starting to measure things that you would normally measure on a on a real animal, but, you know, completely on a on a chip. You know, I I think

Speaker 1

这太有意思了。

that So interesting.

Speaker 0

没错。在数据生成技术领域,特别是在生物学和化学方面,将会涌现一系列全新突破,这些突破将深刻影响我们构建世界模型的方式。

Yeah. There's gonna be a whole host of, like, new breakthroughs in data generating technology, in biology and chemistry that's going to have big impact on how we think about modeling that world as well.

Speaker 1

你们内部有开展相关研究吗?还是希望其他参与者来填补这部分空白?

Are you working on any of that internally, or are you hoping that other players could fill in some of that gap?

Speaker 0

实际上我们在同构实验室内部没有自建实验室,但和许多其他公司保持着合作关系。

So internally, we actually don't have any of our own labs in isomorphic labs, but we work with a whole bunch of other companies.

Speaker 1

我们

We

Speaker 0

自身也产生了大量数据

generate a lot of data ourselves,

Speaker 1

是的。

Yeah.

Speaker 0

大量专有数据。我们已经看到了其惊人的影响力。

A lot of proprietary data. We've seen amazing impact of that.

Speaker 1

这很有道理。有一种观点认为,分子结构的建模及其功能和调控功能的建模非常重要,但并不总是药物开发中的限制因素。你对此有何看法?

It makes a lot of sense. So there's a point of view that modeling structure of molecules and modeling their function and the modulation function is very important, but not necessarily always the limiting factor in drug development. What's your point of view on that?

Speaker 0

是的,正如我之前提到的,药物设计确实非常复杂。甚至在进入药物开发阶段之前——也就是将设计应用于真人时,整个设计和开发过程中存在许多瓶颈。药物开发涉及如何开始临床试验?我们应如何在人体中测试这些药物?如何在保证安全的前提下高效完成这些工作?

Yeah, as I touched on before, drug design is really, really complex. And as before, you even get to drug development, which is where you take those designs and you start putting them into real people, There clinical are so many bottlenecks throughout this whole design and development space. Drug development is, how do we start to approach clinical trials? How should we test these drugs out in people? How can we do this in a really timely manner, but still a really safe manner?

Speaker 0

这一领域存在许多瓶颈,我认为整个行业都需要思考如何创新,特别是随着我们对分子与人体相互作用及毒性的预测模型越来越完善。这些预测模型进步后,我们必须改变临床试验的方式以充分利用它们,最终让急需治疗的患者获得药物。正如之前讨论的,分子设计不仅需要理解结构,也不仅需要理解它们如何改变特定蛋白质的功能,而是需要理解这些分子如何改变体内几乎所有蛋白质的功能——因为如果制成药丸服用,它会遍布全身。

There's a lot of bottlenecks there that I think the industry as a whole, we will need to work out how to innovate in that space, especially as our predictive models of how these molecules will interact with people, how toxic they will be. As these predictive models get better and better, we will have to change the way that we approach clinical trials to really make use of Ultimately, to get therapeutics into the hands of patients who really desperately need them. Even in the design of molecules themselves, as we talked about before, it's not just understanding the structure of these molecules. It's not even just understanding how these molecules change the function of these proteins, but we need to understand how these molecules change the function of pretty much every single protein in our body. Because if we take this as a pill, it's gonna go Everywhere.

Speaker 0

而毒性的主要来源正是:当你设计出一个完美调控特定靶点的分子(已知该靶点对疾病至关重要时)——

And that's the major cause of toxicity is when, yes, you've designed this amazing molecule that perfectly modulates your specific target that you know is key to your disease.

Speaker 1

但它同时也会影响其他方面。

But also affects other things.

Speaker 0

但它也影响其他方面。当然,现在你们做了大量筛选来防范这种情况,但预测能力越强越好。从我的角度看,真正令人兴奋的是,如果我们能创建这些通用模型来理解分子如何与特定靶点相互作用,同时也能适用于其他靶点,那我们为何不能用同一模型来理解这些分子如何与我们身体的其他部分互动呢?

But it also affects other things. Now, of course, you do a lot of screening to protect against that, but the more we can predict that, the better. What's really exciting from my perspective is if we're creating these general models that understand how this molecule interacts with this target, but also any other target, then why can't we just use that same model to understand how these molecules interact with the rest of our body?

Speaker 1

没错,非常有趣。那么AlphaFold3现在能为药物设计师提供哪些可能性?你们内部是如何运用它的?

Right. So interesting. So what is now possible with AlphaFold3 for drug designers? How are you using it internally?

Speaker 0

AlphaFold3让我们的药物设计师能够理解他们设计的分子如何真正与蛋白质靶点(即疾病靶点)相互作用。因此设计师可以调整分子设计,并立即看到这种改变如何影响分子与蛋白质靶点的物理互动。这确实非常强大。在AlphaFold3之前,人们对此完全处于盲区,甚至可能根本不知道自己的分子是如何与蛋白质相互作用的。

So AlphaFold3 gives our drug designers the ability to understand how their molecule designs really interact with this protein target, and this is the target of disease. And so our drug designers can make changes to the design and then see instantly how that changes the way that this molecule physically interacts with the protein target. That's really, really powerful. Before AlphaFold3, you would be completely blind to this. You wouldn't actually probably know how your molecule is interacting with your protein.

Speaker 0

那时只能依赖最佳直觉。或许在药物设计项目的某个阶段,你会得到某个设计与结构的结晶数据。这意味着如果运气好,六个月后去实体实验室才能获得解析出的三维结构。但即便如此,那也只是一个单一设计的三维结构,无法涵盖你做出的每个细微调整。是的。

You'd be using your best intuition. Maybe somewhere down the line in drug design project you would get your structure crystallised with a particular design. That means going out to a real lab six months later, if you're lucky, getting a resolved three d structure. But even then, that's just the three d structure of a single design, not every single change that you make. Yeah.

Speaker 0

因此AlphaFold3彻底改变了化学家开展设计工作的方式。但我要强调,这距离我们的终极目标还很远。因为这不仅关乎分子相互作用时的形态,我们还想知道这些分子与蛋白质结合的强度,以及分子的其他特性。

So AlphaFold three completely changes the way chemists can do this design work. But I would stress that's that's nowhere near as far as we wanna go. Because it's not just about what these molecules look like in terms of interacting. We actually wanna know how strongly these molecules interact with this protein. We want to know other properties of these molecules.

Speaker 0

我们需要理解这些分子与蛋白质的相互作用方式如何改变蛋白质的折叠构象,如何影响蛋白质功能,甚至可能如何改变细胞动力学。还有太多问题需要探索,而我们正在攻关的其他类似AlphaFold的突破性技术——我们的化学家在设计过程中使用的那些惊人模型——也将推动这些领域的发展。

We want to understand how the way that these molecules interact with this protein and how that changes the fold or the confirmation of the protein, how that changes the function of the protein, how it might actually change the dynamics of the cell. There are so many questions, and these are these other alpha fold like breakthroughs that we're working on that also go you know, we have created incredible models for that our chemists are using in this design process.

Speaker 1

有意思。所以你们正在内部设计一些药物。目前主要关注哪些靶点和研究项目?

Interesting. So you're designing some drugs internally. What targets and programmes are you focused on?

Speaker 0

我们内部有一个非常激动人心的药物设计项目计划,主要集中在免疫学和肿瘤学领域。我们在这方面取得了惊人的进展,特别令人兴奋的是看到这些模型如何彻底改变了我们在这些项目中设计药物的方式。

So we have a really exciting internal programme of drug design projects. These are focused on immunology and oncology. We've been making some incredible progress there and it's been really exciting to see especially how these models have transformed the way that we're actually approaching drug design on these programmes.

Speaker 1

你们还与礼来和诺华合作,最近还宣布扩大与诺华的合作关系。能否简单介绍一下这些合作的具体情况?

You're also working with Eli Lilly and Novartis, and recently you announced an expansion with Novartis' partnership. Can you share a little bit about what these partnerships look like?

Speaker 0

是的,我们最初签署了两项合作协议,一项是与礼来,另一项是与诺华。这非常棒。他们给我们带来了一些极具挑战性的难题。比如诺华带来的那些靶点,业内和诺华已经研究了十年以上,这已不是什么秘密。

Yes, so we signed these initial partnerships, two partnerships, one with Eli Lilly, one with Novartis. That was fantastic. They brought some really, really challenging problems to us. I think it's no secret that sort of targets that, for example, Novartis brought to us. These are sort of targets that the field and Novartis, for example, have been working on for ten years plus.

Speaker 1

哇。

Wow.

Speaker 0

这些可不是'随便试试看'的问题,而是真正棘手的难题。去年是令人惊叹的一年,无论是内部项目还是合作项目,我们都真切看到了这些模型的卓越表现。这让我们发现了新的化学物质,找到了调控这些长期研究靶点的新方法。这次扩大与诺华合作的新协议,我认为正是对这些合作初期成果的最好证明。

So these aren't sort of, oh, we'll try things out problems. These are for real, know, hard things. Last year was an amazing year, both for our internal projects but also for these partner projects to really see how well these models are working. It's allowed us to really uncover new chemical matter, working out new ways to modulate these targets that people have worked on for a long time. It's been amazing to see this new deal which has expanded on the Vartis collaboration, which I think is a real testament to some of the success of the early days of these partnerships.

Speaker 1

恭喜你们。这是个了不起的里程碑,尤其是在短短一年内就取得这样的成就。我想谈谈你们的团队,你们组建了一支真正卓越的队伍,汇聚了人工智能、化学、生物学等多个领域的顶尖人才。

Congratulations. I think it's an incredible milestone, especially just one year in. Yeah. So I'd love to talk a little bit about the team. You've built a truly excellent team composed of the highest caliber talent across many different fields, AI, chemistry, biology.

Speaker 1

你们还引入了外部人才来帮助挑战传统思维。能否分享一下你们在这方面的考虑?

And you've also brought ins outsiders into the field to help question traditional thinking. Can you share a little bit about how you thought about this?

Speaker 0

确实,AI在药物设计领域的应用时间并不长。因此,要找到一位既是药物设计领域的世界级专家,又同时精通机器学习或深度学习的人才,基本上

Yeah, so the space of AI for drug design hasn't really existed for very long. So the chances of finding a world expert at drug design who's also a world expert at machine learning or deep learning is basically

Speaker 1

零。

Zero.

Speaker 0

零,没错。仅仅因为这些领域共存的时间还不够长。我真心认为ISO正在孕育一种新型科学领域,因为我们拥有真正生活并呼吸于这些交叉领域的人才。

Zero, yeah. Just because these fields haven't coexisted for long enough. I genuinely think about a new sort of a field of science that ISO is breeding because we are you know, we have these people who really live and breathe the intersection of this.

Speaker 1

是啊。

Yeah.

Speaker 0

所以,正因为我们无法直接雇佣这类人才,我一直在思考如何将药物设计与药物化学的世界级专家,与机器学习和深度学习的世界级专家汇聚一堂,让这些杰出人才并肩工作。因为仅仅让他们各自孤立在团队中是远远不够的。我们需要他们比邻而坐,能够互相理解对方的专业语言

So, you know, because but because we can't hire these people, you know, I really think about how do we bring the world experts at drug design and medicinal chemistry and the world experts at machine learning and deep learning and get these incredible people sitting side by side because it's not just enough to have these amazing people sitting in their isolated teams. Yeah. We need people sitting side by side, speaking each other's languages

Speaker 1

确实。

Yeah.

Speaker 0

怀着极大的同理心和好奇心——那种渴望理解这门新科学、真正建立跨学科直觉的好奇心。我们已经见证这种动态协作产生的惊人成果:当一位对化学或生物学一无所知的通用型机器学习专家开始理解药物化学家和药物设计师面临的问题时。甚至在我们招聘从事相关研究的机器学习科学家和工程师时,团队中60%到80%的成员此前都没有化学或生物学背景,顶多是中学或大学基础水平。

With a lot of empathy, a lot of curiosity, curiosity to understand this new science, to really build intuitions in your own language. And we've seen just such amazing things come out of this dynamic where you really have, you know, a generalist machine learner who doesn't know anything about chemistry or biology. Yeah. Start to come in and understand the problems of a medicinal chemist and a drug designer. And when I think about even hiring machine learners and machine learning scientists and engineers for the research that we're doing, I'd say sixty, seventy, eighty percent of the people on our team have no prior knowledge of chemistry or biology, maybe high school or university level.

Speaker 0

这实际上可以成为一种真正的优势,因为你带着某种天真无邪的状态加入。是的。只要你保持好奇心,我认为关键在于提出那些充满好奇的问题,甚至是那些看似愚蠢的问题。这样我们就能从第一性原理出发来解决问题。

And that can actually be a real asset because you come in sort of a little bit naive. Yeah. And as long as you're curious, I think one of the key things is asking, you know, the curious questions, asking this, like, stupid questions. And then and then that allows us to come at the problems from first principles. Yeah.

Speaker 0

这几乎让我们能够突破以往经验的教条和人们传统上处理这些问题的方式。我们可以从零开始重新思考,这正是我们创造研究突破时的重要思维方式。

It almost allows us to break through the dogma of previous experience and how people traditionally approach these problems. We can think ground up from scratch, And that's a lot of the mentality of how we think about creating these research breakthroughs.

Speaker 1

带着些许天真、高度好奇心和强烈自主性是非常好的特质。

A little naive and highly curious and high agency is a very good thing.

Speaker 0

是的,完全正确。正是如此。

Yes, exactly. Exactly.

Speaker 1

那么在11月,你们还推出了AlphaFold服务器这一重大举措,开放了代码和模型权重供学术使用。能否分享一下背后的原因?

So in November, you also made a very big move in launching the AlphaFold server, which releases code and model weights for academic use. Can you share a little bit about why?

Speaker 0

是的。AlphaFold长期以来一直保持着对学术科研用途的开放性。在AlphaFold3取得最新突破之际,确保科学界能够使用这一功能至关重要。虽然AlphaFold3在药物设计领域已经展现出巨大价值,但它对基础生物学研究和理解生命现象同样意义重大。研究人员正在以极具创意的方式使用我们的4.3服务器和模型。因此,确保非商业学术工作能够免费使用这项技术对我们来说非常重要。

Yeah. So, I mean, AlphaFold has a long, long lineage of being open for this academic and scientific use. And it was really important with this latest breakthrough of AlphaFold3 that we make sure that this scientific community has access to this functionality because, yes, AlphaFold3 is going to be incredibly useful for drug design, it already is, but it's also useful for many other areas of fundamental biology and just understanding biology. People are using our 4.3 server and model it in very, very creative ways. So, you know, it's very important for us to make sure that there is that sort of free use for non commercial academic work.

Speaker 0

看到服务器的迅速普及和广泛应用,这种感觉真是太棒了。

And it's been incredible to see the the up of that and and the use of the server.

Speaker 1

我很想聊聊未来。你能透露一下AlphaFold接下来还有什么新进展吗?

I'd love to talk a little bit about the future. Can you give us a tease of what else is to come with AlphaFold?

Speaker 0

你知道,就结构预测这个问题而言,在我心里,我希望能彻底解决它。我认为我们的Fold3是通往这个目标的重要一步。这是个重大突破,但你知道,准确率并非100%。话说在这个领域,100%准确率究竟意味着什么?

You know, in in terms of, you know, structure prediction as a problem, I, you know, in in my mind, I I want to completely solve this. I think our Fold three is a fantastic step on the way of that. There's a significant breakthrough, but, you know, the it's not a 100% accuracy. Yeah. What does even a 100% accuracy mean in this space?

Speaker 0

科学研究的许多领域都是这样,当你开始突破边界时,会发现问题会衍生出更多问题。这正是科研令人着迷的地方,对吧?我认为AlphaFold3就是个很好的例子——当你获得这些能力时,会发现实际上还有更深层的问题等待我们去攻克。是的,越来越精确地理解结构永远会让我们感兴趣,但这不仅限于静态结构。所以AFFFORD3建模的这些晶体结构,本质上只是分子相互作用的静态结晶形态。

With a lot of areas of science, you start to push the boundaries, you see that the problem opens up into even more problems. That's the addictive part of doing science, right? And I think that AlphaFold3 is a good example of that, where as you start to get these capabilities, you see that actually there are even more deeper problems that we want to be working on and stepping towards. Yes, understanding structure better and better and more accurately is always going to be interesting for us, but then it's also not just necessarily about static structure. So AFFFORD three, it models these crystal structures, which are almost static crystallized versions of how these molecules interact.

Speaker 0

但实际上我们体内并不存在晶体。这些分子存在于溶液中,处于动态运动状态。你可能会想,或许理解这些系统的动态特性也会非常有趣。

But in reality, we don't have crystals inside of us. These molecules are in solution, they're moving about the dynamic. You can think, okay, well maybe understanding the dynamics of these systems is actually also going to be really interesting. Yeah.

Speaker 1

在AI生物领域,GPT-3时刻会是什么样子?我们何时能迎来这个时刻?

What does a GPT-three moment look like in AI biology and when do we get there?

Speaker 0

说到GPT-3,它本质上是个生成模型——专门生成文本。对我来说GPT-3时刻就是跨越了那个临界点:我们确实有能生成文本的模型,它们生成的内容看起来像文本,但我不确定是否出自人类之手。而GPT-3是第一个让你惊呼'天啊'的模型。

So if I think about GPT-three, this is really a generative model. So something that's generating text. And the GPT-three moment for me was crossing over that boundary between, yeah, we've got generative models of text and they generate some stuff and it looks like text, but I'm not convinced that it's generated by by a human. Yeah. And GPT three started to be that first point where you're like, oh shit.

Speaker 0

它生成的内容开始看起来像是人类写的。这说明生成模型实际上重构了训练数据的分布特征。什么是生成模型?就是能拟合其所训练数据流形的模型。

This is like, this kind of looks like a human. Yeah. And so this generative model is actually recreating the distribution of data that it's trained on. And what is a generative model? Generative model is something that fits the manifold of data that is trained on it.

Speaker 0

当我思考这如何应用于生物学时,你可以想象这些生成模型实际上正在重现那个GPT-3时刻,重现事物在现实中的真实样貌。这非常令人兴奋,因为这意味着这些模型输出的内容要么确实存在于世界上——我们可以验证这一点,甚至可能发现世界上存在的新事物;要么它们有可能存在于世界上。

When I think about this applied to biology, you can think about these generative models actually starting to recreate that GPT three moment, recreate what things would actually look like in reality. And that's quite exciting because that means that these models are spitting out things that either they actually exist in the world Yeah. And we can kind of validate that or maybe even discover new things that exist in the world. Or they could exist in the world. Yeah.

Speaker 0

这意味着它们可能是我们可以设计、制造或创造的东西,这些设计将在物理现实中稳定存在并发挥作用。我认为生物学领域最酷的地方在于,与语言不同,当它生成达到人类水平的内容时,我们能理解它,因为语言是人类衍生的。但在化学和生物学领域,许多问题我们自己都难以理解。所以当我们迎来那个GPT-3时刻时,我认为它看起来会远不像GPT-3,而更像是AlphaGo中的第37手。有意思。

Which means that they could be things that we could design or manufacture or create that would actually be stable and work and exist in our physical reality. And I think the cool thing about this in biology is that unlike with language, when it generates something at human level quality, we can understand that because it is human derived. But a lot of problems in chemistry and biology, we even struggle to understand ourselves. And so when we get to that GPT-three moment, I think it will look a lot less like GPT-three but feel a lot more like Move 37 in AlphaGo. Interesting.

Speaker 0

我们将开始看到超越人类理解范畴的事物,但它们确实存在于现实世界,存在于我们的物理现实中,却超出了人类的认知范围。这将会令人震撼。事实上,我们已经在内部生成模型中看到这种现象——我们创造的设计会让人类药物设计师说‘我不太确定这个,我更喜欢那个’,但实际测试时发现生成模型是正确的,而人类错了。

Where we're starting to see things that are beyond human understanding, but that do exist in the real world, that exist in our physical reality, but are beyond sort of human comprehension. And that's just going to be mind blowing. In fact, we're starting to see that internally with our generative models, that we're creating designs that a human drug designer would say, I'm not so sure about that. I much prefer this. And then you test it out in physical reality and the generative model is correct and the human is wrong.

Speaker 1

这太迷人了。我喜欢第37手的类比。

That's fascinating. I love the Move 37 analogy.

Speaker 0

是啊。

Yeah.

Speaker 1

当模型开始展现出创造力的元素时

When the model starts to see elements of creativity and

Speaker 0

正是如此。

and Exactly.

Speaker 1

超越人类。

Pass the human.

Speaker 0

第37手是AlphaGo对阵李世石时的一记惊人妙招。这是整盘棋的第37步,它震撼了世界围棋界,因为人类棋手完全无法解读这一手。是的,它看起来像是个失误。在人类数千年围棋史上,从未有人下出过这样的棋着。

Move 37 was this amazing move during the AlphaGo games against Lisa Dole. It was, you know, the thirty seventh move of the game, and it stunned the world stunned the Go world because it was uninterpretable by a human. Yeah. It looked like a mistake. No one had ever played this move in the entirety of, you know, thousands of years of human history playing Go.

Speaker 0

随着棋局展开,人们发现这记关键妙手正是AlphaGo能在比赛中击败李世石的原因。是的,我们将会在这些模型中看到大量类似的行为。特别是在我们将它们应用于化学、生物学等超出人类固有认知范畴的领域时。

And it turned out as you unrolled the game that this was the critical move that allowed AlphaGo to beat Lee Sedol in that match. Yeah. And we're gonna see so much of that sort of behaviour coming out of these models. Yeah. Especially when we're applying them to things outside of native human understanding, like chemistry and biology.

Speaker 1

没错,我太喜欢这个观点了——这也是我们今天的核心议题。那么什么时候能看到首个AI研发的药物进入临床阶段,并完成一期、二期和三期试验呢?

Yeah. I love that. Also our punch line today. So when will we see our first AI generated drug in clinic and also in phase one, two and three trials?

Speaker 0

我们在药物设计项目上取得了惊人进展。我实际思考的是:当开始批量获得这些AI设计的分子资产进入临床阶段时,该如何重新构想临床开发流程?如何以最快最安全的方式让这些分子惠及患者?毕竟存在大量未满足的医疗需求。这里涉及到与监管机构合作的新模式,以及如何运用预测模型——不仅要预测分子对疾病的疗效,更要预测它与人体的相互作用和潜在毒性。随着AI模型进化,我们能以更精准的方式快速设计分子,这将带来彻底改革临床试验体系的机遇,可能完全改变现有范式。

So we're making amazing progress on our drug design programmes. And the thing I think about actually is, as we start to get a whole bunch of these AI designed assets, these molecules get into clinical phase, how can we actually start to think about engaging in that clinical development to get these molecules to people as fast and as safely as possible because there's so much unmet medical need? So yeah, here I think about what are going to be new ways to engage with regulatory bodies, what are going to be new ways to incorporate our predictive models for not only how this molecule works for the disease, but how, as we talked about, how it interacts with the rest of the body, the types of toxicity it may induce. I think there'll be a lot of opportunities to think about just streamlining and speeding up this process, maybe even completely changing the way we think about human clinical trials as we, you know, our AI models become so, we can design these molecules so much quicker in a much more targeted manner with so much more knowledge about how they work. So that'll that'll change the game.

Speaker 0

但我认为整个行业要真正实现这种变革,还有很长的路要走。

But I think we've got a long way to go as an industry to really work out how that changes.

Speaker 1

是的。最后一个问题:随着Isomorphic公司乃至整个领域的成功,传统制药行业将面临怎样的变革?

Yeah. Last question. As isomorphic succeeds and potentially as the whole field succeeds, what happens to the traditional world of pharma?

Speaker 0

我认为在某种程度上,制药行业将会运用人工智能。五年后,没有人工智能参与的药物设计将不复存在,这已是必然趋势。就像试图不用数学做科学研究一样,AI将成为生物学和化学的基础工具——在Isomorphic的世界里,这已是现实,未来所有人都将使用它。

I think they become, you know, in some sense, pharma will be using AI. I think there's no world where in five years' time you will be designing a drug without AI. Like that is an inevitability. It'll be like trying to do science without using maths. AI will be this fundamental tool for biology and chemistry, it already is, at least in isomorphic's world, that everyone will be using.

Speaker 0

所以问题不会是非此即彼的'制药还是AI',而是整个行业都将适应这种融合,二者将合为一体。

So it's not gonna be, oh, is it pharma or is it AI? There's gonna be one in the same in the sense that the whole industry will adapt to that.

Speaker 1

确实令人惊叹。马克斯,非常感谢你今天参与我们的节目,这次对话非常精彩。

Yeah. Amazing. Max, thank you so much for joining us today. This was a fascinating conversation.

Speaker 0

这是我的荣幸,谢谢。

It's been a pleasure. Thank you.

关于 Bayt 播客

Bayt 提供中文+原文双语音频和字幕,帮助你打破语言障碍,轻松听懂全球优质播客。

继续浏览更多播客