OpenAI Podcast - 第五集 - 定义通用人工智能(AGI)与未来之路 封面

第五集 - 定义通用人工智能(AGI)与未来之路

Episode 5 - Defining AGI and the road ahead

本集简介

我们离自动化科学发现还有多远?AI竞赛胜利究竟揭示了通用人工智能(AGI)发展的哪些真相?OpenAI首席科学家雅各布·帕霍茨基与研究员希蒙·西多尔将分享从国际数学奥赛金牌到推理能力意外突破的内幕故事,揭示AI的下一步发展方向。 1:20 – 从波兰高中生到AI研究领军者 4:50 – 解读AGI:技术视角与日常认知 6:30 – 用AI实现科学发现自动化 7:50 – 医药、AI安全与对齐领域的突破 10:30 – 今日成就是十年的积淀 14:30 – 基准测试饱和及其局限性 16:50 – 数学竞赛对AI发展的意义 18:15 – 模型如何实现无工具推理 21:45 – 识别模型无法解决的问题 23:30 – 故事时间:日本AtCoder竞赛实录 26:50 – 推理突破的真实发生过程 28:55 – 规模扩展与长程推理的下一步 30:30 – AGI的形态与体验预测 36:25 – 信任与个人价值的平衡 34:00 – 给2025年高中生的建议 由Acast托管。更多信息请见acast.com/privacy。

双语字幕

仅展示文本字幕,不包含中文音频;想边听边看,请使用 Bayt 播客 App。

Speaker 0

大家好,我是安德鲁·梅恩,这里是OpenAI播客。今天我们邀请到的嘉宾是OpenAI的首席科学家雅各布·帕霍茨基和西蒙·西多尔。我们将探讨如何衡量AI进展、如何定义AGI,以及下一个突破可能来自何方。

Hello, I'm Andrew Main and this is the OpenAI podcast. Today our guests are OpenAI's Chief Scientist, Jakob Pachotzky and Simon Sidor. We're going to talk about measuring AI progress, how you determine AGI, and where the next breakthrough might come from.

Speaker 1

该模型能够正确识别出它在这个问题上没有取得进展。

The model was able to correctly identify that it didn't make progress on the problem.

Speaker 2

我们开始非常严肃地思考这个问题:作为一个组织,我们是否准备好迎接极其快速的发展步伐?

We started asking very, very seriously the question like, are we ready as an organization for incredibly fast paced progress?

Speaker 1

当我们规划OpenAI的研究方向时,我们的目标是创造具有高度通用性的人工智能。

When we think about how we shape our research program at OpenAI, we seek to create intelligence that is very general.

Speaker 0

我想先了解一下你们的职责。雅各布,你是OpenAI的首席研究员还是首席科学家?

I want to first start off by understanding your roles. So Jakob, you're the chief researcher or chief scientist at OpenAI?

Speaker 1

是首席科学家。

Chief scientist, yes.

Speaker 0

好的。首席科学家具体负责什么?

Okay. What does chief scientist mean?

Speaker 1

我主要负责制定公司的研究路线图,决定我们将押注哪些技术路径,以及开展哪些长期基础研究。

So the primary thing I'm responsible for is setting the research roadmap for the company. So deciding what is the technical path we are going to bet on and what is the the underlying long term research that we're going to pursue.

Speaker 0

那么西蒙你呢?你负责什么工作?

So how about you, Simone? What do you do?

Speaker 2

随机事务。随机事务?好吧。我主要做独立贡献者(IC)的工作,偶尔也会参与一些领导事务。

Random things. Random things? Okay. Yeah, I mostly do IC work. I try to well, maybe sprinkle of leadership somewhere in

Speaker 1

那里。我

there. I

Speaker 2

努力去做最有价值的事情。

try to do what's the very most useful.

Speaker 0

你们俩在加入OpenAI之前就认识,对吧?

Now you two knew each other before working at OpenAI, right?

Speaker 2

是的,我们上的是同一所高中。

Yeah, we went to the same high school.

Speaker 0

同一所高中?嗯。你们当时是朋友吗?

Same high school? Yeah. Were you guys friends?

Speaker 2

我觉得我们是毕业后才成为挚友的。来美国的这段情感经历让我们建立了深厚纽带。高中时我们更像同事关系。

I think we became best friends after we left. I think coming to US is the kind of emotional experience that forms bonds. I think in high school, we were more like colleagues.

Speaker 0

什么样的高中能培养出你们这样的人才?

What kind of high school produces guys like you?

Speaker 1

我们在波兰的普热纳上过一所高中。吸引我们的是那里的计算机科学老师理查德·苏巴尔托夫斯基先生,他在我们入学前就培养出许多计算机科学家和程序员,特别注重编程竞赛和追求专业领域的卓越。那段经历让我们受益匪浅,他是位伟大的导师。

Well, yeah, we went to this high school in Pzena in Poland. I think we were both drawn there by this computer science teacher Mhmm. Mister Richard Subartowski, who's had a great track record before before we went there of of of bringing up computer scientists, programmers with this, like, big focus on programming competitions and kind of and pursuing, you know, excellences in this, like like, one field. So I think that was a very informative experience and a great mentor for us.

Speaker 2

确实如此。凯瑟当时在编程领域钻研得很深,课程远超普通高中水平,涉及图论、矩阵等复杂内容。现在有了ChatGPT,人们或许能更轻松进行深度学习——毕竟没有良师和大量练习,很难复制那种成长体验。

Oh, wow. Yeah, definitely. I think Kether was really going deep on programming. I think it went way beyond typical high school curriculum, like there was graph theory, matrices, and all sorts of stuff like that. I actually hope that maybe with Charge GPT it's a little bit easier for people now to do this kind of deep dives because without the right mentor and without a lot of work, it's kind of hard to replicate that experience.

Speaker 0

我最近用它解释蒙提霍尔问题(三门问题),只要输入'制作交互式图表',它就能动态展示不同选择的结果。这让我特别兴奋——AI不仅能文字解释,还能构建多媒体演示。但这也引出一个问题:我们缺乏衡量这种能力的标准。关于AGI的定义也很模糊,我很想听听你们从技术角度和普通人视角如何描述它。

I've been using it to explain things like there's the Monty Hall problem where you have to choose which door. You go into chat GPT and you say make graphic interactive version of this and all of a sudden you can see it, it can show you the different solutions if you do one thing or the other. I think that's it's one of these things where I'm excited about the ability of not just explain in text but to build multimedia to do things. And it does get into the area of there's not really a measure for that. You know, it's a use case, know, this didn't exist before and you know, we talk about AGI but we kind of have very loose definitions and whatever and I'd love to hear kind of like how you would describe it both from a technical and also like a lay person's understanding of it.

Speaker 1

是的,或许可以谈谈教学这一点,比如对某些概念的更好解释或采用苏格拉底式教学法,确实,ChatGPT在这方面大有可为,我认为它特别适合与像我们Shkwatowski先生这样的教师配合使用。但与此同时,我认为他所能提供的更多是一种情感支持和空间。嗯。我觉得这是AI难以单独完成的。

Yeah, well, maybe to like address the point about teaching, this sort of like better explanation of some concept or teaching Chrono Socratic methods, definitely, yeah, powerful use of Charge GPT and I think works well with a teacher like like like like our mister Shkwatowski. But at the same time, I think, like, the the thing that, like, he was able to provide was more, like, kind of emotional support and space Mhmm. Which I think is it will be it will be hard for AI to to to to do alone.

Speaker 0

这个观点很棒。我觉得这一点经常被忽视,因为人们总在说AI会取代教育。对我而言,有些老师可能讲的知识点未必完全准确,但他们用心良苦,关心学生并解答问题。所以你说的很对,这些工具是教育的辅助者。我认为教师运用这些工具可以变得更有能力。

That's a great point. Think I think that gets lost a lot because sometimes you hear people talk about, oh, AI will replace education. And for me, I had teachers that maybe their facts weren't always right, but their heart was there and they were carrying and they answered questions for stuff. And so I think that's a good point that these are companions to that. And I think that a teacher using these tools can be even more capable teacher.

Speaker 1

是啊。

Yeah.

Speaker 0

不过关于通用人工智能(AGI),我想先听听你的技术定义。或者其实不用技术定义,就像你向年幼的弟弟妹妹解释那样描述它。

But also on the subject of AGI, though, I want to hear first, like, give me your technical definition of it. Or actually not a technical definition. Give me like how you would describe it to like if you were talking to, you know, a younger sibling.

Speaker 1

几年前我们讨论AGI时,深度学习技术虽然前景广阔,但整个概念仍显得抽象而遥远。无论是谈及人类水平智能、自然对话能力,还是解决数学问题或开展研究,这些目标似乎都处于同一维度。但随着技术进步,现在我们发现这些其实是截然不同的能力。显然,AI已能在广泛话题上自然交流了。

A few years ago, when we would talk about AGI, it felt like the technology, deep learning, has incredible promise. But at the same time, the concept still felt a little bit abstract, right, and far away. And so I think, you know, whether you talk about, like, you know, human level intelligence, ability to converse naturally, you know, ability to, you know, solve math problems or pursue research, like, they all kind of felt, like, in the same space. I think, yeah, as as technology has progressed, now we see there's, like, this these are actually, like, quite distinct capabilities. And I think we pretty clearly are at the point where the AI is able to converse naturally on a wide range of topics.

Speaker 1

它能解决数学问题。获得国际数学奥林匹克(IMO)金牌曾被我们视为AGI发展的重要里程碑,而这已经实现。我认为解决全国数学奥林匹克所有题目其实更难,这是通往AGI的另一个里程碑。但越来越明显的是,这类单点衡量标准已不够充分。

It is able to solve math problems. Think getting a gold medal at IMO is something we've long discussed as like a milestone on the path to AGI, and that happened. I think solving all the problems on the national map, OlympiA is actually a little bit harder. And I think it's like another milestone on the path there. But I think increasingly, we see that, like, this kind of point wise measures are less adequate.

Speaker 1

因此我们开始思考:它究竟会对世界产生什么实际影响?

And so we we turn to thinking about, like, what is what is its actual impact in the world?

Speaker 0

对于

For

Speaker 1

我个人而言,当思考AI进步如何真正影响世界时,我首先想到的是它在自动化新技术发现与生产方面的潜力。我们常将新创意和根本性技术进步归功于人类智慧,并用重大发明和技术革命来衡量进步。但很难想象——这个过程大部分是可以被自动化的,计算机确实可能产生彻底改变我们世界认知的创意。

me, personally, the the thing that I think about when I think about how AI progress really impacts the world meaningfully, I I first think about its potential for automating the discovery and production of new technology. I think we tend to associate kind of new ideas, fundamental technological progress with just human ingenuity. And we measure kind of our progress by this kind of major milestone inventions and technological revolutions. And I think it is just hard to internalize, like, this is possible it is possible to automate most of this process. It is possible to have the computer that is coming up with ideas that fundamentally change our understanding of the world.

Speaker 1

实际上我认为这并不遥远。所以我现在思考的是:我们与这个目标之间还存在哪些障碍?以及这项技术会带来什么后果?

And I actually think that is not that far away. And so thinking about that, what separates us from that and what are the consequences of such technology is my first thought.

Speaker 0

我刚订购了一台小型Mac Studio,因为我想运行开源模型GPD OSS,让它不间断地运作。这个想法——让它全天候生成内容、处理事务——本身就让我着迷。但你谈论的是大规模自动化科学的层面。那么,我们预期最先可能从中看到哪些类型的发现或成果呢?

I just ordered a little Mac studio because I want to take the open source model, GPD OSS, I want to just let it run nonstop. Because that idea, just the idea of letting it generate and do stuff 20 fourseven is fascinating to me. But you're talking about a scale of basically automating science at a huge scale. And so what kind of discoveries, what kind of things do we think might be the first things we could see from that?

Speaker 1

在规划OpenAI的研究方向时,我们致力于创造通用性极强的智能。虽然自动化研究者是我们的优先目标,但我们并非局限于特定领域来部署技术。这种方式或许能加速局部进展,但真正重大发现和最具意义的技术突破往往源于这种通用性。当然,某些领域——尤其是需要结合大量推理与领域知识直觉的——确实更容易应用这些系统。

When we think about how we shape our research program at OpenAI, we seek to create intelligence that's very general. We drive towards this automated researcher as a priority, but we, you know, we don't really think of it as, like, let's take this specific domains and let's kind of, like, deploy this technology there. I think that is a way to, like, make faster point wise progress, but I think the potential for, like, the really big discoveries and and and and and and most meaningful technology advancement comes from this general generality. Although still, think we see kind of like, you know, the technology, like, you know, is kind of easier to apply in some domains than others. Think especially in places that combine a large amount of reasoning with a lot of kind of domain knowledge and intuition seem very amenable to to to these systems.

Speaker 1

特别值得一提的是,我们在医学领域看到了令人振奋的惊人成果。嗯...我对此抱有很高期望。作为AI研究公司,我们自然经常思考如何自动化自身工作。

I think, in particular, we see, like, pretty incredible results on medicine, which is very encouraging. Mhmm. I have high hopes about that. Yeah. I think, naturally, being a company of AI researchers, we think a lot about automating our own work.

Speaker 1

这某种程度上也是...如果AI真能实现自动化AI研究,那将是至关重要的突破。同样地,我们也在探索如何用自动化促进AI对齐与安全研究。

I think it is also kind of a you know, I don't think it's a you know, if it is if AI can indeed reach a point where you can automate AI research, then that is probably a very important thing to automate. And, you know, and similarly thinking about, like, how we can help with automating research on AI alignment and safety.

Speaker 2

IMO的AI成果确实令我震撼。几年前我们讨论Jakob的IMO时,连AGI的定义都尚未明确。当时我们考虑过将解决所有国际数学奥林匹克问题作为标准——因为拥有超强数学推理能力的模型,理应能颠覆所有可数学建模的领域。

I'm obviously impressed by the, like, IMO, AI results. I I mean, I was actually about to add that in the past when we were talking about the IMO of Jakob, that was a few years ago, we are still trying to even figure out what our definition of AGI might be. One kind of concept we are considering is something like solving all the problems on the MAF Olympiad. And why did that feel appropriate? It's just like, okay, if you have a model with such a superior mathematical reasoning then it should be able to disrupt a bunch of different domains that can be mathematically modeled.

Speaker 2

借这个播客分享些内幕视角:我对进展速度感到震惊。当看到'AI经济影响仅3%-5%'的标题时——这些报道常附带'AI降温''过度炒作'的论调——我就想起十年前用深度学习做自然语言处理的困境。当时Jakob测试我们研发的语句情感检测技术...

I mean, I'm in general just I think maybe this podcast is just a good opportunity to share a little bit more of an inside perspective, I was astounded by the progress. So sometimes I see those headlines where people say that, Oh, the economic impact of AI is only 3% or 5%. Those headlines are often accompanied by comments like, well, so AI is slowing down or people are overhyping AI so much and it's only like 3%, so what's up with that? When I see headlines like this, I remember maybe ten years ago I was working on natural language processing with deep learning and back then it just didn't really work. Remember Jakob once came to test one of the technologies we were working on and that was trying to detect sentiment of sentences.

Speaker 2

他说'这电影很烂',系统正确判定为负面;'这电影很好',正确判定为正面。但当他输入'这电影不烂'时,模型却判定为负面。那是十年前的水平。

And he was saying, this movie is bad, correctly classified as negative. This movie is good, correctly classified as positive. And then he would say, this movie is not bad. And the model is like, oh, negative. So that was ten years ago.

Speaker 2

此后我们逐步攻克了这类任务:判断词性是名词还是动词(情感神经元)、GPT-1/2生成连贯段落(当时是重大突破)、GPT-3/4演进。GPT-4让我首次感受到AGI时刻——它偶尔会给出令我惊讶的回答。

And since then, we slowly started solving tasks like this, solving tasks like, is this word a noun, a verb? That was sentiment neuron. Then had GPT-one, GPT-two, started producing like a paragraph of text that made sense, right? That was such a breakthrough. Right now it feels so simple, but back then it was such a breakthrough.

Speaker 2

虽然初期ChatGPT对我而言只是更精致的谷歌,但深度研究版能透彻解答问题且较少虚构内容,这很实用。如今模型已能参与编程竞赛——这对我和Jakob而言都是艰辛的里程碑。

Then we had GPD three, GPD four. GPD four was to me my personal AGI moment because it would sometimes say things that surprised me and I was can this model actually surprise me? It's still back then Charge GPT for my personal use kind of felt a little bit more like Nuance and maybe slightly better Google, but what what was the big deal? And then suddenly we get to deep research and this can actually answer questions thoroughly, rarely make things up, that felt useful. And then finally now we have models that can compete in programming competitions, was a very hard earned for me personally and even more so for Jakob obviously.

Speaker 2

从技术研发者视角看,进展速度惊人。所谓3%的影响?十年前可能只有0.00001%。这些数字需要放在时间维度理解——完全有理由相信明年会达到10%,后年20%,以此类推。

The pace of progress from the perspective of somebody working on this technology is absolutely amazing. So when you see that 3% like I raised you, ten years ago if we had to quantify it it would probably be 0.00001% or something. So really I think those numbers need to be put in perspective and there is no reason not to believe that in a year it will be 10%, in two years it will be 20% and so on so far.

Speaker 0

是的,我听过一种说法:如果你观察从90年代初互联网诞生至今的经济走势图,试图指出互联网对经济产生影响的转折点,你根本找不到那个节点。没有哪个时刻会让你恍然大悟‘啊,蒂姆·伯纳斯-李宣布了这个技术’。我认为AI也是如此,人们总说我们只能衡量某些方面,测量本身就很困难——既难以统计使用者规模,也说不清具体使用方式。你提到一个很好的观点:长期关注AI发展的人会记得,我曾在个人电脑上训练过一个简单的下一个字符预测模型,效果非常糟糕。一方面是因为电脑性能有限,但即便后来用Bert做情感分析时也只是略有改善。直到GPT-2问世,我逐行研读了GitHub上的所有输出,当时就确信这项技术蕴含着某种突破——正是这种痴迷最终让我加入了OpenAI。接触GPT-3后我更确信:这就是未来的发展方向。

Yeah, I've heard it said that if you looked at a graph of the economy from let's say you know, World like Wide Web, you know, early 90s forward and you said point to the internet happening to the economy, you can't find the point. There's no point you go, oh, okay, Tim Berners Lee announced this whatever. And I think AI is a lot like that where people go oh, we've only measured this one, our measures are hard, it's hard to know that, you know, one, who's using it, how they're using it and you brought up a very good point too about if you've been following it for a while, I remember training like a very simple next character predictor on my computer and it was terrible, right? One, I'm using a small computer but even then and then you've got, you know, the sentiment analysis, you're playing with Bert and it's kind of going to get a little bit better but then GPT-two comes out and I read every single output on GitHub. Every single output GPT-two came out because I'm like, there is something going on with this and that's how I ended up working at OpenAI was because I was this obsessive person about that and then with access to GPT-three kept saying, oh, this is really this path that's moving forward.

Speaker 0

但现在的情况有点疯狂——如果六周内没有新基准被突破,人们就开始嚷嚷‘我们遇到瓶颈了’。部分问题在于,基准测试的改进往往很有限。我听说某些测试本身存在缺陷,有些甚至包含错误答案,导致根本不可能达到100%准确率。我们在内部讨论时也提到过这个概念,有人称之为‘饱和’现象。

But it's kind of crazy now because like if six weeks go by and a benchmark hasn't been broken, people are like, oh, we hit the wall, we hit the wall. And I would say part of the problem though is that benchmarks in some ways feel like you'll see modest improvement on them. I've heard some of the benchmarks have problems and some of them actually have wrong answers and it's impossible to get 100% if you answer them incorrectly. But also we talk about the term internally. I've heard people talk about this as saturation.

Speaker 0

Per,你想聊聊这个吗?

Do you want to talk about that, Per?

Speaker 1

是的,我认为当前基准测试面临几个问题。最明显的就是饱和现象——模型确实已经达到了人类水平,特别是在标准化智力测试方面。当模型能在国际中学生竞赛中跻身顶尖选手之列时,这种受限的测量方式就显得力不从心了。回想GPT-1到GPT-4的发展阶段,基准测试本质上只是在测量整体能力的提升。

Yeah, I think there's a few issues that we're hitting with benchmarks right now. Yeah. I mean, a pretty clear one is saturation, and that is just the models genuinely reaching a point where, you know, for the kind of standardized forms of measuring intelligence or ability, like, are at human level for a lot of them. You know, if you're kind of, like, able to to, you know, perform amongst the top on this, like, very high school competitions where we have, the best competitors from around the world, it just becomes quite hard to have this very constrained measurement. Previously, when we were looking at just like GPT-one, GPT-two, GPT-three, GPT-four scaling paradigm, the benchmarks were really very they were really just measuring the rising of the tide.

Speaker 1

如今这个领域已经发展出更高效的数据训练方法,可以针对性培养特定能力。比如训练出数学能力远超写作能力的模型,这会导致数学基准分数虚高,却无法反映其整体智能水平。这两个问题叠加之下,我们确实需要重新思考模型的可靠性,特别是它们发现新见解的能力。

I think now the field has developed a lot of more data efficient ways to train for specific abilities, right? It doesn't mean train on these benchmarks, but you can train models that are disproportionately good at math compared to their ability to write, for example. So they will do better on math benchmarks, but it's no longer as representative of their overall intelligence in other topics. I think these two issues combined, yeah, I think we really have to think about the reliability of these models and especially their ability to discover new insights.

Speaker 0

有件事经常被忽视:你可以造出应试能力超强的模型,但这个模型未必实用。理想情况是模型既能考高分又具备实用价值,但高分本身不等于好用。当前面临的挑战在于,当人们评价模型好坏时,就像试图用统一标准衡量100种不同用途。某个模型可能擅长创意写作却数学糟糕,反之亦然——这形成了巨大挑战。

Yeah, guess that's a thing that sort of kind of gets overlooked, is that you can build a model that's a really good test taker, but that model may not really be that useful for work. Ideally your model should score well on tests but just because a model got these scores doesn't mean you're going to find it personally useful. And I certainly think that's a challenge right now where when people say is a model good or bad, it's kind of like saying, you you're trying to create a blanket assessment when there's a 100 different use cases for it. You know, is a model good or bad? Maybe it's great at creative writing, maybe it's bad at math, maybe it's great at math and bad at creative writing and that becomes a really big challenge.

Speaker 0

我们之前讨论过国际数学奥赛这类指标。为什么它们重要?为什么要把AI置于人类级竞赛中评估?

And we've talked about this with one for math, the International Math Olympiad, and these kinds of metrics. Why are they important? Why is it important to put it into these sort of human level competitions?

Speaker 1

我们对国际数学/信息学奥赛感兴趣,是因为这类竞赛题目受限、不依赖庞杂知识,真正考验持续思考能力。大量参赛者的实际表现证明这些题目确实具有挑战性。对于过去那些知识渊博但缺乏深度思考的模型而言,突破这类竞赛正是理想的发展里程碑。

I think the reason we've been excited about these competitions like the International Math Olympiad and Informatics Olympiad is that they are a pretty interesting example of, like, a test that is constrained, doesn't require that much knowledge, but really tests your ability to to think about a problem hard for, you know, an hour or two or three. And, you know, and we have, like, a very kind of good we have very good evidence that these problems are hard. There's a lot of people that try to solve them and compete at solving them, and it matters to them. So, yeah, so I think this is then for models that excelled at kind of knowing a lot of things but not necessarily thinking very hard in the past, that really seemed like the kind of the right milestone to be working towards.

Speaker 0

据我所知,获得金牌水平的模型没有使用计算器或其他工具框架,纯粹依靠推理能力完成。

Now, as I understand it, the model that scored gold medal level on that wasn't using like a calculator, it wasn't using other tools, it wasn't using some of the frameworks, it was doing it purely through reasoning.

Speaker 1

没错,参加国际数学奥赛的模型确实没有借助任何外部工具。

Yeah, that's right. For International Math Olympiad, the model was not using other tools.

Speaker 0

对啊。而且那时候,大概两年前吧,你让它计算两个四位数的乘法,它都会出错。

Like yeah. And again and that was, like, two years ago, you asked it to multiply two four digit numbers. It would fail.

Speaker 1

确实。不过这种比赛本质上,虽然限定在数学领域,但真正考验的是创造性思维,而不是套用公式。

Yeah. But definitely, like, you know, for this kind of contest, it's really, like it is, of course, like in a limited domain of math, but it really is about, like, fairly creative thinking, not about applying a formula.

Speaker 0

但我觉得挑战在于,一旦超出数学范畴,难度就陡增。比如可以设计人文学科的终极测试——我觉得这是个很巧妙的考核。不过你会发现某些模型掌握工具用法后,解题能力会突飞猛进。我在想我们需要什么样的新基准?究竟要看哪些指标才能客观评估这种能力?

I guess that's part of the challenge, though, is that once you start moving outside of math, it gets to be harder. You can start to come up with things like Humanities last exam, which I think is a pretty neat test. But you find that certain models after they learn a certain kind of tool use, kind of figure out maybe sort of how to solve these problems better. And I would wonder what kind of benchmarks are we going to need? What are you looking at to say, okay, this is how I can kind of get an objective measure of a capability?

Speaker 2

有件事让我很意外:有次和同事Anna Makanju聊起IMO的进展,她反问我'IMO是什么'。这让我意识到我们可能活在信息茧房里。对我而言,这类竞赛特别是计算机方向的IOI意义重大,但对其他领域工作者——比如历史爱好者——可能就完全不同了。

One thing that surprised me in the past, was talking to one of our coworkers here, Anna Makanju, and I was telling her about IMO, I was excited about some progress, and she's like, What's IMO? And that was important for me because I do realize that some of those benchmarks we kind of live in a bubble a little bit. For me that competition feels important, especially the computer science counterpart, IOI, because it was a big part of my life. So it's true for many coworkers here. But actually, for an average person working in other fields or maybe not as interested in mathematics or computer science, maybe they're interested in history or something that the

Speaker 0

Lana会讲五种语言呢。我觉得针对她这类人群设计不同的评估标准会很有意思。

Yeah, Lana speaks like five languages too. I could see for her a different metric based on that would be interesting.

Speaker 2

没错。虽然不完美,但ChatGPT用户基数至少能帮我们打破信息茧房——毕竟全民都在用,应用场景包罗万象。当然这标准有缺陷,但至少避免了因个人偏好导致的评估偏差。

Yeah. So I think one thing that it's not a perfect metric, but it at least helps keep us honest and helps keep us escape the bubble is just ChargeGPT users, right? Because everybody uses ChargeGPT, and they use it for all sorts of use cases. And obviously, there's a lot of pitfalls to using that as a metric, but at least it avoids that particular problem where there are just some things that I'm more familiar with and other people might appreciate other things and this gives you a very wide coverage.

Speaker 0

而且用户里还有开发GPTs的进阶群体。你之前提到模型延长推理时间的特点,这似乎也是个很有潜力的评估维度?

Yeah, and in there too you have subsets of users, people who are building GPTs and doing more complicated stuff. You mentioned before too, the fact that the model will reason longer, and that seems like a very interesting way to evaluate capabilities.

Speaker 1

对。我们正把AI普及度作为进步指标——虽然目前影响有限,但爆发在即。未来我们将能调用远超个人用户承受的算力,创造出普惠性技术产品,这对我而言才是真正的里程碑。

Yeah, yeah. And I think this is also maybe like one you know, challenge we're focusing on on the kind of usage of, like, child GPT and broad adoption of AI as as the metric of progress. Like, I think this hasn't really happened to a very meaningful extent yet, but I think it will start happening pretty soon. We should be able to use vastly more compute than a user would normally be willing buy for themselves to produce, you know, technology artifacts that are useful to a lot of people. And I I think I think that for me will be a very important measure of progress.

Speaker 0

这些突破里哪个最让你意外?

Which of these wins were the most surprising to you?

Speaker 1

其实推理模型初见成效时我们就预料到这一天。不过IMO确实来得比预期早——尤其是第六题,那种突破常规的思维方式,通常都是压轴难题。

I think we definitely kind of anticipated getting to this point when we saw the reasoning models starting to work. At the same time, I think this recent set of things is very impressive. I think maybe out of those I think I think IMO came a little bit sooner than I expected. IMO got again, like, think IMO problem six will will still IMO has all the problems require creative thought and some new insights, I think. But typically, there's this problem six that requires very out of the box thinking.

Speaker 1

而且这通常确实超出了其他问题的典型范畴。过去我们实际上是在划定一个界限,一边是获得金牌、解决那些问题,一边是考虑解决所有问题,尤其是第六题。所以某种程度上看到我们自己还有谷歌DeepMind同时说‘啊,我们完美解决了第一到第五题,但第六题一点进展都没有’还挺滑稽的。我觉得这正好清晰展现了那个挑战的难度。

And it's really usually outside the typical domains of the other problems. So in the past, were actually drawing a boundary between getting a GOLD, solving these other problems, and actually considering solving all the problems, and in particular, problem six. So it was pretty hilarious in some way to see ourselves and also Google DeepMind at the same time, oh yeah, we solved problems one to five perfectly and we didn't make any problems on problem six. I think that kind of makes that challenge pretty clear.

Speaker 0

是啊,有趣的是,OpenAI的模型直接表示‘我觉得我解不了这个甚至没尝试’或者说它遇到困难了。这个判断准确吗?

Yeah, that was what's interesting is that, yeah, I think that the OpenAI model said like, yeah, don't think I can solve this and even try or said that it had a problem with that. Was that correct?

Speaker 1

没错,模型能正确识别自己在那个问题上没有进展。

Yeah, the model was able to correctly identify that it didn't make progress on the problem.

Speaker 0

这个能力想想挺神奇的,因为关于幻觉的讨论很多——我觉得这是个被误解的概念——流体思维与晶体思维有区别,前者关乎模型的知识储备,后者是解决问题的能力。当它能主动说‘我觉得我回答不了这个’时,就达到了一个非常有意思的节点。对了,有人让我问个关于日本直播的问题。

That's pretty fascinating to think about that, that the model is able to sort of determine that because there's a lot of conversations about, people talk about hallucination which I think is a kind of a poorly understood thing and there's a difference between fluid and crystalline thinking and one is how much knowledge a model has and the other is its problem solving capability. And when you get to the point where it's able to do that, it's able to say, hey, I think I won't be able to answer this, that's a pretty interesting sort of point to get to. I've been told to ask this question about a live stream in Japan.

Speaker 1

哦,其实过去几周我们的模型在三个竞赛中表现惊人。刚聊了IOI和IMO两个,还有个面向全民(不仅是高中生)的AtCoder大赛——这是日本举办的世界级高水平赛事。那次比赛侧重长周期启发式问题,选手只需攻克单一赛题。

Oh, so so I think in the in the past few weeks, actually, like, our models have performed incredibly well in three competitions. So we talked about two of them, which is IOI and IMO. There is also this competition for that is open to everyone, not just not just high schoolers, called AtCoder. It's a it's a very prestigious, very high quality competition organized in Japan but open to open to competitors worldwide. And in this particular contest, it was about kind of longer horizon, heuristic problems, where you're given only a single problem.

Speaker 1

你有十小时来解题,参赛者要竞相找出这个复杂优化问题的最佳方案。这很不同,因为没有标准答案或固定套路,任务极其多样化,你可以十小时专注一道题。我们让模型参加了这个比赛。

You have ten hours to solve it, and so you have competitors racing to figure out the best approach to this difficult optimization problem. So it's a bit different because there isn't a single correct solution. There isn't a single pattern to follow. These tasks are extremely diverse, and you can focus on a single task for ten hours. And so we entered our model into this contest.

Speaker 1

对我个人而言有点特殊意义——我以前是IOI这类短时封闭式竞赛的活跃选手,而我朋友Saiho(当时也在OpenAI)擅长这种马拉松式比赛。共事时他总调侃说我的比赛类型会先被自动化取代,因为他的赛制更需要持久专注。结果这次日本赛中,Saiho本人就是顶尖选手之一。

And, you know, to to me, this this had a little bit of a personal significance. I used to be a kind of very engaged competitor in the past in this, like, more short form, like, closed form contest like IOI. And my friend, Saiho, who also worked at OpenAI at the time, excelled at this long duration contest. And when we worked together, he would mock me a little bit that my sort of contest would be automated long before his because they are kind of longer duration require kind of more focused work. And turns out, in this contest in Japan, CyHo was actually like one of the top contenders.

Speaker 1

我全程看着直播里我们的模型和Saiho角逐。最终模型拿了第二,冠军是Saiho——他独自让他的预言没能成真。

And so I was watching this live stream, watching our model kind of race with SiHo throughout the competition. In the end, our model actually got second place and SiHo won. So, you know, he alone stood in the way of his prediction not coming true.

Speaker 0

OpenAI还是赢了两场嘛。

Still two wins for OpenAI.

Speaker 2

我印象很深的是赛后精疲力尽的Saiho——比赛中途我采访他时,虽然不能直接引用原话,但大意是:‘你们的模型太差劲了!我想睡觉!累死了!’

I think one thing that also stood out to me is, SciHo, at the end of the competition, he was, like, really tired, and I interviewed him a little bit to to to talk about his experience, like, in the middle of the competition. And I don't think I can quote him directly on this podcast, but he's like, your models are very, very bad. I want to go to sleep. I am tired.

Speaker 0

是的。我们之前讨论过类似‘墙’的话题,当时就觉得很有意思,因为这种推理能力似乎凭空出现。虽然之前有些论文和线索暗示过,但人们并未真正串联起来。突然某个模型问世,带来了全新理念——不仅能回答问题,还能让模型进行内心独白、自我对话来解决问题。你认为这足以实现通用人工智能(AGI)吗?还是需要其他突破?或者你认为还会有哪些突破?

Yeah. We have heard talks about like the wall, we mentioned that before and I think it was interesting because reasoning kind of came out of nowhere. I mean there were hints of stuff, some papers and stuff but people really hadn't kind of drawn the line and all of a sudden the one model comes out and the whole idea that you can not just have a model give answers, you can let the model kind of have an inner monologue talk to itself and solve things through. Do we think that's enough to take us to AGI or are there other breakthroughs needed or are there just other breakthroughs you think are going to happen?

Speaker 2

我必须指出团队为这个‘派对女孩’项目付出了极大努力。表面看只是延长思维链这么简单,但实际实现过程异常艰辛。这让我想起你之前的问题——当我们首次发现模型有效运转,或首次意识到通过提供更多数据能让它们持续进步时,那种震撼感。那是个让我们严肃思考‘组织是否准备好迎接如此迅猛发展’的关键时刻。记得某个晚上11点,我们和Sam、Meera连线测试,结果简直让人瞠目结舌。

I just need to point out that the team here worked extremely hard on this party girl thing. Feels like something simple, like you just need longer chain of thought, but actually to make it work was really hard earned. I do think back to your previous question of what was the surprising result when we first noticed that it's working or we first noticed that we can train those models and give them more data and they get better, that was I think one of the most shocking moments, the moments where we started asking very, very seriously the question, are we ready as an organization for incredibly fast paced progress. I remember there was one particular evening, I think 11PM, I think we were on the line with Sam and Meera and just kind of trying. I think we got a little bit freaked out by those results.

Speaker 2

这种情况时有发生。

Sometimes that happens.

Speaker 0

发展速度确实惊人。就像我说的,人们六周看不到进展就以为停滞了,但纵观全年变化巨大。这很合理——内部研发可能持续数年,论文成果也非一蹴而就,背后是大量积累。

The pace is fast. I mean it is a fast thing and like I said, the joke is people, nothing happens for six weeks, they think it's slowed down. But then if you look year over year it is. I mean, it's a fair point because yeah, you have things that you're aware of internally when you work on something for a couple of years and there's research paper but it's like, yeah, it's not like it came out last night. It was like there's a lot of work on to it.

Speaker 0

我认为世界震惊的是发现了一种根本性的新方法,能基于现有架构大幅提升模型能力。你觉得下一个突破会出现在哪个领域?

I'd say to the world was sort of surprised by the fact that there is this really fundamental new way to sort of make these models do even more, to take kind of the existing sort of infrastructure, so to speak, and get a lot more capability out of it. Where do you think the next breakthroughs are going to happen?

Speaker 1

我们始终提醒自己不要低估规模扩展的重要性。即便观察这些新模型,预训练的规模化范式依然有效。这些因素会产生复合效应。同时我们也在探索新方向,比如延长模型的规划与推理时间跨度。

Think one thing we always try to not underestimate is the importance of scaling. I think even as we look at these resync models, it's not like the previous scaling paradigm of pretraining has vanished. I think we will see these things compound. And I think there's also new directions that we can move in. In particular, we were talking about extending the horizon that these models can plan for and reason in.

Speaker 1

从计算消耗角度看,GPT-4每次回答的算力需求,到GPT-5 Pro可能增长10-20倍——虽然显著但不算惊人。关键在于:对于真正重要的领域,比如医学研究或下一代模型开发,人们愿意投入的算力规模根本不可同日而语。

And I think if you look at it from the perspective of just, like, compute spend, you know, we say, okay. Yeah. We went from g p t g p t four was doing some out of compute for for for every answer to, like, you know, g p t five pro, which maybe uses, I don't know, like, 10 x, 20 x. I don't know. Like, some some, you know, nontrivial, but in some ways, not not that impressive amount of compute more.

Speaker 1

因此模型持久性和长期专注解决问题的能力,显然是下一个关键突破点。

Right? And can produce much better answers. I think on the scale of, like, what amount of compute would you be willing to spend on a problem that actually matters to a lot of people, right, on progress on a medical research question, progress on developing the next generation of models, right, These are incomparably larger amounts. And so I think that question of model persistence and ability to work for a very long time on a focused problem is a pretty clear next step.

Speaker 0

如果向普通ChatGPT用户描述AGI的实用价值,未来五年(其实转瞬即逝,想想五年前GPT-3刚问世)他们的使用体验会怎样?AGI级模型能达到什么水平?

How would you put the practical implications of AGI to sort of like if we were talking to a typical chat GPT user or something, what would their experience be like in a few years from now or five years from now, which sounds far away, but it's really not, because which is five years ago, GPT-three came out and that feels like a blur. What would an AGI like model be capable of?

Speaker 1

以自动化研究为例,我设想中的场景是:一个主要由AI组成的顶尖研究团队。这样的系统将以多元方式与世界交互,绝非简单的黑箱模型。

So I was talking about automating research. You know, my picture of how that would actually look like is, you know, imagine a company of very capable researchers and engineers that is largely automated. Right? And and now, again, like, that is I think that is something that, like, you know, will interface with the world in all sorts of ways. It won't be just, like, kind of a black box.

Speaker 1

比如,它会与人们交谈,接收输入信息,进行实验。但我认为,拥有这种开发新技术和其他类型产物(如代码库、设计)的潜力,将极大加速技术进步的步伐。因此,我认为这是我们将会感受到的,并且需要从技术和社会角度做大量工作来实现。

Like, it will, you know, talk to people. It will kind of, like, take in inputs. It will run experiments. But I think, like, you know, having having this sort of potential for developing new technology and other kind of artifacts, code bases, designs, I think radically accelerate the pace of technical progress. So I think that is something that we will feel and we need to do a lot of work to get it tried, you know, from a technical and from societal perspectives.

Speaker 1

但我想,现在正是我们应该期待实际交互界面取得重大进展的时候。我们看到ChatGPT已经显得非常人性化,人们会对其产生情感依赖。随着它变得更持久,能以不同形式和文本表达自己,对吧?

But I think that is kind of where our time I think we we should also expect, like, a lot of progress on the on the actual kind of, you know, interfaces that we interact with. We see ChatGPT can feel quite human like. We can form attachments with it. I think as it becomes more persistent, as it becomes kind of capable of expressing itself in different forms and texts. Right?

Speaker 1

我认为这些影响会越来越强。这将成为非常重要且需要深入讨论的话题。

Like, I think that those effects will become stronger. And again, like, will be something I think will become a very big and important conversation.

Speaker 0

我刚获得让ChatGPT读取我Gmail日历的权限,突然意识到我们已经走了多远——现在我为这个功能兴奋,而不是担心它会开始写《星球大战》伊沃克人的同人小说。这就像跨越了一个信任的临界点。

I just got access in ChatGPT to have it actually read my calendar in Gmail and and I realized like how far we've come because I'm excited about that now. I'm not really terrified that it's gonna start writing like, you know, Ewok fan fiction to somebody. You know, and I think that's sort of this neat threshold that we sort of cross, this sort of level of trust.

Speaker 1

我们确实处于一个艰难的权衡点:让模型访问大量数据能带来明显的经济和个人价值,但同时这些模型的稳健性尚未达到让我们完全信任其不会被恶意利用的程度。这是个需要整个领域不断迭代解决的大问题。

I think there's definitely like Like we are in a place where there's like very tough trade off, where there's like such clear economic personal value you can extract after having the model have access to a lot of your data. At the same time, I think we are not at the threshold of robustness where we can fully trust these models to not be exploited by someone trying to exploit them. Yeah. It's definitely like a big problem, I think, know, we as a field will have to iterate on.

Speaker 0

你会对高中时代的自己说什么?如果回到当年的教室,你现在会给他们什么关于未来的建议?

What would you tell two versions of you guys today in high school? What would you do? If you're visiting your old classroom, what would you say right now? Tell them about the future. What advice would you give?

Speaker 2

投资比特币。

Invest in Bitcoin.

Speaker 0

不,我是说现在——即便是2025年,你会给高中生什么建议?

No, mean today, even today. In 2025, what would you tell a high school student?

Speaker 2

现在的高中生?这是个好问题。网上有很多误导性信息,但我要说:一定要学会编程。能将复杂问题分解的结构化思维能力是现在乃至未来的稀缺技能。

High school students today? Oh yeah, yeah, one is also, I think, a great question, right? Because I hear a lot of, of, would I consider misinformation on that online? So you should absolutely learn to code. Like, one skill that is at premium and will continue being at premium is to have really structured intellect that can break complicated problems into pieces.

Speaker 2

未来这种能力可能不局限于编程,但编程确实是培养这种技能的好方法。其他需要深度思考的领域也是如此。所以别听信那些'不必学编程'的论调。

That might not be programming in the future, but programming is a fine way to acquire that skill. So are other domains where you need to think a lot. So don't let people tell you that you should not learn to code.

Speaker 0

是的,我是在人生后期才学会编程的,这也正是我最终成为OpenAI工程师的原因。我常向人们解释,仅仅因为一个系统能完成某项任务,并不意味着你就不再需要了解它的工作原理。正如你所说,当你懂得如何分解任务时——我在OpenAI从事提示工程工作时,我的编程理解力帮助我既能驾驭语言又能将其拆解,从而做出更出色的成果。我认为那些能弥合这些鸿沟的人确实具有优势。所以每当我听到有人说‘别学编程’时,我就会想:难道我想要一个不懂空气动力学的飞机驾驶员吗?

Yeah, I learned to code late in life and that's actually why I ended up working at OpenAI as an engineer. I try to explain to people just because a system can do the thing doesn't mean you don't want to know how it works anymore. As you said, when you understand how to break down a task, when I worked at OpenAI and prompt engineering, my coding understanding helped me understand to take both language and break it down and make it do better things. I think that people who bridge those gaps are really an advantage. And so whenever I hear people say like, don't learn to code, it's like, do I want an airplane pilot who doesn't understand aerodynamics?

Speaker 0

这在我看来不太合理。

Like this doesn't make much sense to me.

Speaker 1

回想高中时的思维方式,我觉得很不可思议——当你真正深入思考时,会发现许多所谓的限制其实并不存在。我的第一个重大觉醒是:如果我真的热爱计算机科学,其实可以牺牲其他12门课的部分学习时间,投入更多精力在这方面。但后来更让我震撼的是,原来我某天真的可以去美国留学——这个选项原本根本不在我的认知范围内。在硅谷的时光也让我看到,这里的人们如何以雄心壮志去攻克难题,坚信自己能真正改变世界。

Well, thinking about how I thought about things in high school, I think it's like pretty incredible, like, how many kind of perceived constraints are not actually there when you really think about it. You know, maybe maybe, like, the first revelation to me was, hey. You know, if I really kind of, like, am passionate about this computer science stuff, like, is I can actually spend a bit more time on it at the cost of, you know, maybe spending a bit of a bit less time on, like, you know, like, 12 subjects in school. But, you know, but then, like, you know, but then, like, somehow it, like, again, like, took kind of like a it was, like, a big revelation to me that, like, actually, you know, I I I can can go and, you know, study in The USA at some point. That's not really something that seems that's obviously an interaction space, and obviously kind of like spending some time here in Silicon Valley, and kind of seeing how people are willing to really attack these big problems with ambition and the belief that you can actually make a meaningful positive change in the world.

Speaker 1

是的,这一切都令人无比振奋。这正是我珍视这个社群的原因。

Yeah, I think it has been incredibly inspiring. Yeah, it's something I cherish about this community.

Speaker 0

有哪本书曾给过你启发吗?

Is there a book or something that inspired you?

Speaker 1

确实有几本。现在回想起来很有趣——当时没意识到关联,但我15岁迷茫期时,父亲给了本波兰语版的书,作者是个陌生名字叫《黑客与画家》。

I think there's a couple of books. I remember my It's actually, yeah, it's actually hilarious, like, thinking about it now. I didn't really connect the dots, but my dad gave me this book once when I was like in a pre I think I was, like, 15, and I was, like, pretty unsure what I wanna do. Yeah. It was Polish Polish version of a book by, like, some author I didn't know called Hackers and Painters.

Speaker 1

对,那其实是本禁书。但正因如此,它和这个社群的叛逆精神很契合,让我深受鼓舞。

Yeah. Yeah, it was actually a pogrom. So I guess, like, yeah, again, like, kind of like this community. So I found that pretty inspiring.

Speaker 0

听到‘敢于梦想并付诸行动’这样的信息确实很有帮助。越多人意识到这点,世界就会变得越好。有哪本书/电影/剧集影响过你吗?

Yeah, there's something helpful, I think, to hearing the message of like, no, it's okay to dream big and go do stuff that you can just make things happen in the world. And I think that the more people realize that, kind of the better the world gets to be. Was there any book that influenced you? Or movie, TV show?

Speaker 2

电影啊...我有个很蠢的答案。

Oh, movie. I have a stupid answer to that question.

Speaker 0

我就想听蠢答案。

I want stupid answers.

Speaker 2

但在那深刻的影响之后。好吧,我看了《钢铁侠》。是的,它激励我开始攻读机器人学博士学位。

But after the profound one. But like, okay, so I watched Iron Man. Yeah. And it inspired me to start a PhD in robotics.

Speaker 0

不过这是个很棒的答案。安迪·威尔的《火星救援》,我在NASA遇到一位植物学家科学家,他读了那本书,我说他们搞错了大气物理之类的。他说,所以我才在这里工作。我就想,哦,好吧。

That's a great answer though. The Martian by Andy Weir, I met a scientist at NASA who was a botanist who read that book and I'm like, well they got the atmospheric physics wrong and all this. He's like, well that's why I'm here. I'm like, oh. Well,

Speaker 2

对,我还没说到愚蠢的部分。愚蠢的是,当我开始研究机器人时,非常失望于它们的糟糕表现。不知怎的,我没想到电影只是电影。是的,那段经历对我来说挺糟的,如果不是在那里遇到了一位对深度学习感兴趣的朋友,情况会更糟。当时我以为所有机器学习都是炒作,但那是个有趣的系统问题。

yeah, I guess I didn't get to the stupid part. The stupid part was like, when I started working on robotics, was very disappointed how bad those robots are. Somehow it didn't occur to me that maybe the movie is a movie. Yeah, so that whole experience was kind of bad for me, would be kind of bad for me if not for the fact that there I met a friend who was into deep learning. At the time I thought all of the machine learning of is a hype, but it was an interesting systems problem.

Speaker 2

然后毫无预兆地,AlphaGo出现了——我这么说可能会让DeepMind的一些人感到沮丧。我确信它是突然出现的。

And then out of nowhere, and as I'm sure I would frustrate some DeepMind folks by saying that, AlphaGo came out. I'm sure it was out of nowhere.

Speaker 1

我确信

I'm sure

Speaker 2

这是多年努力的成果。那对我们俩都非常鼓舞人心,真的。从那以后,想不投入工作都难。

it was years in the making. And that was very inspiring, I actually think, to both of us. And since then, it was just hard not to work.

Speaker 1

是的。我花了一段时间才确信深度学习不只是昙花一现。因为我们并不真正理解底层的优化机制。我认为这某种程度上成为了我们研究的主题,试图在这些问题上取得进展,比如它究竟如何运作。但这确实像是在研究某种物理现象。

Yeah. It took me a while to become confident that declaring is more than a fad. Because we don't really understand the kind of underlying optimization. And I think this kind of has been the story of our research here, trying to make progress on these questions, like, about how it really works. But it really is, like, studying a physical phenomenon in some way.

Speaker 1

而且,你知道,对接受传统训练的计算机科学家来说,接受这点很奇怪。

And, you know, to a classically trained computer scientist, that was a weird thing to accept.

Speaker 2

嗯。我确实记得雅各布跟我讲扩大原则性凸优化的时候。那是在AlphaGo之前。

Mhmm. I do remember when Jakob was telling me about scaling up principled convex optimization. That was before AlphaGo.

Speaker 0

AlphaGo有趣在于,起初觉得‘哇,解决了围棋’,然后意识到‘但它只是通过观察学习’。后来他们做了AlphaGo Zero,完全自学,你就明白游戏结束了。这有个发展轨迹,我认为它还在继续。但我想,如果你当初看的是《雷神》而不是《钢铁侠》,也许结果会更好。

And AlphaGo was interesting because first like oh cool, solved Go and then we're like yeah, but it just learned by watching all this. Then they did AlphaGo zero where it's self taught and you're like okay, game over folks. There's a trajectory here and I think that's continued on. But I think that, yeah, if you hadn't watched Iron Man, maybe Thor instead, know, maybe things would have turned out better.

Speaker 2

谁知道呢?我有点希望当初学的是数学。那会更有用。学什么?数学。

Who knows? I kind of wish I studied maths instead. It has been more useful. Study what? Maths.

Speaker 2

数学。好吧。理论计算机科学,这两者中的任何一个都行。

Mathematics. Okay. Theoretical computer science, either of those like this.

Speaker 1

可能是物理学。物理学,

Physics probably. Physics,

Speaker 0

是啊。我最初是个魔术师。不知道你们知不知道。实际上我有自己的真人秀节目,所以你们会发现一条非常奇特的路径最终来到这里。那么,雅各布、西蒙娜,和你们两位交谈真是莫大的荣幸,希望我们能再次见面,聊聊你们默默研究的、即将横空出世的下一个重大突破,到时候我们都会觉得那好像是一夜之间发生的事。

yeah. I started off as a magician. I don't know you know that. So I actually had my own reality TV show and so you find a very strange path to end up here. So, Jakob, Simone, it's been an absolute pleasure to talk to you both and I hope we can meet again and talk about the next big breakthrough that you guys have been stupidly working on that's going to come out of nowhere and we'll be like that was an overnight thing.

关于 Bayt 播客

Bayt 提供中文+原文双语音频和字幕,帮助你打破语言障碍,轻松听懂全球优质播客。

继续浏览更多播客