本集简介
双语字幕
仅展示文本字幕,不包含中文音频;想边听边看,请使用 Bayt 播客 App。
人们在这些模型上花了很多钱。
People are spending a lot on these models.
他们这么做大概是因为从中获得了价值。
They're presumably doing this because they're getting value from them.
你或许可以争辩说,呃,我不认为那些价值是真实的。
You can maybe argue, like, oh, well, I don't think that value is real.
我觉得人们只是在瞎折腾,随便吧。
I think people are just playing around, whatever.
不过,随便啦。
But, like, whatever.
他们愿意为此付费。
They're paying for it.
这是个相当有力的信号。
That's a pretty solid sign.
我们几乎要给你一个实用答案了:我不认为这是泡沫,因为它还没破裂。
We're almost giving you here the useful answer of, like, I don't think it's a bubble because it's not burst yet.
当它破裂时,你就会知道这是个泡沫了。
When it's burst yet, then you'll know it's a bubble.
好吧。
Okay.
人们经常提出这样的论点:人工智能目前尚未盈利,他们正在投入更多资金以实现盈利。
People often make the case, oh, AI hasn't been profitable yet, and they're spending more to make it profitable.
实际上,他们很快就会收回过去所有开发投入的成本。
In reality, they'll have paid off the cost of all of the development they've done in the past very soon.
只是他们正在为未来进行更多开发。
It's just that they're doing develop more development for the future.
他们会后悔这些投入吗?
Will they regret that spending?
他们投入了多少?
How much are they spending?
你可以看看英伟达每年卖出了多少产品,观察其增长趋势,就能判断情况是否持续向好。
You can look at Nvidia and how much they're selling each year, and you can see whether it keeps on growing, and you can see whether stuff is kinda looking good to continue.
数学题对AI来说异常简单。
Math team is unusually easy for AI.
老实说。
I'm gonna be honest.
人们经常声称这是某种直觉性的深层事物,意味着AI已经达到了某种高水平的智能才能解决。
People often make claims about it being like this, you know, intuitive deep thing that it would mean that AI has achieved something, some huge level of intelligence for it to solve.
我认为实际上,这就像是创作一件艺术品。
I think in practice, this is just like, you know, making a piece of art.
事实证明,在能力这条路上,它走得比人们预想的要远得多。
It turns out to be farther farther down the Capability Street than people might have guessed.
几十年前我们在国际象棋上就遇到过类似的情况。
We sort of had this with chess decades ago.
对吧?
Right?
就像计算机在国际象棋上表现得非常出色,当时大家都认为这是推理的巅峰,结果人们都倾向于认为,哦,计算机当然能下棋。
Like, computers solved chess very well, and everyone was thinking of this as the pinnacle of reasoning, and everyone as a result kind of concluded by, oh, well, of course, computers can do chess.
这个有趣的设想,大约有20%到30%的概率在未来十年内发生,比如AI导致失业率在短短六个月内骤升5%。
The, like, interesting scenario to think about, you know, 20% chance, 30% chance something like this will happen in the next decade is, like, you know, a 5% increase in unemployment to over over a very short period of time, like six months, due to AI.
公众对此的反应将决定很多事态发展。
The public's reaction to this will determine a lot.
一旦这种情况发生,人们对AI将产生极其强烈的情绪。
There will be very, very strong feelings about AI once this happens.
我认为届时会形成一系列强烈共识,围绕那些我们通常不认为会被纳入考量的议题。
I think there will be a bunch of very strong consensus on what to do, around things that we don't normally think of as things that people are considering.
就像新冠疫情时那样,当时迅速通过了数万亿美元的刺激方案。
I know when this happened with COVID, there was a several trillion dollar stimulus package.
整个过程只用了短短几周甚至几天,速度快得惊人。
In a matter of weeks to days, it was breakneck speed.
我不知道AI领域会采取什么措施。
I don't know what that will look like for AI.
但我觉得这和其他AI相关事件一样。
But I think it's like everything else in AI.
它是指数级增长的,这意味着从人们稍微关心到真正高度重视的转变会非常迅速。
It's exponential, which means it will pass the point of people people sort of care about it to people really care about it quite fast.
他们只是预期无论最终结果如何,都会出现一些我们一年前还认为不可思议的事情。
They just expect wherever we end up, there will be this certain thing, which we would have considered unimaginable a year ago.
我们正在迈向人类历史上最大的经济繁荣还是最快的崩溃?
Are we building towards the biggest economic boom in human history or the fastest collapse?
目前,AI实验室正在计算资源上烧掉数十亿美元。
Right now, AI labs are burning billions on compute.
Anthropic刚建了一个数据中心,耗电量堪比印第安纳州首府,而微软正在规划一个能与纽约市匹敌的数据中心。
Anthropic just built a data center that uses as much power as Indiana's state capital and Microsoft's planning one that rivals New York City.
赌注是什么?
The bet?
赌的是在资金耗尽之前,AI将彻底消灭某些工作类别。
That AI will eliminate entire categories of work before the money runs out.
来自Epoch AI的David Owen和Yafa Edelman做了一些不寻常的事情。
David Owen and Yafa Edelman from Epoch AI have done something unusual.
他们实际测量了正在发生的情况。
They've actually measured what's happening.
他们追踪许可证、分析卫星图像,并精确计算这些数据中心扩张的速度。
They tracked down permits, analyzed satellite imagery, and calculated exactly how fast these data centers are scaling.
他们的结论同时挑战了怀疑论者和忠实信徒。
Their conclusion challenges both the skeptics and the true believers.
他们没有看到泡沫。
They don't see a bubble.
他们看到收入每年翻倍,推理环节已经实现盈利。
They see revenue doubling every year with inference already profitable.
但他们也没看到某些人预测的纯软件奇点——那种AI能在一夜之间递归自我改进的情况。
But they also don't see the software only singularity that some predict, where AI recursively improves itself overnight.
相反,他们预测了一个更奇怪的世界:AI能在可靠叠好你的衣服之前解决黎曼假设。
Instead, they forecast something stranger: a world where AI solves the Riemann hypothesis before it can reliably fold your laundry.
在这个世界里,10%的现有工作会消失,但失业率可能几乎不会波动。
Where 10% of current jobs vanish, but unemployment might barely budge.
我们实现通用人工智能的方式不是一蹴而就,而是通过一系列不断刷新认知的超现实里程碑,持续重新定义目标。
Where we hit artificial general intelligence not with a bang, but through a series of increasingly surreal milestones that keep moving the goalposts.
我们与A16z合伙人Marco Masgoro一起探讨了他们的时间线预测、哪些因素无法阻止规模化发展,以及为何政治反应可能比任何人预期的都要快。
Along with A16z partner Marco Masgoro, we cover their timeline predictions, what stops doesn't stop the scaling, and why the political response might happen faster than anyone expects.
各位,关于宏观经济的讨论很多。
Guys, there's a lot of conversation about the macro.
我们是否身处泡沫之中。
Are we in a bubble?
我们应该如何思考这个问题?
How should we even think about this question?
我们稍后会深入预测部分,但你们何不先初步谈谈如何处理如此宏大而宽泛的问题?
We're gonna get into forecasting later on, but why don't you just take a first stab at how you approach such a big, general question?
是的。
Yeah.
至少对我来说,我的思考方式是观察人们在计算资源等方面的支出规模这个重要指标,或许某种程度上还包括他们未来是否会后悔这笔支出——这很关键。
I mean, for me at least, the way that I thought about this a little bit is I look at kind of the big indicator being how much people are spending on stuff like compute, and I guess maybe some sense of will they regret that spending, that's relevant.
但他们具体投入了多少资金呢?比如你可以看看英伟达每年的销售额,观察这个数字是否持续增长,以此判断发展态势是否良好。
But how much are they spending thing, like you can see, you can look at Nvidia and how much they're selling each year and you can see whether it keeps on growing and you can see whether stuff is kind of looking good to continue.
至于他们是否后悔这部分投入
Well, they regret it side.
我的意思是,这还有待观察,对吧?
I mean, that's just to be seen, right?
我们实际上还得观望一段时间才能判断。
Like, we'll actually have to wait and see.
目前看来,大部分计算资源确实用于推理环节,而且企业尚未对将这些资源用于产品开发表示后悔。。
It does seem as if most compute gets spent on inference that companies don't so far regret like using to offer their products.
所以从2B这一方...
So, I mean, on that side, I'm, like, thinking not too bubbly yet, but, yeah, I low confidence, so there's other stuff to think about.
嗯。
Yeah.
目前企业实际获得的利润(不包括初始开发成本)看起来非常可观。如果他们停止开发更大的模型,仅维持现有规模,按照目前的利润率,他们很快就能盈利。
Right now, the amount of money companies are actually earning in profit, not including the cost to develop the models initially, is seems to be, like, very positive, such that if they stop developing bigger and bigger models and just stick with the ones they've had, they'd have earned a profit pretty quickly at the current margins.
从这个意义上说,这看起来并不像泡沫。
And in this sense, it doesn't seem bubbly.
另一方面,他们随时都在投资构建越来越大的模型。
On the other hand, at any given time, they're investing and building even larger and larger models.
如果进展顺利,他们将赚取更多利润。
And if that goes well, then they'll earn more money.
如果进展不顺,那么无论当前盈利如何,相比投入都将只是九牛一毛。
And if that doesn't go well, then no matter how profitable they are right now, it'll be a small amount of money compared to how much they would have spent.
因此我认为目前并没有出现泡沫的财务迹象。
So I think right now there are not financial signs that there's a bubble.
许多担忧泡沫的人,只是还不适应这种量级的支出和规模化带来的成功水平。
A lot of people worrying about bubbles just aren't necessarily used to the level of spending and just, like, the level of success that sort of happened in, like, scaling.
但如果泡沫真的出现,可能会突然爆发并造成严重后果。
But if there is a bubble, it could happen very suddenly and be pretty bad.
所以
So
是啊。
Yeah.
我想我们几乎是在给你一个有用的答案,就是我不认为这是泡沫,因为它还没破裂。
I think we're almost giving you here the useful answer of, like, I don't think it's a bubble because it's not burst yet.
等它破裂的时候,你就会知道是泡沫了。
When it's burst yet, then you'll know it's a bubble.
没错。
Yeah.
我确实认为,你可以想象这样一个世界:所有这些投入和当前的成功水平并不匹配,人们经常提出这样的观点,说AI目前还没有盈利,他们正在投入更多资金使其盈利。
I do think, like, you could imagine a world at which there's all this spending and the current level of success does not like, people often make the case, oh, AI hasn't been profitable yet, and they're spending more to make it profitable.
但目前它还没有产出任何东西。
But right now, it's not making anything.
而实际上,它们正在产出。
And in reality, they're making.
他们很快就会收回过去所有研发投入的成本。
They'll have paid off the cost of all of the development they've done in the past very soon.
只是他们正在为未来做更多开发工作。
It's just that they're doing more development for the future.
所以我认为到目前为止存在这种潜在的财务成功,如果这至少是一个明显的泡沫,我不会期望看到这种情况。
So I think there is this underlying financial success so far that I wouldn't expect to see if they're at the very least an obvious bubble.
是的。
Yeah.
这确实看起来非常相关。
That does seem very relevant.
人们在这些模型上投入了大量资金。
People are spending a lot on these models.
他们假设用户会使用这些模型。
They presume it like, you know, users to use them.
他们这样做大概是因为从中获得了价值。
They're presumably doing this because they're getting value from them.
你可能会争辩说,哦,我不认为这种价值是真实的。
You can maybe argue like, oh, well, I don't think that value is real.
我觉得人们只是在随便玩玩,但不管怎样,他们愿意为此付费。
I think people are just playing around, whatever, but like, whatever they're paying for it.
这是个相当可靠的迹象。
That's a pretty solid sign.
我想快速提个相关问题,你在2030年AI报告中提到,基本上没有看到这些模型出现平台期迹象,能力持续提升,有基准测试数据,有不断增加的数据量和算力。
I guess one quick question related to this is like you talked in the report of the AI in 2030, basically that you haven't seen signs of basically these models kind of plateauing or like the capabilities keep increasing and you have the benchmarks, you have the amount of data that is going, the amount of compute.
不过你认为模型某些方面或部分是否正在进入平台期?
Do you think phases or parts of the models are plateauing though?
比如预训练阶段,我们是否看到某种停滞迹象?还是你认为人们在这个阶段仍在探索创新?
Like for instance, pre training, are we seeing some sort of plateauing in that, or do you think people are still exploring some innovations in that stage?
很想听听你对这个问题的看法。
Curious on what do you think about that?
是的。
Yeah.
我认为这个问题看起来会稍微复杂些。
I think this gets a bit harder to look at.
比如,我们进入了一个没有太多公开数据可以讨论的领域,对吧?
Like, we get to an area where there isn't as much public data to say a lot, right?
看起来预训练相对而言不像以前那么受关注了,部分原因是你们有了这个令人兴奋的新方向——算是比较新的方向——后训练,他们在推理等方面做了很多工作。
It seems as if pre training is comparatively less of a focus than it was before partly because like you have this exciting new direction of, well, newish direction of post training where they've done so much about reasoning, whatever.
但我并不一定把这当作证据,比如认为预训练无法进一步扩展之类的。
But then I don't necessarily take that as evidence of like, oh no, and that means pre training, you couldn't scale further, whatever.
似乎确实有更多有意义的数据存在。
Like it seems as if there is meaningfully more data out there.
看起来似乎很多这些东西都具有相当强的协同效应。
It seems as if plausibly like even a lot of this stuff is quite synergistic.
你开发出一个更好的模型,通过后训练手段让它更出色,还能获取大量关于模型实际使用效果的数据。
You develop a better model, you like use post training stuff to make it better, you get a load of data of the model actually being used successfully or not.
其中很多经验很可能可以融入下一次的自由训练中。
A lot of that can probably go into free training next time.
你预见的并非纯软件的奇点——即AI能自动化AI研究,而只是一个自动化的反馈循环。
You aren't projecting a software only singularity where AI is able to automate AI research, but just automated feedback loop.
为什么不呢?
Why not?
是啊。
Yeah.
我的意思是,我觉得,我还没认真到能多说些什么。
I mean, I guess, like, I'm not serious enough to say more.
对我来说,就像那份报告,没有一个人能断言,这就是预测,对吧?
It's like, for me, it's like that report, it's no one person's kind of, oh, this is like the forecast, this is the prediction, right?
这份报告非常具体地分析了当前的趋势是什么?
This report very specifically looks at what are the current trends?
是否存在明显无法持续或可能中断的原因?
Are there reasons that they clearly couldn't continue or might not?
如果趋势持续下去,会导致什么结果?
And if they do continue, where do they lead?
我认为这种自我改进的现象,很难从趋势外推的角度来评估,对吧?
I think whether you see this self improvement thing, that's very hard to do from a sort of trend extrapolation basis, right?
目前AI技术确实在某种程度上辅助了AI研发,比如在编程、数据集筛选和创建等方面,但实际效果很难量化,而且远未达到这种自我改进理论所暗示的重大突破程度。
Like currently AI stuff does help AI R and D at least a little in terms of stuff like coding or selecting your datasets and creating those, whatever, but it's quite hard to actually measure and it's not really helping in some big way like this kind of self improving thing would suggest.
确实存在一些原因会让你认为这可能非常困难。
There are reasons that you might think it could be very hard.
之前有人讨论过,如果发展主要依赖算力扩展,那么自动化大量研发工作可能帮助有限。
People have discussed before how possibly, if stuff just depend a lot on scaling up compute, then maybe automating a lot of the R and D isn't that helpful.
我觉得这个观点有一定说服力,但同时也存在很大不确定性。
I find that somewhat compelling, but I think it's also just, it's pretty uncertain.
对于这种超出常规范畴的事物,确实很难做出准确推测。
It's hard to speculate about something that's quite out of regime like that.
要实现纯软件的奇点,关键前提是必须处于这样一个世界:通过增加研究人员投入时间就能充分提升AI能力,从而弥补无法扩展实验算力或预训练的缺陷。
One thing that needs to happen in order for a software only singularity to occur is you need to be in this world where scaling up the amount of researcher r and d time basically allows you to, like, improve AI enough that it makes up for the lack of being able to scale experimental compute or pretraining.
如果这种情况属实,我们预期看到的实践现象可能是:实际使用的实验算力并不多,而大部分资金都流向了研究人员。
I think that something you would expect to see if this were the case is maybe not that much experimental compute being used practice, and instead all of the money is going towards researchers.
现在确实有充分证据表明,有大量资金正在流向研究人员。
Now there's a very good case that there's a very large amount of money going towards researchers.
但据我们观察,实验计算——看起来是科研必需的——获得的资金投入相当可观,事实上,它获得的资金比最终实际发布模型的训练环节还要高出许多倍。
But as far as we can tell, experimental compute, you seem to need to do research, is receiving a similar amount of money and that, in fact, it's receiving many times more money than the final training runs that are actually of the models that are actually being released.
在我看来,这强烈暗示着:要开展研究就必须进行大规模实验,而我们并没有确凿证据表明仅靠研究人员就能在不增加实验的情况下加速进程。
I think this is, in my mind, is a strong update towards, oh, you need to do very large scale experiments to do research, and that we don't really have good evidence that researchers and just researchers would be able to speed things up without doing more experiments.
不过,这个问题的正反两面都存在相当有力的论据。
However, there are, like, pretty good arguments on either side of this.
我个人倾向于持否定态度。
I tend to lean towards, no.
实际上你需要进行更多实验,这意味着你无法实现仅靠软件的奇点。
You actually need to do more experiments, and that means you can't get this software only Singularity.
但我不认为持相反观点的人是疯狂的。
But I don't think the people who claim otherwise are, like, crazy.
我认为他们有一些非常合理的不同见解,我们都在对当前数据相当匮乏的领域进行推测。
I think they're making some, like they have, like, very reasonable differences, and we're both speculating on something where the data is currently pretty sparse.
实际上,与此相关的是,你对研究人员正在尝试的一些探索有什么看法?
Actually, related to that, like, what do you think on so so if if if you have, like, some of some of the exploration that researchers are trying.
我的意思是,显然人们正在大量探索强化学习,试图超越可验证领域。
I mean, obviously, like, people are exploring a lot with RL, trying to go beyond verifiable domains.
那么你对这个论点怎么看,比如说梯度下降在当前数据集上的学习效果非常好?
And and what what do you think about the argument, for instance, that gradient descent is is really good on learning in the current dataset that you're giving.
对吧?
Right?
如果你反复持续训练,它就会开始遗忘之前学过的内容。
And you just if you keep training this over and over, it's gonna start forgetting things that it was trained before.
对吧?
Right?
比如灾难性遗忘。
Like, catastrophic forgetting.
而且还有这个论点。
And and is this there's this argument.
对吧?
Right?
比如说,孩子们可不是这样学习的。
Like, well, kids don't don't learn that way.
可能孩子们会进行一些模仿学习。
Like, you maybe there's some imitation learning that kids do.
也许他们还会进行某种探索性学习。
Maybe there's some sort of exploration that they do.
我想知道你怎么看这个问题。
And I wonder what you think about it.
我是说,这个观点听起来挺有道理的。
I mean, if and it sounds right.
如果孩子们真的只通过模仿学习,那父母养育孩子应该会轻松得多。
Like, if kids really would just learn on imitation learning, I think parents would have a great time just raising kids.
但现实中父母之所以觉得育儿艰难,正是因为孩子们会探索各种事物。
But it seems like the reason why they have such a hard time raising kids is because they explore all these different things.
你觉得在算法层面,除了数据和算力,我们还需要不断改进些什么?
What do you think about it in terms of the algorithms and, like, the things we need to keep improving these models over and over beyond the data and the compute?
我对将AI学习方式与人类学习方式进行类比持谨慎态度,并非认为两者不可比,而是因为目前我们对AI学习机制的了解远多于对人类学习机制的了解。
I am cautious about comparing the like, how AI has learned to how humans learn, not because I don't think they are comparable, but because I think we know a lot more about how AI has learned right now than we know about how humans learn.
人们总喜欢对人类学习方式做出各种假设,然后说'哦,对,人类不是那样学习的'。
And people like making sort of assumptions about how human learning works and saying, oh, yeah.
人类不是那样学习的。
It doesn't do it that way.
我也不知道。
And I don't know.
也许确实如此。
Maybe that's true.
也许人类儿童确实通过强化学习来学习。
Maybe human kids learn via RL.
我...我对此不太...我觉得...是的。
I I I'm not very I I think that yeah.
我...我对是否需要改用更接近我们认为儿童当前学习方式的方法并没有强烈意见。
I I I don't have strong opinions on whether or not, you know, you need to change to a method that's more like what we think kids do right now.
我怀疑人们会找到某种方法来利用现有的计算资源,因为他们过去在这方面一直很成功。
I suspect people will find some method that works to use the compute available because they've been able to do this in the past.
是啊。
Yeah.
我也有些犹豫。
I'm also sort of reluctant.
同样地,这也是那种情况——当我们指出特定问题时,比如灾难性遗忘的例子,这某种程度上,
Guess as well, it's one of those things where when we point to particular issues, like the example of catastrophic forgetting, it's sort of,
嗯,
well,
但随着规模扩大,我们已经成功让模型记住越来越多的东西。
okay, but as we've scaled up, we have managed to do quite well at having models that remember more and more things.
这并不是说问题就此解决,我们可以高枕无忧,再也不需要建立关联之类,但我也不会完全否定它。
This isn't to say that, hence the problem is solved, hence we're done, hence no more ever making relations necessary or anything like that, but I'm not exactly gonna write it off.
没错。
Yeah.
我并不认为我们目前已经看到这些担忧导致能力发展出现任何放缓。
I don't I definitely don't think we've seen any slowdown yet in capabilities from any of these concerns people have.
我觉得人们总是会有这类担忧。
I think that people always have these sorts of concerns.
除非在图表上看到具体数据,否则我不愿轻易相信其中任何一个说法——而我认为目前尚未出现这种情况。
I'm I'm reluctant to believe any given one of them until this actually shows up in numbers I can see on a graph, which I just don't think has happened yet.
Dario Anthropic曾在2025年3月表示,六个月内AI将编写90%的代码。
Dario Anthropic has said he said in March 2025 that within six months, AI will write 90% of code.
当然,这至今尚未实现。
And, of course, that hasn't happened yet.
他还说过,我们可能在2026或2027年就能拥有相当于一个天才国家算力的AI系统。
He also said, we have you know, we could have AI systems equivalent of a country of geniuses in a data center as soon as 2026 or 2027.
你如何评估Anthropic为何如此乐观?或者说他们与你们观点差异的关键点是什么?
How do you evaluate why Anthropic is so bullish or what is the crux of difference between what they believe and perhaps what you believe?
至少按照我的理解(虽然不确定是否正确),他们更像是相信通过自动化研发能实现快速突破的那类人。
My model, at least, which I don't know if it's right, but what it is, is that they think a bit more like the people who believe in you automate R and D and that gives you very quick takeoff.
所以他们将其视为:没错,我们正在开发这些擅长研究工程类编程的AI,到某个时间点它们就会派上用场,并将快速推动我们开发下一代AI,然后进展就会突飞猛进。
So they see it as like, yep, we're working on these AIs that are great for kind of research engineering type coding And at some point, they're going to be useful and that's going to rapidly accelerate us to develop the next ones, and then it's gonna be quick progress.
是啊。
Yeah.
我认为中,,我认为,很难判断,我认为我们还没有得到很多证据表明这类观点,比如纯软件起飞论是错误的,就目前而言,AI要达到那个能让你实现目标的最低能力水平确实需要更长时间。
I think that it's hard to tell the extent to which I don't think we've gotten a lot of evidence that those sort of views of this, like, software only takeoff are wrong insofar as, like, they're taking a little bit longer to get to, like, the minimum level of competence for AI to get you there.
显然情况就是如此。
Definitely seems to be the case.
但我不确定。
But it I I don't know.
很难判断我们在这方面实际上获得了多少重大进展。
It's it's hard to tell the extent to which we've actually had significant updates on this.
我知道达里奥经常用'最快'之类的措辞来限定他的预测。
I know Dario often qualifies what he says by, like, saying as soon as or something like this.
所以这可能更像是他给出的更激进时间表,虽然我也不确定。
So this is, like, maybe the more more so the faster timelines he gives, although I'm not sure.
是啊。
Yeah.
还有一种类似塔木德式评论的现象,人们仔细研究他的确切措辞,以及其他人在讨论Anthropolog某些团队生成的代码有多少是由Quad代码生成时的措辞,争论这是否符合你所说的标准。
There has also been a thing sort of, you know, Talmud style commentary where people are carefully looking at his exact wording and then wording of other people's discussion of how many lines of code that are generated by some teams at Anthropolog are generated by quad code and whether this does or doesn't satisfy what you said.
所以这变得有点复杂。
So it gets a bit tricky.
嗯。
Yeah.
我记得Uplift那篇论文声称模型实际上会拖慢开发速度。
I remember there was the the paper from the the Uplift paper that was claiming that actually models would slow you down.
但我认为关键在于他们当时使用的模型,因为报告出来时那些模型已经相当过时了。
But I think, like, it mattered a lot what models they were using at the time because I think they were pretty outdated by the time the report came out.
就我个人经验而言,效率绝对是大幅提升的。
And, I mean, in my personal experience, you definitely become way faster.
而且它能帮你完成的工作量也大得多。
And and it just saw so much more for you.
比如,你可以直接获得整个代码库的上下文。
Like, you're just having the whole context on your code base.
这是一个巨大的优势,我认为对人类来说真的很难做到。
That's such a huge advantage that I think for human just would be really hard to do.
我是说,如今我写的代码90%以上都是由AI生成的。
I mean, far more than 90% of the code I write is written by AI these days.
但我知道自己完全不是那种普通程序员。
But I know I'm not, like, the average coder at all.
但这绝对是——我甚至不认为这在现阶段是个大胆的预测——90%的代码将由AI编写。我的意思是,据我所知,在OpenAI某个角落,可能正有人用AlphaCode运行进化算法,通过海量试错来攻克难题。
But it's definitely it's definitely I don't think it's, like, a wild prediction at this point that 90% of code is gonna be written by I mean, for all I know, somewhere at OpenAI, there's someone just, you know or that, you know, with AlphaCode doing evolutionary algorithms on having tons and tons of trials trying to, you know, million shots some hard problem.
但问题在于,目前根本无从得知实际有多少代码是由AI生成的。
But it it's just like it's really unclear how many lines of code are actually being written by AI right now.
我不认为这有那么疯狂,很多人的直觉感受是,AI是否完成了程序员90%的工作?
I don't think it's such a wild I don't I I it's by a lot of, like, people's intuitive sense in terms of, oh, is 90% the job of a programmer being done by AIs?
绝对不是。
Definitely not.
但这里面有个更复杂的问题,就是到底有多少代码是由AI编写的。
But there's this more complicated sense of, like, how much is being written by AI.
可能没有90%,但这很难判断。
Probably not 90%, but it's it's hard to tell.
是啊。
Yeah.
而且我认为这是个非常有意义的区别。
And I think that is a very meaningful distinction.
对。
Yeah.
就像,如果你要衡量有多少代码是通过制表补全'编写'的,那比例可能相当高。
Like, if you were to measure how many lines of code are being written, quote unquote, by, like, tab completion, then it's probably quite high.
但你并不一定认为这承担了程序员真正困难的大部分工作。
But you don't necessarily expect that that's taking on that much of the programmers' really hard work.
你提到的那篇Uplift论文,我觉得非常有趣而且质量很高。
That uplift paper that you mentioned, I find it really interesting and really good.
而且从某种角度来说,这还相当新近,你提到过这些模型已经过时了。
And it's also surprisingly recent in a way, you mentioned, oh, the models are outdated.
这是2025年初的事。
This was early twenty twenty five.
所以这些确实是人们当时认为对他们有帮助的模型。
So these were models that people actually did think were helping them.
在论文中,他们甚至提前让程序员预估这会让他们提速多少。
And in the paper, they even got them to say ahead of time, how much do you think this will speed you up?
他们回答说,是的,我认为能提速多少。
And they said, yeah, I think how much.
之后研究人员又问他们,实际提速了多少?
They then asked them afterwards, how much do you think this sped you up?
他们回答,确实,确实提速了。
And they're yeah, yeah, it sped me up.
我觉得这确实揭示了一个问题:我们可能很难判断自己是否真的被提速了。
And I feel it does reveal actually, it might be hard for us to judge whether we were sped up or not.
是啊。
Yeah.
这里可能发生的一个情况是,AI生成的很多代码原本是不会被编写出来的。
One thing that might be happening here is that a lot of the code that's getting written by AI is code that wouldn't have been written otherwise.
所以它并没有真正加速那些常规会发生的事情。
So it's not really speeding up things that would normally happen.
但你知道,有很多简单的图表或模拟程序,如果没有AI可能就不会被编写出来。
But, you know, there's a lot of simple graphs or simulations I run that might have not gotten written otherwise.
因此很难确切判断这里具体产生了什么影响。
And so it's it's hard to tell exactly what's going on here in terms of the impacts.
我认为归根结底,最可靠的指标还是看这些程序员能赚多少钱,以及整体订阅收入。
I think at the end of the day, the most reliable indicator here is going to be how much money these people are making from programmers and from, you know, subscriptions in general.
金额相当可观。
And it's a lot of money.
我认为这些数据明确显示出人们确实从中获得了实用价值。
I think there's definitely indications that people are finding a use for them.
而且很可能其中相当一部分用途是用于编码,但并不完全是为了替代现有程序员90%的工作量这一指标。
And probably a decent amount of that use is for coding, but not exactly for the metric of doing 90% of an existing coder's job.
是的。
Yeah.
生物学界最近频繁使用一个说法,即AI是端到端的。
Biology is this phrase that's been being used a lot, which is AI is an end to end.
实际上它是中间到中间的,这可能意味着我们需要比人们通常想象的更多的人为参与。
It's it's middle to middle, and maybe which is meant to imply that, you know, we're gonna need a lot more human involvement than some people, you know, typically think.
你对AI在未来十年内对劳动力市场的影响有什么心理模型?无论是低端还是高端市场?
What what is your mental model of what what AI is is going to do for for labor markets either on the sort of lower end and or in the higher end in the next, you know, decade, let's say?
哦,在未来十年内,就高端市场而言,我确实认为很可能会创造新的工作岗位。
Oh, in the next decade, I like, on the higher end, I'm definitely like, you know, probably I expect new jobs to be created.
人人都还能当网红。
Everyone could still be influencers.
但在高端领域,目前你很难明确指出哪些工作是AI明显无法自动化的。
But on the higher end, it's like there are not very good individual things that you can point to where it's very obvious that AI can't automate that job at this point.
你可以辩称,好吧,但存在一些未知因素,我认为这相当合理。
Now you could argue, okay, but there's some unknowns, and I think it's, like, pretty reasonable.
但这些未知因素,有时当AI遇到瓶颈时,我们会发现这些限制,然后它就能学会突破。
But those unknowns, we sometimes, you know, we AI gets up against its limits and we figure out what they are, and then it learned surpasses that.
在高端领域,这似乎完全有可能自动化几乎所有现有工作,除了那些需要人工操作且人们真正在乎由人类完成的工作。
And I don't At the higher end, it definitely seems plausible that it could just automate all of the, basically, all of existing jobs with the exceptions of ones that require manual labor that people actually care about being done by a human.
在我看来,这种情况发生或迅速发生并非不可能,但需要注意的是,如果真的发生可能会遇到一些监管阻力。
It just, like, does not seem at all implausible to me that that can happen or that that could happen very fast with the caveat there being, like, there's probably some regulatory pushback if that happens.
在低端市场,我不确定。
On the lower end, I don't know.
它可能只是一个泡沫,不会产生任何影响。
It could just you know, it could be a bubble and doesn't have any impact.
我在讨论时提到的那个有趣的设想场景,我也不确定。
Thing I talk about when I'm talking about, like, the imp the, like, interesting scenario to think about, which I'm not I don't know.
你知道,有20%或30%的概率这种情况会发生,未来十年可能会出现由于AI的发布导致失业率在很短的时间内(比如六个月)上升5%,我认为这将对世界产生非常重大的影响,无论是人们对AI的看法还是它受到的关注程度。这对我来说似乎是有可能的,但远非确定无疑。
You know, 20% chance, 30% chance something like this will happen, and the next decade is, like, you know, a 5% increase in unemployment over over a very short period of time, like six months due to AI being released to something that I think will have a very substantial impact on the world, both in terms of how people think about AI and sort of how much attention it gets and seems plausible to me, but, you know, far from guaranteed.
是啊。
Yeah.
我认为确实存在高度不确定性。
I think I strongly agree with being just highly uncertain.
在我看来这完全有可能——这一代人工智能实际上就是我们当前技术发展的极限了。
It seems very plausible to me that you end up more or less kind of, you know, this generation actually is exactly where we run out of progress.
这虽然听起来很疯狂,●●●但确实可能发生。
It would be kind of crazy, but it could happen.
然后就会出现这种情况:技术人才忙着将AI整合到工作中,做着各种现有工作。
And then it's like, oh, okay, everything is very much just generating more jobs for technical people to try to integrate it into doing kind of useful but janky things for all of the existing work people do.
当AI发展失控时,你就能用AI自动化大量远程工作。
The stuff where it kind of becomes a crazy runaway thing that you can really automate large swaves of remote work with.
我的时间线预测可能比其他人要长一些,但很难排除十年内发生重大突破的可能性。
I mean, my timelines are, I guess, probably a bit longer than the others, but, I mean, it seems hard to rule out that something really big happens in a decade.
十年时间相当长了。
A decade's quite a long time.
我认为如果未来十年内AI没有自动化掉现有5%的工作岗位,我会感到惊讶。
I think I would be surprised if there were not 5% of jobs that exist now, which AI has automated away over the course of the next decade.
说实话,如果自动化比例达不到现有工作岗位的10%,我反而会觉得意外。
I'd honestly, I'd be surprised if it's not 10% of the jobs that exist now, I think.
但这一进程的速度以及这些人员能否找到其他工作,我认为目前还没有足够证据能说明,这可能取决于各领域发展速度以及具体哪些工作被自动化。
But how fast that happens and, like, the extent to which those people find other jobs is something which I don't think I have seen compelling evidence for either way, and and probably depends on how fast various things go and exactly what jobs are automated.
我认为未来十年自动化10%的现有工作岗位是个相当合理的下限预测——虽然不完全是我的最低预期,但这个数字很合理,不过这可能不会在整体就业数据中显现出来。
I think that 10% over the next 10% of current jobs seems like a pretty reasonable lower it's not quite my lower bound, but, you know, a pretty reasonable number over the next decade of but this might not show up in overall employment numbers.
是啊。
Yeah.
这很有意思。
This is interesting.
我的意思是,从主流经济学观点来看,自动化更可能是发生在任务层面而非职业层面,因此职业数量可能会大幅减少,但很多时候你是在跨多个工作岗位中自动化这些类似任务。
I mean, definitely, like, the kind of to the extent there is a mainstream economics view of this stuff, it would probably be that automation happens at the level of tasks rather than occupations and occupations can as a result, go down quite a bit, but a lot of the time you're automating these like similar tasks across lots of jobs.
我认为这与你的说法是一致的。
I think this is compatible with what you're saying.
只是有些工作会受到很大冲击。
It's just that some jobs get really hit by it.
我不知道。
I don't know.
我觉得,是的,这个问题确实很难思考。
I find it, yeah, quite hard to think about.
我甚至不确定历史上职业消失的基本速率是多少。
I'm not sure what even the historic base rate for kind of jobs ceasing to exist is.
我知道这里面存在问题,比如历史就业数据系列中,实际上有相当高的基础变化率——工作内容在变、职业本身在变、职业种类此消彼长。
I know there are problems with this, like the historic employment data series, there is actually quite a high, I believe base rate of just the tasks in a job changing, jobs themselves changing, jobs kind of going away coming in.
所以即便是这个5%的说法,我也不知道该怎么看。
So yeah, even this 5% thing, don't know what to think.
是的,那将产生巨大影响,或者说...这实际上已经接近软件行业带来的变革规模了。
Yeah, that would be like a big effect or kind of, yeah, that's actually roughly the size of the fact you've already sent from something like software.
我不知道。
I don't know.
是的,可能有5%在软件出现前就存在的工作岗位现在已经不存在了。
Yeah, probably 5% of jobs that existed before software no longer exist.
这个比例看起来相当合理。
It seems pretty reasonable.
我对此并不太有信心。
I I bet I'm not confident to this.
这绝对是个未知领域。
It's definitely something which, like I don't know.
我预计,特别是如果收入趋势持续下去,我预计一两年内——很可能就在明年——我们会了解更多,因为届时AI创造的收入将足以成为经济的重要组成部分。
I expect, especially if revenue trends continue, I expect to know a lot more about this in a couple in a year or two, probably within the next year, because it will just be the case that, okay, we'll have AIs earning enough to substant to be, like, a substantial part of the economy.
如果这没有体现在失业率上,那我们就明白了它的作用机制。
If it's not showing up in unemployment, then we've learned something about what it's doing.
我们会发现它能在不影响失业率的情况下实现这些变化。
We've learned that, like, it's able to do this without showing up in unemployment numbers.
或者它可能会影响失业率,届时我们就能看清具体情况了。
Or maybe it will show up in unemployment numbers, and we'll see exactly what.
已经有了一些早期研究在关注这方面的指标。
There's been, like, some early work looking at, like, indicators of this.
有很多因素使这个问题变得复杂,因为利率也会影响你可能关心的那些方面,或者只是正常的行业波动。
There's a lot of things that complicate looking into this because interest rates also have effects on, like, the sort of things you might care about or just, normal churn.
还有一种可能是科技公司会解雇一批程序员,以便有资金建设数据中心。
Or also, it's possible that tech companies maybe they'll lay off a bunch of programmers so that they have the capital to build data centers.
这些程序员被解雇是因为AI吗?
And are those programmers being laid off because of AI?
我不知道。
I don't know.
也许吧。
Maybe.
如果你有个刚上大学的孩子,他们问:'嘿,如果我想有个好职业,我该主修什么?'
If you had a kid that was a freshman in college, and they were asking, hey, what should I major in if I wanna have a great career?
你会怎么回答他们?
What might you tell them?
如果他们问起计算机科学、数学或技术工程。
If they asked you about computer science or math or Tech engineer.
对,正是如此。
Yeah, exactly.
你会怎么说?
What would you say?
我想我大概会说不要选择提示工程师这个方向,总体来说。
I mean, I'd probably say not prompt and prompt engineer, I think, in general.
人们在使用AI方面会越来越熟练,它非常容易上手。
People get better at using AI is very easy to use.
是的。
Yeah.
我觉得这是个好问题。
I I I think it's a good question.
我认为他们应该选择这样的专业:如果主修编程或计算机科学,他们应该关注的重点不是成为那种只掌握某种编程语言技能的人。
I think they should probably measure in something where if they're majoring in programming, the thing that they should be or computer science, the thing that they should be looking for is not being a person who's gonna like like, the skills that are gonna be useful are not going to be knowing a programming language.
更重要的是通用技能,比如与他人协作的能力、沟通技巧这类东西。
It's going to be more general purpose skills, ability to, like, work with other people, communication skills, that sort of thing.
我不完全确定这是否指向某个特定专业。
I don't really know entirely if this points to a particular major.
大多数专业可能与你实际从事的工作并不直接相关。
Most majors are probably not majors that are, like, actually relevant for your job.
是的。
Yeah.
我想我可能会说,对于超级疯狂的未来,其实没有太多可以提前规划的空间。
I guess I'd sort of be like, well, there's not too much that you can do to plan around the super crazy futures.
所以我觉得应该选择你热爱且对世界有用的事物,但不要在这方面走极端。
So I guess go for something that you're passionate about that's useful in the world, but don't go crazy in that way.
我其实认为,计算机科学和数学,如果你对它们充满热情,那是非常好的选择,因为你能学到在许多领域都有价值的趣味知识。
I actually think that, computer science, maths, if you're passionate about them, they're very good because you'll learn interesting things that are valuable in many worlds.
不过我也不确定,最近我给一位年轻亲戚提建议,结果他们选择了学习戏剧。
But I don't know, I gave advice to a younger relative recently and they chose to study drama instead.
确实,我认为如果你在大学期间过得更愉快,那相当于你生命中四年时光都更美好。
I do think that, you know, one of the things that if you have a better time in college, that's like four years of your life you've had a better time during.
说到底,如果未来充满不确定性,你怎么知道哪条路最终会让你更开心?
And at the end of the day, like, you know, if you if it's a you have it's a crapshoot, which of those things is actually gonna give you a better time in the future?
活在当下确实容易得多。
Planning for the present is a lot easier.
是啊。
Yeah.
我的意思是,现在真的越来越难判断了。
I mean, it's definitely becoming really hard to to know.
对吧?
Right?
我记得两年前大家都把提示工程师当笑话,因为那时人们觉得这职业根本不靠谱。
I mean, I remember, like, the the problem engineer was obviously a joke because everyone believed two years ago that that was sort of some sort of viable thing.
但显然,现在的AI模型在生成提示方面已经强得离谱。
And, obviously, models are phenomenally better at, like, just being great prompters.
所以很明显,这算是已经发生的一个趋势。
So, obviously, like, that that's kinda, like, one thing that has been happening.
随着这些模型不断进步,真的很难预测未来会发生什么。
It's just really hard to predict what's what's happening as these models keep keep getting better.
我有个相关的问题:代码显然是个巨大的市场,已经产生了重大影响。
One one question that I have related to this is, obviously, code is such a big market, and it has had such a big impact.
还有一个让我非常兴奋但尚处早期的领域,我认为是计算机使用。
One that I'm very excited about, but it's still much earlier, think, is computer use.
对吧?
Right?
它基本上是在自动化你在电脑上完成的所有数字任务。
It's basically automating all the digital tasks that you're doing in your computer.
而且这方面的基准测试非常少,无论是Web Arena还是Always World。
And there's very few benchmarks around this, like, whether it's Web Arena or the Always World.
你在报告中稍微提到了一些关于基准测试的内容。
You talk a little bit on your report about benchmarks.
我很好奇,你觉得这个领域还缺少什么?
Curious on, like, what do you think is missing in that space?
比如,为什么我们还没看到像Sonnet 3.5、Cloud Code或Codex发布时那种编码能力显著提升的突破时刻?
Like, why we haven't seen yet that moment where the moment, for example, when Sonnet 3.5 came out or or Cloud Code or Codex, where we saw a significant improvement on coding in general.
在计算机应用领域我们还没迎来那样的转折点。
We haven't had that moment for computer use.
你认为关键缺失是什么?
What what do you think is missing there?
有意思。
Interesting.
其实计算机应用方面确实已有不少改进。
I mean, there have been improvements on computer use for sure.
或许我的观点有点冒险,但我确实认为模型的视觉能力某种程度上人为限制了它们的发挥。
I do have I mean, maybe I'm going out on a limb here slightly, but also I do think that there is a sense in which models are a little bit artificially hobbled by their vision capabilities.
当你让模型操作图形界面时,常见情况是它们会对交互方式感到困惑——就像处理复杂编程问题时容易陷入思维僵局,但这个问题被放大了,因为你无法直观地回看并意识到哪里出错了。
Like it does seem as if a common pattern you see when you try to get models to do stuff with a GUI is they kind of get a bit confused about manipulating it in a way where it's like, okay, this is interacting with your general propensity to get infused in long as you would in like difficult long coding problems, but it's kind of exacerbated because like you're not able to just easily look back on the thing and see kind of, I was wrong.
你反而会陷入一个可怕的死胡同,就是不停地点击这个按钮,一遍又一遍。
You instead go down like some awful dead end of just, I'm just gonna click this again and again and again.
所以我认为这是问题的一部分。
So I think that's part of it.
我觉得这里可能还涉及长上下文连贯性的问题,比如那些代表图形界面的标记相当庞大,随着操作进行,你的上下文窗口会被逐渐填满——就像'哦对,之前发生的所有事情',然后你就会陷入一个输出越来越不合理的恶性循环。
I think there is something here or so probably about kind of long context coherent stuff, like those tokens to represent the GUI are pretty big and then you're filling up your context window as you go with like, oh yeah, well, I had all of this stuff that's happened before, and you seem to just run into a kind of spiral of increasingly less sensible outputs.
所以我觉得这是两个主要问题,但不确定这是否回答了你的疑问。
So I feel like these are two of the big things, but I don't know if that answers your question.
我...我发现计算机使用...我也说不清楚。
I I found computer use I don't know.
今年是我第一次真正觉得计算机使用很有帮助。
This was the first year I found computer use actually useful.
我们在数据中心研究中使用了ChatGPT代理,因为很多工作涉及查找许可证——这些许可证都分散在德州阿比林县这种地方的各种简陋的县级数据库里。
I we use ChatGPT agent in our data center research because a lot of what we have to do is find permits, which are all going to be on janky county by county databases of error permits for, you know, the county that Abilene, Texas is in.
而且我不可能知道美国每个县都有哪些数据库。
And I don't know what databases exist for every county in The US.
展开剩余字幕(还有 316 条)
ChatGPT可以做到。
ChatGPT does.
普通的ChatGPT无法搜索这些数据库,因为它们都是些实际的用户界面。
Normal ChatGPT can't search them because it's these, you know, these actual user interfaces.
你不能简单地用URL来搜索它们,因为它们根本没那么好用。
You can't just search them with, you know, URLs because they definitely don't work that well.
但它能成功导航这些界面,我只需让它帮我找某个城市数据中心的许可证,它就能返回空气污染许可证、税收减免文件等资料,让我获取大量信息。
And it's able to navigate this such that I can just ask it to find me permits on a data center in a particular city, and it will come back with air pollution permits and, like, tax abatement documents and all of this stuff that let me learn a huge amount.
这完全得益于过去一年左右计算机应用技术的进步。
And this is just, like, because of the improvements we've seen in computer use over the past year or so.
我对此感到非常兴奋。
I'm excited to yeah.
我认为它只会变得越来越好,而且确实已经开始变得真正有用了。
I think it's just just gonna get better from there, but I've definitely found it starting to get to the point where it's actually useful.
你更广泛地认为生产力或整体经济统计数据会发生什么变化?
What what's your mental model more broadly for what is going to happen to productivity or just sort of economy statistics in general?
有些人说GDP增长会达到5%。
Are you some people say GDP growth would be, you know, 5%.
我认为这是泰勒·考恩的观点。
I think it's a Tyler Cowen view.
我想有些人会说不会。
I think some people would say, no.
不会。
No.
如果我们真的拥有通用人工智能,按照我们的理解方式,增长率应该达到10%甚至更高。
It should get up to 10% growth or or maybe even higher if we truly have AGI in terms of how we understand it.
你对生产力会发生什么变化的模型是什么?
What's your model of what happens to the productivity?
我认为我的基本预测是,如果收入继续保持这种增长趋势,理论上值得投入那么多资金购买芯片进行推理,那么届时你应该能获得与投入相当的价值回报。。
I think my kind of baseline guessing would be, you know, I forecast out kind of if revenue keeps growing the way it has in theory for it to be worth spending that much on that, you know, those chips to do that inference, you should be getting something kind of similar to that value after those chips by then.
所以你可以从中得出结论,哦,好吧。
So then you could just draw from that kind of like, oh, okay.
所以推算到2030年,你需要
So extrapolating to 2,030 you need.
我认为报告中提到过,虽然我不确定具体数字,但我计算得出大概是GDP增长1%左右的水平。
And I think for there it was in the report, I don't know, I calculated it, but I think it was on the order of like a percent kind of GDP increase.
那是在几年后对吧?
That's in a few years, right?
这并非假设通用人工智能出现,而是假设英伟达股票收入继续保持之前的增长趋势,并且假设他们能产出与之前相当的计算能力等等。
That's not presuming AGI, that's presuming like if Nvidia stock revenues keep growing as they sort of previously have, and you assume that they make roughly as much compute from it as before and so on.
如果真的实现了通用人工智能,人们会将其用于无数不同领域。
If you actually get something, I mean, AGI is like, yeah, people use it to be umpteen different things.
我认为如果真的实现了能远程完成人类所有任务的技术,那么很可能会看到巨大的增长。
I think if you actually get something that can do any tasks that humans can do remotely, then presumably you see a lot of growth.
要准确预测会出现什么样的滞后效应似乎相当困难。
It feels sort of difficult to guess exactly what kind of a lag you're going to see.
我认为有理由认为,人们可能对新事物的接受速度较慢,他们需要学习如何信任它等等。
I think there's reasons to think, oh, well, maybe people will be slow to adopt stuff, how do they learn to trust it, whatever.
也有其他理由认为,他们已经在使用这些技术,很多情况下实际进展可能比大多数增长都要快,事实上语言模型的采用速度确实比以往许多技术更快。
There's other reasons to think, they're already using these technologies, a lot of it might actually be quicker than most growth and indeed adoption has been quicker for LMs than for many previous technologies.
所以是的,我认为到那个阶段建模就变得相当困难了。
So yeah, I think it sort of gets hard at that point to model.
我们网站上曾有过一些粗略数据,比如假设虚拟劳动力翻倍会怎样?
At some point on our site, we had some rough numbers where it was stuff like, what if you doubled the virtual labor force?
如果是十倍呢?
What have you 10 times that?
随便吧。
Whatever.
然后你就会看到这些疯狂的GDP增长。
Then you see these like crazy GDP boosts.
我不确定这是否是最合理的思考方式。
I don't know whether that's the most reasonable way to think about it.
我认为很大程度上取决于你是否想象真的能获得一个可以完成所有事情的系统,还是说最初只能完成远程任务中的一部分,但可能无法处理整个类别的工作,从而导致更多瓶颈。
I think a lot of it comes down to whether you imagine that like, yeah, you really get something that can do everything versus you get something first, but can do a meaningful fraction of remote tasks, but maybe can't do like an entire bucket of them and then it bottlenecks you more.
所以我想这又回到了那个问题,根据当前趋势,我最好的猜测是到2030年这会带来相当明确的几个百分点的GDP增长,按经济标准来看已经相当惊人了。
So I guess it's again, this thing of like, my best guess on current trends is this fairly well defined, few percent of GDP in 2030 thing, which is already pretty crazy by economic standards.
但当你展望更远的未来时,天啊,你知道,我的预测会变得更加疯狂。
But then once you go much further, it's like, god, you know, my predictions are just gonna be even crazier.
我...我不太愿意做这些预测。
I I'm reluctant to make them.
我会稍微不那么犹豫,提出一些主张。
I am gonna be slightly less reluctant and make some claims.
我们是
We're
在这里为了
here for.
假设在未来十年内,我们获得能够像人类一样胜任任何远程工作的人工智能,我认为30%的GDP增长似乎是一个合理的下限——当然这建立在一个重大假设基础上,即很多人会接受这个观点,要知道这个假设本身包含了许多复杂因素。
Assuming in the next ten years, we get AI that is capable of doing any remote job as well as any human, I think, you know, 30% GDP growth seems like a lower bound on something that's reasonable, assuming you get this is a big assumption that a lot of people are gonna that, you know, it's there's a lot going on in that assumption.
但如果这种情况真的发生,我认为结果要么是获得30%的GDP增长,要么就是负100%的GDP增长——因为人类都灭亡了。
But assuming that happens, I think you either are gonna get, like, 30% GDP growth or, you know, negative a 100% GDP growth because everyone's dead.
说白了,最终看来,你将会拥有可扩展的人工智能。
It's just like, you know, it it's just like, at the end of the day, it seems like you're gonna have AI that can scale.
但如果你拥有可扩展的人工智能,那很可能还能进一步扩展。
But if you have AI that can scale there, could probably have AI that scales even farther.
就目前而言,我所见过的经济模型显示,如果实现这种全面替代——即可以自动化工作岗位——要么会出现极速爆发式增长,要么就像有些人尝试后说的那样,翻看几段文字后假设当前水平——假设AI具备GPT-3的能力。
And right now, I think the, like, economic models I have seen of what happens if you get this sort of full replacement, you can automate a job, are, you know, either show this sort of an extremely fast wild takeoff or with a couple of or, you know, you have some people attempting to do this who then say and then you, like, look down through paragraphs, and it's, like, assuming current levels of assuming AI is as capable as GPT three.
我认为较小的数字要么是近期预测,要么是没有考虑到未来十年可能出现的更高端能力。
You know, I I think the the smaller number is just like you know, they're they're nearer they're either nearer term predictions or predictions that aren't looking at, like, the full the the more the upper end of what sort of capabilities you might see in the next ten years.
是啊。
Yeah.
确实很难想象一个拥有这种虚拟劳动力的世界——它能完成人类所有工作——却不会引发疯狂变革。
I mean, does seem hard to imagine a world where you have this supply of virtual labor that literally can do any stuff that humans can do, and then it doesn't lead to crazy things.
这点我完全同意。
I definitely agree with that.
我在想,或许会出现某种严格监管的情况吧。
I guess perhaps maybe some sort of a, I don't know, a heavy regulation situation.
但是确实存在。
But There are.
确实如此。
Do yeah.
我认为存在一些世界在那之后不会变得疯狂。
I think there exist worlds in which things don't go crazy after that.
这些世界似乎并不处于无限稳定的状态。
It does seem like those worlds are not in an indefinite stable state.
但你知道,这并非不可能,但默认情况似乎要么疯狂上升,要么疯狂下降。
But, you know, it's not impossible, but it does seem like the default there is you either go crazy up or you either go crazy down.
很可能会是这两种情况之一。
And it's probably gonna be one of those two.
如果你进入一个AI真的能像人类一样胜任任何工作的世界。
If you get to a world where it's like, genuinely AI can do any job as well as any human.
我想人们...我不知道。
I think people I don't know.
在我看来,声称默认情况应该是不会出现特别荒谬的变化,这种说法很疯狂。
It seems wild to me to claim that, you know, given that your default case should be, you know, not super ridiculous changes.
就像是,你的AI在那里已经能做很多事情了。
It's just like, that that's a lot of things that your AI can do right there.
对,就是这样。
And that's like yeah.
这似乎应该从根本上改变了经济,无论是向好还是向坏的方向。
It it just like seems like it should have fundamentally changed the economy in one direction or another.
我的直觉是很多分歧...
My intuition is a lot of the disagreement.
我是说,可能部分确实源于人们已有的固化信念,但我确实也认为部分原因是当人们谈论'通用人工智能'、'能做我工作的AI'之类的话题时,尽管感觉我们在讨论同一件事,但可能有时并非如此。
I mean, probably some of it does come down to sort of cashed beliefs people already have, but I do also think some of it is that when people talk about like, oh yeah, AGI, AI that can do my job, whatever, even though we feel like we're talking about the same thing, maybe sometimes we're not.
我不知道,我确实有过这样的对话例子:'是啊是啊,它能做任何远程工作'。
I don't know, I've certainly had examples of conversations where it's like, yeah, yeah, I can do any remote job.
然后他们讨论它做不到的事情,那些它确实做不到的事情。
And then they discuss stuff that it can't do and the stuff that it can't do.
这就像是,不,那也是一种远程工作。
It's like, well, no, like that's also a remote job.
就像这是人们目前正在做的那种工作。
Like that's the kind of thing people currently do.
所以我认为这其中存在一些这样的情况。
So I think there is some of this.
你怎么看?我是说,你在报告中谈到了基准测试,但我好奇,到2027、2028年,除了经济增长,衡量模型能力的基准会是什么?比如模型的智能水平?
What do you think, like I mean, you're talking about benchmarks on your report, but I I wonder, like, 2027, 2028, what are gonna be the right benchmarks to measuring the progress more than the economic growth, more the capabilities on the model, like intelligence on the model?
就像我们在2012年有AlexNet,显然这个问题早就解决了,但那无论如何都不是衡量AGI的标准。
Like, we we had in thousand twelve AlexNet, obviously, that that got solved long ago, but that was probably not a measure of AGI by any any means.
你认为我们现有的基准测试也会面临同样的情况吗?
Do you think the same would happen with the current benchmarks we have?
比如说SWE bench、MLU,假设我们在这些基准上已经达到了极限。
So so SWE bench, MLU, let's say we maxed out on those benchmarks.
那之后又该用什么来衡量呢?
What comes after that?
我们该如何衡量这一点?
How would do we measure that?
这是否类似于这些模型的GDP增长?
Is it a sort of, like, GDP growth with these models?
是科学突破吗?
Is it sort of breakthroughs in science?
你认为未来的正确衡量标准是什么?
How do you think is the right measure going forward?
是的。
Yeah.
我的意思是,我认为我们现有的大部分问题很可能都会被解决。
I mean, I think most of what we have is likely to be solved.
事实上,你举的例子已经非常接近了,比如我不确定,你们基本上已经解决了SWE基准测试,可能接近程度取决于某些问题的模糊性,还有一些细节,但确实快达到了。
And indeed, the examples you gave are like pretty close already, like I don't know, you guys basically solved SWE benches, like possibly close depends a bit on how ambiguous some of the questions are, there's some details, but it's really getting there.
我认为有些方向是明确的,你们可以做类似但更难、更好的事情,并尝试让它们更现实一些,人们正在这样做。
I mean, I think some directions are obvious, you kind of do similar things but harder and a bit better and try to make them a bit more realistic and people are doing this.
例如,人们已经投入更多努力去构建涵盖更大任务的更难的软件基准测试。
There are harder software benchmarks that people have made more of an effort to try to curate and that cover larger tasks, for example.
我认为可能还涉及到一些预算方面的问题。
I think there's also perhaps some question of kind of budgets involved.
我确实认为存在这样一种情况:显然,单纯烧钱并不会本质上让基准测试变得更好,但你可能需要平均投入更多资源。
I do think there's this kind of thing where like, obviously if you just burn money, it doesn't intrinsically make the benchmark better, but probably you are gonna see something where you're just gonna have to devote more resources on average to them.
如果你想证明更高层次的能力并达到更高的验证标准,可能需要投入更多开发精力。
Like if you're trying to prove a sort of higher level of capabilities to a higher standard of proof, probably it's gonna involve kind of more effort in developing them.
不过我也认为,你会看到一些数量相对较少但非常令人印象深刻的事例。
I do also think though, you're going to see examples of relatively small kind of small numbers of things that are just very impressive.
这些也是宝贵的信息信号。
And these are also valuable signal.
比如当你看到艾伦能够做到这样的事情——'哦,重构了整个代码库而且非常有用'——这就会很有价值。
Like when you see Alan's being able to do things like, oh yeah, just refactored this entire code base and it was really useful, then this is gonna be useful.
即使它尚未被形式化为基准测试,如果你亲眼所见,它对你来说就是一种有用的证据。
And even if it's not yet formalized into a benchmark, if you've seen it for yourself, it's gonna be kind of useful for you as evidence.
然后人们可能会制定涵盖这类内容的基准测试,试图将其系统化。
And then people are probably going to make benchmarks that cover things like this to try to systematize them.
我想回到我们关于时间线的问题,并询问你对几个不同里程碑的看法,了解你对这些时间线的观点。
I wanna go back to our question on on timelines, and I wanna ask you about a few different sort of milestones and get your perspective on on timelines there.
那么首先,你认为AI解决一个重大未解数学问题的大致时间线是怎样的?
So so first is what what is your rough timeline for a a major unsolved math problem being solved by AI?
哦,其实我也想过这个问题。
Oh, actually wondered.
是的。
Yeah.
因为你之前提到过几个类似的问题说要关注。
Because you had a a few of these that you said trust to look at.
当你说它解决了这个问题时,是指完全无人协助的情况下吗?
When you say that it solves this, I mean, is this unassisted entirely?
是类似新闻报道还是有人发推说,嘿,我把这个丢给GPT它就解决了?
Is it kind of a news report or someone tweets that, hey, like I dumped this at GPT and it solved it.
什么才算是重大成就?
And what counts is major?
我们都能认同的那种。
Something that we would all agree.
比如实质性的、重要的版本。
Like a substantive, you know, version of it.
不是那种道听途说的个人描述。
Not not a, you know, just an anecdotal, you know, person describing it.
但它必须完全独立解决吗?
But does it have to solve it on its own?
是的。
Yeah.
我们就这么定义吧。
Let's go with that.
当然。
Sure.
是的。
Yes.
老实说。
Honestly.
哦,是的。
Oh, yeah.
因为我认为已经有一些案例表明,语言模型确实可以做到。
Because I think there's already cases, it seems, of LM's be yeah.
就像,人们还在争论,但那些看起来值得信赖的数学家们都在说,哇。
Like, people are debating a little bit, but mathematicians who seem trustworthy are saying like, wow.
我用过这个,它在我证明过程中真的很有帮助。
I used this, and it was really helpful during my proof.
是啊。
Yeah.
我一点都不会惊讶如果AI在未来五年
I would not be surprised if AI solves, like, a major unself math problem like the Raymond hypothesis or similar in the next five years.
我不会说这一定是我的中位预期情况,但我确实不会感到特别惊讶。
I'm not gonna say that, like, that's my, you know, median case necessarily, but I definitely wouldn't be that surprised.
就像现在看起来,数学对AI来说似乎没那么困难。
It's like right now, it doesn't look like math is that hard for AI.
就像有些事结果很难,有些则不然。
It's just like some things turn out to be hard and some things don't.
数学恰好是强化学习表现相当好的领域之一,而在大多数其他领域,它还没达到对正教授同样有用的程度——我认为数学领域已经达到或非常接近这个程度了。
And math is just like one of the domains where it's RL seems to work pretty well and where it's most other domains, it's not at the point where it's, like, useful to a full professor to the same extent I think it is for math all or getting very close to for math.
是啊。
Yeah.
而且归根结底,它在某些方面异常出色的能力最终可能被证明非常非常有用的程度也极不明朗。
And at the end of the and also it's, like, very unclear to what extent certain capabilities that it has unusually well might actually turn out to be very, very useful.
比如,说不定它会发现四篇含有冷门结论的论文,这些结论组合起来就能解决某个重大猜想——这类事情AI可能比人类更容易发现。
Like, maybe it'll turn out that there's, like, four papers out there that it knows about that have obscure results in them that when combined solve some big conjecture, which is the sort of thing that it, like, might be much more feasible to figure out with AI than for a human to figure out or something similar.
这里有很多不确定性,但目前看来AI似乎不会在这方面遇到困难。
There's a lot of uncertainty here, but it just, like, does not currently seem like something that AI is actually gonna struggle with.
人们经常声称这像是某种直觉层面的深奥事物,意味着AI要达到某种极高的智能水平才能解决。
People often make claims about it being like this, you know, intuitive deep thing that it would mean that AI has achieved something, some huge level of intelligence for it to solve.
我认为实际上,这就像是创作一件艺术品。
I think in practice, this is just like, you know, making a piece of art.
事实证明AI在能够记住超过几天或其他事情之前,就已经能做到了。
It turns out AI could just do that before it can do a lot of other before it can, you know, remember things for more than a couple of days or whatever.
是啊。
Yeah.
结果它位于能力树上比人们预想的更远的位置。
It turns out to be farther farther down the capabilities tree than people might have guessed.
没错。
Yeah.
不过我也持乐观态度,尽管确实如此。
I think I'm I'm also bullish, though I do think that yeah.
这是那种棘手的事情,你真的需要明确定义它才能做出好的预测,希望能得到准确的预测。
And it's one of those things where it's tricky and you really probably do need to define it quite well to get a good forecast on it, to hope to get a good forecast on it.
比如,我不知道,我们在数学基准测试中有过这样的经历,让数学家解决一些我认为难度不及你所说的那些问题。
Like, I don't know, we've had this experience that with benchmarking mathematics, we got mathematicians to cut with problems that I think aren't as difficult as the kind of problems you're talking about.
但尽管如此,他们还是觉得,是的,AI能解决这个问题。
But nevertheless, they're like, yeah, AI could solve this.
这对AI进展来说会是件大事。
It'd be like a big deal for AI progress.
这对我来说意味着什么。
It would mean something to me.
然后AI确实解决了它们。
And then AI has solved them.
通常他们的反应都是,哦,这让我稍微更新了一下认知。
And usually their response has been kind of like, oh yeah, that updates me a bit.
或者当我仔细看时,才意识到,是的,你可以用蛮力解决这个,可以取巧通过。
Or then man, when I look at it, I just realized like, yeah, you can kind of brute force this, you can kind of cheese this, you can get through.
这有点像,哦好吧,如果有个对人类来说算得上重大的问题,AI解决了它,然后大家就说,好吧,它解决了,就这样。
And it's a bit like, oh, okay, I mean, what if there's a problem that for humans we consider sort of, oh, this would be quite big and then yeah, AI solves it and okay, ah, well, it solved it, whatever.
我们几十年前在国际象棋上就有过类似经历,对吧?
We sort of had this with chess decades ago, right?
就像计算机非常擅长下棋,大家都认为这是推理的巅峰,然后它们做到了,结果人们反而觉得‘哦,计算机当然会下棋’。
Like computers solved chess very well and everyone was thinking of this as the pinnacle of reasoning and then they did and everyone as a result kind of concluded by, Oh, well, of course computers can do chess.
所以,嗯,我也不确定。
So, yeah, I don't know.
我怀疑数学对AI来说是个不错的领域。
I suspect that math is quite nice for AI to do.
我不太愿意断言说‘AI肯定会在未来几年解决某些千禧年难题’,但如果它在未来几年解决了一些看似惊人的问题,我一点都不会感到惊讶。
I'm reluctant to go out and assert like, Oh yeah, definitely AI is gonna solve some of the Millennium Prize problems in the next few years, but it would not at all surprise me if it solves quite impressive seeming things in the next few years.
那么,生物学或医学领域的突破呢?
To to then, what what about a breakthrough in biology or or medicine?
我们已经看到一些成果了,那个叫什么来着?
And we've already seen some of that with the what's it called?
AlphaFold。
Alpha AlphaFold.
数学团队对AI来说异常简单。
It a math team is unusually easy for AI.
说实话。
I'm gonna be honest.
以至于我在想,它会不会达到同样的水平,比如独自完成这样的大事?
So to the extent where I'm like, ah, is it gonna do the same exact level of, like, oh, it on its own did this huge thing?
这在我看来似乎是个更大的跨越。
That seems to be a much bigger stretch to me.
这确实看似合理,但还存在许多其他问题,比如它需要能够实际进行实验、获取数据并与现实世界互动,而这些在数学领域完全不需要。事实上,它们看起来更遥远。
It definitely seems plausible, but there's a lot of other concerns there where it needs to it needs to be able to, like, actually do experiments and get data and interact with the real world for a lot of these in a way that does not need to happen at all for math, in particular for It's just they, in fact, seem farther off.
在我看来更可能的是,我们会看到AI工具在生物学、化学等有用领域的某些方面变得无处不在,从而增强这些领域的特定能力。
What is what seems more plausible to me is that we see, like, you know, it become ubiquitous that some tools like, of using AI in some sort of aspect of, like, biology or chemistry or something useful like that, that it, like, certain aspects of it are enhanced.
当然,AI也可能在没有人类协助的情况下取得惊人进展。
It also is possible that AI will, you know, make incredible strides without Yeah.
我认为没有人类参与会更困难。
I think without humans, but it's it's harder.
是的,我认为这又回到了界限划定的难题上。
Yeah, I think, again, it's a bit tricky for where you draw the line.
我是说,你之所以没把AlphaFold这类工具算进去,如果算的话,你大概会支持这个观点对吧?
I mean, I think you're not counting tools like AlphaFold because if you were, then probably you'd argue for that, right?
其发明者共同获得了诺贝尔奖。
The inventors co won the shared Nobel Prize.
不过确实,我想研究方向是有所不同的。
But yeah, I mean, I guess there's kind of different directions.
在生物学领域,AI可能实现精准的特定预测,也可能发展成更通用的'协科学家'模式——它能查阅文献并产生优质创意,而人类参与程度各有不同。
In biology, you could have AI being able to predict quite specific things like that, or you could have something that's more general purpose, this so called like co scientist or whatever they want to call it approach, it's more about like, oh, it was able to look through the literature and have good ideas and there's different extents of human involvement.
目前似乎已出现一些令人印象深刻的成果。
There already seem to be some results where impressive stuff is happening.
我尚未深入验证这些成果是否已达到你所说的——
I've not vetted them enough to really have a sense of like, would this already count as having satisfied?
那种令人惊艳的标准。
Yeah, the sort of level of impressiveness you're looking for.
我大致认为,发现最终有意义的事物很快就会实现,如果尚未实现的话。
I sort of assume that finding things that end up being meaningful will happen pretty soon if it hasn't already happened.
但接下来可能有个问题,就是它是否能像人类研究者那样,真正优先处理最优的几个方向?
But then maybe there's a question of kind of, okay, but is it doing as well as human researchers actually prioritizing the best few ones to work on?
我认为这些协同科学研究成果中,大部分可能都有人类深度参与优先级的设定。
I think most of these coscientists results have probably had pretty involved humans prioritizing.
不过重申一下,我了解得还不够多,无法断言。
Though, again, I've not looked enough to say.
最后,关于真正的超级智能,按照你对超级智能的定义怎么看?
Lastly, how about for real superintelligence for for your definition of superintelligence?
我...我有...我想我...我在公开场合说过...我讨论过的中位数时间线或众数时间线...
I have I have I I think I am I am on the record of saying that the the median time line I discussed or the modal time line.
抱歉。
Sorry.
我想是众数。
I think it's modal.
是的。
Yeah.
这可能比我预估的中位数时间要早一些。
Which might be on the early side compared to where my median is.
你知道,2045年是我和Jaime做播客时讨论的时间点,我们谈到预测体系崩溃、一切都变得疯狂(这是我用的术语)。
As, you know, 2045 was where when I did the podcast with Jaime, we discussed, like, our forecasting breaking down and everything going bananas is the terminology I have used.
那看起来就像是超级智能。
And that, like, looks like super intelligence.
我认为,如果我们在不久的将来能研发出可以完美胜任所有人类工作的AI,那就意味着通过简单扩展就能实现极大提升,也意味着距离开发出远超人类能力的AI可能只差几步之遥。
I you know, I think that it's, like, the case that if we get AI that can do every single job that a human can do as well as any human can do that job in the near future, then this is, you know, means that scaling just works to get things much, much better and probably means that you are not that many steps so that you are just a bit more scaling away from getting AI that could do anything that humans sorry.
两件事远超人类水平。
Two things vastly better than humans.
是的。
Yeah.
预测变得困难。
It gets hard to predict.
而且我认为这也变成了那种预测开始脱离你可以适当建模的东西的情况。
And I think as well it gets to be one of these things where the predictions get a bit unmoored from the stuff that you can like properly model.
就像我的那种猜测,用专业术语来说就是我的判断性预测,认为可以完成任何远程工作任务的AI,其中位数可能在20到25年左右。
Like my sort of guesses, my like judgmental forecasts to use the fancy term for just kind of can do any remote work tasks, probably have a median of about like twenty, twenty five years.
我有点难以想象一个这样的世界:这种事情已经发生,人们正在部署它进行研究,却没能进一步取得更大突破。
I kind of struggle to imagine a world where that happens and people are like deploying it and doing research and yet they're not making further progress to being able to do stuff much better.
所以我猜按照某种超级智能的定义,这个时间点之后应该不会太久。
So I guess they have to be like not too much longer after that for some definition of superintelligence.
但是,是的,一切都非常不确定,而且看起来确实有点崩溃。
But, yeah, all very uncertain and, yeah, it seems to break down a bit.
你经常谈论数据中心、基准测试和生物学方面的进展。
You you talk a lot about the progress in data centers, benchmarks, biology.
我注意到一个有趣的部分,在机器人领域,世界模型和物理空间方面正在取得很大进展。
And there was one interesting part that I noticed just in the field that is robotics is making a lot of progress with, let's say, world models and, like, the physical space a little bit.
我很好奇你对这方面有什么看法?
Curious on, like, what is your take here?
你觉得呢?看起来机器人领域的许多问题似乎仅通过模仿学习就能解决。
Like, what do you think it's it seems like a lot of the problems in robotics can be solved purely with imitation learning.
你可能不需要数学或其他领域的重大突破。
You might not need, like, a lot of sort of, like, breakthroughs in math or whatever.
基本上,只要从大量数据中学习就够了。
Like, you can just basically learn it from a lot of data.
我认为过去几年在机器人技术和世界模型方面取得的进展非常显著。
And I think in the last couple of years has been remarkable just in in robotics and world models overall.
想听听你对此的看法,以及你是否在这个领域做过相关研究。
Curious on your take a little bit on this and if you did some kind of research in this space.
我们研究了实际用于这些训练的计算资源规模。
So we've looked into what sort of amount of compute is actually being used to, like, do these training runs.
我们发现用于机器人技术的训练计算量,比前沿模型训练所用的计算量要小100倍。
And what we found is that, like, compute it the training runs that are being used for robotics are, like, a 100 times smaller than the training runs that are being used for than the training runs that are being used for, like, frontier models.
因此在这方面还有很大的提升空间。
And so there's a lot of skill you could do there.
我认为直到最近、非常近的时候,才有人真正尝试大规模收集机器人技术所需的数据。
I don't think that until plausibly until very, very recently, there have been serious attempts to gather data for robotics at a massive scale.
实际情况是,如果需要的话,你可以雇一群人穿着动作捕捉服来移动。
It's just the case that you can hire a bunch of people to move around in motion capture suits if you need to.
虽然已经有很多这方面的尝试,但我认为这种情况可能正在改变。
And there have been a lot of attempts to do that, although I think this might be changing.
我认为机器人技术主要是个硬件问题。
I think of robotics as mostly a hardware problem.
是硬件和...经济问题。对。
A hardware and, like, economics problem of Yeah.
如果建造一个机器人要花费10万美元,那么它未必比年薪2万美元的人类工人更划算,或者在有些国家可以雇佣非常廉价的人力,抱歉这么说。
If you if it costs a $100,000 to build a robot, then, you know, it's not necessarily better than a human who could work for $20,000 a year or a very cheap human in certain countries or something sorry.
有些国家的最低工资水平,可能让你能够负担得起人工成本。
A the, like, sort of minimum wage in some countries that you might be able to afford labor for.
在我看来,这里并不存在明显的软件问题。
It's just not obvious to me that there is a software problem here.
硬件方面,确实看起来不太明确。
The hardware, it it does seem like unclear.
对我来说,硬件问题还剩多少非常不明确。
It's it's very unclear to me how much of a hardware problem is left.
特别是有些任务机器人或许能完成,但这些真的是你希望机器人能做的任务吗?
In particular, there's certain tasks which robots might be able to do, but are they actually the tasks that you care about a robot being able to do?
如果你希望你的机器人能灵活走动、举起重物、快速移动并做出反应,那确实很难。
If you want your robot to be able to, like, nimbly walk around while lifting up heavy things and moving fast and react, then that's that's hard.
这是个我们尚未看到解决方案的硬件难题。
That's a hardware problem that I don't think we've seen solutions for yet.
是的。
Yeah.
我的印象大致与此相符。
I think my impression roughly matches this.
某种程度上,我不知道,人们经常谈论远程工作与体力劳动之间的这种区别。
It's sort of, I don't know, people fairly often talk about this distinction between remote work and physical work.
我认为这是因为人们普遍感觉机器人技术的进展有些滞后,甚至有种直觉认为物理操控这类事情可能确实更难,但我不太确定这个结论。
I think because there's this perception of robotics progress lagging behind a bit And there even is some intuition that maybe this physical manipulation stuff is actually just harder, but I wouldn't complete that with much certainty.
就像杰夫说过的,你会想看看如果按类似方式扩大规模后会发生什么,才能有个概念判断。
Like Jeff has said, it feels like you'd kind of also want to see, well, okay, what what happens if it gets scaled up in a similar way to even get a sense of like, oh, okay.
到底是确实更难,还是只是优先级被降低了?
Was it actually harder versus was it just deprioritized?
还有什么我们没谈到但你觉得很重要,应该留给观众的内容吗?
Is there is there anything we didn't get to that you feel is important that we leave our audience with?
我们确实讨论过刚刚发布的数据中心。
We did discuss the data center's release we just did.
我不确定是否有好的方式向观众传达这一点。
I'm not sure if there's a good way to leave the audience with that.
是的。
Yeah.
让我们开始深入讨论吧。
Let let's get into it.
好的。
Okay.
所以你们最近刚完成了一个项目。
So you guys just did a you know, recently did as a project.
为什么不谈谈你们在那里试图实现什么目标,以及希望人们从中获得什么?
Why don't you talk a little bit about what you were trying to achieve there and what you hope people take from it?
是的。
Yeah.
我们选取了能找到的13个最大数据中心。
So we took 13 of the largest data centers we can find.
其中包括美国各大实验室的几个数据中心。
These includes some a few from each of the major labs in The US.
我们还找到了许可证。
And we found permits.
我们拍摄了所有这些数据中心的卫星图像,包括最新拍摄的。
We took satellite images, including new satellite images of all these data centers.
我们通过分析他们正在建设的冷却基础设施,以及上线时间和未来规划,找出了如何确定这些数据中心计算能力的方法。
We figured out how to determine how much compute is in them based off the cooling infrastructure that they're building as well as when they're coming online and their future timelines.
所以我们掌握了这些真实世界的数据,并且全部免费公开在我们的网站上。
So we understand this, like, real world data, and it's all available online on our website for free.
这就像是为了洞察正在发生的巨大基础设施建设和其发展速度,其中有些情况让我非常惊讶。
This like, to give insight into this giant infrastructure build up that's happening and the pace of it, There's some things about it that surprised me a lot.
例如,我们发现最有可能建成首个千兆瓦级数据中心的候选者是Anthropic,这完全出乎我的意料。
For instance, we learned that the most likely candidate to have the first gigawatt scale data center is Anthropic, which would not have been my pick.
但Anthropic亚马逊的新卡莱尔项目Rainier开发似乎按计划将于一月上线,随后不久是Colossus二期。
But Anthropic Amazon's New Carlisle project Rainier development seems on track to come online in January, followed shortly thereafter by Colossus two.
我们还了解到很多关于最大混凝土计划的信息,而不仅仅是营销计划。
We also learned a lot about what the largest concrete plans are rather than just, like, marketing plans.
有些人会随口抛出数字,但我们发现真正在进行中、已获许可并正在建设电力基础设施的是微软的一个项目,该项目至少部分将由OpenAI在Mount Pleasant使用。
Some people will throw around numbers, but the one we found that's actually seriously underway and has permits and is, you know, setting up the electrical infrastructure for is one by Microsoft, which is gonna be used by OpenAI, at least in part, in Mount Pleasant.
他们称之为微软Fairwater,这个项目将使用的电力规模虽不及纽约市,但我想会超过一半。
They're calling it Microsoft Fairwater, and that one's gonna be use a size use not quite as much power as New York City, but I think more than half.
是什么在阻止我们大幅增加集群规模?
The what what's stopping us from significantly increasing the the the cluster side?
是成本问题吗?
Is it the is it is it cost?
是供应链交付周期问题吗?
Is it supply lead times?
还需要其他工程突破吗?
Is are there any other engineering breakthroughs required?
电力?
Power?
我认为人们的看法大致是错误的,认为有什么在阻碍我们——我们正在以资金允许的速度尽可能快地扩展。
I think that people are approximately wrong that there's something stopping us, we are scaling up as fast as there is money to scale up approximately.
我猜他们可能希望今天就能拥有所有的计算集群,但实际上扩张速度已经相当快了。
I suppose they could want there to be all of the clusters literally today, but they're scaling up really quite fast.
你看这些数据中心,比如我提到的Anthropic亚马逊那个,其耗电量几乎与印第安纳州首府相当,而它正位于该州。
You're seeing these data centers which are using I think the one I mentioned for Anthropic Amazon is using about as much power nearly as much power as the state capital of in of Indiana, which is where it's located.
其中一些项目的时间表,比如Colossus二代,大约两年或更短,这简直疯狂——建造一个耗电量堪比整座城市的东西。
And the timelines on some of these, like the Colossus two, are, you know, two years or less, which is just an insane thing to build this thing that's using as much power as the city.
我认为,从某种程度上说,你现在并不想购买芯片。
I think that, plausibly, you know, you don't wanna buy chips now.
你想等待更好的芯片出现。
You wanna wait for there to be better chips.
我觉得人们对于扩展过程中遇到的困难有很多噪音。
I I think that people think of there's a lot of noise about things being difficult in scaling up.
我认为这是因为人们不得不比平时多花一些钱。
And I think this is because people are having to spend a little bit more than they would ordinarily have to spend.
你不能使用普通的电力输送管道,那是为缓慢提供廉价基础设施设计的。
You can't use the ordinary sort of power pipeline, which is designed to deliver this affordable infrastructure at a slow pace.
你必须购买一些平时不需要的东西,花费比平时更多的资金,但又不能买太多导致进度放缓。
You have to, you know, buy things that you wouldn't ordinarily have to buy and spend more than you would ordinarily have to spend, but not buy enough to slow it down.
所有这些成本与你的GPU相比都相形见绌。
All of these things pale in comparison to the cost of your GPUs.
所以我从这些情况中得出的实际结论是:我们在扩展规模方面并没有遇到太多困难。
So my actual takeaway from a lot of this has been, oh, we're not having too much trouble scaling up.
只是这些计划推进得非常快,而且人们是否真的有财力和意愿更快地实施这些计划还不太明显。
It just like, these plans are going really quite fast, and it's not obvious that people would actually have the finances and desire to do them faster.
当人们谈论能源可能成为主要瓶颈,或者我们需要显著提升能力时,你并不担心这会成为一个长期持续的瓶颈吗?
When when people are talking about energy as a as a as a major potential bottleneck or us having to, you know, increase our capabilities significantly, you're you're not worried that that's gonna be a sort of durable, sustainable bottleneck.
这种观点是不正确的。
That that's not right.
我认为人们喜欢抱怨,因为他们不能简单地使用传统的电网接入方式来获得四年后廉价电力。
I think people like complaining because they can't just use the traditional plug into the grid for cheap affordable power four years down the line pipeline.
说到底,现在确实存在一些昂贵的技术。
At the end of the day the day, there are expensive technologies that exist right now.
你可以选择太阳能加电池储能。
You could pay for solar power plus batteries.
这个方案的交付周期相对较短。
This is fairly small lead times.
虽然成本可能是普通电力的两倍,但仍远低于你的GPU开销。
It might cost twice as much as normal power, but that's still way less than your GPUs.
所以必要时你还是会这么做。
So you're gonna do it if you have to.
你会看到人们采取这类成本略高的应急措施,比如启动他们的数据中心。
And you see people doing these sort of emergency things that cost them a bit more, you know, starting up their data centers.
我们常见的情况是,人们在数据中心尚未接入电网时就先行启动。
A common thing we see is people starting their data centers before their data centers are connected to the grid.
我想阿比林就是个例子。
I think Abilene was an example.
XAI的巨像一号就是个典型案例,展示了人们如何不惜成本解决这个问题。
XAI Colossus one is a prominent example of just finding ways around this that are expensive.
你会抱怨是因为,能用更便宜的方式当然更好,没人习惯付出这么高的成本。
And you complain about it because, you know, it'd be nice if you could do the cheaper way, and no one's used to having to do it this expensive way.
但归根结底,只要像AI行业这样愿意支付足够费用,解决方案似乎并不匮乏,我不认为这会成为重大瓶颈。
At the end of the day, though, it's just, like, does not there seem to be enough solutions, especially if you are as willing to pay as you as people are in AI that I don't really expect it to be a significant bottleneck.
也许我们就以此作为结束吧。
Maybe let's close with this.
随着这些系统变得像我们讨论的那样强大,我很好奇政治体系会如何应对。
These systems get as powerful as we're discussing, I'm curious how the political system is going to respond.
我想知道你是否认同阿申布伦纳的观点,即可能会出现某种国有化的情况。
I'm curious if you're sympathetic to the Ashenbrenner view that there's some potential nationalization that occurs.
但你预计政府会如何回应呢?
But how do you expect governments to respond?
考虑到它已经如此强大,却几乎不在政治讨论中出现,这相当引人注目。
It's kind of remarkable of how not in the political discourse it is given how how powerful it is already.
我很好奇你对这个问题是怎么看的。
I'm I'm curious how how you think about that.
我预期的是——就像我前面提到的,这种可能在六个月内导致失业率上升5%的概念。
I expect so the thing I calling back to what I mentioned earlier, this concept of, you know, the potential for 5% unemployment increase in, like, six months.
我认为公众对此的反应将决定很多事。
I think that the public's reaction to this will determine a lot.
一旦这种情况发生,人们对AI将产生非常非常强烈的情绪。
There will be very, very strong feelings about AI once this happens.
我认为届时将会形成关于应对措施的强烈共识。
I think there will be a bunch of, you know, a very strong consensus on what to do.
在一些我们通常不认为人们会考虑的事情上。
I on things that we don't normally think of as things that people are considering.
我知道新冠疫情发生时,短短几周甚至几天内就通过了数万亿美元的刺激方案。
I know when this happened with COVID, there was a several trillion dollar stimulus package passed at, like, you know, in a matter of weeks to days.
速度快得惊人。
It was breakneck speed.
我不知道AI领域会是什么样子,但我觉得就像AI的其他方面一样,是指数级的——这意味着如果事态持续发展,人们的关注度将从漠不关心迅速转变为极度重视。
I don't know what that will look like for AI, but I think it's like everything else in AI, it's like, you know, exponential, which means it will pass the point of, you know, people sort of care about it to people really care about it quite fast if things keep going.
我只是不知道最终会发展到什么程度。
I I just don't know where we're gonna end up.
我只预期无论结果如何,最终都会看起来像是'哦,大家突然都明白为什么了'。
I just expect, you know, wherever we end up, there will be it will look like, oh, everyone suddenly agrees that why.
那是指去做某些一年前我们还认为不可想象的事情。
That's that's to do this certain thing, which we would have considered unimaginable a year ago.
而我也不知道那会是什么样子。
And I I don't know what that will look like.
可能是国有化。
It might look like nationalization.
可能是暂停发展。
It might look like pausing.
也可能是加速推进,或者保障更好的失业福利。
It might look like, I don't know, going faster, guaranteeing better unemployment benefits.
谁知道呢?
Who knows?
我只是觉得会有某种强烈的应对措施,而且会来得非常快。
I I I I just think there's gonna be some sort of, like, strong response of some sort, and it's gonna happen very fast.
是啊。
Yeah.
我是说,你知道,你提出的观点是政府现在可能没有你预期的那么感兴趣。
I mean, you know, you make the point that governments are maybe less interested than you'd expect now.
但我的意思是,目前的影响其实并不那么大。
But I mean, the current impacts, I think, aren't really that large.
我觉得关注度在增加,但就目前而言,AI还没有那么强大。
I feel like the attention is getting larger, but it's not that AI as of right now is that powerful.
然而政府已经在大量讨论这个话题了,对吧?
And yet governments are already talking about it a lot, right?
而且你看到有人会见来自不同硬件制造商和AI公司的国家元首,还有各国在讨论他们的AI战略之类的事情。
And you have people meeting with heads of state from various hardware manufacturers and AI companies and like countries talking about their AI strategy, stuff like this.
所以我明显感觉到国家政府将会深度参与其中。
So I feel clearly country, national governments are going to be quite involved.
问题只是以何种方式参与。
It's just a question of how.
而且,是的,我对此也有点不太清楚。
And, yeah, I I also am a bit unclear on that.
我认为目前我们在收入和财务方面已经看到这种现象,每年都在翻倍或三倍增长。
I think that right now we've seen this thing in revenue and finances where it's been doubling or tripping tripling every year.
我的基本假设是,政策制定者和政府对AI的关注度也将遵循类似的趋势,每年翻倍或三倍增长。
And my default assumption is that attention that AI gets from policymakers and governments is going to follow a similar trend where it will double and triple every year.
这意味着如果趋势持续下去,未来可能会引起极大关注,同时也说明现在的关注度已比去年高出许多。
This means that in the future, there there could if trends continue, there will be a huge amount of attention, and it means that right now, there's a lot more attention than last year.
但关注度不会突然从极少跃升至全面爆发,尽管我们确实正在以相当快的速度推进。
But you don't suddenly skip from very little attention to all of the attention, although you do move quite we are moving, I think, quite fast.
我想我们已经做了足够多的预测,明年年底还得请你回来复盘,看看进展如何,然后再为明年制定计划。
I I think we made enough predictions that we'll have to have you back next year and, yeah, the end of the year and check-in and see where we're at, and then they make make it for next year.
是啊。
Yeah.
大卫,非常感谢你能来
But David, thank you so much for coming
参加播客节目。
on the podcast.
谢谢
Thank
你,埃里克。
you, Eric.
谢谢你。
Thank you.
非常感谢邀请我们。
Thanks so much for having us.
感谢收听本期a16z播客。
Thanks for listening to this episode of the a 16 z podcast.
如果你喜欢本期节目,请点赞、评论、订阅、给我们评分或留言,并与亲友分享。
If you like this episode, be sure to like, comment, subscribe, leave us a rating or a review, and share it with your friends and family.
更多节目请前往YouTube、Apple Podcasts和Spotify。
For more episodes, go to YouTube, Apple Podcasts, and Spotify.
在X平台关注我们@a16z,并订阅我们的Substack:a16z.substack.com。
Follow us on x at a sixteen z, and subscribe to our Substack at a sixteen z.substack.com.
再次感谢收听,我们下期节目再见。
Thanks again for listening, and I'll see you in the next episode.
温馨提示:本节目内容仅供信息参考,不应视为法律、商业、税务或投资建议,也不应用于评估任何投资或证券,且并非针对任何a16z基金的现有或潜在投资者。
As a reminder, the content here is for informational purposes only, should not be taken as legal business, tax, or investment advice, or be used to evaluate any investment or security, and is not directed at any investors or potential investors in any a sixteen z fund.
请注意,a16z及其关联机构可能持有本播客讨论公司的投资。
Please note that a sixteen z and its affiliates may also maintain investments in the companies discussed in this podcast.
更多详情,包括我们的投资链接,请访问a16z.com/disclosures。
For more details, including a link to our investments, please see a 16z.com forward slash disclosures.
关于 Bayt 播客
Bayt 提供中文+原文双语音频和字幕,帮助你打破语言障碍,轻松听懂全球优质播客。