本集简介
双语字幕
仅展示文本字幕,不包含中文音频;想边听边看,请使用 Bayt 播客 App。
你们买GPU的方式就像买可卡因。先打几个电话,发几条短信,然后问:嘿,你有多少货?
How you buy GPUs is like buying cocaine. You call up a couple people. You text a couple people. You ask, yo. How much you got?
价格是多少?
What's the price?
如果你的两大死敌突然联手,这绝对是最糟糕的消息。我完全没预料到,这真是个惊人的发展。
If your two arch nemesis suddenly team up, right, it's the worst possible news you can have. I I did not see this coming. I think it's a it's amazing development.
就像沃伦·巴菲特入股某支股票。黄仁勋就像是半导体界的巴菲特效应。
Like a Warren Buffett coming into a a stock. Jensen is, like, the Buffett effect Yeah. For the semiconductor world.
颇具讽刺意味的是,一切兜兜转转,英特尔现在反倒要仰仗英伟达了。
It's kinda poetic that everything's gone full circle and Intel's sort of crawling to NVIDIA.
今天我们要讨论半导体行业多年来最惊人的消息:英伟达向英特尔注资50亿美元,这两个长期竞争对手将在定制数据中心和PC产品上合作,这桩交易出乎所有人意料。对英伟达而言是巴菲特效应,对英特尔是救命稻草,而对AMD、ARM及全球芯片竞赛格局可能产生巨大冲击。
Today, we're talking about one of the biggest surprises in semiconductors in years. NVIDIA just put 5,000,000,000 into Intel, two long term rivals now teaming up on custom data centers and PC products, a deal nobody saw coming. For NVIDIA, it's the Buffett effect. For Intel, it's a lifeline. And for AMD, ARM, and the global chip race, the fallout could be massive.
为深入解析,我们邀请到Semi Analysis首席分析师迪伦·帕特尔、a16z普通合伙人Sarah Wang,以及a16z合伙人、英特尔前数据中心与AI事业部CTO圭多·阿彭策勒。迪伦,欢迎再次来到播客。
To break it all down, I'm joined by Dylan Patel, chief analyst at Semi Analysis Sarah Wang, general partner at a sixteen z, and Guido Appenzeller, partner at a sixteen z and former CTO of Intel's data center and AI business unit. Let's get into it. Dylan, welcome back to the podcast.
谢谢邀请。是的。它
Thanks for having me. Yeah. It
正巧在我们邀请您的时候,NVIDIA宣布向英特尔投资50亿美元,双方将合作开发定制数据中心和PC产品。您对这次合作有何看法?
just so happens that there's some big news just as we're as we're having you, NVIDIA, announcing $5,000,000,000 investment in Intel and them teaming up to jointly develop custom data centers and PC products. What do you think about the collaboration?
我觉得这简直太滑稽了,NVIDIA居然能投资。消息一公布,他们的投资就已经涨了30%。50亿美元的投资,转眼就赚了20亿。对吧?我觉得有趣的是他们需要客户真正大规模买单。
I think I think it's hilarious that, like, NVIDIA could invest. It gets announced, and their investment's already up 30%. $5,000,000,000 investment, $2,000,000,000 profit already. Right? Like, I think it's fun because they need their customers to really have big buy in.
所以当他们的潜在定制客户买单并承诺采购某些类型产品时,这就非常合理。而且某种程度上很讽刺,因为过去英特尔曾因芯片组反竞争行为被起诉,而NVIDIA当年还从英特尔那里拿到了和解金——那时候显卡还没集成到GPU里,而是放在包含USB等各种IO的芯片组上。所以现在轮到英特尔要制造小芯片,和NVIDIA的小芯片打包成PC产品,这剧情反转实在有点幽默。
So when their custom potential customers buy in and commit to certain types of products, it makes a lot of sense. Right? And it's kind of funny in a way because in the past, there was this whole, like, thing around how Intel was sued for being anticompetitive with their chipsets, and NVIDIA actually got, like, a settlement from NVIDIA Intel, right, way back when when, like, the graphics were separate from the GPU and the the graphics were really put on the chipset, which had, like, all this other IO, like USB and all this stuff. So so it's kind of a a a funny, like, turn of events that now Intel is going to make, like, a chiplet and package it alongside a chiplet from from NVIDIA, and then that's like a PC product. Right?
你看,这种轮回挺有诗意的,英特尔现在算是向NVIDIA低头了。但这可能真会造就最佳设备——我不想要ARM笔记本因为它功能有限,而搭载NVIDIA显卡的x86笔记本若能完全集成,可能会成为市面最棒的产品。所以...
So, you know, it's kind of poetic that everything's gone full circle and Intel sort of crawling to NVIDIA, but actually it might just be the best like device. Right? I don't want an ARM laptop because it can't do a lot of things. And so an x 86 laptop with NVIDIA graphics, fully integrated, would be probably the best product in the market. So so are
您持乐观态度吗?您认为合作前景如何?
you optimistic? How do how do you think this will go?
当然希望成功。毕竟我对英特尔必须保持永恒乐观。
I mean, sure. I mean, I hope. I hope. Right? I'm I'm I'm a perpetual optimist on Intel because I have to be.
我原本以为这笔交易的结构至少是政府人员和英特尔试图推动的那种模式,即让大客户和主要供应商直接向英特尔注资。但现在情况恰恰相反,他们购买部分股份获得一定所有权,但并未真正稀释其他股东的权益。等到英特尔最终从资本市场筹集资金时,其他股东才会被稀释/所有人都会被稀释。不过由于这些已宣布的交易规模都很小,对吧?
I was thinking that the structure of the deal that at least, like, a lot of the government folks and Intel were sort of trying to go for was people get you know, big customers and the biggest suppliers directly give capital to Intel. But this is sort of the other way around where they're buying some of the stock, having some ownership, but they're not really, like, diluting the other shareholders. And then the other shareholders will get diluted slash everyone will get diluted when Intel finally does raise the capital from the capital markets. But because they've announced these deals and they're pretty small. Right?
英伟达50亿,软银20亿,美国政府100亿。你看,这些金额仍然相对较小,相当小。
5,000,000,000 Nvidia, 2,000,000,000 SoftBank, US government was 10. You know, these these are still relatively small. Pretty small.
是啊。
Yeah.
没错。从本质上讲是这样,对吧?我上次说过英特尔需要500亿资金,对吧?
Yeah. On the nature of things. Right? I mean, like, you know, last time I think I said Intel needs, 50,000,000,000. Right?
现在当他们进入资本市场时情况会好转。而且希望他们还能再公布几笔类似交易。现在有各种猜测说特朗普在推动这些公司投资——先是英伟达,现在政府也加入,接下来苹果会不会也来投资英特尔?或者其他企业跟进?这才能真正提振投资者信心,之后他们才能进行股权稀释/债务融资。
Now now when they go to the capital markets, it's it's better. May and and and hopefully, they get another, you know, couple of these announcements. Maybe, know, there's there's all sorts of speculation that Trump is involved in, you know, sort of getting these companies to invest. NVIDIA and now now, you know, the government as well, of course, and now, you know, is Apple gonna come invest, right, and also do something with Intel or who else will come in? And that'll really boost investor confidence, then they can dilute slash go get debt.
就像沃伦·巴菲特入股某支股票带来的效应。黄仁勋现在就像是半导体界的巴菲特。Guido,你曾任英特尔数据中心与AI事业部的首席技术官对吧?
Like a Warren Buffett coming into a a stock. The Jensen is like the Buffett effect Yeah. For the semiconductor world. Guido, you were the CTO of the Intel Data Center and AI BU. Yep.
对此你有什么看法?
What are your thoughts?
我认为短期内这对客户和消费者确实非常有利。对吧?尤其是在笔记本电脑市场,英特尔与其他厂商的合作简直太棒了。
I think it's really good for customers and consumers in the short term. Right? Having having both Intel and, like, especially for the laptop market. Right? Having to collaborate is is is amazing.
我在想英特尔的集成显卡或AI产品线会怎样发展。他们可能会暂时按下重启键放弃这部分业务。毕竟目前他们缺乏有竞争力的产品。
It I wonder what's gonna happen with any of the internal graphics or AI products at Intel. Right? They they might just push a reset and give up on that for now. Right? They currently don't have anything competitive.
对吧?高迪项目基本已经结束了。他们的集成显卡芯片也从未真正在高端市场形成竞争力。
Right? There was the Gaudi effort that's more or less done. Right? There was the internal graphics chips, which never competed really at the high end. Right?
所以从这个角度看,这次合作非常合理。这对双方都有利。听着,英特尔确实需要注入新鲜空气。
So from that perspective, it makes a lot of sense. Right? It's for for for both sides. Look. I think the for Intel, they needed a breath of fresh air.
他们当时已经走投无路了。所以我认为这是件大好事。AMD这下完蛋了——两个死对头突然联手,没有比这更糟的消息了,对吧?
Right? They were sort of desperate. So I think it's it's it's a very good thing. I think AMD is fucked. I mean, your two arch nemesis suddenly team up, that's the worst possible news you can have, right?
AMD本来就在苦苦挣扎。他们的显卡不错但软件生态不行,市场反响很有限。现在他们面临更严峻的问题。ARM的处境也有点不妙。
They were already struggling, right? Their cards are good, their software stack is not, right? They were getting very limited traction, right? They now have a bigger problem on that side. I think ARM is a little bit screwed as well.
因为ARM最大的卖点本是'我们可以与所有不想和英特尔合作的企业结盟'。作为未来CPU领域最具威胁的竞争者,现在他们突然能获得英特尔技术,可能会朝那个方向发展。
Right? Because they they are their biggest selling point was sort of like, look, we can partner with everybody that doesn't want to partner with Intel. And that's what they in a sense, they're number one, you know, like, is probably the most dangerous of the future CPU competitors. Right? And so they now suddenly have access to Intel technologies and might get in that direction.
它它它重新洗牌了。对吧?这这我完全没预料到。我认为这是个惊人的发展。
It it it remixes the card. Right? It's it's I did not see this coming. I think it's it's amazing development.
是啊。看这事态发展会非常有趣。正如埃里克提到的,包装新闻周。迪伦,既然你在这儿,我们还想请教你另一个话题——华为公布他们的AI路线图。显然,他们正在大肆宣传其能力。
Yeah. It will be very interesting to see this play out. To Eric's point, pack news week. The other thing that we wanted to pick your brain on since we have you here, Dylan, is the other news dropping on Huawei unveiling their kind of AI road map. And, you know, obviously, they're hyping up the capabilities.
我觉得你们团队在评估方面一直走在前列,比如九五零超级集群究竟能做什么?但很想听听你对中国战线所有动态的看法。对吧?而且这还与深度求索宣布下一代模型将采用国产芯片的消息相呼应。
I think you guys have been sort of ahead of the curve of trying to gauge, hey. What what can the nine fifty supercluster actually do? But would love your thoughts on everything that's going on from the China front. Right? And this is kind of coupled with DeepSeek saying their next models are gonna be on domestically produced Chinese chips.
中国政府禁止企业购买专为中国生产的英伟达芯片。所以现在中国半导体市场就像多米诺骨牌接连倒下,但很想听听你的整体观点,以及一些细节分析。
The Chinese government, kind of banning companies from buying the, produced specifically for China NVIDIA chips. So there's just sort of a lot of dominoes falling right now in the semi market in China, but would love your take overall and, I mean, drill into some detail.
没错。我认为如果拉远视角看,比如从2020年开始梳理很重要——华为的实力有多强,或者说他们历来就很强。早期他们确实窃取了思科源代码和固件等,但很快就超越了思科和所有其他电信公司。2020年他们发布昇腾芯片并提交给公正的公开基准测试,是首个将7纳米AI芯片推向市场的企业。
Yeah. I think when you sort of zoom out to even, like, you know, let's let's let's walk from 2020 because I think it's really important to recognize how cracked Huawei is or even just historically, like, they've always been really good. Sure. Initially, they stole, like, Cisco source code and firmware and all this stuff, but then they rapidly passed them up as well as every other telecom company. In 2020, they released an Ascend chip and submit it to impartial public benchmarks, and they were the first to bring seven nanometer AI chips to market.
他们是第一个做到的。对吧?虽然当时英伟达仍领先,但差距微乎其微。而且那时他们还能完全接入国际供应链。
They were the first to have that. Right? Now you could still say NVIDIA was ahead, but the gap was, like, nothing. Right? And this is when they could access the full foreign supply chain.
那时他们刚超越苹果成为台积电最大客户。从制造供应链设计和整体实力来看,他们明显领先所有人。当然英伟达市场份额更高,但当时市场才刚起步。他们本有可能真正占领市场的。
This was when they just passed Apple to be TSMC's largest customer. They were, you know, clearly ahead of everyone on a manufacturing supply chain sort of design standpoint on in a total basis. Right? Now, of course, NVIDIA still had higher market share, but it was so nascent then. Like, it could have they could have really taken over the market.
华为在特朗普政府时期被禁止获取技术,这项禁令于2020年全面生效,对吧?因此他们只能小批量生产这些芯片,但当时已用这些自产芯片训练了大量模型。而随后几年,英伟达则持续加速发展。
Huawei got banned by the Trump one administration from accessing, and then it went into effect in 2020. Right? The the full ban. And so they were only able to make a small volume of these chips, but they had trained significant models on these chips that they made then. And then over the next couple years, right, NVIDIA continued to accelerate.
由于被台积电断供,华为不得不转向中芯国际(国内版台积电)寻求制造方案。同时他们还试图通过空壳公司继续从台积电代工,并从韩国等地获取存储芯片。到2024年时,这套操作已形成完整产业链但最终败露——被查获时他们已通过这些渠道从台积电获得了290万至300万枚芯片。
Huawei, because they were banned from TSMC, had to go and try and figure out how to manufacture at SMIC, the domestic TSMC. And then they were also, in parallel trying to go through shell companies to manufacture at TSMC and acquire memory from Korea and so on and so forth. So by the '24, they had this had gotten in full swing and it was caught. Right? It was caught and they finally shut it down, but they were able to acquire 3,000,000 chips, 2,900,000 chips from TSMC through these other entities.
对吧?这批价值约5亿美元的订单最终导致台积电被美国政府处以10亿美元罚款(如果我没记错的话)。路透社有篇报道提到——虽然不确定罚款是否真的执行——这个数字很重要,因为目前市面上Ascend芯片的数量还未完全消化掉这批产能。
Right? Yeah. Roughly $500,000,000 worth of orders, which which ends up being a billion dollar fine that the US government gave TSMC, if I recall correctly. At least there was a Reuters article that I don't know if I actually they actually issued it, which is which is important and interesting to gauge because the number of Ascends floating out there is not has not consumed this entire capacity yet. Right?
时间来到2025年。H20芯片在年初遭禁,英伟达不得不进行巨额资产减记。仅H20芯片在中国市场的预期收入就超过200亿美元——这正是他们原先规划的产能规模/后来被迫减记的金额。
So now we get to 2025. Right? The h 20 got banned in the beginning of the year. NVIDIA had to write off, you know, huge amounts of money. Our our revenue estimate for NVIDIA in China for just h twenty was north of 20,000,000,000 because that's what they were booking in capacity slash had to write off.
没错。禁令出台后供应链直接被切断,就像突然宣布'我们不再供货'。他们的库存后来虽获重新批准...
Yeah. And then it got banned. They cut the supply chain. Like they just said, no, we're not doing this anymore. They had their inventory gets reapproved.
库存得以转售,但英伟达现在面临的问题是:是否要重启生产?而中国方面则表示'我们不需要英伟达,有华为、寒武纪等本土替代方案'。
They resell the inventory, but now they're like, do we even restart production is, is NVIDIA's question. And, and now you have China saying, hey, like we don't need NVIDIA. We have domestic alternatives, right? Whether it be Huawei or Camera Con. Yeah.
这些企业确实具备产能,但大部分仍依赖境外生产——无论是台积电的晶圆,还是韩国三星、SK海力士的存储芯片。
These companies have, you know, capacity, but most of this capacity is still foreign produced. Right? Whether it be wafers from TSMC, memory from Korea. Right? Samsung and SK Hynix.
所以问题大致是,他们在国内能做到什么程度?这里有两方面需要考虑。对吧?一方面是逻辑芯片,即替代台积电,另一方面是存储芯片。
So the question is sort of like, how much can they do domestically? And there's sort of two fronts there. Right? There's the logic, I. E, replacing TSMC, and there's the memory, I.
替代海力士、三星、美光。在逻辑芯片方面,他们虽然落后,但正在全力追赶。我认为他们能够达到所需的生产能力预估。而且美国目前仍允许他们进口几乎所有必要设备,禁令主要针对的是超越当前七纳米工艺的技术。
Replacing Hynix, Samsung, Micron. And on the logic side, they are they're behind, but they're really ramping there. And I think they can they can sort of get to the production capacity, estimates needed. And The US is still allowing them to import all the equipment necessary pretty much. The bands are really for beyond the current generation of techno beyond seven nanometer.
这些禁令实际上针对的是五纳米及以下工艺。尽管政府声称针对十四纳米,但被禁设备仅适用于七纳米以下。因此他们将能大量生产七纳米AI芯片,甚至可能利用现有设备触及五纳米工艺,而非采用新技术。所以逻辑芯片是一方面,存储芯片是另一方面。而华为声明中令人惊讶的是他们正在研发定制存储芯片。
The the the bands are really for five nanometer and below. Even though the government says they're for 14 nanometer, the the actual equipment that's banned is only for below seven nanometer. And so they'll be able to make a lot of seven nanometer AI chips and maybe even get to five with, you know, using existing equipment for five nanometer rather than using rather than, like, taking a new techniques. And so, like, there's the logic side and then there's the memory side. And the the aspect of Huawei's announcement that was surprising was that they're doing custom memory.
嗯哼。对吧?是的。这部分确实让人感到振奋。他们宣布明年将推出两款不同类型的芯片。
Mhmm. Right? Yep. That's that's the part that is sort of like, hey, this is really exciting. They announced, you know, two different types of chips for next year.
一款专注于推荐系统和预填充,另一款专注于解码。这是当前趋势。英伟达也是如此,他们最近刚发布了一款专门针对预填充的芯片。
One that's focused on recommendation systems and prefill, and then one that's focused on decode. There's a trend these days. Yeah. So in NVIDIA, the same thing. They just announced a prefill specific chip recently.
现在有许多AI硬件初创公司都在区分预填充与解码工作负载。华为明年的芯片也采用这种将推理分为两种工作负载的方式。有趣的是,解码芯片采用了定制HBM(高带宽内存)。这意味着什么?其制造供应链是怎样的?
There's numerous AI hardware startups that are really focusing on prefill versus decode. And so this sort of split of inference up to two workloads, you know, Huawei is doing the same thing for their next year chip. And what's interesting is the decode one has, you know, custom HBM. What does that mean? What is the manufacturing supply chain?
因为这正是棘手之处。对吧?他们能生产多少这种定制HBM?英伟达等公司也是从明年才开始采用定制HBM。对吧?
Because that that's the that's the one that's tricky. Right? How much can they manufacture of that custom HBM? And NVIDIA and others are also adopting custom HBM only starting next year. Right?
所以这并不是说,你知道的,制造能力不存在。或许它会消耗更多电力,带宽可能稍低,但他们能做到英伟达和AMD计划在其内存中实现的某些功能,这证明他们正在迎头赶上。但关键问题仍是产能。就像现在,英伟达在中国被禁了。
So it's not like, you know, yes, the manufacturing capacity is not there. The maybe maybe it consume it is gonna consume a bit more power. It's gonna be slightly lower bandwidth, but the fact that they're able to do, you know, some of the same things that NVIDIA plans to do, AMD plans to do in their memory is is, you know, evidence that they're catching up. But then, you know, the the the main question that remains is production capacity. So as far as like, hey, NVIDIA's banned in China.
对吧?他们在说别买英伟达芯片。我觉得短期内对中国没问题。从某种角度看,嘿——
Right? Like they're saying don't buy NVIDIA chips. I think for a period of time, that's fine because fine for China. Right? From a perspective of, hey.
我是中国。这没问题,因为你们2024年运进来的产能还没转化成AI芯片。现在你们正把它们转为AI芯片,消耗所有库存。
I'm China. That's fine because you have all this capacity that you, you know, shipped in in 2024. They haven't turned into AI chips. Now you're turning them to AI chips. You're running all that stockpile down.
从消耗库存到增产新品的过渡期怎么办?对吧?这个过渡期才是真正棘手的。中国要么在这段时间不买英伟达芯片自损实力,要么就能成功增产。
What about the transition from running that stockpile down to ramping your new stuff? Right? Right. And that that that transition is the one that's really tricky. China's either shooting itself in the foot by not purchasing NVIDIA chips during that time period, or China's able to ramp.
我认为他们能增产,只是需要更长时间,中间会出现空窗期,中国可能会退缩说没关系。就像字节跳动会哀求英伟达芯片,他们不想用寒武纪或华为的方案,他们真正想要的是更优秀的英伟达。
I think they'll be able to ramp. I think it'll take a little bit longer, and there will be like a sort of a gap in between where China probably backtracks and says it's fine. Like like ByteDance and is like begging for NVIDIA chips, right? Like they they don't want to use they use some Cambercon. They use some Huawei, but they really want to use NVIDIA because it's way better.
他们不在乎国内供应链,只想打造最佳模型,最高效地部署AI。政府可以强制他们不用,但——
They don't care about like the domestic supply chain. They want to make the best models. They want to deploy their AI as efficiently as possible. And so this is like, you know, the the government can mandate them to, like, not do it. Right?
所以不是英伟达缺乏竞争力,而是政府在推动替代。最后还有个论点:如果禁售英伟达对中国如此有利,为何中国不早主动实施?现在他们终于自己动手了,这会很有趣。
So so it's not that NVIDIA is not competitive. It's that the government sort of trying to instigate it. And then like, I guess the last sort of thing is like, you know, there's always the argument of like, Hey, if banning NVIDIA chips to China is so good for China, why didn't China do it for itself? And they're finally doing it for themselves. So, like, it'll be interesting to see.
走私仍在发生。对吧?从其他国家向中国再出口芯片的情况仍在以一定规模进行,低规模、较低中等规模。对吧?但另一方面,目前英伟达芯片合法向中国直接发货的情况不一定正在发生,但可能可能必须在某个时间点重启,因为中国将不具备生产能力,你知道,他们国内部署的AI芯片数量将远远少于美国。
Smuggling is still happening. Right? Reexportation of chips from, you know, other countries to China that is still happening at some volume, low volume, lower lower medium volume. Right? But then, you know, the direct shipments of NVIDIA chips that are legally allowed to China are not necessarily happening today, but may may have to restart at some point because China won't have the production capacity to you know, they they would just have so many fewer AI chips being deployed domestically versus The US.
在某个时刻,你不得不做出选择,比如,我是全力投入内部供应链,还是全力追逐超级强大的AI?
And at some point, you kinda have to pick like, am I am I all about the internal supply chain, or am I all about chasing, you know, super powerful AI?
是的。那么这里是否存在一个谈判角度的考量?因为目前仍在讨论哪些是边界,哪些可以出口到中国。所以这些公告发布的时间点很巧妙,如果你想表达美国应该允许更多出口的观点。你认为这是一个因素吗?
Yeah. So is is is there is there an angle here about a negotiation angle as well? Because currently, there's still discussions ongoing what exactly are the boundaries, what can be exported to China. So these are sort of well timed announcements if you want to make a point that, you know, US should allow more exports. Is do you think that's a factor or not?
是的。所以,我在几周前关于华为生产能力和供应链的报告中提到了一点,我们写道,说实话,如果你是中国,你确实想要英伟达芯片。你该如何应对?对吧?是的。
Yeah. So so I I you know, in the report we did a few weeks ago about the production capacity of Huawei and the supply chain, there was a bit in there that we wrote about how, you know, honestly, like, if you were China and you want NVIDIA you do want NVIDIA chips, actually. How do you play this? Right? Yeah.
而且,而且,而且是通过这种方式。通过夸大你的国内供应。是的。是的。而且就像是在说,是的,我们什么都能做。
And and and it's by Like this. It's by hyping up your domestic supply Yeah. Yeah. And it's by it's like it's like, yes. We can do everything.
然后政府官员会考虑国内玩家的游说。当然,我们想向他们出口更好的AI芯片。我们正在失去这个市场。我们不能失去这个市场。这就像是,智商一万。
It's then the government official is gonna think alongside sort of, like, lobbying from domestic players. Like, of course, we wanna ship them better AI chips. Like, we're losing this market. We can't lose this market. And it's sort of like, it is 10,000 IQ.
对吧?而且,而且,而且我们在这里玩跳棋,而他们在下象棋。
Right? And and and we're here playing checkers while they're playing chess.
那么,抛开谈判筹码不谈,在那份报告中你提到HBM(高带宽内存)是华为的瓶颈。关于公告中令人意外的部分,你认为他们声称这不再是瓶颈的说法可信吗?还是只是炒作?
Well, so I guess negotiating chip aside, in that report, you talked about HBM or high bandwidth memory being a bottleneck to Huawei. To your point on one of the surprising aspects of the announcement, do you do you think it's credible that it's no longer a bottleneck based on what they're saying, or are they is it just hype?
我认为从产能角度来看,这绝对仍是个瓶颈。制造HBM所需的某些设备仍需进口。他们正在研发国产解决方案,但据我们所知,他们尚未进口足够的设备。不过,如果查看中国各类设备的进口数据,你会发现不同工艺技术的晶圆厂在光刻、蚀刻、沉积和计量设备上的投入比例各不相同。
I think I think production capacity wise, it is still absolutely a bottleneck. They certain types of equipment required for making HBM need to be imported. They're working on domestic solutions, but as far as we know, they have not imported enough equipment for this. Although, if you look at Chinese import data for different types of equipment, right, there's there's sort of like fabs spend, you know, roughly it depends on the process technology. Fabs spend roughly different amounts of money on lithography, etch deposition, metrology.
对吧?这些不同的生产环节。历史上光刻设备占比约18%,采用EUV技术后增长到25%。
Right? Like these different steps. And historically, lithography is hovered around, you know, 18%. With EUV, it's it grew to 25%. Right?
但中国因担忧即将到来的禁令,曾以远超常规比例进口光刻设备——其设备进口总额的40%都用于光刻设备囤积。如今情况逆转:通过分析中国各省的月度进出口数据及各国出口数据,可见蚀刻设备进口量正在飙升。
But China, because they they wanted they they sort of like wanted to stockpile lithography and they were worried about the coming ban. They were importing lithography at a much higher rate than that. Right? Like 40% of their equipment imports were lithography, and they were stockpiling lithography equipment. This is sort of like reversed now in that like, hey, if I want to and so if you look at the monthly import export data both into provinces in China, but also out of countries, you can see that etch etch specifically is skyrocketing.
而HBM堆叠工艺的关键在于,每片晶圆需要通过蚀刻形成硅通孔(TSV)实现上下层互联,然后进行12层或16层堆叠——这就是制造超高带宽内存的方式。目前中国蚀刻设备的进口量确实在激增。
And and the main thing about, you know, stacking HBM is that you have to you know, when you have each wafer, you have to etch create, like, this thing called a through silicon via so it can connect from the top to bottom, and then you stack them on top of each other. Right? 12 high, 16 high for HBM. That's how you make super high bandwidth memory. And and their imports for etch is, like, skyrocketing now.
所以...是的,他们目前确实还不具备产能。产能提升速度取决于能获取多少设备,这是第一点。
So it's like Yeah. Yeah. It's it's they don't have the production capacity yet. How fast can they ramp it as a function of how much equipment can they get? A.
第二点是良品率问题。提升制造良率极其困难——英特尔和三星做得不错,台积电更是惊人。不是说这些公司不行,换个说法更准确。
And b, like, the yields. Right? Improving yields is really hard on manufacturing. Intel and Samsung are really good, and TSMC is just amazing. Not not that those companies suck, like, think is a better way to put it.
所以,你看,就是这两件事。我认为产量,他们甚至还没开始高速生产HBM3,对吧?他们只对HBM2进行了一些样品测试。HBM3是几年前才推出的。
And and so, you know, it's those two things. I think yield, they haven't even started production of high speed of of HBM three. Right? They've they've only done some sampling of HBM two. HBM three came out, like, a few years ago.
因此,在攀登学习曲线上还有相当长的路要走。显然,我预计他们追赶的速度会比当初技术开发所需的时间快,因为技术已经存在。对吧?全世界都知道怎么做,关键在于实际执行而非发明。
So there there's still quite a bit of ways on, like, going up the learning curve. I I obviously, I expect them to catch up faster than it took, you know, the technology to be developed because it exists. Right? In the world, we know how to do it. It's just a matter of actually doing it versus inventing it.
另一个问题是产能。几个月的进出口数据不足以支撑多年的供应链建设,对吧?这就是如今韩国企业所拥有的。现在海力士也在美国伊利诺伊州投资,而美光主要在日本布局。
And then the other one is sort of the production capacity. You know, a couple months of import export data is not enough to, you know, set up for, you know, years worth of supply chain buildup. Right? Which is what we have today in in Korea for the Korean companies. Now Hynix is also investing in The US and Illinois, and then Micron's primarily in Japan.
美国内存企业主要分布在日本和台湾,但他们也在新加坡和美国扩张。已经投入了巨额资本,中国需要时间才能建立起与西方匹敌的产能。我说的西方,指的是非中国的东亚生产产能。所以这需要时间。
The American memory companies primarily in Japan and Taiwan, but they're also expanding in Singapore and The US now. Like, there's so much capital that's been invested. It would take some time for China to build up that production capacity to actually match the West. And when I say the West, I mean, East Asia in production in non China East Asia in production capacity. So it'll take some time to get there.
我不认为问题在于'我们能设计这个吗',而始终是'我们能制造吗'。就像黄仁勋说的,你是在赌中国无法制造。对吧?这是时间问题,而非能否问题。
And I don't think I think it's like, hey, we can design this. It's it's always a question of can we manufacture? And then and then the thing like that Jensen would say is like, you're betting on China not being able to manufacture. Like Right. You know, it's a it's it's a matter of when not if.
这就是美国政府必须考虑的整体算计:'我们该出售什么级别的AI芯片?全部卖吗?可能不行,因为AI更强大且终端市场将远超半导体设备市场。我们该在什么级别设限?'
And that's the whole calculus that like, I think the US government has to be aware of when they're like, hey. What level of AI chips do we sell? Do we sell everything? Probably not because AI is far more powerful and a lot the end market of AI is gonna be way larger than the end market of semiconductors and equipment. Do we sell you know, what level do we sell at?
中国在每个特定性能层级能生产多少?分析其产量后,再决定什么是可接受的——也许略高于或接近同等水平。
Well, how much can China make at each specific, you know, sort of performance tier and then, you know, analyze that and what's the volume and then figure out like what is okay, which is like maybe a little bit above or around the same level.
是的。所以,就像你说的下国际象棋和跳棋的比喻,如果你是Jensen,面对当前局势,你会采取什么下一步行动?
Yep. So so if you, to your point on, like, playing chess versus checkers, if you're Jensen, what would your next move be given the situation at hand?
某种程度上说,他更害怕华为而非AMD,这话部分正确。
It's both, like, true partially true that he's afraid of Huawei more than he is like an AMD.
没错。他称华为为劲敌。
Right. He called them formidable.
对。我是说,华为在很多方面已经超越苹果了,对吧?他们在台积电的订单量超过了苹果,在全球多个地区的手机市场份额也超过了苹果——当然不包括美国——这些都是在禁令出台前的事。
Yeah. Well, like, I mean, like, every other like, Huawei's beat Apple. Right? They they passed Apple up in TSMC orders. They passed Apple up in phone market share, not in The US, but, like, in many parts of the world before the bans came down.
即便现在,他们又在没有西方供应链的情况下重新夺回市场份额。你看,他们在多个行业都做到了这一点。苹果确实是个强劲的竞争对手,但他们也击败过许多行业巨头。
And then even now, they're growing back again in market share without, like, Western supply chains. You know, they they've done this to numerous other industries. I would say Apple's like a formidable competitor. Right? Like, they've they've beaten a lot of industries.
所以他害怕华为是合理的。某种程度上说...他并不惧怕A和B。我认为最好的做法是尽量把华为宣布的成果视为现实而非期望目标。是的,因此消除对制造能力的所有质疑——我觉得这种质疑并不公平。
And so it's it's reasonable that he's afraid of them. It's it's sort of, you know, and he's not afraid of a and b. So, like, I think, like, the best thing is, like, tried so as much, like, Huawei what Huawei announced is reality rather than, like, their hope target. Yep. And so away all doubt on manufacturing capacity, which I think is not fair.
对吧?我认为制造能力确实是他们的瓶颈。良率提升的学习过程也是真正的瓶颈——可能只是暂时的。我们要看这个瓶颈会持续多久,也要看英伟达的技术进步能比华为快多少,明白吗?
Right? Like, I think manufacturing capacity is a real bottleneck for them. And then the yield learnings, real bottleneck, like temporary, maybe. We'll see how long, and we'll see how fast the rest of the, you know, the the NVIDIA technology advances past Huawei is capable of. Right?
以及华为能以多快的速度缩小差距。但我认为,他的主要观点是华为是真实存在的。他们是一个强大的竞争对手,不仅将占领中国市场,还会进军国外市场,对吧?
And and how fast Huawei is able to close the gap. But I think I think his main sort of pitch would be Huawei is is real. They're a formidable competitor. They're going to take over not just the Chinese market, but also Right. Foreign markets.
对吧?无论是中东、东南亚、南亚、欧洲还是拉丁美洲。对吧?除了美国以外的所有地方。而且,我觉得诺亚·史密斯有个类比。
Right? Whether it be the Middle East or Southeast Asia or South Asia or Europe or LatAm. Right? Everywhere besides America. And the the sort of there's a I think I think Noah Smith has this analogy.
对吧?整个理念就是应该让中国‘加拉帕戈斯化’。对吧?让他们发展出与全球截然不同的本土产业。对吧?
Right? This whole idea is that you should Galapagos China. Right? Make them have their own domestic industry that is so different from the rest of the world. Right?
有点像七八十年代和九十年代的日本。他们的个人电脑极度特化,针对日本市场进行了超优化,比如那些奇怪的——不知道你见过没——日本电脑上的奇怪滚轮。就像这样滑动就能滚动,对吧?触摸板还是个圆形,周围环绕着它。
Kind of what happened with Japan in the seventies and eighties. There are and and nineties, their PCs were so specific and hyper optimized to the Japanese market with, like, you know, the weird, like I don't know if you've seen the weird scroll wheel on the on these Japanese PCs. Like, you literally like, it's like, you go like this and it scrolls. Right? And it's like and then the touchpad is a circle, and then that's around it.
这类设计实在太怪异了。但日本市场就吃这套,对吧?他的核心观点就是:让我们把他们‘加拉帕戈斯化’。
It's like things like that are so weird. Totally. And the rest of the world doesn't care, but Japan market likes it. Right? And his whole idea is like, let's Galapagos them, I.
也就是把他们的技术限制在中国境内,形成沉没成本,永远无法向外扩张,而我们服务全球。但风险在于反向情况也可能发生——我们的技术过度优化于运行这种规模的语言模型和强化学习,硬件软件协同设计可能让我们陷入技术路径的死胡同。
E, keep their technology within China, and then that's like deadweight loss and they never expand outside versus that we serve the whole world. But the whole risk is that the opposite can also happen. Right? Our our technology is hyper optimized to running, you know, language models at this scale and RL. And you keep, you know, you keep like hardware, software, co design can take you down a trap path of the tree that, like, is a dead end.
而中国因为被禁止接触这条技术路径,反而可能误打误撞找到最优解。对吧?我们困在局部最低点,他们却占据了局部最高点。
And then China, like, because they're not allowed to access this tree, they're like, oh, okay. And then they end up in the, like, optimal spot. Right? We we had a local minima. They had a low local maxima.
他们曾有一个局部和全局的最大值。对吧?就像,那种技术上的加拉帕戈斯现象,某种程度上就是诺亚·史密斯的比喻。我非常喜欢这个说法。虽然不确定是否准确,但确实很有趣。
They had a local a global maxima. Right? Like, that that sort of, like, technological Galapagos ing is sort of what Noah Smith's analogy is. I like it a lot. I don't know if it's accurate, but, it's an interesting one.
是啊是啊,我超爱这个观点。不过或许我们该暂时从当前事件中抽身——尽管现在可聊的话题实在太多。上次你参加我们节目时,英伟达自然是被提及了。
Yeah. Yeah. I love that. Well, actually, maybe just taking a step back from current events, even though there's so much to talk about right now. Last time you appeared with us, NVIDIA came up, obviously.
当时你还讨论了英伟达未来可能发展的几条路径。
And you talked about a couple of the potential paths forward for NVIDIA.
不妨给我们讲讲看涨和看跌的观点吧。
Give us maybe the bull case, the bear case.
说得对。他们现在的数据包含了很多信息。但有趣的是,银行业的共识主要集中在超大规模云服务商身上——微软、Coreave、亚马逊、谷歌和甲骨文。对吧?
Fair enough. There's a lot embedded in their numbers now. But what's interesting is consensus for the banks is is like for across, like, this the hyperscalers. So Microsoft, Coreave, Amazon, Google, and Oracle. Right?
还有Meta。所以总共是六家超大规模云服务商。至少我是这么定义这个范畴的。
Meta. Right? So it's the six hyperscalers. Right? What I would consider hyperscalers.
银行业预估这些企业明年的总支出将达到3600亿美元。而我的预测数字更接近450到500之间。这个估算是基于我们对数据中心的所有研究,包括追踪每个独立数据中心的...
The consensus for the banks is $360,000,000,000 of spend next year across all of them. And my number is closer to like, it's like $4.50, 500. And that's that's based on, like, you know, all the research we do on, like, data centers and, like, tracking each individual data center and the
供应链,对吧?所以这只是英伟达的支出。
supply chains. Right? So so So this is just NVIDIA spend.
这是超大规模企业的资本支出,对吧?这部分支出会被分摊给不同公司,但绝大部分仍然流向英伟达,明白吗?
This is this is CapEx for the hyperscalers. Right? And that CapEx Got it. Gets split up across different companies, but the vast, vast majority still goes to NVIDIA. Right?
对。
Right.
英伟达现在的情况不是抢占市场份额,而是随市场同步增长/守住份额。所以问题在于:超大规模企业和其他用户的资本支出增速有多快?
And NVIDIA is in a position not where they take they can't take share. Right? It's they grow with the market slash defend share. Yeah. And so the question is like, how fast is the growth rate of of CapEx for hyperscalers and other users?
对吧?我把甲骨文和CoreVue也算作超大规模企业——尽管传统上不这么称呼——因为它们就是OpenAI的超大规模服务商。你看甲骨文那则公告就明白了。
Right? And the reason I included Oracle and CoreVue as hyperscalers, even though they're traditionally not called hyperscalers, is because they are OpenAI's hyperscaler. Right. Right. So, you know, when you look and you you look at the Oracle announcement.
说真的,首先,我不理解为什么人们不觉得甲骨文的公告更疯狂。他们做了上市公司史上最空前的事:发布了四年业绩指引,这让拉里成了世界首富,懂吗?
Right? Like, first of all, the Oracle announcement, I don't understand why people don't think this is crazier. They did the most unprecedented thing in the history of, like, stocks and and public and companies ever. They gave a four year guidance, and it made Larry the richest man in the world. You know?
总之关键问题是:营收增速能有多快?你觉得与甲骨文签了3000亿+美元协议的OpenAI,真能支付得起这笔钱吗?
Like, all these things. Yeah. They got anyways, you know, the the question is, like, how fast does revenue grow? Right? Do you think Oracle and open do you think OpenAI, which signed a 300,000,000,000 plus deal with with Oracle, will actually be able to pay $300,000,000,000.
对吧?无论是融资还是营收。我认为大多数情况下,它会在短短几年内达到每年超过800亿甚至900亿美元的规模。对吧?所以问题是,你相信市场会增长得那么快吗?
Right? Across raising capital and revenue. And I think most and and and it gets to a rate of, like, over 80,000,000,000 eight over $90,000,000,000 a year in just a handful of years. Right? So it's like, do you believe the market will grow that fast?
这非常有可能。是的。对于OpenAI来说,他们明年的营收会是多少?有人认为350亿,有人认为400亿。
It's it's very possible. Yes. And it's very possible for, like, you know, OpenAI, what is their revenue gonna be exiting next next year? Some people think 35,000,000,000. Some people think 40,000,000,000.
还有人认为是450亿。你知道,到明年年底的年化营收。今年他们已经达到了200亿。对吧?年化营收。
Some people think 45,000,000,000. You know, ARR by the end of the year next year. This year, they hit 20. Right? ARR.
明白吗?如果这种增长率持续下去,那么所有这些成本都将用于计算,加上他们持续筹集的资金。对吧?而且,他们在上一轮融资中向投资者透露的财务状况大概是,明年我们将烧掉150亿美元。实际可能更接近200亿。但你知道,他们目前还没有实现正向现金流。
You know? So so if that growth rate is maintained, then all of that cost goes to compute plus all the capital they continue to raise. Right? And again, there are financials that they sort of like gave to investors for their last round was like, hey, we're gonna bend we're gonna burn like $15,000,000,000 next year. It's probably more likely gonna be like 20, but like, you know, and you you stack this on, and they're not turning a cash flow.
他们要到2029年才能盈利。所以情况就是,他们每年将继续烧掉150亿到250亿美元的现金,再加上营收增长。这些是他们的计算支出。Anthropic也是如此,OpenAI也是如此。
They're not gonna be profitable until 2029. So you sort of have like, they're gonna continue to bet burn $15.20, $25,000,000,000 of cash each year plus revenue growth. That's their compute spend. And you do this for Anthropic. You this for OpenAi.
所有实验室都是这样。整个市场规模很可能明年会超过5000亿,不是3600亿,而是5000亿的总资本支出。对于超大规模云服务商来说,这块蛋糕还在继续增长。英伟达表示,实际上AI基础设施的年支出将达到数万亿美元,而他们将占据其中很大一部分。这就是他们的看涨理由。对吧?
Do this for all the labs. It's very possible that the pie does get to, you know you know, more than 505 you know, not 360,000,000,000 next year, 500,000,000,000 next year and for cat total CapEx, and the pie continues to grow for hyperscalers. NVIDIA says, actually, it's gonna be multiple trillions a year on AI infrastructure, and he's gonna capture a huge portion of it. That's his bull case. Right?
看涨的理由是AI确实具有变革性,世界将被数据中心覆盖。你大部分的交互都将与AI进行,无论是商业生产力中让智能体编写代码,还是和你的AI女友安妮聊天。对吧?这些都主要运行在英伟达的平台上。
That's the bull case is is AI is actually so transformative, and the world just gets covered in data centers. And and the majority of your interactions are with AI, whether it's like, you know, business productivity and telling an agent to do some code or you're just talking to your AI girlfriend, Annie. Right? Like, it doesn't matter. You know, all of this is running on NVIDIA for the most part.
啤酒行业的情况是,即使它真的大幅增长,对吧。所以,是的,你继续说吧。
The beer case is, you know, even if it does grow a lot So Yeah. You go ahead.
先别急着讨论乐观情况。我认为从根本上说,价值创造是确实存在的,对吧?我是说,用AI创造数万亿美元的价值,我完全能预见这一幕发生。所以假设这是真的。
Save the bull case for a second. I think fundamentally, the value creation, I think, personally, is there. Right? I mean, to create trillions of dollars of value with AI, I I can totally see this happen. So so assume it's true.
英伟达的市值峰值会达到多少?
Where will NVIDIA top out?
我想问,你有多相信技术突飞猛进的可能性?对吧?是的。所以,如果存在那种爆发式发展场景,强大的AI创造出更强大的AI,然后不断迭代,每级智能都能为经济带来更多价值。
I guess, how how much do you believe in takeoffs? Right? Yes. Yeah. So so, like, if there if there is, like, a takeoff scenario, right, where, like, powerful AI builds more powerful AI builds more powerful AI or, you know, that creates more and more, you know, each level of intelligence, like enables more for the economy.
对吧?就像你企业里能雇佣多少猴子,对比能雇佣多少人类员工?或者说多少条狗?明白我的类比吗?
Right? Like how many how many monkeys can you employ in your business versus how many like humans? Right? You know, sort of the same or how many dogs. Right?
就像,人类和狗创造的价值差异,AI也是类似道理。在这种情况下,价值创造可能达到数百万亿,甚至在那之后更夸张。但我是说,
Like, you know, the they're sort of like, what is the value creation of a human versus a dog? Sort of like the same with AI. So so, like, I mean, in in this case, the true the value creation could be hundreds of trillions if not, you know, the day after that. But I mean,
你真的需要这种假设吗?如果我们让每个白领员工借助AI实现双倍生产力,那已经是数百万亿级别的价值了,不是吗?
do you even do you need this? I mean, if we take every white collar worker and make them twice as productive with AI, that's in the hundreds of trillions, isn't it?
是啊。但是,比如,什么是所谓的双倍效率?我是说,如果你和实验室的人聊,他们说效率翻倍,这到底意味着什么?其实就是取代他们。对吧?而且实际效果比那还要好上十倍。
Yeah. But, like, what is the what is twice you know, like I mean, like, if you talk to people at the labs, right, like, twice as productive, what does that even mean? It's replace them. Right? It's and it's been 10 times better than that.
我是说,我不确定这种情况会多快发生,如果某种程度上——
Like, I I mean, like, I don't I don't know how soon that If sort of
如果白领工作离开了持续的大语言模型标记流就基本无效,正是这些标记让他们保持高效。对吧?到那时,你基本上可以对全球每个知识工作者征税,而长期来看这涵盖了世界上大多数劳动者。对,确实如此。
if it's sort of white collar work is essentially useless without a constant stream of LLM tokens, right, that make them that make them productive. Right? At that point, you basically can can tax every single knowledge worker in the world, right, which is most most workers in the world long term. Yeah. Yeah.
这同样...我也不确定。你觉得呢?给个具体数字吧。上限是——
That's also I don't know. I mean, what what's what's your guess? Give give us a number. What's the
MB的上限?上限?我是说,为什么我们不造个马特廖什卡大脑?不知道。可能在某个时刻,机器会判定人类无需存活,而我们需...我——
cap for for MB? Cap? Mean, like, why why aren't we making a Matrioshka brain? Like, I don't know. Like like I mean, at some point, the the machine says humans don't need to live, and we need we I
需要更多算力。在那之前还有一步,雷。我们是否已经——
need even more compute. One step before that, Ray. Are we are
已经开始殖民火星了?待定。唉,老兄,我真的不知道。面对如此剧变的局势,我觉得预测五年后的事情完全不可能。
we are we colonizing Mars yet? TBD. Yeah. I I don't know, man. It's it's it's I I find it, like, completely, like, impossible to predict anything beyond five years given how much stuff is changing.
比如五年。五是个大数字。我会留给经济学家去研究。对吧?说实话,供应链这类事情大概也就三四年就能见分晓。
Like Five years. Five is a large number. I'll leave it to economists. Right? Like, you know, like, honestly, like, you know, supply chain stuff is like three, four years out and that's it.
是啊。然后第五年就有点像黄色预警了。对吧?所以我尽量用供应链的事情让自己保持清醒。
Yeah. And then fifth year is like sort of like yellow. Right? So like, I I just try and ground myself with the supply chain stuff. Right?
比如,你知道的,供应链之后就是AI的采用率?价值创造在哪里?使用情况如何?这些短期内就能看出来。再往后,我就不清楚了。
Like, it's like, you know, supply chain and then like, what is the adoption of AI? What's the value creation? What's the usage? Like and you can see that in like a short horizon. Beyond that, like, I don't know.
比如,我们是不是都会接入电脑,像脑机接口之类的?兄弟,我真不知道。人形机器人会普及吗?你看到埃隆的演示了吧?
Like, are we all gonna be connected to computers, like BCIs and stuff? Like, I don't know, dude. Are are humanoid robots, are they gonna be you know? I mean, you saw Elon's thing. Right?
他那个样子,好像在说人形机器人就是特斯拉市值突破万亿的原因。行吧,随便。但它们的训练数据从哪来?
Like, he's like, yeah. Humanoid robots are why Tesla's worth more than 10,000,000,000,000. So go ahead. Great. What is all that being trained on?
好极了。英伟达。行。厉害。所以它也该值个万亿市值。
Great. NVIDIA. Okay. Awesome. So that that's worth also 10,000,000,000,000.
对吧?我真的搞不懂。这些话题太玄乎了,我不喜欢这种不着边际的讨论。
Right? Like, I don't I don't know. Like, it's too it's too out there for me. I don't like the out there discussions.
非常公平。
Very fair.
读一些科幻书籍。
Read some sci fi books.
所以刚才提到你谈到的那条线索——虽然这有点像随口一提——关于市场份额无法真正增长,因为它已经占据了如此主导的地位。上次我们讨论过,或者说你们讨论过媒体的护城河。显然,这个护城河与维持他们目前拥有的极高市场份额息息相关。我很喜欢你刚才带我们回顾华为的历史旅程。你能详细讲讲英伟达在历史上是如何构建他们的护城河的吗?
So just pulling out the thread where you talked about I mean, this is kinda a throwaway comment, but how market share can't really grow just because it's it's such a dominant market share. And we talked about or you guys talked about the moat of a media last time. And, obviously, this moat is tied to maintaining that very high market share that they current currently have. And I love this sort of historic journey you took us through with Huawei just earlier. Can you kind of walk through what NVIDIA did throughout history to build their moat?
这超级棒,因为你知道,他们一开始失败了很多次,还多次押上整个公司。对吧?就像,黄仁勋疯狂到敢把整个公司赌上去。比如,某些芯片在还没确认能否成功时,他就下了订单,那几乎是他们剩下的所有钱;或者为尚未拿下的项目提前订购大量芯片。
It's super awesome because, you know, they failed multiple times in the beginning and they bet the whole company multiple times. Right? Like, Jensen is just crazy enough to bet the whole company. Right? Like, whether it was, like, certain chips ordering volume before he knew it even worked, and it was, like, all the money he had left or, like, ordering volumes for projects he had not won yet.
我听到一个传闻——或者说不是传闻,而是一位业内资深人士讲的故事,我觉得他应该知道内情——他说,没错,英伟达在微软还没下订单前,就为Xbox预订了产量。
Like, I heard a rumor that or not a rumor, but like a story from someone who's like a gray beard in the industry, and I think would know was like, yeah. No. No. No. Like, NVIDIA ordered the volume for the Xbox before Microsoft gave them the order.
他们就是……他当时的态度就是:管他呢,豁出去了。对吧?我也不确定。
They're just like they're he was just like, fuck it. Yolo. Yeah. Right? I don't I don't know.
虽然我不清楚这事有多真实,肯定还有更多细节,比如口头承诺之类的。但据说订单确实是在正式签约前就下了。类似的情况也出现在加密货币泡沫时期。
Like, I don't know how real true this I'm sure there's more nuance there, like, you know, verbal indication or whatever. But, like, the order was placed before he got the order. Right? Like, is is what he said. You know, there's there's cases like with the crypto bubbles.
对吧?比如,确实有几个因素,但英伟达竭尽全力让供应链上的每个人都相信这不是加密货币驱动的,而是游戏需求,是持久、真实的需求,来自游戏、数据中心和专业可视化领域。因此,你们应该扩大生产,于是大家都增加了产能,投入大量资本支出提升产量、为他们新建生产线。他们按件付费,买下产品后转手卖出赚得盆满钵满。而当泡沫破裂时,他们只需计提一个季度的库存减值。
Right? Like, there's a couple of them, but, like, NVIDIA did their damn best to convince everyone in the supply chain that it wasn't crypto and that it was gaming, that it was durable, real demand, and it was date gaming and data center and and professional visualization. And, therefore, you guys should ramp your production, and they all ramp production, spent all this CapEx on increasing production and and building out new lines for them. And they pay they pay per item, and then they bought them and sold them at and made shitloads of money. And then and then when it all fell apart, they just had to write down a quarter's worth of inventory.
无所谓了。是啊。其他厂商都傻眼了:靠,现在生产线全空着呢。对吧?
Whatever. Yeah. Everyone else was like, well, crap. I have all these empty production lines. Right?
所以这就像...完全就是。懂吧?但问题是AMD当时怎么做的?他们的芯片其实更适合挖矿。
And so it's like Totally. You know? But but, like, what did AMD do then? Right? Their chips, they were actually better for crypto mining.
对吧?从硅成本与算力比来看确实如此,但AMD就是...呃...我们就不大幅增产了。明白?就像个理智的决定。是吧?
Right? On a on a, you know, amount of silicon cost versus how much you hash, but, like, they just didn't AMD was like, ah, we're gonna not really raise production. Right? Like, as a reasonable, you know, thing. Right?
这不是那种'趁热打铁'的操作。所以你看,英伟达现在也遇到同样情况。最近他们下了没人相信的产能订单。
It wasn't a it's a sort of like strike while the iron's hot. And so, like, you know, the same has happened with NVIDIA. Right? They've in recent in recent times, like, of they've ordered capacity that no one believes. Right?
不止一次。他们显然看到了需求。但很多时候,比如给微软的预估量比微软内部规划还高。后来微软确实调高了规划,但英伟达的预估量还是远超微软需求。
Multiple times. They they see the in demand, obviously. But in many cases, they're just like their number for, like, Microsoft was higher than Microsoft's internal planning. Right? And and then Microsoft's internal planning went up, but, like, their number for Microsoft was way higher.
这就很离谱:'虽然微软说需要这么多,但我们觉得他们用不了'——简直了,哪有供应商对客户说'不,你必须买更多'的?
And it's like, oh, we just don't think Microsoft's gonna need this much even though they tell us this. It's like, who the heck is like, no. No. No. Customer, you're gonna buy more.
比如订单之类的,对吧?然后当订单通过供应链进来时,就像我必须支付NCNR(不可取消、不可退货)款项。你知道,就是这样。我在台湾曾经问过一个问题。
Like and and orders. Right? And then and then when the orders come through the supply chain, it's like, I have to put pay NCNR, right, non cancelable, non returnable. Like, you know, this is Right. You know, this is I asked a question in Taiwan once.
当时在场的有Colette(首席财务官)和Jensen(首席执行官)。他们俩都在场,房间里坐满了金融界人士,他们在财报发布前三天问些愚蠢的财务问题。显然他们什么都不能回答,因为要遵守SEC规定。但我的问题是:Jensen,你是个直觉敏锐、富有远见的人。
There was like a it was it was Colette, which is the CFO and Jensen, CEO. They were they were both there. And it was it was a room full of, like, mostly finance bros, and they're asking stupid finance questions, like, three days before earnings. So, obviously, they just could not answer anything because it's, you know, SEC regulations. But then my question to them was, like, look, Jensen, you're, like, so vibes, like, driven and, like, very gut feel and, like, very visionary.
而Colette作为CFO,她本身非常出色,但你们性格迥异。你们如何合作?他回答说:我讨厌电子表格,我从不看那些东西。
And then Colette's, you know, CFO. Like, she's she's amazing in her own right, but, like, you know, that that those those personalities clash. How do you work together? And he's like, I hate spreadsheets. I don't look at them.
我就是知道答案。这种回应很典型,世界上最优秀的创新者确实拥有出色的直觉判断力。
I just know. Right? And like, this response? And it's like, of course, you know, the the best innovators in the world have really good gut instinct. Right.
对吧?所以完全靠直觉去下不可撤销的订单,尽管存在未知风险。他们在历史上多次被迫进行资产减值。
Right? So like Totally. The gut instinct to like order with, you know, with non cancelable, what you don't know. And they've had to write down over their history multiple times. Right?
累计订单价值高达数百亿美元。无论是受监管较多的A20项目,还是其他不得不取消的订单案例,都是数千亿规模的损失?
Many, many billions of dollars in accumulative orders. Right? So accumulate in total orders. Whether it be, you know, the age 20, which is more regulatory, but like other cases they've ordered and had to cancel. Is that many billions?
确实是数千亿。小意思?这要看情况。当年加密货币资产减值就达数十亿,而那时公司市值还不到1000亿美元。
It's many billions. Peanuts. Well, it depends. Right? The crypto write down was, like, multiple billion when their stock was, like, less than a 100,000,000,000.
对吧?就像,你知道的,这简直是九牛一毛。相比潜在的收益来说。对吧?我觉得这太疯狂了。
Right? Like, it's like a you know, it's it's peanuts. Compared to the upside. Right? I think I think It's crazy.
我认为你做的每件事都是对的。是的。而且我觉得AMD做的每件事都是错的,就像,你知道的,在那个情境下。但是,这确实很疯狂,尤其是在半导体这种周期性行业里,公司破产是常有的事。没错。这就是为什么每次低谷期都会出现行业整合——总有公司撑不下去。
I think everything you did was right. Yeah. And I think everything AMD did was wrong, like, you know, in that in that scenario. But, like, it it it is it is crazy to especially in a cyclical industry like semiconductors where companies go bankrupt Yeah. All the time, which is why we have all this consolidation is every down cycle, companies go bankrupt.
我的意思是,如果从风险回报的角度来看,这些赌注绝对值得下。是的。但如果站在CEO的角度,想要为华尔街提供可预测的季度业绩,那就是另一回事了。
I mean, if you look at it from a risk return perspective, right, these bets were totally worth taking. Yes. If you look at it from, I'm a CEO. I want to have, you know, predictable quarters for Wall Street. It's a very different story,
我觉得这某种程度上
and I think that's sort
就是当前分歧的根源所在。
of where part of detention is from now.
没错。所以我们——不知道你有没有看过那些李光耀的混剪视频?就是他在发表激情演讲,配上酷炫的背景音乐,最后展示他不同时期的照片。我们最近给黄仁勋也做了个类似的,发在社交媒体上,比如Instagram、抖音、小红书、微博。当然还有推特。
Yeah. So so we we, I don't know if you've seen these, like, Lee Kuan Yew edits where they're like him, like, saying some, like, fiery speech, and then, like, and then it's, like, some cool music at the end, and it's, like, showing different pictures of him. And so we made one of Jensen recently and put it on social media right on, like, Instagram, TikTok, XHS, Redbook. Right? Twitter, of course.
对吧?就是覆盖所有主流平台。我特别喜欢这个视频,因为他说——你知道的——参赛的目标就是为了赢。或者说...抱歉,你赢是为了能继续参赛。
Right? Like, all the different social media. And and I really liked it because he's, like, he's, like, you know, the goal of, like, playing is to win. And and the goal or sorry. And the reason you win is so you can play again.
对吧?你还把它比作弹球游戏,实际上就是整天玩个不停,不断获得更多回合。他的整个理念就是,我想赢才能玩下一局。而且,只关乎下一代。对吧?
Right? And you compared it to pinball where, like, actually, you just play all day and you keep getting more rounds. And it's like, his whole thing is like, I wanna win so I can play the next game. And like, it's only about the next generation. Right?
只关乎现在,下一代。不是十五年后的未来,因为那时整个游戏规则都变了,或者五年后。我想你是对的,风险回报比是合理的。
It's only about now, next generation. It's not about fifteen years from now because that's it's a whole new playing field every time or five years from now. I think that's that's you're right. It's the risk reward is is is correct.
是啊。但很少有人愿意冒这种风险。
Yeah. But There's few people take these kind of risks.
这是唯一一家估值超过100亿美元的半导体公司,成立时间却那么晚。联发科是九十年代初的,英伟达也是,其他大多都是七十年代的老牌企业。对,那些巨头们。
It's the only semiconductor company that's worth, you know, I think even north of $10,000,000,000, that was founded as late as it was. Like, MediaTek was in the in the early nineties and then, NVIDIA, and everyone else is, like, from the seventies mostly. Yeah. The big ones. Yeah.
没错。我觉得你提出了个很好的观点,关于这种押上全部身家的豪赌,而且按你说的,他其实已经失败过几次了。
Yeah. Yeah. I think you raised this great point on these bet the the bet the farm, and he's actually been wrong a couple times to your point.
移动领域对吧?比如,移动业务到底出了什么问题?
Mobile. Right? Like, what the hell happened with mobile?
正是如此。但他依然坚持冒险。我记得马克和埃里克有过一次精彩对话,谈到创始人领导的公司会铭记当初为取得今日成就所冒的风险。而后来接任的CEO往往只会说'好吧'。
Exactly. And he still takes them. And I think Mark actually had this great conversation with Eric where he talked about being founder run, where you have this memory of the risks you took to get to where you are today. Right? And so in in a lot of cases, if you're a CEO brought on later on, you're sort of like, okay.
继续按现状掌舵公司。但这次,他记起了所有那些他们几乎倾覆的时刻。他心想,我必须下注。要继续做这样的豪赌。你认为他这些年有什么变化——我是说,他已成为任职最久的CEO之一,超过30年,现在几乎和拉里·埃里森齐名了。
Continue to steer the ship as is. But in this case, he he remembers all the times they they almost went belly up. And he's like, I've gotta bet. Keep making bets like that. How do you think he's changed over I mean, he's been one of the longest running CEOs over 30 he's kinda right up there with Larry Ellison now.
你觉得过去三十多年里他发生了哪些变化?
How do you think he's changed over the last thirty years or so?
我、我意思是,显然,我才29岁。不知道他当年什么样。我看过很多老访谈。我不会说他当时没有...
I I I mean, obviously, like, I'm I'm 29. Don't know he was like. I've I've watched a lot of old interviews. I I won't say he wasn't
他当CEO的时间比你年龄还长。
CEO longer than you've been alive.
没错。完全正确。英伟达创立时我还没出生呢,我是96年的。
Yeah. Exactly. Exactly. Like, Nvidia was founded before I was born. I'm 96.
对吧?你懂的。嗯。
Right? Like, you know? Yeah.
或许最近几年有什么变化?我觉得...
Maybe anything over the last couple of years? I think that.
我觉得,就算是看那些老采访视频,对吧?比如我看了很多他过去的采访和演讲。有一点很明显,他现在简直是魅力爆棚,气场全开,那种吸引力只增不减。对吧?
I think even, like, watching old interviews. Right? Like, watched a lot of old interviews, a lot of old, like, presentations he's given. One thing is that he's just, like, sauced up and dripped up, like, way like, the charisma he's gotten has only gotten stronger. Right?
是啊。这倒是个有趣的观点,虽然我不确定是否完全相关。
Yeah. Which is which is an interesting point. I don't know if it's quite relevant.
完全同意这一点。
Totally agree with that.
没错。但你看,这家伙现在完全是个摇滚巨星范儿了,尽管他一直很有魅力,但现在简直是登峰造极。其实十年前他也是个摇滚巨星,只是人们可能没意识到。我记得看的第一场现场演讲,那叫一个震撼,是在CES展会上,大概是2014还是2015年来着。
Yeah. But, like, the man, like, has learned to be a rock star more even though he was always charismatic, it was like, he's a complete rock star now. And he was a rock star, you know, a decade ago too. It's just people maybe didn't recognize it. I think I think the first live presentation that I watched, it was extreme, was like it was what's the what's the con it was CES, like, 2014 or 2015 or whatever.
那是消费电子展。我当时还在管理游戏硬件相关的Reddit板块,对吧?那时候我还是个青少年。
He's he's he's it's it's consumer electronics show. I'm I'm, like, moderating, like, gaming sub gaming hardware subreddits. Right? Like Yeah. I At the time, I'm a teenager.
结果这老兄全程都在讲AI,对着满场游戏玩家大谈AlexNet和自动驾驶。对吧?首先得了解听众啊,但另一方面又觉得...太厉害了!虽然和消费电子展主题毫不相干。
And, like, the dude is, like, talking only about AI. He's telling he's telling, like, all these gamers about AlexNet and self driving cars. Right? It's like, know your audience, first of all, but also, like like Amazing. It's not has nothing to do with consumer electronics that gave me.
要知道当时我一边觉得'卧槽这也太酷了',一边又盼着他赶紧发布新游戏显卡。论坛上大家很快就炸锅了,都在说'搞什么鬼'之类的。
You know, at the time, I was also, like I was half like, holy crap. This is amazing, but also half like, I want you to announce new gaming GPU. Right? Like, you know but I know, like, on the forums on the forums, quickly, everyone was like, you know, screw this. You know?
对,对。我想听听关于游戏显卡的事,英伟达的价格压榨。你知道的,英伟达向来如此,我们定价时会考虑价值,再加一点溢价。对吧?
Yeah. Yeah. I wanna hear about the g gaming GPUs, NVIDIA's price gouging. Like, you know, of course, NVIDIA's always had to, like we price the value and, like plus a little bit. Right?
因为我们足够聪明,懂得这种怪异操作。我猜黄仁勋就是凭直觉定价的。他会在发布会前最后一刻调整游戏显卡的价格。哇哦。
Because we we're just smart enough to know weird. You know, I'm guessing Jensen just has the gut feel of how to price things. Right? He'll he'll change the price, like, at least on gaming watches, he'll change the price up until, like, right before the presentation. Wow.
所以这很可能真是靠直觉。总之他有那种魅力能判断对错。但很多人当时都觉得:'得了吧,老黄错了'。
So, like, it really is like a a gut feel thing probably. And, anyway, so so he he had that charisma to know what was right. But I think people a lot of people were like, oh, no. Whatever. Jensen's wrong.
说他根本不懂自己在说什么。但现在他发言时,人们都变得...你懂的,非常非常...可能只是因为他正确的次数够多。
He doesn't know what he's talking about. But now, like, he he talks. People are like, oh, they're very, very you know, the so it might just be that he's been right enough.
没错。最近X上有帖子说他已晋升'神级'CEO行列,但具体是...
Yeah. There's a post on X recently that said he had moved up into god mode with a select group of CEOs, but that this was re like, it's exactly Who
哪位神?还有哪些神?
who's god who's who's the other gods?
是扎克伯格。另一位大神。
It was Zuck. Pretty other god.
埃隆?
Elon?
埃隆。埃隆、扎克和杰森。
Elon. Elon, Zuck, and Jensen.
不错。不错。好的。
Nice. Nice. Okay.
好团队
Good crew to
加入其中。所以我们向硅谷祈祷。是的。这是
be in. So so we pray to Silicon Valley. Yeah. It's part of
现在成了邪教的一部分,是吗?
the cult now, is it?
周期性存在。关于人员还有最后一点。你提到了他的首席财务官科莱特,要知道,尽管所有元老级人物现在都可以退休了,但英伟达内部仍有一支出了名忠诚的团队。目前在英伟达,是否有类似SpaceX的格温·肖特维尔,或者以前苹果公司蒂姆·库克之于史蒂夫·乔布斯那样的人物存在?
Cyclically. Just on one more one last thing on people. You mentioned Colette, his CFO, and, you know, there's there's sort of a famously loyal crew at NVIDIA even though all of the OGs could retire at this point. Is there anyone akin to a Gwynne Shotwell at SpaceX or previously a Tim Cook to Steve Jobs at Apple that is at NVIDIA today?
我是说,曾经有两位联合创始人。对吧?比如,你知道的,我们别忽视这一点。其中一位,就像,你知道的,已经很久没有参与了,但另一位直到几年前还在参与。对吧?
I mean, had two cofounders. Right? Like, that's you know, let's not overlook that. One of them one of them is, like, you know, not involved and hasn't been for a long time, but the other one was involved up until just, know, few years ago. Right?
所以不只是Jensen在主导一切。对吧?
So it's not just Jensen running the show. Right?
完全同意。
Totally.
虽然他在主导大局。硬件部门有不少人。我一直觉得NVIDIA里有个人对我来说很神秘。比如,当你和工程团队交谈时,他领导着许多工程团队。他是个低调的人,所以我其实不想说出他的名字。
Although he was running the show. There's quite a few people on the hardware side. I've always there there's someone at at NVIDIA that's, like, mythical to me. Like, when you talk to the engineering teams, he leads a lot of engineering teams. He is a private person, so I don't wanna say his name actually.
我有点害怕。但,你知道,他他他实际上就像是首席工程官的角色,他团队里的人都知道他是谁。我觉得公司里有不少这样的人,但,你知道,他极其忠诚,而且这类人不止一个。还有另一个家伙,就像,你知道的,NVIDIA有所有这些创新想法,而他正是那个会说‘我们现在必须推出这款芯片,要削减功能’的人。
I'm scared off. But, you know, he he's he's he's like he's like effectively, like, chief engineering officers, like his role, and people within his org will know who he is. And I think I think there are people like that, but, you know, there there there he's intensely loyal, and there's there's a number of these types of people. There's another fella who's like, you know, like, there's all these, like, innovative ideas at NVIDIA, and he's the guy who literally is like, we need to get this silicon out now. We're cutting features.
这就是他出名的原因,而NVIDIA的所有技术专家都讨厌他。这是第二个家伙。第二个家伙。同样对NVIDIA极其忠诚,已经待了很久,但你知道,当你身处这样一家充满远见、前瞻的公司时,一个问题就是容易迷失方向。对吧?
And that's like that's like what he's famously known for, and all the technologists in NVIDIA hate him. This is this is like a second guy. This is a second guy. Also intensely loyal to NVIDIA, has been around for a long time, but it's like you know, it's sort of like when you have such a visionary company and forward, you know, one one problem is that you get lost in the sauce. Right?
你知道吗?哦,我想做这个。它必须完美。太棒了。然后,你知道,你需要那种人——这些人显然与Jensen关系密切是有原因的,因为Jensen也相信这些理念。
You know? Oh, I wanna make this. It's gotta be perfect. Amazing. And it's like, you know, you gotta have that sort of like and and these people are like, you know, obviously, they're close to Jensen for a reason because Jensen also believes like these things.
对吧?既要展望未来,又得抱着'管他呢,砍掉重练'的心态。我们会把它放到下一个版本里。直接发布。
Right? Have the visionary future looking, but also like, screw it. Cut it. We'll put it in the next one. Ship.
对吧?就像在硅谷这样的环境里,要快速发布、加速迭代,这真的很难做到。英伟达从一开始就令人超级印象深刻的是——他之前也提到过——他们的首款成功芯片研发时资金即将耗尽,他不得不四处筹钱才完成开发,即便如此资金也刚够用,因为此前他们已经有过一次芯片失败的经历了。
Right? Like, you know, ship now, ship faster, like, in in a space like silicon, which is, like, really hard to do so. And and and sort of like the thing about NVIDIA that's always been, you know, super impressive, and it's from the beginning days where he's talked about this before, is their first chip their their first successful chip, they they're gonna run out of money. And he had to go get money from other people to even finish the development. And even then, he just had enough money because he'd already had a failed chip before this.
芯片必须一次成功,否则就完蛋了。因为他们只付得起一套光刻掩模版的费用。简单来说,就是把类似模板的东西放进光刻机,机器根据模板图案在晶圆上沉积材料、蚀刻处理,反复操作来定位元件位置。
Was the chip came back and it had to work. Otherwise, it would not, you know and so they they were like because they could only pay for it's called a mask set. Right? Basically, you put these like I'll call them stencils into the lithography tool, and then it like says where the patterns are and you, you know, you put that stencil in, you deposit stuff, you etch stuff, you deposit materials on the wafer, etch it away, and you put the stencil in and, like, you you, like, tell it where to put stuff. Right?
然后在这些指定区域不断进行沉积和蚀刻,堆叠数十层后最终形成芯片。这些掩模版是每款芯片专属定制的,如今成本高达数百亿美元,即便在当时也是笔巨款。
And then the the deposition and etch keeps happening in those spots and you stack dozens of layers on top of each other and you make up a chip. These stencils are custom to each chip. Right? And they cost today in the orders of tens and tens of billions of dollars. But even back then, it was still a lot of money.
不,当然那时候没这么贵。但他们只够支付一套掩模版的费用。半导体制造的常态是:无论模拟验证做得多么完善,设计投产后总需要修改。
No. It it wasn't that much then, of course. You know? It it it sort of he could they could only pay for one set. But the typical thing with semiconductor manufacturing is, you know, as good as you can simulate, as good as you can do all the verification, you'll send a design in and you have to change it.
总会有意料之外的问题。完美模拟所有变量实在太难了。但英伟达厉害之处在于他们总能一次成功。即便是AMD或博通这样的优秀企业,也常需要发布A版、B版等修订版本。
You there's gonna be something. It's it's so hard to simulate everything perfectly. And the thing about NVIDIA is they tend to just get it, right, the first time. Yeah. Even like even great executing companies like AMD or Broadcom or whoever, they often have to ship you know, they're denoted in like a and then a number or b and then a number.
这些版本号对应着掩模版的不同部分。英伟达几乎总是直接发布零版本,偶尔发布一版本。很多时候即便开始生产,A版主要指晶体管层,而数字编号版本则是连接晶体管的布线层。
So that's like two different parts of the masks. So like, NVIDIA always ships a zero. Almost always. They sometimes ship a one. And a lot of times, even if they'll they'll start production of the you know, the a is basically the transistor layer than the numbers like the wiring that connects all the transistors together.
所以英伟达会开始生产A版本并大幅提升产量,然后在即将切换到金属层前先按兵不动,以防万一需要修改金属层设计。这样一旦确认芯片可用,他们就能迅速大规模投产。而其他厂商还在反复折腾:‘哎呀,芯片拿回来了。哦,A0版本不行啊。’
So NVIDIA will start production of the a and ramp it really high, and then you just hold it right before you transition to the metal just in case they do need to change the metal layers. And so, like, the moment they're ready and they've confirmed that it works, they can just, you know, blast through a lot of production. Whereas everyone else is like, oh, let's get the chip back. Oh, okay. A zero doesn't work.
我们得调整这个,修改那个,然后退回到上个版本。对,这就叫步进修订(stepping),懂吧?
We gotta make this tweak, make this tweak, and make it step back. Yeah. It's called a stepping. Right?
我们当时...我们当时特别嫉妒英伟达。他们总能按时交付产品。而我们的初版就搞砸了。
We we we didn't we were very jealous of of NVIDIA at that time. Right? They consistently delivered. And the first one, we did not.
有趣的是数据中心CPU部门有款产品,我说要么做A1版本,要么如果连晶体管层都要改就直接跳到B版本。结果英特尔有一次搞到了E2版本,E2啊!
It's funny because the the the data center CPU group, there was one product where, you know, I said a one, a you know, a zero a one, or you go to b if it's have to change the transistor layer as well. So it's like b. NVIDIA sorry. Intel got to, like, e two once. E two.
这相当于15次修订版。AMD市场份额飙升反超英特尔的关键时期,就是英特尔还在折腾E2版本的时候。15次步进修订,想想看。
Like, that's like a 15 revision. This is this is, like, this is, like, the peak of AMD's, smart like, when they went skyrocketing on market share versus Intel was when Intel was at e two. Right? Like, 15 stepping.
因为每次延期就是整个季度啊。对上市计划简直是灾难性的打击。
Like, because it's quarters of delay. Right? I mean, it's it's it's catastrophic for a go to market. Yeah.
每次修订都要耽误个把季度。太荒谬了。所以说英伟达厉害就厉害在敢说‘去他的,直接干’这种魄力。
Each each time is is a quarter of delay or something. Right? Yeah. So it's it's it's absurd. So I think that's the other thing about NVIDIA is like, you know, screw it.
我们赶紧把它发布出去。尽快提升产量。就是,你知道的,让我们把这些事情搞定。总之,他们拥有一些最棒的模拟验证技术等等,这让他们能够从设计阶段,也就是从构想到产品上市,以最快速度推进,砍掉任何可能拖延进度的不必要功能,确保无需返工,以便能迅速响应市场需求。有个关于Volta的故事,那是NVIDIA首款搭载张量核心的芯片——嗯。
Let's ship it. Let's let's get the volume ASAP. Let's let's let's, you know, let let's do these things that you know? And and so anyways, they they, like, you know, have some of the best simulation verification, etcetera, that lets them sort of go from design, you know, from idea to shipment as fast as possible, you know, cutting out any unnecessary features that could delay it, making sure they don't have to do revisions so that they can get you know, they can respond to the market ASAP. There's a story about how Volta, which was the first NVIDIA chip with tensor cores Mhmm.
你知道,他们在前一代P100 Pascal架构上看到了所有AI应用的潜力,于是决定全力投入AI。他们在Volta上增加了张量核心,距离送厂量产只有短短几个月时间。就像他们说的,管他呢,直接改。这太疯狂了。
You know, they saw all the AI stuff on the prior generation p 100 Pascal, and they decided we should go all in on AI. And they added the tensor cores to Volta, like, only a handful of months before they sent it to the fab. Like, they said, screw it. You know, let's change it. And that's crazy.
试想如果他们没这么做,或许别人就会抢占AI芯片市场了,对吧?所以这些关键时刻——虽然都是重大改动,但往往还需要调整些小细节,比如数值格式或某些架构设计。
And it's like, if they hadn't done that, who would have maybe someone else would have taken the AI chip market. Right? So there's all these times where they and it's those are major changes, but there's often, like, minor things that you have to tweak. Right? Number formats or, like, some architectural detail.
NVIDIA就是
NVIDIA is just
快得惊人。更疯狂的是他们还有个能跟得上节奏的软件部门。想想看,如果你推出一款芯片,基本无需迭代就能直接上市,同时还能准备好驱动程序和上层基础设施支持,这简直太厉害了。确实。
so fast. And the the other crazy thing is they have a software division that can keep up with that. Right? I mean, if you're if you come out with a chip, right, and basically no stepping required, it's immediately in the market, then being ready with drivers and and, know, all the infrastructure on top of that, it's just super impressive. Yeah.
我特别赞同这个观点。人们总说NVIDIA接连遇上风口,但你们两位的意思是:你必须动作足够快、执行足够好才能抓住这些机遇。顺便说,我很喜欢你讲的CES故事——我都能想象十几年前他谈论自动驾驶汽车的样子。但想想他们把握住了游戏产业风口、VR、比特币挖矿,现在又是AI...今天黄仁勋还提到机器人和AI工厂。关于NVIDIA我最后想问:
I I love that point because you think of NVIDIA benefiting from tailwind after tailwind, but I think both of you are saying you gotta be you have to move fast enough and execute well enough to take advantage of those tailwinds. And if you think about and by the way, I loved your CES story. I was just envisioning him more than ten years ago talking about self driving cars. But, you know, if you think about nailing the video game tailwind, VR, Bitcoin mining, obviously, AI now, you know, one thing that or one of the things that Jensen talks about today is robotics, AI factories. Maybe my last question on NVIDIA.
你们怎么看未来十到十五年的发展?我知道预测超过五年都很难,但NVIDIA的业务会变成什么样?
What do you what do you think about the next ten to fifteen years? I know calling beyond five is hard, but, like, what does NVIDIA NVIDIA's business look like?
这这这确实是个问题,每次我和NVIDIA的高管交谈时都会问这个,因为我真的很想知道——虽然他们显然不会回答——就是,你们打算怎么处理资产负债表?你们现在是现金流最充沛的公司,现金流多到离谱。现在超大规模企业都在大幅削减现金流,对吧?因为他们正在投入资金...
It's it's it's really a question of and this is like I think every time I've talked to, you know, some executives at NVIDIA have asked this question because I really wanna know and, you know, they won't answer it obviously, but it's like, what are you gonna do with your balance sheet? Like, you are the most high cash flow company in, like like, you have so much cash flow. Now the hyperscalers are all taking their cash flow, like, way down. Right? Because they're spending on.
你们打算怎么处理所有这些现金流?对吧?要知道,在这次爆发式增长之前,他们连收购ARM都不被允许。那么这么多资本和现金能用来做什么呢?
What what is what what are you gonna do with all this cash flow? Right? Like, you know, even even before this whole takeoff, he wasn't allowed to buy arm. Right? So so what can he do with all this capital and all this cash?
对吧?就连这50亿美元投资英特尔也要接受监管审查。公告里明确写着呢。
Right? Even this $5,000,000,000 investment intel is there's regulatory scrutiny there. Right? Like Yeah. It's it's in the announcement.
是啊,这需要经过审查。我估计能通过,但他没法进行任何大型收购。未来资产负债表上会躺着数千亿现金。
Like, yeah, this is subject to review. Right? Like Yep. You know, I I imagine that'll get passed, but, like, he can't buy anything big. He's gonna have hundreds of billions of dollars of cash on his balance sheet.
怎么办?是不是要开始建设AI基础设施和数据中心?也许吧。但既然能让别人去做自己坐收现金,何必亲自下场?不过他现在确实在投资...
What do you do? Is it is it start to build AI infrastructure and data centers? Maybe. But, like, why would you do that if you can just get other people to do it and just take the cash? Well, he's investing those.
对吧?但投的都是小钱。最近他提供了担保,因为现在确实很难找到大量GPU来应对突发需求。
Right? Investing peanuts. Right? You you know, like, he he gave recently, like, a backstop because because today, it's really hard to find a large number of GPUs for burst capacity. Yeah.
对吧?比如有人想用三个月训练模型。有基础算力做实验,但需要三个月集中训练大型模型——这就搞定了。
Right? Like, hey. I wanna train a model for three months. Right? I have my base capacity where I don't know my experiments, but I wanna train a big model three months done.
我们从投资组合中了解到。是的。是的。
We know from our portfolio. Yeah. Yeah.
是的。所以,就像,英伟达看到了这个问题。他们认为这是初创公司面临的实际问题。这就是为什么实验室有如此大的优势。但如果我能,你知道,现在,就像,你知道,今天硅谷的大多数公司在一轮融资中花费了大约75%在GPU上。
Yeah. So so so, like, NVIDIA sees this issue. They think it's a real problem with startups. It's why the labs have such an advantage. But what if I could you know, right now, like, you know, most companies today in the Valley spend, what, 75% of their round on GPUs.
对吧?或者至少。是的。是的。我们看到了。
Right? Or At least. Yeah. Yeah. We see.
如果你能在三个月内通过一次模型运行完成75%的工作呢?对吧?你知道吗?是的。并且真正扩大规模,拥有某种有竞争力的产品,然后你有了模型,然后你可以筹集更多资金,对吧,或者开始部署。
What if you could do 75% in three months on one model run? Right? You know? Yeah. And and really scale and have some sort of, like, competitive product, and then you have the model, then you raise more capital, right, or start deploying.
对吧?你用它做什么?是不是开始购买大量人形机器人并部署它们,但是,就像,它们并没有真正做出好的软件。他们在模型方面并没有真正做出那么惊人的软件。对吧?
Right? What do you do with it? Is it is it start buying a crapload of humanoid robots and to put deploying them, but, like, they don't really make good software. They they don't make really that amazing software for them in terms of the models. Right?
他们做的,你知道,底层很棒。他们部署资金的地方才是问题所在。
They make you know, the layer below is great. Where they deploy their capitals is, like, the question.
不过,他一直在供应链上下游进行一些投资。对吧?投资于新云公司,投资于一些模型训练公司。
He he has been investing up and down the supply chain a little bit, though. Right? Investing in in the in the neo clouds, investing in some of the model training companies.
是啊。不过,再说一次,这都是小打小闹。他要是愿意,完全可以参与整个Anthropic轮融资。当然他没这么做,对吧?
Yeah. But, again, it's small fries. Like, he could have just done the entire anthropic round if he wanted to. Of course he didn't. Right?
然后真的让他们用上GPU什么的,或者他本可以参与整个OpenAI轮融资,任何XAI轮融资都行。你觉得这些是
And and then like really got them to use GPUs or like, he could have done the entire, you know, OpenAI round. He could have done the entire, like, any XAI round. Do you think these are
他应该做的事吗?还是说...
things he should be doing or or what's I
我是说...嗯,好问题。
mean, like Yeah. Good question.
我...我...
I I
不知道,对吧?我觉得...
don't know. Right? I think I think, like
我们会...我们会在下一轮融资时引用你的话。不过话说回来,他...
We'll we'll quote you over the next round that we're waiting. But, anyways, he
他能让风投行业成为死水一潭吗?不。拿下所有最佳轮次。
could he could make venture a dead industry. No. Take all of the best rounds.
但这可是笔大生意。是啊。
But it's a lot of business. Yeah.
你知道的,你可以先投种子轮再让Jensen给你抬价。这就是我搞不定的原因,但我喜欢这样。不,我觉得...我觉得我不喜欢这样。要知道,挑选赢家对他显然非常困难,因为他的客户遍布整个生态系统。
You know, you can do the seeds and then have Jensen mark you up. That's why I couldn't work, but I like it. No. I don't think I don't think I don't like it. I think, like, picking winners is obviously really tough for him because he has customers all across this ecosystem.
如果他开始押注赢家,那么他的客户们会更加焦虑地想要离开,加倍努力转向AMD、某家初创公司或他们的内部项目等等。对吧?买TPU也好,其他什么也罢,人们会...他不能就这么投资这些...你懂的,他可以稍微参与,对吧?在OpenAI轮投个几亿没问题,或者在Next AI轮投个几亿也行。
If he starts picking winners, then, like, his custom his customers will even be even more anxious to leave and give even more effort to whether it's AMD or, you know, some startup or their internal efforts, etcetera, etcetera. Right? Buying TPUs, whatever it is, like, you know, people will. He can't just like invest in these, like, you know, he can do a little bit, right? A few 100,000,000 in an open AI round is fine or a few 100,000,000 in next AI round is fine.
Core Weave对吧?虽然大家都在大惊小怪,但他不过投了两亿多,早期阶段而已,还租用他们的集群做内部开发——这可比租用超大规模云服务商对英伟达更划算。明白吗?从他们那里租比从超大规模云服务商租更好。这真的算是在给CoreWeave兜底吗?
Core Weave, right? Like, yeah, everyone's like throwing a fuss about it, but it's like he invested a couple 100,000,000 plus, you know, early on, plus, you know, rented a cluster from them for internal development purposes instead of renting it from a hyperscaler, which is cheaper for NVIDIA to do. Right? It's better for them to do it from them than the hyperscalers. It's like, did he really like, is he really backstopping CoreWeave that much?
对吧?或者其他客户比如Neo Clouds?是有一些投资,但更像是...这是个不错的云服务。懂吗?我们顶多投个5%到10%的份额。
Right? Or, you know, any of the other customers or neo clouds? Like, there's some investment, but it's like, it's more like, this is a good cloud. You know? We'll throw like five or 10% of the round.
对吧?他并没有吃掉50%以上的融资份额。
Right? It's it's not he's taking 50% plus of the round.
他也在重塑他的市场吗?我是说,看。几年前,这些卡片有四次大采购。你刚刚列出了六次。这在多大程度上是Assume、Nevious和Leia的作用。
Is he also reshaping his market? I mean, look. A couple of years ago, there were four big purchases of these cards. You just listed six. To what extent is that that Assume and and Nevious and Leia.
那里有一长串名单。
There's a long list there. Of
当然。是的。
course. Yeah.
这是否是一种策略?我认为是的。
Is the is that is that a strategy? It is. I think
这绝对是。但他不需要投入太多资金就能做到这一点。比如,
it absolutely is. But he didn't have to put much capital down to do this. Like, what
芯片一是否比另一个更早?
Does chip one earlier than
我不知道。是的。那不是。但就像,如果你看看他们在新云上投入的总资金,那是几十亿美元。
the other? I don't know. Yeah. That's No. But it's like, if you look at the grand amount of capital they spent investing in the neo clouds, it's it's it's a few billion dollars.
但他拥有很多
I But he has a lot
其他杠杆手段,如果他愿意的话。
of other levers if he wants to.
对,对。就像你提到的分配问题。好处在于,历史上你们给超大规模客户提供批量折扣,但由于他能以反垄断为由,现在所有人都享受与超大规模客户相同的价格。非常公平,非常公平。
Right. Right. Allocations, as you mentioned. What's nice is, you know, historically, you gave volume discounts to hyperscalers, but because he can use the argument of antitrust, he's like, everyone gets the same price as So fair. It's very fair.
非常公平。
It's very fair.
明白吗?那么他应该如何处理这只猫,或者说应该以什么为指导,我是说——
You know? So what should he do with the cat or what should guide his I mean,
我认为,他应该投资数据中心,仅限于数据中心层面,而不是数据中心内部的内容,这样更多人会建设数据中心。如果市场需求持续增长,数据中心和电力就不会是问题,对吧?投资数据中心和电力。我已经向他们提过这个建议了。
I think, like, know, like, is the argument he should invest in data centers and only the data center layer, not the not the not what goes in the data center so that more people build data centers. And then if the market demand continues to grow up, data centers and power are not the issue. Right? Invest in data centers and power. I've I've said that to them.
他们应该投资数据中心和电力,而非云服务层,因为云服务层虽未完全商品化,但已接近成为商品化的补充。明白吗?这是完整的逻辑。我不会说云服务已商品化,但确实现在有许多实力相当的竞争者。而且你们已经教育了商业地产和其他基础设施投资公司,让他们也开始涉足AI基础设施领域。
They should invest in data centers and power, not in the cloud layer, because the cloud layer is is is is quite not commoditized, but quite it's it's commoditizer complement. Right? It's the whole phrase. And I won't say being a cloud is commoditized, but it's certainly like you have a lot of competitors who are decent now. And and and you've you've educated the commercial real estate and other, you know, infrastructure investment firms into going into AI infra as well.
所以,我觉得你投资的不是云层本身,对吧?你是投资数据中心和能源吗?对。你投资这些吗?
So, like, I don't think it's the cloud layer that you invest in. Right? Do you invest in data centers and energy? Yeah. Do you invest it?
因为对你来说增长的真正瓶颈在于,一是人们愿意且能够花多少钱,二是实际将资源投入数据中心的能力。至于机器人这类领域,我觉得他确实可以投资,但没有什么需要3000亿美元资本的项目。那你拿这些资本怎么办?我真的完全不知道。我感觉黄仁勋肯定有某种计划。
Because that's the bottleneck for you in growth, really, is is a, well, how much people wanna spend and can spend, and b, the ability to actually put them in data centers. And then like robotics and like, I think there's like areas he could invest in, but nothing requires $300,000,000,000 of capital. So what do you what do you do with the capital? Like, I I really I really don't know. And I like feel like Jensen has to have some idea.
这里肯定存在某种远景规划,因为这决定了公司的走向,对吧?
There's some visionary plan here because that's what shapes the company. Right?
没错。
Yep.
我是说,他们可以继续...我提到每年2000亿到2500亿美元的自由现金流。他们怎么处理这些钱?永远回购股票吗?走苹果的老路?苹果近十年没做出什么创新,就是因为缺乏有远见的领导者。
Is I mean, they could sell they could they could they could just continue to you know, I mentioned $200,000,000,000 of free cash flow, $250,000,000,000 of free cash flow a year. What do they do with it? Like, do they just buy back stock forever? Like, do they go Apple route? The reason why Apple hasn't done anything interesting in, like, you know, nearly a decade is is, you know, they've got they've got a not visionary at the head.
蒂姆·库克擅长供应链管理,但他们只是把钱砸在股票回购上。自动驾驶汽车项目失败了,AR/VR领域还有待观察,可穿戴设备也是未知数,对吧?
Tim Cook's great at supply chain, and they're just plowing the money into buybacks. They're not real you know, automotive the the self driving car thing failed. We'll see what happens with ARVR. You know, we'll we'll see what happens with wearables. Right?
但Meta和OpenAI可能比他们更出色。其他领域我们拭目以待。所以他到底投资什么?我毫无头绪,但肯定不是现在这些。
But like meta and opening, I might be even better than them. We'll see like in others. Right? So so what does he invest in? I have no clue, but nothing.
什么项目需要这么多资金?这是个棘手的问题。实际上它确实能带来回报。
What what requires so much capital? It is the tough question. It actually gets a return.
是啊。
Yeah.
因为简单来说,就像我的股权成本。对吧?我直接回购就行。
Because the easy thing is, like, my cost of equity. Right? I just buy back.
而且不会彻底改变公司文化。我认为这是另一点。对吧?可能有些领域你可以投资,但突然之间公司会同时做两件完全不同的事,这很难持续下去。
And doesn't completely change the company culture. I think that's another thing. Right? There are probably areas you could invest it in, but you suddenly end up with the company doing two completely different things, which are very difficult to keep on.
但他们确实在做10件完全不同的事。对吧?我的意思是,一种理解方式是我们在构建AI基础设施。然后全球各地建造AI基础设施、机器人、人形机器的团队,或是数据中心和能源,都属于AI基础设施。对吧?
But they do they do, like, 10 completely different things. Right? I mean I mean, a one way to look at it is we build AI infrastructure. And then the guys of We Build AI infrastructure, robots, humanoids around the world are AI infrastructure, or or data centers and energy is AI infrastructure. Right?
就像,你知道的,就像
Like, you know, like
所以人形机器人项目完全可行。对吧?但如果突然开始浇筑混凝土和建造发电厂,那将形成完全不同的文化,需要完全不同的人员配置,难度会大得多。好吧,我同意。
So the humanoids would totally work. Right? If you were suddenly pouring concrete and and building power plants, it has completely different culture, completely different set of people, and getting much much harder. Okay. I agree.
有不同的方式可以操作,比如投资于不同的公司,或者为发电厂的建设提供支持。对吧?因为没人愿意建发电厂,毕竟那是需要三十年承保周期的事情。明白吧?在这些不同领域,确实需要资金来促成某些事情的发生,对吧?
There's different ways to do it, like invest in the various companies or, like, backstop, like, the building of the power plants. Right? Like, you know, because no one wants to build power plants because they're thirty year underwriting things. Right. You know, there's all these different areas where could it use capital to, you know, allow something to happen, right?
不一定非要自己亲自拥有。而且
Not necessarily owning it himself. And
听着,记住这一点,我们面临的最大问题之一是客户基础太糟糕了。我是说,大部分芯片都卖给了大型超大规模运营商,这些客户过于集中,他们自己也在造芯片,所以会压价。说实话,把钱花在云服务多元化上才是——
look, look and bear in mind, would tell, one of the biggest problems we had was that our customer base sucked, right? Mean, we were selling to Most of the chips went to the large hyperscalers, you know, which they're way too concentrated, and they build their own chips, and so you can push down your prices. So honestly, spending it on diversifying the cloud, you know, the Well,
纸浆在PLV14项目里。你们当初就该把价格定得超高,让利润率达到80%。全世界能拿你们怎么办?
pulp was in p l v 14. You guys should have just charged so much that your margins were 80%. What would the world have done?
毫无办法。那时候利润率其实挺不错的,那不是问题所在。
Nothing. The margins were pretty good back then. That wasn't the problem.
那就是主要问题。60%,65%的利润率。本来可以做到80%的。现在还是。
That was the primary problem. Sixty, sixty five. They were 80. Still. Yeah.
哎呀,截然不同...是Jensen负责的。现在还是Jensen或者Nestor管吧。对。
Oh, boy. Different different It was Jensen. It's still Jensen or Nestor. Yes.
PTSD症状开始发作了。好吧,
PTSD is kicking in here. Well,
等等。我认为Guido的评论实际上很好地引出了我们想和你讨论的另一个话题,那就是超大规模云服务商。我喜欢阅读SemiAnalysis的原因之一,是你们经常能做出与市场共识不同的准确预测。比如最近关于Well公司发展缓慢的判断——虽然你们只是‘经常’正确?
wait. I think Guido's comment is actually a really good segue into something else we wanted to talk to you about, which is the hyperscalers. And one of the reasons that I love reading semi analysis is you guys make these out of consensus calls that you're often right about. And one of them recently Well is crawling. Only often?
你们有Jensen级别的预测命中率。准确率非常高。
You have a Jensen hit rate. It's very high.
我那个价值十亿美元、押注电动车行业向好的头寸去哪了?
Where's my billion dollar, you know, EV positive bet?
最吸引我注意的是亚马逊在AI领域的复兴。我想就此深入探讨,因为我们在实地帮助投资组合公司选择合作伙伴时,发现这个现象很有趣。虽然我们掌握了一些微观数据,但希望你能系统分析他们落后的原因。
The one that caught my eye was Amazon's AI resurgence. So I wanted to talk to you a little bit about that just because, you know, I think we found it pretty interesting being on the ground helping our portfolio companies pick who their partners are. And so we have some micro data on this, but you sort of walk through why they're behind.
没错。2023年第一季度,我写过一篇题为《亚马逊的云危机》的文章。核心观点是新兴云服务商将导致亚马逊业务 Commoditization(商品化),指出亚马逊的基础架构只适配上一代计算范式——
Yeah. So in q one twenty twenty three, I wrote an article called Amazon's cloud crisis. And it was about all these neo clouds are gonna commoditize Amazon. It was about how Amazon's entire infrastructure was really good for the last era of computing. Right?
他们的弹性网络架构(ENA/EFA)、网卡设计、底层协议体系,以及定制CPU方案等,在传统横向扩展计算时代表现优异,却无法适应当前纵向扩展的AI基础设施需求。当时认为他们的芯片团队过度专注成本优化,而当今AI时代的关键指标是单位成本性能最大化——这意味着有时需要不计代价地追求极致性能。
What they do with their elastic fabric, ENA and EFA, right, their NICs, what they and the the whole protocol and everything behind them, what they do for, custom CPUs, etcetera. Right? Like, it was really good for the last era of scale out computing and not this era of sort of scale up AI infra, and how neo clouds were gonna commoditize them and how their silicon teams were focused on, you know, cost optimization, whereas the name of the game today is max performance per cost. Right? But, like, that often means you just drive up performance like crazy.
即使成本翻倍,性能提升超过三倍,单位性能成本仍在下降。这某种程度上就是如今NVIDIA硬件的制胜法则。事实证明这是个非常明智的判断。当时所有人都质疑我们,比如亚马逊还是最被看好的股票,微软尚未真正起飞,甲骨文等公司也是如此。但自那以后,亚马逊成了超大规模服务商中表现最差的。
Even if cost doubles, you drive up performance more triples because then the cost per performance falls still. That's sort of the name of the game today with NVIDIA's hardware. And it ended up being, like, really good call. Everyone, like like was calling us out like, no, you're wrong because, and this was like when Amazon was like, like the best stock and Microsoft really hadn't like started taking off yet and nor had like all these other, you know, Oracle and so on and so forth. And since then, Amazon has been the worst performing hyperscaler.
完全正确。关键问题在于他们仍存在结构性问题。对吧?他们仍在使用弹性架构网络,虽然有所改进,但仍落后于NVIDIA的网络技术,也不及博通/Arista这类网络接口卡。其自研AI芯片也只能算差强人意。
Totally. And the call here is that, you know, they still have structural issues. Right? They still use elastic fabric, although that's getting better, still behind NVIDIA's networking, behind Broadcom's slash Arista, like type networking NICs. They still use you know, their their their internal AI chip is okay.
但重点是,他们现在正觉醒并开始真正获取业务。明白吗?核心论点是自那份报告发布后,AWS收入增速持续放缓,同比收入连年下滑。而我们的大胆预测是:这种趋势即将逆转回升。
But the main thing is that they're now waking up and being able to actually capture business. Right? So the main call here is that since since that report, AWS has been decelerating revenue. Year on year revenue has been falling consistently. And and our big call is that it's actually going to start reaccelerating.
对吧?这得益于Anthropic的合作,也得益于我们在数据中心的全方位布局——我们实时追踪每个上线数据中心的配置、成本流向(包括芯片成本、网络成本、电力成本),掌握这些设备的常规利润率后,就能初步估算收入。综合这些数据,我们可以明确断言:AWS收入增长将在本季度触底。
Right? And that's because of in Thropic, it's because of all the work we do on data centers, right? Tracking every single data center when that goes online and what's in there, The flow through on costs, or if know how much the chips cost, the networking cost, the power cost, you you know how much, you know, generally margins are for these things, then you can sort of start estimating revenue. So when we build all that up, it's very clear to us that they trough on AWS revenue growth this quarter. Right?
这将是未来至少一年内AWS同比收入增长的最低点。而随着搭载Trainium芯片和GPU的超大规模数据中心投入使用,增速将重新突破20%——具体取决于采用哪种配置方案。
This is the lowest AWS revenue growth will be on a year over year basis for at least the next year. Right? And it's reaccelerating to north of 20% again because of all these massive data centers they have online with Tranium and GPUs. Right? Depends on which one.
实际效果因客户而异,体验可能不如CoreWeave等竞争对手。但当前决胜关键是产能——CoreWeave部署能力有限,数据中心容量受制约(尽管他们的建设速度极快)。而全球数据中心容量最大的企业,虽然未来两年可能被超越,但目前仍是亚马逊。
It depends on which customer. The experience is not as good as, you know, say a CoreWeaver or whatever, but the name of the game is capacity today. CoreWeave can only deploy so much. They have to get they only had can get so much data center capacity, and they're really fast at building. But the company with the most data center capacity in the world that and still today, although, it they may get passed up in the next two years, is Amazon.
根据我们的观察,亚马逊确实即将失去这个宝座。但就增量而言,在未来一年内,亚马逊仍拥有最多可转化为AI营收的闲置数据中心容量。
Actually, they will get passed up based on what we see is Amazon, but incrementally, Amazon still has the most spare data center capacity that's going to ramp into AI revenue over the next year.
让我问一个问题。这是正确的数据中心容量类型吗?比如,对于当今高密度AI构建,你需要大量的冷却系统。附近需要有充足的水源和电力供应。
Is let let me ask one question. Is that the right type of data center capacity? Like, for the high density AI build outs today, you need, you know, massively more cooling. You need to have enough water close by. You need to have enough power close by.
这是否是合适的地点,或者说这是否是错误的类型数据?
Is is that is it the right place, or is it is it So is that the wrong type of data?
所以从这种意义上讲的数据中心容量,我指的是从确保电力供应到建设变电站、变压器,再到能为机架提供电力连接。显然,数据中心容量会有所不同。对吧?实际上,亚马逊拥有全球最高密度的数据中心。
So data center capacity, in this sense, I mean, all the way from power secured to substations built to transformers to you can provide the power whips to the racks. Now, obviously, the data center capacity will differ. Right? You know, historically, actually, Amazon's had the highest density data centers in the world. Right?
当其他公司还在使用12千瓦机架时,亚马逊已经用上了40千瓦的。如果你曾走进大多数数据中心,它们通常凉爽干燥。但走进亚马逊数据中心,感觉就像沼泽地,就像我长大的地方。
Went to, like, 40 kilowatt racks when everyone was still at 12. And if you've ever stepped inside of foot inside of most data centers, they're, like, pretty cool and dry ish. If you step inside of Amazon data center, they feel like a swamp. It feels like where I grew up. Right?
那里又湿又热,因为他们正在优化每一个百分点。所以你的观点是亚马逊的数据中心不适合新型基础设施,但与GPU成本相比,建立复杂的冷却系统是可以接受的。几个月前我们以90美元的价格推荐了Astera Labs,之后由于亚马逊的订单,股价涨到了250美元。但亚马逊的基础设施有其特殊性。
It's like a it's like it's like humid and hot because they're, like, optimizing every percentage. And so sort of like your point in here is that, like, Amazon's data centers aren't equipped for the new type of infrastructure, but when you compare them to the cost of the GPU, like getting, getting, you know, having a complex cooling arrangement is fine. Right. You know, we made a call on Astera Labs a few months ago, a couple months ago when they were like at 90, and it's it's gone to $2.50 the month after because of what their orders Amazon is placing with them. But there's certain things with Amazon's infrastructure.
我不想深入讨论,但他们的机架基础设施需要使用更多Astera Labs的连接产品。冷却系统也是如此。在网络和冷却方面,他们必须使用更多这类设备。
I won't get too much into it, but their rack infrastructure requires them using a lot more of, like, Astera Labs connectivity products. And the same applies to cooling. Right? So it's on the networking and cooling side. They just have to use a lot more of this stuff.
但再次强调,与GPU相比,这些设备的成本微不足道。
But, again, this stuff is inconsequential on cost compared to the GPU.
你们可以建造,对吧?而我的问题更像是,听着,现阶段我可能需要一条大河在附近用于冷却,对吧?
You can build. Right? And my question was more like, look. I may need a major river close by for cooling at this point. Right?
在很多地区,我就是无法获得足够的水源。而且,你知道,同一区域可能还有电力供应。
It's in many areas, I just can't get enough water. And, you know, there's probably power in the same region.
千瓦级规模的站点,他们已确保电力供应,湿式冷却塔和干式冷却塔都已到位。就像,一切都没问题。只是效率没那么高,但你知道,这没关系,对吧?他们即将增加收入。
Kilowatt scale sites that they have power all secured, wet wet chillers and dry chillers all secured. Like, everything everything's fine. It's just not as efficient, but, you know, that's fine. Right? Like, you know, they're they're gonna ramp the revenue.
他们会增加收入。并不是说亚马逊的内部模型一定会很棒,或者他们的内部方案比英伟达的更好,或能与TPU竞争,又或者他们的硬件架构是最佳的。我不一定这么认为。但他们能建很多数据中心,并塞满可供出租的设备,对吧?这其实挺简单的,是的。
They're gonna add the revenue. Not that it's necessarily think Amazon's internal models are gonna be great or, their internal ship is better than Nvidia's or competitive with TPU like, or their hardware architecture is the best. I don't necessarily think that's the case, But they're they can build a lot of data centers, and they could fill them up with stuff that will be rented out. Right? And it's it's it's a pretty simple it's a pretty Yeah.
简单,这是个相当简单的论点。
Simple It's a pretty simple, thesis.
Anthropic对Trainium的协同设计有多重要?因为我记得我们有一家投资组合公司。那是2023年夏天。他们邀请他们到AWS,花了大概一周时间,总共八小时试图搞懂当时的Trainium。
How how important has Anthropic been to the codesign for Trainium? Because I I remember we had a portfolio company. This was summer twenty twenty three. They invited them to AWS. They spent, man, I think eight hours with them over the course of a week trying to figure out Trainium back then.
那时候简直没法操作。显然,那家投资组合公司现在还没回去尝试,但根据你听到的,现在情况有多大不同?
It was just impossible to work through. Is that, you know, that obviously, that portfolio company hasn't gone back and and tried it now, but, like, how how different is it now based on what you're hearing?
哦,还是很糟糕。好吧。好吧。不错。你知道,这东西用起来挺费劲的。
Oh, it's still bad. Okay. Okay. Good. You know, it's it's tough to use.
所以这有点像,每个推理公司都会提出这种论点对吧?包括那些AI硬件初创公司——因为我最多只运行三四种模型,完全可以手动优化所有东西,为每个环节编写内核,甚至深入到汇编级别。能有多难呢?实际上确实相当困难。
So there's sort of like, this is sort of the argument that every inference company offers, right? Including the AI hardware startups is because I'm only running like three different models at most, I can just hand optimize everything and write kernels for everything and even like go down to like an assembly level. Right? How hard can it be? It is it is pretty hard.
确实相当困难。但生产环境中的推理本来就需要这么做。你不会使用NVIDIA的cuDNN这类易用库——那个能超级简单地生成内核之类的工具。明白吗?在实际操作中,你不会用这些便捷库的。
It is pretty hard. But like you tend to do this for production inference anyways. Like, you aren't using cuDNN, which is NVIDIA's, like, library that's, like, super easy to generate your you know, to generate kernels and stuff. Right? Like, you're not or not generate kernels, but but anyways, you're you're you're still use you're not using these, like, ease of use libraries.
运行推理时,你要么用cut lists,要么自己编写PTX代码,有些团队甚至深入到SaaS层面。看看OpenAI或Anthropic在GPU上跑推理时就是这么做的。一旦深入到那个层级,整个生态其实并不那么理想。
You know, when you're running inference, you're either, you know, using cut lists or stamping out your own PTX or, you know, in some cases people are even going down to the SaaS level. Right? And like, when you look at like, say an OpenAI or like, you know, an Anthropic, when they run inference on GPUs, they're doing this. Right? And the ecosystem is not that amazing when you once you get all the way down to that level.
使用NVIDIA GPU并不像想象中那么简单。虽然你对硬件架构有直观理解,毕竟长期接触,大家也都熟悉可以互相交流。但说到底并不轻松。而Anthropic训练更多用TPU。
It's not like it's not like using NVIDIA GPUs is is easy now. I mean, you have an intuitive understanding of the hardware architecture because you work on it so much and everyone's worked on it and you can talk to other people. But at the end of the day, it's not like easy. Right? Whereas, you know, an anthropic trainee, more TPUs.
展开剩余字幕(还有 194 条)
实际上TPU硬件架构比GPU更简单些,核心更大更精简,功能没那么通用,编程反而容易些。Anthropic员工发推说过,做底层开发时他们更喜欢用TPU,就因为架构简单。真的吗?
Actually, the hardware architecture is a little bit more simple than a GPU, larger, more simple cores rather than having all this functionality, less general. So it's a little bit easier to code on. There's tweets from Anthropic people saying when they are doing that low level, actually they prefer working on Tradium and TPU because of the simplicity. Really?
不,有意思。
No. Interesting.
明确地说,Tradium和TPU,尤其是Tradium,非常难用。是的,不适合胆小的人。它非常困难,但如果你只是运行,比如我是Anthropic公司,必须只运行Claude 4.1 Opus或Sonnet,那还是可以做到的。管它呢。
To be clear, Tradium and TPU at, I mean, Tradium especially is very hard to use. Yeah. Like not for the faint of heart. It's it's very difficult, but you can do it if you're just running, if I'm anthropic and I must only run Claude 4.1 Opus for, Sonnet. And and screw it.
我甚至不会运行HiQ。我只会在GPU或其他设备上运行HiQ。对吧?我只打算运行两个模型。实际上,管它呢。
I won't even run HiQ. I'll just run HiQ on, like, on on GPUs or whatever. Right? I'm just gonna run two models. And actually, screw it.
我也只会在GPU和TrueTPU上运行Opus。反正Sonnet占了我大部分流量。我可以花这个时间。而且我多久才改变一次架构?每次都是这样。
I'm just gonna run Opus on GPUs too and and TrueTPUs. Sonnet is the majority of my traffic anyways. I could I could spend the time. And how often am I changing that architecture? Every Right.
四到六个月。对吧?
Four or six months. Right?
老实说,变化其实并不大。对吧?
Like, how much It's it's not even changing that much, honestly. Right?
我认为从三代到四代确实有变化。对吧?
And I think I think from three to four definitely did change. Right?
是的。我是说,定义架构变化。你知道,从高层次来看,过去几代的基本构件大致相同。
Yeah. I mean, define architectural change. You know, at a high level, like the primitives are more or less the same across the last couple of generations.
老实说,我对Anthropic的模型架构了解不够多,但从我在其他地方看到的情况来看,已经发生了足够多的变化,这需要时间来,你知道,编程实现,真正关键的是,比如,如果我是Anthropic,现在有70亿美元的年度经常性收入(ARR),或者到明年年底超过这个数字,对吧?ARR可能甚至达到300亿,而且我的利润率是50%到70%。那就是150亿美元的培训费用。对吧?这些可以在Sonnet上运行。
Don't know enough about anthropics model architecture, to be honest, but I think from what I've seen at other places, there have been enough changes that it takes time to, you know, program this and, and really get the, the main thing is like, you know, if I'm anthropic and I have what 7,000,000,000 ARR now or whatever north of, know, by by the end of next year, of 20, right? Like ARR is like maybe even 30 is like, that's that's and my margins are 50%, 70%. That's $15,000,000,000 of training up that I need. Right? That I can that can run on Sonnet.
其中大部分将使用Sonnet三、五,或者抱歉,四、五,不管具体是哪个版本。对吧?会有一个模型服务于大多数用例。所以,你知道,我可以花时间,它能在这些硬件上运行。
And most of that's gonna be Sonnet three, five, or sorry, four, five, whatever it is. Right? It's gonna be one model serving most of the use cases. So, like, you know, I could I could spend the time and it'll work on this hardware.
是的,完全同意。也许关于你做出的非共识性决策的话题,我可能会转到另一个云服务。六月时,你们说Oracle正在赢得AI计算市场。然后在这个播客中,我们已经提到了Oracle的巨大飞跃。
Yeah. Totally. Maybe on the topic of nonconsensus calls you've made, and and maybe I'll I'll move to another cloud. In June, you guys said that Oracle is winning the AI compute market. And then in this pod, we've already referenced the the big jump, obviously, that Oracle had.
我认为这是市值超过5000亿美元的公司有史以来最大的单日涨幅。所以,而且——
I think it was the single largest gain that a company with over 500,000,000,000 of market cap has ever had. So and in
2023年第一季度的NVIDIA涨幅没那么大吗?可能可能更小。好吧。
those the twenty twenty three q one NVIDIA not bigger? It might have been might have been smaller. Okay.
我觉得可能接近。我们会自我核实一下。这是——
I think it was maybe close. We'll we'll fact check ourselves. This is
因为这太不可思议了。
because it's amazing.
但是,你知道,显然,这是一个宣布的重大承诺。你能带我们回顾一下当时为何做出这个决定,以及为什么甲骨文在如此竞争激烈的领域中能表现得如此出色吗?
But but, you know, obviously, this is the massive commitment that was announced. Can you walk us through why you made that call then and just sort of why Oracle's poised to do so well in such a competitive space.
是的。甲骨文拥有行业内最大的资产负债表,而且他们对任何硬件类型都不固执己见。对吧?他们对网络类型也不固执。他们会与Arista一起部署以太网。
Yeah. So Oracle, they're the largest balance sheet in the industry that is not dogmatic to any type of hardware. Right? They're not dogmatic to any type of networking. They will deploy ethernet with Arista.
他们会通过自己的白盒设备部署以太网,也会部署NVIDIA的网络技术,无论是InfiniBand还是Spectrum X,而且他们拥有非常优秀的网络工程师。他们的软件整体上也非常出色,对吧?比如ClusterMax,他们曾是ClusterMax金牌,因为他们的软件很棒。
They'll deploy ethernet through their own white boxes. They'll deploy NVIDIA networking, InfiniBand or Spectrum X, and they have really good network engineers. They have really great software across the board, right? Again, like ClusterMax. They were ClusterMax Gold because their software is great.
他们需要添加一些东西来提升自己,现在正在将这些升级到白金级别,对吧?这正是CoreWeave所处的位置。所以,当你把两件事结合起来看,比如OpenAI有巨大的计算需求,而微软则显得相当保守。
There's a couple of things that they needed to add that would take them higher and they're adding those right to platinum, right? Which was where CoreWeave was. And so like when you couple two things, right? Like OpenAI has got insane compute demand. Microsoft is quite pansy.
他们不愿意投资,因为他们不相信OpenAI真的能支付那么多钱。对吧?我之前提到过。对吧?对吧。
They're not willing to invest in they don't believe OpenAI can actually pay the amount of money. Right? I mentioned earlier. Right? Right.
那个3000亿美元的交易。是的,OpenAI,你没有3000亿美元,而甲骨文愿意下这个赌注。当然,这个赌注有点像是有一定保障的,因为甲骨文真正需要确保的只是数据中心容量。对吧?所以,这就是我们如何看待这个赌注的,对吧?
The $300,000,000,000 deal Yeah. OpenAI, you don't have $300,000,000,000, and Oracle's willing to take the bet. Now, of course, the bet is a bit like there's a bit more security in the bet in that Oracle really only needs to secure the data center capacity. Right? So, so this is sort of like how we, how we came across the bet, right?
我们一直在向我们的机构客户详细说明这一点,无论是超大规模企业、AI实验室、其他公司,还是我们数据中心模型的投资者,因为我们正在追踪全球每一个数据中心。甲骨文也不自己建造数据中心,对吧?顺便说一句,他们是从其他公司获取的。他们共同设计,但不亲自建造。因此,他们在评估新数据中心和设计方面非常灵活。
And we've been telling our institutional clients, especially in like a super detailed way, whether it be the hyperscalers or AI labs or some of your different companies or, you know, investors in our data center model because we're tracking every single data center in the world. Oracle doesn't build their own data centers either, right? By the way, they get them from other companies. They co engineer, but they don't physically build them themselves. And so they're quite nimble in terms of like being able to assess new data centers, engineer them.
所以我们看到甲骨文在深入讨论中抢占、签约等各种数据中心。我们了解到,这里一个吉瓦,那里一个吉瓦,对吧?比如阿比林,两个吉瓦,明白吗?他们正在签约讨论所有这些不同的站点,我们都在记录。然后我们还有时间线,因为我们在追踪整个供应链。
So we saw all these different data centers Oracle is snatching up in deep discussions, snatching up, signing, etcetera. And so we have, know, hey, gigawatt here, you go out there, giggle out there, right? Abilene, know, two gigawatts, right? You know, you have all these different sites that they're signing up in discussions with, and we're noting them. And then we have the timeline because we're tracking entire supply chain.
我们追踪所有许可证、监管文件,通过语言模型不断使用卫星照片,以及冷却设备、变压器设备、发电机等的供应链。我们能够相当准确地按季度估算每个数据中心站点的电力供应。对吧?有些我们知道的地点甚至要到2027年才会启动,但我们知道甲骨文已经签约了。对吧?
We're tracking all the permits, regulatory filings, you know, through, you know, language models, using satellite photos constantly, and then supply chain of, like, chillers, transformer equipment, generators, etcetera. We're able to make a pretty strong estimate of quarter by quarter in our data center or quarter by quarter, how much power there is for each of these sites. Right? So some of these sites that we know of aren't even ramping until 2027, but we know that Oracle signed it. Right?
我们掌握了大致的启动路径。于是问题就变成了,假设你有一个兆瓦,对吧?为了简单起见,虽然一兆瓦是很大的电力,但现在感觉不算什么了。
And we we have the sort of ramp path. So then it's this question of like, okay. Let's say you have a you you have a megawatt. Right? Like, for simple sake simplicity sake, which is a ton of power, but now it doesn't feel like much.
我们现在讨论的是吉瓦级别的。是的。但如果你说一个兆瓦,对吧,用GPU填满它。一个兆瓦的GPU要花多少钱?
It's you know, we're on the gigawatt. We go. Yep. But, you know, if you talk about a megawatt, right, you fill it up with GPUs. How much do the GPUs for a megawatt cost?
对吧?或者实际上,算起来更简单。对吧?如果我说的是一台GV200。对吧?
Right? Or actually, it's even simpler to do the math. Right? If I'm talking about a GV 200. Right?
每个单独的GPU是1200瓦。但当你考虑CPU和整个系统时,大约是2000瓦。同时,简单来说,每个GPU大约5万美元,对吧?GPU本身不花那么多钱,还有各种外围设备,对吧?
Each individual GPU is 1,200 watts. But when you talk about the CPU, the whole system, it's roughly 2,000 watts. At the same time, you know, all in everything, simplicity's sake, dollars 50,000 per GPU, right? And the GPU doesn't cost them. There's all the peripheries, right?
所以2000瓦的资本支出是5万美元。那么1000瓦就是2.5万美元。然后GPU的租赁价格是多少?如果是长期大批量协议,大约2.6到2.7美元,那么最终每兆瓦的租赁成本大约是1200万美元。是的。
So $50,000 CapEx for 2,000 Watts. So $25,000 for 1,000 Watts. And then what's the rental price for GPU? If you're on a really long term deal volume, $2.70, right, $2.60 in that range, then you end up with, oh, it costs like $12,000,000 per megawatt to rent a megawatt. Yeah.
然后每个芯片都不同。所以我们追踪每块芯片的资本支出、网络配置。这样你就知道每块芯片的情况。可以预测他们会在哪些数据中心部署哪些芯片,数据中心何时上线,每季度消耗多少兆瓦电力。最终得出‘星际之门’将在这一时间段投入运营。
And then you've and then each chip is different. So we track each chip, what the CapEx is, what the networking is. So you know what each chip is. You can predict what each you know, what chips they're putting in which data centers, when those data centers go online, how many megawatts by quarter. And then you end up with, oh, well, Stargate goes online in this time period.
他们将在这个时间开始租用。每个星际之门站点有这么多芯片。对吧?因此,OpenAI需要支付这么多租金。
They're gonna start renting it this time. It's this many chips. Each Stargate site. Right? And so therefore, this is how much OpenAI would have to spend to rent it.
接着你细化这些数据,我们就能以相当高的确定性预测甲骨文的收入,对其2025、2026、2027年的公告预测几乎完全吻合。2028年也相当接近。让我们意外的是,他们宣布了2028、2029年的一些数据中心计划,这些我们尚未发现,但总会找到的。对吧?当然。
And then you you prick that out and we were able to predict Oracle's revenue with pretty high certainty, and we matched pretty dead on what they announced for '25, '26, '27. And we were pretty close on '28. The the surprise for us was that, you know, they announced some stuff that 28, 29 data centers that they we don't we haven't found yet, but we'll find them. Right? Of course.
这种方法论能让你看清:究竟获得了哪些数据中心?多少电力?签了哪些合约?
And sort of like this methodology lets you see Yeah. Sort of, hey. What data centers are you getting? How much power? What are they signing?
这些设施上线时能带来多少增量收入?这就是我们投资甲骨文的基础逻辑。显然简报里细节更简略,但核心论点就是:他们手握大量容量,即将签下这些合约。
How much incremental revenue that is when that comes online? And so that's sort of the basis of our Oracle bet. Obviously, the newsletter, we included a lot less detail, but, know, you know, sort of it was it was that thesis, right? That like, hey, they have all this capacity. They're going to sign these deals.
我们在简报中主要讨论两点:OpenAI业务和字节跳动业务。预计周五会有关于TikTok的公告——字节跳动业务方面,甲骨文也将出租大量数据中心容量给他们。我们采用相同方法分析:字节跳动盈利稳定付款可靠,OpenAI则不然。
And in our newsletter, we talked about two main things. We talked about the OpenAI business, and then we talked about the ByteDance business and presumably tomorrow on Friday, there's going to be an announcement about TikTok and all this, but like the ByteDance business, you know, huge amounts of data center capacity that Oracle is also going to lease out to ByteDance, right? And so we've done the same methodology there. Know, with ByteDance, it's pretty certain they'll pay because they're a profitable company. With OpenAI, it's not.
因此远期预测需要误差区间:OpenAI能否撑到2028、2029、2030年?能否支付与甲骨文签约的每年800多亿美元?这是唯一风险。即便发生意外,甲骨文也有保障,因为他们只签约了数据中心(成本占小头)。对吧?
And so there's gotta be some like error bars as you go further out in terms of like, will OpenAI exist in twenty eight, twenty nine, thirty, and will they be able to pay the 80 plus billion dollars a year that they've signed up to Oracle with? Right? That's the only, like, risk here. And if that happens, then Oracle's downside is also somewhat protected because they only signed the data center, which is a minority of the cost. Right?
GPU就是一切,他们在开始租赁前一两个季度就会采购。所以对他们来说,即使交易没谈成,下行风险也相当低。拿不到交易就拿不到收入,但至少不会囤积一堆变得毫无价值的资产。对,确实如此。
The GPUs are everything, and the GPUs they purchase one to two quarters before they start renting them. So they they're not you know, the downside risk is pretty low for them in terms of if they don't get the deal. Well, they don't get the revenue, but they're not it's not like they have they're stuck with a bunch of assets they bought that are worthless. Yeah. Yeah.
这里还有其他角度吗?
Is there another angle here?
我是说,OpenAI和微软逐渐淡化EFFs的关系,现在他们提交了语音论文,只想多元化发展,这就促使他们转向其他供应商。
I mean, OpenAI and Microsoft wears off EFFs and and now they file to voice papers and they just wanna diversify and then that's pushing them away to towards other providers.
没错。微软曾是独家计算服务提供商,后来重组获得了优先拒绝权。结果微软真的拒绝了——现在它成了你们的
Yeah. So so Microsoft was exclusive compute provider. It got reorg to write a first refusal, you know, and then and then Microsoft refused to do Was it now your
末选方案还是什么来着?
last choice or something like that?
不,优先拒绝权仍然有效。但这两者并不互斥。比如OpenAI说'我们要签800亿或3000亿美元的五年合约,你们接不接',微软拒绝后,他们转头就去找甲骨文了。
No. It's still it's still it's still right of first refusal, but it's like Microsoft Those two are not mutually exclusive. Well, if OpenAI is like, we're gonna sign a $80,000,000,000 contract or a $300,000,000,000 contract for the next five years, you guys want it's like, and they're like, no, what? Okay, cool. And then they go to Oracle.
OpenAI需要有个雄厚资产负债表的企业来承担费用。虽然他们最终能通过计算资源和基础设施等边际利润赚大钱,但前期必须有人兜底——OpenAI自己没有这个财力,甲骨文有。不过考虑到合约规模,我们还有个消息源透露他们当时也在接触债务市场,对吧?
And OpenAI is sort of OpenAI needs someone with a balance sheet to actually be able to pay for it. Because and then and then they'll make tons of money if, you know, off of OpenAI on the margin, on the compute, and the infra, and all these things. But someone's got to have a balance sheet, and OpenAI doesn't have a balance sheet. Oracle does. Although given the scale of what they signed, we also had we had also had another source of information, which was that they were they were talking to debt markets, right?
因为甲骨文实际上只需要通过举债来分期支付这么多GPU的费用。他们不会立即这么做。比如,他们可以用自有资金支付今年和明年的全部费用。但到了2027、2028、2029年,他们将不得不开始用债务来支付这些GPU,这正是CoreVeev和许多新兴云服务商的做法,其资金大部分来自债务融资。就连Meta也为他们在路易斯安那州的巨型数据中心举债。
Because Oracle actually just needs to raise debt to pay for this many GPUs over time. Now they won't do it like immediately. Like, they can pay for everything this year and next year from their own cash. But like in '27, '28, '29, they'll start to have to use debt to pay for these GPUs, which is what, you know, CoreVeev has done and many of the neo clouds, most of its debt financed. Even Meta went and got debt for their Louisiana mega data center.
不是因为别的,纯粹从财务角度看,用现金回购股票并通过举债融资更划算,因为债务成本低于股票回报率。这就像一种财务工程手段,你知道的,现在还有谁在这么做?可能是亚马逊,可能是谷歌,也可能是微软。
Not because, just because it's cheaper than it's literally better on a financial basis to do buybacks with your cash and get debt because the debt is cheaper than the return on your stock. Like, it's like a financial engineering thing, like, you know, who's out there, right? It could be Amazon. It could be Google. It could be Microsoft.
这个名单非常短。或者可能是甲骨文或Meta,对吧?Meta显然已经退出。微软退缩了。剩下的就是亚马逊、谷歌和甲骨文了。
It's a very short list. Or it could be Oracle or Meta, right? Meta's obviously not. Microsoft's chickened out. Amazon, Google, and Oracle, right?
就剩这些了。谷歌会显得格格不入。所以
That's all that's left. Google will be an awkward fit. So
完全同意。是的,谷歌确实会显得格格不入。亚马逊则非常合适,你知道的,就是这样。对吧?就像是
absolutely. Yeah, Google would be an awkward fit. Amazon would be a fine fit, you know, exactly. Right? It's like,
这简直是虚拟版的瓶颈。好吧,我想,既然谈到这些巨型数据中心的建设,你们刚发布了关于xAI和Colossus二代的文章。你们对这些在六个月内建成庞然大物的壮举是越来越不以为然了,还是仍然觉得非常震撼?
it's a virtual drop neck. Yeah. Well well, I I guess maybe, you know, on the topic of these giant data center build outs, you guys just released a piece on x AI and Colossus two. Do you are you getting less impressed by these feats of building something this massive in six months, or is it still very impressive to you guys?
你知道,这就是我对AI研究者的评价——他们是第一批以数量级尺度思考问题的人类。是的。而自从工业革命以来,人们总是以百分比增长来思考问题。在那之前,则完全是用绝对数值。对吧?
You know, this is the, like, thing I've said about AI researchers is that they're, like, the first class of humans to think about things on an order of magnitude scale. Yeah. Whereas, like, people have always thought about things in terms of, like, percentage growth, like, ever since industrialization. And before that, it was just, absolute numbers. Right?
是啊。你看,某种程度上,人类的思维方式正在进化,因为事物变化得更快了。一切都是对数级的。所以,当GPT-2用那么多芯片训练时,真的很令人印象深刻,然后GPT-3是在2万块芯片上训练的——哦抱歉,是GPT-4和2万块芯片。
Yeah. You know, sort of, like, humanity is evolving in terms of how we think because things are changing faster. Everything is log scale. And so, like, you know, it was, like, really impressive when GPT, you know, two was trained on so many chips, and then GPT three was trained on that, you know, like, on on 20 k one hundreds and, you know or sorry. GPT four and twenty k one hundreds.
GPT,你知道,简直让人惊呼'天啊'。接着就进入了10万GPU集群的时代对吧?我们还做过关于10万GPU集群的报道。但现在已经有100万GPU集群了,没错。
GPT, you know, sort of like it's like, holy crap. And then it was like, oh, the era of a 100 k GPUs clusters. Right? And we did some reports around a 100 k GPU clusters. But now there's like, there's like 1,000 k GPU clusters Exactly.
全球范围内。我当时想,好吧,这有点无聊了。但10万GPU集群意味着超过100兆瓦的功耗。现在呢?我们在Slack和一些频道里讨论的都是'又发现了一个200兆瓦的数据中心'。
In the world. I was like, okay. That's kind of boring. But it's like a 100 k GPU's is like, you know, over a 100 megawatts. Now it's like, you know you know, like, literally, you know, we in our in our Slack and in some of these channels, like, oh, we found another 200 megawatt data center.
每次有人发这种消息,总有个家伙会发打哈欠的表情。我就想说:老兄,现在只有吉瓦级别的项目才值得兴奋了,这很棒。
There there there's a there's someone who, like, puts the yawning emoji every time. And I'm like, dude, what like, now it's only it's only exciting if you do gigawatt scale. It's great
吉瓦时代。对啊。没错。
gigawatt era. Yeah. Yeah. Yeah.
而且我确信——其实也不确定——可能连这个规模我们很快也会觉得无聊。但这种对数级的增长...资本数字简直疯狂,对吧?
And and I'm sure, like, you know, and you know, I'm not sure. Maybe maybe we'll start yawning to that too. But, like, you know, the log scale of this is like Yeah. The capital numbers are crazy. Right?
就像OpenAI做了100亿美元的交易已经够疯狂了,然后是10亿,现在我们在谈论100亿美元级别的交易。我们竟然用对数尺度来思考,这本身就很疯狂。但确实,只有这种量级才够震撼。
Like, you know, it's like, it's crazy enough that OpenAI did, like, a $100,000,000,000 trading run, you know, or or, you know, like, then they did a billion dollar trading run. Now we're talking about $10,000,000,000 trading runs. Right? Like, you know, it's it's it's crazy that we think in log scale. But, yes, things are only impressive Yeah.
当他们像埃隆那样操作时——就是埃隆在田纳西州孟菲斯做的事,第一次简直疯狂对吧?六个月内搞定了10万块GPU。他大概在2024年2月买下工厂,半年内就让模型开始训练了。对吧?
When they do it like, what Elon's doing so what what Elon's doing in in in Tennessee, in Memphis, First time was crazy. Right? 100 ks GPUs in six months. He bought a factory in like February '24 and and had models training within six months. Right?
他还搞了液冷系统,要知道这是首个大规模AI数据中心采用这种级别的液冷方案。各种疯狂创举——把发电机放在室外像燃气轮机那样,用移动变电站等各种野路子搞电力,甚至接上了工厂旁边的天然气管道。没错,他就这么干了,简直离谱。而且是为10万块GPU搞的。
And and he did liquid cooling, you know, first large scale data center with liquid at this scale for AI doing liquid cooling, like all these sorts of crazy firsts, putting generators outside like cat turbines, all these things for different things to get the power, you know, mobile substations, all these different crazy things, tapping the natural gas line that's, like, running alongside the factory, all these yeah. So so he does this. It's like, holy crap. And he did it for a 100 k GPUs. Right.
想想看,200到300兆瓦的规模。现在他要搞千兆瓦级别了,速度还一样快。对吧?
You know, 200, 300 megawatts. Right? Now he's doing it for a gigawatt scale, and he's doing it just as fast. Right? And and so, like Yeah.
按理说这次应该更震撼才对。但可能我麻木了,就像给孩子喂太多糖果那样。
You would think, like, this is obviously way more impressive that he did it again. Yeah. But, like Fair. Maybe I'm desensitized, but, like, it's like, you know, like, you've you've given the child too much candy. Yeah.
对吧?
Right?
有道理。
That makes sense.
现在这孩子已经...怎么说,连苹果都不爱吃了。懂我意思吗?总之那个千兆瓦数据中心——他孟菲斯工厂周围还闹过抗议呢。
And now, like, the child has no you know, is, you know, doesn't like apples. Right? Like, I don't know. So so so, like, yeah, a gigawatt data center. There was all these protests around his Memphis facility.
人们会说,哦,你在破坏空气。但问题是,你们去过孟菲斯那片区域吗?那里有一座千兆瓦级的燃气轮机发电厂,专门为那片区域供电。还有一座污水处理厂,服务于整个孟菲斯市。更别提那些露天矿坑了。
People were like, oh, you're destroying the air. And it's like, have you booked around that area of Memphis? Like, there is, like, a gigawatt gas turbine plant that's just powering generally that area. There's a sewage plant that's servicing the entire city of Minnesota or sorry, city of the Memphis. And there's, like, open air pits of, like, the, like, like, there's open air mining.
那里到处都是各种恶心的东西,但这些都是必需的。明白吗?我们需要这些设施来维持国家运转。懂我意思吧?
Like, there's all sorts of disgusting shit around there, which is needed. Right? We need that stuff to have a country run. Right? Like, to be clear.
然后呢,人们却在抱怨区区几百兆瓦的发电量。结果他遭到了各种人的抗议,事情变得超级政治化。
And, you know, it's like people are complaining about, like, a couple 100 megawatts air. Yeah. Of of of of of generation. So he he got, like, protests from all sorts of people. You know, you got super into the politics side of things.
连全国有色人种协进会(NAACP)都抗议他了。
NAACP even protested him.
哇哦。
Like Wow.
所以当地政府机构也开始表示不满,导致他无法在孟菲斯大展拳脚。但他仍需就近建设数据中心,以实现超近距离的高带宽连接,况且那里已有大量基础设施。于是他又买了个配送中心,仍在孟菲斯境内。但孟菲斯妙就妙在与密西西比州仅一河之隔。
And and so, like, he really got, like, some local municipalities to be like, oh, I don't like, you know, like this. And so he couldn't do as much as he wanted to in Memphis, but he still needed the data center to be close because he wanted to connect these data centers, super high bandwidth, super close. And he also already had a lot of infrastructure set up there. So he he bought another distribution center at this time, and it's still in Memphis. But the cool thing about Memphis is it's right across the border from Mississippi.
明白吗?新址距原址约10英里,但离密西西比州边界只有1英里。他在密西西比买了座发电厂,正在那里安装涡轮机。对,就是这样。
Right? So now, you know, it's like 10 miles away from his original one, but his facility is like a mile away from Mississippi, and he bought a power plant in Mississippi. And he's putting turbines there. Yeah. Yeah.
监管规定完全不同,对吧?如果问题真的在于快速动员资源并迅速建设,也许埃隆确实领先于所有人。要知道,他还没有打造出最佳模型,或者说至少目前还没有,我认为。你可以说Grok四曾短暂成为最佳,但老实说,他能如此迅速地构建这些东西,真的令人惊叹。
The regulation is completely different. Right? And and if the if the question is really like galvanize resources and build it really fast, maybe maybe Elon is is ahead of everyone. Know, he hasn't made the best model yet or he doesn't have the best model at least today, I think. You know, you could argue Grok four was the best for a little period of time, but like, you know, it's it's it's it's it's truly amazing how fast he's able to build these things.
而从第一性原理出发,大多数人会想,见鬼。他们会觉得我们无法在这里建设电力设施,无法解决供电问题。是啊,大概只能另寻新址了。
And and for first principles, it's like, most people are like, fuck. Like, you know, they they they we can't we can't can't build the power. We can't do power here anymore. Yeah. I guess we have to find a new site.
但其实完全不必。直接跨过州界去密西西比州就行。最有趣的是阿肯色州就在旁边,所以密西西比州会气得跳脚,懂我意思吗?
And it's like, no. Just go across the border. Go to Mississippi. And the but my favorite thing is, like, Arkansas is right there, so Mississippi gets mad. You know?
这个...我也不太清楚。
I I don't know. You know,
关于未来数据中心的监管规定,是否应该建在多州交界处?这是不是...
the the regulations or future data centers, you know, built in places where multiple states meet. Is that the the
四州交界点。没错。
Four quarters. Yep.
最优选址方案...我想起来了!美国是否存在五州交界点?我知道有四州交界处,四个州的边界在那里交汇。对。
The optimal rego I think there's there's one there you go. Is there is there a point in The US with five? I know there's a point with four. Four states intersect there. Yeah.
也许这取决于数据中心。好吧。
Maybe maybe that's according to data center. Alright.
我打算在那个地区投资房地产,我要抢先一步。嗯,关于新硬件的话题,你们有篇分析GB200总拥有成本(TCO)的文章。我想代表我们的投资组合公司问个问题,听起来你们已经在协助他们了。其中一个让我觉得特别有趣的发现是,GB200的TCO大约是H100的1.66倍。
I'm gonna buy real estate in that area. I'll front run it. Well, I guess on the topic of just maybe new hardware, you had this piece analyzing TCO for the GB two hundreds. And I'm kinda gonna ask this question on behalf of our portfolio companies, which it sounds like you're helping them already. But one of the findings that I thought was really interesting was TCO was sort of 1.66 x, h one hundreds for g v two hundreds.
所以很明显,这就引出一个问题:至少需要达到这样的性能提升基准,才能让转换硬件带来的性能成本比显得合理。能否谈谈你从性能角度观察到的现象,以及对于规模不如xAI的投资组合公司,在面临产能限制时,你会如何建议他们考虑新硬件的采购?
And so, obviously, you know, there's this point on, okay, that's sort of the benchmark for the performance boost that you're gonna need to at least make the sort of performance cost, ratio benefit, from switching over. Maybe just talk about what you've seen, from a performance standpoint, and what do you recommend to portfolio companies, maybe in a smaller scale than x AI who are, you know, thinking about new hardware, try to get it. There's capacity constraints, obviously.
是的。这确实是个挑战。每代GPU速度提升如此之大,让人总想换新款。某些指标下,GB200比前代快三倍甚至两倍。
Yeah. I mean, that's a challenge. Right? Is with each generation of GPU, it gets so much faster that you end up like you want the new one. And and and, you know, in some metrics, you could say GB 200 is three times faster than or two times faster than the prior generation.
其他指标可能显示差距更大。比如预训练和推理场景下的表现差异,对吧?
Other metrics, you can say it's way more than that. Right? So if you're doing pre training versus inference. Right?
它们能暂时运行所有任务,对吧?
They can run everything for a bit. Right?
对。如果能短期运行或仅用于推理,并利用NVLink 72的巨大带宽——你可以勉强说GB200只比H100快两倍,这样1.6倍的TCO还算划算。确实如此。
Yeah. If you could run it for a bit or just inference and take advantage of the huge NVLink n v l 72, you know, there there's there's ways you can you could squint and say, g v 200 is only two x faster than h 100, in which case, 1.6 x TCL. It's, you know, it's worthwhile. Right? It's Yeah.
值得升级到下一代,但边际效益更小了。确实更有限。是的,不是什么大问题。
Worth going to the next gen. But more marginal. It's more marginal. Yeah. It's not a big deal.
还有些情况是,比如在进行深度搜索推理时,每块GPU的性能差异能达到六到七倍以上,而且还在持续优化。所以你看,虽然价格只贵了60%,性能却提升了六倍。
Then there's other cases where it's like, well, on, if you're running deep seek inference, the performance difference per GPU is, like, north of, like, six, seven x, and it continues to optimize, you know, for for deep seek inference. And so the you know, then then it's like, well, I'm only paying 60% more for six x. You know, it's
没错。
like Right.
每美元能获得三到四倍的性能提升。绝对的,对吧?如果你在运行深度序列推理,这也可能包括强化学习。
It's a four x or three x performance per dollar gain. Like, absolutely. Right? And if you're, like, in running inference of deep seq, that can also include RL. Right?
所以问题在于...另外还有个问题是,这些GPU是新的。比如有B200、UV200、B200。从硬件角度看B200更简单。
And so the question is sort of and then and then the the other question is like, well, the GPU is new. You know, there's also a b 200. There's u v 200. There's b 200. B 200 is much more simple from a hardware perspective.
它就是个八GPU的机箱。所以在推理性能上提升不大,但稳定性有保障。八GPU机箱不会不可靠。
It's just eight GPUs in a box. So then it's not as much of a performance gain, especially an inference, but you have you have all the stability. Right? It's an eight GPU box. It's not gonna be unreliable.
GV200系列仍存在可靠性问题,正在逐步解决,每天都在改善,但仍是挑战。而H100或H200的八GPU服务器,只要一块出故障,整台服务器就得下线维修。
The g v two hundreds are still having some reliability challenges. Those are being worked through. It's getting better and better by the day, but it's still a challenge. But, you know, when when you have a g v two when you have a h 100, right, box or h 200, eight GPUs, one of them fails, you take the entire server offline. You have to fix it.
对吧?通常来说,如果你的云服务表现良好,他们会直接替换进来。
Right? So usually, if your cloud's good, they'll swap it in.
嗯。
Mhmm.
对吧?但如果用的是GV 200,当其中一个GPU故障导致整个系统崩溃,而你需要更换全部72块GPU时,你该怎么办?故障的影响范围太大了。对吧?不行的。
Right? But if if it's GV 200, what do you what do you now do with 72 GPUs if one fails to break the whole thing and you get a new 72? The blast radius of a failure. Right? Nope.
GPU的故障率最好情况下持平,很可能更糟。对吧?代际之间因为设备越来越热、越来越快等等。所以最好情况下故障率持平。即便假设故障率完全相同,从八分之一到七十二分之一,这也是个大问题。
GPU failure rates at best are the same and likely worse. Right? Gen on gen because everything's getting hotter, faster, etcetera. So at best, the failure rates are the same. Even if model the failure rates as the exact same, because you go from one out of eight to one out of 72, it's a huge problem.
所以现在很多人选择在64块GPU上运行高优先级任务,剩下8块运行低优先级任务。这就变成了,好吧,整个基础设施的挑战。我得区分高优先级和低优先级任务。当高优先级任务出现故障时,不是整机架下线,而是从低优先级任务抽调GPU补充,坏掉的GPU就放着等后续维护。
So now what a lot of people are doing is they run a high priority workload on 64 of them, and then the other eight, you run low priority workloads, which is then like, okay. This is this whole, like, infrastructure challenge. Like, I have to have high priority workloads. I have to have low priority workloads. When a high priority workload has a failure, instead of taking a whole rack offline, you just take some of the GPUs from the low priority one, put it in the high priority one, and then, like, you just let the dead GPU sit there until you service the rack at a later date.
结果就是,这些复杂的基础设施问题导致——等等,实际上预训练中三倍或两倍的性能提升?其实更低,因为停机时间更长了。嗯。/ 我没法始终利用所有GPU / 我也没能力——你知道,我不够聪明或缺乏基础设施来区分高低优先级任务。但这并非不可能。
And it's like, there's all these, like, complicated infrastructure things that make it so oh, wait. Actually, that that that three x or two x performance increase in pre training Yeah. Is lower because the downtime is higher Mhmm. Slash I'm not using all the GPUs always slash I'm not able to perf you know, I'm not smart enough or I don't have the infra to, like, have low priority and high priority workloads. Like, it's not impossible.
是的。实验室正在这么做。对吧?就像,
Yeah. The labs are doing it. Right? Like, it's
我是说,如果我在运营一个云服务,那实际上非常困难。对吧?因为我可能得租用那些备用实例之类的资源?
just I mean, if I'm running a cloud, it's actually really hard. Right? Because I probably have to rent the spot one like, like, the the spares out of spot instances or something?
不,不是的。因为这是一个连贯的领域,用的是NVLink技术。你肯定不希望任何人去动它。
No. No. Because then because it's a it's a coherent domain. It's NVLink. You don't want anyone touching that.
所以必须由终端客户来操作。必须保持原状
So it has to be the end customer. Have to leave
把它们当作空置备用资源。那样更糟。
them as empty spares. That's even worse.
不。终端客户通常会说,我需要这些资源,然后服务等级协议(SLA)和定价都会把这些考虑进去。明白吗?所以一般来说,云服务都有SLA对吧?
No. The end customer usually would just be like, I want them, and I'll I will you know, and the the SLAs and the pricing, everything is, like, accounting for that. Right? So, like, generally, when you have a cloud, you have an SLA. Right?
就是说,听着。它会保持运行时间,比如99%的可用性,诸如此类。对吧?在这个期限内。
That is, hey. It's gonna be uptime. It's gonna be 99%, you know, blah blah blah. Right? For this period.
对于GB200来说,64块GPU的可用性是99%,不是72块。而72块的可用性是95%。不同云服务商的SLA都不一样,每家都有各自的条款。
With g b 200, it's it's 99% for 64 GPUs, not 72. And then it's like 95% for seventy seventy two. Now it differs across every cloud. Every cloud is a different SLA.
明白了。
Got it.
是的。但他们已经为此调整过了,因为你看,这硬件就是很挑剔。你还想要吗?要的。
Yeah. But, like, they've adjusted for this because they're like, look. This hardware is just finicky. Do you still want it? Yep.
要知道,我们会保证其中64个能一直正常工作,对吧?不是72个。所以这种挑剔的特性意味着终端客户必须能应对这种不可靠性。而且终端客户完全可以继续使用b200。
You know, we will credit you in that 64 of them will always work. Right? Not not 72. And so like, there's this whole, like, finicky nature and the end customer has to be capable of dealing with the unreliability. And it's like and the end customer can just continue to use b 200.
对吧?性能提升没那么大。你想要这72域的唯一原因就是为了获得某些增益。但你必须足够聪明才能做到。
Right? Performance games not as much. The whole reason you want this 72 domain is so you can have, you know, some of these gains. Right. But you have to be smart enough to be able to do it.
这对小公司来说很有挑战性。完全同意。
And and that's challenging for small companies. Totally.
Beta刚发布了Rubin Prefill卡,类似CTX或CPX。CPX,对了。你怎么看?会造成市场蚕食吗?老兄,顺便说一句,
And beta just announced the Rubin Prefill cards, like CTX or CPX. CPX, there we go. What's your take on that? Does it cannibalize? Dude, and by the way,
我不知道这是不是脑残了还是怎么的,但我连昨天午饭吃了什么都记不清,却他妈记得每块芯片的型号。
I don't know if this is like brain rot or like, I don't know, but like, I can't remember what I had for lunch yesterday. But I know the model number of every fucking chip, like
在梦中击中你。
Hots you in your dreams.
我们完了。我们完了。
We're broken. We're broken.
活在梦里。不。
Living the dream. No.
不。不。不。你知道,为什么
No. No. No. You know, why do
你要预先宣布一个在某些使用场景下
you preannounce a product that's
快五倍的产品?
five times faster for certain use cases?
有那么快吗?
Is it that much?
我认为有几个原因。对吧?从历史上看,AI芯片就是AI芯片。然后我们开始听到很多人说,这是一款训练芯片。
I think it's a couple things. Right? Like, historically, AI chips were AI chips. Right? And then we started getting a lot of people saying, this is a training chip.
这是推理芯片。实际上,训练和推理转换得如此之快,它们对感知的要求也在变化。对吧?现在基本上还是同一款芯片。实际上,工作负载层面的动态仍有差异,但主要工作负载即使在训练中也是推理。
This is an inference chip. Actually, training and inference are switching so fast and what they require sense. Right? That, like, now it's, like, still, like, one chip. Actually, there are still workload level dynamics that, differ, but the main workload is inference even in training.
对吧?正是如此。因为强化学习中大部分工作都是在环境中生成内容并试图获得奖励。对吧?所以本质上仍是推理。
Right? Exactly. Because of RL, most of that is is, you know, generating stuff in an environment and trying to, you know, achieve a reward. Right? So it's it's inference still.
对吧?如今训练过程也主要由推理主导。但推理包含两大主要操作。对吧?首先是计算预填充的键值缓存。
Right? Training is now becoming mostly dominated by inference as well. But inference has like two main operations. Right? There is calculating the KB cache for prefill.
对吧?这里有所有这些文档。在所有token之间进行注意力计算。无论你使用哪种注意力机制。
Right? Here's all these documents. Do the attention between all of them. Right? Between all the tokens, however, you know, whatever type of attention you use.
然后是解码阶段,即自回归地生成每个token。这是截然不同的工作负载。最初的想法或基础设施技术、ML系统技术是:好吧,我就把每次前向传播的批处理大小固定为某个值,比如设为1000。
And then there's decode, which is auto aggressively generate each token. These are very, very different workloads. And so initially, the ideas or infrastructure techniques, the ML systems techniques were, oh, okay. I will just make the batch size every single, you know, forward pass this big. And it you know, if I I make it let's call it I'll make it a thousand big.
或许我会同时运行32个用户。这样还能剩下约960的容量。对吧?那剩余的960实际上用于预填充——当有新请求进来时进行分块处理,这被称为主干预填充。
And maybe maybe I'll run 32 users concurrently. That way, you know, now I still have, you know, 900 something left, nine sixty left. Right? That $9.60 is actually doing the prefill for, you know, if a request comes in, it chunks it. It's called trunk prefill.
你现在可以预填充部分内容。GPU利用率确实很高,但这最终会影响解码工作者的效率,对吧?那些自动激进生成每个令牌的人,会导致每秒处理令牌数(TPS)变慢。
You prefill chunks of it now. You get really good utilization on GPUs, but then that that ends up like impacting the decode workers. Right? The people who Who am? Auto aggressively generating each token that being having slower TPS.
而每秒令牌数对用户体验和其他方面至关重要,对吧?所以关键在于,这两种工作负载截然不同,它们本质上是分开的。预填充是一回事,解码是另一回事。
And and tokens per second is is really important for user experience and all these other things. Right? So then so then the idea is like, These two workloads are so different and you they are literally different. Right? You prefill and then you decode.
它们并不是交替进行的。那为什么不彻底拆分它们呢?而且这还是用同类型芯片完成的。OpenAI、Anthropic、Google,基本上所有人都是这么做的。
It's not like you're inter interleaving them. So why don't we split them entirely? And this is this is done on the same type of chip. Right? OpenAI, Anthropic, Google Pretty much everybody does that.
每个人,每个人都做得很好。
Every everyone everyone good.
是啊。所有人?那些大公司。
Yeah. Everyone Really? Big guys.
包括Fireworks在内。这些公司都采用预填充与解码分离的架构。他们用一组GPU处理预填充,另一组处理解码。为什么这样更有利?因为可以自动扩展资源。
Together, Fireworks. All these guys do prefilled decode, disaggregated prefilled decode. So they run prefilled on a set of GPUs, decode on certain set of GPUs. Why is this beneficial? Because you can auto scale them.
对吧?比如突然需要更多长上下文处理能力时,我可以分配更多资源给预填充。或者当流量模式逐渐从'长输入短输出'转变时——当然不是突然的,而是随着时间变化的调整。
Right? You can hey. All of a sudden, have a lot more long context workers. I allocate more resources to prefill. Oh, all of a sudden, have a you know, not all of a sudden, but, like, you know, over time, my traffic mix is not long input, short output.
这是短输入长输出的情况。我有更多的解码工作线程。这样我就能确保资源可以按需自动扩展,同时保证预填充时间——你知道的,在搜索中真正重要的是页面开始加载的速度,而非资源何时到位。游戏里人们怎么做?比如加载界面通常会设计成互动环境或渐进式呈现。
It's short input long output. I have more decode workers. This way I can guarantee and so now I can auto scale the resources differently, and I can also guarantee that my prefill time is, you know, the the time you know, what's really important in search is how fast you get the page to start loading, not when does the resource happen. What do people do in games? Like, the loading screen often has some sort of interactive environment or blends in over time or whatever it is.
它会提供技巧提示来分散注意力。研究表明用户更看重首令牌的响应速度,即使获取全部令牌的总时间会稍长些,只要首个令牌能更快地流式传输出来就行。
It has tips and tricks, ways to distract you. The same thing is it it show there's, like, studies and papers out there that users prefer a faster time to first token. Right? First token gets streamed to me in like, sooner even if the total time to get all my tokens is a
反正人类阅读速度也跟不上。对吧?所以
little bit longer. It can't read that fast anyways. Right? So
我是说,我喜欢快速浏览。
mean, I I mean, I like to skiv.
没错,我也喜欢速读。但大多数模型的返回速度其实已经超过人类速读速度了。
Yeah. I like to skiv. Yeah. But, I mean, most models return above speed reading speed.
但这是必要的。用户体验要求我们必须确保首令牌响应时间控制在特定阈值,否则用户就会觉得'去他的,不用AI了'。
But you need that. Right? I think I think but, like, you know, the the idea is that you wanna guarantee time to first token is a certain level for user experience reasons. Otherwise, people are like, screw this. Not using AI.
解码速度固然重要,但远不及首令牌时间关键。通过分离预填充和解码环节就能实现——这些其实都在同一套基础设施里完成了。
The decode speed matters a lot too, but not as much as time to first token. And so by having separate prefilled decode, you you do this. Right? But now you've already and this is all in the same infrastructure. You've already done this.
那么现在的问题是,下一步该怎么做?这些工作负载差异太大了。解码阶段,你必须加载所有参数和KV缓存才能生成一个token。虽然可以批量处理几个用户,但很快你就会耗尽内存容量或带宽,因为每个人的KV缓存都不同。确实如此。
So now it's like, what's the next logical step? These workloads are so different. Decode, you have to load all the parameters in and the KV caches to generate a single token. You batch a couple users together, but very quickly, you run out of memory capacity or memory bandwidth because everyone's KV cache is different. Yeah.
所有token的注意力机制,对吧?而预填充阶段,我甚至可以一次只服务一两个用户——如果他们发来64,000个上下文的请求,那可是海量浮点运算。对吧?64,000个上下文的请求啊。
The attention of all the tokens. Right? Whereas on prefill, I could even just serve like one or two users at a time because if they send me a 64,000 contacts request, that is a lot of flops. Right? 64,000 context requests.
我用Llama 70B模型来举例,因为计算方便。700亿参数,每个token需要140吉次浮点运算。70乘以64,000...这得有好几拍(千万亿)次浮点运算了。
I'll use I'll use Llama 70 b because it's simple to do math on. Like, 70,000,000,000 parameters. That's that's a 140 gigaflops per token. 70 times 64,000. That's that's that's many, many pair of teraflops.
你可能要占用整个GPU将近一秒钟。理论上。具体取决于GPU的性能,就为了完成这个预填充阶段。
You can use the entire GPU for, like, a second. Right? Like, potentially. Right? Depending on the GPU to just do the prefill.
对吧?而这还只是单次前向传播。所以我不需要急着加载所有token或参数到KV缓存里,关键在于浮点运算总量。这也就是为什么——虽然解释得有点啰嗦——很多人确实很难理解CPX到底是什么概念。
Right? And that's just one forward pass. So I don't necessarily care about, you know, loading all the tokens and or all the parameters in KV cache and fast. All I care about is all the flops. And so that leads us to sort of, like you know, I had to I think it was long winded explanation because it's hard for people to understand what CPX is.
我遇到过太多这种情况,甚至我的客户也是。我们准备了多份说明文档,他们还是表示不理解。真是让人头疼。
I've had a lot of, like even my own clients, like, we set, like, multiple notes, like, explaining, and they're like, still don't understand. I'm like, shit.
明白了。所以《注意力就是一切》那篇论文...
Okay. So the attention is all you need, paper and
你不能指望...我是说,比如想想一个搞网络的人。他们会觉得,我不需要懂这个。你知道,注意力就是全部。对吧?或者想想投资者。
You can't expect you can't expect I mean, like, think about, like, like, like a networking person. Like, they're like, I don't, you know, I don't need to know about this. You know, attention is all you need. Right? Like, it's like or think about an investor.
对吧?你知道,还有数据中心运营商。他们会说,哦,有两种芯片。我为什么要为这个改造数据中心?
Right? Like, you know, there's all people. Maybe the data center operator. Like, they're like, oh, there's two chips. Why should I build my data center for differently?
就像...是啊。我得解释清楚一切,或者说不用改造。但总之你现在明白了
It's like like Yeah. You know, I gotta explain everything or just like, no. You don't have to build differently. You're but, anyways, you you get to now
在斯坦福,25%的非计算机专业学生都读过他们的论文。什么论文?《注意力就是全部》。
in Stanford, and there's 25% of all students, not CS students, of all students read their paper. Read what paper? Attention is all you need.
这比例太低了。他们应该...那些体育专业的学生和...
That's low. They need they they do gym majors and, you know, like, the
哲学系的人。我觉得这太不可思议了。
philosophy guy. It is I find this amazing.
总之抱歉。中东某个国家——我记不清是哪个了——从8岁就开始AI教育。高中生必须读《注意力就是全部》。哇。有人告诉我他们的CNN课程也要求读这个,这...我不知道该怎么说。
Anyways, sorry. The The Middle East, I can't remember what country it is, has AI education starting at, like, age, like, eight. And in high school, they have to read attention is all you need. Wow. Someone someone told me that their their CNN had to read attention is all you need, which is you know, I don't I don't know if that's look.
看吧。自上而下的教育指令。你知道吗?可能有效,也可能无效。比如,有些人可能更喜欢在家教育孩子。
Look. Top down mandates for education. You know? Maybe they work, maybe they don't. Like, you know, maybe people like homeschooling their kids.
我不确定。我是上过学的,但是,
I don't know. I went to school, but, like,
回到你的读者话题上。
back to your readers.
对。就在硬件周期这个话题上,我想或许可以聊聊。
Yeah. Just on the topic of hardware cycles, I wanted to maybe yeah.
抱歉。我其实解释过CPX是什么。所以CPX
Sorry. I I I actually explained what CPX is. So CPX
好的。是的。是的。是的。
Okay. Yes. Yes. Yes.
是一款高度计算优化的芯片,而预填充和解码部分,简而言之,其余则是配备HBM的常规芯片。HBM占GPU成本一半以上。如果去掉这部分,传递给客户的芯片价格会大幅降低。或者,如果英伟达保持相同利润率,这款预填充芯片的成本就低得多。现在整个流程更便宜高效。
Is is a very, like, compute optimized chip, whereas, you know, for prefill and then and then decode is and just to succinctly say it is is like the rest is the normal chips with HBM. HBM is more than half the cost of the GPU. If you strip that out, you end up having a much cheaper chip passed on to the customer. So or or like, you know, if NVIDIA takes the same margin, then then the cost of this prefilled chip is much, much lower. And now the whole process is way cheaper and more efficient.
现在可以采用长上下文了,对吧?
Now long context can be adopted. Right?
是的。嗯,我很高兴我们实际上在讨论所有这些细节,因为我原本有一个更宏观的问题要问你。我对SEVY市场的关注没有你那么密切。我大概是从100开始接触的,我记得在2023年6月帮助Gnome获取GPU。那时候唯一重要的是交付日期,因为当时产能严重不足。
Yeah. Well, so I I love that we're actually going to all this detail because I had a more 10,000 foot view question for you, which is I haven't been following the SEVY market as closely as you have. I probably started with the a 100, and I I remember helping Gnome at character. This is June 2023, chase down GPUs. And the only thing that mattered at that time was delivery date because there was a huge capacity crunch.
然后看到过去两年里情况发生了变化,比如大约6到12个月前,人们会向20家Neo Clouds发RFP(征求建议书)。对吧?某种程度上唯一重要的就是价格。
And then to see that over the last two years evolve where, you know, let's say, six to twelve months ago, people were doing these RFPs to 20 Neo Clouds. Right? And the only thing that mattered to some degree was price.
人们真的会为GPU发RFP?
People actually do RFPs for GPUs?
是的。
Yes.
所以,所以,所以为了说清楚,我对购买GPU的看法是,这就像买可卡因或其他毒品一样。这是别人告诉我的,不是我。我不买可卡因。是可卡因。对。
So so so just to be clear, my opinion on how you buy GPUs is that it's like buying cocaine or any other drug. This is described to me, not me. I don't buy cocaine. It's cocaine. Yeah.
对。所以有人告诉我这个。有人告诉我这个。我当时就想,天啊,说得太对了。
Right. So someone tells me this. Someone tells me this. I'm like, holy shit. It's right.
你打电话联系几个人,发短信给几个人,然后问,嘿,你有多少货?价格是多少?
You call up a couple people. You text a couple people. You ask, yo. How much you got? What's the price?
这简直就像,妈的,就像在买毒品。哦,抱歉,抱歉。不。
It's like Exactly. This is fucking like, buy drugs. Like oh, sorry. Sorry. No.
我是说,非常像,我就是用同样的方式。你只需要发送消息,比如我们Slack上连接了大约30家Neo Clouds。就是这样。还有一些主要的供应商,我们直接给他们发消息说,嘿,客户需要这么多。
I mean, very like, I just it's the same way. You just send, like we we have Slack connects with, like, 30 Neo Clouds. And, like There you go. As well as, like, some of the major ones, and we just send them a message like, hey. Customer wants this much.
你知道,这是他们在找的东西。然后他们发报价过来,然后我认识这个人。
You know, this is what they're looking for. And then they send quotes, and then I I know this guy.
我认识一个人。嗯,所以我认为这实际上是一个非常准确的描述。我转发过无数次你的ClusterMax原帖,因为我觉得它很好地剖析了它们。但也许最后一个问题对我来说是,随着Blackwell上线,我们现在处于什么时代?我们是否回到了2023年夏天的那个时代,这就是我们刚刚进入的周期?
I know a guy. Well so I think that's actually a very accurate description. And I've sent countless portcodes, your ClusterMax original post, because I thought it did a really good job breaking them down. But maybe one question to end on for me is just what era are we in now with Blackwell's coming online? Are we sort of back to the summer twenty twenty three era and it's that's kind of the the cycle, that we've just entered?
或者你对当前阶段有什么看法?
Or what what's sort of your view on where
所以,首先,这是个非常好的问题。
So so first very good question.
关于你们的一个被投公司,我们当时想,在经历了与亚马逊的困难后,我们尝试着说,好吧,让我们实际帮你们搞些GPU。最初给你们谈的交易已经没了,但这里还有些其他交易。结果发现,多家主要的新兴云服务商的Hopper算力已售罄,而他们的Blackwell算力要几个月后才能上线。
For one of your portcos, we we were like, you know, after their difficulties with Amazon, we tried to we were like, okay. Let's let's actually, like, get you GPUs. The original deals we got you were gone, but, like, here's some other deals. Right? It turned out that multiple major neo clouds had sold out of hopper capacity, and their black well capacity comes online in a few months.
所以这确实有点挑战性,对吧?因为推理需求?今年推理需求正在飙升。对吧?
So it's it's a bit of a challenge. Right? In that Due to inference? Inference demand has been skyrocketing this year. Right?
对。哦,是的。现在的推理模型。这些推理模型就是收入来源。今年这方面的需求确实在飙升。
Right. Oh, yeah. Reasoning models now. These these reasoning models are revenue. It's been skyrocketing this year.
然后还有,Blackwell虽然即将上线,但部署起来很困难。所以需要些时间适应,部署它有个学习曲线。相比之下,买Hopper的话,装进数据中心,一两个月就能运行起来。而Blackwell由于可靠性问题,部署周期会更长些。
And then and then also, like, there's a bit of, like, the, you know, Blackwell comes online, but it's hard to deploy. So it takes a little, you know, there's a learning curve to deploying it. So whereas, like, you got down to, like, you buy the hopper, you install the data center, it's running within, like, you know, a month or two. Right? For for for Blackwell, it was like, it's a longer time frame because of reliability challenges.
这是新GPU。我的意思是,总要经历成长的阵痛。对吧?所以市场上出现了个断层——有多少GPU正在上市,而收入曲线正要开始上扬。
It's a new GPU. I mean, it's just learning per pain. Right? Learning learning growing pains. So there was, like, there was, like, this gap of, like, how many GPUs are coming onto the market, right, as revenue starting to inflect.
于是大量算力被迅速消耗。实际上,Hopper的价格在三四个月前,或者说五六个月前触底了。现在价格已经小幅回升了。
And so a lot of capacity got sucked up. Right? And, actually, prices for Hopper bottomed, like, three or four months ago or, like, five or six months ago. Yeah. And, actually, they've, like, crept up a little bit now.
虽然还算快...你知道,还不至于...我不认为我们完全回到了2023、2024年GPU紧缺的时代。但如果你只要几块GPU很容易,想要大量的话就很困难了。
They're still, like Fast. You know, not not so so I do I don't think we're quite 2023, 2024 era Yeah. Of GPUs are tight. But, certainly, if you wanna if you want, like, just a few GPUs, it's easy. But if you want a lot, it's it's it's hard.
是啊是啊,就像,你不可能立刻获得那样的能力。
Yeah. Yeah. Like, you you can't get capacity that instantly.
哇哦,真是个好时代。
Yeah. Wow. What a time.
我们要不要,要不要就此结束?
Shall we, shall we wrap on that?
迪伦,这又是一期即时经典。非常感谢你来参加播客。
Dylan, this was another, instant classic. Thank you so much for coming to podcast.
已经,两小时了,兄弟。哦不,我错过了。谢谢你。
It was, two hours, bro. Oh, no. I missed it. Thank you.
我们停不下来,我停不下来。
We couldn't, I couldn't stop.
非常感谢,真是太棒了。
Thanks so much. It was great.
非常感谢邀请我参加。
Thank you so much for having me.
感谢收听a16z播客。如果您喜欢本期节目,请前往ratethispodcast.com/a16z留下评论告诉我们。我们还有更多精彩对话即将呈现,下次见。请注意,此处内容仅供信息参考,不应视为法律、商业、税务或投资建议,也不应用于评估任何投资或证券,且不针对任何a16z基金的现有或潜在投资者。
For listening to the a 16 z podcast. If you enjoyed the episode, let us know by leaving a review at ratethispodcast.com/a16z. We've got more great conversations coming your way. See you next time. As a reminder, the content here is for informational purposes only, should not be taken as legal business, tax, or investment advice, or be used to evaluate any investment or security and is not directed at any investors or potential investors in any a sixteen z fund.
请注意,a16z及其关联机构可能持有本播客讨论公司的投资。更多详情,包括我们的投资链接,请参见a16z.com/disclosures。
Please note that a sixteen z and its affiliates may also maintain investments in the companies discussed in this podcast. For more details, including a link to our investments, please see a 16z.com/disclosures.
关于 Bayt 播客
Bayt 提供中文+原文双语音频和字幕,帮助你打破语言障碍,轻松听懂全球优质播客。