本集简介
双语字幕
仅展示文本字幕,不包含中文音频;想边听边看,请使用 Bayt 播客 App。
那么在美国以外,甚至是在那些较大、最发达经济体之外的地方呢?你认为人工智能将如何更广泛地影响世界?
How about sort of outside The United States, maybe even outside of the sort of, the larger, most developed economies? How do you think AI is going to affect the world more generally?
我希望人工智能能产生极大的民主化效应,因为在当今世界,最昂贵的东西之一就是智力。聘请一位高技能的专科医生诊断病情,或是请一位高中家教一对一辅导孩子,都需要花费大量金钱。虽然我看不到降低人类智力成本的途径——培养一个熟练人才本就耗资巨大——但降低人工智能成本却是有可能的。这意味着,如今只有相对富裕的人才能雇佣特定类型的员工为他们服务。而在未来,我期待每个人都能拥有一支由聪明、见多识广的‘员工’组成的队伍,为我们处理各种事务。
I feel like I hope that AI will have a very large democratizing effect, because one of the most expensive things in today's world is intelligence. Costs a lot to get a highly skilled specialist doctor or to to tell you what's going on or costs a lot to get a high school tutor to mentor your child one on one. And AI and then whereas I don't see a path to making human intelligence cheap, it just costs a lot to train a skilled human being, there is a path to making artificial intelligence cheap. And what this means is today, only the relatively wealthy can hire certain types of staff to do certain types of things for them. But in the future, I would love a very single person, kind of an army of smart, well informed staff to do all sorts of things for us.
我认为这将
And I think this would
比如健康顾问、家庭教师之类的角色。
Their health advisor, their tutor, that kind of thing.
没错。是的。而且我认为,为每个人配备如今只有相对富裕者才能享有的‘员工团队’,这将让许多人受益。
Yeah. Yes. And and I think I I think that giving everyone an army of staff to help them, that today is available only to the relatively wealthy, This would this will lift up a lot of people.
安德鲁,欢迎回来。真高兴再次见到你。
Andrew, welcome. It's so glad to have you back.
是啊,阿斯特罗,见到你总是很棒。
Yep. Always great to see you, Astro.
太棒了。我们要聊很多话题。但我想先从一个温馨的回忆开始——我们读研的时间大致重合。记得当初和你讨论在X(后称Google Brain)启动项目时,你的毕业论文做了件非常特别且令人难忘的事。
Awesome. We're gonna talk about a lot of different things. But I wanna start with something that I remember fondly. We were in graduate school kind of roughly the same time. And when we were talking to you about starting something at x that later got called Google Brain, you had done something really kind of unusual and memorable for your graduate thesis.
能向听众简单介绍一下你的毕业论文吗?既包括技术层面最精要的亮点,也包括你实际实现的成果?
Can you tell the audience kind of what your graduate thesis was, both kind of what was technically interesting at a very, very high level, but also, like, what you made happen?
在伯克利的博士论文中,我构建了一个操控直升机飞行的小型神经网络。这在当时很特别,因为如今热门的强化学习那时还未兴起。我请朋友借了架直升机,用我们发明的小算法训练这个神经网络保持悬停,结果它稳如磐石。你看视频时会疑惑‘这是实拍吗?’——盯着画面看半天。我觉得这很酷。
For my PhD thesis at Berkeley, I built a little neural network that flew a helicopter. And I guess it was unusual at the time because reinforcement learning, which is hot now, it wasn't hot back then. But I wound up asking some friends to let me use a helicopter, and then, you know, trained a little neural network with a little algorithm that we invented to keep it hovering in the air, and it stayed rock solid. So you could watch a video of it, and you go, is this a video? You look in the picture, and and I thought that was that was cool.
这确实让强化学习获得了比当时更多的关注。而且驾驶直升机也很有趣。
That actually got reinforcement learning lot more attention than it was getting at the time. And it was fun to fly helicopters.
没错。现在听这些的人可能不记得了,因为他们看到所有这些垂直起降飞行器,会认为在空中保持稳定是件很容易的事。但在那时,能学会做到这一点的东西,我记得对领域来说就像一记令人兴奋的重击。
Exactly. That's people listening to this now might not remember because now that they see all these vertical takeoff and landing kinds of craft, they think that something that's rock steady in the air is pretty easy. But back then, something that learned how to do that kind of like was a bit of a kick in the head in an exciting way to the field, I remember.
那很有趣。我觉得自己很幸运,做了很多不走寻常路的事情,尝试了些古怪的东西,有时行不通。这就是偏离常规时会发生的事,但当它成功时,你知道,我认为直升机的结果当时获得了大量关注,并推动了强化学习的发展。
It was fun. I think I've been fortunate for a lot of things I do to go off the beaten path and do something weird, and sometimes it doesn't work out. That's part of what happens when you go off the beaten path, but when it works, you know, I think the helicopter result got a lot of attention and and reinforcement learning forward at the time.
是的。我记得从X的角度讨论启动类似最终成为Google Brain的项目时——先是在X,后来在谷歌由你共同创立。我记得你提到这段历史:你作为斯坦福年轻教授时写下的、关于你认为我们或斯坦福(后来的X)应该做什么的构想。我还没细看这份资料,非常好奇想去看看。
Yeah. I remember as we were talking from the x perspective about starting something sort of like what ultimately became Google Brain here at x and later at Google that you co founded. I remember and you brought this piece of history. What you had written when you were at Stanford as a young professor about what you thought we should or Stanford later X should do. My memory not having skimmed this yet, I'm very curious to go look at it.
我记得其中有两点是你的核心论点——我和当时的X联合创始人/联合主任塞巴斯蒂安都非常认同。其一是规模很重要。杨立昆等人已在学术上有所展示,但没人真正实现过规模化。所以尽管大家强烈怀疑,但尚未被证实。
Is that there were two things that I think was your thesis. Though I remember that myself and Sebastian, my co founder, co director of X at the time, very much bought into. One of them was that scale matters. That Jan Lecun and others had sort of showed this academically, but that no one had actually managed to deliver scale. And so it wasn't proved yet, even though everyone strongly suspected it.
应该说当时人们在学术上已有所改进。实际上我记得参加NeurIPS(那时还叫NIPS)会议时,我到处对人说必须扩大深度学习算法规模。而一些资深前辈却建议说:‘安德鲁,为什么要构建更大的神经网络?去发明新算法吧。’所以2000年其实是...
I would say at that time, people had improved it academically. And in fact, I remember going to NeurIPS, at that time, the NIPS NeurIPS conference, and I was talking to people saying, gotta scale up deep learning algorithms. And I was getting advice from very senior people saying, hey, Andrew, why are you trying to build bigger neural networks? Like, go invent new algorithms. So 2000 was actually
2010年?2011年2月?
'10, 02/2011?
对。我记得2010年2月向拉里·佩奇推销后来成为Google Brain的项目。大约2008年2月,我就在学术会议上宣传这个。那时规模化其实存在争议,人们并不相信。
Yeah. I remember pitching Google Brain, what became Google Brain, to Larry Page in 02/2010. It was around 02/2008, so it was going around academic conferences. Then at that time, scale was actually controversial. People did not believe in it.
而且提建议的都是出于好意、非常资深的学者。
And well meaning, very senior people.
记得约书亚·本吉奥说过:‘安德鲁,这对你职业发展不利。’现在回想起来极具讽刺——不仅因为你完全正确,更因这尤其成就了你的事业。另一个论点(如果我记错了请纠正)我记得是你认为:人类和其他生物大脑中,来自鼻子、眼睛、耳朵甚至味蕾的不同信号都会通过大脑相似区域处理。你提出的问题是:强制系统以这种方式超负荷运作,要求它处理多种迥异任务,或许能使其更健壮、更智能?
You know, I remember Yoshua Benjo saying, hey, Andrew, this is not good for your career. That's very ironic in retrospect, given not only how right you were, but how good for your career in particular it's been. The other theory that I remembered, correct me if I'm wrong, that I felt like was part of your thesis was that in the human brain and in other brains, a lot of different signals, the signals from our nose, the signal from our eyes, the signal from our ears, even from our taste buds goes through similar parts of our brain. And you were asking the question, maybe there's something useful about that. About forcing a system to be overloaded in that way and that asking a system to do a lot of very different things might cause it to be less brittle, more intelligent.
我的记忆准确吗?这是你当时论文的一部分吗?
Am I remembering that correctly? Is it part of the thesis that you had at the time?
是的。它大致分为两部分。核心其实是单一学习算法假说。当时我受到神经可塑性实验的启发——比如当某人脑部某区域不幸受损时,其他区域的相同脑组织可以学会替代原本负责听觉的功能来处理视觉信息。这让我不禁思考(当然不止我一人),我们是否需要完全不同的软件或算法来处理视觉、听觉等不同功能?还是说存在一种通用学习算法,只需根据输入数据(无论是文本、图像、音频还是其他)就能自主适应?
Yeah. It was kind of two parts. It was really the one learning algorithm hypothesis. So I was inspired by these, neural rewiring experiments, which show that things like if someone tragically has damaged in part of the brain, other parts of the brain, the same physical brain tissue, could learn to see that was previously learning to hear. So it really made me wonder, and I wasn't the only one, but it made me wonder, do we need totally different software or algorithms for seeing and hearing and doing all these different things, or is there one learning algorithm that just depending on what data you give it, is it text or images or audio or something else, will learn to figure out how to deal with that data?
事后看来,这个单一学习算法假说被证明基本是正确的。不过现在回顾,我认为自己当时过分依赖神经科学的启发,那些具体神经学细节大多被证实并无助益。但高层次的想法——人脑可能用同一套算法处理多种任务,因此我们应当让计算机也采用统一算法而非让上万研究者发明数千种算法——这个理念确实奏效了。一个小团队开发的单一算法,只需喂入不同数据就能应对各种任务。
In hindsight, I think this one learning algorithm hypothesis, it turned out, I think, to be much more right than wrong. But again, looking back, I think I overemphasized looking to neuroscience for inspiration, and so the specifics from neuroscience mostly turned out not to be helpful. But this high level idea that maybe the human brain has one algorithm that does a lot of stuff, so we should try to get a computer to have one algorithm rather than in rather than having 10,000 people invent a thousand algorithms, maybe you can have a small team invent one algorithm and just feed a very different data, and that that really worked out.
没错,这在当时是异端邪说,如今却成了行业共识。
Right, which at the time was heresy, but is now what everyone is doing.
确实。我清楚记得在国家科学基金会研讨会上谈论这个假说时的场景。那时我还年轻,调侃计算机视觉领域的手工特征工程。有位德高望重的计算机视觉学者当场站起来呵斥我,对年轻教授来说这简直造成心理阴影。
Yes. I actually remember speaking at the National Science Foundation workshop, where I was talking about one learning of a hypothesis. I was, you know, younger at the time, I was kind of poking fun at computer vision people, hand engineering. I remember one very senior computer vision researcher stood up in public and kind of yelled at me. And as a young professor, that was slightly traumatizing.
不过多年后再回首,可以笑着承认:这个方向最终走通了。
But, yeah, maybe years later, can look back on it and smile and say, you know, it actually worked out okay.
是啊,像你这样推动领域变革的人,路上总会遭遇反对声。那么当你带着这个愿景加入X时,本可以选择其他平台或方式尝试,为什么最终选择了X?
Yeah. And people who are changing a field, as you did, generally do get yelled out along the way. So as you were coming into x, why you know you had this vision. You could have tried it in different places and in different ways. Why come to x?
比如,关于选择加入X实施这个想法的记忆,你最初与我和Sebastian、Larry、Sergei共事的经历是怎样的?
Like, what what was your memory of, choosing to come to x with this idea, your initial experiences with me, with Sebastian, with Larry and Sergei, like
记得当时Sebastian Thrun(与你共同管理X)对Google Brain的贡献远超目前公认程度——说实在的。我们在斯坦福的办公室仅一墙之隔,敲敲墙板就能互相呼应。我的学生Adam Colson等人已证明:神经网络规模越大,学习系统表现就越好。
Yeah. So I remember at the time, Sebastian Thune, who is running X together for you at the time, Sebastian Thrun deserves much more credit for the starting of Google Brain than I think he's he's received so far, candidly. Sebastian and I had offices at Stanford who are next to each other, so we shared the wall. If I banged on my wall, you know, here or here on the other side. And so my students at Stanford, Adam Colson and others, had demonstrated that the bigger we build a neural network, the bigger we build these learning systems, the better it perform.
可以说我掌握着这个'秘密数据'(其实公开讨论过,只是没人相信——某种程度上或许是好事)。这些数据清晰表明:神经网络规模与性能呈正相关。
And so I felt like I had this, you know, secret data. It wasn't secret. I talked about it, but people just didn't believe me, which maybe, you know, was a positive thing. I don't know. With this data showing that the bigger we build neural networks, the better they perform.
当时我正对人们说我们应该扩大这些算法的规模,塞巴斯蒂安向我指出,谷歌拥有大量计算机资源。何不向谷歌提议让我利用其庞大的计算基础设施构建比任何人之前都更大的神经网络?于是塞巴斯蒂安安排了我向拉里·佩奇做提案的会议。记得我准备了笔记本电脑上的幻灯片,但我们在日料店用餐时不便打开电脑,最终只能口头向拉里阐述,塞巴斯蒂安也在场。
And so I was talking to people saying we should scale up these algorithms, and Sebastian points out to me, Google has a lot of computers. Why don't you pitch Google to let me use Google's massive computer infrastructure to build much bigger neural networks than anyone else had built? So Sebastian set up a meeting for me to pitch to Larry Page. And remember, I prepared slides on my laptop, came all prepared with slides, but we're eating at Japanese restaurant, so it was inconvenient to pull up my laptop. So I wound up just talking to Larry with Sebastian there as well.
幸运的是,拉里当时认可了我的想法,授权我与塞巴斯蒂安以及X共同推进这个后来成为Google Brain的项目。那次晚餐至今记忆犹新,对我而言是场高风险对话,我至今仍感激拉里·佩奇接受了当时看似疯狂的神经网络愿景。
And fortunately, you know, Larry happened to buy into whatever I was telling him, and then authorized letting me work with Sebastian and and and you and X to take forward the project that later became Google Brain. So I think that dinner, I still remember it clearly. It was a very high stakes conversation for me, and I'm really grateful to this day to Larry Page for, you know, buying on to what was a pretty crazy vision at the time. Neural networks.
到2012年2月形势已变,人们开始对神经网络热情高涨。但即便在2010年2月,神经网络在人工智能领域仍长期处于非主流地位。你当时的思考是什么?规模固然重要——
By 02/2012, things had changed. People were starting to get pretty excited about neural networks. But even 02/2010, maybe 02/2010, neural networks were mostly still out of vogue and had been for a long time in the artificial intelligence world. What was your thinking? Look, Scale is one thing.
我们可以扩大规模,规模可能确实关键。但更根本的问题是:扩大什么?对于如今被视为理所当然的神经网络表征方式,你如何看待?记得2008年2月时,神经网络绝非主流,没人敢断言它会是AI的突破口。
We can scale these things up and scale may really matter. Separate question, what to scale up? Any thoughts on neural networks as a representation that's now something that everyone just takes for granted as being sort of in the water? But as you remember back in 02/2008, you know, networks was not what everyone was doing and was not at all taken for granted that that was what was gonna be the breakthrough for artificial intelligence.
确实。神经网络长期被AI界冷落,回想起来,当时在顶级会议发表神经网络论文都很困难,我早期工作多发表在研讨会上而非主会议。那时学术界的兴奋点在于精巧的数学工作——需要绝妙创意或定理证明才能赢得同行尊重。
Yeah. Neural networks have been out in the wilderness kind of rejected by a lot of AI for a long time. In fact, looking back, it was difficult to publish neural network papers in the leading conferences, which is why a lot of my early work was published in workshops rather than in the main conference at the time. Yeah. I think back then, a lot of the intellectual excitement, you know, the way you get a paper published at the top conference was by doing really tricky mathematical work, have a really clever idea, maybe prove a theorem.
而我却提出'用大量计算机扩大规模',这被视作缺乏学术严谨性——'你不过是在堆砌硬件'。这种做法当时极具争议性。
That's how you get a research paper published, you earn the respect of your peers with your very clever ideas. And I showed up and said, you know what? Let's get a lot of computers and make this much bigger. And that was viewed as, gee, you know, where's the intellectual rigor in that? You're just building stuff.
当深度学习规模化真正起步时,我目睹那些花了二十年微调算法的人们承受着情感冲击。他们毕生钻研精巧算法,而我们却说'建个巨型计算机,灌入海量数据'。当我们的方法开始超越他们数十年的智力成果时,这种颠覆性创新确实令人难以适应。
Why do you want to do that? So I think it was really controversial. And then frankly, I actually saw this as a I actually saw as scaling up deep learning, started to really take off, I saw some of the people that, frankly, had spent twenty years of their career tweaking algorithms. It was actually emotionally wrenching for them because they spent decades of their career filling with these algorithms in very clever ways. And then a bunch of people like me, we said, you know what?
我们首篇关于用GPU扩展神经网络的论文同样充满争议,最终只能发表在研讨会上——如今GPU的应用已成共识。当时加拿大CIFAR会议上,我们少数人(包括杰夫·辛顿等)展示的数据已显现真实势头。
Let's build a really big computer, throw a lot of data at it. And when it started to outperform their, you know, decades of intellectual work, it was actually it was actually you know, it was tough. Many of them adapted and and and kept on continuing to do good work, but when the disruptive innovation that comes that obsoletes what someone's worked on for a long time, It sometimes takes a while to adjust. It turns out, our first paper pushing to use GPUs to scale up neural networks, another controversial idea. We published that in a workshop because couldn't get us at the conference at the time.
颠覆性技术往往不与现有技术直接竞争——那时我们训练的神经网络确实逊于传统计算机视觉和文本处理算法。但我们确信方向正确,因为虽然尚未超越,但进步速度惊人。
And now it's so obvious that everyone knows we should use GPUs. I think one of the things that I was seeing back then was there was a small group of us, not just me, but also, you know, at this CIFAR meeting in in Canada where where Jeff Hinton and others hang out as well. A few of us were generating data showing real momentum. A lot of times when there's disruptive innovation, it really doesn't compete with the incumbent technology, and at that time, the new networks we're training, they were definitely worse than traditional computer vision algorithms, traditional text processing algorithms. But we knew we're onto something because while not yet competitive, it was rapidly getting better.
我和斯坦福的学生们意识到:只要构建更大规模的网络,就终将具备竞争力——这正是我们想要突破的方向。
And I think my students and I at Stanford saw that if only we could build bigger versions, then it would become competitive and that was the best we want to place.
作为斯坦福大学的一名非常年轻的教授,你之前描述过在提出规模可能确实重要的观点时遭到了很多反对。在早期,当有人不仅不同意你,甚至对你提出这是前进方向的说法感到愤怒时,是什么给了你继续推动这一点的信心?
And as a very young professor at Stanford, you were describing before how you got a lot of pushback at the idea that scale might really matter. What gave you the confidence in those early days to keep pushing for it when there were people who not only disagreed with you, but were angry at you for for saying that that's this was the way forward?
我有秘密数据。好吧,其实不算秘密。我们确实发表了,但其他人不相信我,所以和秘密没什么两样。但我的学生亚当·科斯和洪·莱克利制作了一张图表,横轴是模型的大小,纵轴是它的表现,我们尝试了大量不同的模型。在那项研究中,我们尝试的每一个模型,曲线都是向右上方延伸的。
I had secret data. Well, not really secret. We actually published it, but others didn't believe me, it might as well have been secret. But my students, Adam Coase and Hong Leckley, generated a chart where the horizontal axis is the size of the model, vertical axis was how it would perform, and we tried a ton of different models. And for every single model we tried in that research, curves just went up into the right.
所以我知道,基于数据,我们构建的模型越大,它的表现就越好。我认为作为一个科学家或创新者,你不能仅仅通过询问每个人的意见并取平均值来做出好的工作。这没有意义。询问别人的想法是可以的,但最终你必须对你所相信的东西有一个假设。而我的假设是由我们在斯坦福生成并发表的数据塑造的,但不知为何,我很难让人们注意到这一点。
So I knew, based on data, that the bigger we could build the models, the better it would perform. And I think as a scientist or as an innovator, you don't get to do good work by just asking what everyone thinks and taking an average. No sense. It's it's fine to ask people what they think, but in the end, you have to have a hypothesis for what you believe in. And mine was shaped by data that we had generated at Stanford that that we published, but somehow I, you know, struggled to get people to pay attention to this.
所以实际上,在其他团队也加入之前,我们在扩展方面已经有了很长的领先优势。
So we actually had a long head start on scaling before other teams then then jumped onto that too.
当Google Brain开始建立时,杰夫·迪恩最终成为了你在构建Google Brain过程中的合作伙伴。当时是怎样的,你们俩是怎么认识的?那是怎么运作的?你们是如何分工的?我想我
Once Google Brain was starting to get set up, Jeff Dean kind of ended up as your partner in crime in building out Google Brain. What was like, how did you two meet? How did that work out? How did you divide up the work? I think I
非常幸运杰夫·迪恩加入了这个项目。我想当塞巴斯蒂安·特伦和我在拉里·佩奇的指导下努力建立这个项目时,拉里让我和谷歌的很多人交谈,所以我记得和杰夫·迪恩、格雷格·科拉多、汤姆·迪恩、杰伊·亚格尼克等许多人聊过。我记得向杰夫推销这个想法:只要我们能够构建更大的神经网络,事情就会变得更好,这个想法让杰夫兴奋。随着项目的推进,当时参与项目的所有人都知道,如果我们能让杰夫更多地参与进来,他会是一个巨大的价值增值。
was really fortunate that Jeff Dean joined the project. I think when Sebastian Thune and I, under Larry Page's direction, were working to set the project, Larry asked me to speak with a lot of people in Google, and so I remember chatting with Jeff Dean, Greg Corrado, Tom Dean, Jay Yagnick, many, many others. And I remember pitching to Jeff this idea that if only we can build bigger neural networks, things would get better, and that was the idea that excited Jeff. And then as the project got going, all of us still working on the project at the time knew that if we could get Jeff more and more involved, she would be a tremendous, you know, value add.
团队的倍增器。
Multiplier for the team.
所以,我不知道我们中是否有人告诉过杰夫这一点,但我们确实有过一些对话。比如格雷格、罗纳尔多和我聊天。好吧。我们该怎么做才能确保杰夫保持兴奋并持续参与?我们总是希望确保他兴奋,并希望他做更多。
And so, you know, I don't know if any of us ever told Jeff this, but we actually had some conversations. Like Greg, Ronaldo, I were chatting. Alright. What do we do to make sure Jeff is excited and keep him engaged? And we're always wanting to make sure he was excited and we want to do more and more.
幸运的是,他确实如此。我想你知道,当他高度参与时,当我们日常思考时,他最终成为了系统方面的人。我的意思是,他构建了谷歌的很多基础设施。他在非常深的层次上理解扩展,而我最终成为了机器学习方面的人。我认为正是这种合作,我带来了机器学习的专业知识,杰夫带来了计算机系统的专业知识,使我们能够利用谷歌的基础设施来扩展机器学习算法,从而取得了实际成果。
And fortunately, did. I think as you know, when when when he became highly involved when we were pondering day to day, he wound up being the systems person. I mean, he built a lot of Google's infrastructure. He understands scaling at a very deep level, and I wound up being the machine learning person. And I think it was that partnership with me bringing the machine learning expertise and Jeff bringing the computer systems expertise that allowed us to use Google's infrastructure to scale up machine learning algorithms and that sought to deliver real results.
是的。当时有一个有趣的类比。你知道,杰夫为谷歌和世界带来的一件事是解决了一个非常困难的问题,比如查看世界上所有潜在的搜索信息,找到你要找的内容并在毫秒内返回,这意味着你必须拆分问题。事实证明,这种拆分问题然后重新组合结果的方法,当时是谷歌解决这些问题的核心,与你们在训练越来越大的神经网络时所做的训练工作非常相似,对吧?
Yeah. It was an interesting analogy at the time. You know, one of the things that Jeff had brought to Google into the world was taking a really hard problem like looking at all of the potential search information in the world, finding what you're looking for and returning it within milliseconds, which means you have to split up the problem. And it turned out that that splitting up of the problem and then recombining the results, which was central at the time to how Google solved this problems, turned out to be very similar to the training work that you all did as you were training larger and larger neural networks. Right?
是的。我认为Jeff发明了这项名为MapReduce的技术,它能将任务分解,在多台计算机上并行处理后再整合结果。这是我们最初进行部分训练的第一版方案。随着我们不断迭代版本,最终发展出TensorFlow等技术,整个技术栈持续演进。但我觉得谷歌在采用GPU方面行动较慢,部分原因在于谷歌拥有极其出色的CPU计算基础设施。
Yeah. So I think Jeff had invented this technology called MapReduce that takes work, splits it up, does it on lots of computers, brings it back together. That was version one of how we did some of the training. And then as we built more and more versions, which then eventually led to things like TensorFlow, I think the technology stack kept on evolving. I would say one thing that I that I I think we were slower to do at Google was embrace GPUs because partly because Google has such a brilliant CPU compute infrastructure.
没错。让我们深入探讨这一点——X部门的核心原则是聚焦于具有硬件组件的项目。早期我们欢迎Google Brain加入X,正是因为我们预见到专用软件可能需要配套的专用硬件,就像大脑需要匹配心智那样。但随着你们团队取得巨大成功,短期内大家对专用硬件的兴趣反而消退了。
Yeah. So let's let's dive into that for a second because our commitment at X generally was that we would be mostly focused on things that had some hardware component. So in the early days, we were excited about Google brain being at X, but it was partly because we imagined that there might need to be specialized hardware that would go with the specialized software. A brain to go with the the mind as it were. And then over time you and the team were having so much success that at least for a while the interest in specialized hardware went away.
不过后来Google Brain又启动了TPU项目,虽然那是在离开X之后。你还记得我们是如何经历从暂时放弃硬件到最终回归硬件的过程吗?
But then later, Google Brain ended up starting the whole TPU process, though that was after it left x. Any memories about how we got to both temporarily not doing hardware and then later getting back to hardware?
确实。Google Brain做过许多英明决策,但我希望我们能更早调整GPU和TPU的决策方向。记得当时我和Jeff经常与数据中心运维人员交流,他们在搭建Google/集群等设施。当时存在一个合理担忧:如果我们开始零星部署GPU...
Yeah. So, you know, I think I think Google Brain made a ton of great decisions. One that I I wish we'd made differently even earlier was the GPU, maybe TPU decision. And I remember Jeff and I were actually speaking with a lot of the data center operators, you know, building up the Google slash clusters and so on. And there was a concern at the time, which is legitimate, that we started sticking a few GPUs here and there.
会导致计算环境变得高度异构化,难以管理使用。
It would create a very heterogeneous compute environment that became hard to use.
能否向听众解释下GPU和TPU分别是什么?让大家了解这些术语。
Can you unpack for people what a GPU is and what a TPU is just so they recognize those terms?
当然。多数计算机依赖CPU(中央处理器),而GPU(图形处理器)最初为图形渲染设计,后来发现特别适合训练大型AI系统和神经网络。TPU则是谷歌大脑团队的发明,全称张量处理器,是谷歌专门为大规模张量处理设计的定制硬件。
Sure. So most computers pop by CPUs, computer processing units. And then GPUs are graphics processing units originally designed for making computer graphics, but turned out to be fantastic for training very large AI systems, very large neural networks. And then TPUs, Google's invention, Google Brain team's invention, is Google's take on specialized hardware for training these very large Tensor processing. Tensor processing units.
是的。我们很早就发现GPU表现优异。Google Brain初期做语音识别时,我们只有一两台GPU服务器...
Yeah. Right. And so I think we saw that GPUs were working well. And in fact, early on in Google Brain, we're working on speech recognition. And we actually had a GPU server, one, maybe two.
我至今记得那台放在员工桌下、缠满电线的机器。通过那台设备,我们看到了GPU的潜力。但从谷歌数据基础设施角度看,当时谷歌的计算架构能让代码几乎无缝运行在任何地方,而GPU是完全不同的硬件类型,需要程序员专门适配。我们当时在考虑:如果大规模采购GPU,能否同时用于YouTube转码?除了训练AI模型还有其他用途吗?
I can still picture it sitting under someone's desk with a nest of wires feeding to it. And we actually saw with that one computer I think we saw GPUs were hopeful, but but the concern from the, you know, Google data infrastructure point of view was, Google has a at that time, had a compute infrastructure so someone could write code and have it run pretty seamlessly almost anywhere. But GPUs is a very different type of hardware, and so it would change the work that programmers have to do to specialize it. And so we're looking at, boy, if we buy a lot of GPUs, could we use this also for YouTube transcoding? And is it good for anything other than training AI models?
正是这些考量让我们稍有迟疑,没有在谷歌内部全力推进GPU应用。后来我反而在斯坦福团队用GPU做演示项目,因为那个小团队可以接受混乱的基础设施。不过话说回来,我们先用CPU取得了长足进展,后来逐步引入GPU,最终开发TPU,结果证明这条路完全可行。
And I think because of those things, we stalled out a little bit and and and did not as aggressively pursue that here here here at Google as as maybe I should have, you know, pushed harder for. I wound up actually doing demo stuff using GPUs at my Stanford group because that was a scrappy team that could didn't got a very messy infrastructure and was okay. But having said that, I think, you know, we got quite far with CPUs, and then as Brain started to move in more into GPUs a little bit later and then building TPUs, clearly, worked out just fine.
那是在Google Brain离开X之后,Transformer才正式被发明出来,论文才得以发表。你是否看到过一些Transformer或类似Transformer工作的雏形?能否谈谈Google Brain早期遇到的一些障碍,以及当时看似探索性但后来证明非常重要的事情?
It was after, Google Brain had left X that the transformer was formally invented, that the paper was published. Did you see little bits, of transformer or transformer like work? Tell us about maybe some of the snags that you hit in the early days of Google Brain and also some of the things that felt exploratory then but turned out to be really important?
你知道,Transformer论文的卓越之处在于——我认为至今仍只有部分人真正理解——作者们是在Google Brain崇尚规模的传统中成长起来的。因此,关于如何构建Transformer网络的许多决策,都围绕着设计一个能在GPU上良好扩展的神经网络。比如注意力机制这类设计,就是一种让神经网络决定关注句子哪些部分的巧妙方法。
You know, the the brilliant thing about the transformer paper was and I think to this day, only some people understand this, was the authors grew up in that Google Brain tradition of scale, and so a lot of decisions of how to architect to transform a network, it was all about designing a neural network that would scale well on GPUs. So a lot of things like the attention mechanism, which is a very clever way for a neural network to decide which parts of the sentence to pay Can attention you
能否向听众解释Transformer在注意力机制方面的创新是什么?
explain to the audience what the transformer innovation was around attention?
我认为在Transformer论文之前,传统的算法是这样工作的:比如我们要把英语句子翻译成法语,系统会先读取整个句子,尝试记住整个句子,然后试图复述出法语翻译——这种方式勉强可行。
So I think before the transformer paper, there used to be algorithms that say, we wanna translate a sentence from English to French. You would read the whole sentence, try to memorize the whole sentence, and then try to regurgitate the French translation, and and it kinda worked okay.
这相当困难。
That's pretty hard.
没错,确实很困难。特别是长句子。而Transformer论文提出了一种创新架构,它能保留英语原句的完整信息。当你在生成法语句子时,根据当前输出的位置,可以动态关注英语原句中需要翻译的特定部分。
Right. It was pretty hard. It's a long sentence. The And transformer paper had this innovation innovative architecture that would keep the English sentence around. And then as you're trying to write the French sentence, depending on where in the sentence you are in generating output, you could pay attention to the specific part of the English sentence you're translating.
事实证明这需要大量计算——要能同时查看完整的英语句子和正在生成的法语句子,并判断何时该关注什么。但由于它在GPU和TPU等并行硬件上扩展性极佳,效果非常出色。后来这成为了现代基础模型的基石——我们不再局限于英法翻译,而是能将用户提示转化为任何问题的答案。Transformer论文之所以如此成功并获得巨大影响力,很大程度上在于作者极巧妙地设计了神经网络架构,确保每个步骤都高度可并行化,能在GPU上高效运行。这为海量数据训练提供了绝佳的计算基础。
And it turned out that this required a lot of computation, to be able to look at the entire sentence in English and the entire sentence in French and figure out what to look at when you're doing what. But because it's scaled really well on parallel hardware GPUs and TPUs, it it worked really well. And this later became, you know, as we all know, the foundation for modern foundation models, where instead of translating from English to French, we could translate from a user prompt to the, you know, answer to whatever the user is asking. And a large part of why the transformer paper worked so brilliantly and and got so much traction was because the authors designed the neural network architecture very cleverly to make sure that every single step was highly parallelizable and could run well on the GPU. And so that gave it a fantastic compute substrate to train it on tons of data and that made it work really well.
Steve,在Google Brain还隶属于X的早期阶段,你可以选择任何研究方向——机器翻译、语音转文字或图像识别。你是如何选定几个重点方向的?
Steve. In the early days, Google Brain, when it was still at X, you could have worked on anything. You could have worked on translation. You could have worked on turning speech into text or image recognition. How did you pick a few things to focus on?
有没有因为效果不佳或商业价值有限而放弃某些项目?
And did you throw some away because they weren't working as well or because they would be less commercially useful?
其实我加入后做的第一件事是在谷歌内部教授神经网络课程。记得是与Tom Dean和Greg Corrado密切合作的。虽然参与者不到百人,但我们每周聚会,分享关于神经网络、规模化以及Google Brain工作的奇思妙想。这意外地帮助我们在全公司结交了许多盟友。因此我们最先合作的团队之一是语音组,这主要出于两个原因...
So in one of the first things I worked on when I started the next was actually teach a class within Google on neural networks. I think Tom Dean and Greg Corrado and I worked closely on this, I heard correctly, But it turned out to be a great I think something like just shy of a 100 people came. But we were meeting every week, sharing my weird ideas about neural networks and scale and why what we're doing at Google Brain. And then, fortunately, this helped us make, you know, many friends and find many allies across Google. And so one of the first teams we ended up working with was a speech team for two reasons.
首先,我们认为在规模上存在巨大潜力可以提升语音识别技术。
First, we felt that there was a lot of potential in scale to improve speech recognition.
你说的语音,是指通过听取语音音频并识别出所说的词汇,从而将音频内容转化为文字记录。
By speech, you mean listening to the audio of speech and figuring out the words that they're saying so you could literally write down the text that comes from that audio.
没错,就是语音识别。那时候语音搜索还没现在这么成熟,但能用语音与手机应用交互、在谷歌进行语音搜索的想法确实令人振奋。所以我们想提高语音转写的准确率。
Yes. Speech recognition. Right. At that time, think voice search wasn't yet as mature as it is today, but the idea that you can talk to your mobile app and use your voice to search in Google, that was really exciting. So we wanna improve the accuracy of speech transcription.
不过我记得当时语音团队已经开始初步探索神经网络,我们觉得通过帮他们扩大规模,可以推动谷歌语音识别的进步。所以最终有点机会主义——看哪些团队愿意合作,哪些团队能帮我们验证规模假设。
But I think at that time, the speech team was already looking a little bit into neural networks, and we felt that by helping them scale, we could help Google speech recognition improve. And so it wound up being a little bit opportunistic based on who wanted to work with us and who who we thought we could work with to drive the scale hypothesis.
这既给了你们合作的团队,也帮你们验证自身技术是否达标。毕竟他们有明确的基准线来衡量什么是突破性进展。
That gave you a team to work with and help. It also helped you understand if you were becoming good enough to be useful. Right? Because they had a very clear benchmark
嗯。
Mhmm.
他们认定哪些是难点,相应地也会判断哪些进展算得上惊艳。
Of what they considered hard, and so what they would consider impressive in terms of progress.
是的。我们很幸运能深耕技术创新,比如设计神经网络架构,同时还要快速交付实际业务成果。我记得除了语音项目,还做过谷歌街景——当时用计算机视觉识别街景图像中的门牌号,为谷歌地图提供更精准的房屋定位。结果这个项目的影响力反而超过了语音识别。我们还讨论过如何优化广告业务。
Yeah. So we're fortunate that we could work on deep tech innovation, you know, invent neural network architectures, and then also, we were kind of held accountable to deliver real business results relatively quickly. So I remember work on speech, working on Google Street View, where we were, at the time, using computer vision to look at Street View images to read house numbers to more accurately geolocate houses in Google Maps. And that turned out, at the time, to be a bigger impact thing than speech recognition. You know, had conversations on how to help advertising.
记得早期网页搜索团队对此持怀疑态度,我...我花了很大力气说服他们。好在广告团队接受度更高。所以...
I remember in early skepticism about web search. I I I I struggled to convince the web search team at the time. Fortunately, the advertising team was was much more open to this. So
我没记错的话,你们当时也在分析YouTube视频内容做违规过滤?
Do I remember correctly that you were also looking at YouTube videos and doing some filtering for inappropriate content?
是的。当时Jay Yagnik的团队已经在YouTube上运行AI,并做了大量优秀工作,比如根据内容帮助标记YouTube视频,以及一些审核过滤。所以实际上有很多外部兴趣——由于我带领的那个约有100名谷歌员工参与的课程,不同应用团队表现出浓厚兴趣。我们很幸运,甚至在早期就有远超我们编制名额的人想加入Google Brain。但有时他们无法全职加入,我们就说一起合作吧,这促成了许多协作关系。
Yes. And already Jay Yagnik's team at the time was running AI on YouTube and did a lot of really good work on, helping tag YouTube videos based on the content and also some of the moderation filtering. So there there was actually a lot of ex because because of the class, you know, that that I led with about a 100 Googlers or something, there was actually a lot of interest from different application teams, and we're fortunate, we're actually fortunate to have a lot more people wanting to join Google Brain even early on than, you know, we had headcount for. But but sometimes it's all about to join us, we just couldn't bring them in full time. We say, you know, let's just work together, and and that set up a lot of collaborations.
从你加入X到Google Brain从X毕业并入谷歌,大概不到两年时间?你对这次‘毕业’怎么看?是时机成熟了,还是像被逐出伊甸园,或是介于两者之间?转移到谷歌的过程感受如何?或者觉得反正都是谷歌旗下,无所谓?
It was maybe just under two years, I think, from the day you started at X till when Google Brain graduated from X and moved to Google. What did you think on that graduation? Was this like about time or we're being kicked out of the Garden Of Eden or something in between? Like, how did that process of moving over to Google feel? Or it was like, whatever, it's all just kinda Google, so it doesn't matter.
说实话,这些感受都有一点。X过去是、现在依然是个非常特别的地方。记得当初在X大楼工作时,十步之外就是当时的Chauffeur(现Waymo)团队,再远点是气球项目组,还有Glass团队就在我工位旁——所有这些团队都在进行疯狂而令人兴奋的探索。虽然离开X被称作‘毕业’,但最终搬到谷歌核心部门、更贴近业务并获得更多资源并不是坏事。
It was a little bit of all of the above, to be honest. You know, X was, still is, a very special place. And so I remember when I was working in the X Building way back then, it was really nice that, you know, like 10 feet from me would be the, at that time, chauffeur, now Waymo team, and then team working on balloons, and then also the, you know, gloss team, just several feet from my desk, where all of these teams doing all of these wild, exploratory, insanely exciting things. So, you know, while graduating from x was presented as a graduation and the next step. And I think, ultimately, it wasn't a bad thing that we moved to Google Core and became closer to the business and got more resources.
对此我毫无遗憾,这助推了我们的成功。不过离开那个每天都有疯狂创意在身旁迸发的X大楼,确实带着些许苦涩。
So no regrets with that at all. I think it helped sell us of the success. It was also a little bit bittersweet to leave behind that wildly exciting x building, which with with crazy stuff happening just a few feet from where I was sitting every
团队迁移后发生了什么变化?你们转入谷歌后又待了一年半左右?
day. What changed after the team moved? You stuck around at Google for another maybe year and a half after it moved?
应该没有。迁移后我们更专注于神经网络和规模化研究,少了和Waymo团队厮混、试乘早期原型车的机会。某种程度上我们变得更‘企业化’——但绝非贬义。
I don't think so. Yeah. I think I think after we moved, I think it became more focused on one thing, which was neural networks and scaling. We probably spent less time hanging out with the way more people and getting free rides in the very early way more prototypes at the time. And then I think we became more, I I would ask to say corporate, but not at all in a bad way.
对Brain团队而言,与更多谷歌业务建立连接非常有益。我一直坚信:这项技术令人兴奋,我们应该钻研深度技术,但孤立状态下毫无价值,关键在于找到应用场景。搬入谷歌主楼后,我们与重要应用团队仅一分钟步行距离,便于开展合作。后来我逐渐将重心从Google Brain转向Coursera的日常运营。
I think it was helpful for the brain team that we became much more connected to a lot more Google businesses. One thing I believe back then, and I still believe now, is this technology is exciting, we should work on deep tech, but in isolation it's completely useless. The value is when we find applications for it. So when we moved into the main Google building, we're physically much closer to a lot of the important application teams, so it was, you know, a minute walk away to talk to different teams building really important applications that we could collaborate on. I wound up shifting gradually from Google Brain to running Coursera more day to day.
我和联合创始人Daphne原本通过斯坦福机器学习课程创立了Coursera。由于Google Brain进展顺利,我有信心将团队领导权交给杰出伙伴Jeff Dean;而Coursera作为新生事物更需要日常管理。经过约一年时间,我逐步将主导权移交给他——这个过渡最终也很顺利。
So I think, you know, had started Coursera with my machine learning class at Stanford, my co founder, Daphne, and I were running it day to day. And then partly because Google Brain was going well and I was so confident I could hand off the leadership of the team to Jeff Dean, was a wonderful partner, I felt whereas in contrast, I think Coursera is a very early thing, needed much more day to day leadership. So then I talked to Jeff, and we very gradually, over the course of, like, a year or something, you know, had me hand over the reins to him, and I think that that fortunately worked out well too.
确实。你现在仍是Coursera董事会成员吧?
Yeah. For sure. You're still on the board of Coursera, aren't you?
是的,担任董事会主席。
Yes. So chairman of board.
恭喜你。
Congratulations.
谢谢。
Thank you.
所以我很想听听你对人工智能和机器学习发展方向的看法,以及你之后的经历,现在在做什么。
So I'd love to hear from you a little bit where you think AI and machine learning is going, and also sort of where you've gone afterwards, what you're doing now.
是的。最近这些日子,包括从你、Astro和X那里学到的一些早期经验,我大部分时间都在运营AI Fund,这是一个风险工作室,我们平均每月创建一家新初创公司。我继续通过deeplearning.ai和Coursera做很多AI教育方面的事情。但我认为AI非常令人兴奋。像谷歌这样的公司在训练基础模型方面做得非常出色。
Yeah. So, well, these days, including taking some early lessons I learned by watching you, Astro, and and X operate is I spent a lot of my time running AI Fund, which is Venture Studio, and we build an average of about one new startup per month. I continue to also do a lot of AI education things through deeplearning.ai and through Coursera. But I think AI is wildly exciting. Companies like Google are doing a fantastic job training foundation models.
我认为最新版本的Gemini确实,你知道,团队做得非常棒。我对基于这些基础模型构建的大量应用感到无比兴奋。感觉每天都有很多很酷的应用,我认为有明确的市场需求,能让人们的生活变得更好,只是还没有人着手去构建。所以我
I think the latest version of Gemini really, you know, the team really did a great job. And I'm wildly excited about the number of applications that would be built on top of these foundation models. So it feels like I wanna go into where every day there are so many cool applications where I think there's clear market demand, people make people lives better off, it's just no one's gotten around to building it yet. So I
觉得这非常令人兴奋。你说每月一家新初创公司。这是它们从你的渠道中产生的速度。它们在你们的渠道中停留多久?
find that very exciting. You said one new startup per month. That's the rate at which they're coming out of your pipeline. How long do they spend in your pipeline?
从想法到启动一家初创公司大约需要六个月。但其中大约一半的时间是用来招聘CEO。一旦招聘到CEO,他们会和我们一起度过三个月,之后,大约有75%的毕业率,25%的情况下我们会决定不继续推进。所以基本上,一旦CEO加入我们,三个月后,我们就会启动一家初创公司。我认为AI领域的一个变化是原型设计的成本大幅下降。
So about six months from idea to launching a startup. But about half of that time is getting up to hiring a CEO. But once you hire a CEO, they spend three months with us, after which, after three months, with about, let's say, a seventy five percent graduation rate, 25, we, you know, we all day decide to not move forward. So basically, once a CEO is with us, three months in, we will launch a startup. And I think one of the things that's changed in AI is the cost of prototyping has gone down dramatically.
所以,如果你有一个想法,现在构建一个原型然后获取用户反馈,验证或否定它,成本非常低。如果被否定,那也没关系。你只是浪费了两天时间和大约5000美元。这确实加快了创新的步伐,尤其是在应用层,我们利用AI构建应用。而不是AI技术基础模型层,那仍然需要数十亿美元的预算和大规模的数据中心建设。
So, you know, if you have an idea, it's so inexpensive now to build a prototype and then take up the users, validate or falsify it, and if it falsify, great. You just wasted, you know, two days and $5,000 or something. And this this is really picking up the pace of innovation, especially in the application layer where we take AI and build applications. As opposed to the AI technology foundation model layer, which still needs these massive billion dollar budgets and massive data center build homes.
是的。我把它看作是电力和晶体管之间的区别,这些是二十世纪末计算机行业的基础层,互联网基础设施,所有这些都具有深远的推动作用。但我们还有成千上万的东西需要在那之上构建,以实现其价值。同样,基础模型、机器学习以及这些现在全世界都可以使用的大型模型就像是电力,就像是晶体管。
Yeah. I think about it as the difference between electricity, transistors, the sort of foundational layers of what by the end of the twentieth century was sort of the computer industry, the the Internet infrastructure, all of that was profoundly enabling. But we also then had tens of thousands of things to build on top of that to realize the value of that. In that same way, foundation models sort of machine learning and these these large models that are now available to everybody in the world is like the electricity. It's like the transistors.
它能够实现无数的事情,但你需要去用它做些什么。
It enables an incredible number of things, but you have to go do things with it.
确实。事实上,如果你回顾美国及其他国家的电气化进程,建造发电厂曾是门极其庞大的生意。许多人投身其中并获利颇丰。但相比之下,消费电子产业及电力驱动的产品规模远超发电行业。我认为人工智能的发展也将遵循这一规律。
Yeah. In fact, if you look at the electrification of The United States and other countries, building electric power plants was a big, great, great business. Lots of people built electric power plants, and they do very well. But if you look at the consumer electronics industry or the things built using electricity, that's far bigger than the power plant industry. And I think it will be like that for AI.
要知道,构建AI模型将是个巨大机遇,规模惊人,但远不及在其基础上开发海量应用所形成的集体效应。
You know, building AI models will be huge. It will be massive, but it won't be nearly as massive as a collective set of things will do, building tons of applications on top.
我欣赏你对人工智能的热忱,稍后我们会继续探讨。你对教育同样充满激情,提到Coursera时你曾担任教授,但我猜你对教育的热情远超实际教学时间。请谈谈这份教育热忱。
I love your passion about artificial intelligence and we'll come back to that one. You also have a passion for education. You mentioned Coursera. You were a professor for a while, but I'm betting that your passion for education is much bigger than the time that you spent actually teaching. Tell me about your passion for education.
我从小受父母教导明白:重点不在于我个人成就,而在于为他人铺就成功之路。在斯坦福教授机器学习时,年复一年面对相同教室讲授相同内容,连笑话都重复使用。后来我开始质疑:这对学生成功真是最有效的方式吗?于是几年间,我逐步将课程视频录制并免费公开,尝试自动评分测验,从可汗学院汲取经验采用短视频形式。
I think I grew up, you know, trained by my parents and so on to to realize it's not about me. It's always about setting others up for success. And then, at Stanford, I remember teaching machine learning, and I was walking into the same room delivering the same lecture year after year, even telling the same jokes. And after a while, asked, is this really the best use of my time in terms of setting up students to be successful? So over a span of a few years, I wound up, trying to get the videos recorded and posted online free for anyone to access, wound up prototyping things like auto graded quizzes, learned from Sao Con Academy that we should do shorter videos.
事实上在Coursera爆红前,我们迭代过五个无人知晓的版本,有些用户仅20人左右。这些试错让我领悟如何构建可扩展的在线教育平台。当模式跑通后,我意识到面向大众的教育机遇已至,于是邀请Daphne Koller共同创立了Coursera。
And it turns out that before Coursera, you know, suddenly went viral, there were, I think, five other versions that you would never have heard of, some of which had, like, 20 users or something, but that allowed me to learn important lessons about how to build a scalable online educational platform. And when that worked out, I felt there was an opportunity to take training to a very large audience. So I invited, you know, our staffy cover to join me, and and we wound up building up Hosara from there.
现在聊聊人工智能与机器学习吧。你何时对其产生热情?这种痴迷始于何时?具体年龄是?
And tell me about artificial intelligence and machine learning. When did you get passionate about that? How did that sort of bug start with you? At what age?
记得高中时在某办公室实习,终日与复印机为伴——那实在枯燥得令人抓狂。少年时的我常想:要能发明自动复印的机器,我就能去做更有趣的事了。
I remember it was in high school. I did an internship as an office admin, and I just remember doing so much photocopying. Not my favorite. And frankly, was boy, it was boring. And I remember as a teenager thinking, oh boy, if only there was something I could do, like some sort of automation that could do all this photocopying for me, maybe I could do something more fun.
因此自幼年起,我就对解放人类时间的自动化技术充满憧憬。幸运的是,当时我父亲——一名医生——正尝试用原始AI算法进行医疗诊断。对复印工作的厌恶与少年时接触的神经网络知识,让我始终将AI视为超级自动化工具来痴迷。
So since a very young age, I was really excited about automation and how it can free up people's time. And I was fortunate also that my father, a doctor, at the time, was actually experimenting with very rudimentary AI algorithms for medical diagnosis. So my distaste for the amount of photocopying I had to do as office admin and learning about neural networks as a teenager, I think since then I've been passionate about AI as a form of automation on steroids.
既然谈到这个曾遭众人质疑的愿景,你认为未来十年人工智能领域哪些即将到来的变革会出乎大众预料?并非要讨论虚无的通用人工智能,而是那些可能令听众惊讶的具体改变?
So since we started talking about this vision that you had that was at the time, literally people were getting up and yelling at you about, What do you think are some of the things coming down the pipe for us as humanity with respect to artificial intelligence and machine learning that people don't see coming clearly yet? I'm not trying to suck you into some unproductive AGI conversation, but just like, how do you think the world's gonna be in, call it ten years, that might surprise listeners?
我期待全民学会编程——或者说这种AI辅助的新型编程模式。不仅因职业需求,在私人生活中我也常为子女编写程序。比如上周刚开发了生成乘法表卡片的应用,甚至能用手机语音交互定制学习主题,整个原型开发用时不足一天。
There's one thing that I'd love to see. I I feel like I love to see everyone learn to code, or with this new style of coding, which is much more AI assisted. And the reason is maybe in in my professional life, obviously, I do a lot of coding, but in my personal life, I write applications for my kids. Know, a few weeks ago, a week ago, I was writing an application to print out flashcards for my daughter to practice a multiplication table. It took me less than a day to build a new prototype that I could call up on my phone and talk to, you know, to get a custom prompt and talk to me about topics.
但许多过去需要数周或数月构建的原型,现在无需编写受损代码就能在几小时或不到一天内完成,因为AI能替你写代码。我认为软件工程的需求极其庞大。我们很多人都希望编写更多程序,但成本一直太高。美国50个州中已有4个州要求高中文凭必须包含计算机教育。我希望这个数字能变成50/50,因为如果能让每个人都学会用计算机创造而不仅是使用,与计算机并肩构建事物。
But a lot of these prototypes that used to take weeks or months to build now can be built in hours or maybe less than a day without writing damaged code, because AI can write the code for you. And I think that demand for software engineering is massive. So many of us would love for so many more programs to be written, but it's just been too expensive. And I think four states in The United States out of 50 already require some computing education to get a high school diploma. I hope it would be something 50 out of 50 states, because if we can get everyone to learn how to use computers to build things, not just be users of computers, but be building alongside computers.
我认为每个人都能变得更强大。未来最重要的技能之一将是让计算机执行你想做的事,因为计算机正变得越来越强大。如果我们以新的编码方式教会所有儿童编程,这样的世界将让下一代比现在强大得多。
I think every human can be much more powerful. It turns out that one of the most important skills going forward would be the ability to get computers to do what you wanted to do, because computers are becoming more and more powerful. And I feel like a world where we teach all children to code in a in a new way of coding will set up the next generation to be much more powerful than the current one.
那么在美国之外,甚至超出最发达经济体的范围,你认为AI会如何更广泛地影响世界?
And how about sort of outside The United States, maybe even outside of the sort of the larger, most developed economies. How do you think AI is going to affect the world more generally?
我希望AI能产生巨大的民主化效应。因为当今世界最昂贵的事物之一就是智力——聘请高技能专科医生诊断病情,或是雇佣一对一高中家教辅导孩子都花费巨大。虽然我看不到降低人类智力成本的途径(培养专业人才本就昂贵),但降低人工智能成本是可行的。这意味着现在只有相对富裕的人才能雇佣特定人员提供服务。
I feel like I hope that AI will have a very large democratizing effect. Because one of the most expensive things in today's world is intelligence. Costs a lot to get a highly skilled specialist doctor or to to quote tell you what's going on or costs a lot to get a high school tutor to mentor your child one on one. And AI and then whereas I don't see a path to making human intelligence cheap, it just costs a lot to train a skilled human being, there is a path to making artificial intelligence cheap. And what this means is today, only the relatively wealthy can hire certain types of staff to do certain types of things for them.
但在未来,我期待每个人都能拥有一个由聪明博学的'员工'组成的军团来协助处理各类事务。我认为这将...
But in the future, I would love a very single person, kind of an army of smart, well informed staff to do all sorts of things for us. And I think this would
他们的健康顾问,他们的两
Their health adviser, their two
年 对。年。是的。我认为赋予每个人一个现在仅限富裕阶层享有的'员工军团',这将提升很多人的生活。
year Yeah. Year. Yeah. Yes. And and I think I I think that giving everyone an army of staff to help them, that today is available only to the relatively wealthy, This would this will lift up a lot of people.
我很好奇你的定义——虽然有点开玩笑,但作为长期从业者(我知道你显然也是),我一直觉得AI就像不断后退的边界线:当某项技术开始奏效并融入日常生活,我们就不再称之为人工智能。所以我最爱的实用定义是'电影里电脑会做的事',这某种意义上很不公平。记得电脑下棋超越人类的那一刻吗?
I'm I'm really curious to hear your definition. This is a little tongue in cheek, but I've always felt as sort of a longtime practitioner, and I know you are as well, obviously. AI has traditionally been this sort of receding frontier where as things start to work and become an everyday part of our lives, we stop calling them artificial intelligence. So my favorite working definition of artificial intelligence is the things that computers do in the movies, which in a certain sense is totally unfair. But you remember that moment where computers started being better than people at chess.
突然间那就不算智能了,因为电脑比人强。我当时想:好吧,这可不是个有用的智能定义。那你会如何定义人工智能?
And all of a sudden, that didn't count as intelligence anymore. By definition, because computers were better than people at it. And I was like, well, okay. That's not a very useful definition of intelligence then. How how would you define artificial intelligence?
AI成功的因素之一在于:虽然有时感觉目标永远遥不可及,但这个领域对任何想加入并称之为AI的研究都很包容。就我个人而言,如果有人让计算机展现出某种智能迹象并想称之为AI,我完全赞同。我们持开放态度——你想把自己的工作称作AI?没问题。这总比太多人四处否定强。
Yeah. One one of the things that I think has contributed to AI's success is while on one hand it sometimes feels like it's always far off, I think it's a field we've been pretty embracing of whatever people want to enter our field and call AI. And so I find that, just myself, if someone is doing something that makes a computer demonstrate some semblance of intelligence, you want to call AI, fine with me. I will agree with you. So I think the fact that we're quite embracing, if you wanna call your work AI, it's okay, as opposed to too many of us going around saying, no.
那并不算真正的AI。这让我们的领域得以持续发展。
That's not really AI. This lets our field keep growing.
或许可以在此基础上延伸,我们会广义地称之为AI——只要人类做出类似行为时,我们会称那种行为是智能的。
So maybe as a build on that, we would call it AI in a very general sense to the extent if a person did something like that, we would call that behavior intelligent.
对。是的。没错。而批评观点认为,你知道,非常简单的程序用if语句就能做基础决策。这是智能,但这真的是人工智能吗?
Yeah. Yes. Yeah. And and the criticism is, you know, very simple programs use an if statement to make simple decisions. That is intelligence, and is that really artificial intelligence?
我很乐意说:是的。如果你觉得它智能,就叫它AI吧。我完全支持这种观点。而且我认为,当学科领域能包容有效的方法而非过度防御地说'你不属于我们'时,往往会更成功。
I'd be glad to say, yes. If you think it's intelligent, call it AI. I will fully support that. And I think that I find that disciplines tend to be more successful when we just embrace whatever works rather than being too defensive and say, no, you're not one of us. You're not one of us.
我觉得AI避开了这种狭隘思维。
I I think AI has avoided that.
我同意。所以Google Brain在X部门的巅峰成果是那篇引起《纽约时报》大量报道的论文,关于猫和猫视频的。能说说这项被重点报道的工作背后有什么故事吗?具体突出了哪些内容?
I agree. I agree. So the culmination in a way of Google Brain's time at x was a paper, that was published to quite a bit of fanfare in the New York Times about, cats and cat videos. Can you tell us a little bit about what led up to the work that was featured there? What specifically was sort of being highlighted?
因为这可以说是Google Brain的首次公开亮相时刻。
Because this was sort of the coming out moment for, Google Brain.
是的。我们通过那篇现在有点'臭名昭著'的谷歌猫论文宣布了Google Brain。记得当时我们有个想法:要获取足够的学习数据,我们希望从无标注数据中学习。标注数据需要人工查看图片并标注'这是狗''这是猫''这是人'。
Yeah. When we announced Google Brain was through that, you know, now slightly infamous Google cat paper. Yeah. I remember we were we had this idea that to get enough data to learn from, we wanted to learn from unlabeled data. So labeled data means you had someone look at pictures and say, this is a dog, this is a cat, this is a person.
这些标注工作极其耗时。但我们想从无标注数据学习,具体来说,我们构建了一个可能是当时全球最大的神经网络,让它浏览海量YouTube视频,从画面中自主学习。记得当时我在斯坦福的博士生、Google Brain实习生郭乐(现仍在谷歌做杰出工作)有天喊我:'吴恩达,来看看我电脑上的东西'。我走过去,看到他屏幕上有个算法通过观看YouTube视频自主发现的——略显模糊的黑白猫脸图像。毕竟YouTube上猫视频确实多。但关键在于,这个算法在没有任何人类干预的情况下,仅通过分析海量数据就'发现'了猫脸的存在。
Those are labels that are very labor intensive. But we want to learn from unlabeled data, and specifically, we built a very large neural network, possibly, quite likely, the largest in the world at that time, which would go to YouTube and just watch tons of YouTube videos and learn from pictures or from from YouTube to see what I can learn from that. And I remember, it was my then Stanford PhD student and Google Brain team intern, Kwok Lehr, who's still here at Google and doing great work, I remember Kwok one day calling me over and saying, Hey Andrew, take a look at what I have on my computer. And I walked over, and he showed me this ghostly picture of a slightly fuzzy black and white cat that the algorithm had discovered all by itself by watching YouTube videos, because stereotypically, a lot of cat videos on YouTube. But the fact that, an algorithm, just by looking through tons of data without any human interventions tell it there's even such a thing as a cat, had, you know, quote, discovered this cat face by itself.
那真是个惊人的突破性时刻。
That was a that was an incredible breakthrough moment.
没错。你有一句广为人知的名言。我很想请你分享一下关于人工智能与工作的那句名言,是什么让你产生这种观点,以及你认为人类未来如何最好地与人工智能协作?
Yeah. Exactly. You have a quote that's somewhat well known. I'd love if you'd share the quote about AI and work and sort of what's led you to believe that and how you think that humanity can sort of work best with artificial intelligence going forward?
我认为当前每位知识工作者都能从AI获得周期性的生产力提升,但AI还远不能自动化大多数人能做的所有事情。这意味着我不认为AI会取代人类,但会使用AI的人将取代不会使用AI的人。这里我转述我的朋友Kurt Langlois最初针对放射科医生说的观点。更广泛地说,如今我很难想象雇佣一个不会使用谷歌搜索的员工——在知识经济时代,这简直不可思议。
I think every knowledge worker right now can get a cyclical productivity boost from AI, but AI is still far from automating everything that most people could do. And what that means is I don't think AI will replace people, but people that use AI will replace people that don't. And I'm I'm I'm paraphrasing my friend Kurt Langlois that first said this about radiologists. But more generally, today, I can't imagine hiring an employee for most roles that doesn't know how to do Google search. It's just, you know, it's just strange in the knowledge economy to not know how to search on Google.
未来我认为对大多数岗位而言,我们根本不会雇佣那些无法高效使用AI的人。
In the future, I think for most roles, we just won't hire anyone that doesn't know how to use AI in a really effective way.
在多元领域。
In polygram.
但话虽如此,薪资往往会随着生产力水平逐步调整。AI将极大提升人类生产力,因此我认为许多人通过AI加速工作后,实际收入会大幅增长。
But having said that, it turns out that, salaries often get adjusted over some time to productivity, So AI will make people much more productive. And so I think that a lot of people will actually do much better financially, will get paid a lot more by becoming faster with AI as well.
是的,我同意。这确实有很多令人兴奋和期待的地方。对了,能否分享些你在X时期的趣事,或是推动登月计划成功的关键要素?
Yeah. I agree. It's there's a there's a lot to be excited about and hopeful on. Yeah. Any thoughts on either some fun stories of things from your time at x or takeaways for what it takes to successfully drive a moonshot?
在你们和Sebastian领导下的早期X公司里,最珍贵罕见的就是这种信任催化的氛围。记得有次Waymo团队的人突然问我:'Andrew,想试乘无人驾驶车吗?'我当即答应,然后就坐进早期Waymo原型车,在芒廷维尤市区兜了一圈。
One thing that was really precious and rare in the early days of X under you and Sebastian's leadership was the trust fertilization of ITERS. I remember one day someone from the Waymo team, they just came and said, hey, Andrew. Wanna ride in a, you know, driverless car? I said, sure. And then I hopped into one of the early Waymo prototypes and then drove around, you know, downtown Mountain View in a driverless car.
这种开放共享的文化,愿意尝试疯狂创意的态度——其中有些最终成就非凡——实在难能可贵。
And I think that degree of openness and idea sharing and the willingness to just do weird things, some of which really worked out brilliantly. That's really rare and precious.
确实要感谢你。这种交流肯定是双向的,那次试乘想必也给了你灵感。但我敢说Waymo同样受益匪浅——如今他们大量运用基础大模型技术,这种交叉滋养是真正双向互利的。
Oh, yeah. And thank you. And I'm sure it went in both directions. You probably got inspiration and ideas from Waymo from that ride even, but I guarantee you it went in the other direction as well. Like fast forward to today and Waymo uses large foundation models for a lot of what they do.
对吧?这种交叉融合正如字面意义所示,是双向奔赴的共赢。
Right? So that cross fertilization is is literally cross. It goes in both directions and helps both parties.
是的。而且我真的很喜欢那时候的每个人,我相信现在也是,大家都有种感觉,你知道,就是觉得自己在做有意义的工作。对吧?你不是来干些无聊的事情的。但我记得,实际上,拉里过去常常会问大家一个问题:如果你做的事情超乎想象地成功了,会有人在意吗?
Yeah. And and I really like that everyone back then, and I'm I'm sure even now, felt like, you know, it feels like you're doing work that matters. Right? You're not showing up to do some boring thing. But I remember, actually, Larry used to go around and ask people, if what you're doing succeeds beyond your wildest dreams, will anyone care?
这其中的潜台词很明显,就是去从事那些你能给出肯定回答的工作。我觉得这种感觉很棒,即使在那个时候,在整个X公司里,人们都觉得,虽然不确定会不会成功,但如果成功了,天哪,会有很多人关注的。没错。我想分享一个我个人特别执着的东西,那就是速度。
And the obvious implication being, you know, go work on something that you can answer yes to. And I think that felt really good, that even back then and throughout everywhere at x, people felt like, don't know if it'll work, but if it works, boy, a lot of people are gonna care. Yeah. Exactly. I I wanna just share one thing I personally tend to obsess with, which is speed.
因为当你创新时,几乎从定义上来说,你并不真正知道自己在做什么,而你能快速执行并尝试多种不同事物的能力,我认为是成功的重要因素。我发现,当我面试求职者时,每个人都会说自己知道如何快速行动,但实际执行速度的差异巨大,轻松就能差出10倍,甚至100倍。我曾与一些领导者交谈,他们可能在15分钟的对话后就做出决定;而另一些领导者在类似情况下会说,很好,让我们研究三个月,三个月后再讨论。
Because when you're innovating, almost by definition, you don't really know what you're doing, and your ability to execute really quickly and try lots of different things is an important component, I think, to success. And what I find is that it turns out when I, you know, interview people for jobs, everyone will say they know how to move fast, but there are dramatic differences, like easily 10, maybe 100x differences between the paces with which people execute. So I've spoke with leaders that will have a, you know, fifteen minute conversation and just make a decision. And I've also spoke with leaders that in a similar situation will say, great. Let's study this for three months, and we'll reconvene in three months.
我发现这些差异非常显著。创新的关键部分在于,对于像谷歌这样的大公司,你不希望某个工程师的随意行为导致谷歌网页搜索瘫痪,这是不可接受的。但X公司通过创造一个安全的环境,在很大程度上解决了这个问题。
And I find that there are these dramatic differences. And part of the key to innovating is, I think, for a big company like Google, you don't want one random engineer to do something that takes down Google web search. It's just not acceptable. But I think x by, for the most part, creating a safe environment. Yeah.
我们可以在Google Brain上随心所欲地尝试,不必担心会意外搞垮谷歌网页搜索——反正我也没有那个权限。这让我们能快速行动、大胆实验。我认为,这种在确保没人能做出危及母舰的疯狂行为的前提下保持高速执行的能力,这种组合很难实现,但X成功做到了。
We could do whatever we wanted on Google Brain. There was no risk that I would accidentally do something to take down Google web search, although I didn't have the authority to do so anyway. That let us move really quickly and try things out. And I think that speed of execution with the sandboxing of the guardrails to make sure no one can do something crazy to take down the mothership, that combination is is hard to create. I think X managed to do that.
谢谢,安德鲁。我同意。我对这个理念的理解是学习循环的紧密程度。我不在乎我们花多长时间达到卓越或发现走错了路,我在意的是从提出假设到获得可评估结果之间的时间间隔。
Thanks, Andrew. I agree. I my version of that same mantra is the tightness of the learning loops. I don't care how long it takes for us to get to greatness or to figuring out that we're on the wrong track. What I care about is the length of time between hypothesis and results that we can assess.
没错。如果这个过程需要一小时或一个月,那两种情况简直天壤之别。
Yeah. If that takes an hour or that takes a month, like, you're in a different universe in the second case relative to the first case.
说得好。很多创新其实关乎学习——如果我们早知道答案,一周就能重建整个系统。所以关键在于弄清楚要构建什么以及如何构建。
Yeah. Well said. A lot of innovation is about learning because one if if only we knew, we could have rebuilt the whole thing in a week. So it's about figuring out what to build and how to don't.
完全正确。说得太对了。真是精彩绝伦。谢谢你和我进行这次对话,安德鲁。
Exactly. Exactly. Well said. Absolutely wonderful. Thank you for doing this with me, Andrew.
也谢谢你,阿斯特罗。
And thank you, Astro.
关于 Bayt 播客
Bayt 提供中文+原文双语音频和字幕,帮助你打破语言障碍,轻松听懂全球优质播客。