本集简介
双语字幕
仅展示文本字幕,不包含中文音频;想边听边看,请使用 Bayt 播客 App。
从工程师角度看,公司的规模如何?
What is the scale of the company from an engineer?
综合来看,我们每天捕获超过一万亿次事件,涵盖消费者互动、以及支持决策制定的各类产品和服务活动。
When you add that up, we have more than a trillion events that we're capturing every day between consumer interactions, things that are happening across products and services that support decision making.
Live是去年推出的重要项目。
Live was a big launch last year.
我们从Paul Tyson那里学到了很多,因为那是个规模空前的活动。
We had a lot of learnings from Paul Tyson because it was such a large event.
我经常提到我
I've often mentioned I've
经常是全球规模最大的。
often was the world's largest.
对吧?
Right?
6500万条并发流。
65,000,000 concurrent streams.
看着数字不断攀升,那可能是我们注册量最高的一天。
Watching that tick up, I think one of our biggest ever days of sign ups.
现场大概有100人。
There were probably a 100 people on-site.
我当时和大约三四十人同处一室,包括工程师和数据科学家。
I was sitting in a room with maybe 30 or 40, both engineers and data scientists.
我们架着笔记本电脑和临时屏幕在那里工作。
We had our laptops and makeshift screens sitting there.
每当我想起我们为保罗·泰森筹备的那段日子,我常跟人开玩笑说,那一晚让我折寿十年。
When I think about where we were for Paul Tyson, I joke with people, I feel like I lost ten years of my life in that one night.
我们没有正式的绩效考核,这可能是第一件不同寻常的事。
We don't have formal performance reviews, which is probably the first unusual thing.
所以在Netflix,我们的处理方式是
So the way we approach it at Netflix is
Netflix无需过多介绍,但其规模仍会让许多人感到惊讶。
Netflix needs no introduction, but its scale can still surprise many people.
但作为一名软件工程师,在流媒体公司工作是什么体验?
But what is it like to work at a streaming company as a software engineer?
我采访了Netflix的首席技术官伊丽莎白·斯通,了解更多细节。
I sat down with Netflix CTO Elizabeth Stone to get more details.
在今天的对话中,我们将探讨Netflix面临的独特工程挑战,包括三年直播业务的经验总结、Netflix的工程原则、为何伊丽莎白最推崇'求知若渴'理念、Netflix为何取消绩效考核及其替代方案、公司如何运用AI工具、以及他们发现的动物检测分析这一绝佳应用案例等更多内容。
In today's conversation, we cover the unique engineering challenges at Netflix, including the learnings from three years of Netflix live, Netflix's engineering principles, and why Elizabeth's favorite is Yearn to Learn, how Netflix has no performance reviews, and what they do instead, how the company uses AI tools, and why animal detection and analysis is a great use case that they found for them, and many more details.
如果你希望深入了解Netflix软件工程师的工作机制,以及如何在他们所处的环境中表现出色,那么本期节目正适合你。
If you are interested in understanding more about how Netflix works as a software engineer and what it takes to do well in the kind of environment they operate in, then this episode is for you.
本期播客由Statsig赞助播出,这是集功能开关、分析实验等功能于一体的统一平台。
This podcast episode is presented by Statsig, the unified platform for flags, analytics experiments and more.
查看下方节目说明,了解更多关于他们及本季其他赞助商的信息。
Check out the show notes below to learn more about them and our other season sponsor.
那么,伊丽莎白,欢迎来到我们的播客节目。
So, Elizabeth, welcome to the podcast.
谢谢。
Thank you.
感谢邀请,欢迎来到Netflix。
Thank you for having me and welcome to Netflix.
能来到Netflix真是太好了。
It is so nice to be here at Netflix.
我正坐在导演椅上。
I'm sitting in a director chair.
哦,看到导演椅了。
Oh, seeing director chairs.
上面还印着Netflix的标志。
It has Netflix on it.
这是在Netflix办公室内部,感觉真的很特别。
So, it's inside the Netflix offices, which truly feels special.
这让我想起Netflix本质上是一家娱乐公司。
And it just reminds me that Netflix is an entertainment company at its core.
是啊。
Yeah.
这里创造了许多魔法般的成果。
A lot of magic happens here.
即使在技术团队,我们也非常重视视频等内容。
Even in the tech teams, we take things like video very seriously.
很多人当然是通过视频内容、电影、幕后花絮认识Netflix的,从工程师角度看公司规模有多大?
A lot of people know Netflix, of course, from the video offerings, the films, the movies, behind the scenes, what is the scale of the company from an engineer?
我能理解这个业务的规模有多大吗?
Can I make sense of how large this operation is?
这可能比人们意识到的规模要大得多。
It's probably larger than people realize.
经常有人在我的私人生活中问我:'真的需要那么多工程师来构建Netflix产品吗?'
Very often, I'll get questions from people in my personal life saying, Well, how many engineers can it really take to build the Netflix product?
首先,当你思考如何让技术运作得如此完美以至于近乎无形时——理想状态下用户完全感受不到技术存在,只需享受体验——这确实需要相当多的工程师。
So, first of all, it takes quite a few when you think about how do you make the tech work so well that it's basically seamless, in some ways ideally invisible, because members just get to enjoy their experience.
但除此之外,作为技术团队,我们还要为影视工作室制作、广告技术栈等开发工具和产品。
But then, we also, as a tech team, build tools and products for studio productions, our advertising tech stack.
我们构建了大量游戏开发平台功能和发布能力。
We build a lot of the developer platform capabilities and launch capabilities for games.
想想所有与商业相关的内容:套餐方案、定价系统、支付流程、合作伙伴关系。
Think about anything related to commerce, plans, pricing, payments, partnerships.
这些都由技术团队提供支持。
Those are all things that are supported by the tech team.
把这些加起来,我们每天要处理超过万亿次事件——包括用户交互数据、跨产品服务数据以及决策支持系统的各种信息。
So, when you add that up, we have more than a trillion events that we're capturing every day between consumer interactions, things that are happening across products and services that support decision making.
所以现在这已经是个相当庞大的全球性企业了。
So, it's quite a global enterprise at this point.
你提到的某些方面,我觉得在其他大公司也较为典型,比如支付系统建设、广告平台开发这类工作。
Some of the things that you mentioned, I feel, are kind of somewhat typical at other large companies, for example, building payments, building ads, building some of those things.
但你提到了一些我在其他公司从未听过的内容,比如为自家制片工作室定制开发软件。
But you mentioned things that I haven't heard in any other company, which is, for example, building custom software for your production studio.
确实。
Yeah.
在你们构建的软件栈或软件中,有哪些部分是Netflix独有而工程师在其他地方难以找到的?
What are some of the parts of the software stack or the software that you're building that it sounds it might be pretty unique to Netflix that you might not have found elsewhere as an engineer?
确实,这很大程度上是我们的超能力——将技术引入娱乐产业。
Yeah, it's actually very much part of our superpower that we've been able to bring technology to entertainment.
我们是全球最大的制片厂之一。
We're one of the biggest studios in the world.
我们在思考如何为这些制作解决独特问题时具有优势。
And we have an advantage in thinking about what are some of the problems we could uniquely solve for those productions.
典型的例子比如我们的媒体制作套件,它彻底革新了原本陈旧、缓慢且昂贵的全球创意团队间媒体文件传输方式。
So, good examples would be things like our media production suite, which took something that was a fairly antiquated, slow and expensive way for media files to travel across creative teams around the world, and really modernize what that looks like.
比如当某个制作在欧洲拍摄,而审片人员坐在洛杉矶时,他们能实现媒体文件全球流转、实时批注反馈,第二天就能投入新一轮制作。
So that if you have a production that's shooting somewhere in Europe, and you have someone sitting in Los Angeles who's gonna review daily clips and footage, they're able to have those media files travel around the world, provide notes or input on that, have those notes travel back, and be ready for another day of production just the next day.
这类媒体制作套件和进度监控工具,正是Netflix极具创新性的体现。
So, things like that media production suite or other tools that allow us to monitor how are we progressing on some of the productions, is something that's very novel from Netflix.
我们通过Scanline和Eyeline(几年前收购的视效工作室)进行前沿研究,在数据采集、视效技术和拍摄策略创新方面突破传统摄像技术的局限。
We also have a big presence through Scanline and Eyeline, which is a visual effects studio that was an acquisition a few years ago, that does really cutting edge research and technology for things that affect how we do data capture, visual effects, different ways to think about strategies that bring life to productions that wouldn't be easy just based on standard camera technology.
这背后的工程技术面临哪些挑战?
In terms of the engineering work behind that, what are some of the challenges here?
因为我听到你提到电影文件之类的
Because what I'm hearing is you're saying, you know, movie files, etc.
对我来说,这让我联想到海量数据处理问题。
To me, what rings a bell is like, it'll be like large amounts of data, probably.
我猜延迟可能是个有趣的挑战。
I assume the latency might be interesting challenges.
当你想到数百或数千个项目同时进行时的规模,简直难以置信。哦,确实如此。
It's unbelievable scale when you think about hundreds or thousands of productions that are in progress at any given moment Oh, yeah.
因此,在所有那些项目中,你会遇到媒体文件,它们尤其庞大、复杂且难以迁移。
At a So, across all those productions, you have media files, which are also especially large, complex, and difficult to move.
关于规模问题,要考虑存储成本、计算成本以及数据传输成本。
So, about scale, think about cost of both storage, compute, and travel of that data.
延迟问题在某些情况下取决于具体应用场景。
Latency, in some cases, it depends on the use case.
有些情况下,采集的媒体内容可以次日再审核。
For some cases, it's capturing media that is going to be reviewed in a way that can be the next day.
但对于其他场景,比如直播制作,媒体传输必须做到近乎即时。
But for other things, we've got media traveling for things like live productions that has to be essentially instantaneous.
我想说的另一个独特挑战是我们追求的质量标准。
The other thing I would say about some of the unique challenges in that space is the level of quality that we're trying to bring.
当你考虑这些挑战时,如何制作高质量图像视频——无论是内容本身还是平台上的推广内容。
So, when you think about some of the challenges to think about, how do you create very high quality images, videos, whether that's the content itself or the promotion of that content on the service.
要达到这样的标准需要克服许多工程挑战。
There's a lot of engineering challenges that come to meeting that type of bar.
另一个外界熟知的,就是Open Connect内容分发网络,Netflix拥有这个极其独特的系统。
The other thing that's familiar externally, which I'm sure you've heard about, is Open Connect, or Content Delivery Network, which is extremely unique that Netflix has that.
这是我们十多年前做出的重大决策——自建内容分发网络。
It was a big bet that we made more than ten years ago to build our own content delivery network.
其规模常常令人惊叹。
And the scale of that often surprises people.
全球6000个地点,覆盖超过175个国家。
So, 6,000 locations around the world, more than 175 countries.
这使我们能够为电影、电视和游戏部署本地文件,确保会员无论在哪里点击播放,都能获得极低延迟和极高画质。
And that actually allows us to place local files for film, TV, games that you're going to play, so that there can be very low latency and very high quality for members no matter where they click play.
本质上,这些服务器分布在城市内的6000个不同地点,构成了你们的边缘网络,对吧?
Basically, these are server locations at 6,000 different locations inside cities, whatnot, where you have it's like your edge network, right?
没错。
That's right.
我们与互联网服务提供商合作,确保当用户在手机、电视或笔记本电脑上点击播放时,内容能通过最后一英里传输到会员或消费者手中。
And we integrate with internet service providers to actually get the content to when someone clicks play on their phone, on their TV, on their laptop, that actually gets the content through that last mile to the member or the consumer.
在Netflix内部工作时,你能接触到这些细节层面的信息,这可能是其他公司的工程师无法企及的。
And then, when you're inside Netflix, you get to be exposed to some level of this detail, which I guess most engineers at other places wouldn't be.
如果是CDN供应商,这就成了一个黑箱。
Would be the CDN provider and it would be a black box.
但你们正在构建这套系统,对吧?
But you're building this thing, right?
我们正在构建这套系统。
We're building this thing.
当我们考虑新内容类型时,这给了我们惊人的先发优势。
And it's been an incredible head start as we think about new content types.
当我们开始涉足直播、游戏特别是云游戏或流媒体游戏时,考虑到我们影视内容的广度,Open Connect已成为巨大的战略优势,我们正将其扩展到不同类型的内容交付。
So when we started to go into live, into games, especially cloud or streaming games, as we think about just the breadth of our film and TV offering, Open Connect has been a huge strategic advantage, and we're extending that to be able to deliver different types of content.
另一个独特之处在于,Open Connect作为内容分发或边缘网络,是内容流转漫长生命周期中的最后一环。
The other thing that's unique is Open Connect as a content delivery or edge network is sort of the end of a very long integrated life cycle that content moves.
我之前提到过那个媒体制作套件。
So I mentioned that media production suite.
那是用于影视制作的。
That's on a studio production.
文件正在传输以供审核,确保质量符合创意愿景。
Files are being transferred for review, quality, making sure that we're aligning with the creative vision.
当一部作品准备上线时,会经过我们设计的其他流程:宣传物料是否就位?
Once a title is ready to launch, that flows through other pipelines that we think about, Do we have the promotional assets?
我们是否已准备好向目标观众提供精准推荐?
Are we ready to give great recommendations to the right audiences?
我们如何编码这些文件,使其真正准备好通过Open Connect内容分发网络传输?
How do we encode those files so that they're actually ready to be transmitted through Open Connect as the content delivery network?
我们有时把这个过程戏称为'从提案到播放'。
When you think about that, sometimes we lightly call it pitch to play.
整个生命周期都贯穿着工程元素——这很特殊,因为其他很多公司都没有像Netflix这样逐步建立起端到端的流程体系。
There's an element of engineering all the way along that lifecycle, which is unusual because at many other companies, they haven't built that end to end pipeline themselves, as Netflix has over time.
什么是'从提案到播放'?
And what is pitch to play?
想象从内容团队批准某个提案的那一刻开始:'好,我们要开发制作这个项目'。
So, think about from the moment a title is pitched, that someone in the content team green lights, Yes, we're going to develop and produce this title.
有数据科学团队和工程团队协助支持这些内容规划决策。
There's data science teams, there's engineering teams that help to support those decisions on programming.
还有技术团队协助支持内容创作、内容推广、推荐算法,以及最终的交付环节。
Then there's tech teams that help support the creation of that content, the promotion of that content, the recommendations, and ultimately the delivery of that.
所以,技术基本上支撑着整个生命周期。
So, tech basically underlies that whole life cycle.
哇。
Wow.
听起来这涉及更多工作流程。
So, this sounds like a lot more workflows.
通常当我听到流水线工作时,在工程领域,我们会想到CICD流水线。
Usually, when I hear of the work of pipeline, you know, in engineering, we would think of a CICD pipeline.
我想我们对这个非常熟悉。
And I think we're very familiar with that.
你知道,有代码审查、测试运行,流程就这样进行下去。
You know, you have your code reviews, tests run, and it kind of goes on.
但我理解的范围要大得多。
But what I understand is just a lot bigger.
想象一下那个CICD流水线乘以数千倍,因为它实际上要跟随一个完整的'为会员赋予内容生命'的周期。
Imagine that CICD pipeline times thousands because it's actually gonna follow an entire bring content to life for members cycle.
现在正是提到我们本季赞助商Linear的好时机。
This is a great time to mention Linear, our season sponsor.
毕竟,Linear的诞生部分归功于企业在规模化过程中遇到的痛点。
After all, Linear was born partially thanks to the pain points that happened during companies scaling up.
Linear的创意源自其创始人在Airbnb、Coinbase和Uber经历超高速增长阶段时。
The idea for Linear came about when their founders were going through hyper growth phases at Airbnb, Coinbase and Uber.
正如你在真实规模中预料的那样,这些公司开始放缓发展速度。
As you'd expect with real scale, these companies started to slow down.
过去只需数日完成的工作开始需要数周甚至数月,并非因为人们不够努力,而是需要协调的环节大幅增加。
What used to take days started taking weeks and sometimes even months, not because people worked less hard, but because there were a lot more moving parts that needed to be coordinated.
每当适应规模扩张时,总会新增各种工作流程、规范以及必须处理的事务。
Whenever you're adapting to scale, you often pick up new workflows, processes, just things you need to do.
久而久之,这就形成了真正的工作流程深度。
Over time, this creates real workflow depth.
软件工程师在这方面往往受影响最大。
Software engineers often get hit the hardest here.
我们都经历过这种情况:创建问题时必须勾选五个复选框和三个标签,只为确保某人的仪表板能正确显示数据。
Having to check five boxes and three labels when creating an issue just so someone's dashboard populates correctly, we've all been there.
正是这些步骤的不断累积拖慢了组织效率,也让工程团队倍感沮丧。
It's this accumulation of steps that slows orgs down, and it frustrates engineering teams.
对于已转向Linear的公司(如OpenAI、Coinbase、Scale),这一转变就像是对所有繁琐流程的彻底重置。
For companies that have made the switch to linear, like OpenAI, Coinbase, Scale, the move has been like a hard reset in all this process depth.
效果非常显著。
The results are striking.
以Scale为例,通过改用Linear,他们的缺陷解决时间缩短了一半。
Scale, for example, cut their bug resolution time in half by switching to Linear.
如果你对转型感到好奇,实际操作比想象中更简单。
If you're curious about making the switch, it's more straightforward than you think.
Linear原生支持从Jira、GitHub问题、Asana导入数据,过渡期甚至可以并行使用新旧系统。
Linear has native imports for Jira, GitHub issues Asana, you can even run them side by side during transition.
团队也乐意配合你开展4-6周的试点运行,与现有工具并行使用以验证效果。
The team is also happy to work with you to run a four to six week pilot alongside your existing tool just to prove the impact.
看看linear.app/switch这个网站。
Check out linear.app/switch.
他们有一份迁移指南,会带你走完全部流程。
They have a migration guide that walks you through the entire process.
现在回到节目内容。
And now back to the episode.
看到这个更长的流程线,我的第一反应是它听起来会很僵化,但显然Netflix的行动非常迅速。
And now, my first association seeing this longer pipeline is, it sounds like it would be rigid, but of course, Netflix is moving really fast.
从软件工程师的角度,从工程团队的角度来看,完成一个项目是什么样的情况?
What does it look like from a software engineer's perspective, from an engineering team's perspective, getting a project done?
这些通常是怎么完成的?
How are these typically done?
是基于某种时间表来执行,还是更加灵活?
Is it based on some sort of following the schedule, or is it a lot more elastic?
这里可能体现了Netflix文化的独特性。
This is probably where the uniqueness of Netflix's culture comes into play.
我们很多工程系统、产品和工具的构建方式,很大程度上是由个人贡献者驱动的,他们思考如何构建这些系统。
So, a lot of the way that our engineering systems, products, and tools were built was highly driven by individual contributors, thinking about how to build those systems.
因此,创新真正是由团队内部驱动的,而非自上而下。
So, innovation is really driven from within the teams, rather than top down.
在构建方式上,我们拥有很大的自主权和本地决策权。
There's a lot of autonomy and local judgment in how we build things.
这使我们能够构建这种端到端的视角,以我们认为最高效、最优质的方式交付内容,并让我们能够随着新需求的出现,灵活调整其中的各个组成部分。
That has allowed us to build this end to end view of how to deliver content in a way that we think delivers the best quality, most efficiently, and allows us to play with the puzzle pieces of that as we have new needs that come up.
正如我提到的,我们已经在点播视频、电影和电视方面做了很多准备。
So like I mentioned, we had many of those things in place for video on demand, film and TV.
当我们转向直播领域时,工程师们需要重新思考:鉴于直播的特殊要求,我们必须如何改变内容交付的思维方式。
When we went into live, engineers needed to rethink how are we going to have to change how we think about how we're delivering content, given the requirements of live.
他们能够基于我们已有的基础进行开发,同时在如何升级系统和产品以支持新内容类型方面拥有大量自主决策权。
They were able to start with what we've already built, but also have a lot of their own decision making on how to evolve all of our systems and products to be able to deliver new content types.
因此长期以来,Netflix的构建方式主要由团队内部的工程师驱动,而非自上而下的顶层设计——比如先画好架构图再按图施工。
So over time, the way Netflix has been built has been very driven by engineers within the teams, rather than some top down overarching, Let me draw architecture for you and now let's build it in that direction.
这种方式既有优势,也让我们不得不随着时间推移不断演进。
Which has both advantages, but also things that we've had to evolve towards over time.
因为随着公司规模扩大,扩展性成为更大挑战,我们要确保构建方式能支持这种增长,而不是一成不变。
Because as the company becomes much bigger, scale becomes more of a challenge, we wanna make sure that we're building things in a way that supports that, so it's not like static.
用你的话说,这种弹性正是让我们能够为今天的Netflix(而非十年前的Netflix)进行工程创新的关键。
And that elasticity, to use your word, is something that has allowed us to actually engineer for what Netflix requires today versus what it required ten years ago.
我们能具体聊聊直播吗?
Can we talk about specifically Live?
因为直播是去年最重要的发布之一。
Because Live was a big launch last year.
我记得杰克·保罗和迈克·泰森那场拳击赛是个重大事件。
I I remember one of the big boxing match between Jake Paul, Mike Tyson was a huge event.
能否分享一下这个项目是如何启动的?工程团队是如何参与的?是最初只有一个小团队吗?
Can you give us a little of insight of how that project started, how engineering teams got involved, like was it a small team?
还是多个团队共同参与的?
Was it multiple teams?
我猜当时肯定有多支团队在协同工作。
I'm assuming there must have been multiple teams working together.
你们再次推动项目上线的具体流程是怎样的?
And what was your process of getting this to launch again?
是经过周密规划的吗?
Was it overly planned out?
还是纯粹靠碰运气?
Was it just YOLO?
或是介于两者之间?
Something in between?
碰运气。
YOLO.
也不完全是碰运气。
Not quite YOLO.
我们的首个直播节目是克里斯·洛克专场,我记得是在2023年3月。
Our first live title was a Chris Rock special, I believe in March 2023.
那是我们首次为Netflix会员提供直播内容。
That was our first time bringing live to Netflix members.
由此开启了一段高强度时期——如果以保罗·泰森那场比赛为终点,那是在2024年11月。
And started what was a very intense period of, if I take through to that Paul Tyson match, that was November 2024.
所以从首次尝试直播,到史上最大规模流媒体赛事,前后差不多十八个月。
So, you think about that as basically eighteen months from our first ever outing on live, to the largest streamed event ever.
保罗·泰森那场比赛最终创下了这个纪录。
Which is what Paul Tyson ended up being.
这种方式得以实现的关键在于紧迫感、大量临时拼凑的方案,以及我刚才提到的工程师们的努力。
The way that came to life was with urgency, a lot of scrappiness, and like I mentioned, engineers making it happen.
通常我们会设定一个目标,比如我们原计划在2024年7月举办保罗·泰森的活动。
So, typically, we would set a goal saying, so we've got Paul Tyson scheduled for July 2024.
由于泰森的健康原因,活动被推迟到2024年11月,这给了我们多几个月的准备时间。
It was rescheduled because of Tyson's health to November 2024, which gave us a few more months.
想象一下Open Connect团队、编码团队、内容制作与推广团队、内容发现团队都在思考:该让哪些合适的人参与进来共同实现这个目标?
But picture teams from Open Connect, Encoding, our content production and promotion teams, our discovery teams, thinking, Who are the right people to lean in here and help bring this to life?
但他们自发组织起来,制定自己的路线图,思考哪些人需要负责哪些事项,以及我们需要确保哪些系统能够真正支撑直播的稳定性。
But they self organize, they develop their own roadmaps, they think about who needs to be on point for what things, what are some of the systems that we need to make sure are actually resilient enough for live.
从头到尾的时间安排都极其紧张。
It was an incredibly tight timeline, end to end.
更不用说我们从保罗·泰森活动中学到了很多,因为那是个超大型活动。
Not to mention that we had a lot of learnings from Paul Tyson because it was such a large event.
是的。
Yeah.
我
I've
通常来说这是全球规模最大的对吧?
often The world's been largest, right?
6500万同时在线观看量。
65,000,000 concurrent streams.
看着数字不断攀升,我想那是我们有史以来注册量最大的一天。
Watching that tick up, I think one of our biggest ever days of sign ups.
那天,我们看着报名人数直线飙升。
So, we were watching the sign ups go through the roof that day.
接着观察时,我想当我们进行到最初几场垫场赛时,比赛的规模已经超出了我们的预期。
Then watching, I think by the time we were in some of the first couple of undercard fights, we'd already exceeded our expectations for how big the fight would be.
洛斯加托斯这间发布室里的能量简直触手可及。
The energy in the launch room here in Los Gatos was palpable.
你能感受到那种兴奋、紧张,以及工程师们实时解决问题的真实场景。
That you could feel excitement, nervousness, very real time problem solving by engineers.
因为从未见过,没人见过那种规模的场面。
Because never seen, no one had ever seen scale like that.
所以你们需要实时摸索如何实时交付内容。
So you get your real time figuring out how to deliver something in real time.
团队找出需要调控的关键环节以保持系统稳定,这让我感到无比自豪。
I've never been prouder of the team for figuring out what are the levers we need to pull to keep this as stable as possible.
带着2024年11月这些经验,我们只有五周时间为圣诞节的两场NFL比赛做准备——为会员和粉丝提供优质服务的标准非常高。
With those learnings in November 2024, we had about five weeks to be ready for two American NFL football games on Christmas Day, where the bar is very high to deliver well for members and for fans.
团队立即借鉴保罗·泰森的案例,开始思考:如何构建更强的容错能力?
And so the team immediately took the learnings from Paul Tyson to say, how do we build greater resilience?
如果某些市场最终出现带宽限制,我们该如何规划内容调度?
How do we think about how we're gonna direct content if we end up bandwidth constrained in some markets?
如何通过质量调控手段真正优化用户体验?
How can we really optimize by using some of our quality levers for what that experience is gonna be?
最终这些NFL比赛呈现得完美无瑕。
And those NFL games ended up being flawless.
从克里斯·洛克到《爱情盲选》的失败案例,再到保罗·泰森在该规模下积累的大量经验,再到NFL,如今每周的WWE赛事,更不用说其他众多大型活动——这一切都是由地面团队不懈努力驱动的,他们不断追问:我们如何把这件事做好?
That, from Chris Rock, to a Love is Blind failure, to Paul Tyson with lots of learnings at that scale, to NFL, and now weekly WWE, not to mention lots of other big events, that was all driven by teams on the ground being relentless and saying, How do we do this well?
我们需要解决哪些问题?
What are the problems we need to solve?
快速学习的责任并不意味着我们不会遭遇失败。
And accountability for learn fast doesn't mean we're not going to have failures.
但当我们失败时,要快速学习、迭代并交付更好的成果。
But when we have failures, learn fast, iterate, and deliver ever better.
这正是Netflix文化精髓的体现。
That's really where the beauty of the Netflix culture comes into play.
我目睹了同样的模式在我们自建广告技术栈、游戏交付能力以及新电视UI发布中的成功应用。
And I watched the same thing happen standing up our own ads tech stack, being able to deliver games, launching our new TV UI.
每个项目都凝聚着工程师团队的智慧结晶——他们不断探索最佳实现方案。
Each of those was a group of engineers coming up with what's the best way to bring this to life.
你提到了控制室,但我想大多数人没有这种体验。
You mentioned control room, but I think most people have not had this experience.
这是现场直播活动。
It's a live event.
当然,你可以调整各种参数。
Of course, you can tweak things.
能给我描述下控制室是什么样子的吗?
Can you explain to me what was the control room like?
我猜肯定有各种指标的数据仪表盘对吧?
I assume it must have been a bunch dashboards on all sorts of metrics, right?
是有点像那样吗?
Was it a little bit like that?
是的。
Yeah.
所以,就连我们的仪表板都是全新的。
So, even our dashboards were brand new.
它们是
They were
你们并不是为那个活动而构建的。
You not built it for that event.
工程团队临时搭建的。
The engineering team put it together.
数据科学和工程团队共同搭建了一套仪表板,让我们能直观看到一些核心体验质量指标。
The data science and engineering team collectively put together a set of dashboards that would give us visibility into some core quality of experience metrics.
比如渲染时间、应用启动时长、重新缓冲率等数据。
So things like time to render, app start timing, rebuffer rates.
正是在保罗·泰森期间,我们开始看到重新缓冲率飙升。
It was the rebuffers that we started to see amp up during Paul Tyson.
现场大概有100人。
There were probably 100 people on-site.
我当时和大约三四十人同处一室,既有工程师也有数据科学家。
I was sitting in a room with maybe 30 or 40, both engineers and data scientists.
我们都带着笔记本电脑和临时搭建的屏幕在那里工作。
We had our laptops and makeshift screens sitting there.
所有人都被严格接入互联网,这样我们就不会在VPN备份上冒任何风险,以防万一出现问题。
Everyone was hard lined into internet so that we weren't risking anything with the We had VPN backups if anything was to go wrong.
我们有一位戴着耳机的发射指挥官,与制作车里的工作人员保持通话。
We had a Launch Commander with a headset dialed in, talking to people in the production truck.
但这套系统是全新的。
But it was new.
所以它还不够流畅,不够完美,不像我想象中许多直播制作通常拥有的那种高级发射室。
So it wasn't streamlined, it wasn't perfect, it wasn't like the fancy launch rooms I imagine a lot of live productions typically have.
于是我们监控着各项指标,你知道,有些东西开始闪红,引起你的注意。
And so we're watching metrics, you know, some things start flashing in red, causing you to draw attention to it.
我们会临时创建Google Meet会议室,让小组人员能够快速处理问题。
We would create makeshift Google Meet rooms, so small groups of people could triage.
我们安排了专人负责担任信息主管或决策者。
We had people who were on the hook for being the informed captains or decision makers.
所以如果我们遇到Open Connect问题、播放问题或发现功能问题——用户真的能找到节目吗?
So if we have an issue with Open Connect, if we have an issue with Playback, if we have an issue with Discovery, are people actually able to find a title?
我们在发射计划中基本都安排了对应人员。
There were people that we had in basically the launch plan.
想象一下,我觉得那份文档最终写满了40到50页的'如果-那么'条件语句。
Imagine, I think the document ended up being 40 or 50 pages of if then statements.
如果发生这种情况,那么怎么办?
If this thing happens, then what?
这对我们来说是全新的。
It was new for us.
当我回想我们为保罗·泰森所做的一切时,我常开玩笑说那一夜让我折寿十年。
When I think about where we were for Paul Tyson, I joke with people I feel like I lost ten years of my life in that one night.
因为压力实在太大了,而我却无法通过键盘操作提供任何实质帮助。
Because it was so stressful, and there's no hands on keyboard thing I can do to help.
我的职责是支持团队,信任团队做出的决策。
I'm there to support the team, to trust the team in making decisions.
现在看我们处理NFL比赛、卡内洛·克劳福德拳赛和WWE的方式,在系统韧性建设、指标看板和事件可视化方面都成熟得多。
When I look now at how we're doing NFL games, or Canelo Crawford fight, WWE, It's much more sophisticated in the sense of the resiliency we've built, the metrics and dashboards, the visibility we have into what's happening.
最初起步阶段主要依赖人工操作,这也正是我们获取经验的重要来源。
That was very human driven when we were first coming out of the gate, which was a lot of where the learnings came from.
但我常对团队说:能在一家成熟企业里从零打造新事物的机会多难得?
But I say to the team, how often do you get to work at a mature company, but build something truly from scratch like this?
要考虑所有可能出错的情况并做好准备,这样当问题发生时才能保持冷静。
And think about all the things that might go wrong and how to be prepared for them, and then stay very calm and cool under pressure if something happens.
这个故事最让我感兴趣的是,你提到团队做了大量准备工作,40-50页的'如果-那么'预案,这比大多数项目上线要细致得多。
So, what strikes me as very interesting about this story is you mentioned that the team had done a bunch of preparation, you mentioned 40 or 50 pages If Then Else, which sounds way more detailed than I've seen most launches.
通常会有上线计划,但这次情况复杂所以准备更充分。
You usually have a launch plan but again, it was a complex launch so you're prepared.
即便如此,《爱情盲选》和杰克·保罗比赛还是出现了小插曲。
Even so, were hiccups both with Love is Blind, both with the Jake Paul match.
能说说团队是如何处理后续问题的吗?
Can you tell me how the team handled the aftermath?
虽然无责任复盘很常见,但我更好奇这个过程有多正式——是依靠员工主动担责,还是有更严格的流程?或者就是聚在一起解决问题?
Again, it's pretty common to have blameless post mortems, but I'm more curious on how formal this process is, how less formal driven by people stepping up, or again, do you have more rigid processes around this, or it's just getting together and people do go and fix things?
正如你所说,进行无责的事后分析或回顾并不罕见。
Like you said, it's not uncommon to have a blameless postmortem or retro.
所以,这一点是肯定的。
So, have that for sure.
讨论从某件事中学到的东西比过分关注谁做错了什么要有趣得多。
It's much more interesting to talk about the learnings from something than to overly focus on who did what thing wrong.
我们从中实际获得的并不多。
There's not much that we actually gain from that.
这不是一个僵化的流程。
It's not a rigid process.
它非常自然地发生,由那些熟悉工作本身并深感有责任对做得好的和可以改进的地方进行反思的人主导。
It happens very organically, and it's led by the people who are close to the work itself and feel tons of accountability for doing reflections on both what went well and what can we do better.
以保罗·泰森为例,那几天他的情绪相当复杂。
If I take Paul Tyson as a good example, it was a pretty complicated set of emotions and a couple of days afterwards.
我们正在庆祝有史以来规模最大的直播活动。
We're celebrating the biggest ever livestream event.
远超我们曾经期望的规模。
Way bigger than we ever could have hoped for.
我们庆祝自己在一家敢于如此大胆尝试的公司工作。
We're celebrating that we work at a company that takes such a big swing.
我们庆祝在达到2000万、3000万、4000万并发流时没有崩溃。
We're celebrating that we didn't collapse when we got to 20,000,000, 30,000,000, 40,000,000 concurrent streams.
如果有人告诉我会有6500万并发流,我肯定会说这不会顺利。
If someone said to me, You're going to have 65,000,000 concurrent streams, I would have said, This is not going to go well.
尽管如此,是的,我们遇到了一些小问题。
And yet, yes, we have hiccups.
我们始终致力于提供优质的会员体验。
We always want to deliver a great member experience.
我想说我第二天早上醒来时,整晚都没睡,一直在思考:我们接下来该怎么做?
I would like to say I woke up the next morning, I was awake the whole night, thinking about, what do we do next?
距离NFL开赛只剩五周了。
We only have five weeks till NFL.
我醒来时看到团队已经写好的一叠备忘录,上面写着'这是我们观察到的现象'。
I woke up to a set of memos where the team had already written down, Here's what we observed.
'这是我们认为可以改进的地方'。
Here's what we think we can improve.
'这些是我们应该立即优先处理的事项'。
Here's some of the things we should immediately prioritize.
其中部分内容是:'当出现拥堵时,我们该如何疏导流量?'
So some of that was, How do we direct traffic when we get congested?
我们的算法在那一刻是如何考虑流量引导的?而当我们遭遇拥堵时,我们又希望它们如何运作?
What were our algorithms doing to think about directing traffic in that moment, versus what do we want them to do if we end up congested?
同时也在思考:当系统承受这种压力时,我们该如何优雅地降级或回退。
And thinking about what are ways that we can gracefully fall back or degrade when we're under that type of duress.
在这次事件中,我们无法在系统实际运行前预见到所有情况。
There was nothing in that event that we could have created before seeing what happens to the system's life.
所以当我醒来时,我意识到自己毕生都在思考'我们必须这样做,必须那样做'——而这些都已经写在文件里了。
And so just knowing, I woke up thinking all my life, we're gonna have to do this, we're gonna have to do that, it was already there in the document.
团队对如何做得更好有着强烈的责任感,我们甚至不需要走流程说‘现在该做复盘了’。
The team feels such accountability to how do we do this better, that we don't really require a process to say, Now we should do a postmortem.
现在我们应该对此进行反思。
Now we should develop reflections on this.
团队直接承担着这份责任。
The team owns that very directly.
我们的文化备忘录中有关于‘异常负责’的表述。
We have language in our culture memo about being unusually responsible.
这确实是团队的人才特质。
That's really the talent on the team.
这与高人才密度密不可分。
It comes with high talent density.
这源于像对待成年人一样对待员工,让他们在决策中拥有高度自主权。
It comes with treating people like adults, where they get a lot of autonomy in making decisions.
然后他们会对结果负责。
And then they own the outcomes.
他们持有这样的心态:其中有很多令人振奋的部分。
And they have a mindset, which is There was a lot that was exciting about that.
我们还有很多可以改进的地方。
There's a lot we can do better.
现在就让我们去改进吧。
Let's go do better now.
所以我在那里是为了提供意见,提出一些问题以便理解并准确传达。
So I'm there to help provide input, to ask some questions so that I understand and I can represent it accurately.
但几乎从不需要领导者说'我们现在必须做这件事',因为这是由团队驱动的。
But it almost never requires a leader to say, now we must do the following thing because it's so driven by the teams.
我不得不问这个问题,但我已经感觉到答案了。
I have to ask this question but I'm sensing the answer to it already.
谈到工程文化,在许多公司里,当你作为一名新工程师加入时,你会四处询问'嘿,我需要遵循哪些流程?'
When it comes to engineering culture, at a lot of companies, you you go into a company as a new engineer and you ask around saying, hey, what are the processes I need to follow?
因为在许多公司都有强制性的代码审查。
Because at a lot of companies there's a mandatory code review.
如果你要发布一个功能,可能需要使用功能开关,比如在移动应用上。
If you launch a feature, you might need to use a feature flag, let's say on the mobile app.
在代码审查中,你可能需要获得某些人的签字批准。
You need to, on the code review, might need to have certain sign offs from people.
所以有一整套流程,CICD必须始终通过,你不能绕过它。
So there's a bunch, CICD always needs to pass, you cannot override it.
在Netflix工程团队中,有多少这类事情是在全局层面规定的,每个人都必须遵守,团队可以自行决定还是仅基于工程团队或工程师个人的判断?
In the Netflix engineering team, how much of these things are kind of put down at a global level, everyone needs to follow it, teams can decide and do decide versus just based on the judgement of the engineering team or the engineer themselves.
是的。
Yeah.
很多都留给了工程团队和工程师自己决定。
A lot is left to the engineering team and engineering themselves.
所以,即使是新入职或初级人才也是如此。
So, even for new or early career talent.
我们逐渐形成的一个理念是,即使回想多年前Netflix引入混沌猴子概念时,让每位工程师都有责任理解他们的系统何时会崩溃、如何保持韧性、检测问题并快速恢复,这始终是文化的核心部分,也是我们持续坚持的理念。
One of the things we've evolved towards even if I think going back however many years when Netflix introduced Chaos Monkey as a concept the idea of an individual engineer having responsibility to understand how and when their system will break, and how they're going to be resilient, detect that, and recover quickly, was just a core part of the culture, and something that we continued to maintain.
几年前我们的主要工作,让我来谈谈直播前后的关键分界线。
A lot of where we were a few years ago, let me talk about pre and post live is probably the useful threshold.
在直播前,点播业务让我们有很多机会进行明智的风险尝试,因为我们积累了多年经验,知道系统出问题时该如何修复。
Pre live, there were a lot of ways to take smart risks with video on demand, because we had many years under our belt of understanding when something breaks, how are we going to fix that?
当时各团队需要自行考虑所需的测试范围和系统韧性。
And it was left to teams to think about the extent of testing and resilience that they needed to have.
我们还有优秀的待命团队和支持团队应对突发问题。
And great on call teams and support teams for when something does go wrong.
当我们引入直播业务时,标准就完全不同了,因为用户是在实时观看。
When we introduced live, there's a different threshold because you have to watch it live.
不可能出现'Netflix将临时下线维护'这种情况,就像我们过去在可控风险下处理问题时那样。
There's no such thing as Netflix is gonna be down temporarily while we address something where we were taking a smart risk.
起初这很可怕,但随着我们逐步建立安全护栏——特别是对直播关键路径中的零级和一级应用,我们制定了更严格的测试标准,确保系统能承受直播活动的压力。
That was scary at first, but as we started to introduce what are the guardrails we need to have to do this safely, it was things like introducing, especially for tier zero or tier one applications that were in the critical path for live, a higher threshold for what's the testing that you're doing to make sure that your system is ready for the duress it may be under in a live event.
我们将分享这些指导原则。
We will share those guidelines.
它们来自我们的核心工程团队。
It comes from our central engineering team.
这让团队能够减少流程负担。
And it gives people an opportunity to have less process.
因为他们可以说:如果我通过了这些指南要求,完成了这些测试,就不必像直播期间那样进入静默期。
Because they're able to say, If I pass these guidelines, if I've done this testing, I don't need to be in a quiet period, for example, during a live event.
或者我们已经完成端到端测试,深刻理解系统依赖关系,能够为可能的故障做好准备。
Or we've done end to end testing so we know those system dependencies very deeply, and we're able to prepare for the what ifs something goes wrong.
但这并不是一个非常结构化的流程,比如代码审查,或者你必须勾选这些检查项,又或是某种用于实际部署代码的闸门功能。
But that's not a very structured process, like a code review, or you must check these boxes, or some type of gating function for code being actually deployed.
我们确实在年底假期期间设有静默期。
We do have quiet periods during the end of year holidays.
我认为这相当普遍。
I think that's pretty common.
我们会围绕直播活动制定一些基本规则,以确保不承担不必要的风险。
We'll have some rules of the road around live events to make sure we don't take unnecessary risk.
但我们不断在寻找方法,如何缩短这些静默期?
But we are constantly finding ways to, how do we reduce the quiet periods?
如何将更多决策权下放给团队?
How do we leave a lot of judgment to the teams?
然后他们需要对自己的服务出现的任何问题负责。
And then they're accountable for anything that goes wrong with their service.
我这样理解对吗:相比流程,你们更关注的是系统的影响,比如你们有分级系统,零级(我想最重要的是一级),然后依次往下?
And do I sense it correctly that instead of the process, what you focus on is, let's say, the impact of the system, you have the tiering system, tier zero, I guess the most important one is Tier one, and it goes down.
还有工具,比如静默期或其他方式,团队可以用这些工具来管理风险。
And then the tools, for example, quiet periods or other ways that the teams can then use to manage risk.
没错。
That's right.
其中很多措施都是我们进入直播阶段后新引入的。
And a lot of those were new introductions once we entered live.
所以团队在很大程度上是自主运作的,依靠自己的判断力和责任感。
So the teams were doing that very much on their own, with their own judgment and accountability.
在直播业务中,我们最终不得不采取更结构化的方式,因为这是新领域,风险更高——必须实时观看,且涉及众多团队协作。
Live, we ended up in a situation where we had to be more structured about it, because it was new, because it was higher risk because you must watch it live, and because live is something that touches so many teams.
回到我们之前关于内容管道的讨论,想想直播流程:从摄像机到转播车,再到源站或云端,最后接入我们的内容分发网络——这需要众多系统在直播活动中实时交互。
Going back to our conversation around our content pipeline, if you think about live from the camera to the production truck, to the origin or cloud, to then being able to get to our content delivery network, there's a lot of systems that need to talk to one another in real time for a live event.
因此出现了一系列新的潜在故障点。
So there was a new set of things that could go wrong.
在完全掌握这些连接节点之前,我们需要制定更多指导原则来规范流程。
Until we were very confident that we knew those connection points, we wanted to introduce more guidelines for how to think about that.
这就是我们引入分级系统的背景。
That's when we introduced the tiering system.
当时我们还规定了哪些服务系统应考虑加入静默期。
It's when we introduced some things for which services or systems should think about being part of the quiet period or not.
但相比直播业务初期,我们已经大幅简化了这些限制——因为更希望减少约束,避免拖累与直播无关的团队自主改进和创新的节奏。
But we've already dialed a lot of those things down from when we first started with live because our preference is to not have to have so many constraints because it slows down too many teams to be able to make their own improvements and innovations that have nothing to do with live along the way.
我们不愿为了单一重点领域而拖慢其他业务板块的发展。
And we don't wanna actually slow down other parts of the business in favor of just one priority area.
伊丽莎白刚谈到Netflix在达到一定规模后,必须为高风险系统设置基本防护措施。
Elizabeth was just talking about how, above a certain scale, Netflix had to put some basic guardrails in place for high risk systems.
但他们始终担忧过度约束会降低团队效率。
But they kept being worried about introducing too many constraints because that would slow teams down.
这种对可靠性与速度的双重追求并非Netflix独有。
This problem of wanting both reliability and speed is not unique to Netflix.
但顶尖企业总能找到合适的工具来培育正确文化。
But the best companies figure out what the right tools are to enable the right culture.
无论是持续开发还是实验优先的方法,这些文化价值观都需要合适的工具和基础设施。
Whether it's continuous development or experimentation first approaches, these cultural values need the right tooling and infrastructure.
而这正是我们的主要赞助商Statsig发挥作用的地方。
And this is where Statsig, our presenting sponsor, comes in.
Statsig构建了一个统一平台,同时支持持续交付和实验两种文化。
Statsig built a unified platform that enables both cultures, both continuous shipping and experimentation.
功能标志让你能自信地持续交付,先向10%用户发布,及早发现问题,必要时立即回滚。
Feature flags let you ship continuously with confidence, roll out to 10% of users, catch issues early, roll back instantly if needed.
内置实验功能意味着每次发布都自动成为学习机会,通过统计分析准确展示功能对指标的影响。
Built in experimentation means every rollout automatically becomes a learning opportunity, with proper statistical analysis showing you exactly how features impact your metrics.
由于所有功能都在同一平台上(包含产品数据、分析、会话回放等),整个组织的团队可以协作做出数据驱动决策。
And because it's all in one platform with the same product data, analytics, session replays, everything, teams across your organization can collaborate and make data driven decisions.
像Notion这样的公司使用Statsig后,从每季度个位数实验增长到300多个实验。
Companies like Notion went from single digit experiments per quarter to over 300 experiments with Statsig.
他们通过功能标志交付了600多项功能,在快速推进的同时防止指标回退。
They shipped over 600 features behind feature flags, moving fast while protecting against metric regression.
展开剩余字幕(还有 309 条)
微软、Atlassian和Brex选择Statsig也是出于同样原因。
Microsoft, Atlassian, and Brex use Statsig for the same reason.
这是能同时实现规模化速度和可靠性的基础设施。
It's the infrastructure that enables both speed and reliability at scale.
说到规模,Statsic每天处理数万亿事件。
Speaking of scale, Static processes trillions of events per day.
所以无论你是初创公司还是建设OpenAI级别的项目,这个平台都能与你共同成长。
So whether you're a startup or building an OpenAI scale, the platform grows with you.
你可以轻松将其集成到现有的产品数据栈中。
You can integrate it into your existing product data stack easily.
如果你有兴趣建立持续开发和实验的文化,请访问statsiq.com/pragmatic。
If you're interested in building a culture of continuous development and experimentation, go to statsiq.com/pragmatic.
他们提供慷慨的免费额度,5万美金的创业项目支持,以及实惠的企业方案。
They have a generous freeze here, a 50,000 startup program, and affordable enterprise plans.
只要告诉他们是一位务实工程师推荐你来的。
Just tell them the pragmatic engineer sent you.
言归正传,让我们继续讨论Netflix的工程文化。
With this, let's get back to the conversation about Netflix's engineering culture.
当人们或工程师收听/观看这段录音时,很多人会频频点头感叹:'真希望能在这样的地方工作——我们可以自主决策,决定承担多少风险,并制定相应流程。'
So, when people or engineers will be listening or watching this recording, a lot of them will just be nodding like, yeah, wow, I'd love to work at a place where, you know, we can make these decisions and decide how much risk we take and, you know, the process we set.
据我所知,你们能做到这点的一个重要原因是——我认识Netflix的工程师们——你们对人才标准要求极高。
One reason that you can do this, I know for a fact because I know engineers at Netflix, is you have a very high bar for talent.
而且这个传统从创立之初就一直保持着。
And you've always had that from the very beginning.
事实上,如果我没记错的话,Netflix前25年里软件工程师只有'高级软件工程师'这一个职级。
In fact, for the first twenty five years of Netflix, as I'm correct, the only software engineering level used to be a senior software engineer.
能否谈谈你们的招聘标准?这个'高门槛'具体指什么?
Can you talk about how you go about hiring, what this bar is?
根据你的经验,Netflix是如何做到——再次强调——在长达十年里保持单一职级体系,没有其他层级,仅凭这个高标准运作的?
And, in your experience, how did Netflix manage to it's the only company that, for, again, such a decade, only had this one level, there was no other things and this was just this high bar.
这套机制是如何运转的?
How did it work?
Netflix的人才密度始终让我惊叹不已。
I continue to be amazed by the talent density at Netflix.
五年前刚加入时,我几乎不敢相信这一点。
I almost didn't believe it before I joined a little over five years ago.
眼见为实。
I'll believe it when I see it.
我认为理解Netflix人才密度的方式在于:我们文化中的许多方面,包括人才密度,都是实现卓越工作的手段。
I think the way to think about talent density at Netflix is a lot of the aspects of our culture, including talent density, are a means to excellence in our work.
这些都不是最终目的。
None of them are the endgame.
无论是'无规则'、'流程最小化'、'高人才密度',还是'情境管理而非控制',我们所讨论的这些要素都是让团队追求极致的关键。
So saying, No rules, no process, process, or High talent density, or Context not control, any of the things that we're talking about are key elements of getting to a group of people who strive to do the best possible work they can.
Netflix能在没有复杂职级、规则和流程的情况下走到今天,这本身就向人们传递了一个信号:我们对你们期望很高。
Being able to get through so much of Netflix's history without the complexity of levels, or rules, or process helped to signify to people, We're expecting a lot of you.
我发现当有人说'我对你期望很高'时,这很能激发人性中向上的力量。
And I find it's a very human thing when someone says, I'm expecting a lot of you.
人们就会全力以赴做到最好。
That people step up and do the best work they can.
所以在某种程度上,'我们拥有高人才密度'这种说法会自我强化。
So in some ways, it builds upon itself to say, We have high talent density.
我们追求卓越。
We expect excellence.
你拥有高度自主权,同时也承担重大责任。
You have a lot of autonomy, but also a lot of accountability.
最优秀的人才会在这样的环境中脱颖而出。
That the best people will thrive in that situation.
他们不会被许多可能围绕这种情况的事物分散注意力。
They're not distracted by a lot of the things that you could surround that with.
他们知道标准很高,并渴望达到那个标准。
They know that the bar is high and they want to meet that bar.
与我共事的所有人都有这种感觉。
All the people I work with feel that way.
你不需要告诉他们该做什么,他们会主动努力做到最好。
You don't need to tell them what to do, they lean in to try to do the best possible work.
随着时间推移保持这种状态,特别是随着公司规模的扩大,是一个挑战。
Maintaining that over time, especially as we've scaled as a company, is a challenge.
因此,每当团队从100人增长到1000人再到几千人时,你必须考虑搭建什么样的框架来确保我们能保持那种文化精神。
So, anytime a team grows from 100 to 1,000 to a few thousand, you have to think about what's the scaffolding you put around that to make sure we can maintain the spirit of what that culture was.
所以随着时间的推移,我们不再只有单一层级。
So things that have changed over time, we don't still just have a single level.
当我们考虑工程或更广泛的技术组织发展时,并非每个岗位都需要拥有十年、十五年或二十年经验的人。
So as we think about growing as an engineering or as a tech organization more broadly, not every role requires somebody who has ten, fifteen, twenty years of experience.
有些岗位非常适合刚毕业的大学生或只有几年工作经验的人。
Some roles are a great match for someone who's newly out of college or has a couple years of work experience.
但你需要考虑对这类人员的期望和薪酬待遇会有所不同。
But you would want to think about the expectations for that person, the compensation for that person being different.
所以我们确实开始围绕职级构建了一些框架体系。
So we did start to build some scaffolding around levels.
并不是说我们现在想要建立过多的框架,以至于让人感到窒息。
Not to say now we want to have so much structure that that's suffocating.
我们希望保持独立性和责任感带来的所有优点。
We want to maintain all the great things about a lot of independence and accountability.
但我们甚至没有共同语言来讨论:我该如何组建一个拥有更广泛人才类型的团队?
But we didn't even have vocabulary to talk about, How might I construct a team to have a broader array of talent?
这就是我们过去几年做出的改变之一。
So that's one of the things we've changed over the last couple of years.
我们还必须考虑引入职级体系后带来的问题,比如个人贡献者路径或管理路径。
We've also had to think about one of the things you get as you introduce levels are things like IC or people management pathways.
每个职级应该达到哪些期望?
What are the expectations that you have at each level?
其中部分与技能相关。
Some of that is about skills.
每家公司都会在这方面有所体现。
Every company is going have that reflected.
但我们发展路径和沟通方式中很重要的一部分是文化因素。
But a big part of what we have in our pathways and our ways of talking are the cultural things.
你是否能激励身边的同事?
So do you uplift other people around you?
你的工作是否体现了卓越和责任感?
Do you deliver a lot of excellence and accountability in your work?
我们对达成这些标准的人设定了很高的门槛。
And we hold a high bar for people meeting those things.
这是明智的判断。
That's good judgment.
这是为Netflix的最佳利益着想。
That's thinking about what's best for Netflix.
我们的工程原则包括:为未来的团队打造产品,让他们感谢你今天所做的工作——这意味着不要走捷径。
Some of our engineering principles are things like building for the future teams who are going to thank you for the work that you did today, which means don't take shortcuts.
打造高质量、持久耐用的产品。
Build high quality, durable products.
比如'全球思考,本地行动'这样的理念。
Things like think globally, act locally.
意思是:在做出本地决策时,要考虑对整个技术部门乃至Netflix的广泛影响。
Meaning, about the broader ramifications across the tech organization or Netflix, even as you make your local decision.
或许我个人最喜欢的是'渴望学习'这一条。
And maybe my personal favorite, which is Yearn to Learn.
这是'保持好奇心'的趣味化表达。
It's a nice memeified phrase of be curious.
要思考:'我是否在用正确的方式思考正确的问题?'
Think about, Am I thinking about the right problem in the right way?
这就是我们定义高人才密度文化的方式,这些是我们的工作准则,我们期待每个人都做到,无论职级高低。
That's how we've named high talent density, to say those are our ways of working, and we expect that from everyone, no matter what your level is.
你必须警惕事物产生的诱因。
And you have to watch out for the incentives that things create.
人们很容易会说:'我要做那些对我个人有利的事,而不是对团队或公司有利的事。'
It's very easy to say, I'm going to do the thing that I think puts me in a better position rather than my team or the company.
我们确实努力劝阻这种行为,同时大力赞扬那些做出无私行为或承担不太光鲜、不太显眼工作的人,只要这对其他人更有利。
And we really try to discourage that and really celebrate when people do the thing that's selfless or does the less glamorous or less visible work, as long as it's better for everyone else.
我发现当人们以这种方式行事时,会持续吸引并留住最优秀的人才。
I find that when people behave that way, it continues to attract and retain the best talent.
你提到你们正努力避免让人们分心的事情。
You mentioned that you're trying to not have things that distract people.
当我还是经理时,确实有件事让我分心——每半年准时到来的绩效评估季,我们简称它为'perf',这会占据我一个月的生活,尤其是年底时。
Now, when I was a manager, one thing that did distract me, every six months on the dot performance review season, and as we just called it perf, it was one month of my life, or especially at the end of the year.
甚至更久,是的。
Even longer, yeah.
甚至更久。
Even longer.
每当我与其他受伤经理交谈时——我刚遇到一位朋友,他说'是啊,perf要来了,所以这段时间我帮不了你'。
And whenever I talk with fellow injury managers, I just caught up with a friend and he was saying oh yeah, perf is coming up so I won't be able to help you in this period.
你如何应对这种必要的恶行,或者说必要的绩效管理流程?
How do you go about this necessary evil or just necessary process of performance management?
因为我理解这与众不同且相关的是,众所周知的'守门员测试',Netflix在官网上也分享了这点。
Because I understand it's very different and related to this, it's very publicly known, the Keeper test, which Netflix shares on the website as well.
这个测试在其中起到什么作用(如果有的话)?
How does this play into it, if it does at all?
我们没有正式的绩效评估,这可能是第一件不同寻常的事。
We don't have formal performance reviews, is probably the first unusual thing.
所以,当你想到其他公司花时间讨论每个人,或给他们评级是否达标或超标——我在其他公司也见过——我们不是这样做的。
So, when you think about other companies spending that time to talk through each person, or assign a rating for whether they meet or exceed I've seen that at other companies, too we don't do it that way.
但我们确实会认真考虑反馈、表现、期望等所有会影响‘留任测试’的因素,我很乐意详细讨论这些。
But we do carefully think about feedback, performance, expectations, all the things that would feed into Keeper Test, which I'm happy to talk through.
在Netflix,我们的做法首先是努力实现持续、及时且坦诚的反馈机制。
The way we approach it at Netflix is first, trying to get to something that looks like continuous, timely, candid feedback.
说起来容易做起来难。
Easier said than done.
这需要信任。
It requires trust.
需要深厚的关系才能实时给予他人非常坦诚的反馈。
It requires deep relationships to be able to give someone in the moment very candid feedback.
可能是‘这件事你做得很好’。
It could be, Here's a thing you did great.
反馈不总是负面或建设性的,同时也要能接受这类反馈。
It's not always a negative or constructive thing, and to be able to receive that type of feedback.
如果我们很好地践行Netflix文化,这种反馈应该是一年365天都熟悉且舒适的日常。
If we're living the Netflix culture well, that's something that would be familiar and comfortable every day of the year.
因此你无需等待特定绩效考核周期,就能及时了解自己的表现,或向他人提供这类意见。
So you're not having to wait for a certain performance review or feedback cycle in order to hear how you're doing, or be able to provide that type of input to others.
作为安全网,我们确实设有年度360度评估流程,我会向共事的同事们征集反馈。
We do, as kind of a safety net, have an annual three sixty process, where I would request feedback from a bunch of people I work with.
我也会收到来自许多同事的反馈请求。
I get requests from a bunch of people.
但最重要的是你要直接与当事人就反馈进行对话。
But that's something that you're having a direct conversation with the individual about feedback.
我会与经理一起回顾这些内容,说:这是我听到的一些主题,以及我打算着手改进的方面。
It's something I would review with my manager to say, Here are some of the themes that I heard, some of the things I'm going to work on.
因此这是一个思考的机会:我的表现如何?
So there's an opportunity to think about, What is my performance?
人们如何看待我们的工作关系以及我的贡献?
How are people perceiving our working relationship and my contributions?
但它并不是以评估的形式构建的。
But it's not structured as an evaluation.
它是在帮助人们提升的反馈框架下构建的。
It's structured in the context of feedback that helps people improve.
另外每年我们还会进行一次薪酬评估,这从某种程度上反映了我的影响力水平、获得的技能以及对公司的贡献。
And then separately, once a year, we go through both compensation review, which is a reflection in ways of what's my level of impact, what skills have I gained, what are my contributions to the company.
因此在讨论某人的市场最高薪酬(这是我们的薪酬理念)时,自然会涉及表现话题。
So you naturally talk about performance as part of thinking about someone's personal top of market, which is our compensation philosophy.
这就会引发这样的对话:管理者需要为团队中的每个成员思考,如何制定能体现此人对Netflix价值及市场价值的薪酬?
So it comes up as a conversation there where managers really think about, for each person on their team, how do I think about the compensation that reflects this person's value to Netflix and value in the market?
所以这带有绩效色彩,但并非绩效评估。
So that has a performance flavor to it, but is not a performance review.
然后每年我们会有几次晋升评估。
And then a couple times a year, we evaluate promotions.
在这种情况下,对于可能从五级晋升到六级的候选人群,我们会收集有助于决策的反馈意见。
So in that case, for a group of people who might be up for promotion from level five to a level six, we would collect feedback that helps us make that decision.
因此,纵观360度反馈循环、薪酬评估和晋升评估这三个持续反馈环节,我们有很多接触点可以让人们了解自己的工作表现。
So if I look across the feedback continuous in the three sixty cycle, compensation review, and promotion evaluations, we get quite a few touch points where people are hearing how they're doing.
但这比其他公司采用的绩效评估结构更具建设性和可操作性。
But it feels more constructive and actionable than the performance review structure that other companies have.
要做好这一点,仍然需要管理者投入大量关注和判断。
What this requires us to do well, is still a lot of manager attention and judgment.
这并非管理者单方面就能决定,比如直接断言'我认为你通过了留任测试'或'我认为这应该是你的薪酬'。
And it's not a manager in isolation, being able to say, I think you're meeting the keeper test, or I think this should be your compensation.
我们为此建立了制度框架,确保管理者要为自己团队的决策负责。
We do have structure around that so that managers are accountable for the decisions that they're making on their team.
以我领导技术部门的职责为例,我会综合评估晋升名单、晋升人数、360度反馈中的共性问题,以及各团队薪酬分布情况。
So if I think about my role in leading the tech organization, I review collectively who's getting promoted, how many people are getting promoted, what are the themes coming up in three sixty feedback, where's compensation landing across the teams.
因此需要一定的制衡机制,因为我们确实在相对非结构化的流程中融入了太多因素。
So it's a little bit of checks and balances because we do weave so much to a relatively unstructured process.
我们努力在管理者进行留任测试决策时提供大量支持。
And we try to provide a lot of support to managers when they are making keeper test decisions.
具体来说,这个测试要回答的问题是:该员工是否真正达到岗位要求和企业需求?
For context, that's asking the question of, is this person really meeting expectations for the role and what the business requires?
管理者可能会用这个留任测试来自省,并与团队成员展开对话。
There's a keeper test that a manager might ask themselves and have that conversation with members of their team.
但说实话,团队成员对管理者同样存在留任测试。
But honestly, there's a keeper test that goes from members of their team to their managers.
我还想留下吗?
Do I want to stay?
我对当前的工作感到兴奋吗?
Am I excited about the work that I'm doing?
我的经理是否在促进我的成长与发展,以良好的方式指导我的工作?
Is my manager giving me growth and development, helping to guide my work in good ways?
我想明确一点,我们所有人都应对此负责,而不是仅仅由经理为团队做决定。
I want to make sure it's clear that we're all accountable for that, instead of it just being like a manager makes decisions for their teams.
但Keeper测试和自问这个问题,是确保我们所有人都对保持团队高人才密度标准负责的好方法。
But the Keeper test and asking yourself that question is a good way to make sure that we're all accountable for holding high talent density bar in our team.
这也是我团队成员询问我的好方式:嘿,我表现得怎么样?
It's also a good way for someone on my team to ask me, Hey, how am I doing?
有没有什么我可能没听到的反馈,是我应该了解的,以确保我达到期望?
Is there any feedback that maybe I haven't heard that I should know about to make sure I'm meeting expectations?
理想情况下,要以一种感觉像正常业务流程的方式进行。
And ideally, do that in a way that feels like normal course of business.
所以,Keeper测试很不寻常。
So, the Keeper test is unusual.
你们不做绩效评估的事实很不寻常,或者至少没有固定的节奏。
The fact that you don't do performance reviews is unusual or at least not a structured cadence.
对于听众来说,他们可能会觉得,这听起来压力很大。
For someone listening, they might think, well, that sounds real stressful.
然而,我查看了SignalFire的数据,他们有一张图表显示了科技公司在人才标准上的留存率,这里的人才标准较高,而Netflix在所有公司中位居榜首,这意味着根据他们的数据,在可比较的公司中,Netflix的高工程人才最不可能离职。
However, I looked at data from SignalFire, they had this chart with the retention of tech companies in the talent bar, so, like, kind of higher talent bar here And in Netflix comes in the top corner above all companies, which means that based on the data they have, high engineering talent is the least likely to leave at Netflix across companies that are comparable.
我的问题是:你认为人们为什么离开公司?
My question to you: why do you think people leave companies?
他们又为什么留下?
And why are they staying?
很高兴听到我们处于优势位置,但我们必须通过时间证明这一点。
I'm glad to hear we were on the upper right, but we have to earn that over time.
我个人发现,当人们无法从工作中获得他们想要的挑战和成就感时就会离开。
I personally find that people leave when they're not getting the challenges and the fulfillment that they would like to get from their work.
或者他们觉得自己所做的贡献没有得到充分认可。
Or they don't feel like they're adequately recognized for the contributions that they're making.
我不认为有任何公司能保证每个人都热爱自己的工作并每天感到被完全认可。
I don't know that any company can guarantee that everyone loves their job and feels perfectly recognized every day.
但我们在这方面有很多尝试机会,让人们觉得我正在解决真正困难而有趣的问题。
But we get a lot of at bats on that, for people to feel like I'm solving really hard interesting problems.
在如何解决这些问题上,我拥有很大的自主权和决策权。
I have a lot of agency and autonomy on how I solve those problems.
我不觉得受到太多规则或流程的约束。
I don't feel constrained by a lot of rules or process.
我们没有自上而下的命令控制文化来限制人们的贡献。
We don't have a top down command and control culture that really narrows people's contributions.
无论是成功还是失败,我们都要求大家承担很多责任。
And we expect a lot of that responsibility for both the successes and the failures.
很多人都喜欢这种环境。
A lot of people love that environment.
我们非常努力地维护这种环境,让人们愿意留下来。
We fight really hard to maintain that type of environment so that people are excited to stay.
这并不意味着我们的员工保留率是100%。
It doesn't mean that our retention is 100%.
人们在其他地方获得很好的机会,我其实认为他们应该抓住这些机会。
People get great opportunities other places, and I actually think it's good for them to take it.
因为我们并不是说,我们期望你永远留在Netflix。
Because we're not saying, We expect you to be at Netflix forever.
我们确实希望人们在这里感到兴奋,认为他们正在从事一生中最棒的工作。
We do want people to be excited here, think they're doing the best work of their life.
有时他们会在别处获得机会,但希望他们在Netflix经历的工作和文化能给他们带来积极的体验。
Sometimes they get opportunities elsewhere, but hopefully that's a positive experience they've had in terms of the work and the culture that they experienced at Netflix.
我也认为管理者和领导者在员工留任方面扮演着重要角色。
I also think managers and leaders have a big role to play in why people stay.
要设定愿景,制定清晰的战略,做出艰难而及时的决定,让人们能够发挥最佳水平。
In setting a vision, in setting a clear strategy, in making tough, timely decisions so that people can do their best work.
根据我的经验,有时人们会根据整体感受决定去留——'我对我们前进的方向感到非常振奋'。
And in my experience, sometimes people will decide to stay or leave based on that overall sense of, I feel really inspired by the direction we're taking.
我认为Netflix有很多可以提供的,特别是我们的一些新尝试和从零开始构建的项目,以及我们为工作室、广告商或会员带来的新体验。
That, I think Netflix has had a lot to offer, especially with some of our newer bets and things that we're building from scratch, and new experiences that we're bringing to studios, or advertisers, or members.
希望这也能激发一些热情。
So hopefully that also builds some of the enthusiasm.
最后我想说,当人们被周围的才华所打动时,他们就会留下来。
And to finish my thought, think people stay when they're impressed by the talent around them.
这就是人才密度自我强化的地方。
This is where talent density builds on itself.
如果你真正坚持高标准,优秀人才就更有可能愿意留下。
If you really hold people to a high bar, great talent is much more likely to want to stay.
因此,这是我们务必在这方面做得非常出色的另一个原因。
So this is another reason to make sure we do a really good job with that.
所以,当前非常令人兴奋的事情之一当然是人工智能和AI工具,既包括用它们进行构建,也包括工程师们对它们的使用。
So, one of the very exciting things these days, of course, is AI and AI tools, both building with them but also using it as engineers.
根据你在Netflix的经验,工程团队如何在自己的工作中使用这些AI工具?
In your experience, inside Netflix, how are the engineering teams using these AI tools for their own work?
他们是如何进行实验的?
How are they experimenting with them?
哪些方法是有效的?
What is working?
哪些可能不太合适?
What is maybe not a great fit?
是的。
Yep.
这是我们当前重点关注的巨大领域,但带着极大的意图和实用主义,去判断这些工具在哪些方面真正有帮助,而非
It's a huge area of focus for us right now, but with a lot of intention and pragmatism of where these tools are actually helpful, versus
在哪里而且他们
where And they're
再次强调,我们的希望是找出那些真正能带来更高质量、更大业务影响的领域,而不是那些质量较低或仅关乎成本削减的事情。
again, the hope is that we identify those places where we actually get higher quality, more impact for the business, versus things that are lower quality or just about cost reduction.
那些对我们来说真的没什么意思。
That's really not interesting to us.
因此,在任何技术应用中,包括生成式AI,我们都在寻找能实质性推动公司影响力的东西。
So, across any technical application, but including GenAI, we're looking for the thing that is meaningfully advancing our impact for the company.
因此对于工程师们,我们正在试验一套编码助手工具。
So for engineers, we are experimenting with a set of coding assistants.
我们的做法是为团队提供多种不同的工具。
The way we approach it is to provide a lot of different tools to the teams.
这样他们就能探索、试验、决定哪些工具符合需求,并逐渐了解哪些工具更适合某些用例或应用场景。
So they are able to explore, experiment, decide which tools meet their needs, start to learn what works better for some use cases or some applications than others.
我们试图创造空间,让人们真正有时间去做这些尝试。
We're trying to create space so that people actually have the time to do that.
众所周知,当你考虑改变编码方式、文档方式或决策方式时,学习曲线相当陡峭。
As we all know, there's quite a learning curve when you're thinking about, I'm going to change how I write code, how I document, how I think about making decisions.
特别是对那些在岗位上已经驾轻就熟的人而言,改变工作方式可能会带来不适。
Especially for the people who are very accomplished in their roles, changing your way of working can be kind of jarring.
而我们这里时间紧迫、目标远大,没有太多空闲时间让人慢慢摸索新技术。
And we have tight timelines and big ambitions here, so there's not a lot of free time running around of, Let me figure out how to use this new technology.
所以我们正在做的是同时启用这些工具。
So we're doing things like both enabling the tools.
我们会安排几周时间让人们专注于尝试新项目。
We have some weeks where we let people just be focused on, Let me try a new project.
让我可以试验些新东西。
Let me experiment with something new.
这给了人们一些喘息空间。
That gives people a little bit of space.
然后我们会收集来自整个团队的大量反馈,了解哪些工具真正有用,哪些应该升级为标准方案,哪些领域产生了最大影响。
And then we're collecting tons of feedback from across the team around which tools are actually useful, which do we want to graduate to paved paths, where are the areas where we actually see the most impact.
其中很多是自我报告的数据。
A lot of that is self reported.
是那些正在进行实验的团队。
It's teams that are experimenting.
我们在整个业务中设立了所谓的'生成式AI先锋'。
We have what we call Gen AI Champions throughout the business.
因此他们能帮助团队解决问题、了解可用资源,同时向核心团队反馈哪些方法有效、哪些无效,从而持续改进我们的实施策略。
So they're able to help teams troubleshoot, understand what's available, but also feedback to central teams what's working and what's not, so we can continue to advance how we're approaching this.
我认为我们在保持务实方面做得很好,没有把生成式AI当作解决工程或技术团队所有问题的万能钥匙。
And I do think we're doing a good job being pragmatic, and thinking about not feeling like needs to be like Gen AI is a silver bullet for everything that engineering or technical teams are doing.
我们希望更精准地锁定影响产生的领域。
I think we want to be more surgical about where the impact comes from.
某种程度上,这反映了我们在面向会员或创作者用例时采用的相同策略——即通过大量能力实验,找出真正能提升体验和质量的领域。
And in some ways, that reflects the same strategy we're taking for member facing use cases, or creator facing use cases, where we're trying to figure out where do we actually get a better experience and higher quality, and experiment with a ton of different capabilities.
这也影响着我们的基础设施战略。
Which also has implications for what's our infrastructure strategy.
我们的整体战略是如何提供多种选择渠道,通过实验判断市场在哪些领域已有成熟解决方案?
What's our overall strategy of how do we give access to a lot of different options so we can experiment and then think about where's the market solving this well?
哪些领域我们应该自主开发?
Where should we build something in house?
很可能在技术生产力方面,市场已经提供了很好的解决方案。
It's very likely that for a lot of our tech productivity, the market is solving those problems very well.
自主开发工具对我们并无明显优势,但我们仍需谨慎选择实际采用的市场工具。
There's not really a big advantage to us building tools in house, but we still want to be choosy in which market tools we actually leverage.
你说已经收到很多反馈,知道人们、团队和组织都在分享哪些方法有效、哪些无效。
You said you're getting a lot of feedback already, you know, people, teams, organisations are sharing what's working, what's not.
你是否发现某些领域这些工具(特别是AI编程助手、代理工具)可能更有帮助?
Are you seeing some areas where these tools, specifically AI coding assistants, agentic tools, are maybe a little bit more helpful.
无论是新项目、迁移、原型设计还是其他领域。
May that be greenfield things, migrations, prototyping, or some other areas.
你提到的几个方面都很准确。
You named a couple of them very well.
那么从最后一点说起,原型设计现在快得多。
So, maybe starting at the end there, prototyping is a lot faster.
实际上,考虑到跨职能团队(工程师、数据科学家、产品经理、设计师),我们希望可以快速搭建原型。
And actually, that's a place where when you think about the cross functional teams across engineers, data scientists, product managers, designers, we're hoping that we can actually bootstrap things very quickly.
我有个想法。
I have an idea.
让我们把这个想法可视化。
Let's visualize that idea.
快速编写一套代码来实现它。
Let's quickly throw together a set of code that would help to bring this to life.
这些代码不一定达到产品级或生产就绪标准,但没关系,它能帮助团队快速推进、创新和研讨想法。
That's not necessarily something we would productize or consider production ready code, but that's okay because it helps teams advance and innovate and workshop ideas very quickly.
而且如你和听众所知,存在大量繁琐工作。
And then, as you probably know and your listeners know, there's a lot of what can be tedious work.
最耗时的环节并不总是实际编码工作。
And it's not always the actual coding work, where it feels like it's the biggest time commitment.
它可以是获取系统运作的知识。
It can be accessing knowledge about how systems work.
它可以是编写代码文档。
It can be documenting code.
它可以是思考那些需要我们处理的大型迁移项目,实际上我们可以将大部分工作自动化。
It can be thinking about big migrations that we've had on our plate that we can actually automate much of that work.
此外还包括问题检测相关的工作。
And there's also things around detecting issues.
比如异常检测、响应机制,以及能够对问题进行深入分析。
So anomaly detection, response, being able to do deep dives of issues.
我们发现生成式AI工具在这个领域大有可为,这有助于提升我们的系统韧性,并促进工程团队遵循最佳实践和保持健康状态。
We're finding that there's a lot of promise for GenAI tools in that space, which helps us with some of our resilience and just general best practices and health as an engineering organization.
如果能在原型设计、文档编写、迁移工作、检测与响应等领域使用生成式AI工具,就能为更具创新性的工作腾出大量时间。
If we're able to use GenAI tools in those spaces prototyping, documentation, migrations, detection and response it leaves a lot of time for the more innovative work.
那么我们该如何设计架构、系统和产品,以实现业务影响力呢?
So how do we think about architectures and systems and products we're building to deliver business impact?
因此我们有望让工程师产生更大影响力,因为他们能利用这些工具或智能代理体验,减少在低价值活动上花费的时间。
So then hopefully we can actually get more impact for engineers because they're able to leverage some of the tools or agentic experiences to minimize the time spent on some of the less impactful activities.
但这实际上是一个工作组合。
But it's really a portfolio of work.
我想再次强调,它在任何领域都不是万能的解决方案。
I would say, again, it's not a silver bullet in any of those spaces.
应该说,相比几年前我们刚开始实验的那些工具——恕我直言,那些工具根本达不到我们需要的质量标准——现在已经取得了长足进步。
I would say it's come a long way since some of the tools we first started experimenting with a couple years ago, which let's just say, didn't meet the quality bar that we really need.
是的,要知道,在这方面发生了很大的变化。
Yeah, know, it's been a big change on this.
Netflix独特的一点是,你提到长期以来Netflix只招聘资深及以上级别的软件工程师。
One thing that's unique to Netflix is, you mentioned how, for a long time, Netflix only hired senior above software engineers.
大约两年前或几年前,你们开始招聘职业生涯初期的软件工程师。能告诉我这如何改变了Netflix的文化吗?
About two years ago or so, or a few years ago, you've now started to hire earlier career software Can you tell me on how that has changed the culture at Netflix?
通过这种招聘方式你们学到了什么?
What you've learned by hiring?
你们的策略是什么?
And what is your strategy?
你们计划继续引进应届毕业生、实习生或职业初期人才吗?
And are you planning to keep bringing in new grads or interns or early career folks?
还是说你们打算——毕竟现在很多公司都在说,至少在弄清楚AI这回事之前,我们暂时只招资深人员?
Or are you planning to do because again, lot of companies these days are saying, let's just go with seniors for now at least, especially until we figure out this whole AI thing.
是的,我们在应届生和职业初期人才方面有很好的经验。
Yeah, we've had a great experience with new grads and early career talent.
还有你提到的我们的实习项目。
And also, our internship program, which you mentioned.
我们的起点与其他许多科技公司非常不同。
We were starting from a very different place than a lot of other tech companies.
当你观察其他大型科技公司的职级或人才分布时,他们有些公司有30%、40%、50%的所谓三级、四级工程师。
So, when you look at the distribution of levels or talent at some of the other larger tech companies, they had in some cases 30%, 40%, 50%, what I'll call level three, level four engineers.
所以当考虑到新技术转型或这些公司现在需要做的工作时,我理解他们可能需要不同的人才分布。
So when you think about a new technology shift, or the work those companies need to do now, I understand why they might need a different distribution of talent.
在大多数情况下,我们都是从零开始的。
We were starting at 0% in most cases.
太疯狂了。
Crazy.
是啊。
Yeah.
因此,我们团队中大部分是五级及以上水平的成员。
So, we had mostly a level five and above population.
所以我们有巨大机会用早期职业人才来补充现有团队,他们为团队带来新技能、新视角和充沛活力。
So we had a huge opportunity to complement the team we had with earlier career talent, who brings new skills, new perspectives, great energy to the teams.
而随着当前生成式AI的技术变革,很多人天生就熟悉AI。
And with the technology shift right now with Gen AI, a lot of native AI familiarity.
想想那些近些年毕业的学生,他们早已习惯在开发产品、编写代码或解决数据问题时使用AI。
So when you think about somebody who's graduated from school in the last few years, they're very accustomed to using AI in whether it's developing products, writing code, thinking about solving data problems.
这实际上是为团队引入新技能和视角的有效方式。
So it's actually a useful way to bring new skills and perspectives to the team.
我认为我们绝对会保持对早期职业人才的投资,因为这为业务的各个领域都带来了增值。
I think we will absolutely maintain that investment in earlier career talent, because it's been so additive in different parts of the business.
我也认为凡事都要有适当比例。
I also think everything in its right proportion.
有很多问题需要我们聘用极其资深的专家。
There's plenty of problems where we need extremely senior talent.
所以同时,我也在推动团队思考:如何增加更多资深骨干工程师和科学家?
So at the same time, I would say, I'm pushing for us to also think about how do we add more staff, principled, distinguished engineers and scientists to the team?
因为那同样是一个重要场所。
Because that's also a place.
你会想到关于分销渠道的故事,认为那是维系顶尖人才的关键所在。
You think about that tale of the distribution being a really important place to maintain strong talent.
因此,我们正在对这两条分销渠道都进行投资。
So, we're investing in both of those tales of the distribution.
我很欣赏这点,因为通常企业只会谈论其中一方面,而非兼顾两者。
I love it because I usually hear companies talk about one or the other, but not both.
我想在某个时刻,这里的人们终将有望抵达那个高度。
And I guess at some point, the people here one day will hopefully be there.
另一种理解方式是内部培养人才。
Another way to think about it is building talent from within.
我们希望新加入的初级人才能获得优质体验,在此提升技能、创造价值,并逐渐成长为资深技术骨干。
So, we hope that a lot of the early career talent joining has a great experience, develops skills and impact here, and becomes those more senior technical talent over time.
而我们最资深的专业技术人才也必须成为优秀的榜样。
And our most senior technical talent have to be great role models there, too.
因此,我们在内部人才发展方面投入的精力远超五到十年前,可以说这对团队产生了巨大的推动作用。
So, we're doing more of that internally in that talent development than we would have done five or ten years ago, and I would say it's been a huge boost to the team.
我最近对Netflix最惊讶的发现之一,就是你们对开源的投资力度。
One of the most surprising things I've learned about Netflix just very recently is how much you invest in open source.
这话可能有点傻,因为我们都知道混沌猴子非常出名,事实上Netflix正因此闻名。
This might sound a bit silly because we know Chaos Monkey is very famous, in fact Netflix is known for that one.
但最近的报告再次显示,在统计各公司工程师参与开源工作的比例时,Netflix依然高居榜首。
But again, in a recent report, it looked at all the companies and what percentage of engineers end up working on open source and Netflix, again, was at the very highest bar.
这份刊物估计大约五分之一的工程师参与开源项目。
This publication estimated that about one in five engineers work on open source projects.
果然,我打开你们的开源页面,发现全是开源内容。
Sure enough, I go to your open source page, it's just so much open source.
能告诉我为什么、如何以及从何时起Netflix如此热衷开源?为什么我们对此一无所知?
Can you tell me why and how and since when is Netflix doing so much open source and why do we not know about this?
这对我来说是新鲜事。
This was new to me.
或许我们应该多谈谈这个话题。
Perhaps we should be talking about it more.
现在就是个很好的开始机会。
So, is a good opportunity to start.
要知道,我们之前讨论过工程文化和人才密度意识,这往往源于对更广泛技术社区做贡献的热情。
You know, we were talking earlier about the engineering culture and the sense of talent density, which often comes for a passion to contribute to the broader technical community.
Netflix员工非常重视工作质量和更广泛的创新推进。
So the people at Netflix care deeply about the quality of their work and advancing innovation more generally.
有些创新是Netflix特有的,我们必须把这些知识产权作为竞争优势来保护。
For some things, it's Netflix specific innovation, and it's important we keep that IP as a competitive advantage.
但更多时候,我们的贡献能推动整个行业创新,长远来看Netflix也会受益。
But for many, it's something that helps to actually drive broader industry innovation, which also benefits Netflix over time.
举个例子,我们在内部和外部都深度参与的领域之一就是编码技术。
So if I can give you one example among the list of places where we've been very involved both internally and externally, it's in the encoding space.
并推动了大量创新
And driving a ton of innovation
视频编码,对吧?
Video encoding, right?
在视频编码领域,我们目前已获得了九项艾美奖的认可。
Video encoding, we've now won, I believe, nine Emmys for these contributions.
过去我总将艾美奖与电视节目和红毯联系在一起,但实际上我们在视频编码技术上已斩获众多技术与工程类艾美奖。
I always used to associate Emmys just with TV and the red carpet, but we've won a lot of technical and engineering Emmys at this point, specifically on video encoding work.
举个例子,这项技术极大提升了我们节目编码的质量与效率,并优化了内容交付能力。
So as one example, that helps to contribute incredibly to quality and efficiency of our ability to encode our titles and deliver them.
Netflix通过该领域的技术升级获得了立竿见影的收益。
So Netflix gets an immediate benefit by improving the technology in that space.
我们同时是开放媒体联盟的创始成员,该组织致力于推动编码技术的开放式发展。
But we are also a founding member of the Open Media Alliance, which is an industry community that pushes for open advancement of encoding technology.
若能激励行业共同进步,Netflix也将受益——当整个行业标准提升后,我们就能整合各方技术持续突破创新边界。
If we're able to inspire that work, Netflix actually also benefits because the whole industry up levels and we think about integrations with different technologies that we might do over time with everyone helping to push the bar.
有个数据很能说明问题:相比刚推出原创内容时,现在Netflix内容库的规模已不可同日而语。
A statistic I like to cite is when you look at the catalog now of Netflix content, think about how much bigger the catalog is than when we were first starting with originals.
数据显示,我们现在所需的带宽减少了60%。
I believe we now require 60% less bandwidth.
在内容库大幅扩充的同时,用60%更低的码率实现了同等甚至更优的画质。
60% fewer bit for same or better quality with a much bigger catalog.
这都源于我们在媒体编码领域的创新。
That comes from our media encoding innovation.
整个行业共同推进的技术进步,将使所有娱乐产业参与者受益,最终惠及消费者与我们的会员。
And having a whole industry that's pushing that benefits anyone who's in the entertainment space, and definitely benefits consumers and our members.
这是一个很好的例子,它始于开源贡献。
So that's a good example where it starts from an open source contribution.
Netflix通过为更广泛的创新领域做出贡献,不仅没有损失,反而有所收获。
Netflix doesn't lose anything, only gain something by contributing to the broader innovation landscape.
我强烈主张更多地讨论我们推动的创新。
And I am a strong proponent of talking more about the innovation we're driving.
比如我们技术博客上的各种文章,举例来说,我们更多地讨论了实现LIVE功能所需的努力,我认为这对整个社区是巨大的贡献。
So things like different blog posts that show up in our tech blog, for example, we're talking more about what it took to bring LIVE to life as one example, I just think is a great contribution to the broader community.
这是我热爱软件工程的原因之一,因为参与开放和分享事物能让所有人受益。
This is one of reasons I love software engineering, because I feel contributing to the open and sharing things, it lifts the tide for everyone.
是的,我也这么认为。
Yeah, I believe so.
没错,我们确实受益匪浅。
Yeah, we definitely benefit.
我们正努力推动整体更好的成果,尤其以会员需求为核心。
And we are trying to drive better outcomes overall, especially with a real member focus.
因此,我们构建的许多技术都能实现这一目标。
And so, a lot of the technology we're building is able to do that.
作为总结,Netflix听起来与所有大型科技公司甚至创新型企业都截然不同且独特。
So, as closing, Netflix sounds like a very different and special place compared to even across all of the larger tech companies or even the innovative companies.
对于刚加入Netflix的初级软件工程师,您会给出什么建议?
What would your advice be for a new start software engineer starting at Netflix?
他们如何在这种环境中取得成功?如何成长为符合这类公司期望的人才?
How can they succeed in this environment and how can they grow up to the expectations at a place like this?
好奇心,好奇心,还是好奇心。
Curiosity, curiosity, curiosity.
当人们问我,网飞价值观中我最认同哪一点时?
When people ask me, What's the Netflix value that most resonates with me?
我最乐于在团队中看到的特质。
And I most love to see across the team.
就是好奇心。
It's curiosity.
不断提问,质疑我们是否在用正确方式解决正确问题。
Asking questions, questioning whether we're solving the right problems in the right way.
即使你是网飞新人或初入职场,也不意味着你不能成为创新源泉。
Just because you're new to Netflix or you're earlier in the career, doesn't mean you're not going to be the source of innovation.
恰恰相反,伟大创意可能来自任何地方,这始于保持好奇、开放心态、勇于实验探索、承担明智风险。
If anything, great ideas come from everywhere, and that starts with just being curious, open minded, experiment, explore, take smart risks.
试着压制内心那个害怕尝试新事物或承担风险的声音。
Try to reduce that voice in your head that is fearful of exploring something new or taking that risk.
我认为当人们带着这种好奇心态加入网飞时,他们就已经为成功做好了准备。
And I think when people join Netflix and they approach it with that type of curious mindset, they're already set up for success.
我还想说,要善于依靠他人。
I would also say lean on other people.
网飞人才济济,大家都非常乐意帮助他人取得成功。
We have great talent at Netflix, and they are all more than happy to help other people be successful.
所以不要羞于寻找导师,大胆提问:为什么这件事要这样做?
So don't shy away from finding a mentor, asking somebody, Why does this work this way?
你能详细说说这件事的历史背景吗?
Can you give me more of the history of this?
你能帮我理解我们正在解决什么业务问题以及为什么吗?
Can you help me understand which business problem we're solving and why?
这是好奇心的另一种体现,但也关乎更广泛的社区,并真正在Netflix发挥这种力量。
It's another flavor of curiosity, but it's also about the broader community and really leveraging that at Netflix.
哦,伊丽莎白,谢谢你。
Oh, Elizabeth, thank you.
这非常非常有趣,我学到了很多。
This was very, very interesting, and I've learned a lot.
很棒,我觉得。
Great, I thought.
很高兴能来到这里,也很高兴这次对话能顺利进行。
Really happy to be here, and I was happy it worked out to have this conversation.
谢谢你。
Thank you.
谢谢。
Thanks.
关于Netflix最让我惊讶的发现之一是他们对开源社区的贡献程度——大约每五名工程师中就有一人参与开源工作。
One of the most interesting learnings for me about Netflix was just how much open source they contribute to and how about one in five engineers is involved in open source work.
另一个发现是他们的绩效管理非常轻量化,并试图实现真正的持续性。
The other one was how performance management is really lightweight and tries to be truly continuous.
这两点感觉都与大多数其他大型科技公司的运作方式大不相同。
Both of these things feel like they are quite different from how most other big tech companies operate.
我之前深入探讨过Netflix的工程师级别如何从单一的资深级别转变为新的五个级别。
I previously did a deep dive on how Netflix's engineering levels change from the single senior level to the new five levels.
详情请查看节目说明中链接的《务实工程师深度剖析》,以及关于Meta、亚马逊和谷歌等其他大型科技公司工程文化的深度分析。
Check out this The Pragmatic Engineer Deep Dive in the show notes link below, as well as deep dives on the engineering culture of other big tech companies like Meta, Amazon and Google.
如果你喜欢这期播客,请在您常用的播客平台和YouTube上订阅我们。
If you enjoyed this podcast, please do subscribe on your favorite podcast platform and on YouTube.
如果你能为节目评分,我们将不胜感激。
A special thank you if you also leave a rating for the show.
感谢收听,我们下期再见。
Thanks and see you in the next one.
关于 Bayt 播客
Bayt 提供中文+原文双语音频和字幕,帮助你打破语言障碍,轻松听懂全球优质播客。