教室里的AI与Irina Jurenka

本集简介

AI导师应具备多少人性化特质？怎样的教学才算“优秀”？AI将在课堂中引领潮流，还是退居人类教学之后？每个人是否都能拥有个性化的AI导师？与研究负责人Irina Jurenka和Hannah Fry教授一同探索AI在教育领域复杂而激动人心的世界。延伸阅读：《迈向负责任的教育生成式AI开发：一种评估驱动的方法》特别鸣谢以下人员（包括但不限于）：主持人：Hannah Fry教授系列制片人：Dan Hardoon 剪辑：Rami Tzabar，TellTale工作室项目监制与制片人：Emma Yousif 音乐作曲：Eleni Shaw 摄像指导与视频剪辑：Daniel Lazard 音频工程师：Perry Rogantin 视频工作室制作：Nicholas Duke 视频剪辑：Bilal Merhi 视频美术设计：James Barton 视觉标识与设计：Eleanor Tomlinson 制作支持：Mo Dawoud 由Google DeepMind委托制作若喜欢本期节目，请在Spotify或Apple Podcasts上为我们评分。我们始终期待听众的反馈，无论是意见、新想法还是嘉宾推荐！由AdsWizz旗下Simplecast平台托管。个人信息收集及广告用途相关说明详见pcm.adswizz.com。

双语字幕

仅展示文本字幕，不包含中文音频；想边听边看，请使用 Bayt 播客 App。

Speaker 0

欢迎回到Google DeepMinder播客。

Welcome back to Google DeepMinder podcast.

Speaker 0

我是主持人Hannah Fry教授。

I'm your host, professor Hannah Fry.

Speaker 0

过去几年中，很少有哪个领域能像教育行业这样深切感受到人工智能的变革性影响。

Now there are a few places in the past couple of years that have felt the transformative influence of AI as keenly as the education sector.

Speaker 0

全球教师都不得不突然重新思考如何与学生互动及评估学生，同时人们也担忧可能出现作弊和技术依赖问题。

Teachers around the world are having to abruptly rethink how they engage with and assess students, and there are concerns around the potential for cheating and dependence on technology.

Speaker 0

这些担忧是合理的。

Those fears are valid.

Speaker 0

但尽管面临快速转型，人类教师这个概念仍展现出惊人的韧性。

But despite the rapid transition, there has been something remarkably resilient about the idea of a human teacher.

Speaker 0

其核心始终未受技术潮汐的侵蚀。

Something that, at its heart, has remained immune to ebbs and flows of technology.

Speaker 0

毕竟，自文明诞生之初我们就有了教室。

After all, we've had classrooms for almost as long as we've had civilisations.

Speaker 0

这里同样蕴含着不容忽视的机遇。

And there are undoubtedly opportunities here too.

Speaker 0

想象这样一个课堂：每节课都按你的学习节奏定制，AI导师全天候待命，技术能在你卡壳前就预测到难点。

Imagine a classroom where each lesson is tailored to your individual pace of learning, where an AI tutor is available around the clock, and technology can predict where you're likely to get stuck before you do.

Speaker 0

谷歌DeepMind的研究人员一直在努力应对AI在教育领域的机遇与挑战。

Well, researchers here at Google DeepMind have been grappling with both the opportunities and challenges of AI in education.

Speaker 0

他们最近发表了关于该领域负责任开发AI的重要论文。

They recently published a major paper on developing AI responsibly in this area.

Speaker 0

论文主要作者之一将作为嘉宾参与本期播客。

One of its lead authors is my guest on the podcast.

Speaker 0

Irina Jarenka是谷歌DeepMind的研究负责人。

Irina Jarenka is a research lead at Google DeepMind.

Speaker 0

她的学术背景横跨实验心理学和计算神经科学，十年来她一直在这些领域探索诸如'人类如何学习'的问题。

Her background spans experimental psychology and computational neuroscience, and she has spent a decade within these walls asking questions like, how do humans learn?

Speaker 0

欢迎来到播客节目，塞雷娜。

Welcome to the podcast, Serena.

Speaker 0

这是一个人们投入大量精力的领域。

This is a space where people are very heavily invested.

Speaker 0

对吧？

Right?

Speaker 0

这是否使得这个领域变得相当难以驾驭？

Does that does that make it quite a difficult space to navigate?

Speaker 1

确实如此，因为教育已经存在了数千年，它是我们社会的基础结构。

It does, because if you think about it, education has been around for thousands of years, and it is a fundamental structure in our society.

Speaker 1

每个孩子都应该接受教育。

Every child is supposed to get educated.

Speaker 1

所以，教育体系已经存在很长时间了。

So, the educational systems have been around for a while.

Speaker 1

它们相当固化且根深蒂固。

They are quite rigid and they're very established.

Speaker 1

因此，突然出现并宣称'看，我们拥有这项神奇技术，将彻底革新一切'，我认为事情不会这么简单。

So, to come in and say, Look, we have this amazing technology and we're going to revolutionize everything and change everything, I think it's not going to work so easily.

Speaker 1

我们在过去的技术中已经看到这种情况，比如智能辅导系统已存在五十多年。

And we've seen this happen with technologies of the past, like intelligent tutoring systems have existed for fifty plus years.

Speaker 1

虽然投入了大量资金和研究，但可以说这些技术的承诺并未完全实现。

A lot of investment and research has gone into them, but you could argue that the of the promise of that technology hasn't fully materialized.

Speaker 1

更近期的例子是慕课（MOOCs），这些大规模在线课程也曾引发热潮，人们认为不再需要传统教育，只需上网就能学到任何知识。

Or more recently we had MOOCs, these massive open courses, and again, there was so much excitement how we won't need traditional education anymore, you can just go online and learn anything you would ever want to learn.

Speaker 1

但现实情况是，真正使用这些系统的仍是那些已经接受过传统教育的人。

And once again, when you actually look at who uses these systems, it's people who have already gone through traditional education.

Speaker 1

通常，是那些想攻读第二个硕士学位的人。

Typically, it's people trying to get their second Master's.

Speaker 1

所以这绝对不是那个突然出现并破坏系统的东西。

So it's definitely not the thing that came and broke the system.

Speaker 1

我想也许我们不应该试图破坏这个系统。

And I guess maybe we shouldn't be trying to break the system.

Speaker 1

传统教育领域正在发生许多令人惊叹的事情。

There is a lot of amazing stuff happening in traditional education.

Speaker 1

这不仅仅是把知识从老师那里提取出来，再以点滴方式灌输给学生。

It's not just about taking the knowledge from the teacher and kind of distilling or drip feeding that into the student.

Speaker 1

更重要的是与同伴交流、共同学习的社会性互动。

It's about the social aspects of talking to your peers and learning together.

Speaker 1

还包括老师传授的技能：如何成为世界公民，如何应对挑战，如何批判性思考，如何评估信息。

It's about the teachers giving skills, like how to be a global citizen, how to navigate, how to critically think, how to evaluate information.

Speaker 1

因此，教育系统提供的远不止是知识本身。

So, there is so much more to the educational systems than just the knowledge that they give.

Speaker 1

所以在我们团队里，我们思考新技术的角度是：它如何在现有系统中发挥作用，以及如何

So, in our team, we're thinking about the new technology in terms of how can it work within the current system and how can

Speaker 0

为其增值。

it add to it.

Speaker 0

所以这不是从一张白纸开始说要重新设计教育体系，而是对现有体系进行增强。

So this isn't like starting with a brand new blank sheet of paper and saying design an education system from scratch, it's like augmenting the one that exists.

Speaker 1

是的。

Yeah.

Speaker 1

实际上，MIT的研究员贾斯汀·莱克有句很精辟的话。

So actually, Justin Reich, a researcher at MIT, has a really nice quote.

Speaker 1

他说新技术不会破坏教育体系。

So he says that new technology doesn't break educational systems.

Speaker 1

教育系统某种程度上驯服了新技术。

Educational systems kind of tame new technology.

Speaker 0

这正是MOOCs（大规模开放在线课程）所经历的情况，

Which is what happened with MOOCs,

Speaker 1

确实，你说得完全正确。

as Exactly, you exactly, yes.

Speaker 1

正如我所说，我们还看到师生互动中存在这些人性化的方面，是技术永远无法改变的。

So, as I said, we're also seeing that there are these human aspects of teacher student interactions that we can't possibly ever change with technology.

Speaker 1

例如，想象一下学生和导师的关系，存在一些社交规则使得学生不太可能直接从人类导师面前站起来离开。

For example, if you think about a student and a tutor, there are some social rules that are in place where a student is very unlikely to just stand up and walk away from a human tutor.

Speaker 1

但如果你面对的是AI导师，直接关闭窗口就完事了。

But if you are interacting with an AI tutor, can just close the window and that's it.

Speaker 1

因此，引入技术会带来特定挑战，而人与人互动中的某些特质是技术永远无法取代的。

So, there are certain challenges that come with bringing technology in, and there are certain things that human to human interactions have that technology will never replace.

Speaker 1

这就是为什么我们一开始就尝试在现有体系内开展工作。

So, this is why we're trying to work within the system to begin with.

Speaker 0

那么你预计这会对教育造成多大程度的颠覆？

How disruptive do you expect it will be to education then?

Speaker 0

因为一方面，已经出现了相当多的颠覆性变化，特别是最近大型语言模型带来的影响。

Because, I mean, on the one hand, there has been quite a lot of disruption already, like especially particularly recently with large language models.

Speaker 0

当我与Demis交谈时，他提到人们总是高估某项技术的短期影响，却低估其长期影响。

When I spoke to Demis, he was talking about overestimating the impact of something in the short term and then underestimating how big the longer term impact will be.

Speaker 0

你认为教育领域会如何适应这种变化？

Where do you think education fits in with this?

Speaker 1

目前教育界对生成式AI的讨论非常热烈。

There is so much buzz about Gen AI right now in education.

Speaker 1

我感觉所有人都期待它能立即彻底改变一切。

I feel like everyone actually expects it to completely change everything immediately.

Speaker 1

涌现了如此多的教育科技公司，它们利用语言模型开发出各种辅导工具、作业助手等帮助学生学习的应用。

And there have been so many different edtechs that sprung around taking a language model and turning that into a tutor or a homework helper or anything else that kind of helps students.

Speaker 1

老实说，迄今为止还没有哪项技术真正达到人们预期的颠覆性影响。

And honestly, so far nothing has really made the impact that I think everyone was expecting.

Speaker 1

因此从我们的视角来看，这正是我们致力于改进这项技术的核心原因。

So, that's why, from our perspective, we are in the center of actually improving this technology.

Speaker 1

我们拥有接触Gemini的独特优势，能直接影响其发展进程——事实上我们的目标之一就是将Gemini打造成最适合教育领域的大语言模型。

We have unprecedented access to Gemini, we can influence how things change, and in fact, one of our goals is to make Gemini the best large language model for education.

Speaker 1

那么

So what is

Speaker 0

这里的雄心是什么？

the ambition here?

Speaker 0

是要打造一个

Is it to build a

Speaker 1

通用型AI导师吗？

sort of universal AI tutor?

Speaker 1

确实如此，但我们更希望赋能多元化的学习体验。

It is, but we also wanted to power different experiences.

Speaker 1

所以我们首个部署AI导师的平台就是YouTube。

So, the very first place where we deployed our AI tutor was YouTube.

Speaker 1

现在学习类视频新增了这样的功能：当你在观看教学视频时遇到不理解的内容，可以虚拟举手提问，AI导师会立即弹出窗口解答你的学习疑问。

So, on learning videos, there is now a new function, which is kind of like, if you're watching a learning video and you don't quite understand something, you can virtually raise your hand and an AI tutor will pop up and you can ask all your learning questions to the tutor.

Speaker 1

最近我们还推出了Gemini的精品应用'学习教练'，它经过专门优化，能成为你学习旅程的向导。

And then, more recently, we also launched a Gemini gem, so it's called the Learning Coach, and it's basically optimized to be your guide into learning experiences.

Speaker 1

在这里你可以提出任何问题，比如'我想了解光合作用'或'讲讲美国内战'。

So here you can come up with any question, say, Oh, I want to learn about photosynthesis or Tell me about American civil war.

Speaker 1

它会为你制定学习计划，评估你的知识盲区，并引导你循序渐进地掌握学习材料。

And it will give you a plan, it will try to understand what you know and don't know, and it will try to guide you through the materials.

Speaker 1

我们希望做的是真正推动研究，使这些基础模型在教育领域尽可能完善，然后找出如何最好地利用它们。

So, what we're hoping to do is really push the research to make these base models as good as possible for education and then figure out how to actually make the best use of them.

Speaker 1

事实上，我们希望社区能在这方面帮助我们，这样就不只是由我们来规定AI选择器应该是什么样子，而是倾听那些在这个领域比我们经验更丰富的人，并帮助他们充分利用技术，让技术为他们发挥最大价值。

And in fact, I think we hope that the community can help us with that, so that it's not just us dictating what an AI chooser should be like, it's us listening to people who have been in this space for much longer than us and trying to help them make the most of technology and make the technology be the best it can be for them.

Speaker 1

你觉得

How far do

Speaker 0

这项技术能走多远呢？

you think the technology can go, though?

Speaker 0

你能描绘一个非常乐观的未来场景吗？你希望未来是什么样子？

Can you sort of paint me an image of what you, in a very optimistic scenario, would like the future to look like?

Speaker 1

我认为很多人都在讨论AI优先的学校教育。

I think a lot of people talk about kind of AI first schooling.

Speaker 1

英国甚至有一所学校刚刚转向以AI为主的教育模式，只保留少数教师辅助教学。

I think there is even a school in The UK that just switched to mostly having AI based education, and I think they only just have a few teachers on hand to kind of help around.

Speaker 1

但我认为那不是我们应该追求的未来。

And I just don't think that that future is something we should be striving for.

Speaker 1

我们真的不想取代人类教师，而是想提供一种工具，来增强师生之间面对面的课堂体验。

We really don't want to, you know, replace human teachers, we want to give a tool that enhances this kind of in person classroom experience between teachers and students.

Speaker 1

如果学生来学校只是整天盯着屏幕，那会有点可悲。所以我们设想的是：教师仍作为导师和榜样存在，学习过程中保持大量同伴互动，同时有一个AI系统辅助教师和学习者，帮助他们实现最佳教学效果。

I think it's a little bit sad if students come to school and just sit around looking at screens all So, the way we are thinking about it is that there are still teachers as mentors, as role models to the students, and there is a lot of kind of peer interactions during learning, but there is this AI system that helps, that works with teachers and learners, and helps them make the best of the situation.

Speaker 1

也许对每个学习者来说，AI导师能帮助他们按照自己的节奏学习，真正针对他们的兴趣。

So, maybe for each learner, the AI tutor can help them move at their own pace and really target their interests.

Speaker 1

同时，教师可以掌握每个学生的学习进度，并引导这个AI导师。

And at the same time, the teacher gets a view of where everyone is and they can kind of still steer this tutor.

Speaker 1

他们仍保持控制权，仍能将个人风格和教学方式带入课堂，因为我认为师生之间的这种联系非常重要。

They still have control and they can still kind of bring their own personality and teaching style to the lessons, because I think this connection between teachers and students is so important.

Speaker 1

回想我自己的教育经历，最难忘的是那些让我对某个学科产生热情的杰出教师。

And like, looking back on my own education, what stands out to me is these amazing teachers who made me excited about a certain subject.

Speaker 1

我认为技术应该致力于创造更多这样的互动和记忆，或许能消除那些不太理想的情况——比如师生间缺乏默契，或是教师因过度劳累而无法顾及最需要关注的学生。

So, I think what technology should be trying to do is make more interactions and memories like that, and maybe remove the less ideal situations where maybe the teacher and the student don't click, or the teacher is so overworked that they don't have time to spend with a particular student who actually needs them the most.

Speaker 0

我想有些观众可能不太了解你的背景。

I imagine that there'll be some people watching who don't necessarily know about your background.

Speaker 0

能简单说说你是如何开始思考人工智能在教育领域的应用吗？

So can you tell us a little bit what what was your path to get to thinking about AI in education?

Speaker 1

我该从多久以前说起呢？

I mean, how how far away shall I start?

Speaker 1

从头开始吧。

Day one.

Speaker 1

简史一下。

A brief history.

Speaker 1

对。

Yeah.

Speaker 1

我一直对各种智能形式着迷，无论是人类的还是人工智能的。

So, I've always been fascinated by intelligence, any kind of intelligence, human or artificial.

Speaker 1

我很早就开始学习编程。

I started coding quite early on in life.

Speaker 1

说来很巧，小时候我和哥哥得到一本漫画书，内容正是编程入门。

It was just a lucky coincidence that my brother and I got a comic book as children, and it was about basically introduction to programming.

Speaker 1

很神奇吧。

Amazing.

Speaker 1

大概十一二岁时，我们就开始编写小游戏了。

So we started writing small games around the age of probably 11 or 12.

Speaker 1

记得有年夏天，我和哥哥无聊时发现可以获取某个射击游戏的源代码，还能自己编写对手程序。

And I remember at some point during the summer, my brother and I were bored, and we discovered that you can actually get access to the source code of of one of those, like, shooter games, and you can actually code up your opponents.

Speaker 1

那种感觉真是，哇，太令人兴奋了。

So, like, wow, this is exciting.

Speaker 1

我们确实可以创造人工智能。

We can actually create AI.

Speaker 1

我记得当时写日记时还想着，这个夏天我们要解决AI问题。

And so I I remember how like, putting a diary and she was like, This summer we're going to solve AI.

Speaker 1

当然，那并没有实现。

Of course, that didn't happen.

Speaker 0

或者说是你的野心。

Or the ambition of you.

Speaker 1

哦，是啊。

Oh, yeah.

Speaker 1

不过出人意料的是，我哥哥后来学了计算机科学，而我成长在一个相对传统的社会环境里，当时完全没意识到计算机科学学位、制作游戏、和哥哥一起玩电脑这些事其实是相通的。

Surprisingly though, my brother went on to study computer science, but I was growing up in a kind of traditional society where somehow it just didn't click to me that computer science, the degree, and making games and like playing around on computers with my brother are the same thing.

Speaker 1

对我来说，计算机科学是门枯燥的学科，更多是关于硬件的，而我实在不喜欢那些。

To me, computer science was something kind of dry and more about the hardware, and I really did not enjoy that.

Speaker 1

所以我最终选择了心理学作为专业。

So I ended up studying psychology as my degree.

Speaker 1

我一直在想，怎样才能既研究AI又研究智能呢？

I was kind of wondering, okay, how can I move towards AI and still study intelligence?

Speaker 1

因为我着迷于大脑是如何运作的。

Because I was fascinated, like, how does the brain do it?

Speaker 1

这些惊人的行为、智能和推理能力，究竟是如何产生的？

How does this incredible behavior and intelligence and reasoning, how does it all arise?

Speaker 1

后来我很幸运，在完成博士学位时听说了DeepMind，了解到可以通过深度学习进行神经科学研究，回答这些深层次的基础问题，这简直是我的理想工作。

And then I was very lucky that by the time I finished my PhD, and I heard about DeepMind and how you can actually do neuroscience research and answer these deep fundamental questions with deep learning, it was kind of this perfect job for me.

Speaker 1

于是，我最初加入了神经科学团队。

So, I started off in the neuroscience team.

Speaker 1

就像我说的，关于智能和推理的想法一直萦绕在我心头，因为推理能力正是让我们变得智能的关键。

And as I mentioned, kind of this idea of intelligence and reasoning has always been at the back of my mind, because reasoning is kind of what makes us intelligent.

Speaker 1

于是，我开始致力于改进语言模型的推理能力。早在语言模型成为热门之前，我就意识到它们在推理方面表现相当糟糕。

So, I started to work on improving reasoning in language models, and very early on, I kind just even before language models became kind of this big thing, I realized that they were quite bad at reasoning.

Speaker 1

但我也意识到，人类其实并不怎么运用推理能力。

But also what I realized is that humans don't really use reasoning that much.

Speaker 1

如果你仔细想想我们的日常生活，很多行为其实未经深思熟虑，我们几乎就像在自动驾驶模式下行动。

If you think about it in our daily lives, we don't actually think through a lot of our actions, we kind of just we're almost acting on autopilot.

Speaker 1

因此，要真正研究推理能力，我们需要一个推理至关重要的领域。

So, to really study reasoning, we needed a domain where reasoning was important.

Speaker 1

这就是教育再次成为焦点的原因，因为这是人类学会良好推理的途径。

And that's where education became a thing again, because this is where humans discover how to reason well.

Speaker 1

这其中有些非常有趣的现象——

There's something so interesting in

Speaker 0

某种程度上说，这个动机就是试图教会AI更好地推理，并在此过程中理解教授推理意味着什么。

that then, that that the motivation is, in some ways, trying to teach AI to be better at reasoning, and in the process, what it means to teach reasoning.

Speaker 0

我觉得这是个相当巧妙的思考角度。

I mean, that's kind of quite a nice way around to look at it.

Speaker 1

是的。

Yeah.

Speaker 1

还有件有趣的事：自己做事和教会别人做事完全是两码事。

And also, it's interesting how doing something and teaching somebody else how to do it are not the same.

Speaker 1

这本质上就是我们正在解决的挑战。

And this is basically the challenge we are now solving.

Speaker 1

基础版Gemini正在推理、数学和编程等基础技能上逐步提升。

So the base Gemini is slowly improving at reasoning and math and coding, all of these basic skills.

Speaker 1

但我们的工作其实是阻止模型直接运用这些技能给出答案，避免替学生完成任务，而是要学会克制，思考该提出什么样的问题才能引导学生自己找到答案。

But then our job is to actually stop the model from using these skills and kind of giving away the answer and really, like, just doing the job for the student, and instead holding back and thinking about what are the right questions I can ask the student so that they can figure it out by themselves.

Speaker 1

这非常困难。

And that's very hard.

Speaker 1

比如，模型经过微调以提供帮助。

Like, models are fine tuned to be helpful.

Speaker 1

所以，最初的反应是，我会直接给你答案。

So, the initial reaction is, I'll just give you the answer.

Speaker 1

因此我们必须做大量工作来阻止它们这样做。

So we have to do a lot of work to stop them from doing that.

Speaker 0

但实际上，我认为你确实一针见血地指出了——能做某事与能教授它是两回事。

But then, actually, I think you've really hit the nail on the head there that that being able to do something is not the same as being able to teach it.

Speaker 0

在大众教育领域——这是我最了解的领域——我深刻感受到不同部门对学生要求以及培养这些技能和知识的最佳方式存在拉锯战。

And I'm I'm really struck in mass education, which is the the space that I know most about, about how there is this push and pull from different sectors about what is required of students and the best possible way to instill those those skills and that knowledge.

Speaker 0

如果你要构建一个具有普遍吸引力的AI，如何找到平衡点确保满足所有领域的要求？

If you're building a sort of an AI which will have this universal appeal, how do you find that balance of of making sure that you're you're hitting all of the notes that required from all of the different areas?

Speaker 1

这是个好问题。

It is a good question.

Speaker 1

当我们最初构建Tutor时，我们想可以咨询教师、该领域的学者以及学习者，找出最完美的教学方式。

So when we first started building Tutor, we thought, you know, we can talk to teachers and kind of other maybe academics in the field, as well as learners, and figure out, okay, what are what is the perfect way to teach?

Speaker 1

然后我们

Then we'll

Speaker 0

就好像确实存在

As though there is a Exactly.

Speaker 0

某种

Sort of a

Speaker 1

是的，但你总会假设万事都存在最优策略——可能是我们科学家的思维惯性。我们确实这么做了，采访了大量利益相关者，结果发现存在很多分歧。

Yeah, but you kind of assume that in everything there is like this optimal strategy, maybe this is the scientists in us kind of and we did that, we went and interviewed a lot of stakeholders, and what we realized is that there is a lot of disagreement.

Speaker 1

实际上，当我们开始在YouTube或Gemini应用等不同谷歌服务上部署早期辅导模型时，我们发现即使在这些场景下需求也不尽相同。

And actually, once we started even deploying our early tutor models on different Google services like YouTube or Gemini app, we found that even there there were different requirements.

Speaker 1

比如在YouTube上，教育视频才是主角。

So, let's say on YouTube, the educational video is the main act.

Speaker 1

辅导老师的职责就是提供支持，或许他们应该更多地直接给出答案，因为这对学习者表面上看确实有帮助。

So the tutor is really there to support that and maybe the tutor should be giving away answers much more because it's actually helpful for the learner on that surface.

Speaker 1

与此同时，如果你和学校老师交谈，他们会提出截然不同的要求。

At the same time, if you talk to a teacher at school, they have very different requirements.

Speaker 1

他们绝对不希望辅导老师直接透露答案，尤其是考试题目的答案。

They really don't want the tutor to give away answers, definitely not to the exam questions.

Speaker 1

他们还会要求辅导老师遵循特定考试委员会的规定，或者配合某位任课教师特有的教学风格。

And they will also want the tutor to follow some particular exam board requirements or particular teaching style of that particular teacher.

Speaker 1

那么，你究竟该如何将这些不同的声音整合到一位辅导老师身上呢？

So, how do you actually incorporate all of those diverse voices into a single tutor?

Speaker 1

我们意识到需要建立一种基础教学模式，通过不同指令来引导教学方向。

So, what we've realized is that we need to build kind of this base pedagogical model that you can steer with different instructions.

Speaker 1

比如一位老师可能会说：今天我只希望学生们玩得开心，回答他们所有问题，重点营造愉快的学习体验。

So, one teacher can come and say, Actually, I want my students to just have fun today and just answer any question you have and really push on some fun experiences.

Speaker 1

而另一位导师可能更注重学术性，坚决反对这种做法。

And another tutor might be much more academic and say, no.

Speaker 1

今天我们要专门练习考试题目。

Today, we're doing exam practice problems.

Speaker 1

你只需要引导学生掌握这些知识点，确保他们完全理解。

You just guide the student through these topics and make sure that they understand everything.

Speaker 0

我认为教育的关键在于，正如你所说，并不存在完美的教学方法，只有这些不尽完美的衡量标准。

I guess one of the big things about education, mean, I as you said, right, there isn't this optimal approach to teaching, but there are these these kind of imperfect measures, really.

Speaker 0

我们见到好的教学时能辨认出来，但要量化却相当困难。

You know, we sort of know good teaching when we see it, but it feels quite difficult to to quantify.

Speaker 0

那么在这个探索过程中，你们如何判定什么是好的教学法呢？

So how do you decide what counts as good pedagogy when you're sort of navigating in space?

Speaker 1

首先你可能会说，这不是有教育科学理论吗？

So first, you might say, well, there's learning science, right?

Speaker 1

那你为什么不直接看看那些论文呢，它们会给你答案。

So why don't you just look at the papers and they'll give you the answer.

Speaker 1

确实，文献资料很多，但目前还没有达成共识。

And yes, there is a lot of literature, but there is no consensus as such.

Speaker 1

但另一个问题是，教学法非常依赖具体情境。

But another thing is pedagogy is very context dependent.

Speaker 1

所以，对初学者有效的方法可能对专家学习者无效，或者对程序性学科（比如数学）有效的方法——你实际上学习的是解决问题的步骤技巧。

So, what works for a novice learner might not work for an expert learner, or what works for a subject that's more procedural, like, let's say, math, you actually learn the skill of the procedure of how to solve a problem.

Speaker 1

可能就不适用于历史这类更依赖记忆的学科。

Might not work for more memory based subjects like history.

Speaker 1

当你开始思考时，会发现有数百种被研究过的教学策略，它们在不同情境下的效果都略有差异，突然间你就面临一个巨大的空间——可能不存在一个最佳教学法的单一点，而是有许多不同区域各自在特定情境下成为最优教学法。

So, when you start thinking about, okay, there's these hundreds of different pedagogical strategies that have been studied, all of them work slightly different in different contexts, suddenly you have this massive space where maybe there isn't like one single point that that's the best pedagogy, but there are many different regions that are best pedagogies in the given context.

Speaker 1

但问题在于，你该如何量化这个空间？

But the problem is, how do you even kind of quantify this space?

Speaker 1

然后如何在这个空间中寻找完美的教学策略？

And then how do you search it for this perfect pedagogy strategy?

Speaker 1

这开始有点像DeepMind之前的工作，比如下围棋。

And it becomes kind of similar to the work that DeepMind has done before, like playing the game of Go.

Speaker 1

这对AI来说是巨大挑战的原因在于搜索空间太大——所有可能的走法数量多得惊人。

The reason why it was such a huge challenge for AI was because the search space was huge, like all the possible moves you can do, it's like so many.

Speaker 1

而且并不存在一个已知的制胜策略。

And there isn't one known strategy.

Speaker 1

你不能简单地说AI必须搜索所有可能的走法和策略，然后找出它认为最优的方案。

You can't say AI has to search the space of possible moves and strategies and discover what it thinks is the best one.

Speaker 1

我们在AlphaGo项目中发现，首先AI的表现远超人类——基本上超越了人类几千年来的围棋水平。

And what we found with the AlphaGo work was that, first of all, AI was much better than humans, like basically all of humanity for thousands of years playing the game of Go.

Speaker 1

AI实际上能在几天或几个月内搜索整个空间并发现更好的策略，这是人类无法企及的。

AI was actually able to search the space and discover better strategies in the matter of days or kind of months compared to what humans could do.

Speaker 1

因此，我们希望能在教育领域实现类似的目标，但我们需要回到一个根本问题：我们如何真正定义成功？

So, our hope is that we can do something similar with education, but we're going back to this kind of question like, how do we actually know what success looks like?

Speaker 1

在围棋中，胜负分明易于衡量；而在教育领域，圣杯般的追求是学生的学习成果是否真正提升。

In Go you can still measure who has won and it's pretty unambiguous, whereas in education the holy grail is whether the students' learning outcomes have become better.

Speaker 1

但这并非能快速测量的指标，往往需要数月甚至数年时间持续追踪学习者，这在现实中可行性不高。

But this is not something you can measure quickly, you kind of need months, if not years, to really track the learner and that's not really feasible.

Speaker 1

所以我们的大量工作实际上聚焦于：既然知道目标，如何找到更易测量、更快见效的近似评估方法？

So, a lot of our work is actually done on, okay, we know what we're aiming for, but how can we approximate it in a way that's easier to measure and faster to measure?

Speaker 1

我们最近发布的70页报告详细记录了在亚利桑那州立大学的教学测量尝试：从与真实学生合作进行数月的长期评估，到邀请教学评估专家用数周时间分析学生与AI导师的对话样本，再到让AI系统互评以获取数小时内更精准但范围有限的反馈。

So, we published a report recently where it's like 70 pages of basically our trial and error and different attempts at measuring pedagogy, going from working with real students at Arizona State University and maybe measuring at the longer timescales of a couple of months, to asking pedagogical raters and teachers to look through, like, a few examples of conversations between students and our AI tutor and maybe give us quicker feedback on the order of weeks or days, to automatic measures where we actually ask AI to evaluate AI and give us much more targeted, much more limited, but still useful feedback in a matter of hours.

Speaker 0

不过我认为，要真正评估优质教学，还是需要开展完整的随机对照试验，进行长期跟踪观察。

But I guess if you really want to evaluate what good teaching is, you want to do that full randomized control trial where you're you're monitoring people over a period of time.

Speaker 0

你觉得我们距离能开展这类试验还有多远？

How far away do you think we are from being able to to run those?

Speaker 1

实际上这类试验正在进行中。

Well, these are being run right now.

Speaker 1

比如亚利桑那州立大学就是我们当前的试验基地之一。

So, Arizona State University is one example where we're actually running these.

Speaker 1

问题在于，即便将学生分为能接触AI导师组和对照组，我们发现理论上能使用辅导系统的组别中，实际参与的学生比例很低。

I think the problem with these is, even if you take your students and you split them into kind of students who have access to the AI chooser and students who don't, what we find is that in the group where they theoretically have access to the tutor, only a small percentage actually engage with it.

Speaker 1

这就产生了新问题：为什么有些学生参与而另一些不参与？

And that creates a problem, because why are some students engaging and others not?

Speaker 1

这些学生是否存在本质差异？

Is there something inherently different about these students?

Speaker 1

如果只在参与者中看到成效，这究竟归功于导师系统，还是因为这些学习者本身动机更强——没有辅导也会表现更好？

And then, if we only see success in those who engage, is it because of the tutor or is it because these learners were inherently more motivated and hence they would have done better anyway?

Speaker 1

于是核心问题就变成：我们究竟在帮助谁？

And then the question is, who are we helping?

Speaker 1

这在更大范围内会产生什么影响？

And what effect does it have at the larger scale?

Speaker 1

所以，如果你考虑一下优等生和后进生的情况，你在帮助优等生变得更好，但实际上并没有帮助后进生，这实际上是在扩大差距。

So, if you kind of think about the top students and the bottom students, and then you're helping the top students do better, but you're not actually helping the bottom students, you're actually increasing the gap.

Speaker 1

但我认为每个进入教育科技领域的人，实际上都是想缩小这种差距。

But I think everyone, when they go into EdTech, they actually want to decrease the gap.

Speaker 1

我们该如何做到这一点？

How do we do that?

Speaker 1

我们如何确保每个人都能参与进来？

How do we make sure that everyone engages?

Speaker 1

这是我们正在研究的另一个重大问题。

That's another big question that we are working on.

Speaker 0

我是说，无论你看哪里，都只能看到不完美的衡量标准，不是吗？

I mean, there's just there's just imperfect measures everywhere you look, isn't it?

Speaker 0

是啊。

Yeah.

Speaker 0

比如，要获得任何领域的真实情况都非常非常困难。

Like, it's very, very difficult to get a real ground truth in any of it.

Speaker 0

但我想说的是，这其中还存在更多复杂因素，因为那种教学方式...

But then I suppose there's also I mean, there's further complications in this because, okay, so that sort of teaching style.

Speaker 0

但可以推测，有些学科的客观事实比其他学科更多。

But presumably, there are some subjects where there's more of a ground truth than others.

Speaker 0

我在想，比如如果你创建一个历史学科的辅导工具...嗯...

I'm thinking, for example, if you created a tutor for history Mhmm.

Speaker 0

我的意思是，根据你所在国家的不同，对特定问题最相关的答案也会有所不同。

I mean, it would change depending on which country you were in as to what might be the most relevant answers to a particular question.

Speaker 1

是的。

Yes.

Speaker 1

这对我们来说是个大问题。

This is a a big issue for us.

Speaker 1

确实如此。

We are yeah.

Speaker 1

我们深入思考过在这种情境下该如何应对，因为历史问题往往没有唯一正确的答案。

We've thought a lot about what do you do in this situation, because you can't give this one true answer to any history question.

Speaker 1

这再次说明了为什么我们要考虑可引导性——让不同国家的教师能为辅导系统提供背景信息，使其了解回答某些问题的预期方式。

This is again why we're thinking about steerability, so that teachers in different countries can give kind of the background information to the tutor, so that it kind of knows what is the expected way of answering certain questions.

Speaker 1

但话题常常会引发那些既重要又敏感、难以讨论的问题。

But it also often, like, topics bring up questions that are really important to discuss, but are also hard to discuss and very sensitive.

Speaker 1

比如大屠杀这类事件。

I'm thinking things like the Holocaust.

Speaker 1

那么问题又来了：辅导系统在这些情况下该如何表现？

So, again, how should the tutor behave in these situations?

Speaker 1

我认为安全防护的标准做法往往是回避困难对话。

I think the standard approach to safety often is effectively declining to engage in difficult conversation.

Speaker 1

但辅导系统不能这样做。

But that's not something a tutor can do.

Speaker 0

确实不能。

No.

Speaker 0

教育的意义本就包括思考难题。

Mean, part of the point of education is to think about difficult things.

Speaker 1

正是如此。

Exactly.

Speaker 1

坦白说我们尚未解决这个问题。

So, I can't say that we've solved this problem.

Speaker 1

我们正尝试呈现多元观点，为学习者创造批判性思考不同理念的空间。

We are trying to give different views and kind of trying to give the learners a chance to critically evaluate different ideas in space.

Speaker 1

我也在努力将元认知应用到这个问题上。

I'm also really trying to bring metacognition to this problem.

Speaker 1

元认知是个很有趣的概念。

So, metacognition is an interesting one.

Speaker 1

我认为它经常被忽视，但很多人其实并不知道如何学习。

I think it often gets overlooked, but a lot of people don't actually know how to learn.

Speaker 1

学习通常没有想象中那么有趣，它需要你提前规划、真正投入材料，是的，大多数人并不真正知道该怎么做。

It's often not as much fun as you would expect, it requires you to plan ahead, to really engage with the materials, and yeah, most people don't really know how to do that.

Speaker 1

所以，导师能做的是真正教会学习者：如果你想回答这个难题，也许你应该去查阅不同的原始资料，然后思考它们告诉了你什么，你自己怎么看，其他专家怎么看，这更像是教会学习者如何回答问题，而不是直接给出答案。

So, what a tutor can do is actually teach the learner, Okay, if you're trying to answer this difficult question, maybe what you should do is go and look up different primary sources, and then think about what are they telling you, and what do you think about it, what do other experts think about this, and kind of teaching the learner how to go about answering these questions rather than necessarily giving the answers directly.

Speaker 0

那么这里面是有层次的。

There's layers to it then.

Speaker 0

我我我想，在最底层，你有知识和事实，比如数学就充满了这些。

I I I guess on on one layer, you have, like, knowledge and facts, which is, I guess, maths is quite, you know, full of them.

Speaker 0

然后在这之上，你有批判性评估的技能，再往上就是元认知，即如何培养评估知识的技能。

And then above that, you've got, like, the skills of critically evaluating, and then above that, metacognition, which is how to develop the skills to evaluate the knowledge.

Speaker 0

正是如此。

Exactly.

Speaker 0

所以你认为这就是解决应对难题时的安全性问题的答案吗？

So so you think that's the answer to this safety question of approaching difficult problems?

Speaker 0

所以不一定是安全性的答案。

So not necessarily the answer to safety.

Speaker 0

更像是

It's more

Speaker 1

一个关于如何接触那些如你所说没有绝对真理的学科的答案。

of a answer how to engage with subjects where, as you said, there is no necessarily single ground truth.

Speaker 1

就安全性而言，我认为这是个略有不同的问题。

In terms of safety, I think it's a slightly different question.

Speaker 1

有时人们会问我们，为什么要研究安全性问题？

Sometimes people ask us, like, Why are you working on safety at all?

Speaker 1

你们不是已经在基础模型上做了大量安全微调工作吗？

Aren't you using base models which already went through a lot of fine tuning safety work?

Speaker 1

答案是，尽管它们已完成所有这些基础工作，但针对教育类用例时，必须考虑这些系统的实际使用场景。

And the answer to that is, even though they have done all of this background work, When it comes to the educational kind of use case specifically, you have to think about how these systems will be used.

Speaker 1

我们发现，我们的AI导师被部署给亚利桑那州立大学学生，特别是通过其学习大厅项目，该项目旨在让更多元化的学习者接受高等教育。

So, one thing we found was that So, our tutors are deployed to Arizona State University students, and in particular through their study hall program, which is aimed at bringing more diverse learners to higher education.

Speaker 1

本质上，任何在YouTube观看ASU视频的人都能受邀参加这门课程——课程内容相同，但提供更多教师支持，并有获得学分继而转学成为ASU正式学生的机会。

So, essentially, anyone watching ASU videos on YouTube can get invited to take part in this course where it's the same lectures but with more faculty support and essentially an opportunity to earn credit and then transfer to become an actual student at Arizona State University.

Speaker 1

这意味着这些学习者通常已有全职工作或家庭责任，时间紧迫且压力很大。

But what it means is that these learners are typically already in, like, full time work or they have family commitments, they're quite short on time and stressed.

Speaker 1

因此当他们学习时，有时会处于糟糕状态——可能晚上11点才开始学习，只想发泄情绪。

And so, when they're learning, naturally, sometimes they are just, you know, in a bad state and there's no one maybe they're starting at 11PM and they just need to vent.

Speaker 1

他们唯一能倾诉的对象就是屏幕上这个AI导师。

And the only thing that they can vent to is this AI tutor that's sitting in front of them on the screen.

Speaker 1

我们经常看到这类情绪爆发：'我压力好大'、'真的坚持不下去了'、'我能解决这个问题吗'、'或许该放弃了'。

So we find these kind of emotional outbursts like, I am so stressed, I'm really struggling here, will I ever be able to solve this problem, maybe I should just quit.

Speaker 1

导师不能忽视这些消息，不能简单回复'抱歉我无法回答'。

And, you know, the tutor can't ignore these messages, can't just say, Sorry, I can't answer this.

Speaker 1

它必须积极回应，在这些脆弱时刻给予用户情感共鸣。

It really needs to engage and say something that connects with the user in these very vulnerable states.

Speaker 1

因此我们训练的导师会这样回应：'有这种感受很正常'、'我们能共渡难关'、'这里有帮助你的资源'等——我们经常看到这类对话记录。

So, what our tutor is trained to do, and we see transcripts like this coming in, is like, you know, it's fine to feel this way, everyone feels this way, we can get through this together, there are resources that can help you and things like that.

Speaker 0

我知道你曾写道AI导师应谨慎对待敏感自我表露，特别是在这类场景中。

I know that you've written that an AI tutor should be careful about sensitive self disclosure, I guess particularly in that sort of a setting.

Speaker 0

你具体是指什么？

What did you mean by that?

Speaker 1

当人们相互交谈时，常会发生这样的情况：其中一方可能会透露一些个人情况或事实，这会鼓励对方也敞开心扉分享自己的事。

So when people speak to each other, what often happens is maybe one of the conversation partners will say something personal, maybe mention a personal fact, and that encourages the other person to also open up and share something about them.

Speaker 1

通过这种方式，他们建立起信任和某种联系，从而推动对话继续。

And through this, they build trust and kind of a connection that helps the conversation move forward.

Speaker 1

当学习者倾诉自己压力很大的私密感受时，他们很自然地会期待导师也能分享类似经历。

And when a learner mentions something so personal about how stressed they are, it's almost natural that they would expect the tutor to share back.

Speaker 1

但问题在于，导师并没有可以分享的过往压力经历。

But then, of course, the tutor doesn't have a stressful situation from their past that they can share.

Speaker 1

任何这样的自我披露本质上都会构成谎言。

Anything they self disclose like that would be effectively a lie.

Speaker 1

因此我们需要把握一个微妙的平衡——导师既要维系情感连接支持学习者，又不能误导对方，避免在人类与AI之间建立本不该存在的情感纽带。

So, there's this very kind of thin line that we have to walk, where the tutor needs to maintain the connection and make sure that they support the learner, but at the same time not mislead them and not create a connection which shouldn't exist between a human and and an AI.

Speaker 0

所以它绝不能伪装成人类，但必须懂得如何与人类学生共情。

So at no point can it pretend to be another human, but it needs to understand how to empathize with a human student.

Speaker 0

正是如此。

Exactly.

Speaker 0

不过我很好奇，这里其实涉及一个关于拟人化程度的有趣课题。

But then, okay, I sort of wonder that's there's something really interesting there about, like, the correct amount of anthropomorphization.

Speaker 0

让学生明确知道对方是AI而非人类，是否存在某些优势？

Are there some advantages to to students knowing that it's an AI, knowing that there isn't a human at the other end?

Speaker 0

比如学生是否更敢于在AI面前犯错？

Like, are students more comfortable making mistakes in front of the AI, for instance?

Speaker 1

确实。

Yes.

Speaker 1

毫无疑问。

For sure.

Speaker 1

根据学生反馈，他们更愿意向AI导师提出自认为愚蠢的问题，因为不会像面对人类时那样感到被评判。

So something we've heard from from students is that they feel much more comfortable asking what they might perceive as a silly question to AI tutors, just because they don't feel judged, as you kind of do when there is a human on the other side.

Speaker 1

此外，当你在课堂上可以提问时，与AI导师一对一的场景中也会存在同伴评判。

Also, when you're in a class and you could ask a question, but then there's also peer judgement in this one on one setting with an AI tutor.

Speaker 1

你基本上可以畅所欲言，不会有问题。

You can basically say anything and it's going to be fine.

Speaker 1

我们发现学习者对此非常赞赏。

So, we find that the learners really appreciate that.

Speaker 0

但信任问题怎么解决呢？

But then what about trust?

Speaker 0

你们是否发现人们最终会比信任人类导师更相信AI？

Do you find that people end up believing the AI more than they would a sort of human tutor?

Speaker 1

有时确实如此。

Sometimes we do.

Speaker 1

在开发AI导师的最初阶段，我们遇到了一个非常有趣的情况：我们想测试它与人类教师的对比效果。

So we had this very interesting situation where, in the very first stages of developing the AI tutor, we wanted to test it out, how it compares to human teachers.

Speaker 1

我们联系了付费测评员，告诉他们：'看，你有机会学习不同科目，我们会为你匹配导师'，但没说明是AI还是人类。

So, we connected paid raters who were told: Look, you have this opportunity to learn different subjects, you will get connected to a tutor, and we didn't tell them whether it was an AI or a human.

Speaker 1

就是让他们享受学习过程。

And just, you know, have fun, enjoy the learning experience.

Speaker 1

之后我们让他们填写了问卷。

And after that, they were given a questionnaire.

Speaker 1

问卷中我们询问了诸如'你认为自己学到了多少'、'体验感受如何'等问题。

And in this questionnaire we asked things like, you know, how much do you think you've learned, how much did you enjoy the experience?

Speaker 1

这是我们导师系统的第一个版本，当时我们知道它相当糟糕。

And so, this was the very first version of our tutor, which we knew was quite bad.

Speaker 1

但令人惊讶的是，学习者反馈从AI导师那里学到的比人类导师更多。

And we found surprisingly that the learners reported having learned more with an AI tutor than a human.

Speaker 1

这确实显得很奇怪。

So, that seemed strange.

Speaker 1

于是我们决定浏览一下文字记录，看看那里发生了什么。

So, we decided to kind of look through the transcript to understand what is going on there.

Speaker 1

我们发现AI导师编造了各种有趣而惊人的事实，作为学习者，导师说的几乎所有内容听起来都像是'我没想到这个，这是我今天刚学到的冷知识'。

And we found that the AI tutor hallucinated all sorts of interesting, surprising facts that, of course, as a learner, pretty much everything the tutor says sounds like, I did not expect that, this is a fun fact I just learned today.

Speaker 1

所以他们当然印象深刻，觉得自己学到了更多，但实际上这不是导师应该做的行为，这绝对是我们未来迭代中要解决的问题。

So, of course, they were very impressed and felt that they have learned more, but in fact, this is not something that the tutor should be doing, and definitely something that we worked on to address in the future iterations.

Speaker 0

这是未来需要担心的问题吗？

Is that a concern going forwards?

Speaker 0

我是说，关于AI幻觉和人们将其误认为真实知识的问题？

I mean, the idea of hallucinations and people mistaking those for real knowledge?

Speaker 1

这绝对是个值得关注的问题。

It is definitely a concern.

Speaker 1

基础技术在事实准确性方面正在进步，在教育领域也是如此，因为我们教授的是已知材料，所以总是有一定的事实依据。

The base technology is getting better at factuality, and also with education, because we're teaching some material that is known, so there's always some sort of grounding.

Speaker 1

我们的导师通过声明'我只教授这个特定YouTube视频内容'或'只讲解老师提供的这篇文本'，并将事实指向这些原始素材，从而避免了部分事实准确性问题。

We our tutors, kind of like avoid some of these issues of factuality by just being able to say, you know, I'm only teaching you about this particular YouTube video or this particular piece of text that your teacher has provided, and kind of referring facts towards that primary source.

Speaker 1

这样就减少了导师编造内容的机会。

So it kind of gives the tutor less opportunity to actually make things up.

Speaker 0

其实我还想思考大型语言模型对整个教育领域更广泛的影响，不局限于特定的AI导师。

I just wanted to think actually also about the effect that large language models have had on education more generally, so outside of a specific AI tutor.

Speaker 0

因为大家都在问一个重要问题：如何设置防护措施来防止AI被用于作弊，比如替考或代写论文。

Because there is a big question that everyone has been asking, which is about putting in safeguards to to stop AI being used to to cheat or to, you know, sort of do people's exams for them or do people's essays for them.

Speaker 0

那么能设置什么样的防护措施呢？

So, what kind of safeguards can you put in?

Speaker 1

这项技术已经无处不在，我们与学生交流过他们使用生成式AI的情况，连我们都惊讶于他们的使用频率。

This technology is so pervasive, and we actually we talk to students about their use of Gen AI, and even we were surprised by how much they used it.

Speaker 1

他们直言不讳地说屏幕上方是课堂讲义和笔记，底部就是生成式AI。

So, literally, were saying that their screen is kind of their lecture, their notes, and then GenAI at the bottom.

Speaker 1

因此，我认为这项技术将会持续存在，并被学习者所使用。

So, I think the technology is here to stay and it will be used by the learners.

Speaker 1

可以尝试鼓励学习者批判性地评估回应，可能需要改变我们的评估方式，比如调整作业内容，使其真正与技术协同工作。

What can be done is trying to encourage learners to kind of critically evaluate the responses, trying to maybe change how we evaluate, like, what are the assignments, so that it actually works with the technology.

Speaker 1

因为仔细想想，教育是为我们进入现实世界做准备。

Because if you think about it, education is preparing us for the real world.

Speaker 1

而在现实世界中，我认为人们会越来越期望我们实际运用这项技术，因为它确实在很多方面有所帮助，能提高我们的效率。

And in the real world, I think the expectation will be more and more to actually work with this technology because it does help in many ways and does make us more productive.

Speaker 1

所以，在教育阶段禁止使用，却期望学习者能在工作中正确使用它，这是不合理的。

So, it doesn't make sense to ban it during education and then expect learners to know how to use it properly in their work.

Speaker 1

或许我们可以这样思考——这也是我们从教师那里听到的——如何改变作业形式及教学方式，将生成式AI作为合作伙伴来鼓励使用，但评估方式要有所不同。

So, maybe one way to think about it is and I think that's what we've heard from teachers is how to change assignments and the ways of teaching and working where GenAI is encouraged as a partner, but the evaluation is done slightly differently.

Speaker 1

这有点像过去计算器的使用场景：某些数学考试允许使用计算器，但你仍然需要掌握独立运算的能力。

So it's kind of like calculators in the past, where you're allowed to use calculators in certain math exams, but you're still expected to know how to do these calculations without help.

Speaker 0

我确实担心长期来看，当生成式AI成为全天候助手时，我们是否会产生某种依赖。

I do wonder in the longer term as we start to see, I don't know, like, Gen AI being, like, the the the assistant at all times, whether we can end up building a bit of a dependency on them.

Speaker 0

我的意思是，学生是否会形成一种虚假的掌握感？

I mean, do students end up with a a feeling that they have mastery when they don't?

Speaker 0

实际上完成工作的是AI。

Actually, it's the AI that's doing the work.

Speaker 0

所以我认为

So I think

Speaker 1

你指出了这里存在的两个潜在问题。

there are two potential issues here that you've identified.

Speaker 1

一个是这种虚假的掌握感。

One is kind of this feeling of mastery when there isn't one.

Speaker 1

这是任何学习过程中都普遍存在的因素，即便是传统教育也不例外。

And this is a very common factor in any kind of learning, even if you're talking about traditional education.

Speaker 1

例如，学生在备考时最常做的事情之一就是反复阅读笔记或教材。

For example, one of the things that students do a lot in preparation for exams is just like reread their notes or reread the textbook.

Speaker 1

这种方式会让他们产生一种错觉，以为自己已经掌握了材料，仅仅是因为对内容很熟悉。

And that kind of creates a feeling like they have mastered the material just because they are so familiar with it.

Speaker 1

但真正参加考试时，他们却发现自己记不住知识点，也无法有效运用信息。

But when they go into an exam, they actually find that they can't remember the facts and can't use the information well.

Speaker 1

因此，这种单纯重复阅读的方式被公认为是一种糟糕的学习策略。

So this is kind of one of the things that is very well known to be kind of a bad educational strategy, just rereading.

Speaker 1

我们在AI导师身上也发现了同样现象：当询问学习者对教学对话的感受和学习效果时，他们往往表示非常满意。

And we find the same with AI tutors, where if we ask a learner how they thought the conversation went and how much they think they've learned, they can report really good satisfaction.

Speaker 1

而如果把同样的对话交给教师评估，询问教学效果如何，他们的评分可能会截然不同。

Whereas if we give the same conversation to a teacher and then ask them the same question, how pedagogical was the tutor, how well do you think that session went, they could rate it very, very differently.

Speaker 1

我想，另一个因素是关于依赖性的问题。

I guess, the other factor is this question of dependency.

Speaker 1

我们确实发现，如果学生在学习中频繁使用生成式AI，他们会觉得帮助很大。研究也表明这确实能提高作业成绩和分数。

We definitely find that if the learners use Gen AI a lot during their studies, they feel like it's really helping in the process and actually, studies show that it does increase the success in exercises and kind of marks.

Speaker 1

但到了考试时，学生的表现反而会下降。

But when it comes to an exam, actually, the learner's performance drops.

Speaker 1

这是因为在学习过程中，他们过度依赖AI提供答案——即使AI只是引导而非真正教学，他们也可能把思考过程完全外包给了AI。

And that's because during studies, they get so dependent on the AI, providing them with the answers or even if it guides them, if it doesn't actually teach them the right things, it might be, like, they're just outsourcing their reasoning to the AI.

Speaker 1

这在考试环境下就会成为问题，因为无法使用AI时，他们既不记得知识点，也不具备独立解决问题的能力。

That can be a problem during exam conditions where you don't have access to it anymore, and you don't actually remember or know how to reason through these problems on your own.

Speaker 0

我想这是因为最好的考试不仅测试知识，还要测试技能。

I guess because the best exams aren't just testing knowledge, they're also testing skill.

Speaker 1

确实如此。

Exactly.

Speaker 1

嗯。

Mhmm.

Speaker 0

根据你所有的研究，关于如何创建一个有效的人工智能导师，你得出的重要结论是什么？

From all of your research then, what are the big conclusions that you draw about about how to create an effective AI tutor?

Speaker 0

你认为你已经解决这个问题了吗？

Do reckon you've solved it?

Speaker 0

不，还差得远呢。

No, nowhere near.

Speaker 1

我想说我们只是迈出了第一步，这一步某种程度上是意识到这个问题有多难。

And I will say we just made the very first step, and that step is kind of realizing how hard of a problem this is.

Speaker 1

我认为当我们刚开始做这项工作时，我们很天真，有点理想化，以为我们能在一年内解决它。

I think when we first started doing this work, we were naive and kind of wide eyed, thinking that we will come in and solve it within a year.

Speaker 1

但现在，我想我们对问题的范围和需要解决的主要事项有了更好的理解，这样才能开始取得有意义的进展。

But now, I think we have a better idea of the scope of the problem and what are the main things to address to start making meaningful progress.

Speaker 1

这些问题包括：我们如何定义成功，如何衡量教学法，从哪里获取数据，如何实际训练这些模型，以及如何更好地与社区互动，我们为谁而构建，这样我们才不会意外地扩大教育差距，而是朝着缩小差距的方向迈出有意义的步伐。

And these are things like: how do we know success, like how do we measure pedagogy, where do we get the data, how do we actually train these models, and also how to engage the communities better, and like, who are we building for, so that we are not accidentally increasing the gap in education, but are making meaningful steps towards decreasing them.

Speaker 1

所以，我认为我们面前还有很长的路要走，实际上，我们认为真的需要让整个社区一起努力解决这个问题。

So, I think there is a very long road ahead of us, and actually, we think that we really need to bring all of the community to work on this problem together.

Speaker 1

因此，我们试图创建一些共同的基准，让大家可以一起攀登。

So we are trying to kind of create common benchmarks that we can all climb together.

Speaker 0

那真是太好了。

That was really nice.

Speaker 0

真的很好。

It was really nice.

Speaker 0

谢谢你的参与，在与伊琳娜的对话中，这栋楼里正在考虑的问题类型的显著转变让我印象深刻。

Thank you for joining me, I was really struck in that conversation with Irina by the notable shift in the sorts of problems that are being considered in this building.

展开剩余字幕（还有 7 条）

Speaker 0

我们已经从处理确定性的事物，比如国际象棋或围棋的输赢，或者图像中是否有猫，转向了教育这个没有绝对标准的领域，每个方向上都只有不完美的衡量标准，什么是好的教学，什么是有效的学习体验，如何在知识与技能和学习如何学习之间取得平衡，或者在导师应该提示多少和保留多少之间走钢丝，甚至导师应该有多人性化。

We've gone from dealing with definites, right, like winning or losing at chess or Go or recognising cat or no cat in images, to education, a space with no absolutes, only imperfect measures in every direction, in what counts as good teaching, in what counts as an effective learning experience, in how to get the balance between knowledge and skills and learning how to learn, or to walk the line between how much a tutor should prompt and how much it should withhold, even how human a tutor should be.

Speaker 0

这些问题都没有标准答案。

None of those questions have ground truths.

Speaker 0

正是这一点使得这项挑战如此艰巨，但正如伊琳娜所精彩展示的那样，解决它需要谦逊与合作精神。

And that is what makes this challenge so incredibly difficult, but also one as beautifully demonstrated by Irina there, which requires humility and collaboration to solve.

Speaker 0

您正在收听的是由我——汉娜·弗莱教授主持的《谷歌DeepMind》播客节目。

You've been listening to Google DeepMind the podcast with me, Professor Hannah Fry.

Speaker 0

如果您喜欢本期节目，请订阅我们的YouTube频道。

If you enjoyed that episode, do subscribe to our YouTube channel.

Speaker 0

您也可以在您喜爱的播客平台上找到我们。

You can also find us on your favorite podcast platform.

Speaker 0

我们还将推出涵盖各类主题的更多节目，敬请持续关注。

And we have got plenty more episodes on a whole range of topics to come, so do check those out too.