本集简介
双语字幕
仅展示文本字幕,不包含中文音频;想边听边看,请使用 Bayt 播客 App。
希望在于你能发现一些让事物变得可理解的原理。从某种意义上说,唯一可理解的就是那些非临时拼凑的东西,对吧?唯一可理解的正是这些原理。因此,我认为这是我们工作中至关重要却未被充分认识的一点——在我看来,我们并非试图为某事物找到唯一机械论的解释。
The hope is that you can find some principles that make things understandable. In a sense, the only things that are understandable are the non kludges. Right? The only things that are understandable are the principles. And so this is a really important and I think underappreciated element of what we have to do is we're not trying In in my view, we're not trying to come up with the one mechanistic explanation of something.
我们试图寻找一类具有共同属性的等效解释。对我而言,神经科学中最大的空白就是我们尚未理解学习是如何运作的。当前我们使用的所有学习机器都在进行梯度下降——这基本上是它们的运作方式,而大脑并不采用梯度下降法。对吧。
We're trying to find a class of equivalent explanations that have some shared properties. To me, this is the biggest gaping hole in neuroscience is that we don't understand how learning works. All of the machines that we use for learning, they're doing gradient descent these days. Like that's basically what they do and the brain doesn't do gradient descent. Right.
或许它近似于此。这些近似是什么?约束条件又是什么?我们尚不清楚。而我们不知道是因为目前还无法测量。
Maybe it approximates it. What are the approximations? What are the constraints? We we don't know. And we don't know because we can't measure it yet.
这里是《大脑启示录》,由神经递质驱动。你以为自己掌握了原理?扎克·皮特科(Zach Pitko)才真正拥有原理。扎克是我今天的嘉宾,他负责运营卡内基梅隆大学的LAB LAB实验室。这里的LAB代表'算法大脑实验室'(Lab for the Algorithmic Brain),其首字母缩写恰好也是LAB——即算法大脑实验室(Lab for the Algorithmic Brain)。
This is Brain Inspired, powered by the transmitter. You think you have principles? Zach Pitko has principles. Zach is my guest today and he runs the LAB LAB at Carnegie Mellon University. Lab here stands for Lab for the Algorithmic Brain and an acronym for that is LAB, which stands for Lab for the Algorithmic Brain.
至于这个缩写的递归梗...你懂的。扎克是位理论神经科学家,正如我们将谈到的,他也具备实验神经科学背景。他涉猎广泛——实际上他自称'涉猎者'。他参与多项研究,但本次讨论的核心是他如何通过基本原理来开展认知研究,这些原理孕育了他的问题、模型和方法。
And an acronym for that well, you get the point. Zach is a theoretical neuroscientist with a background in some experimental neuroscience as we talk about. He dabbles in I think he actually describes himself as a dabbler. He dabbles in many endeavors. But the main theme of our discussion here is how he approaches his research into cognition by way of principles from which his questions and models and methods spring forth.
因此我们探讨了这些原理,并由此延伸讨论了他试图理解和解释一系列认知过程的理论工作。具体议题包括:当我们设计任务让生物体解决以理解某些认知特征时,这些生物体会采用相对于任务次优、但相对于其行为信念近乎最优的策略——扎克称之为'逆向理性控制'。我们还讨论了概率图网络、大脑如何(或可能如何)运用概率进行计算,以及不同的概率计算方式。
So we discussed those principles and in that light, we discussed some of his specific lines of work and ideas on the theoretical side of trying to understand and explain a slew of cognitive processes. A few of those specific, topics that we discuss are how when we present tasks for organisms to solve in order to understand some facet of cognition. The organisms use strategies that are suboptimal relative to the task but nearly optimal relative to their beliefs about what they need to be doing. Something Zach calls inverse rational control. We talk about probabilistic graph networks, we talk about how brains use probabilities or how brains may use probabilities to compute, different ways they could use probabilities to compute.
他较新的项目是与多位合作者发起的'生态神经科学'。这些仅触及他正在开展、已完成或感兴趣的众多项目的冰山一角。更多关于他的研究原则和工作细节,请访问节目注释braininspired.co/podcast/2nineteen。衷心感谢我的Patreon支持者——若您支持本节目,可获取全部完整剧集和存档,加入Discord社区,并参与我们围绕复杂性基础论文组建的双周讨论组'复杂性小组会议'。
And one of his newer projects is Ecological Neuroscience that he has started with multiple collaborators. And these just touch on a few of the many projects that he is running, has run, and is interested in. You can learn more about his principles and about his work in the show notes at braininspired.co/podcast/2nineteen. Thank you so much to my Patreon supporters. If you support the show, get access to all the full episodes, full archive, you can join the Discord community, you can access a bunch of Complexity Group meetings that is a bi weekly ish kind of discussion group that we've formed around the foundational papers of complexity.
如果想了解更多,可以找我几个月前关于大卫·克拉考尔的那期节目。总之,希望你们一切都好。再过几个月我会有个新工作室,不再是你们眼前这个狭小如壁橱的地方了。好吧。
Look for my David Krakauer episode, a couple months ago if you want to learn more about that. Anyway, I hope you're doing well out there. I'm gonna have a new studio soon in a couple months. It won't be this tiny tiny closet that you see before you. Alright.
好好享受吧,扎克。扎克,我先试着概括一下你的工作,然后你来纠正我。我会用最宽泛的术语描述你的研究领域,然后想听听共同主题?最宽泛的共同主题,对吧?
Enjoy, Zach. Zach, I'm gonna give it a shot and then you can correct me. I'm gonna, in the broadest terms possible, describe what you do and then you can I wanna hear common theme? The broadest possible common theme, right?
我喜欢这个方式。
I like it.
好的。那么,你在现实假设下研究规范性模型,以发现或推断认知功能。这些现实假设包括代谢成本、有限资源、认知的计算成本、次优条件下的理性等等。这只是极其简略的概括,我说错的地方在哪里?你会如何修正?
Okay. So, you study normative models under realistic assumptions to discover or infer cognitive functions. So the realistic assumptions being like metabolic cost, limited resources, the computational cost of cognition, rationality under suboptimality and so on. So that was super brief. Where did I go wrong and how would you correct me?
哦,这是个很棒的开端。这是我们实验室正在研究的主要方向之一。我觉得这些问题非常有趣。我确实受到各种原理的启发,规范性原理是很自然的一个切入点。
Oh, that's a that's a great start. That's a major theme of, of what we're working on in the lab. I think those things are really fun. I've definitely been motivated by principles and different kinds of principles. It's a normative principles are a pretty natural one.
但你也提到了一些非规范性的原理,比如来自大脑内部的约束条件。我们是如何得出这些约束的?其中有些仍然像物理学一样具有原理性——这正是我最初对这个领域产生兴趣的原因。
But there's also and you mentioned some non normative principles in the sense of constraints that come from inside the brain. How do we how do we end up with those constraints? Some of them are still principled like in in physics. That was how I originally got interested in this whole endeavor.
等等,停一下。你这话是什么意思?
Wait. Stop there. What do you mean?
我本科时看过比尔·比亚利克的演讲,他展示了如何用物理解释大脑,我当时反应是:真的吗?没错。太神奇了。这就是我研究之路的开端。比如我们能感知的物理极限——根据物理学原理最微弱的光线。
I saw a talk by Bill Bialik when I was an undergrad and he showed how you could use physics to understand the brain and I was like, really? Yes. That's amazing. And so that's that was the beginning for me. And so some of those constraints like we can see as well as like the dimmest light possible according to physics.
这是物理规律决定的限制。我们眼睛分辨率与折射模糊衍射之间的平衡,这些都源于物理定律。但还有些遗留特征未必是必然的,比如某些架构性结构的存在,这些我也希望能理解。
That's a constraint that comes from physics. The balance in our eyes between resolution and refractive blur diffraction, that that comes from that comes from physics. Those constraints come from physics. But then there's some other legacy things that show up that doesn't maybe it doesn't have to be that way. There's you know, some architectural structures that are there and I'd like to understand those as well.
这些更难探究,因为没有物理定律可循。它们只是进化历史的遗产和我们所处生态位的产物。比如大脑的学习与可塑性主要发生在局部区域,这就是神经连接方式带来的限制。
Those are harder to get at because you don't have physics to point to. That's just the legacy of our evolutionary history and the ecological niches that we occupy. So how can we how can we understand something of how we end up like for example, one concrete example is that learning and plasticity in the brain is largely local. Mhmm. And that's a constraint that comes from our brain wiring.
像AI系统就不受局部学习规则的限制。这是我们大脑和生物系统的特殊性。两侧对称性是从远古继承的,这些特征未必是最优解,而是多重影响的结果。
Other systems like AI systems are not constrained to have local learning rules. Right? So that's a particularity of our brains and our biology and our systems. Bilateral symmetry that's inherited from a long time ago. Those are things that are not necessarily optimal in some sense, but there are some influences that come in.
好的,请继续。
Okay. So go ahead.
我们另一个主要研究方向是各类相关性——它们的作用机制与起源。现在既研究小鼠大脑连接组这样的微观尺度,也探索人类语言在大脑中的宏观表征方式。
I was going say another major effort that so those are all kind of normative or normative adjacent things that we work on. But we also have some other types of things that we work on. We I'm known for doing a bunch of work on correlations of different sorts, like what those what those do for you, where they might come from. And we now have been working on some really fine scale things like the connectome of a mouse brain and some very large scale things like human language and how that's represented represented in in the the brain. Brain.
研究范围其实很广。但核心支柱始终是这些规范性模型,其他方向都由此延伸。因为我特别钟爱原理性研究。
So it's a it's a pretty wide range of things. But I would say that the the bread and butter, like the core out of which these other spokes emerge is indeed these these normative models. Because I really like principles.
好的。那么我们先坚持讨论一下规范性模型,因为你刚才提到你对物理产生了兴趣,而这些规范性原则正是建立在那些我们无法控制的、通过进化形成的结构之上。
Yeah. Okay. So let's stick with the normative models for a minute because Yeah. Absolutely. You were just describing you got turned on by physics and yet these normative sort of principles are built on these things that have happened through evolution, the structures that we can't control.
因此,你面对这些生物学上混乱的现象,却以某种方式将它们视为算法和计算过程的基础、载体或媒介,认为它们朝着某个目标具有规范性。那么,你如何调和——你知道的——规范性事物建立在非规范性基础上的矛盾?换句话说,如果这种描述是合理的话。
And so you have these biologically messy things that then somehow you view them, you view the algorithms and the computational processes that they're enacting built on top of them or within them or through them as like normative toward a goal. Like, how do you mesh the how do you mesh you know, the normative stuff is built on non normative stuff, in other words, if that's a fair characterization.
没错。机制本身是非规范性的,它被推向那些最优化的方向。但它是否真能达到则是另一个问题。
Yeah. Yeah. That's that's right. The the machinery is non normative and it's and it's pushed in those directions, like the directions of optimality. But whether it actually gets there is a separate question.
你会说不能吧?几乎可以说——
Well you would say no. Right? And almost in
是的,它实际上从未真正达到。我是说,在某些案例中确实存在那些绝妙的例子。但更多情况下它并非完全最优,这时你会问:我们能否将其理解为某种原则?
Yeah. It never really gets there. I mean it's I mean in some cases it does. There's those the most beautiful examples. But the there are plenty of cases where it's not going to be exactly optimal and then you can say, Well, can we understand this anyway as as a principle?
为此我们尝试提出一种称为'逆向理性控制'的框架。即动物虽非全局最优,尤其在实验环境中,但它们可能以自我一致的方式行动,在其认知范围内尽力而为。于是你可以定义一组假设——它认为自己要实现的目标是什么——然后说它在此范围内是最优的。
And so one of the one of the ways that we've tried to formulate that is that like is through something we call inverse rational control. Mhmm. Where we say that the animal is not optimal globally and it's certainly not optimal for the experiments that that they're being put into. But they might be acting in a way that's self consistent and doing the best they can under their assumptions. So then you can define a set of assumptions, what it thinks it's trying to accomplish, what its goals are and then say, it's optimal within that.
当然可能是错的。谁可能出错?动物。是动物本身。
And you might be mistaken. Right? So Who might be mistaken? The animal. The animal.
嗯,两者都是。我是说,研究人员可能误解了对动物而言重要的事情。当然,动物也可能误解了对研究人员而言重要的事情。对吧?
Well, both. I mean, the researcher might be mistaken about what's important for the animals. Sure. And the animal might be mistaken mistaken about what's important for the researcher. Right?
就实验设计而言,比如,哦,这种情况发生的频率。这些事情是独立的,或者这些事情是相关的。所以无论动物对你为它构建的小世界的结构有何种假设,那些假设都可能是错误的。因此,我们可以假设动物在这些错误的假设下仍然尽力表现良好,但从外部视角看,它的行为不会显得最优,因为它做的事情在任务背景下没有意义。你必须找到它们确实有意义的方式,我们称之为合理化。
In terms of like, the experimental design like, oh, this happens this often. These things are independent or these things are correlated. So whatever the animal is thinking about the structure of its little world that you've put it in, that those assumptions may be wrong. And so, we can we can make the hypothesis that the animal is still under those wrong assumptions trying to behave as well as it can, but it's not going to look optimal from the outside point of view because it's doing things that don't make sense according to the task. You have to find the way in which they do make sense, which we call rationalizing.
就像你为什么要先刷牙再吃早餐?现在你得为这种行为找出某种合理化解释,而不是反过来。也许这个原则放宽了最优性的概念,但并没有陷入‘随便怎样都行’的境地。
Just the same way as like, why did you you know, why did you brush your teeth before you eat your breakfast? Well, now you have to come up with some rationalization of why you do that instead of the other way around. And so maybe that's a principle that that relaxes the idea of optimality, but doesn't lose it doesn't go into, oh, anything goes.
其实并没有放宽,反而直接指向了最优性。只是这不是任务要求的最优性。对吧?
Well, doesn't relax it. In fact, it it points directly toward it. It's just not the optimality that the task demands. Right?
是的,没错。在某些情况下,动物甚至在生态学意义上可能是最优的——在另一个不同的世界里。比如如果你真的在草原上奔跑采集果实,这样做就是正确的。那么在另一个任务、另一个环境中,它就会是最优的。
Yeah. That's right. And and the the animal really like even in so in some cases, it might be the ecology that it is optimal under a different world. Like if you were actually in the savannah running around and gathering fruit, this is the right thing to do. And then it would be optimal in a different task, a different environment.
在它的自然环境中也可能不是最优的。那里也可能存在一些错误的假设。因此,它可能在某种虚构的环境中才是最优的。所以在你提到的那些规范性模型中——无论是概率推理还是强化学习——很多都归结于约束条件。
It may also be not optimal in in its natural environment. There could be some bad assumptions there too. And so then, it would be optimal in some kind of fictional environment. So it's a lot of these things in in the normative models like you're talking about probabilistic reasoning, you're talking about reinforcement learning. A lot of them boil down to constraints.
是啊。
Yeah.
不仅仅是关于最优性,就像如果你能做任何事,你总是会有一些限制。那么你是将这些限制纳入原则中,还是将它们视为次要因素?实际上,从数学上看,你可以将它们视为等效的,但我认为在概念上区分它们更有帮助,即这是一个我们面临的限制,我们将在这个限制内工作,然后我们称其余部分为最优。
Not not just optimality like if you could do absolutely anything, you always have some constraints. And then do you fold the constraints into the principle or the constraints some side thing? In fact, mathematically there, you can write them as equivalent but I think it helps conceptually to separate them and say, here's a constraint that we have and we're going to work within that constraint and then we'll call the rest of it optimal.
是的。好的,我在试着理解,从某种意义上说,神经科学的历史是基于任务的,对吧?很大一部分历史是这样。你设计一个任务,提供很多限制,减少生物体的准备,无论是固定头部,还是你有两个盒子,可以查看下面不同的奖励分布,就像在逆向理性控制中那样。但你说的是,这没问题。
Yeah. So okay, I'm trying to understand, so in some sense, okay, the history of neuroscience is like task based, right? A large history, right? You design a task, you reduce the, you provide a lot of constraints, you reduce the preparation for the organism whether it's head fixing or, you know, you have two boxes and you can look under both and with different reward distributions, etc, like in the inverse rational control. But what you're saying is, okay, that's fine.
所以,关于这种基于任务的方法有很多批评,因为它不够生态,这不是生物体被设计来做的事情,比如查看这些盒子或看看下面是否有奖励。但你说的是,这没关系,因为它们仍在优化,但它们是在优化进化上更倾向于做的事情,我们可以推断它们实际上在做什么,这可能在某些意义上不是最优的,或者与我们希望它们做的关系不大,但我们仍然可以研究它。
So, okay, so there's a lot of criticism on this task based thing because well, is not ecological, this is not what organisms were designed to do to like look at these boxes or whatever and see if there are rewards under them. But what you're saying is kind of that's okay because they're still optimizing but they're optimizing for a different thing that evolutionarily they are more prone to do evolutionarily selected for and we can infer what actually trying to do which is in some sense maybe suboptimal or less related to what we want them to do but we can still study it.
是的,没错。我的意思是,这种基于任务的方法对神经科学非常关键,因为我们想要控制事物。但这取决于任务的复杂性,这是我们可以调节的一个旋钮。随着时间的推移,我们从最初最简单的任务,甚至没有任务,比如动物可能是无意识的,你只是让眼睛睁开,观察动物在外面时大脑在做什么,它仍然在做事情。开始转向简单的任务,比如你必须选择A或B。随着我们获得更多数据,我们已经能够增加复杂性。
Yeah, that's right. I mean, this whole task based thing is really critical for neuroscience because we want to control things But it depends on how like the complexity of the task is a knob that we can turn. We've changed over time from the very simplest tasks in the beginning, even without tasks like the animal is maybe unconscious and you just have the eyes open and you're looking at what the brain does when the animal is out and it still does stuff. Starting to move towards simple tasks like you have to choose A or B. And as we've gained more data, we've been able to dial up the complexity.
现在,我认为我们还没有达到可以在机器学习风格中进行基准测试任务的地步。比如你可能有一个机器人在四处走动,装载洗碗机或在藤蔓上摇摆。我不知道现在是否有机器人在藤蔓上摇摆,但
Now, we're not yet at the point, I think, where we can do benchmark tasks in machine learning styles. Where you might have a robot that's going around and like loading the dishwasher or swinging from vines. I don't know if there's any robots that swing from vines yet, but
可能有。但那会很酷。我想
Probably. But that'd be cool. I want to
看看是的。不久的将来。你知道,肯定是在崎岖的地形上跑来跑去。对吧?那些是而且可能试图实现某些特定的目标。
see Yeah. Someday soon. You know, certainly running around on on rough terrain. Right? There's a Those are And maybe trying to acquire certain certain goals.
或者以一个真正的自然案例来说,我们会观察一只动物在其自然环境中,比如爬树、与同物种的其他动物进行社交互动、觅食、交配、逃避捕食者、玩耍等。所有这些自然行为都是我们目前尚未收集足够数据来作为研究课题的领域。因此我认为我们始终在寻找这种中间层次的任务——它既要足够复杂以揭示大脑计算中有用且有趣的结构,又要足够简单以便我们能够真正描述其特征。我们一直在朝这个方向调整,我个人倾向于比多数人更偏向复杂化一些,这就需要更复杂的分析系统或框架来解读数据。但最终目标还是要真正迈向自然主义。
Or for for a real natural case, we would have an animal that would be just in its natural environment like climbing trees, interacting socially with other animals of the same species, finding food, mating, running away from predators, like playing. All of those natural things are things that we we don't have enough data for yet to make that the task of interest. So I think we're always looking for this kind of intermediate level of task, which is complex enough to reveal useful and interesting structure about brain computations, but it's simple enough that we can actually characterize it. So we've been shifting this way and I think I like to push a little more in the complex direction than most and then you need to have a more complex analysis system or framework to interpret that data. But then you know, the goal is eventually to move really towards naturalism.
这是个有趣的矛盾。如何平衡这一点?我的一些合作者正在尝试收集海量数据,比如Andreas Tolias正在构建的Enigma项目,他的目标是在自由活动的猴子身上收集大量数据,让它们真正完成各种复杂互动。但当你拥有海量数据时,就能构建像我们看到的大型语言模型那样的巨型模型,只不过数据源不同。
So it's an interesting tension. How do you navigate that? Some of my collaborators are really trying to collect massive data like Andreas Tolias is building this Enigma project where he's collecting massive data in freely moving monkeys, like this is the goal. And having them really do all these kind of complicated interactions. But if you have massive data, then you can build some massive models like we've seen with large language models, but now of other sources.
有时人们称它们为基础模型或前沿模型。这些大型模型提供的是描述性模型,它不会告诉你应该怎么做。
Sometimes they call them foundation models or frontier models. And so then, with those big models, now you have a description. It's a descriptive model. It doesn't say what you should do.
没错。
Right.
它不说明实现过程,只是呈现现象。之后你可以尝试进一步分析,我们也用这类模型做过很多尝试,某种程度上这是对数据的重构。你将海量数据集压缩成描述性模型,但它依然是数据,只是以更复杂且可能更易解析的方式重新格式化。
It doesn't say how it's done. It just is like this is what happens. And then you can try to analyze that further and we've played a lot of games with those kind of models as well, trying to see if we had the it's kind of a reformulation of the data in a sense. You have this massive dataset that you compress into a descriptive model, but it's still the data. It's just reformatted in a much more sophisticated and potentially interrogatable way.
但你们是否对压缩成描述性模型的优化目标做了某些假设?
But are you making assumptions about what is being optimized like to compress it into the descriptive model?
通常不会。从某种意义上说存在隐性假设,但相比我们刚才讨论的规范性模型的假设要弱得多。嗯。这些都是更数据驱动的模型,它们只是描述输入输出关系的巨型神经网络。
Usually, no. I mean, implicitly in some sense, but those are weak assumptions compared to the ones that we were just talking about with normative models. Mhmm. These are much more data driven models. They're they're just big like neural networks that that describe input output relationships.
然后你可以希望它们内部、在底层机制中,以某种方式反映出相关且有趣的潜在变量。但这可能不会发生,因为存在各种计算同一事物的等效方法。因此,我认为这是一个非常重要但未被充分认识的要素:我们并非试图寻找唯一的机械性解释,而是寻求一类具有某些共同特性的等效解释,进而理解这些特性如何与动物的行为及其感官输入在计算层面相关联。当我们在大型神经网络模型中这样做时,你可能会说,嘿,那个模型与大脑毫无关系。
And then you can hope that they inside, under the hood, somehow reflect latent variables that are relevant and interesting. But it might not because there's all sorts of equivalent ways of computing the same thing. And so this is a really important, and I think under underappreciated element of what we have to do is we're not trying, in in my view, we're not trying to come up with the one mechanistic explanation of something. We're trying to find a class of equivalent explanations that have some shared properties And then understand what those properties are and how they relate computationally to the behavior of the animal and its sensory inputs. So when we do this in a big neural network model, you could say, Hey, that model doesn't have anything to do with the brain.
这个模型中的神经元与大脑中的神经元并不相同。
The neurons in this model are not the same as the neurons in the brain.
而且
And
然而。确实如此。然而,我们仍能在其中发现一些共享的结构。你知道,这正是挑战所在。
yet. Yet. Exactly. And yet, we can still find some shared structure there. And you know, that's that's the challenge.
这也是乐趣所在。我们如何通过这些新技术进行识别和发现?这某种程度上触及了我称之为‘神经AI’的领域——一种大脑与机器的融合,既运用现代AI工具理解大脑(正如我们如今用AI工具理解万物),又因AI源于神经科学而赋予其独特视角。
That's the the joy. How do we how do we identify? How do we discover things with these new techniques? This is kind of getting to what I would classify as the field of neuro AI, which is a synthesis of brains and machines where you're trying to use modern AI tools to understand the brain as we're trying to use AI tools to understand everything these days because AI is so powerful. But neuro AI has a particularly interesting spin on this because AI came originally from neuro.
对吧?因此我们持续尝试向AI反馈新想法:不仅是卷积网络启发了AI,这里还有其他可供使用的精细结构。有时你能通过这种方式发现有趣的平行现象。正如费曼那句广为人知的名言:‘凡我不能创造的,我就不能理解。’
Right? And so we're continually trying to give back new ideas to AI to say, Hey, it wasn't just like convolutional networks that that inspired AI, but here's some other detailed structures that we could use. And sometimes you can find interesting parallels that way. There's the famous quote by Feynman that a lot of people know, which is, What I cannot build, I cannot What I cannot create, I cannot understand. Yeah.
所以这是对我们大脑理解的试验场。如果你真认为自己理解了大脑的某些机制——是吗?那就创造出具有智能的事物来证明。
And so this is a test bed for our understanding of brains. If you really think you are understanding something about the brain, oh yeah? Make make something intelligent.
我刚想到,既然你提到AI起源于非常基础的神经科学,对吧?我们神经科学家一直在敲门呼吁:'听听我们的建议,你们需要这样做',但他们却自顾自地成功了。然后我们又用他们的模型对吧?但这让我想到了一条不同的路径。
I just had the thought, I mean, so you just mentioned that, you know, AI came from very rudimentary neuro, right? And so we're, us neuroscientists are constantly banging on the door, hey listen to us, you need to do this, right? And they just forge ahead successfully. Then we use their models to right? But it struck me like a different way to go.
所以你的意思是,观察这些模型的内在机制,可能会发现某些潜在变量,然后或许能将那些潜在状态与生物体实现认知的方式联系起来。但你也提到,解决同一个问题本质上存在无限种方法。我在想,是否有用的做法是用与神经网络截然不同的方式解决问题。虽然,这某种程度上也是神经科学的历史——我不想说失败,但在理解大脑方面确实未能取得重大突破对吧?比如那些相对简单的心理数学模型、累加器模型、漂移扩散模型之类的,可能正是我刚才假设的那种情况。
So you're saying that, you know, you look at the models, you look at the innards, the inner workings and you might find some latents and then you can possibly relate those latent states to the way that organisms are enacting their cognition. But you also mentioned that there are essentially an infinite number of ways to solve a single problem. I wonder if a useful exercise is to solve the problems in ways radically different than neural networks do. Although, maybe that is kind of the history of neuroscience which has, I don't want to say failed, but failed to make like super great failed to solve the brain, right? You know, and maybe like these psych math mathematical psychological models that are fairly simple, accumulator models, drift diffusion models, things like that, maybe those are sort of what I just posited.
我在思考的是,要解决你提到的那类优化问题,其解决方案可以偏离大脑活动模式多远?这听起来...我不知道该怎么表述,也不清楚该如何推进这个想法。
You know, I'm just wondering like how far away from brain like activity can you go to solve to be within that class of that you mentioned, that class of solutions for a given optimization problem? Does that sound like a that sounds I I don't know. I don't know how to how you how one would move forward with that.
是的,这是个好问题,更是一系列问题。探索这类'等效输入输出关系'的家族很有意思。我认为关键要区分两种等效性:
Yeah. So it's a good question. It's a whole family of questions. And so exploring the family of let's say equivalent input output relationships is an interesting one. And there are two ways that something can be equivalent that I think are critical to differentiate.
一种是迄今为止所有测试中都表现等效,另一种则是在所有方面——包括我们尚未测试的领域——都完全等效。
One is that they are equivalent over everything that we've tested so far. Right. And the other is equivalent in all ways, even ways that we haven't yet tested.
其实还有第三类,就是超级狭隘的等效——仅针对我们正在测试的这个特定基准有效。
Well, there's a third category which is like the super narrow equivalent only for this very particular benchmark that we're testing.
对吧?对,没错。好的。
Right? Right. Yeah. Okay. Good.
现在我们有一个谱系。这个谱系反映的是,我们让系统偏离训练机制的程度有多广。如果我们仅在这个单一任务的狭窄范围内等效,那模型可能很脆弱。对吧?我们会得到一个脆弱的模型,然后问:'我们在这个任务上表现如何?'
So now we have a spectrum. And the spectrum of, you know, how how widely are we pushing this system out of the training regime. And so if we if we're equivalent only in this narrow one task case, then we might be brittle. Right? We'd have a brittle model and we'd say, Oh, how well did we do with this?
就像捕捉某些特征。然后有人过来说:'哦,你用了梯度法。试试用随机噪声或自然图像测试,模型就崩溃了。' 这时你会说:'确实,这绝对是错误的模型。'
Like capturing things. And then somebody comes along and says, Oh, well, you know, you used gradings. Test it with you know, I don't know, random noise or natural images and it breaks. And then you say, Well, yeah, you definitely have the wrong model.
对什么来说是错误的模型?
For what? The wrong model for what?
对于大脑而言。对吧?你是想拟合大脑,或者解决——也可能是针对机器学习。你的系统在这个奇怪任务上表现尚可,但整体表现不佳,在真实部署环境中也表现糟糕。
For the brain. Right? You're trying to fit the brain or you're trying to solve the I mean it could be also for machine learning. Right? Your system, it does fine at this one weird task, but it doesn't do fine in general and it doesn't do fine when you deploy it in realistic conditions.
所以我们总是在寻找那些真正重要的测试条件。要知道,这是个移动靶。最初人们试图分类二进制数字,对吧?现在连线性模型都能以约88%的准确率在MNIST上分类数字了。
So we're always looking for those, you know, the test conditions that you really care about. And you know, that's a moving target. In fact, in the beginning people were trying to classify binary digits. Right? Now we can, I mean even linear models classify binary digits with like, I think it's 88% accuracy or something in MNIST?
但当我们把性能推向更高水平时,那个基准就不再是合适的测试了——既然我们能做得相当好,就该尝试更有挑战性的测试,然后继续推进——
But pushing it towards higher and higher performance until that benchmark no longer seems like the right test because okay, we can do that reasonably well. Let's try something that's a better test and then we move towards
这就是古德哈特定律?对吗?就像
This is Goodhart's law? Is that right? Like
这个我不太了解。
I don't know this one.
古德哈特定律指出,一旦某个指标成为目标,它就不再是一个好指标。
Goodhart's law states that once a once a metric becomes a target, it ceases to be a good metric.
啊,是的。虽然这不是我要说的,但这个观点很好,非常重要。而且,鲁斯·萨列库迪诺夫可能还提出了一个推论,他说过,制定基准的人就是赢家。
Ah, yes. That's that's not what I'm talking about but that's a good one. That's a really important one. And, Russ Salekudinov actually had maybe a corollary of that where he he said, whoever makes the benchmark wins.
对。没错。好吧。这个观点也很棒。所以人工智能赢了。
Right. Yeah. Okay. That's a good one too. So AI wins.
人工智能赢了。但我们正努力朝着更好的基准迈进,对吧?而且这些基准也在不断进化。
AI wins. But we're trying to move towards better benchmarks. Right? Right. And and the benchmarks are constantly evolving.
事实上,你知道,进步往往是通过建立基准实现的。比如当李飞飞开发出ImageNet时,对整个领域产生了巨大推动。我认为许多像我们大实验室这样的大规模数据收集工作,将会推动领域发展,推动神经科学进步,因为这样人们就有了可以真正测试的目标。这长期以来并非神经科学的传统,但现在正逐渐成为现实。所以我认为这是机器学习与神经科学之间一个重要的社会学差异,当你将机器学习的基准测试模式引入神经科学等其他领域时,会带来非常丰硕的成果。
In fact, you know, advances were made by developing benchmarks. Like when Fei Fei Li developed ImageNet, that was a huge spur for field. So I think a lot of these large scale data collection efforts like we have at some big labs, Those are going to push the push the field forward, pushing neuroscience forward because then you have targets that people can really test things on. That has not been the tradition in neuroscience for a long time and now it's becoming so. So this is I think a major sort of sociological distinction between machine learning and neuroscience that's really very fruitful when you import the machine learning benchmarking style thing into other fields like neuroscience, for example.
不知道现在是否适合转换话题,请教你工作中另一个常见主题——如果我说错了请纠正——生物体实际上花费更多能量或精力在元认知事务上,比如探索当前任务的同时,还需要弄清约束条件、这些约束的概率分布以及需要投入多少能量。所有这些因素都影响着问题的解决。你是否认为生物体在规划和理解这些限制条件上花费的认知努力,比实际解决问题的算法还要多?
I don't know if this is a good time to pivot and ask you about another common theme in your work which is that, and again correct me if I'm wrong, that organisms spend more like energy or effort actually on like metacognitive things like discovering So there's a task at hand but now they have to figure out the constraints and what those constraints and what the probabilities of those constraints are and how much energy they have to spend. So there's all these factors that go into solving a given problem. Would you say that organisms have to spend more cognitive effort on arranging and figuring out those limitations and constraints than actually the algorithm to solve the problem?
这是个好问题。我不知道答案。就纯粹的脑力而言,我怀疑我们至少将大部分能量用于初级感官处理。但认知层面的东西,我的意思是它的带宽要低得多。我们进行推理的结构要慢得多,维度也低得多。
That's a good question. I don't know the answer. In terms of like sheer brainpower, my suspicion is that we spend a huge amount of our of our energy at least, doing primary sensory processing. But the the and the cognitive stuff is I mean it's lower, it's a lot lower bandwidth. The structures that that we're reasoning about are much slower, they're much lower dimensional.
没错。你看一张图片,每只眼睛基本上能接收大约1亿像素。而我们拥有的认知变量,或者说输出,肯定不到一千个维度。基本上就是我们拥有的肌肉数量那么多。
Okay. But they are Right? So you look at an image, you get, I don't know, a 100,000,000 pixels per eye basically. Whereas the kind of cognitive variables that we have are certainly the output that we that we have is only is less than a thousand dimensions. Just muscle that we have, basically.
其实你只是设定了一个静态画面,因为现实中每次眼球转动我们都在看静态画面。当我们在世界中移动时,每隔几毫秒就会接收到那1亿像素。
Well, you just set us a static picture because in reality, we're looking at static pictures every every time we move our eyes Right. As we move through the world and so it's that what the 100,000,000 every every few milliseconds.
是的。虽然那1亿像素中有很多是相同的,从一刻到下一刻。所以你必须小心,必须谨慎评估这些东西的信息含量。
Yes. Although a lot of those 100,000,000 pixels are the same That's right. From moment to moment. So you have to really like, you have to be careful about how much how you're characterizing the information content of these things.
而且我们有中央凹和周边区域——中央凹是我们真正集中注意力或获取最高感官保真度的区域,至少在视觉上是对图像中非常小的区域。当然我们讨论的是视觉,因为这就是AI和神经科学的全部历史。几乎都是关于视觉的研究。
And we have a fovea and extra extra foveal like we have a fovea where we're actually only paying attention to a very small, paying most attention or getting the highest fidelity of sensory input, at least in vision of a very small area of an image. Of course, we're talking about vision because that's all anyone, That's the history of AI and neuroscience. It's like almost all vision. Yeah.
是啊。视觉是个大课题。这些年能研究几个不同系统其实很有趣。视觉、一些听觉、一些本体感觉。
Yeah. It's a big one. It's actually been fun for me that I've gotten to work on a few different systems over the years. Yeah. You know, vision, some audition, some proprioception.
作为理论学家的乐趣之一就是:实验者必须投入大量资源在特定系统上,包括设备等等,但这只是数学问题。我的意思是,你需要同样的计算能力。数学确实如此。
That's one of the joys about being a theorist is that experimentalists have to invest a huge amount in a particular system with this equipment and everything, but it's just math. Mean, You need same compute. Math Yeah.
是啊。
Yeah.
只要给我一张纸和一支笔,我们就能立马开始。
Just give me a paper and pencil and and we're we're off to the races.
我的职业生涯有过转变。我曾是实验神经生理学家,一直对理论研究者羡慕不已。现在我们要稍微岔开话题聊聊理论。如今我主要从事计算分析工作,实验室也正准备用小鼠开展更多实验,但我有些犹豫,心想不如继续做计算研究。我觉得当初羡慕理论研究者是对的。
I've switched my in my career. So I used to be an experimentalist, neurophysiologist, and I was always super jealous of theorists. We're gonna take a little theory sidetrack here. And now these days I do like it's all like computational analysis that I do and we're gearing up to do some more experiments in lab with mice and I'm sort of hesitant because I'm like, do the computational stuff. I was right to be jealous, I think, of theorists.
我问过理论研究者——现在也想问问你——他们说虽然也需要时间,但不会遇到实验派那些问题。你们遇到的是计算问题,而不是硬件故障或生物体问题这类烂摊子。所以你选择理论路线很明智。某种程度上,你是否同意你们确实更轻松?虽然需要同等甚至更深的思考,但在产出效率方面...
I mean, if I ask theorists, and I'll ask you this too, you know, they say, well, we have, like, it takes us a while too, but you don't run into the same kinds of problems. You run into computational problems. You don't run into like hardware problems organism problems and it's like it's a huge mess. So good for you that you went the theory route. Do you agree with me that in some sense you not have it easier because you have to think just as hard or maybe harder.
你可以和任何愿意合作的实验者搭档,或许需要说服他们分享数据(现在这比过去容易多了)。作为了解实验科学家的理论研究者,你会如何描述自己的工作状态?是否曾在办公室里窃喜'我只需对着电脑,不用面对那些麻烦事'?
But in terms of productivity and you can partner with any experimentalist that's willing to partner with you and maybe you have to convince them to send you their data. That used to be a bigger deal than it is now. But how would you characterize being a theorist, knowing experimentalists that you know? Are you over there kind of laughing in your office and I can just do sit on my computer and I don't face the same challenges?
我确实很享受现在的工作,这份职业充满乐趣。但读研时我也经历过和你类似的挣扎。我最初尝试过各种方向——做过神经生理实验,搞过心理物理学,但最终厌倦了设备故障:可能是线路松动,或是溶液过期,这些不可控因素要求事无巨细的维护,实在不适合我的性格。
I'm I'm loving life over here. Like it's definitely a lot of fun to do this job. But I grappled with the same thing that you were grappling with as a graduate student. I started off like I I did some of everything and so I did some experiments, neurophysiology experiments, I did some psychophysics and I just got tired of when things broke, not you know, it it being some weird like a wire was loose or the solution was old or things that seemed really out of control and to be so meticulous that everything was pristine. It it was just it didn't suit me.
后来我反思时间分配问题时发现...
And I found myself like if I looked back and say, where am I spending my time?
嗯。
Mhmm.
我当时只是把时间花在电脑前或做些分析工作,而不是在暗房里做视觉实验——那些实验其实很有趣,我很高兴自己做过。我想很多与我共事的实验人员也为此感到高兴,因为这让你真正理解数据的复杂与混乱。要知道,有些理论研究者来自更纯粹的学科领域,他们可能认为物理就像某种黑客行为,可以随意假设。
I was just spending my time in front of the computer or doing some analysis, rather than being in the dark room doing some vision experiments, which were interesting and I'm really glad that I did it. And I think a lot of the experimentalists that I work with are also glad that I did it because it gives you an appreciation Right. For the difficulty and the messiness of data. So you know, theorists who come from, let's say, some disciplines where things are more pure, you assume that you can Physics is a little bit of hacking. It's like Okay.
物理与数学的关系,就像黑客与计算机的关系。这是种复杂的互动,因为这种交流通常有两种模式:一种是实验人员带着数据来找你问'这代表什么?'——这种情况其实很少见对吧?
Physics is to math as hackers are to computers. It's a complicated dance because like, okay, there's there's two ways of where that interaction can go. One is an experimentalist comes to you with some data and says, hey, what's this mean? That's rare, right?
这种情况不是很罕见吗?
Isn't that the rare?
不,其实更常见。
No. That's more common.
在当下?
These days?
是的。我的意思是,假设他们主动来找你交流,因为他们手头有正在研究的课题。另一种模式是你有个理论需要验证。
Yeah. I think I think that I mean, assuming that they're coming to you, that you're like in a conversation. Because they have the things that they've been working on and they've been thinking about. The other way is that you have a theory you want to test. Right.
然后你得说服实验人员,能否请你投入半年或一年时间,来验证我这个有点疯狂的想法?当你真正与愿意合作的实验人员共事时,可以组建团队。这时你们可以说,让我们共同设计这个实验。我们有这些构想,也理解你们存在实验条件限制。
And then you have to like convince an experimentalist, here, would you please dedicate six months or a year of your life to testing this particular wacky idea that I had? Then when you actually go and work with an experimentalist who wants to like you can make it a team. And then you can say, let's co design this experiment. We have these ideas. I know you have constraints, experimental constraints.
这些部分容易实现,那些部分比较困难。我们要找到最佳的组合方案。这个过程既充满乐趣又富有成效。有趣的是,许多实验方向的首席研究员其实更像理论家,因为他们并不亲自操作实验,而是负责实验设计。
These things are easy, these things are hard. Let's figure out if we can find the right combination of things. And that's really fun and that's really fruitful. It also is a little bit amusing to me that a lot of PIs who are experimentalists are really operating very much as theorists in a sense because they're not doing the experiments. They're designing the experiments.
没错。从这个角度看,当我和实验人员合作时,我的工作方式其实类似于实验项目的首席研究员。
Right. So in that sense, when I'm collaborating with experimentalists, I'm kind of working like an experimental PI.
等等,为什么觉得有趣?神经科学的发展史不就是实验人员带着待验证的理论假设设计实验的过程吗?这些假设本质上就带有理论性质,对吧?
Wait. Wait. Why is this amusing? So this is kind of the way the history of neuroscience is someone is experimentalists who had their ideas that they wanted to test. So you set up an experiment and that idea is somewhat theoretical, right?
嗯,并不是单纯抱着'看看会发生什么'的心态。
Mhmm. It's not just like, hey, let's see what happens.
就像大老板坐在办公室里,所有研究生都是执行者。
So I mean, the big boss is sitting in the office and all the graduate students are the hands.
明白了,好的。
Right. Okay.
这就是我的意思。但确实有些项目负责人(PIs),你知道,其中有几位仍然热衷于亲自去做实验。
That's that's what I mean. But some PIs actually, you know, there's a few of them who really still like to go and do the experiment.
确实如此。
That's true.
他们在现场能获得极大的满足感。当然,你也能从中收获良多,可以亲眼见证系统如何演变。但你也明白,每个人都有时间限制,而这些限制往往相当严格。
They get a lot of satisfaction out of being there. And of course, you get a lot out of it, you can see things at you know, how the systems evolve. But you know, there's everybody has time constraints. Those time constraints are often pretty strict.
好的。回到正题,你说在与实验型实验室合作时,通常项目负责人可能不亲自做实验,而是由实验室人员操作,而你最终感觉自己更像实验人员,你是这个意思吗?
Okay. So getting getting back, you said you often when you're collaborating with experimentalist labs, the PI maybe is not doing the experiments, like the the lab personnel are doing the experiments and then you end up feeling more like an experimentalist, is that what you were saying?
运作方式类似于实验组的项目负责人。我的意思是,这变成了一种共同设计实验的合作。通常对实验负责人而言,需要了解所有设备情况——确保物品归位、资源到位。但我不必操心这些,因为我不负责搭建实验室或设备,不过我喜欢了解这些。首先,当今的技术实在太酷了。
Operating like the PI in an experimental group. I mean, it becomes a, you know, we're co designing the experiment and usually that means that for the PI, the experimentalist needs needs to know what all the equipment is. You know, they're making sure that things are in the right place, that the resources are available. But I like, I don't need to because I'm not building that lab, I'm not building that equipment, but I like to know it. I think, I mean, first of all, the technology that we have these days is so cool.
没错。其次,这能帮助我更透彻地理解实验的约束条件。
Yeah. And second, it just helps me understand the constraints of the experiment that much better.
好的。有道理。那么我们刚才关于理论派与实验派的讨论有点跑题了。
Okay. Yeah. Fair. Alright. So we went on this big tangent about theory versus experimentalists.
但在我们再次转向之前还有一件事。也许是我还抱着这个旧观念没放下,对吧?通常一个首席研究员有自己的实验室、自己的经费和自己的项目。他们想做的事情往往已经多得做不过来了,这时突然冒出个理论学家说:嘿,不如这样,这是我的想法。
But one more thing before we pivot again. It used to be maybe I just have this old conception that haven't let go of, right? So you have a PI, they have their own lab, they have their own grants, they have their own projects. Usually, the things that they want to do are overflowing with respect to what they are doing and along comes some theorist and says, hey, why don't you, here's my idea.
又来一个。
Here's another one.
又来一个。这已经超出你们目前的能力范围了,但我需要你订购这台新显微镜,还需要你配备一个新的暗房,诸如此类的要求。
Here's another one. This is, you know, it's beyond what you can do right now but I need you to order, this new microscope and I need you to fit, you know, outfit a new dark room, you know, things like that.
现实不是这样运作的。
That doesn't work like that.
不,我知道,但问题就在于——因为这里可能存在某种张力,对吧?
No, I know but that's how, so how does it, because there could be like a certain kind of tension there, right?
是啊。我是说,真正富有成效的合作应该建立在了解对方的兴趣点和技术能力的基础上。偶尔你会说:哦,要是我们能做这个就太棒了。而实验方可能回应:对啊,确实很棒,咱们干吧。
Yeah. I mean, a fruitful collaboration like that is going to happen where you you know what the person is interested in, you know what their technical capabilities are. Every once in a while, you say like, Oh, you know, it'd be cool if we could do this. And maybe the experiment is like, Oh yeah, that would be cool. Let's do it.
但大多数时候情况是:嗯,想法很棒,可惜我做不了。所以只能在现有条件内尽力而为,通常已经是非常强大的技术了。真的,和我合作过的那些人在测量技术方面都有着惊人的专长。
And otherwise, you know, most of the time it's like, Yeah, that would be cool. I wish I could do it. And so then you just work within the opportunities that you have and you try to make the most of what is usually really powerful technology. Mean Yeah. The people I've gotten to work with have incredible skills in measuring stuff.
他们也会带来,我不想低估他们已有的想法和解析能力。所以合作时通常存在互补性。比如他们知道许多我不了解的事物,可以在这方面指导我,反之亦然。我掌握大量数学技能,熟悉各种模型和理论体系,能帮助整合观点;而他们可能对某些特定文献的理解远胜于我,清楚大脑某区域可能出现的信号特征,了解这种动物的能力边界——所有这些对建立优质合作都至关重要。
They also come, I don't want to understate the ideas and interpretive skill that they already have. So coming, there's usually complementarity. Like they'll know a lot of things that I don't know and they can teach me about this and vice versa. And so you know, I have a lot of math skills and I know like sets of models and theories that we could bring together and it can help synthesize some ideas. And they probably know some particular literature way better than I do and they know what kind of signals you might find in what part of the brain and what this animal can do and what it can't do and you know, all of that is critical to making a good collaboration.
没错。很多期节目之前,我记得是和Nathaniel Daw聊过——他也属于理论派——他提到和同事正在纠结是否要自建湿实验室,以便验证他们的理论。我当时就说:别别别,千万别这么做。Fijay也这么认为。
Yeah. Many, many episodes ago, was talking with, I think it was Nathaniel Daw and he So he's on the theorist side as well and he related to me that he and his colleagues were sort of battling whether they should start a wet lab, whether they should start an experimentalist lab so that they could apply their own theories to it. I was like, no, no, no. You don't want to do that, you know. Fijay thought
Subramanian也尝试过这么做。他确实建了实验室,但找人验证他的理论非常困难。不过有时确实有些简单实验值得尝试,会很有趣。
of Subramanian tried to do that too. I mean, he did that. It was just tough to get people to test his theories. Was Right. Mean, sometimes there are some easy experiments that that we could run that would be fun to do.
最常见的就是人类心理物理学实验。比如设计个游戏,收集人类玩游戏的数据——相对而言这很容易实现。如果有人愿意做这类实验就太好了。
The most common of these is just human psychophysics. Like here's a game. Let's just collect some data of a human playing a game. That's easy in relative terms. And so yeah, there's if people are inclined to do that, that would be great.
但你是否认为随着神经科学逐渐成熟,它正变得更像物理学?物理学历史上就明确区分理论家和实验家,他们可以协作但保持独立。而神经科学过去通常是实验者即理论者,两者合一。大约七百年前我读研时(笑),理论实验室和实验实验室之间还存在紧张关系——这种对立现在是否消退了?人们更适应分工了吗?还是没有?
But would you say that the that as neuroscience matures a little bit, it's becoming more like physics in that Physics historically has been sort of happy with there are theorists and there are experimentalists and they can collaborate but they're kind of separate whereas in neuroscience, the past has been the experimentalist is the theorist and it's one person. And there had been this tension when I was a graduate student about seven hundred years ago, there was this tension between theoretical labs and experimental labs. Is that dissipating where people are more comfortable? No, it's not?
我认为这种分野确实在加深。这与专业化趋势有关——不仅是职业细分,理论家内部也出现分化:有人专攻深度学习,有人研究动力学模型,还有人从事统计物理类模型。当擅长线性模型数学思维的研究者需要处理非线性模型时,要么自己拓展能力,要么与精通复杂训练模型(比如AI风格)的专家合作——这种数学派与AI派的结合正在结出硕果。
I think, I mean, I think that the division is indeed becoming stronger. I think it has to do with specialization. I mean, I think it's more than just every one of these occupations is subdividing. So even within theorists, you have people who are specializing in, let's just say, deep learning stuff and others who are going to do dynamical models and others who are going to do I don't know, statistical physics kind of models. Like they're they people pick their specialties and sometimes they work together and that could be really cool where you have somebody who has a math way of thinking about things, but it only works in linear models.
同样地,实验团队也在专业化:分子生物学专家与神经生理学专家协作,既能从分子层面操控神经元,又能进行介观尺度的神经电活动记录。
Then if you want to go to a nonlinear model, you need to have you need to either adopt that capability or work with somebody else who does deal with more complicated, let's say trained models. Right? So the AI style and the math style, they're working together now in fruitful ways. And likewise, the experimentalists are coming up with teams where here, this person is an expert in molecular biology and this person is an expert in neurophysiology. And so then you can both manipulate the neurons at a molecular level and you can do these, you can record from them at a meso scale level.
或许你还会遇到一位认知科学家,他进行大量认知行为实验。于是他们所有的专业知识汇聚在一起,你就能获得一组更为丰富且相互关联的测量数据,以及一个可以通过多种方式连接并从中获取洞见的更丰富数据集。所以我认为这实际上是在持续细分领域。
And then maybe you also have somebody who is a cognitive scientist who does a lot of cognitive behavioral experiments. And so all their expertise comes together and you can have a much richer set of measurements that are related to each other and a much richer data set that you can connect in different ways and draw insight from. So I think it's actually continuing to subdivide.
没错。换句话说,为了变得更健康——如果我们把物理学史视为健康标准的话。对,确实如此。
Yeah. So in other words, to get healthier and and if we consider the history of physics healthy. Yeah. Yeah.
我也这么认为。
I think so.
你选定专业方向了吗?你说过每个人都会选择某个领域...但你的兴趣似乎相当广泛。
Did you pick a specialty? You said everyone kind Right. Chooses a But you're kind of all over the place.
确实涉猎广泛。我一直是个杂家,喜欢尝试各种不同事物——摆弄乐器,涉足不同科学领域。
I'm all over the place. Yeah. I I've always been a dabbler. Like, I dabble at a lot of different things. I dabble at musical instruments, I dabble in different scientific fields.
不过你知道,总会浮现出某些主题,我们之前也提到过其中一些。我有一套自己开发的研究工具,还在不断获取新的,因为以问题为导向很重要——比如'如何解答这个问题',而不是'拿着锤子找钉子'。
So But you know, there's definitely themes that emerge and we touched on some of those themes earlier on. There's a set of tools that I have developed. I keep acquiring new ones because you know, I It's good when things can be question driven. Like, how do you answer this question? As opposed to here's my hammer, wear some nails.
嗯,你本身也热爱学习。
Well, you love to learn also.
我热爱学习。真的。这简直是我最喜欢这份工作的地方——我能持续不断地学习。
It's I love to learn. Yeah. I really do. It's like, it's my favorite thing about this this job is that I'm constantly learning.
太棒了。
It's amazing.
是啊。而且你能从中学到东西的那些人也非常了不起,对吧?这个领域里有太多深奥的知识,还有机会与那些拥有独特想法和创造力的人交流。这太美妙了。
Yeah. And the people that you get to learn from are amazing too. Right? There's so much deep knowledge in in the field and the chance to interact with all these other people who have their own ideas and creativity. That's amazing.
能拥有这样的经历真是太好了。
That's a wonderful experience to have.
好的。那么,我们转向概率和大脑的话题如何?这是个——
Okay. So, alright. Shall we pivot to probabilities and brains? It's a
急转弯啊。
hard pivot.
我们刚才还在讨论实验者与理论家的合作。嗯。你参与过这种生成对抗性合作,我几年前和拉尔夫·赫夫纳聊过这个,他是想弄明白大脑如何用概率进行计算。
We were just talking about experimentalists and theorists collaborations. Mhmm. And you've been on this generative adversarial collaboration and I talked to Ralph Hefner about this a couple of years ago and he, so this is to figure out how the brain computes with probabilities.
是的,没错。
Yeah, that's right.
因此,关于概率如何在大脑神经元网络中表征和运用存在多种理论。每种理论都有支持证据,也有反对证据,这取决于你如何看待数据等等。于是就有了这种退化的对抗性合作——一群对概率在大脑中的表征方式持有不同(或许用‘对立’这个词太强烈)或替代观点的人聚集在一起。大约两年前我和拉尔夫聊到这个时,他对此非常赞赏。他惊讶于合作进展之顺利、大家相处之融洽、成果之丰硕,以及他从中学到了多少东西。
And so there are different theories about how probabilities are represented and used in networks of neurons the brain. And they all have some evidence for them, they have some evidence against them and it depends on how you look at the data, etc, etc. And so you have these degenerative adversarial collaboration is a bunch of people with, maybe opposing is too strong a word with different, with alternative views on how probabilities might be represented in the brain who came together. And when I was talking to Ralph with this maybe two years ago, he was really appreciative of it. He was surprised at how well it had gone, how well everyone had gotten along and how productive it had been and how much, you know, he had learned from it.
那么,整体问题是什么?比如为什么难以了解大脑如何用概率计算?然后稍微讲讲这次合作以及你们得出的结论。
So, like, what the overall issue with, like, why is it difficult to know how brains compute with probabilities? And then tell me a little bit about, like, the collaboration and what you guys have come up with.
当然。这确实是个可以讨论好几个小时的丰富话题。其中一个问题是人们所说的‘贝叶斯万能解释’。如果你想证明大脑在进行某种概率性操作——权衡带有不确定性的感官证据、提取潜变量、采取适当行动——你总能构造某个概率分布作为先验,使得你的数据恰好符合概率推断的正确做法。
Sure. Yeah. This is this is a really rich topic that that we could go on for hours about. One of the problems is what people call Bayesian just so stories. So if you want to say the brain is doing some probabilistic thing, weighing sensory evidence with uncertainty, extracting latent variables, acting appropriately, You can always construct some probability distribution as your prior under which your data would be the right thing to do for probabilistic inference.
这和我们之前讨论的‘一个问题有上千种解法’是同样的概念吗?有关联吗?
Is this the same as what we were talking about earlier where there's a thousand different solutions to a given problem? Is it related?
是的,有关联。我认为可以称之为‘退化现象’。这是个非常具体的例子,对吧?
It's Yeah. It's related to that. There there are I guess I would call these degeneracies. I mean, is a very specific one. Right?
但存在几种不同的退化现象。举个简单的例子:假设你处于强化学习框架中,试图最大化某个效用。但你不完全了解现实世界,于是面临两个问题——如果现实世界处于某状态而你猜测为另一状态,这个错误的严重性有多大?
But there are a couple different degeneracies. One one simple one to characterize is, let's say that you're in the framework of reinforcement learning where you're trying to maximize some utility. Right? But you don't know exactly what the real world is, so now you have two things. How important is it if the real world is in one state and you guess a different state, how bad is that error?
对吧?所以这会产生一些后果。
Right? So that has some consequence.
这是在定义了效用的情况下,
This is with a defined utility,
不是这个,是的。所以如果你定义的话,嗯,测量这些效用的方法有很多种。对吧?对。比如效用可以有不同后果,不同的效用函数。
not this is yeah. So if you define Well, So there are different ways of measuring those utilities. Right? Right. Like the the utilities could have different consequence different utility functions.
对吧?而且这些正确或错误反应的概率也可能不同。对吧?世界的状态。我们关心的是这两者的乘积。
Right? And you can also have different probabilities of those correct or incorrect responses. Right? The the state of the world. What we care about is the product of those two.
嗯。效用乘以概率。对吧?你根据事件发生的频率来权衡效用,然后计算期望值。比如,如果你在打赌,有50%的机会得到200美元或0美元,那就等同于100%得到100美元。
Mhmm. The utility times the probability. Right? You're weighing the utility by how often it happens and then you take your expected value. So if you're like, if you're taking a wager and you could have a fiftyfifty chance of $200 or $0, that's equivalent to a 100% chance of a $100.
对吧?你只需要取平均值,即你的平均回报,你在那里得到的平均效用。但你可以想象,通过改变效用函数和概率,可以有多种方式达到完全相同的平衡。
Right? You just take the average that your average return, average utility that you're going to get there. But you can imagine that there's different ways of changing the utility function and the probability that give you the exact same balance.
当然。
Sure.
产品是唯一重要的事情。因此,无论你以何种方式获得该产品,你可以将效用减半,将概率翻倍,诸如此类。所以这是一种在任何情况下都无法区分的退化现象,因为它总是存在。贝叶斯的故事就像是,哦,我们找到了这个数据的解释,它是贝叶斯的,是在这个先验下的最优概率推断。是的。
Product is the only thing that matters. So any ways that you get that product, you can make divide the utility by half, double the probability, you know, things along that line. So that's one degeneracy that is not possible to distinguish in any cases because there's always going to be that. So the Bayesian just so story is like, Oh, we found this explanation of the data and it's it's Bayesian, it's optimal probabilistic inference under this prior. Yeah.
那么,这是一个真正合理的先验吗?我的意思是,你刚刚编造了它。也许也许它是一个非常锯齿状、形状怪异的概率,只是为了让你的数据正确运作。那么你如何解决这个问题?你通过测试泛化能力来解决它。
Well, is that a real reasonable prior? I mean, you just made that up. Maybe maybe you It was like a very jagged, weird shaped probability that was just the thing necessary to get your data to work out right. So how do you resolve that? You resolve it by testing for generalization.
你寻找新的东西,并对你的模型、你的概率、贝叶斯先验概率做出承诺,这些概率说明了哪些事情可能会发生。所以你说,我承诺这一点。现在,我对大脑如何表示概率的模型必须保持一致。就像我仍然需要在测试一个新情境时解释数据,这个情境仍然遵循我承诺的模型。
You look for something new and you make a commitment to your model, your probability, the Bayesian prior probability, which says what things are likely to happen. So you say, I'm committing to that. And now, my model of how the brain represents probabilities has to remain consistent. Like I I need to still explain the data when I test a new situation that still obeys my committed model.
让我简单概括一下。所以二乘以四等于四乘以二。嗯。你得到了相同的解决方案。但如果你说先验是二,那么你需要使用相同的先验二而不是四。
So let me just really strip this down. So two times four is the same as four times two. Mhmm. You get to the same solution. But if you say the prior is two, then you need to use that same prior two instead of four.
你需要在不同的领域使用二来得到相同的答案,这就是你的确切意思。
You need to use the to in different domains to get the same answer and that's your Exactly.
每次都是如此。所以你已经承诺了这个二,这个二代表了世界的结构,你的模型假设可能会发生的事情。因此,在所有这些情况下,当人们说,哦,证据支持这种特定的解释。哦,证据支持那种解释。他们可能在使用不同的生成模型,关于他们试图进行概率推断的变量,并且他们可能在使用不同的概率。
Every time. So you've made a commitment to that to and that to represents you know, the structure of the world, the things that you that your model assumes that are likely to happen. So in all of these cases, when people have said, Oh, the evidence is favoring this particular interpretation. Oh, the evidence is favoring that interpretation. They may be using different generative models of what variables they're trying to do probabilistic inference over and they could be using different probabilities.
所以我们不是在比较同类事物。因此,为了进行有成效的比较,你需要比较同类事物。你需要共享,需要对世界运作方式的模型做出承诺,然后你可以评估大脑如何权衡概率的不同模型。但你不能做第二部分,测试这些不同模型是否表示概率,直到你对生成模型做出承诺。这种语言,我认为还没有被广泛理解。
So we're not comparing apples to apples. And so in order to do a fruitful comparison, you need to compare apples to apples. You need to be sharing, you need to make a commitment to a model of the way that the world works and then you can evaluate different models of the way the brain weighs probabilities. But you can't do the second part, testing whether these different models are representing probabilities until you make a commitment to a generative model. That language, that has not been I think widely appreciated.
这种对抗性协作,起初就像,你知道,你们在建立方向模型,而我们在构建世界中的加性成分特征模型。所以没错,这些是不同的生成模型,因此你们会以不同方式解释现象。拉尔夫实际上精彩地展示了,你可以采用其中一个生成模型,将其作为基于采样的图像块模型。然后,若从某些关注概率群体编码的对手视角来看——他们研究的是输入中某些线条或条纹图案的方向——你就能解释他们的数据。因此,同一个模型可以用这两种不同方式解释相同或不同的数据。
This adversarial collaboration, in the beginning it was like, well you know, you're you're making models of orientation or we're making models of additive, like component features in the world. So yeah, those are different generative models And so you're going to explain things differently. Ralph actually has shown quite beautifully that you can you can take one of those generative models and have be have a sampling based model for image patches. Then if you look at it from the point of view of some of its adversaries who are looking at probabilistic population codes, which about the orientation of some line or strike pattern in the input, then you get their data explained. So you can explain the same data, different data with the same model in these two different ways.
要调和这些观点是项艰巨的工作。
Coming up reconciling that is hard work.
嗯。
Mhmm.
而发现我们所做的生成模型承诺,是每个研究者在描述自身模型时都需要面对的部分工作。
And finding out that what generative model commitments we've made is part of what everybody needs to do when they're when they're describing their own models.
等等。你说可以用同一模型解释不同数据集,但一种情况使用序列采样(这是大脑运用概率的一种理论),另一种情况使用分布式群体概率。所以这是两个不同模型对吧?还是说...
Wait. You said that you get use the same model to explain different data sets but in one instance you're using sequential sampling which is one theory of how the brain uses probabilities and another one you're using the distributed population Yeah. Probabilities. So those are two different models, right? Or is
实际上,这是对相同数据的不同解读。你可以有一个底层分布,当你询问方向表征或图像块表征时,从不同视角观察它。这是同一机制,只是同一事物的不同表现。
it It's actually well, it's it's different different interpretations of the same data. So there you can have one underlying distribution that you look at it from this perspective when you ask what is the representation of orientation or what is the representation of the image patches. And it's the same mechanism. It's just one thing. Yeah.
在这两种方式下它呈现出不同面貌。
It looks different in these two different ways.
你必须调和这两种不同叙事如何最终达成相同结果。
And you have to reconcile how how those two different stories can end up doing the same thing.
是的。还有第三个流派。在这场对抗性合作中,我们实际上试图呈现三个不同学派的观点。这些都是概率表征的典型代表。其一是抽样假说,即当你观察世界现象并试图解释成因时,就像掷骰子——有时会得出某种解释,有时会得出另一种解释。
Yeah. There's a third group. So in this adversarial collaboration, there were really three different groups that we tried to get represented. These are prominent representations of probabilities. One is sampling, which basically means you if you if you see something out in the world and you're trying to interpret what caused it, you roll a dice and then some fraction of the time you'll come up with one interpretation, another fraction of the time you'll come up with a different interpretation.
我们的大脑持续进行着这种机制。它就像不断掷骰子,产生各种可能的解释,而你采纳某个解释的时长就代表其正确的概率。明白吗?这就是抽样假说的核心。
And that's just constantly happening by our brain. Our brain is rolling dice. It's coming up with alternative interpretations and the frequent the the amount of time that you're spending with one of those interpretations is the probability it's right. Right? That's the sampling hypothesis.
第二是概率群体编码,第三是分布式表征编码——这两个概率表征术语有趣地相似。它们在许多方面具有高度共性,主要区别在于直接使用概率还是对数概率进行表征(所谓直接,是指通过神经活动的线性表达)。关于孰优孰劣存在争议,但它们本质上是互补的。我认为发现其中一种就必然会发现另一种,因为它们各自擅长不同的计算任务。
The second one is probabilistic population codes and the third is distributed distributional codes, which are really funny, funnily similar terms for probability representations. And in some ways, they a lot more they have a lot of similarities. Like they differ by whether you're using whether you're representing probabilities or log probabilities directly and by directly, I mean linearly through the neural activity. And there's some arguments about which one of these is better and which one of these is worse. They are fundamentally complementary and I think that if you find one, you're going to find the other because they're good at different computations.
概率运算中始终涉及两种基本操作:乘法和加法。当两件独立事件同时发生——比如掷骰子又抛硬币时,两者同时发生的概率就是各自概率的乘积。
In in doing probabilities, have two operations that you have to do all the time. You have to multiply and you have to add. Right? The multiplication is when you have two independent things happening like you roll one dice and you flip a coin. The probabilities of both things happening are the product of each one separately.
这就是概率乘法规则。另一种情况是互斥事件:比如掷骰子得到5点或6点时(嗯...),单个骰子不可能同时出现两个结果。
So that's the probability rule. And the other one is only one event happens. Like if you roll a dice, you got a five or you got a six. Mhmm. You didn't get both on one die.
因此计算这类事件的概率必须满足总和为1。这就是概率的加法规则与乘法规则。面对新证据时,若证据独立就需要进行概率相乘——这正是信息累积的方式。不同的编码方式各自擅长其中某种运算。
And so those in order to compute the probabilities of that, they have to add up to one. Right. So that's the sum rule and the product rule of probabilities and you have to do that constantly. When you see a new piece of evidence coming in, if it's independent, then you're going to multiply your probabilities and that's the way you're going to accumulate information. These different codes are good at different ones of those computations.
因此,很自然会在它们之间来回切换。事实上,这种互补性蕴含着许多数学之美。但其中一种优先考虑相互作用。这就是当你使用概率群体编码时的情况。这虽然有点技术性,但对结构的表征至关重要,不仅仅是概率,还包括结构本身。
And so, it's natural to jump back and forth between them. And in fact, there's a lot of mathematical beauty in this that they have this kind of complementarity. But one of them prioritizes interactions. And this is the when you This is the probabilistic population code. And so this is a little bit of a technical thing but fundamental to the representation of structure, not just probability, but structure.
这是否意味着?你说的结构具体指什么?
Does that mean? What do mean structure?
描述结构的方式有很多,而结构对于大脑理解世界的方式极为关键。或许我们可以详细讨论不同类型的结构?我个人喜欢研究的一种叫做概率图模型。其中一种版本可以表示因果关系。嗯。
So there's a lot of ways of characterizing structure and structure is really critical in the way that the brain understands the world. I mean we can I think it would be maybe helpful to talk more about the different kinds of structures that there are? But one that I like to work with is called probabilistic graphical models. And these represent in one version of them, they represent causal interactions. Mhmm.
比如现在,我坐在椅子上。椅子放在地板上。地板由墙壁支撑。墙壁又依靠地基。地基则扎根于大地。
So you can have, like right now I'm sitting on a chair. The chair is sitting on the floor. The floor is held up by some walls. The walls held up by some foundation. The foundation held up by the earth.
这就是一条间接的因果链。所以我其实是被大地支撑着的,对吧?通过这条长长的链条间接实现的。
So there's an indirect chain of causation. So I'm held up by the earth. Right? Indirectly through this long chain.
通过概率图模型中那些代表椅子的节点。
Through those nodes in the probabilistic graphical model chair.
每一个节点都是,没错。
Each one of those, yeah.
地基,地面。没错。
Foundation, earth. Exactly.
没错。每一个都是节点。它们之间的边表示相互作用。比如地基并不直接与烟囱或屋顶相互作用,或者说,可能不是烟囱,是屋顶。对吧?
Exactly. Each one of those is a node. And the edges between them say what are the interactions. So like the the foundation is not directly interacting with the chimney or the I should say, maybe not the chimney, the roof. Right?
屋顶是由所有这些其他部分间接支撑的。因此我假设,这种结构——这是一个我非常感兴趣并想要验证的关键假设——大脑知晓并利用这种结构进行计算。这相当自然。
The roof is held up indirectly by all of these other things. And so that structure, I hypothesize. This is a key hypothesis that I'm really very interested in testing. That that structure is known by the brain and used by the brain in its computations. It's pretty natural.
实际上,它为限制计算可能性提供了一种有效方式。不是所有情况都可能发生,因为如果一切皆有可能,你就得考虑所有可能性。而在这里,你可以以结构化的方式限制可能性。
It actually provides a good way of restricting the possibilities of what computations you have to do. So not everything is possible because if everything is possible, then you have to consider all those possibilities. Here you can restrict your possibilities in a structured way.
概率结构化?
Probabilistically structured?
你也可以在元层面上操作,即对不同图结构赋予概率。
You can also do the meta level thing where you have a probability over different graphs.
对。实际上,
Right. In effect,
从元层面来说,有一种巧妙的方法可以实现这一点,即把动态图解释为带有超边的图。明白了吗?这真的很有趣,也很美妙。实际上,我还3D打印出了这种自然统计形态的模型。
there's a nice way of doing that in a meta sense where you have a dynamic graph interpreted as a graph with hyper edges. Okay. And it's really fun. It's beautiful. In fact, I made a three d printed model of the of the natural statistical shapes that emerge out of this.
它看起来像个圆角四面体。
It looks like a rounded tetrahedron.
这就是你打算给我看的、放在你办公室的那个东西吗?
Is this what you're going to show me that's in your office at work?
嗯,我可以发张照片给你
Well, can send you a picture of
好啊,发照片给我,我会把它放到视频里。
it. Yeah, send me a picture and I'll put it up in the video.
对。这是一个小小的三维概率分布。最酷的地方在于,比如你的图中有两个变量(节点x和y),它们有时会直接互动,有时又会断开连接——这取决于第三个变量的取值。
Yeah. So this is a little three-dimensional probability distribution. And the cool thing about it is that you can have, let's say two variables, two nodes in your graph, x and y, that are directly interacting at one time and they're disconnected another time. They're not directly interacting. Depending on the value of a third variable.
没错。就像是元元变量那种概念。
Right. The meta meta variable kind of yeah.
是的。而且这种关系是相互的,存在各种对称性。但它会形成一个小四面体结构,从不同棱边看——比如前棱表示这些变量是正相关或正向互动的。而后棱,当你的第三个门控变量为负值时,它们的关系会呈现相反状态。如果把这些点连起来,就得到了一个四面体。
Yeah. And and it's it's mutual and there's there's all sorts of symmetries. But it it creates this little tetrahedron where from different edges, like the front edge is saying that these are positively correlated or positively interacting. The back edge, if you're at a negative value of your third gating variable, has them kind of the other way. If you connect the dots, you get a tetrahedron.
哦,有意思。这个圆润的四面体就是这样。所以你完全可以得到这些动态变化的图结构。如果变量之间没有直接互动的边连接,这种图结构就是有价值的约束条件。大脑可以利用这种有价值的结构来简化计算。
Oh, cool. And this rounded tetrahedron is like that. So you can definitely get these these changing graphs. So the graph structure, if you don't have an edge between variables such that they're not directly interacting, that is a valuable restriction. That is a valuable structure that the brain could use to simplify its computations.
它们之间不存在因果依赖关系。
There's no causal dependency between.
没有因果依赖。所以因果表征就像我们知道神经网络是通用函数逼近器,意味着你可以用非结构化方式处理任何想要的推理,就像用大型网络那样。你只需要不断训练,最终总会找到正确答案——因为这总是可行的。但这可能消耗巨大资源,需要很长时间和大量数据,这本身就是另一种资源消耗。
There's no causal dependency. And so causal representation, like we know that neural networks are universal function approximators, meaning that you can take any of those inferences that you want and do it in an unstructured way, just like a big network. You just throw it at it and you train it forever and you'll you'll find the right answer in the end. Because you can, you always can do that. But it might take huge resources, it might take a long time and a lot of data, it's another resource.
关键的是,由于缺乏正确结构,它可能无法泛化。比如在新情境下测试时——假设我现在往椅子下放个瑜伽垫,这不会改变其他结构。但如果仅用通用函数逼近器来描述互动关系,我就得从头开始,无法利用已有的结构化知识。因此我认为,因果影响关系的图结构将成为极其重要的归纳偏置——而神经科学至今尚未真正解决这个问题。
And critically, it may not generalize because you're not using the right structure. So if you test it in a new situation, right? If I if I now put, I don't know, like a yoga mat under my chair, it doesn't change the rest of the structure but if I were just doing a universal function approximator to describe what's interacting with what, I would now need to start over. I would need a whole I new can't leverage all the knowledge, structured knowledge that I already have. And so to me, graph structure of what is causally influencing what is going to become a really important inductive bias that I would say neuroscience has not really yet resolved.
我刚才提到了这个术语,我们...
I threw in that term there, we'll
其实正想提到这个,因为...好吧。确实。我是说...
was about to bring it up actually because Yeah. Okay. Yeah. Mean, it's
让我先把那个想法快速说完。
Let me just quick finish the the thought there.
不,不。我们要继续。我只是想... 是的,你说吧。
No. No. We're going to stay on. I just want to Yeah. Go ahead.
好的。所以那种具有直接和间接连接的结构,在其中一种概率编码中体现出来,而另一种则没有。
Okay. So that that structure of having direct and indirect connections is something which is manifested in one of those probabilistic codes and not the other.
啊,明白了。
Ah, okay.
因此这又回到了生成对抗协作的话题。但仍有根本性问题——在我看来,自然参数(即图中非零连接边的存在与否)被其中一种表征方式突显,而另一种则没有。
And so that's bringing it back to this generative adversarial collaboration. But there's still fundamental questions. I mean, that's just my perspective that the natural parameters, which is this basically the non zero, like the edges on the graph of what's connected are are highlighted by one of the representations and not the other.
所以在你的表征中,连接可以存在或不存在,可以开启或关闭。你认为大脑或心智会学习这些关于世界因果结构的图形表征,这些归纳偏好吗?这很有趣,因为它是建立在有机神经网络之上,而这个网络又学会了另一种网络... 那么另一种理论是否也认为大脑必须学习这种结构?是的。
So in your representation, right, where you could have the connection or not, you could turn on or off the connection. Okay, so you posit that the brain or our minds learn these graphical representations of the causal structure of the world, these inductive biases, right? Which is interesting because it's built on top of an organic neural network that somehow then learns a network. That's very But then the other account, does it also posit that the brain has to learn the structure? Yeah.
确实如此。好的。
It does. Okay.
因此我认为,这就是为什么分布式分布编码中的某些工作实际上暗中体现了相同的结构,所以它某种程度上还是在表示这个图。
So I think this is why some of the work in distributed distributional codes actually secretly manifests this same structure, so it's kind of representing the graph anyway.
好的。
Okay.
所以它暗中像是概率群体编码的局部变换,但全局来看它与概率群体编码相同。这相当技术性。我们需要通过一些数学来理解,但隐藏图的概念我认为相当易懂。我们正试图开发方法来发现这些隐藏图,以及信息是否真的沿着这些隐藏图流动,不仅仅是表示可能存在的事物,而是你当下对世界的猜测,你正在观察的东西。这些信号是否在你脑海中沿着某种隐含的图流动?
So it's secretly like a local transformation of a probabilistic population code, but globally it's the same as probabilistic population code. This is pretty technical. We'd have to like go through some math for it, but the idea of a hidden graph I think is pretty accessible. We're trying to develop methods to discover those hidden graphs and whether information is actually flowing along those hidden graphs, not just there representing things that are that could be present, but what you're guessing about the world as it is right now, what you're looking at. Are those signals flowing along some implicit graph in your mind?
我们能找到那个图吗?我们能找到信号是如何从一种表示转换到另一种表示的吗?这里我认为归纳偏差变得非常关键,即我们想象大脑擅长表示概率。它实现这一点的一种方式就是生活在一个这对它有帮助的世界里。这意味着每次你面对一个新问题时,如果你大量练习,解决该问题的正确方法就是使用概率推理。
Can we find that graph? Can we find how the signals are transformed from representation to representation? Here is where I think the inductive bias becomes really critical, which is that we're imagining that the brain is good at representing probabilities. And one way that it could do that is by just living in a world where that's helpful. And this means that every time you're faced with a new problem, the right way to solve that problem if you practice it a lot is to use probabilistic reasoning.
贝叶斯式的。所以是贝叶斯推理。所以每次你知道,你在做听觉辨别,你试图跑出去追赶冰淇淋车,就像你可能在做所有这些不同的事情,对于每一个,最佳解决方案都将根据其概率权衡证据,并以这种贝叶斯方式将它们综合在一起。
Bayesian. So Bayesian reasoning. So every time you know, you're you're you're doing an auditory discrimination, you're trying to run down to catch the ice cream truck, like all of these different things that you might be doing, you're for every one of them, the best solution is going to weigh evidence by its probabilities and synthesize them together in this Bayesian way.
但那成本非常高。
But that's super expensive.
所以问题是,是否有某种模式可以让我们以更低的成本做到这一点?它是否是可重用的?我认为这是一个大问题。我不知道我们是否有这个,但如果我们没有,假设你知道,你训练有素,最终为所有这些不同问题找到了良好的贝叶斯解决方案。这是否意味着我们实际上是贝叶斯大脑?
So the question is, is there some motif that lets us do that with less cost? Is it something which is reusable? And that's I think a big question. I don't know that we we have that, but if we don't Let's say that you know, you're well trained and you end up with good Bayesian solutions for all these different problems. Does that mean that we are actually Bayesian brains?
我们使用贝叶斯大脑?嗯,这可能只是一个训练有素的网络涌现出的特性,而这个网络最初并没有倾向于贝叶斯主义的归纳偏置。这只是训练的结果。所以如果你只是用一个普通的神经网络,事实上,Orhan和Ma已经做过这样的实验,他们有一个更早的版本,我更喜欢那个名字,叫做《概率的必然性》。
That we use Bayesian brains? Well, that might just be an emergent property of a well trained network that did not have an inductive bias that favored Bayesianism to begin with. It's just the result of the training. So if you just took a generic neural network, and in fact, this was done by Orhan and Ma, they they had an earlier version, which I like the name of better. It was called The Inevitability of Probability.
厉害。是的。他们最终发表时换了个新标题。啊,好吧。但他们当时说的是,嘿,我们就把一个普通神经网络扔给这些问题,结果你看,概率表征就自己冒出来了。
Badass. Yeah. They had a new title when it was eventually published. Ah, okay. But they were they were saying, hey, let's just throw a generic neural network at these things, and lo and behold, probabilistic representations emerged.
但它们不会具备的是——因为它们没有这种机制——参数共享的能力,使得如果它学会了为这个任务做贝叶斯推断,那么它在新任务中自动就更擅长贝叶斯推断。所以这个元素是...这类网络并不具有概率推理的倾向性。它是从数据中涌现的,而非来自内置的、先天的能力。这意味着它对贝叶斯推理缺乏良好的归纳偏置,尽管它涌现出来了,我会说这不能算强有力的贝叶斯大脑案例。这更像是,好吧,只是训练得好而已。
But what they would not have because they have no mechanism for this is parameter sharing such that if it learned to do Bayesian inference for this task, that it would then automatically be better at Bayesian inference in a new task. So that element is something that is, it doesn't Those kind of networks do not have a propensity towards probabilistic reasoning. It emerges from the data, not from the inbuilt, the innate capabilities. And so that means it does not have a good inductive bias for Bayesian reasoning and though it emerges, would say that's not a strong case of a Bayesian brain. That's like, okay, just good training.
因为从这个意义上说它并不规范。
Because it's not normative in that sense.
它没有在多种不同情况下运用那个原则。每次都需要重新学习那个原则。
It's not using that principle in lots of different cases. It has to relearn that principle every single time.
这是不是更适用于基于采样的方法?我想
Is that like more amenable to a sampling based approach? I think
采样是完全相同的问题。你需要能够以正确的概率使用样本。对吧?是的。就像你仍然需要能够采取正确的方式。
the sampling is exactly the same kind of issue. You need to be able to use samples with the right probabilities. Right? Yeah. Like you still need to be able to take Right.
所以,可能确实如此,比如是的,采样可能是一个更容易实现参数共享的方式。举例来说,如果我说单个神经元的表征可能更容易实现参数共享,因为你可以像基因编码那样处理它,而基于群体的方式可能就不行。我我不确定。这真的是一个非常关键的问题。
The so there it might be, like yes, it might be that sampling is an easier thing to have parameters shared. For example, if I would say maybe a single neuron representation is easier to have parameter sharing, because you could like genetically encode it, than a population based thing might be. I I don't know. This is think a really critical question.
那么它就不需要是涌现的,可以直接硬编码。
Then it doesn't need to be emergent, it can just be hard coded.
某些元素可以直接硬编码,你可以拥有不同的微电路,它们非常擅长进行概率表征。
That some elements can just be hard coded and you could have you could have different micro circuits that are really good at doing representations of probabilities.
比如说一个皮质柱。
Say a cortical column.
皮质柱。对。或者不同的大脑区域。另一种实现方式,我们这里讨论的是空间上的参数共享。你有一组神经元负责某项功能,然后以某种非物理的方式复制它的参数。
Cortical column. Yeah. Or different brain areas. Another way that you could do it, so here we're talking about parameter sharing over space. So you have some, this group of neurons that does something and then you copy somehow, which is non physical.
你可以将它的参数复制到另一组神经元,这无法通过学习实现,但可以通过发育过程完成。就像它们都被编程遵循相同的发育路径,最终在你的大脑不同位置形成擅长概率推理的结构。
You could copy its parameters to another group of neurons and you can't do that by learning, but you could do that by development. Like they're both programmed to go down the same developmental path and then you end up with kind of good probabilistic reasoners at different locations in your brain.
好的。我们已经深入探讨了概率相关的内容,但我现在想继续推进话题,因为我想讨论你的具体工作以及你对其的思考。不过先退一步说,这种生成对抗性协作中,每个人对大脑如何利用或计算概率来实现功能都有不同观点。但大脑的处理功能容量是如此巨大。
Okay. So so we've really got into the weeds about the probabilistic stuff, and I but and I want to move on because I want to talk, about your specific work and and sort of your reflections on it, but just backing up, right? So there's this generative adversarial collaboration. Everyone has different perspectives on how probabilities might be used or computed in brains to do things. But the brain is the capacity of the functions of the processing of the brain is so vast.
难道你不能根据情境灵活运用所有这些吗?我只是说这两种分布编码、群体编码之间存在一种优雅的数学权衡关系,对吧?这种此消彼长的特性在不同情境下会很有用。
Couldn't you just be using all of them depending on the context? Just saying that the two distributional codes, population codes, sort of had this nice mathematical relationship trade off, right? Back and forth that would be useful in different situations.
对。存在二元性。
Yeah. Duality there.
没错。可逆关系。但它们就不能全部使用吗?
Yeah. Inverting. But can't they just use it all?
当然可以。有可能。我是说,我们在寻找普遍规律,但可能找不到。对吧?
Sure. It could. Could yeah. Be that I mean, we're looking for universals but we may not find them. Right?
可能不同部位会发生不同情况。
It might be that different things happen at different locations.
你持什么观点?你认为大脑是个拼凑系统吗?就像各种不同机制勉强协作那样?但你又是讲究规范的人。所以你认为大脑是以规范化方式在进行优化。
What's your bet? What Do you think of the brain as a kludge? As like lots of different things just kind of working it out? But you're also a normative person. So you think of the brain as optimizing in a normative fashion.
那你最终怎么看这个问题?比如从全脑视角来看?
So where do you land on this? Like, what whole brain sort of perspective?
是啊。说实话,我目前还不确定。
Yeah. I mean, to be honest, I don't know yet.
还是两者兼具?可能每...其实总是两者都有。这很宏大
Or is it both? It could be every It's always both. It's huge
而且这是有原则的。我想我通常的思考方式是,这两方面都会存在。理想情况是能找到让事物变得可理解的原则。某种意义上,只有那些非临时拼凑的东西才是可理解的,对吧?
and it's principled. I mean, I guess that's generally the way that I go is that it's there's gonna be elements of both there. The hope is that you can find some principles that make things understandable. In a sense, the only things that are understandable are the non kludges. Right?
只有符合原则的事物才是可理解的。事实上,当我在数据集中试图发现某种图结构算法时,就经常强调这一点。如果大脑计算中存在沿着图结构进行的动态过程,而图中每条边都各行其是,那甚至很难称之为算法。如果大脑有算法,就意味着它能在不同情境下执行相同的操作。
The only things that are understandable are the principles. In fact, this is a point that that I like to make when when trying to discover one of these graph structured algorithms in a dataset. So if there are dynamics in the brain computations that proceed along the graph, and if that graph was just every edge did its completely separate thing, then it's not even really very meaningful to talk about that as an algorithm. An algorithm, if the brain has an algorithm, it means it's doing the same thing in different contexts.
对吧?算法就是为实现某个目标而定义的一系列步骤。
Right? An algorithm is a defined set of steps that need to happen to accomplish something.
没错。顺便提一下背景信息,我的实验室就叫'算法大脑实验室'。
Yeah. Yeah. In fact, the name of my just to give some context here, the name of my lab is the lab for the algorithmic brain.
缩写也是LAB(实验室),对吧。
Also abbreviated as lab. Right.
是的。所以第一个实际上,应该说这是算法大脑实验室,然后第一个实验室代表算法大脑实验室,然后那里的第一个实验室,就像是
And yeah. So the the first actually, should say it's lab for the algorithmic brain, and then the first lab stands for lab for the algorithmic brain and then the first lab there, it's like
道格拉斯·霍夫施塔特会非常喜欢这个。
Douglas Hofsner would really like that.
他会爱死它的。还有GNU Unix的那些人,也会喜欢。好吧。因为GNU代表的是
He would love it. And the the GNU Unix people, like it too. Okay. Because GNU it stands for
哦。
Oh.
Gnu不是Unix,而那里的Gnu是
Gnu is not Unix and the Oh Gnu there is
哇。好吧。Gnu是递归的,并非一直到底。
wow. Okay. Gnu is Recursive not all the way down.
一直到底。所以寻找这类结构时,我觉得如果我们没有一套共享的可重复步骤序列,就没什么可学的。那只是个大杂烩。所以我将要学习的任何东西,都将基于某种共享的原则。你知道,如果每个大脑都做不同的事,如果大脑的每个部分都做不同的事,那就很难理解任何东西。
All the way down. So looking for those kind of structures, I feel like if we don't have a shared repeatable series of steps, there's nothing to learn. It's just a big hack. So anything that I am going to learn is going to be from some kind of principle that is shared. You know, if every brain does something different, if every part of every brain does something different, it's it's going to be hard to to make any kind of sense of anything.
现在确实有些人相信这一点。他们寻找原则的地方,不在于大脑在推理或操作过程中的功能本身,而在于学习过程。他们认为存在某种潜在的学习目标。哦,就是说你有一个目标或目的,这才是我们可以理解的。
It's just So now, some people actually do believe that. And the place that they look for principles is not in the functioning of the brain per se during, let's say, inference or operation, but rather in the learning. That there is some underlying learning goal. Oh. And that all you you have a goal or an objective and that that's what we can understand.
而不是那些随机的涌现计算——那些只是你用这个数据集学习时恰好发生的东西。
Not the resulting emergent computations which are just whatever happens when you learn it with this dataset.
哦,不是说那种有原则的学习规则最终会产生本质上等同于共享计算的结果。而是说它会学会任何需要学习的东西。
Oh, it's not that the that that principled learning rule would result in essentially the equivalent of a shared, computation. It's that it learns whatever it needs to learn.
是的。而且
Yeah. And
最根本的是
the fundamental thing is
对,完全正确。所以有些人倾向于那个方向。
Yeah. Exactly. So some people kind of lean in that direction.
那很笨拙。那是笨拙的方向。对吧?
That's kludgy. That's the kludgy direction. Right?
在最终的计算结果和推理过程中,它确实显得有些笨拙,但在根本的学习原理获取方式上并不如此。
It's it's a little bit it's kludgy in the final result of what computations happen, what inferences happen, but it's not kludgy in the in how It's you get the fundamental learning principle.
它并不笨拙。是的。
It's not kludgy. Yeah.
是的,这正是关键所在。但你知道,我们还应该考虑进化历史的影响。我不确定其中有多少是临时拼凑的,又有多少是核心原则。实际上,这让我想到我们刚刚启动的另一项重要合作。
Yeah. That's that's the idea. But you know, there's there's also the evolutionary history that we should account for. And I I don't know to what degree some of that is kludgy or what degree some of that is like core principles. And in fact, this this brings me to one other major collaboration that we've just started.
这项合作是西蒙的生态神经科学合作项目,简单来说就是——关于生态神经科学,我先简要介绍一下背景。生态心理学是由吉布森夫妇在五六十年代创立的领域。
This collaboration, Simon's collaboration for ecological neuroscience, which is basically saying like Okay. So ecological neuroscience, let me just give a little background on that. Ecological psychology was a field founded by Gibson and Gibson in the fifties, sixties.
你提到了两位吉布森,这很好。
You included both Gibsons. That's great.
两位吉布森。
Both Gibsons.
大多数人只提其中一位。
Most people just give the one.
是的。这对夫妻搭档,丈夫更专注于计算方面,妻子则更侧重于发展领域,比如儿童发展这类关键问题。他们都反对大脑创造表征的观点。这些吉布森派心理学家,他们非常不喜欢使用‘表征’这个词,这对他们来说简直是禁忌。
Yeah. The the husband and wife team, the husband was more focused on like computations and the wife was more focused on development, they're both critical there, like childhood development, that kind of thing. And they were arguing against the idea that the brain creates representations. So these these Gibsonian psychologists, they're they really don't like this idea of having like using the word representation is often anathema to them.
有人称之为反表征主义。确实如此。
It's anti representational. As some people say. Yeah.
对,对,完全正确。实际交流中我发现吉布森派的态度比我预想的要温和些。
Yes. Yes. Exactly. I've found it's a little softer in practice than I expected when talking to Gibsonians.
温和是指他们更倾向于...
Softer in that they're They more
会使用‘表征’这个词。
will use the word representation.
哦,糟糕。
Oh shit.
比如,他们可能不小心说漏嘴。不过嘛...我不会点名,但你们心里清楚。这个对比的概念就像是——你现在正拿起一个咖啡杯,对吧?
Like, maybe they let it escape their lips accidentally. But Well, yeah. And I'm not gonna name any names, but you know who you are. And so the idea of the contrast would be here you're picking up a coffee cup here. Right?
所以你看这个场景,有一个杯子,它有些边缘,是个黑色物体,那里有个圆润的形状,中间还有块更深的色斑。你正握着那个东西。没错。
So you look at the scene, you have a cup, it's got some edges, it's a black object, it's got a some rounded shape there, and it's got a darker patch in the middle. You're holding that thing right there. Exactly.
我在拿起它之前,已经想好了所有需要做的计划动作。我对将要发生的事情有了完整的心理模型。对吧?
I made a I I Before I picked it up, I figured out all of the planned movements I needed to do. I had a complete mental model of what was gonna happen. Right?
非常好。是的。完美。这种将物体从部件组合起来的方式,那种表征是吉布森夫妇不认同的。他们认为那是个错误。
Excellent. Yes. Perfect. And so that that way of kind of putting together objects from pieces, that representation is something that the Gibsons didn't like. They thought that that was a mistake.
相反,他们认为我们有两种视角。一种是直接感知——我个人不太推崇这个观点。但另一种是我们通过可供性(Affordances)来解读世界。可供性这个词是由J.J.吉布森提出的。
And instead, they think that we have there's sort of two aspects. One is direct perception, which I'm not a big fan of. But the other is that we interpret the world in terms of things we can do with the Affordances. Affordances. It's a word that Gibson, J.
动词'affords'原本就存在,但吉布森将其名词化。他说,你的咖啡杯提供了通过手柄抓握的可能性。因此他称之为可供性。
J. Gibson came up with. The word affords was already there, but not as a noun. He said, your coffee cup affords picking up by the handle. And so, he says that that is an affordance.
你可以握住杯子,可以倒满杯子,可以扔掷杯子,还可以轻敲杯口发出'呼'的声响。
You can grasp the cup. You can fill the cup. You can throw the cup, you can bop the top of the cup and make a whoop sound.
我喜欢最后那个例子。听到你这么说我有点意外,因为根据我的观察,你更偏向表征主义——毕竟你在学习图形模型,涉及大量结构和世界模型等内容。那么生态心理学,或者你即将解释的生态神经科学,是你逐渐开始欣赏的领域,还是早已认同的理念?
I like that one. I'm somewhat surprised that I'm hearing you say this because you my my perspective is that you're more on the representational side because you're learning graphical models and there's a lot of structure and world models, etc. So is this something, is ecological psychology, and I guess you'll describe what ecological neuroscience is. It's it's something that you've have are coming to appreciate or have appreciated?
所以我非常喜欢'可供性'这个概念。我认为这是个强有力的理念,它能帮助我们以聚焦实用性的方式构建世界中的事物。我们这个名为SCENE的团队——西蒙生态神经科学协作组,正致力于通过神经科学和一系列精密的行为实验来验证这些理论,研究对象包括小鼠、猴子、人类、婴儿、蝙蝠等各类生物。
So I I love the idea of affordances. I think it's a powerful idea and it it helps us structure things in the world, in a way that is focusing on the stuff that's useful. As opposed to So this team, which we call SCENE, Simon's Collaboration for Ecological Neuroscience, we're really trying to test these ideas through neuroscience and and sophisticated behavioral experiments with a whole variety of animals. We have mice and monkeys and humans and babies and bats, baby humans. And and looking for different tasks where they might, they have some information that they can't do anything about.
有些信号它们无法处理。这些不属于可供性范畴,不能提供任何行动可能性。但确实存在一些有用的元素,对吧?
Some signals they can't do anything about. They're not affordances. They don't afford anything. They have some things that are useful. Right?
有些事物是可操作的,你可以对其采取行动,它们是可控的。另一些则具有奖赏性——某些可操作行为未必会获得实验性奖励,它们不属于任务范畴。但纵观神经科学与机器学习的主要理论框架,基本都落在这两个类别中。
You can do stuff, you can act upon them, they're controllable. And then, you have some things that are rewarding. So some of the things that you can do are not necessarily experimentally rewarded. They're not part of the task. But when you think about the main themes of the big theories of neuroscience and machine learning, they're in these two categories.
其一是强化学习:竭尽所能达成目标,学习只服务于目标的内容。这是极端之一。另一个极端则是构建万物的生成模型——
One is reinforcement learning. So like you do everything that you can to get your goal. And you learn stuff in so far as it supports your goal. That's that's one extreme. And then on the other extreme, you make a generative model of everything.
试图通过世界压缩来描述所有感官观察的成因,嗯。即便这些内容既不实用也无奖赏性。因此介于两者之间且略有偏离的,是第三种可能性:既不学习全部,也不只学奖赏性内容,而是学习那些你可操作的事物。
You try to describe the causes of all of your sensory observations as a compression of your world. Mhmm. Even if it's not useful. Even if it's not rewarding. And so, think kind of in between and a little off to the side is this other possibility that you don't learn everything and you don't learn just the rewarding stuff, you learn stuff that you can do.
这就是可供性。这个理念之所以引人入胜,在于它能让你更高效地利用有限数据和资源,获得更好的泛化能力。若只关注当下有用的内容,在不断变化的世界里,你会错过许多未来可能有用却未知的事物。而试图学习一切,则像福尔摩斯说的'太阳绕地球还是地球绕太阳对我毫无区别,我会立刻忘记'——
And that's the affordances. So that becomes a compelling idea because it I think it it allows you to use your limited data and your limited resources more efficiently for things that will generalize better. If you just focus on what's useful right now, the world is constantly changing and you're gonna miss a bunch of things that could have been useful later that you didn't know about. And if you try to learn everything, it's a little bit like Sherlock Holmes saying, if the sun goes around the moon, the sun goes around the earth or the earth goes around the sun, it makes no difference to me. So I'm going to promptly forget it.
福尔摩斯对此非常务实。因为这两种解释都能说明相同的数据。但生成模型会试图描述日心说或地心说——并非因其有用,仅仅为了解释所有可能性。嗯。
Sherlock Holmes is very, like practical about that. But because it's explaining the you know, the same the same data. But now, the generative model like the the generative model would have the You'd be trying to describe whether the sun goes around the earth or the other way around. Not because it's useful for you, but just because you're trying to explain everything you can. Mhmm.
因此,在那个模型中你消耗了大量脑力去做一些可能永远不会用到的事情。所以这给了我们关于‘可供性’的概念,我认为它在泛化到未来可能带来回报的新事物与实际行动之间,提供了一种有趣的潜在平衡。
And so, you're using a lot of brainpower in that model to do things that you might never use. So this gives us this this affordances idea, I think gives us an interesting potential balance between generalization to new things that that you might be able to act upon that could be rewarding later.
就像完全泛化与... 对。
Like total generalization versus Yeah.
你不会采用那个模拟世界上所有因果关系的生成模型,那将是泛化的极致。但代价很高。没错,需要积累海量数据。
You're not going to the the generative model which models every causal every cause in the world, that's going be the best of generalization. But you pay for it. Yeah. A lot of data that you have to accumulate.
这也更低效,因为你在学习大量根本不需要或用不上的东西。所以从某种角度看,如果智能的本质是解决当下问题,这种方式反而更不智能。
It's less intelligent too because you're learning a shit ton of a lot of things that you are not gonna need or use. So in in some sense, it's less intelligent if if intelligence is solving problems at hand.
正确。而且是在自然环境中实际遇到的生态问题集合。是的。因此我认为这成为我们可以探索的第三条有益线索,并且设计了一些神经科学实验来验证——这些实验非常有趣。生态心理学的另一个要素是直接感知。
Correct. And some ensemble of ecological problems that you actually encounter in nature. Yeah. So this becomes I think a useful third theme or third thread that we could explore and we have some neuroscience experiments to test them, which are which are a lot of fun. Now, the other element to ecological psychology was direct perception.
其核心观点是我们不需要分步计算。
And this basically says that we don't have steps of computation.
没错。
Right.
我们直接就能感知到眼前的事物。
We just directly know things that are there.
历史上这种观点确实让人难以接受,不仅是...对。这就像人们最常质疑的——你们到底在搞什么鬼?是啊。
And This this has rubbed people the wrong way historically, not just Yes. This is like the main thing that people are like, what the hell are you Yeah.
没错。举个典型例子回到克鲁格理论,保罗·奇萨克算是生态神经科学家吧,他坚决反对表征主义阵营。他写过几篇关于大脑进化史的精彩论文,我实在太喜欢了。
Yeah. Exactly. One good example of this, which gets back to the Kluges, is like Paul Chisak is is an ecological neuroscientist, I guess, who who is very much in this camp, this anti representation camp. And he's written a couple beautiful papers on the evolutionary history of brains. I just I love this.
他写的几篇综述论文堪称杰作,整合得极其精妙,强烈推荐阅读。
There's a couple papers that he's written that are gorgeous and synthesis, wonderful synthesis. I highly recommend them.
他在这上面投入了大量时间,和我交流时他说——因为这算是冷门方向,算是他的激情项目吧。嗯。但这工作确实很棒,很高兴他在做这些研究。
He's spending a lot of time on that and like in my conversations with him, he's I said, you know, because it's sort of off the beaten path and it's a passion project for him. Mhmm. But yeah, it's beautiful work and I'm glad that he's doing it.
是啊,我也很感激。他举了个蜥蜴寻找庇护所的例子:当你在环境中移动时,看到上方明亮下方阴暗的光斑,若朝某方向移动时暗区扩大亮区升高,这就暗示着可能有悬垂物。
Yeah. I'm grateful as well. So he has an example of how you might find shelter as a lizard. So you're moving around in the world and you see a patch of light that is bright on top and dark below and as you move in one direction, the dark part gets bigger and the bright part gets higher. That suggests an overhang.
你正在接近可能作为庇护所的悬垂物。嗯。现在想象直接感知——你只需将这种亮区上升暗区扩大的视觉模式,直接神经连接到前进动作即可。对吧。
You're getting closer to an overhang which might be shelter. Mhmm. So now, you can imagine direct perception where you just wire this particular visual pattern of a bright patch going up and a dark patch growing. You just wire that directly to move forward. Right.
至少在某种情境下如此。然后还有像反馈调节之类的循环嵌套循环。但那种基本动作会是一种直接的感知方式。你不需要知道,也不必构建一个关于那里存在凸起或凹陷区域的模型。你只需这样做——将这种感官输入与那个运动输出连接起来,就完成了。
Under some context at least. Then there's like loops of loops that are feedback and regulating and all that. But that basic movement would be kind of a direct perception approach. You don't have to know, you don't have to make a model of the fact that there's a convex, a concave area there. You just do this particular You connect this sensory input to that motor output and you're done.
这可能就是为什么打苍蝇很难。
It's probably why it's hard to swat a fly.
因为它们特别擅长这类反应。
Because they're really good at those kind of things.
没错。
Right.
它们有非常迅速的本能反应。虽然吉布森认为这能解释所有现象,但表面上看显然不成立对吧?确实如此不是吗?
They have very quick, close reactions. Whether that describes I mean, Gibson thinks that that describes sort of everything. And it seems just false on its face. Right. Happens Right?
这根本说不通。我去年还真从图书馆借了本《大脑与行为》期刊,里面有整篇文章讨论这个。当时我就觉得——哇。
In the It's just false. There was a whole article of like a brain and behavior, journal that I actually took out of the library. Like Wow. Last year. It was, I know.
我当时手里拿着这本书,它就像此刻真实存在的这本书一样。
I was holding this book and it was it's like this real book here.
你因此长出老茧了吗?没有。
You develop calluses from that? No.
翻页时真的很疼,因为你知道,这是我不常使用的肌肉
It got really sore because turning the pages was, you know, it's muscle I hadn't used in a
有一阵子了。没错。
while. Right.
而且有很多人回应,我想是Shimon Ullmann在批判直接知觉理论时提出的观点。许多知名学者对此发表了看法。其中杰夫·辛顿的回应很有意思,他说吉布森不可能是这个意思——认为感知完全没有中间步骤。他认为吉布森反对的其实是传统AI那种做法:先提取边缘,再组合成轮廓,然后逐步处理,就像计算机科学那种分步操作。
And and there were a bunch of responses to, I think it was Shimon Ullmann who was who was critiquing direct perception in this way. And and a lot of, luminaries giving their responses to it. And one of them was I thought pretty interesting, from Jeff Hinton, which was like, Gibson couldn't possibly have meant that. That it's just there are no intermediate steps. I think instead what he was probably arguing against was more the the good old fashioned AI sense of like, first you extract edges, and then you put them together into contours, and then you do this thing and then you do that thing in a very computer science y way.
但辛顿读过吉布森的著作吗?我没读过吉布森的主要文本,所以...是的,我无法确定
But had Hinton read the Gibson? I have not read the main text of Gibson, so I I mean, yeah. I I wouldn't know if
我也不清楚辛顿读过什么,但他试图调和这些观点。我认为他的联结主义思想与此相关——他说这不过是神经元集体运作的涌现行为,完全可能符合吉布森心中那种非逐步算法的理念。在我看来,或许可以把某种联结主义架构解释为:看,这些是关于轮廓的序列信息,信息流实际是沿着轮廓优先传递,其他信息则偏离轮廓。
I mean, I don't know what Hinton had read, but I mean, he was trying to reconcile these things. And I think his connectionism was along those lines. He was saying, yeah, it's just an emergent behavior of all these neurons working together and that's that could be perfectly consistent with this idea of non step by step algorithms that Gibson may have had in mind. You know, from my perspective, it might be possible to take one of those connectionist architectures and actually interpret it as, hey, look, here's some sequence information about contours. And the information flow actually prioritizes along the contours and some other information is off the contours.
但并不是说存在专门的轮廓神经元。这里需要平衡——如果说这些神经元在此情境下不与那些神经元互动,这种计算结构恐怕仍然存在。
But it's not like here's a contour neuron and here it's There's some balance here but saying that these neurons don't interact with those neurons in this context, that would be structure to the computation that I suspect is still going to be there.
但你仍会发现单个神经元。你之前提到霍勒斯·巴洛对吧?就像那个叫什么来着?神经元假说?不对。
But you would still find single neurons. So you mentioned Horace Barlow earlier, right? And like the What is it? The neuron hypothesis? No.
神经元学说。单一神经元学说。
The neuron doctrine. Single neuron doctrine.
这是指
Which is
可能要归因于,你
probably responsible for, you
明白吗?是的。
know? Yep.
展开剩余字幕(还有 132 条)
你仍会发现与轮廓相关的神经元对吧?但你不能简单将其视为轮廓神经元。没错。你可以解码出该神经元可能与轮廓有关,但不同于巴洛历史上提出的单一神经元学说,你不会称其为祖母细胞或轮廓细胞。
You'd still find neurons that correlate with the contour, right? But you wouldn't just think of it as a contour neuron. Correct. But you could decode like that it probably has to do with the contour from that single neuron, but as opposed to the historical single neuron doctrine of Barlow at all, you wouldn't call that a grandmother cell, a contour cell for example.
神经元可能具有不同程度的特异性,因此完全可以将这类称为轮廓神经元。真正的问题在于泛化范围是什么
It may have neurons will have more and less specificity so it'd perfectly fine calling that kind of a contour neuron. The real question is what is the range of generalization
对。
Right.
确实如此,对吧?如果它在各种不同的情境、背景和对比中都能普遍适用,那么可以说那个神经元在某种程度上定位了关于轮廓的信息。你总能找到这样的信息,比如无论神经元A对什么有选择性反应,无论什么能激活它。
That it that it has. Right? If it generalizes that over a wide variety of contexts, backgrounds, contrasts, then yeah, you might say that that that neuron has kind of localized some information about contours. You can always find such information like whatever neuron A neuron responds selectively to whatever it turn whatever turns it on.
但我觉得称之为轮廓神经元,会让人联想到,如果你破坏它,就看不到轮廓了。或者认为你对轮廓的心理表征完全依赖于那个神经元。这正是人们反对的那种隐含意义。
But I think by calling it a contour neuron, you harken back to like, well, if you kill it, you don't see contours anymore. Or that's like Right. No. Your mental representation of a contour is due to that neuron. That's the sort of implication that people rail against.
是的,完全同意。我认为,即使是那些思考单一神经元学说的人,也不一定认为只有那个神经元在起作用。
Yeah. Yeah. Absolutely. I I think, know, people even people who are thinking about the single neuron doctrine weren't necessarily saying that it was only that neuron.
我同意,对吧?是的。
I agree. Right? Yeah.
但你知道,我认为证据表明这些信息是更广泛分布的。所以,找到正确的基础、正确的组合模式——我们寻找的是模式以及模式之间的关系。这引出了一个我认为思考表征时常被忽视的根本点:表征本身是无用的,它们需要相互连接。因此,我们需要从表征和转换的角度来思考大脑。
But you know, I think that the evidence shows that these pieces of information are more widely distributed. So, you know, I think that that finding the the right basis, the right comp pattern We're looking for patterns and and how the patterns relate to each other. And this brings up I think a fundamental point that people who think about representations often neglect, which is that representations are useless by themselves. They they need to be connected. So you need to think about the brain in terms of representations and transformations.
你说的表征是什么意思?我刚刚参加了一个关于表征的讨论小组,结果话题被带偏了,因为约翰·克拉考尔把所有东西都联系到心理表征上。那么在这里,你指的是嗯...神经活动的结构吗?这就是你的意思?
What do you mean by rep so I just had a panel on to talk about representations and it kind of got side railed because John Krakauer related everything to mental representations. So in this case, you mean Mhmm. The structure of the neural activity. Is that what you mean?
我指的是神经活动与外部世界之间的关系。好的。我听了那期节目。就是你们讨论的那期。我和别人聊过,这确实是个有争议的术语,
I mean the relationship between the neural activity and the external world. Okay. And I listened to that episode Okay. That you're talking about. And I talked to It's just a contentious term,
你知道的。
you know.
这是个有争议的术语。事实上我正在参与另一个生成对抗性合作项目,
It's a contentious term. I actually have another generative adversarial collaboration that I've been working
什么使表征变得有用?
how what makes a representation useful?
对,没错。完全正确。不同人对表征这个词的使用方式不同。
Yes. Exactly. Exactly. So different people use the word representation differently.
是啊。
Yeah.
是的。其实没关系,只要明确表达你的意思,我们就能继续讨论:这种情况会发生吗?那种情况会出现吗?
Yeah. And you know, it's fine. Just say what you mean and then we can we can go on and say, well, does that thing happen? Does this thing happen?
所以按你的理解,在你的案例中,这是神经活动与世界上发生的某些事情之间的某种关联。
So in your sense, in your case, it's some relation between the neural activity and something happening in the world.
是的,没错。因此我打算现在使用‘表征’这个词,在这个共同意义上指代拥有信息。这其实就是典型的、简单的神经科学风格——实际上是我最初采用的风格,在我被说服改变观点之前所理解的含义。比如,如果你认为某物是表征,当它包含关于世界的信息时
It's so yeah. That's so I'm I'm going to use the the word representation right now in this in this joint sense of having information. So it's just it's it's a simple the simple typically, neuroscience style. It's actually the style I started with, the the meaning I started with before I was convinced otherwise. Like, if if you take the view that something is a representation, if it has information about the world
是语义信息还是意义信息?
Shenan information or meaning information?
任何信息都可以。
Any any information.
指向性、意向性信息之类的吗?
Aboutness, intentionality information or something?
我认为这并不
I think it it doesn't
重要?好吧。
Really matter? Okay.
嗯,这又是另一个问题。我们暂时就用香农信息这个术语吧。
Well, it's another Yeah. Let let's just use the term Shannon information for now.
好的,没问题。
Okay. Sure.
那么就不可能出现错误表征。
Then it is not possible to have a misrepresentation.
哦,明白了。
Oh, okay.
对吧?之前也有人提出过这个观点,但我当时没领会。后来有几位哲学家向我指出时,我才恍然大悟。所以现在你需要以符合世界转换规则的方式来运用这些信息,才能形成正确表征,对吗?
Right? And this is something that other people have made as a point before and I didn't appreciate it, but some philosophers actually pointed it out to me and I was like, ah, okay. So that makes sense. So now, you need to use that information in a way that is consistent with some rules of transformation in the world for it to be a correct representation. Right?
比如如果我看到左边有图像,然后我转向左边,这就是正确表征——如果我想朝那个方向移动的话。但如果我看到左边的东西却往右移动,那就是错误表征了。就像戴着棱镜眼镜时那样。所以表征本身只是静态存在。
Like if I if I see an image on my left and I turn to my left, then I am correctly representing like if I want to move If I want to be directed towards it, then I have a representation that is used appropriately. But if I see something on my left and I actually move to the right, then that's misrepresentation. Right? If I'm wearing like prism glasses or something. And so, representation by itself, it's just sitting there.
它需要具备某种功能,需要发挥作用。因此我们既要关注神经元中关于世界的信息,也要研究这些信息如何转化为行为或传递给其他神经元。这种表征与转换的结合体才是我们应该研究的对象。
It needs to have some function. It needs to do something. So we always want to be thinking about information that's there in the neurons about the world, but also how it gets transformed either to behavior or to other neurons. And that joint representation and transformation is what we should be studying.
啊,好吧。
Ah, okay.
因为同一个表征可以通过不同方式转换,这可能意味着它反而会变成一种误表征,对吧?你可以对两种不同表征应用同一种转换,其中一种会有用,另一种则不会。比如,假设我们的转换只是简单相加。这在概率推理中是正确的做法,让我们回到之前那个概念。
Because you can you can have the same representation transformed in different ways, which would mean that it might be a misrepresentation instead. Right? You can have one transformation applied to two different representations and one of them would be useful and one of them would not be. So for example, let's say we're the transformation is just adding things together. That is the correct thing to do for probabilistic inference, bringing back that other idea.
如果你用对数概率表示,那么通过相加对数概率来实现概率相乘。这就是表征与转换之间的匹配,对特定任务很有帮助。但如果你试图直接相加概率来综合证据,那就是错误的。那样得不到正确答案。所以你需要保持这种一致性——神经元活动模式如何与世界关联,以及你如何随时间或空间改变这些模式,以获得对世界有意义的新表征形式。
If you are representing log probability And then you multiply probabilities by adding log probabilities. So that's a match between the representation and the transformation that is helpful for that particular thing. But if you were trying to use a different If you were trying to add together probabilities directly in order to synthesize evidence, that would that would be a mistake. You would not get the right answer that way. So you need to have this alignment between how things are like the patterns of neurons and how they relate to the world and how you change those patterns over time or over space to get to new formats of things that matter for the world.
我们是怎么从生态神经学谈到这里的?我们是不是在反对...
And how did we get here from ecological neuro did we did we anti
表征问题。对吧?就像你问我:我们怎么能...你当时是说
representation. Right? Like I think you asked me like, hey, how could We're you like
讨论存在表征结构的中间地带。对。但它既不完全通用也不完全脆弱。所以这不是直接感知。
talking about the middle ground where there is representational structure. Yeah. But it's not completely general and it's not completely brittle. So there's It's That's not direct perception.
确实如此。这是反对直接感知的观点,但更像是...我认为在某些情况下,当你找到正确的特征组合时,它就能正确运作。你可以说这是直接感知,或者说这是针对特定任务的有效计算。现在这就成了语义问题了。
Those are Yeah. It's against direct perception but it's something like I think that there are some cases where you know, you find the right combination of features and it does the right thing. You can say, oh, that's direct perception. Or you could say that, oh, this is a useful computation for this particular task. Now, becomes a matter of semantics.
对吧?这取决于你如何定义‘直接’。所以我认为不必过分纠结吉布森所说的直接感知具体指什么。但我们可以发现,神经元中存在与外界时空模式相对应的时空模式。
Right? It depends on what your what direct means. So I don't I don't think it's all that important to go into tremendous detail of what Gibson might have meant by direct perception. Right. But I think we can find that there are spatiotemporal patterns in neurons that relate to spatiotemporal patterns in the world.
这些模式在时空中的演化方式就是我们所进行的计算。这是通过动力学实现的计算。正是这种机制将输入模式(若愿意可称之为表征)与我们的使用和行为方式耦合起来,让我们能够判断——用克拉考尔的话说——我们是否拥有对未来世界状态的预测性心理表征,并以符合目标的方式作出反应。
And that the way that those patterns evolve over time in space are the computations that we do. This is computation by dynamics. And so this is the thing that couples the the patterns of input, which we can call representations if we want, to the way that we use them and behave upon them, which is the thing that lets us know whether in Krakauer sense, we have a mental representation of predictions about the future states of the world are reacting in a way that's consistent with where we want to go.
这与我们早先讨论的逆向理性控制理论相契合:动物未必针对当前任务进行优化,而是针对某种在任务视角下次优、但从进化适应性角度理性优化的目标进行行为调整。
Mesh this with the idea, your work that we were talking about way earlier on inverse rational control where an animal is not necessarily optimizing for the task at hand. It is optimizing for something that's suboptimal relative to the task but it's rational in the sense that it's optimizing for something that it is has evolved to, to optimize for.
是的。可以说如果它们在虚假世界中表现理性,那么它们就是在误读感官证据——它们接收到的是一种扭曲的感官输入。
Yeah. You could say that if they're behaving rationally in this false world, right, then they're misrepresenting their sensory evidence. So they're getting one form of sensory evidence
那将被视为错误。这正是我想回归的观点——你会认为那是一种认知偏差。
That that would be an error. That's what I wanted to bring it back to like the error in You would consider that an error.
从外部世界关联性角度我会认为这是错误,但其内部逻辑自洽。可以说这是理性的表现,因此我喜欢区分‘理性’与‘最优’这两个概念。
I would consider that an error in the in the sense of related to the external world. But it's self consistent, right? Might say, it's rational. So I That's why I like to distinguish rational from optimal. Okay.
其他人对‘理性’一词有不同的使用方式。
Other people use the word rational differently.
是的,这正是我想说的。有些人会把它们混淆,对吧?是的,或者在某些人的定义中它们可能被混淆。
Yeah. That's what I was going to say. Some people confound them, right? Yeah. Or they could be confounded in some people's definitions.
没错。所以有些人谈论有限理性,指的是那些你知道的、可能带有一些迷信或其他近似判断的情况。比如在抽样情境下,Ed Vuhl有篇论文《一锤定音》,基本上描述了在某些情况下,行为与对概率分布进行单次抽样并据此行动是一致的。这在时间极端受限的条件下是理性的。
Yep. And so some people talk about bounded rationality, which are things where you know, you don't You might have some superstitions or some other approximations that you make. Like in the sampling context, Ed Vuhl has this paper, One and Done, where you basically there are some cases where behavior is consistent with taking a single sample of a probability distribution and just acting upon that. That's rational, but in under the bound that you have extreme time constraints.
对。
Right.
或者其他类型的资源限制。所以你看,我认为在很多这类讨论中,我们对某些词汇的含义有共同理解。当我们发现冲突时,就需要梳理清楚:我们是否对这些词汇的冲突含义有相同理解?
Or some other kind of resource constraints. So you know, I think in a lot of these, we're talking, we we have a shared understanding of what some words mean. When we find some conflict, now we have to go through and say, alright, well, do we mean the same thing by these words conflict?
但我们不想成为哲学家。我们不希望所有讨论都陷在语义里,我们想要推进实质进展。
But we don't want to be philosophers. We don't want we don't want it all to be about that. We want to move forward.
没错。所以我们实际上是在向前推进的。我们默认彼此理解对方的表述,直到出现分歧。这时我们会尝试调和术语的使用方式,也许最终有人会说:我不希望你那样使用那个术语。
Yeah. Exactly. So we we, you know, we I think we do move forward. We like we we assume that we know what each other is talking about until we don't. Then we try to resolve our under you know, we try to reconcile the way we're using terms and then maybe we don't somebody says, I don't want you to use that term that way.
我会说:好吧,我们可以达成共识或保留分歧,但让我们继续讨论实质内容。
I say, okay, fine. Well, we can agree or not and but let's go on to the substance.
对,没错,就是这样。好的,扎克。我们已经花了很多时间,却除了逆向理性控制外,还没真正讨论过你的一些项目。简单列举几个,你知道,你最近的研究包括:研究了注意力如何根据任务限制和概率等因素在高与低之间波动。
Right. Yeah, exactly. Okay, Zach. So we're we've spent a lot of time already and we haven't even talked really except for the inverse rational control about some of your projects. Just to list some off, you know, some of the recent work that you've done is, you know, you've studied how attention fluctuates like the pattern of high and low attention based on, the constraints of the task and the probabilities, etc.
我们如何控制行为,那句话怎么说来着?‘动得多,想得少’,嗯,我笔记里有。你还研究了递归图形概率模型,虽然我们无法逐一探讨,但有没有什么你想重点强调的?有没有让你感到最自豪或最欣喜的成果?
How we control, what we do, what is the phrase? Move, I have it in my notes here, moving more to think less. Mhmm. You have work on the recurrent graphical probabilistic models, so I don't, we can't go through all these but, you know, is there something you want to highlight? Is there something you're most proud of that you are most joyous of?
毕竟你涉猎广泛。所以我想让你自己来挑选,你觉得最有趣或最有意思的部分。
Because you are a person of many pursuits. So, I want to leave it up to you to sort of highlight, what you think is most fun or interesting.
这就像让你选哪个是——
It's like asking you to pick what's
我懂。‘你最喜欢的颜色是什么?’对吧。或者‘你最喜欢的孩子’。
I know. What's your favorite color? Yeah. Your child.
哦,可以说最喜欢的颜色。我最喜欢绿色。
Oh, can say favorite color. My favorite color is green.
好的,没问题。其实说出最喜欢的孩子也很容易,但我不会透露的。开玩笑的啦。
Okay. For sure. Can say favorite child easily too, but I won't reveal it. No. I'm just kidding.
我说不出口。
I can't say it.
是啊。我最近最喜欢的项目是哪个?我我真的很享受做这些生态心理学神经科学的研究。我觉得
Yeah. I'll I'm which is my favorite project lately? I'm I'm really happy doing this ecological psycholog neuroscience stuff. I think that
我完全没想到你竟然不知道这件事。我真的很惊讶
I was super surprised that you I didn't know about that. And I was super surprised
这方面零发表。七月份才刚启动。
zero publications on it. It just started in July.
好吧。
Okay.
而且整合这些花了漫长而缓慢的过程。团队规模很大,有20人。但其中有六个理论小组,这是非常理论导向的。
And it was a long slow process of of putting this together. And the the team is big. It's 20 people in this team. But there are six theory teams within It's this very theory led.
怎么加入?好吧。
How do I join? Just Alright.
敬请期待。是的,敬请期待。你知道,我认为未来会有机会扩展这些领域,
Stay tuned. Yeah. Stay tuned. You know, I think there will be opportunities for broadening these, you know,
还算公平。
as Fair enough.
这将是一个长达十年的项目。
It's gonna be a ten year project.
天啊。
Holy cow.
是的,是的。所以这会花费我们一些时间。那个项目非常有趣。我们已经讨论过了。
Yeah. Yeah. So this will this will take us a while. So that one's that one's a lot of fun. We've already we've already talked about it.
我对我们讨论的那个动态图概念感到非常兴奋,就是我会发给你那个形状,那个小四面体。酷。实际上,如果你为此建立一个图形模型,用一个圆圈代表每个变量,再用一个方块表示这些变量如何相互作用,最终得到的图像看起来就像《回到未来》里的通量电容器。
I'm really excited about this dynamic graph thing that that we talked about where if I'll send you that shape, the little tetrahedron. Cool. In fact, if you make a graphical model of this where you have the one circle for each variable and then a square for how those variables are interacting, you actually end up with a picture that looks like the flux capacitor from
嗯。
Mhmm.
回到未来。
Back to the Future.
是的。它是
Yeah. It's
就像这三样东西汇聚到中心。我对于用这个主题来解释一大堆事情感到兴奋。我称之为统计晶体管。
like these three things coming into the middle. And I I am excited about using that motif to explain a whole bunch of things. I call it the statistical transistor.
不错。
Nice.
因为一旦有了第三个变量作为其他两个变量是否交互的门控,这个结构图的表达能力就会大幅提升。这解释了低级视觉感知中一些有趣的性质。我认为它能解释我们在改变认知模式时的许多结构。有趣的是,用于这三个交互变量的基本数学方程是x乘以y乘以z。值得注意的是,从中会显现出两点。
Because once you have that third variable gating whether the other two are interacting, the the expressive power of that structural graph becomes vastly larger. And so it explains some interesting properties in just in low level visual perception. I think it can explain a whole lot of structures in in the way that we're changing cognitive patterns. Interestingly, the the fundamental math equation there that you use for these three interacting variables is x times y times z. And remarkably, two two things show up out of this.
首先,这正是你在Transformer中得到的。比如Transformer,目前支撑大型语言模型的非常流行的机器学习架构。这种乘法门控有着悠久的历史,可以追溯到六十年代,在计算机科学领域算是古老的,人们当时使用sigma pi网络,其中sigma是求和,pi是乘积。因此,将x乘以y乘以z实际上提供了很多有趣的表达能力。Transformer以一种略微特定的方式使用了这一点。
First, that is what you get in transformers. Like transformers, the very popular machine learning architecture that underlies large language models at the moment. And this this kind of multiplicative gating has ancient histories back, like if you go back to the sixties, that's ancient, in in computer science sense, people were using sigma pi networks where this product is sigma is the sum and pi is the product. And so multiplying these x times y times z actually gives you a lot of interesting expressive power. Transformers use that in a like a slightly specific way.
但根本上,其核心单元是将这些不同元素相乘,并将其用作所谓的注意力机制,这与生物注意力在门控方面有很多相似之处。我对此感到兴奋的另一点是,当你赋予神经元比传统神经网络更多的能力时,这个主题会自然浮现。传统的人工神经网络中,你取一个神经元,取它的所有输入,对这些输入进行加权求和,然后通过一个非线性函数传递。这就是神经元的响应。对吧?
But fundamentally, the unit is that you're multiplying these different elements together and using that as what they call attention, which has a lot of similarities with biological attention in terms of gating things. And the other thing that I really like about this, that I find exciting is that motif emerges naturally when you give neurons the more capabilities than traditional neural networks. So traditional neural networks in artificial neural networks, you take one neuron, you take all of its inputs, you take a weighted sum of all those inputs and then you pass it through a non linear function. That's your neuron's response. Right?
而这具有极其宝贵的价值。
And this has been stupendously valuable.
只要权重相同,每次通过时它都会得到相同的输出。输入始终不变。
And it gets the same one every time it passes through as long as the weights are the same. It gets the same input.
没错。其他每个神经元也都是同类型的非线性函数,只是对可能不同的输入集进行不同的加权求和。我们称这种加权和为输入的投影。如果允许它学习一个非线性函数来映射输入到输出,但你不是基于一个加权和,而是两个加权和来学习。那么,自然而然地就会涌现出乘积,即x乘以y。
Yeah. And every other neuron is going to be the same kind of non linear function that just takes a different weighted sum of maybe a different set of inputs. So the weighted sum is what we call a projection of the input. If you allow it to have to learn a nonlinear function, mapping inputs to outputs, but you learn it based on not just one weighted sum, but two weighted sums. Now, automatically emerging generically is a product, x times y.
这个操作正是Transformer中的关键新要素——乘积运算。要知道,人们在1960年代就曾涉猎过这个概念。当你赋予神经元这种生物学中天然具备的能力时,它就会自然浮现。
That very operation that shows up as the critical new ingredient in transformers, this product, and was you know, something that people dabbled with in the nineteen sixties. It emerges naturally when you give neurons the kind of power that they automatically have in biology.
嗯。
Mhmm.
所以在生物学中,神经元的树突并非仅传递单一输入。你有顶端树突,有基底树突,结构更为复杂,但至少具备这种特性。只要赋予神经元这种简单的生物结构,所有这些现象就会涌现。你会得到注意力机制,得到门控功能,这些统计晶体管般的东西就会自然而然地免费呈现。
So in biology you have neurons that don't just like the dendrites don't just come in and give one input to the neuron. Right. It's you have an apical dendrite, you have a basal dendrite, there's more structure there, but at least you have that. And so if you give neurons that very simple biological structure, all these things emerge. You get attention, you get this gating, you get these statistical transistors just popping out for free generically.
因此在我看来,这连接了微观电路层面——比如通过简单地赋予神经元顶端和基底树突,就能在众多神经元间轻松共享的小规模神经元特性。所有优秀的计算属性都由此产生。这之所以美妙,是因为它将简单底层的事物与强大抽象的计算能力联系了起来。
So this to me is connecting the low level micro circuit, like a small neuron scale thing that you could share across lots of neurons easily by just giving them apical and basal dendroids. And all this stuff, all these good computational properties emerge out of that. So that for me is really beautiful because it connects something very simple and low level to something very powerful, abstract and computational.
你之前提到的某件事,让我可以联系起来。但几个月前在华盛顿特区的NeuroAI会议上你说的某句话让我很惊讶。不过现在听你这么说,对于缺失的部分,我倒不那么意外了。我们需要什么?我们需要资金来做什么?
Something that you said, so I'll tie this in. But something you said at the NeuroAI Conference in in Washington DC a few months back surprised me. But now you saying this, it doesn't surprise me as much in terms of what's missing. What do we need? What do we need funding for?
我们需要实现什么目标?你当时说我们本质上需要一个突触组。就是说,我们需要了解所有突触的连接强度,这种极其微观的细节。而且存在阻力,比如我们不想深入到电子层面,不想研究离子通道,你却想深入到夸克级别。
What do we need to accomplish? And you said we need a synaptome, essentially. Like, we need to learn all the connection strengths of the of synapses, which is like such a low level detail. And when and there's resistance to, you know, we don't want to go all the way down to electrons, we don't want to go down to ion channels. You want to go down to Quarks.
我本来避免说这个词,因为我总提夸克。但你想研究突触强度层面——正如你刚才兴奋谈到的这些底层实现过程,本质上不同的加权求和,大致分为生物区室如顶端树突和基底树突。我们知道实际情况远比这复杂。但仅就这两个子集而言,像Matthew Larkum等人就在研究近端与远端树突的区别。
I was avoiding saying it because I always say quarks. But you want to go down to the synapse strength level which what you just said being excited about some of these low level implementation processes, different weighted sums essentially separated into like what are biological compartments apical and basal dendrites. And we know it's like way more complex than that. But even just with those two subsets and people like Matthew Larkum do work on apical versus proximal, distal. Yeah.
所以现在我稍微理解你为什么需要这些了,这两者是直接相关的吗?也就是说,你之所以想要那种突触层面的研究,是因为你看到了这些底层机制与高层功能(比如本例中的注意力)之间的联系,你在探索中试图建立这种关联?
So that makes a little bit more sense to me why you would want then, are those two directly related? Like, is that why you're wanting that sort of synaptic because you see these low level ways in which the higher level, let's say the attention in this case, higher level functions, you are connecting those in in your pursuits?
确实存在关联,但我说需要突触组之类的时候,心里想的不是这个。我真正想表达的是:学习是我们大脑的基本功能之一,但我们无法量化测量它。
There are connections there, but that's not what I had in mind when I said that we Okay. The synaptome or whatever. What I really had in mind was that one of the fundamental things that our brains do is learn, but we can't measure it.
好吧,确实。
Okay. Yeah.
我的意思是,目前还无法大规模测量。
The thing that is taken from, I mean not not at scale.
对。
Right. The
让我们从单一神经元学说转向群体学说,并至少理解神经元模式如何与驱动行为和认识世界相关的某些要素,关键在于我们能够大规模测量它们。现在我们可以测量——我认为目前的世界纪录是在同一时间测量一只动物执行单一任务时的百万个神经元。这让我们对大脑中的存在有了更丰富的理解。明白吗?关于大脑在进行何种计算。
thing that has taken us from the single neuron doctrine to the population doctrine and understood at least some elements of how patterns of neurons are related to driving behavior and understanding the world is that we can measure them at scale. We can measure now, I mean the I think the world record is a million neurons at the same time, in one animal doing one thing. And that gives us much richer understanding of what what is there. Right? What the brain computations are doing.
所以早前我们讨论时提到,也许关于大脑我们唯一能真正理解的——虽然我个人不同意这种观点,但持此看法的人认为——我们只能了解学习规则和大脑的目标。除非我们能观察到突触在自然状态下的变化,否则无法看到学习规则的实际运作。实际上我曾与人合作一个小项目,尝试在简单案例中仅通过神经活动推断学习规则,当时的学习规则确实非常简单。
And so when we talked earlier about, well, maybe the only thing that we can really understand about the brain, although I you know, don't agree with that claim, but those people who think about that is that the only thing we can learn is the learning rule and the objectives of the brain. We will not be able to see that learning rule operating in its natural context until we can see the synapses changing. I actually did a small project with somebody trying to infer the learning rule from just neural activity in a simple case where the learning rule was really simple.
嗯。
Mhmm.
这需要从神经活动中推断突触权重,这极其困难。事实上多数情况下根本不可能,但假设存在因果扰动的情况下,要弄清连接关系已经非常困难。而现在你不仅要了解连接强度,还要知道它们如何变化,以及这种变化如何依赖于其他神经活动。
And it required like inferring synapses, synaptic weights from activity is really hard. In fact, in many cases it's impossible but let's say you have causal perturbations, it's really hard to figure out what the connections are. But now you have to not just learn what the strengths are, but how they change and how that change depends on other activity.
是啊。
Yeah.
这超级困难。但如果像Ehud Isakoff等人开始研究的那样——他们正着手测量突触——我们就有机会大规模理解学习规则的运作。我渴望测量的那种快速学习是单次学习。比如你第一次知道什么是汽车,第一次看到敞篷车时...
It's super hard. But if we had direct measurements of the synapses as people like Ehud Isakoff are starting to study, like starting to measure, then we have the chance of understanding what the learning rule is doing at scale. You know, the kind of rapid learning that I would love to measure is one shot learning. The first time that you you know what a car is, the first time you see a convertible. What all of
突然
a sudden
变化?Bobby Kasturi,他参与了一些现代最早期的连接组研究,他提出我们可以利用鸟类印记作为一个模型系统,来发现那些导致重大计算后果的快速、大规模的突触变化。所有这些都需要观察突触。这是我过去未曾涉足的领域,但在我看来,这是神经科学中最大的空白——我们还不理解学习是如何运作的。
changes? Bobby Kasturi, who was involved in some of the earliest connectomes in the like modern age here, he was suggesting that we could use bird imprinting as a as a model system to find really rapid huge scale synaptic changes that cause major computational consequences. So all of that requires looking at the synapses. And that's not something I've done in the past but to me this is the biggest gaping hole in neuroscience. Is that we don't understand how learning works.
我们用于学习的所有机器,如今它们都在进行梯度下降。基本上这就是它们所做的。而大脑并不进行梯度下降。
All of the machines that we use for learning, they're doing gradient descent these days. Like that's basically what they do. And the brain doesn't do gradient descent.
对。
Right.
也许它近似于此。这些近似是什么?约束条件又是什么?我们不知道。我们不知道是因为我们还无法测量它。
Maybe it approximates it. What are the approximations? What are the constraints? We don't know. And we don't know because we can't measure it yet.
但如果我们知道学习算法,或者即使近似方法最终结果相同,为什么还需要了解底层的实现细节呢?
But why do we need to know like the low level implementation details if we know the learning algorithms or even approximates that are that have the end result be the same?
哦,那样的话我们确实可以。我是说,如果你能找到一种我们不需要了解底层细节的方法,那也很好。但我们确实认为,学习的机制依赖于突触,突触是进行大量学习的部分。当然,你知道,我绝对是支持采取更抽象一步的观点的。
Oh, then we could. I mean, you can find a way that we don't need to know the low level details, that would be fine. But we we do think that the machinery depends on on the, you know, the synapses are the things that do a lot of the learning. Right. But you know, I I'm definitely an advocate of taking a step more abstract.
是啊,所以我才感到惊讶。
Yeah. That's why was surprised.
没错。我是说,我会非常乐意尝试理解这一点。虽然我不太确定这个过程会很容易,但假设我们有一群神经元通过突触物理连接,接下来该怎么做?你需要从中抽象出一种图结构推理算法,这个算法存在于更抽象的层面上。
Yeah. I I mean, I would be very happy trying to understand that. I'm not sure I have a lot of confidence that it's going to be easy to do. But let's say we have a population of neurons that are physically connected with all these synapses and then what do you do? You abstract from that some kind of graph structured inference algorithm that's there at a more abstract level.
与此同时,在并行层面上,既有低层次的突触更新,也有你从那个抽象图中提取出的图结构更新。如果能只进行抽象图的更新就太好了,那样或许能更好地理解事物的运作方式。
And now, at the same, like in parallel, you have the low level synaptic updates and you have the updates in that graph, that abstract graph that you extracted. Right. If you could just do the abstract graph updates, that would be great. Then you could understand. Maybe that's that's a good way of understanding the way that things work.
又或者可能需要深入到低层次的突触层面。我不确定。但有一点我很清楚:当我们测量新事物时,总能获得新的洞见。正因如此,我认为我们正处在实现这一目标的临界点,这必将带来大量新的理解。
Or maybe you need to go down to the low level, synapses. I don't know. But one thing I do know is that when we measure new things, we find new insights. And so that's why I was saying this is like, I think we're right on the cusp of being able to do this and I'm sure that it's going to provide a new a lot of new insight.
也许我们可以用这个想法作结,你也可以思考一下——因为我待会儿得去做些神经科学研究了。我们得再邀请你来节目,毕竟今天都没聊到你的具体研究。虽然讨论了很多支撑你工作的基本原理,我会在节目注释里列出你的论文和实验室。有趣的是,过去反对研究实现层面的论点该怎么表述呢...
One, maybe maybe we'll end with this thought and you you can, reflect on it as well because I I have to actually go get doing some neuroscience here in a minute. But, and we'll have to have you back on because we because we didn't even talk about your work. I mean, we talked about a lot of the principles which undergirds a lot of all your work. So and I'll point to, you know, all your papers and stuff and and your lab in the show notes. But the interesting thing, so the arguments against studying the implementation level in the past, how do I want to phrase this?
我想听听你的看法。现代神经科学有了神经网络工具、概率图模型这些理论工具,当我们观测突触强度变化时,能将其与近几十年发展完善的理论实体联系起来。而过去测量突触强度时,只能假设它们导致学习,实验测量后仍存在巨大鸿沟。或许我们正进入一个能跨层级弥合这些鸿沟的时代,更能理解底层实现细节如何影响高层特性。你对我刚才说的有什么见解?
I want to see what you think about this. So in modern neuroscience, we have these neural networks to work with, these probabilistic graph models, these tools, theoretical tools, which we can, when we look at, let's say synaptic strength changes and measure them, now we can actually relate them to these theoretical entities that have been developed, you know, in the past couple decades, like better developed. Whereas in the past, like, you're sort of measuring synaptic strength with the assumption that they lead to learning and you can kind of measure it experimentally but then there's still this huge gap. Maybe we're in an era where we're being able to actually go across levels and close those gaps a little bit better and understand how the low level implementation effects and details matter for the higher level, properties. What, what, how would you reflect on what I just said?
别
Don't
我认为这将是一个值得庆祝的重大理由。
I would say that would be a great cause for celebration.
是啊。
Yeah.
绝对同意。我我觉得
Absolutely. I I think that
不过你真心赞同这个观点吗?
Do you agree with it though?
对,对。我是说,你看,进步是逐步累积的。我认为我们的认知正在不断深化,但仍有许多根本性的事物尚未被理解。
Yeah. Yeah. I I think that we are I mean, you know, progress is moving gradually. I think we're we're gaining more insight. There's some really fundamental things that we don't understand yet.
有些领域我们确实有所认知——至于算多算少取决于评判标准。前几天我和康拉德·库尔廷讨论时,他坚称我们一无所知。我回应说至少我们掌握了些许知识,比如对大脑的理解。
There's some other things that we, you know, I think we understand some things. And you know, whether that's a lot or a little is going to depend on the judgment. I was having a discussion with Conrad Kurting the other day and you know, he was he was saying we don't understand anything. I was like, well, I think we understand some things. And then, you know, it was understand the brain.
我们真的理解大脑吗?这个嘛
Do we understand the brain? Well
不。好吧,这到底是什么意思呢?是的。
No. Well, what does that mean even? Yeah.
是的。所以我认为我们对大脑确实有一些了解。对吧?我觉得我们理解了。它不像过去那样是一个巨大的谜团,尽管在很多方面仍充满未知,但已不再是完全的神秘领域。
Yeah. So I think we do understand some things about the brain. Right? I think we understand. It's not like a it's a mystery in a huge number of ways, but it's not a total mystery anymore the way that it used to be.
它不仅仅是一团纠缠的脂肪和纤维。我们知道存在模式,知道有突触。突触会变化,这些模式会影响我们的行为。
It's not like just a big ball of kind of tangled fat and string. And so, know, we know that there are patterns. We know that there are synapses. Synapses change. The patterns have influences on our behavior.
存在反馈循环。但还有很多我们不明白的——我们不懂符号系统,不理解语言机制,不清楚许多动态过程,也不了解记忆等基础模型或大部分计算原理。不过我们有些线索。
There are feedback loops. There's a lot that we don't We don't understand symbols. We don't understand language. We don't understand many of the dynamics, we don't understand some fundamental models like memory or you know, a lot most of the computations. But we have some hints.
我想说我们正在探索途中,而海量数据加速了这一进程。是的,我认为当下是个绝佳时期——各种因素惊人地汇聚,造就了研究良机:一方面我们拥有来自AI领域的高效分析工具,另一方面前所未有的神经技术为这些模型提供了研究素材。
I would say we're on our way and it's been facilitated by massive data. Yeah. So I think this is a great time. Like we have such an amazing confluence of factors right now that is makes this a really good time. One is that we have analysis tools that have very high power from all the AI stuff and we have incredible neurotechnology, which is giving those models something to chew on.
所以现在既是理论家的黄金时代,也是与那些收集史诗级新数据的研究者合作的理想时机。
So it's a good time to be a theorist and a good time to be partnering up with people collecting some of this epic new data.
好的,扎克。那么我们将继续探索大脑及其功能的奥秘,沿途也会遇到些不那么神秘的事物。感谢你的时间,校园里见。我们还得再请你来节目。
Alright, Zach. And so we carry on into the mystery of the brain and its functions with with some less mysterious things along the way. So thank you for your time. I'll see you around campus. We'll have to have you back on.
我很感激。
I appreciate it.
太棒了,保罗。非常感谢。
This was great, Paul. Thanks so much.
《Brain Inspired》由在线出版物The Transmitter提供支持,旨在传递有用信息、见解和工具,搭建神经科学领域的桥梁并推动研究进展。访问thetransmitter.org,探索由记者和科学家撰写的最新神经科学新闻与观点。如果您重视《Brain Inspired》,请通过Patreon支持我们,获取完整版节目、加入我们的Discord社区,甚至影响我邀请谁来参加播客。前往brandinspired.co了解更多信息。您听到的音乐是一段舒缓的爵士蓝调,由我的朋友Kyle Donovan演奏。
Brain Inspired is powered by the transmitter, an online publication that aims to deliver useful information, insights, and tools to build bridges across neuroscience and advance research. Visit thetransmitter.org to explore the latest neuroscience news and perspectives written by journalists and scientists. If you value Brain Inspired, support it through Patreon to access full length episodes, join our Discord community, and even influence who I invite to the podcast. Go to brandinspired.co to learn more. The music you hear is a little slow, jazzy blues performed by my friend, Kyle Donovan.
感谢您的支持。下次见。
Thank you for your support. See you next time.
关于 Bayt 播客
Bayt 提供中文+原文双语音频和字幕,帮助你打破语言障碍,轻松听懂全球优质播客。