Subtitles section Play video
Traditionally, dot products or something that's introduced really early on in a linear algebra
傳統上,點積或者是在一個綫性代數課程
course
中很早就引進的東西,典型的是在一開始時。
typically right at the start.
因此我在這個系列中把它們推到那麽後面看起來有點奇怪。
So it might seem strange that I push them back this far in the series.
我這樣做是因爲有一種標準的方法來引入這個題目
I did this because there's a standard way to introduce the topic which
它只需要對矢量有一點基本的理解
requires nothing more than a basic understanding of vectors,
但對點積在數學中鎖扮演的角色的一種更全面理解
but a fuller understanding of the role the dot products play in math,
只能在綫性代數的幫助之下才做得到。
can only really be found under the light of linear transformations.
在那個之前,不過,讓我簡單的講一下
Before that, though, let me just briefly cover
點積被引進的標準方法,
the standard way that products are introduced.
這我設想至少有很多聼眾是看到過的。
Which I'm assuming is at least partially review for a number of viewers.
計算上來說,如果你有兩個相同維數的矢量
Numerically, if you have two vectors of the same dimension;
列出的數字的項數具有相同的長度,
to list of numbers with the same length,
它們做點積,意味著
taking their dot product, means,
把它們的對應的坐標排起來,
pairing up all of the coordinates,
把這些對應的相乘起來,
multiplying those pairs together,
並把結果相加。
and adding the result.
所以矢量[1, 2]和 [3, 4]的點積,
So the vector [1, 2] dotted with [3, 4],
就會是1x3 + 2x4.
would be 1 x 3 + 2 x 4.
矢量[6, 2, 8, 3]點積[1, 8, 5, 3]會是
The vector [6, 2, 8, 3] dotted with [1, 8, 5, 3] would be:
6 x 1 + 2 x 8 + 8 x 5 + 3 x 3.
6 x 1 + 2 x 8 + 8 x 5 + 3 x 3.
幸運的是,這種計算有一個真是很好的幾何解釋。
Luckily, this computation has a really nice geometric interpretation.
思考一下在2個矢量 v 和w之間的點積
To think about the dot product between two vectors v and w,
想象w投影到穿過原點和v 箭頭的一根綫上。
imagine projecting w onto the line that passes through the origin and the tip of v.
這個投影的長度乘以v 的長度,你就有了這 v・w 的點積
Multiplying the length of this projection by the length of v, you have the dot product
在這個w 的投影的方向和v 相反時
v・w.
那個點積將是負的。
Except when this projection of w is pointing in the opposite direction from v,
而當兩個矢量都指向相同的方向的時候
that dot product will actually be negative.
它們的點積是正的。
So when two vectors are generally pointing in the same direction,
如果他們是互相垂直時,意思是
their dot product is positive.
一個的投影在另一個是0矢量
When they're perpendicular, meaning,
這點積為0.
the projection of one onto the other is the 0 vector,
而如果他們大致是反向的,那麽它們的點積是負的。
the dot product is 0.
現在,這個解釋注定是不對稱的,
And if they're pointing generally the opposite direction, their dot product is negative.
它對待這兩個矢量很不一樣,
Now, this interpretation is weirdly asymmetric,
因此在我剛知道這點時,對次序無關我感到驚奇。
it treats the two vectors very differently,
你可以代替v 投影到w;
so when I first learned this, I was surprised that order doesn't matter.
把v的投影乘以w的長度
You could instead project v onto w;
而得到同樣的結果。
multiply the length of the projected v by the length of w
我是說,有沒有感到像是一個真正不同的過程?
and get the same result.
對為什麽次序無關係有種直覺:
I mean, doesn't that feel like a really different process?
如果v 和w 正好有同樣的長度,
Here's the intuition for why order doesn't matter:
我們可以用某種對稱。
if v and w happened to have the same length,
因爲w投影到v
we could leverage some symmetry.
然後把投影下長度乘上v的長度
Since projecting w onto v
完全就是v投影到w然後把投影的長度乘以
then multiplying the length of that projection by the length of v,
w的長度的鏡像。
is a complete mirror image of projecting v onto w then multiplying the length of that
現在,如果你“放大”其中的一個,比方說v 放大2倍,
projection by the length of w.
這樣他們就沒有一樣的長度了。
Now, if you “scale” one of them, say v by some constant like 2,
這對稱被破壞了。
so that they don't have equal length,
但是日我們來想請怎樣來解釋在這個
the symmetry is broken.
新的矢量2v 和w之間的點積。
But let's think through how to interpret the dot product between this new vector 2v and
如果你在想由我投影到v
w.
然後這2v・w的點積將
If you think of w is getting projected onto v
是v・w的點積的正好2倍。
then the dot product 2v・w will be
這是因爲在你‘放大’v2倍的時候,
exactly twice the dot product v・w.
它并沒有改變w的投影長度
This is because when you “scale” v by 2,
但是它把你在投影上去的矢量的長度加了一倍。
it doesn't change the length of the projection of w
但是在另一方面,假定你在想把v投影到w。
but it doubles the length of the vector that you're projecting onto.
好吧,這那種情況下,在我們被v乘以2的時候這投影的長度被‘放大了。
But, on the other hand, let's say you're thinking about v getting projected onto w.
這個你在投影上去的矢量之長度是一個常量。
Well, in that case, the length of the projection is the thing to get “scaled” when we multiply
所以縂的效果仍舊是這點積的加倍。
v by 2.
所以,即使在這個情況下對稱被破壞了,
The length of the vector that you're projecting onto stays constant.
這’放大‘ 對點積的值的效果,
So the overall effect is still to just double the dot product.
在兩種解釋中,是同樣的
So, even though symmetry is broken in this case,
在我剛寫這些東西的時候,還有使我不清的一個大問題是
the effect that this “scaling” has on the value of the dot product, is the same
爲什麽搞這麽一個坐標對應的項
under both interpretations.
相乘並加在一起的計算過程
There's also one other big question that confused me when I first learned this stuff:
和投影有些什麽關係呢?
Why on earth does this numerical process of matching coordinates, multiplying pairs and
好吧,給你一個滿意的回答,
adding them together,
並也我這點積伸張其意義,
have anything to do with projection?
我們在這裏要把一個東西挖得深一些
Well, to give a satisfactory answer,
它通常用‘雙重性’這個名字。
and also to do full justice to the significance of the dot product,
但是,在深入到那個之前,
we need to unearth something a little bit deeper going on here
我需要用些時間講些從多維到單維,
which often goes by the name "duality".
即一根數軸,的綫性變換。
But, before getting into that,
這些些函數輸入一個2-維矢量而輸出某個數字。
I need to spend some time talking about linear transformations
但是綫性變換,當然,比你們的
from multiple dimensions to one dimension
用隨便一個2-維的輸入而輸出一個單維的函數更要嚴格些。
which is just the number line.
像我在第三章講到過的,在更高維數
These are functions that take in a 2D vector and spit out some number.
的變換中,有著一些正式的性質
But linear transformations are, of course,
使這些函數是綫性的。
much more restricted than your run-of-the-mill function with a 2D input and a 1D output.
但是為了不對我們最終目標分心我在這裏存心忽略那些,
As with transformations in higher dimensions,
而代之以集中在一個和所有正式的東西相當的視覺性質。
like the ones I talked about in chapter 3,
如果你有一根綫上面有間隔相等的點
there are some formal properties that make these functions linear.
並施以一個變換,
But I'm going to purposely ignore those here so as to not distract from our end goal,
一個綫性變換將保持這些點間隔相等,
and instead focus on a certain visual property that's equivalent to all the formal stuff.
一旦它們到了輸出空間,那就是一個數軸。
If you take a line of evenly spaced dots
否則的話,綫上的有些點就不均等了
and apply a transformation,
那麽你的變換也就不是綫性的了。
a linear transformation will keep those dots evenly spaced,
正如你們以前已經看到過的一些例子,
once they land in the output space, which is the number line.
一個綫性變換完全決定於
Otherwise, if there's some line of dots that gets unevenly spaced
它把i-hat和j-hat移到了什麽地方。
then your transformation is not linear.
而這次,那些單位矢量分別停在一個數字上。
As with the cases we've seen before,
所以在我們記錄它們停下的地方作爲一個矩陣的列,
one of these linear transformations
每一個列只有單獨一個數字。
is completely determined by where it takes i-hat and j-hat
這是一個1x2的矩陣。
but this time, each one of those basis vectors just lands on a number.
讓我們通過一個例子對一個矢量施加一個這樣的變換的意味著什麽。
So when we record where they land as the columns of a matrix
比方是你有一個綫性變換它把i-hat停到1和j-hat 到 -2 .
each of those columns just has a single number.
跟蹤一個坐標為[4, 3]的矢量到什麽地方,
This is a 1 x 2 matrix.
把這個矢量發成4 x i-hat + 3 x j-hat來考慮。
Let's walk through an example of what it means to apply one of these transformations to a
作爲綫性的一個後果,就是在變換之後
vector.
這個矢量將是i-hat停下地方,那是1的4倍
Let's say you have a linear transformation that takes i-hat to 1 and j-hat to -2.
加是j-hat停下的地方,在這個例子中,
To follow where a vector with coordinates, say, [4, 3] ends up,
那是 -2,的3倍。
think of breaking up this vector as 4 times i-hat + 3 times j-hat.
如果你單純用計算的方法,這就是一個矩陣-矢量乘法。
A consequence of linearity, is that after the transformation
現在,一個1x2的矩陣乘上一個矢量的數字技算,
the vector will be: 4 times the place where i-hat lands, 1,
感到就像是把兩個矢量做一個點積。
plus 3 times the place where j-hat lands, -2.
這1x2矩陣看起來不就像一個我們翻到側面的矢量嗎?
which in this case implies that it lands on -2.
事實上,我們現在就可以說在1x2矩陣和
When you do this calculation purely numerically, it's a matrix-vector multiplication.
2-維矢量之間有一種
Now, this numerical operation of multiplying a 1 by 2 matrix by a vector,
通過把一個矢量項的數字翻到側面來和相連的矩陣,
feels just like taking the dot product of two vectors.
或者把矩陣立起來來和相連的矢量。
Doesn't that 1 x 2 matrix just look like a vector that we tipped on its side?
既然我們現在只看數字上的表達,
In fact, we could say right now that there's a nice association between 1 x 2 matrices
在矢量和1x2矩陣之間來來回回會感到
and 2D vectors,
在做一件很愚蠢的事情。
defined by tilting the numerical representation of a vector on its side to get the associated
但是這提示一個東西從幾何觀點來看那是
matrix,
在綫性變換和把是矢量寫成數字和
or to tip the matrix back up to get the associated vector.
矢量本身有一種什麽樣的關係。
Since we're just looking at numerical expressions right now,
讓我們再來看一個例子來澄清其重要性
going back and forth between vectors and 1 x 2 matrices might feel like a silly thing
而它也剛好這個在先前點積上的疑問了。
to do.
不去管你已經學到過的而想象
But this suggests something that's truly awesome from the geometric view:
一下你還不知道點積是和投影相關的。
there's some kind of connection between linear transformations that take vectors to numbers
這這裏我要來做的是複製一根數軸綫
and vectors themselves.
並把它放成大致是對角綫方向上的以及0點放在原點是。
Let me show an example that clarifies the significance
現在想一個2-維的單位矢量,
and which just so happens to also answer the dot product puzzle from earlier.
它的箭頭在數軸的1上。
Unlearn what you have learned
我想命名它為u-hat。
and imagine that you don't already know that the dot product relates to projection.
這個小傢夥在將要發生的事情中扮演一個重要的角色,
What I'm going to do here is take a copy of the number line
因此你心裏就記住一下。
and place it diagonally and space somehow with the number 0 sitting at the origin.
如果我們把矢量直接對這對角綫方向上的數軸投影,
Now think of the two-dimensional unit vector,
在效果是,我們就是定義了一個函數它把2-維矢量變成了數字。
whose tips sit where the number 1 on the number line is.
還有更多的,昨天函數事實上是綫性的
I want to give that guy a name u-hat.
因爲它通過我們的視覺試驗
This little guy plays an important role in what's about to happen,
任何等距的點子在數軸綫上仍保持等距。
so just keep them in the back of your mind.
有一點要說清楚,
If we project 2D vectors straight onto this diagonal number line,
即使我們在一個像這樣的2-維空間放上一根數軸
in effect, we've just defined a function that takes 2D vectors to numbers.
這函數輸出的都是些數字而不是2-維的矢量。
What's more, this function is actually linear
你應該一個函數考慮成拿進兩個坐標而輸出單獨一個坐標。
since it passes our visual test
但是那個u-hat是生存在
that any line of evenly spaced dots remains evenly spaced once it lands on the number
輸入空間的一個2-維的矢量。
line.
它只不過剛好是處於一種方式它復合著這一投影的數軸綫,
Just to be clear,
我們就這樣定義了一個從2-維矢量到數字的一個綫性變換,
even though I've embedded the number line in 2D space like this,
所以我們將可以來發現描述那種變換的某種1x2 矩陣。
the output of the function are numbers, not 2D vectors.
要發現那個1x2 矩陣,讓我們集中到在這根對角上的數軸的設置上
You should think of a function that takes into coordinates and outputs a single coordinate.
並考慮一下i-hat和j-hat各自停在什麽地方,
But that vector u-hat is a two-dimensional vector
因爲那些停下來的地方將成爲這矩陣的列。
living in the input space.
這個部分是特別妙,我們用一種真正的高雅的對稱來推理:
It's just situated in such a way that overlaps with the embedding of the number line.
因爲i-hat和u-hat兩者都是單位矢量,
With this projection, we just defined a linear transformation from 2D vectors to numbers,
i-hat的投影綫通過的u-hat
so we're going to be able to find some kind of 1 x 2 matrix that describes that transformation.
和u-hat向x-軸的投影看起來完全對稱。
To find that 1 x 2 matrix, let's zoom in on this diagonal number line setup
所以在我們問i-hat投影之後會停在那裏
and think about where i-hat and j-hat each land,
其回答將是和u-hat投影到x-軸所停下的地方是一樣的。
since those landing spots are going to be the columns of the matrix.
而把u-hat向x-軸投影的意思不就是
This part's super cool, we can reason through it with a really elegant piece of symmetry:
取u-hat的x-坐標。
since i-hat and u-hat are both unit vectors,
因此通過對稱,i-hat在它向對角方向上的數軸綫投影所停下的那個數字
projecting i-hat onto the line passing through u-hat
就是u-hat的x-坐標。
looks totally symmetric to protecting u-hat onto the x-axis.
那不是很奇妙的嗎?
So when we asked what number does i-hat land on when it gets projected
對j-jat情況的推理過程幾乎就是相同的。
the answer is going to be the same as whatever u-hat lands on when its projected onto the
把這考慮一下。
x-axis
同樣的這些理由,u-hat的y-坐標給我們
but projecting u-hat onto the x-axis
在它對這複製的數軸投影時j-hat所停下的那個數字上。
just means taking the x-coordinate of u-hat.
停一下並琢磨一下;我就是想那真是奇妙的。
So, by symmetry, the number where i-hat lands when it's projected onto that diagonal number
因此這1x2 矩陣的項描述著投影變換
line
將是u-hat的坐標。
is going to be the x coordinate of u-hat.
而對在空間中的任意的矢量的這種投影變換的計算
Isn't that cool?
這就需要把那個矩陣乘上那些矢量,而
The reasoning is almost identical for the j-hat case.
在計算結果上是和u-hat做一個點積相同的。
Think about it for a moment.
這就是為什麽和一個單位矢量做一個點積
For all the same reasons, the y-coordinate of u-hat
可以解釋為把一個矢量投影到那個單位矢量的伸長綫並得出這長度。
gives us the number where j-hat lands when it's projected onto the number line copy.
那麽不是單位矢量會怎樣呢?
Pause and ponder that for a moment; I just think that's really cool.
舉個例子,
So the entries of the 1 x 2 matrix describing the projection transformation
我們假定取一個單位矢量u-hat,
are going to be the coordinates of u-hat.
不過我們把它放大一個係數為3.
And computing this projection transformation for arbitrary vectors in space,
數值上來說,它的各個成分都乘上了3.
which requires multiplying that matrix by those vectors,
一次看一下和那個矢量相關的矩陣
is computationally identical to taking a dot product with u-hat.
它的i-hat和j-hat都要比以前所停下地方
This is why taking the dot product with a unit vector,
的數字的3倍。既然這都是綫性的
can be interpreted as projecting a vector onto the span of that unit vector and taking
它含意就更廣汎了,
the length.
那個新的矩陣可以被解釋為把任何的矢量投影到那根複製的數軸綫上
So what about non-unit vectors?
並在它所停下的乘以3.
For example,
這就是為什麽和一個非單位矢量的點積
let's say we take that unit vector u-hat,
可以解釋成先把那個矢量投影然後
but we “scale” it up by a factor of 3.
那個矢量的投影長度也放大。
Numerically, each of its components gets multiplied by 3,
想一想在這裏發生了什麽。
So looking at the matrix associated with that vector,
我嗎有一個從2-維的空間綫性變換到這數軸綫,
it takes i-hat and j-hat to 3 times the values where they landed before.
這不是由數字(形式)的矢量或者點積的計算來定義的。
Since this is all linear,
它只是由向一個對角綫方向上的數軸的投影空間來定義的。
it implies more generally,
但是因爲變換是綫性的,
that the new matrix can be interpreted as projecting any vector onto the number line
它必定是由某個1x2矩陣來描述的
copy
既然把一個1x2 矩陣乘以一個2-維矢量
and multiplying where it lands by 3.
和把矩陣直立起來和做一個點積是相同的,
This is why the dot product with a non-unit vector
這個變換不可避免地關係到某個2-維矢量。
can be interpreted as first projecting onto that vector
這裏的教訓是,任何時候你有一個這樣的綫性變換
then scaling up the length of that projection by the length of the vector.
它的輸出空間是一根數軸,
Take a moment to think about what happened here.
不管它是怎樣定義的將有某個獨特的矢量v
We had a linear transformation from 2D space to the number line,
來相對應於那個變換的,
which was not defined in terms of numerical vectors or numerical dot products.
在這個意義上這個變換和那個矢量
It was just defined by projecting space onto a diagonal copy of the number line.
做一個點積是同一回事。
But because the transformation is linear,
對我來說,這是完全的美。
it was necessarily described by some 1 x 2 matrix,
這是一個在數學裏叫做“雙重性(duality)”的例子。
and since multiplying a 1 x 2 matrix by a 2D vector
“雙重性”以很多不同的方法和形式出現在整個數學
is the same as turning that matrix on its side and taking a dot product,
而這是超艱難來實際地定義。
this transformation was, inescapably, related to some 2D vector.
不嚴格地說來,這是指一些情況在兩種數學上的事情
The lesson here, is that anytime you have one of these linear transformations
你有一個自然但是很驚異的對應。
whose output space is the number line,
在你剛學到的綫性代數的例子裏
no matter how it was defined there's going to be some unique vector v
你會說這一個矢量的”雙重“是編在綫性變換中的。
corresponding to that transformation,
而一個從空間綫性變換到1-維的雙重
in the sense that applying the transformation is the same thing as taking a dot product
是在那個空間裏的某一個矢量。
with that vector.
所以,總結一下,在表面是,點積是
To me, this is utterly beautiful.
一種用了理解投影的幾何工具
It's an example of something in math called “duality”.
而用來測試矢量是否指同樣的方向上。
“Duality” shows up in many different ways and forms throughout math
而那是可能要你記住關於點積最重要的事情了,
and it's super tricky to actually define.
但是在更深程度是,兩個矢量點積起來
Loosely speaking, it refers to situations where you have a natural but surprising correspondence
是一種翻譯它們中的一個到變換的世界裏:
between two types of mathematical thing.
再說一次,在數字計算是來說,這可能還感到是來强調的一個很蠢的觀點,
For the linear algebra case that you just learned about,
它只不過是剛巧看起來相似的兩個計算。
you'd say that the “dual” of a vector is the linear transformation that it encodes.
但是我發現這個道理在你和一個矢量打交道時,
And the dual of a linear transformation from space to one dimension,
在整個數學裏是很重要的
is a certain vector in that space.
一旦你懂得了它的個性
So, to sum up, on the surface, the dot product is a very useful geometric tool for understanding
有時你意識到不把矢量當作在空間中的一支箭,
projections
而作爲一個綫性變換的實質更容易來理解。
and for testing whether or not vectors tend to point in the same direction.
這好像矢量實際上不過是對某個變換上一種概念上的縮記
And that's probably the most important thing for you to remember about the dot product,
既然對我們來想成箭頭和空間更容易些
but at deeper level, dotting two vectors together
而不是移動在那個空間裏的到數軸綫上。
is a way to translate one of them into the world of transformations:
在下一個錄像裏,在(cross product)我講解叉積時
again, numerically, this might feel like a silly point to emphasize,
你將看到一個這種“雙重性”在實際中起作用的很好的例子。
it's just two computations that happen to look similar.
But the reason I find this so important,
is that throughout math, when you're dealing with a vector,
once you really get to know its personality
sometimes you realize that it's easier to understand it, not as an arrow in space,
but as the physical embodiment of a linear transformation.
It's as if the vector is really just a conceptual shorthand for certain transformation,
since it's easier for us to think about arrows and space
rather than moving all of that space to the number line.
In the next video, you'll see another really cool example of this "duality" in action
as I talk about the cross product.