Placeholder Image

Subtitles section Play video

  • In this video, we'll look at what the overall process of supervised learning is like.

    在本視頻中,我們將瞭解監督學習的整體過程。

  • Specifically, you see the first model of this course, a linear regression model.

    具體來說,您將看到本課程的第一個模型,即線性迴歸模型。

  • That just means fitting a straight line to your data.

    這只是意味著將數據擬合成一條直線。

  • It's probably the most widely used learning algorithm in the world today.

    它可能是當今世界上使用最廣泛的學習算法。

  • As you get familiar with linear regression, many of the concepts you see here will also apply to other machine learning models, models that you'll see later in this specialization.

    隨著你對線性迴歸的熟悉,你在這裡看到的許多概念也將適用於其他機器學習模型,也就是你在本專業後面將看到的模型。

  • Let's start with a problem that you can address using linear regression.

    讓我們從一個可以用線性迴歸來解決的問題開始。

  • Say you want to predict the price of a house based on the size of a house.

    假設您想根據房屋的面積預測房屋的價格。

  • This is the example we've seen earlier this week.

    這就是我們本週早些時候看到的例子。

  • We're going to use a dataset on house sizes and prices from Portland, a city in the United States.

    我們將使用美國波特蘭市的房屋面積和價格數據集。

  • Here we have a graph where the horizontal axis is the size of a house in square feet and the vertical axis is the price of house in thousands of dollars.

    在這幅圖中,橫軸是以平方英尺為組織、部門的房屋面積,縱軸是以千美元為組織、部門的房屋價格。

  • Let's go ahead and plot the data points for various houses in the dataset.

    讓我們繼續繪製數據集中不同房屋的數據點。

  • Here at each data point, each of these little crosses is a house with a size and a price that it most recently was sold for.

    在這裡的每個數據點上,每一個小十字都是一棟房子,都有它的面積和最近售出的價格。

  • Now, let's say you're a real estate agent in Portland, and you're helping a client sell her house.

    現在,假設你是波特蘭的一名房地產經紀人,你正在幫助一位客戶出售她的房子。

  • She's asking you, how much do you think you're going to get for this house?

    她問你,你覺得這房子能賣多少錢?

  • This dataset might help you estimate the price she could get for it.

    這個數據集也許能幫您估算出她能得到的價格。

  • You start by measuring the size of the house, and it turns out that her house is 1,250 square feet.

    你先測量房子的面積,結果發現她的房子有 1250 平方英尺。

  • How much do you think this house could sell for?

    你認為這棟房子能賣多少錢?

  • One thing you could do is you can build a linear regression model from this dataset.

    你可以做的一件事是利用這個數據集建立一個線性迴歸模型。

  • Your model will fit a straight line to the data which might look like this.

    您的模型將根據數據擬合出一條直線,看起來可能是這樣的。

  • Based on this straight line fit to the data, you can see that if a house is 1,250 square feet, it will intersect the best fit line over here.

    根據這條直線擬合的數據,你可以看到,如果房屋面積為 1,250 平方英尺,它將與這裡的最佳擬合線相交。

  • If you trace that to the vertical axis on the left, you can see the price is maybe around here, say about $220,000.

    如果將其追蹤到左側的縱軸上,可以看到價格可能就在這附近,比如大約 22 萬美元。

  • This is an example of what's called a supervised learning model.

    這就是所謂的監督學習模型的一個例子。

  • We call this supervised learning because you are first training your model by giving a data that has the right answers.

    我們之所以稱之為監督學習,是因為你首先要通過提供具有正確答案的數據來訓練你的模型。

  • Because you give the model examples of houses with both the size of the house, as well as the price that the model should predict for each house.

    因為你給模型提供了房屋的例子,既包括房屋的面積,也包括模型應該預測的每套房屋的價格。

  • Well, here are the prices that is the right answers are given for every house in the dataset.

    好了,下面就是數據集中每棟房屋的價格,也就是正確答案。

  • This linear regression model is a particular type of supervised learning model.

    這種線性迴歸模型是一種特殊的監督學習模型。

  • It's called a regression model because it predicts numbers as the output like prices and dollars.

    之所以稱其為迴歸模型,是因為它將價格和美元等數字作為輸出進行預測。

  • Any supervised learning model that predicts a number such as 220,000 or 1.5 or negative 33.2, is addressing what's called a regression problem.

    任何有監督的學習模型,如果能預測出一個數字,如 22 萬、1.5 或負 33.2,就能解決所謂的迴歸問題。

  • Linear regression is one example of a regression model, but there are other models for addressing regression problems too.

    線性迴歸是迴歸模型的一個例子,但也有其他模型可以解決迴歸問題。

  • We'll see some of those later in course 2 of this specialization.

    稍後,我們將在本專業的第 2 課中介紹其中的一些內容。

  • Just to remind you, in contrast with the regression model, the other most common type of supervised learning model is called a classification model.

    提醒一下,與迴歸模型相比,另一種最常見的監督學習模型叫做分類模型。

  • A classification model predicts categories or discrete categories, such as predicting if a picture is of a cat, meow, or a dog, woof.

    分類模型預測類別或離散類別,例如預測一張圖片是貓 "喵 "還是狗 "汪"。

  • Or if given a medical record, it has to predict if a patient has a particular disease.

    或者,如果給它一份醫療記錄,它必須預測病人是否患有某種疾病。

  • You'll see more about classification models later in this course as well.

    在本課程的後面部分,您還將看到更多關於分類模型的內容。

  • As a reminder about the difference between classification and regression, in classification, there are only a small number of possible outputs.

    在此提醒大家注意分類和迴歸的區別,在分類中,只有少量可能的輸出。

  • If your model is recognizing cats versus dogs, that's two possible outputs.

    如果您的模型要識別貓和狗,那就有兩種可能的輸出。

  • Or maybe you're trying to recognize any of 10 possible medical conditions in a patient.

    或者,您正在嘗試識別病人可能出現的 10 種病症中的任何一種。

  • There's a discrete finite set of possible outputs.

    可能的輸出有一個離散的有限集合。

  • We call it a classification problem.

    我們稱之為分類問題。

  • Whereas in regression, there are infinitely many possible numbers that the model could output.

    而在迴歸中,模型可能輸出的數字是無限的。

  • In addition to visualizing this data as a plot here on the left, there's one other way of looking at the data that would be useful, and that's a data table here on the right.

    除了將這些數據可視化為左側的圖表外,還有一種查看數據的方法也很有用,那就是右側的數據表。

  • The data comprises a set of inputs.

    數據由一組輸入組成。

  • This would be the size of the house, which is this column here.

    這就是房子的大小,也就是這根柱子。

  • It also has outputs.

    它還具有輸出功能。

  • You're trying to predict the price, which is this column here.

    你試圖預測價格,也就是這一欄。

  • Notice that the horizontal and vertical axes correspond to these two columns, the size and the price.

    請注意,橫軸和縱軸與尺寸和價格這兩列相對應。

  • If you have, say, 47 rows in this data table, then there are 47 of these lower crosses on the plot of the left, each cross corresponding to one row of the table.

    如果這個數據表中有 47 行,那麼左側的曲線圖上就有 47 個這樣的下叉,每個叉對應數據表中的一行。

  • For example, the first row of the table is a house with size 2,104 square feet.

    例如,表格的第一行是面積為 2,104 平方英尺的房屋。

  • That's around here.

    就在附近

  • This house sold for $400,000, which is around here.

    這棟房子賣了 40 萬美元,就在附近。

  • This first row of the table is plotted as this data point over here.

    表格的第一行就繪製成了這個數據點。

  • Now, let's look at some notation for describing the data.

    現在,讓我們來看看一些描述數據的符號。

  • This is notation that you find useful throughout your journey in machine learning.

    這是您在機器學習的整個過程中都會用到的符號。

  • As you increasingly get familiar with machine learning terminology, this would be terminology they can use to talk about machine learning concepts with others as well, since a lot of this is quite standard across AI.

    隨著你對機器學習術語的日益熟悉,這些術語也將成為他們與他人談論機器學習概念時可以使用的術語,因為這其中有很多是人工智能領域的標準術語。

  • You'll be seeing this notation multiple times in this specialization, so it's okay if you don't remember everything the first time through.

    在本專業中,您會多次看到這種符號,是以,如果您第一次沒有記住所有內容也沒關係。

  • It will naturally become more familiar over time.

    隨著時間的推移,自然會越來越熟悉。

  • The dataset that you just saw and that is used to train the model is called a training set.

    你剛才看到的用於訓練模型的數據集稱為訓練集。

  • Note that your client's house is not in this dataset because it's not yet sold, so no one knows what its price is.

    請注意,您客戶的房子不在此數據集中,因為它尚未售出,所以沒有人知道它的價格是多少。

  • To predict the price of your client's house, you first train your model to learn from the training set, and that model can then predict your client's house's price.

    要預測客戶房屋的價格,首先要訓練模型從訓練集中學習,然後該模型才能預測客戶房屋的價格。

  • In machine learning, the standard notation to denote the input here is lowercase x, and we call this the input variable.

    在機器學習中,表示輸入的標準符號是小寫 x,我們稱之為輸入變量。

  • It's also called a feature or an input feature.

    它也被稱為特徵或輸入特徵。

  • For example, for the first house in your training set, x is the size of the house, so x equals 2,104.

    例如,對於訓練集中的第一棟房屋,x 是房屋的大小,是以 x 等於 2 104。

  • The standard notation to denote the output variable, which you're trying to predict, which is also sometimes called the target variable, is lowercase y.

    您要預測的輸出變量(有時也稱為目標變量)的標準符號是小寫 y。

  • So here, y is the price of the house, and for the first training example, this is equal to 400, so y equals 400.

    在這裡,y 是房子的價格,在第一個訓練示例中,這個價格等於 400,所以 y 等於 400。

  • So the dataset has one row for each house, and in this particular training set, there are 47 rows with each row representing a different training example.

    是以,數據集中的每棟房屋都有一行,在這個特定的訓練集中,共有 47 行,每一行代表一個不同的訓練實例。

  • We're going to use lowercase m to refer to the total number of training examples, and so here, m is equal to 47.

    我們將使用小寫 m 來表示訓練示例的總數,是以這裡的 m 等於 47。

  • To indicate a single training example, we're going to use the notation parentheses x,y.

    我們將使用括號 x,y 來表示單個訓練示例。

  • So for the first training example, x,y, this pair of numbers is 2,104,400.

    是以,對於第一個訓練示例 x,y,這對數字是 2,104,400 。

  • Now, we have a lot of different training examples.

    現在,我們有很多不同的訓練示例。

  • We have 47 of them, in fact.

    事實上,我們有 47 個。

  • So to refer to a specific training example, this will correspond to a specific row in this table on the left.

    是以,要引用一個具體的訓練示例,這將與左側表格中的某一行相對應。

  • I'm going to use the notation x superscript in parentheses, i,y superscript in parentheses, i.

    我將使用括號內的符號 x 上標 i,括號內的符號 y 上標 i。

  • The superscript tells us that this is the i-th training example, such as the first, second, or third up to the 47th training example.

    上標表示這是第 i 個訓練實例,如第 1、第 2 或第 3 個,直到第 47 個訓練實例。

  • I here refers to a specific row in the table.

    這裡的 I 是指表格中的某一行。

  • So for instance, here is the first example when i equals 1 in the training set.

    例如,這裡是訓練集中 i 等於 1 時的第一個例子。

  • So x superscript 1 is equal to 2,104, and y superscript 1 is equal to 400.

    是以,x 上標 1 等於 2 104,y 上標 1 等於 400。

  • Let's add the superscript 1 here as well.

    讓我們在這裡也加上上標 1。

  • Just a note, this superscript i in parentheses is not exponentiation.

    請注意,括號中的上標 i 並不是指數。

  • So when I write this, this is not x squared, this is not x to the power of 2.

    是以,當我寫這個時,這不是 x 的平方,也不是 x 的 2 次方。

  • It just refers to the second training example.

    它指的只是第二個訓練示例。

  • So this i is just an index in the training set, and refers to row i in the table.

    是以,這個 i 只是訓練集中的一個索引,指的是表格中的第 i 行。

  • In this video, you saw what a training set is like, as well as the standard notation for describing this training set.

    在本視頻中,你將看到訓練集是什麼樣的,以及描述訓練集的標準符號。

  • In the next video, let's look at what it'll take to take this training set that you just saw and feed it to a learning algorithm so that the algorithm can learn from this data.

    在下一個視頻中,讓我們看看如何將剛才看到的訓練集輸入學習算法,以便算法能夠從這些數據中學習。

  • Let's see that in the next video.

    讓我們在下一個視頻中看看。

In this video, we'll look at what the overall process of supervised learning is like.

在本視頻中,我們將瞭解監督學習的整體過程。

Subtitles and vocabulary

Click the word to look it up Click the word to find further inforamtion about it