Placeholder Image

Subtitles section Play video

  • [MUSIC PLAYING]

  • Welcome back.

  • We've covered a lot of ground already,

  • so today I want to review and reinforce concepts.

  • To do that, we'll explore two things.

  • First, we'll code up a basic pipeline

  • for supervised learning.

  • I'll show you how multiple classifiers

  • can solve the same problem.

  • Next, we'll build up a little more intuition

  • for what it means for an algorithm to learn something

  • from data, because that sounds kind of magical, but it's not.

  • To kick things off, let's look at a common experiment

  • you might want to do.

  • Imagine you're building a spam classifier.

  • That's just a function that labels an incoming email

  • as spam or not spam.

  • Now, say you've already collected a data set

  • and you're ready to train a model.

  • But before you put it into production,

  • there's a question you need to answer first--

  • how accurate will it be when you use it to classify emails that

  • weren't in your training data?

  • As best we can, we want to verify our models work well

  • before we deploy them.

  • And we can do an experiment to help us figure that out.

  • One approach is to partition our data set into two parts.

  • We'll call these Train and Test.

  • We'll use Train to train our model

  • and Test to see how accurate it is on new data.

  • That's a common pattern, so let's see how it looks in code.

  • To kick things off, let's import a data set into [? SyKit. ?]

  • We'll use Iris again, because it's handily included.

  • Now, we already saw Iris in episode two.

  • But what we haven't seen before is

  • that I'm calling the features x and the labels y.

  • Why is that?

  • Well, that's because one way to think of a classifier

  • is as a function.

  • At a high level, you can think of x as the input

  • and y as the output.

  • I'll talk more about that in the second half of this episode.

  • After we import the data set, the first thing we want to do

  • is partition it into Train and Test.

  • And to do that, we can import a handy utility,

  • and it makes the syntax clear.

  • We're taking our x's and our y's,

  • or our features and labels, and partitioning them

  • into two sets.

  • X_train and y_train are the features and labels

  • for the training set.

  • And X_test and y_test are the features and labels

  • for the testing set.

  • Here, I'm just saying that I want half the data to be

  • used for testing.

  • So if we have 150 examples in Iris, 75 will be in Train

  • and 75 will be in Test.

  • Now we'll create our classifier.

  • I'll use two different types here

  • to show you how they accomplish the same task.

  • Let's start with the decision tree we've already seen.

  • Note there's only two lines of code

  • that are classifier-specific.

  • Now let's train the classifier using our training data.

  • At this point, it's ready to be used to classify data.

  • And next, we'll call the predict method

  • and use it to classify our testing data.

  • If you print out the predictions,

  • you'll see there are a list of numbers.

  • These correspond to the type of Iris

  • the classifier predicts for each row in the testing data.

  • Now let's see how accurate our classifier

  • was on the testing set.

  • Recall that up top, we have the true labels for the testing

  • data.

  • To calculate our accuracy, we can

  • compare the predicted labels to the true labels,

  • and tally up the score.

  • There's a convenience method in [? Sykit ?]

  • we can import to do that.

  • Notice here, our accuracy was over 90%.

  • If you try this on your own, it might be a little bit different

  • because of some randomness in how the Train/Test

  • data is partitioned.

  • Now, here's something interesting.

  • By replacing these two lines, we can use a different classifier

  • to accomplish the same task.

  • Instead of using a decision tree,

  • we'll use one called [? KNearestNeighbors. ?]

  • If we run our experiment, we'll see that the code

  • works in exactly the same way.

  • The accuracy may be different when you run it,

  • because this classifier works a little bit differently

  • and because of the randomness in the Train/Test split.

  • Likewise, if we wanted to use a more sophisticated classifier,

  • we could just import it and change these two lines.

  • Otherwise, our code is the same.

  • The takeaway here is that while there are many different types

  • of classifiers, at a high level, they have a similar interface.

  • Now let's talk a little bit more about what

  • it means to learn from data.

  • Earlier, I said we called the features x and the labels y,

  • because they were the input and output of a function.

  • Now, of course, a function is something we already

  • know from programming.

  • def classify-- there's our function.

  • As we already know in supervised learning,

  • we don't want to write this ourselves.

  • We want an algorithm to learn it from training data.

  • So what does it mean to learn a function?

  • Well, a function is just a mapping from input

  • to output values.

  • Here's a function you might have seen before-- y

  • equals mx plus b.

  • That's the equation for a line, and there

  • are two parameters-- m, which gives the slope;

  • and b, which gives the y-intercept.

  • Given these parameters, of course,

  • we can plot the function for different values of x.

  • Now, in supervised learning, our classified function

  • might have some parameters as well,

  • but the input x are the features for an example we

  • want to classify, and the output y

  • is a label, like Spam or Not Spam, or a type of flower.

  • So what could the body of the function look like?

  • Well, that's the part we want to write algorithmically

  • or in other words, learn.

  • The important thing to understand here

  • is we're not starting from scratch

  • and pulling the body of the function out of thin air.

  • Instead, we start with a model.

  • And you can think of a model as the prototype for

  • or the rules that define the body of our function.

  • Typically, a model has parameters

  • that we can adjust with our training data.

  • And here's a high-level example of how this process works.

  • Let's look at a toy data set and think about what kind of model

  • we could use as a classifier.

  • Pretend we're interested in distinguishing

  • between red dots and green dots, some of which

  • I've drawn here on a graph.

  • To do that, we'll use just two features--

  • the x- and y-coordinates of a dot.

  • Now let's think about how we could classify this data.

  • We want a function that considers

  • a new dot it's never seen before,

  • and classifies it as red or green.

  • In fact, there might be a lot of data we want to classify.

  • Here, I've drawn our testing examples

  • in light green and light red.

  • These are dots that weren't in our training data.

  • The classifier has never seen them before, so how can

  • it predict the right label?

  • Well, imagine if we could somehow draw a line

  • across the data like this.

  • Then we could say the dots to the left

  • of the line are green and dots to the right of the line are

  • red.

  • And this line can serve as our classifier.

  • So how can we learn this line?

  • Well, one way is to use the training data to adjust

  • the parameters of a model.

  • And let's say the model we use is a simple straight line

  • like we saw before.

  • That means we have two parameters to adjust-- m and b.

  • And by changing them, we can change where the line appears.

  • So how could we learn the right parameters?

  • Well, one idea is that we can iteratively adjust

  • them using our training data.

  • For example, we might start with a random line

  • and use it to classify the first training example.

  • If it gets it right, we don't need to change our line,

  • so we move on to the next one.

  • But on the other hand, if it gets it wrong,

  • we could slightly adjust the parameters of our model

  • to make it more accurate.

  • The takeaway here is this.

  • One way to think of learning is using training data

  • to adjust the parameters of a model.

  • Now, here's something really special.

  • It's called tensorflow/playground.

  • This is a beautiful example of a neural network

  • you can run and experiment with right in your browser.

  • Now, this deserves its own episode for sure,

  • but for now, go ahead and play with it.

  • It's awesome.

  • The playground comes with different data

  • sets you can try out.

  • Some are very simple.

  • For example, we could use our line to classify this one.

  • Some data sets are much more complex.

  • This data set is especially hard.

  • And see if you can build a network to classify it.

  • Now, you can think of a neural network

  • as a more sophisticated type of classifier,

  • like a decision tree or a simple line.

  • But in principle, the idea is similar.

  • OK.

  • Hope that was helpful.

  • I just created a Twitter that you can follow

  • to be notified of new episodes.

  • And the next one should be out in a couple of weeks,

  • depending on how much work I'm doing for Google I/O. Thanks,

  • as always, for watching, and I'll see you next time.

[MUSIC PLAYING]

Subtitles and vocabulary

Click the word to look it up Click the word to find further inforamtion about it