Placeholder Image

Subtitles section Play video

  • [MUSIC PLAYING]

  • YUFENG GUO: From detecting skin cancer

  • to sorting cucumbers to detecting escalators

  • in need of repair, machine learning

  • has granted computer systems entirely new abilities.

  • But how does it really work under the hood?

  • Let's walk through a basic example

  • and use it as an excuse to talk about the process of getting

  • answers from your data using machine learning.

  • Welcome to Cloud AI Adventures.

  • My name is Yufeng Guo.

  • On this show, we'll explore the art, science,

  • and tools of machine learning.

  • Let's pretend that we've been asked

  • to create a system that answers the question of whether a drink

  • is wine or beer.

  • This question answering system that we build

  • is called a model, and this model

  • is created via a process called training.

  • In machine learning, the goal of training

  • is to create an accurate model that answers our questions

  • correctly most of the time.

  • But in order to train the model, we

  • need to collect data to train on.

  • This is where we will begin.

  • Our data will be collected from glasses of wine and beer.

  • There are many aspects of drinks that we could collect data on--

  • everything from the amount of foam to the shape of the glass.

  • But for our purposes, we'll just pick two simple ones--

  • the color as a wavelength of light and the alcohol content

  • as a percentage.

  • The hope is that we can split our two types of drinks

  • along these two factors alone.

  • We'll call these our features from now on--

  • color and alcohol.

  • The first step to our process will

  • be to run out to the local grocery store,

  • buy up a bunch of different drinks,

  • and get some equipment to do our measurements-- a spectrometer

  • for measuring the color and a hydrometer

  • to measure the alcohol content.

  • It appears that our grocery store has an electronics

  • hardware section as well.

  • Once our equipment and then booze-- we got it all set up--

  • it's time for our first real step of machine

  • learning-- gathering that data.

  • This step is very important because the quality

  • and quantity of data that you gather

  • will directly determine how good your predictive model can be.

  • In this case, the data we collect

  • will be the color and alcohol content of each drink.

  • This will yield us a table of color, alcohol content,

  • and whether it's beer or wine.

  • This will be our training data.

  • So a few hours of measurements later, we've

  • gathered our training data and had a few drinks, perhaps.

  • And now it's time for our next step of machine learning--

  • data preparation--

  • where we load our data into a suitable place

  • and prepare it for use in our machine learning training.

  • We'll first put all our data together then randomize

  • the ordering.

  • We wouldn't want the order of our data

  • to affect how we learn since that's not

  • part of determining whether a drink is beer or wine.

  • In other words, we want to make a determination of what

  • a drink is independent of what drink came before or after it

  • in the sequence.

  • This is also a good time to do any pertinent visualizations

  • of your data, helping you see if there

  • is any relevant relationships between different variables

  • as well as show you if there are any data imbalances.

  • For instance, if we collected way more data points about beer

  • than wine, the model we train will be heavily biased

  • toward guessing that virtually everything that it sees

  • is beer since it would be right most of the time.

  • However, in the real world, the model

  • may see beer and wine in equal amount, which

  • would mean that it would be guessing beer wrong half

  • the time.

  • We also need to split the data into two parts.

  • The first part used in training our model

  • will be the majority of our dataset.

  • The second part will be used for evaluating our train model's

  • performance.

  • We don't want to use the same data that the model was trained

  • on for evaluation since then it would just

  • be able to memorize the questions,

  • just as you wouldn't want to use the questions from your math

  • homework on the math exam.

  • Sometimes the data we collected needs other forms

  • of adjusting and manipulation-- things

  • like duplication, normalization, error correction, and others.

  • These would all happen at the data preparation step.

  • In our case, we don't have any further data preparation needs,

  • so let's move on forward.

  • The next step in our workflow is choosing a model.

  • There are many models that researchers and data scientists

  • have created over the years.

  • Some are very well suited for image data, others

  • for sequences, such as text or music, some for numerical data,

  • and others for text-based data.

  • In our case, we have just two features-- color and alcohol

  • percentage.

  • We can use a small linear model, which

  • is a fairly simple one that will get the job done.

  • Now we move on to what is often considered

  • the bulk of machine learning--

  • the training.

  • In this step, we'll use our data to incrementally improve

  • our model's ability to predict whether a given

  • drink is wine or beer.

  • In some ways, this is similar to someone

  • first learning to drive.

  • At first, they don't know how any of the pedals, knobs,

  • and switches work or when they should be pressed or used.

  • However, after lots of practice and correcting

  • for their mistakes, a licensed driver emerges.

  • Moreover, after a year of driving,

  • they've become quite adept at driving.

  • The act of driving and reacting to real-world data

  • has adapted their driving abilities, honing their skills.

  • We will do this on a much smaller scale with our drinks.

  • In particular, the formula for a straight line

  • is y equals mx plus b, where x is the input,

  • m is the slope of the line, b is the y-intercept,

  • and y is the value of the line at that position x.

  • The values we have available to us to adjust or train

  • are just m and b, where the m is that slope and b is

  • the y-intercept.

  • There is no other way to affect the position of the line

  • since the only other variables are x, our input, and y,

  • our output.

  • In machine learning, there are many m's

  • since there may be many features.

  • The collection of these values is usually

  • formed into a matrix that is denoted

  • w for the weights matrix.

  • Similarly, for b, we arranged them together,

  • and that's called the biases.

  • The training process involves initializing some random values

  • for w and b and attempting to predict

  • the outputs with those values.

  • As you might imagine, it does pretty poorly at first,

  • but we can compare our model's predictions with the output

  • that it should have produced and adjust the values in w

  • and b such that we will have more accurate predictions

  • on the next time around.

  • So this process then repeats.

  • Each iteration or cycle of updating the weights and biases

  • is called one training step.

  • So let's look at what that means more concretely

  • for our dataset.

  • When we first start the training,

  • it's like we drew a random line through the data.

  • Then as each step of the training progresses,

  • the line moves step by step closer

  • to the ideal separation of the wine and beer.

  • Once training is complete, it's time

  • to see if the model is any good.

  • Using evaluation, this is where that dataset that we set

  • aside earlier comes into play.

  • Evaluation allows us to test our model

  • against data that has never been used for training.

  • This metric allows us to see how the model might

  • perform against data that it has not yet seen.

  • This is meant to be representative of how

  • the model might perform in the real world.

  • A good rule of thumb I use for a training-evaluation split is

  • somewhere on the order of 80%-20% or 70%-30%.

  • Much of this depends on the size of the original source dataset.

  • If you have a lot of data, perhaps you

  • don't need as big of a fraction for the evaluation dataset.

  • Once you've done evaluation, it's

  • possible that you want to see if you can further improve

  • your training in any way.

  • We can do this by tuning some of our parameters.

  • There were a few that we implicitly

  • assumed when we did our training,

  • and now is a good time to go back and test

  • those assumptions, try other values.

  • One example of a parameter we can tune

  • is how many times we run through the training set

  • during training.

  • We can actually show the data multiple times.

  • So by doing that, we will potentially

  • lead to higher accuracies.

  • Another parameter is learning rate.

  • This defines how far we shift the line

  • during each step based on the information

  • from the previous training step.

  • These values all play a role in how accurate our model can

  • become and how long the training takes.

  • For more complex models, initial conditions

  • can play a significant role as well in determining

  • the outcome of training.

  • Differences can be seen depending

  • on whether a model starts off training

  • with values initialized at zeros versus some distribution

  • of the values and what that distribution is.

  • As you can see, there are many considerations

  • at this phase of training, and it's important

  • that you define what makes a model good enough for you.

  • Otherwise, we might find ourselves tweaking parameters

  • for a very long time.

  • Now, these parameters are typically

  • referred to as hyperparameters.

  • The adjustment or tuning of these hyperparameters

  • still remains a bit more of an art than a science,

  • and it's an experimental process that

  • heavily depends on the specifics of your dataset, model,

  • and training process.

  • Once you're happy with your training and hyperparameters,

  • guided by the evaluation step, it's

  • finally time to use your model to do something useful.

  • Machine learning is using data to answer questions,

  • so prediction or inference is that step where we finally

  • get to answer some questions.

  • This is the point of all of this work where the value of machine

  • learning is realized.

  • We can finally use our model to predict whether a given

  • drink is wine or beer, given its color and alcohol percentage.

  • The power of machine learning is that we

  • were able to determine how to differentiate between wine

  • and beer using our model rather than using human judgment

  • and manual rules.

  • You can extrapolate the ideas presented today

  • to other problem domains as well, where

  • the same principles apply--

  • gathering data, preparing that data, choosing a model,

  • training it and evaluating it, doing your hyperparameter

  • training, and finally, prediction.

  • If you're looking for more ways to play

  • with training and parameters, check out

  • the TensorFlow Playground.

  • It's a completely browser-based machine learning sandbox,

  • where you can try different parameters

  • and run training against mock datasets.

  • And don't worry, you can't break the site.

  • Of course, we will encounter more steps and nuances

  • in future episodes, but this serves

  • as a good foundational framework to help

  • us think through the problem, giving us a common language

  • to think about each step and go deeper in the future.

  • Next time on AI Adventures, we'll

  • build our first real machine learning model, using code--

  • no more drawing lines and going over algebra.

  • [MUSIC PLAYING]

[MUSIC PLAYING]

Subtitles and vocabulary

Click the word to look it up Click the word to find further inforamtion about it