Subtitles section Play video
[MUSIC PLAYING]
LAURENCE MORONEY: Hi, and welcome to part three
of this series on using Google Colab to code, train, and test
neural networks in the browser without needing to install
any kind of a runtime.
In the previous video, I showed you
how you can install TensorFlow with Colab.
In this video, I'll show you then
how you can use TensorFlow to build
a neural network for breast cancer classification.
It runs completely in the browser using Colab,
and it's really quick to train and test.
The data for training this neural network
comes from the Diagnostic Wisconsin Breast Cancer
Database.
You can find it at the URL in the description.
It has close to 600 samples of data,
each from a cell biopsy, where 30 features have
been extracted per cell.
I've pre-processed the data into several CSV files
so we can just focus on the neural network itself.
Let's now take a look at the code
for training this neural network using this data so you
can use that network to then perform breast cancer
classification yourself.
Let's start with uploading the CSV files.
Now, that's a really neat thing in Colab,
that you can load external data into it.
I'm going to load my CSVs into panda
dataframes with this code.
Next, using Keros in the sequential API,
I'm going to create a neural network
with an input dimension of 30.
And that's because each of these cells has 30 features.
And we'll then have a layer of 16, then 8, then 6,
and then, finally, 1.
The final layer will be activated
by a Sigmoid function, which will push it
towards a 1 or a 0.
Now we're classifying two features, so that's perfect.
The network itself will need to have a loss function
and an optimizer defined on it.
On each iteration, it measures how well it did in training
using the loss function.
It then tries to improve on that using the optimizer.
And as you'll see in a moment, the training process
has 100 steps, with this loop happening once per step.
The training itself takes place in the Fit function.
Here, I pass in the training x's and y's, and I
specify how many times it will loop,
where a loop is it making a guess at the relationship
between the x and the y.
It measures how well or how badly it does using the loss
function, and then it improves on its guess
using the optimizer.
It's coded to do that 100 times, but you
can amend that easily and explore the results
for yourself.
As you'll see, once it finishes training,
the loss is 0.0595, showing that it's about 94% accurate.
We can now test that network with data
that the neural network hasn't yet seen.
This is the x-test data.
So we'll get a set of y predictions for this data.
Now, these predictions are going to be a probability.
They're not a 0 or a 1, but values that
are close to 0 or close to 1.
So we'll write this code that takes all of the values that
are less than 0.5 and consider that to be 0
and everything else to be a 1.
And now, here's some simple code that
will compare the predicted values
for the test set against the actual known
values for the same set.
Now, there were 114 values in this test set,
and you'll see it gets it 100% correct.
Now, remember earlier we said it was about 94% accurate.
So why do you think it gets it 100% correct?
Well, that's a little homework for you to do.
Post your answers in the comments below
and let's see who can get it right.
And that's it for this episode.
And in the next video in this series,
my colleague, Paige, will show you
about how to use different runtimes and processors
and how to use your code to take advantage of GPUs and TPUs
right in your browser.
So whatever you do, don't forget to hit that subscribe button,
and you'll be able to catch up with it.
Thank you.
[MUSIC PLAYING]