Placeholder Image

Subtitles section Play video

  • ASHLEY: My name is Ashley.

  • I'll be your host for today.

  • We'll get this show on the road.

  • I'd like to introduce our first speaker, Josh Gordon, who's

  • on the TensorFlow team.

  • Josh is going to talk about the ease of TensorFlow 2.0

  • and will walk us through the three

  • styles of model building APIs complete with coding examples.

  • So please help me in welcoming Josh.

  • [APPLAUSE]

  • JOSHUA GORDON: Thanks so much.

  • How's it going everybody?

  • AUDIENCE: Good.

  • JOSHUA GORDON: So let me just unlock this laptop.

  • And we will get started.

  • So I have only good news about TensorFlow 2.0,

  • which is about a month old.

  • It's wonderful.

  • It is massively easier to use.

  • And it's great both for beginners and for experts.

  • And also from a teaching perspective,

  • it has a lot of things to recommend it

  • that I'll talk about too.

  • And no big deal.

  • But maybe we can close the doors in the back.

  • It's a little bit loud.

  • Thanks.

  • All right.

  • So I'm going to get into an outline in a sec.

  • But one thing I wanted to mention right off the bat

  • is TensorFlow 2.0 has all the power of graphs that we

  • had in TensorFlow 1.0 except they're massively, massively,

  • massively easier to use.

  • In TensorFlow 1.0, the name "TensorFlow"-- a tensor

  • is basically a fancy word for an array.

  • So a scalar's a tensor.

  • A list is a tensor.

  • A cube is a tensor.

  • And flow refers to a data flow graph.

  • And in TensorFlow 1.0, you would manually define a graph.

  • And you would execute it with a session.

  • And this felt a little bit like metaprogramming.

  • And this is exactly the system you

  • would have wanted several years back

  • if you were an engineer and the challenge you faced was

  • massively distributed training.

  • That's great for that.

  • However, as a developer or a student or as a researcher,

  • you want something that feels a lot more like Python.

  • And on the right, you're seeing how TensorFlow 2.0 looks.

  • Basically, you can think of a tensor in a very similar way

  • to a NumPy ndarray.

  • And you can work with them imperatively,

  • exactly as you would expect in Python.

  • So you no longer need to use things like sessions

  • or anything like that.

  • And it works as you would expect, which is great.

  • But that's low-level details.

  • And here's some of the things I'd like to talk about.

  • So this is a rough schematic of how TensorFlow 2.0 looks.

  • And TensorFlow 2.0 is a very, very large system

  • with many moving pieces.

  • It's a whole framework for doing machine learning.

  • And what I'd like to do here is show you a couple of the pieces

  • and what some of your options are to use it.

  • So we will start with designing models using my favorite API

  • of all time, which is Keras.

  • And what's awesome in TensorFlow 2.0--

  • and this is a really important point--

  • there is a spectrum of use cases.

  • And all of these are built into the same framework.

  • And you can mix and match as you go.

  • And what this means is if you're a total novice

  • to deep learning, you can start with something

  • called the Sequential API, which is

  • by far the easiest and clearest way

  • to develop deep learning models today.

  • And it's wonderful.

  • You can build a stack of layers.

  • You can call things like compile and fit.

  • And that is 100% valid TensorFlow 2.0 code.

  • It is just as fast as any other way of writing code.

  • There is no downsides to it at all

  • if your use case falls into that bucket.

  • And what's really important in TensorFlow 2.0 is, as helpful

  • to you, you can optionally scale up in complexity

  • using things like the Functional API

  • or going all the way to subclassing

  • all in the same framework.

  • And you can mix and match as you go.

  • And what this means is when I'm teaching this, for example,

  • I can start in these really simple, clear, easy ways.

  • And then when I want to write gradient descent from scratch,

  • I can do that.

  • Or write custom layers from scratch, I can do that.

  • It's very easy.

  • So let's take a look at what some of these pieces look like.

  • And I only have one slide on sequential.

  • And the reason is we have so many tutorials on it.

  • There's an entire book on it, which

  • I'll recommend at the end.

  • So I've just one slide on this.

  • But in case you're new to this, the Sequential API

  • lets you define a stack of layers.

  • And this is by far the most common way

  • to build your models.

  • And something like, you know, I'd

  • say 80% to 90% a machine learning models

  • will fit into this framework.

  • So you define a stack of models.

  • And that's great.

  • What's interesting is when you're using the Sequential

  • API and the Functional, although a lot of developers

  • don't realize this, what you're actually doing

  • is defining a data structure.

  • And this means you can do things like model.summary

  • and see a printout of all the layers and all the weights.

  • It also means that we can do compile time checks.

  • So when you call model.compile, we

  • can make sure all your layers are compatible.

  • It also means that when you share your model

  • with other people, for example, imagine

  • that you want to do fine tuning for transfer learning.

  • If you have a model that's defined with a Sequential

  • API or the Functional API, because it

  • has a data structure of a stack of layers or, with a Functional

  • API, a graph of layers, you can inspect that data structure

  • and you can pull layers out of it and get the activations.

  • And you can do fine tuning things like that really easily.

  • Anyway, defining a stack of layers is very common.

  • In TensorFlow 2.0, this works exactly

  • like it does in Keras.io with the multipack in Keras.

  • So that's great.

  • One thing that's also extremely powerful that a lot of people

  • are new to is the Functional API.

  • And the Sequential API is for building stacks.

  • The Functional API is for building DAGS

  • or Directed Graphs.

  • And I just want to show you how powerful this is.

  • So to be honest, most of what I've been doing myself

  • is using either the Sequential API

  • or going all the way to subclassing and just

  • write everything from scratch.

  • But I heard a really awesome talk on the Functional API

  • a couple weeks ago in Montreal.

  • And I've been using it a lot since.

  • And I love it.

  • So I just want to show you what it can do.

  • And so I just want to show you what a quick model would

  • look like for something like visual question answering.

  • And a lot of the time, when you start with machine learning,

  • you spend a lot of your time-- or deep learning,

  • rather-- you spend a lot of your time building image classifiers

  • and doing things like cats and dogs.

  • But we can take a look at a slightly more sophisticated

  • model.

  • And this is VQA.

  • And in VQA, you're given two inputs.

  • One, you're given an image.

  • So here we have a pair of dogs.

  • And you're given a question in natural language.

  • And here, the question is asking, what color

  • is the dog on the right?

  • And so to answer a question like this,

  • you need a much, much, much more sophisticated model

  • than just an image classifier.

  • You can still phrase this as a classification problem.

  • And if you Google for VQA, there's

  • two really excellent papers that will go into detail.

  • But you can imagine you have some model--

  • well, let's talk about how we would do this.

  • Here, we have a model with two inputs.

  • We have an image and a question.

  • And if you take a machine learning course or deep

  • learning course, rather, you'll learn

  • about processing images with convolutional layers

  • and max pooling layers.

  • And you'll learn about processing text with things

  • like LSTMs and embeddings.

  • And one thing that's really powerful about deep learning

  • is all of these layers, regardless of what they are,

  • they take vectors as input.

  • And if you're a dense layer, you don't

  • care if your input happens to be the output

  • of some convolutional layer or if your input happens

  • to be the output of some LSTM.

  • It's just numbers that you're taking as input.

  • So in deep learning, there's no reason

  • that we can't process the image with the CNN,

  • process the text with an LSTM, and then concatenate the result

  • and feed that into a dense layer and phrase

  • this as a classification problem.

  • So you can imagine the output of our dense layer

  • might have 1,000 different classes.

  • And each class corresponds to one possible answer.

  • So here, if the answer is golden,

  • we want to classify both of these inputs jointly as golden.

  • And I want to show you how quickly we can design something

  • like this with the Functional API, which is really amazing.

  • So, cool, I probably should've flipped to this slide.

  • But this is what I was just talking through.

  • So this is the architecture that we want.

  • This is one model.

  • And it's going to have two heads.

  • And the first head is going to be a standard stack of CNNs

  • and max pooling layers.

  • And this is exactly the same model

  • you would use to classify cats and dogs.

  • And you can do all the same tricks

  • that you will learn about there.

  • You can import, like here, I want to show you

  • how to write it from scratch.

  • But there's no reason that you couldn't import like MobileNet

  • v whatever, and use that to get activations for the images.

  • But basically we're going to go from an image to a vector.

  • And in the other head, we're going to process the question.

  • And we're going to go from a question to a vector.

  • And to do that, we use an embedding in an LSTM.

  • At the end, we can concatenate the results and classify it.

  • And this is nearly the complete code for this entire VQA model.

  • Actually it is the complete code for the model-- a Hello World

  • version of it, which is nuts when you look at it.

  • So here, this is our image classifier.

  • And you would want something much deeper.

  • But this would be your Hello World image classifier.

  • So a vector is going to go in.

  • And just some random tips.

  • I can see my slides.

  • That's cool.

  • So you'll notice in the first layer,

  • this is a convolutional layer.

  • It has 64 filters, each of which is 3 by 3.

  • And it has relu activation.

  • And you see in the input shape there,

  • I'm specifying how large my image is.

  • Although you'll find that Keras can often

  • infer the input shape, whenever you can you should specify it.

  • And it's just one less thing that can go wrong.

  • So fully specify it, catch bugs early.

  • After that we're doing max pooling.

  • And the important part is we're flattening it.

  • So that's actually a sequential model

  • that we're using inside the Functional API.

  • After that, we're creating an input layer.

  • And this is for the Functional API.

  • And we're beginning to chain layers

  • together to build up a graph.

  • So here what we're doing is we're changing the vision model

  • to the input layer.

  • So that's the first half of our model.

  • And here's the second half.

  • And we're almost done.

  • So this is the model that's going to process the question.

  • Here we're creating another input.

  • And I don't have the preprocessing here.

  • But you can imagine that we've tokenized the text.

  • And we vectorized it.

  • We've padded it.

  • And then what we're doing is we're

  • feeding that into an embedding and then into an LSTM.

  • And this is exactly what you might

  • do if you were training a text classifier.

  • The important thing is that a vector is coming out,

  • and we're chaining these together to build a graph.

  • And at the very end--

  • and this is the magic bit about deep learning-- we can simply

  • concatenate the results.

  • Nothing simple, but it's one line of code, which is nice.

  • Nothing simple conceptually.

  • But what we can do is we can concatenate the results.

  • And now we just have a vector.

  • And just like any other problem, now that we have this vector,

  • we can feed it into dense layers, and we can classify it.

  • And so here's the tail of our model.

  • And now we have a TensorFlow 2.0 model that will do VQA.

  • And this will work exactly like any other Keras model.

  • So if you want, you can call model.fit on this thing.

  • You can call model.train_on_batch.

  • You can use callbacks.

  • If you want, you can write a custom training

  • loop using GradientTape.

  • And so I think this is really powerful.

  • And what's nice about these Functional APIs,

  • just like sequential models, because there's

  • a data structure behind the scenes, there's a graph.

  • TensorFlow 2.0 can run compatibility checks

  • and make sure your layers work with each other.

  • So it's really, really nice.

  • So basically, if you haven't used the Functional API,

  • either you just learned about Keras Sequential

  • from a lot of books or you're coming from PyTorch

  • and you've only used things like subclassing,

  • I really encourage you to try it out.

  • I've had only positive results in the last few weeks.

  • So I love it.

  • Other things, of course, so that graph

  • I just made in Google Slides.

  • But because you have a data structure,

  • instead of calling something like model.summary,

  • you can call a model plot model.

  • And you'll actually get a nice rendering

  • of a graph that looks exactly like what I just showed you.

  • So for complicated models--

  • so this is cool.

  • The only time I found it really useful

  • is when you have complicated models, things like ResNets,

  • you can actually plot out the whole graph

  • and just make sure that it looks as you

  • expect as you're assembling it.

  • So it's really nice.

  • Anyway, and then there's another style in TensorFlow 2.0.

  • So the last two things I showed you

  • were built into what you'll find at Keras.io

  • and that's multipack in Keras.

  • Keras in TensorFlow 2.0 is a superset

  • of what you find in Keras.io.

  • And this is something that's new.

  • So this is subclassing.

  • And this is a Chainer/PyTorch style

  • of developing models, which is also really, really nice.

  • And basically you'll see this little spectrum here.

  • You're getting increasing control as you move up.

  • And so in this style, you can be a researcher

  • or a student learning this for the first time.

  • But what you're saying is, I just

  • want to write everything from scratch to learn how it works

  • or because I have some special use case that doesn't

  • fit into the other ones.

  • So here what we're doing is we're

  • defining a subclass model.

  • And I also really love this.

  • This feels a lot like object-oriented NumPy

  • development.

  • So what we're doing-- and the idea is basically--

  • it's very, very similar in all these frameworks.

  • The framework gives you a class.

  • And here this class happens to be model.

  • And there's two chunks to writing a subclass model.

  • You have the constructor and the call method

  • or the forward method or the predict method.

  • And in the constructor, you define your layers.

  • So here I'm creating a pair of dense layers.

  • And it's the exact same layers that you'd

  • find in the Sequential and Functional model APIs,

  • which is great.

  • So basically, you learn these layers once,

  • you can use them all over the place.

  • And in the call method or the forward method,

  • you describe how these layers are chained together.

  • So here I have some inputs.

  • And I'm feeding the inputs through my dense layer.

  • And then I'm feeding that result through my second dense layer

  • and returning it.

  • And what's nice is this is not symbolic.

  • So if you're curious what x, is you can just

  • do print x, of course, like you would in Python,

  • and that will give you the activations

  • of that first dense layer.

  • If you want you can modify it.

  • So for example, here I've highlighted relu.

  • Let's say for some reason I'm not

  • interested in using the built-in relu activation.

  • I want to write my own.

  • What you can do simply is just remove that.

  • And you can write your own with just regular Python flow right

  • there.

  • So there, I've written relu using the built-in method.

  • But I could also just write that using regular Python.

  • So this is great for hacking on things.

  • It's great.

  • If you want to really know the details of what exactly

  • is flowing in and out of these layers,

  • it's a perfect way to do it.

  • Also for example, there's nothing here

  • that saying that you have to use these built-in dense layers.

  • So if you look at the code for the dense layer, that's

  • doing something like wx plus b, you

  • can absolutely just write that from scratch in Python

  • and use that here instead.

  • And then these are just a whole bunch of references.

  • I'm going to tweet out these slides when we finish so you

  • don't have to write it down.

  • But what these are, these are guides

  • from TensorFlow 2.0 that will go into detail on how you write--

  • how do you use each of these three styles of the Keras API?

  • How do you write custom layers and stuff like that?

  • They're great.

  • And then we have a couple of tutorials

  • that I recommend that are really, really nice

  • that show different ways of building stuff.

  • If you haven't seen these, by the way,

  • the segmentation one is super-nice.

  • We wrote it this summer.

  • It runs really, really fast.

  • And I think you'll like it.

  • Also, by the way, because this is an Intro to TensorFlow 2.0

  • talk, let me just show you what the tutorials are

  • in case you're new to these.

  • And there's just one thing I want to point out.

  • So this is the tutorial on our website.

  • Big surprise.

  • What I wanted to mention, and I think

  • this is a really nice feature of TensorFlow.org,

  • obviously this is a web page HTML.

  • But this is just a direct rendering

  • of this Jupyter Notebook.

  • So the web page is just the Jupyter Notebook.

  • And the reason we've done that is all the tutorials

  • are runnable end to end.

  • So you can install TensorFlow 2.0 locally

  • and run this tutorial.

  • Or if you click run in Colab, this is exactly the same page

  • as you have on TensorFlow.org.

  • For all of these, you can do Runtime, Run all.

  • And this has the complete code to reproduce the results

  • that you see here.

  • And what this means is that our tutorials are testable.

  • And if you know the expression, trust but verify?

  • So for a long time, I've seen nifty tutorials

  • that are like, yeah, like let me show you

  • how to write this like neural machine translation model.

  • And then the code doesn't work or it's

  • missing a key piece that's left as an exercise to the reader.

  • So at least all the tutorials on the website, all of them

  • run end to end, which is really, really nice.

  • Some we still have plenty of work to do cleaning them up.

  • But at least they guarantee you they have the complete code

  • to do the thing.

  • So I really like that a lot.

  • All right.

  • I wanted to talk a little bit about training models.

  • And so basically, there are several ways to train models.

  • And again, you can use-- the nice thing

  • about TensorFlow 2.0 is you can use the one that's

  • most helpful for your use case.

  • So you don't always need to write a custom training

  • loop from scratch.

  • So you have other options.

  • And the first is that you might be familiar with from Keras

  • is just simply calling model.fit.

  • And what's really nice about model.fit,

  • it doesn't care if you have a sequential model, a functional

  • model, or a subclass model.

  • It works for all them.

  • And model.fit-- it's fast.

  • It's performant.

  • It's simple.

  • One thing that's a little bit less obvious,

  • when you do model.fit, this is not just

  • the baby way of training models.

  • So if you're working in a team and you call model.fit,

  • you've reduced your code footprint by a lot.

  • This is one less thing that your friends

  • need to worry about when they're playing with your models

  • down the road.

  • So if you can use the simple things,

  • you should unless there's a reason for more

  • complexity, of course, just like in regular software

  • engineering.

  • The nice thing about fit is you can pass in different metrics.

  • In a lot of examples, you'll see things like accuracy.

  • By the way, TensorFlow 2.0 has really nice metrics

  • for things like precision and recall built in.

  • You can also write custom metrics.

  • And something that's really helpful too is callbacks.

  • So for instance, these are things

  • that I don't see a lot of new developers using.

  • And they're super helpful.

  • So callbacks, and one of my favorites is EarlyStopping.

  • And so typically, when we're training models,

  • we need to prevent overfitting.

  • And a really wonderful way to do that

  • is to make plots of your loss over time

  • and so on and so forth.

  • These callbacks can do things like that for you

  • automatically, which can be helpful as well.

  • You can also write custom callbacks.

  • So a cool thing would be like let's say

  • you're training a model that takes

  • a very long time to train.

  • You could write a callback to send you a Slack notification

  • after every epoch of training completes.

  • And so that can be really nifty too.

  • So callbacks are great.

  • And then I don't have slides for train_on_batch here.

  • But I did want to show you custom training

  • with a GradientTape because this is also very powerful,

  • especially for students who are learning this

  • for the first time and don't want a black box

  • or for researchers.

  • And so here is a custom training loop.

  • And I have an example of this for you in a minute

  • just with linear regression so you

  • can see exactly how this works.

  • But this is a custom training loop.

  • And what we're doing here is we have some function.

  • And for now you can pretend that atf.function in the orange box

  • doesn't exist.

  • So that's optional.

  • So just pretend that doesn't exist.

  • We have some function that's taking features and labels

  • as input.

  • And whenever we're doing training in deep learning,

  • we're doing gradient descent.

  • The first step in doing gradient descent

  • is getting the gradients.

  • And the way all frameworks do this is by backdrop,

  • which is reverse-mode autodiff.

  • And the implementation in TensorFlow

  • is we start recording operations on a tape.

  • So here we're creating a tape.

  • And we're recording what's happening beneath that tape.

  • So we have just regular Python code

  • that's calling the forward method or the call

  • method, rather, on your model.

  • So we forward the features through our model.

  • And we're computing some loss.

  • And maybe we're doing regression.

  • And that's squared error.

  • And then what we're doing is we're

  • getting the gradients of the loss with respect

  • to all the variables in the model.

  • And if you print those out, you'll

  • see exactly what the gradients are.

  • And then here, we're doing gradient descent manually.

  • We're applying them on an optimizer.

  • We can also write our own optimizer.

  • And I'll show you that in a second.

  • Anyway, this is a custom training loop from scratch.

  • And what this means is that if you--

  • so in model.fit, you can use optimizers

  • like RMSprop and Adam and all this stuff.

  • But if you'd like to write like the [? Sarah ?] optimizer,

  • you can go ahead and write it in Python.

  • And it will fit right in with your model.

  • So this is great for research.

  • tf.function, by the way, first of all,

  • you never need to write it.

  • Your code will work the same.

  • But if you do want a graph in TensorFlow 2.0 or if you--

  • basically, if you want to compile your code

  • and have it run faster, you can write that tf.function

  • annotation.

  • And what this means is that TensorFlow 2.0 will trace

  • your computation, compile it, and the second and on time that

  • you run this function it will be much, much,

  • much faster because it's running entirely in C++.

  • So all there is to graphs in TensorFlow 2.0 is basically

  • at tf.function.

  • But it's optional.

  • You don't even need to use it.

  • But it's easy performance if you need it.

  • And then I just want to make this super concrete

  • because this is a Getting Started with TensorFlow 2.0

  • talk.

  • There's a lot of awesome tutorials

  • that will quickly show you on the website

  • how to train image classifiers and whatnot.

  • But I think a good place to start too

  • is just looking at linear regression.

  • And the reason is it's gradient descent.

  • All deep neural networks are trained by gradient descent.

  • And a nice place to start is seeing exactly what that is.

  • And because I have a--

  • I know it's tiny.

  • But because I have a graphic here on the left,

  • I'm just briefly going to explain

  • how linear regression works.

  • And it's the same pattern for deep neural networks

  • too, which is really surprising.

  • So in linear regression or deep neural networks,

  • you need three things.

  • The first thing you need is a model,

  • which is a function that makes a prediction.

  • And so a model for linear regression,

  • you might have learned in high school, could be y

  • equals mx plus b.

  • We're trying to find the best fit line.

  • And we can define a line y equals mx plus b.

  • That means we have two parameters or variables

  • that we need to set.

  • We have m, which is the slope, right?

  • And we have b, which is the intercept.

  • And by wiggling those variables, we

  • can fit the line to our data.

  • So on the right, you'll see a plot.

  • And we have a scatter plot with a bunch of points.

  • And we have the best fit line.

  • And the idea is now that we have a model or a line

  • that we can wiggle, we need a way of saying or quantifying,

  • how well does this line fit the data?

  • One way to quantify how well the line fits the data

  • is squared error.

  • What that means is you drop a line on the page.

  • And then you measure the distance

  • from your line to all the points.

  • And you take the sum of the squares of that.

  • The higher the sum of the squares is,

  • the worse your line fits the data.

  • The better your line fits the data,

  • the lower the sum of the squares.

  • So you can have a single number, which

  • is called loss, that describes how badly your line fits

  • the data.

  • And then you want to reduce that loss.

  • And you know that if the loss gets to a minimum,

  • you're line will fit the data well.

  • And you found the best fit line.

  • The way we reduce the loss is gradient descent.

  • And on the left, you'll see a gradient descent plot.

  • And we're looking at loss or a squared error

  • as a function of two variables-- m and b.

  • And you can see that if we set m and b with a random guess

  • to start, our loss is pretty high.

  • And then as we wiggle them, we can reduce the loss.

  • The trick is, how do we figure out

  • which way to wiggle m and b?

  • And briefly-- I don't want to go on too much of a tangent--

  • there's two ways to do that.

  • If you forget calculus, you can find the gradient numerically.

  • And it's not rocket science.

  • You take m, and you wiggle it up a little bit,

  • recompute your loss.

  • Then you take m, and you wiggle it down a little bit.

  • And you recompute your loss.

  • You figure out which way makes the loss go down.

  • That's the direction you're going to be wiggling m.

  • Do the same thing or b.

  • That's very, very slow.

  • And there's faster ways to do it too.

  • But I want to show you what this code looks

  • like in TensorFlow 2.0.

  • So basically, you don't have to use Keras at all.

  • You can also use TensorFlow 2.0 a lot like you would use NumPy.

  • And basically, whenever you see something like tensor,

  • just replace that in your head with NumPy ndarray.

  • So we have constants.

  • And you see as you print out a constant,

  • it has shape and a data type.

  • One really nice thing about TensorFlow tensors

  • is they have a NumPy method.

  • So you can go straight from tensors to NumPy,

  • which is great.

  • So you're free of the clutches of TensorFlow 2.0.

  • Tensors have a shape and a data type.

  • And then just like you would expect in NumPy,

  • we have things like distributions.

  • So if you want to create some random normal,

  • here's how you would do it.

  • I just want to fly through this really quick.

  • And you can do math in TensorFlow 2.0 a lot

  • like you would do math in NumPy.

  • So basically, just like you have things like numpy.square

  • and numpy.matrixmultiply, TensorFlow

  • has all these same things too.

  • And the idea is the same.

  • The names might be slightly different.

  • You might have to poke around a little bit.

  • But the names are all there.

  • And here's an example of like very, very simple.

  • Here's how we get the gradients using the GradientTape.

  • But this is more concrete.

  • So here we have a constant.

  • That's 3.

  • And we have a function, which is x squared.

  • And so if you think about your rules of calculus,

  • if we have 3x squared, you take the 2.

  • And you multiply it by the 3.

  • And you get 6.

  • And if you walk through this code,

  • you'll see that it returns you 6 too.

  • So basically, this is how we get the gradients using

  • GradientTape.

  • And you can also do that with all the variables and layers

  • at once.

  • So here we have a pair of dense layers.

  • And we're calling the dense layers on some data.

  • And we're getting the gradients also under the tape.

  • And let me just show you what this looks

  • like in linear regression just to make this concrete.

  • So this is code for y equals mx plus b.

  • And the first thing I wanted to mention

  • is how you install TensorFlow 2.0.

  • If you're running in Colab--

  • right now in Colab, TensorFlow 1.0 is installed by default.

  • But there's a magic flag you can run.

  • So that magic command will give you the latest

  • version of TensorFlow 2.0.

  • If you're running this locally, you can visit

  • Tensorflow.org/install.

  • And you can do pip install tensorflow.

  • So anything you can do in Colab, you can do locally.

  • But I have this here.

  • This is convenient.

  • TensorFlow is just a Python library.

  • You can import as you always would.

  • And the first thing we do in this notebook--

  • I know I flew through some of those code examples--

  • but what we're going to do is just create a scatterplot,

  • so just some random data.

  • And then we're going to find the best fit line.

  • So here what we're doing is we're creating some data.

  • Let me see if I've run this before.

  • Yeah.

  • And then we're plotting the data.

  • And here's what we get.

  • The slides had TensorFlow constants.

  • And these are TensorFlow variables.

  • And basically, constants are constant,

  • and variables can be adjusted over time.

  • You almost never need to write code this low level.

  • This is pretending that we don't have Keras.

  • We don't have any built-in fit methods.

  • We just want to do this from scratch.

  • But here's how you would do it from scratch.

  • So I'm creating some variables.

  • Here I've initialized them to 0.

  • I probably should have initialized them

  • to a random number.

  • But this will work too.

  • And then here, this doesn't look scary at all.

  • This is the predict function for linear regression.

  • So this is our equation for a line y equals mx plus b.

  • And our goal is going to be to find good values for m and b.

  • Here's our loss function.

  • And what we're doing is we're taking the results

  • that we predicted minus the results that we wanted.

  • We're squaring it.

  • And we're taking the average.

  • So that's squared error.

  • And then as we go through this notebook,

  • we can see our squared error when we start.

  • And here's gradient descent from scratch

  • pretending that we didn't have anything like model.fit.

  • So for some number of steps, what we're doing is

  • we're taking our x's, and we're forwarding through the model

  • to predict our y's.

  • We're getting the squared error, which is a single number.

  • And then we're getting the gradients of m

  • and b with respect to the loss.

  • And this will literally tell us-- if you print those out,

  • those aer just numbers.

  • And the gradients point in the direction of steepest ascent.

  • So if we move in the direction of gradients,

  • our loss will increase.

  • We want the loss to decrease.

  • So we move in the reverse direction of the gradient,

  • which is gradient descent.

  • And here, again, this is like the lowest possible level

  • way to write this code.

  • We're doing gradient descent from scratch.

  • So we don't have any optimizer.

  • What we're doing is we're taking a step

  • in the negative gradient multiplied by our learning

  • rate.

  • And that adjusts m and b as we go.

  • And if you run this code, you'll see the losses decreasing.

  • And you'll see the final values for m and b

  • and plot the best fit line.

  • And then what I did for you is I wrote--

  • this is a little bit uglier.

  • But I wrote some code to produce this diagram

  • just so you can see exactly what the gradient descent is doing.

  • So that's how you would write things

  • from scratch in TensorFlow 2.0.

  • And what's really, really awesome,

  • when you move to things like neural networks,

  • this code is basically copy and pasted.

  • So if you can compare this custom training

  • loop for linear regression to the custom training

  • loop for DeepDream or any of the fancy models on the website,

  • it looks almost identical.

  • You always have this tape, and the steps are the same.

  • You make a prediction.

  • You get your loss.

  • You get the gradients.

  • And you go from there.

  • So that's really nice.

  • All right.

  • Other things I wanted to mention.

  • Oh, I got to move a lot faster here.

  • So in terms of data sets, basically

  • you have two options in TensorFlow 2.0.

  • The first option is at the top.

  • And these are all the existing Keras data

  • sets that you find at Keras.io.

  • And these are great.

  • They're good to start with.

  • They in NumPy format.

  • And they're usually really tiny.

  • They fit into memory no problem.

  • Then we have this enormous collection of research data

  • sets, which is awesome.

  • And that's called TensorFlow Datasets.

  • And here, I'm showing you how you can download something

  • like cycle_gan in TensorFlow data sets.

  • What's important to be aware of, I just

  • have a couple of quick tips.

  • If you're downloading a data set in TensorFlow data format,

  • by the way, it's going to give you

  • something called-- the data sets not going to be NumPy.

  • It's going to be in tf.data.

  • And tf.data, it's a high-performance format

  • for data.

  • It's slightly trickier to use than what you might be used to.

  • And so if you're using TensorFlow data sets,

  • you have to be very, very careful to benchmark your input

  • pipeline.

  • If you just import a data set and try and call

  • model.fit on the data set, it might be slow.

  • So it's important to take your time

  • and make sure that your data pipeline can read images

  • off disk and things like that efficiently.

  • And I have just a couple tips that might be helpful.

  • TensorFlow data sets recently added an in_memory flag.

  • So if you don't want to write fast input pipelines,

  • you can pass in_memory.

  • And you can insert the whole thing into RAM.

  • So that will make it really easy.

  • And it also added this caching function,

  • which is really, really nice.

  • So here's some code for tf.data.

  • And maybe we have an image data set,

  • and we have some code to preprocess the images.

  • And let's pretend that that preprocessing code is expensive

  • and we don't want to run it every time, on every epoch.

  • What you can do is you can add this cache line at the end.

  • Cache will keep the results of the preprocessing in memory.

  • And it will make subsequent runs of your pipeline much faster.

  • So cache is a really handy thing to be aware of.

  • The goal here is not to give you all the details for this.

  • It's just to point you to some things that

  • are useful to know about.

  • You can also cache to files.

  • So cache without any parameters will cache it into RAM.

  • If you pass a file name, you can actually

  • cache to a file on disk too.

  • One thing that's awesome in TensorFlow 2.0 that you--

  • if you're an expert, you'll care about,

  • and if not, you'll care about down

  • the road-- is distributed training.

  • And I'm going to skip some slides to move faster.

  • What I wanted to say briefly is distributed training

  • in TensorFlow 2.0 is awesome.

  • And it's awesome because if you're doing single machine,

  • multiple GPU synchronous data parallel training

  • or you're doing multimachine, multi-GPU synchronous data

  • parallel training, you don't need

  • to change the code of your model, which

  • is exactly what I care about.

  • And so it's awesome.

  • And basically, here is some Keras model.

  • This happens to be a built-in application for ResNet.

  • But it doesn't matter.

  • And I just want to show you how we run this code on one

  • machine with multiple GPUs.

  • We just wrap it in a block.

  • That's it.

  • And model.fit is distribute aware and will work.

  • So you don't need to change your model to run on multiple GPUs.

  • And this particular strategy is called the MirroredStrategy.

  • There's different strategies you can

  • use to distribute your models.

  • There's another strategy-- it's like Mirrored MultiWorker,

  • which you can change that one line.

  • And then if you have a network with multiple machines on it,

  • again, your code doesn't change.

  • So that's awesome.

  • I really, really like the design of this.

  • All right.

  • Other things that are awesome about TensorFlow 2.0

  • that I'd really encourage you to check out, especially

  • if you're learning or you have students,

  • is going beyond Python.

  • So we've talked about training models in Keras.

  • And I just want to show you some of the cool things you

  • can do to deploy them.

  • And roughly, there's a bunch of different ways

  • that you can deploy your models.

  • The way that I was used to a few years ago

  • as a Python developer was I'd throw up REST API.

  • And I can do that using like TensorFlow Serving or Flask

  • or whatever you want.

  • And I'd serve the thing behind an API

  • because that's what I know how to do.

  • There's a couple of things that I've been learning since then,

  • which have been great, and that's basically

  • deploying models in the browser with TensorFlow.js,

  • deploying them on Android and iOS

  • using TensorFlow Lite, and very recently,

  • running them on Arduino using TensorFlow Lite Micro.

  • And I have a couple of suggested projects for you

  • that I just wanted to point you to.

  • So the first is tinyML.

  • And this was a blog post--

  • this was a guest article on our blog

  • a few weeks ago by the Arduino team.

  • And this article is basically a tutorial.

  • And what we're looking at is an Arduino.

  • It's a microcontroller.

  • And if you're new to microcontrollers,

  • it's a system on a chip.

  • But it also has pins that it can run voltage to

  • or read voltage from.

  • And so for instance, you can plug an LED light bulb

  • into one of those pins, and you can

  • have C code which runs voltage to the pin

  • and turns on the light.

  • Likewise, you could have an accelerometer attached

  • to one of the pins, and you could

  • have C code that reads from the accelerometer

  • and gives you some time series data.

  • So that's what a microcontroller is.

  • It's a computer plus these, basically, pins

  • that you can read and write to.

  • TensorFlow Lite is the code we used to deploy

  • TensorFlow models onto phones.

  • But recently, TensorFlow Lite Micro

  • now lets you deploy them onto Arduino.

  • And these things are smaller than a stick of gum.

  • This is a really nice one.

  • It has-- this is the Nano that has built-in sensors.

  • I have one at home.

  • But it's still about $30.

  • And it has a built-in accelerometer, a temperature

  • sensor, stuff like that.

  • But anyway, what we're looking at here,

  • this is a demo using the accelerometer someone trained

  • to model to recognize two gestures.

  • One is like a punch, and one is an uppercut.

  • And you can see they're holding the Arduino in their hand.

  • And as they're moving it, the laptop

  • is recognizing the gestures.

  • But the workflow-- and this is in the blog post--

  • is not bad at all considering how much power you're

  • getting out of it.

  • So what you're doing--

  • and let me see if I can show you the steps.

  • Basically, the first thing you need to do is capture the data.

  • And I wanted to bring an Arduino with me.

  • I thought it would be better just

  • to quickly show you these GIFs.

  • But you can plug the Arduino into your laptop

  • with a USB cable.

  • You need to collect training data for your model.

  • So what you do is you hold the Arduino in your hand.

  • And you collect a bunch of data for your punching gesture.

  • And you save that to disk as a time series.

  • And the diagram on the right there--

  • I was just capturing the IDE this morning--

  • as you move the accelerator-- as you move the Arduino around,

  • you're reading from the accelerometer.

  • And what you get out is just a CSV file with time series data,

  • exactly like you would have if you were looking

  • through the time series forecasting tutorial

  • on TensorFlow.org.

  • And you can do the same thing for your other gestures.

  • So what you do is you gather data.

  • You save CSV files.

  • You upload them to Colab.

  • In Colab, you write a model using TensorFlow 2.0 in Keras

  • to classify the data.

  • And that's just a regular Python model.

  • You don't need to know anything special about TensorFlow Lite

  • to do it.

  • And then what I wanted to show you--

  • and you can find the complete code in the blog post--

  • there's a very small amount of code

  • that you need to convert your model from Python down

  • into TensorFlow Lite format to run on device.

  • And this is a tiny amount of code.

  • Once you have this model--

  • and this is a little unusual--

  • we're going to convert it to a C array.

  • And that's because this Arduino, it

  • has like 1 megabyte of disk space

  • and I think like 256 KB of RAM.

  • So we're converting it into the smallest,

  • simplest possible format.

  • But you convert it to a C array.

  • And then what you can do-- we have an example.

  • And you can paste the C array into the example.

  • And now you're running your TensorFlow model on device.

  • And it's amazing.

  • I've had so much fun doing this.

  • The reason I wanted to point it to you, if you've kids or you

  • have students that want to play with this stuff,

  • if you teach them just to train a time series forecasting

  • model, which is a really valuable skill.

  • We have a pretty good tutorial for it.

  • I mean, it's interesting.

  • But it's vastly more interesting if they can then

  • deploy it on device.

  • It gives them something tangible that they can play with

  • and show their friends.

  • It's super cool.

  • Also this is a brand new area.

  • So tinyML referring to doing machine

  • learning on small devices I think has a lot of promise.

  • Another way to deploy your models,

  • which is super-powerful, is in JavaScript.

  • And this is using TensorFlow.js.

  • Here's another project suggestion.

  • One of the first tutorials you might run through

  • in deep learning is sentiment analysis.

  • So given a sentence, predict if it's positive or negative.

  • Also really valuable skill can be somewhat dry, right?

  • But the first time I've ever been super-excited about

  • sentiment analysis, other than many years ago when I found

  • that I can make money for it--

  • that was not the link that I wanted.

  • If you go into this GitHub repo link from the slides,

  • you will find that you can run sentiment analysis live

  • in JavaScript.

  • So for example, this is just a web page.

  • And down at the bottom here, the movie was awesome.

  • So I wrote a sentence.

  • And you can see that's predicted positive.

  • And here, you can see that's predicted negative.

  • And what's nice about doing JavaScript in the browser

  • from a user perspective is there's nothing to install,

  • which means if you're a Python developer

  • and your goal is to have a cool demo,

  • instead of throwing up a REST API, you can create a web page

  • and share it with your friends.

  • And what's nice, this model was written

  • in TensorFlow 2.0 using Keras.

  • And we have a converter script that

  • will convert it into TensorFlow.js format

  • to run in the browser.

  • So following examples, I am not a JavaScript developer at all.

  • I can get through the examples.

  • And it's not too bad.

  • So it's possible to do.

  • But it's a really good opportunity

  • if you have friends that are good JavaScript developers.

  • This is a huge collaboration opportunity

  • where you can develop models, and your friends

  • can help you deploy them in the browser.

  • Also if you haven't seen it, you can also write models

  • from scratch in JavaScript.

  • And I just wanted to show you a couple of demos here.

  • I'm almost certainly going to accidentally unplug

  • this laptop.

  • But another super-convincing thing

  • about why you might want to do JavaScript in the browser,

  • this is a model called posenet.

  • This could be the end of the presentation.

  • This is a model called posenet.

  • It's running entirely client-side in the browser.

  • So nothing's being sent to a server.

  • And it's not meant for this many people.

  • But you can see that it's starting to recognize where

  • people are in the audience.

  • And what's cool--

  • I know this is all obvious for web developers.

  • But I'm not.

  • So this is all new to me.

  • For me to do this in Python would have been a nightmare.

  • Like I would have to be streaming data

  • from the video camera, sending it to a server,

  • classifying it server-side, sending the results back.

  • There's no way I'd be able to do that in real time like that.

  • Also for privacy reasons, that would not be cool.

  • But because this is running client-side in JavaScript,

  • we can do that.

  • And so immediately, like you can see all the things

  • that we can do on top of this.

  • There's other models like that too that are just

  • really compelling.

  • And then for people looking for applications,

  • so there is sentiment analysis.

  • But a model built right on top of that in the browser

  • is just a text classification.

  • But this is multiclass text classification.

  • And this is a toxicity detector.

  • It's basically a sentiment analysis model.

  • But the idea is you're given a sentence,

  • and you want to figure out if this is like a comment

  • that you might want to post on YouTube or something like this.

  • But you can build tools like this

  • that analyze text privately, quickly, client-side.

  • So you can imagine, for example, if you

  • had a job as like a Wikimedia moderator

  • and you wanted to take a look at article edits

  • to see if they were something you wanted to publish or not,

  • you might spend a lot of your time looking

  • for toxic comments.

  • With something like this, you could very quickly

  • preprocess-- you could immediately have code

  • that right in the web page highlights

  • the bad parts of the article.

  • And I need to move a little bit quicker.

  • So let me just point you to one demo and then I'll stop.

  • If you're new TensorFlow.js, by far the best demo--

  • there's two, and they're right here.

  • One is posenet, which you can get at that link.

  • There's also Pac-Man.

  • If you haven't seen Pac-Man, you can

  • control Pac-Man with your face.

  • You can train a model live in the browser

  • to do that, which is awesome.

  • And then flying through this, last comment,

  • then learning more.

  • If you're working in Colab and you're

  • used to using Keras, what you don't want to do

  • is import keras.

  • You want to say from tensorflow import keras.

  • And that will give you the version

  • of Keras in TensorFlow 2.0 that's

  • a superset of regular Keras.

  • This is only a problem in Colab because Keras is installed

  • there by default. If you ever see a message that

  • says using TensorFlow backend, you've

  • imported the wrong version of Keras.

  • And then last slide, here are four books that I'd recommend.

  • The very first book is about TensorFlow 2.0.

  • And this will give you low-level details.

  • It's great.

  • Only buy the second edition.

  • The first edition teaches TensorFlow 1.0,

  • which you do not need to learn how

  • to use to use TensorFlow 2.0.

  • The second book doesn't mention the word "TensorFlow" at all.

  • That's the Keras book by Francois Chollet.

  • But it's outstanding if you're new to deep learning.

  • It's a perfectly good place to start.

  • All the code from the second book

  • will also work inside TensorFlow 2.0

  • just by saying from tensorflow import keras.

  • Nothing else will change.

  • It's all completely good.

  • The next book is Keras in JavaScript,

  • so deep learning with JavaScript, which is great.

  • The fourth book is brand new.

  • It's by Pete Warden, TinyML.

  • So thanks very much.

  • And I'll stop there.

  • And I'll be around after for questions.

  • [APPLAUSE]

  • ASHLEY: Thank you so much, Josh.

  • Where can we find you?

  • JOSHUA GORDON: I'll be right outside.

  • ASHLEY: OK.

  • Awesome.

  • All right.

  • We're going to take a 10-minute break.

  • For those of you who are standing in the back,

  • there's some more seats over on this side of the room.

  • We'll be back in about 10 minutes.

  • See you soon.

  • Thanks again.

  • JOSHUA GORDON: Thanks a lot.

ASHLEY: My name is Ashley.

Subtitles and vocabulary

Click the word to look it up Click the word to find further inforamtion about it