Placeholder Image

Subtitles section Play video

  • [MUSIC PLAYING]

  • SANDEEP GUPTA: My name is Sandeep Gupta.

  • I'm a product manager in Google.

  • YANNICK ASSOGBA: And I'm Yannick Assogba,

  • and I'm a software engineer on the TensorFlow.js team.

  • SANDEEP GUPTA: And we are here to talk to you today

  • about machine learning and JavaScript.

  • So the video that you just saw, this

  • was our very first AI-inspired Google Doodle.

  • And it was able to bring machine learning

  • to life in a very fun and creative way

  • to millions of users.

  • And what users were able to do with this

  • is that you're able to use a machine learning model directly

  • running in the browser that was able to synthesize

  • a box-style harmony.

  • And what made this possible was this library called

  • TensorFlow.js.

  • So TensorFlow.js is an open-source library for machine

  • learning in JavaScript.

  • It's part of the TensorFlow family of products,

  • and it's built specifically to make it easier for JavaScript

  • developers to build and use machine

  • learning models within their JavaScript applications.

  • You use this library in one of three ways.

  • You can use one of the pre-existing pre-trained models

  • that we provide, and directly run them within your JavaScript

  • applications.

  • You can use one of the models that we have packaged for you,

  • or you can take pretty much any TensorFlow model that you have

  • and use a converter and run it with TensorFlow JavaScript.

  • You can use a previously-trained model,

  • and then retrain it with your own data to customize it,

  • and this is often useful to solve the problem that's

  • of interest to you.

  • This is done using a technique called transfer learning.

  • And then lastly, it's a full-feature JavaScript library

  • that lets you write and alter models

  • directly with JavaScript.

  • And so you can create a completely new model

  • from scratch.

  • Today in this talk, we will talk a lot about the first

  • and the third one of these.

  • For the re-training examples, there

  • are a bunch of these on our website and in the codelabs,

  • and we encourage you to sort of take a look after the talk.

  • The other part is that JavaScript

  • is a very versatile language, and it works

  • on a variety of platforms.

  • So you can use TensorFlow.js on all of these platforms.

  • We see a ton of use cases in the browser,

  • and it has a lot of advantages because, you know,

  • browser is super interactive.

  • You have easy access to sensors, such as webcam and microphone,

  • which you can then bring into your machine learning models.

  • And also we use WebGL-based acceleration.

  • So if you have a GPU in your system,

  • you can take advantage of that and get

  • really good performance.

  • TensorFlow.js will also run server-side using Node.js.

  • It runs on a variety of mobile platforms in iOS and Android

  • using mobile web platforms, and also it

  • can run in desktop applications using Electron.

  • And we'll see, later in the talk, more examples of this.

  • So we launched TensorFlow.js one year back last March,

  • and then earlier this year at our developer summit

  • we released version 1.0.

  • And we have been amazed to see really good adoption and usage

  • by the community, and some really good sort of popularity

  • numbers.

  • We are really, really excited to see

  • more than 100 external contributors who

  • are contributing to and making the library better.

  • So for those of you who are in the audience or those

  • of you listening, thank you very much from all

  • of the TensorFlow.js team.

  • So let's dive a little bit deeper into the library

  • and see how it is used.

  • OK, I'm going to start with looking

  • at some pre-trained models first.

  • So I want to show you a few of these today.

  • So we have packaged a variety or a collection

  • of pre-trained models for use out of the box

  • to solve some of the most common types of ML problems

  • that you might encounter.

  • These work with images.

  • So we-- for tasks such as image classification,

  • detecting objects, segmenting objects, and finding boundaries

  • of objects, recognizing human gesture and human

  • pose from image or video data.

  • We have a few speech audio models,

  • which work with speech commands to recognize spoken words.

  • We have a couple of text models for analyzing, understanding,

  • and classifying text.

  • All of these models are packaged with very easy-to-use wrapped

  • APIs for easy consumption in JavaScript applications.

  • You can either NPM install them, or you can directly

  • use them from our hosted scripts with nothing to install.

  • So let's take a look at two examples.

  • The first model I want to show you is an image-based model.

  • It's called BodyPix.

  • So this is the model that lets you take image data,

  • and it finds whether there is a person in that image or not.

  • And if there is a person, it will segment out

  • the boundary of that person.

  • So it will label each pixel as whether it

  • belongs to the person or not.

  • And you can also do body part segmentation.

  • So it can further divide up the pixels that belong to a person

  • into one of 24 body parts.

  • So let's take a look at what the code looks like

  • and how you would use a model like this.

  • So you start by loading the library

  • and by loading the model using the script

  • tag from our hosted scripts.

  • You choose an image file.

  • You can load it from disk or you could point to a webcam element

  • to load it from the webcam.

  • And once you have an image, then you

  • create an instance of the BodyPix model,

  • and you call its person segmentation method

  • on the image that you have chosen.

  • Because this runs asynchronously,

  • you wait for the result and we do that

  • by using the wait keyword.

  • So once you get back the segmentation result,

  • it returns an object, and this object

  • has the width and the height of the image,

  • and also a binary array of zeros and ones, with the pixels where

  • the person is found are labeled, and you see that

  • in that image on the right.

  • You could also use the body parts segmentation method

  • instead of the person segmentation

  • method, in which case you would get the sub-body part

  • classification.

  • The model is packaged with a set of utility functions

  • for rendering, and here you see the example

  • of the drawPixelatedMask() function,

  • which produces this image on the right.

  • OK, so this is how you would use one of these image-based models

  • directly in your web application.

  • The second model I want to show you is a speech commands model.

  • So this is an audio model that will look for,

  • that will listen to microphone data,

  • and try to recognize some spoken words.

  • So you can use this to build voice controls and interfaces

  • or to recognize words for translation

  • and other types of applications.

  • So let me quickly switch to the demo laptop.

  • So we have a small glitch application written,

  • which uses the speech commands model,

  • and we are using a version of a pre-trained model, which

  • is trained on a vocabulary of just four simple words-- up,

  • down, left and right.

  • So when I click start and I can speak these words,

  • this application will display a matching emoji.

  • So let's try it out.

  • Left.

  • Up.

  • Left.

  • Down.

  • Down.

  • OK.

  • Right.

  • Left.

  • Up.

  • There we go.

  • We can go back to the screen.

  • This actually points to what you would frequently encounter

  • with machine learning models.

  • There are a lot of other factors that you have to account for--

  • things like background noise, and just training data

  • representing adequately the type of data that it

  • will encounter in real life.

  • So let's again take a look at what the code would look like.

  • Again, you use the script tag similar to before

  • to load the model and to load our library.

  • And now, we create an instance of the speech commands model

  • and we initialize it or we use a version of the speech commands

  • model that's trained for the specific vocabulary

  • of interest.

  • So in this case, it's this directional four-word

  • vocabulary.

  • We have packaged this model with a couple of other vocabularies

  • that you can use, and also you can extend this model

  • to your own vocabulary using transfer learning.

  • And we have a codelab which shows how to do that.

  • So once you have initiated the model,

  • then we call its listen method, which

  • starts listening to the microphone data.

  • And then once it recognizes these words,

  • then it returns a set of probabilities for the matching

  • score for each of the spoken words

  • in its set of labeled classes.

  • And then once you've figured out what the spoken word is,

  • then you can use that to display emojis, for example,

  • in that particular example.

  • OK, so I'm going to turn it over to Yannick,

  • who will show you how to do training using this library.

  • YANNICK ASSOGBA: Thanks Sandeep.

  • Hello.

  • So, Sandeep showed you one of the simplest ways

  • to get started with machine learning with TensorFlow.js,

  • and that's to take one of our pre-trained models,

  • incorporate it into your app.

  • And, if you noticed, you won't even

  • have to think about tensors, but there

  • are situations where there won't be a model that works out

  • of the box for your use case, and that's

  • where training comes in.

  • So TensorFlow.js has a full API to support training custom

  • models right in JavaScript.

  • So last year here at I/O, we showed

  • a demo of training a game controller using webcam input--

  • in this example, your face--

  • to control this Pac-Man game.

  • And this year, we're going to look a bit more closely

  • at the training process, focusing

  • on training in Node.js, and what it looks

  • like to bring your own data.

  • Now some of the advantages of training in Node.js

  • include generally increased access to memory and storage,

  • increased performance in certain situations, and importantly,

  • being able to browse the internet while you

  • wait for your model to train.

  • In the browser, when you're training a model,

  • you have to keep the tab focused.

  • Else, many browsers will throttle performance

  • on that tab.

  • So it's quite handy that you're able to do something else.

  • All right, so let's train a custom text classifier,

  • and there's really two main things

  • I'd like you to take away from this exercise.

  • The first is generally how to work

  • with text in TensorFlow.js, and the other

  • is a general principle of using an existing building block

  • to bootstrap your machine learning project.

  • This is referred to as transfer learning,

  • and it's really helpful when you're getting

  • started with machine learning.

  • And we'll see more about that in the example, but to step back

  • a bit, what can you do with a text classifier?

  • So there are classical examples, such as sentiment analysis,

  • or spam detection, but you can also

  • do things like log scrubbing, where

  • you may look through your logs for maybe

  • personal or private information that you don't want to keep,

  • and obfuscate it or remove it before you store it.

  • But you can also do things like analyze product reviews

  • or do document clustering.

  • But today, we're going to build a component for a chatbot,

  • and in particular, we're going to look

  • at classifying user intents.

  • So, for example, given the sentence,

  • "Will it rain in the next 30 minutes?"

  • we want the model to detect that that's a GetWeather

  • intent, or something like "Play the latest Bach album,"

  • should be a PlayMusic intent.

  • So any machine learning project needs data to learn from,

  • and today the data we're going to use come from

  • comes from the Snips AI NLU benchmark,

  • and it's an open-source dataset that's available on GitHub.

  • And for our first task, we're basically

  • going to start with a spreadsheet.

  • As you can see, it has the query sentences on one side,

  • and the intents on the other.

  • However, one thing we need to do is convert this text

  • into numbers so that we can feed it into our neural network

  • because neural networks don't really understand text

  • natively, and that is where the Universal Sentence

  • Encoder comes in.

  • It's a deep neural network created by Google

  • that I like to think of as NLP in a box.

  • It takes sentences and turns them

  • into lists of numbers that encode the meaning

  • and syntax of those sentences, and we'll

  • take a look at an example.

  • So let's think of this example, "What

  • is the weather in Cambridge, Massachusetts?"

  • The Universal Sentence Encoder will take that sentence

  • and turn it into an array of 512 numbers,

  • and it will always be 512 numbers regardless

  • of the length of the sentence, which is actually

  • quite nice because it gives us a regular sort of structure

  • to work with.

  • And this is what the code looks like to create those numbers.

  • So, similar to what Sandeep showed earlier,

  • we load our pre-trained model, the Universal Sentence Encoder.

  • We wait for its waits to finish loading,

  • and then we call this model.embed() with

  • the sentences that we want to pass in.

  • And this process of turning these sentences into numbers

  • is often referred to as embedding

  • in machine learning terminology, and you'll kind of

  • hear that term a bit.

  • So just a bit of what that looks like.

  • Wait for the result, and that's our set of 512 numbers.

  • OK.

  • Next is the intent.

  • So we also have to convert these into numbers.

  • So since these are categories and we

  • have a small number of them, we can

  • use a scheme called oneHot encoding

  • to turn the label into a small array that has a 1

  • in the position corresponding to that label.

  • So in this example, we have our GetWeather intent,

  • and notice the first element of the array is a 1

  • and the rest are zeros.

  • In the corner, you will see the other two

  • intents we're using in this demo, the Play Music

  • one and the AddToPlayList, and there's always

  • just a single one in the array representing

  • which category this represents.

  • And here is the code to do that, and it is basically

  • an index look-up into a list.

  • So we call the method tf.oneHot, given

  • the index of the label we want to encode

  • as well as the total number of categories we have.

  • So here, GetWeather and three, and then it's

  • going to return that compact array that

  • is our numerical representation of the label.

  • All right.

  • So now we have our inputs, or often as referred to

  • in machine learning, our xs, and our targets, which are also

  • referred to as ys often.

  • And now our goal is to take the 512 numbers representing

  • the input and predict that smaller array that's

  • 1, 0, 0 that represents that particular intent,

  • and we're going to train a model to do that.

  • So let's code up a model.

  • So this is the entire code for the model.

  • It's not too much code, but it does a lot of work,

  • and as you spend time with this, it

  • becomes more and more familiar.

  • At the top, we see our EMBEDDING_DIMS, 512,

  • and NUM_CLASSES that represent the size of the input

  • and the output of the model, respectively.

  • So 512 numbers coming in, three numbers going out.

  • The part that's highlighted is the entire model definition,

  • and I won't dwell on this too long,

  • but this is a common building block

  • you'll see in neural networks, and it's

  • known as a dense network.

  • So we start with a sequential model,

  • and this network just has one layer,

  • and it's the one dense layer, and its job

  • is to convert 512 numbers to three numbers.

  • That's its entire job.

  • The deep learning part will have been

  • take care of by the Universal Sentence Encoder.

  • Finally, we compile the model to get it ready for training,

  • and here we pick an optimizer, which

  • is an algorithm that's going to drive the wait

  • updates during the training process,

  • as well as a loss function to tell us how well we're doing.

  • And here, we're picking one that's commonly

  • used for classification tasks.

  • So onto the training loop.

  • So model.fit() is the function that actually runs the whole

  • training process, and here we're calling it with a few

  • parameters. xs are our input sentences.

  • ys are those targets from our training set,

  • and then two extra parameters. epochs is a fancy word that

  • refers to the number of times you

  • go through the data set before you're done training,

  • and you can set that kind of to whatever you want.

  • validationSplit is a fraction between 0 and 1,

  • and it indicates a portion of the data

  • that we're going to set aside and not train with,

  • but we can use it to see how well we're

  • doing at making predictions.

  • So before we look at the demo itself,

  • let's take a quick look at what the code to deploy

  • that trained model would look like in the browser,

  • and let's just take a quick sample.

  • So we'd first load the models in the metadata, but for now

  • I just want you to focus on the three lines in the middle.

  • So the basic process is that we take the input query

  • from the user.

  • We're going to use the Universal Sentence Encoder to embed it

  • into that numerical representation.

  • Then we're going to call our model, with model.predict(),

  • to get our final prediction.

  • We're going to call the .array() to get the values out,

  • and we're going to use that to drive our UI.

  • And in this case, our UI is going to look something

  • like this, and it's about 150 lines of JavaScript,

  • and about 100 lines of CSS.

  • So it's not terribly large.

  • So let's get our hands into it.

  • So if you just switch to the demo laptop, please.

  • All right.

  • So we're just going to sort of take a quick tour of the code.

  • It's all available on GitHub.

  • So here is that spreadsheet that we started with--

  • our queries and our intents.

  • And I've done a bunch of the pre-processing beforehand,

  • so that we don't have to wait for that [INAUDIBLE] complete.

  • So the first step was converting it to tensors,

  • and it's just those long lists of 512 numbers.

  • JSON isn't the most efficient format to store this,

  • but it's quite readable.

  • So you can actually just look at what a tensor is.

  • It's just a long list of numbers.

  • We also have some metadata for our model.

  • So these are the three classes that we're going to train,

  • and we have about 6,000 sentences

  • that we're going to learn from.

  • Our model itself looks pretty much like what

  • we saw in the slides.

  • There's no surprises there.

  • And finally, our script that's going to run the training.

  • Its main job-- it takes a bunch of options,

  • but its main job is to load the data and train the model.

  • So it's going to do that with model.fit() as we saw earlier.

  • We're going to wait for that to be complete,

  • and then we're going to save that model to disk.

  • So that's going to save a JSON file and a binary file.

  • The JSON file contains the sort of structure of the network.

  • The binary file contains the weights in an efficient format.

  • Both of those are loadable in the browser,

  • and that's our basic process.

  • So let's see what that kind of looks like.

  • So here I'm going to run the training script for this

  • while I'll say yarn train-intent.

  • It's going to load up, load our script,

  • and use TensorFlow on the CPU and we're off to the races.

  • So each one of these lines that it prints is an epoch,

  • and that's a trip through the entire data

  • set of about 6,000 sentences, and you

  • notice it goes pretty quickly.

  • It typically finishes in about 20 seconds, and boom--

  • 19.90 today.

  • And we've trained a model.

  • It's done.

  • We can now load that in the browser and use that

  • to make predictions.

  • You can look at the file that's produced.

  • It's a really small model file.

  • So I'm going to start this demo app, which is here in our app

  • folder, and it's just like a client-side JavaScript app,

  • and this is really where all the machine learning happens,

  • where we make our predictions.

  • So this is going to copy the model that we just trained over

  • into the folder for the client-side app

  • and launch it in dev mode.

  • I'm actually going to lift this a bit higher,

  • and now we can try and make predictions.

  • So we can ask it like, what is the weather in Cambridge?

  • Sweet, and it responds with a nice cloud for like it

  • has detected a weather query.

  • Or we can say, play the latest Bach album.

  • And it's correctly classified that as a PlayMusic one.

  • Or even things like but those sick beats on my running list--

  • not that I run terribly often, but we

  • get the right response back of AddToPlaylist,

  • but what happens when you give it something surprising?

  • Like, get me a pizza.

  • Well, it's just going to throw its hands up and shrug,

  • and that's actually quite useful.

  • You generally don't want your model

  • to do things when it's not the right thing that it's

  • been trained to do.

  • And how we've set this up is that we set a threshold

  • for confidence in classification,

  • and so it should be pretty confident in one

  • of these classes before it takes that action,

  • and that's very useful to think of when you're building

  • a machine-learning-driven app.

  • It's sometimes good to say, I don't know or not [INAUDIBLE]..

  • So, sweet.

  • That's our classifier.

  • Let's head back to the slides for a bit.

  • All right.

  • So we built our custom classifier.

  • Yay.

  • And in many instances, that might be all that you need.

  • It may be the final step of your pipeline,

  • or once you have the specific intent,

  • you can apply some handwritten rules

  • to extract information and do the rest of your processing.

  • However, we can train models to do more than just

  • whole sentence classification.

  • So, given our original query, "What

  • is the weather in Cambridge, Massachusetts?"

  • we may want to know, what's the location-related aspect

  • of this sentence, and that's what

  • we're going to look at next.

  • So we can reformulate our problem a bit to this.

  • So, given a sentence, we want to tag each word

  • in the sentence with a tag.

  • Like TOK for generic token or LOC for location.

  • We could have other tag types, but for now we're

  • just going to focus on these two and the weather queries.

  • Like before, we need to convert our text into numbers,

  • and like before, we're going to use the Universal Sentence

  • Encoder to do that.

  • We're just going to give it one word at a time.

  • So now each word becomes an array of 512 numbers,

  • and in addition we're going to add these special tokens

  • at the end of our sentences.

  • And that's the __PAD tokens.

  • These will have two purposes.

  • They'll let us know when the end of a sentence is,

  • but more importantly, we're going to add enough of them

  • so that all of our sentences are effectively the same length.

  • And this will be useful for us because it gives us

  • a nice rectangular matrix that we can use during training,

  • and that's just way more efficient to train.

  • So this is roughly what our input will look like.

  • They'll be sort of enough pad tokens to make everything

  • a given length.

  • What about our targets?

  • So now we want to predict something for each word,

  • and we're going to use oneHot encoding again.

  • Conveniently, we have three categories like before.

  • So we have our TOK category, our LOC for location,

  • and the special pad category that

  • will tell us when we've reached the end of the sentence.

  • And we see the oneHot encoding scheme just like before.

  • So once you've done that for our inputs and our outputs,

  • we now have them represented as sequences.

  • So now each sentence is a sequence

  • of those arrays of numbers, and we need an appropriate model

  • to handle them.

  • So you can use a dense network like before,

  • though I'd advise maybe adding some capacity to that,

  • but today we're going to look at a special kind of network,

  • known as a recurrent neural network,

  • and in particular a special kind of layer

  • known as an LSTM that is geared towards handling sequences.

  • So let's take a look at that.

  • So here is our new model function.

  • First, I want you to notice that the start and end of it

  • is pretty similar to what we saw before.

  • EMBEDDING_DIMS is still 512, and the number of classes

  • is still three.

  • We still start with a sequential model, and the end of it

  • is pretty similar.

  • We're going to compile it with the same optimizer

  • and the same loss function.

  • So really, the meat of it is in the middle.

  • So instead of starting with a dense layer,

  • we're going to use an LSTM layer,

  • and this is a special kind of layer that--

  • it's designed to learn across sequences.

  • And here, think of each sentence as a sequence.

  • So we're going to configure it, set a maximum sequence length,

  • and then after that we're going to do one more special thing.

  • We're going to take the LSTM layer

  • and wrap it in a bi-directional layer

  • to give us a bi-directional LSTM.

  • And this is useful because it allows the model

  • to learn context in both directions.

  • So you can think of it as reading the sentence left

  • to right and then right to left, and trying to learn from that.

  • Finally, we end with a dense layer,

  • and this is very common in classification problems

  • to end with a dense layer that has

  • your number of output classes, the NUM_CLASSES

  • that you see in the slide.

  • But because of the LSTM stuff we did previously,

  • we do have to wrap it in this time-distributed layer that's

  • going to unroll some of the sequence stuff

  • that happened earlier.

  • So that's our entire model definition again.

  • Let's get our hands into it again.

  • Let's see what that looks like.

  • So I'm going to go back to the demo machine suite.

  • Let me close that and head to the code.

  • Again, I've pre-prepared the data.

  • So we can see what that looks like.

  • Here's our input data, and it's just the sentences broken up

  • into words with a tag for each word, and somewhere in there

  • there's some LOC ones.

  • So that's our input.

  • Another thing I've done is I've pre-embedded all of the words,

  • and just written them to a file.

  • So we can just look that up instead of calling Universal

  • Sentence Encoder each time.

  • So with that pre-processing done,

  • we can look at our model definition,

  • and-- sorry this is the tagger model--

  • and if you look at this on GitHub,

  • you'll see that it's a little more involved than what's

  • in the slide, and that's because the example we've put up

  • allows you to train three different kinds of models.

  • You can train a one-directional LSTM, the bi-directional LSTM

  • we were talking about today, or a dense network.

  • And that's just to let you compare and see

  • how they behave, but other than that,

  • it's pretty much the same.

  • Our training script is very similar.

  • The data is a bit bigger.

  • So there's a little bit more data management.

  • So we call fitDataset() this time,

  • and then we save that to disk just like before.

  • So that's the outline of our process, and we can run that.

  • So I'm just going to yarn train-tagger,

  • and we'll see what that looks like.

  • So we'll start it, and all of this training, by the way,

  • is just using CPU.

  • So you don't necessarily need a GPU to do any training,

  • but it does speed things up.

  • So we've started training, and probably

  • the first thing you've noticed is that it's

  • a lot slower than before.

  • The data is much bigger this time.

  • Each word is 512 numbers, and the model is more complex.

  • So it will take more time to train,

  • and I'd say on average it takes somewhere

  • between 10 and 20 minutes depending

  • on what options you pick.

  • So we're not going to wait for the whole thing.

  • To speed up the presentation, I did train a model last night.

  • So we're just going to use that, and look

  • at a demo app that's designed to sort of show you the process.

  • So I'm just going to start that, yarn tagger-app.

  • So we copy our model over it just like before and start up

  • our front-end application, and this

  • is a demo app just designed to give you

  • a sense of what is the pipeline that inputs are going through.

  • So we can now enter a query like "What is

  • the weather in Cambridge, MA?"

  • So the first line is our input sentence that's been tokenized.

  • The sort of grid thing is a representation

  • of those 512 numbers from the Universal Sentence Encoder,

  • and then below that is our top category that comes out.

  • And you can see that Cambridge, Massachusetts is nicely

  • classified as being location-related.

  • One nice thing about these models, you can try

  • is somewhat more complex ones like "What

  • is the weather in white river junction Vermont?"

  • and that's a place I have actually been to,

  • and it does get it correct.

  • We have this longer location-related sequence,

  • and it's correctly tagged the tokens as belonging to that.

  • You'll notice the confidence on the VT is a little lower.

  • If we use the more traditional capitalization--

  • so, "What is the weather in white river junction VT?"

  • you'll notice that the classification score for VT,

  • the abbreviation for Vermont, goes way up,

  • but because we've used a bi-directional LSTM,

  • you'll also notice that the words before that,

  • their confidence scores go up because its reading

  • the context in both directions.

  • So that can be super handy.

  • Another thing that's important to realize

  • is that it's not just memorizing place names.

  • So if you try just typing in "white river junction,"

  • it's not going to detect that as location, or even "white river

  • junction vt," and that's because it's

  • learned to find these in the context

  • of these weather-related queries.

  • That's the training data.

  • So it's not just gone and memorized

  • a bunch of location names, for example.

  • So it's important to keep in mind.

  • Like, it's really based on what you gave it to train.

  • All right, and we can switch back to these slides.

  • Sweet, that worked.

  • So, yay.

  • We trained an intent classifier and a model

  • that can extract information from that identified intent.

  • Sweet.

  • So is it time to ship this into production?

  • I would caution against this.

  • So you really should take care to first test

  • that your model is robust to different situations,

  • and that it will match your user's expectations.

  • Machine learning models are probabilistic

  • and behave differently based on often subtle differences

  • in the training data used.

  • So it's super important to have a good test set, including

  • some tricky cases, and just validate

  • that with your users to make sure

  • it matches those expectations.

  • And Google has a number of great resources online on this topic,

  • and I also recommend checking out

  • the designing human-centered AI products

  • talk by some of our colleagues in Google Pair

  • later this afternoon at I/O.

  • So now you've seen a bit of what the workflow looks

  • like to train a model.

  • You can check out the full code on GitHub.

  • I've included a short link to it here,

  • and it's part of our larger repository of examples.

  • And next, I'm going to hand it back

  • to Sandeep to talk about different ways

  • that TensorFlow.js has been used.

  • Thanks.

  • SANDEEP GUPTA: Thank you, Yannick.

  • [APPLAUSE]

  • So we saw how the library can be used for using and training

  • machine learning models.

  • I want to just take a few minutes

  • and quickly show you a variety of applications of what people

  • are doing with TensorFlow.js.

  • We saw earlier that it runs on a bunch of different platforms.

  • So these examples will sort of span many of these.

  • Creatability, this is a project that's

  • being developed by the Creative Labs team in Google,

  • and this consists of a set of experiments

  • where they're exploring how to use

  • AI and ML to make these creative tools more accessible.

  • These run machine learning models in the browser,

  • and in this particular case you see

  • a person controlling a keyboard with head gestures and head

  • motion.

  • They have some really cool examples,

  • and I encourage you to check out the experiment sandbox, which

  • is showing many of these all on the website.

  • Uber uses machine learning in a very significant way

  • for a wide variety of problems at a very large scale,

  • and Manifold is a browser-based application

  • that Uber uses to visualize and debug their machine learning

  • models and data pipelines.

  • So this application runs in the browser,

  • and they're using TensorFlow.js for a lot

  • of numerical computations that they want to use here.

  • So, for example, distance calculations and visualization,

  • as well as clustering of data, et cetera.

  • They were able to find that, because of the WebGL

  • acceleration, they could accelerate

  • these computations more than 100x

  • compared to just natively using JavaScript.

  • AirBnB has an interesting use case

  • where their Trust team, when a user is trying

  • to upload a profile picture to the AirBnB website, sometimes

  • people accidentally use a driver's license

  • picture or a passport picture, which

  • may end up containing personal, sensitive information.

  • So AirBnB runs a machine learning model

  • directly on your client-side in the browser or on device

  • so that if you were to choose a picture which

  • may have some sensitive information,

  • it will alert you before you upload that picture

  • and prevent you from doing that.

  • On the server-side, Clinic.js is a really nice example

  • of an application that NearForm has built.

  • This is a Node.js-based application,

  • which is used for profiling node jobs or node processes.

  • And they're using TensorFlow.js to look for anomalies or spikes

  • in CPU usage or memory consumption

  • of these node applications.

  • So it's a really nice example of server-side application

  • of TensorFlow.js.

  • In the desktop, TensorFlow.js can be used with Electron,

  • and Magenta Studio is a set of plugins

  • that has packaged a collection of machine learning models

  • for music generation that used TensorFlow.js

  • for some very fun and creative music applications.

  • I think Magenta has a talk later today that you might

  • want to check out, And they also have some demos in the sandbox.

  • We have the Ableton Live plug-in for this in our sandbox

  • that you can see.

  • TensorFlow.js also runs on mobile platforms on mobile web

  • both on iOS and Android, and recently we

  • have been working on adding support

  • for the WeChat application.

  • So WeChat is a very popular social media platform

  • and messaging application, and it has a mini program

  • environment, which lets developers

  • build these small JavaScript applications called

  • mini programs and easily deploy and share them.

  • So let's take a quick look at a prototype

  • of what this could look like.

  • So in this video, what you will see

  • is a Pac-Man game that's shared as a WeChat mini-app,

  • and it will let the user control this game using head motion

  • from the phone's camera.

  • So you see the WeChat application

  • being launched here, and then one of my friends

  • has shared this mini program via a link.

  • And I just click on it and I launch this game,

  • and now this game is a little JavaScript

  • program that's running within the WeChat environment.

  • After I do a very quick calibration step

  • by looking straight up at the phone,

  • I'm ready to play this game.

  • And so this Pac-Man game is loaded,

  • and you will see in a moment, my head motion

  • is driving that little Pac-Man character.

  • So a really fun way of interacting

  • with device, and the nice thing is

  • that you can do a variety of things using webcams, using

  • text, using speech, and have a very convenient way of sharing

  • these applications without having to install anything.

  • So we saw a bunch of examples.

  • I just want to quickly show you that the community has

  • been building some really interesting applications

  • and use cases beyond all the examples

  • that I've shown you so far.

  • And for those of you in the audience

  • who have been using TensorFlow.js, a big thank you.

  • We have a collection of these examples

  • that you can check out on our gallery page, which

  • are extending TensorFlow.js and using it

  • for things like reinforcement learning

  • for that self-driving car example or generative models

  • and a variety of other interesting applications.

  • We also have a bunch of developers

  • who are building add-on libraries as extensions

  • on top of TensorFlow.js, and these

  • are extending TensorFlow.js in very useful ways

  • for libraries that let you track hand

  • gestures or facial movement and facial gestures

  • or also do more things like hyperparameter

  • tuning of machine learning models, et cetera.

  • So there are a bunch of these resources also available

  • on our website.

  • So in closing, I want to just point out a couple of resources

  • to help you get started.

  • There is this new textbook that is coming out.

  • It's called Deep Learning With JavaScript,

  • and it's written by our colleagues

  • on the TensorFlow.js team.

  • It's an excellent resource for learning about deep learning

  • and machine learning, and all examples

  • are written in JavaScript using TensorFlow.js.

  • For our audience here and people listening ,

  • we have a really nice sort of an offer of a discount code.

  • So that might find that useful.

  • TensorFlow recently also launched a new Coursera course

  • for deep-learning AI to introduce TensorFlow

  • and machine learning.

  • And as part of this course, we will

  • have a module on TensorFlow.js, which we'll be

  • launching in a couple of weeks.

  • I just want to point out a few more useful links

  • on our website for our models repo, for our gallery.

  • And then also we have an office hours session right here

  • in Google I I/O tomorrow.

  • You can come by and ask your questions

  • and meet the TensorFlow.js team.

  • You can check out many more demos

  • at our demonstration in the AI/ML sandbox,

  • and also there are a few hands-on codelabs

  • that you can try interactively in the codelabs area.

  • So thank you so much for coming out here today,

  • and have a great rest of the I/O.

[MUSIC PLAYING]

Subtitles and vocabulary

Click the word to look it up Click the word to find further inforamtion about it