Subtitles section Play video
[MUSIC PLAYING]
ARUN VENKATESAN: Hi.
My name is Arun Venkatesan, and I'm
TensorFlow developer advocate.
As you all know, TensorFlow Lite is a production-ready framework
for on-device machine learning.
TensorFlow Lite supports Android, iOS, Linux,
and other platforms.
TensorFlow Lite is now deployed on more than 4 billion edge
devices worldwide.
However, we think we've only scratched the surface.
We would like to continue to grow
the number of on-device machine-learning use cases
by making TensorFlow Lite more robust
and also keeping it simple.
And today we are excited to share with you
the recent improvements that we have made to the TensorFlow
Lite user journeys.
To make onboarding easier for new users,
we have recently updated learning materials,
both on the TensorFlow website as well as made them available
on Udacity via free course.
In addition to tutorials and code labs,
we also have a list of sample labs and models
across a variety of use cases to help you ramp up quickly.
These sample apps are updated periodically
along with the release of newer on-device
machine-learning models to provide you
with startup resources.
You can not only use these off-the-shelf models
and the sample apps, but we also show you an easy way
to customize these models to be used on your own data
as well as package and share these models
with your teammates, for example.
Now, once you've made the choice of either using
an off-the-shelf model or customizing an existing
model for your own need, we introduce a new set of tools
to use this model within an Android app
by reducing the boilerplate code that you need to write.
And, finally, TensorFlow Lite is not only class platform
but also supports hardware accelerators
specific to each platform.
Over the next few minutes, we'll walk you
through each one of these features.
Let's start with onboarding.
We know that on-device machine learning has only
started to pick up steam.
And so we wanted to make sure that there
are good learning materials to help
you get started with on-device machine learning and TensorFlow
Lite.
We recently launched an introduction
to TensorFlow Lite course on Udacity
that targets mobile developers that
are new to machine learning.
We have also updated the code labs and tutorials
on the TensorFlow website.
For example, the code lab referenced in the slide
walks you through the end-to-end process
to train TensorFlow machine-learning model that
can recognize handwritten digits converted to a TensorFlow Lite
model and deploy it on Android app.
Once you've familiarize yourself with TensorFlow Lite
and have decided to start prototyping your use case,
we have a list of sample apps that showcase
what is possible with TensorFlow Lite.
So instead of developing the apps from scratch,
you can select a sample app that is closest to your use case
and see how TensorFlow Lite actually works on a device.
We released a couple of new samples,
such as the Android and iOS samples for style transfer,
using which you can convert any image into an artwork,
as the slide shows.
In addition to sample apps, there
are also a bunch of pre-trained TensorFlow Lite models
both from Google as well as from the TensorFlow community
that you can leverage.
They cover a variety of use cases,
from computer vision to natural-language processing
and speech recognition.
You can discover these models through TensorFlow Hub,
TensorFlow.org website, or a GitHub repository
called Awesome TensorFlow Lite that one
of our Google developer experts has put out together.
As we all know, machine learning is
an extremely fast-paced field with new research
papers breaking the state of the art every few months.
In TensorFlow Lite, we spend a significant amount of effort
to make sure that these models that
are relevant to on-device machine learning
are well-supported.
And in the natural-language processing domain,
TensorFlow Lite supports MobileBERT,
which is the faster and smaller version of the popular BERT
model optimized for on-device machine learning.
It's up to 4.4x times faster than standard BERT,
while being 4x smaller with no loss in accuracy.
The model size has also been reduced to 100 MB
and thus is usable even on lower-end devices.
The MobileBERT model is available on our website
with a sample app for question-and-answer type
of tasks and is ready to use right now.
We are currently working on the quantized version of MobileBERT
with an expected further 4x size reduction.
And in addition to MobileBERT, we also just released
on TensorFlow Hub the mobile version of ALBERT,
an upgrade to BERT, that at once is the state-of-the-art
performance on natural-learning processing tasks.
We are really excited about the new use cases
that these models will enable.
And stay tuned for updates from us on this.
Other than an LP, TensorFlow Lite also supports
computer-vision state-of-the-art models.
EfficientNet-Lite brings the power of EfficientNet
to edge devices and comes in five variants,
allowing you to choose from the lowest
latency and low model-size variant
to the high-accuracy option, which
is called EfficientNet-Lite4.
The largest gradient, which is the quantized version
of EfficientNet-Lite4 achieves an 80.4% ImageNet top accuracy
while still running in real time on a Pixel 4 CPU.
The chart here in the slide shows the quantized--
how the quantized EfficientNet-Lite model
performs compares to a similarly quantized version
of the same popular ImageNet classification models.
In addition to ensuring that these models run well
on TensorFlow Lite, we also wanted
to make sure that you can easily customize these models
to your own use cases.
And so we're excited to announce TensorFlow Lite Model
Maker, which enables you customize a TensorFlow Lite
model on your own data set without any prior ML expertise.
Here is what you would have to do
to train EfficientNet on your own data set without TensorFlow
Lite Model Maker.
First, you would have to clone EfficientNet from GitHub
download the check points use the model as a feature
extractor, and put a classification head on top
of it, and then you would apply transfer learning
and convert it to your flight.
But with Model Maker, it's just four lines of code.
You start by specifying your data set, choose the model spec
that you'd like to use, and, boom, it works.
You can also evaluate the model and easily export it
to a TensorFlow Lite format.
In odML, you have to constantly make a trade-off between
accuracy, inference, and speed of model--
speed or model size--
and therefore, we want to allow you to not only customize
models for your own data but also easily switch
between different model architectures.
As shown in the code here, you can easily
switch by choosing to use either ResNet or EfficientNet.
We currently support image- and text-classification use cases,
but new use cases such as object detection or question
and answers are coming soon.
I'll now hand it off to Lu to talk
about how you can easily share, package, and deploy
these models.
LU WANG: Thanks, Arun.
Hi.
My name is Lu.
I'm a software engineer from TensorFlow Lite.
Once you have a working TFLite model,
the next step is to share it with your mobile-app teammates
and integrate it into your mobile apps.
Model Metadata is a new feature that
makes model sharing and deployment much easier
than before.
It contains both human-readable and machine-readable
information about what a model does and how to use a model.
Let's look at an example of the metadata
of an image-classification model.
First, there is some general information about the model,
such as name, description, version, and author.
Then, for each input and output tensor,
it documents the details, such as name and description,
the content type, which is for things
like image or a [INAUDIBLE] and the statistic of the tensor,
such as min and max values.
And it also has information about the associate files
that are related to a particular tensor--
for example, the label file of an image classifier
that describes the output.
All of this rich description of the model
helps user understand it more easily.
It also makes it possible to develop
tools that can parse the metadata
and then use the model automatically.
For example, we developed the TFLite Codegen tool for Android
that generates model interface with high level APIs
to interact with the model.
We'll talk about that in a minute.
We provide two sets of tools to develop
with TFLite metadata, one for model author
and one for mobile developers.
Both of them are available through the TFLite support pay
package.
For model author, we provide Python tools
to create the metadata and pack the associated
files into the model.
The new TFLite model becomes a zip file
that contains both a model with metadata
and the associate files.
It can be unpacked just with common zip tools.
One nice feature about the new model format
is it is compatible with existing TFLite
framework and interpreter.
For model developers, we provide the Codegen tool
that can pass metadata and then generates
an Android model that is ready to be
integrated into an Android app.
The generator module contains a model file
with metadata and social files packed
in, a Java file with easy-to-use API for inference--
in this example, it's MyModel.java.
And it also has the gradle files and manifest
file was proper configurations and also
the readme file, which has a [INAUDIBLE] of example
usage of the model.
Let's now take a closer look at those two sets of tools.
First, about creating and populating metadata.
TFLite metadata is essentially a FlatBuffer file.
It can be created using the FlatBuffer API.
Here's an example of how to create metadata using
FlatBuffer Python Object API.
Then once you have the metadata, you can populate it
and the associate files into a TFLite model
through the Python tool, metadata populator.
You can find the full example of how to manipulate with metadata
from TensorFlow.org.
Besides using the general FlatBuffer bindings,
we provide two ways to read the metadata from a model.
The first one is a convenient Python tool,
metadata displayer, that converts the metadata
into adjacent format and then returns
a list of associate files that are packed into the model.
Metadata displayer is accessible through the TFLite support pay
package.
The second one is a Java library,
metadata extractor, which contains the API to return
specific metadata fields and model specs,
such as the associate files, tensor metadata,
and the quantization parameters.
It can be integrated directly into an app
to replace those hard-coded configuration values.
Metadata extractor is now available on the Maven
repository.
Now you have added metadata to a TFLite model.
Let's see how the Android code generator makes
it easy to deploy the model.
Running the inference is more than just the model
but also steps like pre- and post-processing and data
conventions.
Here are the steps that needed when
you run inference with a TFLite model in a model app.
First, you need to download-- you need to load you model,
then to transfer your input data into the format
that the model can consume.
And then you run inference with a TFLite interpreter.
And, finally, you process your output result.
Without TFLite support library and Codegen,
you will have to write a lot of boilerplate code
to use a model, such as loading your model
and setting up the interpreter, allocating memory for the input
array, and converting the native bitmap instance to an RGB float
array that the model can consume,
and also post-processing the outputs for an app.
But with Codegen, the world of code
are reducing to just five lines of code
with a single line for each step, such as load your model,
transform your data, run inference, and use
the resulting output.
The Codegen tool reads the metadata
and automatically generates a Java wrapper
with model-specific API.
It also generates a code snippet for you,
like the example we'll just say for image classifier.
It makes it extremely easy to consume a TFLite
model without any ML expertise.
Your model-developer teammates will feel much relief now
that they don't have to worry about maintaining
a junk of complex ML logic in their apps like before.
And here's how you use a Codegen tool, a very simple
command-line tool.
Just specify your TFLite model with metadata,
your preferred Java package name and class name,
and the destination directory.
The Codegen tool will automatically
generate an Android module that is
ready to be integrated into app, as we
introduced in a previous slide.
The Codegen tool is also available through the TFLite
Support pay package.
We're also working with Android Studio
to integrate the Codegen into a favorite IDE,
so that you can generate the Java model
interface for your model by simply importing
the model into Android Studio.
Try it out in the Canary version of Android Studio.
Now, as your TFLite model is integrating
to your mobile apps, you are ready to scale it
to billions of mobile users around the world.
TFLite works cross-platform from mobile OS like Android and iOS
to IoT devices such as Raspberry Pi.
We're also added support for more hardware accelerators
such as Qualcomm Hexagon DSP for Android and CoreML iOS.
You can accelerate your model with those delegates
by just adding one line of code.
Please follow up was another TensorFlow video which
talks about those delegates.
Also, feel free to visit our website
for more details of what we've just covered,
and also check out our demo videos for TFLite reference
apps.
Thanks.
[MUSIC PLAYING]