Placeholder Image

Subtitles section Play video

  • [MUSIC PLAYING]

  • NICHOLAS GILLIAN: OK.

  • We've just heard from the TensorFlow Light Team

  • how it's getting even easier to place machine

  • learning directly on devices.

  • And I'm sure this got many of you thinking,

  • what's possible here?

  • Now, I'm going to tell you about how

  • we're using Jacquard to do exactly this,

  • embed machine learning directly into everyday objects.

  • Before I jump into the details, let

  • me tell you a little bit about Jacquard platform.

  • So Jacquard is a machine learning

  • powered ambient computing platform

  • that extends everyday objects with extraordinary powers.

  • At the core of the Jacquard platform is the Jacquard tag.

  • This is a tiny embedded computer that

  • can be seamlessly integrated into everyday objects,

  • like your favorite jacket, backpack, or pair of shoes.

  • The tag features a small embedded ARM processor

  • that allows us to run ML models directly on the tag with only

  • sparse gesture or motion predictions

  • being emitted via BLE to your phone when detected.

  • What's interesting is that the tag has a modular design, where

  • the ML models can either run directly

  • on the tag as a standalone unit or via additional low-power

  • compute modules that can be attached along

  • with other sensors, custom LEDs, or haptic motors.

  • A great example of this is the Levi's Trucker Jacket

  • that I'm wearing.

  • Let me show you how this works.

  • So if we can switch over to the overhead camera,

  • so here I can take Jacquard tag and add it

  • to a specifically designed sensor module which

  • is integrated into the jacket.

  • Let me check that again.

  • What happens now is that this talks to an M0 processor that's

  • running on the jacket itself, which

  • is talking to some integrated sensor lines in the jacket.

  • The M0 processor not only reads data from the sensor lines,

  • but it also allows us to run ML directly on the tag.

  • This allows us to do gestures, for example, on the jacket

  • to control some music.

  • So for example, I can do a double tap gesture,

  • and this can start to play some music.

  • Or I can use a cover gesture to silence it.

  • Users can also use swipe in and swipe out gestures

  • to control their music, drop pins on maps,

  • or whatever they'd like, depending

  • on the abilities in the app.

  • What's important here is that all of the gesture recognition

  • is actually running on the M0 processor.

  • This means that we can run these models at super low power,

  • sending only the events to the user's phone via the Jacquard

  • app.

  • So I'm sure many of you are wondering how we're actually

  • getting our ML models to be deployed

  • in this case in a jacket.

  • And by the way, this is a real product

  • that you can go to your Levi's store and buy today.

  • So as most of you know, there are three big on-device ML

  • challenges that need to be addressed to enable platforms

  • like Jacquard.

  • So first is how can we train high-quality ML models that

  • can fit on memory-constrained devices?

  • Second, let's assume we've solved problem one and have

  • a TensorFlow model that's small enough to fit within our memory

  • constraints.

  • How can we actually get it running

  • on low compute embedded devices for real-time inference?

  • And third, even if we solve problems one and two,

  • it's not going to be a great user experience

  • if the user has to keep charging their jackets

  • or backpacks every few hours.

  • So how can we ensure the ML model's always

  • ready to respond to a user's actions

  • when needed while still providing multi-day experiences

  • on a single charge?

  • Specifically for Jacquard, these challenges have mapped

  • to deploy models as small as 20 kilobytes,

  • in the case of the Levi's jacket,

  • or running ML models on low-compute microprocessors,

  • like a Cortex-M0+, which is what's embedded here

  • in the cuff of the jacket.

  • To show you how we've addressed these challenges for Jacquard,

  • I'm going to walk you through a specific case study for one

  • of our most recent products, so recent, in fact, that it

  • actually launched yesterday.

  • First, I'll describe the product at a high level,

  • and then we can review how we've trained and deployed

  • ML models that in this case fit in your shoe.

  • So the latest Jacquard-enabled product is called GMR.

  • This is an exciting new product that's

  • being built in collaboration between Google, Adidas,

  • and the EA Sports FIFA Mobile Team.

  • With gamer GMR, you can insert the same tag that's

  • inserted in your jacket into an Adidas insole

  • and go out in the world and play soccer.

  • So you can see here where the tag inserts at the back.

  • The ML models in the tag will be able to detect

  • your kicks, your motion, your sprints, how far you've run,

  • your top speed.

  • We can even estimate the speed of the ball as you kick it.

  • Then after you play, the stream of predicted soccer events

  • will be synced with your virtual team in the EA FIFA Mobile

  • Game, where you'll be rewarded with points by completing

  • various weekly challenges.

  • This is all powered by our ML algorithms that run directly

  • in your shoe as you play.

  • So GMR is a great example of where

  • running ML inference on device really pays off,

  • as players will typically leave their phone in the locker room

  • and go out and play for up to 90 minutes

  • with just the tag in their shoes.

  • Here, you really need to have the ML models run directly

  • on-device and be smart enough to know when to turn off

  • when the user is clearly not playing soccer

  • to help save power.

  • So this figure gives you an idea of just how interesting

  • a machine learning problem this is.

  • Unlike, say, normal running, where

  • you would expect to see a nice, smooth, periodic signal

  • over time, soccer motions are a lot more dynamic.

  • For example, in just eight seconds of data here,

  • you can see that the player moves

  • from a stationary position on the left, starts to run,

  • breaks into a sprint, kicks the ball,

  • and then slows down again to a jog

  • all within an 8 second window.

  • For GMR, we needed our ML models to be responsive enough

  • to capture these complex motions and work across a diverse range

  • of players.

  • Furthermore, this all needs to fit within the constraints

  • of the Jacquard tag.

  • For GMR, we have the following on-device memory constraints.

  • We have around 80 kilobytes of ROM,

  • which needs to be used not just for the model widths,

  • but also the required ops, the model graphs, and, of course,

  • the supporting code required for plumbing everything

  • together so this can be plugged into the Jacquard OS.

  • We also have around 16 kilobytes of RAM,

  • which is needed to buffer the raw unused sensor data can also

  • be used as scratch buffers for the actual ML

  • inference in real-time.

  • So how do we train models that can detect kicks, a player's

  • speed and distance, and even estimate the ball

  • speed within these constraints?

  • Well, the first step is we don't--

  • well, at least initially.

  • We train much larger models in the cloud

  • to see just how far we can push the model's performance.

  • In fact, this is using TFX, which

  • is one of the systems that was shown earlier today.

  • This helps inform the design of the problem space

  • and guide what additional data needs to be collected

  • to boost the model's quality.

  • After we start to achieve good model performance

  • without the constraints on the cloud,

  • we then use these learnings to design much smaller models that

  • start to approach the constraints of the firmware.

  • This is also when we start to think about not

  • just how the models can fit within the low compute

  • and low memory constraints, but how they can run at low power

  • to support multi-day use cases.

  • For a GMR, this led us to design an architecture that

  • consists of not one, but four neural networks that all

  • work coherently.

  • This design is based on the insight

  • that even during an active soccer match,

  • a player only kicks the ball during a small fraction

  • of gameplay.

  • We therefore use much smaller models

  • that are tuned for high recall to first predict

  • if a potential kick or active motion is detected.

  • If not, there's no need to trigger the larger, more

  • precise models in the pipeline.

  • So how do we actually get our multiple neural networks

  • to actually run on the tag?

  • To do this, we have built a custom C model exporter.

  • For this, the model exporter is using

  • a Python tool that calls a number of C ops from a lookup.

  • This then generates custom C code with a lightweight ops

  • library that can be shared across multiple graphs,

  • and the actual .H and .C code that you get for each model.

  • This allows us to have zero dependency overheads

  • for our models and make every byte count.

  • Here, for example, you can see an example of one of the C ops

  • that would be called by the library.

  • So this is for a rank three transpose operation,

  • which supports multiple IO types,

  • such as int8s or 32-bit floats.

  • So with this, you can see how we're

  • taking our neural networks and actually

  • getting them to run on the Jacquard tag live.

  • I hope that you're inspired by projects like Jacquard,

  • and this makes you think about things that you could possibly

  • do with tools like TF Lite Micro to actually build

  • your own embedded ML applications.

  • [MUSIC PLAYING]

[MUSIC PLAYING]

Subtitles and vocabulary

Click the word to look it up Click the word to find further inforamtion about it