Subtitles section Play video Print subtitles [MUSIC PLAYING] NICHOLAS GILLIAN: OK. We've just heard from the TensorFlow Light Team how it's getting even easier to place machine learning directly on devices. And I'm sure this got many of you thinking, what's possible here? Now, I'm going to tell you about how we're using Jacquard to do exactly this, embed machine learning directly into everyday objects. Before I jump into the details, let me tell you a little bit about Jacquard platform. So Jacquard is a machine learning powered ambient computing platform that extends everyday objects with extraordinary powers. At the core of the Jacquard platform is the Jacquard tag. This is a tiny embedded computer that can be seamlessly integrated into everyday objects, like your favorite jacket, backpack, or pair of shoes. The tag features a small embedded ARM processor that allows us to run ML models directly on the tag with only sparse gesture or motion predictions being emitted via BLE to your phone when detected. What's interesting is that the tag has a modular design, where the ML models can either run directly on the tag as a standalone unit or via additional low-power compute modules that can be attached along with other sensors, custom LEDs, or haptic motors. A great example of this is the Levi's Trucker Jacket that I'm wearing. Let me show you how this works. So if we can switch over to the overhead camera, so here I can take Jacquard tag and add it to a specifically designed sensor module which is integrated into the jacket. Let me check that again. What happens now is that this talks to an M0 processor that's running on the jacket itself, which is talking to some integrated sensor lines in the jacket. The M0 processor not only reads data from the sensor lines, but it also allows us to run ML directly on the tag. This allows us to do gestures, for example, on the jacket to control some music. So for example, I can do a double tap gesture, and this can start to play some music. Or I can use a cover gesture to silence it. Users can also use swipe in and swipe out gestures to control their music, drop pins on maps, or whatever they'd like, depending on the abilities in the app. What's important here is that all of the gesture recognition is actually running on the M0 processor. This means that we can run these models at super low power, sending only the events to the user's phone via the Jacquard app. So I'm sure many of you are wondering how we're actually getting our ML models to be deployed in this case in a jacket. And by the way, this is a real product that you can go to your Levi's store and buy today. So as most of you know, there are three big on-device ML challenges that need to be addressed to enable platforms like Jacquard. So first is how can we train high-quality ML models that can fit on memory-constrained devices? Second, let's assume we've solved problem one and have a TensorFlow model that's small enough to fit within our memory constraints. How can we actually get it running on low compute embedded devices for real-time inference? And third, even if we solve problems one and two, it's not going to be a great user experience if the user has to keep charging their jackets or backpacks every few hours. So how can we ensure the ML model's always ready to respond to a user's actions when needed while still providing multi-day experiences on a single charge? Specifically for Jacquard, these challenges have mapped to deploy models as small as 20 kilobytes, in the case of the Levi's jacket, or running ML models on low-compute microprocessors, like a Cortex-M0+, which is what's embedded here in the cuff of the jacket. To show you how we've addressed these challenges for Jacquard, I'm going to walk you through a specific case study for one of our most recent products, so recent, in fact, that it actually launched yesterday. First, I'll describe the product at a high level, and then we can review how we've trained and deployed ML models that in this case fit in your shoe. So the latest Jacquard-enabled product is called GMR. This is an exciting new product that's being built in collaboration between Google, Adidas, and the EA Sports FIFA Mobile Team. With gamer GMR, you can insert the same tag that's inserted in your jacket into an Adidas insole and go out in the world and play soccer. So you can see here where the tag inserts at the back. The ML models in the tag will be able to detect your kicks, your motion, your sprints, how far you've run, your top speed. We can even estimate the speed of the ball as you kick it. Then after you play, the stream of predicted soccer events will be synced with your virtual team in the EA FIFA Mobile Game, where you'll be rewarded with points by completing various weekly challenges. This is all powered by our ML algorithms that run directly in your shoe as you play. So GMR is a great example of where running ML inference on device really pays off, as players will typically leave their phone in the locker room and go out and play for up to 90 minutes with just the tag in their shoes. Here, you really need to have the ML models run directly on-device and be smart enough to know when to turn off when the user is clearly not playing soccer to help save power. So this figure gives you an idea of just how interesting a machine learning problem this is. Unlike, say, normal running, where you would expect to see a nice, smooth, periodic signal over time, soccer motions are a lot more dynamic. For example, in just eight seconds of data here, you can see that the player moves from a stationary position on the left, starts to run, breaks into a sprint, kicks the ball, and then slows down again to a jog all within an 8 second window. For GMR, we needed our ML models to be responsive enough to capture these complex motions and work across a diverse range of players. Furthermore, this all needs to fit within the constraints of the Jacquard tag. For GMR, we have the following on-device memory constraints. We have around 80 kilobytes of ROM, which needs to be used not just for the model widths, but also the required ops, the model graphs, and, of course, the supporting code required for plumbing everything together so this can be plugged into the Jacquard OS. We also have around 16 kilobytes of RAM, which is needed to buffer the raw unused sensor data can also be used as scratch buffers for the actual ML inference in real-time. So how do we train models that can detect kicks, a player's speed and distance, and even estimate the ball speed within these constraints? Well, the first step is we don't-- well, at least initially. We train much larger models in the cloud to see just how far we can push the model's performance. In fact, this is using TFX, which is one of the systems that was shown earlier today. This helps inform the design of the problem space and guide what additional data needs to be collected to boost the model's quality. After we start to achieve good model performance without the constraints on the cloud, we then use these learnings to design much smaller models that start to approach the constraints of the firmware. This is also when we start to think about not just how the models can fit within the low compute and low memory constraints, but how they can run at low power to support multi-day use cases. For a GMR, this led us to design an architecture that consists of not one, but four neural networks that all work coherently. This design is based on the insight that even during an active soccer match, a player only kicks the ball during a small fraction of gameplay. We therefore use much smaller models that are tuned for high recall to first predict if a potential kick or active motion is detected. If not, there's no need to trigger the larger, more precise models in the pipeline. So how do we actually get our multiple neural networks to actually run on the tag? To do this, we have built a custom C model exporter. For this, the model exporter is using a Python tool that calls a number of C ops from a lookup. This then generates custom C code with a lightweight ops library that can be shared across multiple graphs, and the actual .H and .C code that you get for each model. This allows us to have zero dependency overheads for our models and make every byte count. Here, for example, you can see an example of one of the C ops that would be called by the library. So this is for a rank three transpose operation, which supports multiple IO types, such as int8s or 32-bit floats. So with this, you can see how we're taking our neural networks and actually getting them to run on the Jacquard tag live. I hope that you're inspired by projects like Jacquard, and this makes you think about things that you could possibly do with tools like TF Lite Micro to actually build your own embedded ML applications. [MUSIC PLAYING]
B1 tag jacket model soccer embedded gesture Jacquard: Embedding ML seamlessly into everyday objects (TF Dev Summit '20) 1 0 林宜悉 posted on 2020/03/25 More Share Save Report Video vocabulary