Subtitles section Play video Print subtitles [MUSIC PLAYING] LAURENCE MORONEY: Running inference on compute-heavy machine learning models such as face and contour detection on mobile devices can be very demanding due to the device's limited processing power as well as considerations such as battery life. While converting to a fixed point model is one way to achieve acceleration, we've also had lots of feedback about GPU support. So now, we're excited to announce a developer preview of a GPU back end for TensorFlow Lite. This back end leverages OpenGL ES 3.1 compute shaders on Android, and Metal compute shaders on iOS. This release is a pre-compiled binary preview of the new GPU back end, allowing you an early chance to play with it. We're working on a full open source release later in 2019, and would love to hear your feedback as we shape it. Let's take a look at its performance. We've experimented with a GPU back end on a number of models, including four public ones and two internal models that we use in Google products. The results showed that the GPU back end worked two to seven times faster than a floating point CPU implementation. You can see the GPU inference time in orange and the CPU inference time in gray. We tested this on an iPhone 7, a Pixel 2, and a Pixel 3, amongst others, and this was the average inference time across 100 tests on each of these devices for each of these models. The speed up was most significant on more complex models, the kind that lead themselves better to GPU utilization. On smaller and simpler models, you may not see such a benefit because of, for example, the time cost of transferring data into GPU memory. So I'm sure your next question is, how do I get started? Well, the easiest way is to try our demo apps that use the GPU delegate. We have written a tutorial to walk you through this, and I'll put a link to it in the description for this video. There's also a couple of really nice screen casts showing you how to get up and running quickly. There's one for Android and one for iOS, and I've linked to them too. For Android, we've prepared a complete Android archive that includes TensorFlow Lite with the GPU back end. When you use this, you can then initialize the TensorFlow Lite interpreter using the GPU back end with this code. On iOS and C++, you can use modify graph with delegate after creating your model with this code. All of this is done for you in the sample app, so download it and give it a try. Now, not all operations are supported by the GPU back end at this moment. Your model will run fastest when it uses only the GPU supported ops. Others will automatically fall back to the CPU. To learn more about GPU, including a deep dive into how it works, check out the documentation on TensorFlow.org, which also has a bunch of details on optimizations, performance best practices, and a whole lot more. This is just the beginning of our GPU support efforts. We're continuing to add more optimizations, performance improvements, and API updates all the time, but would love to hear your feedback on our GitHub page. We'd also love to hear your feedback on YouTube, so don't forget to hit that Subscribe button. And if you've any questions or comments, just please leave in the comments below. Thank you. Are you excited to see what's new with TensorFlow in 2019? We've got lots of great new content coming, and it's all going to be covered right here on the YouTube channel. So whatever you do, don't forget to hit that Subscribe button, and we'll see you there.
B1 tensorflow lite lite inference android delegate feedback TensorFlow Lite, Experimental GPU Delegate (Coding TensorFlow) 5 1 林宜悉 posted on 2020/04/04 More Share Save Report Video vocabulary