Placeholder Image

Subtitles section Play video

  • [MUSIC PLAYING]

  • LAURENCE MORONEY: Running inference

  • on compute-heavy machine learning models

  • such as face and contour detection on mobile devices

  • can be very demanding due to the device's limited processing

  • power as well as considerations such as battery life.

  • While converting to a fixed point model

  • is one way to achieve acceleration,

  • we've also had lots of feedback about GPU support.

  • So now, we're excited to announce

  • a developer preview of a GPU back end for TensorFlow Lite.

  • This back end leverages OpenGL ES

  • 3.1 compute shaders on Android, and Metal compute

  • shaders on iOS.

  • This release is a pre-compiled binary preview

  • of the new GPU back end, allowing you an early chance

  • to play with it.

  • We're working on a full open source release later in 2019,

  • and would love to hear your feedback as we shape it.

  • Let's take a look at its performance.

  • We've experimented with a GPU back end

  • on a number of models, including four public ones and two

  • internal models that we use in Google products.

  • The results showed that the GPU back end worked two to seven

  • times faster than a floating point CPU implementation.

  • You can see the GPU inference time in orange and the CPU

  • inference time in gray.

  • We tested this on an iPhone 7, a Pixel 2, and a Pixel 3,

  • amongst others, and this was the average inference time

  • across 100 tests on each of these devices for each

  • of these models.

  • The speed up was most significant on more complex

  • models, the kind that lead themselves

  • better to GPU utilization.

  • On smaller and simpler models, you

  • may not see such a benefit because of,

  • for example, the time cost of transferring data

  • into GPU memory.

  • So I'm sure your next question is, how do I get started?

  • Well, the easiest way is to try our demo

  • apps that use the GPU delegate.

  • We have written a tutorial to walk you through this,

  • and I'll put a link to it in the description for this video.

  • There's also a couple of really nice screen

  • casts showing you how to get up and running quickly.

  • There's one for Android and one for iOS,

  • and I've linked to them too.

  • For Android, we've prepared a complete Android

  • archive that includes TensorFlow Lite with the GPU back end.

  • When you use this, you can then initialize the TensorFlow Lite

  • interpreter using the GPU back end with this code.

  • On iOS and C++, you can use modify graph with delegate

  • after creating your model with this code.

  • All of this is done for you in the sample app,

  • so download it and give it a try.

  • Now, not all operations are supported by the GPU back end

  • at this moment.

  • Your model will run fastest when it uses

  • only the GPU supported ops.

  • Others will automatically fall back to the CPU.

  • To learn more about GPU, including a deep dive

  • into how it works, check out the documentation

  • on TensorFlow.org, which also has

  • a bunch of details on optimizations,

  • performance best practices, and a whole lot more.

  • This is just the beginning of our GPU support efforts.

  • We're continuing to add more optimizations, performance

  • improvements, and API updates all the time,

  • but would love to hear your feedback on our GitHub page.

  • We'd also love to hear your feedback on YouTube,

  • so don't forget to hit that Subscribe button.

  • And if you've any questions or comments,

  • just please leave in the comments below.

  • Thank you.

  • Are you excited to see what's new with TensorFlow in 2019?

  • We've got lots of great new content coming,

  • and it's all going to be covered right here on the YouTube

  • channel.

  • So whatever you do, don't forget to hit that Subscribe button,

  • and we'll see you there.

[MUSIC PLAYING]

Subtitles and vocabulary

Click the word to look it up Click the word to find further inforamtion about it