Subtitles section Play video Print subtitles Hello, I'm Martin Kronberg and welcome to the IOT Developer Show. In these episodes, we're taking a deep dive into OpenVINO, Intel's new toolkit for AI and computer vision development. In the previous episode, we gave a high level overview of OpenVINO. And in this one, we will take a look at OpenVINO in much more detail. [MUSIC PLAYING] First things first, let's take a look at the OpenVINO architecture. OpenVINO is a set of tools designed to help you build smart video applications. These tools can be broken down in two parts, a deep learning deployment toolkit and a traditional computer vision toolkit. The deep learning toolkit consists of the inference engine which runs the deep learning model, a model optimizer used to convert and optimize existing models from other frameworks, and a set of prebuilt deep learning models. Also included are libraries to optimize the running of models with the Math Kernal Library for deep neural networks and the compute library for deep neural networks to optimize deep neural networks on CPU and GPU, respectively. On the other hand, the traditional computer vision toolkit consists of OpenCV 3.3, which is a popular library for computer vision, the Intel Media SDK used to leverage fast hardware encode and decode of video and OpenCL drivers and runtimes in order to access the onboard Intel GPU effectively. In order to give you guys a better understanding of how all of these features work together, I want to walk you through a sample deep neural network computer vision work flow. Let's say that I have a specific computer vision application in mind. And that I want to use OpenVINO. The first thing I can do is look online to see what pretrained models exist for me to use. If you can find a pretrained model that meets your needs, it's going to save you a lot of time, versus having to train one yourself. I can go under software at Intle.com/OpenVINO and look at all the models available. We have models that detect people, license plates, road side objects, even models that detect emotion on faces, like we saw in the last episode. If one of these does not fit my needs, I can search for more models from any of the popular frameworks including Caffe, TensorFlow, or MXNet. OpenVINO has a tool called the model downloader which is a script that pulls all the necessary files for a model, including topology, weights, and labels and makes sure that their naming conventions are compatible with the model optimizer. Once I have that model downloaded, I can use the model optimizer, which is a Python script, to convert the model into the intermediate representation format that the OpenVINO inference engine uses. For this workflow example, let's say that I'm building out a people tracker and the pedestrian tracking model works for me. So I'm going to use that. Now that I have an idea of the model that I'm using, how should I go about developing an OpenVINO app? The first thing to think about is what ID you will be using. While there are many options, I would suggest trying Intel System Studio 2019. In this newly released version of our development platform integration with OpenVINO is super simple. And if you have used eclipse based IDs in the past, it'll be very familiar to you. In addition to the debugging capabilities that you get with Intel System Studios, you can also use it to leverage VTune, a powerful Performance Analyzer. If you want to optimize your application VTunes give you a lot of insight into how various process threads are performing on the CPU. In fact, it can even tell you how the various layers of your inference model are running. So if you have a bottleneck happening on one particular layer, a convolutional layer, for example, you can work to reduce its complexity or even send it to the GPU for processing to increase overall performance. After you have your ID set up, I would say go on our GitHub to explore some of the reference implementations there. To get an idea of how to leverage this model, I could take a look at the store traffic monitor, reference implementation or installation and deployment information. Right now the sample is using OpenCV and FFMpeg to do the video stream encoding and decoding. However, I could use the Media SDK encode/decode functionality to get a more optimized performance. What decoding does is transform an MP4 or other video format into pixel value arrays for each frame. I'm going be doing every operation on the image on a frame by frame basis. Now before I can run my inference model, I want to do some image preprocessing, let's say denoise, convert to grayscale, or resize. I would use OpenCV to perform all those initial image transformations on each of my frames. Next, I will put my process frame into the inference engine using the model I found earlier. This will analyze the image and will identify people in the frame as well as the bounding boxes around them. Now, if I want to display those labels and bounding boxes on my image, I will use OpenCV again to draw that information onto the frame. And finally, I want to encode all of those frames into an output video file or stream. Once again, I can either use OpenCV and FFMpeg or Media SDK. Now, let's take a look at the end result of that whole pipeline. Here we see a retail environment where we can keep track of people as they enter and leave. We can also track inventory by seeing when people pick bottles off of a shelf. And that's it for today's episode. Tune in two weeks and we'll discuss the various hardware kits that you can use to develop OpenVINO applications. Thanks for watching. Follow the links provided. And see you guys next time. [DING]
B1 US openvino model computer vision toolkit inference frame OpenVINO Toolkit and Two Hardware Development Kits | IoT Developer Show Season 2 | Intel Software 27 1 alex posted on 2019/04/26 More Share Save Report Video vocabulary