Subtitles section Play video Print subtitles [MUSIC PLAYING] JACQUES PIENAAR: Good afternoon, everybody. I am Jacques, and I'll be filling in for Tatiana today, presenting on MLIR, accelerating TensorFlow with compilers. Now, I don't think I need to tell anybody in this room that machine learning is everywhere. There's a wide range of deployments happening in the industry today-- inference and training happening on the cloud and to the edge. We also have models getting larger and larger, and the computational requirements for training these models ever increasing. We see a near exponential growth of the complexity and size and the computational requirements for training these models. Now, if you combined the growth in different deployment strategies, as well as models, velocity is a must. We need a faster, more scalable way to build infra to keep up with these bigger complex models and deployment scenarios. So we need to build these email systems faster. We want to unify efforts for extensibility and reusability, while allowing customization as needed. So we want to be able to standardize representation of some basic concepts such as operations and types. What defines an operation? How do you define operation or a type? We want to create a common framework of reusable passes that you can combine to create your own solutions. And also, we want to make it such that it's fully customizable and extensible. The deployment scenarios and models of five years ago differ greatly from what we have today, and so we want a system that's able to scale and adapt for all the future needs. With that inter MLIR, we designed MLIR which stands for multi-level intermediate representation. It's an intermediate representation and compiler framework for TensorFlow and beyond as part of the [INAUDIBLE] project. So what is MLIR, and why do we believe it's a compiler infrastructure for machine learning? Well, for one, MLIR is state of the art compiler technology. It's not just a serialization format, and there's nothing like it. MLIR is modular and extensible. You can build different solutions using MLIR-- building blocks that suit your solution. Importantly, MLIR is not opinionated. MLIR does not try and force you into a box. It allows you to create a solution for your problem space. MLIR is also fully customizable. These different deployment scenarios needs different ways of integrating the components, and with MLIR, we want to make it easy for all of these different deployment scenarios to work. Finally, MLIR is part of the [INAUDIBLE] project. It's [INAUDIBLE] governments and effectively on the desk of many compiler developers all around the world already. And the industry agrees. MLIR is strongly supported by our partners. Some of our partners includes the largest hardware partners in the world, consisting of 95% of the data center hardware, four billion mobile phones, and countless of IoT devices. Importantly, MLIR is an open community of academia and industry all working together to solve this problem of compiling machine learning models. So if MLIR-- what if we want to use it for TensorFlow? Well, we want to use it to build a better TensorFlow. We want to build better user experience, as well as better pluggable hardware support. Now, if you're a user, we want to make it easier for you to debug your model. We want to make optimizations transparent and see what's going on. We want to make it-- if you have an error message in your optimized model, we want to be able to track it back to your original location, and MLIR's location tracking enables this. And, of course, we want faster performance. So going from writing a model to actually being able to get good performance on your hardware is essential. And speaking of hardware, for our hardware partners, we know it's an awesome time. There is so many new generations of accelerators coming up and new accelerators, and we want to make it simpler and easier to integrate with TensorFlow. Because while [INAUDIBLE] accelerators are great time, it's only really interesting when it's usable for our users. And, of course, for researchers, we want to provide the standard infrastructure for research. So going from being able to represent the different optimization passes and running it in an end to end workflow on some production models, we want to make it easy for these researchers to try new approaches, see new effects, and if it works well, of course, contribute it. So let's take another of a closer look at MLIR, the progressive lowering, and the infrastructure around MLIR. Now, you've seen this before in the TensorFlow architecture. And if we zoom in a little bit, we can expand the different opponents. But let's focus on the parts where MLIR will be used. So a lot of these, as I mentioned before, the graph representation and optimization format for these TensorFlow models, but also particularly for-- in the compilation. So for optimization and conversion passes between different computing frameworks, to compilation of modules, as well as actually for writing AOT kernels, or generating AOT kernels, or exploiting these [? handwritten ?] kernels, MLIR will be involved in all of these different parts. So as the previous slide showed, we can and will be using MLIR to do many tasks in TensorFlow from graph organizations, operation rewrites and lowerings, graph transformations, creating frameworks and components, to code generation. So you think of MLIR as a common graphic representation and legalization framework. It's a common set of optimizations and conversion passes, as well as a full code generation pipeline. But importantly, as I mentioned, MLIR is modular, so you can tailor it for your use case. You can use what you need to solve your problems. So for example, you can reconfigure MLIR for a [? graphic ?] writing so you can-- and that's, for example, how we use it for the new TensorFlow or TensorFlow Lite converter-- just using the parts we actually need to get the final product that we want. So what about-- so let's talk a little bit about progressive lowering. The ML in MLIR stands for multi-level. MLIR enables you to represent multiple different levels of operations, all in the same IR. So from a TensorFlow operation to XLA HLO to LLVM IR all can be represented in MLIR. You can lower, progressively, from one form to another, and you don't need-- and all of these can coexist together. So for example, you can have a function that actually has [INAUDIBLE] and HLO up and LLVM IR. This ability to mix and match these different levels of abstractions and dialects gives great power in actually modeling the problems to suit what your hardware specialization needs. But what about XLA? So we're using what we learned from XLA to build MLIR. XLA is a great exhilaration tool for models with stable tensor shapes. And so for example, the TF function API in TensorFlow 2.2 enables great performance improvements, exploiting XLA, and we've made sure that they work really well together. And we are working on ensuring that there's full interoperability between MLIR and XLA. And speaking of full interoperability, we are working very hard to make MLIR and all existing TensorFlow components all interact very well. So whether you want to import or export from a graph, XLA HLO proto, or TF Lite Flatbuffer, all of these are possible. So you can mix and match your workflows with XLA. Importantly, MLIR allows for open integration at any level of the stack. So you can start with a TensorFlow graph, import it into MLIR, lower it to HLO, [INAUDIBLE] HLOs, or go further and lower it to LLVM IR and then [? Code Gen. ?] MLIR allows you to hook into any parts of this configuration, and in particular, MLIR does not require that you only use one, so if for your problem you need a combination of these ops, that's possible. So this makes it very easy to incrementally enable MLIR in conjunction with your existing tools. Now, let's look at MLR in action. So we'll just take a look at the new TF Lite converter, as well as the new features provided by MLIR there. Now, the new TF to TF Lite converter launched just in February this year. Very excited about it. So starting from a TensorFlow graph model, importing it to MLIR, doing all the optimizations, legalizations, and then finally exporting to TF Lite Flatbuffer for TensorFlow Lite to runtime to execute. All of these with better error messages-- so being able to find out what went wrong during conversions and give more actual feedback. To support for TensorFlow control flow, you can finally deploy some of these models with control flow on the edge, and also with a new unified quantization workflow. Now, looking ahead beyond the converter, you'll see MLIR in action in a lot of different places in TensorFlow. In particular, I mentioned MLIR as being the graph representation optimization framework in TensorFlow, so we'll be unifying the different graph optimization infrastructure that we have, as well as all the different converters using MLIR. Another part that's very important for us is the partner integration and supporting for new hardware. As I mentioned, new hardware is coming up every day. We want to make it very easy for folks to integrate with TensorFlow. So especially if you're a partner, please reach out to TensorFlow team if you want to get involved in this discussion. And also, for code generation, we're enhancing MLIR. We're looking at more advanced code generation, particularly code generation with dynamic shapes. And MLIR is also integrating very tightly with optimization and code gen with the new TensorFlow runtime. So there's many different ways of getting involved. Like I mentioned, MLIR is a open community. We have open design meetings where everybody can sign in and ask questions. There's talks from the team, from other teams. We have the TensorFlow MLIR special interest group. And of course we have code on GitHub, [INAUDIBLE] repo, as well as TensorFlow repo. So feel free to send some [? PRs ?] and fix some bugs and add new features and get involved. And with that, thank you. [MUSIC PLAYING]
B1 tf graph hardware inaudible optimization lite MLIR: Accelerating TF with compilers (TF Dev Summit '20) 1 0 林宜悉 posted on 2020/03/25 More Share Save Report Video vocabulary