Subtitles section Play video Print subtitles RICHARD WEI: Good afternoon, everyone. My name's Richard Wei, and I'm from the Google Brain Team. Today, let's talk about Swift. So traditionally, there are two ways of building models in Python. One way is graph building. The users are required to explicitly build a graph and run it and pass it to a session at runtime. And the other way is eager execution. Eager execution allows you to write everything in the source code, and you interpret it line by line. And it's really easy to use. But there's a tension between graph execution and eager execution. Graph execution has great performance, but it's not very easy to use, especially when it comes to control flow. Eager execution is super easy to use but can be challenging to achieve high performance, like graph execution. Well, can we combine usability and performance? Well, what we are building is something that combines the ease of use with high performance for your machine learning code. But to do that, we have to do something more than just writing a new library. We had to innovate in the programming language. Machine learning is so important to us that we're willing to adopt a programming language and entire software stack. Since we need first-class machine learning capabilities and primitives, not only do we need the ease of use and high performance, we also need an open design process driven by an established community for the language. This brings us to the Swift programming language. Swift is a fast, modern programming language and is super easy to use. It's cross-platform, has a clean syntax like Python, and it has all the powerful capabilities, like type inference, optionals, and supports functional programming and all the great features. So what exactly is Swift for TensorFlow? Well, the key components of Swift for TensorFlow is just a magical tensor type. You can write anything using tensors. As you would expect, you can initialize a tensor like this, like Python. Or you can use a full power generics to power your models, libraries, and applications. Well, math operators, like sigmoid and matmul are type saved generic functions in Swift, without the need of name spacing when you use them. When you combine things together, it's nice and clean. The programming style looks just like Python's eager execution. Well, as you see, tensor just works and is part of the TensorFlow library, written in Swift. However, it works very differently from other language bindings because TensorFlow, while tensor is supported as a first-class citizen in the programming language. For this, we built a technology called graph program extraction in the Swift compiler. In addition to the ability to treat tensors as first-class citizens, we also need the ability to train models using automatic differentiation. It's graph program extraction, using automatic differentiation and do interesting things even with Python through interoperability. Well, all of these components are integrated natively in the Swift programming language, providing great productivity for a developer. So let's start with a TensorFlow library. The TensorFlow library has all the common APIs like the Tensor API and data API. In data API, you can use dataset initializer. You can use functional zip and batch operations just like TF data in Python. Well in addition to nice-looking TensorFlow APIs defined in Swift, you also have access to all the raw TensorFlow APIs like this. Tensor code sits alongside the rest of the Swift program, just like eager execution, and it interacts with arbitrary user-defined data structures in algorithms. But will it actually execute line by line, one statement at a time, just like eager execution? Well, the answer is no. The TensorFlow library works quite differently from all the language bindings. Swift for TensorFlow is a full-fledged compiler, and it makes our code run fast without sacrificing the ease of use. The technology is called graph program extraction. So how does it work? Well, we want users to write code like eager execution, but we also want it to graph level performance. And we want to support native language control flow so your control flow like ifs and whiles can be compiled directly to a TensorFlow graph. We want the compiler to handle data transfer between the device or between devices, between the device and the host. So the burden is now on the compiler, not on the developer. Developers don't even have to think about it. So this is how it works. Well, dataset operations such as zip and batch work as well as-- well, they work in a graph. Tensor operations such as matmul can also be represented as a graph. When Swift compiler sees the code, the Swift compiler automatically turns this into a graph by identifying all the graph-extractable operations. So Swift compiler compiles this code, compiles the graph-extractable operations into a TensorFlow graph. And that will be part of the binary that Swift compiler produces so it's one graph. At the same time since it's compiling the graph, it produces error messages before even running your code and catches some shape errors and all the type errors. Now with graph program extraction, we can combine performance and usability. But often in machine learning code, we also need to be able to differentiate code in train models. Well, that's one of the key things in machine learning algorithm. A common way to do that is to take advantage of automatic differentiation. Although most automatic differentiation tools are an implement as a library, we want to take full advantage of being able to improve a programming language. So we build automatic differentiation also into the compiler. So automatic differentiating in Swift works on custom data structures. It works on TensorFlow, as well as lots of standard library functions and standard library types. And as a user, you can define your own type to be differentiable. For example, you can define arbitrary structs, enums, and other data types and make them represent a vector space. And they suddenly become differentiable. So the core of the system is a differential operator. This is actually a keyword in Swift because we've built that into the language. For example, we have a 32-bit floating point operation. It's a function. And to differentiate its function, we just throw a differential operator at it. And we get a gradient, and we can call it. It's all functional. The same thing applies to tensors. I have a tensor inference function that computes a prediction. I can throw a differential operator at it. I can get the prediction by calling it, and I can also get the gradient of it and apply the gradient. And I can use a code like this for training. Swift also supports custom gradients. This is a commonly requested feature. Well for example, we have a times operator, and we want to apply some custom gradient on it. All we have to do is to use a differentiable attribute to specify the reverse mode adjoint for this function. So when automatic differentiation sees the code, it'll automatically call into the adjoint when it's differentiating data flow. Well, because automatic differentiation is language-integrated, it shows the errors at compile time when you're differentiating some non-differentiable function with cursors pointing exactly at each non-differentiable operation in the call stack. It's really useful. And also in the future, we plan to support showing warnings for numeric stability. So all of this is enabled by having automatic differentiation built into the compiler. But since advanced automatic differentiation is immensely useful for machine learning research, we have support for arbitrary control flow, including loops and recursion. And control flow can also enable differentiating through all the tree data structures defined using the algebraic data type in Swift. And we'll also support APIs from manipulating gradient of some variables, compile time warning for numeric ability, and the ability to compute forward-mode Jacobian vector products and full Jacobians. Well, with all the features providing a super easy-to-use programming interface in Swift, we also care a lot about the ecosystem. Since a machine learning ecosystem relies on a lot of Python libraries, we don't want to lose the ability to call into Python libraries and use them in Swift. For that, we built Python interoperability in Swift. You can import a Python library using Python import. It's like Python. And you can use numpy using Python syntax. Users can directly import their favorite libraries directly in Swift and use them as if they were writing Python. This gives great compatibility with the entire machine learning ecosystem. So for Python, you can use your favorite libraries like numpy or pickle or gzip, load some images. It's that simple. So Swift supports interpreted mode, scripting mode, and Jupyter notebooks. And you can write interactive code as if you're writing Python. And we are releasing an Iris tutorial on our website, and you can follow the tutorial and try it out. If you want to participate in the development, you can download a development toolchain from our website. Everything is open source. On GitHub/tensorflow/swift, you can find our technical documentation, white papers, and everything, and we have an open design process as well. So this is Swift for TensorFlow. Thank you everyone. [APPLAUSE]
B1 swift graph compiler execution programming tensor Swift for TensorFlow (TensorFlow @ O’Reilly AI Conference, San Francisco '18) 4 0 林宜悉 posted on 2020/03/25 More Share Save Report Video vocabulary