Subtitles section Play video Print subtitles [MUSIC PLAYING] RICHARD WEI: Good morning, and welcome. My name's Richard. JAMES BRADBURY: I'm James. RICHARD WEI: We're thrilled to talk to you about a next-generation machine learning platform, Swift for TensorFlow. TensorFlow is a world-leading ecosystem for machine learning, and we're always looking to make it better for all kinds of users. TensorFlow 2.0, available in alpha, makes it easier than ever to get started with ML. But we're also looking further beyond. In this session, I'm going to tell you about a project to rethink what a machine learning framework could look like. You might hear Swift for TensorFlow and think, this is going to be about iOS apps. But Swift is also a general purpose language that we think will help enable the future of ML. When figuring out what a next-generation machine learning platform should look like, we have been guided by three fundamental goals. First, we would like to make sure that every developer can be a machine learning developer. Second, we'd like to make sure that every application can benefit from the power of ML. Third, we've seen that the frontiers of ML research are pushing the boundaries of a software stack split between Python and C++. So we want a unified platform in the friendly and fast language. This is why we decided to develop something a little outside the box. So we've taken the Swift programming language, a general purpose and cross-platform language that compiles to native code, and built general machine learning functionality right into its core, and combined its power with the TensorFlow ecosystem. And this is Swift for TensorFlow. Swift is a language designed to be easy to learn and use. Let's take a look. Swift's syntax is familiar and free of boilerplate. Swift has types, so it catches errors early for you. But it uses type inference, so it feels clean and simple. Swift encourages fluent API design. So your code is fluent and readable at both the definitions side and the call side. Swift is open source and has an open process that allows us to propose new changes, new language features for all Swift developers. Swift supports prototyping platforms such as a command line interpreter, Jupyter Notebooks, and Playgrounds. Just import your favorite libraries and get started right away. The TensorFlow library includes our carefully curated Tensor APIs, layer APIs, as well as general abstractions that help you build your own. Now let's look at how to build a deep learning model in Swift. This is an image classification model. This is built out of several layers, forming a simple convolutional neural network. The call function applies every layer sequentially, producing a final output. Once we have defined the model, it's time to train it. We start by initializing the model in the stochastic gradient descent optimizer. We'll be using random data here, x and y. Now we write a training loop, apply the model, compute the loss from labels using the softmaxCrossEntropy() function, and compute the gradient of the loss with respect to the model's parameters. Finally, we can use the optimizer to update all the differentiable variables and all parameters in the model. Our layer and optimizer APIs are inspired by libraries like Keras, and are designed to be familiar. A huge part of Swift for TensorFlow is workflow. Lots and lots of research breakthroughs in machine learning come from prototyping ideas. Google Colab is a hosted Jupyter Notebook environment that allows you to prototype your ideas right in your browser. A lot of researchers and developers use Python with TensorFlow in Colab today. And now we've made Colab work seamlessly with Swift. Now let's show you Swift in action. This is a Colab notebook. As we start typing, we get an immediate code completion. Here we use functional map to add 1 to each element in an array. And it prints just like you expect. Now let's look at a basic workflow in more detail. Here's the dictionary of numbers. And I want to find the largest even number in each array. To do that, I'm using a for loop that loops through all elements where the element is both greater than the largest-- than the last largest one found and is even. Let's run that and see what happens. Oops, what's wrong? Well, it looks like Int doesn't have a property called isEven. No problem, let's add one. Even though the Int type is defined in the standard library, we can still add methods and properties to it. In Swift, this is called an extension. Here we define a Boolean computed property called isEven, implement it in terms of the isMultiple method. As you can see, Swift's syntax and naming conventions look more like English than other programming languages. And now everything works. By importing TensorFlow, we get the power of TensorFlow operators as functions and methods on a Tensor type. Many of these methods are the same as they would be in NumPy and Python TensorFlow. Other APIs make use of Swifty features, like generics and argument labels. TensorFlow also provides deep learning building blocks, such as layers and optimizes. Here's the same simple [INAUDIBLE] that we just showed you. We're on GPU, and it's running really fast. Running the cells train the model. It is that simple. So that was Swift for TensofFlow in Colab. Writing Swift for machine learning is not all that different from writing dynamic languages. With Colab, you can start Swift right away in your browser, and it gives you code completion, free GPUs, and you can even install a package, using Swift package manager, right in your notebook. You don't have to install anything. It's really handy. Next up, interoperability-- Swift interoperates transparently with C. So you can literally just import a C header and use it. But how about all the libraries-- everybody's favorite machine learning and data science libraries in Python? Do we have to read/write everything in Swift? The answer is, we don't. In Swift for TensorFlow, we have extended Swift's seamless language interoperability to Python. As you start using Swift for TensorFlow, we want to make sure that you didn't miss all the utilities and libraries that you're used to. Now let's show you how interoperability works in Swift. Python interoperability is enabled by a lightweight Swift library called Python. Let's start by importing that. Python.import lets you import any Python package installed with pip. Hundreds of popular packages are pre-installed in Colab. Here we're going to use NumPy and Matplotlib. We also enable their in-notebook visualization functionality. Now we can call some common NumPy and plotting functions directly from Swift exactly as if we're writing Python, and it just works. Now, OpenAI Gym is a Python library of reinforcement learning environments, little games for AIs. Let's import that and solve one of the games. One of these games is a cart-pole environment where you train a neural net to balance a pendulum. We'll use a simple two-layer neural network. In each episode of the game is a loop. We get observations from the Python environment and call our Swift neural net to choose an action. After running the game a few times, we use the games that went well as training data and update our neural net's parameters so that it will pick better actions next time. Repeating this process, we're getting better and better, higher and higher rewards. Finally, the problem's solved. We can also plot the mean rewards. As you see, you can mix and match idiomatic Swift code, Python code, and a neural net defined inline in Swift. It is so seamless. In fact, Python interoperability isn't hard-coded in the Swift compiler. It's actually written as a library in pure Swift. And you can do the same-- do the same to make your libraries-- to make Swift interoperability work with Ruby or JavaScript and other dynamic languages. Now, in cutting-edge research and applications, people often need to dive into low-level implementations for better performance. Swift's C interoperability makes that surprisingly easy. Importing Glibc gives you access the C standard library from qsort() all the way down to malloc() and free(). Here we allocate some memory using malloc(), store string, and print it. Now it's probably fair to say that Swift in Colab gives you a better C prototyping experience than the C language itself. Now I hope you'll agree that the best parts of Swift for TensorFlow is how easy it is to work with code that is not in Swift. And that is interoperability. Having a few Python libraries in hand is incredibly useful for day to day tasks in data science. Swift for TensorFlow embraces interoperability and lets you use C and Python without wrappers-- so seamless. Now it's time to talk about the dark magic in Swift for TensorFlow, differentiation. It is fair to say that calculus is an integral part of deep learning. This is because machine learning models are really just mathematical functions expressed in code. But the special thing about them is that we don't just want to evaluate those functions, we also want to change the parameters, or train them by changing their parameters based on their derivatives. A common technique adopted by deep learning frameworks is called automatic differentiation. Different from traditional programming, doing calculus in code requires the language or the library to understand the structure of your code. But who to understand your code better than the compiler? We think that computing derivatives from code is not just a simple technique, but also a new programming paradigm in the age of machine learning. So we've integrated first-class automatic differentiation, differentiable programming right into Swift. First-class differential programming means that native functions like this can be differentiated. Here's a function of floats. You can use a gradient operator to get its gradient function. And just call it. It's that simple. And you can import TensorFlow, of course, to compute gradients over tensors in the exact same way. In fact, we've been using this feature in the model training examples that you just saw. Now instead of telling you how cool it is, let's show you. Here's a simple function that works with the native double type in Swift. It squares x and adds y to it. If we mark this function with differentiable, the compiler checks to see if it can differentiate everything that this function calls. Now f is a differentiable function. One of the things we can do with a differentiable function is to get its gradient. A gradient represents derivatives of the output with respect to each of its inputs. Here the derivatives are with-- here are the derivatives with respect to x and y, produced by the gradient operator. Well, sometimes it's more efficient to compute the output and the gradient together. Here's a function that calls the squaring function f. We can use a value with gradient operator to do this. All our differential operators like value and gradient also work with arbitrary closures, not just named functions. Here we use Swift's trailing closure syntax to simplify it. The closure represents the same g function passed directly to the differential operator. Now so far, we've seen differentiation on types from the standard library and model training examples earlier, using differentiation on tensors, layers, and models. But in applications, we often need to define custom types, like a point, which has an x and a y component. Here we also define a custom function, dot(), that takes the dot product of two points representing vectors. Now what if I call the differential operator on the dot product function? Boom, Swift reports an error at compile time saying point is not differentiable, does not conform to the protocol "differentiable." Don't worry. This is because Swift supports generics and generic constraints. The differential operator defined for functions whose parameters and return type-- takes functions whose parameters' type and return type conform to the differentiable protocol. Now all we have to do is to make the point struct conform to the differentiable protocol. And now everything works. Because a gradient operator returns a gradient tuple, it expected two arguments, a and b. We can simplify the code by using Swift's pattern matching capability to retrieve the tuple elements in one line. Swift's compile-time automatic differentiation can also detect more subtle errors. Here we define a function called roundedSum() on the point that rounds the x and y components by converting them, first, to integers, then converting the sum back to float, back to a double. Let's see if this function can be differentiated. Well, it looks like we shouldn't have done that. Functions over integers are not differentiable, because the functions are not continuous. And the compiler knows exactly why and tells you. We showed you earlier that Swift has seamless interoperability with C. But what happens if I use a differential operator on a C function? Well, let's try that. Let's try to differentiate square root. Well, it looks like it can't be differentiated. This is because a function is not defined in Swift, and the C compiler cannot compute derivatives. But don't worry. Let's write a Swift function that calls the C version and write a custom derivative for it. Now, in elementary calculus, the derivative of square root is negative 1 over 2 root x. We can define such a derivative as a Swift function that returns both the original output and the closure that applies the chain rule of differentiation to compute the derivative. If we mark this function as differentiating square root-- when we do this, we're telling the compiler, whenever you differentiate a square root, treat the derivative function as its derivative. Now we can differentiate that function. So this exact-- this is also exactly how we made all the functions and methods in the TensorFlow library differentiable. It is that simple. Isn't that amazing? Having automatic differentiation built directly into a general purpose language is unprecedented. It's a super-exciting time to bring your applications, intelligent applications, to the next level. Differentiable programming is new era of programming. Language integration gives you useful compile-time errors and the flexibility to make user-defined types differential. There's also a tremendous amount of depth here. But if you're interested in learning how it works, we have some detailed design documentation online. Next up-- performance. Swift is fast. Swift has good low-level performance thanks to its LLVM-powered compiler. And it lets programmers use threads for multi-core concurrency. For deep learning code, we start with eager execution, so writing TensorFlow code feels great. But the compiler is also doing a lot of things behind the scenes. It's optimizing the tensor computation, all without changing the programming model. One trend we're seeing is that deep learning models are getting integrated into bigger and bigger applications. The problem is, doing this often requires you to export models and treat them as black boxes in your code. But we think that Swift for TensorFlow can make it better. Because your model is expressed just as code. And it sits alongside the rest of the application. And this enables all sorts of whole-program optimizations that a compiler is really good at. It makes debugging a lot easier. What about performance in a research context? Some of the most cutting-edge projects like DeepMind's AlphaGo and AlphaZero worked by bringing these three things together. They use deep neural nets integrated into classical AI search algorithms. While it's possible for an advanced team like DeepMind to build all of this, we think that an AI-first language can make it easier. And lowering the barriers in the infrastructure between deep learning and traditional programming can lead to new breakthroughs in science. The MiniGo project is a ground-up, open source replication of the central advances of AlphaGo and AlphaZero using GCP and Cloud TPUs. In order to achieve the best performance from the Monte Carlo tree search algorithm that evaluates possible moves, the author also had to rewrite everything in a complex mixture of C++ and Python. But Swift for TensorFlow gives you a better option. Because MiniGo can be written in one language without sacrificing performance. Let's see what that looks like in Swift. The Swift models repository on GitHub contains machine learning models that we built using Swift for TensorFlow, such as common [INAUDIBLE],, transformer, and of course, MiniGo. On Linux or Mac OS, we can build a models repository using Swift package manager via Swift build. Or we can just open the IDE to develop machine learning models like traditional applications. The MiniGo game has three components-- one, the game board, two, the Monte Carlo tree search algorithm, three, the convolutional model to use with tree search. They're all defined in Swift code in this workspace. Recently, we've started doing some refactoring on the Monaco. Now let's run some unit tests and see if we have broken anything. Whoa, what's wrong? Well, it looks like the-- if we go to the test, looks like the shape of the value tensor is expected to be 2, 1. Well, where did that come from? Let's just jump to the definition, the prediction method that returned it. Well, the prediction method simply call self via the call method. So let's look at that. Now let's set a breakpoint and run the test again to see what's going on. At the breakpoint, we can print out the shape of the value tensor using the lldb debugger, exactly like application development. Well, it looks like the tensor has shape 2, 1, but the test expected just 2. To fix that, we can simply flatten the tensor by calling the flatten method. Now let's drop the breakpoint and run everything again. Great, all tests are passing. We could run the game locally, but it'll be pretty slow on a laptop CPU. How about running it on a free GPU in a Colab. Let's do it. Let's open Chrome. And here's a Colab that's installing the MiniGo library that we've been working on using Swift package manager. And so Swift package manager pulls directly from GitHub and sets things up. Now it's linking and initializing Swift. Here we download some pre-trained weights using some command line command. And we also set up the game environment, like the Go board and the players. Now the game is starting to play. The stones are being placed on the Go board one step at a time. Now we're running a Go AI on GPU in the Colab. Isn't that amazing? To advance through the game, here's the same game that has been running for quite a while. Isn't that cool? So that is MiniGo and Colab. Having deep learning models sit alongside complex algorithms in an application and debugging through all of them-- the dream has finally come true. Now you've seen how a unified programming model makes it so easy to push the state-of-the-art in machine learning research. But we want to go even further. For example, Google is making a major investment in fundamental compiler technologies, an infrastructure that will allow you to write custom kernels for hardware acceleration right in Swift, and even in Colab someday. Now James will talk to you about that work and where Swift fits into the future of the TensorFlow ecosystem. James. JAMES BRADBURY: Thanks, Richard. So this is a broad-brush look at the lifecycle of a TensorFlow model today. You build a model, typically using Python. And then you can run it in many different ways on many different kinds of hardware. And Swift can plug right into this ecosystem. In fact, that's what we're doing. We want it to be possible to use a TensorFlow model written in Swift in all the places you can use a TensorFlow model built in Python. But the reality is there's a lot of complexity here, many different hardware back ends and compiler stacks. And thankfully, the team is working on a long-term project to unify and simplify the situation with a new compiler infrastructure called MLIR, or Multi-Level Intermediate Representation. Things like the TFLite model converter and compilers for specialized hardware, all of them will be able to share functionality, like source location tracking. And this will make life easier for everyone in the ecosystem, all the way from app developers who want a better experience like high-quality error messages when exporting to mobile, all the way to hardware vendors who want to implement TensorFlow support more easily. Check out the TensorFlow blog for more details on MLIR. Now the reason I'm bringing this up is that MLIR is a great opportunity to take advantage of the strengths of Swift as a compiled language. So while Python will be able to drive MLIR-based compilers at runtime, Swift will also drive them from its own compiler, bringing benefits like type safety and static binaries. In short, for users who need the flexibility, we can make the entire TensorFlow ecosystem directly accessible from Swift, all the way down to low-level device capabilities for writing custom kernels, all the while improving the experience for everyone who uses TensorFlow. So that's MLIR. Let's talk about the original vision again. We think Swift for TensorFlow can help every developer become a machine learning developer. Thanks to Swift's support for quality tooling like context aware, auto-complete, and the integration of differentiable programming deep into the language, Swift for TensorFlow can be one of the most productive ways to get started learning the fundamentals of machine learning. In fact, you don't have to take our word for it. Jeremy Howard's fast.ai recently announced that they're including Swift for TensorFlow in the next iteration of their popular deep learning course. We think Swift for TensorFlow can help make sure every app can benefit from the power of machine learning by bridging the gap between application code and machine learning code. And we think that Swift for TensorFlow will provide a great option for researchers who need a high-performance host language like Swift. To break down the barriers for you to get started, Google Colab hosted notebooks give you free access to GPU resources right in your browser. And Jupyter Notebooks let you quickly prototype your ideas locally or on remote machines. Swift for TensorFlow also supports Linux and Mac OS natively. So you can use your favorite IDE or run the interpreter in a terminal. RICHARD WEI: Swift for TensorFlow's 0.3 release is available today, which contains all the technologies that powered the demos that you just saw. We think it's ready for pioneers, especially if you're running into limits of Python as a host language or mixing deep learning with traditional programming. JAMES BRADBURY: Swift for TensorFlow is not yet production-ready. But if you're excited about what you see, we have Colab tutorials you can run in your browser and binary tool chains you can download and use. We're actively designing and building high-level machine learning libraries and improving the programming experience. RICHARD WEI: The whole project is developed in open source at github.com/tensorflow/swift. Try it out today, and we think you're going to love it. JAMES BRADBURY: This is an incredible opportunity for developers like you to help shape the future of TensorFlow. RICHARD WEI: Thank you for coming, and have a great I/O. [MUSIC PLAYING]
B1 swift colab learning machine learning compiler gradient Swift for TensorFlow (Google I/O'19) 3 0 林宜悉 posted on 2020/04/04 More Share Save Report Video vocabulary