Subtitles section Play video Print subtitles [MUSIC PLAYING] KRZYSZTOF OSTROWSKI: All right, TensorFlow Federated-- what's exciting about TFF is that it enables everyone to experiment with computations on decentralized data. And decentralized data is exciting, because it's everywhere. It's in intelligent home devices, in sensor networks, in distributed medical databases. And of course, there's a ton of it on personal devices like cell phones. And we love our cell phones. We want them to be intelligent. This data could help. Traditionally, the way we implement intelligence is on the server. So here we have a model on the server. The clients all talk to the server to make predictions. So all the data accumulates on the server as well. So the model, the data, it's all in one place-- super easy. The downside of this is that all this back and forth communication can hurt user experience due to network latency, lack of connectivity, shortened battery life. And of course, there's a ton of data that would be really useful in implementing intelligence but that, for various reasons, you may choose not to collect. So what can we do? Well, one idea is take all the TensorFlow machinery and put it on-device. So here we have each client independently training its own model using only its own local data. No communication necessarily-- great. Well, maybe not so great-- actually, we realize that, very often, there's just not enough data on each individual device to learn a good model. And unlike before, even though there might be millions of clients, you can't benefit from the data. We can mitigate this by pre-training the model on the server on some proxy data. But just think of a smart keyboard. If, today, everyone starts using a new word, then a smart model trained on yesterday's data won't be able to pick it up. So this technique has limitations. OK, so now what? Do we just give up? We have to choose between more intelligence versus more privacy? Or can we have both? Until a few years ago, we didn't know the answer. It turns out the answer is yes, we can. In fact, it's very simple. It goes like this-- you start with the model on the server. You distribute it to some of the clients. Now each line trains the model locally using its own local data-- and that doesn't have to mean training to convergence. It could be just training a little bit-- produces a new model, locally trained, and sends it to the server. And in practice, we would send updates and not models, but that's an implementation detail. All right, so now server gets locally trained models from all the clients. And now is the crazy part. We just average them out-- so simple. So OK, the average model, trivially, it reflects the training from every client, right? So it's good. But how do we know it's a good model, that this procedures is doing something meaningful? In fact, you would think it's too simple. There's just no way-- no way-- this can possibly work. And you would be correct. It's not enough to do it once. You have to earn it. So we repeat the process. The combined model becomes the initial model for the next round. And so it goes in rounds. In every round, the combined model gets a little bit better thanks to the data from all the clients. And now hundreds or thousands, many, many rounds later, your smart keyboard begins to show signs of intelligence. So this is quite amazing. It's mind-boggling that something this incredibly simple can actually work in practice. And yet it does. And then it gets even more crazy. You can do things like compress the update from each client down to one bit, or add some random noise to it to implement differential privacy. Many extensions are possible. And it still works. And you can apply it to other things than learning. For example, you can use it to compute a statistic over sensitive data. So experimenting with all the different things you can do with federated learning is actually a lot of fun. And TFF is here basically just so that everyone can have fun doing it. It is open source. It's inspired by our experiences with federal learning at Google, but now generalized to non-learning use cases as well. We're doing it in the open, in the public. It's on GitHub. We just recently started. So now is actually a great time to jump in and contribute, because you can have influence on the way this goes from the early stages. We want to create an ecosystem, so TFF is all about composability. If you're building a new extension, you should be able to combine it with all of the existing ones. If you're interfacing a new platform for deployment, you should be able to deploy all of the existing code to it. So we've made a number of design decisions to really promote composability. And speaking of deployments, in order to enable flexibility in this regard, TFF compiles all your code into an abstract representation, which, today, you can run in a simulator, but that, in the future, could potentially run on real devices-- no promises here. In the first release, we only provide a simulation runtime. I mentioned that TFF was all about having fun experimenting. In our past work on federated learning-- and that's before TFF was born-- we've discovered certain things that consistently get in the way of having fun. And the worst offender was really all the different types of logic getting interleaved. So it's model logic, communication, checkpointing, differential privacy. All this stuff gets mixed up, and it gets very confusing. So in order to avoid this, to preserve the joy of creation for you, we've designed programming abstractions that will allow you to write your federated learning code at a similar level as when you write pseudocode or draw in a whiteboard. You'll see an example of this later in the talk. And I hope that it will work for you. OK, so what's in the box? You get two sets of interfaces. The upper layer allows you to create a system that can perform federated training or evaluation using your existing model. And this sits on top of a layer of lower-level modularity abstractions that allow you to express and simulate custom types of computations. And this layered architecture is designed to enable a clean separation of concerns so that developers who specialize in different areas, whether that be federated learning, machine learning, compiler theory, or systems integration, can all independently contribute without stepping on each other's toes. OK, federated learning, we've talked about this as an idea. Now let's look at the code. We provide interfaces to represent federated data sets for simulations and a couple of data sets for experiments. If you have a Keras model, you can wrap it like this with a one-liner for use with TFF-- very easy. And now we can use one of the build functions we provide to construct various kinds of federated computations. And those are essentially abstract representations of systems that can perform various federated tasks. And I'll explain what that means in a minute. Training, for instance, is represented as a pair of computations, one of them that constructs the initial state of a federated training system, and the other one that executes a single round of federated averaging. And those are still kind of abstract. But you can also invoke them just like functions in Python. And when you do, they, by default, execute in a local simulation runtime. So this is actually how you can write little experiment loops. You can do things like pick a different set of clients in each round and so on. The state of the system includes the model of the training. So this is how you can very easily simulate federated evaluation of your model. All of this sits on top of FC API, which is basically a language for constructing distributed systems. It is embedded in Python. So you just write Python code as usual. It does introduce a couple abstract, new concepts that are worth explaining. So maybe let's take a deep dive and look. All right, first concept-- imagine you have a group of clients again. Each of them has a temperature sensor that generates a reading, some floating-point number. I'm going to refer to the collective of all these sensor readings as a federated value, a single value. So you can think of a federated value as a multi-set. Now in TFF, values like this are first-class citizens, which means, among other things, that they have types. The types of those kinds of values consist of the identity of the group of devices that are hosting the value-- we call that the placement-- and the local type, type of the local data items that are hosted by each member of the group. All right, now let's throw the server into the mix. There's a number on the server. We can also give it a federated type. In this case, I'm dropping the curly braces to indicate that there's actually just one number, not many. OK, now let's introduce a distributed aggregation protocol that runs among these system participants. So let's say it computes the number on the server based on all the numbers on the client's. Now in TFF, we can think of that as a function even though the inputs and outputs of that function reside in different places-- the inputs on the clients and the output on the server. Indeed, we can give it a functional-type signature that looks like this. So in TFF, you can think of distributed systems, or components of distributed systems, distributed protocols, as functions, simply. We also provide a library of what we call federated operators that represent, abstractly, very common types of building blocks like, in this case, computing an average among client values and putting their result in the server. Now with all this that I've just described, you can actually draw system diagrams in code, so to speak. It goes like this-- you declare the federated type that represents the inputs to your distributed system. Now you pass it as an argument to a special function decorator to indicate that, in a system you're constructing, this is going to be the input. Now in the body of the decorated function, you invoke all the different valid operators to essentially populate your data flow diagram like this. It works conceptually in very much the same way as when you construct non-eager TensorFlow graphs. OK, now let's look at something more complex and more exciting. So again, we have a group of clients. They have temperature sensors. Suppose you want to compute what fraction of your clients have temperatures exceeding some threshold. So in this system, in this computation I'm constructing, there are two inputs. One is the temperature readings in the clients. The other output is the threshold in the server. And again, the inputs can be in different places, and that's OK. All right, how do I execute this? First, we probably want to just broadcast the thresholds to all the clients. So that's our first federated operator in action. Now that each client has both the threshold and its own local temperature reading, you can run a little bit of TensorFlow to compute 1 if it's over the threshold, 0 otherwise. OK, you can think of this as basically a map step in MapReduce. And the result of that is a federated float, yet another one. OK, now we have all these ones and zeros. Actually, the only thing that remains to do is to perform a distributed aggregation to compute the average of those ones and zeros and place the result in the server. OK, that's the third federated operator in our system. And that's it. That's a complete example. Now let's look at how this example works in the code. Again, you declare the federated types of your inputs. You pass them as inputs to the other arguments to the function decorator. And now, in the body of the decorated function, you simply invoke all the federated operators you need in the proper sequence so the broadcast, the map, and the average are all there. And that piece of TensorFlow that was a parameter to the mapping operator is expressed using ordinary TensorFlow ops just as normal. And this is the complete example. It's working code that you can copy-paste into a code lab and try it out. OK, so this example obviously has nothing to do with federated learning. However, in tutorials on our website, you can find examples of fully implemented federated training and federated evaluation code that look, basically, just like this. They also [INAUDIBLE] some variable renaming. So they also fit on one screen. So yeah, in TFF, you can express your federated learning logic very concisely in a way that you can just look at it and understand what it does. And it's actually really easy to modify. Yeah, and I personally-- I feel it's liberating to be able to express my ideas at this level without getting bogged down in all the unnecessary detail. And this empowers me to try and create an experiment with new things. And I hope that you will check it out, and try it, and that you'll feel the same. And that's all I have. Everything you've seen is on GitHub. As I mentioned, there are many ways to contribute depending on what your interests are. Thank you very much. [MUSIC PLAYING]
B1 server data model distributed training learning TensorFlow Federated (TFF): Machine Learning on Decentralized Data (TF Dev Summit ‘19) 0 0 林宜悉 posted on 2020/03/25 More Share Save Report Video vocabulary