Subtitles section Play video Print subtitles [MUSIC PLAYING] JASON MAYES: Hello, everyone. My name is Jason Mayes. I'm a developer advocate within the research and machine Intelligence Group here at Google. And today, I've got Thomas and Sean from ML Fairness team at Google to talk to us about their new updates. But first of all, what does the ML Fairness team actually do? THOMAS GREENSPAN: So our team is an infrastructure team within TensorFlow. And our mission is to help lower the barrier to entry for any developer looking to improve model fairness. JASON MAYES: Awesome. Great stuff. And what do you have in store for us today? THOMAS GREENSPAN: Today, I'm going to be presenting on Fairness Indicators and TensorFlow constrained optimization. Fairness Indicators are a suite of tools that help developers regularly evaluate for fairness concerns. And constrained optimization is a technique where you can introduce a constraint into the model on a specific metric. So today, what we're going to do is first train a simple model to predict if a subject is smiling or not smiling in a certain image. We're going to run fairness Indicators to evaluate this model. We're going to retrain using constraint optimization to fix the fairness concerns that we may find. And then we'll rerun fairness indicators to see whether we've improved the model. So first, the data set that we'll be using is called CelebA. It's a public data set made available by the Chinese University of Hong Kong. It has about 200,000 images with about 40 attributes, such as smiling, not smiling, young, not young, that kind of binary classification. First, we train a model. We use keras, and we keep it very simple, because that's not what we're concentrating on. When we evaluate it, we see that we get 86% accuracy, which is actually pretty good. But we need to make sure that we're fair across certain slices. So to do that, we specify how to set up the fairness indicators. Specifically, we tell it that we're using various indicators with certain thresholds. And we specify a slice. In this case, it's going to be young or not young. Next, we run and we get our results. In this case, we're going to be looking at false positive rates specifically. As you can see, there's actually a really large gap between the young slice, which, with the false positive rate, is around 4%, and the not young slice, where it's about 11%. So what do we do? We are going to apply constraint optimization on this model. To do this, we first define the context. So we tell it that we're looking at smiling, not smiling, and we tell it what slice we're interested in, young, not young. Next, we ask the model to keep the false positive rate for each slice under 5%. And, finally, we define the optimizer that has this constraint defined. Next, we train the model and we re-evaluate. As we can see, the false positive rate for the overall is about the same. But on the not young slice, the false positive rate has decreased by about 5%, which is really big. It's worth noting that on the young slice, the false positive rate has increased a little bit, but that's a worthwhile trade-off. JASON MAYES: Awesome. Well, thank you for that, and what's up next? THOMAS GREENSPAN: Yeah, no problem. Next, I'll hand it over to Sean. SEAN O'KEEFE: Within this next case study, we'll follow on from Thomas's previous case study to understand fairness indicators within TFX pipeline. The tools we will be using will be many of the same tools that Thomas already spoke to us about, But with the addition of TensorFlow Extended, or TFX, which is a Google production grade machine learning pipeline. And then also, ML Metadata, which will allow us to help track and understand the underlying performance of our machine learning models. Now, the data that we're using is known as the compass data set. And this is two years of actual criminal record data from the Florida judicial system. Now, it is important to note that using this data has a lot of fairness concerns in terms of predicting that anyone might commit another crime. I would encourage anybody that follows along with this to also take a look at the AI Partnership and their article on the use of this data set to understand and predict a person's likelihood to commit another crime. Now, the higher level overview of what we'll be doing within TFX is, first, ingesting our data within the example gen. This will help us transform and validate our data to make sure that we could correctly adjust it within TensorFlow. After that, we will transform it for the model that we need, we will train it with a keras estimator. And finally, we'll evaluate it to try to understand how the performance in the model is effective. Once we see the fairness concerns within our model, we will track those changes within ML Metadata, finally re-run the model again to see how it could change, and, once again, track those further developments for anybody who might be interested in following us at a future date. Now, the keras model itself is listed here on the right. This is within the TensorFlow trainer component that we'll be transmitting back as an estimator. This is the initial run for the first model. And to modify the model, we will reweighting it within the model compile function. Now, once we run our model and understand the fairness issues that it produces, we will be able to see a pretty drastic difference between our different data slices between African-American and Caucasians with regards to whether or not our model is predicting another crime. Now, in order to understand where this bias is coming from, we will use ML Metadata to understand the lift statistics within each one of our features, to identify how we could possibly bring this issue down, and then, finally, track this within each one of the runs within ML Metadata for any other developers to follow. For further information on this case study and also Thomas's case study, please check out TensorFlow.org/tfx/fairness indicators. JASON MAYES: Awesome. Thank you very much for the talk. SEAN O'KEEFE: Thank you. [MUSIC PLAYING]
B1 fairness model jason mayes mayes smiling slice Fairness Indicators for TensorFlow (TF Dev Summit '20) 4 1 林宜悉 posted on 2020/03/31 More Share Save Report Video vocabulary