Subtitles section Play video
[MUSIC PLAYING]
CATHERINE XU: Hi.
I'm Kat from the TensorFlow team,
and I'm here to talk to you about responsible AI
with TensorFlow.
I'll focus on fairness for the first half of the presentation,
and then my colleague, Miguel, will end
with privacy considerations.
Today, I'm here to talk about three things.
The first, an overview of ML fairness.
Why is it important?
Why should we care?
And how does it affect an ML system exactly?
Next, we'll walk through what I'll
call throughout the presentation a fairness workflow.
Surprisingly, this isn't too different from what
you're already familiar with--
for example, a debugging or a model evaluation workflow.
We'll see how fairness considerations
can fit into each of the discrete steps.
Finally, we'll introduce tools in the TensorFlow ecosystem,
such as Fairness Indicators that can be
used in the fairness workflow.
Fairness Indicators is a suite of tools
that enables easy evaluation of commonly used fairness
metrics for classifiers.
Fairness Indicators also integrates
well with remediation libraries in order
to mitigate bias found and a structure
to help in your deployment decision
with features such as model comparison.
We must acknowledge that humans are at the center of technology
design, in addition to being impacted by it,
and humans have not always made product design decisions
that are in line with the needs of everyone.
Here's one example.
Quick, Draw! was developed through the Google AI
experiments program where people drew little pictures of shoes
to train a model to recognize them.
Most people drew shoes that look like the one on the top right,
so as more people interacted with the game,
the model stopped being able to recognize shoes
like the shoe on the bottom.
This is a social issue first, which
is then amplified by fundamental properties of ML--
aggregation and using existing patterns to make decisions.
Minor repercussions in a faulty shoe classification product,
perhaps, but let's look at another example that can
have more serious consequences.
Perspective API was released in 2017
to protect voices in online conversations
by detecting and scoring toxic speech.
After its initial release, users experimented
with the web interface found something interesting.
The user tested two clearly non-toxic sentences
that were essentially the same, but with the identity term
changed from straight to gay.
Only the sentence using gay was perceived by the system
as likely to be toxic, with the classification score of 0.86.
This behavior not only constitutes
a representational harm.
When used in practice, such as a content moderation system,
this can lead to the systematic silencing of voices
from certain groups.
How did this happen?
For most of you using TensorFlow,
a typical machine learning workflow
will look something like this.
Human bias can enter into the system
at any point in the ML pipeline, from data collection
and handling to model training to deployment.
In both of the cases mentioned above,
bias primarily resulted from a lack of diverse training data--
in the first case, diverse shoe forms,
and in the second case, examples of comments containing gay
that were not toxic.
However, the causes and effects of bias are rarely isolated.
It is important to evaluate for bias at each step.
You define the problem the machine learning
system will solve.
You collect your data and prepare it,
oftentimes checking, analyzing, and validate it.
You build your model and train it of the data
you just prepared.
And if you're applying ML to a real world use case,
you'll deploy it.
And finally, you'll iterate and improve
your model, as we'll see throughout the next few slides.
The first question is, how can we do this?
The answer, as I mentioned before,
isn't that different from a general model quality workflow.
The next few slides will highlight the touch points
where fairness considerations are especially important.
Let's dive in.
How do you define success in your model?
Consider what your metrics and fairness-specific metrics
are actually measuring and how they relate to areas
of product risk and failure.
Similarly, the data sets you choose
to evaluate on should be carefully selected
and representative of the target population of your model
or product in order for the metrics to be meaningful.
Even if your model is performing well at this stage,
it's important to recognize that your work isn't done.
Good overall performance may obstruct poor performance
on certain groups of data.
Going back to an earlier example,
accuracy of classification for all shoes was high,
but accuracy for women's shoes was unacceptably low.
To address this, we'll go one level deeper.
By slicing your data and evaluating performance
for each slice, you will be able to get a better
sense of whether your model is performing equitably
for a diverse set of user characteristics.
Based on your product use case and audience,
what groups are most at risk?
And how might these groups be represented in your data,
in terms of both identity attributes and proxy
attributes?
Now you've evaluated your model.
Are there slices that are performing significantly worse
than overall or worse than other slices?
How do we get intuition as to why
these mistakes are happening?
As we discussed, there are many possible sources of bias
in a model, from the underlying training data to the model
and even in the evaluation mechanism itself.
Once the possible sources of bias have been identified,
data and model remediation methods
can be applied to mitigate the bias.
Finally, we will make a deployment decision.
How does this model compare to the current model.
This is a highly iterative process.
It's important to monitor changes
as they are pushed to a production setting
or to iterate on evaluating and remediating
models that aren't meeting the deployment threshold.
This may seem complicated, but there
are a suite of tools in the TensorFlow ecosystem
that make it easier to regularly evaluate and remediate
for fairness concerns.
Fairness Indicators is a tool available via TFX, TensorBoard,
Colab, and standalone model-agnostic evaluation
that helps automate various steps of the workflow.
This is an image of what the UI looks like,
as well as a code snippet detailing
how it can be included in the configuration.
Fairness Indicators offers a suite of commonly-used fairness
metrics, such as false positive rate and false negative rate,
that come out of the box for developers
to use for model evaluation.
In order to ensure responsible and informed use,
the toolkit comes with six case studies that
show how Fairness Indicators can be applied across use cases
and problem domains and stages of the workflow.
By offering visuals by slice of data,
as well as confidence intervals, Fairness Indicators
help you figure out which slices are underperforming
with significance.
Most importantly, Fairness Indicators
works well with other tools in the TensorFlow ecosystem,
leveraging their unique capabilities
to create an end-to-end experience.
Fairness Indicators data points can easily
be loaded into the What If tool for a deeper analysis,
allowing users to test counterfactual use cases
and examine problematic data points in detail.
This data can also be loaded into TensorFlow Data Validation
to identify the effects of data distribution
on model performance.
This Dev Summit, we're launching new capabilities
to expand the Fairness Indicators
workflow with remediation, easier deployments, and more.
We'll first focus on what we can do
to improve once we've identified potential sources of bias
in our model.
As we've alluded to previously, technical approaches
to remediation come in two different flavors--
data-based and model-based.
Data-based remediation involves collecting data, generating
data, re-weighting, and rebalancing in order
to make sure your data set is more representative
of the underlying distribution.
However, it isn't always possible to get or to generate
more data, and that's why we even investigated
model-based approaches.
One of these approaches is adversarial training,
in which you penalize the extent to which a sensitive attribute
can be predicted by the model, thus mitigating the notion
that the sensitive attribute affects
the outcome of the model.
Another methodology is demographic-agnostic
remediation, an early research method
in which the demographic attributes don't need
to be specified in advance.
And finally, constraint-based optimization
we will go into more detail in over the next few slides
in a case study that we have released.
Remediation, like evaluation, must be used with care.
We aim to provide both the tools and the technical guidance
to encourage teams to use this technology responsibly.
CelebA is a large-scale face attributes
data set with more than 200,000 celebrity images,
each with 40 binary attribute annotations, such as is
smiling, age, and headwear.
I want to take a moment to recognize
that binary attributes do not accurately
reflect the full diversity of real attributes
and is highly contingent on the annotations and annotators.
In this case, we are using the data set
to test a smile detection classifier
and how it works for various age groups characterized
as young and not young.
I also recognize that this is not
the possible full span of ages, but bear
with me for this example.
We trained an unconstrained-- and you'll find out what
unconstrained means--
tf.keras.Sequential model and evaluated and visualized
using Fairness Indicators.
As you can see, not young has a significantly higher false
positive rate.
Well, what does this mean in practice?
Imagine that you're at a birthday party
and you're using this new smile detection
camera that takes a photo whenever everyone in the photo
frame is smiling.
However, you notice that in every photo,
your grandma isn't smiling because the camera falsely
detected her smiles when they weren't actually there.
This doesn't seem like a good product experience.
Can we do something about this?
TensorFlow constraint optimization
is a technique released by the Glass Box research team
here at Google.
And here, we incorporate it into our case study.
TF constraint optimization works by first defining
the subsets of interest.
For example, here, we look at the not young group,
represented by groups_tensor less than 1.
Next, we set the constraints on this group, such
that the false positive rate of this group is less than
or equal to 5%.
And then we define the optimizer and train.
As you can see here, the constrained sequential model
performs much better.
We ensured that we picked a constraint
where the overall rate is equalized
for the unconstrained and constrained model, such that we
know that we're actually improving the model, as opposed
to merely shifting the decision threshold.
And this applies to accuracy, as well-- making sure
that the accuracy and AUC has not gone down over time.
But as you can see, the not young FPR
has decreased by over 50%, which is a huge improvement.
You can also see that the false positive rate for young
has actually gone up, and that shows that there are often
trade-offs in these decisions.
If you want to find out more about this case study,
please see the demos that we will
post online to the TF site.
Next, we finally we want to figure out
how to compare our models across different decision thresholds
so that we can help them in your deployment decision
to make sure that you're launching the right model.
Model Comparison is a feature that we launched such
that you can compare models side by side.
In this example, which is the same example
that we used before, we're comparing the CNN and SVM
model for the same smile detection example.
Model comparison allows us to see that CNN outperforms SVM--
in this case, has a lower false positive rate--
across these different groups.
And we can also do this comparison
across multiple thresholds, as well.
You can also see the tabular data
and see that CNN outperforms SVM at all of these thresholds.
In addition to remediation and Model Comparison,
we also launched Jupyter notebook support, as well as
a Fairness Lineage with ML Metadata Demo Colab, which
traces the root cause of fairness disparities using
stored run artifacts, helping us detect
which parts of the workflow might have contributed
to the fairness disparity.
Fairness Indicators is still early
and we're releasing it here today
so we can work with you to understand how it works
for your needs and how we can partner together to build
a stronger suite of tools to support various questions
and concerns.
Learn more about Fairness Indicators here
at our tensorflow.org landing page.
Email us if you have any questions.
And the Bitly link is actually our GitHub page and not
our tf.org landing page, but check it out
if you're interested in our code or case studies.
This is just the beginning.
There are a lot of unanswered questions.
For example, we didn't quite address, where do I
get relevant features from if I want to slice
my data by those features?
And how do I get them in a privacy-preserving way?
I'm going to pass it on to Miguel
to discuss privacy tooling in TensorFlow in more detail.
Thank you.
MIGUEL GUEVARA: Thank you, Cat.
So today, I'm going to talk to you about machine learning
and privacy.
Before I start, let me give you some context.
We are in the early days of machine learning and privacy.
The field at the intersection of machine learning and privacy
has existed for a couple of years,
and companies across the world are deploying models
to be used by regular users.
Hundreds, if not thousands, of machine learning models
are deployed to production every day.
Yet, we have not ironed the prize issues out
with these deployments.
For this, we need you, and we've got your back
in terms of in TensorFlow.
Let's walk through some of those privacy concerns and ways
in which you can mitigate them.
First of all, as you all probably know,
data is a key component of any machine learning model.
Data is at the core of any aspect that's
needed to train a machine learning model.
However, I think one of the pertinent questions
that we should ask ourselves is, what
are the primary considerations that there
are when we're building a machine learning system?
We can start by looking at the very basics.
We're generally collecting data from an end device.
Let's say it's a cell phone.
The first privacy question that comes up
is, who can see the information in the device?
As a second step, we need to send that information
to the server.
And there are two questions there.
While the data is transiting to the server,
who has access to the network?
And third, who can see the information in the server
once it's been collected?
Is this only reserved for admins,
or can regular [INAUDIBLE] also access that data?
And then finally, when we deploy a model to the device,
there's a question as to who can see the data that
was used to train the model.
In a nutshell, if I were to summarize these concerns,
I think that I can summarize them with those black boxes.
The first concern is, how can we minimize data exposure?
The second one is, how can we make sure
that we're only collecting what we actually need?
The third one is, how do we make sure
that the collection is only ephemeral for the purposes
that we actually need?
Fourth, when we're releasing it to the world,
are we releasing it only in aggregate?
And are the models that we're releasing memorizing or not?
One of the biggest motivations for privacy
is some ongoing research that some of my colleagues
have done here at Google.
A couple of years ago, they released this paper
where they show how neural networks can
have unintended memorization attacks.
So for instance, let's imagine that we are training a learning
model to predict a next word.
Generally, we need text to train that machine learning model.
But imagine that that data or that core piece of text
has, or potentially has, sensitive information,
such as social security numbers, credit card numbers, or others.
What the paper describes is a method
in which we can prove what's the propensity that the model will
actually memorize some data?
I really recommend you to read it,
and I think that one of the interesting aspects
is that we're still in the very early days of this field.
The research that I showed you is
very good for neural networks, but there are ongoing questions
around classification models.
We're currently exploring more attacks
against machine learning models that
can be more generalizable and used by developers like you,
and we hope to update you on that soon.
So how can you get started?
What are the steps that you can take
to do machine learning in a privacy preserving way?
Well, one of the techniques that we use is differential privacy,
and I'll walk you through what that means.
You can look at the image there and imagine
that that image is the collection of data
that we've collected from a user.
Now let's zoom into one specific corner, that blue square
that you see down there.
So assume that we're training on the individual data
that I'm zooming in.
If we trained without privacy, we'll
train with that piece of data.
However, we can be clever about the way that we train a model.
And what we could do, for instance,
is just let's flip each bit with a 25% probability.
One of the biggest concerns that people
have when doing this approach is that it naturally
introduces some noise, and people
have questions as to, what's the performance of the resulting
model?
Well, I think one of the interesting things
from this image is that even after flipping 25% of the bits,
the image is still there.
And that's kind of the big idea around differential privacy,
which is what powers TensorFlow Privacy.
As I said, differential privacy is the notion of privacy
that protects the presence or absence of a user in a data
set, and it allows us to train models
in a privacy-preserving way.
We released, last year, TensorFlow Privacy,
which you can check at our GitHub repository,
github.com/tensorflow/privacy.
However, I want to talk to you also about some trade-offs.
Training with privacy might reduce the accuracy
of the models and increase training time,
sometimes exponentially.
Furthermore, and I think more worryingly and tied
to Cat's talk, if a model is already biased,
differential privacy might make things even worse, as in
even more biased.
However, I do want to encourage you
to try to use differential privacy because it's
one of the few ways in which we have to do privacy in ML.
The second one is our Federated learning.
So a refresher, TensorFlow Federated
is an approach to machine learning where a shared
global model is trained across many participating clients that
keep their training data locally.
It allows you to train a model without ever collecting
the raw data, therefore, reducing some privacy concerns.
And of course, you can also check it out
at our GitHub repository.
This is kind of what I was thinking with or mentioning
about TensorFlow Federated learning.
The idea is that devices generate a lot of data
all the time-- phones, IoT devices, et cetera.
Traditional ML requires us to centralize
all of that data in a server and then train the models.
One of the really cool aspects about Federated learning
is that each device runs locally only,
and the outputs are aggregated to create improved models,
allowing the orchestrator not to see any private user data.
In terms of Federated, as a recap,
allows you to train models without ever
collecting the raw data.
So if you remember the first slide that I showed,
it really protects the data at the very edge.
In terms of next steps, we would really
want you to reach out to tf-privacy.
We would love to partner with you
to build responsible AI cases with privacy.
As I said earlier, we're in this together.
We are still learning.
The research is ongoing.
And we want to learn more from your use cases.
I hope that from the paper that I showed you,
you have the sense that keeping user data private
is super important.
But I think most importantly is that this is not trivial.
The trade-offs in machine learning and privacy are real,
and we need to work together to find what the right balance is.
[MUSIC PLAYING]