Subtitles section Play video
[MUSIC PLAYING]
LEZHIE LI: Hi, everyone.
My name's Lezhi, and I'm a software engineer at Uber.
So today I'm going to talk about how we build a visual debugging
tool for machine learning using TensorFlow.js.
So why is model debugging important?
Machine learning practitioners report
that they spend the majority of their time
not on building a model, but instead,
on iterating and debugging of the existing model.
So there's huge opportunity for us
to improve efficiency of the 80% of the time.
Traditionally, the only guidance for model developers
to evaluate a model performance is
by looking at performance metrics.
Although these metrics are useful,
they do not give too much insight
into how to improve on a model or why a model performs
in a certain way.
So given the intrinsic opacity of machine learning algorithms,
it is very hard for anyone who wants
to try to understand model performance.
So how do we solve that problem.
The idea here is that we can transform a model space problem
into a data space problem.
And by that we mean that, instead
of asking what went wrong with the model,
we look at on which data did this model make mistakes.
And instead of asking why a model makes certain mistakes,
we look into the future characteristics
of these failed data points.
So based on those two ideas, we developed
Manifold, which is a model agnostic visual debugger
for machine learning.
Here's the workflow of using Manifold.
The user will connect Manifold to the output data
set of several machine learning outcomes,
and Manifold will automatically segment these data sets
into subsets, each subsets containing
data points with similar performance with each other.
The users would choose the subset of their interest
to compare against each other, and Manifold
would highlight the feature distribution difference
of these two different subsets, and helping
them to diagnose the behavior of the performance outcome.
So while we developed these ideas into production,
we faced several technology challenges.
And among them there's a performance challenge and also
portability challenge.
So traditionally, it is the model training backend's job
to handle the performance metric calculation.
But that pattern is no more applicable
to our visual interface because of the latency introduced
by the recalculation in response to the user interaction.
And also, if you want to connect Manifold
to another machine learning training back end,
there are two pieces of code we need to port out, the back end
code and a front end code.
But in reality, the metrics calculation logic actually
belong to the visual tool and should not
be injected into the training back end.
Those two reasons shows why we put this computation logic
inside of front end.
And because this computation could get intensive as a data
volume increases, that's why we use
TensorFlow.js to help us increase the competition
efficiency.
So what are the intensive computation
involved in this Manifold interface?
In performance configuration view,
we compute and perform scores for each data point
on each model and use those metrics
to run the K-means clustering to segment
this data set into subsets.
And in the future attribution view, for each feature
we compute the distribution histograms of the two
different subsets, and using those
histograms to compute KL-divergence to rank
that feature importance for model developers
to inspect the model performance.
And in all of those scenarios, TensorFlow.js
gave us a lot of performance boosting
compared to plain JavaScript implementation.
And in some cases, the performance boosting
can be as high as 100 times, for the per instance model metrics
computation.
So to conclude, complex tasks such as machine learning
diagnosis can benefit a lot from numerical computation capacity
of TensorFlow.js.
And TensorFlow.js opens up new opportunities for developers
of visual analytics source.
OK, that's it.
Thank you.
[APPLAUSE]
[MUSIC PLAYING]