Placeholder Image

Subtitles section Play video

  • Hi, I'm Robert Crowe

  • and today I'm going to be talking about TensorFlow Extended,

  • also known as TFX, and how it helps you put

  • your amazing machine learning models into production.

  • This is the first episode in our five-part series

  • on real world machine learning which will help you

  • get up to speed on using TFX to create your own

  • production machine learning pipelines.

  • In today's episode, we'll be asking the question,

  • what exactly is this TFX thing anyway?

  • Let's find out.

  • ♪ (upbeat music) ♪

  • When we think about ML, we usually only think

  • about the great models that we can now create.

  • After all, that's what all the research papers

  • are focused on.

  • But when we want to take that amazing model

  • and make it available to the world, we need to think about

  • all the things that a production solution requires.

  • So that's why we have TFX, to build production pipelines

  • so that we can offer our amazing models to the world.

  • Google created TFX because we needed it.

  • And there was nothing already available that could meet our needs.

  • Google does a ton of ML.

  • And not just Google but all of the alphabet companies.

  • There's ML in almost everything we do.

  • In fact, TFX wasn't the first ML pipeline framework

  • that Google created.

  • it evolved out of earlier attempts and is now the default framework

  • for the majority of Google's ML production solutions.

  • And now, Google has open sourcing TFX

  • and making it available to everyone.

  • And it's not just Google.

  • TFX has had deep impact on our partners,

  • including Twitter, Airbnb and PayPal.

  • As ML developers putting a model into production,

  • what do we need to think about?

  • First, when we start planning for developing an ML application,

  • we have all the normal ML things to think about.

  • That includes getting labeled data if we're doing supervised learning,

  • and making sure that our data set covers well the space

  • of possible inputs.

  • We also want to minimize the dimensionality of our feature set

  • while maximizing the predictive information it contains.

  • And we need to think about fairness.

  • And make sure that our application won't be unfairly biased.

  • We also need to consider rare conditions,

  • especially in applications like healthcare

  • where we might be making predictions

  • for conditions that only occur in rare, but important, situations.

  • And finally, we need to consider that this will be a living solution

  • that will evolve over time as new data flows in

  • and as conditions change and plan for life cycle management

  • of our data.

  • But in addition to all that, we need to remember that

  • we're putting a software application into production.

  • That means that we still have all the requirements that

  • any production software application has,

  • including scalability, consistency, modularity

  • and testability, as well as safety and security.

  • We're way beyond just training a model now.

  • By themselves, these are challenges

  • for any production software deployment.

  • And we can't forget about them just because we're doing ML.

  • How are we going to meet all these needs

  • and get our amazing new model into production?

  • We don't pretend to have all the answers.

  • This is an evolving field within the ML community

  • and we welcome contributions.

  • If you're interested in a more in-depth discussion

  • of the challenges of machine learning in production environments,

  • this is a great paper.

  • That's what TFX is all about.

  • TFX allows you to create production ML pipelines that include

  • many of the requirements for production software deployments

  • and best practices.

  • It starts with ingesting your data and flows through data validation,

  • feature engineering, training, evaluating and serving.

  • In addition to TensorFlow, itself, we've created libraries

  • for each of the major phases of an ML pipeline,

  • TensorFlow Data Validation, TensorFlow Transform

  • and TensorFlow Model Analysis.

  • TFX implements a series of pipeline components

  • which leverage these libraries,

  • which in this diagram are in orange, and allows you to create

  • your own components too.

  • To tie all this together, we created some horizontal layers

  • for things like pipeline storage, configuration and orchestration.

  • These layers are really important

  • for managing and optimizing your pipelines

  • and the applications that you run on them.

  • We'll be discussing those more in later episodes.

  • For now, that should give you an idea of what we're talking about

  • when we think about implementing a production ML pipeline with TFX.

  • In our next episode, we'll discuss how TFX pipelines actually work.

  • For more information on TFX, visit us at tensorflow.org/tfx

  • and don't forget to comment and like us below

  • and thanks for watching.

  • ♪ (music) ♪

Hi, I'm Robert Crowe

Subtitles and vocabulary

Click the word to look it up Click the word to find further inforamtion about it