Subtitles section Play video Print subtitles [MUSIC PLAYING] EWA MATEJSKA: Hi, everyone. Thank you for joining us. I'm Ewa Matejska, and I'm technical program manager on the TensorFlow team. ZHITAO LI: Hi, my name is Zhitao. I'm a software engineer from Google's TensorFlow Extended team, TFX. EWA MATEJSKA: Today, we'll be talking about the brand new feature of the addition of native Kera model support through TFX pipelines. So could you tell me what's TFX, what are TFX pipelines, and what is native Keras model support? ZHITAO LI: Happy to do that. TFX is Google's production-ready machine learning platform. TFX pipelines is something we released the last year to bring the pipeline experience to open source users, as well as the Google Cloud users. And the native Keras support is something we started working from last October to making sure our TensorFlow 2 users can use the native Keras API inside TFX to train their machine learning models. EWA MATEJSKA: What can I do with TFX pipelines? ZHITAO LI: So you can ingest data into TFX, do data processing and the data understanding to feature engineering on top of your data, train the TensorFlow model, do model analysis, and the model validation on your model, and then finally, when everything is ready, push the model onto production-ready [INAUDIBLE] solutions. EWA MATEJSKA: Awesome. I'm excited to see the native Keras support. So what do I do? ZHITAO LI: Let me show that in this notebook. So this is a public notebook from TFX team to demonstrate how to use various components in TFX. This notebook's also retaining native Keras. I'm going to show how to do it that way. So to do that, we first go-- we first need to install TFX and the various softwares, including TensorFlow and TensorBoard. We're making sure all the packages are preloaded and then making sure the version of software is correct. After that, we set our pipeline path to making sure we can correctly access all the data we need. EWA MATEJSKA: MK, and what kind of model will you be using? What kind of data? ZHITAO LI: So the data set here is the public data-- public taxi data set from Chicago city. And the problem they're going to solve is try to predict whether the driver will receive a tape more than 20% of the fare, which we call it [INAUDIBLE]. So we are going to download the example data to the path, making sure the data here is loadable. Check the first couple of lines. Then we create the interactive context, helping us to be able to run each component of TFX pipelines in the notebook. EWA MATEJSKA: Is Interactive context a new API? ZHITAO LI: Interactive context is an API from last October. This can help us to run each component of the TFX pipeline in a notebook. So we first start with the ExampleGen. This ingests the data into the pipeline and transform them to a [INAUDIBLE] examples. We can check the first couple of examples, making sure they're correct. Then we can use the StatisticsGen component to generate some statistics for the data. EWA MATEJSKA: Can you tell me a little more about the statistics? ZHITAO LI: Sure. The statistics tell us, for each of the features in the data set, what's the distribution? How many [INAUDIBLE] records are there? Minimum value, maximum value, medium value, et cetera, et cetera. EWA MATEJSKA: OK, cool. ZHITAO LI: And we can also generate a schema out of the data, which will tell us, on the aggregated view, what the data is really-- what the data looks like. And we can se-- we can list out all the schemas from here. We can also use the example validator to making sure the data is correct. Now, we can use transform to do feature engineering top of our existing data. To do that, people simply write a pre-processing function, which takes the raw-- which takes the original inputs and then using Python functions to define the transform on them. And we can easy capture all these transforms in the result. Now, to support native Keras, we need to-- we ask users to write their TensorFlow training codes as if they're just writing the-- writing the Keras [? space ?] code in the normal environment. The model type we are solving here is a wide and a deep model. We simply ask people to write their training code. This is a-- EWA MATEJSKA: Wide and deep model, you said? ZHITAO LI: Yes. Build a Keras model. People can build a wide and deep classifier. And once this classifier is defined using the native Keras API, they can rub that in the red function. The red function will be then fed into the TFX trainer executor. And we expect the function to expand our saved model. After that, we take off the training component. And we can see the training happens. EWA MATEJSKA: OK, awesome. ZHITAO LI: The training really happened in the Jupyter Notebook. We see these are the features we are using. These are the advanced features. These are the layers we used in the model. And we train them for 10,000 steps. And then we exported model [INAUDIBLE].. EWA MATEJSKA: So this is a lot of meaty content. How can I follow along at home? ZHITAO LI: Sure. So feel free to check out the TensorFlow.org/TFX page. That is our Home page. We have all the tutorials, API docs, as well as component guides available there. And feel free to reach out to us on either GitHub or the TFX Google Group. EWA MATEJSKA: And I have one last question for you, a high level question. How do I take this out to production? ZHITAO LI: Oh, sure. Happy to do that. So to do that, you can simply use the pusher component to push the model onto various types of production-ready serving solutions, including TensorFlow Serving, off mobile devices using TensorFlow Lite or TensorFlow Hub. EWA MATEJSKA: Thank you so much for showing me a little bit about the native [? Kera ?] model support. And thank you for joining us. ZHITAO LI: Thank you. [MUSIC PLAYING]
B1 ewa ewa matejska matejska tfx data model TensorFlow Extended (TF Dev Summit '20) 4 0 林宜悉 posted on 2020/03/31 More Share Save Report Video vocabulary