Subtitles section Play video Print subtitles [MUSIC PLAYING] MAGNUS HYTTSTEN: Hi there, everybody. What's up? My name is Magnus, and you're watching Cording TensorFlow-- the show where you learn how to code in TensorFlow. [MUSIC PLAYING] All right. In this episode, we'll talk about saving and loading models. So why do we want to talk about this? Well, first of all, whenever you train a model of any significant complexity, the training can take a long time. Most of the models in this Getting Started series will just take a minute or so to train, where real-life models can take days or even weeks to train. So if you were to hit Control-C on your training job after it's been running for a day or so, all your model weights and values will be lost, and you would have to restart training from the beginning and be a very sad camper. But if you saved your model every so often, you can always resume training from that point, making you a happy camper. Another benefit is that you can take your model and transfer to another computer, where you can continue training. But I'm pretty sure you already guessed that I was going to bring that up. That's enough talking for now. Check out the links below to locate the code, because that's what we're going to do now. Check out the code! Oh, finally! We get to check out the code! That's awesome! Let's go and check out the code! All right. Let's start by checking out the awesome licenses here at the top. Then install packages for HDF5 and JAML support. And here we do some imports, and print the TensorFlow version. It's totally OK if you have a later version than me here. We use the MNIST data set to demonstrate model loading and saving. Then reshape the images to batches of 28 by 28 arrays, which is the pixel size of MNIST images, and normalize all pixel values to be between 0 and 1. Next is the model definition, which is defined in the create_model function. This is a very basic model, which is totally OK, because in this screencast we're interested in learning how to load and save models, not creating the best model for the MNIST dataset. And here, we finally get to see how a model can be saved. checkpoint_path will be the path of the saved model. A model checkpoint callback object is created with this path. We also specify that only the weights of the model should be saved, and that we want debug output when the saving is performed. Finally, we perform the model training by calling the fit method and providing this callback. As you can see, this will cause a model to be saved once every epoch has been completed. And if we look at the checkpoints directory, we can now see three files. The cp.ckpt.data file contains all the weight values. This file has a range sequence, because multiple partitions could potentially be used if we have a lot of weights. The cp.ckpt.index file specifies which partition file contains which weights. And finally, the checkpoint file is a text file that points to the latest model. In our case, we only have one data file, but shortly, we'll see an example where we have saved multiple versions of the model. All right. So now when we have our saved model, let's try out loading it. First, let's just create a model from scratch and try it out. Since it hasn't been trained, you can see that the accuracy really sucks. And now for the magic. If we call the method load_weights with our checkpoint path, our model gets initialized with the previous training state, and has much better accuracy. OK. That's the basics to save and load models. Let's look at some more options we have. One option is to provide the period parameter when creating the model checkpoint object. In this case we use the value 5, which as you can see saves a new model every five epochs. Observe in this case, we also use a parameterized filing based on the epoch. This means a unique file is saved every time. That's also why we can see multiple files when looking at the checkpoint directory. We can also use a function called tf.train.latest_checkpoint that will return the latest model, which was saved-- in our case, the one with index 50. This function looks into the file with the name checkpoint to find the latest checkpoint. Remember that the checkpoint file is a text file, so you can actually check the file content yourself. And now we can load the model using the load_weights function like we did before, providing the value returned by tf.train.latest_checkpoint. Another way of saving models is to call the save method on the model. This will create an HDF5-formatted file. Remember that we specified save_weights_only to true last time we saved a model. In addition to only saving variables, the save method saves additional data, like the model's configuration and even the state of the optimizer. A model that was saved using the save method can be loaded with the function keras.models.load_model. And as you can see, we have the accuracy of a trained model. In addition to everything we've looked at, TensorFlow also has a very important file format, called SavedModel. This is a file format that allows to exchange models between many different parts of TensorFlow, like TensorFlow Python, TensorFlow.js,. And also TensorFlow Lite. We are currently building out first-hand support for SavedModel in Keras, and you can check out the links below to read more about it. And that's it for this episode of Coding TensorFlow. Make sure to subscribe to the channel to get more videos like this. Now it's your turn to go out there and create some great models. And don't forget to tell us all about it. [MUSIC PLAYING]
B1 model file checkpoint saved saving training Saving and Loading Models (Coding TensorFlow) 1 0 林宜悉 posted on 2020/03/25 More Share Save Report Video vocabulary