Subtitles section Play video Print subtitles Hello, and welcome to another ML5 neural network video tutorial. I'm following up on what I did in the previous video where I built this example. This example has this interaction where you click the mouse all over this canvas and press keys to assign each xy point a label, C, D, E-- and I added a bunch of notes since the previous video. So I have the full scale C, D, E, F, G, A, B. Then I trained the model with the inputs being the xy of all these points and the target being the actual label. And once the model is trained, it can make guesses. So in theory, I just collected this data set, trained the model. When I click into it, when I click over here, I should hear the musical note D. I should hear the musical note E, G, A. And in between, it's sort of interesting to see what I get. But it works as expected. I ran into like a pretty significant problem while working on this. Because once I've collected the data set and trained the model, if I had a bug in the code or something I needed to fix or I wanted to try a different parameter, I have to stop the sketch and run it again and sit there and do this. This highly manual process of clicking, clicking, clicking to collect the data set. So in this video, I want to look at saving the data. And I also want to look at saving the model, which, those are two pretty different things. It might seem like the same idea. I want to save the data. I want to save the model. Why would I do one versus the other? Let's pause for a second and examine all the steps of a machine learning project and where we might want to save the data versus save the model and why. Step 1, collect the data. Now, this could be a really big, complicated step. But in my scenario, in my example, it's just clicking the mouse a whole bunch of times and pressing keys on the keyboard. Step 2, train the model. Once the model has been trained, the idea is to use that model in some scenario. So that we can call step 3, deploy the model or prediction inference. So now the question is, where along the way might you want to save the state of what you're doing? So in the most traditional machine learning sense, once you've done all of this and your model is trained, you don't ever need to look back. You've got a trained model. You can save that model. So right here in between steps 2 and 3 is a point where we might want to save the model if we're done and our model is exactly the way we want it and we're ready to just use it in a project. However, you might want to try training the model a variety of different ways. And this is where you might want to, in between these two steps, save the data. We also might collect a lot of data want to take a break, reload that data, collect more data. There's a lot of different reasons why we might want to stop in between steps 1 and 2 and save where we are. And the functions in the ML5 neural network class that we want to use are saveData and save. So just save is saving the model. SaveData is saving the data. There are also functions for loading it back, which we'll look at, loadData, and load. Let's begin by just looking at saveData. So in this particular example, all of the interaction happens with key presses. And certainly as I've mentioned before, you might want to think about a more thoughtful interface for doing all this work. But for me, I'm just going to add another key press, s, for saveData. So I'm going to say elseif the key is s, then I'm going to call model.saveData. I can look up more about how the saveData function works by looking at the ML5 website. And we can see there's two optional arguments. So one argument is a file name, which I want to use because I want to set the file name. It'll just pick a date if you don't. And then a callback to know that it's done. I don't actually really need to worry about that. Because I'll know that it's done when the file is there and downloaded to the downloads directory. I'm going to give it the name mouse-notes and run the sketch and collect some data. So I'm just going to do a little bit just to make sure it's working. So now I can hit s. And look, a file has been downloaded. I can take a look at this file in Visual Studio Code. And here's what the file looks like. So I've got a data property with an array that has all the data in it. x, ys with a, label, x, y with a label. And if I reformat the JSON, you can actually see it here, and it's much more legible what's going on. So this is all of the data that I've collected. Not very much data, but there it is. So now that I've done that, I might as well take the time to collect a lot more data, knowing that I can save it. Methodically collected a large data set. Now I'm going to press s to save it. And here's what it looks like. Almost 400 data samples. Let's see how it performs. I going to train the model. Try doing some inference. It works pretty well. So now, the next thing that I want to try to do is hit stop and run the sketch again but have all of my data reappear. Let's see if I can make that happen. Now, instead of just creating the neural network, I can create the neural network and load data into it. And that's as easy as saying, model.loadData mouse-notes.json. The only thing here is that you have to remember that I'm working in client-side JavaScript only. So if I run this right now, well, it's giving me this nice error here because it's looking for a JSON object with an array called data. But it can't find it because that JSON file doesn't exist. It doesn't exist because I downloaded it to the downloads directory. And so I need, for my p5 sketch to be able to access it, I have to manually upload it back to the p5 web editor. If I were writing my own server, maybe with Node.js, I could do something where I could save the data and have it reload back automatically. But that's another example for another time. Let me do Add File and drag mouse-notes.json in here. Now, we can see that that file is part of my p5 sketch. And I should be able to run the sketch now. All right, I think the data was loaded. I don't see it, because I'm not drawing it. So this might be something I want to add in a moment, be able to draw the data that it's loaded. But in theory, there's no reason why I couldn't train the model. All right, the model is trained. And you can see, I'm not seeing the data. I'm not seeing those clusters. But it's clearly been trained based on that data. To show you how this can be useful, one thing that I might want to do is change some property that affects the training process. So I could try it multiple times. And an obvious one might be to try learning rate. So let's say I make a smaller learning rate, 0.01, and I run the sketch again. I've got to click in here so that my key press gets activated. So I'm going to add one more piece of data and hit T. So you can see, with a small learning rate, the loss is going down very, very, very, very slowly. So in this case, having a small learning rate is not super helpful. But I can say, OK, that learning rate wasn't good. Let me try a much larger learning rate, like 0.5, run it again, hit T for train. And then you can can see, with this high learning rate, that loss is going down really, really quickly. Now, I don't mean to suggest here that universally a high learning rate is better than a low learning rate. There's a lot of it depends here. But just to show you how you can now retrain the model changing all of the different kinds of options, and you could look at the ML5 neural network documentation and see what other kind of options you might want to play around with or change. You might be finding this example a little bit tricky to follow because you can't actually see the data. So let's add that feature of, once I've loaded the data, also drawing it to the canvas. So in this case, having a callback for when the data is ready would be very useful. I'm going to say, dataLoaded. I'm going to write my dataLoaded function. And let me just look at where I'm drawing stuff. I'm going to grab all this drawing code, bring it back up here. Let me comment it out. And let's actually look at what the data looks like in the neural network model. So I think I should be able to just console.log the model's data, I think. So this is what the ML5 data object looks like. And for me, the important bit here is under data under raw. This raw data is all of the actual xy coordinate with the target label. I can make a variable called data. And I can iterate over it. So here, I'm looking at the raw data, iterating over every single element, pulling out the x's into the inputs, the y's into the target. And then I can add the drawing code back in. Only here, I'm saying, inputs.x, inputs.y. And this is target.label. So I believe if I run it-- aha, I figured out what my mistake is. This is confusing, because in model-- is what I'm seeing here is model.data, and I want to look at model.data.data.raw. This leads me to think that maybe it would make sense to have something like a getData function. Because in many cases, you just want the raw data. ML5 is storing a lot of information about the data set additionally to help it when it loads it later. But for looking at it again, it might be easier just to have a function that just grabs that rather than saying data.data.raw. But take a look. Maybe by the time you're watching this, this will have been added to the ML5 library. Let's see if this works. There we go. Ooh, so it drew all the circles, but I have the label wrong. Why do I have the label wrong? Oops, I Copy/Pasted mouseX, mouseY in there. But I've got to get the actual xy coordinate. Always making silly mistakes. Let's run this again. And there we go. Now, it's loading that data. And maybe actually what I want to do also once the data is loaded is just automatically call training. So I could do all this stuff right here. I could run the sketch. And it's immediately going to train the model. And I'm ready for inference. None of this is working, I might offer an exercise suggestion. And I will try to remember to, in the video's description, linked to code that does this in addition to just the code that's here right now. So remind me in the comments if I haven't. But how could you take this example, which now loads the data set and immediately trains the model, to allow you to change the state back to data collection? So I trained the model. Then I want to add some more data, retrain the model, resave the data. How could you have a workflow that allows you to load the data you previously had and add new data, remove some of the data even-- how would you do that-- and retrain the model? And once we are there and we've got this workflow where I can collect data, train, change parameters, train again, recollect, all that kind of stuff, I'm ready for the next step-- definitely want to ring the bell for this-- which is actually save the trained model itself.
A2 data model ml5 save trained neural network ml5.js: Save Neural Network Training Data 1 0 林宜悉 posted on 2020/03/27 More Share Save Report Video vocabulary