Placeholder Image

Subtitles section Play video

  • what is going on?

  • Everybody.

  • And welcome to part nine.

  • I think of the deep learning with the highlight A I competition 3 2018 All right, let's go.

  • So, uh, I've been working on is doing machine learning with Hal, like three and a two East where we left off.

  • We've got some code here and we figured out how to at least save the, you know, the frame basically from the game, at least of the information that we're curious and we visualized it we saw Okay, this is looking like what we want except for the flippy stuff.

  • But then we figured out what's going on there.

  • Um, now we're ready to actually create and save the data.

  • So we need to come up with some sort of principle to why we would want to save or not some data.

  • Now, Historically, I've done this before with, like, an evolutionary based model.

  • And if you just go off of, let's say the best between your model that allows for the model to actually get worse over time.

  • So if you train a model in that new model actually sucks for whatever reason, maybe over fit or whatever and then you go back and start playing some games.

  • You're still gonna have winning gains.

  • But actually, let's say the total hell I collected or whatever is going to be worse than before and we don't want that.

  • So instead, what I think I would rather go with is rather than an evolutionary based model, as in you know, who wins.

  • Instead, what we should do is go off of some sort of threshold of how light that's what this entire game is about.

  • Anyways, it's all about the how light.

  • So, um so that's what I'd like to do now, Um, a couple of things like So, for example, um, it's a little challenging to for the bite like the, but never actually knows when the game is over because of the way Hal itis set up.

  • And it's kind of unfortunate, but we can know in advance how many turns there will be given a game size.

  • So, for example, let me just zoom in a little bit here.

  • I guess what I'll do is I'll just make because later we could query so the body can query the game map and then we can get the size of the game map, which will always be square.

  • And then what we can do is have some sort of variable here called like maps, settings that will be a dictionary.

  • And then let's say the map is a 32 by 32.

  • That means there's going to be 400 turns.

  • Then we could just continue this.

  • So at 40 40 by 40 that means it's gonna be 425.

  • We have a 48 groups, not 49 48 which is gonna be a 4 50 Wow, I can't type.

  • I hope this doesn't turn out like yesterday's videos.

  • Anyways, 56 becomes a 475 and finally 64 is 500 turns.

  • So we could query the game map and we can know.

  • Okay, how many turns air there This way.

  • On the last turn, we could query and say how much I like, Do we have?

  • Doesn't meet a certain threshold.

  • Yea, nay, uh, if it does now will save the data.

  • So in theory, all of the bots in any given game could hit that threshold, and that would be good.

  • Now there's a couple of downsides with this one is each map is seated randomly.

  • So some maps If you just look at replays on like hal i dot io Some maps have an insane abundance of Hal I TTE everywhere, especially close to the to the drop off.

  • So these games are likely to produce higher How?

  • Light numbers.

  • Because ships didn't have to travel this far to get the bulk of the Hal I TTE Also, it allows you to create many more ships immediately.

  • So, um, this is somewhat problematic.

  • So, you know, maybe there should be some sort of combination of Winter and Hal i tor.

  • Maybe ships produced Hal I TTE or even better, some sort of maybe combination of, uh, like a percentage of how light.

  • And I think, you know, like the least on hala dot io.

  • When they visualize it, they do say what percentage of the how light was collected.

  • And I think instead probably eventually will go that route and, like, just go based on ah threshold of percent collected rather than how light collected because that might wind up confusing a model.

  • But for now, what I'd like to dio is continue on this trek that we've got going on.

  • So first of all, we want to set some sort of minimum threshold.

  • Um, so I'm gonna say threshold and let's just say I don't know, 1000 How light?

  • I don't think we're gonna be able to collect 1000 How light?

  • Because it's gonna be random, but I just want to set it there just to have a variable there.

  • Um, And then, um in fact, let's call this, uh, save save threshold, so it will require 1000.

  • How light?

  • Also, Just for now, I'm gonna throw in another variable called Total Turns.

  • So for now, we're gonna limit it to 50 turns.

  • Because when you call the highlights, you know, command line interface thing, there is a flag for, uh, I think it's turned ash limit.

  • And then you could say 50.

  • Uh, let me check real quick on, make sure that is true.

  • I think it's turn like, yeah, turn.

  • It's just a dash dash, turn limit, and then you can set so it doesn't matter.

  • The game size term limits 50.

  • Now, why do we want to do that?

  • Um, we'd like to do that because, um, the first models are gonna be totally random.

  • So the quote unquote reason why one day I was able to collect 1000.

  • How light, Let's say, And the other one, not when both of them moved 100% randomly is maybe one move out of all of the moves that was made by any one of the ships.

  • So if we allow this to go for 500 freakin turns, the results are probably it's gonna be almost untradeable cause the little kernel of what the A I did good during 500 turns is going to be obstructed by insane amounts of noise.

  • So instead, if we can lower the total turn number and then maybe lower the threshold like 50 turns to collect 1000 how light is going to be quite challenging?

  • But if we could do that, that would be beneficial.

  • Also, probably safe threshold.

  • Maybe we make that well, because we're gonna I can't decide what I want to do If I want to produce all of the shit cause you start with 5000.

  • How light?

  • So you might want to even save the threshold is 6000 or you know something like that I can't really decide what I want to do there.

  • Uh, because we're only gonna make one.

  • She was actually not 6000 but you'll start with 5000.

  • You make one ship, you'll have 4000.

  • So let's say anything about 4000 is good.

  • That means we at least dropped off something.

  • So in theory, we could save safe threshold.

  • It's 5000 if we want to collect 1000.

  • Anyway, we can We can hash that one out in a little bit, but I'll leave it at 5000 for now.

  • So, um okay, so we've gotta save threshold.

  • We better total number of turns.

  • Now what we want to do is let's add a max Max.

  • Keep forgetting my caps.

  • Lock doesn't work, Max ships equals.

  • Let's go with one for now.

  • Again, Um, overtime would like more ships, but it just like I was saying before, with the total turns, let's just do one ship and because those model is a ship only model.

  • So it's a very micro model.

  • So we'd like first to train a single ship to do a smart thing, just to see if that can worry, this will be the easiest version of this challenge.

  • Eventually, we do need the model it take the other ships into account.

  • So we do need a model that's training on the presence of other friendly ships.

  • That way, if a friendly ship is closer to a bigger plot of how light we don't waste our time trying to go to that spot because our other ships probably doing the same thing.

  • Okay, so Max ships one.

  • Let's go ahead and scroll down and fix that eso less than max ships.

  • Okay, so now what we want to do is we will also rather than n p dot save.

  • Ah, here.

  • What I'd like to do is I'm gonna comment that out.

  • We're in it.

  • Come over here and let's make a new directory.

  • And I'm gonna call this not in all caps.

  • I'm gonna call this training Underscored data.

  • So this is where all our training date is going to go.

  • If we meet a certain threshold, I'm gonna throw another if statement down here and want to say if game dot turn number equals, um what was the thing?

  • Mac, uh, turns total turns?

  • That's got to be it.

  • Total turns um So if that's the case, okay, so if we are on the final turn, because again R A I does not get information from the game that the game is over, so we have no way of knowing, but we do have a way of knowing what is the final turn.

  • And then we can kind of surmise.

  • Okay, Okay, it's the end of the game.

  • So?

  • So, if game dot turn number equals the total turns, what do we want to do?

  • Well, if me dot hallie amount is greater than or equal to that save threshold that we set, then we want to save.

  • So say, n p dot save whoops, and then where do we want to save it?

  • Let's make this enough sharing again.

  • Uh, training data, right?

  • Training, data training Did it yet?

  • Training data slash.

  • And then let's save it.

  • Let's save it by two variables.

  • Let's say, uh, the allied amount because later we might we might set an initial threshold, and then suddenly we've got, like, up bricking 20 games.

  • Uh, and then we still might wantto like somehow, uh, set some sort of limit there.

  • So for now, let's just say me dot Hal I TTE amount.

  • So that'll be the first segment of this file name.

  • So then later we could go in via python and os dot lister or something.

  • And then we could filter by that first thing with just a simple split string by dash and then get the zero with, uh, anyways Mita hala him out.

  • And then let's get a int er time dot time and, uh, that should be fine, but let's go ahead and go down to that like millisecond, just to be certain that it's a unique family.

  • Don't overwrite some other file, but I really think the chances pretty much none.

  • But okay, and then n p y.

  • So also, I'm not sure I brought in time.

  • I have not.

  • So we shall import time.

  • Um, now, later, total turns like once we don't set a turn limit.

  • Total turns would be equal to like you Price set total turns.

  • Um, somewhere down here, like around line 39 we would do Len, I think you have game map dot width and game map dot height.

  • Or maybe it's not sighs.

  • We'll figure that out.

  • Um, so you get that number.

  • And then you would say total turns equals that, you know, map settings, that number with the key.

  • So are as the key.

  • But in this case, we're gonna hard code 50 for now because we want the game to stop a 50.

  • Okay.

  • Um impeded Aussie Bob.

  • Bob.

  • Bob, Bob, I think we're all set.

  • So let's go ahead.

  • Actually, let me change that safe threshold.

  • I just feel it.

  • I feel dirty asking for that.

  • Let's just say the safe threshold is 4000 and 50.

  • So anything I don't even know I just don't know how many will collect that random.

  • So we'll just save it at 4000 eso anything actually do 4000 won anything Any drop off we're happy with?

  • Um, okay.

  • And then, um okay, secrets up.

  • Choice direction, or so they're already move randomly.

  • Um, I'm forgetting anything else, or are we ready?

  • Oh, I think we're ready to rumble.

  • Uh, so let's go into the run game.

  • Ah, Okay.

  • So now we don't want to versions.

  • This won't cause that same problem is before this problem before was, uh, that, like over Because we were using, like, the same directory where is in this case, it will create its own little unique file.

  • Uh, okay, so run game.

  • I kind of got rid of it.

  • Okay, so this is the command to run the game now.

  • We could just run one game really quickly, like, let's make sure it works.

  • Would be nice.

  • Um, and also, actually, let me just throw in replayed directory.

  • Let's just throw in a dash, dash, turn, dash, limit 50.

  • Okay, copy.

  • Save.

  • Uh, break that running again.

  • Let's see what happens if we hit the errors.

  • Okay.

  • Looks good.

  • Um, zero highlight was collected.

  • Let's run it again.

  • Still zero.

  • How light.

  • This could get ugly, man.

  • No.

  • Hello, Really?

  • I figured, like with random It can't be that hard toe accidentally go left then.

  • Right?

  • You know, left, right, right.

  • Oh, wait.

  • Oh, no, no, no.

  • Um, we're hitting an error, first of all, And because of that error, we kill the player.

  • I was missing that.

  • So we passed the file name, but not the array.

  • Okay.

  • Idiot.

  • Okay.

  • Um uh, what are we saving here?

  • Surroundings.

  • Surroundings?

  • Yeah.

  • Stupid.

  • Uh, dragon.

  • There we go.

  • We got some Hal.

  • I look at that again?

  • 40 34.

  • Let's just run this a couple times.

  • That one got zero extra.

  • How light?

  • Okay, we got, like, 3rd 28 6 40 99 40 83.

  • It looks like 100 would not be a super challenging threshold to hit.

  • We want something that's challenging something that's ah, relatively difficult.

  • Maybe 100.

  • Maybe we'll go with 100.

  • Yeah.

  • So we've definitely surpassed 100 there, but oops.

  • Um, I kinda want to go to that second toe.

  • Last replay.

  • So not this one, but this one.

  • Let's let's look at that.

  • Because I'm gonna bet there's, like, a lot of how light clothes, but a t least while it's random, it may not be the case.

  • It might just be that we got very randomly lucky, but let's just see, uh, what happened?

  • I just resized it.

  • Are you kidding me?

  • You Oh, no.

  • What now?

  • Really?

  • What happened?

  • Oh, my gosh.

  • Uh, see, if I can do this, I think I can Can I?

  • I thought that would give me the last one.

  • No, that is the last one.

  • Okay, so at least yes, in this case, there are some clothes clusters.

  • But that's not why he did good.

  • He just went down down, Collected some serious How light there.

  • Oh, but he already had it at that point, or so is the green one.

  • Oops.

  • I was looking at the red guy.

  • So he just goes, collects and happens to just come up.

  • And then I think he must make another deposit.

  • He does?

  • Yea, for random.

  • Okay, so yeah, that's, you know, we wish you would have collected a little more Hallett, but, hey, this is random.

  • And in 50 steps on Lee, um, hey, that's a pretty good, uh, a pretty good.

  • I mean, that's a pretty good, you know, better than random thing we'd like to mimic.

  • So cool.

  • So that's that is what we want.

  • So I'm gonna say 4100 will be our threshold, and then save that.

  • And then rather than us sitting here manually, we could do like a wild loop.

  • Like I think you can do.

  • Like what?

  • What is it while true?

  • Maybe with a capital t here.

  • While true while dude true.

  • And then you do the thing and then you do it done.

  • I think that works.

  • Um, but I'm not gonna do that, especially for our sad Lennox users.

  • Um, and said, Hey, we could make a python script.

  • So let me, let's copy Paste uh, let's call this run game and we will open with sublime texts and we will get rid of everything.

  • And then what we want to do is we want to simply run a command.

  • Now, um, try and get let's see run, because that let's see, we probably could get away with us.

  • Dot system should run the command.

  • So let's d'oh import O s.

  • We will import secrets.

  • Uh, let's also bring in.

  • We'll set a max turns just to set them for now, we'll say 50 and then we wanna have both to player and four player games.

  • Also, we do wanna have varying map size.

  • So, actually, let's go ahead and copy map settings or your paste.

  • And then we'll just set a wild, true loop here.

  • And then, uh, first of all, uh, let's say map signs equals secrets dot choice map settings.

  • Uh, print map size swarming.

  • Sure.

  • That's gonna do what we intended.

  • No key error, too.

  • Why?

  • I don't know what kind do It might be keys or get keys or something.

  • Really?

  • I feel like that should be Ah, an acceptable option.

  • Um, what is it?

  • Is it get keys trying to remember ah dot Get keys.

  • Maybe has no attributes dot So it may be keys.

  • Oh, bro, What's It's so difficult.

  • Uh, get a list of keys from dictionary pie.

  • Thank you, sir.

  • Yeah dot Keys.

  • Okay, so secrets must be trying to index it.

  • So probably what we could say then, is list keys.

  • Okay, A look at us, hacker man.

  • Okay, so that's that.

  • That's out of the way.

  • Um hey, guys, check out my basics.

  • Python Siri's, uh, now that that's out of the way.

  • We've got our map size.

  • We've got our max turns.

  • So the next thing will be basically, what we want to do is call, uh, this turn limit.

  • So it's copy this, uh, let's say commands.

  • We'll make it a list, and we basically two versions of this command.

  • And in fact, we need Thio.

  • We'll get away with just putting it in single quotes, Single quote.

  • So that's one of the options.

  • The other option would be a four player game.

  • So copy.

  • Calm down here, paste.

  • And then we want, um Why did that happen?

  • We want basically this double.

  • So 24 players boom.

  • Done.

  • Okay, so now turn limit.

  • Oh, this needs to be an F string F string, dude.

  • Um, okay, then what we want to say is turn limit shall be Max turns.

  • I really should have done this before we copied it.

  • I would have made more sense.

  • Uh, 40 becomes What is our map size map size?

  • So then I just copy paste, paste, paste.

  • And then, uh, we run o Estado.

  • Oh, wait, no, no, no.

  • Uh, actually, yeah, we could do this.

  • Uh, well, say cut.

  • So the command that we want to run a command that we want to run his commands don or no secrets secrets start choice commands.

  • And then let's just run os dot system command.

  • Um, I'm trying to recall if oh, stop.

  • System's gonna work anywhere.

  • I think os dot system will work even on, like, clinics and Mac and all that to write.

  • Like, I'm pretty sure that that's like a yeah.

  • So we should get away with os dot system on any operating system, If not check out like sub process would probably be the next best thing, but this should be fine.

  • So it's a system command.

  • Uh, and then let's just do, like 10.

  • So four.

  • I in range 10.

  • Let's print I I just want to make sure it's working.

  • So let's come into here on and delete all of our replays be gone, and then I want to go into training.

  • It's not gameplay.

  • Actually, we could just delete game play.

  • We're not using that anymore.

  • Uh, training data.

  • Uh, well, delete all that and let's begin.

  • So let's say, uh, python run gamed up high.

  • That's a lot of output.

  • It looks like we're running a game, though, for sure.

  • We get any training?

  • Did it?

  • Wow.

  • 43.

  • So what?

  • Wow.

  • We go.

  • Oh, my gosh.

  • That's a lot of Hal I, bro.

  • Okay.

  • Um, all right, that appears to work.

  • So rather than four iron range 10 we will say wow, true.

  • Uh, we don't need this anymore.

  • Okay, close these and this.

  • So then what we're gonna do?

  • Let's close this.

  • Uh, oops.

  • We need to back up one.

  • Mmm.

  • Mmm.

  • Mmm.

  • Python run.

  • Gamed up high.

  • Come on.

  • I am not in the mood today, all right?

  • And we're off, Okay.

  • Yes, sir.

  • So now we can you know, you could be running this one, and then we could bop up into here, maybe run another one Python run by 13 run game.

  • Okay?

  • And now we are making games.

  • Um, hoping that that's not gonna We might wanna, like, shut up the print statements or like the output.

  • I guess it's actually I believe the output is actually, Is it standard?

  • That is like STD air that's coming out, but I can't remember.

  • But again, you could probably use some process and shut those up.

  • But for now, um, wow, we've got a lot of train 4400 almost 4500.

  • So keep in mind.

  • So we already know each of these is 50 steps or 50 turns.

  • We might even decrease the amount of turns.

  • But if we go to calculator Hello.

  • You don't have a calculator.

  • Okay.

  • Uh, whatever.

  • Each one is 50 though.

  • So, like, let's say we had, um you know, the target is generally 100,000 and then we want to divide by 50 That would mean our target should be approximately 2000 games to give us 100,000 steps of samples.

  • So at some point, um, like, I would just leave this running like overnight all day.

  • Whatever.

  • Run Morgaine simultaneously, however many you can do.

  • Um, I would, but this kind of a CPU hog, um anyway, run a bunch of these.

  • Gather a bunch of training samples about 2000.

  • Hopefully we can get about 2000 samples.

  • Um, you know above.

  • Like I don't know, 424,300.

  • I'd be pretty nice.

  • Look at this guy.

  • Almost 5000 is moving randomly.

  • It's one ship in 50 turns.

  • 1000?

  • How light?

  • That is awesome.

  • Okay, uh, anyway, so I would let these run for a while, and I think I'm gonna stop it here and then pick up in the next tutorial where we will take the best of the best training model.

  • And then basically, we begin our infinitely narratively literally, infinitely literate tive loop of, you know, training model on the top Hallet threshold or plausibly some other rules.

  • We set, like a percentage.

  • How light collected is probably the more superior route.

  • So maybe that it will be the better way to go anyway.

  • Set some sort of threshold to save by and just over time slowly increments that threshold train a model that met those criteria.

  • Increment the threshold.

  • Um, play a bunch of games training, model, increment, the threshold play bunch of games.

  • Repeat that process.

  • Hopefully, you understand.

  • Anyway, because I don't I don't know what I'm doing.

  • Uh, that's it for now.

  • Questions, comments, concerns.

  • Whatever.

  • Yeah, feel free to leave him below.

  • Also, there's discord Channel discord, not G.

  • So Lash sent.

  • Dex will get you there.

  • Uh, otherwise, I will see you guys later.

  • In another video will retrain an amazing model built completely off of random movements.

  • Uh, see there, actually, hold on.

  • It just clicked.

  • Uh, I want to make absolutely certain before I leave this, uh, that our size.

  • How big were we working?

  • Where is it?

  • 16.

  • Because I want to be able to use the, uh, like, mobile net.

  • And I think inception V three and those things, the minimum size is 32 so we'd have to pad and stuff, and I don't wanna do that.

  • Okay, also.

  • Sorry.

  • I know I said we were leaving.

  • I feel bad for the people who, like, click off at the end of it.

  • He is.

  • And then they get back, and there's, like, news things that they've never seen.

  • Uh, let's go.

  • What's called this?

  • Reed?

  • I hate to call it radio.

  • It's not really radius.

  • Uh, I don't know.

  • Sight distance equals 16.

  • Copy that.

  • Come down here, paste.

  • Good save.

  • Um, I kind of want to confirm that.

  • That worked something.

  • Break break, run, game.

  • Cool.

  • Also, just out of curiosity, how many training files do we have yet?

  • 197.

  • So definitely should not be challenging to meet the 4000 threshold.

  • Ah, I think that was our threshold, Right?

  • 4000.

  • Oh, no.

  • 4100.

  • So definitely not a challenge to meet that.

  • It only takes, like, I don't know, 20 minutes.

  • Probably.

  • So maybe 40.

  • 243.

  • 45.

  • Somewhere around there, we will set a limit.

  • Uh, anyway, yep.

  • Clearly working.

  • Cool.

  • All right, I will see you guys in the next video.

  • Just getting We really need to stop meeting like this.

  • But of course, that is not gonna work.

  • So So I was really looking over all this stuff, and I realized, um, I've made an egregious mistake, and what we've done is basically we're just We've made a couple of mistakes.

  • One days we're not saving the entire games worth of data.

  • We're on Lee saving the final surrounding image.

  • Also, we are not saving the move, Associate ID.

  • So we need to do two things we need to save the entire game and every move associated with every frame.

  • So, uh, as sucky is, that was Forget that.

  • Let me just add that in really quickly.

  • Um, so basically, what we're going to do is we actually kind of I thought I had that I did.

  • Okay, so we'll take this direction, order that actually doesn't need to be where it is.

  • So I'm gonna put it up here is no need to have it defined every time.

  • It's the same.

  • Um, Then what we need is we're gonna have a training underscore data, and this is what's gonna contain every single frame, every single move for the entire game.

  • And if we meet a certain threshold, that's what we want to actually save.

  • So then we're gonna come all the way down to where, um, where we actually make our move.

  • So in this case, upend ship move secrets.

  • Okay, so what we really want to say is we want to say something like choice equals secrets dot choice.

  • And then, let's say, um, for the range len of direction orders.

  • Basically, it's gonna be a 0123 or 45 options for this prediction.

  • So that's going to be the choice that we make.

  • Um and I'm trying to think I don't think that's gonna cause trouble later.

  • I'm a little curious due to the way that we're gonna actually train this model.

  • If it would make a difference.

  • I don't think it will, but weaken change train later.

  • But anyways, with with TF learn, you can train with scale er's rather than one hot vectors.

  • You can pass those in, I think, behind the scenes, it just does like a quick operation to convert it to a one hot for you.

  • I don't think it actually matters that much, but anyways, we'll say that so that's the choice that we want to make.

  • Then what we wanted to say is, ah, frozen.

  • Okay.

  • Uh, thanks, Virtual machine.

  • Hopefully, it'll come back.

  • Basically, what we want to say is we wantto upend our, um, the surrounding data and that specific move to training data.

  • And then at the very end, that's when we want Thio.

  • Save it.

  • Hello?

  • All right.

  • Soon as I stopped the recording, it comes back.

  • Awesome.

  • Okay, Training data.

  • Uh, then we want to upend surroundings and the choice that we make And hey, let's make it Pepe.

  • Okay, so then what we want to do is, uh so in this case here, we actually want to say direction, order choice instead.

  • So move direction, underscore.

  • Order, choice.

  • And then, at the very end, down here, if we want to save it, what we want to save is training data rather than just the surroundings.

  • Okay, let's save that.

  • Let's come back up into one of these.

  • Also, I just rename this old This is now an empty directory.

  • Rerun it.

  • Hopefully everything works immediately.

  • At least we've got a game.

  • Cool.

  • Um, and now, uh, let me just copy this Coming Thio back over here.

  • We go into testing grounds.

  • Sublime text, important umpires and P.

  • Uh, n equals np dot Load.

  • Uh, and then let's print in zero go ups.

  • Um, what we want is training data slash you little Okay, next.

  • There we go.

  • Okay, so now we have the visual.

  • Now we have the choice.

  • Uh, Okay, Now this is the data that we want to train against.

  • Unfortunately, I'd actually built quite a large.

  • Um, I had it running for, like, six hours.

  • I came back and then it, like, hit me.

  • And I was like, Oh, my gosh, I don't think I was like, I remembered that I probably saved his surroundings rather than actually built the training data with targets and all of it, uh, it was quite the heart sinking moment.

  • Okay, anyways, now we're gonna build the training data.

  • It is going, Thio, hopefully not take too too long.

  • Wow, look at this before.

  • Almost got a five k game.

  • Incredible.

what is going on?

Subtitles and vocabulary

Click the word to look it up Click the word to find further inforamtion about it