And I'veseenitdosomeprettycoolmaneuversandallthat.
Andso I mean, it's something, butit's not.
It's notthatgreatof a driveryet, but I mean, that's I I don't knowwhatweshouldhaveexpected, tobehonest, Um, especiallyconsideringthisis a dequeensodick.
Um, Danielclaimstohave a modelthat's prettygoodthatalsoexhibitsandhasjustconstantlossexplosionssaysitworksWell, we'llseeifwefindthatwehave a modelthatactuallyworkswelldespitehavinglostexplosion, I'llbesuretoletyouknow.
Butforthemostpartwantedtomoveawayfromexception.
Andah, atleastthereasonwhyweassumedthatwasprobably a problemismainlyexceptionhasbasicallytheamountofparametersthatyoucanactuallytrainintweakas a neuralnetwork, allbasically 23 million, Uh, andthener, actually, yeah, yeah, 23 millionprettymuch, whereas a 64 bythreecontinentsthreemillion.
Soit's just a muchsimplerproblemfortheagentfortheneuralnetworkjusttokindoftrytofigureout.
Andagainit's withthe D.
Q.
N.
Yes, we'reusing a neuralnetwork, butit's kindofliketheneuralnetworkismoresotheretohelpusgeneralizequevaluesright?
That's allithastodo.
It's notreally a supercomplextaskoftheneuralnetworkhastodosoanyway, itseemslikemaybeexceptionwasoverkill.
Butthenitseemslike a 64 bythreewasmaybenotlargeenoughof a modelbecauseouraccuracyjustneverbecameverygood.
Soifitapproached 10 seconds, that's what I woulddo.
Unfortunately, itbasicallystayedstagnantaroundlike 6 to 7.
So I didn't reallyseetoomuchreasontoraiseepisodetimetoomuch.
Buteventually I putitto 12 justbecause I feltlikeLet's makeitatleasthalfsocontinuingonyoucanseetheminimumtimeThatreallydidn't changeatall.
I thinkit's probablycausethecarwegetdroppedinmaybedroppedontopofanothercarorsomethinglikethat.
I likejustbeingputin a badsituation.
Uh, excellentovertime.
I justtriedtorecycleEpsilonjusttoseeiftheifthemodelwouldlearnanythingnewovertimeinthemostrecentchangesthat I bumpeditupjusttosee, canwegetanythingelseoutofthemodelandthenhereyoucanseelosstoagain?
ConnectingtoCarlaat a certainpointcostsmorethanrunninganotherCarlainstance, whichisbonkerstomeandagain, Like I saidinthevideonumberone, um, Carlaisreallyimpressive.
I thinkprobablyit's bettertoplaywiththerewardfunctionormaybeapproachwiththedifferentreinforcementlearningalgorithmalltogether.
Like I said, DequeEniscut.
I mean, it's OK.
Itcanlearnsomereallyawesomethings.
Itjusttakes a verylongtime, whichisverychallengingwhenwhatyou'rerunningisCarla, whichalready, likemostpeople, have a hardtimerunninglikeoneCarlyinstance, andweneedtodojusttonsandtonsandtons.
Soanyway, cool.
A quickshoutouttomymostrecentbrandnewchannelmembers.