Yeah, ifthoseweightsorconfiguredcorrectly, we'llgettothatthatoneofthoseneuronswillhave a strongoutputoncertainimagesandtellyouthisisanairplaneortheotheronewehave.
We'llhave a strongoutputthatyouwilltell.
Youknow, thisisnotanaeroplane.
Okay, thereisonemorethinginthis.
Thereareactivationfunctions.
So a littlebitmoredetailsaboutwhat a neurondoesforthosewholikeit.
Youhavethefullyipsrighthere.
Youhavethefulltransferfunctionhere.
Soyousee, it's a weightedsum.
Plussomething.
Go, Tobias.
That's justanadditionaldegreeoffreedom.
Andthenyoufeedthisthroughactivationfunction.
Andinyourownnetworks, thatisalways a nonlinearfunction.
In a formatthatlookslikewhatthenetworkisproducing, it's thesimplestencodingyoucanthinkof.
It's calledonehartingcoating, andbasicallyit's a bunchofzeroswasjust 11 inthemiddleattheindexofftheclassyouwantsoheretorepresentthesix, I have a vectorofzeroswiththeoneinthesixthposition, andnowthosevectorswereverysimilar.
I cancomputethedistancebetweenthemandthegoodofthepeoplewhostudiedthisin a classifier, theytellus, don't chooseanydistanceusedacrossentropydistance.
Why, I don't know.
They'resmarterthanme.
I justfollowintheinterestofthecrossentropydistancescomputedlikethis.
Soyoumultiplyelementbyelementwth e elementsofthevector, theknownanswerfromthetopbythelogarithmsofftheprobabilitiesyougotfromyourneuralnetworkandyouthensumthatupacrossthefactthisisthedistancebetweenwhatthenetworkhaspredictedandthecorrectanswer.
That's whatyouwant.
Ifyouwanttotrade a neuralnetwork, yougetthatit's called a naturalfunctionorlostfunction.
Once I havethat, I cangiveittotensorflowtakenoptimizer.
Askittooptimizethislossandthemagicwillhappen.
Sowhatisthis?
MagicTensorflowwilltakethisatourfunction.
Differentiatedrelativelytoalltheweight's onAllthebiosisallthetrainablevariablesinthesystem, andthatgivesitsomethingthatismathematicallypulled, a greatagent.
Andbyfollowingthisradiant, itcanfigureouthowtoadjusttheweightsandbiasesintheneuralnetworkin a waythatmakesthiserrorsmaller.
Instead, workgoingtolookat a simplemovesusifthey'reConstanceandthenweplaymanygamesacrossmany, manymovestoget a lotofdatatreatingallofthesesimplemovesasiftheirconstantcomesthelabelsandonlydifferentiatingtheprobabilitiesthatouroutputbythemodelinbluehereonthescreeninthoseprobabilitiesdirectlydependonthemodels, waysandbiases.
Forthosenotfamiliarwith a tensorflowtensorflowbuilds a note.
A graphofoperationsinthememory s Sothat's whyoutofmultiplenormal, wegetanoperation, andthenwewillhaveanadditionalsteptorunitandactuallygetthosepredictionsout.
Andplaceholdersarethedatayouneedtoputinwhenyouactuallyrun a notetobeabletoget a newmedicalresults.
Itcares a lotaboutwheretheopponentpanelisandwhereismovingandcaresabouttheBullstrajectoryacross a gameboardand, quiteinterestingly, alsocares a wholelotaboutwhereitsownpanelisontherightbecause, like a marketpointoutbefore, atthebeginningoflearninginthemodeldon't evenknowwhichpedalisplaying.
Sowhatyou'resayingisthatthereinforcementlearningcansolvemanymoreproblemsthanjustpunk, andit's a wayofgettingaroundsomenondifferentialstepthatyoufindinyourproblem.
Youshouldbeable.
That's great.
That's great.
Sowhatisthisgoing?
Wewanttoshowyou a coupleofthingsfromthelabbecausethishashadmostlylabapplicationsandthenonelastthing.
Whatisthis?
Thisisveryinteresting.
Everyone.
Whatwe'rewitnessinghereis a humanexpertorpancakeflipping, tryingto t shirtrobotic.
I'm doingthesamething.
Andthere's a modelinthebagsapproachingroboticgumwhoseoutputcontrolsjointsmovementorthemortars, innit?
Whatangle?
Whatspeedtomovetoward.
Andthegoaloftheseistoflip a pancakeinthepinkisnotjustanyregularpinkable.
Theneuralnetworkispredictingthepowertosendtothesimulatedmusclesandjointsoffthesemodels, andtherewardisbasically a positiverewardwheneveryoumanagedtomoveforwardand a negativerewardwhenyoueithermovebackwardorwhenyoufallthrough a holeorwhenyoujustcrumbledtotheground, therestisjustreinforcement, learningaswehaveshownyoutoday.
Soalloffthesebehaviorsareemergingbehaviors.
Nobodytaughtthosemodels.
Andlook, youhavesomewonderfulemerging.
Theheaters.
It's comingin a coupleofseconds.
Lookatthisjumpafterthis.
Thosearenicejumps, butthereis a muchnicerone.
In a second, youwillsee a jumpReallyathleticjumpwiththemodelswingingarmstogetmomentumthanliftingonelegcushioningrighthere.
Solet's saywebuildonethatproducessequencesoffcharactersin't westructureitsothatthosecharactersactuallyrepresent a neuralnetwork.
Yeah, youknow, networkis a sequenceoflayers, soyoucanfigureout a syntaxforsayingthisismyfirstlayer.
Thisishowbigitisandblahblahblah.
Sowhynotproduce a sequenceofcharactersthatrepresents a neuralnet?
Whatifthenwetrainthisneuralnetworkonsomeproblemwecareabout, Let's say, spottingairplanesinpicturesSothiswilltrainto a givenaccuracy.
Whatifnowwetakethisaccuracyandmakeit a rewardin a reinforcementlearningalgorithm?
Sothisaccuracybecomes a rewardandweapplyreinforcementLearning, whichallowsustomodifytheweightsandbuyus, isinouroriginalneuralnetworktoproduce a bitterneuralnetworkarchitecture.
It's notjustshooting.
Theparameterswerechangingtheshapeofthenetwork.
Thatworksbetterforourproblem.
Theproblemwecareabout, weget a neuralnetworkthatisgenerating a neuralnetworkforourspecificproblem.
It's cool, newnorarchitecturalsearch, andweactuallypublished a paperonthis, and I findthisveryniceapplicationoff a technologydesignedinitiallytobeatpunk.
So, Martin, you'resayingwehaveneuralnetworksareruntobuildotherneuralnetworks.
I reallydon't wanttorunthetrainingonmylaptop, and I probablycouldSo I usedthecloudmachinerunningengineforthetrainingmotherthatwasplayingthegamethatwesawbeforeLifedemotookmaybeaboutonedayoftraining a company.
Wejustreleasedthiscoachtogetup, soyouhavetogethaveyour L.
Ifyouwanttotrain a punkagentyourself, gointoit.
Youcantake a picturethere, andifyouwantyoutheyouarereallystillonthescream.
Ifyouwanttolearnmachinelearning, I'm notgoingtosayit's easy, but I'm notgoingtosayit's impossible, either.
WehavethisSiri's relativelyshortSiri's offvideosandcodesamplesandcodelabscalledTensorflowwithout a PhDthatisdesignedtogiveyouthekeystothemachinelearningkingdom.
Soyougothroughthosevideosofthistalkisoneofthem, anditgivesyouallthevocabularyandalltheconcepts, andwearetryingtoexplaintheconceptsin a languagethatdevelopersunderstandbecausewearedevelopers.
Thankyouverymuch.
Hello.
Subtitles and vocabulary
Click the word to look it upClick the word to find further inforamtion about it