Sotodaywe'recontinuing a littlebitofoursubjectfromlaststreamover.
Take a stepbackfor, like, a momentfromimagesandtheywerehappeningtoneuralnetworkstokindofthebiggestbuzzwordofmlatthemoment.
Going a littlebitmorelevelthanimages, we'regonnawe'regonnause a littlebiteasierdatasetwhen I buildkindof a sampleversionofthisThislastweekendittook, I think, likeanhourtorunorsonotgreat.
Sowe'regoingtouse a verysimpledatasetandfigureouthowtojustwhat a neuralnetworkis.
Actually, I thinkinthatstreamYeah, thefirstdreamwasusedtensorflowGoogle's likethere's evenhigherYeah, todaywe'regonnabeusingjustjustnowaygotmatriceswiththegoldeneventuallyofcontinuebecausethisiskindoflike a newsubSiri's.
Yeah, whenwe'redoing a sortoflikebuilding a neuralnetwork, talkingaboutmachinelearningwiththeeventualgoalofbeingabletogenerateimagesfrompartofstreamswhichwetalkedaboutlastweek.
Soevenlikedeepfakes, iskindof a similarconceptwhereit's likeyoucan't necessarilytellthatit's not a realperson, likelikepresidentialvideoswheretheylikedancing a littlewacky.
Theonlythingthat I thinkwe'llprobablyexplore, wemightevenstartwithnextweekbecauseit's a littleeasiertobuildfromscratchisstyletransfernetworks.
Solet's say I giveyoulike, thengopaintingand I thenhandyou a moderndaypictureofsomethingright.
Oncewehavekindoflikeourbasicunderstandingof, like, what's a CNN?
Whatisconvolution?
There's a lotofstepsalongtheway.
It's kindofoverwhelming, but I thinkit's coolis a lot.
Soifyou'reeverconfused, luckilyisonvideo.
SeeifanybodywantstowatchYouTube.
Ourfirstpartwherewebuiltthe K meansclassifierbecauseweneedWe'regonnaeventuallyneed a classifiedimagessopresumingtobeabletogeneratethem, comparinghimagainstthosethoseclusters.
Exactly.
Um, butyoucancheckthatthat's onYouTube.
Thatshouldbeaboutitatthetimeofthisrecording.
Actuallytrue, ConniesaysHi.
Oh, nicejobonthepodcastcalledThankyouveryMuchWayarereleasing a podcastverysoon, probablytoday.
Andifyou'refamiliarwiththisatall, hearingthatis a littlewhack, butthat's kindoftheintuitionis I havesomeweight, whichisjust a numberthatsays, Hey, anytime I'm givinganinput, I'm gonnaapplythisnumbertwothatinputand I havesomebias.
So I knowsomehowthat I amofflinearlyfromwhatevertheactualansweris.
Ifyou'refamiliarlike y equalsMXplusbe, thatwouldbelikethe B termandthen M mightbeyourweight.
Nowthat's notquitehowthisworks.
It's alldonethroughmatrixmath, sothingsgetgeneralized a littlebitmorebroadly.
Butthat's kindoftheintuitioniseverytime a pieceofourinputfeatureisgiventhreeneuronthatgetsitsays, I'm gonnamultiply, youbuy a weightoutofbias, and I'm gonnapassyouontothenextlayer.
We'regonnabuild a singlelayerneuralnetwork, whichsomeofyoumaygo.
That's still E.
Anditis, to a degree, a couplemaybe, like 40 yearsago, notsosilly, butnowitiskindofsilly.
Wehavedeepuse.
Wecancalculatelikehundredsofneuronsinonelayerat a timeandthingslikethat.
Theseareoftenreferredtoasdenselayersbecauseandthereiskindof a coolFacebookcommentsomewhere, Uh, I thinkthatsoftwareengineer, probablyanMLengineerwhosaidthatlikethereisnosuchthing, is a trulyconnectlike a full, truthfullyconnectedlayer.
Ifyou'refamiliarwith a wholelotofmatrixmath, thenif I takesomethingthatiswillsayfourbytwo, and I multiplythatbysomethingthatistobuyandtheneachkindofstacked I don't I don't knowwhy I'm illustratingitlikethis, but I thinkthatis a veryconvenientwaytoimaginehowthisworkeachelementoftheinput, isthengoingtogodownthatmatrixandbesummedupintothecorrespondingcolumnandropehair.
Ifwhat I justsaiddidn't makeanysense, letmefindanexampletoillustratethat.
Ah, it's it's a littletrickytosee I wouldlikeanexampleouthere.
Yeah, so I'm a veryvisualperson.
Thisisnotnecessary.
Thebiggestimage I couldhavefoundthatmighthelp.
Essentially, you'regoingtotake a columnfromthethingbeingmultipliedandmultiplyeachcorrespondingentrybytherowofthethingmultiplying.
Butessentially, thisshouldhave a normaldistributioneyeswhatweresaying.
Butifweswitchthisbackto a uniformdistribution, which I'm goingtostartwith, butwecanseehowswappingtheseoutmightchangethings, althoughinthisexamplemightnotbesupereasyoftheexpectedvalueof a uniform, distributionfrom 0 to 1 shouldbeabout 10.5.
Andweseethathere, thestandarddeviation, I don't knowoffthetopofmyhead, butit's around 0.33 Soundsreasonabletome.
Soifyouaretryingtosaylike, Oh, I'm savingmymodelor I wanttouseitlatertomake a prediction.
Thisiswhatwe'reusing.
Eyes.
Essentially, I havesomesetofweights, somesetofbiases, and I wanttoknowhowdotheseHowdoesmymodelthinktheinputthatit's given, giventheseweightsandbiases, shouldbeclassified.
Andthenoncewegettotheend, it's goingtosay, Basedonwhat I knowandwhat I amcurrentlysetat, I'm goingtospitout a number.
Okay, sowe'rereadingitforthen, eh?
Sowhatwewouldendupwantingissomeformatofthis.
Wewanteitherzeroorone.
Soit's a binaryclassifierinthiscase.
Butmoregenerallyspeaking, ifwegiveit a setofclasses, wewantedtospitoutsomethingthat's closetooneofthoseclasses, andwe'llpretendasifitiswhatitactuallyendsup.
Spittingoutis a probabilitythatit's eitherofthoseclasses.
Andi, inthiscase, itactuallyspitsoutsomething a littlestrange, butwe'llkindofseehowthatlooksin a moment.
Andthesebiases, howdotheseoperatehave?
What's therelevanceofthebiasisjustbasically, like a waytocorrectbasedon, like, I'm totallyright, but I'm justlike, offeverytime.
Likeeverytime I answer, I got, like, thepattern, right, But I'm offby a littlebiton, so I'm justgonnasay, Okay, we'regonnaaddinthisbiastermthatsaysyou'rebiased.
It's notsupermeaningful, butin a relativesense, in a supermeaningful.
Soif I have a lossoffiveversus a lossofthreevslossof 0.1 thentheladderisthebestversionofourmodel.
GivenourDonnalostdeals a lotwith, like a lotofthewaysthatpeoplecomeupwiththeselossfunctions, eitherdealwith, like, uh, um, white, maybefor a linearfunctionsroutinesquarethere.
So I takethesquarerootofthemeanofthesquarederrorsfromsomegivenknowledgethat I haveonthisandthatwouldbesupervisedlearning.
I knowtheanswersofsomeofthedataset, and I cantellyouhowfaroffyourmodelisonthoseanswers.
ItbasicallyjustsaysEntropyis a measureofkindofdisorderonDDE, inthissensecanactuallybeinterpretedas a formoferrorandcategoricalandcrossmeanthatwehave a categoricalproblem.
Usingtheir, uh, nonreallyvaluedequivalentswerelike e tothe I exerciseequipmentlikecosignof X minusfourplus I sineofexercisesomethingalongthoselineson.
Becauseofthatproperty, wecanactuallyget a numeratorfortan H thatlookssomethinglike e totheverysidewaysNPD x p of, um, liketwotimes X minusone.
Andthenwe'llget a denominatorthatlookslikesomethinglikethis, twotimes X plusone, andthenwe'llreturnthenumeratordividedbythedenominator.
Now, ifwewanttoseewhatthisfunctionlookslike, actuallyjustcommentthisoutfor a secondongoingtograftwiththatfunctionmightlooklike I'm actually, I'm gonnahavetodothisseveraltimes.
So I'm actuallyjustdefine a plot.
FXonthishasgivensomeFXthanwhat I wanttodioisourkindof X isequivalenttonumbPeedoutlinspacefirstwesaidplanteffectsthattheymetlike a sexthatisvery, very, veryclose.
Notquiteon.
Andthen I'lldolike a plotstopfigureplotsDoddsScatterof x byfxof x waswild.
Uh, thatwasconfusing.
Don't worry.
I confusemyselfaswell, andthen I justneedplots.
Sowe'regonnaimportMatt.
Plotlive.
Thankyou.
Oh, no.
Socloseasplot.
Okay, sonow I canplantanyfunctionmoreorlessmessanythingup.
You'llnotice I can't reallyswitchbackandforthbetweenthetwo, but I'm sorry, uh, I didn't pluckhimquitelikethat, butyou'llnoticethatourtan H functionhad a veryslick, harsh, almostverticallineinthemiddle.
Whereasoursigmoidfunctionactuallyis a littlebitsofter, it's a littlebitmoregenerous, ifyouwill, abouthowitplotsthings.
Sobecauseofthat, itendsupbeing a littlebitlessconvenientsinceweknowwehaveexactlytwoclasses, butyou'llseesigmoidactivationsallthetime.
So, basically, ifyou'reinanextremecase, you'rejustoneoftheother.
Ifyou'renotinanextremecase, then I can't necessarilytell.
And I mightgiveyousomeprobabilityusingsomethingcalledSoftMaxonwhereyouareNow, ifthatdoesn't makeanysense, sir, ifallofthesesortsofthingslookcrazythandon't worry, you'rewelcometogoGooglethem.
Oh, I wasgonnasaythankstoGrahamWaltonsayingGoodnight, peoplesee?
Well, tomorrow I wanttosaythanksfortuninginandsomeotherpeopleaskAnnieBabicaskedifitwassomeOhTaylorSiri's theyweretalkingaboutearlier.
Um, sobadignite.
You'reasking.
I'm notexactlysurewhichpartyou'reaskingissimilarto a tailor.
Siri's forthosethatarefamiliar.
A tailorsyriza's ehwayofkindofapproximatingfunctionsbyexpandingouttheirterms, forexample, liketheTaylorexpansionof, likeeatoftheexeslikeoneplus X plus x squaredorsomethingon.
Itbecomeslessusefulin a lotofcaseswherethebehaviorchangesawayfromzero.
Butitisgenerallyyou'reabsolutelycorrectthatit's generallyusedforpolynomialfunctions, likeusuallyif I havetogetyoutotheaxeridelikesomeweirdpolynomial, and I justwanttoapproximateitsbehavior, youcandothat.
YoualsoliketheMacLaurinseries, justlike a specific I wasforget.
Uh, said, Ifyouuselike a signalprocessinghersystemsclass, theyusedfor a A series, whichistoapproximatelyperiodicfunctionsandfor a seriesofragetransformshaveallsortsofbeautifulpropertiessuchthatlikerandomnoises, reallyhighfrequency.
Andif I do a fastfreighttransformonaninputsignalofamplitudeSze, then I cankindofignorethenoisebecauseit's goingtobeveryhighfrequencyandthentheatleastusuallynoiseshighfrequencycomparedtotheactualsignalthat I'm lookingfor.
And I canusethattocleansignals.
There's allsortsofcoolthingsyoucandowiththose.
There's manydifferentkindsofSiri's thatcouldbeusedfor a seriousjusthappenstobevery, verycommonlyused.
Signalprocessingis a veryusefulcoursetotake.
I actuallytakeitlastsemesterforbioengineering, so I endedupabsorbinglotsandlotsofinformationabout a classthatCSundergradistypicallytakenoit's notrequiredforCS.
It's requiredforbio.
Forallofourengineeringconcentrators, actually, maybenotallengineering, I thinkjustby a week.