And I thinkthattherewas a certainmomentinourtrainingprocesswhereweputmorecomputesinour L thanbeforeandtrainedfirstwhilegeneratingcoherentchainsofthought.
I thinkrelatedtothat, whenwethinkabouttraining a modelforreasoning, onethingthatimmediatelyjumpstomindisyoucouldhavehumanswriteouttheirthoughtprocessandtrainonthat.
For a lotofthetimethat I'vebeenhere, we'vebeentryingtomakethemodelsbetteratsolvingmathproblems, asanexample.
Andwe'veput a lotofworkintothis.
Andwe'vecomeupwith a lotofdifferentmethods.
Butonethingthat I kept, like, everytime I wouldreadtheseoutputsfromthemodels, I'd alwaysbesofrustratedthatthemodeljustwouldneverseemtoquestionwhatwaswrongorwhenitwasmakingmistakesorthingslikethat.