Placeholder Image

Subtitles section Play video

  • PROFESSOR: Well, yesterday was easy.

  • You learned all of the rules of programming and lived.

  • Almost all of them.

  • And so at this point, you're now certified programmers--

  • it says.

  • However, I suppose what we did is we, aah, sort of got you a

  • little bit of into an easy state.

  • Here, you still believe it's possible that this might be

  • programming in BASIC or Pascal with just a funny syntax.

  • Today, that illusion--

  • or you can no longer support that belief.

  • What we're going to do today is going to

  • completely smash that.

  • So let's start out by writing a few programs on the

  • blackboard that have a lot in common with each other.

  • What we're going to do is try to make them abstractions that

  • are not ones that are easy to make in most languages.

  • Let's start with some very simple ones that you can make

  • in most languages.

  • Supposing I want to write the mathematical expression which

  • adds up a bunch of integers.

  • So if I wanted to write down and say the sum from i

  • equal a to b on i.

  • Now, you know that that's an easy thing to compute in a

  • closed form for it, and I'm not interested in that.

  • But I'm going to write a program that

  • adds up those integers.

  • Well, that's rather easy to do to say I want to define the

  • sum of the integers from a to b to be--

  • well, it's the following two possibilities.

  • If a is greater than b, well, then there's nothing to be

  • done and the answer is zero.

  • This is how you're going to have to think recursively.

  • You're going to say if I have an easy case that I know the

  • answer to, just write it down.

  • Otherwise, I'm going to try to reduce this problem to a

  • simpler problem.

  • And maybe in this case, I'm going to make a subproblem of

  • the simpler problem and then do something to the result.

  • So the easiest way to do this is say that I'm going to add

  • the index, which in this case is a, to the result of adding

  • up the integers from a plus 1 to b.

  • Now, at this point, you should have no trouble looking at

  • such a definition.

  • Indeed, coming up with such a thing might be a little hard

  • in synthesis, but being able to read it at this point

  • should be easy.

  • And what it says to you is, well, here is the subproblem

  • I'm going to solve.

  • I'm going to try to add up the integers, one fewer integer

  • than I added up for the the whole problem.

  • I'm adding up the one fewer one, and that subproblem, once

  • I've solved it, I'm going to add a to that, and that will

  • be the answer to this problem.

  • And the simplest case, I don't have to do any work.

  • Now, I'm also going to write down another simple one just

  • like this, which is the mathematical expression, the

  • sum of the square from i equal a to b.

  • And again, it's a very simple program.

  • And indeed, it starts the same way.

  • If a is greater than b, then the answer is zero.

  • And, of course, we're beginning to see that there's

  • something wrong with me writing this down again.

  • It's the same program.

  • It's the sum of the square of a and the sum of the square of

  • the increment and b.

  • Now, if you look at these things, these programs are

  • almost identical.

  • There's not much to distinguish them.

  • They have the same first clause of the conditional and

  • the same predicate and the same consequence, and the

  • alternatives are very similar, too.

  • They only differ by the fact that where here I have a,

  • here, I have the square of a.

  • The only other difference, but this one's sort of unessential

  • is in the name of this procedure is sum int, whereas

  • the name of the procedure is sum square.

  • So the things that vary between these

  • two are very small.

  • Now, wherever you see yourself writing the same thing down

  • more than once, there's something wrong, and you

  • shouldn't be doing it.

  • And the reason is not because it's a waste of time to write

  • something down more than once.

  • It's because there's some idea here, a very simple idea,

  • which has to do with the sigma notation--

  • this much--

  • not depending upon what it is I'm adding up.

  • And I would like to be able to--

  • always, whenever trying to make complicated systems and

  • understand them, it's crucial to divide the things up into

  • as many pieces as I can, each of which I understand

  • separately.

  • I would like to understand the way of adding things up

  • independently of what it is I'm adding up so I can do that

  • having debugged it once and understood it once and having

  • been able to share that among many different uses of it.

  • Here, we have another example.

  • This is Leibnitz's formula for finding pi over 8.

  • It's a funny, ugly mess.

  • What is it?

  • It's something like 1 over 1 times 3 plus 1 over 5 times 7

  • plus 1 over 9 times 11 plus--

  • and for some reason, things like this tend to have

  • interesting values like pi over 8.

  • But what do we see here?

  • It's the same program or almost the same program.

  • It's a sum.

  • So we're seeing the figure notation, although over here,

  • we're dealing with incrementing by 4, so it's a

  • slightly different problem, which means that over here, I

  • have to change a by 4, as you see right over here.

  • It's not by 1.

  • The other thing, of course, is that the thing that's

  • represented by square in the previous sum of squares, or a

  • when adding up the integers.

  • Well, here, I have a different thing I'm adding up, a

  • different term, which is 1 over a times a plus 2.

  • But the rest of this program is identical.

  • Well, any time we have a bunch of things like this that are

  • identical, we're going to have to come up with some sort of

  • abstraction to cover them.

  • If you think about this, what you've learned so far is the

  • rules of some language, some primitive, some means of

  • combination, almost all of them, the means of

  • abstraction, almost all of them.

  • But what you haven't learned is common patterns of usage.

  • Now, most of the time, you learn idioms when learning a

  • language, which is a common pattern that mean things that

  • are useful to know in a flash.

  • And if you build a great number of them, if you're a

  • FORTRAN programmer, of course, everybody knows how to--

  • what do you do, for example, to get an integer which is the

  • biggest integer in something.

  • It's a classic thing.

  • Every FORTRAN programmer knows how to do that.

  • And if you don't know that, you're in real hot water

  • because it takes a long time to think it out.

  • However, one of the things you can do in this language that

  • we're showing you is not only do you know something like

  • that, but you give the knowledge of that a name.

  • And so that's what we're going to be going after right now.

  • OK, well, let's see what these things have in common.

  • Right over here we have what appears to be a general

  • pattern, a general pattern which covers all of the cases

  • we've seen so far.

  • There is a sum procedure, which is being defined.

  • It has two arguments, which are a lower bound

  • and an upper bound.

  • The lower bound is tested to be greater than the upper

  • bound, and if it is greater, then the result is zero.

  • Otherwise, we're going to do something to the lower bound,

  • which is the index of the conversation, and add that

  • result to the result of following the procedure

  • recursively on our lower bound incremented by some next

  • operation with the same upper bound as I had before.

  • So this is a general pattern, and what I'd like to do is be

  • able to name this general pattern a bit.

  • Well, that's sort of easy, because one of the things I'm

  • going to do right now is-- there's nothing very special

  • about numbers.

  • Numbers are just one kind of data.

  • It seems to me perfectly reasonable to give all sorts

  • of names to all kinds of data, for example, procedures.

  • And now many languages allow you have procedural arguments,

  • and right now, we're going to talk

  • about procedural arguments.

  • They're very easy to deal with.

  • And shortly, we'll do some remarkable things that are not

  • like procedural arguments.

  • So here, we'll define our sigma notation.

  • This is called sum and it takes a term, an A, a next

  • term, and B as arguments.

  • So it takes four arguments, and there was nothing

  • particularly special about me writing this in lowercase.

  • I hope that it doesn't confuse you, so I'll write it in

  • uppercase right now.

  • The machine doesn't care.

  • But these two arguments are different.

  • These are not numbers.

  • These are going to be procedures for computing

  • something given a number.

  • Term will be a procedure which, when given an index,

  • will produce the value of the term for that index.

  • Next will be given an index, which will

  • produce the next index.

  • This will be for counting.

  • And it's very simple.

  • It's exactly what you see.

  • If A is greater than B, then the result is 0.

  • Otherwise, it's the sum of term applied to A and the sum

  • of term, next index.

  • Let me write it this way.

  • Now, I'd like you to see something, first of all.

  • I was writing here, and I ran out of space.

  • What I did is I start indenting according to the

  • Pretty-printing rule, which says that I align all of the

  • arguments of the procedure so I can see

  • which ones go together.

  • And this is just something I do automatically, and I want

  • you to learn how to do that, too, so your programs can be

  • read and understood.

  • However, what do we have here?

  • We have four arguments: the procedure, the lower index--

  • lower bound index--

  • the way to get the next index, and the upper bound.

  • What's passed along on the recursive call is indeed the

  • same procedure because I'm going to need it again, the

  • next index, which is using the next procedure to compute it,

  • the procedure for computing next, which I also have to

  • have separately, and that's different.

  • The procedure for computing next is different from the

  • next index, which is the result of using next on the

  • last index.

  • And I also have to pass along the upper bound.

  • So this captures both of these and the other nice program

  • that we are playing with.

  • So using this, we can write down the original program as

  • instances of sum very simply.

  • A and B. Well, I'm going to need an identity procedure

  • here because ,ahh, the sum of the integers requires me to in

  • this case compute a term for every integer, but the term

  • procedure doesn't want to do anything to that integer.

  • So the identity procedure on A is A or X or whatever, and I

  • want to say the sum of using identity of the term procedure

  • and using A as the initial index and the incrementer

  • being the way to get the next index and B being the high

  • bound, the upper bound.

  • This procedure does exactly the same as the sum of the

  • integers over here, computes the same answer.

  • Now, one thing you should see, of course, is that there's

  • nothing very special over here about what I used as the

  • formal parameter.

  • I could have, for example, written this

  • X. It doesn't matter.

  • I just wanted you to see that this name does not conflict

  • with this one at all.

  • It's an internal name.

  • For the second procedure here, the sum of the squares, it's

  • even a little bit easier.

  • And what do we have to do?

  • Nothing more than add up the squares, this is the procedure

  • that each index will be given, will be given each--

  • yes.

  • Each index will have this done to it to get the term.

  • That's the thing that maps against term over here.

  • Then I have A as the lower bound, the incrementer as the

  • next term method, and B as the upper bound.

  • And finally, just for the thing that we did about pi

  • sums, pi sums are sort of--

  • well, it's even easier to think about them this way

  • because I don't have to think.

  • What I'm doing is separating the thing I'm adding up from

  • the method of doing the addition.

  • And so we have here, for example, pi sum A B

  • of the sum of things.

  • I'm going to write the terms procedure here explicitly

  • without giving it a name.

  • This is done anonymously.

  • I don't necessarily have to give a name to something if I

  • just want to use it once.

  • And, of course, I can write sort of a expression that

  • produces a procedure.

  • I'm going to write the Greek lambda letter here instead of

  • L-A-M-B-D-A in general to avoid taking up a lot of space

  • on blackboards.

  • But unfortunately, we don't have

  • lambda keys on our keyboards.

  • Maybe we can convince our friends in the computer

  • industry that this is an important.

  • Lambda of i is the quotient of 1 and the product of i and the

  • sum of i 2, starting at a with the way of incrementing being

  • that procedure of an index i, which adds i to 4, and b being

  • the upper bound.

  • So you can see that this notation, the invention of the

  • procedure that takes a procedural argument, allows us

  • to compress a lot of these procedures into one thing.

  • This procedure, sums, covers a whole bunch of ideas.

  • Now, just why is this important?

  • I tried to say before that it helps us divide a problem into

  • two pieces, and indeed, it does, for example, if someone

  • came up with a different way of implementing this, which,

  • of course, one might.

  • Here, for example, an iterative

  • implementation of sum.

  • Iterative implementation for some reason might be better

  • than the recursive implementation.

  • But the important thing is that it's different.

  • Now, supposing I had written my program this way that you

  • see on the blackboard on the left.

  • That's correct, the left.

  • Well, then if I want to change the method of addition, then

  • I'd have to change each of these.

  • Whereas if I write them like this that you see here, then

  • the method by which I did the addition is encapsulated in

  • the procedure sum.

  • That decomposition allows me to independently change one

  • part of the program and prove it perhaps without changing

  • the other part that was written for some

  • of the other cases.

  • Thank you.

  • Are there any questions?

  • Yes, sir.

  • AUDIENCE: Would you go over next A and next again on--

  • PROFESSOR: Yes.

  • It's the same problem.

  • I'm sure you're going to--

  • you're going to have to work on this.

  • This is hard the first time you've ever seen

  • something like this.

  • What I have here is a-- procedures

  • can be named by variables.

  • Procedures are not special.

  • Actually, sum square is a variable, which has gotten a

  • value, which is a procedure.

  • This is define sum square to be

  • lambda of A and B something.

  • So the procedure can be named.

  • Therefore, they can be passed from one to another, one

  • procedure to another, as arguments.

  • Well, what we're doing here is we're passing the procedure

  • term as an argument to sum just when we get it around in

  • the next recursive.

  • Here, we're passing the procedure next

  • as an argument also.

  • However, here we're using the procedure next.

  • That's what the parentheses mean.

  • We're applying next to A to get the next value of A. If

  • you look at what next is mapped against, remember that

  • the way you think about this is that you substitute the

  • arguments for the formal parameters in the body.

  • If you're ever confused, think of the thing that way.

  • Well, over here, with sum of the integers.

  • I substitute identity for a term and 1 plus the

  • incrementer for next in the body.

  • Well, the identity procedure on A is what I get here.

  • Identity is being passed along, and here, I have

  • increment 1 plus being applied to A and 1 plus is being

  • passed along.

  • Does that clarify the situation?

  • AUDIENCE: We could also define explicitly those two

  • functions, then pass them.

  • PROFESSOR: Sure.

  • What we can do is we could have given names to them, just

  • like I did here.

  • In fact, I gave you various ways so you

  • could see it, a variety.

  • Here, I define the thing which I passed the name of.

  • I referenced it by its name.

  • But the thing is, in fact, that procedure, one argument

  • X, which is X. And the identity procedure is just

  • lambda of X X. And that's what you're seeing here.

  • Here, I happened to just write its canonical name there for

  • you to see.

  • Is it OK if we take our five-minute break?

  • As I said, computers to make people happy, not people to

  • make computers happy.

  • And for the most part, the reason why we introduce all

  • this abstraction stuff is to make it so that programs can

  • be more easily written and more easily read.

  • Let's try to understand what's the most complicated program

  • we've seen so far using a little bit of

  • this abstraction stuff.

  • If you look at the slide, this is the Heron of Alexandria's

  • method of computing square roots that we saw yesterday.

  • And let's see.

  • Well, in any case, this program is a little

  • complicated.

  • And at the current state of your thinking, you just can't

  • look at that and say, oh, this obviously means

  • something very clear.

  • It's not obvious from looking at the

  • program what it's computing.

  • There's some loop here inside try, and a loop does something

  • about trying the improvement of y.

  • There's something called improve, which does some

  • averaging and quotienting and things like that.

  • But what's the real idea?

  • Can we make it clear what the idea is?

  • Well, I think we can.

  • I think we can use abstraction that we have learned about so

  • far to clarify what's going on.

  • Now, what we have mathematically is a procedure

  • for improving a guess for square roots.

  • And if y is a guess for a square root, then what we want

  • to get we'll call a function f.

  • This is the means of improvement.

  • I want to get y plus x/y over 2, so the average of y and x

  • divided by y as the improved value for the square root of x

  • such that-- one thing you can notice about this function f

  • is that f of the square root of f is in fact the

  • square root of x.

  • In other words, if I take the square root of x and

  • substitute it for y here, I see the square root of x plus

  • x divided by the square of x, which is the square root of x.

  • That's 2 times the square root of x divided by 2, is the

  • square root of x.

  • So, in fact, what we're really looking for is we're looking

  • for a fixed point, a fixed point of the function f.

  • A fixed point is a place which has the property that if you

  • put it into the function, you get the same value out.

  • Now, I suppose if I were giving some nice, boring

  • lecture, and you happened to have in front of you an HP-35

  • desk calculator like I used to have when I

  • went to boring lectures.

  • And if you think it was really boring, you put it into

  • radians mode, and you hit cosine, and you hit cosine,

  • and you hit cosine.

  • And eventually, you end up with 0.734 or

  • something like that.

  • 0.743, I don't remember what exactly, and it gets closer

  • and closer to that.

  • Some functions have the property that you can find

  • their fixed point by iterating the function, and that's

  • essentially what's happening in the square root program by

  • Heron's method.

  • So let's see if we can write that down, that idea.

  • Now, I'm not going to say how I compute fixed points yet.

  • There might be more than one way.

  • But the first thing to do is I'm going to

  • say what I just said.

  • I'm going to say it specifically, the square root.

  • The square root of x is the fixed point of that procedure

  • which takes an argument y and averages of x

  • divided by y with y.

  • And we're going to start up with the initial guess for the

  • fixed point of 1.

  • It doesn't matter where it starts.

  • A theorem having to do with square roots.

  • So what you're seeing here is I'm just trying to write out

  • by wishful thinking.

  • I don't know how I'm going to make fixed point happen.

  • We'll worry about that later.

  • But if somehow I had a way of finding the fixed point of the

  • function computed by this procedure, then I would have--

  • that would be the square root that I'm looking for.

  • OK, well, now let's see how we're going to write--

  • how we're going to come up with fixed points.

  • Well, it's very simple, actually.

  • I'm going to write an abbreviated version here just

  • so we understand it.

  • I'm going to find the fixed point of a function f--

  • actually, the fixed point of the function computed by the

  • procedure whose name will be f in this procedure.

  • How's that?

  • A long sentence--

  • starting with a particular starting value.

  • Well, I'm going to have a little loop inside here, which

  • is going to push the button on the calculator repeatedly,

  • hoping that it will eventually converge.

  • And we will say here internal loops are written by defining

  • internal procedures.

  • Well, one thing I'm going to have to do is I'm going to

  • have to say whether I'm done.

  • And the way I'm going to decide when I'm done is when

  • the old value and the new value are close enough so I

  • can't distinguish them anymore.

  • That's the standard thing you do on the calculator unless

  • you look at more precision, and eventually,

  • you run out of precision.

  • So the old value and new value, and I'm going to stay

  • here if I can't distinguish them if they're close enough,

  • and we'll have to worry about what that is soon.

  • The old value and the new value are close enough to each

  • other and let's pick the new value as the answer.

  • Otherwise, I'm going to iterate around again with the

  • next value of old being the current value of new and the

  • next value of new being the result of calling f on new.

  • And so this is my iteration loop that pushes the button on

  • the calculator.

  • I basically think of it as having two registers on the

  • calculator: old and new.

  • And in each step, new becomes old, and new gets F of new.

  • So this is the thing where I'm getting the next value.

  • And now, I'm going to start this thing up

  • by giving two values.

  • I wrote down on the blackboard to be slow

  • so you can see this.

  • This is the first time you've seen something quite this

  • complicated, I think.

  • However, we might want to see the whole thing over here in

  • this transparency or slide or whatever.

  • What we have is all of the details that are required to

  • make this thing work.

  • I have a way of getting a tolerance for a close enough

  • procedure, which we see here.

  • The close enough procedure, it tests whether u and v are

  • close enough by seeing if the absolute value of the

  • difference in u and v is less than the given tolerance, OK?

  • And here is the iteration loop that I just wrote on the

  • blackboard and the initialization for it, which

  • is right there.

  • It's very simple.

  • But let's see.

  • I haven't told you enough.

  • It's actually easier than this.

  • There is more structure to this problem than I've

  • already told you.

  • Like why should this work?

  • Why should it converge?

  • There's a hairy theorem in mathematics tied up in what

  • I've written here.

  • Why is it that I should assume that by iterating averaging

  • the quotient of x and y and y that I should

  • get the right answer?

  • It isn't so obvious.

  • Surely there are other things, other procedures, which

  • compute functions whose fixed points would also be the

  • square root.

  • For example, the obvious one will be a new function g,

  • which maps y to x/y.

  • That's even simpler.

  • The fixed point of g is surely the square root also, and it's

  • a simpler procedure.

  • Why am I not using it?

  • Well, I suppose you know.

  • Supposing x is 2 and I start out with 1, and if I divide 1

  • into 2, I get 2.

  • And then if I divide 2 into 2, I get 1.

  • If I divide 1 into 2, I get 2, and 2 into 2, I get 1, and I

  • never get any closer to the square root.

  • It just oscillates.

  • So what we have is a signal processing system, an

  • electrical circuit which is oscillating, and I want to

  • damp out these oscillations.

  • Well, I can do that.

  • See, what I'm really doing here when I'm taking my

  • average, the average is averaging the last two values

  • of something which oscillates, getting something in between.

  • The classic way is damping out oscillations in a signal

  • processing system.

  • So why don't we write down the strategy that I just said in a

  • more clear way?

  • Well, that's easy enough.

  • I'm going to define the square root of x to be a fixed point

  • of the procedure resulting from average damping.

  • So I have a procedure resulting from average damp of

  • the procedure, that procedure of y, which divides x by y

  • starting out at 1.

  • Ah, but average damp is a special procedure that's going

  • to take a procedure as its argument and return a

  • procedure as its value.

  • It's a generalization that says given a procedure, it's

  • the thing which produces a procedure which averages the

  • last value and the value before and

  • after running the procedure.

  • You can use it for anything if you want to damp out

  • oscillations.

  • So let's write that down.

  • It's very easy.

  • And stylistically here, I'm going to use lambda notation

  • because it's much easier to think when you're dealing with

  • procedure, the mid-line procedures, to understand that

  • the procedures are the objects I'm dealing with, so I'm going

  • to use lambda notation here.

  • Not always.

  • I don't always use it, but very specifically here to

  • expand on that idea, to elucidate it.

  • Well, average damp is a procedure, which takes a

  • procedure as its argument, which we will call f.

  • And what does it produce?

  • It produces as its value--

  • the body of this procedure is a thing which produces a

  • procedure, the construct of the procedures right here, of

  • one argument x, which averages f of x with x.

  • This is a very special thing.

  • I think for the first time you're seeing a procedure

  • which produces a procedure as its value.

  • This procedure takes the procedure f and does something

  • to it to produce a new procedure of one argument x,

  • which averages f--

  • this f--

  • applied to x and x itself.

  • Using the context here, I apply average damping to the

  • procedure, which just divides x by y.

  • It's a division.

  • And I'm finding to fixed point of that, and that's a clearer

  • way of writing down what I wrote down over

  • here, wherever it was.

  • Here, because it tells why I am writing this down.

  • I suppose this to some extent really clarifies what Heron of

  • Alexandria was up to.

  • I suppose I'll stop now.

  • Are there any questions?

  • AUDIENCE: So when you define average damp, don't you need

  • to have a variable on f?

  • PROFESSOR: Ah, the question was, and here we're having--

  • again, you've got to learn about the syntax.

  • The question was when defining average damp, don't you have

  • to have a variable defined with f?

  • What you are asking about is the formal parameter of f?

  • AUDIENCE: Yeah.

  • PROFESSOR: OK.

  • The formal parameter of f is here.

  • The formal parameter of f--

  • AUDIENCE: The formal parameter of average damp.

  • PROFESSOR: F is being used to apply it to

  • an argument, right?

  • It's indeed true that f must have a formal parameter.

  • Let's find out what f's formal parameter is.

  • AUDIENCE: The formal parameter of average damp.

  • PROFESSOR: Oh, f is the formal parameter of average damp.

  • I'm sorry.

  • You're just confusing a syntactic thing.

  • I could have written this the other way.

  • Actually, I didn't understand your question.

  • Of course, I could have written it this other way.

  • Those are identical notations.

  • This is a different way of writing this.

  • You're going to have to get used to lambda notation

  • because I'm going to use it.

  • What it says here, I'm defining the name average damp

  • to name the procedure whose of one argument f.

  • That's the formal parameter of the procedure average damp.

  • What define does is it says give this name a value.

  • Here is the value of for it.

  • That there happens to be a funny syntax to make that

  • easier in some cases is purely convenience.

  • But the reason why I wrote it this way here is to emphasize

  • that I'm dealing with a procedure that takes a

  • procedure as its argument and produces a

  • procedure as its value.

  • AUDIENCE: I don't understand why you use lambda twice.

  • Can you just use one lambda and take two

  • arguments f and x?

  • PROFESSOR: No.

  • AUDIENCE: You can't?

  • PROFESSOR: No, that would be a different thing.

  • If I were to write the procedure lambda of f and x,

  • the average of f of x and x, that would not be something

  • which would be allowed to take a procedure as an argument and

  • produce a procedure as its value.

  • That would be a thing that takes a procedure as its

  • argument and numbers its argument and

  • produces a new number.

  • But what I'm producing here is a procedure to fit in the

  • procedure slot over here, which is going to

  • be used over here.

  • So the number has to come from here.

  • This is the thing that's going to eventually end up in the x.

  • And if you're confused, you should do some substitution

  • and see for yourself.

  • Yes?

  • AUDIENCE: Will you please show the definition for average

  • damp without using lambda notation in both cases.

  • PROFESSOR: I can't make a very simple one like that.

  • Let me do it for you, though.

  • I can get rid of this lambda easily.

  • I don't want to be--

  • actually, I'm lying to you.

  • I don't want to do what you want because I think it's more

  • confusing than you think.

  • I'm not going to write what you want.

  • So we'll have to get a name.

  • FOO of x to be of F of x and x and return as a value FOO.

  • This is equivalent, but I've had to make an

  • arbitrary name up.

  • This is equivalent to this without any lambdas.

  • Lambda is very convenient for naming anonymous procedures.

  • It's the anonymous name of something.

  • Now, if you really want to know a cute way of doing this,

  • we'll talk about it later.

  • We're going to have to define the anonymous procedure.

  • Any other questions?

  • And so we go for our break again.

  • So now we've seen how to use high-order

  • procedures, they're called.

  • That's procedures that take procedural arguments and

  • produce procedural values to help us clarify and abstract

  • some otherwise complicated processes.

  • I suppose what I'd like to do now is have a bit of fun with

  • that and sort of a little practice as well.

  • So let's play with this square root thing even more.

  • Let's elaborate it and understand what's going on and

  • make use of this kind of programming style.

  • One thing that you might know is that there is a general

  • method called Newton's method the purpose of which is to

  • find the roots--

  • that's the zeroes--

  • of functions.

  • So, for example, to find a y such that f of y equals 0, we

  • start with some guess.

  • This is Newton's method.

  • And the guess we start with we'll call y0, and then we

  • will iterate the following expression.

  • y n plus 1-- this is a difference equation--

  • is yn minus f of yn over the derivative with respect to y

  • of f evaluated at y equal yn.

  • Very strange notation.

  • I must say ugh.

  • The derivative of f with respect to y is a function.

  • I'm having a little bit of unhappiness with that, but

  • that's all right.

  • It turns out in the programming language world,

  • the notation is much clearer.

  • Now, what is this?

  • People call it Newton's method.

  • It's a method for finding the roots of the function f.

  • And it, of course, sometimes converges, and when it does,

  • it does so very fast. And sometimes, it doesn't

  • converge, and, oh well, we have to do something else.

  • But let's talk about square root by Newton's method.

  • Well, that's rather interesting.

  • Let's do exactly the same thing we did last time: a bit

  • of wishful thinking.

  • We will apply Newton's method, assuming we knew how to do it.

  • You don't know how to do it yet.

  • Well, let's go.

  • What do I have here?

  • The square root of x.

  • It's Newton's method applied to a procedure which will

  • represent that function of y, which computes

  • that function of y.

  • Well, that procedure is that procedure of y, which is the

  • difference between x and the square of y.

  • Indeed, if I had a value of y for which this was zero, then

  • y would be the square root of x.

  • See that?

  • OK, I'm going to start this out searching at 1.

  • Again, completely arbitrary property of square roots that

  • I can do that.

  • Now, how am I going to compute Newton's method?

  • Well, this is the method.

  • I have it right here.

  • In fact, what I'm doing is looking for a fixed point of

  • some procedure.

  • This procedure involves some complicated expressions in

  • terms of other complicated things.

  • Well, I'm trying to find the fixed point of this.

  • I want to find the values of y, which if I put y in here, I

  • get the same value out here up to some degree of accuracy.

  • Well, I already have a fixed point process

  • around to do that.

  • And so, let's just define Newton's method over here.

  • A procedure which computes a function and a

  • guess, initial guess.

  • Now, I'm going to have to do something here.

  • I'm going to need the derivative of the function.

  • I'm going to need a procedure which computes the derivative

  • of the function computed by the given a procedure f.

  • I'm trying to be very careful about what I'm saying.

  • I don't want to mix up the word procedure and function.

  • Function is a mathematical word.

  • It says I'm mapping from values to other values, a set

  • of ordered pairs.

  • But sometimes, I'll accidentally mix those up.

  • Procedures compute functions.

  • So I'm going to define the derivative of f to be by

  • wishful thinking again.

  • I don't know how I'm going to do it.

  • Let's worry about that later--

  • of F. So if F is a procedure, which happens to be this one

  • over here for a square root, then DF will be the derivative

  • of it, which is also the derivative of the function

  • computed by that procedure.

  • DF will be a procedure that computes the derivative of the

  • function computed by the procedure F. And then given

  • that, I will just go looking for a fixed point.

  • What is the fixed point I'm looking for?

  • It's the one for that procedure of one argument x,

  • which I compute by subtracting x.

  • That's the old-- that's the yn here.

  • The quotient of f of x and df of x, starting out with the

  • original guess.

  • That's all very simple.

  • Now, I have one part left that I haven't written, and I want

  • you to see the process by which I write these things,

  • because this is really true.

  • I start out with some mathematical idea, perhaps.

  • By wishful thinking, I assume that by some magic I can do

  • something that I have a name for.

  • I'm not going to worry about how I do it yet.

  • Then I go walking down here and say, well, by some magic,

  • I'm somehow going to figure how to do that, but I'm going

  • to write my program anyway.

  • Wishful thinking, essential to good engineering, and

  • certainly essential to a good computer science.

  • So anyway, how many of you wished that your

  • computer ran faster?

  • Well, the derivative isn't so bad either.

  • Sort of like average damping.

  • The derivative is a procedure that takes a procedure that

  • computes a function as its argument, and it produces a

  • procedure that computes a function, which needs one

  • argument x.

  • Well, you all know this definition.

  • It's f of x plus delta x minus f of x over delta x, right?

  • For some small delta x.

  • So that's the quotient of the difference of f of the sum of

  • x and dx minus f point x divided by dx.

  • I think the thing was lining up correctly when I balanced

  • the parentheses.

  • Now, I want you to look at this.

  • Just look.

  • I suppose I haven't told you what dx is.

  • Somewhere in the world I'm going to have to write down

  • something like that.

  • I'm not interested.

  • This is a procedure which takes a procedure and produces

  • an approximation, a procedure that computes an approximation

  • of the derivative of the function computed by the

  • procedure given by the standard methods that you all

  • know and love.

  • Now, it may not be the case that doing this operation is

  • such a good way of approximating a derivative.

  • Numerical analysts here should jump on me and

  • say don't do that.

  • Computing derivatives produces noisy answers, which is true.

  • However, this again is for the sake of understanding.

  • Look what we've got.

  • We started out with what is apparently a mathematically

  • complex thing.

  • and.

  • In a few blackboards full, we managed to decompose the

  • problem of computing square roots by the way you were

  • taught in your college calculus class--

  • Newton's method--

  • so that it can be understood.

  • It's clear.

  • Let's look at the structure of what it is we've got.

  • Let's look at this slide.

  • This is a diagram of the machine described by the

  • program on the blackboard.

  • There's a machine described here.

  • And what have I got?

  • Over here is the Newton's method function f that we have

  • on the left-most blackboard.

  • It's the thing that takes an argument called y and puts out

  • the difference between x and the square of y, where x is

  • some sort of free variable that comes in from the outside

  • by some magic.

  • So the square root routine picks up an x, and builds this

  • procedure, which I have the x rolled up in it by

  • substitution.

  • Now, this procedure in the cloud is fed in as the f into

  • the Newton's method which is here, this box.

  • The f is fanned out.

  • Part of it goes into something else, and the other part of it

  • goes through a derivative process into something else to

  • produce a procedure, which computes the function which is

  • the iteration function of Newton's method when we use

  • the fixed point method.

  • So this procedure, which contains it by substitution--

  • remember, Newton's method over here, Newton's method builds

  • this procedure, and Newton's method has in it defined f and

  • df, so those are captured over here: f and df.

  • Starting with this procedure, I can now feed this to the

  • fixed point process within an initial guess coming out from

  • the outside from square root to produce the

  • square root of x.

  • So what we've built is a very powerful engine, which allows

  • us to make nice things like this.

  • Now, I want to end this with basically an idea of Chris

  • Strachey, one of the

  • grandfathers of computer science.

  • He's a logician who lived in the--

  • I suppose about 10 years ago or 15 years ago, he died.

  • I don't remember exactly when.

  • He's one of the inventors of something called

  • denotational semantics.

  • He was a great advocate of making procedures or functions

  • first-class citizens in a programming language.

  • So here's the rights and privileges of first-class

  • citizens in a programming language.

  • It allows you to make any abstraction you like if you

  • have functions as first-class citizens.

  • The first-class citizens must be able

  • to be named by variables.

  • And you're seeing me doing that all the time.

  • Here's a nice variable which names a procedure which

  • computes something.

  • They have to be passed as arguments to procedures.

  • We've certainly seen that.

  • We have to be able to return them as values from

  • procedures.

  • And I suppose we've seen that.

  • We haven't yet seen anything about data structures.

  • We will soon, but it's also the case that in order to have

  • a first-class citizen in a programming language, the

  • object has to be allowed to be part of a data structure.

  • We're going to see that soon.

  • So I just want to close with this and say having things

  • like procedures as first-class data structures, first-class

  • data, allows one to make powerful abstractions, which

  • encode general methods like Newton's method

  • in very clear way.

  • Are there any questions?

  • Yes.

  • AUDIENCE: Could you put derivative instead of df

  • directly in the fixed point?

  • PROFESSOR: Oh, sure.

  • Yes, I could have put deriv of f right here, no question.

  • Any time you see something defined, you can put the thing

  • that the definition is there because you

  • get the same result.

  • In fact, what that would look like, it's interesting.

  • AUDIENCE: Lambda.

  • PROFESSOR: Huh?

  • AUDIENCE: You could put the lambda expression in there.

  • PROFESSOR: I could also put derivative of f here.

  • It would look interesting because of the open paren,

  • open paren, deriv of f, closed paren on an x.

  • Now, that would have the bad property of computing the

  • derivative many times, because every time I would run this

  • procedure, I would compute the derivative again.

  • However, the two open parens here both would be meaningful.

  • I want you to understand syntactically that that's a

  • sensible thing.

  • Because if was to rewrite this program-- and I should do it

  • right here just so you see because

  • that's a good question--

  • of F and guess to be fixed point of that procedure of one

  • argument x, which subtracts from x the quotient of F

  • applied to x and the deriv of F applied to x.

  • This is guess.

  • This is a perfectly legitimate program,

  • because what I have here--

  • remember the evaluation rule.

  • The evaluation rule is evaluate all of the parts of

  • the combination: the operator and the operands.

  • This is the operator of this combination.

  • Evaluating this operator will, of course, produce the

  • derivative of F.

  • AUDIENCE: To get it one step further, you could put the

  • lambda expression there, too.

  • PROFESSOR: Oh, of course.

  • Any time I take something which is define, I can put the

  • thing it's defined to be in the place where the thing

  • defined is.

  • I can't remember which is definiens and which is

  • definiendum.

  • When I'm trying to figure out how to do a lecture about this

  • in a freshman class, I use such words and tell everybody

  • it's fun to tell their friends.

  • OK, I think that's it.

PROFESSOR: Well, yesterday was easy.

Subtitles and vocabulary

Click the word to look it up Click the word to find further inforamtion about it