Subtitles section Play video
[MUSIC PLAYING]
PROFESSOR: Well, Hal just told us how you build robust
systems. The key idea was--
I'm sure that many of you don't really assimilate that
yet-- but the key idea is that in order to make a system
that's robust, it has to be insensitive to small changes,
that is, a small change in the problem should lead to only a
small change in the solution.
There ought to be a continuity.
The space of solutions ought to be continuous in this space
of problems.
The way he was explaining how to do that was instead of
solving a particular problem at every level of
decomposition of the problem at the subproblems, where you
solve the class of problems, which are a neighborhood of
the particular problem that you're trying to solve.
The way you do that is by producing a language at that
level of detail in which the solutions to that class of
problems is representable in that language.
Therefore when you makes more changes to the problem you're
trying to solve, you generally have to make only small local
changes to the solution you've constructed, because at the
level of detail you're working, there's a language
where you can express the various solutions to alternate
problems of the same type.
Well that's the beginning of a very important idea, the most
important perhaps idea that makes computer science more
powerful than most of the other kinds of engineering
disciplines we know about.
What we've seen so far is sort of how to use
embedding of languages.
And, of course, the power of embedding languages partly
comes from procedures like this one that
I showed you yesterday.
What you see here is the derivative program that we
described yesterday.
It's a procedure that takes a procedure as an argument and
returns a procedure as a value.
And using such things is very nice.
You can make things like push combinators and all that sort
of wonderful thing that you saw last time.
However, now I'm going to really muddy the waters.
See this confuses the issue of what's the procedure and what
is data, but not very badly.
What we really want to do is confuse it very badly.
And the best way to do that is to get involved with the
manipulation of the algebraic expressions that the
procedures themselves are expressed in.
So at this point, I want to talk about instead of things
like on this slide, the derivative procedure being a
thing that manipulates a procedure--
this is a numerical method you see here.
And what you're seeing is a representation of the
numerical approximation to the derivative.
That's what's here.
In fact what I'd like to talk about is instead things that
look like this.
And what we have here are rules from a calculus book.
These are rules for finding the derivatives of the
expressions that one might write in
some algebraic language.
It says things like a derivative of a constant is 0.
The derivative of the valuable with respect to which you are
taking the derivative is 1.
The derivative of a constant times the function is the
constant times the derivative of the function,
and things like that.
These are exact expressions.
These are not numerical approximations.
Can we make programs?
And, in fact, it's very easy to make programs that
manipulate these expressions.
Well let's see.
Let's look at these rules in some detail.
You all have seen these rules in your elementary calculus
class at one time or another.
And you know from calculus that it's easy to produce
derivatives of arbitrary expressions.
You also know from your elementary calculus that it's
hard to produce integrals.
Yet integrals and derivatives are opposites of each other.
They're inverse operations.
And they have the same rules.
What is special about these rules that makes it possible
for one to produce derivatives easily and
integrals why it's so hard?
Let's think about that very simply.
Look at these rules.
Every one of these rules, when used in the direction for
taking derivatives, which is in the direction of this
arrow, the left side is matched against your
expression, and the right side is the thing which is the
derivative of that expression.
The arrow is going that way.
In each of these rules, the expressions on the right-hand
side of the rule that are contained within derivatives
are subexpressions, are proper subexpressions, of the
expression on the left-hand side.
So here we see the derivative of the sum, with is the
expression on the left-hand side is the sum of the
derivatives of the pieces.
So the rule of moving to the right are reduction rules.
The problem becomes easier.
I turn a big complicated problem it's lots of smaller
problems and then combine the results, a perfect place for
recursion to work.
If I'm going in the other direction like this, if I'm
trying to produce integrals, well there are several
problems you see here.
First of all, if I try to integrate an expression like a
sum, more than one rule matches.
Here's one that matches.
Here's one that matches.
I don't know which one to take.
And they may be different.
I may get to explore different things.
Also, the expressions become larger in that direction.
And when the expressions become larger, then there's no
guarantee that any particular path I choose will terminate,
because we will only terminate by accidental cancellation.
So that's why integrals are complicated
searches and hard to do.
Right now I don't want to do anything as hard as that.
Let's work on derivatives for a while.
Well, these roles are ones you know for
the most part hopefully.
So let's see if we can write a program which is these rules.
And that should be very easy.
Just write the program.
See, because while I showed you is that it's a reduction
rule, it's something appropriate for a recursion.
And, of course, what we have for each of these rules is we
have a case in some case analysis.
So I'm just going to write this program down.
Now, of course, I'm going to be saying something you have
to believe.
Right?
What you have to believe is I can represent these algebraic
expressions, that I can grab their parts, that I can put
them together.
We've invented list structures so that you can do that.
But you don't want to worry about that now.
Right now I'm going to write the program that encapsulates
these rules independent of the representation of the
algebraic expressions.
You have a derivative of an expression with
respect to a variable.
This is a different thing than the
derivative of the function.
That's what we saw last time, that numerical approximation.
It's something you can't open up a function.
It's just the answers.
The derivative of an expression is
the way it's written.
And therefore it's a syntactic phenomenon.
And so a lot of what we're going to be doing today is
worrying about syntax, syntax of expressions
and things like that.
Well, there's a case analysis.
Anytime we do anything complicated thereby a
recursion, we presumably need a case analysis.
It's the essential way to begin.
And that's usually a conditional
of some large kind.
Well, what are their possibilities?
the first rule that you saw is this something a constant?
And what I'm asking is, is the expression a constant with
respect to the variable given?
If so, the result is 0, because the derivative
represents the rate of change of something.
If, however, the expression that I'm taking the derivative
of is the variable I'm varying, then this is the same
variable, the expression var, then the rate of change of the
expression with respect to the variable is 1.
It's the same 1.
Well now there are a couple of other possibilities.
It could, for example, be a sum.
Well, I don't know how I'm going to express sums yet.
Actually I do.
But I haven't told you yet.
But is it a sum?
I'm imagining that there's some way of telling.
I'm doing a dispatch on the type of the expression here,
absolutely essential in building languages.
Languages are made out of different expressions.
And soon we're going to see that in our more powerful
methods of building languages on languages.
Is an expression a sum?
If it's a sum, well, we know the rule for derivative of the
sum is the sum of the derivatives of the parts.
One of them is called the addend and the
other is the augend.
But I don't have enough space on the blackboard
to such long names.
So I'll call them A1 and A2.
I want to make a sum.
Do you remember which is the sum for end or the menu end?
Or was it the dividend and the divisor or
something like that?
Make sum of the derivative of the A1, I'll call it.
It's the addend of the expression with respect to the
variable, and the derivative of the A2 of the expression,
because the two arguments, the addition with
respect to the variable.
And another rule that we know is product rule, which is, if
the expression is a product.
By the way, it's a good idea when you're defining things,
when you're defining predicates, to give them a
name that ends in a question mark.
This question mark doesn't mean anything.
It's for us as an agreement.
It's a conventional interface between humans so you can read
my programs more easily.
So I want you to, when you write programs, if you define
a predicate procedure, that's something that rings true of
false, it should have a name which ends in question mark.
The list doesn't care.
I care.
I want to make a sum.
Because the derivative of a product is the sum of the
first times the derivative of the second plus the second
times the derivative of the first. Make a sum of two
things, a product of, well, I'm going to say the M1 of the
expression, and the derivative of the M2 of the expression
with respect to the variable, and the product of the
derivative of M1, the multiplier of the expression,
with respect to the variable.
It's the product of that and the multiplicand, M2, of the
expression.
Make that product.
Make the sum.
Close that case.
And, of course, I could add as many cases as I like here for
a complete set of rules you might find in a calculus book.
So this is what it takes to encapsulate those rules.
And you see, you have to realize there's a lot of
wishful thinking here.
I haven't told you anything about how I'm going to make
these representations.
Now, once I've decided that this is my set of rules, I
think it's time to play with the representation.
Let's attack that/
Well, first of all, I'm going to play a pun.
It's an important pun.
It's a key to a sort of powerful idea.
If I want to represent sums, and products, and differences,
and quotients, and things like that, why not use the same
language as I'm writing my program in?
I write my program in algebraic expressions that
look like the sum of the product on a and the product
of x and x, and things like that.
And the product of b and x and c, whatever, make that a sum
of the product.
Right now I don't want to have procedures with unknown
numbers of arguments, a product of b and x and c.
This is list structure.
And the reason why this is nice, is because any one of
these objects has a property.
I know where the car is.
The car is the operator.
And the operands are the successive cdrs the successive
cars of the cdrs of the list that this is.
It makes it very convenient.
I have to parse it.
It's been done for me.
I'm using the embedding and Lisp to advantage.
So, for example, let's start using list structure to write
down the representation that I'm implicitly assuming here.
Well I have to define various things that are implied in
this representation.
Like I have to find out how to do a constant,
how you do same variable.
Let's do those first. That's pretty easy enough.
Now I'm going to be introducing lots of primitives
here, because these are the primitives that
come with list structure.
OK, you define a constant.
And what I mean by a constant, an expression that's constant
with respect to a veritable, is that the expression is
something simple.
I can't take it into pieces, and yet
it isn't that variable.
I can't break it up, and yet it isn't that variable.
That does not mean that there may be other expressions that
are more complicated that are constants.
It's just that I'm going to look at the primitive
constants in this way.
So what this is, is it says that's it's the and.
I can combine predicate expressions which return true
or false with and.
Something atomic, The expression is atomic, meaning
it cannot be broken into parts.
It doesn't have a car and a cdr. It's not a list. It adds
a special test built into the system.
And it's not identically equal to that variable.
I'm representing my variable by things that are symbols
which cannot be broken into pieces, things like x, and y,
things like this.
Whereas, of course, something like this can be broken up
into pieces.
And the same variable of an expression with respect to a
variable is, in fact, an atomic expression.
I want to have an atomic
expression, which is identical.
I don't want to look inside this stuff anymore.
These are primitive maybe.
But it doesn't matter.
I'm using things that are given to me with a language.
I'm not terribly interest in them
Now how do we deal with sums?
Ah, something very interesting will happen.
A sum is something which is not atomic and begins with the
plus symbol.
That's what it means.
So here, I will define.
An question is a sum if and it's not atomic and it's head,
it's beginning, its car of the expression is the symbol plus.
Now you're about to see something you haven't seen
before, this quotation.
Why do I have that quotation there?
Say your name,
AUDIENCE: Susanna.
PROFESSOR: Louder.
AUDIENCE: Susanna
PROFESSOR: Say your name.
AUDIENCE: Your name.
PROFESSOR: Louder.
AUDIENCE: Your name.
PROFESSOR: OK.
What I'm showing you here is that the words
of English are ambiguous.
I was saying, say your name.
I was also possibly saying say, your name.
But that cannot be distinguished in speech.
However, we do have a notation in writing, which is quotation
for distinguishing these two possible meanings.
In particular, over here, in Lisp we have a notation for
distinguishing these meetings.
If I were to just write a plus here, a plus symbol, I would
be asking, is the first element of the expression, is
the operator position of the expression,
the addition operator?
I don't know.
I would have to have written the addition operator there,
which I can't write.
However, this way I'm asking, is this the symbolic object
plus, which normally stands for the addition operator?
That's what I want.
That's the question I want to ask.
Now before I go any further, I want to point out the
quotation is a very complex concept, and adding it to a
language causes a great deal of troubles.
Consider the next slide.
Here's a deduction which we should all agree with.
We have, Alyssa is smart and Alyssa is George's mother.
This is an equality, is.
From those two, we can deduce that George's mother is smart.
Because we can always substitute equals for equals
in expressions.
Or can we?
Here's a case where we have "Chicago" has seven letters.
The quotation means that I'm discussing the word Chicago,
not what the word represents.
Here I have that Chicago is the biggest city in Illinois.
As a consequence of this, I would like to deduce that the
biggest city in Illinois has seven letters.
But that's manifestly false.
Wow, it works.
OK, so once we have things like that, our language gets
much more complicated.
Because it's no longer true that things we tend to like to
do with languages, like substituting equals for equals
and getting right answers, are going to work without being
very careful.
We can't substitute into what's called referentially
opaque contexts, of which a quotation is the prototypical
type of referentially opaque context.
If you know what that means, you can consult a philosopher.
Presumably there is one in the room.
In any case, let's continue now, now that we at least have
an operational understanding of a 2000-year-old issue that
has to do with name, and mention, and all sorts of
things like that.
I have to define what I mean, how to make a sum of two
things, an a1 and a2.
And I'm going to do this very simply.
It's a list of the symbol plus, and a1, and a2.
And I can determine the first element.
Define a1 to be cadr. I've just
introduced another primitive.
This is the car of the cdr of something.
You might want to know why car and cdr are names of these
primitives, and why they've survived, even though they're
much better ideas like left and right.
We could have called them things like that.
Well, first of all, the names come from the fact that in the
great past, when Lisp was invented, I suppose in '58 or
something, it was on a 704 or something like
that, which had a machine.
It was a machine that had an address register and a
decrement register.
And these were the contents of the address register and the
decrement register.
So it's an historical accident.
Now why have these names survived?
It's because Lisp programmers like to talk to each other
over the phone.
And if you want to have a long sequence of cars and cdrs you
might say, cdaddedr, which can be understood.
But left of right or right of left is not so clear if you
get good at it.
So that's why we have these words.
All of them up to four deep are defined typically in a
Lisp system.
A2 to be--
and, of course, you can see that if I looked at one of
these expressions like the sum of 3 and 5, what that is is a
list containing the symbol plus, and a number 3,
and a number 5.
Then the car is the symbol plus.
The car of the cdr. Well I take the cdr and
then I take the car.
And that's how I get to the 3.
That's the first argument.
And the car of the cdr of the cdr gets me to
this one, the 5.
And similarly, of course, I can define what's going on
with products.
Let's do that very quickly.
Is the expression a product?
Yes if and if it's true, that's it's not atomic and
it's EQ quote, the asterisk symbol, which is the operator
for multiplication.
Make product of an M1 and an M2 to be list, quote, the
asterisk operation and M1 and M2.
and I define M1 to be cadr and M2 to be caddr. You get to be
a good Lisp programmer because you start talking that way.
I cdr down lists and console them up and so on.
Now, now that we have essentially a complete program
for finding derivatives, you can add more
rules if you like.
What kind of behavior do we get out of it?
I'll have to clear that x.
Well, supposing I define foo here to be the sum of the
product of ax square and bx plus c.
That's the same thing we see here as the algebraic
expression written in the more conventional
notation over there.
Well, the derivative of foo with respect to x, which we
can see over here, is this horrible, horrendous mess.
I would like it to be 2ax plus b.
But it's not.
It's equivalent to it.
What is it?
I have here, what do I have?
I have the derivative of the product of x and x.
Over here is, of course, the sum of x times
1 and 1 times x.
Now, well, it's the first times the derivative of the
second plus the second times the derivative of the first.
It's right.
That's 2x of course.
a times 2x is 2ax plus 0X square doesn't count plus B
over here plus a bunch of 0's.
Well the answer is right.
But I give people take off points on an exam for that,
sadly enough.
Let's worry about that in the next segment.
Are there any questions?
Yes?
AUDIENCE: If you had left the quote when you put the plus,
then would that be referring to the procedure plus and
could you do a comparison between that procedure and
some other procedure if you wanted to?
PROFESSOR: Yes.
Good question.
If I had left this quotation off at this point, if I had
left that quotation off at that point, then I would be
referring here to the procedure which is the thing
that plus is defined to be.
And indeed, I could compare some procedures with each
other for identity.
Now what that means is not clear right now.
I don't like to think about it.
Because I don't know exactly what it would need to compare
procedures.
There are reasons why that may make no sense at all.
However, the symbols, we understand.
And so that's why I put that quote in.
I want to talk about the symbol that's
apparent on the page.
Any other questions?
OK.
Thank you.
Let's take a break.
[MUSIC PLAYING]
PROFESSOR: Well, let's see.
We've just developed a fairly plausible program for
computing the derivatives of algebraic expressions.
It's an incomplete program, if you would
like to add more rules.
And perhaps you might extend it to deal with uses of
addition with any number of arguments and multiplication
with any of the number of arguments.
And that's all rather easy.
However, there was a little fly in that ointment.
We go back to this slide.
We see that the expressions that we get are rather bad.
This is a rather bad expression.
How do we get such an expression?
Why do we have that expression?
Let's look at this expression in some detail.
Let's find out where all the pieces come from.
As we see here, we have a sum--
just what I showed you at the end of the last time--
of X times 1 plus 1 time X. That is a
derivative of this product.
The produce of a times that, where a does not depend upon
x, and therefore is constant with respect to x, is this
sum, which goes from here all the way through here and
through here.
Because it is the first thing times the derivative of the
second plus the derivative of the first times the second as
the program we wrote on the blackboard
indicated we should do.
And, of course, the product of bx over here manifests itself
as B times 1 plus 0 times X because we see that B does not
depend upon X. And so the derivative of B is this 0, and
the derivative of X with respect itself is the 1.
And, of course, the derivative of the sums over here turn
into these two sums of the derivatives of the parts.
So what we're seeing here is exactly the thing I was trying
to tell you about with Fibonacci numbers a while ago,
that the form of the process is expanded from the local
rules that you see in the procedure, that the procedure
represents a set of local rules for the expansion of
this process.
And here, the process left behind some stuff, which is
the answer.
And it was constructed by the walk it takes of the tree
structure, which is the expression.
So every part in the answer we see here derives from some
part of the problem.
Now, we can look at, for example, the derivative of
foo, which is ax square plus bx plus c, with respect to
other things, like here, for example, we can see that the
derivative of foo with respect to a.
And it's very similar.
It's, in fact, the identical algebraic expression, except
for the fact that theses 0's and 1's are
in different places.
Because the only degree of freedom we have in this tree
walk is what's constant with respect to the variable we're
taking the derivative with respect to and
was the same variable.
In other words, if we go back to this blackboard and we
look, we have no choice what to do when we take the
derivative of the sum or a product.
The only interesting place here is, is the expression the
variable, or is the expression a constant with respect to
that variable for very, very small expressions?
In which case we get various 1's and 0's, which if we go
back to this slide, we can see that the 0's that appear here,
for example, this 1 over here in derivative of foo with
respect to A, which gets us an X square, because that 1 gets
the multiply of X and X into the answer, that 1 is 0.
Over here, we're not taking the derivative of foo with
respect to c.
But the shapes of these expressions are the same.
See all those shapes.
They're the same.
Well is there anything wrong with our rules?
No.
They're the right rules.
We've been through this one before.
One of the things you're going to begin to discover is that
there aren't too many good ideas.
When we were looking at rational numbers yesterday,
the problem was that we got 6/8 rather then 3/4.
The answer was unsimplified.
The problem, of course, is very similar.
There are things I'd like to be identical by simplification
that don't become identical.
And yet the rules for doing addition a multiplication of
rational numbers were correct.
So the way we might solve this problem is do the thing we did
last time, which always works.
If something worked last time it ought to work again.
It's changed representation.
Perhaps in the representation we could put in a
simplification step that produces a simplified
representation.
This may not always work, of course.
I'm not trying to say that it always works.
But it's one of the pieces of artillery we have in our war
against complexity.
You see, because we solved our problem very carefully.
What we've done, is we've divided the
world in several parts.
There are derivatives rules and general rules for algebra
of some sort at this level of detail.
and i have an abstraction barrier.
And i have the representation of the algebraic expressions,
list structure.
And in this barrier, I have the interface procedures.
I have constant, and things like same-var.
I have things like sum, make-sum.
I have A1, A2.
I have products and things like that, all the other
things I might need for various kinds of algebraic
expressions.
Making this barrier allows me to arbitrarily change the
representation without changing the rules that are
written in terms of that representation.
So if I can make the problem go away by changing
representation, the composition of the problem
into these two parts has helped me a great deal.
So let's take a very simple case of this.
What was one of the problems?
Let's go back to this transparency again.
And we see here, oh yes, there's horrible things like
here is the sum of an expression and 0.
Well that's no reason to think of it as anything other than
the expression itself.
Why should the summation operation have
made up this edition?
It can be smarter than that.
Or here, for example, is a multiplication of
something by 1.
It's another thing like that.
Or here is a product of something with 0, which is
certainly 0.
So we won't have to make this construction.
So why don't we just do that?
We need to change the way the representation
works, almost here.
Make-sum to be.
Well, now it's not something so simple.
I'm not going to make a list containing the symbol plus and
things unless I need to.
Well, what are the possibilities?
I have some sort of cases here.
If I have numbers, if anyone is a number--
and here's another primitive I've just introduced, it's
possible to tell whether something's number--
and if number A2, meaning they're not symbolic
expressions, then why not do the addition now?
The result is just a plus of A1 and A2.
I'm not asking if these represent numbers.
Of course all of these symbols represent numbers.
I'm talking about whether the one I've got is the
number 3 right now.
And, for example, supposing A1 is a number, and it's equal to
0, well then the answer is just A2.
There is no reason to make anything up.
And if A2 is a number, and equal A20, then
the result is A1.
And only if I can't figure out something better to do with
this situation, well, I can start a list. Otherwise I want
the representation to be the list containing the quoted
symbol plus, and A1, and A2.
And, of course, a very similar thing
can be done for products.
And I think I'll avoid boring you with them.
I was going to write it on the blackboard.
I don't think it's necessary.
You know what to do.
It's very simple.
But now, let's just see the kind of results we get out of
changing our program in this way.
Well, here's the derivatives after having just changed the
constructors for expressions.
The same foo, aX square plus bX plus c, and what I get is
nothing more than the derivative of that is 2aX plus
B.
Well, it's not completely simplified.
I would like to collect common terms and sums.
Well, that's more work.
And, of course, programs to do this sort of thing are huge
and complicated.
Algebraic simplification, it's a very complicated mess.
There's a very famous program you may have heard of called
Maxima developed at MIT in the past, which is 5,000 pages of
Lisp code, mostly the algebraic simplification
operations.
There we see the derivative of foo.
In fact, X is at something I wouldn't take off more than 1
point for on an elementary calculus class.
And the derivative of foo with respect to a, well it's gone
down to X times X, which isn't so bad.
And the derivative of foo with respect to b is just X itself.
And the derivative of foo with respect to c comes out 1.
So I'm pretty pleased with this.
What you've seen is, of course, a little bit
contrived, carefully organized example to show you how we can
manipulate algebraic expressions, how we do that
abstractly in terms of abstract syntax rather than
concrete syntax and how we can use the abstraction to control
what goes on in building these expressions.
But the real story isn't just such a simple thing as that.
The real story is, in fact, that I'm manipulating these
expressions.
And the expressions are the same expressions--
going back to the slide--
as the ones that are Lisp expressions.
There's a pun here.
I've chosen my representation to be the same as the
representation in my language of similar things.
By doing so, I've invoked a necessity.
I created the necessity to have things like quotation
because of the fact that my language is capable of writing
expressions that talk about expressions of the language.
I need to have something that says, this is an expression
I'm talking about rather than this expression is talking
about something, and I want to talk about that.
So quotation stops and says, I'm talking about this
expression itself.
Now, given that power, if I can manipulate expressions of
the language, I can begin to build even much more powerful
layers upon layers of languages.
Because I can write languages that not only are embedded in
Lisp or whatever language you start with, but languages that
are completely different, that are just, if we say,
interpreted in Lisp or something like that.
We'll get to understand those words more in the future.
But right now I just want to leave you with the fact that
we've hit a line which gives us tremendous power.
And this point we've bought a sledgehammer.
We have to be careful to what flies when we apply it.
Thank you.
[MUSIC PLAYING]