Subtitles section Play video Print subtitles Tonight is Act III: Function the Ultimate. We're going to be talking about functions tonight. Functions are the very best part of JavaScript. It's where most of the power is, it's where the beauty is. Like everything else in JavaScript, they're not quite right, but you can work around that, and there's a lot of good stuff here. Tonight, unlike the previous two nights, I'm going to be showing you quite a lot of code. Because we're talking about functions, you need to see how they work. I personally tend to fall asleep in presentations that put a lot of code on the screen; it's just kind of not a good time, so I have a lot of examples and I tried to make them all fit on one screen in big type. They're all going to be simple, but they should be interesting and useful. Let's begin. Function is the key idea in JavaScript. It's what makes it so good and so powerful. In other languages you've got lots of things: you've got methods, classes, constructors, modules, and more. In JavaScript there's just function, and function does all of those things and more. That's not a deficiency, that's actually a wonderful thing — having one thing that can do a lot, and can do it brilliantly, at scale, that's what functions do in this language. Here's what a function is. A function is the word 'function'. It optionally has a name, which can be used to allow it to call itself. It can have a set of parameters, which are wrapped in parens, containing zero or more names which are separated by commas. It can have a body which is wrapped in curly braces, containing zero or more statements. A function expression like that produces an instance of a function object. Function objects in this language are first class, which means that they can be passed as an argument to another function, they may be returned as a return value from a function, they can be assigned to a variable, and they can be stored in an object or an array. Anything you can do with any other kind of value in this language, you can do with a function. A function expression is like an object literal in that it produces a value, except in this case it produces something that inherits from Function.prototype. It may seem kind of strange that a function can inherit methods from something else, but it can. So in this language, functions have methods. That may sound odd, but we've got that. I'll show you some examples of that. We have a var statement which allows us to declare and initialize variables within a function. Because JavaScript is not a strongly typed language you don't specify types in the var statement, you just give a name for the variable. Any variable can contain any value that's expressible in the language. A variable that's declared anywhere within a function is visible everywhere within the function; we don't respect block scope. The way var statements work is the var statement gets split into two pieces. The declaration part gets hoisted to the top of the function and is initialized with undefined. Back at the place where the original var statement was, it gets turned into an assignment statement so that the var gets removed. Here we have an example. I've got myVar = 0 and myOtherVar. What that does is, at the top of the function it defines myVar and myOtherVar and sets them both to undefined. Then at the point in the function where the original var statement was, we have an assignment statement. The separation and the hoisting operation changes the way you might think of the scoping of variable names. We also have a function statement. Unfortunately, the function statement looks exactly like a function expression. The only difference is that the name, instead of being optional, is now mandatory. But in all other respects it looks exactly the same, and it is confusing to have both. Why do we have both? Well, the function statement was the older thing, and the function expression, which is really the more useful form, was added to the language later. What the function statement does is it expands into a var statement which creates a variable and assigns a function value to it. That expansion, because it's actually a var statement, splits into two things. Except unlike the ordinary var statement that we saw earlier, both pieces of it are hoisted to the top of the function, so things are not necessarily declared in the order that you think they are. It's confusing having both function expressions and function statements, so how do you know which is which when you're looking at it? The rule is, if the first token of a statement is function, then it's a function statement. Otherwise, it's a function expression. Generally, function expressions are easier to reason about. For example, you can't put a function statement inside of an if statement because of the hoisting stuff. You might want to have a different function being defined if you take the else branch or the then branch, but hoisting doesn't look at branching, and it happens before we know the result of the if, so the language definition says that you can't do that. It turns out every browser lets you do that anyway, but because the language definition doesn't tell you what it's supposed to do, they all do something different. That's one of those edge cases that you want to stay away from. In this language we have function scope. In most other languages that have C syntax we have block scope, but because of the way vars get hoisted, block scope doesn't work in this language. In JavaScript, blocks do not have scope. Scope means that, in another language such as Java, if you declare a variable inside of curly braces, it's visible only inside of the curly braces and not outside. But that doesn't happen in JavaScript because of hoisting. The variable declaration gets pulled out of the if statement and moved to the top of the function, so the variables will be visible everywhere within the function. Only functions, in this language, have scope. If you declare a variable in a function, that variable is not visible outside of the function, but it's still visible everywhere within the function. If you're coming from other languages, this can be confusing. For example, a function like this will work in most other languages and will fail in JavaScript without an error. What you'll find is that it will run forever, and that's because the programmer thinks he's created two i variables, but in fact there's only one i variable. So the inner loop is constantly resetting the i value so that the outer loop will never finish. That's something to be aware of: in JavaScript, you can't be depending on block scope. Because of hoisting, because of the way that variable statements and function statements work, I recommend that you declare all variables at the top of the function and declare all functions before you call them. In other languages the prevailing style is to declare variables near the site of their first use, and in languages which have block scope that's good advice, but I don't recommend it in this language. We have a return statement. A return statement allows a function to return early, and also indicates what value the function should be returning. There are two forms of it: there's one that takes an expression, and one that does not. If there's no expression, then the value that gets returned is undefined. It turns out, every function in JavaScript returns a value, and if you don't explicitly say what the value is, it will return the undefined value. Unless it was called as a constructor, in which case it will return the new object that you're constructing. One other note: you cannot put a line break between the word return and the expression. Semi-colon insertion will go in and turn it into a statement that returns undefined, which is tragically awful. There are two pseudo parameters that every function can receive. One is called arguments, and the other has the unfortunate name of this. Let's look at arguments first. When a function is invoked, in addition to the parameters that it declares, it also gets a special parameter called arguments. It contains all of the arguments that were actually specified in the invocation. It is an array-like object, but it is not an array, which is unfortunate. I'll show you some examples of why that's unfortunate. It's array-like in that it has a length property, so you can ask arguments how many arguments were actually passed to this function, which might be different than the number of parameters that you specify. It also has very weird interaction with parameters. If you change one of the elements of the arguments array, you may change one of the parameters that it's associated with. If you do something really scary like splicing on the arguments array, you may scramble and reassign all of your parameters. Generally, you don't want to mess with the arguments array. While the language doesn't require you to treat it as a read-only structure, I highly recommend that you treat it as a read-only structure. OK, let's look at an example. I want to have a function in which I can pass it some number of numbers and it will then add them all and return the result. The way I do that is I first look at arguments.length to find out how many numbers I'm going to be adding. Then I will have a loop which will go through each of those members of the arguments' pseudo array and figure out the total, and then when it's done it returns the total. This is how you would write that in ES3, or in the third edition of the ECMAScript Standard. This gets a little bit nicer in the fifth edition. In the fifth edition, arguments is more array-like than before. It's more array-like in that it actually inherits, now, from array.prototype, and array.prototype now contains some interesting functions like reduce. I can call arguments.reduce and pass it a function that does adding, and the result of that will be to add up all the members of that array and return it. I think it's a more elegant way of expressing the same program. Then we have the this parameter. I'm discovering that I don't like the name 'this' because it makes it really difficult to talk about it. My first sentence: 'the this parameter…' Already you're in trouble. I mean, it's just hard to talk about it in doing code reviews: 'oh, I see your problem, this is wrong.' [laughter] Well, you might be right. So what is this? The this parameter contains a reference to the object of invocation. This allows a method to know what object it is concerned with. It allows a single instance of a function object to serve as many functions. You can take a single function object and store it in lots of different objects, or put it in lots of prototypes, and allow it to be inherited by even more objects. There's just one instance of the function in the system, but all of those objects think that they have that method, and they will do the right thing with it because they use this to figure out what object they should actually be manipulating. So this is the key to prototypal inheritance. Prototypal inheritance works in this language because of this. We have the parens suffix operator, which is used for invoking, or calling, or executing the function. It surrounds zero or more comma separated expressions which will become the arguments of the function, and those arguments will be bound to the parameters of the function. If a function is called with too many arguments, the extra arguments are ignored. You don't get an error for that, they're just ignored. But they'll still go into the arguments array, so if you want to find out about them they're still accessible to you. If a function is called with too few arguments, that's not an error either. It will fill in undefined for any things that you did not include. There's no implicit type checking at all, so if the types of the parameters are important to you then you need to check them yourself within your function. There are four ways to call a function. There's the function form, the method form, the constructor form, and the apply form. They differ in what they do with this. In the method form, we have an object, and then we say dot function name or subscript, some method name, and then pass them arguments that will call the function and it will associate this with whatever that object was. That will allow the function, then, to manipulate this. Then there's the function form, in which we simply take a function value and call it immediately. In this case there's no object to associate this to, so in ES3 this was set to the global object, which was just awful. In ES5/Strict we improve that a little bit: we now bind this to undefined, which is less awful. But one problem with this form is that sometimes if you have an inner function inside of an outer method, and that method wants the inner function to have access to this, but it doesn't have access to it because it has its own this which is different than the outer this. So in order to make this visible to the inner function, the outer function can declare a variable, perhaps called that, assign this to it, and then the inner function will have access to that. We have a constructor form, which looks like the function form except we have the new prefix. Now when the function is called, this is bound to a new object that inherits from the function's prototype number. Then if the function does not explicitly return a value, that new object will be returned. This is used very much in the pseudo classical style which we'll look at a little bit later. Then finally there's the apply form in which we use either the function's apply method or its call method. What they have in common is they both allow us to specify what this is. The value that this should have will be the first parameter. The difference between them is that apply takes an array of arguments and call takes zero or more individual parameters, which will become the arguments. I showed how to define call in terms of apply, and also show a little bit of the ugliness that's caused by the fact that arguments is not a real array. What I want to do to implement a call is I want to take all of the parameters that were passed except for the first one, and I do that by using the splice method — except arguments doesn't have a splice method in ES3, so instead I have to go out and find it. I know that I can find it at array.prototype, so I go array.prototype.slice.apply, and then I can take that piece of arguments. Really awful. Again, we fix that in ES5 a little bit. To summarize, this is a bonus parameter, and its value depends on the calling form. If its call is a function, it's bound to either the global object in ES3, or to undefined in ES5/Strict. If it's called as a method it's bound to the object containing the method. If it's called as a constructor, it's bound to the new object being constructed. And if it's called in the apply form, then we explicitly pass in an argument that determines what this is going to be. We call these things functions, but they don't behave exactly like mathematical functions. In a mathematical function you would expect that every time you use a function with a particular set of inputs, you should get exactly the same outputs. There are some programming languages in which people are trying to match that ideal and there is some attractiveness in doing that, because the behavior programs write is more predictable, it's easier to reason about them, and they're also a lot harder to write. Because it turns out that programs, in order to be interesting, are interacting with the world, and the world is always different. The functions are always going to be dealing with different things, so they'll tend to want to keep state, and to manipulate that state and to mutate things. So functions will tend to have side effects. In JavaScript you can program in the pure functional sense, in that you can assume: OK, I'm never going to assign to a variable, and I'm never going to change any object once it's created. And the language will let you do that, but you're going to find it's really hard. We tend to change things a lot, because it's just an easier style of programming. Where did functions come from? Originally, there was something called the subroutine. The subroutine began life back in the assembly language era, where you'd want to be able to define your own op codes, and you could take a bunch of instructions that you used frequently and create a pseudo op and call it. Subroutines were born. They introduced the idea of call and return, where we call the thing and when it's finished it comes back and we resume from where we were. That idea has been in virtually every language since then. In different languages they've been called subs, procedures, procs, funcs, functions, lambdas, but it's all the same idea of taking some specification of computation and packaging it so that it can be re-used conveniently. The first motivation for sub-routines was code reuse. The first generations of computers had really small memories, so in order to get programs to fit you'd want to take pieces of the program that were recurring and factor them out so that they were only there once, and then call them. That was the only way you could hope to get it to fit. It turned out that was such a good idea that was then used in the design of programs. Treating a program as a single, monolithic list of instructions was too difficult to reason about, so if we'd divide and conquer that program into smaller components then we can think about those components more easily. A subroutine or function was a natural form for doing that. The next step was using them to do modular things — for example, to create libraries of routines that could be loaded with any program so that you could have stuff that could be reused from one program to another. That led to a sense of expressiveness where, in thinking about how to design an application, you would first think of what set of subroutines would make it really easy to write this application, essentially designing a programming language expressed as subroutine calls, which are ideal for implementing this application, and then write those subroutines. The next step up was higher order functions, in which we're going to do things with functions which couldn't be done otherwise. That's when some of the power of the language really starts to work for you. One of the cases where that occurs is in recursion. Recursion is when a function calls itself, or is defined in terms of itself. Now, at first this didn't make sense to some programmers. For example, if you were working in FORTRAN, FORTRAN couldn't do this. It was not possible for a function to call itself in FORTRAN. A lot of very good programmers looked at the idea of recursion, and reasoned: well, I've never used it, and I don't understand the need for it, therefore it could not be very important. It turns out it's actually really important, and when you learn to think recursively you become a much stronger programmer. One of the classic algorithms for a recursive solution is the Quicksort, which was invented basically because ALGOL was invented. Expressing this in a recursive programming language turned out to be really easy. There are basically two steps in Quicksort. The first is you divide an array into two groups: all the big values and all the small values. One way you could do that is you could have two pointers that are going through the array, and one starts on the small end and one starts on the big end, and when either finds something that's in the wrong group they swap them, and then continue scanning in until they meet. When they meet, you're done. Then you go to step two, where you take each of those groups and call Quicksort on those groups, and you're done. That's the whole sort, and it's really fast. There are more optimizations you can do to it that make it even faster, but just doing what I was describing in the average case is n log n, which is really good for a sort, and you hardly do anything. It's just really, really simple. So once you can learn to think recursively, a lot of really interesting things fall out. Here's another kind of recursion. You might recognize these from the JSON language. We've got the syntax diagrams for values and arrays, and you might notice that there's a dependence issue going on here where a value can be an array, but an array can contain a value. A naïve programmer might struggle, thinking how do I organize my functions in order to parse something like this? I've got this circular dependency and that's really hard, but it turns out if you have recursion working for you, there's a trivial solution to this. Here I have two functions. Each exactly implements one of those syntax diagrams. I've got the value which, when it sees a square bracket, will call the array method and array function, and return whatever it returns. Then I've got my array function, and for each of the things that it finds it calls value to find out what it is. Here I've got mutual recursion going on, and it all works out. You don't have to think about how to manage the transition from one to another, just the ordinary function plumbing does all of that work for you. You don't even have to think about it. Lisp had this stuff going on in 1958, ALGOL had it in '60, but it took awhile to get into the mainstream languages, partly because people couldn't think about how to implement it efficiently. That time, the way subroutine calls worked, on most machines you had self modifying instructions, so when you called a function it would destroy whatever is in the first word of the function and replace it with a jump back to the place that it was called from, and you couldn't do that recursively because once you've clobbered that address there's no getting back. So you needed some other place to keep the return values, and eventually that turned out to be a stack. All modern CPUs now have support for that, usually in the form of auto incrementing or auto decrementing pointer instructions. These assembly language notions eventually found their way into programming languages, so these things come straight out of assembly language but found their way into C and then into Java, and everything else. I don't like them. They look way too primitive and fish brained to me, if you know what I mean. I think we can do better than that. One of the other key ideas which was, again, alien to people who'd never used it, was closure. People who were working in languages in which closure was not an option were like 'I've been programming for years without it, I don't understand why you'd ever want it.' But it turns out JavaScript's got it, and it's really, really good. That's where we're going to spending most of out time tonight. It's sometimes called lexical scoping, sometimes called static scoping. It has to do with how variable names are resolved in nested functions. The context of an inner function includes the scope of the outer functions, so all of the variables that are in the outer function are available to the inner function, and this continues even after the parent function has returned. That sounds kind of weird, so I've got a lot of examples to show you what this means. I'll start with a simple one. I've got a function called digit_name, and digit_name will take a number as an argument, and will return the name of that number in English. It will take advantage of an array of strings it stored in names. As you can see, it's a really simple function. Unfortunately, the way I've defined it here, names is a global variable. The problem with that is, if there's anything else in the environment that is also a global variable that has that name, they're going to interfere with each other and will probably cause this to fail. That's something you cannot test for, because it's impossible to test with everything that might be loaded on a page. For example, it might be that a third party ad gets loaded one day that happens to have a global variable called name, and now your page died. That's intolerable, so we want to, as much as possible, reduce our dependence on global variables. One way we could do that is to rewrite this program so that names is now a local variable of the digit_name function. And that works. It's a local variable, we have function scope, names is not visible on the outside, so even if an evil ad comes in and has a names variable it will not interfere with this one, so that's good. This is a much more reliable version of the function. Unfortunately, every time we call the function, we're going to allocate a new array and stuff ten things into it, which is going to take some time. We don't want to do that; that's a terrible waste. In this case it's a fairly trivial thing, but we might have a more complicated function with a more complicated initialization, so we want to be able to factor that out. Closure provides a really nice way to do that. Now I have a function and it has a private names variable, and it returns a function. The function it returns is assigned to digit_name. The important thing is, notice at the bottom, we're invoking the function now. We're invoking the function immediately, so what I'm storing in digit_name is not the whole function, it is the function that it returns. OK? This is really important. In order to give the reader a clue that there's something interesting going on here — because assigning a function looks almost the same as assigning a function that's immediately invoked — I wrapped it in parens. The whole thing is wrapped in the golden parens. That's a clue to the reader; it's not required by the language, but I think it is required by humans. It gives us a clue that there's something really interesting going on here. We assign the return value of the outer function to digit_name. The outer function has now returned, digit_name now contains a function, which is the green function. That green function still has access to names, even though names is a private variable of a function that's already returned. That's closure: one function closes over the variables of another function. This turns out to be one of the most important features in JavaScript; this is the thing that it got amazingly right. This is the thing that makes JavaScript one of the world's brilliant programming languages. There's another pattern going around called lazy function definition. I show you think as a warning. Don't do this. The idea here is that, in this form, I unconditionally initialize the function before we're going to start calling it. But what if the initialization is really expensive, so we don't want to do it unless we know the function is going to end up getting called at least once? This lazy pattern attempts to do that. What it does is it assigns to digit_name a function, and when that function is called it will then store another function into the same variable. So it'll replace itself, it'll modify itself. The idea here is that that allows us to avoid having to initialize the thing, if we don't need to do it. But it comes at a cost, and the cost is confusion. Digit_name is no longer first class in that if I were to pass it to a function and let that function call it, or if I were to assign it to an object and let someone call it as a method, every time it gets called from that point on it will do the initialization and stuff a new function into digit_name. Instead of making it faster we've actually made it slower. It's slower than the slow case we started off with. Now, the counter-argument is, OK, you've got to be really careful to not do that, so one of the rules we'll put in the documentation is that this can only be called from the global variable, you can't use the function value as a function value except to call it immediately, and that it's worth it because we're saving the initialization cost. It turns out that analysis is wrong. All we're saving is the cost of an if per iteration, and let me show you why that's the case. Here we're going back to the closure form, except I put an if statement in it, so that if names hasn't been initialized yet, we'll initialize it now, and then we'll do what we always do. The cost of this compared to the previous one was one if statement per invocation, which is in the noise, it's not even measurable. The optimization that we were hoping to get in the lazy form just doesn't pay off, and we get weirdness instead. Now, an argument about that might be: well, suppose we call this function a million times, or a gazillion times. A gazillion if statements, that starts to add up to something. You can go yeah, maybe that's true. But if you think you're really going to call this a gazillion times, we shouldn't be optimizing the case where we're not going to call it at all. [laughter] I thought I heard some applause there. Maybe not. [laughter] OK, here's another example. A fade function. This is something you might do in an Ajax application. I want to take some object — maybe a div or something — and have it fade from yellow to white, maybe as an indication to the user that something changed and they should pay attention to it. I've got my fade function. First thing I do is find a DOM element and create a variable called level, which I'll set initially to 1. Then I'll define a step function, and then I will call setTimeOut, passing that step function with a time, so it'll fire in a tenth of a second. And then it returns. Done. That's the end of fade. Then suddenly, a tenth of a second later approximately, the step function executes. It will first define a variable H, and initialize it with level. What is level? Level is the variable of fade. It's not the value of fade when it was created, it is the current value, it is the current variable. It does the same thing with DOM — it gets access to the DOM variable and uses that to change the background color of that DOM node. It then looks at level, and if it's less than 15 — which it will be, at this point — it will add 1 to it. It's adding 1 to the level variable of the fade function that's already returned, and then it will call setTimeOut, and in a tenth of a second will do this again. It will keep doing it until eventually we reach 15, and then we stop. Now, suppose we had three things on the page and we wanted them all to fade simultaneously. We call fade 1, 2, 3, with three different IDs at the same time — are those three executions going to interfere with each other? No, not at all. Because each invocation of fade has its own unique set of variables: its own DOM, its own level, creates its own step functions, and they do not interfere with each other at all. So this works, again, because of closure. Because step is able to close over the DOM and level variables, it just works. Everybody still with me? OK, one more example along these lines. I want to make a later method. It's like setTimeOut except more object oriented, so I want it to be a method of all objects. I can take for any object, call later, give it the number of milliseconds in which to wait. It doesn't actually wait, it puts it on timer queue, and eventually it'll get around to dispatching it. Give it the name of a method, or perhaps pass in a function which will be treated as a method, and then the other parameters of that method would need. On the next screen I'll show you what it looks like. But again, I'll point out the problem with arguments. What I'm going to want to be able to say is: arguments.slice(2), so that I can take all of the parameters that were passed except for the first two and make a nice little array out of it. I can't do that in ES3, instead I have to write array.prototype.slice.apply(arguments, [2]), which is pretty nasty. So when you see that on the next screen, you'll know why that is. In ES5, you can do the simpler thing. I'm going to add this to object.prototype. I could add it to any of the ancestors of my application. This is one place to put it. Object.prototype is a global object and all of the problems you have with global variables you have with global prototypes as well, so this is something you want to do really cautiously. You want to do it conditionally, just in case the language ever actually adds later as standard equipment, so that you're not going to be replacing the official version with your version. Generally you don't want to be doing this in applications, although it's sometimes a reasonable thing to be doing in Ajax libraries. In this case, if we don't already have an object.prototype.later method, we're going to define one. We're going to pass in the number of milliseconds in the method, and then we'll create an array of the additional arguments. We're binding that to this; it's doing the thing I showed you before, because in the green function we're going to want access to this, but this doesn't work. This is not captured in closure. But that is, and so that's how we get that into it. That will call setTimeOut, and will cause that function's method to get invoked at that time. One other thing I'm doing here is when later is finished, which happens immediately, it returns the value of that, which is also this. The advantage of doing that is it allows us to then cascade on that. So if I had several things that I wanted to have happen later but at different times, I could say myObject.later5.later10.later20, and so on. I could just cascade all these things one after another because each returns its own object, so we can then go right on and invoke the next one. There are a lot of Ajax libraries that carry this idea to excess, but it's a really nice pattern, and I think it works really nicely in this language. Another example: partial application. We're starting to get a little theoretical now. Partial application says I'll take a function and a parameter and return another function which doesn't execute that yet, but will when it's supplied with additional parameters. Let's start with the example first. Using a function called curry, I'm going to pass it an add function — which takes two arguments and adds them together — and I'm going to pass it 1. It will return a function which will add 1 to whatever gets passed to it. I'm going to store it in increment, because that's a good name for that, and then I can call it. So if I now pass a 6 to inc, I get 7. This is called partial application. The implementation of it is, I'll first get an array of arguments, except for the first one, because the first one is the function and I don't need that one. In this case I'm assuming I'm on ES5, so I'm not doing the awful array.prototype.apply trick. Then curry returns a function, and that function will apply the arguments to the function. One bit of weirdness that's left over from arguments not being a real array is that if I pass arguments as a parameter to concat, it doesn't recognize that it's an array and then take all the members of it and concatenate them to the other thing. It will concatenate them as a single array, which is not what we want, in this case. We need to turn it into a real array so that concat will do the right thing to it, and we do that by calling its slice method. ES5 has the slice method, so slice returns an array, and that will work. But we shouldn't have had to do that; there's still some things left to get fixed in future editions. Everybody still with me? OK, here's one other. Suppose we've got a process which cannot be resolved immediately. Maybe it's going to require a lot of computation, maybe it has to go out to a worker pool and do something, maybe it has to go back to the server and get some stuff. But we'd like to be able to return something immediately that we can start acting on, even though it's not going to be real for awhile; we don't know when that while is yet. A service that's doing something like that could return something that's called a promise, and the promise is an object which allows us to call methods on the thing. If we know what the thing is then it will immediately get executed. But if we don't know what the thing is yet, it'll get cued up. It will finally get executed when we know what the thing is. That turns out to be a really useful pattern for doing a lot of things, particularly when you're doing a lot of communications. Here we're going to implement a promise maker, and the promise maker will return a set of five functions: when, fail, fulfill, smash, and status. You could pass any one, or any fraction of these functions to someone else. For example, you might have a service, and I want to return something to you immediately. I give you back an object containing a when and a fail method. You can then pass to when functions that you want called when the thing is fulfilled. You can also pass functions to fail for the case where a failure comes back. It'll just sit on all those things until it knows what the disposition is. And then the creator of the service might hang on to the fulfill and smash methods. Fulfill he'll call and pass a value in when he knows what the value finally is, and that's the thing that will get delivered to the functions. If it turns out that it's going to be an error, at this point it turns out it's too late to throw an exception because that was a long time ago, and the other guy's not in your call stack anymore, so instead you smash the promise, you break the promise, and that will cause all of his fail methods, now, to run. The way these things work is they depend on the vouch and resolve methods, which are private to the promise maker. But again, it closes over, so it'll always have access to those functions and the state that they refer to. Let me show you implementations of vouch and resolve. First we've got a few more variables. We've got status, which initially is unresolved, and eventually could be fulfilled or failed. We've got the outcome, so when we know what the value is we'll stick it in there. We've got the waiting list of functions that were registered with when. And we've got the dreading list for the functions that were registered with fail. Then vouch will take a deed and a function and then it'll look at the status. If the status is still unresolved, then it will put it onto one of those lists. Which list it will put it on will depend on what the deed is. But if the current state of the promise matches the deed, then we can execute it immediately. Then the other piece of this is resolve. If the status has already been resolved then we throw an error, because we can only do it once. Otherwise, we'll go through and use one of the nice thing in ES5 now: we've got a forEach method. We'll figure out which of those two arrays of a function we've got, and we'll say for each one of those functions, 'call this function'. This function will then go and call each of those with the value. We had to wrap it in a try catch, because if any of those functions should throw, we don't want that to interfere with the other functions getting a chance to run. OK, everybody still with me? We'll look at one more: sealers and unsealers. Sometimes we'd like to be able to pass secret information around through the application. Say that I give to you a secret envelope and tell you to give it to the cashier, and the cashier will take care of you. I want you to be able to take that envelope to the cashier and get reimbursed, and I'd like you to be able to give that envelope to someone else and allow them to be reimbursed. But I don't want you to be able to open it yourself, I don't want you to be able to tamper with it, and I want the cashier to be able to verify that it is, in fact, the original un-tampered-with thing. We can do that really easily in JavaScript, it turns out. It sounds like something you'd need cryptography to be able to do, but that doesn't really work inside of an application. But it turns out there is a much simpler solution. The way is works is I've got a sealer maker which will return a pair of functions, a sealer and unsealer, and they have to be used in the pairs. I will keep the sealer, and I will give the unsealer to the cashier, and then I can call the sealer with the value that I want to give to you, and it will return to me a box which I can then give to you. The box is useless to you, except that if you can give it to someone who's got an unsealer, they can reclaim the original object. This function is a tiny bit harder to write than it should be, because in JavaScript object keys have to be strings, they can't be objects. If they could be objects, this function would be totally trivial. As it is, it's just slightly trivial. What I will do is I'll create the box, the secret container, which is just an empty object. It's really just a token; I'm not actually giving you a real box, but it acts like a box. I'll store it in my box's array, and right next to it I will store in my value's array the value that it represents, and then return the box to you. That was really easy. Then the unsealer uses the new indexOf method that we have in arrays, and goes looking for that box in the list of boxes. If it finds it then it returns the corresponding value, and then we've got it. If something goes wrong, if you pulled a substitution, gave an object that was not sealed, you get undefined back, which is how it should be. We're going to shift slightly and start looking at inheritance, but we're still going to reflect it back onto what we can do with closure. Here's an example of how you can do things with what I call pseudoclassical inheritance. This was the inheritance scheme that was designed for the language, and I really don't care for it at all. I don't think it looks very good. Here we're defining a gizmo, and you can see the gizmo's constructor. Then we add to the gizmo's prototype the methods that we want the instances to inherit. This just looks really weird. We're sort of used to the idea of a class containing all of its stuff, and in this case it's kind of hanging on the end of it in a haphazard way. It also induces people to do things incorrectly. For example, I've seen people trying to assign functions to prototypes inside of the constructor because it just seems like that's where you should do it, and doing it on the outside just feels wrong even though that's how you're supposed to do it. It gets even worse in the case of the hoozit where I want the hoozit to inherit from the gizmo. The way I specify that in the language is I replace hoozit's prototype with a new instance of gizmo, and that just looks crazy. And it's potentially dangerous. It turns out that the gizmo constructor would throw if there were no parameters, then it would actually fail. But this is the way the language was intended to be used, and it's because the language itself is confused about its prototypal nature. I think there's a better way to do this. So let me suggest another formulation of exactly these same objects. Here I'm going to make a gizmo, and to make it for me I'm going to call my new constructor function. It will make the new instance of gizmo, or the new definer of gizmo. I will pass to it object because I want gizmo to inherit from object. I'm going to pass to it the constructor function, and I'm going to pass to it an object containing the methods that it should add to its own prototype. This does exactly the same thing that we saw on the other screen, but I think it's just more pleasant looking. Then it gets even better with the hoozit. With the hoozit I call new constructor, pass in the gizmo that says I want hoozit to inherit from gizmo, and I also pass it a constructor. I'll also pass it an object containing additional methods that I want it to add to its prototype. To my eye, this looks a whole lot more rational than that did, with all the stuff hanging out and the weird replacement. The language doesn't provide the new constructor function that you need to do this, but it turns out it's a really easy function to write. So let's write that function. Function new_constructor takes three parameters: extend, initializer, and methods. The first thing it does is it creates the prototype object, which it makes by calling object.create. Then if there are methods available it will call the keys method — this is a new thing in ES5 — which will return an array of all of the own keys of that object, which is really nice because an array has a forEach method, so it will then call that. That will allow us to easily copy all of the methods into the prototype. It's a really nice construction. Then we'll create the function itself, which we'll use to make our hoozits or whatever, and you can see that closure's working in there because it has access to prototype, and it has access to the initializer. So it will create a new instance of the prototype using object.create, which makes a new object that inherits from the object that you pass in. It will then call the initializer, passing that same object in, and when it's done it will return the object that we just created. So this does the same thing as new, except we don't use new. Then a little bit of extra plumbing — we don't really need to this, but just to be nice we'll set the function's prototype property to the prototype, because in the case of the hoozit, the prototype got replaced, so we lost the constructor value. We'll fix that there, as well. Again, we're using closure in order to implement a classical pattern, and I think this works really nicely in the language. Another thing we can do with functions is to create modules. We'd like to be able to minimize using global variables because of the conflicts that they can create, and functions provide a very nice way of doing that. Here I want to create a singleton object — there'll just be one instance of it — so you don't want to have to create a class to define something there's just going to be one instance of; that'd be silly. So I'm going to assign to singleton not that function, but the consequence of calling that function. Again, I'm wrapping the whole function and the invocation in parens as a sign to the reader that there's something bigger going on than just assignment of a function. There are some people who would put the golden paren around the function, and not around the whole invocation. That doesn't make sense to me, because what we're trying to tell the user is: look at the whole thing. Putting parentheses around just part of it is, I think, counter productive. I think the whole thing needs to be wrapped in parens. The outer function has variables and functions, and they will return an object using an object literal, and the object will contain some methods. Those methods will be closed over the private stuff. We're returning, in this case, two functions. In the earlier cases we returned one function, but this time we're returning two. We could return as many as we want. And they share their access; they're both closed over the variables of the parent function. So they can communicate through that shared state without corrupting the global space. A related pattern to this is if we want to have a common global object where we'll keep our whole application. At Yahoo! we keep a lot of stuff in a global Yahoo! object, so everything that's ours we keep in one common namespace. I want to add a new thing to my global object called methodical, which will have my two methods in it. Just as before, I'm going to be assigning the result of my function into that object. Now, sometimes I want to be adding not a new object but just a couple of methods to that structure. I can do that as well. Here's another variation on the same pattern. I've got a function, and it's got the private stuff, and then I'm going to assign to GLOBAL.firstMethod my first method, and to GLOBAL.secondMethod my second method, the other one. Again, the whole thing is wrapped in the golden parentheses. In this case, the parentheses are syntactically required, and that's because I want this to be a function expression and not a function statement. If it were a function statement, I couldn't immediately execute it, and I want to immediately execute it. Everybody still with me? I can take this module pattern and very easily turn it into a constructor pattern. It's the same basic idea, I'm just going to make lots of instances, not just one instance. Here's the recipe. Step one: make an object using any of the techniques available in the language. I can use an object literal, I can use new, I can use object.create, I can call another of these power constructors and use the thing that it returns. Then step two: I define some variables and functions, and these will be the private members of the object that I'm about to make. Step three: I augment the object with privileged methods. A privileged method is a method which has access to that private state, that closes over the private state. And step four, I return the object. Really simple recipe, but it's a little abstract, so let me turn it into a template that's a little easier to follow. Step one. This is going to be my new power constructor, and I'm going to create a variable called 'that'. I can't call it 'this', because 'this' is a reserved word. I will initialize it somehow; somehow I'll turn it into an object. Then step two, I declare secrets, the secret variable, stuff that's going to be available to my privileged method. Step three, I create my privileged methods and assign them to that. Step four, I return that. So it's really simple. Here's gizmo and hoozit again. This is how we would write it, again, in the classical style, pseudoclassical style. It so bothers me how all this stuff's hanging out. Also, gizmo's got a constructor, and hoozit's got a constructor, and they both do the same thing. So even though one inherits from the other, we don't get the advantage of that code reuse. There's some redundant waste going on there. I want to apply this functional system instead of doing this. This is how we'd write it. I've got my gizmo, it returns an object literal, done. That was really easy. Then my hoozit calls gizmo to create an instance, it augments that, adding its test method, and returns that. Done. So it's really simple. But there are some other benefits that come from writing in this style. One is that we've got privacy. Right now, with the way it's written, the ID is a global property of the object, so anybody could go in and get the ID directly or modify it. Maybe I don't want them to be able to do that, maybe the integrity of my object depends on nobody being able to mess with the ID. Writing this in the functional style, we can do that — not only can we do that, the code gets simpler. We just don't have the ID property in the object. We're referring now to the ID parameter, and because of closure, our two string method always has access to that parameter. So we just took the 'this's out, and it's done. We do a similar thing with hoozit. So again, it just became simpler. There are other things we could do, too. We could have a shared secret which we pass between all of the constructors, which could be used to simulate something like a package relationship, where they all contain something that they know. You can get arbitrarily complicated with this stuff; you usually don't need to get anywhere near this fancy, but it's nice knowing that you can, if the need should ever arise. When I started working with this language, I spent a lot of time thinking about how to simulate things that we did in the classical languages, like how do we get super functions? In the pseudoclassical model there's no easy way to write super functions, but in the functional style it's really easy. Just capture a super function from the thing that I'm inheriting from, keep that in the closure, and then I can call it at any time I want. It turns out, though, in my career with this language I've never once written a super function. I just think about things in a different way so that that style of dependency that I've come from, I just haven't found the need for it. So if you find yourself wanting to have super functions, you might step back and figure out: why do I think I need that? Maybe there's a simpler way to think about this. Here's another thing we can do. I want to have a memoizer, which will remember the result of previous callings of a function — particularly recursive functions — so that we can avoid doing some work. For example, factorial can be given a recursive definition in which it's the product of the value and of calling factorial on the value diminished by 1. If you're computing a table of factorials, you could spend a lot of time going over the same ground over and over and over again, and this function will prevent that. What I'm going to pass to the memoizer is an array containing some of the values that we're going to remember. The results for factorial of 0 and factorial of 1 will be 1 and 1, so we'll pass that in to get it started, and then we'll also pass in a function that defines what a factorial step is. In this case, it's multiplying n times the recurrence minus 1. When we go up to the memoizer it takes that memo array and it takes the formula we just passed in, and it will create a recurrence function, which is the thing that will call for each iteration, which will first look to see if we already have the result that we need in the memo array. If it does, then we're done. If not, then we will call a formula passing in itself, its own recurrence function, so that it can do the next step. Where this is a big win is in computing Fibonacci, because Fibonacci recurs on two legs at the same time, so it gets explosive. If you do a Fibonacci of 40, say, it's in the trillions of iterations, and this gets it down into the tens. So even though the program looks a little bit more complicated, it's hugely more efficient. Again, this is happening because of closure, because the recur function closes over the memo array and over the formula that we're recurring on. One bit of warning about functions: don't declare functions in a loop. Don't make functions in a loop, for two reasons. One is it can be wasteful, because a new function object is created for each iteration. It's just wasteful. JavaScript compilers tend not to do any kind of loop and variant analysis, so anything you're doing in a loop that doesn't change over each iteration, you probably want to move it out of the loop anyway just to make it go a little faster. But the bigger reason is that it gets really confusing, because you think that you're closing over the current value of the loop variables but you're actually closing over their final values, or their current values, and that's almost always not what you want. Let me show you an example of a really common error. Say you've got an array of divs and you want to attach an event handler to each one. You go through the array in a loop and for each one you want to add an onClick handler which will display its ID number when it's clicked on. What you find is that they all come up with the same number, and it's the wrong number. You wonder, how did that happen? It's because when you add the function to onClick it's closing over div ID, which is constantly changing. By the time you finally get around to clicking on them you're going to be getting the final value, which was the value that kicked you out of the loop. The way you get around that is by creating a separate function which you're going to use to assign the functions to the event handler. Here I have a function called make_handler which will take the div ID and return the event handler function. Then within the loop we call make_handler and take its result and stuff it into onClick. By doing that, we avoided creating any functions inside of the loop, and that way we avoided the confusion that came from that problem with closure. Here I have two versions of the factorial function that do exactly the same thing. The only difference is that one of them uses a variable, and the other uses a parameter to represent result. Otherwise, they're exactly the same. R. D. Tennent wrote a book called 'The Principles of Programming Languages' in which he demonstrated the Principle of Correspondence, which was a correspondence between variables and parameters. JavaScript demonstrates it really well. This shows that you could imagine a subset of JavaScript which didn't have variables — would that still be a useful language? It turns out yes, and this is the proof that anything you can write with variables you can write without variables. You can use a function closure instead to do the same thing. We can take that thought experiment one crazy step-off-the-edge farther. Suppose we had a language in which we didn't have variables and in which we didn't have assignment, and we didn't have named functions. Could we still do recursion? It turns out you can. I'm not sure you'd want to, but you can. Here is the strangest artifact in computer science: it's called the Y Combinator. It's a function. It's a really complicated function, although it's not very big. It's incredibly nested; functions within functions, calling themselves, passing themselves as parameters to themselves. I call Y passing in a factorial formula. It returns a function, and the function it returns is the recursive factorial function. This is really wild stuff. If you can figure this out, you can call yourself a computer scientist, because this is the really good stuff. You can express this stuff in JavaScript — I mean, JavaScript is right up there with Lisp and Scheme. It is a functional language. You can do this stuff. While this may have little practical value, in terms of increasing your powers as a programmer, this is the stuff to be playing with. You can get really, really deep. I see a lot of people playing with their Ajax stuff, or wanting to show off — look at all the stuff I can do — and sometimes doing things which are probably reckless and ultimately not very smart. If you want to show that you're really smart, you ought to be doing this stuff. You know, off to the side, where you're not going to hurt anybody. [laughter] JavaScript has good parts. It has really good parts. And these, I think, are the best of the parts. Again, this comes as a big surprise, because when JavaScript was introduced nobody expected there was anything good about it at all. The stuff that is good about this language is in there intentionally, by design, it wasn't accidental. You don't get stuff this good by accident. This is an amazingly good language. And that's why Ajax happened — we'll be talking a lot more about Ajax next week. The reason I was able to discover that JavaScript had good parts was because I knew something about functions. The place where I first learned about functions was in a little book called 'The Little LISper', which I highly recommend to you. The current edition of it is called 'The Little Schemer' — it was updated to be about Scheme. It's not really about Scheme; there isn't very much Scheme in the book. It's mostly about functions, and it's really, really good. It turns out that everything in the book can be written in JavaScript. Although Scheme and JavaScript couldn't be more different syntactically, at their roots they're surprisingly similar. There's a simple transformation from one language to the other; it's surprisingly simple. If you go to this web page, it'll show you exactly what they are, and that'll give you enough to be able to read and write the examples in the book. I highly, highly recommend that you go out and get this book. It will change the way you think, and there are very few books that do that. This is one of those books. Next time we meet: The Metamorphosis of Ajax. It'll be awful. [laughter] See you here. Thank you, and good night. [applause]
B1 object array variable javascript method language Crockford on JavaScript - Act III: Function the Ultimate 106 12 kleeff posted on 2013/12/25 More Share Save Report Video vocabulary