Subtitles section Play video Print subtitles [MUSIC PLAYING] >> DAVID J. MALAN: All right, this is CS50. And this is week one. So recall that last time in week zero, we focused on computational thinking. And we transitioned from that to Scratch, a graphical programming language from our friends at MIT's Media Lab. >> And with Scratch, did we explore ideas like functions, and conditions, and loops, and variables, and even events, and threads, and more. And today, we're going to continue using those ideas, and really taking them for granted, but translate them to another language known as C. Now, C is a more traditional language. It's a lower level language, if you will. >> It's purely textual. And so at first glance, it's all going to look rather cryptic if you've never programmed before. We're going to have semi-colons, and parentheses, and curly braces, and more. But realize that even though the syntax is about to look a little unfamiliar to most of you, see past that. And try to see the ideas that are, indeed, familiar, because here in week one what we'll begin to do is to compare, initially, Scratch versus C. >> So, for instance, recall that when we implemented the first of our programs last time, we had a block that looked a little something like this-- when green flag clicked, and then we had one or more puzzle pieces below it, in this case, say, hello world. So, indeed, in Scratch, when I click that green flag to run my program, so to speak, these are the blocks that get executed, or run. And, specifically, Scratch said, hello, world. >> Now, I could have specified different words here. But we'll see that, indeed, many of these blocks-- and indeed, in C many functions-- can be parametrized or customized to do different things. In fact, in C if we want to convert, now, this Scratch program to this other language, we're going to write a little something like this. >> Granted, there is some unfamiliar syntax there most likely, int, and parentheses, and void. But printf-- even though you would think it would just be print. But print means print formatted, as we'll soon see. This literally will print to the screen whatever is inside of those parentheses, which of course in this case is, hello world. >> But you'll notice some other syntax, some double quotes, that the parentheses at the end, the semi-colon and the like. So there's a bit of overhead, so to speak, both cognitively and syntactically, that we're going to have to remember before long. But realize that with practice, this will start to jump out at you. >> In fact, let's focus on that one function specifically-- in this case, say hello world. So say is the function. Hello world is its parameter, or argument, its customisation. >> And the equivalence in C is just going to be this one line here, where printf is equivalent to, say, the double quoted string, hello world is equivalent, of course, to what's in the white box there. And the backslash n, though a little strange and absent from Scratch, simply is going to have the effect we'll see in a computer, like my Mac or a PC, of just moving the cursor to the next line. It's like hitting Enter on your keyboard. >> So we'll see that again before long. But first, let's take a look at this other example in the case of loops. We had this forever loop last time, which was a series of puzzle pieces that did something literally forever-- in this case, say, hello world, hello world, hello world, hello world. So it's an infinite loop by design. >> In C, if we want to implement this same idea, we might simply do this. While true, printf hello world-- now while, just semantically, kind of conjures up the idea of doing something again, and again, and again, and for how long? Well, true-- recall that true is just on or one. >> And true is, of course, always true. So it's kind of a meaningless statement just to say true. But indeed, this is deliberate, because if true is just always true, than while true just implies, if a little indirectly, that the following lines of code in between those curly braces should just execute again, and again, and again, and never actually stop. >> But if you do want your loop to stop, as we did last time with something like this, repeat the following 50 times, in C we can do the same with what's called a for loop-- the keyword not being while, but for. And then we have some new syntax here, with int i equals 0, i less than 50, i++. And we'll come back to that. But this is simply how we would translate the set of Scratch blocks to a set of C lines of code. >> Meanwhile, consider variables. And, in fact, we just saw one a moment ago. And in the case of Scratch, if we wanted to declare a variable called i for i being integer, just a number, and we want to set it to some value, we would use this orange block here-- set i to 0. >> And we'll see today and beyond, just like last week, programmers do almost always start counting from zero, really by convention. But also because recall from our discussion of binary, the smallest number you can represent with any number of bits is just going to be 0 itself. And so we'll generally start initializing even our variables to 0. >> And in C to do the same, we're going to say int for integer, i just by convention. I could have called this variable anything I want, just like in Scratch. And then equals 0 just assigns the value 0 from the right and puts it into the variable, or the storage container there, on the left. And the semi-colon as we'll see-- and we've seen a few of these already-- just means end of thought. Proceed to do something else on the lines that follow. >> Now, what about Boolean expressions? Recall that in Scratch, these were expressions that are either true or false-- questions, really, that are either true or false. So in the case of Scratch, we might ask a simple question like this, is i less than 50? So i, again, is an integer. Maybe we're using it in a Scratch program to keep track of a score or something like that. So this syntax here in Scratch just means, is i less than 50? Well, thankfully, something is simple in C. And to translate, this we would simply say i less than 50, using the familiar key on your keyboard. >> Meanwhile, if you wanted to say something more general, like, well, is x less than y where each of x and y are themselves variables? We can do the same thing in C, so long as we've created these variables already. And we'll see how to do that before long. We would simply say x less than y. >> So you're starting to see some similarities. And those folks who made Scratch were certainly inspired by some of these basic ideas. And you'll see this kind of syntax in many languages-- not just Scratch, not just C, but Python, and JavaScript, and other languages still. >> Let's consider another construct from C, the notion of a condition, doing something conditionally. If something is true, do this. If something else is true, do that. It's sort of the programming equivalent of a fork in the road. Maybe it's a two-way fork, a three-way fork, or more. And in Scratch, we might have seen something like this. >> So this one's a big one. But consider the relative simplicity of the logic. If x is less than y, then say x is less than y, else if x is greater than y, then say x is greater than y. And then, logically, if you think back to Scratch or just your own human intuition, well, if x is not greater than y, and x is not less than y, then of course x is going to be equal to y. So in this case, by nesting those Scratch blocks, can we achieve a three way fork in the road? >> Meanwhile, if we want to do that in C, it arguably looks a little simpler-- at least once you get familiar with the syntax. If x is less than y, printf x is less than y. Else if x is greater than y, printf x is greater than y. Else printf x is equal to y-- and, again, with those backslash ends just for those new lines so that if you actually ran this kind of program it would just move your cursor ultimately to the next line of the screen. >> Now, meanwhile Scratch had other more sophisticated features, only some of which we're going to initially move over to the world of C. And one of them was called a list in Scratch. And this was a special type of variable that allowed you to store multiple things in it back, to back, to back, to back. >> In C, it doesn't have lists, per se, but something that are more generally called arrays, although we'll come back later this semester to looking at something called a list, or really a linked list. But for now, the closest equivalent in C for us is going to be something called an array. And an array is simply a special type of variable that allows you to store data back, to back, to back, to back. >> And, indeed, in Scratch, if we wanted to access the first element of an array or a list-- and I'm going to call it, by convention, argv, argument vector, but more on that before long. If I want to get at the first element of argv, in the world of Scratch you actually do typically start counting from 1. >> And so I might get item 1 of argv. That's just how MIT implemented the notion of lists. But in C, I'm going to more simply just say, argv, which again is the name of my list-- or to be clear, an array. And if I want the first elements, I'm going to use square brackets, which you might not often used under a keyboard. >> But 0 just means, get me the first. So on occasion and as time passes, we're going to start to see these dichotomies between Scratch and C, whereby Scratch uses one. We in C use 0 here. But you'll quickly see once you understand the foundations of each language, that these things start to get all the more familiar through practice and practice. >> So let's actually look now at a program. Here shall be the first of our C source code for complete programs. And the program we're going to offer for consideration is the one that's equivalent to that earlier Scratch piece. >> So in here, we have what's arguably the simplest C program you can write that actually does something. Now, we'll look past, for now, has include, standard io.h, and these angle brackets, and int, and void, and the curly braces, and the like. >> And let's just focus on what, at least intuitively, might jump out at you already. In fact, main, I don't necessarily know what this is, but much like Scratch had that when green flag clicked puzzle piece, so does C as a programming language have a main piece of code that gets executed by default. And, indeed, it's literally going to be called main. >> So main is a function. And it's a special function that exists in C that when you run a program, it is main that gets run by default. In the world of Scratch, it was usually when green flag clicked that got run by default. >> Meanwhile, we've seen this before, printf or print formatted, that's going to be a function that comes with C, along with a whole bunch of others, that will from time and time again, in order to do exactly as its name suggests, print something. What do we want to print? Well, we'll see that by enclosing characters like these-- hello world, backslash n in double quotes, we can tell printf exactly what to print on the screen. >> But in order to do that, we unfortunately need to take something that is already cryptic to us humans, but at least it's somewhat readable-- sharp include, standard io.h, int, main, void, printf, all of the magical incantations we just saw on the screen. But we actually have to go more arcane still. We first need to translate the code that we write into machine code. And recall from last week that machines, at least the ones we know here, at the end of the day only understand zeros and ones. >> And my God, if we had to write these zeros and ones to actually program, it would very, very quickly take the fun out of anything. But it turns out, per last week, that these patterns of zeros and ones just have special meaning. In certain contexts, they might mean numbers. >> In some contexts, they might mean letters, or colors, or any number of other abstractions there upon. But just as your computer has a CPU, Central Processing Unit, or the brains inside of your computer. It's usually Intel inside, because that's one of the biggest companies that makes CPUs for computers. >> Well, Intel CPUs and others simply have decided in advance that certain patterns of zeros and ones shall mean specific things. Certain patterns of zeros and ones will mean, print this to the screen, or add these two numbers, or subtract these two numbers, or move this piece of data from my computer's memory over here, or any number of other very low level, but ultimately useful, operations. But, thankfully, we humans are not going to need to know this level of detail. Indeed, just like last time, where we abstracted again, and again, and again, building from very low level primitives like zeros and ones to higher level concepts like numbers, and letters, and colors, and more, so can we as programmers stand on the shoulders of others who have come before us and use software that other people have written before us-- namely programs called compilers. >> C is a language that is usually compiled, which means converted from source code to machine code. In particular, what this means is that if you've got your source code that you yourself write, as we soon will in just a moment on the screen, and you want to convert it ultimately to machine code-- those zeros and ones that only your Mac or your PC understands-- you've got a first feed that source code in as input to a special program called a compiler, the output of which we shall see is machine code. And, indeed, last time we talked about, really, at the end of the day, problem solving. You've got inputs. And you've got outputs. And you've got some kind of algorithm in the middle. >> Algorithms can surely be implemented in software, as we saw with pseudocode last week and as we'll see with actual code this week. And so a compiler really just has a set of algorithms inside of it that know how to convert the special keywords, like main, and printf, and others that we just saw into the patterns of zeros and ones that Intel inside and other CPUs actually understands. So how do we do this? Where do we get a compiler? >> Most of us here have a Mac or a PC. And you're running Mac OS, or Windows, or Linux, or Solaris, or any number of other operating systems. And, indeed, we could go out onto the web and download a compiler for your Mac or your PC for your particular operating system. But we would all be on different pages, so to speak. We'd have slightly different configurations. And things wouldn't work all the same. And, indeed, these days many of us don't use software that runs only on our laptops. Instead, we use something like a browser that allows us to access web-based applications in the cloud. And later this semester, we will do exactly that. We will write applications or software using code-- not C, but other languages like Python and JavaScript-- that run in the cloud. >> And to do that, we ourselves during the semester will actually use a cloud-based environment known as CS50 IDE. This is a web-based programming environment, or integrated development environment, IDe, that's built atop some open source software called Cloud 9. And we've made some pedagogical simplifications to it so as to hide certain features in the first weeks that we don't need, after which you can reveal them and do most anything you want with the environment. >> And it allows us, too, to pre-install certain software. Things like a so-called CS50 library, which we'll soon see provides us in C with some additional functionality. So if you go to, ultimately, CS50.io, you'll be prompted to log in, and once you do and create an account for free, you will be able to access an environment that looks quite like this. >> Now, this is in the default mode. Everything is nice and bright on the screen. Many of us have a habit of working on CS50 piece that's quite late into the night. And so some of you might prefer to turn it into night mode, so to speak. >> But, ultimately, what you're going to see within CS50 IDE is three distinct areas-- an area on the left where your files are going to be in the cloud, an area on the top right where your code is going to be editable. You'll be able to open individual tabs for any program that you write this semester inside of that top right hand corner. And then most arcanely, and yet powerfully, is going to be this thing at the bottom known as a terminal window. >> This is an old school Command Line Interface, or CLI, that allows you to execute commands on the computer-- in this case, the computer in the cloud-- to do things like compile your code from source code to machine code, to run your programs, or to start your web server, or to access your database, and any number of other techniques that we'll start to use before long. But to get there, we're going to actually have to go online and start playing. And to do that, let's first start tinkering with main, and write the main part of a program. And let's use that function printf, which we used earlier, simply to say something. >> So here I am already inside of CS50 IDE. I've logged in advance. And I full screened the window. And so, ultimately, you too in coming problems will follow similar steps that will provide online documentation. So you don't need to worry about absorbing every little technical step that I do here today. >> But you'll get a screen like this. I happen to be in night mode. And you can brighten everything up by disabling night mode. And at the end of the day, you're going to see these three main areas-- the file browser at left, the code tabs up top, and the terminal window at the bottom. >> Let me go ahead and write my first program. I'm going to preemptively go to File, Save, and save my file as hello.c. Indeed, by convention, any program we write that's written in the C language should be named something dot c, by convention. So I'm going to name it hello.c, because I just want to say hello to the world. Now I'm going to zoom out and click Save. And all I have here now is a tab in which I can start writing code. >> This is not going to compile. This means nothing. And so even if I converted this to zeros and ones, the CPU is going to have no idea what's going around on. But if I write lines that do match up with C's conventions-- C being, again, this language-- with syntax like this, printf hello world-- and I've gotten comfortable with doing this over time. So I don't think I made any typographical errors. >> But, invariably, the very first time you do this, you will. And what I am about to do might very well not work for you the first time. And that's perfectly OK, because right now you might just see a whole lot of newness, but over time once you get familiar with this environment, and this language, and others, you'll start to see things that are either correct or incorrect. >> And this is what the teaching fellows and course assistants get so good at over time, is spotting mistakes or bugs in your code. But I claim that there are no bugs in this code. So I now want to run this program. >> Now on my own Mac or PC, I'm in the habit of double clicking icons when I want to run some program. But that's not the model here. In this environment, which is CS50 IDE. We are using an operating system called Linux. Linux is reminiscent of another operating system, generally known as Unix. And Linux is particularly known for having a Command Line Environment, CLI. Now, we're using a specific flavor of Linux called Ubuntu. And Ubuntu is simply a certain version of Linux. >> But these Linux's these days do actually come with graphical user interfaces. And the one we happen to be using here is web-based. So this might look even a little different from something you yourself might have seen or run in the past. >> So I'm going to go ahead now and do the following. I've saved this file as hello.c. I'm going to go ahead and type clanghello.c So Clang for the C language is a compiler. It's pre-installed in CS50 IDE. And you can absolutely download and install this on your own Mac or PC. >> But, again, you wouldn't have all of the pre-configuration done for you. So for now, I'm just going to run clanghello.c. And now notice this syntax here will eventually realize just means that I'm in a folder or directory called Workspace. This dollar sign is just convention for meaning, type your commands here. >> It's what's called a prompt, just by convention is dollar sign. And if I go ahead now and click Enter, nothing seems to have happened. But that's actually a good thing. The less that happens on your screen, the more likely your code is to be correct, at least syntactically. >> So if I want to run this program, what do I do? Well, it turns out that the default name by convention for programs when you don't specify a name for your program is just a.out. And this syntax too, you'll get familiar with before long. >> Dot slash just means, hey, CS50 IDE, run a program called a.out that's inside my current directory. That dot means the current directory. And we'll see what other such sequences of characters means before long. >> So here we go, Enter, hello world. And you'll notice, that what happened? Not only did it print hello world. It also moved the cursor to the next line. >> And why was that? What was the code that we wrote before that ensured that the cursor would go on the next line? Funny thing about a computer is it's only going to do literally what you tell it to do. >> So if you tell it to printf hello, comma, space, world, close quote, it's literally only going to print those characters. But I had this special character at the end, recall, backslash n. And that's what ensured that the character went to the next line of the screen. >> In fact, let me go and do this. Let me go ahead and delete this. Now, notice that the top of my screen there's a little red light in the tab indicating, hey, you've not saved your file. So I'm going to go ahead with control S or command S, save the file. Now it goes-- went for a moment-- green. And now it's back to just being a close icon. >> If I now run clanghello.c again, Enter, dot slash, a.out, Enter, you'll see that it still worked. But it's arguably a little buggy. Right now, my prompt-- workspace, and then that dollar sign, and then my actual prompt-- is all on the same line. So this certainly an aesthetic bug, even if it's not really a logical bug. >> So I'm going to undo what I just did. I'm going to rerun a.out. Notice I've added the newline character back. I've saved the file. >> So I'm going to rerun a.out, and-- dammit, a bug, a bug meaning mistake. So the bug is that even though I added the backslash n there, re-saved, re-ran the program, the behavior was the same. Why would that be? >> I'm missing a step, right? That key step earlier was that you have to-- when you change your source code, it turns out also run it through the compiler again so you get new machine code. And the machine code, the zeros and ones, are going to be almost identical, but not perfectly so, because we need, of course, that new line. >> So to fix this, I'm going to need to rerun clanghello.c, enter, dot slash, a.out. And now, hello world is back to where I expect it to be. So this is all fine and good. But a.out is a pretty stupid name for a program, even though it happens to be, for historical reasons, the default-- meaning assembly outputs. >> But let me go ahead here and do this differently. I want my hello world program to actually be called hello. So if it were an icon on my desktop, it wouldn't be a.out. It would be called hello. >> So to do this, it turns out that Clang, like many programs, supports command line arguments, or flags, or switches, which simply influence its behavior. Specifically, Clang supports a dash o flag, which then takes a second word. In this case, I'll arbitrarily, but reasonably, call it hello. But I could call it anything I want, except a.out, which would be rather besides the point. >> And then just specify the name of the file I do want to compile. So now even though at the beginning of the command I still have Clang, at the end of the command I still have the filename, I now have these command line arguments, these flags that are saying, oh, by the way, output-o, a file called hello, not the default a.out. >> So if I hit Enter now, nothing seems to have happened. And, yet, now I can do dot slash hello. So it's the same program. The zeros and ones are identical at the end of the day. >> But they're in two different files-- a.out, which is the first version and just foolishly named, and now hello, which is a much more compelling name for a program. But, honestly, I am never going to remember this again, and again, and again. And, actually, as we write more complicated programs, the commands you're going to have to write are going to get even more complicated still. >> And so not to worry. It turns out that humans before us have realized they too had this exact same problem. They too did not enjoy having to type fairly long, arcane commands, let alone remember them. And so humans before us have made other programs that make it easier to compile your software. >> And, indeed, one such program is called Make. So I'm going to go ahead and do this. I'm going to undo everything I just did in the following way. Let me type LS. And you'll notice three things-- a.out, and a star, hello and a star, and hello.c. Hopefully, this should be a little intuitive, insofar as earlier there was nothing in this workspace. There was nothing that I had created until we started class. >> And I created hello.c. I then compiled it, and called it a.out. And then I compiled it again slightly differently and called it hello. So I have three files in this directory, in this folder called Workspace. Now, I can see that as well if I zoom out actually. >> If I zoom out here and look at that top right hand corner, as promised the left hand side of your screen is always going to show you what's in your account, what's inside of CS50 IDE. And there is three files there. >> So I want to get rid of a.out and hello. And as you might imagine intuitively, you could sort of control click or right click on this. And this little menu pops up. You can download the file, run it, preview it, refresh, rename, or what not. >> And I could just delete, and it would go away. But let's do things with a command line for now, so as to get comfortable with this, and do the following. I'm going to go ahead and remove a.out by typing literally rma.out. It turns out, the command for removing or deleting something, is not remove or delete. >> It's more succinctly RM, just to save you some keystrokes, and hit Enter. Now we're going to be somewhat cryptically remove regular file a.out. I don't really know what an irregular file would be yet. But I do want to remove it. >> So I'm going to type y for yes. Or I could type it out, and hit Enter. And, again, nothing seems to happen. But that is, generally, a good thing. >> If I type LS this time, what should I see? Hopefully, just hello and hello.c. Now, as an aside, you'll notice this star, asterisk, that's at the end of my programs. And they're also showing up in green. That is just CS50 IDE's way of cluing you into the fact that that's not source code. That's an executable, a runnable program that you can actually run by doing dot slash, and then it's name. >> Now, let me go ahead and remove this, rm hello, Enter, remove regular file hello, yes. And now if I type LS, we're back to hello.c. Try not to delete your actual source code. Even though there are features built into CS50 IDE where you can go through your revision history and rewind in time if you accidentally delete something, do be mindful as per these prompts yes or no, of what you actually want to do. And if I go up to the top left hand corner here, all that remains is hello.c. So there's bunches of other commands that you can execute in the world of Linux, one of which is, again, Make. And we're going to Make my program now as follows. >> Instead of doing clang, instead of doing clang-o, I'm going to simply literally type, make hello. And now notice, I am not typing make hello.c. I am typing make hello. >> And this program Make that comes with CS50 IDE, and more generally with Linux, is a program that's going to make a program called Hello. And it's going to assume, by convention, that if this program can be made, it's going to be made from a source code file ending in dot c, hello.c. >> So if I hit Enter now, notice that the command that gets executed is actually even longer before than before. And that's because we've preconfigured CS50 IDE to have some additional features built in that we don't need just yet, but soon will. But the key thing to realize is now I have a Hello program. >> If I type LS again, I have a hello program. And I can run it with dot slash a.out, no, because the whole point of this exercise was dot slash hello. And now I have my hello world program. So moving forward, we're almost always just going to compile our programs using the command Make. And then we're going to run them by dot slash, and the program's name. But realize what Make is doing for you, is it is itself not a compiler. It's just a convenience program that knows how to trigger a compiler to run so that you yourself can use it. >> What other commands exist in Linux, and in turn the CS50 IDE? We'll soon see that there's a CD command, Change Directory. This allows you within your command line interface to move forward, and back, and open up different folders without using your mouse. >> LS we saw, which stands for list the files in the current directory. Make Dir, you can probably start to infer what these mean now-- make directory, if you want to create a folder. RM for remove, RM Dir for remove directory-- and these, again, are the command line equivalents of what you could do in CS50 IDE with your mouse. But you'll soon find that sometimes it's just a lot faster to do things with a keyboard, and ultimately a lot more powerful. >> But it's hard to argue that anything we've been doing so far is all that powerful, when all we've been saying is, hello world. And, in fact, I hardcoded the words hello world into my program. There is no dynamism yet. Scratch was an order of magnitude more interesting last week. >> And so let's get there. Let's take a step toward that by way of some of these functions. So not only does C come with printf, and bunches of other functions some of which we'll see over time, it doesn't make it all that easy right out of the gate in getting user input. >> In fact, one of the weaknesses of languages like C, and even Java and yet others, is that it doesn't make it easy to just get things like integers from users, or strings, words, and phrases, let alone things like floating point values, or real numbers with decimal points, and really long numbers, as we'll soon see. So this list of functions here, these are like other Scratch puzzle pieces that we have pre-installed in CS50 IDE that we'll use for a few weeks as training wheels of sorts, and eventually take them off, and look underneath the hood, perhaps, at how these things are implemented. >> But to do this, let's actually write a program. Let me go ahead now. And I'm going to create a new file by clicking this little plus, and clicking New File. >> I'm going to save this next one as, let's say, string.c, because I want to play with strings. And string in C is just a sequence of characters. So now let's go ahead and do the following. >> Include standard IO.h-- and it turns out standard IO, IO just means input and output. So it turns out that this line here is what is the neighboring us to use printf. Printf, of course, produces output. So in order to use printf, it turns out you have to have this line of code at the top of your file. >> And we'll come back to what that really means before long. It turns out that in any C program I write, I've got to start it with code that looks like this. And you'll notice CS50 IDE, and other integrated development environments like it, are going to try as best they can to finish your thought. In fact, a moment ago if I undo what I just did, I hit Enter. >> I then hit open curly brace, hit Enter again. And it finished my thought. It gave me a new line, indented no less for nice stylistic reasons we'll see. And then it automatically gave me that curly brace to finish my thought. Now, it doesn't always guess what you want to do. But in large part, it does save you some keystrokes. So a moment ago, we ran this program-- hello, world, and then compiled it, and then ran it. But there's no dynamism here. What if we wanted to do something different? Well, what if I wanted to actually get a string from the user? I'm going to use a puzzle piece called exactly that-- get string. >> Turns out in C that when you don't want to provide input to a puzzle piece, or more properly to a function, you literally just do open parenthesis, close parenthesis. So it's as though there's no white box to type into. The say block before had a little white box. We don't have that white box now. >> But when I call get string, I want to put the result somewhere. So a very common paradigm in C is to call a function, like get string here, and then store its return value. It's the result of its effort in something. >> And what is the construct in programming, whether in Scratch or now C, that we can use to actually store something? Called it a variable, right? And in Scratch, we don't really care what was going in variables. >> But in this case, we actually do. I'm going to say string. And then I could call this anything I want. I'm going to call it name, gets get string. >> And now even if you're a little new to this, notice that I'm lacking some detail. I'm forgetting a semi-colon. I need to finish this thought. So I'm going to move my cursor, and hit semi-colon there. And what have I just done? In this line of code, number 5 at the moment, I'm calling get string with no inputs. So there's no little white box like the Save block has. >> I'm just saying, hey, computer, get me a string. The equal sign is not really an equal sign, per se. It's the assignment operator, which means, hey, computer, move the value from the right over to the left. And in the left, I have the following. >> Hey, computer, give me a string-- a sequence of characters. And call that string Name. And I don't even have to call it Name. >> I could call it, conventionally, something like S, much like we used i to call the variable i. But now I need to do something with it. It would be pretty stupid to try compiling this code, running this program, even though I'm getting a string, because it's still just going to say hello world. >> But what if I do want to change this. Why don't I do this? Percent s, comma s. And this is a little cryptic still. >> So let me make my variables more clear. Let me name this variable Name. And let's see if we can't tease apart what's happening here. >> So on line five, I'm getting a string. And I'm storing that string, whatever the user has typed in at his or her keyboard, in a variable called Name. And it turns out that printf doesn't just take one argument in double quotes, one input in double quotes. >> It can take two, or three, or more, such that the second, or third, or fourth, are all the names of variables, or specifically values, that you want to plug into, dynamically, that string in quotes. In other words, what would be wrong with this? If I just said hello name, backslash n, saved my file, compiled my code, and ran this, what would happen? >> It's just going to say, hello name, literally N-A-M-E, which is kind of stupid because it's no different from world. So anything in quotes is what literally gets printed. So if I want to have a placeholder there, I actually need to use some special syntax. And it turns out if you read the documentation for the printf function, it will tell you that if you use percent s, you can substitute a value as follows. >> After a comma after that double quote, you simply write the name of the variable that you want to plug in into that format code, or format specifier, percent s for strings. And now if I've saved my file, I go back down to my terminal. And I type Make String, because, again, the name of this file that I chose before is string.c. >> So I'm going to say Make String, enter. Oh my goodness, look at all of the mistakes we've made already. And this is-- what, this is really like a six, seven line program? So this is where it can very quickly get overwhelming. >> This terminal window has now just regurgitated a huge number of error messages. Surely, I don't have more error messages than I have lines of code. So what is going on? >> Well, the best strategy to do anytime you do encounter an overwhelming list of errors like that, is scroll back, look for the command you just ran, which in my case is make string. Look at what make did, and that's that long Clang command, no big deal there. >> But the red is bad. Green is trying to be gentle and helpful. But it's still bad, in this case. But where is it bad? >> String.c, line five, character five. So this is just common convention. Something colon something means line number and character number. Error, use of undeclared identifier string. Did you mean standard in? >> So, unfortunately, Clang is trying to be helpful. But it's wrong, in this case. No, Clang, I did not mean standard IO. I meant that on line one, yes. >> But line five is this one here. And Clang does not understand S-T-R-I-N-G. It's an undeclared identifier, a word it just has never seen before. And that's because C, the language we're writing code in right now, does not have variables called strings. >> It doesn't, by default, support something called a string. That's a CS50 piece of jargon, but very conventional. But I can fix this as follows. >> If I add one line of code to the top of this program, include CS50.h, which is another file somewhere inside of CS50 IDE, somewhere on the hard drive, so to speak, of the Ubuntu operating system that I'm running, that is the file that's going to teach the operating system what a string is, just like standard io.h is the file in the operating system that's going to teach it what printf is. >> Indeed, we would have gotten a very similar message if IO had admitted standard IO.h and tried to use printf. So I'm going to go ahead and just take Control L to clear my screen. Or you can type clear and it will just clear the terminal window. But you can still scroll back in time. >> And I'm going to rerun Make String. Cross my fingers this time, Enter. Oh my God, it worked. it shows me a long cryptic command that is what Make generated via Clang, but no error messages. So realize, even though you might get completely overwhelmed with the number of error messages, it just might be this annoying cascading effect, where Clang doesn't understand one thing, which means it then doesn't understand the next word, or the next line. And so it just chokes on your code. But the fix might be simple. And so always focus on the very first line of output. And if you don't understand it, just look for keywords that might be clues, and the line number, and the character, where that mistake might be. >> Now let me go ahead and type dot slash, string, enter. Hm, it's not saying hello anything. Why? Well, recall, where is it running? >> It's probably stuck at the moment in a loop, if you will, on line six, because Get String by design, written by CS50 staff, is literally meant to just sit there waiting, and waiting, and waiting for a string. All we mean by string is human input. So you know what? Let me go ahead. And just on a whim, let me type my name, David, enter. Now I have a more dynamic program. It said, hello David. >> If I go ahead and run this again, let me try say Zamila name, enter. And now we have a dynamic program. I haven't hard coded world. I haven't hard coded name, or David, or Zamila. >> Now it's much more like the programs we know, where if it take input, it produces slightly different output. Now, this is not the best user experience, or UX. I run the program. >> I don't know what I'm supposed to do, unless I actually look at or remember the source code. So let's make the user experience a little better with the simplest of things. Let me go back into this program, and simply say printf. >> And let me go ahead and say name, colon, and a space, and then a semi-colon. And just for kicks, no backlash n. And that's deliberate, because I don't want the prompt to move to the next line. >> I want to, instead, do this, make string to recompile my code into new machine code dot slash string. Ah, this is much prettier. Now I actually know what the computer wants me to do, give it a name. >> So I'm going to go ahead and type in Rob, enter, and hello, Rob. So, realize, this is still, at the end of the day, only a nine line program. But we've taken these baby steps. >> We wrote one line with which we were familiar, printf, hello world. Then we undid a little bit of that. And we actually used get string. And we tossed that value in a variable. And then we went ahead and improved it further with a third line. And this iterative process of writing software is truly key. In CS50, and in life in general, you should generally not sit down, have a program in mind, and try writing the whole damn thing all at once. >> It will, inevitably, result in way more errors than we ourselves saw here. Even I, to this day, constantly make other stupid mistakes, are actually harder mistakes that are harder to figure out. But you will make more mistakes the more lines of code you write all at once. And so this practice of, write a little bit of code that you're comfortable with, compile it, run it, test it more generally, then move on-- so just like we kept layering and layering last week, building from something very simple to something more complex, do the same here. Don't sit down, and try to write an entire problem. Actually take these baby steps. >> Now, strings aren't all that useful unto themselves. We'd actually, ideally, like to have something else in our toolkit. So let's actually do exactly that. >> Let me go ahead now and whip up a slightly different program. And we'll call this int.c, for integer. I'm going to, similarly, include CS550.h. I'm going to include standard IO. And that's going to be pretty common in these first few days of the class. >> And I'm going to ready myself with a main function. And now instead of getting a string, let's go ahead and get an int. Let's call it i, and call it get int, close parens, semi-colon. And now let's do something with it, printf. >> Let's say something like hello, backslash n, comma i. So I'm pretty much mimicking what I did just a moment ago. I have a placeholder here. I have comma i here, because I want to plug i into that placeholder. >> So let's go ahead and try compiling this program. The file is called int.c. So I'm going to say, make int, enter. Oh my God, but no big deal, right? There's a mistake. >> There's a syntactic mistake here such that the program can't be compiled inside int.c, line seven, character 27, error format specifies type char star, whatever that is. But the argument type is int. >> So here, too, we're not going to-- even though today is a lot of material, we're going to overwhelm you with absolutely every feature of C, and programming more generally, in just these first few weeks. So there's often going to be jargon with which you're not familiar. And, in fact, char star is something we're going to come back to in a week or two's time. >> But for now, let's see if we can parse words that are familiar. Formats-- so we heard format specifier, format code before. That's familiar. Type-- but the argument has type int. Wait a minute, i is an int. >> Maybe percent s actually has some defined meaning. And, indeed, it does. An integer, if you want printf to substitute it, you actually have to use a different format specifier. And you wouldn't know this unless someone told you, or you had done it before. But percent i is what can be commonly used in printf for plugging in an integer. You can also use percent d for a decimal integer. But i is nice and simple here. So we'll go with that. >> Now let me go ahead and rerun make int, Enter. That's good, no errors. Dot slash int-- OK, bad user experience, because I haven't told myself what to do. But that's fine. I'm catching on quickly. >> And now let me go ahead and type in David, OK, Zamila, Rob. OK, so this is a good thing. This time, I'm using a function, a puzzle piece, called get int. And it turns out-- and we'll see this later in the term-- the CS50 staff has implemented get string in such a way that it will only physically get a string for you. >> It has implemented get int in such a way that it will only get an integer for you. And if you, the human, don't cooperate, it's literally just going to say retry, retry, retry, literally sitting there looping, until you oblige with some magical number, like 50, and hello 50. >> Or if we run this again and type in 42, hello 42. And so the get int function inside of that puzzle piece is enough logic, enough thought, to figure out, what is a word? And what is a number? Only accepting, ultimately, numbers. >> So it turns out that this isn't all that expressive. so far. So, yay, last time we went pretty quickly into implementing games, and animation, and artistic works in Scratch. And here, we are being content with hello world, and hello 50. >> It's not all that inspiring. And, indeed, these first few examples will take some time to ramp up in excitement. But we have so much more control now, in fact. And we're going to very quickly start layering on top of these basic primitives. >> But first, let's understand what the limitations are. In fact, one of the things Scratch doesn't easily let us do is really look underneath the hood, and understand what a computer is, what it can do, and what its limitations are. And, indeed, that lack of understanding, potentially, long-term can lead to our own mistakes-- writing bugs, writing insecure software that gets hacked in some way. >> So let's take some steps toward understanding this a little better by way of, say, the following example. I'm going to go ahead and implement real quick a program called Adder. Like, let's add some numbers together. And I'm going to code some corners here, and just copy and paste where I was before, just so we can get going sooner. So now I've got the basic beginnings of a program called Adder. >> And let's go ahead and do this. I'm going to go ahead and say, intx gets get int. And you know what? Let's make a better user experience. >> So let's just say x is, and effectively prompt the user to give us x. And then let me go ahead and say, printf how about y is, this time expecting two values from the user. And then let's just go ahead and say, printf, the sum of x and y is. And now I don't want to do percent s. I want to do percent i, backslash n, and then plug in sum value. >> So how can I go about doing this? You know what? I know how to use variables. Let me just declare a new one, int z. >> And I'm going to take a guess here. If there are equal signs in this language, maybe I can just do x plus y, so long as I end my thought with a semi-colon? Now I can go back down here, plug in z, finish this thought with a semi-colon. And let's see now, if these sequences of lines-- x is get int. Y is get int. >> Add x and y, store the value in z-- so, again, remember the equal sign is not equal. It's assignment from right to left. And let's print out that the sum of x and y is not literally z, but what's inside of z. So let's make Adder -- nice, no mistakes this time. Dot slash Adder, enter, x is going to be 1. >> Y is going to be 2. And the sum of x and y is 3. So that's all fine and good. >> So you would imagine that math should work in a program like this. But you know what? Is this variable, line 12, even necessary? You don't need to get in the habit of just storing things in variables just because you can. And, in fact, it's generally considered bad design if you are creating a variable, called z in this case, storing something in it, and then immediately using it, but never again. Why give something a name like z if you're literally going to use that thing only once, and so proximal to where you created it in the first place, so close in terms of lines of code? So you know what? It turns out that C is pretty flexible. If I actually want to plug-in values here, I don't need to declare a new variable. I could just plug-in x plus y, because C understands arithmetic, and mathematical operators. >> So I can simply say, do this math, x plus y, whatever those values are, plug the resulting integer into that string. So this might be, though only one line shorter, a better design, a better program, because there's less code, therefore less for me to understand. And it's also just cleaner, insofar as we're not introducing new words, new symbols, like z, even though they don't really serve much of a purpose. >> Unfortunately, math isn't all that reliable sometimes. Let's go ahead and do this. I'm going to go ahead now and do the following. >> Let's do printf, percent i, plus percent i, shall be percent i, backslash n. And I'm going to do this-- xyx plus y. So I'm just going to rewrite this slightly differently here. Let me just do a quick sanity check. Again, let's not get ahead of ourselves. Make adder, dot slash adder. x is 1, y is 2, 1 plus 2 is 3. So that's good. But let's complicate this now a bit, and create a new file. >> I'm going to call this one, say, ints, plural for integers. Let me start where I was a moment ago. But now let's do a few other lines. Let me go ahead and do the following, printf, percent i, minus percent i, is percent i, comma x, comma yx minus y. So I'm doing slightly different math there. Let's do another one. So percent i times percent i is percent i, backslash n. Let's plug-in x, and y, and x times y. We'll use the asterisk on your computer for times. >> You don't use x. x is a variable name here. You use the star for multiplication. Let's do one more. Printf percent I, divided by percent i, is percent i, backslash n. xy divided by y-- so you use the forward slash in C to do division. And let's do one other. Remainder of percent i, divided by percent i, is percent i. xy-- and now remainder is what's left over. When you try dividing a denominator into a numerator, how much is left over that you couldn't divide out? >> So there isn't really, necessarily, a symbol we've used in grade school for this. But there in C. You can say x modulo y, where this percent sign in this context-- confusingly when you're inside of the double quotes, inside of printf, percent is used as the format specifier. >> When you use percent outside of that in a mathematical expression, it's the modulo operator for modular arithmetic-- for our purposes here, just means, what is the remainder of x divided by y? So x divided by y is x slash y. What's the remainder of x divided by y? It's x mod y, as a programmer would say. >> So if I made no mistakes here, let me go ahead and make ints, plural, nice, and dot slash ints. And let's go ahead and do, let's say, 1, 10. All right, 1 plus 10 is 11, check. 1 minus 10 is negative 9, check. >> 1 times 10 is 10, check. 1 divided by 10 is-- OK, we'll skip that one. Remainder of 1 divided by 10 is 1. That's correct. But there's a bug in here. >> So the one I put my hand over, not correct. I mean, it's close to 0. 1 divided by 10, you know, if we're cutting some corners, sure, it's zero. But it should really be 1/10, 0.1, or 0.10, 0.1000, or so forth. >> It should not really be zero. Well, it turns out that the computer is doing literally what we told it to do. We are doing math like x divided by y. And both x and y, per the lines of code earlier, are integers. >> Moreover, on line 15, we are telling printf, hey, printf plug-in an integer, plug-in an integer, plug-in an integer-- specifically x, and then y, and then x divided by y. x and y are ints. We're good there. >> But what is x divided by x? x divided by y should be, mathematically, 1/10, or 0.1, which is a real number, a real number having, potentially, a decimal point. It's not an integer. >> But what is the closest integer to 1/10, or 0.1? Yeah, it kind of is zero. 0.1 is like this much. And 1 is this much. So 1/10 is closer to 0 than it is to one. >> And so what C is doing for us-- kind of because we told it to-- is truncating that integer. It's taking the value, which again is supposed to be something like 0.1000, 0 and so forth. And it's truncating everything after the decimal point so that all of this stuff, because it doesn't fit in the notion of an integer, which is just a number like negative 1, 0, 1, up and down, it throws away everything after the decimal point because you can't fit a decimal point in an integer by definition. >> So the answer here is zero. So how do we fix this? We need another solution all together. And we can do this, as follows. >> Let me go ahead and create a new file, this one called floats.c. And save it here in the same directory, float.c. And let me go ahead and copy some of that code from earlier. >> But instead of getting an int, let's do this. Give me a floating point value called x. where a floating point value is just literally something with a floating point. It can move to the left, to the right. It's a real number. >> And let me call not get int, but get float, which also was among the menu of options in the C50 library. Let's change y to a float. So this becomes get float. >> And now, we don't want to plug in ints. It turns out we have to use percent f for float, percent f for float, and now save it. And now, fingers crossed, make floats, nice, dot slash floats. x is going to be one 1. y Is going to be 10 again. >> And, nice, OK my addition is correct. I was hoping for more, but I forgot to write it. So let's go and fix this logical error. >> Let's go ahead and grab the following. We'll just do a little copy and paste. And I'm going to say minus. >> And I'm going to say times. And I'm going to say divided. And I'm not going to do modulo, which is not as germane here, divided by f, and times plus-- OK, let's do this again. >> Make floats, dot slash floats, and 1, 10, and-- nice, no, OK. So I'm an idiot. So this is very common in computer science to make stupid mistakes like this. >> For pedagogical purposes, what I really wanted to do was change the science here to plus, to minus, to times, and to divide, as you hopefully noticed during this exercise. So now let's re-compile this program, do dot slash floats. >> And for the third time, let's see if it meets my expectations. 1, 10, enter, yes, OK, 1.000, divided by 10.000, is 0.100000. And it turns out we can control how many numbers are after those decimal points. We actually will. We'll come back to that. >> But now, in fact, the math is correct. So, again, what's the takeaway here? It turns out that in C, there are not only just strings-- and, in fact, there aren't really, because we add those with the CS50 library. But there aren't just ints. >> There are also floats. And it turns out a bunch of other data types too, that we'll use before long. Turns out if you want a single character, not a string of characters, you can use just a char. >> Turns out that if you want a bool, a Boolean value, true or false only, thanks to the CS50 library, we've added to C the bool data type as well. But it's also present in many other languages as well. And it turns out that sometimes you need bigger numbers then come by default with ints and floats. >> And, in fact, a double is a number that uses not 32 bits, but 64 bits. And a long long is a number that uses not 32, bits but 64 bits, respectively, for floating point values and integers, respectively. So let's actually now see this in action. >> I'm going to go ahead here and whip up one other program. Here, I'm going to go ahead and do include CS50.h. And let me go, include standard IO.h. >> And you'll notice something funky is happening here. It's not color coding things in the same way as it did before. And it turns out, that's because I haven't given the thing a file name. >> I'm going to call this one sizeof.c, and hit Save. And notice what happens to my very white code against that black backdrop. Now, at least there's some purple in there. And it is syntax highlighted. >> That's because, quite simply, I've told the IDE what type of file it is by giving it a name, and specifically a file extension. Now, let's go ahead and do this. I'm going to go ahead and very simply print out the following-- bool is percent LU. >> We'll come back to that in just a moment. And then I'm going to print size of bool. And now, just to save myself some time, I'm going to do a whole bunch of these at once. And, specifically, I'm going to change this to a char and char. This one, I'm going to change to a double and a double. >> This one, I'm going to change to a float and a float. This one, I'm going to change to an int and an int. And this one, I'm going to change to a long long. And it's still taking a long time, long long. >> And then, lastly, I gave myself one too many, string. It turns out that in C, there's the special operator called size of that's literally going to, when run, tell us the size of each of these variables. And this is a way, now, we can connect back to last week's discussion of data and representation. >> Let me go ahead and compile size of dot slash size of. And let's see. It turns out that in C, specifically on CS50 IDE, specifically on the operating system Ubuntu, which is a 64-bit operating system in this case, a bool is going to use one byte of space. That's how size is measured, not in bits, but in bytes. And recall that one byte is eight bits. So a bool, even though you technically only need a 0 or 1, it's a little wasteful how we've implemented it. It's actually going to use a whole byte-- so all zeros, are maybe all ones, or something like that, or just one 1 among eight bits. >> A char, meanwhile, used for a character like an Ascii character per last week, is going to be one character. And that synchs up with our notion of it being no more than 256 bits-- rather, synchs up with it being no longer than 8 bits, which gives us as many as 256 values. A double is going to be 8 bytes or 64 bits. >> A float is 4. An int is 4. A long, long is 8. And a string is 8. But don't worry about that. We're going to peel back that layer. It turns out, strings can be longer than 8 bytes. >> And, indeed, we've written strings already, hello world, longer than 8 bytes. But we'll come back to that in just a moment. But the take away here is the following. >> Any computer only has a finite amount of memory and space. You can only store so many files on your Mac or PC. You can only store so many programs in RAM running at once, necessarily, even with virtual memory, because you have a finite amount of RAM. >> And just to picture-- if you've never opened up a laptop or ordered extra memory for a computer, you might not know that inside of your computer is something that looks a little like this. So this is just a common company named Crucial that makes RAM for computers. And RAM is where programs live while they're running. >> So on every Mac or PC, when you double click a program, and it opens up, and it opens some Word document or something like that, it stores it temporarily in RAM, because RAM is faster than your hard disk, or your solid state disk. So it's just where programs go to live when they're running, or when files are being used. >> So you have things that look like this inside of your laptop, or slightly bigger things inside of your desktop. But the key is you only have a finite number of these things. And there's only a finite amount of hardware sitting on this desk right here. >> So, surely, we can't store infinitely long numbers. And, yet, if you think back to grade school, how many digits can you have to the right of a decimal point? For that matter, how many digits can you have to the left of a decimal point? Really, infinitely many. >> Now, we humans might only know how to pronounce million, and billion, trillion, and quadrillion, and quintillion. And I'm pushing the limits of my understanding-- or my-- I understand numbers, but my pronunciation of numbers. But they can get infinitely large with infinitely many digits to the left or to the right of a decimal point. >> But computers only have a finite amount of memory, a finite number of transistors, a finite number of light bulbs inside. So what happens when you run out of space? In other words, if you think back to last week when we talked about numbers themselves being represented in binary, suppose that we've got this 8-bit value here. >> And we have seven 1's and one 0. And suppose that we want to add 1 to this value. This is a really big number right now. >> This is 254, if I remember the math from last week right. But what if I change that rightmost 0 to a 1? The whole number, of course, becomes eight 1's. So we're still good. >> And that probably represents 255, though depending on context it could actually represent a negative number. But more on that another time. This feels like it's about as high as I can count. >> Now, it's only 8 bits. And my Mac, surely, has way more than 8 bits of memory. But it does have finite. So the same argument applies, even if we have more of these ones on the screen. >> But what happens if you're storing this number, 255, and you want to count 1 bit higher? You want to go from 255 to 256. The problem, of course, is that if you start counting at zero like last week, you can't count as high as 256, let alone 257, let alone 258,m because what happens when you add a 1? If you do the old grade school approach, you put a 1 here, and then 1 plus 1 is 2, but that's really a zero, you carry the 1, carry the 1, carry the 1. All of these things, these 1's, go to zero. And you wind up, yes, as someone pointed out, a 1 on the left hand side. But everything you can actually see and fit in memory is just eight 0's, which is to say at some point if you, a computer, tried counting high enough up, you're going to wrap around, it would seem, to zero, or maybe even negative numbers, which are even lower than zero. >> And we can kind of see this. Let me go ahead and write a real quick program here. Let me go ahead and write a program called Overflow. Include CS50.h, include standard IO.h-- oh, I really missed my syntax highlighting. So let's save this as overflow.c. >> And now int main void-- and before long, we'll come back to explaining why we keep writing int main void. But for now, let's just do it, taking it for granted. Let's give myself an int, and initialize it to 0. >> Let's then do for int i get zero-- actually, let's do an infinite loop and see what happens. While true, then let's print out n is percent i, backslash n, plug-in n. But, now, let's do n gets n plus 1. >> So in other words, on each iteration of this infinite loop, let's take n's value, and add 1 to it, and then store the result back in n on the left. And, in fact, we've seen syntax slightly like this, briefly. A cool trick is instead of writing all this out, you can actually say an n plus equals 1. >> Or if you really want to be fancy, you can say n plus plus semi-colon. But these latter two are just what we'd call syntactic sugar for the first thing. >> The first thing is more explicit, totally fine, totally correct. But this is more common, I'll say. So we'll do this for just a moment. >> Let's now make overflow, which sounds rather ominous, dot slash overflow. Let's see, n's getting pretty big. But let's think, how big can n get? >> n is an int. We saw a moment ago with the size of example that an int is four bytes. We know from last week, four bytes is 32 bits, because 8 times 4, that's 32. That's going to be 4 billion. >> And we are up to 800,000. This is going to take forever to count as high as I possibly can. So I'm going to go ahead, as you might before long, and hit Control C-- frankly, Control C, a lot, where Control C generally means cancel. Unfortunately, because this is running in the cloud, sometimes the cloud is spitting out so much stuff, so much output, it's going to take a little while for my input to get to the cloud. So even though I hit Control C a few seconds ago, this is definitely the side effect of an infinite loop. >> And so in such cases, we're going to leave that be. And we're going to add another terminal window over here with the plus, which of course doesn't like that, since it's still thinking. And let's go ahead and be a little more reasonable. >> I'm going to go ahead and do this only finitely many times. Let's use a for loop, which I alluded to earlier. Let's do this. Give me another variable int i gets 0. i is less than, let's say, 64 i++. And now let me go ahead and print out n is percent i, comma n. And then n-- this is still going to take forever. Let's do this. >> n gets n times 2. Or we could be fancy and do times equals 2. But let's just say n equals itself, times 2. In other words, in this new version of the program, I don't want to wait forever from like 800,000 to 4 billion. Let's just get this over with. >> Let's actually double n each time. Which, recall, doubling is the opposite of having, of course. And whereas last week we have something again, and again, and again, super fast, doubling will surely get us from 1 to the biggest possible value that we can count to with an int. >> So let's do exactly this. And we'll come back to this before long. But this, again, is just like the repeat block in Scratch. And you'll use this before long. >> This just means count from zero up to, but not equal, to 64. And on each iteration of this loop, just keep incrementing i. So i++-- and this general construct on line 7 is just a super common way of repeating some lines of code, some number of times. Which lines of code? These curly braces, as you may have gleaned from now, means, do the following. >> It's in like Scratch, when it has the yellow blocks and other colors that kind of embrace or hug other blocks. That's what those curly braces are doing here. So if I got my syntax right-- you can see the carrot symbol in C means that's how many times I was trying to solve this problem. So let's get rid of that one altogether, and close that window. And we'll use the new one. Make overflow, dot slash overflow, Enter, all right, it looks bad at first. But let's scroll back in time, because I did this 64 times. >> And notice the first time, n is 1. Second time, n is 2, then 4, then 8, then 16. And it seems that as soon as I get to roughly 1 billion, if I double it again, that should give me 2 billion. But it turns out, it's right on the cusp. >> And so it actually overflows an int from 1 billion to roughly negative 2 billion, because an integer, unlike the numbers we were assuming last week, can be both positive and negative in reality and in a computer. And so at least one of those bits is effectively stolen. So we really only have 31 bits, or 2 billion possible values. >> But for now, the takeaway is quite simply, whatever these numbers are and whatever the math is, something bad happens eventually, because eventually you are trying to permute the bits one too many times. And you effectively go from all 1's to maybe all 0's, or maybe just some other pattern that it clearly, depending on context, can be interpreted as a negative number. And so it would seem the highest I can count in this particular program is only roughly 1 billion. But there's a partial solution here. You know what? >> Let me change from an int to a long long. And let me go ahead here and say-- I'm going to have to change this to an unsigned long. Or, let's see, I never remember myself. >> Let's go ahead and make overflow. No, that's not it, LLD, thank you. So sometimes Clang can be helpful. I did not remember what the format specifier was for a long long. >> But, indeed, Clang told me. Green is some kind of good, still means you made a mistake. It's guessing that I meant LLD. >> So let me take it's advice, a long long decimal number, save that. And let me rerun it, dot slash overflow, Enter. And now what's cool is this. >> If I scroll back in time, we still start counting at the same place-- 1, 2, 4, 8, 16. Notice, we get all the way up to 1 billion. But then we safely get to 2 billion. >> Then we get to 4 billion, then 8 billion, 17 billion. And we go higher, and higher, and higher. Eventually, this, too, breaks. >> Eventually, with a long long, which is the 64-bit value, not a 32-bit value, if you count too high, you wrap around 0. And in this case, we happen to end up with a negative number. >> So this is a problem. And it turns out that this problem is not all that arcane. Even though I've deliberately induced it with these mistakes, it turns out we see it kind of all around us, or at least some of us do. >> So in Lego Star Wars, if you've ever played the game, it turns out you can go around breaking things up in LEGO world, and collecting coins, essentially. And if you've ever played this game way too much time, as this unnamed individual here did, the total number of coins that you can collect is, it would seem, 4 billion. >> Now, with it's actually rounded. So LEGO was trying to keep things user friendly. They didn't do it exactly 2 to the 32 power, per last week. But 4 billion is a reason. It seems, based on this information, that LEGO, and the company that made this actual software, decided that the maximum number of coins the user can accumulate is, indeed, 4 billion, because they chose in their code to use not a long long, apparently, but just an integer, an unsigned integer, only a positive integer, whose max value is roughly that. Well, here's another funny one. So in the game Civilization, which some of you might be familiar, with it turns out that years ago there was a bug in this game whereby if you played the role of Gandhi in the game, instead of him being very pacifist, instead was incredibly, incredibly aggressive, in some circumstances. In particular, the way that Civilization works is that if you, the player, adopt democracy, your aggressiveness score gets decremented by two, so minus minus, and then minus minus. >> So you subtract 2 from your actual iterating. Unfortunately, if your iterating is initially 1, and you subtract 2 from it after adopting democracy as Gandhi here might have done, because he was very passive-- 1 on the scale of aggressiveness. But if he adopts democracy, then he goes from 1 to negative 1. >> Unfortunately, they were using unsigned numbers, which means they treated even negative numbers as though they were positive. And it turns out that the positive equivalent of negative 1, in typical computer programs, is 255. So if Gandhi adopts democracy, and therefore has his aggressiveness score decreased, it actually rolls around to 255 and makes him the most aggressive character in the game. So you can Google up on this. And it was, indeed, an accidental programming bug, but that's entered quite the lore ever since. >> That's all fun and cute. More frightening is when actual real world devices, and not games, have these same bugs. In fact, just a year ago an article came out about the Boeing 787 Dreamliner. >> And the article at first glance reads a little arcane. But it said this, a software vulnerability in Boeing's new 787 Dreamliner jet has the potential to cause pilots to lose control of the aircraft, possibly in mid-flight, the FAA officials warned airlines recently. It was the determination that a model 787 airplane that has been powered continuously for 248 days can lose all alternating current, AC, electrical power due to the generator control units, GCUs, simultaneously going into fail safe mode. It's kind of losing me. But the memo stated, OK, now I got that, the condition was caused by a software counter internal to the generator control units that will overflow after 248 days of continuous power. We are issuing this notice to prevent loss of all AC electrical power, which could result in loss of control of the airplane. >> So, literally, there is some integer, or some equivalent data type, being used in software in an actual airplane that if you keep your airplane on long enough, which apparently can be the case if you're just running them constantly and never unplugging your airplane, it seems, or letting its batteries die, will eventually count up, and up, and up, and up, and up, and up. >> And, by nature, a finite amount of memory will overflow, rolling back to zero or some negative value, a side effect of which is the frighteningly real reality that the plane might need to be rebooted, effectively, or might fall, worse, as it flies. So these kinds of issues are still with us, even-- this was a 2015 article, all the more frightening when you don't necessarily understand, appreciate, or anticipate those kinds of errors. >> So it turns out there's one other bad thing about data representation. It turns out that even floats are kind of flawed, because floats, too, I proposed are 32 bits, or maybe 64 if you use a double. But that's still finite. >> And the catch is that if you can put an infinite number of numbers after the decimal point, there is no way you can represent all the possible numbers that we were taught in grade school can exist in the world. A computer, essentially, has to choose a subset of those numbers to represent accurately. >> Now, the computer can round maybe a little bit, and can allow you to roughly store any number you might possibly want. But just intuitively, if you have a finite number of bits, you can only permute them in so many finite ways. So you can't possibly use a finite number of permutation of bits, patterns of zeros and ones, to represent an infinite number of numbers, which suggests that computers might very well be lying to us sometimes. >> In fact, let's do this. Let me go back into CS50 IDE. Let me go ahead and create a little program called Imprecision, to show that computers are, indeed, imprecise. >> And let me go ahead and start with some of that code from before, and now just do the following. Let me go ahead and do printf, percent f, backslash n, 1 divided by 10. In other words, let's dive in deeper to 1/10, like 1 and divided by 10. Surely, a computer can represent 1/10. >> So let's go ahead and make imprecision. Let's see. Format specifies type double. But the argument has type int. What's going on? >> Oh, interesting, so it's a lesson learned from before. I'm saying, hey, computer show me a float with percent f. But I'm giving it 2 ints. So it turns out, I can fix this in a couple of ways. >> I could just turn one into 1.0, and 10 into 10.0, which would, indeed, have the effect of converting them into floats-- still hopefully the same number. Or it turns out there's something we'll see again before long. You could cast the numbers. >> You can, using this parenthetical expression, you can say, hey, computer, take this 10, which I know is an int. But treat it, please, as though it's a float. But this feels unnecessarily complex. >> For our purposes today, let's just literally make them floating point values with a decimal point, like this. Let me go ahead and rerun, make imprecision, good, dot slash imprecision, enter. OK, we're looking good. >> 1 divided by 10, according to my Mac here, is, indeed, 0.100000. Now, I was taught in grade school there should be an infinite number of 0's. So let's at least try to see some of those. It turns out that printf is a little fancier still than we've been using. It turns out you don't have to specify just percent f, or just percent i. You can actually specify some control options here. >> Specifically, I'm going to say, hey, printf, actually show me 10 decimal points. So it looks a little weird. But you say percent, dot, how many numbers you want to see after the decimal point, and then f for flat, just because that's what the documentation says. Let me go ahead and save that. >> And notice too, I'm getting tired of retyping things. So I'm just setting the up and down arrow on my keys here. And if I keep hitting up, you can see all of the commands that I made, or incorrectly made. >> And I'm going to go ahead now and not actually use that, apparently. Make imprecision, dot slash imprecision-- so what I was taught in grade school checks out. Even if I print it to 10 decimal places it, indeed, is 0.10000. But you know what? >> Let's get a little greedy. Let's say, like, show me 55 points after the decimal. Let's really take this program out for a spin. Let me remake it with make imprecision, dot slash, imprecision. >> And here we go. Your childhood was a lie. Apparently, 1 divided by 10 is indeed 0.100000000000000005551115123-- >> What is going on? Well, it turns out, if you kind of look far enough out in the underlying representation of this number, it actually is not exactly 1/10, or 0.1 and an infinite number of zeros. Now, why is that? >> Well, even though this is a simple number to us humans, 1 divided by 10, it's still one of infinitely many numbers that we could think up. But a computer can only represent finitely many so numbers. And so, effectively, what the computer is showing us is its closest approximation to the number we want to believe is 1/10, or really 0.10000 ad infinitum. >> Rather, though, this is as close as it can get. And, indeed, if you look underneath the hood, as we are here by looking 55 digits after the decimal, we actually see that reality. Now as an aside, if you've ever seen the movie-- most of you probably haven't-- but Superman 3 some years ago, Richard Pryor essentially leveraged this reality in his company to steal a lot of fractions and fractions of pennies, because the company-- as I recall, it's been a while-- was essentially throwing away anything that didn't fit into the notion of cents. >> But if you add up all these tiny, tiny, tiny numbers again, and again, and again, you can, as in his case, make a good amount of money. >> That same idea was ripped off by a more recent, but still now older movie, called Office Space, where the guys in that movie, did the same thing, screwed it up completely, ended up with way too much money in their bank account. It was all very suspicious. But at the end of the day, imprecision is all around us. >> And that, too, can be frighteningly the case. It turns out that Superman 3 and Office Space aside, there can be some very real world ramifications of the realities of imprecise representation of data that even we humans to this day don't necessarily understand as well as we should, or remember as often as we should. And, indeed, the following clip is from a look at some very real world ramifications of what happens if you don't appreciate the imprecision that can happen in numbers representation. >> [VIDEO PLAYBACK] >> -Computers, we've all come to accept the often frustrating problems that go with them-- bugs, viruses, and software glitches, for small prices to pay for the convenience. But in high tech and high speed military and space program applications, the smallest problem can be magnified into disaster. >> On June 4th, 1996, scientists prepared to launch an unmanned Ariane 5 rocket. It was carrying scientific satellites designed to establish precisely how the earth's magnetic field interacts with solar winds. The rocket was built for the European Space Agency, and lifted off from its facility on the coast of French Guiana. >> -At about 37 seconds into the flight, they first noticed something was going wrong. The nozzles were swiveling in a way they really shouldn't. Around 40 seconds into the flight, clearly, the vehicle was in trouble. >> And that's when they made a decision to destroy it. The range safety officer, with tremendous guts, pressed the button, blew up the rocket, before it could become a hazard to the public safety. >> -This was the maiden voyage of the Ariane 5. And its destruction took place because of a flaw embedded in the rocket's software. -The problem on the Ariane was that there was a number that required 64 bits to express. And they wanted to convert it to a 16-bit number. They assumed that the number was never going to be very big, that most of those digits in a 64-bit number were zeroes. They were wrong. >> -The inability of one software program to accept the kind of number generated by another was at the root of the failure. Software development had become a very costly part of new technology. The Ariane rocket have been very successful, so much of the software created for it was also used in the Ariane 5. >> -The basic problem was that the Ariane 5 was faster, accelerated faster. And the software hadn't accounted for that. >> -The destruction of the rocket was a huge financial disaster, all due to a minute software error. But this wasn't the first time data conversion problems had plagued modern rocket technology. >> -In 1991, with the start of the first Gulf War, the Patriot Missile experienced a similar kind of number conversion problem. And as a result, 28 people, 28 American soldiers, were killed, and about 100 others wounded, when the Patriot, which was supposed to protect against incoming scuds, failed to fire a missile. >> -When Iraq invaded Kuwait, and America launched Desert Storm in early 1991, Patriot Missile batteries were deployed to protect Saudi Arabia and Israel from Iraqi Scud missile attacks. The Patriot is a US medium-range surface to air system, manufactured by the Raytheon company. >> -The size of the Patriot interceptor itself is about roughly 20 feet long. And it weighs about 2,000 pounds. And it carries a warhead of about, I think it's roughly 150 pounds. And the warhead itself is a high explosive, which has fragments around it. The casing of the warhead is designed to act like buckshot. >> -The missiles are carried four per container, and are transported by a semi trailer. >> -The Patriot anti-missile system goes back at least 20 years now. It was originally designed as an air defense missile to shoot down enemy airplanes. In the first Gulf War, when that war came along, the Army wanted to use it to shoot down scuds, not airplanes. >> The Iraqi Air Force was not so much of a problem. But the Army was worried about scuds. And so they tried to upgrade the Patriot. >> -Intercepting an enemy missile traveling at mach 5 was going to be challenging enough. But when the Patriot was rushed into service, the Army was not aware of an Iraqi modification that made their scuds nearly impossible to hit. >> -What happened is the scuds that were coming in were unstable. They were wobbling. The reason for this was the Iraqis, in order to get 600 kilometers out of a 300 kilometer range missile, took weight out of the front warhead. They made the warhead lighter. >> So now the Patriot is trying to come at the Scud. And most of the time, the overwhelming majority of the time, it would just fly by the Scud. Once the Patriot system operators realized the Patriot missed its target, they detonated the Patriot's warhead to avoid possible casualties if it was allowed to fall to the ground. >> -That was what most people saw, those big fireballs in the sky, and misunderstood as intercepts of Scud warheads. >> -Although in the night skies, Patriots appeared to be successfully destroying Scuds, at Dhahran, there could be no mistake about its performance. There, the Patriot's radar system lost track of an incoming Scud, and never launched due to a software flaw. It was the Israelis who first discovered that the longer the system was on, the greater the time discrepancy became, due to a clock embedded in the system's computer. >> -About two weeks before the tragedy in Dhahran, the Israelis reported to the Defense Department that the system was losing time. After about eight hours or running, they noticed that the system was becoming noticeably less accurate. The Defense Department responded by telling all of the Patriot batteries to not leave the systems on for a long time. They never said what a long time was-- eight hours, 10 hours, 1,000 hours. Nobody knew. >> -The Patriot battery stationed at the barracks at Dhahran and its flawed internal clock had been on over 100 hours on the night of February 25th. >> -It tracked time to an accuracy of about a tenth of a second. Now, a tenth of a second is an interesting number, because it can't be expressed in binary exactly, which means it can't be expressed exactly in any modern digital computer. It's hard to believe. >> But use this as an example. Let's take the number one third. One third cannot be expressed in decimal exactly. One third is 0.333 going on for infinity. >> There is no way to do that with absolute accuracy in decimal. That's exactly the kind of problem that happened in the Patriot. The longer the system ran, the worse the time error became. >> -After 100 hours of operation, the error in time was only about one third of a second. But in terms of targeting a missile traveling at mach 5, it resulted in a tracking error of over 600 meters. It would be a fatal error for the soldiers on what happened is a Scud launch was detected by early Warning satellites and they knew that the Scud was coming in their general direction. They didn't know where it was coming. >> -It was now up to the radar component of the Patriot system defending Dhahran to locate and keep track of the incoming enemy missile. >> -The radar was very smart. It would actually track the position of the Scud, and then predict where it probably would be the next time the radar sent a pulse out. That was called a range gate. >> -Then, once the Patriot decides enough time has passed to go back and check the next location for this detected object, it goes back. So when it went back to the wrong place, it then sees no object. And it decides that there was no object, it was a false detection, and drops the track. >> -The incoming Scud disappeared from the radar screen. And seconds later, it slammed into the barracks. The Scud killed 28, and was the last one fired during the first Gulf War. >> Tragically, the updated software arrived at Dhahran the following day. The software flaw had been fixed, closing one chapter in the troubled history of the Patriot missile. >> [VIDEO PLAYBACK] DAVID J. MALAN: So this is all to say that these issues of overflow and imprecision are all too real. So how did we get here? We began with just talking about printf. Again, this function that prints something to the screen, and we introduced thereafter a few other functions from the so-called CS50's library. And we'll continue to see these in due time. And we, particularly, used get string, and get int, and now also get float, and yet others still will we encounter and use ourselves before long. >> But on occasion, have we already seen a need to store what those functions hand back? They hand us back a string, or an int, or a float. And sometimes we need to put that string, or int, or float, somewhere. >> And to store those things, recall just like in Scratch, we have variables. But unlike in Scratch, in C we have actual types of variables-- data types, more generally-- among them, a string, an int, a float, and these others still. >> And so when we declare variables in C, we'll have to declare our data types. This is not something we'll have to do later in the semester as we transition to other languages. But for now, we do need to a priori in advance, explain to the computer what type of variable we want it to give us. >> Now, meanwhile, to print those kinds of data types, we have to tell printf what to expect. And we saw percent s for strings, and percent i for integers, and a few others already. And those are simply requirements for the visual presentation of that information. >> And each of these can actually be parametrized or tweaked in some way, if you want to further control the type of output that you get. And, in fact, it turns out that not only is there backslash n for a new line. There's something else called backslash r for a carriage return, which is more akin to an old school typewriter, and also Windows used for many years. >> There's backslash t for tabs. Turns out, that if you want to double quote inside of a string, recall that we've used double quote double quote on the left and the right ends of our strings thus far. That would seem to confuse things. >> If you want to put a double quote in the middle of a string-- and, indeed, it is confusing to see. And so you have to escape, so to speak, a double quote with something like, literally, backslash double quote. And there's a few other still. And we'll see more of those in actual use before long. >> So let's now transition from data, and representation, and arithmetic operators, all of which gave us some building blocks with which to play. But now let's actually give us the rest of the vocabulary that we already had last week with Scratch by taking a look at some other constructs in C-- not all of them. But the ideas we're about to see really just to emphasize the translation from one language, Scratch, to another, C. >> And over time, we'll pick up more tools for our toolkit, so to speak, syntactically. And, indeed, you'll see that the ideas are now rather familiar from last week. So let's do this. >> Let's go ahead and whip up a program that actually uses some expressions, a Boolean expression. Let me go ahead here and create a new file. I'll call this condition.c. >> Let me go ahead and include the CS50 library. And let me go ahead and include standard IO.h for our functions, and printf, and more respectively. Let me give myself that boilerplate of int main void, whose explanation we'll come back to in the future. >> Now let me go ahead and give myself an int via get int. Then let me go ahead and do this. I want to say if i is less-- let's distinguish between positive, negative, or zero values. >> So if i is less than zero, let me just have this program simply say, negative, backslash n, else if i is greater than zero. Now I'm, of course, going to say printf positive, backslash n. And then else if-- I could do this. >> I could do if i equals 0. But I'd be making at least one mistake already. Recall that the equal sign is not equal, as we humans know it. >> But it's the assignment operator. And we don't want to take 0 on the right and put it in i on the left. So to avoid this confusion, or perhaps misuse of the equals sign, humans decided some years ago that in many programming languages when you want to check for equality between the left and the right, you actually use equals equals. So you hit the equals sign twice. When you want to assign from right to left, you use a single equal sign. So we could do this-- else if i equals equals zero. >> I could then go and open my curly braces, and say, printf 0, backslash n, done. But remember how these forks in the road can work. And, really, just think about the logic. i is a number. It's an integer, specifically. And that means it's going to be less than 0, or greater than 0, or 0. So there is kind of this implied default case. >> And so we could, just like Scratch, dispense with the else if, and just say else. Logically, if you the programmer know there's only three buckets into which a scenario can fall-- the first, the second, or the third in this case-- don't bother adding the additional precision and the additional logic there. Just go ahead with the default case here of else. >> Now, let's go ahead after saving this, make conditions dot slash conditions-- not a great user interface, because I'm not prompting the user, as I mentioned earlier. But that's fine. We'll keep it simple. Let's try the number 42. And that's positive. Let's try the number negative 42, negative. >> Let's try the value 0. And, indeed, it works. Now, you'll see with problems before long, testing things three times, probably not sufficient. You probably want to test some bigger numbers, some smaller numbers, some corner cases, as we'll come to describe them. >> But for now, this is a pretty simple program. And I'm pretty sure, logically, that it falls into three cases. And, indeed, even though we just focused on the potential downsides of imprecision and overflow, in reality where many of CS50's problems, we are not going to worry about, all the time, those issues of overflow and imprecision, because, in fact, in C, it's actually not all that easy to avoid those things. If you want to count up bigger, and bigger, and bigger, it turns out there are techniques you can use, often involving things called libraries, collections of code, that other people wrote that you can use, and other languages like Java and others, actually make it a lot easier to count even higher. So it really is some of these dangers a function of the language you use. And in the coming weeks, we'll see how dangerous C really can be if you don't use it properly. But from there, and with Python, and JavaScript, will we layer on some additional protections, and run fewer of those risks. >> So let's make a little more interesting logic in our program. So let me go ahead and create a program called Logical just so I can play with some actual logic, logical.c. I'll just copy and paste some code from earlier so I get back to this nice starting point. >> Let me this time do char C. I'm going to give it a name of C just because it's conventional, get a character from the user. And let's pretend like I'm implementing part of that Rm program, the remove program before that prompted the user to remove a file. How could we do this? >> I want to say, if C equals equals, quote unquote, y, then I'm going to assume that the user has chosen yes. I'm just going to print yes. If it were actually writing the removal program, we could remove the file with more lines of code. But we'll keep it simple. >> Else if c equals equals n-- and now here, I'm going to say, the user must have meant no. And then else, you know what? I don't know what else the user is going to type. So I'm just going to say that that is an error, whatever he or she actually typed. >> So what's going on here? There is a fundamental difference versus what I've done in the past. Double quotes, double quotes, double quotes, and, yet, single quotes, single quotes. It turns out in C, that when you want to write a string, you do use double quotes, just as we've been using all this time with printf. >> But if you want to deal with just a single character, a so-called char, then you actually use single quotes. Those of you who've programmed before, you might not have had to worry about this distinction in certain languages. In C, it does matter. And so when I get a char and I want to compare that char using equals equals to some letter like y or n, I do, indeed, need to have the single quotes. >> Now, let's go ahead and do this. Let's go ahead and do make logical dot slash logical. And now I'm being prompted. So, presumably, a better user experience would actually tell me what to do here. But I'm going to just blindly say y for yes, OK, nice. >> Let's run it again, n for no, nice. Suppose like certain people I know, my caps lock key is on all too often. So I do capital Y, enter, error. OK, it's not exactly what I'm expecting. Indeed, the computer is doing literally what I told it to do-- check for lowercase y and lowercase n. This doesn't feel like good user experience, though. Let me ask for and accept either lower case or upper case. So it turns out, you might want to say something like in Scratch, like literally or C equals equals capital single quoted y. Turns out, C does not have this literal keyword or. >> But it does have two vertical bars. You have to hold Shift usually, if you're using a US keyboard, and hit the vertical bar key above your return key. But this vertical bar vertical bar means or. >> If, by contrast, we wanted to say and, like in Scratch, we could do ampersand ampersand. That makes no logical sense here, because a human could not possibly have typed both y and lowercase y and capital Y as the same character. So or is what we intend here. >> So if I do this in both places, or c equals equals capital N, now rerun, make logical, rerun logical. Now, I can type y. And I can do it again with capital Y, or capital N. And I could add in additional combinations still. >> So this is a logical program insofar as now I'm checking logically for this value or this value. And I don't have to, necessarily, come up with two more ifs or else ifs. I can actually combine some of the related logic together in this way. So this would be better designed than simply saying, if C equals lower case y, print yes, else if c equals capital Y, print yes, else if c equals lower-- in other words, you don't have to have more and more branches. You can combine some of the equivalent branches logically, as in this way. >> So let's take a look at just one final ingredient, one final construct, that C allows. And we'll come back in the future to others still. And then we'll conclude by looking at not the correctness of code-- getting code to work-- but the design of code, and plant those seeds early on. >> So let me go ahead and open up a new file here. You know what? I'm going to re-implement that same program, but using a different construct. >> So let me quickly give myself access to include CS50.h for the CS50 library, standard Io.h for printf. Give me my int main void. And then over here, let me go ahead and do this. >> Char c gets get char, just like before. And I'm going to use a new construct now-- switch, on what character? So switch is kind of like switching a train tracks. Or, really, it is kind of an if else, if else if, but written somewhat differently. >> A switch looks like this. You have switch, and then what character or number you want to look at, then some curly braces like in Scratch, just say do this stuff. And then you have different cases. >> You don't use if and else. You literally use the word case. And you would say something like this. >> So in the case of a lowercase y, or in the case of a capital Y, go ahead and print out yes. And then break out of the switch. That's it. We're done. >> Else if, so to speak, lower case n, or capital N, then go ahead and print out no, and then break. Else-- and this kind of is the default case indeed-- printf error-- and just for good measure, though logically this break is not necessary because we're at the end of the switch anyway, I'm now breaking out of the switch. So this looks a little different. >> But, logically, it's actually equivalent. And why would you use one over the other? Sometimes, just personal preference, sometimes the aesthetics, if I glance at this now, there's something to be said for the readability of this code. I mean, never mind the fact that this code is new to many of us in the room. >> But it just kind of is pretty. You see lowercase y, capital Y, lower case n, capital N default, it just kind of jumps out at you in a way that, arguably, maybe the previous example with the ifs, and the vertical bars, and the else ifs, might not have. So this is really a matter of personal choice, really, or readability, of the code. >> But in terms of functionality, let me go ahead and make a switch, dot slash switch, and now type in lowercase y, capital Y, lowercase n, capital N, David, retry because that's not a single character. Let's do x, error, as expected. And, logically-- and this is something I would encourage in general-- even though we're only scratching the surface of some of these features. >> And it might not be obvious when you yourself sit down at the keyboard, how does this work? What would this do? The beautiful thing about having a laptop, or desktop, or access to a computer with a compiler, and with a code editor like this, is you can almost always answer these questions for yourself just by trying. >> For instance, if the rhetorical question at hand were, what happens if you forget your break statements? Which is actually a very common thing to do, because it doesn't look like you really need them. They don't really complete your thought like a parenthesis or a curly brace does. Let's go ahead and recompile the code and see. So make switch, dot slash switch. Let's type in lower case y, the top case, Enter. So I typed y. >> The program said yes, no, error, as though it was changing its mind. But it kind of was, because what happens with a switch is the first case that match essentially means, hey computer, execute all of the code beneath it. And if you don't say break, or don't say break, or don't say break, the computer is going to blow through all of those lines and execute all of them until it gets to that curly brace. So the brakes are, indeed, necessary. But a takeaway here is, when in doubt, try something. Maybe save your code first, or save it in an extra file if you're really worried about messing up and having to recover the work that you know is working. >> But try things. And don't be as afraid, perhaps, of what the computer might do, or that you might break something. You can always revert back to some earlier version. >> So let's end by looking at the design of code. We have this ability now to write conditions, and write loops, and variables, and call functions. So, frankly, we're kind of back at where we were a week ago with Scratch, albeit with a less compelling textual environment than Scratch allows. >> But notice how quickly we've acquired that vocabulary, even if it's going to take a little while to sink in, so that we can now use this vocabulary to write more interesting programs. And let's take a baby step toward that, as follows. Let me go ahead and create a new file here. >> I'm going to call this prototype.c, and introduce for the first time, the ability to make your own functions. Some of you might have done this with Scratch, whereby you can create your own custom blocks in Scratch, and then drag them into place wherever you'd like in C. And in most programming languages, you can do exactly that-- make your own functions, if they don't already exist. >> So, for instance, let me go ahead and include CS50.h, and include standard IO.h, int main void. And now we have a placeholder ready to go. I keep printing things like people's names today. And that feels like-- wouldn't be nice if there were a function called print name? I don't have to use printf. I don't have to remember all the format codes. Why don't I, or why didn't someone before me, create a function called print name, that given some name, simply prints it out? >> In other words, if I say, hey, computer, give me a string by asking the user for such, via CS50's get string function. Hey, computer, put that string in the variable in the left hand side, and call it s. And then, hey computer, go ahead and print that person's name, done. >> Now, it would be nice, because this program, aptly named, tells me what it's supposed to do by way of those function's names. Let me go and make prototype, Enter. And, unfortunately, this isn't going to fly. >> Prototype.c, line 7, character 5, error, implicit declaration of function print name is invalid in C99, C99 meaning a version of C that came out in 1999. That's all. >> So I don't know what all of this means yet. But I do recognize error in red. That's pretty obvious. >> And it seems that with the green character here, the issue is with print name, open paren s, close paren, semi-colon. But implicit declaration of function we did see briefly earlier. This means, simply, that Clang does not know what I mean. >> I've used a vocabulary word that it's never seen or been taught before. And so I need to teach it what this function means. So I'm going to go ahead and do that. >> I'm going to go ahead and implement my own function called Print Name. And I'm going to say, as follows, that it does this, printf, hello, percent s, backslash n, name, semi-colon. So what did I just do? >> So it turns out, to implement your own function, we kind of borrow some of the same structure as main that we've just been taken for granted, and I know just copying and pasting pretty much what I've been writing in the past. But notice the pattern here. Int, Main, Void, we'll tease apart before long what that actually means. >> But for today, just notice the parallelism. Void, print name, string name, so there's a purple keyword, which we're going to start calling a return type, the name of the function, and then the input. So, actually, we can distill this kind of like last week as, this is the name or the algorithm of the code we're going to write-- the algorithm underlying the code we're going to write. >> This is its input. This is its output. This function, print name, is designed to take a string called name, or whatever, as input, and then void. It doesn't return anything, like get string or get int does. So it's going to hand me something back. It's just going to have a side effect, so to speak, of printing a person's name. So notice, line 7, I can call print name. Line 10, I can define or implement print name. But, unfortunately, that's not enough. >> Let me go ahead and recompile this after saving. Whoa, now, I've made it worse, it would seem. So implicit declaration of function print name is invalid. And, again, there's more errors. But as I cautioned earlier, even if you get overwhelmed with, or a little sad to see so many errors, focus only on the first initially, because it might just have had a cascading effect. So C, or Clang more specifically, still does not recognize print name. >> And that's because Clang, by design, is kind of dumb. It only does what you tell it to do. And it only does so in the order in which you tell it to do. >> So I have defined main on line four, like we've been doing pretty often. I've defined print name on line 10. But I'm trying to use print name on line seven. >> It's too soon, doesn't exist yet. So I could be clever, and be like, OK, so let's just play along, and move print name up here, and re-compile. Oh my God. It worked. It was as simple as that. >> But the logic is exactly that. You have to teach Clang what it is by defining the function first. Then you can use it. But, frankly, this feels like a slippery slope. >> So every time I run into a problem, I'm just going to highlight and copy the code I wrote, cut it and paste it up here. And, surely, we could contrive some scenarios where one function might need to call another. And you just can't put every function above every other. >> So it turns out there's a better solution. We can leave this be. And, frankly, it's generally nice, and convenient, and good design to put main first, because, again, main just like when green flag clicked, that is the function that gets executed by default. So you might as well put it at the top of the file so that when you or any other human looks at the file you know what's going on just by reading main first. So it turns out, we can tell Clang proactively, hey, Clang, on line four, I promise to implement a function called Print Name that takes a string called name as input, and returns nothing, void. And I'll get around to implementing it later. >> Here comes Main. Main now on line 9 can use Print Name because Clang is trusting that, eventually, it will encounter the definition of the implementation of Print Name. So after saving my file, let me go ahead and make prototype, looks good this time. Dot slash, prototype, let me go ahead and type in a name. David, hello David, Zamila, hello Zamila, and, indeed, now it works. >> So the ingredient here is that we've made a custom function, like a custom Scratch block we're calling it. But unlike Scratch where you can just create it and start using it, now we have to be a little more pedantic, and actually train Clang to use, or to expect it. Now, as an aside, why all this time have we been just blindly on faith including CS50.h, and including standard IO.h? >> Well, it turns out, among a few other things, all that's in those dot h files, which happen to be files. They're header files, so to speak. They're still written in C. But they're a different type of file. >> For now, you can pretty much assume that all that is inside of CS50.h is some one-liners like this, not for functions called Print Name, but for Get String, Get Float, and a few others. And there are similar prototypes, one liners, inside of standard IO.h for printf, which is now in my own Print Name function. So in other words, this whole time we've just been blindly copying and pasting include this, include that, what's going on? Those are just kind of clues to Clang as to what functions are, indeed, implemented, just elsewhere in different files elsewhere on the system. >> So we've implemented print name. It does have this side effect of printing something on the screen. But it doesn't actually hand me something back. How do we go about implementing a program that does hand me something back? >> Well, let's try this. Let me go ahead and implement a file called return.c so we can demonstrate how something like Get String, or Get Int, is actually returning something back to the user. Let's go ahead and define int main void. >> And, again, in the future, we'll explain what that int and that void is actually doing. But for today, we'll take it for granted. I'm going to go ahead and printf, for a good user experience, x is. And then I'm going to wait for the user to give me x with get int. >> And then I'm going to go ahead and print out x to the square. So when you only have a keyboard, people commonly use the little carrot symbol on the keyboard to represent to the power of, or the exponent of. So x squared is present i. >> And now I'm going to do this. I could just do-- what's x squared? x squared is x times x. >> And we did this some time ago already today. This doesn't feel like all that much progress. You know what? Let's leverage some of that idea from last time of abstraction. >> Wouldn't it be nice if there's a function called square that does exactly that? It still, at the end of the day, does the same math. But let's abstract away the idea of taking one number multiplied by another, and just give it a name, like square this value. >> And, in other words, in C, let's create a function called square that does exactly that. It's going to be called square. It's going to take an int. And we'll will just call it n, by default. >> But we could call it anything we want. And all that it's going to do, literally, is return the result of n times n. But because it is returning something, which is the keyword in purple we've never seen before, I, on line 11, can't just say void this time. >> Void, in the example we just saw rather of print name, just means, do something. But don't hand me something back. In this case, I do want to return n times n, or whatever that is, that number. >> So I can't say, hey, computer, I return nothing, void. It's going to return, by nature, an int. And so that's all that's going on here. >> The input to square is going to be an int. And so that we can use it, it has to have a name, N. It's going to output an int that doesn't need a name. We can leave it to main, or whoever is using me to remember this value if we want with its own variable. >> And, again, the only new keyword here is Return. And I'm just doing some math. If I really wanted to be unnecessary, I could say int product gets n times n. >> And then I could say, return product. But, again, to my point earlier of this just not being good design-- like, why introduce a name, a symbol, like product, just to immediately return it? It's a little cleaner, a little tighter, so to speak, just to say return n times n, get rid of this line altogether. >> And it's just less code to read, less opportunity for mistakes. And let's see if this actually now works. Now, I'm going to go ahead and make return. >> Uh-oh, implicit declaration of function. I made this mistake before, no big deal. Let me just type, or highlight and copy, the exact same function prototype, or signature, of the function up here. Or I could move the whole function. >> But that's a little lazy. So we won't do that. Now, let me make return again, dot slash return. >> x is 2. x squared is 4. x is 3. x squared is 9. And the function seems now to be working. So what's the difference here? I have a function that's called square, in this case, which I put in an input. And I get back an output. And yet, previously, if I open the other example from earlier, which was called prototype.c, I had print name, which returned void, so to speak, Or it returned nothing, and simply had a side effect. >> So what's going on here? Well, consider the function get string for just a moment. We've been using the function get string in the following way. >> We've had a function get string, like include CS50.h, include standard IO.h, int, main, void. And then every time I've called get string thus far, I've said something like, string s gets get string, because get string-- let's call this get.c-- get string itself returns a string that I can then use, and say, hello, comma, percent s, backslash n, s. >> So this is the same example, really, that we had earlier. So get string returns a value. But a moment ago, print string does not return a value. It simply has a side effect. So this is a fundamental difference. We've seen different types of functions now, some of which have returned values, some of which don't. So maybe it's string, or int, or float. Or maybe it's just void. >> And the difference is that these functions that get data and return a value are actually bringing something back to the table, so to speak. So let's go ahead and look at one final set of examples that gives a sense, now, of how we might, indeed, abstract better, and better, and better, or more, and more, and more, in order to write, ultimately, better code. Let's go ahead, and in the spirit of Scratch, do the following. >> Let me go ahead and include CS50.h and standard IO.h. Let me go ahead and give myself an int, main, void. And let me go ahead, call this cough.c. >> And let me go ahead and just like Scratch, print out cough/n. And I want to do this three times. So I'm, of course, just going to copy and paste three times. I'm now going to make cough dot slash cough. Let's give myself a little more room here, Enter, cough, cough, cough. >> There's, obviously, already an opportunity for improvement. I've copied and pasted a few times today. But that was only so I didn't have to type as many characters. I still changed what those lines of code are. >> These three lines are identical, which feels lazy and indeed is, and is probably not the right approach. So with what ingredient could we improve this code? We don't have to copy and paste code. >> And, indeed, any time you feel yourself copying and pasting, and not even changing code, odds are there's a better way. And, indeed, there is. Let me go ahead and do a for loop, even though the syntax might not come naturally yet. >> Do this three times, simply by doing the following-- and I happen to know this from practice. But we have a number of examples now. And you'll see online more references still. >> This is the syntax on line 6, that much like Scratch that repeats block, repeat the following three times. It's a little magical for now. But this will get more, and more familiar. >> And it's going to repeat line eight three times, so that if I re-compile make cough, dot slash cough, cough, cough, cough. It still works the same way. So that's all fine and good. But that's not very abstracted. >> It's perfectly correct. But it feels like there could be an opportunity, as in the world of Scratch, to kind of start to add some semantics here so that I don't just have some for loop, and a function that says cough, or does cough. You know what? Let me try to be a little cooler than that, and actually write a function that has some side effects, call it cough. >> And it takes no input, and returns no value as output. But you know what it does? It does this-- printf, quote unquote, cough. >> And now up here, I'm going to go ahead and for int, i gets zero, i less than 3, i plus plus. I'm going to not do printf, which is arguably a low level implementation detail. I don't care how to cough. I just want to use the cough function. And I'm just going to call cough. >> Now, notice the dichotomy. When you call a function, if you don't want to give it inputs, totally fine. Just do open paren, close paren, and you're done. >> When you define a function, or declare a function's prototype, if you know in advance it's not going to take any arguments, say void in those parentheses there. And that makes certain that you won't accidentally misuse it. Let me go ahead and make cough. And, of course, I've made a mistake. >> Dammit, there's that implicit declaration. But that's fine. It's an easy fix. I just need the prototype higher up in my file than I'm actually using it. >> So now let me make cough again, nice. Now, it works. Make cough, cough, cough, cough. So you might think that we're really just over engineering this problem. And, indeed, we are. This is not a good candidate of a program at the moment for refactoring, and doing what's called hierarchical decomposition, where you take some code, and then you kind of factor things out, so as to ascribe more semantics to them, and reuse it ultimately longer term. But it's a building block toward more sophisticated programs that we will start writing before long that allows us to have the vocabulary with which to write better code. And, indeed, let's see if we can't generalize this further. >> It seems a little lame that I, main, need to worry about this darn for loop, and calling cough again and again. Why can't I just tell cough, please cough three times? In other words, why can't I just give input to cough and do this? >> Why can't I just say, in main cough three times. And now, this is kind of magical. It's very iterative here. And it's, indeed, a baby step. >> But just the ability to say on line eight, cough three times, it's just so much more readable. And, plus, I don't have to know or care how cough is implemented. And, indeed, later in the term and for final projects, if you tackle a project with a classmate or two classmates, you'll realize that you're going to have to, or want to, divide the work. >> And you're going to want to decide in advance, who's going to do what, and in which pieces? And wouldn't it be nice if you, for instance, take charge of writing main, done. And your roommate, or your partner more generally, takes care of implementing cough. >> And this division, these walls of abstraction, or layers of abstraction if you will, are super powerful, because especially for larger, more complex programs and systems, it allows multiple people to build things together, and ultimately stitch their work together in this way. But, of course, we need to now fix cough. We need to tell cough that, hey, you know what? You're going to need to take an input-- so not void, but int and now. Let's go ahead and put into cough the int. i gets zero. >> i is less than how many times. I said three before. But that's not what I want. I want cough to be generalized to support any number of iterations. >> So, indeed, it's n that I want, whatever the user tells me. Now, I can go ahead and say print cough. And no matter what number the user passes in, I will iterate that many times. >> So at the end of the day, program is identical. But notice all of this stuff could even be in another file. Indeed, I don't know at the moment how printf is implemented. >> I don't know at the moment how get string, or get int, or get float are implemented. And I don't want to see them on my screen. As it is, I'm starting to focus on my program, not those functions. >> And so, indeed, as soon as you start factoring code like this out, could we even move cough to a separate file? Someone else could implement it. And you and your program become the very beautiful, and very readable, arguably, really four line program right there. >> So let's go ahead now and make one more change. Notice that my prototype has to change up top. So let me fix that so I don't get yelled at. >> Make cough, let me run cough once more, still doing the same thing. But now, notice we have an ingredient for one final version. You know what? I don't want to just cough, necessarily. I want to have something more general. So you know what? I want to do this. I want to have, much like Scratch does, a say block, but not just say something some number of times. I want it to say a very specific string. And, therefore, I don't want it to just say cough. I want it to say whatever string is passed in. >> So notice, I've generalized this so that now say feels like a good name for this, like Scratch, takes two arguments, unlike Scratch. One is a string. One is an int. >> And I could switch them. I just kind of like the idea of say the string first, and then how many times later. Void means it still doesn't return anything. These are just visual side effects, like with [? Jordan, ?] a verbal side effect of yelling. It still does something n times, 0 up to, but not equal to n. This means n total times. And then just print out whatever that string is. So I've really generalized this line of code. So now, how do I implement the cough function? >> I can do void cough. And I can still take in how many times you want to cough. But you know what? I can now punt to say. >> I can call say with the word cough, passing in n. And if I want to also implement, just for fun, a sneeze function, I can sneeze some number of times. And I can keep reusing n, because notice that m in this context or scope only exists within this function. >> And n in this context only exists within this function here. So we'll come back to these issues of scope. And here, I'm just going to say, achoo, and then n times, semi-colon. >> And now, I just need to borrow these function signatures up here. So cough is correct. Void sneeze is correct now. >> And I still just need say. So I'm going to say, say string s, int n, semi-colon. So I've over-engineered the heck out of this program. >> And this doesn't necessarily mean this is what you should do when writing even the simplest of programs. Take something that's obviously really simple, really short, and re-implement it using way too much code. But you'll actually see, and in time look back on these examples, and realize, oh, those are the steps we took to actually generalize, to factor something out, until at the end of the day my code is actually pretty reasonable. Because if I want to cough three times then sneeze three times, I'm simply going to rerun this, program make cough, and run cough. And I have three coughs and three sneezes. >> And so this is a basic paradigm, if you will, for how we might go about actually implementing a program. But let's just see now what it is we've been doing all of this time, and what some of the final pieces are behind this simple command. At the end of the day, we've been using Clang as our compiler. We've been writing source code, converting it via Clang into machine code. >> And we've been using Make just to facilitate our keystrokes so that we don't have to remember those incantations of Clang itself. But what is Make actually doing? And, in turn, what is Clang actually doing? >> It turns out, though we have simplified today's discussion by saying, you take source code, pass it as input to a compiler, which gives you output of machine code, turns out there's a few different steps inside there. And compiling happens to be the umbrella term for a whole bunch of steps. But let's just tease this out really quickly. >> It turns out that we've been doing more things every time I run a program, or every time I compile a program today. So preprocessing refers to this-- anything in a C program, as we'll see again and again, that starts with this hash symbol, or the hashtag symbol here, means it's a preprocessor directive. That means, in this case, hey computer, do something with this file before you actually compile my own code. >> In this case, hash include is, essentially, C's way of saying, hey computer, go get the contents of CS50.h and paste them here. Hey computer, go get the contents of standard IO.h, wherever that is on the hard drive, paste it here. So those things happen first during preprocessing. >> And Clang does all of this for us. And it does it so darn fast, you don't even see four distinct things happening. But that's the first such step. >> What actually happens next? Well, the next official step is compiling. And it turns out that compiling a program technically means going from source code, the stuff we've been writing today, to something called assembly code, something that looks a little different. >> And, in fact, we can see this real fast. Let me actually go into my IDE. Let me go ahead and open hello.c, which is the very first program with which we began today. And let me go ahead and run Clang a little differently, Clang-s, hello.c, which is actually going to give me another file hello.s. >> And we will probably never again see this kind of code. If you take a lower level systems class like CS61, you will see a lot more of this kind of code. But this is assembly language. This is X86 assembly language that the CPU that is underlying CS50 IDE actually understands. >> And cryptic as it does look, it is something the computer understands pretty well. Sub q, this is a subtract. There's movements. >> There's calling of functions here, x oring, a movement, an add, a pop, a return. So there's some very low level instructions that CPUs understand that I alluded to earlier. That is what Intel Inside. >> There are patterns of zeros and ones that map to these arcanely worded, but somewhat well-named, instructions, so to speak. That is what happens when you compile your code. You get assembly language out of it, which means the third step is to assemble that assembly code into, ultimately, machine code-- zeros and ones, not the text that we just saw a moment ago. >> So pre-processing does that find and replace, and a few other things. Compiling takes your source code from C, source code that we wrote, to assembly code that we just glanced at. Assembling takes that assembly code to zeroes and ones that the CPU really will understand at the end of the day. And linking is the last step that happens for us-- again, so fast we don't even notice-- that says, hey computer, take all of the zeros and ones that resulted from compiling David's code, and his main function in this case. >> And hey computer, go get all of the zeros and ones that the CS50 staff wrote inside the CS50 library. Mix those in with David's. And hey computer, go get all the zeros and ones that someone else wrote years ago for printf. And add those into the whole thing, so that we've got my zeros and ones, the CS50 staff's zeros and ones, the printf zeros and ones, and anything else we're using. >> They all get combined together into one program called, in this case, hello. So henceforth, we will just use the word compiling. And we will take for granted that when we say, compile your program, it means, hey do the pre-processing, assembling, and linking. But there's actually some juicy stuff going on there underneath the hood. And especially if you get curious some time, you can start poking around at this lower level. But for now, realize that among the takeaways for today are quite simply the beginning of a process, of getting comfortable with something like hello world. Indeed, most of what we did today certainly won't sink in super fast. And it will take some time, and some practice. And odds are, you will sort of want to hit your keyboard or yell at the screen. And all of that's OK. Though, perhaps try not to do it in the library so much. >> And ultimately, you'll be able though, to start seeing patterns, both in good code that you've written and in mistakes that you've made. And much like the process of becoming a TF or a CA is like, you'll start to get better and better at seeing those patterns, and just solving your own problems ultimately. In the meantime, there will be plenty of us to lend you support, and get you through this. And in the write-ups for all of the problems will you be guided through all of the commands that I certainly know from a lot of practice by now, but might have flown over one's head for now. And that's totally fine. >> But, ultimately, you're going to start to see patterns emerge. And once you get past all of the stupid details, like parentheses, and curly braces, and semi-colons, and the stuff, frankly, that is not at all intellectually interesting. And it is not the objective of taking any introductory class. It's the ideas that are going to matter. >> It's the loops, and the conditions, and the functions, and more powerfully the abstraction, and the factoring of code, and the good design, and the good style, and ultimately the correctness of your code, that's ultimately going to matter the most. So next week, we will take these ideas that we first saw in Scratch and have now translated to C. And we'll start to introduce the first of the course's real world domains. >> We'll focus on the world of security, and more specifically cryptography, the art of scrambling information. And among the first problems you yourself will get to write beyond playing with some of the syntax and solving some logical problems, ultimately before long, is to actually scramble, or encrypt, and ultimately decrypt information. And everything we've done today, will fairly low level, is just going to allow us to take one, and one, and one more step above toward writing the most interesting code yet. >> So more on that next week. >> [VIDEO PLAYBACK] >> -What can you tell me about the last time you saw him? -What can I say, really? I mean, it was like any other pre-production rehearsal, except there was something he said at the very end that stuck with me. >> -This was CS50. >> -That's a cut everyone, great job on rehearsal. >> -That's lunch? >> -Yeah, you and I can grab a sandwich in a bit. Let me just debrief with David really quickly. David? David? >> [END PLAYBACK]
B1 US string cs50 program cough print file CS50 2016 - Week 1 - C 311 25 zero2005x posted on 2016/11/06 More Share Save Report Video vocabulary