Placeholder Image

Subtitles section Play video

  • [Valgrind]

  • [Nate Hardison, Harvard University]

  • This is CS50, CS50.TV]

  • Some of the most difficult bugs in C programs

  • come from the mismanagement of memory.

  • There are a huge number of ways to screw things up,

  • including allocating the wrong amount of memory,

  • forgetting to initialize variables,

  • writing before or after the end of a buffer,

  • and freeing keep memory multiple times.

  • The symptoms range from intermittent crashes

  • to mysteriously overwritten values,

  • often at places and times far removed from the original error.

  • Tracing the observed problem back to the underlying root cause

  • can be challenging,

  • but fortunately there's a helpful program called Valgrind

  • that can do a lot to help.

  • >> You run a program under Valgrind to enable

  • extensive checking of heap memory allocations and accesses.

  • When Valgrind detects a problem, it gives you immediate,

  • direct information that allows you to

  • more easily find and fix the problem.

  • Valgrind also reports on less deadly memory issues,

  • such as memory leaks, allocating heap memory,

  • and forgetting to free it.

  • Like our compiler, Clang, in our debugger, GDB,

  • Valgrind is free software, and it is installed on the appliance.

  • Valgrind runs on your binary executable,

  • not your .c or .h source code files,

  • so be sure you have compiled an up-to-date copy of your program

  • using Clang or Make.

  • Then, running your program under Valgrind can be

  • as simple as just prefixing the standard program command with the word Valgrind,

  • which starts up Valgrind and runs the program inside of it.

  • When starting, Valgrind does some complex

  • jiggering to configure the executable for the memory checks,

  • so it can take a bit to get up and running.

  • The program will then execute normally, be it much more slowly,

  • and when it finishes, Valgrind will print a summary of its memory usage.

  • If all goes well, it will look something like this:

  • In this case, ./clean_program

  • is the path to the program I want to run.

  • And while this one doesn't take any arguments,

  • if it did I'd just tack them on to the end of the command as usual.

  • Clean program is just a silly little program I created

  • that allocates space for a block of ints on the heap,

  • put some values inside of them, and frees the whole block.

  • This is what you're shooting for, no errors and no leaks.

  • >> Another important metric is the total number of bytes allocated.

  • Depending on the program, if your allocations are in the megabytes or higher,

  • you're probably doing something wrong.

  • Are you unnecessarily storing duplicates?

  • Are you using the heap for storage, when it would be better to use the stack?

  • So, memory errors can be truly evil.

  • The more overt ones cause spectacular crashes,

  • but even then it can still be hard to pinpoint

  • what exactly led to the crash.

  • More insidiously, a program with a memory error

  • can still compile cleanly

  • and can still seem to work correctly

  • because you managed to get lucky most of the time.

  • After several "successful outcomes,"

  • you might just think that a crash is a fluke of the computer,

  • but the computer is never wrong.

  • >> Running Valgrind can help you track down the cause of visible memory errors

  • as well as find lurking errors you don't even yet know about.

  • Each time Valgrind detects a problem, it prints information about what it observed.

  • Each item is fairly terse--

  • the source line of the offending instruction, what the issue is,

  • and a little info about the memory involved--

  • but often it's enough information to direct your attention to the right place.

  • Here is an example of Valgrind running on a buggy program

  • that does an invalid read of heap memory.

  • We see no errors or warnings in compilation.

  • Uh-oh, the error summary says that there are two errors--

  • two invalid reads of size 4--bytes, that is.

  • Both bad reads occurred in the main function of invalid_read.c,

  • the first on line 16 and the second on line 19.

  • Let's look at the code.

  • Looks like the first call to printf tries to read one int past the end of our memory block.

  • If we look back at Valgrind's output,

  • we see that Valgrind told us exactly that.

  • The address we're trying to read starts 0 bytes

  • past the end of the block of size 16 bytes--

  • four 32-bit ints that we allocated.

  • That is, the address we were trying to read starts right at the end of our block,

  • just as we see in our bad printf call.

  • Now, invalid reads might not seem like that big of a deal,

  • but if you're using that data to control the flow of your program--

  • for example, as part of an if statement or loop--

  • then things can silently go bad.

  • Watch how I can run the invalid_read program

  • and nothing out of the ordinary happens.

  • Scary, huh?

  • >> Now, let's look at some more kinds of errors that you might encounter in your code,

  • and we'll see how Valgrind detects them.

  • We just saw an example of an invalid_read,

  • so now let's check out an invalid_write.

  • Again, no errors or warnings in compilation.

  • Okay, Valgrind says that there are two errors in this program--

  • and invalid_write and an invalid_read.

  • Let's check out this code.

  • Looks like we've got an instance of the classic strlen plus one bug.

  • The code doesn't malloc an extra byte of space

  • for the /0 character,

  • so when str copy went to write it at ssubstrlen "cs50 rocks!"

  • it wrote 1 byte past the end of our block.

  • The invalid_read comes when we make our call to printf.

  • Printf ends up reading invalid memory when it reads the /0 character

  • as it looks at the end of this E string it's printing.

  • But none of this escaped Valgrind.

  • We see that it caught the invalid_write as part of the str copy

  • on line 11 of main, and the invalid_read is part of printf.

  • Rock on, Valgrind.

  • Again, this might not seem like a big deal.

  • We can run this program over and over outside of Valgrind

  • and not see any error symptoms.

  • >> However, let's look at a slight variation of this to see

  • how things can get really bad.

  • So, granted, we are abusing things more than just a bit in this code.

  • We're only allocating space on the heap for two strings

  • the length of cs50 rocks,

  • this time, remembering the /0 character.

  • But then we throw in a super-long string into the memory block

  • that S is pointing to.

  • What effect will that have on the memory block that T points to?

  • Well, if T points to memory that's just adjacent to S,

  • coming just after it,

  • then we might have written over part of T.

  • Let's run this code.

  • Look at what happened.

  • The strings we stored in our heap blocks both appeared to have printed out correctly.

  • Nothing seems wrong at all.

  • However, let's go back into our code and

  • comment out the line where we copy cs50 rocks

  • into the second memory block, pointed to by t.

  • Now, when we run this code we should

  • only see the contents of the first memory block print out.

  • Whoa, even though we didn't str copy

  • any characters into the second heap block, the one pointed to by T,

  • we get a print out.

  • Indeed, the string we stuffed into our first block

  • overran the first block and into the second block,

  • making everything seem normal.

  • Valgrind, though, tells us the true story.

  • There we go.

  • All of those invalid reads and writes.

  • >> Let's look at an example of another kind of error.

  • Here we do something rather unfortunate.

  • We grab space for an int on the heap,

  • and we initialize an int pointer--p--to point to that space.

  • However, while our pointer is initialized,

  • the data that it's pointing to just has whatever junk is in that part of the heap.

  • So when we load that data into int i,

  • we technically initialize i,

  • but we do so with junk data.

  • The call to assert, which is a handy debugging macro

  • defined in the aptly named assert library,

  • will abort the program if its test condition fails.

  • That is, if i is not 0.

  • Depending on what was in the heap space, pointed to by p,

  • this program might work sometimes and fail at other times.

  • If it works, we're just getting lucky.

  • The compiler won't catch this error, but Valgrind sure will.

  • There we see the error stemming from our use of that junk data.

  • >> When you allocate heap memory but don't deallocate it or free it,

  • that is called a leak.

  • For a small, short-lived program that runs and immediately exits,

  • leaks are fairly harmless,

  • but for a project of larger size and/or longevity,

  • even a small leak can compound into something major.

  • For CS50, we do expect you to

  • take care of freeing all of the heap memory that you allocate,

  • since we want you to build the skills to properly handle the manual process

  • required by C.

  • To do so, your program should have an exact

  • one-to-one correspondence between malloc and free calls.

  • Fortunately, Valgrind can help you with memory leaks too.

  • Here is a leaky program called leak.c that allocates

  • space on the heap, writes to it, but doesn't free it.

  • We compile it with Make and run it under Valgrind,

  • and we see that, while we have no memory errors,

  • we do have one leak.

  • There are 16 bytes definitely lost,

  • meaning that the pointer to that memory wasn't in scope when the program exited.

  • Now, Valgrind doesn't give us a ton of information about the leak,

  • but if we follow this little note that it gives down towards the bottom of its report

  • to rerun with --leak-check=full

  • to see the full details of leaked memory,

  • we'll get more information.

  • Now, in the heap summary,

  • Valgrind tells us where the memory that was lost was initially allocated.

  • Just as we know from looking in the source code,

  • Valgrind informs us that we leaked the memory

  • allocated with a call to malloc on line 8 of leak.c

  • in the main function.

  • Pretty nifty.

  • >> Valgrind categorizes leaks using these terms:

  • Definitely lost--this is heap allocated memory

  • to which the program no longer has a pointer.

  • Valgrind knows that you once had the pointer but have since lost track of it.

  • This memory is definitely leaked.

  • Indirectly lost--this is heap allocated memory

  • to which the only pointers to it also are lost.

  • For example, if you lost your pointer to the first node of a linked list,

  • then the first node itself would be definitely lost,

  • while any subsequent nodes would be indirectly lost.

  • Possibly lost--this is heap allocated memory

  • to which Valgrind cannot be sure whether there is a pointer or not.

  • Still reachable is heap allocated memory

  • to which the program still has a pointer at exit,

  • which typically means that a global variable points to it.

  • To check for these leaks, you'll also have to include the option

  • --still-reachable=yes

  • in your invocation of Valgrind.

  • >> These different cases might require different strategies for cleaning them up,

  • but leaks should be eliminated.

  • Unfortunately, fixing leaks can be hard to do,

  • since incorrect calls to free can blow up your program.

  • For example, if we look at invalid_free.c,

  • we see an example of bad memory deallocation.

  • What should be a single call to free the entire block

  • of memory pointed to by int_block,

  • has instead become an attempt to free each int-sized section

  • of the memory individually.

  • This will fail catastrophically.

  • Boom! What an error.

  • This is definitely not good.

  • If you're stuck with this kind of error, though, and you don't know where to look,

  • fall back on your new best friend.

  • You guessed it--Valgrind.

  • Valgrind, as always, knows exactly what's up.

  • The alloc and free counts don't match up.

  • We've got 1 alloc and 4 frees.

  • And Valgrind also tells us where the first bad free call--

  • the one that triggered the blowup--is coming from--

  • line 16.

  • As you see, bad calls to free are really bad,

  • so we recommend letting your program leak

  • while you're working on getting the functionality correct.

  • Start looking for leaks only after your program is working properly,

  • without any other errors.

  • >> And that's all we've got for this video.

  • Now what are you waiting for?

  • Go run Valgrind on your programs right now.

  • My name is Nate Hardison. This is CS50. [CS50.TV]

[Valgrind]

Subtitles and vocabulary

Click the word to look it up Click the word to find further inforamtion about it