Subtitles section Play video Print subtitles (relaxed music) - So we're gonna be talking about memory layout in Swift. As I'm sure you know, Swift is Apple's brand new, magical, fancy programming language and I'm gonna go dive into it a little bit and talk about the bits and bytes and how it's all put together and what stuff looks like in memory when you actually run the code on a computer. Real brief about me, I'm online at mikeash.com, I have a blog where I do all sorts of crazy stuff like this, I like to take things apart and see how they tick and I've got a bunch of crazy Github projects which you should probably never use for anything, but are lots of fun to play with. I'm on Twitter if you feel like following me. There's a picture of my cat because, you know, the internet is all about cats and we're fundamentally all about the internet these days. I fly gliders, just point of information, it's a lot of fun. And the arrow is kinda pointing over here so that's me, you know, I always put a photo of myself on these slides and then afterwards I'm like, people can just look at me. So I decided I'd stop doing that. So here's the plan; first I just wanna give a quick overview of what memory is. I'm sure you all know what memory is, but it can help to get a little bit of perspective and just tear it down, you know, go to the foundations, revisit the fundamental stuff. Then I wrote a program that basically is where I generated all this information, it actually goes through, crawls, the program starting from a particular value and dumps everything out that are in those values in memory. And then finally I'm gonna actually dive into how Swift lays out stuff in memory, what that program actually produces and some contrast with how C does it and how C++ does it. So what is memory? And fundamentally memory is what stops this from happening. So you gotta keep track of where you are essentially. You've got a computational process and you are at some state within that process at all times. And if you can't keep track of that then you will just never get anywhere. So we don't wanna just endlessly repeat, we wanna actually make progress and that's what this is all about. So figuring out how to actually build hardware which can remember things and store information and dig it out later is kinda one of the fundamental problems in computing and there's lots of technologies along the way. Started out with vacuum tubes. You can imagine these things are like this big and they're essentially like an incandescent light bulb and each one holds one bit. So if you wanna actually store some reasonable amount of data you're talking about a room full of incredibly hot equipment. Later on there were mercury delay lines, this is kind of a cool technology of a pipe, you basically fill it with mercury, you have a speaker or something like that on one end and something like a microphone on the other end and you pulse your data through it. And it takes time to travel and because of that you can fit stuff in and store your information that way. And there was a fun little proposal, somebody decided that gin would make a good medium for this, had all the right chemical properties and whatever. As far as I can tell nobody ever built that, but fun little aside. Magnetic core memory was an advancement of this stuff, it was a very neat technology, you got little rings of iron and you run wires through them and depending on the electrical current you send through them you can store data or retrieve data by storing it in the magnetic field in those rings. And so that was one ring per bit. And the state of the art of this in the 60s or 70s was basically a cube about this big could hold 32,000 bits of information and then you can imagine this thing I've got in my hand can hold many gigabytes memory, billions of bits. And so things have advanced a lot since then. So DRAM, dynamic RAM, basically silicon chips is the state of the art today. Which we should all be incredibly thankful for because it really makes our lives a lot easier. That fact that we can have these, this allows us to store billion and billions of bits of information all at once. And my phone is misbehaving, if you can pardon me for just a moment here, there we go. Alright, so that's the hardware view of things, we don't really care too much about hardware most of the time if we're programming because that all just works, we ignore it. So how does it look for a programmer? So we've got the fundamental unit of information is the bit, that's a one or a zero. Traditionally we organize bits in groups of eight, those are bytes, and memory is essentially just a long sequence of bytes one after the other just heading off into the mist. And they're arranged in a certain order, every byte gets a number, that's called it's address. So we start at zero and then one and then two and then three and then an 8,000,000,000 byte system we've got billions off in the distance. It can be, you can view these things in different directions, often we view it like this, organized by word instead of by byte. So a word is a vague term of the art in computer science, but usually it means a unit that's the size of a pointer. So on modern devices it's 64 bits or eight bytes. And it just heads off into infinity. So here I've got the bytes addressed by eight. And we like hexadecimal. Hexadecimal is where you've got base 16 addressing instead of base eight. So zero through nine then A through F, that's a nice multiple of two so everything fits together nicely, it's kind of the native language of computing. So it's the natural language to use here. And so I've got the addresses done in hexadecimal instead, zero, eight, 16 is 10, 24 and all that. And this is just the big picture of what this whole thing looks like. If you zoomed out this is a Mac running on x86-64, everything's a little bit different and it's all very platform specific, but essentially you've got a gap of stuff that doesn't exist, the first four gigabytes of memory is not mapped, this doesn't take up any physical space it's just an addressing trick. Then you've got the user, your program essentially, your memory is the green stuff. So you get a chunk that's for you and then you've got a nice, big, empty space after that and then finally the kernel is down at the bottom. So you've got this two to the 64th power bytes which get sliced up and organized like this. And this is essentially how it looks if you zoom in a little bit, so this is the same picture as before except it's more realistic because instead of starting at zero we're starting at 4,000,000,000. We've got pointers in memory, I'm sure you're all familiar with the term pointer, references. A pointer at this level is just a number. And it's a number that just happens to correspond to the address of something else in memory. So here we've got this thing up there, that stores the address of this bit down there and I just indicate that with an arrow. The arrow doesn't exist in reality, it's just a number that we treat as if it were an arrow. And one more detail on all of this, whoops, went too far, we in the most modern systems store things in little-endian order which is essentially the least significant part of the number comes first. So it's as if you wrote 234 as 432, everything's backwards, just one of those things you just kinda have to learn to live with it and read it that way, so. Memory is organized into, as far as we see it, is organized into three fundamental parts. We've got at the hardware level it's just a big list of stuff. But the way we actually treat it, parts of it have more specific purposes. So we've got the stack which is where all your local variables go when you write variables that you're using in your computations in a function, that goes on the stack and it's called a stack because every time you make a function call it adds that function's local variables to it. And when you make another call it goes up and another call it goes up and then when you return from a function it goes back down, back down, back down. So it's like a stack of things. You've also got global data in your program. Those are essentially things that came along as part of the program when you loaded it. So your global variables are part of that, your string constants, your type metadata in Swift, in other languages, just gets loaded as part of that. And then you've got the heap, and the heap is dynamically allocated stuff, when you create a new object, that allocates some memory that's on the heap. And these are things where basically they don't live permanently, they've got some lifetime but they're not tied to a function, they're not tied to your program, they come to life and go away kind of arbitrarily. And that's essentially everything else. When you create a new object, when you allocate memory manually, when you concatenate strings, whatever, that's all on the heap. So that's kinda the overview, remind you what memory is, what the whole deal is that we're talking about here. Let's get into dumping memory. Actually diving in and inspecting the contents, seeing what's actually in there. So I'm gonna explore this program that I wrote, how it works, that actually goes in and traces all this stuff out. If anyone wants to take a look at it I've got it up on Github, the Github address is there, or there's a tiny URL below, or if anyone likes really huge URLs you can use that one at the bottom, but I'm not gonna wait for you to type it out. Just real quick, this uses Xcode 8 and Swift 3. If anyone's doing anything with Swift so far you know that there have been a lot of versions of it and they like to break compatibility. So last year was Swift 2, starting a couple of weeks ago we've got Swift 3 now and that's different. So if you wanna use that code you need this. So back to this. The kind of fundamental unit of this program is a function that looks like this. And this is a function that works on an arbitrary type, it takes a value and it's gonna return an array of unsigned eight bit integers, or bytes. And we'll just use it as in the demo above, you create a variable containing something, any arbitrary thing, and then you just call this and it'll hand you back the bytes that make it up. And that's going to serve as the foundation for this whole program. And the question is, how do we write this in Swift specifically? This is a real quick overview of one possible implementation which is not what the program actually uses, but it kinda gets the ball rolling as far as how this works. So the idea is you get the value, you get it's size and because we know the type of the value, this is a generic function in Swift so it works on any type, but it knows what type it's being called with at any given time. This memory layout type allows us to get the actual size, so that tells us how many bytes it is, so we know how long it is in memory. And then there's this built-in function in Swift called with unsafe pointer. So you call that, you give it a variable and it comes back and gives you a pointer to that variable. And once we have a pointer we can do things like look at that pointer as if it were a pointer of a different type. So imagine you have a pointer to an int and we do this with memory rebound call here which says, okay, pretend that this is not a pointer to an integer, pretend this is a pointer to bytes, and just work with me on this, it's the same thing but it's a different type. Now go through and read it. So what this does, this takes a pointer to whatever arbitrary thing you've got and says, alright, just pretend it's raw bytes, interpret it that way. And then once we have that we can go through and just read one by one and that's a bit of a shortcut here, I just tell the system to read it for me, there's no loop or anything like that. Unsafe buffer pointer basically lets me say it's a container and then I can create an array from that and it all just kind of happens. Swift let's you write short stuff like that. So real quick demo of what this produces. I created a variable that contains one, two, three, four, five, six, seven, eight and then I just dumped out 42, and if I print these things then I get these results here. So you can see the first one prints out exactly what we saw except it's backwards because remember modern systems tend to do things backwards. So it prints out eight, seven, six, five, four, three, two, one, zero and then 42 comes out as 42 with a bunch of zeros after it because it's a 64 bit integer. And just real quick, you don't need to actually follow this, this is just some code I wrote that I wanted to put up real quick. Hexadecimal is again the natural language of low level computing. So this just takes an array of bytes and dumps it out as a hexadecimal string instead of as a sequence of decimal integers like we saw before. So if we use that then we get this instead. Same basic thing except instead of printing decimal we get hex. So one, two, three, four, five, six, seven, eight comes out just as it did before and then 42 comes out as 2a, since that's what 42 is in hexadecimal, followed by a bunch of zeros. So this let's us dump this stuff out in a form that we can understand, but that's still close to what the computer has. Alright, so if you got this where we can take a value and see what's in it, but real programs are more complicated than single values, right, they've got a lot more going on. Real programs look more like this. Okay, you've got a value which contains a bunch of bytes and some of those bytes are actually pointers which point to other stuff and those point to other things and you get this whole tree thing going on. So we want to be able to actually chase all this stuff down in an automated fashion. The program needs to be able to actually find all of this stuff. So how do we do that? We start off with the knowledge that pointers are just integers, alright, a pointer is just an address, it's just a number which we interpret as another location in memory. So I wrote a quick struct which gets used in the program, all it is is it just contains an address which is an unsigned integer, pointer sized integer. And wrapping it in a struct helps me keep things apart so you don't confuse which parameters are actually integers and which parameters are integers which we are treating as pointers, right. Just make a new type so that the type system helps us write program correctly. And then this bit of code essentially takes an array of bytes, which we already know how to obtain, we just wrote that function, and it takes that array of bytes and tries to scan for pointers in it. So again, a pointer is just an integer that you happen to treat as an address. And we can't know how stuff is being treated at this level because we just get a bunch of bytes and we don't know what they mean. So we're just gonna kind of optimistically go through and slice it up into chunks of eight bytes and pull them all out and pretend, say, what if these were pointers, what would that mean? So that's what this does, we take this array of bytes and we say, instead of treating it as an array of bytes gives me a pointer to it's contents and treat it as a pointer to pointers, okay, which means that we can essentially go through and say, read the first pointer size chunk, read the second pointer size chunk, read the third pointer size chunk. And then we take all of that and return it as an array. So this code essentially will just go through this big array of bytes that we get from the thing before and divide it up, which, like this. So this is a visual indication of what's going on with that code. So we give it a value, it returns a bunch of bytes then we go through, slice it up and get the individual pieces. And then we can start chasing those down. So we can read a value, grab all of it's bytes, then we can grab all of the pointers that those bytes might indicate. And then we can take those pointers and repeat the process and essentially that gives you the tree, you can in a loop keep going through as long as you've got pointers to explore, you read their contents and then you spit them out. The problem with this approach, we don't know which pointers are actually pointers and which pointers are just integers, it might be the player's high score, it might be the number of people who dislike you or something like that and we don't know what they mean. And normally in a program when you try to read from a pointer that's not actually a pointer, it just is some illegal piece of memory, then your program crashes which is good in normal code because you don't want to proceed when your program is that confused, you want it to just stop and produce a crash log or something like that. But in this code we want to be able to keep going so we can explore this stuff. So we wanna be able to read from pointers without crashing if they're bad. So on the Mac and on iOS we've got this nice low level function, Apple platforms use a mach kernel highly modified and added onto stuff, but the low level mach calls are still there and there's a mach call called mach vm read overwrite. And essentially it's a system call where you give it two pointers and you say, I wanna copy this many bytes from that pointer to this pointer. If you're familiar with the memcpy function from the C standard library, it's exactly like that except that if you give memcpy a bad pointer your program crashes and if you give mach vm read overwrite a bad pointer it's okay, it returns an error because it's a system call, it happens at the kernel level, the kernel level can do all this checking safely and so it can come back and say, I couldn't do that because that is not a real pointer, that was just a bunch of junk and the address there doesn't exist. And so based on that we can go through and reliably follow this tree without crashing because we can essentially optimistically try every pointer, pass it to this function and then if it comes back and says there was an error, that's fine, we just say, okay, couldn't follow that, keep on going. This is a real quick, this is just a function prototype what looks like it takes a task, which is like a process, if you've got the right permissions on the Mac you can actually read from other processes not your own which is sort of the foundations of how you can build a debugger. It takes an address, it takes a length, it takes a destination address and it takes a pointer to something where it will tell you how many bytes it actually read. So back at the beginning I showed a function that would read from a pointer, but it would crash if you gave it a bad pointer, this will read from a pointer safely. So essentially it's just a wrapper around that mach call. It takes the pointer you give it, it does a little bit of casting to get it into the form that the system wants and then it just makes that call. If it succeeds it returns and says, hey, we did it, and if it doesn't then it returns false, the caller can know that it didn't work. And so that way based on this we can build this whole recursive scanning system. Let's see, there we go, alright. So we can read this stuff safely, but we need to know how much to read. The first value we read we can get the size of the type because we know the type it compile time, it's a generic function, we get that metadata from the compiler, but after we start chasing pointers we can't do that anymore because we're dealing with arbitrary bags of bytes, we don't know what this stuff is so we need to know, we need to be able to at least guess how many bytes to read at any given time, when you chase these pointers through. For stuff that's on the heap, there's the malloc size function, at least on the Mac and on iOS, where you give it a pointer and it comes back and says, there were actually 32 bytes allocated on the heap here. So we can call that and it comes back and tells us exactly how much we can read. Which is great. And even better, this function is tolerant of bad data, so if you give it something that's not a pointer or you give it a pointer to something that's legitimate, but not allocated on the heap, or you give it a pointer to something in the middle of something else, whatever, it doesn't care, it'll give you back zero. So it doesn't crash, which is really convenient for our purposes. And finally, we've got global variables, code, things like that. Those are symbols in your app, there's the dladder function where you give it an address and it comes back and tells you what symbol is nearby. And so we can use that to check to see if something is actually a symbol and we can also use it to kind of extract the size by essentially scanning. It gives you the symbol that comes immediately before the pointer you give it. So you start from here and you say, give me the symbol information and if it comes back and says, yes, I have symbol information, then advance it by one byte and say, how 'bout here, how 'bout here, how 'bout here, and just keep doing that until it gives you a different answer and then you know exactly how long that thing was. And as a bonus, it also gives you the names. So your function names, your global variable names, things like that, those all pop out of this API, and so we can use them to annotate our scan and help us understand what's going on. Those names in Swift and also in C++ tend to come up mangled because the compiler tries to embed information about what the type is besides just the name. So in C for example, if you have a function called summon, the symbol name that it spits out just says summon, and in Swift if you have a function called summon the symbol name comes out more like this where you've got a bunch of extra stuff on it because it will not only include that name, but it will also include the fact that it takes two integers and returns a string or whatever it actually is. So in order to help with that there's the Swift demangle command that comes with Xcode. I imagine it's available in the Swift open source tools as well. You give it a mangled symbol and it comes back with something like this which is more readable. So in my code I just dump everything through that. Swift demangle is a very nice program because if you give it something it doesn't understand it just gives it back to you unmodified. So I could just feed everything through it without having to fear that it would explode or crash or something like that on data that wasn't actually mangles Swift symbols. And then C++ has the same thing, there's a tool called C++filt which does the same job for C++ names and it has the same semantics where if you give it something it doesn't understand it gives it back to you without changing it. So I could just pass every name that I came across to these two tools. A lot of the data that we come across in memory is actually strings, alright, textual information like method names, like user input, and it's useful to be able to find these. And the trouble is again, we're working with these bags of bytes, we don't know what's going on with them, they're just a sequence of data and we want to be able to at least guess at which sequences of data actually represent text and which don't. And there's no way to do this reliably, but a decent heuristic is to look for ASCII characters and look for printable ASCII characters, so zero through 31 in ASCII are control characters which we don't expect to find as part of text in a program, at least not the text that we're interested in. And then stuff beyond 126 is either the delete character in ASCII, or it's non-ASCII characters. So we look for printable ASCII characters and we look for sequences of at least four. So the idea is that if you just have one or two or three then it's likely that's just some other data that just coincidentally happened to look like text. And once you get up to four there's a decent chance that it's something textually interesting, and it's not a guarantee, but it's a decent heuristic, it gives you good results. And this is code that just goes through and implements that heuristic here. You give it an array of bytes and it goes through, it splits that array into chunks of continuous printable ASCII characters and then filters out all the short ones and gives you back the long ones. So we can run this on the byte arrays that we get out of the scanner to see what's going on in them. Alright, so those are the foundations of the program, there's a bunch of bookkeeping that goes on in it if you're interested in that part look it up on Github, but those are the fundamental pieces and we now know how to build all of that. And so we can read all of this stuff, but we wanna be able to actually output it in some form that's nice for the human to look at. So we could just dump it all in text form or something like that, but it's gonna take a lot of work to interpret. Ideally we want something more like this. And as an intermediate step I produce something like this which is not very readable at all. But this is an open source program called Graphviz and essentially you give it a list of nodes and you give it a list of connections and you say this node has this label and it's connected to these nodes and this node has this label and it's connected to these nodes. And when you hand it over to that program it hands you back stuff like this which is really cool and readable and you can go through and look. This is, I wrote a simple C program that creates a little structure in memory and then hands it off to my dumper program and that generates the Graphviz stuff and then Graphviz turns it into a PDF which looks like this. So we can go through, we can read, we can see we started off with a pointer up at the top, that pointer points to some malloc memory which contains this and those point to more malloc memory which point to more malloc memory and we've got a couple of strings at the bottom and we can go through and you can just see this whole structure visually, which is cool, so that helps us figure out what's going on. So that's the theory of how we're looking at these things. So let's actually go through and look at them and see what's going on with this stuff. How does Swift represent things in memory? How does C represent things in memory? How does C++ represent things? So quick notes, this is all very architecture specific, I did this stuff on Mac on x86-64, iOS 64 bit is likely to be very similar, Swift on Linux 64 bit is likely to be similar, but this is stuff that's very useful for debugging, it's very useful for understanding how the system works, it is not a good idea to write any code that relies on this stuff unless it's kind of a hobby project or an experimental thing. You don't wanna write any production code that relies on this stuff because offsets, sizes, the meaning of various fields is all subject to change from one release to the next. So it's really useful stuff, but you don't want to incorporate this into that library that you're writing for work that's gonna ship to users. Oh, my phone is not cooperating with me today. There we go. Alright, let's take a look at some C structs. C is very simple in how it lays things out in memory, that's kind of it's appeal. And we'll take a look at this real quick, I made a C struct which just contains three long fields, x, y and z, I wrote a little bit of code that fills them with one, two and three and then I dumped out that memory using my nice graphical dumper and that's what we get up here in the bubble. And you can see that it essentially just lays them out sequentially. We've got one followed by two followed by three and there's a bunch of empty space because long is an eight byte value and these are small numbers, so they have a lot of leading zeroes and just puts them out one by one. It gets more interesting when you get different sizes. So this is a struct that has a bunch of fields of different sizes, a through h, some of them are one byte, that's a character, some of them are two bytes, those are short, some of them are four, that's integers, and some of them are eight bytes, that's long. And again, the compiler just lays them out one by one, you can see one, two, three, four, five, six, seven, eight, but if you look closely you'll see that some of them take up more space than they maybe ought to. Number three for example, three is one byte, corresponds to c, that's a one byte field, but if you look here you've got three followed by zero followed by four, so there's extra space in there. The reason for that is that struct fields get padded. The idea is that it's more efficient to access data when it's on a memory address which is divisible by it's size, or at least which is divisible by whatever the hardware architecture likes for it to have. Typically it's its size. So a two byte value wants to be on an even numbered address, four byte value wants to be on an address divisible by four. And what the compiler does, this essentially wastes memory, but it trades off memory against time by expanding these fields a little bit, adding some space between them when necessary to make sure they all line up nicely so that they're fast to access. And that's really it for C. C kinda has structs and that's about it and it just lays things out sequentially and there's no metadata, there's no implicit pointers or anything like that. C, what you see is what you get. C++ gets more interesting though, here's a simple C++ class, I've got three virtual methods on it. It's got one field, I create one and initialize it and dump it out, and this is what I get. So we can see now it's not just one bubble, it's got a bunch of different stuff. And I'll zoom it in so we can actually see what's going on here. Up at the top is the actual object and that's the thing I created which contains, in it's single field it contains the value one. And we can see that it's got more than that. So it just had one field, but here we've got more stuff at the top. And the program explored this and found that that thing at the top is a pointer which points here, and then that points to more stuff. And so that thing at the top is a vtable pointer. So in C++ the way you do virtual method dispatch is the first pointer sized chunk of an object is a pointer to a vtable, which is a table of function pointers. So when you call through to something like object.x, what it actually does is it uses that table to look up the implementation of x for that object. And that's how inheritance is implemented. If you subclass something and override, then that generates a new table and that new table contains new entries for those method implementations so that the code knows what it needs to call. So here's an example of that. Quick C++ subclass, it inherits from the previous one, it adds a new field, it adds a couple of new methods. And when you dump that out you get a little more stuff. And again, I'll zoom in. So here we've got the object at the top, like before you've got this vtable pointer and then you've got the fields. And if you'll remember, field number one was from the super class, field number two is from the subclass, it just puts them sequentially. So the idea is that when the super class is doing stuff it can look at it and it sees what it thinks is itself and then the subclass data gets laid out afterwards so there's no conflict there, but they're just efficiently packed in memory just the same. And then the vtable for the subclass gets longer because there were five methods now, we had three from the super class, two from the subclass and then it just lists them sequentially. So every method just gets an index in this table. And the subclasses get the same table as their super class, except they can be potentially longer if there are more methods added and entries get replaced to indicate overriding. Let's take a look at multiple inheritance, this is where things get interesting. C++ allows a class to subclass multiple classes simultaneously. So here's a second super class to go along with our original. And here is a subclass of both. So each super class some methods, each super class has a field, subclass has a field, create it, fill it out with some data and this is what we get. It's a little bit more complicated. The good news is that most of that is runtime type information stuff that we can kind of not look at too hard. Let's zoom in and see what's going on. So again, object is at the top and we can see that it starts out similar. So it's got a vtable pointer followed by that first field, which is one, but then something interesting happens. Instead of doing just one, two, three, laying out all the fields sequentially, we get another vtable pointer right in the middle of the object. And so this is how C implements multiple inheritance. We've got one vtable pointer at the top, we've got another one over there. And the idea is that it's kind of like two objects glued together. So if you take this first one here, that's the vtable that indicates it's an instance of that first super class, and then the second super class gets laid out below it. And what happens normally in C and with simple C++ classes if you cast between types it's got an instance of a subclass, you say, treat this as if it were an instance of its super class. This is just like some bookkeeping trickery, right, you've got the exact same pointer and you just say, okay, pretend this means something else. But when you get multiple inheritance involved suddenly things get a little more complicated. And if you say, take this pointer and interpret it as a pointer to its super class, what it will actually do is it will move that pointer a little bit. So in this case it's going to add 16 to that address and give you a pointer into the middle of this object. And because that vtable is right there, it all just kinda works out. And it's a bit of a crazy system, but it gets the job done. And so you can see the effect here where you've got essentially the vtable for the subclass and each part of this object, you've got two vtables in the object, each one points to a different part of this vtable and everything just kinda lines up with these multiple super classes so it all just works out. Lots of compiled time trickery and then the end effect is at runtime everything is nicely laid out, friendly, and quick. Friendly for the computer, not for us, but that's usually okay. So that's C++, you get crazy stuff with multiple inheritance, but it's usually straightforward. Again, you get that vtable at the top which tells you what kind of object it is and then all the fields are just laid out. Sometimes you get padding depending on their sizes, but it's just one after the other, after the other. Just in line. So let's move on the Swift now. And Swift starts out very simple like C and like C++. So just to get the ball rolling I created an empty struct and you'll never guess what it looks like, an empty struct contains nothing at all, it's a zero size object. Interesting feature of this, it does still have an address in memory even though it doesn't contain anything. The compiler still gives it an address which I thought was kinda funny. It probably doesn't make a whole lot of sense for the compiler to optimize for zero size structs since we don't use those very much. Move on to a more realistic example, more useful example, here's a struct with three fields. This is essentially the Swift equivalent of that C example I did with the beginning. Three fields, one, two and three, and it looks, this is the result, the output, the way it's laid out in memory looks a lot like the way it was laid out in memory in C. And in fact it doesn't just look like it, it is exactly the same. These two are laid out in exactly the same way. So Swift is just laying it out one, two, three in a row like that. There's no fancy metadata going on, there's no extra stuff, it's just your fields. And then I did the same thing that I did before with the multiple sizes. And again, we get the exact same result. So this is a complicated struct with different fields of different sizes and the output is exactly the same as it was in C. With one exception, you'll notice that after three, you get one, two, three and then there's this five f thing before four, that's just because the padding that gets inserted does not have to contain any particular value because it doesn't mean anything. The padding is ignored. So before when I ran the thing on C it just happened to contain zero and when I ran it on this it just happened to contain five f. So this is kinda like the junk DNA inside your program. But again, it's just laid out exactly the same way C is, so there's no overhead, it's very straightforward. Let's look at a little more complicated thing, let's see how a Swift class looks. So simple thing, complicated result. It's not as bad as it looks. Essentially what you're getting in there is that Swift has this whole hierarchy of stuff and it knows what types mean at runtime. And I wanna zoom in a little bit so we can see the object, but what all this other stuff is going on is essentially it's saying that your class is actually a subclass of this heap in the Swift object class and then that class has a metaclass and all that stuff. So there's all this metadata that's going on that you can use to inspect objects and things like that. But we can mostly ignore it. So if we ignore all that other stuff and kind of zoom in, we look at the instance of the object here and we can see the data laid out in memory, one, two, three, and there's a header above it which is similar to the way C++ was. In C++ we had a vtable and then there were the fields, and in Swift we have an isa pointer, which is essentially the Swift equivallent of a vtable, it points to the object's class, then we've got some other stuff which I'll talk about in a moment and then you've got the fields. So you've got the same arrangement of a header followed by the fields just packed in memory. Nice, linear, fast, hardware friendly. And let's take a look at a little bit more complicated class, this is the class equivalent of that struct that I showed. And it ends up being the exact same thing with that sort of header put at the top. So you've got that isa pointer, you've got this other stuff which I'll get to. And then all the fields are just laid out the exact same way they would be in a struct. Sequentially with some padding to make everything line up nicely. So this is sort of the visual representation, the abstract representation 'cause those hexadecimal things get painful to read after awhile. So this is what they mean if you actually go in and interpret it. You've got the isa pointer, that other header field that I didn't mention yet, those are retain counts, you may or may not know Swift operates using automatic reference counting. So it needs to count the number of references to each object and in Swift those counts are stored in the object itself as that second header field. And then your stored properties just get laid out after that, the compiler just puts them one by one. And I did say retain counts, plural, so there are actually two counts in a Swift object, this is in interesting little feature of the way the system works. There's the strong count and the weak count. So when you make a normal reference to a Swift object that increments the strong count and then if you make a weak reference to an object that increments the weak count. And the idea is that when the strong count goes to zero, if the weak count is non-zero then the object is destroyed, but it's not deallocated and that could be a talk by itself. I got a blog article about it if anyone cares about exactly how that works. But that's essentially what we're seeing there. So there are two separate counts packed into the same field. Each one I think is like 31 bits or something like that. And then let's look at that isa structure. So that isa structure in C++ the vtable was just a list of method pointers. In Swift it's a little bit more complicated partly because of Objective-C interopp. Swift has to work with Apple's Objective-C stuff and in fact all Swift classes in memory are also Objective-C classes. This fact is hidden from us sometimes. If you explicitly subclass in Objective-C class then you can see it, if you use the at obj C annotation you can see it. But even if you do none of that and you do what looks like a pure Swift class, it's actually an Objective-C class just the same. And just to be a little bit more accurate, the first part of an object is not necessarily the isa pointer, sometimes it's the isa pointer along with some other junk. This is just a way to sort of efficiently pack some metadata in there. Apple does this on iOS 64 bit, I don't believe they do this on the Mac currently. This is all subject to change, but basically they can put little extra bits of information in there like whether this object has ever had any associated objects with it that need to be cleaned up when its deallocated and things like that. Just real quick detail there. So what do these class structures look like? Since every Swift class is also an Objective-C class, that means that we can look at Objective-C class definitions to see what's going on. And Objective-C class definitions are part of the Objective-C runtime which is open source, that's convenient. So we can just look in runtime.h in the open source dump there. And if we look there and we see what's going on, this is what we get. So every class is also a valid object in memory. So if you remember an object starts with an isa pointer, so that means every class starts with an isa pointer as well. So every class is also an object, a class has a class, that's called the metaclass and you can follow that rabbit hole all the way down until you get very confused. The class also stores super class. So that allows you to follow the chain up and essentially explore the class hierarchy. Class stores its name, it stores a bunch of other stuff, it stores how big it's objects are, it stores a list of instance variables and methods and it's got a cache which speeds up methods dispatch. And then Swift classes take all of that and they add more stuff because Swift has more stuff going on. So if you look in the Swift open source to see what's involved there, we've got some flags, we've got this offsets, a lot of bizarre stuff. But essentially a Swift class is the Objective-C class with more stuff on the end. And then, this is an interesting part, after all of those fields it's a list of methods again, an array of method implementation. So essentially it's the C++ vtable approach again with some extra stuff at the top that we can ignore. And so what that means is that when you do a method call in Swift it translates into essentially an array lookup. So you write obj.method up here and that translates into code like this down here. So essentially you take to object, you get that isa field out of it and then you just index into it to get the method pointer and then you jump to it. You essentially make a function call based on that. And so it's quick, it's efficient at runtime. Let's take a look at what an object looks like when you subclass a bunch of stuff. So I made a class, a subclass, a subclass of that and so forth four levels deep. And it looks exactly the same. So you've got that isa pointer at that top which tells you what it is, you've got the retain counts below that and then the field of all those classes just gets laid out sequentially. Just like in C++ we saw before. So at runtime it's very simple. Even though the class hierarchy we looked at was kind of long and complex. Let's take a look at arrays in Swift. Arrays in Swift are value types which means that they act like primitives essentially when you assign x equals y that conceptually creates a new array which is codally separate from the original, this dump reveals that's essentially a lie, they are, in fact, reference types under the hood the way they're implemented. So this array, one, two, three, four, five, if you actually look at it it's just a single pointer. And that points to one, two, three, four and then five after that which ran off the end so it got truncated. And so what's going on with that is every array that you work with is actually a pointer to the storage and when you make a new array you just get a new pointer to the storage, nothing really happens and it's only when you actually modify it, it will go in and it will see, oh, someone else references this, I will create a copy and then modify that copy. So it still references under the hood, you just don't see it until you run a program like this, then you see it. Let's take a look at protocol types, this is an interesting aspect of Swift. So here's a Swift protocol, it's got three methods in it. Here is a struct, which holds three instances of that protocol, right, you can use a protocol as a type itself and that can hold an instance of anything that implements that protocol. Here is a struct which implements it, it's just got empty implementations of those three methods. It's also got a field which is just an integer containing the strange hex value, that hex value will spell out the word small in ASCII, basically that's there so that when I do the dump we can identify it because it will search for that string, it will show it in the printing. Here's another struct, this is a larger one, it's got four fields. The first one spells out large and the other ones just contain one, two, and three repeated just so that they show up nicely. And finally, here is a class, if this wants to advance, yeah, my wifi is not cooperating. There's a class which, same as the struct essentially except it spells out class instead. And so we wanna see how these get represented. So here we create an instance of that protocol holder containing one instance of the small struct, one instance of the large struct and one instance of that class. And if we dump it out here's what we get. The larger view of this is very complicated, but we can see that struct in the list of strings that it found, it found small. So we can tell from this that that small struct actually gets stored inline, that protocol, that field of protocol type is able to store that struct inline, but the large struct does not get stored inline and of course the class doesn't because the class is a reference. And where did that large struct go? Because structs normally get stored inline, but this one was large, it ends up getting stored off on the side here if you chase the arrows around. And essentially what happens is it's too big to fit. The compiler can't know how big these things are gonna be so it places an arbitrary size limit and when you go over that limit then the compiler behind your back boxes it up, allocates something dynamically and stores it over here. So here large gets stored off in the weeds somewhere. If you chase it down you actually look at how these things are implemented. A value of protocol type holds five fields. It holds three arbitrary data fields and then it holds some type metadata which essentially tells you what it is and then it holds a witness table which is like a vtable for the protocol. And those three data fields are given over to whatever the type needs them for. So if you've got a struct which holds that much stuff or less it gets stored inline very efficiently and everything is quick and as soon as you go over that limit, suddenly it has to get broken out, it gets boxed up, it gets allocated dynamically and you loose a lot of efficiency. And this is all hidden from you, you don't notice it util your code gets slow. So the witness table is basically a vtable, it's just an array of implementations just like the C++ vtable. And so that means that when you make a method call on a protocol type it looks a lot like a method call on an object 'cause you've got this special table just for the protocol. So when you make a call you get a protocol type like that, you do p.g, make a call, it translates into something like this. You just take that, you look up the table by looking up the fourth word in the protocol and then you use the offset in the table that you know about because the compiler just knows that it's that method and then you just make the call based on the function pointer. And then if you have a struct that's too big it ends up looking like this, instead of having data fields, that first data field is actually a pointer to the real data. So everything gets stored off over here, you've got the table over there. And then the methods here know that when they need to do their stuff they have to go up and chase that pointer and it's all just handled behind your back. And this is not cooperating again, indeed, there we go. Enums are a very interesting case in Swift. So Swift has these high level enumeration types where you can have associated data and all that or they can just be very simple things. Here is a simple case, just five cases, nothing associated with them, just A, B, C, D, and E. Here's a struct which will hold those. And the result is those get laid out very succinctly, zero, one, two, three, four, each one gets a different number, they're one byte long because we don't need more than one byte to represent five values and it's all very nice and compact. Here is a version with a raw value so you can actually go through and tell Swift, I want my cases to correspond to specific values. So what this does is it says A is one, B is two, C is three, D is four, E is five. And let's see what that looks like. And an interesting thing, it does not change. It doesn't go one, two, three, four, five, it still goes zero, one, two, three, four. Alright, running out of time here. Real quick, if we just go through (clicks) for a string, you can do string raw values. So A is whatever and then B, C, D, and E get defaults, those are just B, C, D, and E. And those still are zero, one, two, three, four. And essentially what's going on here is the raw value can be stored off in a separate table somewhere, the compiler knows about it, there's no per instance raw value of any kind, so it can just be zero, one, two, three, four and somewhere else there's a table that says zero is whatever, one is B, two is C, and so forth. Alright, let's look at associated objects real quick. This is just an enum, the first case has an object associated with it and the others do not. And if we dump that out we find that it has expanded because it needs to be able to store that object pointer, but it has expanded intelligently. So the first thing is just a raw object pointer and then the other ones are just small integers. And the compiler's able to pack these so that it knows zero, two, four, and six can never be a valid pointer. So it's able to use that to distinguish between those. And then if we make it larger we have an enum with A, B, C, D, and E where they all have objects associated with them and suddenly everything gets bigger. Every entry is a pointer followed by an integer. So object pointer zero, object pointer one, object pointer two, object pointer three. So the number that gets assigned to each enum case and the associated value essentially get laid out next to each other. The compiler's able to pack them compactly for that one specific case, but not in the general sense. Alright, so wrapping up, I'm just gonna kinda skip through these real quick since we're behind on time. We've got real physical memory, we've got conceptual memory and then we've got sort of the actual, the architecture of it all. C just lays things out nice and straightforward with a little padding. C++ objects get a vtable at the top. Swift objects get the same sort of thing, but with more stuff going on. Protocol values end up taking up five words of memory, sometimes they can store data inline, but if you get too big they don't. And enums end up being packed in many different ways depending on what's going on. There's our quick sum up which I just said. And so you can learn a lot by poking around. It's a lot of fun and sometimes it's useful. And as they asked me to remind you, and as I did before, remember to rate the session in the app and that's it. So if there's questions we can, or if we have no time for questions or (laughs), okay, can you use Swift demangle with PO in Xcode when debugging? I don't think there's any built in way to do that, but what you can do is just copy, paste onto the command line, Swift demangle should be available in the terminal. If you wanted to I'm sure you could build a little script, LLDB is scriptable through Python so you could do that if that turned out to be useful. Yes. Okay, so the question is, any versus any object in Swift three and whether there are changes based on whether you import foundation? I don't think there would be changes in the layout based on what you import because there needs to be cross-compatibility between files that import foundation to files that don't. So they would still need to be the same. But any, in Swift 3 Objective-C objects as untyped objects now come in as any instead of any object, so there's definitely a change there, I believe that's just essentially a translation phase. Any I think looks like one of the protocol types where it's a five word thing, it's got three inline and whatever and it's essentially storing things that way. And then there's just a step where it takes that Objective-C pointer that comes in and just kind of puts it in one of those things. The source code is online for my thing, it should run more or less out of the box, so if you wanna experiment and see what it does feel free. Anything else? No? Alright, oh, yes? (inaudible) Sure. Yeah, so question is about the new memory debugging facilities in Xcode 8 and how it compares to my stuff. So that new memory debugging stuff is really cool, you can go through and it will just essentially show you kind of graphs like I showed here except they're live which is really neat. And I haven't played with it a ton, but I'm sure it's gonna be really useful. It's a little bit more limited from what little I have done with it, in that it tries to, well it's gotta work at runtime, so it has to be kind of limited in that respect. I believe it will not trace like, C pointers and things like that, at least not beyond a certain point. It's not gonna be tracing global symbols and things like that. But as far as like, looking at plain Swift objects, it's really cool, it'll show you the trees, it'll show you this object points to that object. And I think it's gonna skip over things like the pointers up to the classes, so it doesn't give you everything. But for what you care about day to day, it looks really cool, really useful. Alright, looks like that's it. Thank you very much for coming and enjoy the rest of the conference. (applause)
B1 pointer swift memory object stuff essentially GOTO 2016 • Exploring Swift Memory Layout • Mike Ash 82 6 colin posted on 2017/04/26 More Share Save Report Video vocabulary