Subtitles section Play video Print subtitles Okay, so endianness it's, um, it's a simple problem really Computers these days arrange their memory in terms of bytes. So 8-bit values, eight binary digits, 8 bits arranged in the computer's memory and we have an individual address for those things, for each of those bytes. Now, the computers don't deal always with bytes. Sometimes they use 16-bit values sometimes they use 32-bit values and so you have an interesting question is that you've got say a 32-bit value, let's just stick with 32-bit values for now and you need to assign it into a memory location. So you've got 8 bits per memory location per memory address and you've got 32 bits so you're gonna have to split it into 4 bytes worth of things, four individual pieces, and then assign each of those individual pieces into one memory location. Let's pick a 32-bit value and we'll do it in hexadecimal. Just cause it makes the numbers easier. So the 0x means it's hexadecimal and we're gonna go for 0 0 C 0 F F E E. So this is going to be our 32-bit value that we're going to want to assign into four different memory locations. So this would be address 0 1 2 3 and then 4 it would go on like that, memory locations. So each of those addresses is going to represent a byte. That's the number between 0 and 255 which is equivalent to two hexadecimal digits. Each hexadecimal digit represents one nibble four bits So two of them is a bytes worth. Eight of them is 32 bits worth. So we need to assign these bytes into the memory locations. So how do we do it? What would your suggestion be, Shawn? Shawn: "To me, it looks like you were just kind of translate that down and have the 0 0 in 0 and just carry on like that." So you want me to put the 0 0 there and then I put C 0 in there? I put FF in there and then E E in there? Shawn: "Yeah, but I do feel like I'm walking into a trap." No, obviously you like to eat your hard-boiled eggs from the big end Shawn: "Right." Ok. There is another way you could it though. You could start from the little end and there is a reason why I'm talking about a hard-boiled egg. I haven't completely flipped in this computerphile video. We'll come back to that in a minute. Let's draw out another set of four memory locations. 0 1 2 3 & 4. We could also have started from this end and put the EE in there, the F F in there the C 0 in there, and then the two zeros in there and that would be another way of doing it. In actual fact, as long as you're consistent in the way you do it and you build a computer knowing that if it's going to read a 32-bit value they're going to be in this order or that order or whatever order and it's consistent then your computer system would work. What we've done here is we've got two different ways of writing these things out and this is basically the issue around endianness is: How does your computer store values that are bigger than 1 byte in memory when each memory is made up of 8-bit locations where we can store 8 bits? So how do we map say a 32-bit value, a 64-bit value, a 16-bit value into those 8-bit locations. And this is where we come back to our friend the egg. There's a book published in the 1700s by Jonathan Swift called Gulliver's Travels. It's a novel, it's a satire of society. In this novel, Gulliver goes on his travels. The first place he goes to is a town called Lilliput. Lilliput, everyone's very tiny, but they like to argue about things and apparently - I haven't read the book - but apparently at one point civil war breaks out over which way do you eat an egg? Do you start from the top, the little end, because it's pointy or do you start from the bottom, the big end? Half of Lilliput was little-endian. They would start from the the pointy end and the other half were big-endian. They would start from the other end. So they would sort of smack it down like that and start peeling their eggs or hitting it with, uh, probably with a teaspoon and serving it and dipping their yolk in there. And we've got here the two main types that are used. This one is called big-endian and this one is called little-endian. And the reason why it's called that is because if we were to write this out as a binary number If you've got a hexadecimal number, you can convert each of the hexadecimal digits into four binary digits it's relatively easy to write it out. So we get 1 1 1 0 for the first E, followed by 1 1 1 0 going backwards for the second E. Then we get 1 1 ... 0 0 and this should be 32 bits there. Now each of these bits has a number associated with it. So this would be considered bit 0 and this would be considered bit 31. And then we can count down, so this is then bit 24. That's bit 23. Bit 16 and 15. And then that would be the bit 8 and that's bit 7. And so this byte, the E E, is what we call the least significant byte because it's got the bits with the least numbers on them, the smaller bits. And this is the most significant byte because it's got the bits with the higher numbers on: 24-to-31 as opposed to 0-to-7. Someone had the big idea that the way to name these things was to reference the egg Wars of Gulliver's Travels and to refer to systems that started, the sensible way in my opinion, putting the 0 0 then C 0 then F F then EE like that in memory, they would be big-endian systems. People that started by putting E E at the bottom and then F F C 0 0 0 would be called little-endian systems. So that's why we call it endianness. It all traces back to eggs of Lilliput in Gulliver's Travels. Now you might ask why have two systems at all, why not just standardize on doing it one way or the other? Well, as I said, it doesn't make any difference as long as your computer system's consistent the people who are writing the software know how it's done, the hardware designers know how it's done everything's lined up in the right place and it isn't a problem. But there are some advantages to doing it one way over the other. So, for example with the big-endian system it's what you naturally went for, you naturally went for a big-endian system. And so the people who designed some of the IBM mainframes, the PowerPC architecture the 68000 chip, and things like the original Macintosh and the Atari ST. There all big-endian systems. So when they got a 32-bit value they start in the first address they put the most significant byte and then they go down towards the least significant byte. On the other hand, the 6502 chip, the ARM chip by default, it can work the other way the Intel x86 and the AMD x86 chips, there all little-endian systems, Z80 was as well. They will put the least significant byte first in memory, and there is an advantage from that because when you're reading it and building the hardware it doesn't matter whether you've got a 16-bit value or a 32-bit value. If we had a 16-bit value let us have A B C D, that would be big-endian. And you could also write that as C D A B, and then that would be little. If it's a little-endian system, the first byte always goes in bits naught-to-7 the second byte always goes into bits 8-to-15 regardless of whether it's a 2-byte number, a 16-bit number, or a 32-bit number, or a 64-bit number. So your hardware's simpler to design. On the other hand, if you're reading the memory in a debugger or something it becomes harder and you have to manually rearrange the bytes in your own head. There's also another system, which is sometimes referred to as PDP 11 ordering, or mix ordering which is when you just sort of really mix it up and start from the middle and go out. You can get really weird ordering, but we'll ignore that for now. So generally on one system if it's not talking to anything else it doesn't matter which endianness you need as long as you know what it is. The problem comes is when you have one computer communicating with another whether that's over a network or whether that's by putting data onto a floppy disk, a USB stick, or something. You've then got bytes laid out in something by one machine which is being read by another machine and when you do that you need to make sure that both machines agree on how the bytes are laid out. So for example... Networks, when they're transferring data across, they're going to need to agree what order do the bits come in? What order do the bytes come in to represent a 32-bit number? If they don't agree on a standard and the Internet, for example, is agreed on everything being big-endian, sensible choice then one machine will send it big-endian the other machine will read it little-endian and get completely the wrong number out when they do it. So the only time it really matters is when you're transferring data between machines of different types in which case you have to make sure that you agree on what standard your using to transfer them. Shawn: "Where's that translation happen?" So that's a good question. Normally it will happen in the software. Say, for example when you write software to communicate over a network using IP there's various functions that you will call to take the number, say, for example, your TCP port number so like, if you're trying to connect to a web server that's port 80 or port 443 if you've got encryption. Rather than just setting the value directly in memory, you run it through a function which is called network to host ordering or host to network ordering depending on which way you're doing it. So if you're setting the port number you'd use this one, if you're reading it from a network packet you'd use that one and that will do the conversion for you, if needed. So that thing will be defined on, say, an Intel system to convert from little-endian to big-endian. But on a Motorola system using a 68000, which is natively big-endian, it will just do nothing and copy the values. Shawn: "Does it slow things down?" Um, yes, a bit. So, for example, you have to read the bytes individually and then shuffle them around in memory in actual fact modern CPUs, modern ARM chips, modern Intel chips have instructions that can move big-endian numbers even though they're natively little-endian. And at that point it's done as fast as possible. These days, with the clock speeds you're dealing with, the slowdown won't be noticeable because you're not doing it that often. It's... you set one value in a port number when you create the socket. The rest of the transmission probably is in ASCII, anyway so you never need to convert anything so it's not gonna make that much of a difference. If I write down 0 0 1 0 that represents a 2 in its simplest form. That is what binary coded decimal is and you just use them in 4-bit nibbles. Now. We all know a nibble is half a byte. A byte equals eight...
A2 memory hexadecimal gulliver ordering port system Endianness Explained With an Egg - Computerphile 4 0 林宜悉 posted on 2020/04/04 More Share Save Report Video vocabulary