Subtitles section Play video Print subtitles We're going to do the final stage we're going to end up with the self compiling compiler Well in the previous video what we did was we developed a C compiler by writing it in assembler hypothetically anyway and The net result of that is you get a working C compiler that works on the particular chipset that's in your computer it could be ARM, it could be Intel, it could be whatever and It does what you were expected to do. It drinks in C program statements and it spits out at the other end. So it's code generator modules a Binary which when suitably packaged will become an executable on your particular piece of machinery Traditionally under UNIX and Linux. These are called a dot out files traditionally and under windows are called dot X's, aren't they? but it's basically something that is very very close to a loadable venable binary as Translation of your C program and we left it last time if you if you like you could say, this is our C compiler But we could almost call it the mark 1 because we're saying hypothetically We developed it via an assembler root so when it executes It's running assembler quality binary and it's also producing shall we say assembler? Quality binary I'll label that Compiler mark 1 so you say well what's wrong with that? Why not just stick like that. We've got a compiler that works. Well, it could be a variety of reasons It could be that's all though. It's running binary of quality a that binary could be not very good It could be slow. It could be a perfectly good compiler, but it seems to take forever Yeah That will be one good reason for doing it there might just possibly now You hope not be a very very hidden bug that for certain C imports It just crashes. You hope that testing and debugging have removed all those but you can never be sure so There's maybe another reason that certain sorts of constructs sometimes caused it to crash. You don't want that But what you're now saying is can we use a not top-quality thing to make a better instance of itself. Now this is not a new idea some of the commenters on my previous video have been pointing out the machine tool industry has been doing this for ages using a not very good tool that's on its last legs as a means of making something that's sharper and better you can imagine a 3d printer that has been produced in a component-wise way by other 3d printers and there is a vital cog in the middle of it, which Turns around a lot for some reason or another and gets worn and what you'd like to do is to improve the quality of that Vital cog near worried that it's gonna split apart and do it. So why not? program your machine to make that vital cog on the machine as it now is And just hope it withstands the stress until the piece is made and then when the piece is made it's got better Tolerances and so on. Hopefully then the bit is replacing So you take the bad piece out throw it away and put the new one in So you're actually feeding back into the machine itself a new instance of something that is vital to its running So even out there in the everyday world this business of eating yourself or producing a better version of yourself Has been around for a long time and we're gonna do something very similar here. What I'm saying is Ideally, I'd like to make Bin A be History and a thing of the past because I'd like to write a new version of the compiler Which we will call mark 2 so just to summarize This is a compiler running binary code of quality a at the moment and code generating binary of a similar sort of quality We now write a new version of the C compiler still written in C Which produces better quality binary we pay a lot of attention to the code generation modules, so we'll call it bin B Version 1 is a version to his be or you can think of the B meaning better best quality in some sense binary So that is what you initially write you write a new version of the C compiler which takes in C it Produces bin B. You spent a lot of time on that code generation modules And of course at this stage you think of it as being written in C Which it is you're writing a new C compiler in C But the burning question then comes well, I can't execute C directly on the hardware I don't have a C interpreter although in principle I suppose you could develop one know that C has got to be turned into binary. How do you turn that C? this is written in into binary when you are in the process of constructing the new thing and the answer is revert to the previous generation revert to the one that you've Hoping to leave behind as a means of propelling you forward as part of the bootstrap process What we're now saying is we do it like this we write C producing bin B the new version of the compiler written in C How do we compile that C we have already got the old Mark 1 version of the compiler? Which you'll remember from the top of the previous sheet takes in C runs on Bin A execute Bin A let's hope he doesn't fall over and squirts out binary of quality a But that's sufficient to show C going in there whirring round producing Binet. You've now got an executable Which takes in C produces much better binary of quality B it's only weakness is that it's still running itself on binary a Maybe it's still slow maybe binary a is congenitally slow. It's not being very well written at all But at least it works. It's there some final step. Yes, there is that would get rid of any vestigial remains of bin a in all this Process cycle. Yeah, that is exactly what you do next we've got a new version of the C compiler, but it's executable is only still of Bin A quality how to improve it feed the new version of the compiler to the Executable that you have just created And it would go like this Look, here's your new mark 2 version of the C compiler C compiler written in C producing bin B Feed it through The Previous thing that we developed at the end of the previous sheet what we ended up with if you remember is A version of the compiler now that can do C to Bin B so it's a new version the only little weakness inside it is it's running bin a But can you see that by feeding that into a binary? executable of itself What you do is your feed your C in this is running on Bin A, but it spits out bin B So what you collect in your dot exe or dot they doäôt file at the end is what you want C Producing bin B, but running on bin B so this is now our executable for the new version but the beautiful thing about it, is that that remaining weakness here where you use the old version running Bin A to produce a Version that doesn't have Bin A in this at all Final stage feed that back to itself and that our sound of trumpets C written and C producing bin B But as a result of the previous exercise, we've now got a version of the compiler. That is the binary instantiation of this We have got from over here a C producing Bin B written in bin B Well use it use it to recompile yourself And you end up? now with C written in bin B producing bin B any memories of the hell we went through with the Assembly version can now be forgotten you can throw that away and Carefully, of course store away your version now which only needs to get back to a Bin B level of history in order to regenerate itself and I suppose you could say therefore that this triumphant thing which I put three stars on here the product we were trying for all along you can say it's now a self-sufficient Compiler as well as being a self compiling compiler It's only needs a working and debug version of itself in Order to be a wondrous new replacement now things are never quite that perfect There's always a downside I've got I've got a question because he kind of sort of alluded to it though. It skips over it nicely Debugged right. I mean what happens if there are problems with this? You know, where do we go? You always have to retain the ability to go back further than you would wish. Of course you do It yeah, I think it is fair to say that Compiling a C compiler is a very heavy and demanding task for a C compiler to do You will get varieties of code Usages of data structures and all sorts in a Compiler that you will not likely find in a weather forecasting program. Let alone an events listing date calendar Whatever a compiler is a demanding thing to write is going to be a demanding thing to compile It may well be that just saying. Oh here's the old compiler that the old compiler just falls over when faced with the Unbelievable quality of the C you've written for the mark to the thing So yes that always happens you can feed it to itself and then itself falls over. What what do you do? Well, you've got to back off and do it again, but you can see that in the end It is the way to go get yourself Use the whole idea revealing something to yourself - how shall we say motivate the quality of the C that you write to be compilable and Giving a mind of course to efficiency and so on all the way along the chain But I do hope that this example for those of you maybe been a bit puzzle saying what is all this bootstrapping? One of the problems I think is that even sometimes when you see T diagram explanations They make it hard by not distinguishing if you like between Bin A and Bin B I've seen Explanations which just right bin and don't make it clear that it's a different sort of binary, you know And and that I think causes a lot of confusion So I hope this has helped. Is it possible that this goes through oodles more? That's the technical term, oodles more? Iterations, he got Bin C. Bin D, Bin E. Oh, yeah. Yes yes, if you decide that well bin B was fine for its time, but frankly, there's a witty new idea with interlocked Triple Ref pointers pointing to data structures that do this and not the other and that's Absolutely the way to write that next version of the C compiler then you can do exactly the same thing You be daft not to use the previous version of the compiler to compile yourself But what you must be careful of is that if you're in the process of as it were defining The subset of the language of the C compiler compiles your hope it will be total but there may be glitches in it You've got to try and write C that it's capable of compiling itself and you've got that in your mind all the time When you're doing this Worst case scenario. Yes, exactly Yes. Yes. I'm a fabulous compiler. But please don't feed me with myself all hell may break loose. Yeah. Yeah And the cube mouse click events We probably wouldn't get one whether the mouse is move If you think about as the mouse is being moved You've got lots and lots of events most of which you're not interested in You're probably only interested when a button is pressed or so on the only times pass that's different Is it cause for your dragging something around the screen?
B1 compiler bin binary version quality producing Self Compiling Compilers - Computerphile 8 1 林宜悉 posted on 2020/04/13 More Share Save Report Video vocabulary