Placeholder Image

Subtitles section Play video

  • On "Computerphile" we just love provocative and mysterious titles and

  • carrying on from the last time we spoke let's say this is going to be a chat

  • about what came to be called the UNCOL problem. "Universal Computer [Oriented] Language" I think it stands for.

  • It was more specific than just any old computer language. It

  • was :" ... is there a unique intermediate language which would suit everybody?"

  • You know, not as high as C even and not quite down at the absolute binary level but

  • more like a sort of absolutely Universal Assembler - a pseudo-assembler. It's not

  • really hardware implemented on any machine but it's it's one that we can

  • all work with and every compiler in the whole world - all they would have to do is

  • produce the UNCOL language, if we can only agree what it is, and then every

  • system could talk to every other system at this agreed low level. Well, as you can

  • imagine, it doesn't work like that. It very soon became obvious that, yes, this

  • business of putting a level in there and saying: "We'll all compile to intermediate

  • code", is fine but when you start looking at what facilities it should have what

  • facilities it shouldn't have yo're up against the fact that computer hardware

  • designers like to do things their own way. I mean, numbers of registers? Might be

  • 16, might be fewer that's no big deal Some of them have arranged those registers

  • [as pointers into] in a formal or informal stack. Others don't. Should we always assume that we

  • have stack capabilities? And I think somebody pointed out to me - I think Ron

  • Ron not who originally created these notes, he said "The thing in the end that

  • kills you is that they all do input-output differently; there's almost

  • no agreement about how you do I/O". So, fairly soon the idea of finding one

  • unique into me language had to be forgotten about.

  • But the idea of different intermediate languages at different levels of

  • sophistication really did gain traction in the 1980s. We mentioned that Steve

  • Bourne had his Z-code as part of his Algol 68 project, way back in the early 80s.

  • A little bit later on, I think it was in the 80s, many of you will

  • know this one better, James Gosling developed the language Java, in which

  • he decided that pointers were dangerous and should be hidden (but therein lies

  • another story). But the big thing that James made a feature of was to say:

  • I want my Java systems to compile down to what he called 'bytecode'. In other words

  • it was a sort of pseudo-assembler with really, like, single-byte op codes like A

  • and Z, and whatever. And, yes, bytecode became flavour of the month we all go down

  • to bytecode but then what do you do? Well, you've got choices. You could either

  • write an interpreter for bytecode which will be easy to change. It will be a

  • little slow, a bit big. If you really care passionately about having the ultimate

  • super-fast and efficient binary, you can always compile bytecode; get it

  • smaller and all that. So you had options. But the idea was that, yes, you would have

  • an intermediate code. Even so, it's not a one-size-fits-all. There's still ... it was

  • ideal for what James wanted to do but its extensibility to be a universal

  • panacea? Not so. You see, let me give you another good example of why some people

  • might want to move the semantic barrier a bit higher I mean bytecode is fairly

  • low-level. What about if we move it up so we're getting more airy-fairy?

  • heaven knows might encounter Haskell way up there somewhere!

  • Classic example of course is a development of C++ and as many of you

  • know, as its name implies, it goes beyond C. It adds classes and all sorts of

  • other features to C. And the idea from Bjarne Stroustrup,

  • the inventor, was that to get something going, in the first instance at least, he

  • would, of course, do the obvious thing. His compiler would compile C++ down to

  • C. And then you could put it just through an ordinary C compiler, for the back end.

  • So, you see, his 'UNCOL'is at a much higher level of sophistication than

  • pseudo-assembler type bytecode level. And you might say: "Oh well, that's great,

  • I mean, it's obviously suits C++ to do that". Yes, it did, but there are big problems

  • with this approach Once you broadcast the fact that

  • actually C++ your "Mark I" compiler is producing C under the hood, you will have

  • the devil's own job in convincing benevolent hackers, who think they can

  • generate better C code than Bjarne Stroustrup can, getting in behind the

  • scenes and messing about with the way he does classes for example. So, I suppose

  • what I'm saying more generally about this, is that very often you will have a

  • very good solution for a language system to establish a bridge-head, and to get

  • something working. But, in the longer run, you might want a more direct version

  • that isn't is prone to hacker intrusion, gross abuse or just things going a bit

  • wrong because of the nature of the intermediate language being so rich and

  • having a mind of its own. Now, you might say: "That can't be an issue, can it?" Yes, it can.

  • Because this whole question of 'level' of your intermediate code. This thing gets

  • me there. So, why do I need to go direct? Let me give you another example

  • Not C++ this time. Another well-known example for many of you is PDF. It's been

  • so well-established for so long now since 1989,

  • that many of you using it will not know that in the early days it came off the

  • back of Adobe's very successful language called PostScript. And PostScript was

  • there as, you know, the universal graphics language.

  • It drove laser printers it drove whatever It was a wonderful achievement. In order to

  • get a PDF - the way you did it originally was you compiled your PostScript with

  • with an Adobe-provided utility called Distiller. But the problem was in many

  • ways it was very graphically sophisticated but it was Turing Complete.

  • You could do anything in it [given enough memory] and, indeed, I often thought: "Well the next

  • program I wrote in PostScript, before I do any typesetting as ordered by the

  • customer, I will get my program to solve a commands function first. Can you

  • imagine the delay: "I'm sorry, I'm going to compute ackermann(3,1) before I turn my

  • attention to doing your miserable little piece of typesetting. But in principle

  • you could have done that - as long as it didn't run out of memory. But, you know,

  • I'm just saying this to make the silly point that that's perfectly do-able!

  • You sometimes found that your PDF, produced out of compiling PostScript with Distiller was

  • yards bigger than the input. Not very often but sometimes. So, there again, you

  • see, in order to stop abuse and to point the way to the future Adobe

  • very quickly said: "What we must do, for those that don't know about PostScript,

  • have no need for it, is to give a direct route to PDF. And they called it PDFwriter,

  • back in the early days. And then, of course, people don't not wantingto be

  • bounden to Adobe, quite rightly said: "Fabulous! What we need to do is to

  • replicate something of what Distiller does. We'll write utilities with names

  • like 'ps2pdf', which you'll typically find in PostScript offerings on Linux

  • and all this kind of stuff. But it makes the point that very often that

  • directness of approach gives you a good result

  • and stops people messing about under the hood and doing things which are

  • ridiculous and expensive. If you start saying: "No, from now on it's much quicker

  • to go direct, we know how to do it. Let's do it; let's keep it clean". So that is I

  • guess, I think, a feature. Still I keep reading stories of people using

  • intermediate codes for compiling programming languages who suddenly saying

  • "Well, 20 years down the line we think intermediate codes are bad. It's far

  • better to do it direct, in some other way" And all you can say, out of this. is that

  • every time you get into porting software you learn something every time about the

  • pros and cons of having an intermediate representation. Or do you jump over it

  • and go direct? There is no universal right answer. The more you look at the

  • scene at the moment about program language implementation the more you

  • realize that a huge number of the offerings out there might look to you

  • like straightforward point-to-point compilers you know: " ... I'm running on ...

  • whatever I'm running on at the moment ... I'm running on an ARM chip. It's all self-hosted on the

  • ARM chip. It compiles ARM code for further use on further ARM chips. It

  • doesn't do anything else!" Not true. If you look under the hood of gcc, of course

  • Stallman and the GNU effort did such a wonderful job in creating for us a new

  • version of 'cc'; when you look in there at the possible back-ends for different

  • architectures you realize it's really a cross-compilation system. You can compile

  • from anything to anything. Now other people, other than the GNU effort, got

  • there eventually and realized the same thing. I mean I think Microsoft, in the

  • mid 80s, did actually had the nerve to develop something that I think they

  • called "ntermediate Language". I don't know whether Microsoft did try and

  • actually copyright the phrase "Intermediate Language" but it's part of

  • the same mindset. it's not just them. {It's also] Apple and Steve Jobs

  • always used to have: "It may have existed[before] but it was done by a bunch o

  • no-hopers. And until we discovered it; packaged it; marketed it and put it up for you

  • you might as well think that it never existed!" And that was Jobs through and through.

  • But maybe all big computer companies have a little bit of this inside them:

  • " ... it didn't really exist, in a usable way, until *we* discovered it".

On "Computerphile" we just love provocative and mysterious titles and

Subtitles and vocabulary

Click the word to look it up Click the word to find further inforamtion about it