Subtitles section Play video Print subtitles This is an Ogham stone, or "ogg-em" stone, depending on whose pronunciation you follow. The carvings on here are from an alphabet unlike anything else in the world, an alphabet that is literally an exception to modern rules. I'm not talking about the Roman letters on the face: I'm talking about the markings carved into the corner. Those markings are in Ogham -- it's a way of writing down the early Irish language with marks like these. There are only about 400 surviving stones like this, found in Ireland and the western parts of the UK. This particular one is about 1500 years old, and it comes from Devon in the south west of England. It's on display here at the British Museum. The inscription is a name. Ogham stones are mostly used to record names, either as a tombstone or as a marker of land ownership. You read this along the stemline, along the corner. Each character is made of one to five markings, which will all be on one side of the line, the other side of the line, through the line, or on the line. When modern scholars started to analyse this script, they wanted to write it down on paper, and they adapted it a little to make it easier for them: they changed it so it always went left to right, and each phrase was drawn on a horizontal line so you could easily tell which marks were on which side. Some Ogham inscriptions, on later stones or on other artefacts, do actually carve their own stemline into a flat surface, so adding that line in print wasn't too much of a stretch. And after all, trying to fold a bit of paper and sketch markings on the corner wouldn't be easy to work with for academic papers. And that left us, years later, with an interesting technology problem. When it came time to encode Ogham characters as 1s and 0s, to fold them into Unicode, the international standard for how to display text on a computer: Ogham became the only language in Unicode where a space is not a space. A space character, to a computer, has three properties: it has a certain width, you don't display or print anything in that width, and if your text has run out of room on a line, you can go back to the previous space character and replace it with a line break. Now, there have been well-understood variations on those for years. Two of those rules are flouted all the time. You can have a non-breaking space, like the one between a word and a French quotation mark. That space has width, it has nothing displayed in it, but you can't put a line break there. The word and punctuation must travel together. You can also have a zero-width space, which sounds like a ridiculous idea, but it's a good way to tell a computer that, if there isn't room, it's OK to break a long string of characters somewhere that it otherwise wouldn't. These are all commonly used. But until Ogham was added to Unicode, the rule that a space character must be empty had never been broken. Why would it? It's a space. Well, an Ogham space includes that stemline. The line doesn't stop between words, because the corner doesn't stop between words. The space is not a space... but it behaves like one. It can be replaced with a line break. If you spread an Ogham inscription over two lines, the space character vanishes, same as in English. Now, Ogham isn't the only language that uses a separator like this. Ancient Latin used an interpunct, a middle dot, the same way. But in modern usage that is not a space, and modern usage wins. Ogham is the only case where modern folks have gone, yeah, okay, it's a space that also involves drawing something. It's a space that isn't a space. There's been an actual argument about it, down in one of the mailing lists for linguists and computer science nerds at the Unicode Consortium. The Irish contingent had some very strong opinions. And the final ruling: yep. It's a space that's also a line. This is one of the things I love about linguistics: an ancient script, carved into stones more than a millennium ago, is an exception to a rule that I never even realised was there. A space doesn't have to be a space.
B1 space line unicode width character modern ᚛ᚈᚑᚋ ᚄᚉᚑᚈᚈ᚜ and ᚛ᚑᚌᚐᚋ᚜ 3 0 林宜悉 posted on 2020/04/01 More Share Save Report Video vocabulary