17 October 2012

The daleth and the resh, part 1 of 2


Here's what the Hebrew letters daleth and resh look like.


With that in mind, consider the following excerpt about Isaiah 33.8.

The RSV, NRSV, NAB, and NIV follow 1QIsaa in reading
ʾdym [‘ê·ḏîm] [עֵדִ֔ים] [concordance] “witnesses”
instead of the MT
ʾrym [‘ā·rîm] [עָרִ֔ים] [concordance] “cities.”
“Witnesses” seems appropriate to the meaning of the passage, and the interchange of resh for daleth is understandable in light of the similarity of the letter shapes. The NJV also calls attention to this reading in a footnote.
Harold Scanlin, The Dead Sea Scrolls & Modern Translations of the Old Testament, p. 130

This got me thinking, how should one go about writing a sacred text in a way that avoids such problems?

Or, more generally, how should one go about writing a text that needs to be transmitted with high fidelity, i.e. faithfully. Sacred texts are just a specific example of this. The situation reminds me of an old FedEx slogan:

When it Absolutely, Positively has to be there overnight.

So, sacred texts are those where it absolutely, positively must be copied right. Yet, historically, they have fallen far short of this. At least, this is true of the sacred texts of Judaism, which are the only ones I know anything about.

The following is a rambling set of comments on the topic of faithful transmission of text. I'd like to be able to call it something more profound, like an "extended meditation," but it is really just a ramble. As the title suggests, it is the first of what I hope will be a two-part whole.

Avoid homoglyphs

The first rule of faithful transmission is "avoid homoglyphs." Well, really it should be "avoid homoglyphs and near-homoglyphs," but that isn't as catchy. Anyway, it is just a fancy way of saying "use letterforms that look different."

Hebrew homoglyphs

Hebrew is littered with near-homoglyphs. We've already seen the issue with daleth and resh; here it is again, along with various other issues.

נג כב עצ זןו רדך סם

A more detailed presentation of these issues is available on the following web page. (Note that it shows Sofit and Fey Sofit in their cursive form. Their printed forms are not easily confused.)

Similar Hebrew Letters
DIGRESSION By the way, this "Hebrew for Christians" site is an example of a general pattern:
Some of the best resources for studying the Tanakh are for studying the Old Testament.
Whether this is a good thing or a bad thing, or just a thing, neither good nor bad, I will perhaps opine on in another blog post. But I will remark here that it feels a little strange to me. But that feeling itself is a little strange, since when I step back and think about it, it is not surprising that it should be the case. Our sacred texts our sacred to them, too. And, while numbers don't tell all, I'll just point out that there are something like 150 times as many Christians as Jews in the world.

English homoglyphs

Anyway, back to homoglyphs (and near-homoglyphs). To be fair to Hebrew, English is not immune to this problem. Or, rather, the Latin alphabet and Arabic digits are not immune to this problem. For example consider the following characters.

  • 1 (the digit one)
  • I (the letter capital I as in India) (henceforth "CI")
  • l (the letter lowercase el) (henceforth "LL")

Putting all three together, you get "1 I l." How much these differ depends on whatever font is operative. For some edification and entertainment, I recommend typing "1 I l" in at one of the following sites:

I chose "India" as an example of a word starting with I since that's the choice of the NATO phonetic alphabet, which is designed to give letters different-sounding names, solving a problem analogous to the one we are discussing here.

Are homoglyphs "wrong"?

I think we have to let sans serif fonts off easy on the CI/LL distinction. Though the distinction can be made with full strokes, not just serifs, I feel that it is not really in the charter of a sans serif font to have to make distinctions like that.

More generally, we can't say that any font is wrong if it fails to make one or more of these distinctions. Fonts serve a variety of purposes; their design goals span concerns of form, function, and the great gray area in between. Many of these goals are at odds with each other, and hence trade-offs must be made. In the service of one goal, another goal may be sacrificed, or at least compromised. For example, if simplicity of letterform is allowed to trump distinctness of letterform, then perhaps one, CI, and LL would be allowed to be very similar or even the same.

That having been said, for most purposes, distinctness of letterform is a very important goal for a font. Thus a font intended for general use should give this goal great weight in making its design trade-offs.

One and CI

My particular interest is the one/CI distinction. The most common problem here is a one that looks like CI. The above sites allowed me to quickly identify the following fonts on my computer as "offenders" in this area.

Hoefler Text
Big Caslon

A Roman-style one (henceforth "R1"), i.e. a one that looks like CI, is common as part of what are called old-style numerals or old-style figures (OSF). Such a one may be distinguishable (with effort) from CI since it is usually only x-height. Even then, when mixed with a small caps CI, the problem may persist. This might seem an obscure situation, but the use of small caps for acronyms is a somewhat common style.

Here's the story of how Vice President Al Gore caused the one in the Brioni font to be changed from Roman to Arabic to make it easily distinguishable from CI.

Roman ones and the Great Isaiah Scroll

Let's get back to the sacred. Strangely enough, my first experience of one/CI confusion happened while reading about some famous daleth/resh confusions of more than two thousand years ago! The very perceptive reader may have noticed the opportunity for one/CI confusion in the prickly-looking abbreviation "1QIsa" that appears in my opening quote.

"1QIsa" is an interesting opportunity for confusion. On the positive side, it has a one and a CI, making slight differences between the glyphs easier to see than if they appeared independently and the reader had to find and compare far-flung examples. Also on the positive side, many readers would know, from context, that the third glyph is a CI since it begins an abbreviation for Isaiah. Slightly fewer, but still many readers would know, from context, that the glyph before the Q is supposed to represent a number.

But here begins the real problem. When the one resembles a CI, is the reader to infer that the convention is to use Roman rather than Arabic numerals? For the trivial case of the number one, it doesn't really matter, since they represent the same thing. But for the case of the number eleven, confusion could be serious, since R1 glyphs would make eleven look like the Roman numeral representation for two. And, as it turns out, there was a Cave 11 at Qumran, containing, among other things, the Great Temple Scroll (11QTa).

Let's see how the Great Isaiah Scroll is referred to in the NJPS translation of Prophets (Nevi'im) that was first published in 1978 (ISBN 0-8276-0096-8). (This is not the first publication of its translation of Isaiah; that was in 1973.)

First, we should note that the NJPS Prophets abbreviates Isaiah to just "Is" as opposed to the more standard "Isa." Perhaps that standard had not been established yet, and in any case that's not our issue here.

Our issue here is that its typesetting of "1QIs" visits the whole gamut of the one/CI/LL confusion. The one is represented

  • confusingly, an R1
  • incorrectly, as LL
  • correctly, as an Arabic one (henceforth "A1")

The CI is represented

  • incorrectly, as an LL
  • correctly, as a CI

Below are scans of examples of its different settings of "1QIsa." The page numbers and the representations of one and CI are listed in the first row. Apologies that these are bilevel rather than grayscale images.

356: R1 / LL 368: LL / CI 369: R1 / CI 468: A1 / CI

This was all fixed, to A1/CI, in the 1985 NJPS Tanakh. More generally, the 1985 Tanakh moved to using A1 rather than R1 in footnotes.

And then came the digital age.

What promise it offered, and continues to offer! Yet, what typographic barbarisms it has facilitated. I suppose that only from great heights can great falls happen. Another way I've seen it well-put is:

To err is human; to really screw things up requires a computer.

Don't get me wrong, my Kindle version of the NJPS Tanakh is one of my prized possessions, inasmuch as something so intangible can be thought of as a possession. But somehow one/CI confusion in "1QIs" crept back in, with a vengeance. In particular, the one is usually represented as a CI.

Below are scans of examples of its different settings of "1QIsa." The Kindle locations, Kindle and printed page numbers, and the representations of one and CI are listed in the first row.

16293 / 762 / 633: CI / CI 16330 / 762 / 640: A1 / CI
DIGRESSION In typical great heights/great falls fashion, Kindle locations offer citations of intriguingly high resolution, but all footnotes have been converted to endnotes, and thus all Isaiah footnotes appear to be on the last page of Isaiah, 762. Within the hyper-linked Kindle world, this doesn't really matter. For citations that "work" for the printed version, you need to follow the endnote's hyperlink "backward" to find the real page.

In the Kindle edition, how can we know whether the one is being represented by a CI or an R1? Well, for one thing it is visually identical to other CI glyphs, but, more deeply, if you copy and paste it, it is a CI; and if you search for IQI (CI-Q-CI), you'll find those instances that use it.

An important consequence of this is that if you search for 1QI (one-Q-CI), you won't find the IQI (CI-Q-CI) instances. And here we really find form spilling over into function. So far all my complaints about the typesetting of "1QIs" could be dismissed as the whinings of an aesthete with too much time on his hands. I would mostly disagree with this characterization, but would have to admit that function was only impaired, not destroyed by these problems. If we narrowly define function as transmitting the meaning, "Isaiah manuscript from Cave 1 at Qumran," then these problems probably did not destroy function for most readers. They probably just made it more difficult to decode this meaning, i.e. they only impaired function.

But the digital version adds (or should add) a new function: the ability to search. And this was not just impaired but destroyed by the misrepresentation of one as CI.

DIGRESSION When one is represented as one in the Kindle edition, it shows up as an Arabic, yet old-style, figure. This and other research leads me to believe that Georgia is the font used by the Kindle Reader for Mac. Or at least it is the font used when a serif, proportional font is requested. Note that you can't easily change fonts on any Kindle-reading platform. That's why I said it is "the font," not "the default font." On Kindle hardware, the font seems to be PMN Caecilia, which has a lining one, i.e. a non-old-style one. Some fonts may have an option for both lining and old-style figures, and perhaps even an option for both an Arabic and Roman old-style figure for 1. I'm not sure if either of these fonts do, but the relevant question here is what does their default one look like.


If you have the luxury of making up your own alphabet, avoid homoglyphs. This luxury is rarely available; the only recent example I can think of is the invention of the Klingon alphabet. I wonder how distinguishable its glyphs are.

DIGRESSION Another "I wonder ..." about a recent example: did anything analogous to daleth/resh confusion ever happen with the Book of Mormon?

Back to obvious conclusions: if you, like most of us, are stuck using someone else's alphabet, choose your fonts so as to avoid homoglyphs. For example, avoid fonts with Roman ones if the text you're setting might use them in a way that would cause confusion. Perhaps you don't need to avoid the font altogether if it provides an Arabic one as an alternative.

Finally, let's zoom out to a theological question: if Isaiah's words are holy, why didn't G-d give him a better alphabet to record them in?

My suggestion is, the Hebrew alphabet is no more the alphabet of G-d than the Hebrew language is the language of G-d. Indeed problems like daleth/resh confusion serve a useful purpose. They remind us that we are reading holy words, not G-d's words. Holy words bring us closer to G-d, but they are written in man's imperfect alphabets, and in man's imperfect languages.

To me, the very notion of "G-d's words" unacceptably diminishes G-d by seeing him as acting within the limits of language and therefore possibly constrained by language.

Like anyone else (perhaps more so), I can't claim to know much about G-d. But I'm pretty sure he is without limits. (So I'm also pretty sure he is not a "he" or a "she"!) And I'm pretty sure that if G-d had an alphabet, we surely could not read it.


  1. In the modern day I think we could dismiss most of these issues with a few error correcting codes and checksums - or better yet, digital signatures. Hopefully God has a way to transmit His Public Key to us. I'd accept the text as divine if the key fingerprint appeared as a cloud formation.

  2. Hehe.. nice lead-in to what I plan to cover in part 2, i.e. how you handle these issues with what we now know about coding. Regarding the cloud formation, couldn't you be mislead by elaborate sky-writing?