Loading ...
Sorry, an error occurred while loading the content.

28th Unicode Conference-Call for Papers-Orlando, FL-September 7-9, 2005

Expand Messages
  • Tex Texin
    Send in your submissions now! Due date is May 20! Call for Papers! Twenty-eighth Internationalization and Unicode Conference (IUC28) Unicode 4.1 - Multilingual
    Message 1 of 5 , May 16, 2005
    View Source
    • 0 Attachment
      Send in your submissions now! Due date is May 20!

      Call for Papers!

      Twenty-eighth Internationalization and Unicode Conference (IUC28)

      Unicode 4.1 - Multilingual Challenges and Solutions for 2006

      See Call for Papers at:
      http://www.global-conference.com/iuc28

      September 7-9, 2005
      Orlando, Florida, USA

      Send in your submission now!

      Submissions due: May 20, 2005
      Notification date: June 10, 2005
      Papers due: July 15, 2005

      Unicode 4.1 - Multilingual Challenges and Solutions for 2006

      The Internationalization and Unicode Conference (IUC) is the premier
      technical conference for software and multilingual computing, and it is
      your source for the latest information on advances in the globalization
      of software and the Internet. The 28th IUC features a number of
      presentation formats including tutorials, workshops, lectures, and
      panel discussions to support different learning styles

      The release this year of version 4.1 brings us closer to 100,000
      characters in the Unicode Character Standard. Leaders in Software
      and Web Internationalization will share their expertise and
      best practices in working with the expanding character set and
      identify new challenges and opportunities in multilingual computing.

      You are invited to contribute to the discussion. All topics related to
      Unicode, software and Web internationalization, and specific regional
      challenges are relevant, but emphasis will be given to topics that fit
      this theme with a focus on actual practice. The conference will also
      explore how to make internationalization and localization more efficient
      even when extending support to languages with complex processing and
      rendering behavior. There will be a track devoted to localization
      standards, tools, methodologies and techniques.

      The conference web site has more details and numerous example topics:
      http://www.global-conference.com/iuc28


      ATTENTION WEBSITE and SOFTWARE DEVELOPERS!

      Share your ideas for best practices for designing applications that can
      accommodate any language. If you are using Unicode in software or on
      the Web, bring your experience, knowledge and any remaining questions
      to light! We invite you to submit papers describing challenges you
      faced, lessons learned, and ideas for future implementation. Our
      audience is very interested in how Unicode and internationalization
      are being applied in the real world. Come and share your ideas with
      the peers and industry experts in attendance.

      INVITATION TO SUBMIT PAPERS

      This is the premier technical conference for software and Web
      internationalization and your source for the latest information on
      standards, best practices, development tools and advances in the
      globalization of software and the Internet. The Internationalization &
      Unicode Conference features a number of presentation formats including
      tutorials, workshops, lectures, and panel discussions to support
      different
      learning styles. The conference also provides a forum for identifying
      and
      discussing new issues in internationalization.

      Attendees benefit from the wide range of basic to advanced topics
      and the opportunities for dialog and idea exchange with experts
      and peers. We invite you to submit papers on the conference themes or
      topics that relate to Unicode or any aspect of software and Web
      Internationalization.

      You can view the programs of previous conferences at:
      http://www.unicode.org/unicode/conference/about-conf.html


      COLLABORATION WITH TILP

      The Institute of Localisation Professionals (TILP) will chair a
      track during the conference, devoted to localization standards,
      tools, methodologies and techniques. Improving efficiencies in
      localization is key to enabling cost-effective, quick-to-market
      support of new language versions of software.

      Registration for the conference will grant access to the TILP
      localization track. Speakers therefore will have opportunities to
      reach a wider audience of localization and internationalization
      management and professionals. We invite papers that are appropriate
      for this expanded audience.

      CONFERENCE ATTENDEES

      Conference attendees are generally involved in either the
      development and deployment of Unicode software, or the
      globalization of software and the Internet. They include
      managers, software engineers, testers, systems analysts, program
      managers, font designers, graphic designers, content developers,
      web designers, web administrators, site coordinators, technical
      writers, and product marketing personnel.

      EXHIBIT OPPORTUNITIES

      The Conference SHOWCASE area is for corporations and individuals
      who wish to display and promote their products, technology and/or
      services. Every effort will be made to provide maximum exposure,
      advertising and traffic.

      Exhibit space is limited. For further information or to reserve a
      place, please contact Global Meeting Services at
      info@....

      THE UNICODE CONSORTIUM

      The Unicode Consortium is a non-profit organization dedicated to
      the development, maintenance and promotion of The Unicode
      Standard, a worldwide character encoding. The Unicode Standard
      encodes the characters of the world's principal scripts and
      languages, and is code-for-code identical to the international
      standard ISO/IEC 10646. The Consortium also defines character
      properties and algorithms for use in implementations. The
      membership base of the Unicode Consortium includes major computer
      corporations, software producers, database vendors, research
      institutions, international agencies and various user groups.

      For further information on the Unicode Standard, visit the
      Unicode Web site at http://www.unicode.org or e-mail info@...
      * * * * *

      Unicode(r) and the Unicode logo are registered trademarks of Unicode,
      Inc.
      Used with permission.
      Copyright 2005 Global Meeting Services, Inc.
      All Rights Reserved.
    • i18n@yahoo.com
      Hi all - just came across this on the Straight Dope web site at http://www.straightdope.com/mailbag/mindusscript.html Comments, currently not really on point
      Message 2 of 5 , May 17, 2005
      View Source
      • 0 Attachment
        Hi all - just came across this on the Straight Dope web site at
        http://www.straightdope.com/mailbag/mindusscript.html

        Comments, currently not really on point for qalam, are available at
        http://boards.straightdope.com/sdmb/showthread.php?t=315899

        Best,

        Barry
        -------------------------------------------------

        A Staff Report by the Straight Dope Science Advisory Board
        How come we can't decipher the Indus script?

        10-May-2005

        Dear Straight Dope:

        I just got a book on ancient civilizations. In the chapter dealing with
        written languages, they list Egyptian hieroglyphics, Mesopotamian
        pictographs, and Indus script as the three oldest known written
        languages. The book goes on to say Indus script has never been
        deciphered even though over 2,500 examples of it exist. Maybe I've
        watched too many sci-fi movies where a master linguist deciphers alien
        languages, but I really thought we had terrestrial languages mastered.
        What's the deal with Indus script? Is the art of linguistics still held
        hostage by our inability to decipher ancient languages without a "key" à
        la the Rosetta Stone? --Troy Dayton, Fargo, ND

        SDSTAFF bibliophage replies:

        Too much science fiction? No such thing. Star Trek, for example, teaches
        us that a good communications officer can send a message that transcends
        mere language, especially if she has legs down to here and a hemline up
        to there. Mmmmm. Mm-HMMMmmmmm . . . er, sorry. Was I saying something?

        Yes, I was. The Indus script, which was written in and around Pakistan
        over a period of several centuries centered around 2500 B.C., is the
        most famous undeciphered script, but there are many others. Other
        mystery writing systems include Linear A (Greece, 1800 B.C.), Zapotec
        (Mexico, 500 B.C.), Meroitic (Sudan, 300 B.C.), Isthmian (Central
        America, A.D. 200), Rongorongo (Easter Island, A.D. 1800) and Joycean
        (Ireland, A.D. 1900). Okay, maybe not that last one.

        Why haven't they been deciphered? It's instructive to look at some
        deciphered scripts to see what makes the enigmatic writing of the Indus
        valley different. Script decipherment is not as easy as it's made out to
        be in science fiction--and sometimes not as easy as it's made out to be
        in history books. Chances are the impression you took away from school
        was that the Rosetta stone made it child's play to decipher Egyptian
        hieroglyphics. Not so. How many schools teach that some of the best
        minds in the world pored over the Rosetta stone for a quarter century
        before it finally revealed its secrets?

        One of the biggest obstacles was that the ancient Egyptians used a
        writing system unlike anything known when the Rosetta stone was
        discovered in 1799. Scholars knew about logographic systems like
        Chinese, where there are thousands of symbols, each normally
        representing a whole word or idea. They knew about alphabetic systems
        like Hebrew and English, where there are typically 20 to 30 symbols,
        each normally representing one consonant or vowel. Some scholars may
        have known about syllabaries, with several dozen symbols each
        representing one syllable, as in Japanese hiragana and katakana. But
        Egyptian hieroglyphics had too many distinct symbols to be an alphabet
        or syllabary, and too few to be logographic.

        The decipherment published by Champollion in 1823 (building on work by
        many others, including Thomas Young) showed that Egyptian hieroglyphics
        were (neglecting some complications) a logo-phonetic system. In such a
        writing system, any given symbol can represent either an entire idea or
        word, or the sound (or initial sound) of that word. Some simple ideas
        can be expressed efficiently with a drawing of the object or an object
        it's associated with. But to express an abstract idea that can't be
        readily drawn, you can use a string of sounds. Suppose you want to
        express the English word "charitable" without an alphabet. You could
        draw a picture of a chair and a table (since "chair table" sounds sort
        of like "charitable"). This is the rebus principle. Today we may
        consider rebus puzzles to be nothing but a silly game, but to the
        ancients, they were a natural way to write a language. Other early
        scripts, like Mayan hieroglyphs and Mesopotamian cuneiform, are built on
        the same principle.

        The rebus approach may seem an unwieldy way to write a language, but
        it's a step up from non-linguistic pictograms. A picture of a chair and
        a table can only convey "chair and table," or at best an idea associated
        with a chair and table, such as the act of sitting down at a table. An
        abstract concept such as "charitable" is difficult to get across using
        pictograms. Writing systems built on the rebus system are a way of
        filling the void, but have the drawback (for us latter-day translators)
        that, unlike pictograms, they'll only work in one language. For a
        speaker of Latin, for example, pictograms of a chair (in Latin, sella)
        and a picture of a table (mensa) would never suggest the word for
        charitable (benignus).

        I go into such detail about logo-phonetic systems because the Indus
        script appears to have about the right number of distinct symbols (250
        to 400, depending on who's counting) to use this system. Knowing that,
        shouldn't it be easier to decipher the Indus script? Not really--the
        decipherers of Egyptian hieroglyphics had the help of the Rosetta stone,
        a bilingual or bitext (parallel texts of the same message in the unknown
        script and a known script). No bitext for the Indus script has yet been
        found.

        A bitext is no guarantee that decipherment will be easy. Take the case
        of Etruscan writing, found in Italy. At a superficial level the script
        is easily deciphered, since the letters are close in form to archaic
        Greek and Latin alphabets. But the language remains largely
        uninterpreted. What's the difference? Given a piece of Etruscan writing,
        we have no difficulty pronouncing the words, but no idea what most of
        the words mean (think of a trained politician reading off a
        TelePrompTer). The trouble is that Etruscan is apparently unrelated to
        any language understood today. Champollion, the decipherer of Egyptian
        hieroglyphics, had the advantage of knowing Coptic, which he correctly
        suspected was the descendant of the ancient Egyptian language. Etruscan
        has left no descendants.

        The dozens of Etruscan bitexts (with Latin, Greek, or Phoenician) aren't
        very helpful. All they really tell you is that a given block of
        mysterious text means such-and-such. There's no sure way to tell which
        Etruscan word corresponds to which word in the parallel text, since the
        order of ideas and number of words vary widely among the different
        languages. All is not lost, however. If, for example, a Latin word
        occurs several times in a text and a mystery word occurs the same number
        of times in the corresponding Etruscan text, you may be justified in
        supposing that they mean the same thing. But beware--often the two
        messages in a bilingual text are just paraphrases of each other, not
        word-for-word translations. Still, using methods like this, together
        with glosses (explicit translations of individual words in the
        documents), scholars have been able to determine--or at least make a
        reasonable guess at--the meanings of a couple hundred Etruscan words.

        If we understand the language or a close relative or descendant of the
        language, it ought to be pretty easy to decipher the script, right? Not
        so fast. The Rongorongo script used on Easter Island after European
        contact almost certainly represents Rapa Nui, the well known Polynesian
        language of the Easter Islanders. But no one now remembers how the
        script symbols are meant to be read. Steven Fischer recently claimed to
        have deciphered Rongorongo, but his critics say "Wrong-o, wrong-o." I
        don't know if Fischer is right or wrong, but undeciphered scripts do
        seem to invite harebrained analysis. Jacques Guy bluntly calls them
        "kook attractors," but even serious scholars aren't immune. Hrozný, who
        correctly deciphered Hittite, later went down many wrong paths with
        other scripts.

        The real kooks are those like Goropius Becanus of the Netherlands, who
        in 1580 proved to his satisfaction that Egyptian hieroglyphics
        represented Dutch. A Jesuit priest named Heras is one of scores who have
        claimed to decipher Indus script. Here's one of his translations: "There
        is no feast in the place outside the country of the Minas of the three
        fishes of the despised country of the woodpeckers." Whatever you say, padre.

        You mention the 2,500 examples of the Indus script. The number of
        available texts now exceeds 4,000, but quantity is no indication of ease
        of decipherment. Some scripts have been translated with far fewer texts.
        Take Palmyrene, the first ancient script ever deciphered. A handful of
        inscriptions were found on the walls of the ruins of the city of Palmyra
        in Syria. Scholars knew from ancient Greek writers that the language
        spoken there was closely related to Syriac, a well known Semitic
        language. The script was obviously derived from the known Aramaic
        alphabet but many letters weren't immediately identifiable. Among the
        ruins were several bilingual inscriptions in Greek and Palmyrene. If you
        know the Aramaic alphabet, it's a fairly simple matter to use the
        identifiable Aramaic letters and the similarity of proper names in Greek
        and Palmyrene to get a good start. Then you can use your knowledge of
        Greek and Syriac to fill in the blanks. Your Syriac is a little rusty,
        you say? Not to worry--a decent Syriac dictionary will serve just as
        well. Soon after the first decent reproductions of Palmyrene
        inscriptions were published in Europe in the 1750s, Barthélemy in France
        and Swinton in England independently deciphered them, each taking just a
        few hours to finish the job. It was perhaps a bit more challenging than
        the cryptogram puzzles you can find in your Sunday paper, but not by
        much. Most decipherments, needless to say, are a good deal tougher to
        crack than that.

        Returning to the matter at hand, is the lack of a bitext for the Indus
        script an insurmountable obstacle? Not necessarily. Some scripts have
        been deciphered without them, although not without a good deal of
        cleverness. Ugaritic writings, like Palmyrene, were found in Syria (in
        1929), suggesting that they too might be a Semitic language. About two
        dozen symbols were used, suggesting an alphabetic script. Several of the
        words were only a single letter long, suggesting Ugaritic used a
        consonantal alphabet written without vowels (as was the case with other
        early Semitic alphabets such as Hebrew). Applying letter frequency
        analysis to the problem, Hans Bauer tentatively assigned the values L
        and M to two Ugaritic letters. In Semitic languages, L is common as a
        single-letter word, but not so common in suffixes and prefixes; M is the
        only letter that is really common in Semitic suffixes, prefixes, and as
        single-letter words.

        On the assumption that related languages use similar words for common
        concepts (much as European languages have father/vater/pater), Bauer
        then used the M and L assignments to search the texts for the expected
        Semitic word for "king" (M-L-K or similar) and "kings" (M-L-K-K or
        similar). Proceeding along these lines, he found the words for "son" and
        the name of the god Ba`al, and so eventually determined the values of
        several other letters. His real insight was to guess that the word for
        axe might occur in the text inscribed on several axes. He turned out to
        be right about that, but chose the wrong phonetic values (he guessed
        G-R-Z-N as in Hebrew; the actual Ugaritic form was the related but not
        identical H-R-S-N). Édouard Dhorme later corrected the reading and
        finished the decipherment. One of the axe inscriptions said, in a
        language related to biblical Hebrew, "Unto the high priest doth this axe
        belong, wherefore shouldst thou keep thy hands off it!" Or something
        like that. It strikes me that Bauer's guess was pretty lucky--I have two
        axes in my garage but have yet to inscribe either with the word "axe."
        But hey, when the high priest tells me, "Inscribe the word 'axe' on this
        axe, chop-chop," I'm not about to wait around for him to axe me politely.

        Ugaritic isn't the only language to have been deciphered without a
        bilingual. Georg Friedrich Grotefend made considerable progress in
        deciphering Persian cuneiform by looking for and finding proper names of
        Persian emperors known from ancient Greek and Hebrew sources. (Henry
        Rawlinson finished the decipherment in the 1830s.) The point is that
        bilinguals aren't necessary to decipher an unknown script. Still, in the
        case of Ugaritic and Persian, scholars had a pretty good handle on the
        language the script represented before they started work. In the case of
        Etruscan, where the language is largely unknown, complete decipherment
        thus far has eluded us.

        What do we know about the language the Indus script wrote? We can say
        little for certain, but the best guess is that it's a language of the
        Dravidian family, an idea that has been around since at least the 1920s.
        Today most Dravidian speakers live in Sri Lanka and southern India, 800
        miles or more from the Indus valley where the bulk of the Indus
        inscriptions have been found. But about a hundred thousand speakers of
        one Dravidian language, Brahui, live in western Pakistan and neighboring
        parts of Iran and Afghanistan, not too far west of the Indus. Contrary
        to earlier speculation about recent migrations, linguistic and genetic
        analyses show that they have been separated from other Dravidian
        speakers for at least several thousand years. Further evidence that
        Dravidian or related languages were once spoken in the general area
        comes from Linear Elamite inscriptions, found in the ruins of the
        ancient city of Susa in southwestern Iran. The script has been
        deciphered from a phonetic standpoint because of its similarity to
        Mesopotamian cuneiform, but as with Etruscan, the language remains
        largely unknown. A significant percentage of words in Linear Elamite
        appear to be of Dravidian origin, which could mean it is descended from
        a hypothetical Elamo-Dravidian ancestor language, or just that it
        borrowed a lot of words from a Dravidian language spoken nearby. In
        either case, the Elamite connection makes it seem more likely that a
        Dravidian or related language was spoken in the Indus valley when the
        inscriptions were made.

        Many Indian nationalists, and some serious scholars, believe the Indus
        script writes a language of the Indo-Iranian (Aryan) branch of the
        Indo-European family, which includes Farsi (modern Persian), Sanskrit
        and Hindi. All things considered, this seems unlikely. The inscriptions
        go back to about 3200 B.C., which according to mainstream archaeological
        thinking is before any Indo-Europeans had come that far southeast.
        Another problem is that Indo-European peoples kept domesticated horses
        and used chariots and had other cultural traits not shared with the
        ancient Indus civilization. Indeed, according to the mainstream
        thinking, the arrival of the Indo-Europeans in the Indus Valley around
        1800 B.C. is more likely to have been the end of the Harappan culture
        than the beginning of it.

        If the Indus script turns out to write a language that is neither
        Indo-European nor Dravidian (or Elamo-Dravidian), then the chances of
        deciphering it are slim. In the words of Alice Kober, who helped
        decipher Linear B, "an unknown language written in an unknown script
        cannot be deciphered, bilingual or no bilingual." There are really no
        other decent candidates among known languages, so we would be left with
        an unknown language, and the prospects of complete decipherment would be
        as poor as with Etruscan.

        But faint hope is better than none. Sumerian is a linguistic isolate,
        but the script has been phonetically deciphered, and the language partly
        deciphered. Most of the cuneiform scripts of Mesopotamia are direct
        descendants of the Sumerian script, though they're used to write
        unrelated languages. Babylonian and Akkadian and some other languages
        written in these related scripts were amenable to decipherment in part
        because they were members of the well understood Semitic family. The
        similarity of the scripts, the many Sumerian loanwords in these Semitic
        languages, and the unusually large number of bilingual texts have
        allowed scholars to reconstruct the Sumerian language with considerable
        success despite its being unrelated to any known language. No such
        combination of circumstances exists for the Indus script, and no
        discoveries along these lines are seriously expected.

        What will we get if the Indus script is finally deciphered--great
        historical works that reveal the local political situation 5,000 years
        ago? Classic works of literature like the Egyptian Book of the Dead or
        the Mesopotamian epic of Gilgamesh? Insight into ancient religious
        practices of the sort revealed by Ugaritic? No to all the above. The sad
        truth is that the longest known Indus inscription is only 17 symbols
        long. The bulk of the 4,000 or so Indus inscriptions are believed to be
        simple identifying marks. Most of the inscriptions are on seals or seal
        impressions, similar to signet rings or rubber stamps. So even if we
        decipher the script and the language, chances are we'll discover they
        say nothing more fascinating than "government property" or "John Smith"
        or "tax paid." As with the revelation that Linear B wrote an archaic
        form of Greek, if the Indus script is deciphered, the most interesting
        fact learned will be what language the ancient script wrote--that is, if
        it writes a language at all.

        If it writes a language? They wouldn't call it the "Indus script" if it
        weren't a script, would they? Don't be so sure. When the first
        inscriptions were discovered in the 1870s in and around the Indus valley
        of Pakistan, and when the early cities of Harappa and Mohenjo-Daro were
        excavated in the 1920s, archaeologists assumed that civilization and
        writing always went together--a complex urban culture couldn't possibly
        develop without writing. The Indus sites were urban; ergo, the
        inscriptions were writing.

        Today we recognize that civilization and writing don't always go
        together. The Inca empire, for example, was urban but lacked true
        writing. Historian Steve Farmer now questions the assumption that the
        Indus script is true writing. In a recent paper, he and two linguists
        compare the Indus script with medieval European heraldry. Like heraldry,
        they say, the Indus script may consist of discrete conventional elements
        that serve as identification marks but don't encode a spoken language.

        This controversial idea has some points in its favor. Considering the
        corpus of texts as a whole, there's a considerable amount of repetition
        among symbols, as would be expected if they wrote a spoken language. But
        there's less repetition than expected within the texts, even considering
        their brevity. Further, several systems of pictograms from around the
        world--for example, the Vinca signs of southeastern Europe, written
        about 4000 B.C.--resemble the Indus script in their use of conventional
        symbols, but nobody believes they code a written language.

        Traditionalists have some points in their favor too. The Indus script
        was linear, that is, usually written with symbols following one another
        in a line, rather than being placed randomly or in some other geometric
        pattern. Linearity is found in most writing, though not exclusively so.
        More to the point, the characters often crowd at the end of a line, as
        if the writer wanted to avoid breaking up a word. This is a distinctive
        feature of true writing. The comparison with heraldry may not hold water
        either. Hittite hieroglyphics were initially considered heraldry by
        serious linguists but were eventually found to be true writing and
        deciphered. Much the same has been said about many other undeciphered
        scripts likewise shown to be true writing.

        Still, Farmer feels so strongly that the Indus script is not a real
        script that he has offered a $10,000 reward for proof that it is true
        writing. He will accept as proof an authenticated inscription more than
        50 symbols long. Farmer thinks the extant texts are all so short because
        they don't write a language. The pro-language side thinks the longer
        texts once produced in Harappa and other cities have been lost because
        they were written on perishable surfaces. Certainly a long text would be
        a great gift to modern science. I just wish they wouldn't use the lame
        excuse that they couldn't give it to us because they ran out of Harappan
        paper.

        Further reading

        Lost Languages: The Enigma of the World's Undeciphered Scripts by Andrew
        Robinson, 2002

        The Story of Decipherment: From Egyptian Hieroglyphs to Maya Script by
        Maurice Pope, revised edition, 1999

        "The Collapse of the Indus-Script Thesis: The Myth of a Literate
        Harappan Civilization" by Steve Farmer, Richard Sproat, and Michael
        Witzel in Electronic Journal of Vedic Studies, Dec.13, 2004. This and
        related items can be accessed from Steve Farmer's download page at
        www.safarmer.com/downloads/.

        --SDSTAFF bibliophage
        Straight Dope Science Advisory Board

        [Comment on this answer.]

        Staff Reports are researched and written by members of the Straight Dope
        Science Advisory Board, Cecil's online auxiliary. Although the SDSAB
        does its best, these articles are edited by Ed Zotti, not Cecil, so
        accuracywise you'd better keep your fingers crossed.

        [ Return to the Staff Report Archive ]

        The Straight Dope / Questions or comments for Cecil Adams to:
        cecil@...
        Comments regarding this website to: webmaster@...
        For advertising information, see the Chicago Reader Online Rate Sheet
        Copyright 2005 Chicago Reader, Inc. All rights reserved.
        No material contained in this site may be republished or reposted
        without express written permission.
        The Straight Dope is a registered trademark of Chicago Reader, Inc.
      • i18n@yahoo.com
        i18n@yahoo.com wrote: Hi all - just came across this on the Straight Dope web site at http://www.straightdope.com/mailbag/mindusscript.html Comments, currently
        Message 3 of 5 , May 19, 2005
        View Source
        • 0 Attachment
          i18n@... wrote:
          Hi all - just came across this on the Straight Dope web site at
          http://www.straightdope.com/mailbag/mindusscript.html

          Comments, currently not really on point for qalam, are available at
          http://boards.straightdope.com/sdmb/showthread.php?t=315899

          Best,

          Barry
          -------------------------------------------------

          A Staff Report by the Straight Dope Science Advisory Board
          How come we can't decipher the Indus script?

          10-May-2005

          Dear Straight Dope:

          I just got a book on ancient civilizations. In the chapter dealing with
          written languages, they list Egyptian hieroglyphics, Mesopotamian
          pictographs, and Indus script as the three oldest known written
          languages. The book goes on to say Indus script has never been
          deciphered even though over 2,500 examples of it exist. Maybe I've
          watched too many sci-fi movies where a master linguist deciphers alien
          languages, but I really thought we had terrestrial languages mastered.
          What's the deal with Indus script? Is the art of linguistics still held
          hostage by our inability to decipher ancient languages without a "key" à
          la the Rosetta Stone? --Troy Dayton, Fargo, ND

          SDSTAFF bibliophage replies:

          Too much science fiction? No such thing. Star Trek, for example, teaches
          us that a good communications officer can send a message that transcends
          mere language, especially if she has legs down to here and a hemline up
          to there. Mmmmm. Mm-HMMMmmmmm . . . er, sorry. Was I saying something?

          Yes, I was. The Indus script, which was written in and around Pakistan
          over a period of several centuries centered around 2500 B.C., is the
          most famous undeciphered script, but there are many others. Other
          mystery writing systems include Linear A (Greece, 1800 B.C.), Zapotec
          (Mexico, 500 B.C.), Meroitic (Sudan, 300 B.C.), Isthmian (Central
          America, A.D. 200), Rongorongo (Easter Island, A.D. 1800) and Joycean
          (Ireland, A.D. 1900). Okay, maybe not that last one.

          Why haven't they been deciphered? It's instructive to look at some
          deciphered scripts to see what makes the enigmatic writing of the Indus
          valley different. Script decipherment is not as easy as it's made out to
          be in science fiction--and sometimes not as easy as it's made out to be
          in history books. Chances are the impression you took away from school
          was that the Rosetta stone made it child's play to decipher Egyptian
          hieroglyphics. Not so. How many schools teach that some of the best
          minds in the world pored over the Rosetta stone for a quarter century
          before it finally revealed its secrets?

          One of the biggest obstacles was that the ancient Egyptians used a
          writing system unlike anything known when the Rosetta stone was
          discovered in 1799. Scholars knew about logographic systems like
          Chinese, where there are thousands of symbols, each normally
          representing a whole word or idea. They knew about alphabetic systems
          like Hebrew and English, where there are typically 20 to 30 symbols,
          each normally representing one consonant or vowel. Some scholars may
          have known about syllabaries, with several dozen symbols each
          representing one syllable, as in Japanese hiragana and katakana. But
          Egyptian hieroglyphics had too many distinct symbols to be an alphabet
          or syllabary, and too few to be logographic.

          The decipherment published by Champollion in 1823 (building on work by
          many others, including Thomas Young) showed that Egyptian hieroglyphics
          were (neglecting some complications) a logo-phonetic system. In such a
          writing system, any given symbol can represent either an entire idea or
          word, or the sound (or initial sound) of that word. Some simple ideas
          can be expressed efficiently with a drawing of the object or an object
          it's associated with. But to express an abstract idea that can't be
          readily drawn, you can use a string of sounds. Suppose you want to
          express the English word "charitable" without an alphabet. You could
          draw a picture of a chair and a table (since "chair table" sounds sort
          of like "charitable"). This is the rebus principle. Today we may
          consider rebus puzzles to be nothing but a silly game, but to the
          ancients, they were a natural way to write a language. Other early
          scripts, like Mayan hieroglyphs and Mesopotamian cuneiform, are built on
          the same principle.

          The rebus approach may seem an unwieldy way to write a language, but
          it's a step up from non-linguistic pictograms. A picture of a chair and
          a table can only convey "chair and table," or at best an idea associated
          with a chair and table, such as the act of sitting down at a table. An
          abstract concept such as "charitable" is difficult to get across using
          pictograms. Writing systems built on the rebus system are a way of
          filling the void, but have the drawback (for us latter-day translators)
          that, unlike pictograms, they'll only work in one language. For a
          speaker of Latin, for example, pictograms of a chair (in Latin, sella)
          and a picture of a table (mensa) would never suggest the word for
          charitable (benignus).

          I go into such detail about logo-phonetic systems because the Indus
          script appears to have about the right number of distinct symbols (250
          to 400, depending on who's counting) to use this system. Knowing that,
          shouldn't it be easier to decipher the Indus script? Not really--the
          decipherers of Egyptian hieroglyphics had the help of the Rosetta stone,
          a bilingual or bitext (parallel texts of the same message in the unknown
          script and a known script). No bitext for the Indus script has yet been
          found.

          A bitext is no guarantee that decipherment will be easy. Take the case
          of Etruscan writing, found in Italy. At a superficial level the script
          is easily deciphered, since the letters are close in form to archaic
          Greek and Latin alphabets. But the language remains largely
          uninterpreted. What's the difference? Given a piece of Etruscan writing,
          we have no difficulty pronouncing the words, but no idea what most of
          the words mean (think of a trained politician reading off a
          TelePrompTer). The trouble is that Etruscan is apparently unrelated to
          any language understood today. Champollion, the decipherer of Egyptian
          hieroglyphics, had the advantage of knowing Coptic, which he correctly
          suspected was the descendant of the ancient Egyptian language. Etruscan
          has left no descendants.

          The dozens of Etruscan bitexts (with Latin, Greek, or Phoenician) aren't
          very helpful. All they really tell you is that a given block of
          mysterious text means such-and-such. There's no sure way to tell which
          Etruscan word corresponds to which word in the parallel text, since the
          order of ideas and number of words vary widely among the different
          languages. All is not lost, however. If, for example, a Latin word
          occurs several times in a text and a mystery word occurs the same number
          of times in the corresponding Etruscan text, you may be justified in
          supposing that they mean the same thing. But beware--often the two
          messages in a bilingual text are just paraphrases of each other, not
          word-for-word translations. Still, using methods like this, together
          with glosses (explicit translations of individual words in the
          documents), scholars have been able to determine--or at least make a
          reasonable guess at--the meanings of a couple hundred Etruscan words.

          If we understand the language or a close relative or descendant of the
          language, it ought to be pretty easy to decipher the script, right? Not
          so fast. The Rongorongo script used on Easter Island after European
          contact almost certainly represents Rapa Nui, the well known Polynesian
          language of the Easter Islanders. But no one now remembers how the
          script symbols are meant to be read. Steven Fischer recently claimed to
          have deciphered Rongorongo, but his critics say "Wrong-o, wrong-o." I
          don't know if Fischer is right or wrong, but undeciphered scripts do
          seem to invite harebrained analysis. Jacques Guy bluntly calls them
          "kook attractors," but even serious scholars aren't immune. Hrozný, who
          correctly deciphered Hittite, later went down many wrong paths with
          other scripts.

          The real kooks are those like Goropius Becanus of the Netherlands, who
          in 1580 proved to his satisfaction that Egyptian hieroglyphics
          represented Dutch. A Jesuit priest named Heras is one of scores who have
          claimed to decipher Indus script. Here's one of his translations: "There
          is no feast in the place outside the country of the Minas of the three
          fishes of the despised country of the woodpeckers." Whatever you say, padre.

          You mention the 2,500 examples of the Indus script. The number of
          available texts now exceeds 4,000, but quantity is no indication of ease
          of decipherment. Some scripts have been translated with far fewer texts.
          Take Palmyrene, the first ancient script ever deciphered. A handful of
          inscriptions were found on the walls of the ruins of the city of Palmyra
          in Syria. Scholars knew from ancient Greek writers that the language
          spoken there was closely related to Syriac, a well known Semitic
          language. The script was obviously derived from the known Aramaic
          alphabet but many letters weren't immediately identifiable. Among the
          ruins were several bilingual inscriptions in Greek and Palmyrene. If you
          know the Aramaic alphabet, it's a fairly simple matter to use the
          identifiable Aramaic letters and the similarity of proper names in Greek
          and Palmyrene to get a good start. Then you can use your knowledge of
          Greek and Syriac to fill in the blanks. Your Syriac is a little rusty,
          you say? Not to worry--a decent Syriac dictionary will serve just as
          well. Soon after the first decent reproductions of Palmyrene
          inscriptions were published in Europe in the 1750s, Barthélemy in France
          and Swinton in England independently deciphered them, each taking just a
          few hours to finish the job. It was perhaps a bit more challenging than
          the cryptogram puzzles you can find in your Sunday paper, but not by
          much. Most decipherments, needless to say, are a good deal tougher to
          crack than that.

          Returning to the matter at hand, is the lack of a bitext for the Indus
          script an insurmountable obstacle? Not necessarily. Some scripts have
          been deciphered without them, although not without a good deal of
          cleverness. Ugaritic writings, like Palmyrene, were found in Syria (in
          1929), suggesting that they too might be a Semitic language. About two
          dozen symbols were used, suggesting an alphabetic script. Several of the
          words were only a single letter long, suggesting Ugaritic used a
          consonantal alphabet written without vowels (as was the case with other
          early Semitic alphabets such as Hebrew). Applying letter frequency
          analysis to the problem, Hans Bauer tentatively assigned the values L
          and M to two Ugaritic letters. In Semitic languages, L is common as a
          single-letter word, but not so common in suffixes and prefixes; M is the
          only letter that is really common in Semitic suffixes, prefixes, and as
          single-letter words.

          On the assumption that related languages use similar words for common
          concepts (much as European languages have father/vater/pater), Bauer
          then used the M and L assignments to search the texts for the expected
          Semitic word for "king" (M-L-K or similar) and "kings" (M-L-K-K or
          similar). Proceeding along these lines, he found the words for "son" and
          the name of the god Ba`al, and so eventually determined the values of
          several other letters. His real insight was to guess that the word for
          axe might occur in the text inscribed on several axes. He turned out to
          be right about that, but chose the wrong phonetic values (he guessed
          G-R-Z-N as in Hebrew; the actual Ugaritic form was the related but not
          identical H-R-S-N). Édouard Dhorme later corrected the reading and
          finished the decipherment. One of the axe inscriptions said, in a
          language related to biblical Hebrew, "Unto the high priest doth this axe
          belong, wherefore shouldst thou keep thy hands off it!" Or something
          like that. It strikes me that Bauer's guess was pretty lucky--I have two
          axes in my garage but have yet to inscribe either with the word "axe."
          But hey, when the high priest tells me, "Inscribe the word 'axe' on this
          axe, chop-chop," I'm not about to wait around for him to axe me politely.

          Ugaritic isn't the only language to have been deciphered without a
          bilingual. Georg Friedrich Grotefend made considerable progress in
          deciphering Persian cuneiform by looking for and finding proper names of
          Persian emperors known from ancient Greek and Hebrew sources. (Henry
          Rawlinson finished the decipherment in the 1830s.) The point is that
          bilinguals aren't necessary to decipher an unknown script. Still, in the
          case of Ugaritic and Persian, scholars had a pretty good handle on the
          language the script represented before they started work. In the case of
          Etruscan, where the language is largely unknown, complete decipherment
          thus far has eluded us.

          What do we know about the language the Indus script wrote? We can say
          little for certain, but the best guess is that it's a language of the
          Dravidian family, an idea that has been around since at least the 1920s.
          Today most Dravidian speakers live in Sri Lanka and southern India, 800
          miles or more from the Indus valley where the bulk of the Indus
          inscriptions have been found. But about a hundred thousand speakers of
          one Dravidian language, Brahui, live in western Pakistan and neighboring
          parts of Iran and Afghanistan, not too far west of the Indus. Contrary
          to earlier speculation about recent migrations, linguistic and genetic
          analyses show that they have been separated from other Dravidian
          speakers for at least several thousand years. Further evidence that
          Dravidian or related languages were once spoken in the general area
          comes from Linear Elamite inscriptions, found in the ruins of the
          ancient city of Susa in southwestern Iran. The script has been
          deciphered from a phonetic standpoint because of its similarity to
          Mesopotamian cuneiform, but as with Etruscan, the language remains
          largely unknown. A significant percentage of words in Linear Elamite
          appear to be of Dravidian origin, which could mean it is descended from
          a hypothetical Elamo-Dravidian ancestor language, or just that it
          borrowed a lot of words from a Dravidian language spoken nearby. In
          either case, the Elamite connection makes it seem more likely that a
          Dravidian or related language was spoken in the Indus valley when the
          inscriptions were made.

          Many Indian nationalists, and some serious scholars, believe the Indus
          script writes a language of the Indo-Iranian (Aryan) branch of the
          Indo-European family, which includes Farsi (modern Persian), Sanskrit
          and Hindi. All things considered, this seems unlikely. The inscriptions
          go back to about 3200 B.C., which according to mainstream archaeological
          thinking is before any Indo-Europeans had come that far southeast.
          Another problem is that Indo-European peoples kept domesticated horses
          and used chariots and had other cultural traits not shared with the
          ancient Indus civilization. Indeed, according to the mainstream
          thinking, the arrival of the Indo-Europeans in the Indus Valley around
          1800 B.C. is more likely to have been the end of the Harappan culture
          than the beginning of it.

          If the Indus script turns out to write a language that is neither
          Indo-European nor Dravidian (or Elamo-Dravidian), then the chances of
          deciphering it are slim. In the words of Alice Kober, who helped
          decipher Linear B, "an unknown language written in an unknown script
          cannot be deciphered, bilingual or no bilingual." There are really no
          other decent candidates among known languages, so we would be left with
          an unknown language, and the prospects of complete decipherment would be
          as poor as with Etruscan.

          But faint hope is better than none. Sumerian is a linguistic isolate,
          but the script has been phonetically deciphered, and the language partly
          deciphered. Most of the cuneiform scripts of Mesopotamia are direct
          descendants of the Sumerian script, though they're used to write
          unrelated languages. Babylonian and Akkadian and some other languages
          written in these related scripts were amenable to decipherment in part
          because they were members of the well understood Semitic family. The
          similarity of the scripts, the many Sumerian loanwords in these Semitic
          languages, and the unusually large number of bilingual texts have
          allowed scholars to reconstruct the Sumerian language with considerable
          success despite its being unrelated to any known language. No such
          combination of circumstances exists for the Indus script, and no
          discoveries along these lines are seriously expected.

          What will we get if the Indus script is finally deciphered--great
          historical works that reveal the local political situation 5,000 years
          ago? Classic works of literature like the Egyptian Book of the Dead or
          the Mesopotamian epic of Gilgamesh? Insight into ancient religious
          practices of the sort revealed by Ugaritic? No to all the above. The sad
          truth is that the longest known Indus inscription is only 17 symbols
          long. The bulk of the 4,000 or so Indus inscriptions are believed to be
          simple identifying marks. Most of the inscriptions are on seals or seal
          impressions, similar to signet rings or rubber stamps. So even if we
          decipher the script and the language, chances are we'll discover they
          say nothing more fascinating than "government property" or "John Smith"
          or "tax paid." As with the revelation that Linear B wrote an archaic
          form of Greek, if the Indus script is deciphered, the most interesting
          fact learned will be what language the ancient script wrote--that is, if
          it writes a language at all.

          If it writes a language? They wouldn't call it the "Indus script" if it
          weren't a script, would they? Don't be so sure. When the first
          inscriptions were discovered in the 1870s in and around the Indus valley
          of Pakistan, and when the early cities of Harappa and Mohenjo-Daro were
          excavated in the 1920s, archaeologists assumed that civilization and
          writing always went together--a complex urban culture couldn't possibly
          develop without writing. The Indus sites were urban; ergo, the
          inscriptions were writing.

          Today we recognize that civilization and writing don't always go
          together. The Inca empire, for example, was urban but lacked true
          writing. Historian Steve Farmer now questions the assumption that the
          Indus script is true writing. In a recent paper, he and two linguists
          compare the Indus script with medieval European heraldry. Like heraldry,
          they say, the Indus script may consist of discrete conventional elements
          that serve as identification marks but don't encode a spoken language.

          This controversial idea has some points in its favor. Considering the
          corpus of texts as a whole, there's a considerable amount of repetition
          among symbols, as would be expected if they wrote a spoken language. But
          there's less repetition than expected within the texts, even considering
          their brevity. Further, several systems of pictograms from around the
          world--for example, the Vinca signs of southeastern Europe, written
          about 4000 B.C.--resemble the Indus script in their use of conventional
          symbols, but nobody believes they code a written language.

          Traditionalists have some points in their favor too. The Indus script
          was linear, that is, usually written with symbols following one another
          in a line, rather than being placed randomly or in some other geometric
          pattern. Linearity is found in most writing, though not exclusively so.
          More to the point, the characters often crowd at the end of a line, as
          if the writer wanted to avoid breaking up a word. This is a distinctive
          feature of true writing. The comparison with heraldry may not hold water
          either. Hittite hieroglyphics were initially considered heraldry by
          serious linguists but were eventually found to be true writing and
          deciphered. Much the same has been said about many other undeciphered
          scripts likewise shown to be true writing.

          Still, Farmer feels so strongly that the Indus script is not a real
          script that he has offered a $10,000 reward for proof that it is true
          writing. He will accept as proof an authenticated inscription more than
          50 symbols long. Farmer thinks the extant texts are all so short because
          they don't write a language. The pro-language side thinks the longer
          texts once produced in Harappa and other cities have been lost because
          they were written on perishable surfaces. Certainly a long text would be
          a great gift to modern science. I just wish they wouldn't use the lame
          excuse that they couldn't give it to us because they ran out of Harappan
          paper.

          Further reading

          Lost Languages: The Enigma of the World's Undeciphered Scripts by Andrew
          Robinson, 2002

          The Story of Decipherment: From Egyptian Hieroglyphs to Maya Script by
          Maurice Pope, revised edition, 1999

          "The Collapse of the Indus-Script Thesis: The Myth of a Literate
          Harappan Civilization" by Steve Farmer, Richard Sproat, and Michael
          Witzel in Electronic Journal of Vedic Studies, Dec.13, 2004. This and
          related items can be accessed from Steve Farmer's download page at
          www.safarmer.com/downloads/.

          --SDSTAFF bibliophage
          Straight Dope Science Advisory Board

          [Comment on this answer.]

          Staff Reports are researched and written by members of the Straight Dope
          Science Advisory Board, Cecil's online auxiliary. Although the SDSAB
          does its best, these articles are edited by Ed Zotti, not Cecil, so
          accuracywise you'd better keep your fingers crossed.

          [ Return to the Staff Report Archive ]

          The Straight Dope / Questions or comments for Cecil Adams to:
          cecil@...
          Comments regarding this website to: webmaster@...
          For advertising information, see the Chicago Reader Online Rate Sheet
          Copyright 2005 Chicago Reader, Inc. All rights reserved.
          No material contained in this site may be republished or reposted
          without express written permission.
          The Straight Dope is a registered trademark of Chicago Reader, Inc.
        • Peter T. Daniels
          Cecil s staff got it almost entirely right. The one important point about the Rosetta Stone that they missed is that the Rosetta Stone by itself was not
          Message 4 of 5 , May 19, 2005
          View Source
          • 0 Attachment
            Cecil's staff got it almost entirely right. The one important point
            about the Rosetta Stone that they missed is that the Rosetta Stone by
            itself was not sufficient for deciphering the hieroglyphs, because (by
            accident) only one pharaoh's name (Ptolemy) was preserved -- only when
            Champollion got hold of Cleopatra's cartouche (probably provided to him
            by Young) was he able to cross-check which symbols went with which
            consonant sounds. Thereafter, all he needed to do was apply his
            knowledge of Coptic.

            (The great surprise was that not only Greek names -- Ptolemy, Cleopatra
            -- could be written phonetically, but even Egyptian ones: Rameses was
            the first to appear, since he knew the Coptic for 'sun' and had the <m>
            from <Ptolemy>.)
            --
            Peter T. Daniels grammatim@...
          Your message has been successfully submitted and would be delivered to recipients shortly.