Loading ...
Sorry, an error occurred while loading the content.

4833Re: [Synoptic-L] counts of word tokens

Expand Messages
  • David Mealand
    May 8, 2013
    • 0 Attachment
      Verses are too variable a commodity. There is reasonable
      agreement on word token counts for whole books so this
      is what is needed for estimating the sizes of what is found
      only in one text or in any of the pairs in question.

      The slight variability in word counts is discussed in Kenny,
      Stylometric Study 14-15: the differences between the counts he
      cites are something like 26 in a text of c. 6800 words. They
      are due to factors such as a) text critical decisions b) word
      division decisions.

      There will be larger differences in estimating words shared by
      a pair of texts. Does one count only words identical in form,
      or include a word as present in both, even if it has one case
      ending in one text and a different one in the other? Ditto
      with verb tenses and endings, etc.

      There is an article by John Poirier which charts a long series
      of published items on Synoptic analysis which shows how people
      have wrestled with how to count what is shared between two or
      more texts containing similar material (in CBR 2008).
      You will find other recent items here on the Birkbeck Uni site
      http://www.ems.bbk.ac.uk/faculty/abakuks/synoptic

      David M.

      ---------
      David Mealand, University of Edinburgh


      --
      The University of Edinburgh is a charitable body, registered in
      Scotland, with registration number SC005336.
    • Show all 20 messages in this topic