Loading ...
Sorry, an error occurred while loading the content.
 

Re: Online Moten Dictionary

Expand Messages
  • Christophe Grandsire-Koevoets
    ... Toolbox s files are plain text, but I don t know if they have a similar format as Shoebox. I ll look into it. Thanks for the link! ... My programming
    Message 1 of 19 , Mar 6, 2013
      On 6 March 2013 19:27, BPJ <bpj@...> wrote:

      >
      > Provided that Toolbox's data file is just a plain text file
      > in Shoebox format you might find this olde dogge, which parses
      > such files, useful:
      >
      > <https://metacpan.org/release/**Text-Shoebox<https://metacpan.org/release/Text-Shoebox>
      > >
      >
      >
      Toolbox's files are plain text, but I don't know if they have a similar
      format as Shoebox. I'll look into it. Thanks for the link!


      > You would have to write routines to go through the parsed
      > data[^1] and format it, but you could target LaTeX!
      >
      > [^1]: the object-oriented helper modules are really helpful,
      > although they have a slightly arcane method set.
      >
      >
      My programming skills are rusty, but that may be a good opportunity to
      revive them. It may be useful for other people as well.

      I have great trouble distinguishing | from l in the dictionary,

      > as they are the same height in the font used. A font where |
      > extends well below the baseline, or with heavier serifs, would be
      > preferable


      Good point! I'll experiment with other fonts. The template used by Toolbox
      uses styles throughout, so it shouldn't be too difficult. I'm not too happy
      with Times New Roman's look anyway.


      > (ĺ ń ś ź or ḷ ṇ ṣ ẓ would be even preferabler,
      > but something tells me you won't go for that! ;-)
      >
      >
      Indeed not! While I have nothing against diacritics, the pipe has been part
      of Moten's typographic identity since the very beginning, and comes closest
      to the way I actually write Moten with pen and paper (which predates my
      first foray on the Internet!). I am not going to throw away 20 years of
      history if I can help it :P .
      The alternative would be to create a special font with the correct shapes
      for these characters (maybe using ligatures for ease of typing). But it
      seems a lot of work for just four characters (eight if we count the capital
      letters).
      --
      Christophe Grandsire-Koevoets.

      http://christophoronomicon.blogspot.com/
      http://www.christophoronomicon.nl/
    • BPJ
      ... basically the same. I used Shoebox once upon a time but have been keeping my vocabularies in CSV files for years now. ... a bit of an itch to rewrite it --
      Message 2 of 19 , Mar 7, 2013
        Den torsdagen den 7:e mars 2013 skrev Christophe Grandsire-Koevoets:

        > On 6 March 2013 19:27, BPJ <bpj@... <javascript:;>> wrote:
        >
        > >
        > > Provided that Toolbox's data file is just a plain text file
        > > in Shoebox format you might find this olde dogge, which parses
        > > such files, useful:
        > >
        > > <https://metacpan.org/release/**Text-Shoebox<
        > https://metacpan.org/release/Text-Shoebox>
        > > >
        > >
        > >
        > Toolbox's files are plain text, but I don't know if they have a similar
        > format as Shoebox. I'll look into it. Thanks for the link!
        >
        > As far as I could gauge from information found on the SIL website they are
        basically the same. I used Shoebox once upon a time but have been keeping
        my vocabularies in CSV files for years now.

        >
        > > You would have to write routines to go through the parsed
        > > data[^1] and format it, but you could target LaTeX!
        > >
        > > [^1]: the object-oriented helper modules are really helpful,
        > > although they have a slightly arcane method set.
        > >
        > >
        > My programming skills are rusty, but that may be a good opportunity to
        > revive them. It may be useful for other people as well.
        >
        > I looked at the code for that parser and it wasn't that complicated. I got
        a bit of an itch to rewrite it -- easy to resist since I don't have any
        currently relevant data set of my own and many other things including Real
        Work(TM) on my hands. The main disadvantage with the format is that keys
        aren't required to be unique and that the order of items within an entry
        may be significant since this means you can't just slurp each entry into an
        associative array. With CSV I can do that although I often want more than
        two dimensions. In practice I often have semicolon-separated subfields or
        even compact YAML fragments within a field.



        > I have great trouble distinguishing | from l in the dictionary,
        >
        > > as they are the same height in the font used. A font where |
        > > extends well below the baseline, or with heavier serifs, would be
        > > preferable
        >
        >
        > Good point! I'll experiment with other fonts. The template used by Toolbox
        > uses styles throughout, so it shouldn't be too difficult. I'm not too happy
        > with Times New Roman's look anyway.
        >
        >
        It does have it's uses though when you need to use narrow columns. Perhaps
        you could take some font with a suitable license and adjust the pipe
        character!



        >
        > > (ĺ ń ś ź or ḷ ṇ ṣ ẓ would be even preferabler,
        > > but something tells me you won't go for that! ;-)
        > >
        > >
        > Indeed not! While I have nothing against diacritics, the pipe has been part
        > of Moten's typographic identity since the very beginning, and comes closest
        > to the way I actually write Moten with pen and paper (which predates my
        > first foray on the Internet!). I am not going to throw away 20 years of
        > history if I can help it :P .


        I know the feeling! I changed the transcription of Sohlob once when going
        from ASCII to Latin-1 -- which then only meant to replace some unambiguous
        digraphs tj sj dj ae with c ç j æ -- but I won't 'remedy' the digraphs that
        remain, especially since ny ng ngg for /J N Ng/ and hl hr hm hn hng for
        voiceless liquids and nasals are pretty intuitive. Rather the problem is
        that c ç for /ts\ s\/ aren't intuitive for most people!

        /bpj



        > The alternative would be to create a special font with the correct shapes
        > for these characters (maybe using ligatures for ease of typing). But it
        > seems a lot of work for just four characters (eight if we count the capital
        > letters).
        > --
        > Christophe Grandsire-Koevoets.
        >
        > http://christophoronomicon.blogspot.com/
        > http://www.christophoronomicon.nl/
        >
      • Christophe Grandsire-Koevoets
        ... I like Toolbox because it keeps things tidy, automatically sorts entries (with the correct alphabet order, despite my use of weird letters :) ), while the
        Message 3 of 19 , Mar 7, 2013
          On 7 March 2013 10:43, BPJ <bpj@...> wrote:

          > >
          > > As far as I could gauge from information found on the SIL website they
          > are
          > basically the same. I used Shoebox once upon a time but have been keeping
          > my vocabularies in CSV files for years now.
          >
          >
          I like Toolbox because it keeps things tidy, automatically sorts entries
          (with the correct alphabet order, despite my use of weird letters :) ),
          while the back-end is still just plain text files. It lowers the overhead a
          lot.


          > > >
          > > My programming skills are rusty, but that may be a good opportunity to
          > > revive them. It may be useful for other people as well.
          > >
          > > I looked at the code for that parser and it wasn't that complicated. I
          > got
          > a bit of an itch to rewrite it -- easy to resist since I don't have any
          > currently relevant data set of my own and many other things including Real
          > Work(TM) on my hands.


          I've actually discovered a PDF entitled "From Toolbox to LaTeX", with a
          link to a Perl script and a LaTeX style that claim to do exactly what I
          want. It's at: http://www.zas.gwz-berlin.de/uploads/media/tb-to-tex.pdf
          I've downloaded the scripts, and it seems that they could be useful as
          starting point, but there's a lot of work needed before either can be used
          with my dictionary. You're welcome to scratch your itch on those if you
          want (my Perl skills are basically non-existent. I'm more of a Ruby guy
          myself).


          > The main disadvantage with the format is that keys
          > aren't required to be unique and that the order of items within an entry
          > may be significant since this means you can't just slurp each entry into an
          > associative array. With CSV I can do that although I often want more than
          > two dimensions. In practice I often have semicolon-separated subfields or
          > even compact YAML fragments within a field.
          >
          >
          Since I started using Toolbox I haven't looked back. So far it's better for
          my own use than any alternative I've come across.


          > >
          > It does have it's uses though when you need to use narrow columns. Perhaps
          > you could take some font with a suitable license and adjust the pipe
          > character!
          >
          >
          >
          Unfortunately, the only computer I have access to on which I can use Word
          is locked down, and I can't install fonts on it. So I'll just have to make
          do with the fonts already installed there.

          That's why I'm so keen on a LaTeX solution. I could then use XeLaTeX on my
          home computer and use any font I want to.


          >
          >
          > I know the feeling! I changed the transcription of Sohlob once when going
          > from ASCII to Latin-1 -- which then only meant to replace some unambiguous
          > digraphs tj sj dj ae with c ç j æ -- but I won't 'remedy' the digraphs that
          > remain, especially since ny ng ngg for /J N Ng/ and hl hr hm hn hng for
          > voiceless liquids and nasals are pretty intuitive. Rather the problem is
          > that c ç for /ts\ s\/ aren't intuitive for most people!
          >
          >
          In the past I've looked at alternatives for |l, |n, |s and |z, but I could
          never find anything pleasing. It's important that those four characters
          should keep a connection (the phonemes they represent behave in the same
          way in some environments, and differently from any other consonant), but
          I've never found anything that worked across the board. Diacritics just
          look wonky on _l_. Using unused letters of the alphabet would be
          unintuitive (how do I represent /ʎ/ when the only letters available are c,
          h, q, r, w, x and y? Or /ɲ/ for that matter?). Replacing the | with an
          unused letter would result to something not unlike the x-notation of
          Esperanto (which was useful in the days before Unicode, but was ugly as
          hell!), also not especially intuitive. I toyed with doubling letters (based
          on Castillan Spanish's _l_ /l/ vs. _ll_ /ʎ/), which would have been
          unambiguous since Moten phonotactics don't allow doubled or long phonemes.
          But I'm not sure how intuitive _nn_ for /ɲ/, _ss_ for /ts/ and _zz_ for
          /dz/ would be (I'd rather have people plainly not knowing how to pronounce
          words rather than people *thinking* they know how to pronounce words and
          doing it wrong). Also, it would mess up with writing down interjections in
          the Moten script (interjections routinely break Moten phonotactics and
          allow long and doubled phonemes, and I wanted to be able to mark those with
          double letters).

          In the end, I decided to stick with the pipe. It may be a weird choice, but
          it works for me, and it now *feels* like part of Moten's identity. It gives
          it a unique look on the page at least :) .
          --
          Christophe Grandsire-Koevoets.

          http://christophoronomicon.blogspot.com/
          http://www.christophoronomicon.nl/
        • A. da Mek
          ... never find anything pleasing. It s important that those four characters should keep a connection (the phonemes they represent behave in the same way in
          Message 4 of 19 , Mar 7, 2013
            > In the past I've looked at alternatives for |l, |n, |s and |z, but I could
            never find anything pleasing. It's important that those four characters
            should keep a connection (the phonemes they represent behave in the same
            way in some environments, and differently from any other consonant),

            > Using unused letters of the alphabet would be
            unintuitive (how do I represent /ʎ/ when the only letters available are c,
            h, q, r, w, x and y? Or /ɲ/ for that matter?)

            ly and ny seem to me intuitive enough for palatals; these digraphs are used
            for example in Hungarian.
            But if the palatals and affricates shall be marked alike, then anything
            intuitive for the first pair will be unintuitive for the second pair.
            Maybe l̉ n̉ s̉ z̉ could work; this hook may remind an apostrophe which is
            sometimes used to mark palatalisation and sometimes for a glottal stop,
            which in a combination with a fricative could suggest the plosive onset of
            an affricate.
            And it is one of the five diacritic modifiers which are on my computer
            available in the usual fonts (Times, Courier, Arial):
            ò grave
            ó acute (They both are anyway available precomposed on most letters, but
            the modifiers can be useful for a combination of two diacritic, such as
            accented long vowel ṓ.
            õ tilde (Useful if there is no precomposed ẽ; and also if you want
            write g̃ instead of ŋ)
            ỏ hook
            ọ dot under

            > In the end, I decided to stick with the pipe.

            There is one disadvantage of non-letter characters - Google does not
            recognize such string as one word.
            I considered to use <è èh àh òh> instead of <¨ ¨h ªh ºh> which I am now
            writing for [?], [h], [X\] and [?\], but a text with è used as a consonat
            looks like a file with č written in the Central European codepage and then
            misinterpreted as the Western codepage.
            It is difficult to find some letter for the glottal stop. The letter з may
            remind the glyph used by Egyptologists to transcribe the glottal stop, but
            most people would probably read з either as [Z] or [dz)].
          • BPJ
            ... Too much mouse-pointing and menu-mucking for my taste. I m basically a plain-text guy. Besides I can sort any which way me pleases (almost) from within
            Message 5 of 19 , Mar 7, 2013
              On 2013-03-07 11:33, Christophe Grandsire-Koevoets wrote:
              > On 7 March 2013 10:43, BPJ <bpj@...> wrote:
              >
              >>>
              >>> As far as I could gauge from information found on
              >>> the SIL website they
              >> are basically the same. I used Shoebox once upon a
              >> time but have been keeping my vocabularies in CSV
              >> files for years now.
              >>
              >>
              > I like Toolbox because it keeps things tidy,
              > automatically sorts entries (with the correct
              > alphabet order, despite my use of weird letters :) ),
              > while the back-end is still just plain text files. It
              > lowers the overhead a lot.

              Too much mouse-pointing and menu-mucking for my taste.
              I'm basically a plain-text guy. Besides I can sort any
              which way me pleases (almost) from within perl:

              <https://metacpan.org/module/Sort::ArbBiLex>

              You'll notice it's by the same author. He used to be
              *the* linguist on the CPAN. BTW I have an object-
              oriented wrapper around this module which allows you to
              define a sort-key generating function, chuse the
              normalization form to use during sort[^1], return
              objects which can tell their string value as well as
              their sort key and their family rather than as plain
              strings -- although they stringify to their string
              values! -- get the entries as a list of lists, one for
              each family, and give families arbitrary names like
              "digits". It's in need of documentation, but if anyone
              is interested I might get around to writing that
              documentation. It was part of a project to write a
              Unicode-aware drop-in replacement for makeindex with
              support for arbitrary sort orders. Maybe one day...

              [^1]: It's a good idea to use NFD during sorting
              because then letters with unforeseen decomposable
              diacritics get sorted under their base letter rather
              than just ignored!

              >
              >
              >>>>
              >>> My programming skills are rusty, but that may be a
              >>> good opportunity to revive them. It may be useful
              >>> for other people as well.
              >>>
              >>> I looked at the code for that parser and it wasn't
              >>> that complicated. I
              >> got a bit of an itch to rewrite it -- easy to resist
              >> since I don't have any currently relevant data set
              >> of my own and many other things including Real
              >> Work(TM) on my hands.
              >
              >
              > I've actually discovered a PDF entitled "From Toolbox
              > to LaTeX", with a link to a Perl script and a LaTeX
              > style that claim to do exactly what I want. It's at:
              > http://www.zas.gwz-berlin.de/uploads/media/tb-to-
              > tex.pdf I've downloaded the scripts, and it seems
              > that they could be useful as starting point, but
              > there's a lot of work needed before either can be
              > used with my dictionary. You're welcome to scratch
              > your itch on those if you want (my Perl skills are
              > basically non-existent. I'm more of a Ruby guy
              > myself).

              Sorry to say but there was a bug which would cause it
              not to compile right on line 11! Also It's quite
              ancient from days before perl was unicode-aware or
              before XeTeX was around! Anyway my itch got kinda
              piqued, so maybe I'll look into it once my current
              commission is done in a couple of weeks. I'm unlikely
              to get a new commission right away anyway.

              Anyway you might probably write something in Ruby to
              get your database into a datastructure. The parsing
              code in Text::Shoebox isn't exactly complicated, though
              it too shows its age.

              > Unfortunately, the only computer I have access to on
              > which I can use Word is locked down, and I can't
              > install fonts on it. So I'll just have to make do
              > with the fonts already installed there.

              I'm not surprised! I ditched MSW in both senses years
              ago and haven't looked back. That's part of why I'm
              reluctant to use Toolbox even under wine.

              > That's why I'm so keen on a LaTeX solution. I could
              > then use XeLaTeX on my home computer and use any font
              > I want to.

              I hardly ever use a WYSIWYG WP program willingly any
              more; it's vim, pandoc and XeLaTeX all over the place.
              (*un*willing = paid work is another matter. Luckily
              OpenOffice/LibreOffice can open most anything they
              throw at me -- usually .doc(x)!)

              >> I know the feeling! I changed the transcription of
              >> Sohlob once when going from ASCII to Latin-1 --
              >> which then only meant to replace some unambiguous
              >> digraphs tj sj dj ae with c ç j æ -- but I won't
              >> 'remedy' the digraphs that remain, especially since
              >> ny ng ngg for /J N Ng/ and hl hr hm hn hng for
              >> voiceless liquids and nasals are pretty intuitive.
              >> Rather the problem is that c ç for /ts\ s\/ aren't
              >> intuitive for most people!
              >>
              >>
              > In the past I've looked at alternatives for |l, |n,
              > |s and |z, but I could never find anything pleasing.
              > It's important that those four characters should keep
              > a connection (the phonemes they represent behave in
              > the same way in some environments, and differently
              > from any other consonant), but I've never found
              > anything that worked across the board. Diacritics
              > just look wonky on _l_.

              At least unless they go below, and Unicode doesn't
              offer much in that department, pre-composedwise. In
              fact acute and underdot are the only ones which come
              with all your four letters, and ś/ź and ṣ/ẓ look wonky
              for affricates! At a pinch I'd probably use ḷṅṡż
              -- after all the dot is the mother of all diacritics
              and it would be justified to exceptionally put it
              below l.

              I now see that the latest Unicode has some new offerings
              which might give you a reasonable set which recalls the pipe:

              0141 LATIN CAPITAL LETTER L WITH STROKE

              0142 LATIN SMALL LETTER L WITH STROKE

              A7A4 LATIN CAPITAL LETTER N WITH OBLIQUE STROKE

              A7A5 LATIN SMALL LETTER N WITH OBLIQUE STROKE

              A7A8 LATIN CAPITAL LETTER S WITH OBLIQUE STROKE

              A7A9 LATIN SMALL LETTER S WITH OBLIQUE STROKE

              01B5 LATIN CAPITAL LETTER Z WITH STROKE

              01B6 LATIN SMALL LETTER Z WITH STROKE

              The s with oblique stroke glyphs I've seen so far
              are way to similar to a digit 8 though! There is
              also your very good point that

              > I'd rather have
              > people plainly not knowing how to pronounce words
              > rather than people *thinking* they know how to
              > pronounce words and doing it wrong.

              Very good point indeed!

              But then I suppose you should replace j with y!

              And I suppose you know that based on your own
              description of Moten morphophonology and spelling
              lj nj ts dz would be perfectly unambiguous!

              >
              > In the end, I decided to stick with the pipe. It may
              > be a weird choice, but it works for me, and it now
              > *feels* like part of Moten's identity. It gives it a
              > unique look on the page at least :) .

              I can't blame you. Back in typewriter days I used
              overstruck slash with impunity! :-)

              On 2013-03-07 14:02, A. da Mek wrote:
              > There is one disadvantage of non-letter characters - Google does
              > not recognize such string as one word.

              Well Christophe could always (ab)use 01C0 LATIN LETTER
              DENTAL CLICK! ;-)

              /bpj
            • Christophe Grandsire-Koevoets
              ... True enough. ... Exactly! Which is why I stuck with the pipe. ... Gah! l̉ is ugly! ... It s Google s loss, not mine :P . ... I would personally read it as
              Message 6 of 19 , Mar 8, 2013
                On 7 March 2013 14:02, A. da Mek <a.da_mek0@...> wrote:

                > ly and ny seem to me intuitive enough for palatals; these digraphs are used
                > for example in Hungarian.
                >

                True enough.


                > But if the palatals and affricates shall be marked alike, then anything
                > intuitive for the first pair will be unintuitive for the second pair.
                >

                Exactly! Which is why I stuck with the pipe.


                > Maybe l̉ n̉ s̉ z̉ could work; this hook may remind an apostrophe which is
                > sometimes used to mark palatalisation and sometimes for a glottal stop,
                > which in a combination with a fricative could suggest the plosive onset of
                > an affricate.
                >

                Gah! l̉ is ugly!


                >
                >
                > In the end, I decided to stick with the pipe.
                >>
                >
                > There is one disadvantage of non-letter characters - Google does not
                > recognize such string as one word.
                >

                It's Google's loss, not mine :P .


                > I considered to use <è èh àh òh> instead of <¨ ¨h ªh ºh> which I am now
                > writing for [?], [h], [X\] and [?\], but a text with è used as a consonat
                > looks like a file with č written in the Central European codepage and then
                > misinterpreted as the Western codepage.
                > It is difficult to find some letter for the glottal stop. The letter з may
                > remind the glyph used by Egyptologists to transcribe the glottal stop, but
                > most people would probably read з either as [Z] or [dz)].
                >

                I would personally read it as a vowel, but that's me :P .

                On 7 March 2013 17:49, BPJ <bpj@...> wrote:

                >
                > Too much mouse-pointing and menu-mucking for my taste.
                >

                Toolbox has lots of keyboard shortcuts :P .


                > I'm basically a plain-text guy. Besides I can sort any
                > which way me pleases (almost) from within perl:
                >
                >
                I'm more a GUI person myself. I like using the keyboard as much as
                possible, but love to be able to fall back on using the mouse if I forget
                commands.


                > <https://metacpan.org/module/Sort::ArbBiLex>
                >
                > You'll notice it's by the same author. He used to be
                > *the* linguist on the CPAN. BTW I have an object-
                > oriented wrapper around this module which allows you to
                > define a sort-key generating function, chuse the
                > normalization form to use during sort[^1], return
                > objects which can tell their string value as well as
                > their sort key and their family rather than as plain
                > strings -- although they stringify to their string
                > values! -- get the entries as a list of lists, one for
                > each family, and give families arbitrary names like
                > "digits". It's in need of documentation, but if anyone
                > is interested I might get around to writing that
                > documentation. It was part of a project to write a
                > Unicode-aware drop-in replacement for makeindex with
                > support for arbitrary sort orders. Maybe one day...
                >
                > [^1]: It's a good idea to use NFD during sorting
                > because then letters with unforeseen decomposable
                > diacritics get sorted under their base letter rather
                > than just ignored!
                >
                >
                See, that's how rusty my programming skills are: I understand everything
                you write, but have no idea how I'd go around implementing it myself! I
                haven't done any serious programming for more than 10 years, and I wasn't
                that great to begin with...


                >
                >> I've actually discovered a PDF entitled "From Toolbox
                >> to LaTeX", with a link to a Perl script and a LaTeX
                >> style that claim to do exactly what I want. It's at:
                >> http://www.zas.gwz-berlin.de/uploads/media/tb-to-
                >> tex.pdf I've downloaded the scripts, and it seems
                >>
                >> that they could be useful as starting point, but
                >> there's a lot of work needed before either can be
                >> used with my dictionary. You're welcome to scratch
                >> your itch on those if you want (my Perl skills are
                >> basically non-existent. I'm more of a Ruby guy
                >> myself).
                >>
                >
                > Sorry to say but there was a bug which would cause it
                > not to compile right on line 11! Also It's quite
                > ancient from days before perl was unicode-aware or
                > before XeTeX was around! Anyway my itch got kinda
                > piqued, so maybe I'll look into it once my current
                > commission is done in a couple of weeks. I'm unlikely
                > to get a new commission right away anyway.
                >
                >
                See, I would never have found that bug. Perl is not really my forte.


                > Anyway you might probably write something in Ruby to
                > get your database into a datastructure. The parsing
                > code in Text::Shoebox isn't exactly complicated, though
                > it too shows its age.
                >
                >
                The parsing isn't what I fear most. It's the next step, converting the data
                into a useful XeLaTeX file. I'm currently looking at bilingual dictionaries
                typeset in LaTeX to see how I could create a template. My LaTeX programming
                skills are *very* rusty, so I'd rather not have to create my own styles :P .


                >
                > I'm not surprised! I ditched MSW in both senses years
                > ago and haven't looked back. That's part of why I'm
                > reluctant to use Toolbox even under wine.
                >
                >
                I have little choice with my work laptop. At home I use GNU/Linux
                exclusively. That computer has never had any other OS installed on it! :)
                And I don't mind using Toolbox under Wine, it works pretty well.


                >
                > I hardly ever use a WYSIWYG WP program willingly any
                > more; it's vim, pandoc and XeLaTeX all over the place.
                > (*un*willing = paid work is another matter. Luckily
                > OpenOffice/LibreOffice can open most anything they
                > throw at me -- usually .doc(x)!)
                >
                >
                At work I *have* to use all kinds of GUIs. Our entire business nearly runs
                on Excel! :( Not to mention all the modelling tools I have to use.


                > I'd rather have
                >
                >> people plainly not knowing how to pronounce words
                >> rather than people *thinking* they know how to
                >> pronounce words and doing it wrong.
                >>
                >
                > Very good point indeed!
                >
                > But then I suppose you should replace j with y!
                >
                >
                Not here in the Netherlands :P .


                > And I suppose you know that based on your own
                > description of Moten morphophonology and spelling
                > lj nj ts dz would be perfectly unambiguous!
                >
                >
                Very true. But I decided against those digraphs extremely early in the
                design of Moten's orthography. I wanted a true phonemic script, i.e. no
                digraphs, even if those would be unambiguous. I don't treat |l, |n, |s and
                |z as digraphs either, by the way. They are single letters and part of the
                alphabet.


                >
                >
                >> In the end, I decided to stick with the pipe. It may
                >> be a weird choice, but it works for me, and it now
                >> *feels* like part of Moten's identity. It gives it a
                >> unique look on the page at least :) .
                >>
                >
                > I can't blame you. Back in typewriter days I used
                > overstruck slash with impunity! :-)
                >
                >
                Yeah, Moten is definitely pre-Internet :P .
                Well Christophe could always (ab)use 01C0 LATIN LETTER

                > DENTAL CLICK! ;-)
                >
                >
                Ouch! :P
                --
                Christophe Grandsire-Koevoets.

                http://christophoronomicon.blogspot.com/
                http://www.christophoronomicon.nl/
              • George Corley
                On Fri, Mar 8, 2013 at 2:11 AM, Christophe Grandsire-Koevoets
                Message 7 of 19 , Mar 8, 2013
                  On Fri, Mar 8, 2013 at 2:11 AM, Christophe Grandsire-Koevoets <
                  tsela.cg@...> wrote:

                  > On 7 March 2013 14:02, A. da Mek <a.da_mek0@...> wrote:
                  > > There is one disadvantage of non-letter characters - Google does not
                  > > recognize such string as one word.
                  > >
                  >
                  > It's Google's loss, not mine :P .
                  >

                  Google likely won't care, but people searching for words might run into
                  issues.
                • Christophe Grandsire-Koevoets
                  ... Why would *anyone* ever search for Moten words *via Google*? Searching for Moten or Moten language , I d understand, but anything else is just weird.
                  Message 8 of 19 , Mar 8, 2013
                    On 8 March 2013 09:54, George Corley <gacorley@...> wrote:

                    > Google likely won't care, but people searching for words might run into
                    > issues.
                    >

                    Why would *anyone* ever search for Moten words *via Google*? Searching for
                    "Moten" or "Moten language", I'd understand, but anything else is just
                    weird. Especially since Moten morphology means that the shape of a noun in
                    the citation form may be very different from the shape of that word in the
                    genitive case plural! So a plain Google search (even one that could handle
                    the pipe correctly) would most likely not return the results you'd want.

                    But really, why would one want to search specific Moten words via Google?
                    --
                    Christophe Grandsire-Koevoets.

                    http://christophoronomicon.blogspot.com/
                    http://www.christophoronomicon.nl/
                  • BPJ
                    ... It s actually more than a peeve in my case. The less I use the mouse the less my shoulder hurts (yes it *always* hurts but it s reducible) I guess I
                    Message 9 of 19 , Mar 8, 2013
                      On 2013-03-08 09:11, Christophe Grandsire-Koevoets wrote:
                      > On 7 March 2013 17:49, BPJ<bpj@...> wrote:
                      >
                      >> >
                      >> >Too much mouse-pointing and menu-mucking for my taste.
                      >> >
                      > Toolbox has lots of keyboard shortcuts :P .

                      It's actually more than a peeve in my case. The less I use
                      the mouse the less my shoulder hurts (yes it *always* hurts
                      but it's reducible) I guess I should get a scrollball in
                      front of the keyboard instead -- or a laptop, but my hands
                      are too big! :-)

                      >
                      >
                      >> >I'm basically a plain-text guy. Besides I can sort any
                      >> >which way me pleases (almost) from within perl:
                      >> >
                      >> >
                      > I'm more a GUI person myself. I like using the keyboard as much as
                      > possible, but love to be able to fall back on using the mouse if I forget
                      > commands.

                      I'm so vimmified that I try to use vim commands in other programs.
                      If you see any random letters littered around my emails you'll
                      know why!

                      >
                      > See, that's how rusty my programming skills are: I understand everything
                      > you write, but have no idea how I'd go around implementing it myself! I
                      > haven't done any serious programming for more than 10 years, and I wasn't
                      > that great to begin with...

                      And I'm no real programmer! (No formal training etc.)

                      >> >Sorry to say but there was a bug which would cause it
                      >> >not to compile right on line 11! Also It's quite

                      > See, I would never have found that bug. Perl is not really my forte.

                      I'ld not have spotted it without syntax highlighting
                      either as it was a forward slash instead of a backslash
                      in the middle of a regular expression! I realized later
                      that actually it would compile but if they intended
                      what they say in the associated comment it wouldn't do
                      what they intended, which is worse of course!

                      >
                      >> >Anyway you might probably write something in Ruby
                      >> >to get your database into a datastructure. The
                      >> >parsing code in Text::Shoebox isn't exactly
                      >> >complicated, though it too shows its age.
                      >> >
                      >> >
                      > The parsing isn't what I fear most. It's the next
                      > step, converting the data into a useful XeLaTeX
                      > file. I'm currently looking at bilingual
                      > dictionaries typeset in LaTeX to see how I could
                      > create a template. My LaTeX programming skills
                      > are*very* rusty, so I'd rather not have to create my
                      > own styles :P .

                      That's only stage 3 I'm afraid. Going through the data,
                      grouping and ordering it correctly and wrapping it in
                      LaTeX commands correctly is what I fear most -- more
                      exactly how to do it without getting lost in a maze of
                      conditionals; 'chunking it down' as someone called it.
                      Not only must you divide the whole lexicon into
                      entries, you must also divide the entry into the part-of-
                      speech/sense number/subentry hierarchy -- at least
                      unless everything in each output entry is always going
                      to come in exactly the order it stands in the database.
                      I'm thinking an object class for each (sub)entryish
                      type which can be accessed either as a plain array --
                      in the order they come in the database -- or as an
                      associative array -- by 'marker' (which I keep thinking
                      about as 'tag'), which then in turn would have arrays
                      as values since there may be several fields with the
                      same tag --, and at the bottom level a field object
                      class with a tag and a value property, and then for
                      each class a template or method which stringifies it in
                      a sensible way. All in the interest of the user only
                      needing to worry about the template bit.

                      See, already scratching that itch... :-/

                      And I have to <del>make money</del><ins>do serious work</ins>!

                      /bpj
                    • A. da Mek
                      ... Maybe you could use the broken one, 00A6 ¦ BROKEN BAR. The palatal sounds are often described as soft , so the yin bar would be more appropriate than the
                      Message 10 of 19 , Mar 8, 2013
                        > In the end, I decided to stick with the pipe.

                        Maybe you could use the broken one, 00A6 ¦ BROKEN BAR.
                        The palatal sounds are often described as "soft", so the yin bar would be
                        more appropriate than the yang one.
                      • George Corley
                        On Fri, Mar 8, 2013 at 3:55 AM, Christophe Grandsire-Koevoets
                        Message 11 of 19 , Mar 8, 2013
                          On Fri, Mar 8, 2013 at 3:55 AM, Christophe Grandsire-Koevoets <
                          tsela.cg@...> wrote:

                          > On 8 March 2013 09:54, George Corley <gacorley@...> wrote:
                          >
                          > > Google likely won't care, but people searching for words might run into
                          > > issues.
                          > >
                          >
                          > Why would *anyone* ever search for Moten words *via Google*? Searching for
                          > "Moten" or "Moten language", I'd understand, but anything else is just
                          > weird. Especially since Moten morphology means that the shape of a noun in
                          > the citation form may be very different from the shape of that word in the
                          > genitive case plural! So a plain Google search (even one that could handle
                          > the pipe correctly) would most likely not return the results you'd want.
                          >
                          > But really, why would one want to search specific Moten words via Google?
                          >

                          Someone knows the word, but not what language it comes from, either because
                          it was posted somewhere with insufficient information, or they are
                          remembering the word but not where they saw it last.

                          I'm not trying to get you to change anything. I'm just throwing out my
                          ideas as to why conlangers in general would want to be Google-friendly.
                          It's not so hard to do, anyway -- Google ignores diacritics, so that
                          misspellings of foreign words can still find what the user is after. But I
                          respect that you, specifically, have an established orthography that would
                          be difficult to change at this point.
                        • Christophe Grandsire-Koevoets
                          ... Can t type it easily on my keyboard. At least the pipe is readily available without having to do weird contorsions. But I don t understand why we re having
                          Message 12 of 19 , Mar 8, 2013
                            On 8 March 2013 12:58, A. da Mek <a.da_mek0@...> wrote:

                            > In the end, I decided to stick with the pipe.
                            >>
                            >
                            > Maybe you could use the broken one, 00A6 ¦ BROKEN BAR.
                            > The palatal sounds are often described as "soft", so the yin bar would be
                            > more appropriate than the yang one.
                            >

                            Can't type it easily on my keyboard. At least the pipe is readily available
                            without having to do weird contorsions.

                            But I don't understand why we're having this discussion. I am *not* going
                            to change Moten's orthography. I was just musing about the time when I
                            considered to do so. That time is past, and no argument is going to make me
                            change my mind on that.

                            On 8 March 2013 18:26, George Corley <gacorley@...> wrote:

                            >
                            > Someone knows the word, but not what language it comes from, either because
                            > it was posted somewhere with insufficient information, or they are
                            > remembering the word but not where they saw it last.
                            >
                            >
                            That's an awfully specific scenario to change an entire orthography for.


                            > I'm not trying to get you to change anything. I'm just throwing out my
                            > ideas as to why conlangers in general would want to be Google-friendly.
                            > It's not so hard to do, anyway -- Google ignores diacritics, so that
                            > misspellings of foreign words can still find what the user is after. But I
                            > respect that you, specifically, have an established orthography that would
                            > be difficult to change at this point.
                            >

                            Thanks :) .
                            --
                            Christophe Grandsire-Koevoets.

                            http://christophoronomicon.blogspot.com/
                            http://www.christophoronomicon.nl/
                          • George Corley
                            On Fri, Mar 8, 2013 at 5:09 PM, Christophe Grandsire-Koevoets
                            Message 13 of 19 , Mar 9, 2013
                              On Fri, Mar 8, 2013 at 5:09 PM, Christophe Grandsire-Koevoets <
                              tsela.cg@...> wrote:

                              >
                              > But I don't understand why we're having this discussion. I am *not* going
                              > to change Moten's orthography. I was just musing about the time when I
                              > considered to do so. That time is past, and no argument is going to make me
                              > change my mind on that.
                              >

                              It could be useful to other conlangers whose orthographies are not set in
                              stone yet.


                              > On 8 March 2013 18:26, George Corley <gacorley@...> wrote:
                              >
                              > >
                              > > Someone knows the word, but not what language it comes from, either
                              > because
                              > > it was posted somewhere with insufficient information, or they are
                              > > remembering the word but not where they saw it last.
                              > >
                              > >
                              > That's an awfully specific scenario to change an entire orthography for.
                              >

                              This is only the scenario where optimizing for Google would be important,
                              and it could become quite a common one if you conlang were to become
                              popular for some reason. Seeing as yours is not one of the conlangs riding
                              along the back of a popular book/TV/movie franchise, that's still unlikely.

                              There could be other arguments to be made. Readers will have no idea what
                              the pipe means without reading a pronunciation guide (the only natlang use
                              I can think of is the dental click, which will of course be vanishingly
                              rare). Again, I don't care if you change your orthography, but this could
                              be something for other conlangers to think about.
                            Your message has been successfully submitted and would be delivered to recipients shortly.