Loading ...
Sorry, an error occurred while loading the content.

Re: alphanumeric character transcoding, by ones and pairs

Expand Messages
  • rickah
    Flo and Ian, To display what function this all serves, I created a simple webpage. The two fonts must be installed to view it properly. web page:
    Message 1 of 19 , Jun 18, 2013
    • 0 Attachment
      Flo and Ian,
      To display what function this all serves, I created a simple webpage. The two fonts must be installed to view it properly.
      web page: https://sites.google.com/site/my37s8ks8a/Latin-Sgaw
      fonts zip: https://sites.google.com/site/my37s8ks8a/2KarenFonts.zip

      I found a character error in my original sample line. There is one addtional 'A' in this line. I hope this didn't cause much trouble:
      1: lABTAkvFHWAkolaKELdAFqalanWElacO:laKv:sE:muIsE:sFlanolapLUB

      =====

      > In array %Char%, you see a sequence where each character is followed by the character it has to be replaced with. Note: In the array, a string like 'd' must follow 'dD'.
      >
      > The point is not to replace a 'h' that replaced an 'E' again with '['. To prevent this, each character that has been replaced already is protected with '|'. So '|h' won't get replaced with [' again. This is achieved by the Negative Lookbehind '(?<!\|)'.
      >
      > Probably, there are more issues in this but I hope it might be useful as a first approach...
      >
      > Regards,
      > Flo
      >
    • flo.gehrke
      ... I had a look at your webpage and tested those two lines (Karen Standard/Karen Normal Unique) with my clip (as posted with messages #23882 and #23883). I
      Message 2 of 19 , Jun 18, 2013
      • 0 Attachment
        --- In ntb-clips@yahoogroups.com, "rickah" <richolland@...> wrote:
        >
        > Flo and Ian, To display what function this all serves, I created
        > a simple webpage...

        I had a look at your webpage and tested those two lines (Karen Standard/Karen Normal Unique) with my clip (as posted with messages #23882 and #23883).

        I get to a correct transcoding of your KSTD sample...

        lABTAkvFHWAkolaKELdAFqalanWElacO:laKv:sE:muIsE:sFlanolapLUB

        to your KNU sample...

        vhRxhul.oGhuDvgcHs'h.ngvgEGHvgCd;vgcl;qH;rk>qH;q.vgeDvgysKR

        However different from your 54 replacements, this works only after changing my LIST.TXT as follows...

        A >> h
        E >> H
        n >> E
        U >> K

        Regards,
        Flo
      • rickah
        Vielen Danke, Flo. Most excellent. I had to update to NTB v7 (for the %RepWith%), and I had trouble with this one line relating to the list.txt, for obvious
        Message 3 of 19 , Jun 19, 2013
        • 0 Attachment
          Vielen Danke, Flo. Most excellent.

          I had to update to NTB v7 (for the %RepWith%), and I had trouble with this one line relating to the list.txt, for obvious reasons:

          ^!SetListDelimiter »

          instead of:

          ^!SetListDelimiter >>
          --

          And a gold star for catching the "... n >> E, > U >> K" ...

          This will help tremendously since there are more than those two font
          variations to work with. Only recently does that language have a Unicode font with a standard keyboard layout. Working toward gradually changing the various older texts into the new font set will be so much easier now.

          Thanks again, Flo and Ian,
          Richard

          --

          --
        • Ian NTnerd
          Richard, Part of the idea is for you to learn how clips go together. If we give you all you don t learn as much. :-) I changed the tab list delimiter so email
          Message 4 of 19 , Jun 19, 2013
          • 0 Attachment
            Richard,

            Part of the idea is for you to learn how clips go together. If we give
            you all you don't learn as much. :-)

            I changed the tab list delimiter so email does not mess up the tab. Now
            it is space greater than, greater than space.
            With
            ^!SetListDelimiter " >> "

            Here is my working code with start and end samples.

            If you are using it to go the other way Normal to Standard then the [
            character needs to be escaped with a \[ in the list.

            H="Karen Standard to Normal"
            ^!SetListDelimiter ^p
            ^!SetArray %Charpair%=^$GetClipText("list")$
            ^!Set %i%=0

            :Loop
            ^!Inc %i%
            ^!If ^%i% > ^%Charpair0% Out
            ^!SetListDelimiter " >> "
            ^!SetArray %pair%=^%Charpair^%i%%
            ^!Set %Search%=^%pair1%
            ^!Set %ReplaceWith%=^%pair2%
            ^!SetDebug Off
            ^!If ^$StrSize("^%ReplaceWith%")$ = 2 addchar ELSE noadd
            :addchar
            ^!Set
            %ReplaceWith%=^$StrCopyLeft("^%ReplaceWith%";1)$|^$StrCopyRight("^%ReplaceWith%";1)$
            :noadd
            ^!Replace "(?<!\|)^%Search%" >> "|^%ReplaceWith%" WARS
            ^!Goto Loop

            :Out
            ^!Replace "|" >> "" WATS

            H=";List follows has the form character1 space greater_than greater_than
            space character2"


            H="_list"
            bJ >> bS
            dD >> 'f
            kY >> uF
            KY >> cF
            mD >> rf
            mJ >> rS
            pJ >> yS
            PJ >> zS
            SJ >> pS
            sJ >> qS
            a >> g
            A >> H
            b >> b
            B >> R
            c >> C
            d >> '
            E >> h
            e >> J
            F >> .
            f >> m
            g >> i
            h >> [
            H >> o
            I >> >
            i >> X
            j >> *
            J >> S
            K >> c
            k >> u
            L >> s
            l >> v
            m >> r
            n >> E
            o >> D
            O >> d
            p >> y
            P >> z
            q >> n
            r >> &
            S >> p
            s >> q
            t >> w
            T >> x
            u >> k
            U >> K
            v >> l
            w >> 0
            W >> G
            X >> {
            x >> t
            y >> ,
            Y >> F
            z >> ±S
            : >> ;
            \\ >> -



            H="Karen Standard"
            lABTAkvFHWAkolaKELdAFqalanWElacO:laKv:sE:muIsE:sFlanolapLUB

            H="Karen Normal"
            vhRxhul.oGhuDvgcHs'h.ngvgEGHvgCd;vgcl;qH;rk>qH;q.vgeDvgysKR


            On 19/06/2013 11:13 PM, rickah wrote:
            >
            >
            >
            > Vielen Danke, Flo. Most excellent.
            >
            > I had to update to NTB v7 (for the %RepWith%), and I had trouble with
            > this one line relating to the list.txt, for obvious reasons:
            >
            > ^!SetListDelimiter »
            >
            > instead of:
            >
            > ^!SetListDelimiter >>
            > --
            >
            > And a gold star for catching the "... n >> E, > U >> K" ...
            >
            > This will help tremendously since there are more than those two font
            > variations to work with. Only recently does that language have a
            > Unicode font with a standard keyboard layout. Working toward gradually
            > changing the various older texts into the new font set will be so much
            > easier now.
            >
            > Thanks again, Flo and Ian,
            > Richard
            >
            > --
            >
            > --
            >
            >



            [Non-text portions of this message have been removed]
          • rickah
            Ian, I was not expecting nearly so much. The finished script is so complex yet compact (i.e., elegant) it will take some time to study just to figure out what
            Message 5 of 19 , Jun 20, 2013
            • 0 Attachment
              Ian,
              I was not expecting nearly so much. The finished script is so complex yet compact (i.e., elegant) it will take some time to study just to figure out what it does. I'm couldn't be happier that Flo took up this challenge.

              You guys went went far above and beyond what I expected. I was completely thrilled to be able to re-code an entire test page of text with one click; and no mis-coding that I could detect. Now, internet searches not possible using one font set may work when using another.

              After some study and research, I'm going to see if I can use what I learn to make it available in a web-share-able format. The people I know who could benefit from this script are not very computer literate to begin with.

              I cannot thank y'all enough.

              Yours, Richard.

              --- In ntb-clips@yahoogroups.com, Ian NTnerd <indiamcq@...> wrote:
              >
              > Richard,
              >
              > Part of the idea is for you to learn how clips go together.
              > If we give you all you don't learn as much. :-)
              >
              > I changed the tab list delimiter so email does not mess up the tab.
              > Now it is space greater than, greater than space.
              > With
              > ^!SetListDelimiter " >> "
              --
            • flo.gehrke
              ... If those sample lines on Richard s webpage show a correct transcoding then there are two differences in Ian s result: His clip replaces A H instead of
              Message 6 of 19 , Jun 20, 2013
              • 0 Attachment
                --- In ntb-clips@yahoogroups.com, Ian NTnerd <indiamcq@...> wrote:
                >
                >
                > Here is my working code with start and end samples...

                If those sample lines on Richard's webpage show a correct transcoding then there are two differences in Ian's result: His clip replaces 'A >> H' instead of 'h', and 'E >> h' instead of 'H'. It seems to work the other way round as I mentioned in message #23887.

                There's also an issue with 'n'. In the table on Richard's webpage, 'n' is replaced with 'e'. This accords with his sample lines where, at position #52, 'n' is replaced with 'e'. At #26, however, 'n' is replaced with 'E'. Why this?

                In other words: If 'n' has to be replaced with 'e' where does an 'E' come from? As a replace character, 'E' doesn't occur either in Richard's 54 replacements (see message #23877) or in the table on his webpage.

                So, if I'm not mistaken, some fine-tuning is needed here.

                Regards,
                Flo
              • rickah
                Yes, you are again correct. I did make those changes but failed to note them all here. This is my LIST.TXT as it stands now. -- LIST.TXT a»g A»h b»b B»R
                Message 7 of 19 , Jun 21, 2013
                • 0 Attachment
                  Yes, you are again correct. I did make those changes but failed to note them all here. This is my LIST.TXT as it stands now.

                  --
                  LIST.TXT
                  a»g
                  A»h
                  b»b
                  B»R
                  c»C
                  D»f
                  d»'
                  E»H
                  e»J
                  F».
                  f»m
                  g»i
                  G»A
                  h»[
                  H»o
                  I»>
                  i»X
                  j»*
                  J»S
                  K»c
                  k»u
                  L»s
                  l»v
                  m»r
                  n»e
                  o»D
                  O»d
                  p»y
                  P»z
                  q»n
                  r»&
                  S»p
                  s»q
                  t»w
                  T»x
                  u»k
                  U»K
                  v»l
                  w»0
                  W»G
                  X»{
                  x»t
                  y»,
                  Y»F
                  z»&S
                  ,»<
                  »A
                  :»;
                  …»µ
                  .»$
                  \\»-

                  --end--

                  KSTD keyboard character "G" and KNU character "A" are non-printing 'gaps' merely for visual effect but are not part of the written language. In this list, a KSTD space (" ") is changed to KNU "A " to both show a visible gap and add a word delimiting space.

                  Common "Western" punctuation marks will eventually come in handy for newer publications, but these are not found in traditional S'gaw.

                  Cheers,
                  Richard.
                  --
                  --- In ntb-clips@yahoogroups.com, "flo.gehrke" <flo.gehrke@...> wrote:
                  >
                  > If those sample lines on Richard's webpage show a correct transcoding then there are two differences in Ian's result: His clip replaces 'A >> H' instead of 'h', and 'E >> h' instead of 'H'. It seems to work the other way round as I mentioned in message #23887.
                  >
                  > There's also an issue with 'n'. In the table on Richard's webpage, 'n' is replaced with 'e'. This accords with his sample lines where, at position #52, 'n' is replaced with 'e'. At #26, however, 'n' is replaced with 'E'. Why this?
                  >
                  > In other words: If 'n' has to be replaced with 'e' where does an 'E' come from? As a replace character, 'E' doesn't occur either in Richard's 54 replacements (see message #23877) or in the table on his webpage.
                  >
                  > So, if I'm not mistaken, some fine-tuning is needed here.
                  >
                  > Regards,
                  > Flo
                  >
                • flo.gehrke
                  ... Richard, Two more ideas: 1. If in LIST.TXT the » is OK for you as a separator then we don t have to replace the pipe in that list any more but only the
                  Message 8 of 19 , Jun 21, 2013
                  • 0 Attachment
                    --- In ntb-clips@yahoogroups.com, "rickah" <richolland@...> wrote:
                    >
                    > Yes, you are again correct. I did make those changes but failed
                    > to note them all here. This is my LIST.TXT as it stands now
                    > --
                    > LIST.TXT
                    > a»g
                    > A»h
                    > (...)

                    Richard,

                    Two more ideas:

                    1. If in LIST.TXT the '»' is OK for you as a separator then we don't have to replace the pipe in that list any more but only the CRNL.

                    2. Your new LIST.TXT is containing...

                    .»$

                    Please note that the dot is a RegEx metacharacter which means 'any character except NL'. As a literal character it must be escaped with '\.' on the left:

                    \.»$

                    If you like we could omit the escaping in the list and insert two command lines which will automatically check any search character and add the backslash if needed -- see below.

                    If you prefer this solution then remove all backslashs on the left in LIST.TXT.

                    Regarding these ideas, now the latest version could be...


                    ^!SetHintInfo Working...
                    ^!SetScreenUpdate Off
                    ^!SetClipboard ^$GetFileText(^$GetDocumentPath$LIST.TXT)$
                    ^!SetClipboard ^$StrReplace(\R;»;^$GetClipboard$;RA)$
                    ^!SetListDelimiter »
                    ^!SetArray %Char%=^$GetClipboard$
                    ^!Set %i%=0

                    :Loop
                    ^!Inc %i%
                    ^!If ^%i% > ^%Char0% Out
                    ^!Set %Search%=^%Char^%i%%
                    ; New: Check for metacharacters
                    ; --- Long line start---
                    ^!IfMatch "(\.|\[|\(|\)|\^|\$|\*|\+|\?|\\|{|\|)" "^%Search%" Next Else Skip
                    ; --- Long line end ---
                    ^!Set %Search%=\^%Search%
                    ^!Inc %i%
                    ^!Set %RepWith%=^%Char^%i%%
                    ^!Replace "(?<!\|)^%Search%" >> "|^%RepWith%" WARS
                    ^!Goto Loop

                    :Out
                    ^!Replace "|" >> "" WATS
                    ^!Info Finished!

                    Regards,
                    Flo
                  • rickah
                    That does simplify things. I don t have to guess which characters need to be escaped. Eventually, nearly all keyboard characters may need to be added to the
                    Message 9 of 19 , Jun 22, 2013
                    • 0 Attachment
                      That does simplify things. I don't have to guess which characters need to be escaped. Eventually, nearly all keyboard characters may need to be added to the list.


                      --- In ntb-clips@yahoogroups.com, "flo.gehrke" <flo.gehrke@...> wrote:
                      > Richard,
                      >
                      > Two more ideas:
                      >
                      > > >
                      > If you like we could omit the escaping in the list and insert two command lines which will automatically check any search character and add the backslash if needed -- see below.
                      >
                      > If you prefer this solution then remove all backslassh on the left in LIST.TXT.
                      >
                      > Regarding these ideas, now the latest version could be...
                      >

                      ^!IfMatch "(\.|\[|\(|\)|\^|\$|\*|\+|\?|\\|{|\|)" "^%Search%" Next Else Skip
                      ^!Set %Search%=\^%Search%

                      I've put the list.txt in the notepad.exe folder so I don't lose track of it.

                      Would you be able to help implement the suggestion of Ian to use a clip list instead? I'm thinking it would make things much easier to share. (The clip H="_list" matches the LIST.TXT.)

                      > ^!SetListDelimiter ^p
                      > ^!SetArray %Charpair%=^$GetClipText("list")$
                      > ^!Set %i%=0

                      > :Loop
                      > ^!Inc %i%
                      > ^!If ^%i% > ^%Charpair0% Out
                      > ^!SetListDelimiter ^T
                      > ^!SetArray %pair%=^%Charpair^%i%%
                      > ^!Set %Search%=^%pair1%
                      > ;^!Inc %i%
                      > ^!Set %ReplaceWith%=^%pair2%
                      > ^!Replace "(?<!\|)^%Search%" >> "|^%ReplaceWith%" WARS
                      > ^!Goto Loop

                      One very minor error when setting list items; with "M|&Sl", the letter M is replaced by the complex character "&Sl". I found that this entry must follow "S|p" or "&Sl" beocmes "&pl".

                      This reminds me that I'll eventually be working with character codes such as: ၁ and "\u1063\u103A". Do you foresee much difficulty?

                      Rick
                    • rickah
                      I m going to end up with three or four conversion lists, so having them as separate text files is the better idea. Thanks again, Flo. R. Holland.
                      Message 10 of 19 , Jun 28, 2013
                      • 0 Attachment
                        I'm going to end up with three or four conversion lists, so having them as separate text files is the better idea.

                        Thanks again, Flo.
                        R. Holland.
                      Your message has been successfully submitted and would be delivered to recipients shortly.