Loading ...
Sorry, an error occurred while loading the content.

Finding gaps in a sequence

Expand Messages
  • flo.gehrke
    I ve got a database where each record is indexed with an alpha-code from aaa to zzz . Every now and then, I want to find out if there is a gap in a sorted
    Message 1 of 29 , Nov 12, 2011
    • 0 Attachment
      I've got a database where each record is indexed with an alpha-code from 'aaa' to 'zzz'.

      Every now and then, I want to find out if there is a gap in a sorted list of these codes. There's a gap, for example, in...

      zbx
      zby
      zbz
      zca
      zcc
      zcd

      (missing 'zcb'). How to detect such gaps with a clip?

      My only basic idea is to calculate a kind of "checksum" and the difference between the checksum of two following codes:

      1. Calculate "checksum" %A% with...

      ^!Set %A%=^$Calc(^$CharToDec(^$StrIndex(^%A%;1)$)$+^$CharToDec(^$StrIndex(^%A%;2)$)$+^$CharToDec(^$StrIndex(^%A%;3)$)$)$

      2. Calculate the "checksum" of next code and assign it to %B%

      3. Calculate the difference diff = B - A

      If I'm not mistaken, the sequence is OK if diff=1 or diff=-24. And there's a gap if 'diff' is diverging from 1 or -24.

      Any ideas - possibly more efficient? Thanks!

      Flo
    • Rod Dav4is
      Treat them as 3-digit radix-26 numbers, and check the numerical difference between two records. If not =1, something s missing. -R.
      Message 2 of 29 , Nov 12, 2011
      • 0 Attachment
        Treat them as 3-digit radix-26 numbers, and check the numerical
        difference between two records. If not =1, something's missing.
        -R.

        On 2011-11-12 11:34, flo.gehrke wrote:
        > I've got a database where each record is indexed with an alpha-code from 'aaa' to 'zzz'.
        >
        > Every now and then, I want to find out if there is a gap in a sorted list of these codes. There's a gap, for example, in...
        >
        > zbx
        > zby
        > zbz
        > zca
        > zcc
        > zcd
        >
        > (missing 'zcb'). How to detect such gaps with a clip?
        >
        > My only basic idea is to calculate a kind of "checksum" and the difference between the checksum of two following codes:
        >
        > 1. Calculate "checksum" %A% with...
        >
        > ^!Set %A%=^$Calc(^$CharToDec(^$StrIndex(^%A%;1)$)$+^$CharToDec(^$StrIndex(^%A%;2)$)$+^$CharToDec(^$StrIndex(^%A%;3)$)$)$
        >
        > 2. Calculate the "checksum" of next code and assign it to %B%
        >
        > 3. Calculate the difference diff = B - A
        >
        > If I'm not mistaken, the sequence is OK if diff=1 or diff=-24. And there's a gap if 'diff' is diverging from 1 or -24.
        >
        > Any ideas - possibly more efficient? Thanks!
        >
        > Flo
        >
        >
        >
        >
        > ------------------------------------
        >
        > Fookes Software: http://www.fookes.com/
        > NoteTab website: http://www.notetab.com/
        > NoteTab Discussion Lists: http://www.notetab.com/groups.php
        >
        > ***
        > Yahoo! Groups Links
        >
        >
        >
        >
      • Axel Berger
        ... A totally different approach: Get the first triple, say pqf and add it to the first line with a space, thus pqf pqf some content. Then count it upward and
        Message 3 of 29 , Nov 12, 2011
        • 0 Attachment
          "flo.gehrke" wrote:
          > if there is a gap in a sorted list of these codes.

          A totally different approach:
          Get the first triple, say pqf and add it to the first line with a space,
          thus

          pqf pqf some content.

          Then count it upward and prepend each line with the next one. This
          should yield

          pqg pqg more content
          pqh pqh and more
          pqi pqj something missing

          You'll end up in the last line and can see at a glance if the triples
          match. At the very end delete the first four characters of each line.

          No idea if this is any better, pretty much worse probably.

          Axel
        • diodeom
          ... The following basic take looks for what the next index in sequence should be and flags a gap if no match is found: ;Assure blank line at the end of the
          Message 4 of 29 , Nov 12, 2011
          • 0 Attachment
            Flo wrote:
            >
            > I've got a database where each record is indexed with an alpha-code from 'aaa' to 'zzz'.
            >
            > Every now and then, I want to find out if there is a gap in a sorted list of these codes. There's a gap, for example, in...
            >
            > zbx
            > zby
            > zbz
            > zca
            > zcc
            > zcd
            >


            The following basic take looks for what the next index in sequence should be and flags a gap if no match is found:

            ;Assure blank line at the end of the list
            ^!Replace "\R*\Z" >> "\r\n" WRS
            ;26-character set to get ordinals from
            ^!Set %az%=abcdefghijklmnopqrstuvwxyz
            ^!Jump 1

            :Current_index
            ^!Set %1st%=^$StrIndex("^$GetLine$";1)$
            ^!Set %2nd%=^$StrIndex("^$GetLine$";2)$
            ^!Set %3rd%=^$StrIndex("^$GetLine$";3)$
            ^!Set %i1%=^$StrPos("^%1st%";"^%az%";0)$
            ^!Set %i2%=^$StrPos("^%2nd%";"^%az%";0)$
            ^!Set %i3%=^$StrPos("^%3rd%";"^%az%";0)$

            ;Establish what the next index *should be*
            :Next_index
            ^!Inc %i3%
            ^!Set %3rd%=^$StrIndex("^%az%";^%i3%)$
            ^!IfEmpty ^%3rd% Next Else Roll_call
            ^!Set %i3%=1; %3rd%=a
            ^!Inc %i2%
            ^!Set %2nd%=^$StrIndex("^%az%";^%i2%)$
            ^!IfEmpty ^%2nd% Next Else Roll_call
            ^!Set %i2%=1; %2nd%=a
            ^!Inc %i1%
            ^!Set %1st%=^$StrIndex("^%az%";^%i1%)$
            ^!IfEmpty ^%1st% End Else Roll_call

            ;Check if given index is present where expected
            :Roll_call
            ^!Jump +1
            ^!IfEmpty "^$GetLine$" End
            ^!If "^%1st%^%2nd%^%3rd%" = "^$GetLine$" Next_Index
            ;Report if absent
            ^!Jump Line_Start
            ^!InsertText ****** (gap)^%nl%
            ^!Goto Current_index

            Alternatively, maybe it would not be too crazy to generate all possible (17,576) combinations, append to them entries from the target list (array) where they match, and then rapidly replace all blocks of solo (unmatched) lines with "gap" statements.
          • diodeom
            ... Naturally, for working with indexed lines of data zby Stuff zbz Other stuff zca Etc. the comparing statement would look like ^!If ^%1st%^%2nd%^%3rd% =
            Message 5 of 29 , Nov 12, 2011
            • 0 Attachment
              I wrote:
              >
              > ;Check if given index is present where expected
              > :Roll_call
              > ^!Jump +1
              > ^!IfEmpty "^$GetLine$" End
              > ^!If "^%1st%^%2nd%^%3rd%" = "^$GetLine$" Next_Index


              Naturally, for working with indexed lines of data

              zby Stuff
              zbz Other stuff
              zca Etc.

              the comparing statement would look like

              ^!If "^%1st%^%2nd%^%3rd%" = "^$StrCopyLeft("^$GetLine$";3)$" Next_Index
            • flo.gehrke
              ... Thanks for all replies to my question! @Rod ... I understand that diodeom s solution is based on something like that where the numeric value of each
              Message 6 of 29 , Nov 12, 2011
              • 0 Attachment
                --- In ntb-clips@yahoogroups.com, "flo.gehrke" <flo.gehrke@...> wrote:
                >
                > I've got a database where each record is indexed with an alpha-code
                > from 'aaa' to 'zzz'...I want to find out if there is a gap in a
                > sorted list of these codes...


                Thanks for all replies to my question!

                @Rod

                > Treat them as 3-digit radix-26 numbers, and check the numerical
                > difference between two records. If not =1, something's missing.

                I understand that diodeom's solution is based on something like that where the numeric value of each character is determined by ^$StrPos$ and a string from a to z.

                @Axel

                > Get the first triple, say pqf and add it to the first line with
                > a space...

                Wittily, indeed! But we know from experience that this would lead to a lot of cursor movements, insertings etc ending in a very slow performance with a long list.

                @diodeom

                > The following basic take looks for what the next index in sequence
                > should be and flags a gap if no match is found...

                Thanks, that's great! The advantage is that it needs no '^$Calc$' functions which are rather slow when being called very often.

                Just for the fun of it, I would like to present my first approach. It's dealing with a pure list of 3-digit-alpha-codes from 'aaa' to 'zzz' without any further stuff in the lines.

                It's quite fast as well since it doesn't move from line to line but assigns the whole list to an array. The result is a list of codes with gaps, i.e. codes that do not immediately follow one another displayed in a second window (avoid empty line at end of list):


                ; Assign whole list to array %Codes%
                ^!SetListDelimiter ^%NL%
                ^!SetArray %Codes%=^$GetText$
                ^!Set %i%=1

                :Loop
                ; Assign first and second code to array %A% and %B%
                ^!Set %A%=^%Codes^%i%%
                ^!Inc %i%
                ^!Set %B%=^%Codes^%i%%
                ; Calculate numeric value of each code
                ^!Set %x%=^$Calc(^$CharToDec(^$StrIndex(^%A%;1)$)$+^$CharToDec(^$StrIndex(^%A%;2)$)$+^$CharToDec(^$StrIndex(^%A%;3)$)$)$
                ^!Set %y%=^$Calc(^$CharToDec(^$StrIndex(^%B%;1)$)$+^$CharToDec(^$StrIndex(^%B%;2)$)$+^$CharToDec(^$StrIndex(^%B%;3)$)$)$
                ; Calculate difference
                ^!Set %Diff%=^$Calc(^%y%-^%x%)$
                ^!If ^%Diff% = 1 Skip Else Next
                ^!If ^%Diff%=-24 Next Else Error
                ^!If ^%i%=^%Codes0% Out
                ^!Goto Loop

                :Error
                ; Assign wrong sequence to variable %Gaps%
                ^!Set %Gaps%=^%Gaps%^%A%/^%B%^P
                ^!If ^%i%=^%Codes0% Out
                ^!Goto Loop

                :Out
                ^!IfEmpty ^%Gaps% Next Else Skip_2
                ^!Info No gaps!
                ^!Goto End
                ^!Toolbar New Document
                ^!InsertText There's a gap at:^P^%Gaps%
                ^!Toolbar Second Window
                ^!ClearVariables

                Regards,
                Flo
              • flo.gehrke
                ... Errata: Don t know where those backslashs come from!?? Inserted by Yahoo? Please correct: ^!Set
                Message 7 of 29 , Nov 12, 2011
                • 0 Attachment
                  --- In ntb-clips@yahoogroups.com, "flo.gehrke" <flo.gehrke@...> wrote:
                  >
                  > Just for the fun of it, I would like to present my first
                  > approach...

                  Errata: Don't know where those backslashs come from!?? Inserted by Yahoo?

                  Please correct:

                  ^!Set
                  %x%=^$Calc(^$CharToDec(^$StrIndex(^%A%;1)$)$+^$CharToDec(^$StrIndex(^%A%;2)$)$+^$CharToDec(^$StrIndex(^%A%;3)$)$)$
                  ^!Set
                  %y%=^$Calc(^$CharToDec(^$StrIndex(^%B%;1)$)$+^$CharToDec(^$StrIndex(^%B%;2)$)$+^$CharToDec(^$StrIndex(^%B%;3)$)$)$

                  Also...

                  > ; Assign first and second code to array %A% and %B%

                  No "arrays", of course -- it's "variables"!

                  Flo
                • diodeom
                  ... It looks like to prevent erroneus gap reporting between the pairs like azz/baa, bzz/caa, czz/daa, etc., one more provision could help (when ^%Diff%=-49). -
                  Message 8 of 29 , Nov 13, 2011
                  • 0 Attachment
                    Flo wrote:
                    >
                    > Just for the fun of it, I would like to present my first approach. It's dealing with a pure list of 3-digit-alpha-codes from 'aaa' to 'zzz' without any further stuff in the lines.
                    >
                    > It's quite fast as well since it doesn't move from line to line but assigns the whole list to an array. The result is a list of codes with gaps, i.e. codes that do not immediately follow one another displayed in a second window (avoid empty line at end of list):
                    >
                    > (...)
                    >
                    > ^!Set %Diff%=^$Calc(^%y%-^%x%)$
                    > ^!If ^%Diff% = 1 Skip Else Next
                    > ^!If ^%Diff%=-24 Next Else Error
                    >


                    It looks like to prevent erroneus gap reporting between the pairs like azz/baa, bzz/caa, czz/daa, etc., one more provision could help (when ^%Diff%=-49).

                    - - -
                    As you pointed out, looping in memory ought to be incomparably faster than taking an illustrative walk over the displayed lines (for either of the index-comparing methods), but for those of us who find rapid regex swaps on 'live' text appealing :) here's yet another exercise in gap-sniffing that doesn't directly evaluate any of the indices; it checks instead for presence of any sequence-breaking patterns.

                    The first simple replacement iteration does the vast majority of spotting -- and I wish it were it, but the 'bloat' of the subsequent ones is needed to locate any remaining cases (in a quickly diminishing order of probablity of their occurence):

                    ^!SetArray %az%=a;b;c;d;e;f;g;h;i;j;k;l;m;n;o;p;q;r;s;t;u;v;w;x;y;z;a
                    ^!Set %i%=0

                    :3rd
                    ^!Inc %i%
                    ^!If ^%i%=27 2_nd
                    ^!Set %j%=^$Calc(^%i%+1)$
                    ;Flag where 3rd chars skip the sequence
                    ;(e.g. zbx/zbz, zbz/zcb)
                    ^!Replace "^%az^%i%%\R\K(?!(\*|..^%az^%j%%|\R|\Z))" >> "*****\r\n" WARS
                    ^!Goto 3rd

                    :2_nd
                    ;Flag where after 3rd [a-y] the next 2nd doesn't remain the same
                    ;(e.g. zcc/zdd, zdd/zfe)
                    ^!Replace "(.)[a-y]\R\K(?!.(\*|\1|\R|\Z))" >> "*****\r\n" WARS
                    ^!Set %i%=0
                    :2nd
                    ^!Inc %i%
                    ^!If ^%i%=27 1_st
                    ^!Set %j%=^$Calc(^%i%+1)$
                    ;Flag where 3rd 'z' doesn't incremeant by one the next 2nd
                    ;(e.g. aaz/aca, zez/zga)
                    ^!Replace "^%az^%i%%z\R\K(?!.(\*|^%az^%j%%|\R|\Z))" >> "*****\r\n" WARS
                    ^!Goto 2nd

                    :1_st
                    ;Flag where after 2nd [a-y] the next 1st doesn't remain the same
                    ;(e.g. bcc/dcd, dcd/gce)
                    ^!Replace "^(.)[a-y].\R\K(?!(\*|\1|\R|\Z))" >> "*****\r\n" WARS
                    ^!Set %i%=0
                    :1st
                    ^!Inc %i%
                    ^!If ^%i%=26 End
                    ^!Set %j%=^$Calc(^%i%+1)$
                    ;Flag where 'zz' doesn't incremeant by one the next 1st
                    ;(e.g. azz/caa, czz/faa)
                    ^!Replace "^%az^%i%%zz\R\K(?!(\*|^%az^%j%%|\R|\Z))" >> "*****\r\n" WARS
                    ^!Goto 1st
                  • Eb
                    Hi Flo, I have a bit outside-of-the-box approach for converting and testing, which require no calculations. I have not tested the code below, but have used the
                    Message 9 of 29 , Nov 14, 2011
                    • 0 Attachment
                      Hi Flo,

                      I have a bit outside-of-the-box approach for converting and testing, which require no calculations. I have not tested the code below, but have used the technique successfully in similar situations. Note that there are a couple of holes you need to fill -- the alpha variables, and the GAP_HANDLER.

                      If you start your code with a series of variables, one per letter in the alphabet, assigning two-digit hex codes from 01 .. 1a, the StrSplit function will separate the code letters, and the StrReplace function converts the letters to variables. Finally the assignment parses these variables (long line).

                      --------------------------------------------------------
                      ;n variables, 1 per letter in alphabet used in your index codes,
                      ; containing 2-digit hex code each, starting with 01
                      ^!Set %a%=01; %b%=02;...%z%=1a

                      ;fetch index codes -- use TABS to avoid confusion with StrSplit
                      ^!SetListDelimiter ^%tab%
                      ^!SetArray %codes%=^$GetDocMatchAll("^[a-z]{3}")$

                      ;split character codes into variables, convert to dec nums
                      ^!Set %i%=0
                      :LOOP
                      ^!Inc %i%
                      ;-----long line ahead
                      ^!Set %codes^%i%%=^%^$StrReplace("^%nl%";"%*^%";"^$StrSplit("^%codes^%i%%";1;True)$"%
                      ;-----end long line
                      ^!Set %codes^%i%%=^$HexToInt(^%codes^%i%%)$
                      ;test if gap -- if gap, have the gap handler update the %gapdetect% variable
                      ^!If ^%i%=^%codes^%i%% NEXT else GAP_HANDLER
                      ^!If ^%i%<^%codes0% LOOP
                      :GAP_HANDLER
                      ;reports only the first gap, but using a separate lin e num variable allows you to report multiple gaps.
                      --------------------------------------------------------




                      --- In ntb-clips@yahoogroups.com, "flo.gehrke" <flo.gehrke@...> wrote:
                      >
                      > --- In ntb-clips@yahoogroups.com, "flo.gehrke" <flo.gehrke@> wrote:
                      > >
                      > > I've got a database where each record is indexed with an alpha-code
                      > > from 'aaa' to 'zzz'...I want to find out if there is a gap in a
                      > > sorted list of these codes...
                      >
                      >
                      > Thanks for all replies to my question!
                      >
                      > @Rod
                      >
                      > > Treat them as 3-digit radix-26 numbers, and check the numerical
                      > > difference between two records. If not =1, something's missing.
                      >
                      > I understand that diodeom's solution is based on something like that where the numeric value of each character is determined by ^$StrPos$ and a string from a to z.
                      >
                      > @Axel
                      >
                      > > Get the first triple, say pqf and add it to the first line with
                      > > a space...
                      >
                      > Wittily, indeed! But we know from experience that this would lead to a lot of cursor movements, insertings etc ending in a very slow performance with a long list.
                      >
                      > @diodeom
                      >
                      > > The following basic take looks for what the next index in sequence
                      > > should be and flags a gap if no match is found...
                      >
                      > Thanks, that's great! The advantage is that it needs no '^$Calc$' functions which are rather slow when being called very often.
                      >
                      > Just for the fun of it, I would like to present my first approach. It's dealing with a pure list of 3-digit-alpha-codes from 'aaa' to 'zzz' without any further stuff in the lines.
                      >
                      > It's quite fast as well since it doesn't move from line to line but assigns the whole list to an array. The result is a list of codes with gaps, i.e. codes that do not immediately follow one another displayed in a second window (avoid empty line at end of list):
                      >
                      >
                      > ; Assign whole list to array %Codes%
                      > ^!SetListDelimiter ^%NL%
                      > ^!SetArray %Codes%=^$GetText$
                      > ^!Set %i%=1
                      >
                      > :Loop
                      > ; Assign first and second code to array %A% and %B%
                      > ^!Set %A%=^%Codes^%i%%
                      > ^!Inc %i%
                      > ^!Set %B%=^%Codes^%i%%
                      > ; Calculate numeric value of each code
                      > ^!Set %x%=^$Calc(^$CharToDec(^$StrIndex(^%A%;1)$)$+^$CharToDec(^$StrIndex(^%A%;2)$)$+^$CharToDec(^$StrIndex(^%A%;3)$)$)$
                      > ^!Set %y%=^$Calc(^$CharToDec(^$StrIndex(^%B%;1)$)$+^$CharToDec(^$StrIndex(^%B%;2)$)$+^$CharToDec(^$StrIndex(^%B%;3)$)$)$
                      > ; Calculate difference
                      > ^!Set %Diff%=^$Calc(^%y%-^%x%)$
                      > ^!If ^%Diff% = 1 Skip Else Next
                      > ^!If ^%Diff%=-24 Next Else Error
                      > ^!If ^%i%=^%Codes0% Out
                      > ^!Goto Loop
                      >
                      > :Error
                      > ; Assign wrong sequence to variable %Gaps%
                      > ^!Set %Gaps%=^%Gaps%^%A%/^%B%^P
                      > ^!If ^%i%=^%Codes0% Out
                      > ^!Goto Loop
                      >
                      > :Out
                      > ^!IfEmpty ^%Gaps% Next Else Skip_2
                      > ^!Info No gaps!
                      > ^!Goto End
                      > ^!Toolbar New Document
                      > ^!InsertText There's a gap at:^P^%Gaps%
                      > ^!Toolbar Second Window
                      > ^!ClearVariables
                      >
                      > Regards,
                      > Flo
                      >
                    • joy8388608
                      Oh I love this type of problem so I had to submit my entry! I am viewing the three digits as a number in base 26 and thinking something like if the line I m
                      Message 10 of 29 , Nov 14, 2011
                      • 0 Attachment
                        Oh I love this type of problem so I had to submit my entry!

                        I am viewing the three digits as a number in base 26 and thinking something like "if the line I'm on has a value of x, the next line should have a value of x+1.

                        I keep calculations to a minimum and think the code is relatively easy to follow\modify.

                        Note: VIS stand for Value IS and VSB is Value Should Be.

                        Joy

                        GapFind
                        ; Finds gaps in series aaa aab...aaz aba abb...
                        ^!Set %GapCount%=0
                        ^!Set %Line%=1
                        ^!Set %VSB%=^$ConvertTo26(^$StrCopyLeft("^$GetLine(^%Line%)$";3)$)$
                        ^!Set %AZ%="abcdefghijklmnopqrstuvwxyz"
                        ^!SetWordWrap OFF

                        :LoopStart
                        ^!If ^%Line% > ^$GetLineCount$ LoopEnd
                        ^!If ^$StrSize("^$GetLine(^%Line%)$")$<2 LoopEnd
                        ^!Set %VIS%=^$ConvertTo26(^$StrCopyLeft("^$GetLine(^%Line%)$";3)$)$
                        ^!If ^%VSB%=^%VIS% OK else NOTOK

                        :OK
                        ^!Inc %Line%
                        ^!Inc %VSB%
                        ^!Goto LoopStart

                        :NOTOK
                        ^!Set %VSB%=^$ConvertTo26(^$StrCopyLeft("^$GetLine(^%Line%)$";3)$)$
                        ^!SetCursor ^%Line%:1
                        ^!InsertText ----- THERE IS A GAP HERE -----^%NL%
                        ^!Inc %Line%
                        ^!Inc %Line%
                        ^!Inc %VSB%
                        ^!Inc %GapCount%
                        ^!Goto LoopStart

                        :LoopEnd
                        ^!Prompt ^%GapCount% gaps were found

                        _ConvertTo26
                        ^!Set %Passed%=^&

                        ^!Set %C1%=^$strIndex("^%Passed%";1)$
                        ^!Set %C2%=^$strIndex("^%Passed%";2)$
                        ^!Set %C3%=^$strIndex("^%Passed%";3)$

                        ^!Set %V1%=^$StrPos("^%C1%";"^%AZ%";False)$
                        ^!Set %V2%=^$StrPos("^%C2%";"^%AZ%";False)$
                        ^!Set %V3%=^$StrPos("^%C3%";"^%AZ%";False)$

                        ^!Result ^$Calc(^%V1%*676 + ^%V2%*26 + ^%V3%)$
                      • flo.gehrke
                        ... So we ve seen five solutions now -- that s great! @Eb For me, your clip didn t provide any result straightaway. But I trust it will do when getting to the
                        Message 11 of 29 , Nov 15, 2011
                        • 0 Attachment
                          --- In ntb-clips@yahoogroups.com, "diodeom" <diomir@...> wrote:

                          > As you pointed out, looping in memory ought to be incomparably
                          > faster than taking an illustrative walk over the displayed lines
                          > (for either of the index-comparing methods), but for those of us
                          > who find rapid regex swaps on 'live' text appealing :) here's yet
                          > another exercise in gap-sniffing that doesn't directly evaluate
                          > any of the indices; it checks instead for presence of any
                          > sequence-breaking patterns....

                          So we've seen five solutions now -- that's great!

                          @Eb

                          For me, your clip didn't provide any result straightaway. But I trust it will do when getting to the bottom of it.

                          @Joy

                          Yet another interesting approach! When testing your clip with 10,000 codes it works perfectly. On my notebook, it performs that task in 78 seconds.

                          @diodeom

                          Your second clip is the most efficient solution by far -- it's doing the job within two seconds. Thanks, I'll install that in my clipbooks!

                          Flo
                        • joy8388608
                          ... I realized I made two slight errors in my program. The original code works as is but the Set %AZ% line should be moved up one click to come before the Set
                          Message 12 of 29 , Nov 17, 2011
                          • 0 Attachment
                            --- In ntb-clips@yahoogroups.com, "joy8388608" <mycroftj@...> wrote:
                            >
                            > Oh I love this type of problem so I had to submit my entry!
                            >
                            > I am viewing the three digits as a number in base 26 and thinking something like "if the line I'm on has a value of x, the next line should have a value of x+1.
                            >
                            > I keep calculations to a minimum and think the code is relatively easy to follow\modify.
                            >
                            > Note: VIS stand for Value IS and VSB is Value Should Be.
                            >
                            > Joy
                            >
                            > GapFind
                            > ; Finds gaps in series aaa aab...aaz aba abb...
                            > ^!Set %GapCount%=0
                            > ^!Set %Line%=1
                            > ^!Set %VSB%=^$ConvertTo26(^$StrCopyLeft("^$GetLine(^%Line%)$";3)$)$
                            > ^!Set %AZ%="abcdefghijklmnopqrstuvwxyz"
                            > ^!SetWordWrap OFF
                            >
                            > :LoopStart
                            > ^!If ^%Line% > ^$GetLineCount$ LoopEnd
                            > ^!If ^$StrSize("^$GetLine(^%Line%)$")$<2 LoopEnd
                            > ^!Set %VIS%=^$ConvertTo26(^$StrCopyLeft("^$GetLine(^%Line%)$";3)$)$
                            > ^!If ^%VSB%=^%VIS% OK else NOTOK
                            >
                            > :OK
                            > ^!Inc %Line%
                            > ^!Inc %VSB%
                            > ^!Goto LoopStart
                            >
                            > :NOTOK
                            > ^!Set %VSB%=^$ConvertTo26(^$StrCopyLeft("^$GetLine(^%Line%)$";3)$)$
                            > ^!SetCursor ^%Line%:1
                            > ^!InsertText ----- THERE IS A GAP HERE -----^%NL%
                            > ^!Inc %Line%
                            > ^!Inc %Line%
                            > ^!Inc %VSB%
                            > ^!Inc %GapCount%
                            > ^!Goto LoopStart
                            >
                            > :LoopEnd
                            > ^!Prompt ^%GapCount% gaps were found
                            >
                            > _ConvertTo26
                            > ^!Set %Passed%=^&
                            >
                            > ^!Set %C1%=^$strIndex("^%Passed%";1)$
                            > ^!Set %C2%=^$strIndex("^%Passed%";2)$
                            > ^!Set %C3%=^$strIndex("^%Passed%";3)$
                            >
                            > ^!Set %V1%=^$StrPos("^%C1%";"^%AZ%";False)$
                            > ^!Set %V2%=^$StrPos("^%C2%";"^%AZ%";False)$
                            > ^!Set %V3%=^$StrPos("^%C3%";"^%AZ%";False)$
                            >
                            > ^!Result ^$Calc(^%V1%*676 + ^%V2%*26 + ^%V3%)$
                            >

                            I realized I made two slight errors in my program. The original code works as is but the Set %AZ% line should be moved up one click to come before the Set %VSB% line and the value of %AZ% should be "bcdefghijklmnopqrstuvwxyz" (remove the "a"). This makes the conversion to decimal truly base 26 using the 26 "digits" a-z with values 0-25. The original code used values 1-26 which is technically not correct.

                            Joy
                          • Eb
                            Flo, It seems I had an oops (typo) in my code. Remove the * between %*^% in the replace function, so the result can parse as a hex number. Eb
                            Message 13 of 29 , Nov 19, 2011
                            • 0 Attachment
                              Flo,

                              It seems I had an oops (typo) in my code. Remove the '*' between '%*^%' in the replace function, so the result can parse as a hex number.

                              Eb

                              --- In ntb-clips@yahoogroups.com, "flo.gehrke" <flo.gehrke@...> wrote:
                              >
                              > --- In ntb-clips@yahoogroups.com, "diodeom" <diomir@> wrote:
                              >
                              > > As you pointed out, looping in memory ought to be incomparably
                              > > faster than taking an illustrative walk over the displayed lines
                              > > (for either of the index-comparing methods), but for those of us
                              > > who find rapid regex swaps on 'live' text appealing :) here's yet
                              > > another exercise in gap-sniffing that doesn't directly evaluate
                              > > any of the indices; it checks instead for presence of any
                              > > sequence-breaking patterns....
                              >
                              > So we've seen five solutions now -- that's great!
                              >
                              > @Eb
                              >
                              > For me, your clip didn't provide any result straightaway. But I trust it will do when getting to the bottom of it.
                              >
                              > @Joy
                              >
                              > Yet another interesting approach! When testing your clip with 10,000 codes it works perfectly. On my notebook, it performs that task in 78 seconds.
                              >
                              > @diodeom
                              >
                              > Your second clip is the most efficient solution by far -- it's doing the job within two seconds. Thanks, I'll install that in my clipbooks!
                              >
                              > Flo
                              >
                            • flo.gehrke
                              ... Eb, Even after removing the asterisk, I wonder how that line should work. Let s reduce your clip to... ^!Set %a%=61; %y%=79; %z%=7A ^!SetListDelimiter
                              Message 14 of 29 , Nov 23, 2011
                              • 0 Attachment
                                --- In ntb-clips@yahoogroups.com, "Eb" <ebbtidalflats@...> wrote:
                                >
                                > Flo,
                                >
                                > It seems I had an oops (typo) in my code. Remove the '*' between '%*^%' in the replace function, so the result can parse as a hex number.

                                Eb,

                                Even after removing the asterisk, I wonder how that line should work.

                                Let's reduce your clip to...

                                ^!Set %a%=61; %y%=79; %z%=7A
                                ^!SetListDelimiter ^%tab%
                                ^!SetArray %codes%=^$GetDocMatchAll("^[a-z]{3}")$
                                ^!Set %i%=0

                                :Loop
                                ^!Inc %i%
                                ^!Set %codes^%i%%=^%^$StrReplace("^%nl%";"%^%";"^$StrSplit("^%codes^%i%%";1;True)$"%
                                ^!Info ^%codes^%i%%
                                ^!Set %codes^%i%%=^$HexToInt(^%codes^%i%%)$
                                ^!Info ^%codes^%i%%

                                When running these lines on three codes only...

                                ayz
                                yza
                                zay

                                ...the result is '0'.

                                As the first '^!Info' shows, the '^$StrReplace$' returns...

                                ("
                                ";"%

                                which, of course, can't be converted to an integer.

                                Furthermore, I think '^$StrReplace$' needs five arguments. But the result isn't any better even when writing..

                                ^!Set %codes^%i%%=^%^$StrReplace("^%nl%";"%^%";"^$StrSplit("^%codes^%i%%";1;True)$";0;0)$%

                                So what's still wrong with that line?

                                Flo
                              • Eb
                                Hi Flo, I pulled the translation code from my memory, didn t test it, and added typos. I know it worked where I used it before, but do not remember where I
                                Message 15 of 29 , Nov 29, 2011
                                • 0 Attachment
                                  Hi Flo,

                                  I pulled the translation code from my memory, didn't test it, and added typos. I know it worked where I used it before, but do not remember where I used it :=( .

                                  I'll get back to you when I find it.

                                  Sorry to waste your time like this.


                                  Cheers,

                                  Eb


                                  --- In ntb-clips@yahoogroups.com, "flo.gehrke" <flo.gehrke@...> wrote:
                                  >
                                  >
                                  >
                                  > --- In ntb-clips@yahoogroups.com, "Eb" <ebbtidalflats@> wrote:
                                  > >
                                  > > Flo,
                                  > >
                                  > > It seems I had an oops (typo) in my code. Remove the '*' between '%*^%' in the replace function, so the result can parse as a hex number.
                                  >
                                  > Eb,
                                  >
                                  > Even after removing the asterisk, I wonder how that line should work.
                                  >
                                  > Let's reduce your clip to...
                                  >
                                  > ^!Set %a%=61; %y%=79; %z%=7A
                                  > ^!SetListDelimiter ^%tab%
                                  > ^!SetArray %codes%=^$GetDocMatchAll("^[a-z]{3}")$
                                  > ^!Set %i%=0
                                  >
                                  > :Loop
                                  > ^!Inc %i%
                                  > ^!Set %codes^%i%%=^%^$StrReplace("^%nl%";"%^%";"^$StrSplit("^%codes^%i%%";1;True)$"%
                                  > ^!Info ^%codes^%i%%
                                  > ^!Set %codes^%i%%=^$HexToInt(^%codes^%i%%)$
                                  > ^!Info ^%codes^%i%%
                                  >
                                  > When running these lines on three codes only...
                                  >
                                  > ayz
                                  > yza
                                  > zay
                                  >
                                  > ...the result is '0'.
                                  >
                                  > As the first '^!Info' shows, the '^$StrReplace$' returns...
                                  >
                                  > ("
                                  > ";"%
                                  >
                                  > which, of course, can't be converted to an integer.
                                  >
                                  > Furthermore, I think '^$StrReplace$' needs five arguments. But the result isn't any better even when writing..
                                  >
                                  > ^!Set %codes^%i%%=^%^$StrReplace("^%nl%";"%^%";"^$StrSplit("^%codes^%i%%";1;True)$";0;0)$%
                                  >
                                  > So what's still wrong with that line?
                                  >
                                  > Flo
                                  >
                                • flo.gehrke
                                  ... Eb, I m still racking my brains over that line because I think there s an interesting issue in here. I suspect your intention is to create a string like
                                  Message 16 of 29 , Nov 30, 2011
                                  • 0 Attachment
                                    --- In ntb-clips@yahoogroups.com, "Eb" <ebbtidalflats@...> wrote:
                                    >
                                    > Hi Flo,
                                    >
                                    > I pulled the translation code from my memory, didn't test it, and added typos. I know it worked where I used it before, but do not remember where I used it :=( .
                                    >
                                    > I'll get back to you when I find it.
                                    >
                                    > Sorry to waste your time like this.
                                    >
                                    > Cheers,
                                    >
                                    > Eb

                                    Eb,

                                    I'm still racking my brains over that line because I think there's an interesting issue in here.

                                    I suspect your intention is to create a string like '^%a%^%y%^%z%' with...

                                    ^!Set %a%=61; %y%=79; %z%=7A
                                    ^!Set %Code%=^%^$StrReplace("^%NL%";"%^%";"^$StrSplit("ayz";1;True)$";0;0)$%
                                    ^!Info ^%Code%

                                    In detail, 'ayz' is split into...

                                    a
                                    y
                                    z

                                    Next, the CRNL in this are replaced with '%^%' providing 'a%^%y%^%z'. Adding a preceding '^%' and a trailing '%' you try to get '^%a%^%y%^%z%'.

                                    I found out that we could achieve that string by replacing the caret with its preset variable:

                                    ^!Set %a%=61; %y%=79; %z%=7A
                                    ^!Set %Code%=^%CARET%%^$StrReplace("^%NL%";"%^%CARET%%";"^$StrSplit("ayz";1;True)$";0;0)$%
                                    ^!Info ^%Code%

                                    Now the problem is that NT provides that string of variables but, surprisingly, not their contents, that is, the hex values: '61797A'. Why this?

                                    Regards,
                                    Flo


                                    P.S. In case that line would provide the hex values, it would be another question how to convert that string into integers with '^$HexToInt(^%codes^%i%%)$'. Anyway, first of all I would be happy to understand why the above mentioned line doesn't provide the hex values...
                                  • Eb
                                    Hi Flo, You have are right in what the hex conversion was supposed to do. In the mean time I found my original char to hex clip, which only converted a single
                                    Message 17 of 29 , Nov 30, 2011
                                    • 0 Attachment
                                      Hi Flo,

                                      You have are right in what the hex conversion was supposed to do.
                                      In the mean time I found my original char to hex clip, which only converted a single digit. I applied the single-digit approach to your problem. While I got it to work, it just raised another problem.

                                      The alphabet is like a base-26 number set (English aplhabet), after shifting a to zero. Straight conversion to numbers creates gaps, where it rolls to the next digit, i.e. aaz --> aba has a gap of 26!, the value of the next digit, and azz to baa has a gap much larger!

                                      I believe you already had a base-26 suggestion. But I was on a roll and created my own version, looking at two ideas:

                                      1. a single array, containing the base-26 values 0..25, used with the calc function, to arrive at consecutive decimal values for your alpha codes.

                                      2. Three separate arrays, one for each digit in your codes, containing look-up values for each alpha character by digit.

                                      The second version is more efficient, and I have included it here (mind the long lines):


                                      ----------->8-------------
                                      H="ThreeDigitAlphaToBase26"
                                      ;value of characters in 1st to 3rd (right to left) digit of code
                                      ^!SetArray %digits3%=0;1;2;3;4;5;6;7;8;9;10;11;12;13;14;15;16;17;18;19;20;21;22;23;24;25
                                      ^!SetArray %digits2%=0;26;52;78;104;130;156;182;208;234;260;286;312;338;364;390;416;442;468;494;520;546;572;598;624;650
                                      ^!SetArray %digits1%=0;676;1352;2028;2704;3380;4056;4732;5408;6084;6760;7436;8112;8788;9464;10140;10816;11492;12168;12844;13520;14196;14872;15548;16224;16900
                                      ^!Set %offset%=64
                                      ;note: offsets a to 1, which has value zero in numbering system
                                      ;--------------------------------------------
                                      ;extract codes
                                      ^!SetListDelimiter ^%nl%
                                      ^!SetArray %codes%=^$GetDocMatchAll("^[a-z]{3}")$
                                      ;loop codes
                                      ^!Set %i%=0
                                      :Loop
                                      ^!Inc %i%
                                      ;fetch code digits one at a time, consolidate in temp
                                      ^!Set %one%=^$CharToDec(^$StrUpper(^$StrIndex("^%codes^%i%%";1)$)$)$
                                      ^!Dec %one% ^%offset%
                                      ^!Set %temp%=^%digits1^%one%%
                                      ^!Set %two%=^$CharToDec(^$StrUpper(^$StrIndex("^%codes^%i%%";2)$)$)$
                                      ^!Dec %two% ^%offset%
                                      ^!Inc %temp% ^%digits2^%two%%
                                      ^!Set %tre%=^$CharToDec(^$StrUpper(^$StrIndex("^%codes^%i%%";3)$)$)$
                                      ^!Dec %tre% ^%offset%
                                      ^!Inc %temp% ^%digits3^%tre%%
                                      ;assign assembled code back to codes array
                                      ^!Set %codes^%i%%=^%temp%
                                      ;--------------------------------------------
                                      :gap_trap
                                      ^!If ^%i%>1 SKIP_2
                                      ^!Set %OLD%=^%temp%
                                      ^!Dec %old%
                                      ;incrementing OLD should set it to same as new
                                      ^!Inc %OLD%
                                      ;temporarily disable for testing
                                      ^!If ^%old%<>^%temp% HANDLE_GAP
                                      ;if differene > 1 signal a gap
                                      ;--------------------------------------------
                                      :NOGAP
                                      ^!Set %old%=^%temp%
                                      ^!Info ^%codes^%i%%
                                      ^!If ^%i%<^%codes0% LOOP
                                      ^!Goto END

                                      :HANDLE_GAP
                                      ^!Info [L]There is a gap at ^%old% to ^%temp%^%nl%Continuing...
                                      ^!Set %OLD%=^%TEMP%
                                      ^!Goto NOGAP
                                      ----------->8-------------


                                      Cheers,


                                      Eb

                                      PS
                                      I'm guessing at some of the stuff below:
                                      The conversion to hex failed because NoteTab saw the insertion of a plain caret as the begin of a parsable something, and when changed to ^%caret% in included the caret like an escaped character, no longer capable of triggering the parser.

                                      As to the HextToInt function, it works fine, when an actual hex number is passed to it.
                                    • joy8388608
                                      ... Sorry if I misunderstood you but I ll reply just in case in order to save you possible extra work and confusion... You said aaz to aba has a gap but you
                                      Message 18 of 29 , Dec 1, 2011
                                      • 0 Attachment
                                        --- In ntb-clips@yahoogroups.com, "Eb" <ebbtidalflats@...> wrote:
                                        >
                                        > Hi Flo,
                                        >
                                        > You have are right in what the hex conversion was supposed to do.
                                        > In the mean time I found my original char to hex clip, which only converted a single digit. I applied the single-digit approach to your problem. While I got it to work, it just raised another problem.
                                        >
                                        > The alphabet is like a base-26 number set (English aplhabet), after shifting a to zero. Straight conversion to numbers creates gaps, where it rolls to the next digit, i.e. aaz --> aba has a gap of 26!, the value of the next digit, and azz to baa has a gap much larger!


                                        Sorry if I misunderstood you but I'll reply just in case in order to save you possible extra work and confusion...

                                        You said aaz to aba has a gap but you correctly noted a=0...z=25.
                                        Therefore, aaz=(0*26^2 + 0*26 + 25)=25 and aba=(0*26^2 + 1*26 + 0)=26 - No gap. Likewise, azz=675 and baa=676. Again, no gap.

                                        Hope this helps, sorry if I misunderstood.

                                        Joy
                                      • flo.gehrke
                                        Thanks, Eb! Now we ve got another working solution. I ve tested it succesfully. In a list of 10,000 3-digit-alpha-codes it needs 118 seconds to find a gap.
                                        Message 19 of 29 , Dec 1, 2011
                                        • 0 Attachment
                                          Thanks, Eb! Now we've got another working solution.

                                          I've tested it succesfully. In a list of 10,000 3-digit-alpha-codes it needs 118 seconds to find a gap.

                                          Maybe it's a bit complicated to see those gaps because it outputs numbers and not the code -- but never mind. What matters here is the basic concept.

                                          Flo

                                          --- In ntb-clips@yahoogroups.com, "Eb" <ebbtidalflats@...> wrote:
                                          >
                                          > Hi Flo,
                                          >
                                          > You have are right in what the hex conversion was supposed to do.
                                          > In the mean time I found my original char to hex clip, which only converted a single digit. I applied the single-digit approach to your problem. While I got it to work, it just raised another problem.
                                          >
                                          > The alphabet is like a base-26 number set (English aplhabet), after shifting a to zero. Straight conversion to numbers creates gaps, where it rolls to the next digit, i.e. aaz --> aba has a gap of 26!, the value of the next digit, and azz to baa has a gap much larger!
                                          >
                                          > I believe you already had a base-26 suggestion. But I was on a roll and created my own version, looking at two ideas:
                                          >
                                          > 1. a single array, containing the base-26 values 0..25, used with the calc function, to arrive at consecutive decimal values for your alpha codes.
                                          >
                                          > 2. Three separate arrays, one for each digit in your codes, containing look-up values for each alpha character by digit.
                                          >
                                          > The second version is more efficient, and I have included it here (mind the long lines):
                                          >
                                          >
                                          > ----------->8-------------
                                          > H="ThreeDigitAlphaToBase26"
                                          > ;value of characters in 1st to 3rd (right to left) digit of code
                                          > ^!SetArray %digits3%=0;1;2;3;4;5;6;7;8;9;10;11;12;13;14;15;16;17;18;19;20;21;22;23;24;25
                                          > ^!SetArray %digits2%=0;26;52;78;104;130;156;182;208;234;260;286;312;338;364;390;416;442;468;494;520;546;572;598;624;650
                                          > ^!SetArray %digits1%=0;676;1352;2028;2704;3380;4056;4732;5408;6084;6760;7436;8112;8788;9464;10140;10816;11492;12168;12844;13520;14196;14872;15548;16224;16900
                                          > ^!Set %offset%=64
                                          > ;note: offsets a to 1, which has value zero in numbering system
                                          > ;--------------------------------------------
                                          > ;extract codes
                                          > ^!SetListDelimiter ^%nl%
                                          > ^!SetArray %codes%=^$GetDocMatchAll("^[a-z]{3}")$
                                          > ;loop codes
                                          > ^!Set %i%=0
                                          > :Loop
                                          > ^!Inc %i%
                                          > ;fetch code digits one at a time, consolidate in temp
                                          > ^!Set %one%=^$CharToDec(^$StrUpper(^$StrIndex("^%codes^%i%%";1)$)$)$
                                          > ^!Dec %one% ^%offset%
                                          > ^!Set %temp%=^%digits1^%one%%
                                          > ^!Set %two%=^$CharToDec(^$StrUpper(^$StrIndex("^%codes^%i%%";2)$)$)$
                                          > ^!Dec %two% ^%offset%
                                          > ^!Inc %temp% ^%digits2^%two%%
                                          > ^!Set %tre%=^$CharToDec(^$StrUpper(^$StrIndex("^%codes^%i%%";3)$)$)$
                                          > ^!Dec %tre% ^%offset%
                                          > ^!Inc %temp% ^%digits3^%tre%%
                                          > ;assign assembled code back to codes array
                                          > ^!Set %codes^%i%%=^%temp%
                                          > ;--------------------------------------------
                                          > :gap_trap
                                          > ^!If ^%i%>1 SKIP_2
                                          > ^!Set %OLD%=^%temp%
                                          > ^!Dec %old%
                                          > ;incrementing OLD should set it to same as new
                                          > ^!Inc %OLD%
                                          > ;temporarily disable for testing
                                          > ^!If ^%old%<>^%temp% HANDLE_GAP
                                          > ;if differene > 1 signal a gap
                                          > ;--------------------------------------------
                                          > :NOGAP
                                          > ^!Set %old%=^%temp%
                                          > ^!Info ^%codes^%i%%
                                          > ^!If ^%i%<^%codes0% LOOP
                                          > ^!Goto END
                                          >
                                          > :HANDLE_GAP
                                          > ^!Info [L]There is a gap at ^%old% to ^%temp%^%nl%Continuing...
                                          > ^!Set %OLD%=^%TEMP%
                                          > ^!Goto NOGAP
                                          > ----------->8-------------
                                          >
                                          >
                                          > Cheers,
                                          >
                                          >
                                          > Eb
                                          >
                                          > PS
                                          > I'm guessing at some of the stuff below:
                                          > The conversion to hex failed because NoteTab saw the insertion of a plain caret as the begin of a parsable something, and when changed to ^%caret% in included the caret like an escaped character, no longer capable of triggering the parser.
                                          >
                                          > As to the HextToInt function, it works fine, when an actual hex number is passed to it.
                                          >
                                        • flo.gehrke
                                          Joy, I also went through your clip again (messages #22230, #22245). I like that formula ^$Calc(^%V1%*676 + ^%V2%*26 + ^%V3%)$ which, actually, seems to be
                                          Message 20 of 29 , Dec 1, 2011
                                          • 0 Attachment
                                            Joy,

                                            I also went through your clip again (messages #22230, #22245). I like that formula '^$Calc(^%V1%*676 + ^%V2%*26 + ^%V3%)$' which, actually, seems to be the heart of your solution.

                                            So I combined it with some ideas of my first concept and managed to speed up your clip significantly. Originally, your clip needed 78 seconds (on my notebook) to check a list of 10,000 codes. The following version is doing it in 9 seconds:


                                            ^!SetHintInfo Working...
                                            ; Assign code list to array %List%
                                            ^!SetListDelimiter ^%NL%
                                            ^!SetArray %List%=^$GetText$
                                            ^!Set %AZ%="abcdefghijklmnopqrstuvwxyz"
                                            ^!Set %i%=1

                                            :CodeToInt
                                            ; Save current code to variable for later output in case of gap
                                            ^!Set %CurrCode%=^%List^%i%%
                                            ; Convert code to number(with Joy's formula)
                                            ^!Set %First%=^$Convert(^%List^%i%%)$
                                            ^!Inc %First%
                                            ^!Inc %i%
                                            ^!If ^%i% > ^%List0% Out
                                            ^!Set %Second%=^$Convert(^%List^%i%%)$
                                            ^!IfSame ^%First% ^%Second% CodeToInt Else False

                                            :False
                                            ^!Append %Gaps%=^%CurrCode%^P
                                            ^!Goto CodeToInt

                                            :Out
                                            ^!IfEmpty ^%Gaps% Next Else Skip_2
                                            ^!Info No gaps!
                                            ^!Goto Skip_3
                                            ^!Toolbar New Document
                                            ^!InsertText Gap found after...^P^%Gaps%
                                            ^!Toolbar Second Window
                                            ^!ClearVariables


                                            The sublip with custom function ^$Convert$ and your formula is...

                                            ^!Set %C1%=^$StrIndex(^&;1)$
                                            ^!Set %C2%=^$StrIndex(^&;2)$
                                            ^!Set %C3%=^$StrIndex(^&;3)$
                                            ^!Set %V1%=^$StrPos(^%C1%;^%AZ%;0)$
                                            ^!Set %V2%=^$StrPos(^%C2%;^%AZ%;0)$
                                            ^!Set %V3%=^$StrPos(^%C3%;^%AZ%;0)$
                                            ^!Result ^$Calc(^%V1%*676 + ^%V2%*26 + ^%V3%)$


                                            Thanks again for your proposal! Maybe you'll have a look at this revised version...

                                            Regards,
                                            Flo


                                            --- In ntb-clips@yahoogroups.com, "joy8388608" <mycroftj@...> wrote:
                                            >
                                            >
                                            >
                                            > --- In ntb-clips@yahoogroups.com, "Eb" <ebbtidalflats@> wrote:
                                            > >
                                            > > Hi Flo,
                                            > >
                                            > > You have are right in what the hex conversion was supposed to do.
                                            > > In the mean time I found my original char to hex clip, which only converted a single digit. I applied the single-digit approach to your problem. While I got it to work, it just raised another problem.
                                            > >
                                            > > The alphabet is like a base-26 number set (English aplhabet), after shifting a to zero. Straight conversion to numbers creates gaps, where it rolls to the next digit, i.e. aaz --> aba has a gap of 26!, the value of the next digit, and azz to baa has a gap much larger!
                                            >
                                            >
                                            > Sorry if I misunderstood you but I'll reply just in case in order to save you possible extra work and confusion...
                                            >
                                            > You said aaz to aba has a gap but you correctly noted a=0...z=25.
                                            > Therefore, aaz=(0*26^2 + 0*26 + 25)=25 and aba=(0*26^2 + 1*26 + 0)=26 - No gap. Likewise, azz=675 and baa=676. Again, no gap.
                                            >
                                            > Hope this helps, sorry if I misunderstood.
                                            >
                                            > Joy
                                            >
                                          • Eb
                                            Flo, ... Is that fast or slow? ... To change the output to the original code, just make a copy of the codes array and use the unadulterated copy to display
                                            Message 21 of 29 , Dec 1, 2011
                                            • 0 Attachment
                                              Flo,

                                              --- In ntb-clips@yahoogroups.com, "flo.gehrke" <flo.gehrke@...> wrote:

                                              > I've tested it succesfully. In a list of 10,000 3-digit-alpha-codes it needs 118 seconds to find a gap.


                                              Is that fast or slow?


                                              > Maybe it's a bit complicated to see those gaps because it outputs numbers and not the code -- but never mind. What matters here is the basic concept.


                                              To change the output to the original code, just make a copy of the 'codes' array and use the unadulterated copy to display the gap (or display both the original code and the numeric code, since the numbers give a clearer picture of how large the gap is.


                                              Cheers
                                            • Art Kocsis
                                              ... If I am interpreting correctly what you said here, the statement is not correct - there is no gap using the alphabet as symbols for a base 26 numbering
                                              Message 22 of 29 , Dec 1, 2011
                                              • 0 Attachment
                                                At 11/30/2011 13:28, Eb wrote:
                                                >The alphabet is like a base-26 number set (English aplhabet), after
                                                >shifting a to zero. Straight conversion to numbers creates gaps, where it
                                                >rolls to the next digit, i.e. aaz --> aba has a gap of 26!, the value of
                                                >the next digit, and azz to baa has a gap much larger!

                                                If I am interpreting correctly what you said here, the statement is not
                                                correct - there is no gap using the alphabet as symbols for a base 26
                                                numbering system.

                                                Any integer (including negative ones), may be used as a base for counting
                                                sequentially and takes the form: sum(d(i) * b^i) where "d" is the ith
                                                "digit" (right to left, 0 based) and "b" is the base-1. In the case of
                                                using the alphabet symbols to represent base 26 digits: a=0, b=1 ... z=25
                                                and the base 10 value of any such number would be d2 * 26^2 + d1 * 26^1 +
                                                d0 * 26^0 or d2*676 +d1*26 + d0*1.

                                                Thus aaz = 0*676 + 0*26 + 25*1 = 25 and aba = 0*676 + 1*26 +0*1 = 26
                                                (no gap)
                                                Also azz = 0*676 + 25*26 + 25*1 = 675 and baa = 1*676 + 0*26 +0*1 = 676
                                                (again, no gap)

                                                Your code uses does correctly so the statement may just be ambiguously worded.

                                                BTW, very clever use of ^!Inc & ^!Dec to do arithmetic! I'll have to
                                                remember that.

                                                I have noted that none of the suggested solutions have done any input data
                                                verification but all assume that each line truly begins with a three (lower
                                                case) alpha character. Your use of ^$GetDocMatchAll("^[a-z]{3}")$ to
                                                extract the sequence codes would seem to offer a simple, one-line way to
                                                verify that assumption: just compare the size of the ^%codes% array to the
                                                line count of the source document.

                                                ^!If ^$GetParaCount$ <> %codes0% ^!Continue Input data error - missing
                                                sequence code(s)


                                                Namaste', Art
                                              • Eb
                                                Joy, I observed a gap while using a non-mathematical (== ) technique to convert from base 26 (the alphabet) to base 16 by using the ascii codes: aaz == 0 x 41
                                                Message 23 of 29 , Dec 2, 2011
                                                • 0 Attachment
                                                  Joy,

                                                  I observed a gap while using a non-mathematical (==>) technique to convert from base 26 (the alphabet) to base 16 by using the ascii codes:

                                                  aaz ==> 0 x 41 41 5A = 4,276,570
                                                  aba ==> 0 x 41 42 41 = 4,276,801

                                                  Once I shifted to the base 26 array approach, I may have stayed in the haze of non-math confusion for a bit longer <g>.


                                                  Eb


                                                  --- In ntb-clips@yahoogroups.com, "joy8388608" <mycroftj@...> wrote:
                                                  >
                                                  >
                                                  >
                                                  > --- In ntb-clips@yahoogroups.com, "Eb" <ebbtidalflats@> wrote:
                                                  > >
                                                  > > Hi Flo,
                                                  > >
                                                  > > You have are right in what the hex conversion was supposed to do.
                                                  > > In the mean time I found my original char to hex clip, which only converted a single digit. I applied the single-digit approach to your problem. While I got it to work, it just raised another problem.
                                                  > >
                                                  > > The alphabet is like a base-26 number set (English aplhabet), after shifting a to zero. Straight conversion to numbers creates gaps, where it rolls to the next digit, i.e. aaz --> aba has a gap of 26!, the value of the next digit, and azz to baa has a gap much larger!
                                                  >
                                                  >
                                                  > Sorry if I misunderstood you but I'll reply just in case in order to save you possible extra work and confusion...
                                                  >
                                                  > You said aaz to aba has a gap but you correctly noted a=0...z=25.
                                                  > Therefore, aaz=(0*26^2 + 0*26 + 25)=25 and aba=(0*26^2 + 1*26 + 0)=26 - No gap. Likewise, azz=675 and baa=676. Again, no gap.
                                                  >
                                                  > Hope this helps, sorry if I misunderstood.
                                                  >
                                                  > Joy
                                                  >
                                                • Eb
                                                  Yes, I was still confused by my earlier attempt to convert character codes to hex codes using ascii. My test clip still had elements of hex code in it. Color
                                                  Message 24 of 29 , Dec 2, 2011
                                                  • 0 Attachment
                                                    Yes, I was still confused by my earlier attempt to convert character codes to hex codes using ascii.

                                                    My test clip still had elements of hex code in it.

                                                    Color me embarrassed.

                                                    Eb

                                                    --- In ntb-clips@yahoogroups.com, Art Kocsis <artkns@...> wrote:
                                                    >
                                                    > At 11/30/2011 13:28, Eb wrote:
                                                    > >The alphabet is like a base-26 number set (English aplhabet), after
                                                    > >shifting a to zero. Straight conversion to numbers creates gaps, where it
                                                    > >rolls to the next digit, i.e. aaz --> aba has a gap of 26!, the value of
                                                    > >the next digit, and azz to baa has a gap much larger!
                                                    >
                                                    > If I am interpreting correctly what you said here, the statement is not
                                                    > correct - there is no gap using the alphabet as symbols for a base 26
                                                    > numbering system.
                                                  • ebbtidalflats
                                                    Hi Art, I suspect that none of the people offering solutions are privy to the format of the data file. So verifying input must be left to Flo. For example, the
                                                    Message 25 of 29 , Dec 2, 2011
                                                    • 0 Attachment
                                                      Hi Art,

                                                      I suspect that none of the people offering solutions are privy to the format of the data file. So verifying input must be left to Flo.

                                                      For example, the ^$GetDocMathcAll statement must include the field delimiter to avoid also matching the first three characters of longer words, which might not be index codes at all.


                                                      Cheers


                                                      Eb

                                                      --- In ntb-clips@yahoogroups.com, Art Kocsis <artkns@...> wrote:
                                                      > ...
                                                      > I have noted that none of the suggested solutions have done any input data
                                                      > verification but all assume that each line truly begins with a three (lower
                                                      > case) alpha character. Your use of ^$GetDocMatchAll("^[a-z]{3}")$ to
                                                      > extract the sequence codes would seem to offer a simple, one-line way to
                                                      > verify that assumption: just compare the size of the ^%codes% array to the
                                                      > line count of the source document.
                                                      >
                                                      > ^!If ^$GetParaCount$ <> %codes0% ^!Continue Input data error - missing
                                                      > sequence code(s)
                                                      >
                                                      >
                                                      > Namaste', Art
                                                      >
                                                    • flo.gehrke
                                                      ... Friends, I started this topic with message #22221 writing... ... So why speculating about the format of the data? Why inventing characters and strings
                                                      Message 26 of 29 , Dec 2, 2011
                                                      • 0 Attachment
                                                        > --- In ntb-clips@yahoogroups.com, Art Kocsis <artkns@> wrote:
                                                        > I have noted that none of the suggested solutions have done any
                                                        > input data verification but all assume that each line truly
                                                        > begins with a three (lower case) alpha character...

                                                        --- In ntb-clips@yahoogroups.com, "ebbtidalflats" <ebbtidalflats@...> wrote:
                                                        >
                                                        > Hi Art,
                                                        >
                                                        > I suspect that none of the people offering solutions are privy to
                                                        > the format of the data file. So verifying input must be left to Flo.

                                                        Friends,

                                                        I started this topic with message #22221 writing...

                                                        > I've got a database where each record is indexed with an alpha-code
                                                        > from 'aaa' to 'zzz'. Every now and then, I want to find out if there
                                                        > is a gap in a sorted list of these codes. There's a gap, for
                                                        > example, in...
                                                        >
                                                        > zbx
                                                        > zby
                                                        > zbz
                                                        > zca
                                                        > zcc
                                                        > zcd

                                                        So why speculating about the format of the data? Why inventing characters and strings which actually are not there?

                                                        "For we write none other things unto you,
                                                        than what ye read or acknowledge..."
                                                        Corinthians 2, 1:13

                                                        Flo
                                                      • joy8388608
                                                        Flo - Very interesting. Your clip is much faster than mine even when I turned ScreenUpdate off. Mine took 41 seconds and yours took 15 for 17550 lines (aaa to
                                                        Message 27 of 29 , Dec 5, 2011
                                                        • 0 Attachment
                                                          Flo -

                                                          Very interesting. Your clip is much faster than mine even when I turned ScreenUpdate off. Mine took 41 seconds and yours took 15 for 17550 lines (aaa to zzz with 26 .rr lines removed). Why? I'm not sure. Perhaps working with an array even though the lines on a screen are probably just another type of array.

                                                          This has been fun, interesting, and I've learned several new things.

                                                          Oh, yes. You don't have to, but as I posted previously, you can modify the value of %AZ% to "bcdefghijklmnopqrstuvwxyz" (remove the 'a') for correctness.

                                                          Thanks,
                                                          Joy

                                                          P.S. On the off chance anyone else (still) wants to play with this for learning purposes, I wrote a quick clip to generate the lines aaa to zzz. Let me know if anyone wants me to post the code.


                                                          --- In ntb-clips@yahoogroups.com, "flo.gehrke" <flo.gehrke@...> wrote:
                                                          >
                                                          > Joy,
                                                          >
                                                          > I also went through your clip again (messages #22230, #22245). I like that formula '^$Calc(^%V1%*676 + ^%V2%*26 + ^%V3%)$' which, actually, seems to be the heart of your solution.
                                                          >
                                                          > So I combined it with some ideas of my first concept and managed to speed up your clip significantly. Originally, your clip needed 78 seconds (on my notebook) to check a list of 10,000 codes. The following version is doing it in 9 seconds:
                                                          >
                                                          >
                                                          > ^!SetHintInfo Working...
                                                          > ; Assign code list to array %List%
                                                          > ^!SetListDelimiter ^%NL%
                                                          > ^!SetArray %List%=^$GetText$
                                                          > ^!Set %AZ%="abcdefghijklmnopqrstuvwxyz"
                                                          > ^!Set %i%=1
                                                          >
                                                          > :CodeToInt
                                                          > ; Save current code to variable for later output in case of gap
                                                          > ^!Set %CurrCode%=^%List^%i%%
                                                          > ; Convert code to number(with Joy's formula)
                                                          > ^!Set %First%=^$Convert(^%List^%i%%)$
                                                          > ^!Inc %First%
                                                          > ^!Inc %i%
                                                          > ^!If ^%i% > ^%List0% Out
                                                          > ^!Set %Second%=^$Convert(^%List^%i%%)$
                                                          > ^!IfSame ^%First% ^%Second% CodeToInt Else False
                                                          >
                                                          > :False
                                                          > ^!Append %Gaps%=^%CurrCode%^P
                                                          > ^!Goto CodeToInt
                                                          >
                                                          > :Out
                                                          > ^!IfEmpty ^%Gaps% Next Else Skip_2
                                                          > ^!Info No gaps!
                                                          > ^!Goto Skip_3
                                                          > ^!Toolbar New Document
                                                          > ^!InsertText Gap found after...^P^%Gaps%
                                                          > ^!Toolbar Second Window
                                                          > ^!ClearVariables
                                                          >
                                                          >
                                                          > The sublip with custom function ^$Convert$ and your formula is...
                                                          >
                                                          > ^!Set %C1%=^$StrIndex(^&;1)$
                                                          > ^!Set %C2%=^$StrIndex(^&;2)$
                                                          > ^!Set %C3%=^$StrIndex(^&;3)$
                                                          > ^!Set %V1%=^$StrPos(^%C1%;^%AZ%;0)$
                                                          > ^!Set %V2%=^$StrPos(^%C2%;^%AZ%;0)$
                                                          > ^!Set %V3%=^$StrPos(^%C3%;^%AZ%;0)$
                                                          > ^!Result ^$Calc(^%V1%*676 + ^%V2%*26 + ^%V3%)$
                                                          >
                                                          >
                                                          > Thanks again for your proposal! Maybe you'll have a look at this revised version...
                                                          >
                                                          > Regards,
                                                          > Flo
                                                        • flo.gehrke
                                                          ... Joy, I think there are three main reasons for that: 1. Assigning the whole list to an array 2. Calculating ^$ConvertTo26$ only twice -- it s done three
                                                          Message 28 of 29 , Dec 5, 2011
                                                          • 0 Attachment
                                                            --- In ntb-clips@yahoogroups.com, "joy8388608" <mycroftj@...> wrote:
                                                            >
                                                            > Flo -
                                                            >
                                                            > Very interesting. Your clip is much faster than mine even
                                                            > when I turned ScreenUpdate off. Mine took 41 seconds and
                                                            > yours took 15 for 17550 lines (aaa to zzz with 26 .rr lines
                                                            > removed). Why? I'm not sure...

                                                            > Flo -
                                                            >
                                                            > Very interesting. Your clip is much faster than mine even when
                                                            > I turned ScreenUpdate off. Mine took 41 seconds and yours took
                                                            > 15 for 17550 lines (aaa to zzz with 26 .rr lines removed). Why?
                                                            > I'm not sure...

                                                            Joy,

                                                            I think there are three main reasons for that:

                                                            1. Assigning the whole list to an array

                                                            2. Calculating ^$ConvertTo26$ only twice -- it's done three times in your clip

                                                            3. Gathering up the gaps with ^!Append and outputting them all at once -- no ^!InsertText

                                                            > I wrote a quick clip to generate the lines aaa to zzz. Let
                                                            > me know if anyone wants me to post the code.

                                                            I put my hand up and would enjoy seeing that clip!

                                                            Flo
                                                          • joy8388608
                                                            ... My pleasure. Joy Generate Base 26 numbers ; by Joy ^!Continue This will generate 17576 lines from aaa to zzz. ^!SKIP Leave Screen update on? (Slower...)
                                                            Message 29 of 29 , Dec 7, 2011
                                                            • 0 Attachment
                                                              --- In ntb-clips@yahoogroups.com, "flo.gehrke" <flo.gehrke@...> wrote:
                                                              >
                                                              > --- In ntb-clips@yahoogroups.com, "joy8388608" <mycroftj@> wrote:
                                                              > >
                                                              > > Flo -
                                                              > >
                                                              > > Very interesting. Your clip is much faster than mine even
                                                              > > when I turned ScreenUpdate off. Mine took 41 seconds and
                                                              > > yours took 15 for 17550 lines (aaa to zzz with 26 .rr lines
                                                              > > removed). Why? I'm not sure...
                                                              >
                                                              > > Flo -
                                                              > >
                                                              > > Very interesting. Your clip is much faster than mine even when
                                                              > > I turned ScreenUpdate off. Mine took 41 seconds and yours took
                                                              > > 15 for 17550 lines (aaa to zzz with 26 .rr lines removed). Why?
                                                              > > I'm not sure...
                                                              >
                                                              > Joy,
                                                              >
                                                              > I think there are three main reasons for that:
                                                              >
                                                              > 1. Assigning the whole list to an array
                                                              >
                                                              > 2. Calculating ^$ConvertTo26$ only twice -- it's done three times in your clip
                                                              >
                                                              > 3. Gathering up the gaps with ^!Append and outputting them all at once -- no ^!InsertText
                                                              >
                                                              > > I wrote a quick clip to generate the lines aaa to zzz. Let
                                                              > > me know if anyone wants me to post the code.
                                                              >
                                                              > I put my hand up and would enjoy seeing that clip!
                                                              >
                                                              > Flo
                                                              >

                                                              My pleasure. Joy

                                                              Generate Base 26 numbers
                                                              ; by Joy
                                                              ^!Continue This will generate 17576 lines from aaa to zzz.

                                                              ^!SKIP Leave Screen update on? (Slower...)
                                                              ^!Setscreenupdate OFF
                                                              ^!StatusShow Generating sequences aaa to zzz...

                                                              ; Start with aaa
                                                              ^!Set %I%=-1

                                                              :LoopStart
                                                              ^!Inc %I%
                                                              ^!Set %Num%=^%I%

                                                              ; Find value of first digit (of 3) (will be 0 to 25)
                                                              ^!Set %x%=^$Calc(INT(^%Num%/676))$

                                                              ; Convert first digit to letter (will be a to z)
                                                              ^!Set %B26%=^$DecToChar(^$Calc(^%x%+97)$)$

                                                              ; adjust value of current number
                                                              ^!Set %Num%=^$Calc(^%Num% - (^%x%*676))$

                                                              ; Find value of second digit (of 3) (will be 0 to 25)
                                                              ^!Set %x%=^$Calc(INT(^%Num%/26))$

                                                              ; Convert second digit to letter (will be a to z) and append
                                                              ^!Set %B26%=^%B26%^$DecToChar(^$Calc(^%x%+97)$)$

                                                              ; adjust value of current number
                                                              ^!Set %Num%=^$Calc(^%Num% - (^%x%*26))$

                                                              ; Convert remaining value (0 to 25) to letter (will be a to z) and append
                                                              ^!Set %B26%=^%B26%^$DecToChar(^$Calc(^%Num%+97)$)$

                                                              ; Output value
                                                              ^!InsertText ^%B26%^%NL%

                                                              ^!If "^%B26%" <> "zzz" LoopStart

                                                              ^!Sound SystemExclamation
                                                            Your message has been successfully submitted and would be delivered to recipients shortly.