Loading ...
Sorry, an error occurred while loading the content.

[Clip] Re: Finding gaps in a sequence

Expand Messages
  • Eb
    Yes, I was still confused by my earlier attempt to convert character codes to hex codes using ascii. My test clip still had elements of hex code in it. Color
    Message 1 of 29 , Dec 2, 2011
    • 0 Attachment
      Yes, I was still confused by my earlier attempt to convert character codes to hex codes using ascii.

      My test clip still had elements of hex code in it.

      Color me embarrassed.

      Eb

      --- In ntb-clips@yahoogroups.com, Art Kocsis <artkns@...> wrote:
      >
      > At 11/30/2011 13:28, Eb wrote:
      > >The alphabet is like a base-26 number set (English aplhabet), after
      > >shifting a to zero. Straight conversion to numbers creates gaps, where it
      > >rolls to the next digit, i.e. aaz --> aba has a gap of 26!, the value of
      > >the next digit, and azz to baa has a gap much larger!
      >
      > If I am interpreting correctly what you said here, the statement is not
      > correct - there is no gap using the alphabet as symbols for a base 26
      > numbering system.
    • ebbtidalflats
      Hi Art, I suspect that none of the people offering solutions are privy to the format of the data file. So verifying input must be left to Flo. For example, the
      Message 2 of 29 , Dec 2, 2011
      • 0 Attachment
        Hi Art,

        I suspect that none of the people offering solutions are privy to the format of the data file. So verifying input must be left to Flo.

        For example, the ^$GetDocMathcAll statement must include the field delimiter to avoid also matching the first three characters of longer words, which might not be index codes at all.


        Cheers


        Eb

        --- In ntb-clips@yahoogroups.com, Art Kocsis <artkns@...> wrote:
        > ...
        > I have noted that none of the suggested solutions have done any input data
        > verification but all assume that each line truly begins with a three (lower
        > case) alpha character. Your use of ^$GetDocMatchAll("^[a-z]{3}")$ to
        > extract the sequence codes would seem to offer a simple, one-line way to
        > verify that assumption: just compare the size of the ^%codes% array to the
        > line count of the source document.
        >
        > ^!If ^$GetParaCount$ <> %codes0% ^!Continue Input data error - missing
        > sequence code(s)
        >
        >
        > Namaste', Art
        >
      • flo.gehrke
        ... Friends, I started this topic with message #22221 writing... ... So why speculating about the format of the data? Why inventing characters and strings
        Message 3 of 29 , Dec 2, 2011
        • 0 Attachment
          > --- In ntb-clips@yahoogroups.com, Art Kocsis <artkns@> wrote:
          > I have noted that none of the suggested solutions have done any
          > input data verification but all assume that each line truly
          > begins with a three (lower case) alpha character...

          --- In ntb-clips@yahoogroups.com, "ebbtidalflats" <ebbtidalflats@...> wrote:
          >
          > Hi Art,
          >
          > I suspect that none of the people offering solutions are privy to
          > the format of the data file. So verifying input must be left to Flo.

          Friends,

          I started this topic with message #22221 writing...

          > I've got a database where each record is indexed with an alpha-code
          > from 'aaa' to 'zzz'. Every now and then, I want to find out if there
          > is a gap in a sorted list of these codes. There's a gap, for
          > example, in...
          >
          > zbx
          > zby
          > zbz
          > zca
          > zcc
          > zcd

          So why speculating about the format of the data? Why inventing characters and strings which actually are not there?

          "For we write none other things unto you,
          than what ye read or acknowledge..."
          Corinthians 2, 1:13

          Flo
        • joy8388608
          Flo - Very interesting. Your clip is much faster than mine even when I turned ScreenUpdate off. Mine took 41 seconds and yours took 15 for 17550 lines (aaa to
          Message 4 of 29 , Dec 5, 2011
          • 0 Attachment
            Flo -

            Very interesting. Your clip is much faster than mine even when I turned ScreenUpdate off. Mine took 41 seconds and yours took 15 for 17550 lines (aaa to zzz with 26 .rr lines removed). Why? I'm not sure. Perhaps working with an array even though the lines on a screen are probably just another type of array.

            This has been fun, interesting, and I've learned several new things.

            Oh, yes. You don't have to, but as I posted previously, you can modify the value of %AZ% to "bcdefghijklmnopqrstuvwxyz" (remove the 'a') for correctness.

            Thanks,
            Joy

            P.S. On the off chance anyone else (still) wants to play with this for learning purposes, I wrote a quick clip to generate the lines aaa to zzz. Let me know if anyone wants me to post the code.


            --- In ntb-clips@yahoogroups.com, "flo.gehrke" <flo.gehrke@...> wrote:
            >
            > Joy,
            >
            > I also went through your clip again (messages #22230, #22245). I like that formula '^$Calc(^%V1%*676 + ^%V2%*26 + ^%V3%)$' which, actually, seems to be the heart of your solution.
            >
            > So I combined it with some ideas of my first concept and managed to speed up your clip significantly. Originally, your clip needed 78 seconds (on my notebook) to check a list of 10,000 codes. The following version is doing it in 9 seconds:
            >
            >
            > ^!SetHintInfo Working...
            > ; Assign code list to array %List%
            > ^!SetListDelimiter ^%NL%
            > ^!SetArray %List%=^$GetText$
            > ^!Set %AZ%="abcdefghijklmnopqrstuvwxyz"
            > ^!Set %i%=1
            >
            > :CodeToInt
            > ; Save current code to variable for later output in case of gap
            > ^!Set %CurrCode%=^%List^%i%%
            > ; Convert code to number(with Joy's formula)
            > ^!Set %First%=^$Convert(^%List^%i%%)$
            > ^!Inc %First%
            > ^!Inc %i%
            > ^!If ^%i% > ^%List0% Out
            > ^!Set %Second%=^$Convert(^%List^%i%%)$
            > ^!IfSame ^%First% ^%Second% CodeToInt Else False
            >
            > :False
            > ^!Append %Gaps%=^%CurrCode%^P
            > ^!Goto CodeToInt
            >
            > :Out
            > ^!IfEmpty ^%Gaps% Next Else Skip_2
            > ^!Info No gaps!
            > ^!Goto Skip_3
            > ^!Toolbar New Document
            > ^!InsertText Gap found after...^P^%Gaps%
            > ^!Toolbar Second Window
            > ^!ClearVariables
            >
            >
            > The sublip with custom function ^$Convert$ and your formula is...
            >
            > ^!Set %C1%=^$StrIndex(^&;1)$
            > ^!Set %C2%=^$StrIndex(^&;2)$
            > ^!Set %C3%=^$StrIndex(^&;3)$
            > ^!Set %V1%=^$StrPos(^%C1%;^%AZ%;0)$
            > ^!Set %V2%=^$StrPos(^%C2%;^%AZ%;0)$
            > ^!Set %V3%=^$StrPos(^%C3%;^%AZ%;0)$
            > ^!Result ^$Calc(^%V1%*676 + ^%V2%*26 + ^%V3%)$
            >
            >
            > Thanks again for your proposal! Maybe you'll have a look at this revised version...
            >
            > Regards,
            > Flo
          • flo.gehrke
            ... Joy, I think there are three main reasons for that: 1. Assigning the whole list to an array 2. Calculating ^$ConvertTo26$ only twice -- it s done three
            Message 5 of 29 , Dec 5, 2011
            • 0 Attachment
              --- In ntb-clips@yahoogroups.com, "joy8388608" <mycroftj@...> wrote:
              >
              > Flo -
              >
              > Very interesting. Your clip is much faster than mine even
              > when I turned ScreenUpdate off. Mine took 41 seconds and
              > yours took 15 for 17550 lines (aaa to zzz with 26 .rr lines
              > removed). Why? I'm not sure...

              > Flo -
              >
              > Very interesting. Your clip is much faster than mine even when
              > I turned ScreenUpdate off. Mine took 41 seconds and yours took
              > 15 for 17550 lines (aaa to zzz with 26 .rr lines removed). Why?
              > I'm not sure...

              Joy,

              I think there are three main reasons for that:

              1. Assigning the whole list to an array

              2. Calculating ^$ConvertTo26$ only twice -- it's done three times in your clip

              3. Gathering up the gaps with ^!Append and outputting them all at once -- no ^!InsertText

              > I wrote a quick clip to generate the lines aaa to zzz. Let
              > me know if anyone wants me to post the code.

              I put my hand up and would enjoy seeing that clip!

              Flo
            • joy8388608
              ... My pleasure. Joy Generate Base 26 numbers ; by Joy ^!Continue This will generate 17576 lines from aaa to zzz. ^!SKIP Leave Screen update on? (Slower...)
              Message 6 of 29 , Dec 7, 2011
              • 0 Attachment
                --- In ntb-clips@yahoogroups.com, "flo.gehrke" <flo.gehrke@...> wrote:
                >
                > --- In ntb-clips@yahoogroups.com, "joy8388608" <mycroftj@> wrote:
                > >
                > > Flo -
                > >
                > > Very interesting. Your clip is much faster than mine even
                > > when I turned ScreenUpdate off. Mine took 41 seconds and
                > > yours took 15 for 17550 lines (aaa to zzz with 26 .rr lines
                > > removed). Why? I'm not sure...
                >
                > > Flo -
                > >
                > > Very interesting. Your clip is much faster than mine even when
                > > I turned ScreenUpdate off. Mine took 41 seconds and yours took
                > > 15 for 17550 lines (aaa to zzz with 26 .rr lines removed). Why?
                > > I'm not sure...
                >
                > Joy,
                >
                > I think there are three main reasons for that:
                >
                > 1. Assigning the whole list to an array
                >
                > 2. Calculating ^$ConvertTo26$ only twice -- it's done three times in your clip
                >
                > 3. Gathering up the gaps with ^!Append and outputting them all at once -- no ^!InsertText
                >
                > > I wrote a quick clip to generate the lines aaa to zzz. Let
                > > me know if anyone wants me to post the code.
                >
                > I put my hand up and would enjoy seeing that clip!
                >
                > Flo
                >

                My pleasure. Joy

                Generate Base 26 numbers
                ; by Joy
                ^!Continue This will generate 17576 lines from aaa to zzz.

                ^!SKIP Leave Screen update on? (Slower...)
                ^!Setscreenupdate OFF
                ^!StatusShow Generating sequences aaa to zzz...

                ; Start with aaa
                ^!Set %I%=-1

                :LoopStart
                ^!Inc %I%
                ^!Set %Num%=^%I%

                ; Find value of first digit (of 3) (will be 0 to 25)
                ^!Set %x%=^$Calc(INT(^%Num%/676))$

                ; Convert first digit to letter (will be a to z)
                ^!Set %B26%=^$DecToChar(^$Calc(^%x%+97)$)$

                ; adjust value of current number
                ^!Set %Num%=^$Calc(^%Num% - (^%x%*676))$

                ; Find value of second digit (of 3) (will be 0 to 25)
                ^!Set %x%=^$Calc(INT(^%Num%/26))$

                ; Convert second digit to letter (will be a to z) and append
                ^!Set %B26%=^%B26%^$DecToChar(^$Calc(^%x%+97)$)$

                ; adjust value of current number
                ^!Set %Num%=^$Calc(^%Num% - (^%x%*26))$

                ; Convert remaining value (0 to 25) to letter (will be a to z) and append
                ^!Set %B26%=^%B26%^$DecToChar(^$Calc(^%Num%+97)$)$

                ; Output value
                ^!InsertText ^%B26%^%NL%

                ^!If "^%B26%" <> "zzz" LoopStart

                ^!Sound SystemExclamation
              Your message has been successfully submitted and would be delivered to recipients shortly.