Sorry, an error occurred while loading the content.

Re: Finding gaps in a sequence

Expand Messages
• ... Sorry if I misunderstood you but I ll reply just in case in order to save you possible extra work and confusion... You said aaz to aba has a gap but you
Message 1 of 29 , Dec 1, 2011
• 0 Attachment
--- In ntb-clips@yahoogroups.com, "Eb" <ebbtidalflats@...> wrote:
>
> Hi Flo,
>
> You have are right in what the hex conversion was supposed to do.
> In the mean time I found my original char to hex clip, which only converted a single digit. I applied the single-digit approach to your problem. While I got it to work, it just raised another problem.
>
> The alphabet is like a base-26 number set (English aplhabet), after shifting a to zero. Straight conversion to numbers creates gaps, where it rolls to the next digit, i.e. aaz --> aba has a gap of 26!, the value of the next digit, and azz to baa has a gap much larger!

Sorry if I misunderstood you but I'll reply just in case in order to save you possible extra work and confusion...

You said aaz to aba has a gap but you correctly noted a=0...z=25.
Therefore, aaz=(0*26^2 + 0*26 + 25)=25 and aba=(0*26^2 + 1*26 + 0)=26 - No gap. Likewise, azz=675 and baa=676. Again, no gap.

Hope this helps, sorry if I misunderstood.

Joy
• Thanks, Eb! Now we ve got another working solution. I ve tested it succesfully. In a list of 10,000 3-digit-alpha-codes it needs 118 seconds to find a gap.
Message 2 of 29 , Dec 1, 2011
• 0 Attachment
Thanks, Eb! Now we've got another working solution.

I've tested it succesfully. In a list of 10,000 3-digit-alpha-codes it needs 118 seconds to find a gap.

Maybe it's a bit complicated to see those gaps because it outputs numbers and not the code -- but never mind. What matters here is the basic concept.

Flo

--- In ntb-clips@yahoogroups.com, "Eb" <ebbtidalflats@...> wrote:
>
> Hi Flo,
>
> You have are right in what the hex conversion was supposed to do.
> In the mean time I found my original char to hex clip, which only converted a single digit. I applied the single-digit approach to your problem. While I got it to work, it just raised another problem.
>
> The alphabet is like a base-26 number set (English aplhabet), after shifting a to zero. Straight conversion to numbers creates gaps, where it rolls to the next digit, i.e. aaz --> aba has a gap of 26!, the value of the next digit, and azz to baa has a gap much larger!
>
> I believe you already had a base-26 suggestion. But I was on a roll and created my own version, looking at two ideas:
>
> 1. a single array, containing the base-26 values 0..25, used with the calc function, to arrive at consecutive decimal values for your alpha codes.
>
> 2. Three separate arrays, one for each digit in your codes, containing look-up values for each alpha character by digit.
>
> The second version is more efficient, and I have included it here (mind the long lines):
>
>
> ----------->8-------------
> H="ThreeDigitAlphaToBase26"
> ;value of characters in 1st to 3rd (right to left) digit of code
> ^!SetArray %digits3%=0;1;2;3;4;5;6;7;8;9;10;11;12;13;14;15;16;17;18;19;20;21;22;23;24;25
> ^!SetArray %digits2%=0;26;52;78;104;130;156;182;208;234;260;286;312;338;364;390;416;442;468;494;520;546;572;598;624;650
> ^!SetArray %digits1%=0;676;1352;2028;2704;3380;4056;4732;5408;6084;6760;7436;8112;8788;9464;10140;10816;11492;12168;12844;13520;14196;14872;15548;16224;16900
> ^!Set %offset%=64
> ;note: offsets a to 1, which has value zero in numbering system
> ;--------------------------------------------
> ;extract codes
> ^!SetListDelimiter ^%nl%
> ^!SetArray %codes%=^\$GetDocMatchAll("^[a-z]{3}")\$
> ;loop codes
> ^!Set %i%=0
> :Loop
> ^!Inc %i%
> ;fetch code digits one at a time, consolidate in temp
> ^!Set %one%=^\$CharToDec(^\$StrUpper(^\$StrIndex("^%codes^%i%%";1)\$)\$)\$
> ^!Dec %one% ^%offset%
> ^!Set %temp%=^%digits1^%one%%
> ^!Set %two%=^\$CharToDec(^\$StrUpper(^\$StrIndex("^%codes^%i%%";2)\$)\$)\$
> ^!Dec %two% ^%offset%
> ^!Inc %temp% ^%digits2^%two%%
> ^!Set %tre%=^\$CharToDec(^\$StrUpper(^\$StrIndex("^%codes^%i%%";3)\$)\$)\$
> ^!Dec %tre% ^%offset%
> ^!Inc %temp% ^%digits3^%tre%%
> ;assign assembled code back to codes array
> ^!Set %codes^%i%%=^%temp%
> ;--------------------------------------------
> :gap_trap
> ^!If ^%i%>1 SKIP_2
> ^!Set %OLD%=^%temp%
> ^!Dec %old%
> ;incrementing OLD should set it to same as new
> ^!Inc %OLD%
> ;temporarily disable for testing
> ^!If ^%old%<>^%temp% HANDLE_GAP
> ;if differene > 1 signal a gap
> ;--------------------------------------------
> :NOGAP
> ^!Set %old%=^%temp%
> ^!Info ^%codes^%i%%
> ^!If ^%i%<^%codes0% LOOP
> ^!Goto END
>
> :HANDLE_GAP
> ^!Info [L]There is a gap at ^%old% to ^%temp%^%nl%Continuing...
> ^!Set %OLD%=^%TEMP%
> ^!Goto NOGAP
> ----------->8-------------
>
>
> Cheers,
>
>
> Eb
>
> PS
> I'm guessing at some of the stuff below:
> The conversion to hex failed because NoteTab saw the insertion of a plain caret as the begin of a parsable something, and when changed to ^%caret% in included the caret like an escaped character, no longer capable of triggering the parser.
>
> As to the HextToInt function, it works fine, when an actual hex number is passed to it.
>
• Joy, I also went through your clip again (messages #22230, #22245). I like that formula ^\$Calc(^%V1%*676 + ^%V2%*26 + ^%V3%)\$ which, actually, seems to be
Message 3 of 29 , Dec 1, 2011
• 0 Attachment
Joy,

I also went through your clip again (messages #22230, #22245). I like that formula '^\$Calc(^%V1%*676 + ^%V2%*26 + ^%V3%)\$' which, actually, seems to be the heart of your solution.

So I combined it with some ideas of my first concept and managed to speed up your clip significantly. Originally, your clip needed 78 seconds (on my notebook) to check a list of 10,000 codes. The following version is doing it in 9 seconds:

^!SetHintInfo Working...
; Assign code list to array %List%
^!SetListDelimiter ^%NL%
^!SetArray %List%=^\$GetText\$
^!Set %AZ%="abcdefghijklmnopqrstuvwxyz"
^!Set %i%=1

:CodeToInt
; Save current code to variable for later output in case of gap
^!Set %CurrCode%=^%List^%i%%
; Convert code to number(with Joy's formula)
^!Set %First%=^\$Convert(^%List^%i%%)\$
^!Inc %First%
^!Inc %i%
^!If ^%i% > ^%List0% Out
^!Set %Second%=^\$Convert(^%List^%i%%)\$
^!IfSame ^%First% ^%Second% CodeToInt Else False

:False
^!Append %Gaps%=^%CurrCode%^P
^!Goto CodeToInt

:Out
^!IfEmpty ^%Gaps% Next Else Skip_2
^!Info No gaps!
^!Goto Skip_3
^!Toolbar New Document
^!InsertText Gap found after...^P^%Gaps%
^!Toolbar Second Window
^!ClearVariables

The sublip with custom function ^\$Convert\$ and your formula is...

^!Set %C1%=^\$StrIndex(^&;1)\$
^!Set %C2%=^\$StrIndex(^&;2)\$
^!Set %C3%=^\$StrIndex(^&;3)\$
^!Set %V1%=^\$StrPos(^%C1%;^%AZ%;0)\$
^!Set %V2%=^\$StrPos(^%C2%;^%AZ%;0)\$
^!Set %V3%=^\$StrPos(^%C3%;^%AZ%;0)\$
^!Result ^\$Calc(^%V1%*676 + ^%V2%*26 + ^%V3%)\$

Thanks again for your proposal! Maybe you'll have a look at this revised version...

Regards,
Flo

--- In ntb-clips@yahoogroups.com, "joy8388608" <mycroftj@...> wrote:
>
>
>
> --- In ntb-clips@yahoogroups.com, "Eb" <ebbtidalflats@> wrote:
> >
> > Hi Flo,
> >
> > You have are right in what the hex conversion was supposed to do.
> > In the mean time I found my original char to hex clip, which only converted a single digit. I applied the single-digit approach to your problem. While I got it to work, it just raised another problem.
> >
> > The alphabet is like a base-26 number set (English aplhabet), after shifting a to zero. Straight conversion to numbers creates gaps, where it rolls to the next digit, i.e. aaz --> aba has a gap of 26!, the value of the next digit, and azz to baa has a gap much larger!
>
>
> Sorry if I misunderstood you but I'll reply just in case in order to save you possible extra work and confusion...
>
> You said aaz to aba has a gap but you correctly noted a=0...z=25.
> Therefore, aaz=(0*26^2 + 0*26 + 25)=25 and aba=(0*26^2 + 1*26 + 0)=26 - No gap. Likewise, azz=675 and baa=676. Again, no gap.
>
> Hope this helps, sorry if I misunderstood.
>
> Joy
>
• Flo, ... Is that fast or slow? ... To change the output to the original code, just make a copy of the codes array and use the unadulterated copy to display
Message 4 of 29 , Dec 1, 2011
• 0 Attachment
Flo,

--- In ntb-clips@yahoogroups.com, "flo.gehrke" <flo.gehrke@...> wrote:

> I've tested it succesfully. In a list of 10,000 3-digit-alpha-codes it needs 118 seconds to find a gap.

Is that fast or slow?

> Maybe it's a bit complicated to see those gaps because it outputs numbers and not the code -- but never mind. What matters here is the basic concept.

To change the output to the original code, just make a copy of the 'codes' array and use the unadulterated copy to display the gap (or display both the original code and the numeric code, since the numbers give a clearer picture of how large the gap is.

Cheers
• ... If I am interpreting correctly what you said here, the statement is not correct - there is no gap using the alphabet as symbols for a base 26 numbering
Message 5 of 29 , Dec 1, 2011
• 0 Attachment
At 11/30/2011 13:28, Eb wrote:
>The alphabet is like a base-26 number set (English aplhabet), after
>shifting a to zero. Straight conversion to numbers creates gaps, where it
>rolls to the next digit, i.e. aaz --> aba has a gap of 26!, the value of
>the next digit, and azz to baa has a gap much larger!

If I am interpreting correctly what you said here, the statement is not
correct - there is no gap using the alphabet as symbols for a base 26
numbering system.

Any integer (including negative ones), may be used as a base for counting
sequentially and takes the form: sum(d(i) * b^i) where "d" is the ith
"digit" (right to left, 0 based) and "b" is the base-1. In the case of
using the alphabet symbols to represent base 26 digits: a=0, b=1 ... z=25
and the base 10 value of any such number would be d2 * 26^2 + d1 * 26^1 +
d0 * 26^0 or d2*676 +d1*26 + d0*1.

Thus aaz = 0*676 + 0*26 + 25*1 = 25 and aba = 0*676 + 1*26 +0*1 = 26
(no gap)
Also azz = 0*676 + 25*26 + 25*1 = 675 and baa = 1*676 + 0*26 +0*1 = 676
(again, no gap)

Your code uses does correctly so the statement may just be ambiguously worded.

BTW, very clever use of ^!Inc & ^!Dec to do arithmetic! I'll have to
remember that.

I have noted that none of the suggested solutions have done any input data
verification but all assume that each line truly begins with a three (lower
case) alpha character. Your use of ^\$GetDocMatchAll("^[a-z]{3}")\$ to
extract the sequence codes would seem to offer a simple, one-line way to
verify that assumption: just compare the size of the ^%codes% array to the
line count of the source document.

^!If ^\$GetParaCount\$ <> %codes0% ^!Continue Input data error - missing
sequence code(s)

Namaste', Art
• Joy, I observed a gap while using a non-mathematical (== ) technique to convert from base 26 (the alphabet) to base 16 by using the ascii codes: aaz == 0 x 41
Message 6 of 29 , Dec 2, 2011
• 0 Attachment
Joy,

I observed a gap while using a non-mathematical (==>) technique to convert from base 26 (the alphabet) to base 16 by using the ascii codes:

aaz ==> 0 x 41 41 5A = 4,276,570
aba ==> 0 x 41 42 41 = 4,276,801

Once I shifted to the base 26 array approach, I may have stayed in the haze of non-math confusion for a bit longer <g>.

Eb

--- In ntb-clips@yahoogroups.com, "joy8388608" <mycroftj@...> wrote:
>
>
>
> --- In ntb-clips@yahoogroups.com, "Eb" <ebbtidalflats@> wrote:
> >
> > Hi Flo,
> >
> > You have are right in what the hex conversion was supposed to do.
> > In the mean time I found my original char to hex clip, which only converted a single digit. I applied the single-digit approach to your problem. While I got it to work, it just raised another problem.
> >
> > The alphabet is like a base-26 number set (English aplhabet), after shifting a to zero. Straight conversion to numbers creates gaps, where it rolls to the next digit, i.e. aaz --> aba has a gap of 26!, the value of the next digit, and azz to baa has a gap much larger!
>
>
> Sorry if I misunderstood you but I'll reply just in case in order to save you possible extra work and confusion...
>
> You said aaz to aba has a gap but you correctly noted a=0...z=25.
> Therefore, aaz=(0*26^2 + 0*26 + 25)=25 and aba=(0*26^2 + 1*26 + 0)=26 - No gap. Likewise, azz=675 and baa=676. Again, no gap.
>
> Hope this helps, sorry if I misunderstood.
>
> Joy
>
• Yes, I was still confused by my earlier attempt to convert character codes to hex codes using ascii. My test clip still had elements of hex code in it. Color
Message 7 of 29 , Dec 2, 2011
• 0 Attachment
Yes, I was still confused by my earlier attempt to convert character codes to hex codes using ascii.

My test clip still had elements of hex code in it.

Color me embarrassed.

Eb

--- In ntb-clips@yahoogroups.com, Art Kocsis <artkns@...> wrote:
>
> At 11/30/2011 13:28, Eb wrote:
> >The alphabet is like a base-26 number set (English aplhabet), after
> >shifting a to zero. Straight conversion to numbers creates gaps, where it
> >rolls to the next digit, i.e. aaz --> aba has a gap of 26!, the value of
> >the next digit, and azz to baa has a gap much larger!
>
> If I am interpreting correctly what you said here, the statement is not
> correct - there is no gap using the alphabet as symbols for a base 26
> numbering system.
• Hi Art, I suspect that none of the people offering solutions are privy to the format of the data file. So verifying input must be left to Flo. For example, the
Message 8 of 29 , Dec 2, 2011
• 0 Attachment
Hi Art,

I suspect that none of the people offering solutions are privy to the format of the data file. So verifying input must be left to Flo.

For example, the ^\$GetDocMathcAll statement must include the field delimiter to avoid also matching the first three characters of longer words, which might not be index codes at all.

Cheers

Eb

--- In ntb-clips@yahoogroups.com, Art Kocsis <artkns@...> wrote:
> ...
> I have noted that none of the suggested solutions have done any input data
> verification but all assume that each line truly begins with a three (lower
> case) alpha character. Your use of ^\$GetDocMatchAll("^[a-z]{3}")\$ to
> extract the sequence codes would seem to offer a simple, one-line way to
> verify that assumption: just compare the size of the ^%codes% array to the
> line count of the source document.
>
> ^!If ^\$GetParaCount\$ <> %codes0% ^!Continue Input data error - missing
> sequence code(s)
>
>
> Namaste', Art
>
• ... Friends, I started this topic with message #22221 writing... ... So why speculating about the format of the data? Why inventing characters and strings
Message 9 of 29 , Dec 2, 2011
• 0 Attachment
> --- In ntb-clips@yahoogroups.com, Art Kocsis <artkns@> wrote:
> I have noted that none of the suggested solutions have done any
> input data verification but all assume that each line truly
> begins with a three (lower case) alpha character...

--- In ntb-clips@yahoogroups.com, "ebbtidalflats" <ebbtidalflats@...> wrote:
>
> Hi Art,
>
> I suspect that none of the people offering solutions are privy to
> the format of the data file. So verifying input must be left to Flo.

Friends,

I started this topic with message #22221 writing...

> I've got a database where each record is indexed with an alpha-code
> from 'aaa' to 'zzz'. Every now and then, I want to find out if there
> is a gap in a sorted list of these codes. There's a gap, for
> example, in...
>
> zbx
> zby
> zbz
> zca
> zcc
> zcd

So why speculating about the format of the data? Why inventing characters and strings which actually are not there?

"For we write none other things unto you,
than what ye read or acknowledge..."
Corinthians 2, 1:13

Flo
• Flo - Very interesting. Your clip is much faster than mine even when I turned ScreenUpdate off. Mine took 41 seconds and yours took 15 for 17550 lines (aaa to
Message 10 of 29 , Dec 5, 2011
• 0 Attachment
Flo -

Very interesting. Your clip is much faster than mine even when I turned ScreenUpdate off. Mine took 41 seconds and yours took 15 for 17550 lines (aaa to zzz with 26 .rr lines removed). Why? I'm not sure. Perhaps working with an array even though the lines on a screen are probably just another type of array.

This has been fun, interesting, and I've learned several new things.

Oh, yes. You don't have to, but as I posted previously, you can modify the value of %AZ% to "bcdefghijklmnopqrstuvwxyz" (remove the 'a') for correctness.

Thanks,
Joy

P.S. On the off chance anyone else (still) wants to play with this for learning purposes, I wrote a quick clip to generate the lines aaa to zzz. Let me know if anyone wants me to post the code.

--- In ntb-clips@yahoogroups.com, "flo.gehrke" <flo.gehrke@...> wrote:
>
> Joy,
>
> I also went through your clip again (messages #22230, #22245). I like that formula '^\$Calc(^%V1%*676 + ^%V2%*26 + ^%V3%)\$' which, actually, seems to be the heart of your solution.
>
> So I combined it with some ideas of my first concept and managed to speed up your clip significantly. Originally, your clip needed 78 seconds (on my notebook) to check a list of 10,000 codes. The following version is doing it in 9 seconds:
>
>
> ^!SetHintInfo Working...
> ; Assign code list to array %List%
> ^!SetListDelimiter ^%NL%
> ^!SetArray %List%=^\$GetText\$
> ^!Set %AZ%="abcdefghijklmnopqrstuvwxyz"
> ^!Set %i%=1
>
> :CodeToInt
> ; Save current code to variable for later output in case of gap
> ^!Set %CurrCode%=^%List^%i%%
> ; Convert code to number(with Joy's formula)
> ^!Set %First%=^\$Convert(^%List^%i%%)\$
> ^!Inc %First%
> ^!Inc %i%
> ^!If ^%i% > ^%List0% Out
> ^!Set %Second%=^\$Convert(^%List^%i%%)\$
> ^!IfSame ^%First% ^%Second% CodeToInt Else False
>
> :False
> ^!Append %Gaps%=^%CurrCode%^P
> ^!Goto CodeToInt
>
> :Out
> ^!IfEmpty ^%Gaps% Next Else Skip_2
> ^!Info No gaps!
> ^!Goto Skip_3
> ^!Toolbar New Document
> ^!InsertText Gap found after...^P^%Gaps%
> ^!Toolbar Second Window
> ^!ClearVariables
>
>
> The sublip with custom function ^\$Convert\$ and your formula is...
>
> ^!Set %C1%=^\$StrIndex(^&;1)\$
> ^!Set %C2%=^\$StrIndex(^&;2)\$
> ^!Set %C3%=^\$StrIndex(^&;3)\$
> ^!Set %V1%=^\$StrPos(^%C1%;^%AZ%;0)\$
> ^!Set %V2%=^\$StrPos(^%C2%;^%AZ%;0)\$
> ^!Set %V3%=^\$StrPos(^%C3%;^%AZ%;0)\$
> ^!Result ^\$Calc(^%V1%*676 + ^%V2%*26 + ^%V3%)\$
>
>
> Thanks again for your proposal! Maybe you'll have a look at this revised version...
>
> Regards,
> Flo
• ... Joy, I think there are three main reasons for that: 1. Assigning the whole list to an array 2. Calculating ^\$ConvertTo26\$ only twice -- it s done three
Message 11 of 29 , Dec 5, 2011
• 0 Attachment
--- In ntb-clips@yahoogroups.com, "joy8388608" <mycroftj@...> wrote:
>
> Flo -
>
> Very interesting. Your clip is much faster than mine even
> when I turned ScreenUpdate off. Mine took 41 seconds and
> yours took 15 for 17550 lines (aaa to zzz with 26 .rr lines
> removed). Why? I'm not sure...

> Flo -
>
> Very interesting. Your clip is much faster than mine even when
> I turned ScreenUpdate off. Mine took 41 seconds and yours took
> 15 for 17550 lines (aaa to zzz with 26 .rr lines removed). Why?
> I'm not sure...

Joy,

I think there are three main reasons for that:

1. Assigning the whole list to an array

2. Calculating ^\$ConvertTo26\$ only twice -- it's done three times in your clip

3. Gathering up the gaps with ^!Append and outputting them all at once -- no ^!InsertText

> I wrote a quick clip to generate the lines aaa to zzz. Let
> me know if anyone wants me to post the code.

I put my hand up and would enjoy seeing that clip!

Flo
• ... My pleasure. Joy Generate Base 26 numbers ; by Joy ^!Continue This will generate 17576 lines from aaa to zzz. ^!SKIP Leave Screen update on? (Slower...)
Message 12 of 29 , Dec 7, 2011
• 0 Attachment
--- In ntb-clips@yahoogroups.com, "flo.gehrke" <flo.gehrke@...> wrote:
>
> --- In ntb-clips@yahoogroups.com, "joy8388608" <mycroftj@> wrote:
> >
> > Flo -
> >
> > Very interesting. Your clip is much faster than mine even
> > when I turned ScreenUpdate off. Mine took 41 seconds and
> > yours took 15 for 17550 lines (aaa to zzz with 26 .rr lines
> > removed). Why? I'm not sure...
>
> > Flo -
> >
> > Very interesting. Your clip is much faster than mine even when
> > I turned ScreenUpdate off. Mine took 41 seconds and yours took
> > 15 for 17550 lines (aaa to zzz with 26 .rr lines removed). Why?
> > I'm not sure...
>
> Joy,
>
> I think there are three main reasons for that:
>
> 1. Assigning the whole list to an array
>
> 2. Calculating ^\$ConvertTo26\$ only twice -- it's done three times in your clip
>
> 3. Gathering up the gaps with ^!Append and outputting them all at once -- no ^!InsertText
>
> > I wrote a quick clip to generate the lines aaa to zzz. Let
> > me know if anyone wants me to post the code.
>
> I put my hand up and would enjoy seeing that clip!
>
> Flo
>

My pleasure. Joy

Generate Base 26 numbers
; by Joy
^!Continue This will generate 17576 lines from aaa to zzz.

^!SKIP Leave Screen update on? (Slower...)
^!Setscreenupdate OFF
^!StatusShow Generating sequences aaa to zzz...

; Start with aaa
^!Set %I%=-1

:LoopStart
^!Inc %I%
^!Set %Num%=^%I%

; Find value of first digit (of 3) (will be 0 to 25)
^!Set %x%=^\$Calc(INT(^%Num%/676))\$

; Convert first digit to letter (will be a to z)
^!Set %B26%=^\$DecToChar(^\$Calc(^%x%+97)\$)\$

; adjust value of current number
^!Set %Num%=^\$Calc(^%Num% - (^%x%*676))\$

; Find value of second digit (of 3) (will be 0 to 25)
^!Set %x%=^\$Calc(INT(^%Num%/26))\$

; Convert second digit to letter (will be a to z) and append
^!Set %B26%=^%B26%^\$DecToChar(^\$Calc(^%x%+97)\$)\$

; adjust value of current number
^!Set %Num%=^\$Calc(^%Num% - (^%x%*26))\$

; Convert remaining value (0 to 25) to letter (will be a to z) and append
^!Set %B26%=^%B26%^\$DecToChar(^\$Calc(^%Num%+97)\$)\$

; Output value
^!InsertText ^%B26%^%NL%

^!If "^%B26%" <> "zzz" LoopStart

^!Sound SystemExclamation
Your message has been successfully submitted and would be delivered to recipients shortly.