## Finding a sequence of elements in a list

Expand Messages
• I m a little unsure whats the best way to try and find 2 elements in a list that are next to each other. For example if I had a list [1;2;3;4;5] and I wanted
Message 1 of 21 , Jan 7, 2005
I'm a little unsure whats the best way to try and find 2 elements in a
list that are next to each other. For example if I had a list
[1;2;3;4;5] and I wanted to see if the list contained 3 then 4 (which
it does) how would I go about doing this?
Thanks
Robin
• ... I m fairly certain that the standard library doesn t contain any functions for considering adjacent elements. But it s easy enough to write your own! The
Message 2 of 21 , Jan 7, 2005
On Fri, Jan 07, 2005 at 11:41:51AM -0000, rob_lyles wrote:
>
>
> I'm a little unsure whats the best way to try and find 2 elements in a
> list that are next to each other. For example if I had a list
> [1;2;3;4;5] and I wanted to see if the list contained 3 then 4 (which
> it does) how would I go about doing this?

I'm fairly certain that the standard library doesn't contain any
functions for considering adjacent elements. But it's easy enough to

The simple case:

# let rec test = function
[] -> false
| [x] -> false
| 3 :: 4 :: xs -> true
| x :: xs -> test xs ;;
val test : int list -> bool = <fun>
# test [ 1; 2; 3; 4; 5 ];;
- : bool = true
# test [ 1; 2; 3; 3; 5 ];;
- : bool = false
# test [ 3; 4; 1; 2; 3 ];;
- : bool = true
# test [ 1; 2; 3; 4 ];;
- : bool = true

You can also abstract the general pattern (testing adjacent elements)
into a higher-order function:

# let rec test_pairs f = function
[] -> false
| [x] -> false
| x :: y :: xs when f x y -> true
| x :: xs -> test xs;;
val test_pairs : (int -> int -> bool) -> int list -> bool = <fun>

'test' can now be redefined in terms of this higher-order function:

# let test = test_pairs (fun a b -> a = 3 && b = 4);;
val test : int list -> bool = <fun>
# test [ 1; 2; 3; 4; 5 ];;
- : bool = true
# test [ 1; 2; 3; 3; 5 ];;
- : bool = false
# test [ 3; 4; 1; 2; 3 ];;
- : bool = true
# test [ 1; 2; 3; 4 ];;
- : bool = true

It's also possible, albeit less efficient, to do something involving
List.fold_left.

Rich.

--
• ... When the function is corrected, this gives: val test_pairs : ( a - a - bool) - a list - bool = Rich. --
Message 3 of 21 , Jan 7, 2005
On Fri, Jan 07, 2005 at 12:40:57PM +0000, Richard Jones wrote:
> You can also abstract the general pattern (testing adjacent elements)
> into a higher-order function:
>
> # let rec test_pairs f = function
> [] -> false
> | [x] -> false
> | x :: y :: xs when f x y -> true
> | x :: xs -> test xs;;
^^^^ should be test_pairs f !

> val test_pairs : (int -> int -> bool) -> int list -> bool = <fun>

When the function is corrected, this gives:

val test_pairs : ('a -> 'a -> bool) -> 'a list -> bool = <fun>

Rich.

--
• ... I don t know if this is the best way , but I guess it s not very bad either ocaml # let rec find_next l a b = match l with ... ;; val find_next : a list
Message 4 of 21 , Jan 7, 2005
Em Sexta, 7 de Janeiro de 2005 11:41, o rob_lyles escreveu:
> I'm a little unsure whats the best way to try and find 2 elements in a
> list that are next to each other. For example if I had a list
> [1;2;3;4;5] and I wanted to see if the list contained 3 then 4 (which
> it does) how would I go about doing this?
> Thanks
> Robin

I don't know if this is the "best way", but I guess it's not very bad either

ocaml
# let rec find_next l a b =
match l with
|_a::_b::_ when (_a,_b)=(a,b)-> true
|[] -> false
|_ -> find_next (List.tl l) a b
;;
val find_next : 'a list -> 'a -> 'a -> bool = <fun>
# find_next [1;2;3;4;5] 3 4;;
- : bool = true
# find_next [1;2;3;4;5] 3 5;;
- : bool = false
# find_next [1;2;3;4;5] 1 5;;
- : bool = false

====

Or you can try to search for a list as a sublist of another list to make
things more generic (so you can look for more than adjacent elements)

something like

find_sublist [1;2;3;4;5;6;7] [3;4;5] -> true

this is not so simple as this but not very hard either.

--
Cláudio Valente
• Thank you both for the help Robin ... bad either ... make
Message 5 of 21 , Jan 7, 2005
Thank you both for the help

Robin

--- In ocaml_beginners@yahoogroups.com, Cláudio Valente
<cvalente@c...> wrote:
> Em Sexta, 7 de Janeiro de 2005 11:41, o rob_lyles escreveu:
> > I'm a little unsure whats the best way to try and find 2 elements in a
> > list that are next to each other. For example if I had a list
> > [1;2;3;4;5] and I wanted to see if the list contained 3 then 4 (which
> > it does) how would I go about doing this?
> > Thanks
> > Robin
>
> I don't know if this is the "best way", but I guess it's not very
>
> ocaml
> # let rec find_next l a b =
> match l with
> |_a::_b::_ when (_a,_b)=(a,b)-> true
> |[] -> false
> |_ -> find_next (List.tl l) a b
> ;;
> val find_next : 'a list -> 'a -> 'a -> bool = <fun>
> # find_next [1;2;3;4;5] 3 4;;
> - : bool = true
> # find_next [1;2;3;4;5] 3 5;;
> - : bool = false

> # find_next [1;2;3;4;5] 1 5;;
> - : bool = false
>
>
> ====
>
> Or you can try to search for a list as a sublist of another list to
make
> things more generic (so you can look for more than adjacent elements)
>
> something like
>
> find_sublist [1;2;3;4;5;6;7] [3;4;5] -> true
>
> this is not so simple as this but not very hard either.
>
> --
> Cláudio Valente
• ... It s correct but weird, why use List.tl when you are doing a pattern matching ? a better version would be: # let rec find_next l a b = match l with ... It
Message 6 of 21 , Jan 7, 2005
On Fri, 7 Jan 2005 12:50:07 +0000, Cláudio Valente <cvalente@...> wrote:
>
> Em Sexta, 7 de Janeiro de 2005 11:41, o rob_lyles escreveu:
> > I'm a little unsure whats the best way to try and find 2 elements in a
> > list that are next to each other. For example if I had a list
> > [1;2;3;4;5] and I wanted to see if the list contained 3 then 4 (which
> > it does) how would I go about doing this?
> > Thanks
> > Robin
>
> I don't know if this is the "best way", but I guess it's not very bad either
>
> ocaml
> # let rec find_next l a b =
> match l with
> |_a::_b::_ when (_a,_b)=(a,b)-> true
> |[] -> false
> |_ -> find_next (List.tl l) a b
> ;;
> val find_next : 'a list -> 'a -> 'a -> bool = <fun>
> # find_next [1;2;3;4;5] 3 4;;
> - : bool = true
> # find_next [1;2;3;4;5] 3 5;;
> - : bool = false
> # find_next [1;2;3;4;5] 1 5;;
> - : bool = false
>

It's correct but weird, why use List.tl when you are doing a pattern matching ?
a better version would be:
# let rec find_next l a b =
match l with
|_a::_b::_ when (_a,_b)=(a,b)-> true
|[] -> false
|_::rest -> find_next rest a b

It does the same thing, but it is more idiomatic (and efficient, but
this is less important)
• ... you are right that was the one I ended up with after cleaning my previous version, so I agree with you. Would you care to try to solve my other problem.
Message 7 of 21 , Jan 7, 2005
Em Sexta, 7 de Janeiro de 2005 13:54, o Remi Vanicat escreveu:
> It's correct but weird, why use List.tl when you are doing a pattern
> matching ? a better version would be:
> # let rec find_next l a b =
>     match l with
>     |_a::_b::_  when (_a,_b)=(a,b)-> true
>     |[] -> false
>     |_::rest -> find_next rest a b
you are right
that was the one I ended up with after cleaning my previous version, so I
agree with you.

Would you care to try to solve my other problem.
checking if b is a sublist from a

I ended with:

# let rec sub_list a b =
let rec start_list a b = match (a,b) with
|(_a::a_, _b::b_) when (_a=_b) -> start_list a_ b_
|(_,[]) -> true
|_ -> false
in
match (a,b) with
| (_,[]) -> true
| ([],_) -> false
| _ when (start_list a b) -> true
| (_::a_,_) -> sub_list a_ b
;;
val sub_list : 'a list -> 'a list -> bool = <fun>
# sub_list [0;1;2;3;4;5;6;7;8;9;10] [4;5;6];;
- : bool = true
# sub_list [0;1;2;3;4;5;6;7;8;9;10] [4;4;6];;
- : bool = false
# sub_list [0;1;2;3;4;5;6;7;8;9;10] [4;1;6];;
- : bool = false
# sub_list [0;1;2;3;4;5;6;7;8;9;10] [];;
- : bool = true
# sub_list [0;1;2;3;4;5;6;7;8;9;10] [3;4];;
- : bool = true

I'm not very convinced there isn't an easier way to do this and since you
improved on my previous solution, maybe you can find a more elegant way to
solve this more general problem.

Or am I just complicating and the standard library has a way to accomplish
this?

--
Cláudio Valente
• how about: let sublist a b = let rec shared a b = match (a, b) with (_, []) - true ... let rec step a = match a with [] - false ... and sublist
Message 8 of 21 , Jan 7, 2005

let sublist a b =
let rec shared a b =
match (a, b) with
(_, []) -> true
| (a'::as, b'::bs) -> if a' = b' then shared as bs else false in
let rec step a =
match a with
[] -> false
| _::as -> sublist' as
and sublist' a = shared a b || step a in
sublist' a

(i'm not sure if it's "||" or "or", but you get the idea...)

andrew

Cláudio Valente said:
>
> Em Sexta, 7 de Janeiro de 2005 13:54, o Remi Vanicat escreveu:
>> It's correct but weird, why use List.tl when you are doing a pattern
>> matching ? a better version would be:
>> # let rec find_next l a b =
>> match l with
>> |_a::_b::_ when (_a,_b)=(a,b)-> true
>> |[] -> false
>> |_::rest -> find_next rest a b
> you are right
> that was the one I ended up with after cleaning my previous version, so I
> agree with you.
>
> Would you care to try to solve my other problem.
> checking if b is a sublist from a
>
> I ended with:
>
> # let rec sub_list a b =
> let rec start_list a b = match (a,b) with
> |(_a::a_, _b::b_) when (_a=_b) -> start_list a_ b_
> |(_,[]) -> true
> |_ -> false
> in
> match (a,b) with
> | (_,[]) -> true
> | ([],_) -> false
> | _ when (start_list a b) -> true
> | (_::a_,_) -> sub_list a_ b
> ;;
> val sub_list : 'a list -> 'a list -> bool = <fun>
> # sub_list [0;1;2;3;4;5;6;7;8;9;10] [4;5;6];;
> - : bool = true
> # sub_list [0;1;2;3;4;5;6;7;8;9;10] [4;4;6];;
> - : bool = false
> # sub_list [0;1;2;3;4;5;6;7;8;9;10] [4;1;6];;
> - : bool = false
> # sub_list [0;1;2;3;4;5;6;7;8;9;10] [];;
> - : bool = true
> # sub_list [0;1;2;3;4;5;6;7;8;9;10] [3;4];;
> - : bool = true
>
> I'm not very convinced there isn't an easier way to do this and since you
> improved on my previous solution, maybe you can find a more elegant way to
> solve this more general problem.
>
> Or am I just complicating and the standard library has a way to accomplish
> this?
>
> --
> Cláudio Valente
>
>
>
> http://www.connettivo.net/cntprojects/ocaml_beginners/
> The archives of the very official ocaml list (the seniors' one) can be
> found at http://caml.inria.fr
> Attachments are banned and you're asked to be polite, avoid flames etc.
>
>
>
>
>
>
>
>

--
` __ _ __ ___ ___| |_____ work web site: http://www.ctio.noao.edu/~andrew
/ _` / _/ _ \/ _ \ / / -_) personal web site: http://www.acooke.org/andrew
\__,_\__\___/\___/_\_\___| list: http://www.acooke.org/andrew/compute.html
• ... This is probaby the easier way to do this, but it is not the more efficient one. In fact it is an intersting problem, and probably the same as the search
Message 9 of 21 , Jan 7, 2005
On Fri, 7 Jan 2005 16:01:59 +0000, Cláudio Valente <cvalente@...> wrote:
>
> Em Sexta, 7 de Janeiro de 2005 13:54, o Remi Vanicat escreveu:
> > It's correct but weird, why use List.tl when you are doing a pattern
> > matching ? a better version would be:
> > # let rec find_next l a b =
> > match l with
> > |_a::_b::_ when (_a,_b)=(a,b)-> true
> > |[] -> false
> > |_::rest -> find_next rest a b
> you are right
> that was the one I ended up with after cleaning my previous version, so I
> agree with you.
>
> Would you care to try to solve my other problem.
> checking if b is a sublist from a
>
> I ended with:
>
> # let rec sub_list a b =
> let rec start_list a b = match (a,b) with
> |(_a::a_, _b::b_) when (_a=_b) -> start_list a_ b_
> |(_,[]) -> true
> |_ -> false
> in
> match (a,b) with
> | (_,[]) -> true
> | ([],_) -> false
> | _ when (start_list a b) -> true
> | (_::a_,_) -> sub_list a_ b
> ;;
> val sub_list : 'a list -> 'a list -> bool = <fun>
> # sub_list [0;1;2;3;4;5;6;7;8;9;10] [4;5;6];;
> - : bool = true
> # sub_list [0;1;2;3;4;5;6;7;8;9;10] [4;4;6];;
> - : bool = false
> # sub_list [0;1;2;3;4;5;6;7;8;9;10] [4;1;6];;
> - : bool = false
> # sub_list [0;1;2;3;4;5;6;7;8;9;10] [];;
> - : bool = true
> # sub_list [0;1;2;3;4;5;6;7;8;9;10] [3;4];;
> - : bool = true
>
> I'm not very convinced there isn't an easier way to do this and since you
> improved on my previous solution, maybe you can find a more elegant way to
> solve this more general problem.

This is probaby the easier way to do this, but it is not the more
efficient one. In fact it is an intersting problem, and probably the
same as the search of a substring in a string. Their is severall well
knonw algorithm to make this (If I remind correctly, the
Knuth-Morris-Pratt for example but there are other) .
• Hi, If what you mean is advanced string matching algorithm like KMP in OCaml, I don t have idea either. As the simple algorithm, I think my solution is just
Message 10 of 21 , Jan 7, 2005
Hi,

If what you mean is advanced string matching algorithm like KMP in
OCaml, I don't have idea either. As the simple algorithm, I think my
solution is just similar to yours, except minor syntactic difference.
>
> # let rec sub_list a b =
> let rec start_list a b = match (a,b) with
> |(_a::a_, _b::b_) when (_a=_b) -> start_list a_ b_
> |(_,[]) -> true
> |_ -> false
> in
> match (a,b) with
> | (_,[]) -> true
~~~~~~~~~~~~~The line above is unnecessary, since it's already handled
in function start_list. And "b" is a free variable here so you don't
really need to "match (a,b)", "match a" should be sufficient.

> | ([],_) -> false
> | _ when (start_list a b) -> true
> | (_::a_,_) -> sub_list a_ b
> ;;
> val sub_list : 'a list -> 'a list -> bool = <fun>
My code here:

let rec sub_list a b =
let rec compare = function
_,[] -> true
| [],_ -> false
|_x::x_,_y::y_ -> (_x=_y) && compare (x_,y_) in
match a with
[] -> false
| h::t -> compare (a,b) || (sub_list t b);;

Another fact is I using "sequential AND" and "sequential OR" instead the
"when" in your program. I'm not sure about the efficiency at this point,
I suppose them to be not that different. I don't like to use "when",
because it occasionally causes short pauses during my code reading :( I
usually feel a pattern matching with "when" doesn't have a very clear
structure for understanding. Probably just my prejudice. If anyone has
suggestions on that, I'd like to hear.
• [repost- original never showed up] ... When the function is corrected, this gives: val test_pairs : ( a - a - bool) - a list - bool = Rich.
Message 11 of 21 , Jan 7, 2005
[repost- original never showed up]

On Fri, Jan 07, 2005 at 12:40:57PM +0000, Richard Jones wrote:
> You can also abstract the general pattern (testing adjacent elements)
> into a higher-order function:
>
> # let rec test_pairs f = function
> [] -> false
> | [x] -> false
> | x :: y :: xs when f x y -> true
> | x :: xs -> test xs;;
^^^^ should be test_pairs f !

> val test_pairs : (int -> int -> bool) -> int list -> bool = <fun>

When the function is corrected, this gives:

val test_pairs : ('a -> 'a -> bool) -> 'a list -> bool = <fun>

Rich.
• ... don t they assume constant time access to characters (arrays)? this question is about linked lists. andrew -- ` __ _ __ ___ ___| |_____ work web site:
Message 12 of 21 , Jan 7, 2005
Remi Vanicat said:

> This is probaby the easier way to do this, but it is not the more
> efficient one. In fact it is an intersting problem, and probably the
> same as the search of a substring in a string. Their is severall well
> knonw algorithm to make this (If I remind correctly, the
> Knuth-Morris-Pratt for example but there are other) .

andrew

--
` __ _ __ ___ ___| |_____ work web site: http://www.ctio.noao.edu/~andrew
/ _` / _/ _ \/ _ \ / / -_) personal web site: http://www.acooke.org/andrew
\__,_\__\___/\___/_\_\___| list: http://www.acooke.org/andrew/compute.html
• ... At first sight KMP doesn t seem to be easily adapted to linked lists, as you said. But I m pretty sure it can be done for Rabin-Karp. I ll have to try
Message 13 of 21 , Jan 7, 2005
--- andrew cooke <andrew@...> wrote:

> > knonw algorithm to make this (If I remind correctly, the
> > Knuth-Morris-Pratt for example but there are other) .
>

At first sight KMP doesn't seem to be easily adapted to linked lists,
as you said. But I'm pretty sure it can be done for Rabin-Karp. I'll
have to try though, to see if there aren't any surprises.

regards,

__________________________________
Do you Yahoo!?
The all-new My Yahoo! - What will yours do?
http://my.yahoo.com
• On Fri, 7 Jan 2005 16:04:22 -0300 (CLST), andrew cooke ... Well, I haven t reread the KMP algorithm recently, but I remember that the idea is more or less to
Message 14 of 21 , Jan 7, 2005
On Fri, 7 Jan 2005 16:04:22 -0300 (CLST), andrew cooke
<andrew@...> wrote:
>
>
> Remi Vanicat said:
>
> > This is probaby the easier way to do this, but it is not the more
> > efficient one. In fact it is an intersting problem, and probably the
> > same as the search of a substring in a string. Their is severall well
> > knonw algorithm to make this (If I remind correctly, the
> > Knuth-Morris-Pratt for example but there are other) .
>

Well, I haven't reread the KMP algorithm recently, but I remember that
the idea is more or less to build a finite automaton whose state are
represented by index in the string that we are looking for. So one
could use some way to represent this automaton for linked list. Of
course, As I'm not sure anymore, I way be remembering of another
algorithm
• ok, maybe i m confusing my substring algorithms. i thought the important part of the standard fast solution (maybe i m even wrong in thinking there is such
Message 15 of 21 , Jan 7, 2005
ok, maybe i'm confusing my substring algorithms. i thought the important
part of the "standard fast solution" (maybe i'm even wrong in thinking
there is such a thing) included jumping ahead when it could infer that no
match was possible in a certain section (ie that the state machine didn't
process values sequentially).

...goes away and checks cormen...

for kmp the prefix function tells you, when a match fails, how far
forwards you can jump. however, i think i'm right (ie overall i'm wrong)
in thinking that it's not so much the jump that is important, but rather
not needing to repeatedly rematch while traversing that jump. so you're
not big-O slower if you have to follow a chain of links to do so.

andrew

Remi Vanicat said:
>
> On Fri, 7 Jan 2005 16:04:22 -0300 (CLST), andrew cooke
> <andrew@...> wrote:
>>
>>
>> Remi Vanicat said:
>>
>> > This is probaby the easier way to do this, but it is not the more
>> > efficient one. In fact it is an intersting problem, and probably the
>> > same as the search of a substring in a string. Their is severall well
>> > knonw algorithm to make this (If I remind correctly, the
>> > Knuth-Morris-Pratt for example but there are other) .
>>
>
> Well, I haven't reread the KMP algorithm recently, but I remember that
> the idea is more or less to build a finite automaton whose state are
> represented by index in the string that we are looking for. So one
> could use some way to represent this automaton for linked list. Of
> course, As I'm not sure anymore, I way be remembering of another
> algorithm
>
>
> http://www.connettivo.net/cntprojects/ocaml_beginners/
> The archives of the very official ocaml list (the seniors' one) can be
> found at http://caml.inria.fr
> Attachments are banned and you're asked to be polite, avoid flames etc.
>
>
>
>
>
>
>
>

--
` __ _ __ ___ ___| |_____ work web site: http://www.ctio.noao.edu/~andrew
/ _` / _/ _ \/ _ \ / / -_) personal web site: http://www.acooke.org/andrew
\__,_\__\___/\___/_\_\___| list: http://www.acooke.org/andrew/compute.html
• On Fri, 7 Jan 2005 17:53:09 -0300 (CLST), andrew cooke ... KMP jump ahead in string we are looking for, not in the string we are looking into. And one can see
Message 16 of 21 , Jan 8, 2005
On Fri, 7 Jan 2005 17:53:09 -0300 (CLST), andrew cooke
<andrew@...> wrote:
>
>
> ok, maybe i'm confusing my substring algorithms. i thought the important
> part of the "standard fast solution" (maybe i'm even wrong in thinking
> there is such a thing) included jumping ahead when it could infer that no
> match was possible in a certain section (ie that the state machine didn't
> process values sequentially).

KMP jump ahead in string we are looking for, not in the string we are
looking into. And one can see that to jump ahead in the string, and to
use an automaton whose state are location in the string is
fondamentaly the same thing.

>
> ...goes away and checks cormen...
>
> for kmp the prefix function tells you, when a match fails, how far
> forwards you can jump. however, i think i'm right (ie overall i'm wrong)
> in thinking that it's not so much the jump that is important, but rather
> not needing to repeatedly rematch while traversing that jump. so you're
> not big-O slower if you have to follow a chain of links to do so.

You will be slower. You change an constant time thing by a O(n) thing,
it is slower.
My idea was that as the jump are precomputed, you could not save the
locations where you will jump as an integer, but as the actual
reference in the sublist that is the sublist where you go (and it
won't be worse in memory use, because ocaml does share sublist). And
you will probably be in constant time then.

By the way I haven't done it, I'm not sure that it is easy.
• Or you could copy the sublist into an array first, then you can use integer offsets on the sublist. Seems easier than storing references to sublists...
Message 17 of 21 , Jan 8, 2005
Or you could copy the sublist into an array first, then you can use integer
offsets on the sublist. Seems easier than storing references to sublists...
• ... Sure. By the way I ve tried to do what I meant, and the result can be find at teh follwoing url: http://aspellfr.free.fr/ocaml_things/. I ve done two
Message 18 of 21 , Jan 8, 2005
On Sat, 8 Jan 2005 05:59:48 -0800, Aaron W. West <tallpeak@...> wrote:
>
> Or you could copy the sublist into an array first, then you can use integer
> offsets on the sublist. Seems easier than storing references to sublists...
>

Sure. By the way I've tried to do what I meant, and the result can be
find at teh follwoing url: http://aspellfr.free.fr/ocaml_things/.

I've done two version, one using some imperative feature of ocaml, the
other using lazy expression (I don't see any good solution using
neither one or the other mean). It far for simple, but it seem to
work.
• ... List walking and pattern matching will do it. You can match on multiple elements of the list simultaneously. Code like this should work: let rec
Message 19 of 21 , Jan 8, 2005
On Fri, 7 Jan 2005, rob_lyles wrote:

>
>
> I'm a little unsure whats the best way to try and find 2 elements in a
> list that are next to each other. For example if I had a list
> [1;2;3;4;5] and I wanted to see if the list contained 3 then 4 (which
> it does) how would I go about doing this?

List walking and pattern matching will do it. You can match on multiple
elements of the list simultaneously. Code like this should work:

let rec find_pair_in_list x y lst =
(* Find the first occurance of x :: y :: _ in the list *)
match lst with
| [] (* Empty lists and *)
| _ :: [] (* one element lists can never match. *)
-> invalid_arg "find_pair_in_list"

| a :: b :: _ when (a = x) && (b = y) ->
(* We have found a match *)
lst
| _ :: t ->
(* Search the rest of the list *)
find_pair_in_list x y t
;;

are in is signifigant. Second, I'm using structural comparisons not
pointer comparisons. Third, notice the trick with using a when clause on
my pattern. Fourth, notice that this function is tail recursion, and
Ocaml will compile it effectively as a loop.

Brian
• ... No surprises. Here is the code (not cleaned up, sorry): ===== exception No_match;; let rec pow_mod a b q = if b = 1 then a mod q else if b 1 then (a *
Message 20 of 21 , Jan 9, 2005

> But I'm pretty sure it can be done for Rabin-Karp.
> I'll have to try though, to see if there aren't any surprises.

No surprises. Here is the code (not cleaned up, sorry):

=====
exception No_match;;

let rec pow_mod a b q =
if b = 1 then a mod q else
if b > 1 then (a * (pow_mod a (b-1) q)) mod q else
0;;

let first_hash base hash q model text =
let next h v = (h * base + hash v) mod q in
let rec fh m t mh th =
match (m, t) with
([], _) -> (mh, th, t)
| (_, []) -> raise No_match
| (h1 :: t1, h2 :: t2) ->
fh t1 t2 (next mh h1) (next th h2) in
fh model text 0 0;;

let is_prefix cmp model text =
let rec ip m t =
match (m, t) with
([], _) -> true
| (_, []) -> false
| (h1 :: _, h2 :: _) when not (cmp h1 h2) -> false
| (_ :: t1, _ :: t2) -> ip t1 t2 in
ip model text;;

let is_sublist base hash cmp q model text =
try
let h = pow_mod base ((List.length model) - 1) q in
let next v ot it =
((v - (hash ot) * h) * base + (hash it)) mod q in
let mh, th, rt = first_hash base hash q model text in
let rec isl th rt text =
if mh = th && is_prefix cmp model text then true else
match (rt, text) with
([], _) -> false
| (h1 :: t1, h2 :: t2) ->
isl (next th h2 h1) t1 t2 in
isl th rt text
with No_match -> false
;;
=====

Example uses:

=====
let is_sublist_int = is_sublist 100 (fun x -> x) (=) 104729;;
let is_sublist_char = is_sublist 256 int_of_char (=) 104729;;

is_sublist_int [3;4] [1;2;3;4;5;6;7;8];;
is_sublist_char ['u'; 'b'] ['b';'a';'u';'b';'a';'u'];;
=====

regards,

__________________________________
Do you Yahoo!?
The all-new My Yahoo! - What will yours do?
http://my.yahoo.com
• thanks for the clarification - i obviously need to go back and look at this again. i ll have a look, but it will be in a few weeks i m afraid, as i m now away
Message 21 of 21 , Jan 9, 2005
thanks for the clarification - i obviously need to go back and look at
this again. i'll have a look, but it will be in a few weeks i'm afraid,
as i'm now away travelling.

andrew

Remi Vanicat said:
>
> On Fri, 7 Jan 2005 17:53:09 -0300 (CLST), andrew cooke
> <andrew@...> wrote:
>>
>>
>> ok, maybe i'm confusing my substring algorithms. i thought the
>> important
>> part of the "standard fast solution" (maybe i'm even wrong in thinking
>> there is such a thing) included jumping ahead when it could infer that
>> no
>> match was possible in a certain section (ie that the state machine
>> didn't
>> process values sequentially).
>
> KMP jump ahead in string we are looking for, not in the string we are
> looking into. And one can see that to jump ahead in the string, and to
> use an automaton whose state are location in the string is
> fondamentaly the same thing.
>
>>
>> ...goes away and checks cormen...
>>
>> for kmp the prefix function tells you, when a match fails, how far
>> forwards you can jump. however, i think i'm right (ie overall i'm
>> wrong)
>> in thinking that it's not so much the jump that is important, but rather
>> not needing to repeatedly rematch while traversing that jump. so
>> you're
>> not big-O slower if you have to follow a chain of links to do so.
>
> You will be slower. You change an constant time thing by a O(n) thing,
> it is slower.
> My idea was that as the jump are precomputed, you could not save the
> locations where you will jump as an integer, but as the actual
> reference in the sublist that is the sublist where you go (and it
> won't be worse in memory use, because ocaml does share sublist). And
> you will probably be in constant time then.
>
> By the way I haven't done it, I'm not sure that it is easy.
>
>
> http://www.connettivo.net/cntprojects/ocaml_beginners/
> The archives of the very official ocaml list (the seniors' one) can be
> found at http://caml.inria.fr
> Attachments are banned and you're asked to be polite, avoid flames etc.
>
>
>
>
>
>
>
>

--
` __ _ __ ___ ___| |_____ work web site: http://www.ctio.noao.edu/~andrew
/ _` / _/ _ \/ _ \ / / -_) personal web site: http://www.acooke.org/andrew
\__,_\__\___/\___/_\_\___| list: http://www.acooke.org/andrew/compute.html
Your message has been successfully submitted and would be delivered to recipients shortly.