Loading ...
Sorry, an error occurred while loading the content.
 

Finding a sequence of elements in a list

Expand Messages
  • rob_lyles
    I m a little unsure whats the best way to try and find 2 elements in a list that are next to each other. For example if I had a list [1;2;3;4;5] and I wanted
    Message 1 of 21 , Jan 7, 2005
      I'm a little unsure whats the best way to try and find 2 elements in a
      list that are next to each other. For example if I had a list
      [1;2;3;4;5] and I wanted to see if the list contained 3 then 4 (which
      it does) how would I go about doing this?
      Thanks
      Robin
    • Richard Jones
      ... I m fairly certain that the standard library doesn t contain any functions for considering adjacent elements. But it s easy enough to write your own! The
      Message 2 of 21 , Jan 7, 2005
        On Fri, Jan 07, 2005 at 11:41:51AM -0000, rob_lyles wrote:
        >
        >
        > I'm a little unsure whats the best way to try and find 2 elements in a
        > list that are next to each other. For example if I had a list
        > [1;2;3;4;5] and I wanted to see if the list contained 3 then 4 (which
        > it does) how would I go about doing this?

        I'm fairly certain that the standard library doesn't contain any
        functions for considering adjacent elements. But it's easy enough to
        write your own!

        The simple case:

        # let rec test = function
        [] -> false
        | [x] -> false
        | 3 :: 4 :: xs -> true
        | x :: xs -> test xs ;;
        val test : int list -> bool = <fun>
        # test [ 1; 2; 3; 4; 5 ];;
        - : bool = true
        # test [ 1; 2; 3; 3; 5 ];;
        - : bool = false
        # test [ 3; 4; 1; 2; 3 ];;
        - : bool = true
        # test [ 1; 2; 3; 4 ];;
        - : bool = true

        You can also abstract the general pattern (testing adjacent elements)
        into a higher-order function:

        # let rec test_pairs f = function
        [] -> false
        | [x] -> false
        | x :: y :: xs when f x y -> true
        | x :: xs -> test xs;;
        val test_pairs : (int -> int -> bool) -> int list -> bool = <fun>

        'test' can now be redefined in terms of this higher-order function:

        # let test = test_pairs (fun a b -> a = 3 && b = 4);;
        val test : int list -> bool = <fun>
        # test [ 1; 2; 3; 4; 5 ];;
        - : bool = true
        # test [ 1; 2; 3; 3; 5 ];;
        - : bool = false
        # test [ 3; 4; 1; 2; 3 ];;
        - : bool = true
        # test [ 1; 2; 3; 4 ];;
        - : bool = true

        It's also possible, albeit less efficient, to do something involving
        List.fold_left.

        Rich.

        --
      • Richard Jones
        ... When the function is corrected, this gives: val test_pairs : ( a - a - bool) - a list - bool = Rich. --
        Message 3 of 21 , Jan 7, 2005
          On Fri, Jan 07, 2005 at 12:40:57PM +0000, Richard Jones wrote:
          > You can also abstract the general pattern (testing adjacent elements)
          > into a higher-order function:
          >
          > # let rec test_pairs f = function
          > [] -> false
          > | [x] -> false
          > | x :: y :: xs when f x y -> true
          > | x :: xs -> test xs;;
          ^^^^ should be test_pairs f !

          > val test_pairs : (int -> int -> bool) -> int list -> bool = <fun>

          When the function is corrected, this gives:

          val test_pairs : ('a -> 'a -> bool) -> 'a list -> bool = <fun>

          Rich.

          --
        • Cláudio Valente
          ... I don t know if this is the best way , but I guess it s not very bad either ocaml # let rec find_next l a b = match l with ... ;; val find_next : a list
          Message 4 of 21 , Jan 7, 2005
            Em Sexta, 7 de Janeiro de 2005 11:41, o rob_lyles escreveu:
            > I'm a little unsure whats the best way to try and find 2 elements in a
            > list that are next to each other. For example if I had a list
            > [1;2;3;4;5] and I wanted to see if the list contained 3 then 4 (which
            > it does) how would I go about doing this?
            > Thanks
            > Robin

            I don't know if this is the "best way", but I guess it's not very bad either

            ocaml
            # let rec find_next l a b =
            match l with
            |_a::_b::_ when (_a,_b)=(a,b)-> true
            |[] -> false
            |_ -> find_next (List.tl l) a b
            ;;
            val find_next : 'a list -> 'a -> 'a -> bool = <fun>
            # find_next [1;2;3;4;5] 3 4;;
            - : bool = true
            # find_next [1;2;3;4;5] 3 5;;
            - : bool = false
            # find_next [1;2;3;4;5] 1 5;;
            - : bool = false


            ====

            Or you can try to search for a list as a sublist of another list to make
            things more generic (so you can look for more than adjacent elements)

            something like

            find_sublist [1;2;3;4;5;6;7] [3;4;5] -> true

            this is not so simple as this but not very hard either.

            --
            Cláudio Valente
          • rob_lyles
            Thank you both for the help Robin ... bad either ... make
            Message 5 of 21 , Jan 7, 2005
              Thank you both for the help

              Robin

              --- In ocaml_beginners@yahoogroups.com, Cláudio Valente
              <cvalente@c...> wrote:
              > Em Sexta, 7 de Janeiro de 2005 11:41, o rob_lyles escreveu:
              > > I'm a little unsure whats the best way to try and find 2 elements in a
              > > list that are next to each other. For example if I had a list
              > > [1;2;3;4;5] and I wanted to see if the list contained 3 then 4 (which
              > > it does) how would I go about doing this?
              > > Thanks
              > > Robin
              >
              > I don't know if this is the "best way", but I guess it's not very
              bad either
              >
              > ocaml
              > # let rec find_next l a b =
              > match l with
              > |_a::_b::_ when (_a,_b)=(a,b)-> true
              > |[] -> false
              > |_ -> find_next (List.tl l) a b
              > ;;
              > val find_next : 'a list -> 'a -> 'a -> bool = <fun>
              > # find_next [1;2;3;4;5] 3 4;;
              > - : bool = true
              > # find_next [1;2;3;4;5] 3 5;;
              > - : bool = false

              > # find_next [1;2;3;4;5] 1 5;;
              > - : bool = false
              >
              >
              > ====
              >
              > Or you can try to search for a list as a sublist of another list to
              make
              > things more generic (so you can look for more than adjacent elements)
              >
              > something like
              >
              > find_sublist [1;2;3;4;5;6;7] [3;4;5] -> true
              >
              > this is not so simple as this but not very hard either.
              >
              > --
              > Cláudio Valente
            • Remi Vanicat
              ... It s correct but weird, why use List.tl when you are doing a pattern matching ? a better version would be: # let rec find_next l a b = match l with ... It
              Message 6 of 21 , Jan 7, 2005
                On Fri, 7 Jan 2005 12:50:07 +0000, Cláudio Valente <cvalente@...> wrote:
                >
                > Em Sexta, 7 de Janeiro de 2005 11:41, o rob_lyles escreveu:
                > > I'm a little unsure whats the best way to try and find 2 elements in a
                > > list that are next to each other. For example if I had a list
                > > [1;2;3;4;5] and I wanted to see if the list contained 3 then 4 (which
                > > it does) how would I go about doing this?
                > > Thanks
                > > Robin
                >
                > I don't know if this is the "best way", but I guess it's not very bad either
                >
                > ocaml
                > # let rec find_next l a b =
                > match l with
                > |_a::_b::_ when (_a,_b)=(a,b)-> true
                > |[] -> false
                > |_ -> find_next (List.tl l) a b
                > ;;
                > val find_next : 'a list -> 'a -> 'a -> bool = <fun>
                > # find_next [1;2;3;4;5] 3 4;;
                > - : bool = true
                > # find_next [1;2;3;4;5] 3 5;;
                > - : bool = false
                > # find_next [1;2;3;4;5] 1 5;;
                > - : bool = false
                >

                It's correct but weird, why use List.tl when you are doing a pattern matching ?
                a better version would be:
                # let rec find_next l a b =
                match l with
                |_a::_b::_ when (_a,_b)=(a,b)-> true
                |[] -> false
                |_::rest -> find_next rest a b

                It does the same thing, but it is more idiomatic (and efficient, but
                this is less important)
              • Cláudio Valente
                ... you are right that was the one I ended up with after cleaning my previous version, so I agree with you. Would you care to try to solve my other problem.
                Message 7 of 21 , Jan 7, 2005
                  Em Sexta, 7 de Janeiro de 2005 13:54, o Remi Vanicat escreveu:
                  > It's correct but weird, why use List.tl when you are doing a pattern
                  > matching ? a better version would be:
                  > # let rec find_next l a b =
                  >     match l with
                  >     |_a::_b::_  when (_a,_b)=(a,b)-> true
                  >     |[] -> false
                  >     |_::rest -> find_next rest a b
                  you are right
                  that was the one I ended up with after cleaning my previous version, so I
                  agree with you.

                  Would you care to try to solve my other problem.
                  checking if b is a sublist from a

                  I ended with:

                  # let rec sub_list a b =
                  let rec start_list a b = match (a,b) with
                  |(_a::a_, _b::b_) when (_a=_b) -> start_list a_ b_
                  |(_,[]) -> true
                  |_ -> false
                  in
                  match (a,b) with
                  | (_,[]) -> true
                  | ([],_) -> false
                  | _ when (start_list a b) -> true
                  | (_::a_,_) -> sub_list a_ b
                  ;;
                  val sub_list : 'a list -> 'a list -> bool = <fun>
                  # sub_list [0;1;2;3;4;5;6;7;8;9;10] [4;5;6];;
                  - : bool = true
                  # sub_list [0;1;2;3;4;5;6;7;8;9;10] [4;4;6];;
                  - : bool = false
                  # sub_list [0;1;2;3;4;5;6;7;8;9;10] [4;1;6];;
                  - : bool = false
                  # sub_list [0;1;2;3;4;5;6;7;8;9;10] [];;
                  - : bool = true
                  # sub_list [0;1;2;3;4;5;6;7;8;9;10] [3;4];;
                  - : bool = true

                  I'm not very convinced there isn't an easier way to do this and since you
                  improved on my previous solution, maybe you can find a more elegant way to
                  solve this more general problem.

                  Or am I just complicating and the standard library has a way to accomplish
                  this?

                  --
                  Cláudio Valente
                • andrew cooke
                  how about: let sublist a b = let rec shared a b = match (a, b) with (_, []) - true ... let rec step a = match a with [] - false ... and sublist
                  Message 8 of 21 , Jan 7, 2005
                    how about:

                    let sublist a b =
                    let rec shared a b =
                    match (a, b) with
                    (_, []) -> true
                    | (a'::as, b'::bs) -> if a' = b' then shared as bs else false in
                    let rec step a =
                    match a with
                    [] -> false
                    | _::as -> sublist' as
                    and sublist' a = shared a b || step a in
                    sublist' a

                    (i'm not sure if it's "||" or "or", but you get the idea...)

                    andrew



                    Cláudio Valente said:
                    >
                    > Em Sexta, 7 de Janeiro de 2005 13:54, o Remi Vanicat escreveu:
                    >> It's correct but weird, why use List.tl when you are doing a pattern
                    >> matching ? a better version would be:
                    >> # let rec find_next l a b =
                    >> match l with
                    >> |_a::_b::_ when (_a,_b)=(a,b)-> true
                    >> |[] -> false
                    >> |_::rest -> find_next rest a b
                    > you are right
                    > that was the one I ended up with after cleaning my previous version, so I
                    > agree with you.
                    >
                    > Would you care to try to solve my other problem.
                    > checking if b is a sublist from a
                    >
                    > I ended with:
                    >
                    > # let rec sub_list a b =
                    > let rec start_list a b = match (a,b) with
                    > |(_a::a_, _b::b_) when (_a=_b) -> start_list a_ b_
                    > |(_,[]) -> true
                    > |_ -> false
                    > in
                    > match (a,b) with
                    > | (_,[]) -> true
                    > | ([],_) -> false
                    > | _ when (start_list a b) -> true
                    > | (_::a_,_) -> sub_list a_ b
                    > ;;
                    > val sub_list : 'a list -> 'a list -> bool = <fun>
                    > # sub_list [0;1;2;3;4;5;6;7;8;9;10] [4;5;6];;
                    > - : bool = true
                    > # sub_list [0;1;2;3;4;5;6;7;8;9;10] [4;4;6];;
                    > - : bool = false
                    > # sub_list [0;1;2;3;4;5;6;7;8;9;10] [4;1;6];;
                    > - : bool = false
                    > # sub_list [0;1;2;3;4;5;6;7;8;9;10] [];;
                    > - : bool = true
                    > # sub_list [0;1;2;3;4;5;6;7;8;9;10] [3;4];;
                    > - : bool = true
                    >
                    > I'm not very convinced there isn't an easier way to do this and since you
                    > improved on my previous solution, maybe you can find a more elegant way to
                    > solve this more general problem.
                    >
                    > Or am I just complicating and the standard library has a way to accomplish
                    > this?
                    >
                    > --
                    > Cláudio Valente
                    >
                    >
                    >
                    > Archives up to September 30, 2004 are also downloadable at
                    > http://www.connettivo.net/cntprojects/ocaml_beginners/
                    > The archives of the very official ocaml list (the seniors' one) can be
                    > found at http://caml.inria.fr
                    > Attachments are banned and you're asked to be polite, avoid flames etc.
                    > Yahoo! Groups Links
                    >
                    >
                    >
                    >
                    >
                    >
                    >
                    >


                    --
                    ` __ _ __ ___ ___| |_____ work web site: http://www.ctio.noao.edu/~andrew
                    / _` / _/ _ \/ _ \ / / -_) personal web site: http://www.acooke.org/andrew
                    \__,_\__\___/\___/_\_\___| list: http://www.acooke.org/andrew/compute.html
                  • Remi Vanicat
                    ... This is probaby the easier way to do this, but it is not the more efficient one. In fact it is an intersting problem, and probably the same as the search
                    Message 9 of 21 , Jan 7, 2005
                      On Fri, 7 Jan 2005 16:01:59 +0000, Cláudio Valente <cvalente@...> wrote:
                      >
                      > Em Sexta, 7 de Janeiro de 2005 13:54, o Remi Vanicat escreveu:
                      > > It's correct but weird, why use List.tl when you are doing a pattern
                      > > matching ? a better version would be:
                      > > # let rec find_next l a b =
                      > > match l with
                      > > |_a::_b::_ when (_a,_b)=(a,b)-> true
                      > > |[] -> false
                      > > |_::rest -> find_next rest a b
                      > you are right
                      > that was the one I ended up with after cleaning my previous version, so I
                      > agree with you.
                      >
                      > Would you care to try to solve my other problem.
                      > checking if b is a sublist from a
                      >
                      > I ended with:
                      >
                      > # let rec sub_list a b =
                      > let rec start_list a b = match (a,b) with
                      > |(_a::a_, _b::b_) when (_a=_b) -> start_list a_ b_
                      > |(_,[]) -> true
                      > |_ -> false
                      > in
                      > match (a,b) with
                      > | (_,[]) -> true
                      > | ([],_) -> false
                      > | _ when (start_list a b) -> true
                      > | (_::a_,_) -> sub_list a_ b
                      > ;;
                      > val sub_list : 'a list -> 'a list -> bool = <fun>
                      > # sub_list [0;1;2;3;4;5;6;7;8;9;10] [4;5;6];;
                      > - : bool = true
                      > # sub_list [0;1;2;3;4;5;6;7;8;9;10] [4;4;6];;
                      > - : bool = false
                      > # sub_list [0;1;2;3;4;5;6;7;8;9;10] [4;1;6];;
                      > - : bool = false
                      > # sub_list [0;1;2;3;4;5;6;7;8;9;10] [];;
                      > - : bool = true
                      > # sub_list [0;1;2;3;4;5;6;7;8;9;10] [3;4];;
                      > - : bool = true
                      >
                      > I'm not very convinced there isn't an easier way to do this and since you
                      > improved on my previous solution, maybe you can find a more elegant way to
                      > solve this more general problem.

                      This is probaby the easier way to do this, but it is not the more
                      efficient one. In fact it is an intersting problem, and probably the
                      same as the search of a substring in a string. Their is severall well
                      knonw algorithm to make this (If I remind correctly, the
                      Knuth-Morris-Pratt for example but there are other) .
                    • code17
                      Hi, If what you mean is advanced string matching algorithm like KMP in OCaml, I don t have idea either. As the simple algorithm, I think my solution is just
                      Message 10 of 21 , Jan 7, 2005
                        Hi,

                        If what you mean is advanced string matching algorithm like KMP in
                        OCaml, I don't have idea either. As the simple algorithm, I think my
                        solution is just similar to yours, except minor syntactic difference.
                        >
                        > # let rec sub_list a b =
                        > let rec start_list a b = match (a,b) with
                        > |(_a::a_, _b::b_) when (_a=_b) -> start_list a_ b_
                        > |(_,[]) -> true
                        > |_ -> false
                        > in
                        > match (a,b) with
                        > | (_,[]) -> true
                        ~~~~~~~~~~~~~The line above is unnecessary, since it's already handled
                        in function start_list. And "b" is a free variable here so you don't
                        really need to "match (a,b)", "match a" should be sufficient.

                        > | ([],_) -> false
                        > | _ when (start_list a b) -> true
                        > | (_::a_,_) -> sub_list a_ b
                        > ;;
                        > val sub_list : 'a list -> 'a list -> bool = <fun>
                        My code here:

                        let rec sub_list a b =
                        let rec compare = function
                        _,[] -> true
                        | [],_ -> false
                        |_x::x_,_y::y_ -> (_x=_y) && compare (x_,y_) in
                        match a with
                        [] -> false
                        | h::t -> compare (a,b) || (sub_list t b);;

                        Another fact is I using "sequential AND" and "sequential OR" instead the
                        "when" in your program. I'm not sure about the efficiency at this point,
                        I suppose them to be not that different. I don't like to use "when",
                        because it occasionally causes short pauses during my code reading :( I
                        usually feel a pattern matching with "when" doesn't have a very clear
                        structure for understanding. Probably just my prejudice. If anyone has
                        suggestions on that, I'd like to hear.
                      • Richard Jones
                        [repost- original never showed up] ... When the function is corrected, this gives: val test_pairs : ( a - a - bool) - a list - bool = Rich.
                        Message 11 of 21 , Jan 7, 2005
                          [repost- original never showed up]

                          On Fri, Jan 07, 2005 at 12:40:57PM +0000, Richard Jones wrote:
                          > You can also abstract the general pattern (testing adjacent elements)
                          > into a higher-order function:
                          >
                          > # let rec test_pairs f = function
                          > [] -> false
                          > | [x] -> false
                          > | x :: y :: xs when f x y -> true
                          > | x :: xs -> test xs;;
                          ^^^^ should be test_pairs f !

                          > val test_pairs : (int -> int -> bool) -> int list -> bool = <fun>

                          When the function is corrected, this gives:

                          val test_pairs : ('a -> 'a -> bool) -> 'a list -> bool = <fun>

                          Rich.
                        • andrew cooke
                          ... don t they assume constant time access to characters (arrays)? this question is about linked lists. andrew -- ` __ _ __ ___ ___| |_____ work web site:
                          Message 12 of 21 , Jan 7, 2005
                            Remi Vanicat said:

                            > This is probaby the easier way to do this, but it is not the more
                            > efficient one. In fact it is an intersting problem, and probably the
                            > same as the search of a substring in a string. Their is severall well
                            > knonw algorithm to make this (If I remind correctly, the
                            > Knuth-Morris-Pratt for example but there are other) .

                            don't they assume constant time access to characters (arrays)? this
                            question is about linked lists.

                            andrew

                            --
                            ` __ _ __ ___ ___| |_____ work web site: http://www.ctio.noao.edu/~andrew
                            / _` / _/ _ \/ _ \ / / -_) personal web site: http://www.acooke.org/andrew
                            \__,_\__\___/\___/_\_\___| list: http://www.acooke.org/andrew/compute.html
                          • Radu Grigore
                            ... At first sight KMP doesn t seem to be easily adapted to linked lists, as you said. But I m pretty sure it can be done for Rabin-Karp. I ll have to try
                            Message 13 of 21 , Jan 7, 2005
                              --- andrew cooke <andrew@...> wrote:

                              > > knonw algorithm to make this (If I remind correctly, the
                              > > Knuth-Morris-Pratt for example but there are other) .
                              >
                              > don't they assume constant time access to characters (arrays)?
                              > this question is about linked lists.

                              At first sight KMP doesn't seem to be easily adapted to linked lists,
                              as you said. But I'm pretty sure it can be done for Rabin-Karp. I'll
                              have to try though, to see if there aren't any surprises.

                              regards,
                              radu





                              __________________________________
                              Do you Yahoo!?
                              The all-new My Yahoo! - What will yours do?
                              http://my.yahoo.com
                            • Remi Vanicat
                              On Fri, 7 Jan 2005 16:04:22 -0300 (CLST), andrew cooke ... Well, I haven t reread the KMP algorithm recently, but I remember that the idea is more or less to
                              Message 14 of 21 , Jan 7, 2005
                                On Fri, 7 Jan 2005 16:04:22 -0300 (CLST), andrew cooke
                                <andrew@...> wrote:
                                >
                                >
                                > Remi Vanicat said:
                                >
                                > > This is probaby the easier way to do this, but it is not the more
                                > > efficient one. In fact it is an intersting problem, and probably the
                                > > same as the search of a substring in a string. Their is severall well
                                > > knonw algorithm to make this (If I remind correctly, the
                                > > Knuth-Morris-Pratt for example but there are other) .
                                >
                                > don't they assume constant time access to characters (arrays)? this
                                > question is about linked lists.

                                Well, I haven't reread the KMP algorithm recently, but I remember that
                                the idea is more or less to build a finite automaton whose state are
                                represented by index in the string that we are looking for. So one
                                could use some way to represent this automaton for linked list. Of
                                course, As I'm not sure anymore, I way be remembering of another
                                algorithm
                              • andrew cooke
                                ok, maybe i m confusing my substring algorithms. i thought the important part of the standard fast solution (maybe i m even wrong in thinking there is such
                                Message 15 of 21 , Jan 7, 2005
                                  ok, maybe i'm confusing my substring algorithms. i thought the important
                                  part of the "standard fast solution" (maybe i'm even wrong in thinking
                                  there is such a thing) included jumping ahead when it could infer that no
                                  match was possible in a certain section (ie that the state machine didn't
                                  process values sequentially).

                                  ...goes away and checks cormen...

                                  for kmp the prefix function tells you, when a match fails, how far
                                  forwards you can jump. however, i think i'm right (ie overall i'm wrong)
                                  in thinking that it's not so much the jump that is important, but rather
                                  not needing to repeatedly rematch while traversing that jump. so you're
                                  not big-O slower if you have to follow a chain of links to do so.

                                  sorry, my bad.

                                  andrew

                                  Remi Vanicat said:
                                  >
                                  > On Fri, 7 Jan 2005 16:04:22 -0300 (CLST), andrew cooke
                                  > <andrew@...> wrote:
                                  >>
                                  >>
                                  >> Remi Vanicat said:
                                  >>
                                  >> > This is probaby the easier way to do this, but it is not the more
                                  >> > efficient one. In fact it is an intersting problem, and probably the
                                  >> > same as the search of a substring in a string. Their is severall well
                                  >> > knonw algorithm to make this (If I remind correctly, the
                                  >> > Knuth-Morris-Pratt for example but there are other) .
                                  >>
                                  >> don't they assume constant time access to characters (arrays)? this
                                  >> question is about linked lists.
                                  >
                                  > Well, I haven't reread the KMP algorithm recently, but I remember that
                                  > the idea is more or less to build a finite automaton whose state are
                                  > represented by index in the string that we are looking for. So one
                                  > could use some way to represent this automaton for linked list. Of
                                  > course, As I'm not sure anymore, I way be remembering of another
                                  > algorithm
                                  >
                                  >
                                  > Archives up to September 30, 2004 are also downloadable at
                                  > http://www.connettivo.net/cntprojects/ocaml_beginners/
                                  > The archives of the very official ocaml list (the seniors' one) can be
                                  > found at http://caml.inria.fr
                                  > Attachments are banned and you're asked to be polite, avoid flames etc.
                                  > Yahoo! Groups Links
                                  >
                                  >
                                  >
                                  >
                                  >
                                  >
                                  >
                                  >


                                  --
                                  ` __ _ __ ___ ___| |_____ work web site: http://www.ctio.noao.edu/~andrew
                                  / _` / _/ _ \/ _ \ / / -_) personal web site: http://www.acooke.org/andrew
                                  \__,_\__\___/\___/_\_\___| list: http://www.acooke.org/andrew/compute.html
                                • Remi Vanicat
                                  On Fri, 7 Jan 2005 17:53:09 -0300 (CLST), andrew cooke ... KMP jump ahead in string we are looking for, not in the string we are looking into. And one can see
                                  Message 16 of 21 , Jan 8, 2005
                                    On Fri, 7 Jan 2005 17:53:09 -0300 (CLST), andrew cooke
                                    <andrew@...> wrote:
                                    >
                                    >
                                    > ok, maybe i'm confusing my substring algorithms. i thought the important
                                    > part of the "standard fast solution" (maybe i'm even wrong in thinking
                                    > there is such a thing) included jumping ahead when it could infer that no
                                    > match was possible in a certain section (ie that the state machine didn't
                                    > process values sequentially).

                                    KMP jump ahead in string we are looking for, not in the string we are
                                    looking into. And one can see that to jump ahead in the string, and to
                                    use an automaton whose state are location in the string is
                                    fondamentaly the same thing.

                                    >
                                    > ...goes away and checks cormen...
                                    >
                                    > for kmp the prefix function tells you, when a match fails, how far
                                    > forwards you can jump. however, i think i'm right (ie overall i'm wrong)
                                    > in thinking that it's not so much the jump that is important, but rather
                                    > not needing to repeatedly rematch while traversing that jump. so you're
                                    > not big-O slower if you have to follow a chain of links to do so.

                                    You will be slower. You change an constant time thing by a O(n) thing,
                                    it is slower.
                                    My idea was that as the jump are precomputed, you could not save the
                                    locations where you will jump as an integer, but as the actual
                                    reference in the sublist that is the sublist where you go (and it
                                    won't be worse in memory use, because ocaml does share sublist). And
                                    you will probably be in constant time then.

                                    By the way I haven't done it, I'm not sure that it is easy.
                                  • Aaron W. West
                                    Or you could copy the sublist into an array first, then you can use integer offsets on the sublist. Seems easier than storing references to sublists...
                                    Message 17 of 21 , Jan 8, 2005
                                      Or you could copy the sublist into an array first, then you can use integer
                                      offsets on the sublist. Seems easier than storing references to sublists...
                                    • Remi Vanicat
                                      ... Sure. By the way I ve tried to do what I meant, and the result can be find at teh follwoing url: http://aspellfr.free.fr/ocaml_things/. I ve done two
                                      Message 18 of 21 , Jan 8, 2005
                                        On Sat, 8 Jan 2005 05:59:48 -0800, Aaron W. West <tallpeak@...> wrote:
                                        >
                                        > Or you could copy the sublist into an array first, then you can use integer
                                        > offsets on the sublist. Seems easier than storing references to sublists...
                                        >

                                        Sure. By the way I've tried to do what I meant, and the result can be
                                        find at teh follwoing url: http://aspellfr.free.fr/ocaml_things/.

                                        I've done two version, one using some imperative feature of ocaml, the
                                        other using lazy expression (I don't see any good solution using
                                        neither one or the other mean). It far for simple, but it seem to
                                        work.
                                      • Brian Hurt
                                        ... List walking and pattern matching will do it. You can match on multiple elements of the list simultaneously. Code like this should work: let rec
                                        Message 19 of 21 , Jan 8, 2005
                                          On Fri, 7 Jan 2005, rob_lyles wrote:

                                          >
                                          >
                                          > I'm a little unsure whats the best way to try and find 2 elements in a
                                          > list that are next to each other. For example if I had a list
                                          > [1;2;3;4;5] and I wanted to see if the list contained 3 then 4 (which
                                          > it does) how would I go about doing this?

                                          List walking and pattern matching will do it. You can match on multiple
                                          elements of the list simultaneously. Code like this should work:

                                          let rec find_pair_in_list x y lst =
                                          (* Find the first occurance of x :: y :: _ in the list *)
                                          match lst with
                                          | [] (* Empty lists and *)
                                          | _ :: [] (* one element lists can never match. *)
                                          -> invalid_arg "find_pair_in_list"

                                          | a :: b :: _ when (a = x) && (b = y) ->
                                          (* We have found a match *)
                                          lst
                                          | _ :: t ->
                                          (* Search the rest of the list *)
                                          find_pair_in_list x y t
                                          ;;

                                          A couple of things to note about this code. First, the order the patterns
                                          are in is signifigant. Second, I'm using structural comparisons not
                                          pointer comparisons. Third, notice the trick with using a when clause on
                                          my pattern. Fourth, notice that this function is tail recursion, and
                                          Ocaml will compile it effectively as a loop.

                                          Brian
                                        • Radu Grigore
                                          ... No surprises. Here is the code (not cleaned up, sorry): ===== exception No_match;; let rec pow_mod a b q = if b = 1 then a mod q else if b 1 then (a *
                                          Message 20 of 21 , Jan 9, 2005
                                            --- Radu Grigore <radugrigore@...> wrote:

                                            > But I'm pretty sure it can be done for Rabin-Karp.
                                            > I'll have to try though, to see if there aren't any surprises.

                                            No surprises. Here is the code (not cleaned up, sorry):

                                            =====
                                            exception No_match;;

                                            let rec pow_mod a b q =
                                            if b = 1 then a mod q else
                                            if b > 1 then (a * (pow_mod a (b-1) q)) mod q else
                                            0;;

                                            let first_hash base hash q model text =
                                            let next h v = (h * base + hash v) mod q in
                                            let rec fh m t mh th =
                                            match (m, t) with
                                            ([], _) -> (mh, th, t)
                                            | (_, []) -> raise No_match
                                            | (h1 :: t1, h2 :: t2) ->
                                            fh t1 t2 (next mh h1) (next th h2) in
                                            fh model text 0 0;;

                                            let is_prefix cmp model text =
                                            let rec ip m t =
                                            match (m, t) with
                                            ([], _) -> true
                                            | (_, []) -> false
                                            | (h1 :: _, h2 :: _) when not (cmp h1 h2) -> false
                                            | (_ :: t1, _ :: t2) -> ip t1 t2 in
                                            ip model text;;

                                            let is_sublist base hash cmp q model text =
                                            try
                                            let h = pow_mod base ((List.length model) - 1) q in
                                            let next v ot it =
                                            ((v - (hash ot) * h) * base + (hash it)) mod q in
                                            let mh, th, rt = first_hash base hash q model text in
                                            let rec isl th rt text =
                                            if mh = th && is_prefix cmp model text then true else
                                            match (rt, text) with
                                            ([], _) -> false
                                            | (h1 :: t1, h2 :: t2) ->
                                            isl (next th h2 h1) t1 t2 in
                                            isl th rt text
                                            with No_match -> false
                                            ;;
                                            =====

                                            Example uses:

                                            =====
                                            let is_sublist_int = is_sublist 100 (fun x -> x) (=) 104729;;
                                            let is_sublist_char = is_sublist 256 int_of_char (=) 104729;;

                                            is_sublist_int [3;4] [1;2;3;4;5;6;7;8];;
                                            is_sublist_char ['u'; 'b'] ['b';'a';'u';'b';'a';'u'];;
                                            =====

                                            regards,
                                            radu




                                            __________________________________
                                            Do you Yahoo!?
                                            The all-new My Yahoo! - What will yours do?
                                            http://my.yahoo.com
                                          • andrew cooke
                                            thanks for the clarification - i obviously need to go back and look at this again. i ll have a look, but it will be in a few weeks i m afraid, as i m now away
                                            Message 21 of 21 , Jan 9, 2005
                                              thanks for the clarification - i obviously need to go back and look at
                                              this again. i'll have a look, but it will be in a few weeks i'm afraid,
                                              as i'm now away travelling.

                                              andrew

                                              Remi Vanicat said:
                                              >
                                              > On Fri, 7 Jan 2005 17:53:09 -0300 (CLST), andrew cooke
                                              > <andrew@...> wrote:
                                              >>
                                              >>
                                              >> ok, maybe i'm confusing my substring algorithms. i thought the
                                              >> important
                                              >> part of the "standard fast solution" (maybe i'm even wrong in thinking
                                              >> there is such a thing) included jumping ahead when it could infer that
                                              >> no
                                              >> match was possible in a certain section (ie that the state machine
                                              >> didn't
                                              >> process values sequentially).
                                              >
                                              > KMP jump ahead in string we are looking for, not in the string we are
                                              > looking into. And one can see that to jump ahead in the string, and to
                                              > use an automaton whose state are location in the string is
                                              > fondamentaly the same thing.
                                              >
                                              >>
                                              >> ...goes away and checks cormen...
                                              >>
                                              >> for kmp the prefix function tells you, when a match fails, how far
                                              >> forwards you can jump. however, i think i'm right (ie overall i'm
                                              >> wrong)
                                              >> in thinking that it's not so much the jump that is important, but rather
                                              >> not needing to repeatedly rematch while traversing that jump. so
                                              >> you're
                                              >> not big-O slower if you have to follow a chain of links to do so.
                                              >
                                              > You will be slower. You change an constant time thing by a O(n) thing,
                                              > it is slower.
                                              > My idea was that as the jump are precomputed, you could not save the
                                              > locations where you will jump as an integer, but as the actual
                                              > reference in the sublist that is the sublist where you go (and it
                                              > won't be worse in memory use, because ocaml does share sublist). And
                                              > you will probably be in constant time then.
                                              >
                                              > By the way I haven't done it, I'm not sure that it is easy.
                                              >
                                              >
                                              > Archives up to September 30, 2004 are also downloadable at
                                              > http://www.connettivo.net/cntprojects/ocaml_beginners/
                                              > The archives of the very official ocaml list (the seniors' one) can be
                                              > found at http://caml.inria.fr
                                              > Attachments are banned and you're asked to be polite, avoid flames etc.
                                              > Yahoo! Groups Links
                                              >
                                              >
                                              >
                                              >
                                              >
                                              >
                                              >
                                              >


                                              --
                                              ` __ _ __ ___ ___| |_____ work web site: http://www.ctio.noao.edu/~andrew
                                              / _` / _/ _ \/ _ \ / / -_) personal web site: http://www.acooke.org/andrew
                                              \__,_\__\___/\___/_\_\___| list: http://www.acooke.org/andrew/compute.html
                                            Your message has been successfully submitted and would be delivered to recipients shortly.