Loading ...
Sorry, an error occurred while loading the content.
 

Re: "ocaml_beginners"::[] Unix "locate" database access ?

Expand Messages
  • William Neumann
    ... Look at the Unix.open_process* functions William
    Message 1 of 13 , Mar 11, 2007
      On Mar 11, 2007, at 12:04 PM, Fabrice Marchant wrote:

      > I've soon used Sys.command but I do not know how to crop the
      > results back to my OCaml program :
      > have no idea about how to proceed to the "screen scrape"...

      Look at the Unix.open_process* functions <http://caml.inria.fr/pub/
      docs/manual-ocaml/libref/
      Unix.html#6_Highlevelprocessandredirectionmanagement>

      William D. Neumann

      "I eat T-bone steaks, I lift barbell plates, I'm sweeter than a
      German chocolate cake. I'm the reflection of perfection, the number
      one selection. I'm the man of the hour, the man with the power, too
      sweet to be sour. The ladies' pet, the men's regret, where what you
      see is what you get, and what you don't see, is better yet."

      --Superstar Billy Graham
    • Fabrice Marchant
      Thanks a lot William ! ... I ve seen the doc and can try these process functions now. But that isn t obvious for me. If you know a screen scrape example
      Message 2 of 13 , Mar 12, 2007
        Thanks a lot William !

        > Look at the Unix.open_process* functions <http://caml.inria.fr/pub/
        > docs/manual-ocaml/libref/
        > Unix.html#6_Highlevelprocessandredirectionmanagement>

        I've seen the doc and can try these process functions now.
        But that isn't obvious for me. If you know a "screen scrape" example somewhere...

        Regards

        Fabrice
      • Robert Roessler
        ... All that is really meant here is that you must examine the text being generated/returned by running commands and parse it enough to be able to recognize
        Message 3 of 13 , Mar 12, 2007
          Fabrice Marchant wrote:
          > Thanks a lot William !
          >
          > > Look at the Unix.open_process* functions <http://caml.inria.fr/pub/
          > <http://caml.inria.fr/pub/>
          > > docs/manual-ocaml/libref/
          > > Unix.html#6_Highlevelprocessandredirectionmanagement>
          >
          > I've seen the doc and can try these process functions now.
          > But that isn't obvious for me. If you know a "screen scrape" example
          > somewhere...

          All that is really meant here is that you must examine the text being
          generated/returned by running commands and parse it enough to be able
          to recognize and extract the data of interest to you.

          Typically, this could mean using simple pattern-matching to "see"
          lines that have useful data, and ignore ones that don't.

          The next level of complexity (for information spread across multiple
          lines) is to remember what you have seen and are therefore expecting
          to see next - essentially simulating a simple finite state machine.

          At the extreme end of this approach, you might actually create a
          grammar and employ lexical analysis and generator tools.

          Robert Roessler
          robertr@...
          http://www.rftp.com
        • Martin Jambon
          ... See slurp_command on that page: http://martin.jambon.free.fr/toolbox.html#programs Micmatch.Text.iter_lines_of_channel can also be useful when combined
          Message 4 of 13 , Mar 12, 2007
            On Mon, 12 Mar 2007, Fabrice Marchant wrote:

            > Thanks a lot William !
            >
            >> Look at the Unix.open_process* functions <http://caml.inria.fr/pub/
            >> docs/manual-ocaml/libref/
            >> Unix.html#6_Highlevelprocessandredirectionmanagement>
            >
            > I've seen the doc and can try these process functions now.
            > But that isn't obvious for me. If you know a "screen scrape" example
            > somewhere...

            See "slurp_command" on that page:
            http://martin.jambon.free.fr/toolbox.html#programs

            Micmatch.Text.iter_lines_of_channel can also be useful when combined with
            Unix.open_process_in.


            Martin

            --
            Martin Jambon
            http://martin.jambon.free.fr
          • Fabrice Marchant
            Thanks Robert for your abstract but useful explanations. I discover this way of working. There is no reason though to be specific to OCaml. Regards Fabrice
            Message 5 of 13 , Mar 12, 2007
              Thanks Robert for your abstract but useful explanations.

              I discover this way of working. There is no reason though to be specific to OCaml.

              Regards

              Fabrice
            • Grant Olson
              Sorry, I guess screen scrape might be an American term. But yes, this isn t specific to OCaml. Robert pretty much summed it up, you issue a command and
              Message 6 of 13 , Mar 12, 2007
                Sorry, I guess 'screen scrape' might be an American term. But yes, this
                isn't specific to OCaml.



                Robert pretty much summed it up, you issue a command and process the
                returned text content to get the information you're looking for. Opinions
                vary as to whether this is a good way to do things. In some ways it fits in
                with the Unix tradition of processing info by piping text output into the
                input of another program like grep or awk or sed. But like Unix piping, you
                can suddenly have your program break when you do a system update or move to
                another OS or do something else that changes an expected input. Sometimes
                it's the ONLY way to get what you need.



                For something quick screen scraping may be easier than reverse engineering
                or writing a library for locatedb, if you're trying to get some simple
                information. But if you're going to be doing significant work with the
                database, it is probably worth writing a proper access library to manipulate
                the db file.



                As usual, wikipedia has more info than you wanted to know. ;-)



                http://en.wikipedia.org/wiki/Screen_scraping



                -Grant



                _____

                From: ocaml_beginners@yahoogroups.com
                [mailto:ocaml_beginners@yahoogroups.com] On Behalf Of Fabrice Marchant
                Sent: Monday, March 12, 2007 7:18 PM
                To: ocaml_beginners@yahoogroups.com
                Subject: Re: "ocaml_beginners"::[] Unix "locate" database access ?



                Thanks Robert for your abstract but useful explanations.

                I discover this way of working. There is no reason though to be specific to
                OCaml.

                Regards

                Fabrice



                [Non-text portions of this message have been removed]
              • Oliver Bandel
                ... [...] Unix.open_process_in is what you can use. The term screen scrape, when used for the following,is a misnomer, because you directly read from the
                Message 7 of 13 , Mar 13, 2007
                  On Mon, Mar 12, 2007 at 11:35:16PM +0100, Fabrice Marchant wrote:
                  > Thanks a lot William !
                  >
                  > > Look at the Unix.open_process* functions <http://caml.inria.fr/pub/
                  > > docs/manual-ocaml/libref/
                  > > Unix.html#6_Highlevelprocessandredirectionmanagement>
                  >
                  > I've seen the doc and can try these process functions now.
                  > But that isn't obvious for me. If you know a "screen scrape" example somewhere...
                  >
                  [...]

                  Unix.open_process_in

                  is what you can use.

                  The term screen scrape, when used for the following,is a misnomer,
                  because you directly read from the process you inderectly invoked via
                  open_process_in:

                  ===============================================================================
                  first:~/Desktop/OCAML-Programmierung-Dokus oliver$ ocaml unix.cma
                  Objective Caml version 3.09.3

                  # open Unix;;
                  # let channel = open_process_in "ls -lt";;
                  val channel : in_channel = <abstr>
                  # while true do print_endline (input_line channel) done ;;
                  total 14216
                  drwxr-xr-x 59 oliver oliver 2006 28 Feb 11:45 OCAML-htmlman-Reference-Manual
                  -rw-r--r-- 1 oliver oliver 124188 21 Feb 19:32 ocamldoc-doku.pdf
                  drwxr-xr-x 4 oliver oliver 136 14 Feb 23:30 style Files
                  -rw-r--r-- 1 oliver oliver 28295 14 Feb 23:30 style.html
                  drwxr-xr-x 5 oliver oliver 170 12 Feb 10:45 Camlp4-Tutorial
                  drwxr-xr-x 8 oliver oliver 272 12 Feb 10:41 OCAMl-Infos-Web-OCamlP3l-und-anderes
                  -rw-r--r-- 1 oliver oliver 415384 4 Feb 07:11 javavsocaml.pdf
                  drwxr-xr-x 3 oliver oliver 102 24 Aug 2006 GoF-DesignPatterns
                  drwxr-xr-x 36 oliver oliver 1224 7 May 2006 OCAML-COCOA
                  drwxr-xr-x 27 oliver oliver 918 7 May 2006 Ocaml--Format-Module
                  drwxr-xr-x 3 oliver oliver 102 7 May 2006 diverses
                  -rw-r--r-- 1 oliver oliver 71171 7 Feb 2006 Ocaml-two-forms-of-LET.pdf
                  -rw-r--r-- 1 oliver oliver 1880564 19 Nov 2005 ocaml-3.09-refman.pdf
                  drwxr-xr-x 5 oliver oliver 170 25 Oct 2005 81bbc08defeb05351c2e0e3164dca32c.en Files
                  -rw-r--r-- 1 oliver oliver 10907 25 Oct 2005 81bbc08defeb05351c2e0e3164dca32c.en.html
                  drwxr-xr-x 5 oliver oliver 170 21 Oct 2005 OCaml-vs-other-Languages
                  drwxr-xr-x 22 oliver oliver 748 21 Oct 2005 OCAML-diverses
                  -rw-r--r-- 1 oliver oliver 21316 15 May 2005 not a bug.html
                  drwxr-xr-x 3 oliver oliver 102 15 May 2005 not a bug_files
                  -rw-r--r-- 1 oliver oliver 100525 15 May 2005 FAQ_EXPERT-eng.html
                  -rw-r--r-- 1 oliver oliver 17976 2 May 2005 Polymorphic-Variants.pdf
                  drwxr-xr-x 4 oliver oliver 136 29 Mar 2005 Ocaml-an-introduction
                  drwxr-xr-x 4 oliver oliver 136 29 Mar 2005 OCaml-concise-introduction
                  drwxr-xr-x 30 oliver oliver 1020 29 Mar 2005 OCAML-Tutorial
                  -rw-r--r-- 1 oliver oliver 360389 29 Mar 2005 book.pdf
                  -rw-r--r-- 1 oliver oliver 302580 27 Feb 2005 chapter1.pdf
                  -rw------- 1 oliver oliver 5096 10 Feb 2005 Ocaml-Extlib.url
                  -rw-r--r-- 1 oliver oliver 188570 22 Dec 2004 ocamllex-tutorial.pdf
                  -rw-r--r-- 1 oliver oliver 458918 22 Dec 2004 ocamlyacc-tutorial.pdf
                  -rw-r--r-- 1 oliver oliver 2875150 20 Dec 2004 ocaml-OReilly-Book.pdf
                  -rw-r--r-- 1 oliver oliver 129871 13 May 2003 recursive-modules-note.pdf
                  -rwxr-xr-x 1 oliver oliver 126386 11 Jan 2003 KOPIE-camlp4-3.06-tutorial.ps.gz
                  -rwxr-xr-x 1 oliver oliver 126386 20 Aug 2002 camlp4-3.06-tutorial.ps.gz
                  Exception: End_of_file.
                  #
                  ===============================================================================

                  open_process_in is the aequivalent to a Unix popen(3) call
                  with a OCaml's channels on top of it.

                  You give that function the command you would type in a shell
                  and get back the stdout of the process - the stuff you
                  normally get to screen, when calling the command from
                  the shell.

                  BTW: after you finished, you have to call close_process_in; I didn't do thatin the
                  above example.

                  Best wishes,
                  Oliver Bandel
                • Fabrice Marchant
                  Thanks to Martin and Oliver for their code examples I ve experimented with : absolutely perfect ! Oliver said screen scraping was a misnomer. I think so !
                  Message 8 of 13 , Mar 14, 2007
                    Thanks to Martin and Oliver for their code examples I've experimented with : absolutely perfect !

                    Oliver said "screen scraping" was a misnomer. I think so ! Before your explanations, I was close to imaginate we must feed an OCR system with pixel bits...

                    Thanks to Grant for his interesting explanations about "screen scraping", its possible drawbacks and for his advices about the choice of this programming method.

                    The quality of the answers is incredible on this list.

                    Very sorry I only thanks now to these clever answers.

                    Regards
                  Your message has been successfully submitted and would be delivered to recipients shortly.