Loading ...
Sorry, an error occurred while loading the content.

parsing a csv line of text from a list of files

Expand Messages
  • Jean Saint-Remy
    Hi Folks, I seem to be stuck in the mud and need a little help. We have a directory ~/Stock_Quotes with approximately 1300 CSV files. Inside of each file there
    Message 1 of 3 , Mar 7 2:59 PM
    • 0 Attachment
      Hi Folks,

      I seem to be stuck in the mud and need a little help.
      We have a directory ~/Stock_Quotes with approximately 1300 CSV files. Inside of each file there is somewhere our query string

      "MSFT","3/5/2010",28.64,28.68,28.42,28.5875,56005056

      which we want to tokenize to produce the output

      Symbol MSFT
      Date 3/5/2010
      Open 28.64
      High 28.68
      Low 28.42
      Settle 28.5857
      Volume 56,005,056

      We want to use a recursive call parsing the lines in a file, and for the time being we will use the 'while' loop to process the files.

      This is what we came up with so far:

      #load "str.cma" ;;

      let dir_h = "~/Stock_Quotes" ;;
      let query = "MSFT" ;; (* our query symbol *)

      begin
      try (* outer loop, could be more efficient *)
      while true do (* as long as there are files in the directory *)
      let file_h = Sys.readdir dir_h in

      (* we need efficient loop to parse lines from the files *)
      let read_l file_h =
      let line_in = open_in file_h in
      let rec aux accum =
      try aux (( input_line line_in ) :: accum )

      (* we want to match our symbol 'MSFT' *)
      let m_line query line_in = Str.string_match ( Str.regexp query ) m_line 0 in

      (* once we have a match we process CSV tokens *)
      let split_str = Str.split ( Str.regexp_string "," ) m_line in
      List.iter print_endline split_str;

      (* inner try block *)
      with End_of_file -> close_in line_in; List.rev accum
      | Not_found -> List.rev (line_in :: accum ) (* if no match *)
      in aux [];

      (* outer try block *)
      with
      | End_of_file -> ()

      end; (* close begin block *)

      closedir dir_h ;;

      Thanks in advance for any suggestions.

      With kind regards,

      Jean
    • Adrien
      Hi, Have you tried to use the ocaml-csv library[1]? It works pretty well. [1] https://forge.ocamlcore.org/projects/csv/ ... Adrien Nader
      Message 2 of 3 , Mar 7 3:12 PM
      • 0 Attachment
        Hi,

        Have you tried to use the ocaml-csv library[1]? It works pretty well.

        [1] https://forge.ocamlcore.org/projects/csv/

        ---

        Adrien Nader
      • Raphael Speyer
        Disclaimer: I m still a beginner to OCaml, but I think this could be more declarative. If you re using Batteries Included you can get it to handle most of the
        Message 3 of 3 , Mar 7 11:03 PM
        • 0 Attachment
          Disclaimer: I'm still a beginner to OCaml, but I think this could be
          more declarative. If you're using Batteries Included you can get it to
          handle most of the IO, and then you can mainly just operate on
          enumerations.

          What about something along these lines?

          module Quote = struct
          type t = { symbol:string; date:string; opening:float; high:float;
          low:float; settle:float; volume:int }

          let of_string str =
          Scanf.sscanf str
          "%S,%S,%f,%f,%f,%f,%i" (fun symbol date opening high low
          settle volume ->
          { symbol = symbol; date = date; opening = opening; high =
          high; low = low; settle = settle; volume = volume })

          let to_string { symbol = symbol; date = date; opening = opening;
          high = high; low = low; settle = settle; volume = volume } =
          Printf.sprintf
          "Symbol %s\nDate %s\nOpen %.2f\nHigh %.2f\nLow %.2f\nSettle
          %.4f\nVolume %i"
          symbol date opening high low settle volume

          let symbol { symbol = s } = s
          end


          let files_in dirname =
          Shell.files_of dirname |> map ((^) (dirname ^ "/"))

          let find_quote_with_symbol symbol filename =
          File.lines_of filename |> map Quote.of_string |> find (Quote.symbol
          |- (=) symbol)

          let print_quote =
          Quote.to_string |- print_endline

          ;;
          files_in "./Stock_Quotes" |> map (find_quote_with_symbol "MSFT") |>
          iter print_quote
          ;;


          Raphael
        Your message has been successfully submitted and would be delivered to recipients shortly.