Loading ...
Sorry, an error occurred while loading the content.
 

Ocamlyacc isn't liking me...

Expand Messages
  • Richard Lyman
    It zips through all of the tokens and throws a Parsing error once it hits the end. Where can I find some guidelines to help in constructing my grammar? I ve
    Message 1 of 7 , Mar 11, 2008
      It zips through all of the tokens and throws a Parsing error once it
      hits the end.

      Where can I find some guidelines to help in constructing my grammar?

      I've found some great documentation on ocamlyacc in general, but I'm
      not running across much in the way of grammar guidelines...

      I know I'm being vague and I apologize for that. I might have the
      source to a point soon where I can share it, so maybe I'll re-post in
      a few days with some sample source.

      Also... how close is Bison to ocamlyacc? Wouldn't the concepts be
      close enough to help in building the grammar? I haven't looked at
      their documentation, so I don't know if it has what I'm looking for
      anyway.

      Should I try to get my hands on this 'Dragon' book I've seen mentioned
      everywhere?

      -Rich
      P.s. - If I construct an LR(1) grammar, that's good enough for the
      LALR(1) of ocamlyacc - right?
    • Michael DiBernardo
      ... If you re talking about the Aho/Ullman Dragon book, it s very... dense. I read it (and even got my copy signed), but I ve been told by friends who make
      Message 2 of 7 , Mar 11, 2008
        On 11-Mar-08, at 2:19 PM, Richard Lyman wrote:
        > Should I try to get my hands on this 'Dragon' book I've seen mentioned
        > everywhere?
        >
        If you're talking about the Aho/Ullman Dragon book, it's very...
        dense. I read it (and even got my copy signed), but I've been told by
        friends who make compilers for a living that it is now both a smidge
        antiquated _and_ hard to read on top of that.

        My personal advice would be to read it for posterity's sake at some
        point, but it's probably not your best bet if you are a novice
        compiler constructor.

        Apparently, there is a new edition available as of 2006. I can't
        comment on that one, it may be much better now.

        I haven't played much with ocamlyacc as of yet, but I've seen talk
        about it on this list before, so I'm sure someone smarter and more
        experienced than me will come along shortly and give you some hints.

        -Mike
        >
        >
        > -Rich
        > P.s. - If I construct an LR(1) grammar, that's good enough for the
        > LALR(1) of ocamlyacc - right?
        >
        >
      • Oliver Bandel
        Hello Richard, ... [...] Well, possibly you missed something that made the grammar correct? Without an example (including sozrces as well as some
        Message 3 of 7 , Mar 11, 2008
          Hello Richard,



          Zitat von Richard Lyman <richard.lyman@...>:

          > It zips through all of the tokens and throws a Parsing error once it
          > hits the end.
          [...]


          Well, possibly you missed something that made the grammar correct?

          Without an example (including sozrces as well as some input-exmples),
          it is not that easy to help you.


          >
          > Where can I find some guidelines to help in constructing my grammar?
          [...]


          I would recommend three things:

          1) Yacc-tutorial (the original Unix-stuff from some decades ago,
          AT&T Bell labs)

          2) Ocamlyacc tutorial by SooHyoung Oh

          3) The Book "ley & yacc" from OReilly, if you are not a
          C-allergic person ;-)


          You can find 1) (together with other tutorials) here:

          http://dinosaur.compilertools.net/yacc/index.html


          You can find 2) here:

          http://plus.kaist.ac.kr/~shoh/ocaml/ocamllex-ocamlyacc/ocamlyacc-tutorial/


          You can find 3) here:
          http://www.oreilly.com/catalog/lex/index.html


          In the book there are examples on how to ctreate grammars,
          as well as make them better as well as hints to solviing
          typical problems (e.g. if-then vs if-then-else parsing).

          IMHO this book is a good starting point.


          The Dragon book, IMHO is goind very deep into the basic things.
          If you want to write special stuff, this might be of interest,
          otherwise it's disproportionate for practical programming.

          I think it's going far beyond the possibilities of lex and yacc.



          [...].
          >
          > Also... how close is Bison to ocamlyacc?

          As far as I know, bison is an enhanced yacc, but I have never used it,
          so I'm vague here ;-)


          > Wouldn't the concepts be
          > close enough to help in building the grammar?

          Look in 1) and especially (3) for the grammar stuff.
          Look in 2) for ocamlyacc-specific things.



          Ciao,
          Oliver
        • darioteixeira
          Hi Richard, ... Any tutorial on Yacc/Bison applies also to Ocamlyacc. There s quite a few of those on the net, plus the O Reilly book that has been mentioned
          Message 4 of 7 , Mar 11, 2008
            Hi Richard,

            > I've found some great documentation on ocamlyacc in general, but I'm
            > not running across much in the way of grammar guidelines...

            Any tutorial on Yacc/Bison applies also to Ocamlyacc. There's
            quite a few of those on the net, plus the O'Reilly book that
            has been mentioned already.

            In any case, I recommend you try using Menhir instead of Ocamlyacc.
            It is mostly backwards compatible, and includes a number of options
            that make debugging a lot easier. Moreover, it is already available
            on GODI. Here's the homepage:
            http://cristal.inria.fr/~fpottier/menhir/

            > P.s. - If I construct an LR(1) grammar, that's good enough for the
            > LALR(1) of ocamlyacc - right?

            Menhir is a LR(1) parser generator. So you won't have to worry
            about that issue...

            Cheers,
            Dario
          • Richard Lyman
            I d be interested in Menhir, but their site mentions: Warning: the current release is BETA quality Everybody has their own definition of Beta quality so it s
            Message 5 of 7 , Mar 11, 2008
              I'd be interested in Menhir, but their site mentions:

              "Warning: the current release is BETA quality"

              Everybody has their own definition of Beta quality so it's possible
              that theirs allows for more stable code...

              Have you run into any problems with that?

              -Rich

              On Tue, Mar 11, 2008 at 2:21 PM, darioteixeira <darioteixeira@...> wrote:
              <snip>
              >
              > In any case, I recommend you try using Menhir instead of Ocamlyacc.
              > It is mostly backwards compatible, and includes a number of options
              > that make debugging a lot easier. Moreover, it is already available
              > on GODI. Here's the homepage:
              > http://cristal.inria.fr/~fpottier/menhir/
              >
              <snip>
              >
              > Cheers,
              > Dario
            • darioteixeira
              ... Hi, I haven t had any problems. In any case, I suggest you use Menhir to develop your application, since it makes debugging so much easier (check Menhir s
              Message 6 of 7 , Mar 11, 2008
                > I'd be interested in Menhir, but their site mentions:
                >
                > "Warning: the current release is BETA quality"
                >
                > Everybody has their own definition of Beta quality so it's possible
                > that theirs allows for more stable code...
                >
                > Have you run into any problems with that?

                Hi,

                I haven't had any problems. In any case, I suggest you use Menhir
                to develop your application, since it makes debugging so much easier
                (check Menhir's manual for options --explain, --graph, --trace,
                --dump, --log-automaton, --log-code, and --log-grammar). As long
                as you don't use any Menhir-specific features in your grammar (and
                if the grammar is LALR(1) of course) you can always switch back to
                Ocamlyacc if you run into any problems.

                Cheers,
                Dario
              • Richard Lyman
                So, what I m wanting is to transform some XML. I know there are libraries to handle this task - I m only using XML since I thought it would be something simple
                Message 7 of 7 , Mar 12, 2008
                  So, what I'm wanting is to transform some XML.

                  I know there are libraries to handle this task - I'm only using XML
                  since I thought it would be something simple to lex and parse. I'm
                  hoping that I will just get a string of numbers 0-9 that correspond
                  with the non-terminals that were visited. Here's the code that isn't
                  doing that...

                  (... and in the end I'm hoping that I can just have regular OCaml code
                  that writes out to a file or a sqlite DB for each production, instead
                  of concatenating sequences of number-strings for each production... )


                  Main.ml:
                  let linebuf = Lexing.from_string
                  "<?xml version=\"1.0\" encoding=\"UTF-8\"?>
                  <root>
                  <child/>
                  <child/>
                  </root>" in
                  while true do
                  try
                  Printf.printf "%s\n%!" (Parser.main Lexer.token linebuf)
                  with
                  | Lexer.Error msg -> print_endline "Lexer error"
                  | Parser.Error -> print_endline "Parser error"
                  done




                  Lexer.mll:
                  {
                  open Str
                  open Parser
                  open Printf

                  exception Error of string

                  let line = ref 0
                  let file = ref ""
                  let incLine lexbuf =
                  let pos = lexbuf.Lexing.lex_curr_p in
                  lexbuf.Lexing.lex_curr_p <- {
                  pos with
                  Lexing.pos_lnum = pos.Lexing.pos_lnum + 1;
                  Lexing.pos_bol = pos.Lexing.pos_cnum;
                  }
                  let usefulError point lexbuf =
                  let lpos = (Lexing.lexeme_start_p lexbuf).Lexing.pos_bol in
                  let pos = (Lexing.lexeme_start_p lexbuf).Lexing.pos_cnum - lpos in
                  let charIndicator =
                  if( pos <= 0 ) then
                  "^"
                  else
                  ((String.make pos ' ') ^ "^") in
                  let line = (Lexing.lexeme_start_p lexbuf).Lexing.pos_lnum in
                  let ic = (open_in !file) in
                  let ignore = seek_in ic lpos in
                  let context = input_line ic in
                  failwith( sprintf "File '%s'\nLine %d\nCharacter %d ('%c')\n%s\n%s"
                  !file line pos point context charIndicator )
                  }

                  let openBracket = '<'
                  let eol = ['\n' '\r' '\013']+
                  let ops = [' ' '\t']*
                  let reqs = [' ' '\t']+
                  let alpha = ['A'-'Z' 'a'-'z']
                  let alphanum = (alpha | ['0'-'9'])
                  let extraChars = ['.' '-' ':' '\\' '/' ',' ' ' '[' ']' '_' '(' ')' '&'
                  ';' '='] (* Dbl Quote is the only thing that should stay out *)
                  let attributeValue = (alphanum | extraChars )
                  let attribute = alpha+ ops '=' ops '"' attributeValue* '"'
                  let attributeList = (reqs attribute)*

                  rule token = parse
                  | "<?" { print_endline "OPEN_PD"; OPEN_PD; token lexbuf }
                  | "?>" { print_endline "CLOSE_PD"; CLOSE_PD; token lexbuf }
                  | "<" { print_endline "LT"; LT; token lexbuf }
                  | (alpha+ as name) { print_endline ("IDENT "^name); IDENT name;
                  token lexbuf }
                  | '"' (attributeValue* as value) '"' { print_endline ("QUOTE STRING
                  QUOTE "^value); QUOTE; STRING value; QUOTE; token lexbuf }
                  | '=' { print_endline "EQUAL"; EQUAL; token lexbuf }
                  | "</" { print_endline "CLOSE_NODE"; CLOSE_NODE; token lexbuf }
                  | "/>" { print_endline "CLOSE_NODE"; CLOSE_NODE; token lexbuf }
                  | ">" { print_endline "GT"; GT; token lexbuf }
                  | reqs { print_endline "WHITESPACE"; WHITESPACE; token lexbuf }
                  | eol { incLine lexbuf; token lexbuf }
                  | eof { print_endline "Reached eof"; EOF; exit 0 }
                  | _ as point { usefulError point lexbuf }




                  Parser.mly:
                  %token <string> IDENT
                  %token <string> STRING
                  %token WHITESPACE OPEN_PD CLOSE_PD GT LT CLOSE_NODE QUOTE EQUAL EOF

                  %start main
                  %type <string> main
                  %type <string> pd
                  %type <string> attribute
                  %type <string> attlist
                  %type <string> node_start
                  %type <string> node_end
                  %type <string> node_list
                  %type <string> root

                  %%

                  pd:
                  | OPEN_PD IDENT WHITESPACE attlist CLOSE_PD { "1" }

                  attribute:
                  | IDENT EQUAL QUOTE STRING QUOTE { "2" }

                  attlist:
                  | { "3" }
                  | attribute { "4" }
                  | attlist WHITESPACE attribute { "5" }

                  node_start:
                  | LT IDENT WHITESPACE attlist GT { "6" }

                  node_end:
                  | CLOSE_NODE IDENT GT { "7" }

                  node_list:
                  | { "8" }
                  | root { $1^"9" }

                  root:
                  | node_start node_list node_end { $1^$2^$3^"0" }

                  main:
                  | pd root EOF { $1^$2 }

                  %%




                  I'm compiling all of them with 'compile.sh':
                  ocamllex Lexer.mll
                  menhir --graph --trace --dump --log-grammar 2 Parser.mly

                  ocamlc -c -w a Parser.mli
                  ocamlc -c -w a Parser.ml
                  ocamlc -c -w a Lexer.ml
                  ocamlc -g -w a str.cma unix.cma Lexer.cmo Parser.cmo Main.ml -o Main

                  dot -Tpdf Parser.dot -o cp.pdf

                  rm -f *.cmx
                  rm -f *.cmi
                  rm -f *.cmo
                  rm -f *.mli
                  rm -f *.o
                  rm -f Lexer.ml
                  rm -f Parser.ml
                Your message has been successfully submitted and would be delivered to recipients shortly.