HTML Parser needed
- [If you do not know what a "parser" is, you can delete this message]
The parser I use for Xenu's Link Sleuth is pretty basic and I'm almost
ashamed for it :-) Because I did take a course in syntax analysis for
one semester, and have owned the book "Compiler Construction" by Prof.
Niklaus Wirth for many years.
Xenu's parser isn't really a real parser, it's just something "quick and
dirty". Does anyone know a source code for a HTML parser that has all
the newer parts of the language, and that is easy to use?
What I'd like is something where I can "get into" to include my own
little components, i.e. through callbacks, or by simply modifying parts
of the code. This way I could handle the "id=" part, which I don't
support at this time because it can happen everywhere.
Like Prof. Wirth, I prefer non-table based syntax analysis.