MS Word 2000 to HTML
- I have been using Word 97 to originate my documents, then save them
as HTML, and clean them up with NoteTab.
I have started to use Word 2000, but the new "Save to Web Page" adds
a whole bunch of extra tags and styles. One example is that I use a
style in Word 97 for H3 and H4 that saves great as HTML. But in Word
2000, it is coded as an P class=H3 and P class=H4.
I have been teaching myself clip progamming, and I think I can write
something that will clean up the mess that Word leaves behind. My
main question is can I do a "search and replace" with a clip that
will convert the <p class=H3></p> tag set to a standard <h3></h3>
set, with some sort of wildcard between the <> and </>? I don't mind
doing a series of simple search and replace actions, but it would be
nice to automate it somehow.
I have the same thing with the standard <p> set, which Word creates
as a <p class=mso.normal> set, along with the standard <li> set that
Word adds all kinds of extra stuff to.
- FWIW there's some new developments with Tidy at the following
link... I haven't had time to sift through it yet..