Loading ...
Sorry, an error occurred while loading the content.

23972How to fix non-Ascii characters using NoteTab

Expand Messages
  • John Shotsky
    Aug 30, 2013
    • 0 Attachment
      When converting Ebooks to other formats, one of the tasks is to convert the ebook to html. Generally, that means converting
      characters to UTF-8, but because of a lack of understanding on the part of many of those creating ebooks, many of the characters
      that should be coded entitles are 'in the open'. That is, characters that browsers know how to display even when they are not
      encoded are displayed correctly, but some of these characters don't exist in ASCII, at all. Here is an example:
      <strong><span class="sgc-3">2 shallots, chopped (about ? cup) or ? cup chopped scallion or onion</span></strong>
      Those 1/3 fraction symbols are called 'Vulgar Fractions', but US ASCII only support three of them - halves and fourths.
      Using NoteTab, there is no way to search and replace these characters, because you can't write the character into your find
      expression - it doesn't exist in the character set.

      So, my question is this: Is there a way to use NoteTab to open these html files, FIND these unencoded characters, and replace them
      with the equivalent US ASCII characters, which in this case would be the three character sequence 1/3?

      There are a whole host of other characters that are not properly encoded for html/utf-8 as well, but if there is a way to make this
      one work, I can work out the rest.


      [Non-text portions of this message have been removed]
    • Show all 33 messages in this topic