Loading ...
Sorry, an error occurred while loading the content.

7224Re: [NH] Notetab refuses to perform edits on this .html file

Expand Messages
  • Axel Berger
    Oct 10, 2012
      Marcelo Bastos wrote:
      > I had a quick look at the logic, and it seems to be generic enough to
      > tackle the entire Basic Multilingual Plane.

      Even more than that, it will also translate illegal UTF into equally
      illegal entities. I have another clip that checks a document for legal
      UTF and flags errors such as ANSI characters.

      ---------------------------------------------------------------
      :loop
      ^!Find "([\x80-\xBF]|[\xC0-\xFF][\x80-\xBF]*)" RS
      ^!IfError usasc
      ^!IfMatch "[\xC2-\xDF][\x80-\xBF]" "^$GetSelection$" loop
      ^!IfMatch "\xE0[\xA0-\xBF][\x80-\xBF]" "^$GetSelection$" loop
      ^!IfMatch "[\xE1-\xEC\xEE\xEF][\x80-\xBF]{2}" "^$GetSelection$" loop
      ^!IfMatch "\xED[\x80-\x9F][\x80-\xBF]" "^$GetSelection$" loop
      ^!IfMatch "\xF0[\x90-\xBF][\x80-\xBF]{2}" "^$GetSelection$" loop
      ^!IfMatch "[\xF1-\xF3][\x80-\xBF]{3}" "^$GetSelection$" loop
      ^!IfMatch "\xF4[\x80-\x8F][\x80-\xBF]{2}" "^$GetSelection$" loop
      ^!Continue Illegal sequence, no UTF-8
      ^!Goto loop
      :usasc
      ^!Continue No errors found
      ---------------------------------------------------------------

      Axel
    • Show all 9 messages in this topic